MIT Differential Equations Notes
MIT Differential Equations Notes
BJORN POONEN
These are an approximation of what was covered in lecture. (Please clear your browser’s
cache before reloading this file to make sure you are getting the current version.) This PDF
file is divided into sections; the instructions for viewing the table of contents depend on which
PDF viewer you are using.
If your PDF viewer is linked to a browser, you should be able to click on URLs, such as
the one below to go to the online mathlets:
http://mathlets.org/mathlets/
Small text contains a technical explanation that you might want to ignore when reading for the first time.
February 7
dy
Warning: The notation ẏ is a standard abbreviation for ; use it only for the derivative
dt
dy
with respect to time. If y is a function of x, write y 0 or instead.
dx
dy y
If y has units m (meters) and t has units s (seconds), then has the same units as
dt t
would, namely m/s (meters per second). Similarly, ÿ would have units m/s2 .
Maybe you guessed y = e3t . This is a solution to the differential equation (1), because
substituting it into the DE gives 3e3t = 3e3t . But it’s not the function I was thinking of!
Some other solutions are y = 7e3t , y = −5e3t , y = 0, etc. Later we’ll explain why the general
solution to (1) is
y = ce3t , where c is a parameter;
saying this means that
• for each number c, the function y = ce3t is a solution, and
• there are no other solutions besides these.
So there is a 1-parameter family of solutions to (1).
You still haven’t guessed my secret function.
Clue 2: My function satisfies the initial condition y(0) = 6.
Solution: There is a number c such that y(t) = ce3t holds for all t; we need to find c.
Plugging in t = 0 shows that 6 = ce0 , so c = 6. Thus, among the infinitely many solutions to
the DE, the particular solution satisfying the initial condition is y(t) = 6e3t . 2
2
Important: Checking a solution to a DE is usually easier than finding the solution in the
first place, so it is often worth doing. Just plug in the function to both sides, and also check
that it satisfies the initial condition.
In (1) only one initial condition was needed, since only one parameter c needed to be
recovered.
Example 1.1. Is
2. Modeling
I sometimes tell people that I have a career in modeling. We’re going to talk about
mathematical modeling, which is converting a real-world problem into mathematical equations.
Guidelines:
1. Identify relevant quantities, both known and unknown, and give them symbols. Find the
units for each.
2. Identify the independent variable(s). The other quantities will be functions of them, or
constants. Often time is the only independent variable.
3
3. Write down equations expressing how the functions change in response to small changes in
the independent variable(s). Also write down any “laws of nature” relating the variables.
As a check, make sure that each summand in an equation has the same units.
Often simplifying assumptions need to be made; the challenge is to simplify the equations so
that they can be solved but so that they still describe the real-world system well.
Problem 2.1. I have a savings account earning interest compounded daily, and I make
frequent deposits or withdrawals into the account. Find an ODE with initial condition to
model the balance.
Now that the modeling is done, the next step might be to solve (2) for the function x(t),
but we won’t do that yet.
2.2. Systems and signals. Maybe for financial planning I am interested in testing different
saving strategies (different functions q) to see what balances x they result in. To help with
this, rewrite the ODE as
ẋ − I(t)x = q(t) .
controlled by bank controlled by me
In the “systems and signals” language of engineering, q is called the input signal, the bank
is the system, and x is the output signal. These terms do not have a mathematical meaning
dictated by the DE alone; their interpretation is guided by what is being modeled. But the
general picture is this:
• The input signal is a function of the independent variable alone, a function that enters
into the DE somehow (usually the right side of the DE, or part of the right side).
• The system processes the input signal by solving the DE with the given initial condition.
• The output signal (also called system response) is the solution to the DE.
Solution:
Step 1. This involves only the first derivative of a one-variable function y(t), so it is a
first-order ODE. Thus we can attempt separation of variables.
dy
Step 2. Rewrite as − 2ty = 0.
dt
dy dy
Step 3. Isolate the term: = 2ty.
dt dt
1
Step 4. We can separate variables! Namely, dy = 2t dt. (Warning: We divided by y, so
y
at some point we will have to check y = 0 as a potential solution.)
Step 5. Integrate: ln |y| = t2 + C.
Step 6. Solve for y:
2 +C
|y| = et
2
y = ±eC et .
As C runs over all real numbers, and as the ± sign varies, the coefficient ±eC runs over all
2
nonzero real numbers. Thus these solutions are y = cet for all nonzero c.
6
Step 7. Because of Step 4, we need to check also the constant function y = 0; it turns out
2
that it is a solution too. It can be considered as the function cet for c = 0.
Conclusion: The general solution to ẏ − 2ty = 0 is
2
y = cet , where c is an arbitrary real number. 2
2 2 2
Step 8. Plugging in y = cet to ẏ − 2ty = 0 gives cet (2t) − 2tcet = 0, which is true, as it
should be.
et ÿ 5 ẏ t9 y.
et ÿ + 5 ẏ + t9 y = 0.
4.1.2. Building an inhomogeneous linear ODE. If you start with a homogeneous linear ODE,
and replace the 0 on the right by a function of t only, the result is called an inhomogeneous
linear ODE. The function of t could be a constant function, but it is not allowed to involve y.
For example,
et ÿ + 5 ẏ + t9 y = 7 sin t + 2
et ÿ + 5 ẏ + t9 y = 2.
7
Most general nth order inhomogeneous linear ODE:
4.1.3. Both kinds together. In testing whether an ODE is a homogeneous linear ODE or
inhomogeneous linear ODE, you are allowed to rearrange the terms. A linear ODE is an ODE
that can be rearranged into one of these two types.
Remark 4.1. If you already know that an ODE is linear, there is an easy test to decide if it is
homogeneous or not: plug in the constant function y = 0.
• If y = 0 is a solution, the ODE is homogeneous.
• If y = 0 is not a solution, the ODE is inhomogeneous.
et ÿ + 5 ẏ + t9 y = 0.
5 t9
ÿ + ẏ + y = 0.
et et
The same can be done for any linear ODE, to put it in standard linear form
for some functions pn−1 (t), . . . , p0 (t), q(t) (not the same ones as before the division).
For now, we assume that we are looking for a solution y(t) defined on an open interval I, and that the
functions pn−1 (t), . . . , p0 (t), q(t) are continuous (or at least piecewise continuous) on I. Open interval means
a connected set of real numbers without endpoints, i.e., one of the following: (a, b), (−∞, b), (a, ∞), or
(−∞, ∞) = R.
4.2. Nonlinear ODEs. For an ODE to be nonlinear, the functions y, ẏ, . . . must enter the
equation in a more complicated way: raised to powers, multiplied by each other, or with
nonlinear functions applied to them.
8
Flashcard question: Which of the following ODEs is linear?
ÿ − 7t y ẏ = 0
ÿ = et (y + t2 )
ẏ − y 2 = 0
ẏ 2 − ty = sin t
ẏ = cos(y + t).
ÿ + (−et )y = t2 et .
February 9
Homogeneous: ẏ + p(t) y = 0
Inhomogeneous: ẏ + p(t) y = q(t).
ẏ + p(t) y = 0
dy
+ p(t) y = 0
dt
dy
= −p(t) y
dt
dy
= −p(t) dt (assume for now that y is not 0).
y
Choose any antiderivative P (t) of p(t). Integrating gives
ln |y| = −P (t) + C
|y| = e−P (t)+C
y = ±eC e−P (t)
y = ce−P (t) ,
where c is any number (we brought back the solution y = 0 corresponding to c = 0).
9
If you choose a different antiderivative, it will have the form P (t) + d for some constant d, and then the
new e−P (t) is just a constant e−d times the old one, so the set of all scalar multiples of the function e−P (t) is
the same as before.
Conclusion:
Theorem 5.1 (General solution to first-order homogeneous linear ODE). Let p(t) be a
continuous function on an open interval I (this ensures that p(t) has an antiderivative). Let P (t) be
any antiderivative of p(t). Then the general solution to ẏ + p(t) y = 0 is y = ce−P (t) , where
c is a parameter.
ẏ + p(t) y = 0.
(You need just one nonzero solution. If instead you found the general solution to the
homogeneous ODE, set the parameter c equal to 1, say, to get one solution.)
2. For an undetermined function u(t), substitute
into the inhomogeneous equation (3) and solve for u(t), to find all choices of u(t) that
make this y a solution to the inhomogeneous equation.
3. Now that the general u(t) has been found, plug it back into y = u(t)yh (t) to get the general
solution to the inhomogeneous equation. 2
The reason for considering uyh in Step 2 is this: we know that if we try cyh for a constant c
in the inhomogeneous equation
ẏ + p(t) y = q(t),
it won’t work since the left side will evaluate to 0 (the function cyh is a solution to the
homogeneous equation). Therefore instead we try uyh for a function u(t), and try to figure
out which functions u will make the left side evaluate to q(t). (That’s why it’s called variation
of parameters: the parameter c has been replaced by something varying.)
dy 2
=− y
dt t
dy 2
= − dt
y t
ln |y| = −2 ln t + C (since t > 0)
y = ce−2 ln t
y = ct−2 .
t−1 u̇ = t5
u̇ = t6
t7
u= + c.
7
t7 t5
−2
y = ut = + c t−2 = + ct−2 .
7 7
(If you want, check by direct substitution that this really is a solution.) 2
Z
Here q(t)eP (t) dt represents all possible antiderivatives of q(t)eP (t) , so there are infinitely
many solutions.
If you fix one antiderivative, say R(t), then the others are R(t) + c for a constant c, so
the general solution is
5.4. Linear combinations. A linear combination of a list of functions is any function that
can built from them by scalar multiplication and addition.
Examples:
• 2 cos t + 3 sin t is a linear combination of the functions cos t and sin t.
• 9t5 + 3 is a linear combination of the functions t5 and 1.
Flashcard question: One of the functions below is not a linear combination of cos2 t and 1.
Which one?
1. 3 cos2 t − 4
2. sin2 t
3. sin(2t)
4. cos(2t)
5. 5
6. 0
12
Answer: 3.
All the others are linear combinations:
3 cos2 t − 4 = 3 cos2 t + (−4) · 1
sin2 t = (−1) cos2 t + 1 · 1
sin(2t) = ???
cos(2t) = 2 cos2 t + (−1) · 1
5 = 0 cos2 t + 5 · 1
0 = 0 cos2 t + 0 · 1.
Could there be some fancy identity that expresses sin(2t) as a linear combination of cos2 t
and 1? No; here’s one way to see this: Every linear combination of cos2 t and 1 has the form
c1 cos2 t + c2
for some numbers c1 and c2 . All such functions are even functions, but sin(2t) is an odd
function. (Warning: This trick might not work in other situations.)
5.5. Superposition.
You probably guessed that to get a right hand side that is 9 times as large, the solution needs
to be 9 times as large: 9t5 /7. This is a correct possible answer!
Why does this work? Imagine plugging in y = 9t5 /7 instead of y = t5 /7 into the left side
of the DE; then each of ẏ and y will be 9 times larger, so tẏ + 2y will be 9 times larger too;
that is, it will equal 9t5 instead of t5 .
What special property of tẏ + 2y ensured that it would be 9 times larger? It is that each
summand (tẏ and 2y) is a function of t times one of y, ẏ, . . . , so that when y is multiplied by
9, each summand gets multiplied by 9. In other words, this worked precisely because the DE
was linear !
In the language of systems and signals, if the input signal (right hand side) is multiplied
by 9, the output signal (the solution) is multiplied by 9.
Adding input signals is also called superimposing them; this explains the name of the
general principle:
2.
Using both parts gives a principle for linear combinations, as in the following example.
Answer: t5 /7 + ct−2 . This is great news: if you already solved the homogeneous DE, you just
have to find one solution to the inhomogeneous DE to build all solutions to the inhomogeneous
DE!
Why does this work? For any number c, superposition says that adding ct−2 and t5 /7 will
give a solution to the inhomogeneous DE with right side 0 + t5 = t5 , but why do all solutions
of that inhomogeneous DE arise this way? It is because the process can be reversed: given
14
any solution to the inhomogeneous DE tẏ + 2y = t5 we can subtract the particular solution
yp = t5 /7 to the same DE to get a solution to the homogeneous DE tẏ + 2y = 0.
The strategy suggested by Problem 5.6 can help you find the general solution yi to any
inhomogeneous linear DE: pn (t) y (n) + · · · + p0 (t) y = q(t) :
1. List all solutions to the associated
homogeneous linear DE: pn (t) y (n) + · · · + p0 (t) y = 0;
i.e., write down its general solution yh .
2. Find (in some way) any one particular solution yp to the inhomogeneous DE.
3. Add yp to all the solutions of the homogeneous DE to get all the solutions to the inhomo-
geneous DE.
Summary:
yi = yp + yh .
general inhomogeneous solution particular inhomogeneous solution general homogeneous solution
Simplifying assumptions:
• The insulating ability of the thermos does not change with time.
• The rate of cooling depends only on the difference between the soup temperature and
the external temperature.
Variables and functions (with units): Define the following:
t : time (minutes)
x : external temperature (◦ C)
y : soup temperature (◦ C)
Here t is the independent variable, and x and y are functions of t.
Equation:
ẏ = f (y − x)
for some function f . Another simplifying assumption: f (z) = −kz + ` for some constants k
and ` (any reasonable function be approximated on small inputs by its linearization at 0);
this leads to
ẏ = −k(y − x) + `.
Common sense says
• If y = x, then ẏ = 0. Thus ` should be 0.
15
• If y > x, then y is decreasing. This is why we wrote −k instead of just k.
So the equation becomes
ẏ = −k(y − x)
This is Newton’s law of cooling: the rate of cooling of an object is proportional to the difference
between its temperature and the external temperature. The (positive) constant k is called
the coupling constant, in units of minutes−1 ; smaller k means better insulation, and k = 0 is
perfect insulation. This ODE can be rearranged into standard form:
ẏ + ky = kx.
It’s a first-order inhomogeneous linear ODE! The input signal is x, the system is the thermos,
and the output signal is y.
February 12
Existence and uniqueness theorem for a linear ODE. Let pn−1 (t), . . . , p0 (t), q(t) be
continuous functions on an open interval I. Let a ∈ I, and let b0 , . . . , bn−1 be given numbers.
Then there exists a unique solution to the nth order linear ODE
y (n) + pn−1 (t) y (n−1) + · · · + p1 (t) ẏ + p0 (t) y = q(t)
satisfying the n initial conditions
y(a) = b0 , ẏ(a) = b1 , ..., y (n−1) (a) = bn−1 .
Remark 6.1. For a linear ODE as above, the solution y(t) is defined on the whole interval
I where the functions pn−1 (t), . . . , p0 (t), q(t) are continuous. In particular, if pn−1 (t), . . . ,
p0 (t), q(t) are continuous on all of R, then the solution y(t) will be defined on all of R.
7. Complex numbers
Complex numbers are expressions of the form a + bi, where a and b are real numbers, and
i is a new symbol. Multiplication of complex numbers will eventually be defined so that
i2 = −1. (Electrical engineers sometimes write j instead of i, because they want to reserve i
for current, but everybody else thinks that’s weird.)
Just as the set of all real numbers is denoted R, the set of all complex numbers is denoted
C. The notation “α ∈ C” means literally that α is an element of the set of complex numbers,
so it is a short way of saying “α is a complex number”.
(In lecture there was a joke about the Greek letter Ξ; you had to be there.)
One can add, subtract, multiply, and divide complex numbers (except for division by 0).
Addition, subtraction, and multiplication are defined as for polynomials, except that after
multiplication one simplifies by using i2 = −1; for example,
To divide z by w, multiply z/w by w/w so that the denominator becomes real; for example,
2 + 3i 2 + 3i 1 + 5i 2 + 13i + 15i2 −13 + 13i 1 1
= · = = = − + i.
1 − 5i 1 − 5i 1 + 5i 1 − 25i2 26 2 2
The arithmetic operations on complex numbers satisfy the same properties as for real
numbers (zw = wz and so on). The mathematical jargon for this is that C, like R, is a field. In
particular, for any complex number z and integer n, the nth power z n can be defined in the
usual way (need z =6 0 if n < 0); e.g., z 3 := zzz, z 0 := 1, z −3 := 1/z 3 . (Warning: Although there
is a way to define z n also for a complex number n, when z 6= 0, it turns out that z n has more than one
possible value for non-integral n, so it is ambiguous notation. Anyway, the most important cases are ez , and
z n for integers n; the other cases won’t even come up in this class.)
18
If you change every i in the universe to −i (that is, take the complex conjugate everywhere),
then all true statements remain true. For example, i2 = −1 becomes (−i)2 = −1. Another
example: If z = v + w, then z = v + w; in other words,
v+w =v+w
v w = v w.
(These two identities say that complex conjugation respects addition and multiplication.)
7.2. The complex plane. Just as real numbers can be plotted on a line, complex numbers
can be plotted on a plane: plot a + bi at the point (a, b).
Addition and subtraction of complex numbers has the same geometric interpretation as
for vectors. The same holds for scalar multiplication by a real number. (The geometric
interpretation of multiplication by a complex number is different; we’ll explain it soon.)
Complex conjugation reflects a complex number in the real axis.
19
The absolute value (also called magnitude or modulus) |z| of a complex number z = a + bi
is its distance to the origin:
√
|a + bi| := a2 + b2 (this is a real number).
For a complex number z, inequalities like z < 3 do not make sense, but inequalities like
|z| < 3 do, because |z| is a real number. The complex numbers satisfying |z| < 3 are those in
the open disk of radius 3 centered at 0 in the complex plane. (Open disk means the disk without
its boundary.)
7.3. Some useful identities. The following are true for all complex numbers z:
z+z z−z
Re z = , Im z = , z = z, zz = |z|2 .
2 2i
Also, for any real number c and complex number z,
Re(cz) = c Re z, Im(cz) = c Im z.
Re z = a,
z+z (a + bi) + (a − bi)
= = a,
2 2
z+z
so Re z = . 2
2
The proofs of the others are similar.
z+z
Many identities have a geometric interpretation too. For example, Re z = says that
2
Re z is the midpoint between z and its reflection z.
20
7.4. Complex roots of polynomials.
Example 7.1. How many roots does the polynomial z 3 − 3z 2 + 4 have? It factors as
(z − 2)(z − 2)(z + 1), so it has only two distinct roots (2 and −1). But if we count 2
twice, then the number of roots counted with multiplicity is 3, equal to the degree of the
polynomial.
Some real polynomials, like z 2 + 9, cannot be factored completely into degree 1 real
polynomials, but do factor into degree 1 complex polynomials: (z + 3i)(z − 3i). In fact, every
complex polynomial factors completely into degree 1 complex polynomials — this is proved
in advanced courses in complex analysis. This implies the following:
Fundamental theorem of algebra. Every degree n complex polynomial f (z) has exactly
n complex roots, if counted with multiplicity.
Since real polynomials are special cases of complex polynomials, the fundamental theorem
of algebra applies to them too. For real polynomials, the non-real roots can be paired off
with their complex conjugates.
Example 7.3. Want a fourth root of i? The fundamental theorem of algebra guarantees that
z 4 − i = 0 has a complex solution (in fact, four of them). We’ll soon learn how to find them.
The fundamental theorem of algebra will be useful for constructing solutions to higher
order linear ODEs with constant coefficients, and for discussing eigenvalues.
7.5. Real and imaginary parts of complex-valued functions. Suppose that y(t) is a
complex-valued function of a real variable t. Then
for some real-valued functions of t. Here f (t) := Re y(t) and g(t) := Im y(t). Differentiation
and integration can be done component-wise:
The functions in parentheses labelled f (t) and g(t) are real-valued, so these are the real and
imaginary parts of the function y(t). 2
ẏ = y, y(0) = 1,
so we will use this to guide the definition for complex numbers. To avoid talking about
complex-valued functions of a complex variable, which we have not learned how to differentiate,
we can fix a complex constant α and try to define the complex-valued function eαt of a real
variable t.
Definition 7.5. For each complex constant α, the function eαt is defined to be the solution
to
ẏ = αy, y(0) = 1.
(The existence and uniqueness theorem for linear ODEs guarantees that there is exactly one
solution.)
Remark 7.6. Strictly speaking, we need a vector-valued variant of the existence and uniqueness theorem,
since a complex-valued function of t is equivalent to a pair of real-valued functions of t. Anyway, it is true.
For each nonzero α, the value of αt traces out the line through 0 and α as t ranges over
real numbers, so the function eαt specifies the value of ez for every z on this line. These lines
for varying α cover the whole complex plane, so defining eαt for every α assigns a value to ez
for every complex number z.
Remark 7.7. Each value of ez has been defined multiple times: for example, if β = 2α, then eβ has been
defined as both the value of eβt at t = 1 and the value of eαt at t = 2. Fortunately these multiple definitions
are consistent: in the example, eα(2t) satisfies the same DE and initial condition that specified eβt , so
eα(2t) = eβt ; now set t = 1.
22
7.7. Basic properties of the complex exponential function. From the definition, we
can deduce that the function ez for complex z satisfies many of the same properties as et for
real t.
Lemma 7.8. For any complex numbers α and β, the functions eαt eβt and e(α+β)t are equal.
Proof. We will show that both eαt eβt and e(α+β)t are solutions to
ẏ = (α + β)y, y(0) = 1.
This implies that they are the same function, since the uniqueness part of the existence and
uniqueness theorem says that there is only one solution!
The second function, e(α+β)t , satisfies that DE and initial condition by definition. The first
function y(t) := eαt eβt , also satisfies the DE and initial condition:
ẏ = (αeαt )eβt + eαt (βeβt ) (product rule)
= (α + β)eαt eβt
= (α + β)y
and y(0) = eα0 eβ0 = 1 · 1 = 1.
Theorem 7.9.
(a) e0 = 1.
(b) ez ew = ez+w for all complex numbers z and w.
1
(c) z = e−z for every complex number z.
e
(d) (ez )n = enz for every complex number z and integer n.
Proof.
(a) True by definition.
(b) The previous lemma said eαt eβt = e(α+β)t . Evaluating at t = 1 gives eα eβ = eα+β .
Renaming variables gives ez ew = ez+w .
(c) (The proofs of (c) and (d) were skipped in lecture.) We have
(b) (a)
ez e−z = ez+(−z) = e0 = 1,
so e−z is the inverse of ez .
(d) If n = 0, then this is 1 = 1 by definition. If n = 3,
(b) repeatedly
(ez )3 = ez ez ez = ez+z+z = e3z ;
the same argument works for any positive integer n. If n = −3, then
1 (just shown) 1 (c)
(ez )−3 = z 3
= 3z
= e−3z ;
(e ) e
the same argument works for any negative integer n.
23
February 14
Euler’s formula. We have eit = cos t + i sin t for every real number t.
As t increases, the complex number eit = cos t + i sin t travels counterclockwise around the
unit circle.
Theorem 7.11.
(a) ea+bi = ea (cos b + i sin b) for all real numbers a and b.
(b) e−it = cos t − i sin t = eit for every real number t.
(c) |eit | = 1 for every real number t.
24
Proof.
(a) We have ea+bi = ea eib = ea (cos b + i sin b).
(b) Changing every i in the universe to −i transforms eit = cos t+i sin t into e−it = cos t−i sin t.
(Substituting −t for t would do it too.) On the other hand, applying complex conjugation
to both sides of eit = cos t + i sin t gives eit = cos t − i sin t.
√ √
(c) By Euler’s formula, |eit | = | cos t + i sin t| = cos2 t + sin2 t = 1 = 1.
Of lesser importance is the power series representation
z2 z3
ez = 1 + z + + + ··· . (6)
2! 3!
This formula can be deduced by using Taylor’s theorem with remainder, or by showing that the right hand
side satisfies the DE and initial condition. Some books use ea+bi = ea (cos b + i sin b) or the power series
z2
ez = 1 + z + 2! + · · · as the definition of the complex exponential function, but the DE definition we gave is
less contrived and focuses on what makes the function useful.
7.10. Polar forms of a complex number. Given a nonzero complex number z = x + yi,
we can express the point (x, y) in polar coordinates r and θ:
x = r cos θ, y = r sin θ.
Then
x + yi = (r cos θ) + (r sin θ)i = r(cos θ + i sin θ).
In other words,
z = reiθ .
The expression reiθ is called a polar form of the complex number z. Here r is required to be a
positive real number (assuming z 6= 0), so r = |z|.
25
Any possible θ for z (a possible value for the angle or argument of z) may be called arg z,
but this is dangerously ambiguous notation since there are many values of θ for the same z:
this means that arg z is not a function.
Example 7.12. Suppose that z = −3i. So z corresponds to the point (0, −3). Then r = |z| = 3,
but there are infinitely many possibilities for the angle θ. One possibility is −π/2; all the
others are obtained by adding integer multiples of 2π:
To specify a unique polar form, we would have to restrict the range for θ to some interval of width 2π.
The most common choice is to require −π < θ ≤ π. This special θ is called the principal value of the argument,
and is denoted in various ways:
This assumes that r1 and r2 are positive real numbers, and that θ1 and θ2 are real numbers,
as you would expect for polar coordinates.
7.11. Converting from x + yi form to a polar form. This is the same as converting from
rectangular coordinates to polar coordinates, so you are supposed to know this already. This
section was not covered in lecture.
26
Problem 7.13. Convert a nonzero complex number z = x + yi to polar form. In other
words, given real numbers x and y, find r and a possible θ.
p
Finding r is easy: r = |z| = x2 + y 2 .
Finding θ is trickier. If x = 0 or y = 0, then x + yi is on one of the axes and θ will be
an appropriate integer multiple of π/2. So assume that x and y are nonzero. The correct θ
satisfies tan θ = y/x, but there are also other angles that satisfy this equation, namely θ + kπ
for any integer k. Some of these other angles point in the opposite direction. In particular,
tan−1 (y/x) might be in the opposite direction. By definition, the angle tan−1 (y/x) always
lies in (−π/2, π/2), pointing into the right half plane, so it will be wrong when x + yi lies
in the left half plane; in that case, adjust tan−1 (y/x) by adding or subtracting π to get a
possible θ. Finally, if desired, add an integer multiple of 2π to get the principal value of the argument,
which is the θ satisfying −π < θ ≤ π.
The “2-variable arctangent function” in Mathematica and MATLAB mentioned above
looks not only at y/x, but also at the point (x, y), to calculate a correct θ.
Example 7.14. Suppose that z = −1 − i. Evaulating tan−1 (y/x) at (−1, −1) gives tan−1 (1) =
π/4, pointing in the direction opposite to (−1, −1). Subtracting π gives −3π/4 as a possible
θ. (The other possible angles θ are the numbers −3π/4 + 2πk, where k can be any integer.)
7.12. Operations in polar form. Some arithmetic operations on complex numbers are
easy in polar form:
multiplication: (r1 eiθ1 )(r2 eiθ2 ) = r1 r2 ei(θ1 +θ2 ) (multiply absolute values, add angles)
1 1
reciprocal: iθ
= e−iθ
re r
iθ1
r1 e r1
division: iθ
= ei(θ1 −θ2 ) (divide absolute values, subtract angles)
r2 e 2 r2
nth power: (re ) = rn einθ
iθ n
for any integer n
complex conjugation: reiθ = re−iθ .
Solution: Since i = eiπ/2 , multiplying by i adds π/2 to the angle of each point; that is, it
rotates counterclockwise by 90◦ (around the origin). Next, multiplying by 3 does what you
would expect: dilate by a factor of 3. Doing both leads to. . .
27
For example, the nose was originally on the real line, a little less than 2, so multiplying it by
3i produces a big nose close to (3i)2 = 6i. 2
28
Question 7.16. How do you trap a lion?
Answer: Build a cage in the shape of the unit circle |z| = 1. Get inside the cage. Make
sure that the lion is outside the cage. Apply the function 1/z to the whole plane. Voilà! The
lion is now inside the cage, and you are outside it. (Only problem: There’s a lot of other stuff inside
the cage too. Also, don’t stand too close to z = 0 when you apply 1/z.)
Question 7.17. Why not always write complex numbers in polar form?
7.13. The function e(a+bi)t . Fix a nonzero complex number a + bi. As the real number t
increases, the complex number (a + bi)t moves along a line through 0, and e(a+bi)t moves
along part of a line, a circle, or a spiral, depending on the value of a + bi. Try the “Complex
Exponential” mathlet
http://mathlets.org/mathlets/complex-exponential/
to see examples of this.
Example 7.18. Consider e(−5−2i)t = e−5t ei(−2t) as t → ∞. Its absolute value is e−5t , which
tends to 0, so the point is moving inward. Its angle is −2t, which is decreasing, so the point
is moving clockwise. It’s spiraling inwards clockwise.
7.14.1. An example.
(reiθ )5 = 32eiπ
r5 ei(5θ) = 32eiπ
r5 = 32 5θ = π + 2πk for some integer k
absolute values angles
π 2πk
r=2 and θ= + for some integer k
5 5
π 2πk
z = 2ei( 5 + 5 ) for some integer k.
These are numbers on a circle of radius 2; to get from one to the next (increasing k by 1),
rotate by 2π/5. Increasing k five times brings the number back to its original position. So
it’s enough to take k = 0, 1, 2, 3, 4. Answer:
Remark 7.21. The same approach (write in polar form, solve for absolute value and angle)
finds the solutions to z n = α for any positive integer n and nonzero complex number α.
Problem 7.22. Let n be a positive integer. The nth roots of unity are the complex solutions
to z n = 1. Find them all.
30
Solution: Rewrite the equation in polar form, using z = reiθ :
(reiθ )n = 1
rn ei(nθ) = 1
rn = 1 nθ = 2πk for some integer k
absolute values angles
2πk
r=1 and θ= for some integer k
n
2πk
z = ei( n ) for some integer k.
1, ζ, ζ 2 , . . . , ζ n−1 .
February 16
7.14.3. Another approach to describing all nth roots of a complex number. Here is another
approach to describing all complex solutions to z n = α, analogous to the yi = yp +yh approach
to inhomogeneous linear DEs:
If z0 is one solution to z n = α,
and 1, ζ, ζ 2 , . . . , ζ n−1 are all the solutions to z n = 1,
then are all the solutions to z n = α.
Answer: z0 , ζz0 , ζ 2 z0 , . . . , ζ n−1 z0 . Why? These are solutions, since for any integer k,
(ζ k z0 )n = (ζ k )n z0n = 1 · α = α;
there can’t be any more, since a degree n polynomial equation has at most n solutions. 2
Remark 7.24. To use this approach to find all solutions to z n = α, you need to know one
solution z0 in advance. If there is no obvious z0 , you’ll still need polar form to solve z n = α.
The answer above says that once you know one solution z0 , you get the others by repeatedly
multiplying by ζ = e2πi/n , which rotates by 2π/n; after n rotations, you return to z0 . Thus
the n solutions to z n = α form the vertices of a regular n-gon (at least if n ≥ 3).
Try the “Complex Roots” mathlet
http://mathlets.org/mathlets/complex-roots/
31
7.15. eit and e−it as linear combinations of cos t and sin t, and vice versa.
Example 7.25. The functions eit and e−it are linear combinations of the functions cos t and
sin t:
If we view eit and e−it as known, and cos t and sin t as unknown, then this is a system of
two linear equations in two unknowns, and can be solved for cos t and sin t. This gives
Thus cos t and sin t are linear combinations of eit and e−it . (Explicitly, sin t = 2i1 eit + −1
2i
e−it .)
Important: The function ez has nicer properties than cos t and sin t, so it is often a good
idea to use these formulas to replace cos t and sin t by these combinations of eit and e−it , or
to view cos t and sin t as the real and imaginary parts of eit .
and
Problem 8.1. A cart is attached to a spring attached to a wall. The cart is attached also
to a dashpot, a damping device. (A dashpot could be a cylinder filled with oil that a piston moves
through. Door dampers and car shock absorbers often actually work this way.) Also, there is an external
force acting on the cart. Model the motion of the cart.
32
Solution: Define variables
t : time (s)
x : position of the cart (m), with x = 0 being where the spring exerts no force
m : mass of the cart (kg)
Fspring : force exerted by the spring on the cart (N)
Fdashpot : force exerted by the dashpot on the cart (N)
Fexternal : external force on the cart (N)
F : total force on the cart (N).
where k is the spring constant (in units N/m) and b is the damping constant (in units N s/m);
here k, b > 0. Substituting these and Newton’s second law F = mẍ into
gives
Carts attached to springs are not necessarily what interest us. But oscillatory systems
arising in all the sciences are governed by the same math, and this physical system lets us
visualize their behavior.
8.2. The differential equation ẍ+x = 0. Suppose that m = k = 1 and there is no dashpot
and no external force, only a mass and spring. Then the DE is simply
ẍ + x = 0 .
Each solution x(t) gives rise to a pair of numbers (x(0), ẋ(0)). Conversely, the existence and
uniqueness theorem says that for each pair of numbers (a, b), there is exactly one solution to
ẍ + x = 0 satisfying (x(0), ẋ(0)) = (a, b). What are these solutions?
The answer is the function a cos t + b sin t. In other words, the 2-parameter family of solutions
a cos t + b sin t
There are other ways to construct solutions. For example, for any constant φ, the time-
shifted function cos(t − φ) is a solution, and if A is another constant, then
A cos(t − φ)
is a solution too. It turns out that these functions are the same as the functions a cos t + b sin t,
just written in a different form, so the family of such functions A cos(t − φ) is the general
solution again!
To explain this and to solve other DEs, we’ll need to understand functions like these and
learn how to convert between the different forms.
34
9. Sinusoidal functions
9.1. Construction. Start with the curve y = cos x. Then
1. Shift the graph φ units to the right (φ is phase lag, measured in radians). (For
example, shifting by φ = π/2 gives the graph of sin x, which reaches its maximum
π/2 radians after cos x does.)
2. Compress the result horizontally by dividing by a scale factor ω (angular frequency,
measured in radians/s).
3. Amplify (stretch vertically) by a factor of A (amplitude).
(Here A, ω > 0, but φ can be any real number.)
Result? The graph of a new function f (t), called a sinusoidal function (or just sinusoid).
9.2. Formula. What is the formula for f (t)? According to the instructions, each point (x, y)
on y = cos x is related to a point (t, f (t)) on the graph of f by
x+φ
t= , f = Ay.
ω
f (t) = A cos(ωt − φ) .
35
9.3. Alternative geometric description. Alternatively, the graph of f (t) can be described
geometrically in terms of
A : its amplitude, as above, how high the graph rises above the t-axis at its maximum
t0 : its time lag, also sometimes called τ , a t-value at which a maximum is attained (s)
P : its period, the time for one complete oscillation (= width between successive maxima) (s or s/cycle)
How do t0 and P relate to ω and φ?
• t0 = φ/ω, since this is the t-value for which the angle ωt − φ becomes 0.
• P = 2π/ω, since adding 2π/ω to t increases the angle ωt − φ by 2π.
There is also frequency ν := 1/P , measured in Hz = cycles/s. It is the number of complete
oscillations per second. To convert from frequency ν to angular frequency ω, multiply by
2π radians
; thus ω = 2πν = 2π/P , which is consistent with the formula P = 2π/ω above.
1 cycle
Question 9.1. What is the difference between phase lag and time lag?
Answer: Phase lag φ and time lag t0 both measure how much a sinusoid A cos(ωt − φ) is
shifted relative to the standard sinusoid cos(ωt) of the same frequency, but φ is measured as
a fraction of a cycle (expressed in radians), and t0 is expressed in time units.
For example, if φ is π radians, that is half a cycle, so it means that the sinusoid is completely
out of phase, attaining a maximum where cos(ωt) has a minimum, and vice versa. The time
lag of this same sinusoid, however, will depend on the duration of a cycle: if the angular
frequency ω is very high, then there will be many cycles per second, so each cycle represents
a very short time period, so the time for half a cycle will also be very short.
To convert between phase lag and time lag, multiply or divide by the angular frequency ω.
To remember whether to multiply or divide, compare units. Since ω is measured in radians/s,
the conversion is t0 = φ/ω.
Another way to think about this: in terms of the construction of a sinusoid, φ represents
the shift in angle, and then compressing horizontally by dividing by ω gives the shift in time.
2
9.4. Three forms. There are three ways to write a sinusoidal function of angular frequency
ω:
• amplitude-phase form: A cos(ωt − φ), where A and φ are real numbers with A ≥ 0;
• complex form: Re (ceiωt ) , where c is a complex number;
• linear combination: a cos ωt + b sin ωt, where a and b are real numbers.
36
Different forms are useful in different contexts, so we’ll need to know how to convert between
them. The following proposition explains how.
February 20
Warning: Don’t forget that it is c and not c itself that appears in the key equations.
An equivalent form of the key equations (obtained by taking complex conjugates) is
Ae−iφ
c a − bi
If you ever forget the key equations above, you can do the conversion manually by going
through the steps in the proof below.
Proof of Proposition 9.2.
1.
Re ceiωt = Re Ae−iφ eiωt
= Re Aei(ωt−φ)
= A Re ei(ωt−φ)
= A cos(ωt − φ).
2.
Re ceiωt = Re ((a − bi)(cos ωt + i sin ωt))
(Actually, it would have been enough to prove equality on two sides of the triangle.)
Here are two sample problems showing how to use the key equations.
Problem 9.3. (We actually did this in lecture on Fri Feb 16.) Convert 7 cos(2t − π/2) into
complex form Re (ceiωt ).
√
Problem 9.4. (Skipped.) Convert − cos 5t− 3 sin 5t to amplitude-phase form A cos(ωt−φ).
√
Solution: Given: a = −1, b = − 3, ω = q 5. Wanted: A, ω, φ. So we use Aeiφ = a + bi,
√ √
which says that Aeiφ = −1 − i 3. First, A = (−1)2 + (− 3)2 = 2. The real-part equation
√
A cos φ = −1 says that cos φ = −1/2, so the angle of (−1, − 3) is φ = −2π/3 (or this plus
2πk for any integer k). Thus the answer is
Remark 9.5. Sinusoids are always real-valued functions. It would be wrong to write one
as a cos ωt + bi sin θt; the i should not be in there. Even the complex form Re (ceiωt ) of a
sinusoid is a real-valued function, because of the Re on the outside.
9.5. Complex gain, gain, and phase lag. Later in the class, we’ll talk about LTI systems
(LTI stands for linear and time-invariant). These include all systems built of springs, masses,
and dashpots, and also all RLC circuits (circuits built of resistors, inductors, and capacitors).
It turns out that such a system, when fed a sinusoidal input signal, produces a sinusoidal
output signal of the same frequency. How can we compare the input and output sinusoids?
38
Write each sinusoid in complex form (convert if necessary):
Imagine feeding a corresponding “complex replacement” signal into the system and getting a
complex output signal (this probably makes no physical sense, but do it anyway):
Define complex gain as the factor by which the complex input signal has gotten “bigger”:
complex output Ceiωt C
G := = iωt = .
complex input ce c
Complex gain is a complex number.
Question 9.6. What is the physical interpretation of the complex gain G, in terms of
amplitudes and phases of the real signals?
The answers are in the two boxes below. The conversion Re (ceiωt ) = A cos(ωt − φ) uses
the key equation
c = Ae−iφ ,
so multiplying c by the complex scale factor G (to get C) amounts to
• multiplying the amplitude A by |G|, and
• increasing φ by − arg G.
The amplitude scale factor is called gain:
output amplitude
gain := = |G| .
input amplitude
Gain is a nonnegative real number.
The increase in φ is called the phase lag (of the output relative to the input):
Phase lag is a real number, measured in radians. Warning: This is a relative phase lag,
different from the absolute phase lag defined earlier comparing a sinusoid to the standard
sinusoid cos x.
Example 9.7. If the phase lag is π/2, that means that the maximum of the output sinusoid
occurs π/2 radians after the maximum of the input signal. (To get instead the relative time
lag, divide the phase lag by ω.)
39
Remark 9.8. For an LTI system, it turns out that the complex gain depends only on ω. In
other words, the complex gain is the same for all sinusoidal input signals having a fixed
angular frequency ω. Mathematically, this is because when the complex input signal ceiωt is
multiplied by a nonzero complex number, linearity implies that the complex output signal
is multiplied by the same complex number, so the complex gain (= complex output
complex input
) stays the
same.
Problem 9.9. Consider two sinusoid sound waves of angular frequencies ω + and ω − ,
say cos((ω + )t) and cos((ω − )t), where is much smaller than ω. What happens when
they are superimposed?
= Re ei(ω+)t + ei(ω−)t
The function cos ωt oscillates rapidly between ±1. Multiplying it by the slowly varying
function 2 cos t produces a rapid oscillation between ±2 cos t, so one hears a sound wave of
angular frequency ω whose amplitude is the slowly varying function |2 cos t|. 2
Theorem 10.2. For any homogeneous linear DE, the set of all solutions is a vector space.
Theorem 10.2 is why homogeneous linear DEs are so nice. It says that if you know some
solutions, you can form linear combinations to build new solutions, with no extra work! This
is the key point of linearity in the homogeneous case. We will use it over and over
again in applications throughout the course.
10.2. Span. Last week we used the existence and uniqueness theorem to show that the
solutions to ẍ + x = 0 are exactly the linear combinations of cos t and sin t:
{all solutions to ẍ + x = 0} = {a cos t + b sin t, where a and b range over all numbers}.
(To be completely precise, we should say whether c1 , . . . , cn are ranging over real numbers only, or over all
complex numbers too. The answer depends on the context.)
41
Flashcard question: How many functions are in the set Span(cos t, sin t)?
Answer: Infinitely many! Here are some functions in this set:
Problem 10.4. Let S be the set of polynomials p(t) whose degree is ≤ 2. Express S as a
span.
at2 + bt + c,
where a, b, c range over all numbers. Thus one possible answer is that S = Span(t2 , t, 1). 2
Problem 10.5. Let T be the set of all solutions to ẏ = 7y. Express T as a span.
Solution: The set T is the set of functions of the form ce7t , where c ranges over all numbers.
Thus one possible answer is that T = Span(e7t ). (The linear combinations of a single function
are just the scalar multiples of that function.) 2
Last week we showed that the set of all solutions to ẍ + x = 0 equals the set of the linear
combinations of two solutions. The same reasoning applies to any 2nd -order homogeneous
linear ODE.
Conclusion: For any 2nd -order homogeneous linear ODE, the set of solutions can be
expressed as the span of 2 solutions.
10.3. Linearly dependent functions. How do you know which two solutions to use? Will
the span of any two solutions give the set of all solutions? No! It turns out that most pairs
of solutions will work, but not every pair. To determine which pairs work, we need the notion
of linear dependence.
Example 10.6 (Why the span of two solutions might not equal the set of all solutions). The
functions cos t and 2 cos t are two solutions to ẍ + x = 0, but Span(cos t, 2 cos t) consists only
of the functions
a cos t + b(2 cos t) = (a + 2b) cos t,
which vary only over the scalar multiples of cos t. Thus
The solution sin t is missing from this set, so this is not the set of all solutions to ẍ + x = 0.
Definition 10.7. Call two functions f and g linearly dependent when either f is a scalar
multiple of g or g is a scalar multiple of f . Call f and g linearly independent otherwise.
42
Warning: The definition of linear dependence for three or more functions is more compli-
cated. We’ll discuss it later.
Flashcard question: Which of the following is a pair of linearly independent functions?
• cos t, 2 cos t
• cos t, cos(t + π)
• cos t, cos(t − π/2)
• cos t, 0.
Answer: The third pair is linearly independent since cos(t − π/2) = sin t, and neither cos t
nor sin t is a scalar multiple of the other function (they have zeros in different places, for
instance). The second pair is linearly dependent since cos(t + π) = (−1)(cos t). The fourth
pair is linearly dependent since 0 = 0(cos t). 2
It turns out that in solving a 2nd -order homogeneous linear ODE, any two solutions can be
used to generate the others, provided that they are linearly independent:
Solution: Try y = ert , where r is a constant to be determined. Let’s find out for which
constants r this is really a solution. We get ẏ = rert and ÿ = r2 ert , so (7) becomes
r2 ert + rert − 6ert = 0
(r2 + r − 6)ert = 0.
This holds as an equality of functions if and only if
r2 + r − 6 = 0
(r − 2)(r + 3) = 0
r = 2 or r = −3.
So e2t and e−3t are solutions.
43
Luckily, neither is a constant times the other (the ratio is e2t /e−3t = e5t , a nonconstant
function); in other words, e2t and e−3t are linearly independent. Conclusion:
The set of all solutions to ÿ + ẏ − 6y = 0 is Span(e2t , e−3t ) .
An equivalent way to express this answer:
The general solution to ÿ + ẏ − 6y = 0 is c1 e2t + c2 e−3t , where c1 and c2 are parameters.
Remark 10.9. In the future, we’ll jump directly from the ODE to solving r2 − r − 6 = 0, now
that we know how this works.
10.5. Basis. Here is a third way to express the answer to the previous problem:
The functions e2t and e−3t form a basis of the space of solutions to ÿ + ẏ − 6y = 0.
This terminology means that
• Span(e2t , e−3t ) is the space of solutions to ÿ + ẏ − 6y = 0, and
• e2t and e−3t are linearly independent.
Think of the functions e2t and e−3t in the basis as the “basic building blocks”:
• the first condition says that every solution can be built from e2t and e−3t by taking
linear combinations, and
• the second condition says that there is no redundancy in the list (neither building
block could have been built from the other one).
The plural of basis is bases, pronounced BAY-sees.
February 21
10.6. How to solve any second-order homogeneous linear ODE with constant
coefficients. To solve
a2 ÿ + a1 ẏ + a0 y = 0,
where a2 , a1 , a0 are constants (with a2 6= 0), do the following:
1. Write down the characteristic equation
a2 r2 + a1 r + a0 = 0,
in which the coefficient of ri is the coefficient of y (i) from the ODE. The polynomial
a2 r2 + a1 r + a0 is called the characteristic polynomial. (For example, ÿ + 5y = 0 has
characteristic polynomial r2 + 5.)
2. Solve the characteristic equation to list the complex roots with multiplicity. (To do
this, factor the characteristic polynomial, or complete the square, or use the quadratic
formula.)
44
3a. If the roots are distinct numbers r1 6= r2 then the functions
er1 t , er2 t
form a basis of the space of solutions to the ODE. In other words, the general solution is
c1 er1 t + c2 er2 t .
3b. If the roots are equal, r and r, then the functions
ert , tert
form a basis of the space of solutions to the ODE. In other words, the general solution is
c1 ert + c2 tert .
(We’ll explain later on why this works.) 2
10.7. Complex roots. The general method in the previous section works even if some of
the roots are not real.
Question 10.11. Which basis should be used, eit , e−it or cos t, sin t ?
Answer: It depends:
• The basis eit , e−it is easier to calculate with, but it’s not immediately obvious which
linear combinations of these functions are real-valued.
• The basis cos t, sin t consisting of real-valued functions is useful for interpreting
solutions in a physical system. The general real-valued solution is c1 cos t + c2 sin t
where c1 , c2 are real numbers.
45
So we will be converting back and forth.
Complex basis vs. real-valued basis. Let y(t) be a complex-valued function of a real-
valued variable t. If y, y is a basis for a space of functions, then Re(y), Im(y) is another
basis for the same space of functions, but having the advantage that it consists of real-valued
functions.
Proof. (Not done in detail in lecture.) Any linear combination of y and y can be re-expressed
as a linear combination of Re(y) and Im(y) by substituting
Conversely, any linear combination of Re(y) and Im(y) can be re-expressed as a linear
combination of y and y by substituting
y+y y−y
Re(y) = , Im(y) = .
2 2i
Thus Span(Re(y), Im(y)) = Span(y, y).
To finish checking the Re(y), Im(y) is a basis for the space, we need to check that they
are linearly independent. If Im(y) were a scalar multiple of Re(y), say Re(y) = f (t) and
Im(y) = af (t), then y and y would be multiples of f (t) too, so one of y and y would be a
multiple of the other, so y and y would be linearly dependent, which is nonsense since by
assumption they form a basis. Thus Im(y) cannot be a scalar multiple of Re(y). Similarly,
Re(y) cannot be a scalar multiple of Im(y). Thus Re(y), Im(y) are linearly independent.
Answer: Yes, but this would be less useful, because the whole point was to obtain a basis
consisting of real-valued functions.
Answer: No, because if y = f + ig, then y = f − ig, so Re(y) and Re(y) are both f ! They
are linearly dependent, so they can’t form a basis.
ÿ + 4ẏ + 13y = 0
Thus
e−2t cos(3t), e−2t sin(3t)
is another basis, this time consisting of real-valued functions. 2
10.8. Harmonic oscillators and damped frequency. Let’s apply all this to the spring-
mass-dashpot system, assuming no external force.
mẍ + kx = 0.
In other words, the real-valued solutions are all the sinusoid functions of angular frequency
ω. They could also be written as A cos(ωt − φ), where A and φ are real constants.
This system, or any other system governed by the same DE, is also called a simple harmonic
oscillator. The angular frequency ω is also called the natural frequency (or resonant frequency)
of the oscillator.
mẍ + bẋ + kx = 0.
Summary:
Case Roots Situation
b=0 two complex roots ±iω undamped (simple harmonic oscillator)
2
b < 4mk two complex roots −s ± iωd underdamped (damped oscillator)
b2 = 4mk repeated real root −s, −s critically damped
b2 > 4mk distinct real roots −s1 , −s2 overdamped
February 23
Possible answers:
√
1. π 3
√
2. π/ 3
√
3. 2π 3
√
4. 2π/ 3
√
5. 3/π
√
6. 3/(2π)
7. None of the above
49
√ √
Answer: π/ 3. Why? The solution has the same zeros as the sinusoid a cos( 3t) +
√ √ √
b sin( 3t) of angular frequency 3, period 2π/ 3. But a sinusoid crosses 0 twice within
√
each period, so the answer is half a period, π/ 3.
Try the “Damped Vibrations” mathlet
http://mathlets.org/mathlets/damped-vibrations/
11.2. Linearly dependent functions. Recall that two functions are called linearly depen-
dent when one of them is a scalar multiple of the other one. To motivate the definition for
more than two functions, first consider the following:
Answer: TRUE. Let’s explain why. The span of these three functions is the set of linear
combinations
a cos t + b sin t + c(3 cos t + 4 sin t).
But each such linear combination is also just a linear combination of cos t and sin t alone: for
example,
100 cos t + 10 sin t + 2(3 cos t + 4 sin t) = 106 cos t + 18 sin t.
Thus
Span(cos t, sin t, 3 cos t + 4 sin t) = Span(cos t, sin t),
which we already know is the set of all solutions to ẍ + x = 0. 2
50
Even though the statement was true, including 3 cos t + 4 sin t in the list was redundant:
it gave no new linear combinations. The general definition of linearly dependent functions
captures this notion of redundancy:
Example 11.3. The three functions cos t, sin t, and 3 cos t + 4 sin t are linearly dependent since
the third function is a linear combination of the first two.
Remark 11.4. When there are only two functions, f1 and f2 , then to say that one of them is a
linear combination of the others is the same as saying that one of them is a scalar multiple of
the other one. So the new definition is compatible with the earlier definition for two functions.
Key point: A vector space usually has infinitely many functions. To describe it compactly,
give a basis of the vector space.
11.4. Dimension. It turns out that, although a vector space can have different bases, each
basis has the same number of functions in it.
Definition 11.6. The dimension of a vector space is the number of functions in any basis.
Example 11.7. The space of solutions to ẍ + x = 0 is 2-dimensional since the basis cos t, sin t
has 2 functions. (The basis eit , e−it also has 2 functions.)
In these two examples, the dimension equals the order of the homogeneous linear ODE. It
turns out that this holds in general:
Dimension theorem for a homogeneous linear ODE. The dimension of the space of
solutions to an nth order homogeneous linear ODE is n.
In other words, the number of parameters needed in the general solution to an nth order
homogeneous linear ODE is n.
Remember when we proved that all solutions of ẍ + x = 0 were linear combinations of cos t
and sin t, by showing that no matter what the values of x(0) and ẋ(0), we could find a linear
combination of cos t and sin t that solved the DE with the same initial conditions? The same
idea proves the dimension theorem for any homogeneous linear ODE.
52
Let’s explain the idea for a 3rd -order ODE, in which we use 0 as starting time in the
existence and uniqueness theorem. Define
so every solution y(t), no matter what its values of y(0), ẏ(0), ÿ(0) are, is some linear
combination af + bg + ch. Thus the set of all solutions to the DE is Span(f, g, h) .
If a, b, c are numbers such that y := af + bg + ch is the zero function, then its values of y(0),
ẏ(0), ÿ(0) must all be 0, which means that a, b, c are all 0. Thus f, g, h are linearly independent .
The previous two paragraphs imply that f, g, h is a basis for the space of solutions to the
DE. There are 3 functions in this basis, so the dimension of the space of solutions is 3.
11.5. Solving a homogeneous linear ODE with constant coefficients. Earlier we gave
a method to solve any second-order homogeneous linear ODEs with constant coefficients.
Now we do the same for nth order for any n.
Given
an y (n) + · · · + a1 ẏ + a0 y = 0, (8)
an rn + · · · + a1 r + a0 = 0,
in which the coefficient of ri is the coefficient of y (i) from the ODE. The left hand side
is called the characteristic polynomial p(r). (For example, ÿ + 5y = 0 has characteristic
polynomial r2 + 5.)
2. Factor p(r) as
an (r − r1 )(r − r2 ) · · · (r − rn )
c1 er1 t + · · · + cn ern t .
4. If r1 , . . . , rn are not distinct, then er1 t , . . . , ern t cannot be a basis since some of these
functions are redundant (definitely not linearly independent!) If a particular root r is
53
repeated m times, then
m copies
z }| {
replace ert , ert , ert , . . . , ert
by ert , tert , t2 ert , ..., tm−1 ert .
In all cases,
0, 0, 0, 0, −3, −3.
We need to replace the first block of four functions, and also the last block of two functions.
So the correct basis is
c1 + c2 t + c3 t2 + c4 t3 + c5 e−3t + c6 te−3t .
(As expected, there is a 6-dimensional space of solutions to this 6th order ODE.) 2
Problem 11.10. Find the simplest constant-coefficient homogeneous linear ODE having
(5t + 7)e−t − 9e2t as one of its solutions.
54
Solution: The given function is a linear combination of
e−t , te−t , e2t
so the roots of the characteristic polynomial (with multiplicity) should include −1, −1, 2. So
the simplest characteristic polynomial is
(r + 1)(r + 1)(r − 2) = r3 − 3r − 2
and the corresponding ODE is
y (3) − 3ẏ − 2y = 0. 2
Remember when for a second-order ODE whose characteristic polynomial had complex
roots we got a basis like
e(−2+3i)t , e(−2−3i)t ,
consisting of a complex-valued function y and its complex conjugate y? We explained that it
was OK to replace y, y by a new basis Re y, Im y consisting of real-valued functions.
We can do the same replacement even if y, y is just part of a basis.
February 26
Midterm 1
February 28
Solution: The characteristic polynomial is p(r) := r3 + 3r2 + 9r − 13. Checking the divisors
of −13 (as instructed by the rational root test), we find that 1 is a root, so r − 1 is a factor.
Long division (or solving for unknown coefficients) produces the other factor:
p(r) = (r − 1)(r2 + 4r + 13).
Roots: 1, −2 + 3i, −2 − 3i.
Basis: et , e(−2+3i)t , e(−2−3i)t .
Leave et as is but replace y := e(−2+3i)t and y = e(−2−3i)t to get a new basis
et , e−2t cos(3t), e−2t sin(3t)
consisting of real-valued functions. 2
55
12. Inhomogeneous linear ODEs with constant coefficients
12.1. Operator notation.
Answer:
(2D2 + 3D + 5)y = 0. 2
The same argument shows that every constant-coefficient homogeneous linear ODE
an y (n) + · · · + a0 y = 0
can be written simply as
p(D) y = 0,
where p is the characteristic polynomial.
2D2 + 3D + 5
Remark 12.2. Any homogeneous linear ODE can be written as Ly = 0 for some linear operator
L, even if the coefficients are nonconstant functions of t.
Remark 12.3. Delaying an input signal F (t) in time by a seconds gives a new input signal
F (t − a) (the minus sign is correct: the new input signal has the value at t = a that the old
input signal has at t = 0).
When p is a polynomial with constant coefficients, then p(D) is time-invariant, which means
that
if f (t) is a solution to p(D) x = F (t)
and a is a number,
then f (t − a) is a solution to p(D) x = F (t − a).
In words: if an input signal F (t) is delayed in time by a seconds, then the output signal is
delayed by a seconds.
This can simplify the solution to some DEs:
One answer is
1 1
x(t − π/2) = cos(t − π/2) + sin(t − π/2)
2 2
1 1
= sin t − cos t.
2 2
A system that is defined by a linear time-invariant operator is called an LTI system.
57
12.4. Shortcut for applying an operator to an exponential function.
The same calculation, but with an arbitrary polynomial, proves the general rule:
Solution:
Characteristic polynomial: p(r) = r2 − 10r + 25 = (r − 5)2 .
Roots: 5, 5.
Basis: e5t , te5t . 2
But why does this work? Using operators, we can now explain!
In operator form, the DE is
(D − 5)2 y = 0.
The calculation
shortcut
(D − 5)2 e5t = p(D) e5t = p(5)e5t = 0e5t = 0
shows that e5t is one solution. But the DE is second-order, so the basis should have two
functions. Taking the second function to be ce5t for a constant c does not give a basis, since
e5t , ce5t are linearly dependent.
Let’s try variation of parameters! Plug in y = ue5t , where u is a function to be determined.
To calculate (D − 5)2 ue5t , let’s apply D − 5 twice:
= u̇e5t .
Similarly,
Remark 12.7. You don’t have to go through the discussion of this section each time you want
to solve p(D)y = 0; we are just explaining why the method given earlier actually works.
1
In other words, multiply the input signal by the number to get an output signal.
p(r)
Problem 12.8. Find the general solution to ÿ + 7ẏ + 12y = −5e2t .
Solution:
Characteristic polynomial: p(r) = r2 + 7r + 12 = (r + 3)(r + 4).
Roots: −3, −4.
General solution to homogeneous equation: yh := c1 e−3t + c2 e−4t .
59
ERF says:
1 2t
e is a particular solution to p(D) y = e2t ;
p(2)
i.e.,
1 2t
e is a particular solution to ÿ + 7ẏ + 12y = e2t ,
30
so
1
− e2t is a particular solution to ÿ + 7ẏ + 12y = −5e2t .
6
call this yp
y = yp + yh
1
= − e2t + c1 e−3t + c2 e−4t . 2
6
March 2
p(D) y = ert
should have a solution even if p(r) = 0 (when ERF does not apply). Here is how to find a
particular solution in this bad case:
1
In other words, multiply the input signal by tm , and then multiply by the number ,
p(m) (r)
where p(m) is the mth derivative of p.
∂
(The proof that this works involves a generalized shortcut, obtained by applying ∂R to the shortcut
equation
p(D) eRt = p(R)eRt
m times and then setting the variable R equal to the number r. We’ll skip it.)
Generalized ERF comes up less often than regular ERF, since in most applications, p(r) 6= 0.
60
12.7. Sinusoidal response (complex replacement). Suppose that p(t) is a real polyno-
mial, and ω is a real number. Let zp denote a complex-valued function.
Answer: Taking real parts of both sides shows that Re(zp ) works.
The observation above leads to the complex replacement method, which we now explain.
2. Find a particular solution zp to the complex replacement DE. (Use ERF for this,
provided that p(iω) 6= 0.)
3. Take the real part: xp := Re(zp ). Then xp is a particular solution to the original DE.
Aeiφ
c a + bi.
√
Converting to amplitude-phase form. The number −2 + 2i has absolute value 2 2 and
angle 3π/4, so
√
−2 + 2i = 2 2ei(3π/4)
1 1
c= = √ ei(−3π/4)
−2 + 2i 2 2
1
c = √ ei(3π/4) ,
2 2
√
1 2
which is supposed to be Aeiφ . Thus the amplitude is A = √
2 2
= 4
and the phase lag is
φ = 3π/4. Conclusion: In amplitude-phase form,
√
2
xp = cos(2t − 3π/4) .
4
1 1
xp = − cos 2t + sin 2t .
4 4
the input signal is cos ωt and the output signal is the steady-state solution xp . Here is what
happens in general, assuming p(iω) 6= 0:
• The complex replacement ODE
p(D) z = eiωt
1 iωt
has complex input signal eiωt and complex output signal e by ERF, so
p(iω)
1
complex gain G =
p(iω)
Remark 12.11. What happens if we change the input signal cos ωt to a different sinusoidal
function of angular frequency ω? As mentioned when we introduced complex gain, this
multiplies the complex input signal and the complex output signal by the same complex
number, so the complex gain is the same. Thus the complex gain, gain, and phase lag are
given by the same formulas as above: they depend only on the system and on ω.
Complex replacement is helpful also with other real input signals, with any real-valued
function that can be written as the real part of a reasonably simple complex input signal.
Here are some examples:
63
Real input signal Complex replacement
cos ωt eiωt
A cos(ωt − φ) Ae−iφ eiωt
a cos ωt + b sin ωt (a − bi)eiωt
eat cos ωt e(a+iω)t
Each function in the first column is the real part of the corresponding function in the
second column. The nice thing about these examples is that the complex replacement is a
constant times a complex exponential, so ERF (or generalized ERF) applies.
12.9. Stability.
Problem 12.12. What is the general solution to ẍ + 7ẋ + 12x = cos 2t?
xh = c1 e−3t + c2 e−4t .
x = xp + xh
1
= Re e 2it
+ c1 e−3t + c2 e−4t . 2
8 + 14i | {z }
| {z } transient
steady-state solution
In general, for a forced damped oscillator, complex replacement and ERF will produce a
periodic output signal, and that particular solution is called the steady-state solution. Every
other solution is the steady-state solution plus a transient, where the transient is a function
that decays to 0 as t → +∞. As time progresses, the solution x(t) will approximate the
steady-state solution more and more closely.
64
Changing the initial conditions gives a new solution xnew (t). But this changes only the c1 , c2
above, so xnew (t) will approximate the same steady-state solution that x(t) approximates, and
xnew (t) − x(t) will tend to 0 as t → +∞. A system like this, in which changes in the initial
conditions have vanishing effect on the long-term behavior of the solution (i.e., xnew (t) − x(t)
tends to 0 as t → +∞), is called stable.
More generally:
Theorem 12.13 (Stability test in terms of roots). A constant-coefficient linear ODE of any
order is stable if and only if every root of the characteristic polynomial has negative real part.
12.9.3. Testing a second-order system for stability in terms of coefficients. In the 2nd -order
case, there is also a simple test directly in terms of the coefficients:
Theorem 12.14 (Stability test in terms of coefficients, 2nd -order case). Assume that a0 , a1 , a2
are real numbers with a0 > 0. The ODE
(a0 D2 + a1 D + a2 )x = F (t)
is stable if and only if a1 > 0 and a2 > 0.
Proof. By dividing by a0 , we can assume that a0 = 1. Break into cases according to the table
above.
• When the roots are a ± bi, we have a < 0 if and only if the coefficients −2a and a2 + b2
are both positive.
65
• When the roots are s, s, we have s < 0 if and only if the coefficients −2s and s2 are
both positive.
• When the roots are r1 , r2 , we have r1 , r2 < 0 if and only if the coefficients −(r1 + r2 )
and r1 r2 are both positive. (Knowing that −(r1 + r2 ) is positive means that at least
one of r1 , r2 is negative; if moreover the product r1 r2 is positive, then the other root
must be negative too.)
Summary: In all three cases, the roots have negative real part if and only if the coefficients of
the characteristic polynomial are positive. So to test for stability, instead of checking whether
the roots have negative real part, we can check whether the coefficients of the characteristic
polynomial are positive.
There is a generalization of the coefficient test to higher-order ODEs, called the Routh–Hurwitz conditions
for stability, but the conditions are much more complicated.
March 5
12.10. Resonance. Recall that a harmonic oscillator has a natural frequency. Resonance is
a phenomenon that occurs when a harmonic oscillator is driven with an input sinusoid whose
frequency is close to or equal to the natural frequency.
• “Near resonance” (frequency is close to the natural frequency): The oscillations of
the output signal will be much larger than the oscillations of the input signal — the
gain will be large. In fact, the closer that the input frequency gets to the natural
frequency, the larger the gain becomes.
• “Pure resonance” (frequency is equal to the natural frequency): The oscillations grow
with time, and there is no steady-state solution. But in a realistic physical situation,
there is at least a tiny amount of damping, which prevents the runaway growth, so
that the oscillations are bounded (but still large).
We now explain all of this by solving ODEs explicitly.
12.10.1. Warm-up: harmonic oscillator with no input signal. A typical ODE modeling a
harmonic oscillator is
ẍ + 9x = 0.
Characteristic polynomial: r2 + 9.
Roots: ±3i.
Basis of solutions: e3it , e−3it .
Real-valued basis: cos 3t, sin 3t.
General real-valued solution: a cos 3t + b sin 3t, for real numbers a, b. These are all the
sinusoids with angular frequency 3. The natural frequency is 3.
66
12.10.2. Near resonance. Now let’s drive the harmonic oscillator with an input sinusoid. A
typical ODE modeling this situation is
ẍ + 9x = cos ωt .
input signal
z̈ + 9z = eiωt .
1 iωt 1
zp := e = 2
eiωt .
p(iω) 9−ω
1
xp := cos ωt.
9 − ω2
1 1
Complex gain: G = = .
p(iω) 9 − ω2
1
Gain: |G| = . This becomes very large as ω approaches 3.
|9 − ω 2 |
Phase lag: − arg G, which is 0 or π depending on whether ω < 3 or ω > 3.
In engineering, the graph of gain as a function of ω is called a Bode plot (Bode is pronounced
Boh-dee). (Actually, engineers usually instead use a log-log plot: they plot log(gain) as a function of log ω).
On the other hand, a Nyquist plot shows the trajectory of the complex gain G as ω varies.
Also try the “Harmonic Frequency Response: Variable Natural Frequency” mathlet
http://mathlets.org/mathlets/
harmonic-frequency-response-variable-natural-frequency/
(In this one, the input signal is fixed to be sin t, and the natural frequency is adjustable.)
67
12.10.3. Pure resonance.
12.10.4. Resonance with damping. In a realistic physical situation, there is at least a tiny
amount of damping, and this prevents the runaway growth of the previous section.
Question 12.16. What happens if ω = 3 exactly, but there is a tiny amount of damping, so
that the ODE is
ẍ + bẋ + 9x = cos ωt
damping term input signal
for some small positive constant b?
12.11. RLC circuits. Let’s model a circuit with a voltage source, resistor, inductor, and
capacitor attached in series: an RLC circuit.
t : time (s)
R : resistance of the resistor (ohms)
L : inductance of the inductor (henries)
C : capacitance of the capacitor (farads)
Q : charge on the capacitor (coulombs)
I : current (amperes)
V : voltage source (volts)
VR : voltage drop across the resistor (volts)
VL : voltage drop across the inductor (volts)
VC : voltage drop across the capacitor (volts).
I = Q̇
VR = RI Ohm’s law
VL = LI˙ Faraday’s law
1
VC = Q
C
V = VR + VL + VC Kirchhoff’s voltage law.
VL + VR + VC = V,
which becomes
1
LQ̈ + RQ̇ + Q = V (t) ,
C
a second-order inhomogeneous linear ODE with unknown function Q(t). Mathematically,
this is equivalent to the spring-mass-dashpot ODE
• Kirchhoff’s current law says that at each junction, the current flowing in equals the
current flowing out; applying this at the junction at the top gives
I3 = I1 + I2
V − R2 I3 − L1 I˙1 − R1 I1 = 0
V − R2 I3 − L2 I˙2 = 0.
(As one goes around the left loop counterclockwise, the electric potential increases by
V as one crosses the voltage source, and then decreases as one crosses the resistor R2
since the current I3 flows from high potential to low potential, and so on.)
13.2. Definitions.
Flashcard question: Consider the system
ẋ = 2t2 x + 3y
ẏ = 5x − 7et y
involving two unknown functions, x(t) and y(t). Which of the following describes this system?
Possible answers:
• first-order homogeneous linear system of ODEs
• second-order homogeneous linear system of ODEs
• first-order inhomogeneous linear system of ODEs
• second-order inhomogeneous linear system of ODEs
• first-order homogeneous linear system of PDEs
• second-order homogeneous linear system of PDEs
• first-order inhomogeneous linear system of PDEs
• second-order inhomogeneous linear system of PDEs
Answer: It’s a first-order homogeneous linear system of ODEs. The system is first-order
since it involves only the first derivatives of the unknown functions. This is a homogeneous
linear system since every summand is a function of t times one of x, ẋ, . . . , y, ẏ, . . . . (If there
were also terms that were functions of t, then it would be an inhomogeneous linear system.)
The equations are ODEs since the functions are still functions of only one variable, t. 2
March 7
13.4. Theory. Before trying to solve such systems, we might want to have some assurance
that solutions exist. Fortunately, as in the case of a single linear ODE, they do:
Existence and uniqueness theorem for a linear system of ODEs. Let A(t) be a
matrix-valued function and let q(t) be a vector-valued function, both continuous on an open
interval I. Let a ∈ I, and let b be a vector. Then there exists a unique solution x(t) to the
system
ẋ = A(t) x + q(t)
satisfying the initial condition x(a) = b.
(Of course, the sizes of these matrices and vectors should match in order for this to make sense.)
x1 b1
.. ..
Remark 13.1. Write x = . and b = . . Then the vector initial condition x(a) = b
xn bn
amounts to n scalar initial conditions: x1 (a) = b1 , . . . , xn (a) = bn .
Once the system ẋ = A(t) x and the starting time a are fixed, there is a one-to-one
correspondence
{solutions to ẋ = A(t) x} ←→ {possibilities for b}
under which each solution x(t) corresponds to its initial condition vector b := x(a) (the
existence and uniqueness theorem says that for each b, there is one solution x(t)). Adding
solutions corresponds to adding their b vectors, and scalar multiplication of solutions corre-
sponds to scalar multiplication of their b vectors too. Therefore the concepts of span, linear
73
independence, basis, and dimension on the left side correspond to the same concepts on the
right side. In particular,
Dimension theorem for a homogeneous linear system of ODEs. For any first-order
homogeneous linear system of n ODEs in n unknown functions
ẋ = A(t) x,
Solution: Introduce a new function variable y := ẋ. Now try to express the derivatives ẋ
and ẏ in terms of x and y:
ẋ = y
ẏ = ẍ = −5ẋ − 6x = −6x − 5y.
!
0 1
In matrix form, this is ẋ = Ax with A = . 2
−6 −5
!
0 1
(The matrix arising this way is called the companion matrix of the polynomial
−6 −5
r2 + 5r + 6.)
Remark 13.3. For constant-coefficient ODEs, the characteristic polynomial of the second-order ODE (scaled,
if necessary, to have leading coefficient 1) equals the characteristic polynomial (to be defined soon) of the
matrix of the first-order system.
Remark 13.4. Given a 3rd -order ODE with unknown function x, we can convert it to a system
of first-order ODEs by introducing y := ẋ and z := ẍ. In general, we can convert an nth -order
ODE to a system of n first-order ODEs.
Remark 13.5. One can also convert systems of higher-order ODEs to systems of first-order
ODEs. For example, a system of 4 fifth-order ODEs can be converted to a system of 20
first-order ODEs. That’s why it’s enough to study first-order systems.
74
13.6. Converting a system of two first-order ODEs to a second-order ODE. Con-
versely, given a system of two first-order ODEs, one can eliminate function variables to find a
second-order ODE satisfied by one of the functions.
ẋ = 2x − y
ẏ = 5x + 7y,
Solution: Solve for y in the first equation (y = 2x − ẋ) and substitute into the second:
This simplifies to
ẍ − 9ẋ + 19x = 0. 2
Remark 13.7. First-order systems with more than two equations can be converted too, but
the conversion is not so easy.
Remark 13.8. In principle, we could solve a first-order linear system of ODEs by first converting
it in this way. But usually it is better just to leave it as a system.
x = eλt v,
where λ is a number and v is a nonzero constant vector. (Note: In contrast with ceat , we put
the eλt first when writing eλt v in order to follow the convention of writing the scalar first in a scalar-vector
multiplication. Some people nevertheless write veλt instead of eλt v; it means the same thing.)
Question 14.1. For which pairs (λ, v) consisting of a scalar and a nonzero vector is the
vector-valued function x = eλt v a solution to the system ẋ = Ax?
75
Solution: Plug it in, to see what has to happen in order for it to be a solution:
Interchanging sides and dividing by eλt (also a reversible operation) shows that this is
equivalent to
Av = λv .
! !
1 −2 2
Problem 14.3. Let A = and let v = . Is v an eigenvector of A?
−1 0 −1
In order to find eigenvalues and eigenvectors of a matrix, we need a few concepts from
linear algebra.
Key property: Iv = v for any vector v. (Check this yourself in the 2 × 2 case!)
Definition 14.4. The trace of a square matrix A is the sum of the entries along the diagonal.
It is denoted tr A.
4 6 9
Example 14.5. If A = 1 7 8, then tr A = 4 + 7 + 5 = 16.
2 3 5
14.5. Determinant. Covered in recitation.
To each square matrix A is associated a number det A called the determinant.
Key property: Av = 0 has a nonzero solution v if and only if det A = 0.
(Thus the determinant “determines” whether a system of linear equations has a nonzero
solution.) In the 2 × 2 case, the determinant is given by the formula
!
a b
det = ad − bc.
c d
Warning: Trace and determinant make sense only for square matrices.
The reason for this definition will be clear in the next section when we show how to
compute eigenvalues.
(Warning: This is not the same concept as the characteristic polynomial of a constant-coefficient linear
ODE, but there is a connection, arising when such a DE is converted to a first-order system of linear ODEs.)
Remark 14.7. We often calculate the characteristic polynomial using det(A − λI) instead.
This turns out to be the same as det(λI − A), except negated when n is odd. (The reason is that
changing the signs of all n rows of the matrix A − λI flips the sign of the determinant n times.) Usually we
care only about the roots of the polynomial, so negating the whole polynomial doesn’t make
a difference. In any case, det(A − λI) = det(λI − A) for 2 × 2 matrices (since 2 is even).
77
!
7 2
Problem 14.8. What is the characteristic polynomial of A := ?
3 5
Solution: We have
! ! !
7 2 λ 0 7−λ 2
A − λI = − =
3 5 0 λ 3 5−λ
Remark 14.10. Suppose that n > 2. Then, for an n × n matrix A, the characteristic polynomial has the form
where the ± is + if n is even, and − if n is odd. So knowing tr A and det A determines some of the coefficients
of the characteristic polynomial, but not all of them.
Av = 5v
5v − Av = 0
5Iv − Av = 0
(5I − A)v = 0.
• det(5I − A) = 0.
• Evaluating the characteristic polynomial det(λI − A) at 5 gives 0.
• 5 is a root of the characteristic polynomial. 2
The same test works for any number in place of 5. (Now that we know how this works, we
never again have to go through the argument above.) Conclusion:
Remark 14.13. In this example, the matrix equation became two copies of the same equation
−v − 2w = 0. More generally, for any 2 × 2 matrix A and eigenvalue λ, one of the two
equations will be a scalar multiple of the other, so again we need to consider only one of
them. In particular, the system of two equations will always have a nonzero solution (as
there must be, by definition of eigenvalue).
To summarize:
Steps to find all the eigenvectors associated to a given eigenvalue λ of a 2 × 2 matrix A:
1. Calculate A − λI. !
v
2. Expand (A − λI)v = 0 using v = ; this gives a system of two equations in x
w
and y.
3. Solve the system; one of the equations will be redundant, so nonzero solutions will
exist. !
v
4. The solution vectors are the eigenvectors associated to λ.
w
14.9. Solving a 2×2 homogeneous linear system of ODEs with constant coefficients.
Steps to find a basis of solutions to ẋ = Ax, given a 2 × 2 constant matrix A with distinct
eigenvalues:
1. Compute the characteristic polynomial det(λI−A) or det(A−λI) or λ2 − (tr A)λ + (det A).
2. Find the roots λ1 and λ2 of the characteristic polynomial; these are the eigenvalues.
3. Solve (A − λ1 I)v = 0 to find a nonzero eigenvector v1 associated with λ1 . (Assuming
that λ1 is not a repeated root, the eigenvectors associated to λ1 will be just the
scalar multiples of v1 .)
4. Solve (A − λ2 I)v = 0 to find a nonzero eigenvector v2 associated to λ2 .
5. Then eλ1 t v1 and eλ2 t v2 form a basis for the space of solutions. (Under our assumption
λ1 6= λ2 , these two vector-valued functions are linearly independent.)
The “simple” solutions forming a basis, here of the shape eλt v, are sometimes called normal
modes. There is not a precise mathematical definition of normal mode, however, since what counts as simple
is subjective.
ẋ = x − 2y
ẏ = −x
x(0) = −1
y(0) = 8.
!
1 −2
Solution: This is ẋ = Ax with A := .
−1 0
We already found the eigenvalues and eigenvectors of A:
• Eigenvalues: 2, −1.
81
!
−2
• Eigenvector associated to the eigenvalue 2: .
1
!
1
• Eigenvector associated to the eigenvalue −1: .
1
! !
−2 1
Basis of the space of solutions: e2t , e−t .
1 1
! !
−2 1
General solution: x(t) = c1 e2t + c2 e−t .
1 1
In other words,
x(t) = −6e2t + 5e−t ,
y(t) = 3e2t + 5e−t .
Since there were many opportunities to make errors, it would be wise to check the answer
by verifying that these functions satisfy the original DEs and initial condition:
ẋ = −12e2t − 5e−t = x − 2y
ẏ = 6e2t − 5e−t = −x
x(0) = −6 + 5 = −1
y(0) = 3 + 5 = 8. ,
Remark 14.16. The system of linear equations involving c1 and c2 could have been written in
matrix form: ! ! !
−2 1 c1 −1
= .
1 1 c2 8
This point of view will be helpful for more complicated systems.
82
March 9
Answer: Make two plots, the first showing x(t) as a function of t, and the second showing
y(t) as a function of t. !
x(t)
Better answer: Draw the solution as a parametrized curve in the phase plane with
y(t)
!
x(t)
axes x and y. In other words, plot the point for every real number t (including
y(t)
negative t). The ODE specifies, in terms of the current position, which direction the phase
plane point will move next (and how fast).
Question 14.18. Suppose that λ is a real eigenvalue of A, and v is a nonzero real eigenvector
associated to λ. Then eλt v is a solution to ẋ = Ax. (It is the solution satisfying x(0) = v.)
Evaluating eλt v at any time t gives a positive scalar multiple of v, so the trajectory is
contained in the ray through v. What is the direction of the trajectory?
Two approaches:
1. Consider the length and direction of eλt v as t changes.
2. Use the ODE itself to get the velocity vector at each point.
Answers:
• If λ > 0, the phase point tends to infinity (repelled from (0, 0)).
• If λ < 0, the phase point tends to (0, 0) (attracted to (0, 0)).
• If λ = 0, the phase point is stationary at v! The point v is called an critical point
since ẋ = 0 there. 2
83
!
−1 0
Flashcard question: One of the solutions to ẋ = x is
0 −2
! ! ! !
x −t 1 −2t 0 e−t
=e +e = .
y 0 1 e−2t
Which of the following describes the motion in the phase plane with axes x and y as t → +∞?
Possible answers:
• approaching infinity, along a curve asymptotic to the x-axis
• approaching infinity, along a curve asymptotic to the y-axis
• approaching infinity, along a straight line
• approaching the origin, along a curve tangent to the x-axis
• approaching the origin, along a curve tangent to the y-axis
• approaching the origin, along a straight line
• spiraling
• none of the above
Answer: Approaching the origin, along a curve tangent to the x-axis. As t → +∞, both
x = e−t and y = e−2t tend to 0, but the y-coordinate tends to 0 faster than the x-coordinate,
so the trajectory is tangent to the x-axis. (In fact, the y-coordinate is always the square of
the x-coordinate, so the trajectory is part of the parabola y = x2 .) 2
The phase plane trajectory by itself does not describe a solution fully, since it does not
show at what time each point is reached. The trajectory contains no information about speed,
though one can specify the direction by drawing an arrow on the trajectory.
The phase portrait (or phase diagram) is the diagram showing all the trajectories in the
phase plane. We are now ready to classify all possibilities for the phase portrait in terms
84
of the eigenvalue behavior. The most common ones are indicated in green; the others are
degenerate cases.
Try the “Linear Phase Portraits: Matrix Entry” mathlet
http://mathlets.org/mathlets/linear-phase-portraits-matrix-entry/
See if you can get every possibility listed below.
14.11.1. Distinct real eigenvalues. Suppose that the eigenvalues λ1 , λ2 are real and distinct.
Let v1 , v2 be corresponding eigenvectors. The set of all eigenvectors associated to λ1 consists
of all scalar multiples of v1 ; these form the eigenline of λ1 . General solution:
c1 eλ1 t v1 + c2 eλ2 t v2 .
Opposite sign: λ1 > 0, λ2 < 0. This is called a saddle. Trajectories flow outward along the
positive eigenline (the eigenline of λ1 ) and inward along the negative eigenline (the eigenline
of λ2 ). Other trajectories are asymptotic to both!eigenlines, tending
! to infinity towards the
1 −1
positive eigenline. (Typical solution: x = e2t + e−3t . When t = +1000, this is
1 1
!
1
approximately a large positive multiple of . When t = −1000, this is approximately a
1
!
−1
large positive multiple of .)
1
In the next two cases, in which the eigenvalues have the same sign, we’ll want to know
which is bigger. If |λ1 | > |λ2 |, call λ1 the fast eigenvalue and λ2 the slow eigenvalue; use the
same adjectives for the eigenlines.
Both positive: λ1 , λ2 > 0. This is called a repelling node (or node source). All nonzero
trajectories flow from (0, 0) towards infinity. Trajectories not contained in the eigenlines
are tangent to the slow eigenline at (0, 0), and far from (0, 0) have direction approximately
parallel to the fast eigenline.
Both negative: λ1 , λ2 < 0. This is called an attracting node (or node sink). All nonzero
trajectories flow from infinity towards (0, 0). Trajectories not contained in the eigenlines
are tangent to the slow eigenline at (0, 0), and far from (0, 0) have direction approximately
parallel to the fast eigenline.
Zero real part: a = 0. This is called a center. The nonzero trajectories are concentric
ellipses. Solutions are periodic
" with period
!# 2π/b. !
2 2 cos t
(Typical solution: x = Re eit = , a parametrization of a fat ellipse.)
−i sin t
Positive real part: a > 0. This is called a repelling spiral (or spiral source). All nonzero
trajectories spiral outward.
Negative real part: a < 0. This is called an attracting spiral (or spiral sink). All nonzero
trajectories spiral inward.
In these spiraling or rotating cases, how can one determine whether trajectories go clockwise
or counterclockwise? It’s complicated to see this in terms of eigenvalues and eigenvectors, !
1
but easy to see by testing a single velocity vector. The velocity vector at x = is
0
!
1
ẋ = Ax = A ; trajectories go counterclockwise if and only if this velocity vector has
0
positive y-coordinate.
14.11.3. Repeated real eigenvalue. Suppose that there is a repeated real eigenvalue, say λ, λ. The eigenspace
of λ (the set of all eigenvectors associated to λ) could be either 1-dimensional or 2-dimensional.
λ 6= 0 and A 6= λI (1-dimensional eigenspace): This is a called a degenerate node (or improper node or
defective node). There is just one eigenline. It serves as both the slow eigenline and the fast eigenline: every
trajectory not contained in it is tangent to it at (0, 0), and approximately parallel to it when far from (0, 0).
Such trajectories are repelled from (0, 0) if λ > 0, and attracted to (0, 0) if λ < 0. (This is a borderline case
between a node and a spiral.)
λ 6= 0 and A = λI (2-dimensional eigenspace): This is a called a star node. Every vector is an eigenvector.
Nonzero trajectories are along rays, repelled from (0, 0) if λ > 0, and attracted to (0, 0) if λ < 0.
λ = 0 and A 6= 0 (1-dimensional eigenspace): This could be called parallel lines. Points on the eigenline
are stationary. All other trajectories are lines parallel to the eigenline.
λ = 0 and A = 0 (2-dimensional eigenspace): This could be called stationary. Every point is stationary.
86
14.11.4. Summary. Although all of the above may be needed for homework problems, for
exams you are expected to know only the main cases listed in green above and also the case
of a center, not the other “borderline” cases.
March 12
!
−5 −2
Problem 14.19. Sketch the phase portrait of the system ẋ = x.
−1 −4
Solution: Call the matrix A. Then tr A = −9 and det A = 20 − 2 = 18, so the characteristic
polynomial is λ2 + 9λ + 18 = (λ + 6)(λ + 3). The eigenvalues are the roots, which are −6
and −3.
Eigenvectors of −6: These are the solutions to
(A − (−6)I)v = 0
! ! !
1 −2 v 0
= ,
−1 2 w 0
which is the linear system
v − 2w = 0
−v + 2w = 0.
87
Here w can be any number c, and then v = 2c, so
! ! !
v 2c 2
= =c .
w c 1
!
2
Thus the eigenline of −6 is the line through the origin in the direction of .
1
Eigenvectors of −3: These are the solutions to
(A − (−3)I)v = 0
! ! !
−2 −2 v 0
= ,
−1 −1 w 0
−2v − 2w = 0
−v − w = 0.
Solution: Call the matrix A, so A = 2I. Every vector v satisfies Av = 2Iv = 2v, so every
vector is an eigenvector associated with the eigenvalue 2. At every position in the phase
plane, the system ẋ = Ax says that the velocity vector is 2 times the position vector, so
every trajectory moves out radially along a ray. This phase portrait is called a star node (a
degenerate case).
(If instead A were −2I, then every trajectory would tend to (0, 0) along a ray.)
88
Problem 14.21. !Bonus problem: not done in lecture. Sketch the phase portrait of the
2 1
system ẋ = x.
0 2
14.12. Trace-determinant plane. The type of phase portrait is determined by the eigen-
values λ1 , λ2 (except in the case of a repeated eigenvalue, when one needs to know whether
A is a scalar times I). And the eigenvalues are determined by the characteristic polynomial
det(λI − A) = λ2 − (tr A)λ + (det A) = (λ − λ1 )(λ − λ2 ).
(Comparing coefficients shows that tr A = λ1 + λ2 and det A = λ1 λ2 .)
Therefore the classification of phase portraits can be re-expressed in terms of tr A and
det A. First, by the quadratic formula, the number of real eigenvalues is determined by the
sign of the discriminant (tr A)2 − 4 det A.
Only some of the cases below were discussed in lecture.
14.12.1. Distinct real eigenvalues. Suppose that (tr A)2 − 4 det A > 0.
Then the eigenvalues are real and distinct.
• If det A < 0, then λ1 λ2 < 0, so the eigenvalues have opposite sign: saddle.
• If det A > 0, then the eigenvalues have the same sign.
89
– If tr A > 0, repelling node.
– If tr A < 0, attracting node.
• If det A = 0, then one eigenvalue is 0; comb.
The trace-determinant plane is the plane with axes tr and det. This is completely different
from the phase plane (because the axes are different).
Whereas the phase portrait shows all possible trajectories for a system ẋ = Ax, the
trace-determinant plane has just one point for the system. The position of that point contains
information about the kind of phase portrait.
90
Above the parabola det = 14 tr2 , the eigenvalues are complex. Below the parabola, the
eigenvalues are real and distinct.
Definition 14.22. If the phase portrait type is robust in the sense that small perturbations
in the entries of A cannot change the type of the phase portrait, then the system is called
structurally stable.
Warning: A system ẋ = Ax can be structurally stable without being stable, and can be
stable without being structurally stable. It is unfortunate that the two concepts have similar
names, since they are independent of each other.
91
The structurally stable cases are those corresponding to the large regions in the trace-
determinant plane, not the borderline cases. For a 2 × 2 matrix A, the system ẋ = Ax is
structurally stable if and only if A has either
14.15.1. Conservation of energy in the harmonic oscillator. Consider the harmonic oscillator
described by mẍ + kx = 0. Let’s check conservation of energy.
mẋ2
Kinetic energy: KE = .
2
Potential energy PE is a function of x, and
x x
kx2
Z Z
PE(x) − PE(0) = − Fspring (X) dX = − −kX dX = .
2
|0 0
| {z }
change in PE {z }
work done by Fspring
kx2
If we declare PE = 0 at position 0, then PE(x) = .
2
Total energy:
mẋ2 kx2
E = KE + PE = + .
2 2
How does total energy change with time?
So energy is conserved.
14.15.2. Phase plane depiction of the harmonic oscillator. Three ways to depict the harmonic
oscillator:
• Graph of x as a function of t:
92
!
x(t)
• Trajectory of (a parametrized curve) in the phase plane whose horizontal
ẋ(t)
axis shows x and whose vertical axis shows ẋ (the phase plane after converting the
second-order ODE to a 2 × 2 system):
Let’s start with the mass to the right of equilibrium, and then let go. At t = 0, we have
x > 0 and ẋ = 0. At the first time the mass crosses equilibrium, x = 0 and ẋ < 0. When the
mass reaches its leftmost point, x < 0 and ẋ = 0 again. These give three points on the phase
plane trajectory.
March 14
Here are two ways to see that the whole trajectory is an ellipse:
1. We have x = A cos ωt for some A and ω. Thus ẋ = −Aω sin ωt. The parametrized curve
!
A cos ωt
−Aω sin ωt
93
is an ellipse. (This is like the parametrization of a circle, but with axes stretched by
different amounts.)
mẋ2 kx2
2. Rearrange E = + as
2 2
x2 ẋ2
+ = 1.
2E/k 2E/m
p √
This is an ellipse with semi-axes 2E/k and 2Em.
Flashcard question: In which direction is the ellipse traversed?
Possible answers:
1. Clockwise.
2. Counterclockwise.
3. It depends on the initial conditions.
Answer: Clockwise. Above the horizontal axis, ẋ > 0, which means that x is increasing.
Changing the initial conditions changes E, which changes the ellipse. The family of all
such trajectories is a nested family of ellipses, the phase portrait of the system.
14.15.3. Energy loss in the damped oscillator. Now consider a damped oscillator described
by mẍ + bẋ + kx = 0. Now
Ė = ẋ(mẍ + kx)= −bẋ2 .
Energy is lost to friction. The dashpot heats up. The phase plane trajectory crosses through
the equal-energy ellipses, inwards towards the origin.
It may spiral in (underdamped case), or approach the origin more directly (critically
damped and overdamped cases).
94
Midterm 2 covers everything up to here.
(A − λI)v = 0,
in which A and λ are known and v is unknown. In the 2 × 2 case, this is easy, but to handle
more complicated situations, we’ll need better methods for solving linear systems.
Solving a linear system comes up also when we have a general solution to an ODE (or
system of ODEs) and want to use given initial conditions to find the coefficients in the
particular solution.
And there are many other applications of solving linear systems in all branches of science
and engineering (e.g., balancing a chemical equation).
15.2. Intersecting lines in R2 . Given a system of two equations in two unknowns, each
equation describes a line (assuming that the equation is not just constant=constant). The
solution to the system is the intersection of the two lines.
From geometry, you know that there are three possibilities:
95
• The lines intersect at one point : one solution. (This is what happens most of the
time.) Example:
x+y =1
x − 2y = 2.
• The lines are the same, so their intersection is a line : infinitely many solutions.
Example:
x+y =1
2x + 2y = 2.
x+y =1
x + y = 0.
With more equations and more unknowns, there are more possibilities, and we want to
describe them all. For this, we need to develop more linear algebra.
2x + 5y + 7z = 15
x + z=1
What kind
of thing is the left hand side? It’s a vector-valued function f of an unknown
x
vector y : the definition of f is
z
x !
2x + 5y + 7z
f y :=
x + z
z output
input
100
Problem 15.1. What is f 10 ?
1
96
!
257
Solution: Plug in x = 100, y = 10, z = 1 to get . 2
101
Here f is a function from R3 to R2 (input has 3 coordinates, output has 2 coordinates).
Solution:
x ! x
2 5 7
f y := y ,
1 0 1
z z
!
2 5 7
so A = . 2
1 0 1
One says that A is the matrix representing the function f , and that f is the function defined
by the matrix A.
Key point: In general, a function f from Rn to Rm such that each coordinate of the
output is a linear combination of the input coordinates is represented by an m × n matrix A .
(Warning: m and n get reversed, as in our 2 × 3 example.)
15.5.1. Depicting a function. Imagine evaluating f on every vector in the input space to get
vectors in the output space. To visualize it, draw a shape in the input space, apply f to every
point in the shape, and plot the output points in the output space.
!
2 0
Problem 15.3. The matrix represents a linear transformation f . Depict f by
0 1
showing what it does to the standard basis vectors i, j of R2 and to the unit smiley. What is
the area scaling factor?
97
Solution: We have
! !! !
1 2 0 1 2
f = =
0 0 1 0 0
! ! ! !
0 2 0 0 0
f = =
1 0 1 1 1
and the unit smiley is stretched horizontally into a fat smiley of the same height.
For a 2 × 2 matrix, the area scaling factor is the absolute value of the determinant:
!
2 0
det = |2| = 2. 2
0 1
http://mathlets.org/mathlets/matrix-vector/
Problem 15.4. Let f be the function that rotates each vector in R2 counterclockwise by the
angle θ. What is the corresponding matrix R?
98
! ! ! !
1 cos θ 0 − sin θ
Solution: The rotation maps to and to . Thus
0 sin θ 1 cos θ
! !
1 cos θ
(first column of R) = R =
0 sin θ
! !
0 − sin θ
(second column of R) = R = ,
1 cos θ
so !
cos θ − sin θ
R= .
sin θ cos θ
2x + 5y + 7z = 15
x + z=1
16.2. Row operations. The equation operations correspond to operations on the augmented
matrix, called elementary row operations:
• Multiply a row by a nonzero number.
• Interchange two rows.
• Add a multiple of one row to another row (while leaving the first row as it was).
March 16
16.4. Review.
Further question: What kind of function is the second coordinate y(t) of x(t)?
101
Solution: If v is a nonzero eigenvector associated to 1 + 2i, then v is a nonzero eigenvector
associated to 1 − 2i, so the vector-valued functions
e(1+2i)t v, e(1−2i)t v
form a basis of the space of solutions. Therefore x(t) is a linear combination of these. Taking
second coordinates shows that y(t) is a linear combination of e(1+2i)t and e(1−2i)t . We can
replace the latter two functions by the real and imaginary parts of the first of them, so y(t)
is also a linear combination of et cos(2t) and et sin(2t). In other words, y(t) is et times a
sinusoidal function with ω = 2:
y(t) = et A cos(2t − φ)
Further question: If x(0) is on the positive x-axis, what is the next time t that x(t) lies on
the x-axis?
Solution 1: We have
y(t) = c1 et cos(2t) + c2 et sin(2t)
for some real numbers c1 and c2 . Since x(0) is on the positive x-axis, y(0) = 0. Plugging
t = 0 into the formula for y(t) gives 0 = c1 , so
y(t) = c2 et sin(2t).
The first positive time at which y(t) = 0 is when the angle 2t equals π, so t = π/2.
Solution 2: The sinusoidal function y(t) crosses 0 twice within each period. The period is
P = 2π/ω = π, so the time interval between crossings is P/2 = π/2.
Further question: At that first time when it crosses the x-axis again, what is its distance
to the origin compared to its initial distance to the origin?
Solution: The y-coordinate is 0 at both t = 0 and t = π/2, so we need only study x(t).
The same argument as for y(t) shows that x(t) = et A0 cos(2t − φ0 ) for some A0 > 0 and some
φ0 . When t goes from 0 to π/2, the angle 2t − φ0 increases by π, so the cosine changes sign,
while et increases from 1 to eπ/2 . Thus the distance is multiplied by eπ/2 .
Problem 16.2. Estimate the angular frequency ω for which the steady-state solution to
p(D) z = eiωt
102
1 iωt 1
has a solution e , by ERF, as long as p(iω) 6= 0. Thus the complex gain is and
p(iω) p(iω)
1
the gain is , which is largest when iω is close to a root of p(r).
|p(iω)|
Now
The roots of q(r) are −1 and ±2i and these are close to the roots of p(r). In particular, the
positive numbers ω such that iω is close to a root of p(r) are the numbers ω close to 2. Thus
the amplitude of the solution is maximized for a value of ω close to 2.
Problem 16.3. Not actually done in lecture. A ball is thrown straight upward from the
ground. Let x(t) be its height in meters after t seconds (up until it returns to the ground).
Sketch the possible trajectories in the (x, ẋ) phase plane.
mẋ2
Solution: Let m be the mass of the ball. Kinetic energy is . Declare that the potential
2
energy is 0 at the ground. The force of gravity is a constant −mg, so the work done as the
ball rises to height x is −mgx, so potential energy has increased to mgx. Total energy:
mẋ2
E= + mgx.
2
For various positive constants E, these are the equations of the trajectories. They are parts
of parabolas.
In which direction are the trajectories traversed? Downward: above the horizontal axis,
ẋ > 0, which means that x is increasing.
103
What if there is a little bit of air resistance? Then the phase plane trajectory crosses
through the equal-energy parabolas, and the ball lands with a lower speed than it started
with (and its velocity is of opposite sign, of course).
Question 16.4. Not actually done in lecture. Can two different trajectories in the (x, y)-
phase plane for a system ẋ = Ax ever intersect?
Answer: No. If a trajectory passes through a point v, then its behavior before and after
are uniquely determined, by the existence and uniqueness theorem. (They can approach the
same point as t → ∞, however.)
March 19
Midterm 2
March 21
16.5. Gaussian elimination. Gaussian elimination is an algorithm for converting any matrix
into row-echelon form by performing row operations. Here are the steps:
1. Find the leftmost nonzero column, and the first nonzero entry in that column (read from
the top down).
2. If that entry is not already in the first row, interchange its row with the first row.
3. Make all other entries of the column zero by adding suitable multiples of the first row to
the others.
104
4. At this point, the first row is done, so ignore it, and repeat the steps above for the
remaining submatrix (with one fewer row). In each iteration, ignore the rows already
taken care of.
5. Stop when all the remaining rows consist entirely of zeros. Then the whole matrix will be
in what is called row-echelon form.
Problem 16.5. Done in recitation. Apply Gaussian elimination to convert the 4 × 7 matrix
0 0 6 2 −4 −8 8
0 0 3 1 −2 −4 4
2 −3 1 4 −7
1 2
6 −9 0 11 −19 3 0
to row-echelon form. (This example is taken from Hill, Elementary linear algebra with applications, p. 17.)
Solution:
Step 1. The leftmost nonzero column is the first one, and its first nonzero entry is the 2:
0 0 6 2 −4 −8 8
0 0 3 1 −2 −4 4
.
2 −3 1 4 −7 1 2
6 −9 0 11 −19 3 0
Step 2. The 2 is not in the first row, so interchange its row with the first row:
2 −3 1 4 −7 1 2
0 0 3 1 −2 −4 4
.
0 6 2 −4 −8 8
0
6 −9 0 11 −19 3 0
Step 3. To make all other entries of the column zero, we need to add −3 times the first
row to the last row (the other rows are OK already):
2 −3 1 4 −7 1 2
0 0 3 1 −2 −4 4
.
2 −4 −8
0 0 6 8
0 0 −3 −1 2 0 −6
Step 4. Now the first row is done. Start over with the 3 × 7 submatrix that remains
beneath it:
2 −3 1 4 −7 1 2
0 0 3 1 −2 −4 4
.
2 −4 −8 8
0 0 6
0 0 −3 −1 2 0 −6
105
Step 1. The leftmost nonzero column is now the third column, and its first nonzero entry
is the 3:
2 −3 1 4 −7 1 2
0 0 3 1 −2 −4 4
.
2 −4 −8 8
0 0 6
0 0 −3 −1 2 0 −6
Step 2. The 3 is already in the first row of the submatrix (we are ignoring the first row of
the whole matrix), so no interchange is necessary.
Step 3. To make all other entries of the column zero, add −2 times the (new) first row to
the (new) second row, and 1 times the (new) first row to the (new) third row:
2 −3 1 4 −7 1 2
0 0 3 1 −2 −4 4
.
0 0 0 0 0 0 0
0 0 0 0 0 −4 −2
Step 4. Now the first and second row of the original matrix are done. Start over with the
2 × 7 submatrix beneath them:
2 −3 1 4 −7 1 2
0 0 3 1 −2 −4 4
.
0 0 0 0 0 0 0
0 0 0 0 0 −4 −2
Step 1. The leftmost nonzero column is now the penultimate column, and its first nonzero
entry is the −4 at the bottom:
2 −3 1 4 −7 1 2
0 0 3 1 −2 −4 4
.
0 0 0 0 0 0 0
0 0 0 0 0 −4 −2
Step 2. The −4 is not in the first row of the submatrix, so interchange its row with the
first row of the submatrix:
2 −3 1 4 −7 1 2
0 0 3 1 −2 −4 4
.
0 0 0 0 0 −4 −2
0 0 0 0 0 0 0
Step 3. The other entry in this column of the submatrix is already 0, so this step is not
necessary.
106
Now the first three rows are done. What remains below them is all zeros, so stop! The
matrix is now in row-echelon form:
2 −3 1 4 −7 1 2
0 0 3 1 −2 −4 4
. 2
0 0 0 0 0 −4 −2
0 0 0 0 0 0 0
16.6. Row-echelon form. What does row-echelon form mean? Before explaining this, we
need a few preliminary definitions. A zero row of a matrix is a row consisting entirely of zeros.
A nonzero row of a matrix is a row with at least one nonzero entry. In each nonzero row, the
first nonzero entry is called the pivot.
Example 16.6. The following 4 × 5 matrix has one zero row, and three pivots (shown in red):
0 −5 4 4 3
2 0 0 1 7
.
0 0 0 0 0
0 0 0 0 3
Definition 16.7. A matrix is in row-echelon form if it satisfies both of the following conditions:
1. All the zero rows (if any) are grouped at the bottom of the matrix.
2. Each pivot lies farther to the right than the pivots of higher rows.
Warning: Some books require also that each pivot be a 1. We are not going to require this for row-echelon
form, but we will require it for reduced row-echelon form later on.
16.7. Back-substitution.
Key point of row-echelon form: Matrices in row-echelon form correspond to systems that
are ready to be solved immediately by back-substitution: solve for each variable in reverse
order, while introducing a parameter for each variable not directly expressed in terms of later
variables, and substitute values into earlier equations once they are known.
Problem 16.8. Suppose that we are solving a linear system with unknowns x, y, z, v, w.
Suppose that we already wrote down the augmented matrix and used Gaussian elimination
to convert it to row-echelon form, resulting in
1 2 0 2 3 4
0 −1 2 3 1 5
.
0 0 0 0 2 6
0 0 0 0 0 0
x + 2y + 2v + 3w = 4
−y + 2z + 3v + w = 5
2w = 6
0 = 0.
We solve for the variables in reverse order, using the equations from the bottom up. Start by
solving for the last variable, w:
w = 3.
There is no equation for v in terms of the later variable w, so v can be any number; set
v = c1 for a parameter c1 .
z = c2 for a parameter c2 .
Substitute the values of w, v, z into the previous equation, and solve for y:
−y + 2c2 + 3c1 + 3 = 5
y = 3c1 + 2c2 − 2.
Suppose that a matrix is in row-echelon form. Then any column that contains a pivot is
called a pivot column. A variable whose corresponding column is a pivot column is called
a dependent variable or pivot variable. The other variables are called free variables. (The
augmented column does not correspond to any variable.)
In the problem above, x, y, w were dependent variables, and v, z were free variables.
Warning: If your matrix is not in row-echelon form yet, don’t talk about pivot columns
and pivot variables!
16.8. Reduced row-echelon form. With even more row operations, one can simplify a
matrix in row-echelon form to an even more special form:
Definition 16.9. A matrix is in reduced row-echelon form (RREF) if it satisfies all of the
following conditions:
1. It is in row-echelon form.
2. Each pivot is a 1.
3. In each pivot column, all the entries are 0 except for the pivot itself.
16.9. Gauss–Jordan elimination. The presentation of the algorithm and the first problem
below was done in recitation.
Gauss–Jordan elimination is an algorithm for converting any matrix into reduced row-echelon
form by performing row operations. Here are the steps:
109
1. Use Gaussian elimination to convert the matrix to row echelon form.
2. Divide the last nonzero row by its pivot, to make the pivot 1.
3. Make all entries in that pivot’s column 0 by adding suitable multiples of the pivot’s row
to the rows above.
4. At this point, the row in question (and all rows below it) are done. Ignore them, and go
back to Step 2, but now with the remaining submatrix, above the row just completed.
Eventually the whole matrix will be in reduced row-echelon form.
0 0 0 0 0 0 0
Solution:
Step 1. The matrix is already in row-echelon form.
Step 2. The last nonzero row is the third row, and its pivot is the −4, so divide the third
row by −4:
2 −3 1 4 −7 1 2
0 0 3 1 −2 −4 4
.
0 0 0 0 0 1 1/2
0 0 0 0 0 0 0
Step 3. To make all other entries of that pivot’s column 0, add −1 times the third row to
the first row, and add 4 times the third row to the second row:
2 −3 1 4 −7 0 3/2
0 0 3 1 −2 0 6
.
0 0 0 0 0 1 1/2
0 0 0 0 0 0 0
Step 3. To make the other entries of that pivot’s column 0, add −1 times the second row
to the first row:
2 −3 0 11/3 −19/3 0 −1/2
0 0 1 1/3 −2/3 0 2
.
0 0 0 0 0 1 1/2
0 0 0 0 0 0 0
Step 4. Now the last three rows are done:
2 −3 0 11/3 −19/3 0 −1/2
0 0 1 1/3 −2/3 0 2
.
0 0 0 0 0 1 1/2
0 0 0 0 0 0 0
Problem 16.11. Suppose that we are solving a linear system for unknowns x, y, z, v, w.
Suppose that we have used Gauss–Jordan elimination to put the augmented matrix in reduced
row-echelon form, and the result is
1 0 −2 0 7 3
0 1 6 0 8 4 .
0 0 0 1 9 5
(Check: This really is in reduced row-echelon form!) Find the general solution to the system.
111
Solution: The system to be solved is
x − 2z + 7w = 3
y + 6z + 8w = 4
v + 9w = 5.
Back-substitution:
w = c1 (free variable)
v = −9w + 5 = −9c1 + 5
z = c2 (free variable)
y = −6z − 8w + 4 = −6c2 − 8c1 + 4
x = 2z − 7w + 3 = 2c2 − 7c1 + 3.
(Notice that no substitution was required: we could solve for each variable directly!) Answer:
x 2c2 − 7c1 + 3 3 −7 2
y −6c − 8c + 4 4 −8 −6
2 1
z = c = 0 + c 0 + c 2 1 . 2
2 1
v −9c1 + 5 5 −9 0
w c1 0 1 0
Moral: After using Gaussian elimination to put an augmented matrix into row-echelon
form, there are two ways to finish solving the linear system:
• Do back-substitution.
• Do the extra row operations need to get the matrix into reduced row-echelon form
(Gauss–Jordan elimination), and then do (a much easier) back-substitution.
Remark 16.12. Performing row operations on A in a different order than specified by Gaussian
elimination and Gauss–Jordan elimination can lead to different row-echelon forms. But it
turns out that row operations leading to reduced row-echelon form always give the same
result, a matrix that we will write as RREF(A).
112
16.10. Comparing inhomogeneous and homogeneous linear systems. Recall that the
general solution to the inhomogeneous system
x + 2y + 2v + 3w = 4
−y + 2z + 3v + w = 5
2w = 6
0=0
was
x −1 −8 −4
y −2 3 2
z = 0 + c1 0 + c2 1 ,
v 0 1 0
w 3 0 0
| {z } | {z }
particular solution general homogeneous solution
where c1 , c2 are parameters. (The labels under the braces haven’t been explained yet.)
Doing Gaussian elimination and back-substitution again would show that the general
solution to the associated homogeneous system
x + 2y + 2v + 3w = 0
−y + 2z + 3v + w = 0
2w = 0
0=0
is
x −8 −4
y 3 2
z = c1 0 + c2 1 ,
v 1 0
w 0 0
where c1 , c2 are parameters. We now want to say that the set of solutions is
−8 −4
3 2
the set of all linear combinations of the two vectors 0 and 1 ,
1 0
0 0
so it is the span of those two vectors, so it is a vector space, except that so far we’ve talked
about these linear algebra concepts only for functions, not for vectors. It’s time to introduce
these concepts for vectors too.
113
17. Homogeneous linear systems and linear algebra concepts
For a while, we are going to assume that vectors have real numbers as coordinates, and that all scalars
are real numbers. This is so we can describe things geometrically in Rn more easily. But eventually, we will
work with vectors in Cn whose coordinates can be complex numbers, and will allow scalar multiplication by
complex numbers.
Definition 17.1. Suppose that S is a set consisting of some of the vectors in Rn (for some
fixed value of n). Call S a vector space if all of the following are true:
0. The zero vector 0 is in S.
1. Multiplying any one vector in S by any scalar gives another vector in S.
2. Adding any two vectors in S gives another vector in S.
Remark 17.2. Such a set S is also called a subspace of Rn , because Rn itself is a vector space,
and S is a vector space contained in it.
March 23
17.3. Span.
Solution:
0. The zero vector 0 is in Span(v, w) since 0 = 0v + 0w.
1. Multiplying any linear combination of v and w by any scalar gives another linear combi-
nation of v and w (for example, 5(2v + 3w) = 10v + 15w).
2. Adding any two linear combinations of v and w gives another linear combination of v and
w (for example, (2v + 3w) + (4v + 5w) = 6v + 8w). 2
17.4. Nullspace.
Theorem 17.11. For any homogeneous linear system Ax = 0, the set of all solutions is a
vector space.
This is analogous to the fact that the set of all solutions to a homogeneous linear ODE is
a vector space!
Definition 17.12. The set of all solutions to Ax = 0 is called the nullspace of the matrix A,
and denoted NS(A).
! !
4 6 −3
Problem 17.13. Let A = . Is in NS(A)?
2 3 2
!
−3
Solution: The question is asking whether is a solution to Ax = 0. Is it true that
2
! ! !
4 6 −3 0
= ?
2 3 2 0
Yes! 2
Here is a more direct way to explain why the set of all solutions to Ax = 0 is a vector
space, without computing it as a span:
0. The zero vector 0 is a solution since A0 = 0.
1. Multiplying any solution v by any scalar c gives another solution: given that Av = 0, it
follows that A(cv) = c(Av) = c0 = 0.
2. Adding any solutions gives another solution: given that Av = 0 and Aw = 0, it follows
that A(v + w) = Av + Aw = 0 + 0 = 0.
To compute NS(A), solve the system Ax = 0 by using Gaussian elimination and back-
substitution. (Shortcut: For a homogeneous system, there is no need to keep track of an
augmented column, because it would consist of zeros, and would stay that way even after
row operations.)
117
17.5. Linearly dependent vectors.
Definition 17.14. Vectors v1 , . . . , vn are linearly dependent if one of them is a linear combi-
nation of the others.
Definition 17.15 (equivalent). Vectors v1 , . . . , vn are linearly dependent if there exist scalars
c1 , . . . , cn not all zero such that c1 v1 + · · · + cn vn = 0.
We haven’t yet talked about dimension for a vector space of vectors, but intuitively, the
dimension of Span(v1 , . . . , vn ) will be n if the vectors are linearly independent, and less than
n if they are linearly dependent.
We’ll soon explain why this works. But first, let’s give an example:
1 4 7
Problem 17.16. Determine whether the vectors 2, 5, 8 are linearly dependent.
3 6 9
Solution:
1 4 7
1. Create A = 2 5 8.
3 6 9
2. We convert A to row-echelon form. First add −2 times the first row to the second row,
and add −3 times the first row to the third row, to get
1 4 7
0 −3 −6 .
0 −6 −12
Now the first row is done. Add −2 times the second column to the third column to get
1 4 7
0 −3 −6 ,
0 0 0
118
which is in row-echelon form. Solve the corresponding system
x + 4y + 7z = 0
−3y − 6z = 0
Answer: Checking for linear dependence is the same as searching for nonzero solutions to
1 4 7
a1 2 + a2 5 + a3 8 = 0.
3 6 9
By the interpretation of matrix-vector multiplication as a linear combination of columns, this
equation is the same as
1 4 7 a1
2 5 8 a2 = 0,
3 6 9 a3
119
a1
so what we are really looking for is a nonzero vector a2 in the nullspace of A. 2
a3
Are there numbers c1 and c2 that make this combination 0? If that happens then the blue
entries on the right are 0, so c1 = 0 and c2 = 0. Thus the two column vectors found that
span NS(A) are linearly independent.
The same argument applies whenever we use Gaussian elimination and back-substitution to
solve a homogeneous linear system: the list of solutions found is always linearly independent.
17.6. Basis.
Definition 17.19. A basis of a vector space S (of vectors) is a list of vectors v1 , v2 , . . . such
that
1. Span(v1 , v2 , . . .) = S, and
2. The vectors v1 , v2 , . . . are linearly independent.
3
Example
17.20.
Not actually mentioned in the March 23 lecture. If S is the xy-plane in R ,
1 0
then 0 , 1 is a basis for S.
0 0
April 2
120
17.7. Review: solving a homogeneous linear system.
Problem 17.21. Find a basis of the vector space of solutions to the homogeneous linear
system
2x + y − 3z + 4w = 0
4x + 2y − 2z + 3w = 0
2x + y − 7z + 9w = 0.
First we convert the matrix to row-echelon form. Add −2 times the first row to the second
row, and −1 times the first row to the third row to get
2 1 −3 4
0 0 4 −5
0 0 −4 5
and then add the second row to the third row to get
2 1 −3 4
B := 0 0 4 −5 ,
0 0 0 0
which is in row-echelon form (the pivots are identified in red). Solve the corresponding system
2x + y − 3z + 4w = 0
4z − 5w = 0
0=0
by back-substitution:
w = c1
5
z = c1
4
y = c2
1 1
x = (−y + 3z − 4w)/2 = − c1 − c2 ,
8 2
121
1
x − 8 c1 − 12 c2 −1/8 −1/2
y c2 0 1
General solution: = = c1 + c2 .
5
z c
4 1
5/4 0
w c1 1 0
−1/8 −1/2
0 1
NS(A) = Span , .
5/4 0
1 0
−1/8 −1/2
0 1
Basis for NS(A): , , since these are automatically linearly independent.
5/4 0
1 0
(As a check, plug each vector back into the original system.) 2
17.8. Dimension. It turns out that every basis for a vector space has the same number of
vectors.
Definition 17.22. The dimension of a vector space is the number of vectors in any basis.
!
−3
Example 17.23. The line x + 3y = 0 in R2 is a vector space L. The vector by itself is
1
a basis for L, so the dimension of L is 1. (Not a big surprise!)
Theorem 17.24 (Formula for the dimension of the nullspace). Suppose that the result of
putting a matrix A in row-echelon form is B. Then NS(A) = NS(B) (since row reductions
do not change the solutions), and
(The boxed formula holds since it is the same as dim NS(B) = #free variables.)
In other words, here are the steps to find the dimension of the space of solutions to a
homogeneous linear system Ax = 0:
1. Perform Gaussian elimination on A to convert it to a matrix B in row-echelon form.
2. Identify the pivots of B.
3. Count the number of non-pivot columns of B; that number is dim NS(A).
Warnings:
• You must put the matrix in row-echelon form before counting non-pivot columns!
• If you have an augmented column (of zeros, since we are talking about a homogeneous
system), then do not include it in the count of non-pivot columns. (The augmented
122
column does not correspond to a free variable, or any variable at all for that matter,
so it should not be counted.)
And here are the steps to find a basis of the space of solutions to a homogeneous linear
system Ax = 0:
1. Perform Gaussian elimination on A to convert it to a matrix B in row-echelon form.
2. Use back-substitution to find the general solution to Bx = 0.
3. The general solution will be expressed as the general linear combination of a list of
vectors; that list is a basis of NS(A).
0x + 0y + 0z = 9,
i.e., 0 = 9, which cannot be satisfied. So there are no solutions! The system is inconsistent. 2
0x1 + · · · + 0xn = b ,
nonzero number
Problem 18.3. For which vectors b ∈ R2 does the inhomogeneous linear system
! x1
1 2 3
x2 = b
2 4 6
x3
have a solution?
Here is what happens in general for (possibly inhomogeneous) linear systems (the explana-
tion is the same as in the example above):
124
Theorem 18.5. The linear system Ax = b has a solution if and only if b is in CS(A).
!
1 2 3
For the matrix A = in the problem above,
2 4 6
! ! !
1 2 3
CS(A) = the span of , , ,
2 4 6
is true for a matrix, it will remain true after any row operation. Similarly, any linear relation between
columns is preserved by row operations. So the linear relations between columns of A are the same as the
linear relations between columns of C. The condition that certain numbered columns (say the first, second,
and fourth) of a matrix form a basis is expressible in terms of which linear relations hold, so if certain
columns form a basis for CS(C), the same numbered columns will form a basis for CS(A). Also, performing
Gauss–Jordan reduction on B to obtain C in reduced row-echelon form does not change the pivot locations.
Thus it will be enough to show that the pivot columns of C form a basis of CS(C). Since C is in reduced
row-echelon form, the pivot columns of C are the first r of the m standard basis vectors for Rm , where r is
the number of nonzero rows of C. These columns are linearly independent, and every other column is a linear
combination of them, since the entries of C below the first r rows are all zeros. Thus the pivot columns of C
form a basis of CS(C).
In particular,
dim CS(A) = #pivot columns of B .
Warning: Usually CS(A) 6= CS(B).
1 2 3 4 5
Problem 18.6. Let A be the 3 × 5 matrix −1 −2 9 10 11.
1 2 9 11 13
(a) Find a basis for CS(A).
(b) What are dim NS(A) and dim CS(A)?
Solution:
125
(a) First we must find a row-echelon form. Add the first row to the second, and add −1
times the first row to the third:
1 2 3 4 5
0 0 12 14 16 .
0 0 6 7 8
18.4. Rank.
Proof.
Problem 18.8. Given vectors v1 , . . . , vn ∈ Rm , how can one compute a basis of Span(v1 , . . . , vn )?
Solution:
18.6. Example: a projection. Let f be the function from R3 to R3 that projects all of R3
onto the xy-plane:
x x
f y := y .
z 0
1 1
(first column of A) = f 0 = 0
0 0
0 0
(second column of A) = f 1 = 1
0 0
0 0
(third column of A) = f 0 = 0 .
1 0
1 0 0
Thus A = 0 1 0. 2
0 0 0
127
NS(A) is a subspace of the input space:
NS(A) = {solutions to Ax = 0}
x
= {solutions to f y = 0}
z
0
= 0 : z ∈ R
z
we stated earlier.
128
Back to the example: What does the solution set to Ax = b look like?
• If b is not in CS(A), then
there are no solutions.
b1
• If b is in CS(A), say b = b2 , then
0
x b1
{solutions to Ax = b} = solutions to f y = b2
z 0
1
b
= b2 : z ∈ R
z
April 4
and ! ! !
0 rotate −1 reflect −1
−→ −→ ;
1 0 0
the answer is the matrix having these outputs as the first and second columns:
!
0 −1
. 2
−1 0
Geometric meaning: The absolute value of det A is the area scaling factor (or volume scaling
factor or. . . ).
Theorem 19.2. The determinant of an upper triangular matrix equals the product of the
diagonal entries.
For example,
a11 a12 a13
det 0 a22 a23 = a11 a22 a33 .
0 0 a33
Why is the theorem true? The Laplace expansion along the first column shows that the
determinant is a11 times a upper triangular minor with diagonal entries a22 , . . . , ann .
Properties of determinants:
1. Interchanging two rows changes the sign of det A.
2. Multiplying an entire row by a scalar c multiples det A by c.
3. Adding a multiple of a row to another row does not change det A.
132
4. If one of the rows is all 0, then det A = 0.
5. det(AB) = det(A) det(B) (assuming that A, B are square matrices of the same size).
In particular, row operations multiply det A by nonzero scalars, but do not change whether
det A = 0.
Question 19.3. Suppose that A is a 3 × 3 matrix such that det A = 5. Doubling every entry
of A gives a matrix 2A. What is det(2A)?
Example 19.4. Suppose that A is a square matrix and det A 6= 0. Then RREF(A) has
nonzero determinant too, but is now upper triangular, so its diagonal entries are nonzero.
In fact, these diagonal entries are 1 since they are pivots of a RREF matrix. Moreover, all
non-diagonal entries are 0, by definition of RREF. So RREF(A) = I.
Now imagine solving Ax = b. Gauss–Jordan elimination converts the augmented matrix
[A|b] to [I|c] for some vector c. Thus Ax = b has the same solutions as Ix = c; the unique
solution is c.
What if instead we wanted to solve many equations with the same A, say, Ax1 = b1 , . . . ,
Axp = bp ? Use many augmented columns! Gauss–Jordan elimination converts [A|b1 . . . bp ]
to [I|c1 . . . cp ], and c1 , . . . , cp are the solutions.
133
In other words, to solve an equation AX = B to find the unknown matrix X, convert
[A|B] to RREF [I|C]. Then C is the solution. 2
The reason we talked about this is in order to compute inverse matrices; let’s define these
now.
Suppose that A represents the linear transformation f . Then A−1 exists if and only if an
inverse function f −1 exists; in that case, A−1 represents f −1 .
!
cos θ − sin θ
Problem 19.6. Does the rotation matrix R := have an inverse? If so,
sin θ cos θ
what is it?
Suppose that det A 6= 0. In 18.02, you learned one algorithm to compute A−1 , using the
cofactor matrix. Now that we know how to compute RREF, we can give a faster algorithm
(faster for big matrices, at least):
New algorithm to compute A−1 :
1. Form the n × 2n augmented matrix [A|I].
2. Convert to RREF; the result will be [I|B] for some n × n matrix B.
3. Then A−1 = B.
This is a special case of Example 19.4 since A−1 is the solution to AX = I.
134
19.4. Conditions for invertibility. There are two types of square matrices A:
• those with det A 6= 0 (called nonsingular or invertible), and
• those with det A = 0 (called singular).
The answer to the one question “Is det A = 0?” determines a lot about the geometry of A
and about solving systems Ax = b, as we’ll now explain.
So if you have a matrix A for which one of these conditions holds, then all of the conditions
hold for A.
Let’s explain the consequences of det A 6= 0. Suppose that det A 6= 0. Then the volume
scaling factor is not 0, so the input space Rn is not flattened by A. This means that
there are no “crushed dimensions”, so NS(A) = {0}. Since no dimensions were crushed,
the image CS(A) has the same dimension as the input space, namely n. By definition,
rank(A) = dim CS(A) = n. (Alternatively, this follows from dim NS(A) + rank(A) = n.) The
only n-dimensional subspace of Rn is Rn itself, so CS(A) = Rn . Thus every b is in CS(A), so
Ax = b has a solution for every b. The system Ax = b has the same number of solutions
as Ax = 0 (they are just shifted by adding a particular solution xp ); that number is 1 (the
only solution to Ax = 0 is 0). To say that Ax = b has exactly one solution for each b means
that the associated linear transformation f is a 1-to-1 correspondence, so f −1 exists, so A−1
exists. (Moreover, we showed how to find A−1 by applying Gauss–Jordan elimination to
[A|I].) We have RREF(A) = I as explained earlier, since I is the only RREF square matrix
with nonzero determinant.
19.4.2. Singular matrices. The same theorem can be stated in terms of the opposite conditions
(it’s essentially the same theorem, so this is really just review):
Characteristic polynomial: det(λI − A). (If instead you use det(A − λI), you will need to
change the sign when n is odd.)
Expanding out the determinant shows that the characteristic polynomial has the form
λn − (tr A)λn−1 + · · · ± det A
(the ± is + if n is even, and − if n is odd).
2 3 5
Problem 19.11. Find the eigenvalues of the upper triangular matrix A := 0 2 7.
0 0 6
Solution: The characteristic polynomial is
λ − 2 −3 −5
det(λI − A) = det 0 λ − 2 −7 = (λ − 2)(λ − 2)(λ − 6),
0 0 λ−6
so the eigenvalues, listed with multiplicity, are 2, 2, 6. 2
In general, for any upper triangular or lower triangular matrix, the eigenvalues are the
diagonal entries.
There is one eigenspace for each eigenvalue. Each eigenspace is a vector space, so it can be
described as the span of a basis. To compute the eigenspace of λ, compute NS(A − λI) by
Gaussian elimination and back-substitution.
◦
Problem 19.12. (Skipped) Find
! the eigenvalues and eigenvectors of the 90 counterclockwise
0 −1
rotation matrix R = .
1 0
1 ≤ (dimension of eigenspace of λ) ≤ m.
Given λ, the dimension of the eigenspace of λ is also the maximum number of linearly
independent eigenvectors of eigenvalue λ that can be found. This dimension is at least 1
since A has at least one nonzero eigenvector of eigenvalue λ (otherwise λ would not have
been an eigenvalue). That this dimension is at most m requires more work to prove, and
we’re not going to do it in this class.
Problem 19.14. A 9 × 9 matrix has characteristic polynomial (λ − 2)3 (λ − 5)6 . What are
the possibilities for the dimension of the eigenspace of 2?
Definition 19.15. The eigenspace of λ is called complete if its dimension equals the multi-
plicity m of λ, and deficient if its dimension is less than m. Warning: Different authors use different
terminology here.
Example 19.16. If the multiplicity is 1, then the dimension of the eigenspace is sandwiched
between 1 and 1, so the eigenspace is complete.
Definition 19.17. A matrix is complete if all its eigenspaces are complete. A matrix is
deficient if at least one of its eigenspaces is deficient.
For the application to solving linear systems of ODEs, given an n × n matrix A we will
want to find as many linearly independent eigenvectors as possible. To do this, we choose a
basis of each eigenspace, and concatenate these lists of eigenvectors; it turns out that the
resulting long list is linearly independent.
138
How many eigenvectors are in this list? Well,
X
#eigenvectors = (#eigenvectors from eigenspace of λ)
λ
X
= dim(eigenspace of λ)
λ
X
≤ (multiplicity of λ)
λ
Why does concatenating the bases produce a linearly independent list? The vectors within each basis
are linearly independent, and there are no linear relations involving eigenvectors from different eigenspaces
because of the following:
Theorem 19.18. Fix a square matrix A. Eigenvectors corresponding to distinct eigenvalues are linearly
independent.
April 6
19.6.2. Examples. Here are three examples showing all the situations that can arise for a
2 × 2 matrix.
!
1 −2
Example 19.19. Let A := . The characteristic polynomial is (λ − 2)(λ + 1), so
−1 0
the eigenvalues (2 and −1) each have multiplicity 1, so the eigenspaces are automatically
139
!
−2
complete. Calculation shows that the eigenspace of 2 has basis and the eigenspace of
1
!
1
−1 has basis ; together, these two vectors form a basis for R2 .
1
!
5 0
Example 19.20. Let B = . Since B is upper triangular (even diagonal), the eigenvalues
0 5
!
0 0
are 5, 5. The eigenspace of 5 is NS(B − 5I), which is the set of solutions to x = 0,
0 0
which is the entire space. Its dimension (namely, 2) matches the multiplicity of the eigenvalue 5,
so this eigenspace is complete. Every vector is an eigenvector with eigenvalue
! 5.! So it is easy
1 0
to find two linearly independent eigenvectors: for example, take and .
0 1
!
5 3
Example 19.21. Let C = . Again the eigenvalues are 5, 5. The eigenspace of 5 is
0 5
! !
0 3 x
NS(C − 5I), which is the set of solutions to = 0. This system consists of a
0 0 y
!
c
single nontrivial equation 3y = 0. Thus the eigenspace is the set of vectors of the form ;
0
it is only 1-dimensional, even though the multiplicity of the eigenvalue 5 is still 2. This means
that this eigenspace is deficient, and hence C is deficient. It is impossible to find two linearly
independent eigenvectors.
Remark 19.23 (Using diagonalization to compute matrix powers). Suppose that A = SDS −1 .
Then
−1 −1 −1 3 −1
A3 = SD S | {zS} D |S {zS} DS = SD S .
cancels cancels
More generally, for any integer n ≥ 0,
An = SDn S −1 . (9)
Remark 20.1. These n solutions will automatically be linearly independent, since their values
at t = 0 are the eigenvectors, which are linearly independent. (The chosen eigenvectors
within each eigenspace are linearly independent, and there is no linear dependence between
eigenvectors with different eigenvalues.)
142
Remark 20.2. If some of the eigenvalues are complex, they must be included (if you ignore
them, you won’t find enough eigenvectors). In this case, you may wish to find a new basis of
real-valued functions (replace each pair x, x in the basis by Re x, Im x ).
Remark 20.3. If some eigenspace has dimension less than the multiplicity of the eigenvalue
(that is, A is deficient), then this method fails: it does not produce enough functions to form
a basis.
20.2.1. Definition. Consider a homogeneous linear system of n ODEs ẋ = Ax. (We’ll assume
that A is constant, but everything in this section remains true even if A is replaced by a matrix-valued
The dimension theorem says that the set of solutions is an n-dimensional vector
function A(t).)
space. Let x1 , . . . , xn be a basis of solutions. Write x1 , . . . , xn as column vectors side-by-side
to form a matrix
.. ..
. .
X(t) := x1
xn .
.. ..
. .
(It’s really a matrix-valued function, since each xi is a vector-valued function of t.) Any such
X(t) is called a fundamental matrix for ẋ = Ax. (There are many possible bases, so there are
many possible fundamental matrices.)
Solution:
(a) The functions ! ! ! !
2 2e2t 1 e3t
e2t = and e3t =
1 e2t 1 e3t
are a basis of solutions, so one fundamental matrix is
!
2e2t e3t
X(t) = .
e2t e3t
(b) The solution will be X(t) c for some constant vector c. Thus
! !
2t 3t
2e e c1
x= 2t 3t
e e c2
for some c1 , c2 to be determined. Set t = 0 and use the initial condition to get
! ! !
4 2 1 c1
= .
5 1 1 c2
In other words,
2c1 + c2 = 4
c1 + c2 = 5.
Solving leads to c1 = −1 and c2 = 6, so
! ! !
2e2t e3t −1 −2e2t + 6e3t
x= = . 2
e2t e3t 6 −e2t + 6e3t
20.2.4. Criterion for a matrix to be a fundamental matrix. To say that each column of X(t)
is a solution is the same as saying that Ẋ = AX, because the matrix multiplication can be
done column-by-column.
For a n × n matrix whose columns are solutions, to say that the columns form a basis is
equivalent to saying that they are linearly independent (the space of solutions is n-dimensional,
so if n solutions are linearly independent, their span is the entire space). By the existence
and uniqueness theorem, linear independence of solutions is equivalent to linear independence
of their initial values at t = 0, i.e., to linear independence of the columns of X(0). So it is
equivalent to say that X(0) is a nonsingular matrix.
Conclusion:
April 9
20.3.1. Definition. Inspired by the formula for a real (or complex) number x,
x2 x3
ex = 1 + x + + + ··· ,
2! 3!
define, for any square matrix A,
A2 A3
eA := I + A + + + ··· .
2! 3!
So eA is another matrix of the same size as A.
20.3.2. Properties.
• e0 = I (here 0 is the zero matrix)
2
(Proof: e = I + 0 + 02! + · · · = I.)
0
d At
• e = AeAt
dt
(Proof: Take the derivative of eAt term by term.)
• If AB = BA, then eA+B = eA eB . (Warning: This can fail if AB 6= BA.)
(Proof: Skipped.)
! !
λ1
λ1 0 e 0
• If D = , then eD = .
0 λ2 0 eλ2
! !
2 3
λ1 0 λ 1 0
(Proof: D2 = , D3 = , and so on. Thus
0 λ22 0 λ32
λ21
! !
2 λ1
D 1 + λ 1 + + · · · 0 e 0
eD = I + D + + ··· = 2!
λ2 = .
2! 0 1 + λ2 + 2!2 + · · · 0 eλ2
A similar statement holds for diagonal matrices of any size.)
Theorem 20.6. The function eAt is a fundamental matrix for the system ẋ = Ax.
Compare:
The solution to ẋ = ax satisfying the initial condition x(0) = c is eat c .
The solution to ẋ = Ax satisfying the initial condition x(0) = c is eAt c .
Question 20.7. If the solution is as simple as eAt c, why did we bother with the method
involving eigenvalues and eigenvectors?
Answer: Because computing eAt is usually hard! (In fact, the standard method for
computing it involves finding the eigenvalues and eigenvectors of A.)
Problem 20.8. Use the matrix exponential to find the solution to the system
ẋ = 2x + y
ẏ = 2y
! ! !
2 1 2 0 0 1
Solution: This is ẋ = Ax with A := = + . Then N 2 = 0, so
0 2 0 2 0 0
D N
!
1 t
eN t = I + N t + 0 + 0 + · · · = .
0 1
146
Also, Dt and N t commute (a scalar times I commutes with any matrix of the same size), so
eAt = eDt+N t
= eDt eN t
! !
e2t 0 1 t
=
0 e2t 0 1
!
e2t te2t
=
0 e2t
! !
x 5
= eAt
y 7
! !
e2t te2t 5
=
0 e2t 7
!
5e2t + 7te2t
= . 2
7e2t
Why is that silly? Because we already know how to solve ẋ = Ax when we have a basis of
eigenvectors (and their eigenvalues).
But. . . the same decoupling method also lets us solve an inhomogeneous linear system, and
that’s not silly:
S ẏ = ASy + q(t)
S ẏ = SDy + q(t)
ẏ = Dy + S −1 q(t).
(You may skip to the last of these equations.) This is a decoupled system of inhomogeneous
linear ODEs.
4. Solve for each coordinate function of y.
5. Compute Sy; the result is x.
!
−4 −3
Problem 21.1. Find a particular solution to ẋ = Ax + q, where A := and
6 5
!
0
q= .
cos t
!
0
Solution: We will solve it instead with q = (complex replacement), and take the
eit
real part of the solution at the very end.
Step 1. We have tr A = 1 and det A = −20 − (−18) = −2.
Characteristic polynomial: λ2 − λ − 2 = (λ − 2)(λ + 1).
Eigenvalues: 2, −1. Therefore define
!
2 0
D := .
0 −1
148
Step
! 2. Calculating
! eigenspaces in the usual way leads to corresponding eigenvectors
1 1
, , so define
−2 −1
!
1 1
S := .
−2 −1
Now A = SDS −1 .
Step 3. The result of substituting x = Sy is
ẏ = Dy + S −1 q.
We have
! !
1 −1 −1 0
S −1 q =
det S 2 1 eit
!
−eit
=
eit
x = Sy
! !
2 1
1 1 + i
= 5
1
5
1
eit
−2 −1 2
− 2
i
!
9 3
10
− 10 i
= 13 1
(cos t + i sin t).
− 10 + 10 i
Final step: Take the real part to get a particular solution to the original system:
!
9 3
10
cos t + 10
sin t
x= .
− 13
10
cos t − 1
10
sin t
149
21.2. Variation of parameters.
Long ago we learned how to use variation of parameters to solve inhomogeneous linear
ODEs
ẏ + p(t)y = q(t).
Now we’re going to use the same idea to solve an inhomogeneous linear system of ODEs
such as
ẋ = Ax + q,
where q is a vector-valued function of t. First find a basis of solutions to the corresponding
homogeneous system
ẋ = Ax,
and put them together to form a fundamental matrix X (a matrix-valued function of t).
We know that Xc, where c ranges over constant vectors, is the general solution to the
homogeneous equation. Replace c by a vector-valued function u: try x = Xu in the original
system:
ẋ = Ax + q
Ẋu + X u̇ = AXu + q
AXu + X u̇ = AXu + q
X u̇ = q
u̇ = X −1 q
u̇ = X −1 q
Remark 21.2. Not mentioned in lecture. One choice of X is eAt , in which case u̇ = e−At q and
Z
At
x=e u=e At
e−At q dt.
150
April 11
22. Coordinates
22.1. Coordinates with respect to a basis.
! !
2 −1
Problem 22.1. The vectors , form a basis for R2 , so any vector in R2 is a linear
1 1
combination of them. Find c1 and c2 such that
! ! !
2 −1 2
c1 + c2 =
1 1 4
!
2
(These c1 , c2 are called the coordinates of with respect to the basis. There is only one
4
!
2
solution, since if there were two different linear combinations giving , subtracting them
4
would give a nontrivial linear combination giving 0, which is impossible since the basis vectors
are linearly independent.)
2c1 − c2 = 2
c1 + c2 = 4
“Coordinates with respect to a basis” make sense also in vector spaces of functions.
Problem 22.2. Let V be the vector space with basis consisting of the three functions 1,
t − 3, (t − 3)2 . Find the coordinates of the function t2 with respect to this basis.
t2 = c1 (1) + c2 (t − 3) + c3 (t − 3)2 .
If two polynomials are equal as functions (i.e., for all values of t), then their coefficients must
match. Equating constant terms gives
0 = c1 − 3c2 + 9c3 .
151
Equating coefficients of t gives
0 = c2 − 6c3 .
Equating coefficients of t2 gives
1 = c3 .
Solving this system of three equations leads to (c1 , c2 , c3 ) = (9, 6, 1). 2
Question 22.5. Suppose that v1 , . . . , vn is only an orthogonal basis of Rn . How can we find
the coordinates c1 , . . . , cn of a vector w with respect to this basis?
152
Answer: The same trick leads to
w · v 1 = c1 v 1 · v 1 ,
so we get
w · v1 w · vn
c1 = , ··· , cn = . 2
v1 · v1 vn · vn
Answer: The shortest period is 2π/3, but sin 3t is also periodic with period any positive
integer multiple of 2π/3, including 3(2π/3) = 2π:
sin(3(t + 2π)) = sin(3t + 6π) = sin 3t.
So the answer is yes.
23.2. Square wave. Define
1, if 0 < t < π,
Sq(t) :=
−1 if −π < t < 0.
and extend it to a periodic function of period 2π, called a square wave. The function Sq(t)
has jump discontinuities, for example at t = 0. If you must define Sq(0), compromise between the
upper and lower values: Sq(0) := 0. The graph is usually drawn with vertical segments at the
jumps (even though this violates the vertical line test).
It turns out that
4 sin 3t sin 5t
Sq(t) = sin t + + + ··· .
π 3 5
We’ll explain later today where this comes from.
Try the “Fourier Coefficients” mathlet
153
http://mathlets.org/mathlets/fourier-coefficients/
23.3. Fourier series. A linear combination like 2 sin 3t − 4 sin 7t is another periodic function
of period 2π.
Definition 23.2. A Fourier series is a linear combination of the infinitely many functions
cos nt and sin nt as n ranges over integers:
a0
f (t) = + a1 cos t + a2 cos 2t + a3 cos 3t + · · ·
2
+ b1 sin t + b2 sin 2t + b3 sin 3t + · · ·
(Terms like cos(−2t) are redundant since cos(−2t) = cos 2t. Also sin 0t = 0 produces nothing
new. But cos 0t = 1 is included; the first term is the coefficient a0 /2 times the function 1.
We’ll explain later why we write a0 /2 instead of a0 .)
Written using sigma-notation,
∞ ∞
a0 X X
f (t) = + an cos nt + bn sin nt.
2 n=1 n=1
P∞ th
Recall that, for example, n=1 bn sin nt means the sum of the series whose n term is
obtained by plugging in the positive integer n into the expression bn sin nt, so
X
bn sin nt = b1 sin t + b2 sin 2t + b3 sin 3t + · · · .
n≥1
Any Fourier series as above is periodic of period 2π. (Later we’ll extend to the definition
of Fourier series to include functions of other periods.) The numbers an and bn are called
the Fourier coefficients of f . Each summand (a0 /2, an cos nt, or bn sin nt) is called a Fourier
component of f .
Fourier’s theorem. “Every” periodic function f of period 2π “is” a Fourier series, and the
Fourier coefficients are uniquely determined by f .
(The word “Every” has to be taken with a grain of salt: The function has to be “reasonable”. Piecewise
differentiable functions with jump discontinuities are reasonable, as are virtually all other functions that arise
in physical applications.
The word “is” has to be taken with a grain of salt: If f has a jump discontinuity at τ , then the Fourier
series might disagree with f there; the value of the Fourier series at τ is always the average of the left limit
f (τ − ) and the right limit f (τ + ), regardless of the actual value of f (τ ).)
In other words, the functions
1, cos t, cos 2t, cos 3t, . . . , sin t, sin 2t, sin 3t, . . .
form a basis for the vector space of “all” periodic functions of period 2π.
Question 23.3. Given f , how do you find the Fourier coefficients an and bn ?
154
In other words, how do you find the coordinates of f with respect to the basis of cosines
and sines?
Can one define the dot product of two functions? Sort of.
Definition 23.4. If f and g are real-valued periodic functions with period 2π, then their
inner product is Z π
hf, gi := f (t)g(t) dt
−π
(It acts like a dot product f · g, but don’t write it that way because · could be misinterpreted as multiplication.)
Example 23.7.
Z π
hsin t, sin ti = sin2 t dt = ?
−π
Z π
hcos t, cos ti = cos2 t dt = ?
−π
Since cos t is just a shift of sin t, the answers are going to be the same. Also, the two answers
add up to Z π
(sin2 t + cos2 t) dt = 2π,
−π | {z }
this is 1
so each is π.
The same idea works to show that
23.6. Meaning of the constant term. The constant term of the Fourier series of f is
Z π
a0 1
= f (t) dt,
2 2π −π
which is the average value of f on (−π, π).
Similarly, the Fourier series of an odd function f has only sine terms:
∞
X
f (t) = bn sin nt.
n=1
= πn
0, if n is even.
Thus
4 4 4
b1 = , b3 = , b5 = ,...
π 3π 5π
and all other Fourier coefficients are 0.
Conclusion:
4 sin 3t sin 5t
Sq(t) = sin t + + + ··· . 2
π 3 5
April 13
157
23.8. Finding a Fourier series representing a function on an interval.
Problem 23.9. Suppose that f (t) is a (reasonable) function defined only on the interval
(0, π). Find numbers a0 , a1 , . . . such that
a0
f (t) = + a1 cos t + a2 cos 2t + · · ·
2
for all t ∈ (0, π).
Solution: For any ai , the right hand side will define an even periodic function of period 2π
(if the series converges). So begin by extending f (t) to a function of the same type:
• Extend f (t) to an even function on (−π, π) by defining f (−t) := f (t) for all t ∈ (−π, 0)
(and then define f (0) and f (−π) arbitrarily).
• Shift the graph of f horizontally by integer multiples of 2π to get a period 2π function
defined on all of R.
Define Z π Z π
1 2
an := f (t) cos nt dt = f (t) cos nt dt.
π −π π 0
Then
a0
f (t) =+ a1 cos t + a2 cos 2t + · · ·
2
holds for all t ∈ R, so in particular it holds for t ∈ (0, π) (possibly excluding points of
discontinuity). 2
Remark 23.10. The same function f (t) on (0, π) can be extended to an odd periodic function
of period 2π, in order to obtain
f (t) = b1 sin t + b2 sin 2t + · · · ,
where Z π Z π
1 2
bn := f (t) sin nt dt = f (t) sin nt dt.
π −π π 0
23.9.1. Warm-ups.
Review problem 1: Given a positive integer n, what periodic function x(t) of period 2π is
a solution to
ẍ + 50x = eint ?
Solution: The characteristic polynomial is p(r) := r2 + 50. ERF gives
1 int 1 int 1
x(t) = e = 2
e = 2
eint .
p(in) (in) + 50 50 − n
(This is periodic of period 2π.) 2
158
Review problem 2: Given a positive integer n, what periodic function x(t) of period 2π is
a solution to
ẍ + 50x = sin nt ?
Solution: Take imaginary parts of the previous solution to get
1 int 1 int
1
x(t) = Im 2
e = 2
Im e = sin nt. 2
50 − n 50 − n 50 − n2
Problem 23.11. Suppose that f (t) is an odd periodic function of period 2π. What periodic
function x(t) of period 2π is a solution to
ẍ + 50x = f (t) ?
Solution: Since f is odd, the Fourier series of f is a linear combination of the shape
f (t) = b1 sin t + b2 sin 2t + b3 sin 3t + · · · .
By the superposition principle, the system response to f (t) is
1 1 1
x(t) = b1 sin t + b2 sin 2t + b3 sin 3t + · · · .
49 46 41
Note that each Fourier component sin nt has a different gain: the gain depends on the
frequency.
One could write the answer using sigma-notation:
X 1
x(t) = bn sin nt .
n≥1
50 − n2
This is better since it shows precisely what every term in the series is (no need to “guess the
pattern”). 2
159
Think of f (t) as the input signal, and the solution x(t) as the system response (output
signal). Summary of the solution:
1
eint eint
50 − n2
1
sin nt sin nt
50 − n2
1
sin t sin t
49
1
sin 2t sin 2t
46
1
sin 3t sin 3t
41
.. ..
. .
X X 1
bn sin nt bn sin nt
n≥1 n≥1
50 − n2
Problem 23.12. For which input signal sin nt is the gain the largest?
1 1
Solution: The complex gain is . The gain is
, which is largest when
50 − n2 50 − n2
|50 − n2 | is smallest. This happens for n = 7. 2
1
The gain for sin 7t is 1, and the next largest gain, occurring for sin 6t and sin 8t, is 14 .
Thus the system approximately filters out all the Fourier components of f (t) except for the
sin 7t term.
≈ 0.020 sin t + 0.008 sin 3t + 0.008 sin 5t + 0.143 sin 7t − 0.003 sin 9t − (even smaller terms)
so the coefficient of sin 7t is largest, and the coefficient of sin t is second largest. (This makes
1
sense since the Fourier coefficient 2
is large only when one of n or 50 − n2 is small.)
(50 − n )n
2
Remark 23.14. Even though the system response is a complicated Fourier series, with infinitely
many terms, only one or two are significant, and the rest are negligible.
non-periodic.
There are infinitely many other solutions, namely xp + c1 cos 7t + c2 sin 7t for any c1 and
c2 , but these solutions still include the 71 − 14t cos 7t term and hence are not periodic. 2
162
Conclusion: The system response is almost indistinguishable from a pure sinusoid of angular
frequency 7.
http://mathlets.org/mathlets/fourier-coefficients-complex/
If using headphones, start with a low volume, since pure sine waves carry more energy than
they seem to, and can damage your hearing after sustained listening.
Your ear is capable of decomposing a sound wave into its Fourier components of different
frequencies. Each frequency corresponds to a certain pitch. Increasing the frequency produces
a higher pitch. More precisely, multiplying the frequency by a number greater than 1 increases
the pitch by what in music theory is called an interval. For example, multiplying the frequency
by 2 raises the pitch by an octave, and multiplying by 3 raises the pitch an octave plus a
perfect fifth.
When an instrument plays a note, it is producing a periodic sound wave in which typically
many of the Fourier coefficients are nonzero. In a general Fourier series, the combination of
the first two nonconstant terms (a1 cos t + b1 sin t, if the period is 2π) is a sinusoid of some
frequency ν, and the next combination (e.g., a2 cos 2t + b2 sin 2t) has frequency 2ν, and so
on: the frequencies are the positive integer multiples of the lowest frequency ν. The note
corresponding to the frequency ν is called the fundamental, and the notes corresponding to
frequencies 2ν, 3ν, . . . are called the overtones.
The musical staffs below show these for ν ≈ 131 Hz (the C below middle C), with the
integer multiplier shown in green.
163
Question 23.17. Can you guess what note corresponds to 9ν?
April 18
23.11. Fourier series of arbitrary period. Everything we did with periodic functions of
period 2π can be generalized to periodic functions of other periods.
and extend it to a periodic function of period 2L. Express this new square wave f (t) in terms
of Sq.
Solution: To avoid confusion, let’s use u as the variable for Sq. Stretching the graph of
Sq(u) horizontally by a factor L/π produces the graph of f (t).
164
L
In other words, if t and u are related by t = u (so that u = π corresponds to t = L), then
π
πt πt
f (t) = Sq(u). In other words, u = , so f (t) = Sq . 2
L L
Similarly we can stretch any function of period 2π to get a function of different period.
Let L be a positive real number. Start with “any” periodic function
a0 X X
g(u) = + an cos nu + bn sin nu,
2 n≥1 n≥1
of period 2π. Stretching horizontally by a factor L/π gives a periodic function f (t) of period
2L, and “every” f of period 2L arises this way. By the same calculation as above,
πt
f (t) = g
L
a0 X nπt X nπt
= + an cos + bn sin .
2 n≥1
L n≥1
L
πt π
The substitution u = (and du = dt) also leads to Fourier coefficient formulas for period
L L
2L:
1 π
Z
an = g(u) cos nu du
π −π
1 L
Z
πt nπt π
= g cos dt
π −L L L L
1 L
Z
nπt
= f (t) cos dt.
L −L L
A similar formula gives bn in terms of f .
23.11.1. The inner product for periodic functions of period 2L. Adapt the definition of the
inner product to the case of functions f and g of period 2L:
Z L
hf, gi := f (t)g(t) dt.
−L
(This conflicts with the earlier definition of hf, gi, for functions for which both make sense, so perhaps it
would be better to write hf, giL for the new inner product, but we won’t bother to do so.)
165
The same calculations as before show that the functions
πt 2πt 3πt πt 2πt 3πt
1, cos , cos , cos , . . . , sin , sin , sin ,...
L L L L L L
form an orthogonal basis for the vector space of “all” periodic functions of period 2L, with
h1, 1i = 2L
nπt nπt
cos , cos =L
L L
nπt nπt
sin , sin =L
L L
(the average value of cos2 ωt is 1/2 for any ω, and the average value of sin2 ωt is 1/2 too).
This gives another way to derive the Fourier coefficient formulas for functions of period 2L.
23.11.2. Summary.
• Fourier’s theorem: “Every” periodic function f of period 2L is a Fourier series
a0 X nπt X nπt
f (t) = + an cos + bn sin .
2 n≥1
L n≥1
L
1 L
Z
nπt
an = f (t) cos dt for all n ≥ 0,
L −L L
1 L
Z
nπt
bn = f (t) sin dt for all n ≥ 1.
L −L L
• If f is even, then only the cosine terms (including the a0 /2 term) appear.
• If f is odd, then only the sine terms appear.
Solution: One way would be to use the Fourier coefficient formulas directly. But we will
instead obtain the Fourier series for s(t) from the Fourier series for Sq(u), by stretching and
shifting.
First, stretch horizontally by a factor of 5/π to get
πt 1, if 0 < t < 5,
Sq =
5 −1, if −5 < t < 0.
166
Here the difference between the upper and lower values is 2, but for s(t) we want a difference
of 6, so multiply by 3:
πt 3, if 0 < t < 5,
3 Sq =
5 −3, if −5 < t < 0.
Finally add 5:
8,
πt if 0 < t < 5,
5 + 3 Sq =
5 2, if −5 < t < 0.
Since
4 X 1
Sq(u) = sin nu,
π n ≥ 1, odd
n
we get
πt
s(t) = 5 + 3 Sq
5
X
4 1 nπt
=5+3 sin
π n ≥ 1, odd n 5
X 12 nπt
=5+ sin .
n ≥ 1, odd
nπ 5
Theorem 23.21. If f is a piecewise differentiable periodic function, then the Fourier series
of f (with the an and bn defined by the Fourier coefficient formulas)
• converges to f (t) at values of t where f is continuous, and
f (t− ) + f (t+ )
• converges to where f has a jump discontinuity.
2
Example 23.22. The left limit Sq(0− ) = −1 and right limit Sq(0+ ) = 1 average to 0. The
Fourier series
4 sin 3t sin 5t
sin t + + + ···
π 3 5
evaluated at t = 0 converges to 0 too.
167
23.13. Antiderivative of a Fourier series. Suppose that f is a piecewise differentiable
Rt
periodic function. For any number C, the formula F (t) := 0 f (τ ) dτ + C defines an
antiderivative of f in the sense that F 0 (t) = f (t) at any t where f is continuous. (If f has
jump discontinuities, then at the jump discontinuities F will be only continuous, not differentiable.)
The function F is not necessarily periodic! For example, if f is a function of period 2 such
that
2, if 0 < t < 1,
f (t) :=
−1 if −1 < t < 0,
An even easier example: if f (t) = 1, then F (t) = t + C for some C, so F (t) is not periodic.
But if the constant term a0 /2 in the Fourier series of f is 0, then F is periodic, and its
Fourier series can be obtained by taking the simplest antiderivative of each cosine and sine
term, and adding an overall +C, where C is the average value of F .
Problem 23.23. Let T (t) be the periodic function of period 2 such that T (t) = |t| for
−1 ≤ t ≤ 1; this is called a triangle wave. Find the Fourier series of T (t).
168
Solution: We could use the Fourier coefficient formula. But instead, notice that T (t) has
slope −1 on (−1, 0) and slope 1 on (0, 1), so T (t) is an antiderivative of the period 2 square
wave
X 4
Sq(πt) = sin nπt.
n ≥ 1, odd
nπ
Taking an antiderivative termwise (and using that the average value of T (t) is 1/2) gives
1 X 4 − cos nπt
T (t) = +
2 n ≥ 1, odd nπ nπ
1 X 4
= − cos nπt. 2
2 n ≥ 1, odd n π 2
2
April 20
Remark 23.24. A Fourier series of a piecewise differentiable periodic function f can also be
differentiated termwise, but the result will often fail to converge. For example, the termwise
derivative of the Fourier series Sq(t) gives a nonsensical value at t = 0. (Here is one good case,
however: If f is continuous and piecewise twice differentiable, then the derivative series converges.)
23.14. Review.
169
23.14.1. Game: What is the matrix? The following two matrices exhibit different phenomena:
! !
2 1 1/2 −1/2
, .
1 2 −1/2 1/2
You take the blue matrix, and it’s nonsingular, so things look pretty nice, just a little
distorted.
You take the red matrix, and dimensions get crushed!
Remember: all I’m offering is the truth.
• Which matrix satisfies det A = 0? The red one.
• Which matrix has area scaling factor 0? The red one.
• For which matrix are there nonzero vectors in NS(A)? The red one.
• Which matrix has rank 2? The blue one.
• Which matrix has column space equal to the whole space R2 ? The blue one.
• Which matrix defines a function R2 → R2 that is a 1-to-1 correspondence? The blue
one.
• For which matrix A is it true that for every vector b in R2 , the system Ax = b is
solvable? The blue one.
• Which matrix has an inverse? The blue one.
• Which matrix has RREF(A) = I? The blue one.
The point is that the answer to the question “Is det A = 0?” can tell you a lot about a matrix.
Problem 23.25. Define f (t) = |t| for −π ≤ t ≤ π and extend f (t) to a periodic function of
period 2π. Find the periodic solution to ẋ + x = f (t).
Solution: Either using the Fourier coefficient formulas or integrating Sq(t) shows that
π 4 cos 3t cos 5t
f (t) = − cos t + + + ···
2 π 9 25
π X 4
= − cos nt.
2 n≥1, odd πn2
π
(The constant term is since that is the average value of f .)
2
The strategy is to first solve
ẋ + x = cos nt
and then take a linear combination to solve the original ODE. For this, first solve the complex
replacement
ż + z = eint
170
by using ERF: the characteristic polynomial is p(r) := r + 1, so ERF gives a periodic solution
1 int
z= e
p(in)
1
= eint
1 + in
1 1 − in int
= e
1 + in 1 − in
1 − in
= (cos nt + i sin nt)
1 + n2
1
= ((cos nt + n sin nt) + i(−n cos nt + sin nt)).
1 + n2
Taking the real part gives the periodic solution to ẋ + x = cos nt:
1
x= (cos nt + n sin nt).
1 + n2
In particular, the n = 0 case of this says that the periodic solution to ẋ + x = 1 is x = 1
(kind of obvious, in hindsight). Taking a linear combination gives the answer to the original
problem:
π X 4 1
x= − (cos nt + n sin nt).
2 n≥1, odd πn2 1 + n2
Problem 23.26. Let f (t) be the same function as above. For which angular frequencies
ω > 0 will the ODE
ẍ + ω 2 x = f (t)
exhibit pure resonance (fail to have a periodic solution)?
Solution: The solutions to the harmonic oscillator ẍ + ω 2 x = 0 are the linear combinations
of cos ωt and sin ωt; it has natural frequency ω. Pure resonance will occur when one of the
Fourier components in the right hand side has a matching angular frequency. There is a
nonzero Fourier component of frequency n for each odd integer n ≥ 1. Thus pure resonance
occurs exactly when ω is one of the numbers
1, 3, 5, . . . .
23.14.4. Solving an inhomogeneous system of ODEs. What are the methods to solve an
inhomogeneous system ẋ = Ax + q?
1. Convert to a higher-order ODE involving only one unknown function.
2. Decouple (assuming that A is complete). (Theory: If A = SDS −1 , substitute x = Sy to get
S ẏ = (SDS −1 )Sy + q = SDy + q, and multiply by S −1 on the left to get ẏ = Dy + S −1 q.)
What you actually do:
171
• Calculate eigenvalues to get the diagonal matrix D.
• Calculate a basis of each eigenspace, and make all these eigenvectors into columns of
a matrix S,
• Compute S −1 by converting [S|I] to RREF [I|?].
• Compute S −1 q.
• Solve the decoupled system ẏ = Dy + S −1 q for y1 and y2 separately.
• Compute x = Sy.
3. Variation of parameters. (Theory: If X is a fundamental matrix for ẋ = Ax, then
substitute x = Xu to get
Ẋu + X u̇ = AXu + q
AXu + X u̇ = AXu + q
X u̇ = q
u̇ = X −1 q,
If there are initial conditions, first find the general solution to the inhomogeneous system,
and then use the initial conditions to solve for the unknown parameters (and plug them back
in at the end).
April 23
Midterm 3
April 25
(The two conditions are initial conditions since they are the value and derivative at the same
x-value, namely x = 0.)
Solution: The function v(x) = 0 is one solution. The uniqueness part of the existence and
uniqueness theorem says that there is only one solution, so 0 is the only solution.
173
24.2. New examples: boundary value problems.
(The two conditions are boundary conditions since they are at different x-values.)
Warning: There is no existence and uniqueness theorem for boundary value problems!
Although 0 is still a solution, there is no guarantee that there are not others. In fact, we’ll
see soon that this particular problem has other solutions, namely v(x) = b sin 3x for any
constant b.
24.3. Solving a family of boundary value problems. Let’s solve a whole family of
boundary value problems like these at once.
Problem 24.4. For each real number λ, find all functions v(x) on [0, π] satisfying
a+b=0
√ √
ae λπ
+ be− λπ
= 0.
174
!
1 1√
Since det √ 6= 0, the only solution to this linear system is (a, b) = (0, 0). Thus
e λπ e− λπ
the only solution to the boundary value problem is v = 0.
Case 2: λ = 0. Then the general solution is a + bx, and the boundary conditions say
a=0
a + bπ = 0.
Again the only solution to this linear system is (a, b) = (0, 0). Thus the only solution to the
boundary value problem is v = 0.
√
Case 3: λ < 0. The roots of the characteristic polynomial are again ± λ, but the number
λ is negative, so each root will be a real number times i. In order to simplify the formula for
√
these roots, define ω to be the positive real number such that λ = −ω 2 ; then the roots ± λ
are simply ±iω. Now the functions eiωx and e−iωx form a basis of solutions to v 00 (x) = λv(x).
The functions cos ωx and sin ωx form a real-valued basis for the same vector space. Therefore
the general solution is a cos ωx + b sin ωx. The first boundary condition says a = 0, so
v = b sin ωx. The second boundary condition then says b sin ωπ = 0, which says different
things about b, depending on whether ω is an integer:
• If ω is not an integer, then sin ωπ 6= 0, so the second condition implies b = 0.
• If ω is an integer n, then sin ωπ = 0, so b can be anything. In this case, λ = −n2 for
some positive integer n (positive since n = ω > 0), and v(x) can be b sin nx for any
constant b.
just as Av = λv has a nonzero solution v only for special values of λ, namely the eigenvalues
d2
of λ. But the differential operator 2 has infinitely many eigenvalues, as one would expect
dx
for an ∞ × ∞ matrix.
d2
The nonzero solutions v(x) to v = λv satisfying the boundary conditions are called
dx2
eigenfunctions, since they act like eigenvectors.
Lemma 24.5. Suppose that f (x) and g(t) are functions of independent variables x and t,
respectively. If f (x) = g(t) for all values of x and t, then there is a constant λ such that
f (x) = λ for all x and g(t) = λ for all t.
Problem 25.1. An insulated uniform metal rod with exposed ends starts at a constant
temperature, but then its ends are held in ice at 0◦ C. Model its temperature.
176
Variables and functions: Define
Here
• L, A, and u0 are constants;
• x and t are independent variables; and
• u = u(x, t) and q = q(x, t) are functions defined for x ∈ [0, L] and t ≥ 0.
Physics: Each bit of the rod contains internal energy, consisting of the microscopic kinetic
energy of particles (and the potential energy associated with microscopic forces). This energy
can be transferred from point to point, via atoms colliding with nearby atoms. Heat flux
density measures such heat transfer from left to right across a cross-section of the rod, per
unit area, per unit time.
We will use three laws of physics:
1. The first law of thermodynamics (conservation of energy), in the special case in which no
work is being done, states that for any bit of the rod,
Divide by A dx dt to get
∂u ∂q
∝ − .
∂t ∂x
(More correct would be to use ∆x, ∆t, and so on, and to take a limit, but the end result is the same.)
∂u
Finally, substitute Fourier’s law q ∝ − into the right hand side to get the heat equation
∂x
∂u ∂ 2u
=α 2 ,
∂t ∂x
for some constant α > 0 (called thermal diffusivity or the heat diffusion coefficient) that
depends only on the material. The heat equation is a second-order homogeneous linear partial
differential equation involving the unknown function u(x, t).
Remark 25.2. This PDE makes physical sense, since if the temperature profile (graph of u(x, t)
∂ 2u
versus x at a fixed time) is curving upward at a point ( 2 > 0), then the average of the
∂x
point’s neighbors is warmer than the point, so the point’s temperature should increase.
Boundary conditions: u(0, t) = 0 and u(L, t) = 0 for all t ≥ 0 (for u in degrees Celsius).
Initial condition: u(x, 0) = u0 for all x ∈ (0, L) .
April 27
179
Summary of last lecture:
• We modeled an insulated metal rod with exposed ends held at 0◦ C.
• Using physics, we found that its temperature u(x, t) was governed by the PDE
∂u ∂ 2u
= α 2 (the heat equation).
∂t ∂x
For simplicity, we specialized to the case α = 1, length π, and initial temperature
u(x, 0) = 1.
2
• Trying u = w(t)v(x) led to separate ODEs for v and w, leading to solutions e−n t sin nx
for n = 1, 2, . . . to the PDE with boundary conditions.
• We took linear combinations to get the general solution
25.3. Initial condition. As usual, we postponed imposing the initial condition, but now it
is time to impose it.
Question 25.3. Which choices of b1 , b2 , . . . make the solution above also satisfy the initial
condition u(x, 0) = 1 for x ∈ (0, π)?
Set t = 0 in (10) and use the initial condition on the left to get
which must be solved for b1 , b2 , . . .. Section 23.8 showed how to find such bi : the left hand
side extends to an odd function of period 2π, namely Sq(x), so we need to solve
Question 25.4. What does the temperature profile look like when t is large?
180
Answer: All the Fourier components are decaying, so u(x, t) → 0 as t → +∞ at every
position. Thus the temperature profile approaches a horizontal segment, the graph of the
zero function. But the Fourier components of higher frequency decay much faster than the
first Fourier component, so when t is large, the formula
4 −t
u(x, t) ≈ e sin x
π
is a very good approximation. Eventually, the temperature profile is indistinguishable from a
sinusoid of angular frequency 1 whose amplitude is decaying to 0. This is what was observed
in the mathlet. 2
25.4. Analogy between a linear system of ODEs and the heat equation. We can
continue the table of analogies from Section 24.4:
Solution:
1. Forget the initial condition for now, and look for a solution u = u(x) that does not depend
∂ 2u
on t. Plugging this into the heat equation PDE gives 0 = . The general solution to this
∂x2
simplified DE is u(x) = ax + b. Imposing the boundary conditions u(0) = 0 and u(π) = 20
20
leads to b = 0 and a = 20/π, so up = x. (This is the solution whose temperature profile
π
is an unchanging straight line from u = 0 at x = 0 up to u = 20 at x = π.)
2. The PDE with the homogeneous boundary conditions is what we solved earlier; the general
solution is
uh = b1 e−t sin x + b2 e−4t sin 2x + b3 e−9t sin 3x + · · · .
3. The general solution to the PDE with inhomogeneous boundary conditions is
20
u(x, t) = up + uh = x + b1 e−t sin x + b2 e−4t sin 2x + b3 e−9t sin 3x + · · · . (11)
π
4. To find the bn , set t = 0 and use the initial condition on the left:
20
1 = x + b1 sin x + b2 sin 2x + b3 sin 3x + · · · for all x ∈ (0, π).
π
20
1 − x = b1 sin x + b2 sin 2x + b3 sin 3x + · · · for all x ∈ (0, π).
π
20
Extend 1 − x on (0, π) to an odd periodic function f (x) of period 2π. Then the bn are
π
the Fourier coefficients of f (x); they can be calculated in two ways:
• Use the Fourier coefficient formulas directly:
1 π 2 π
Z Z
20
bn = f (x) sin nx dx = 1 − x sin nx dx.
π −π π 0 π
• Use the Fourier coefficient formulas to find the Fourier series for the odd periodic
extensions of 1 and x separately, namely
4 X sin nx
1=
π n≥1, odd n
X sin nx
x=2 (−1)n+1 ,
n≥1
n
20
for x ∈ (0, π), and take a linear combination to get 1 − x.
π
Either way, we get
36 40 36 40
f (x) = − sin x + sin 2x − sin 3x + sin 4x − · · · ;
π 2π 3π 4π
182
that is,
36 40 36 40
b1 = − , b2 = , b3 = − , b4 = , ....
π 2π 3π 4π
Plug the bn back into (11) to get
20 36 40 −4t 36 −9t 40 −16t
u(x, t) = x − e−t sin x + e sin 2x − e sin 3x + e sin 4x − · · · .
π π 2π 3π 4π
25.6. Insulated ends.
Problem 25.6. Consider the same insulated uniform metal rod as before (α = 1, length π),
but now assume that the ends are insulated too (instead of exposed and held in ice), and
that the initial temperature is given by u(x, 0) = x for x ∈ (0, π). Now what is u(x, t)?
Solution: As usual, we temporarily forget the initial condition, and use it only at the end.
“Insulated ends” means that there is zero heat flow through the ends, so the heat flux
∂u
density function q ∝ − is 0 when x = 0 or x = π. In other words, “insulated ends” means
∂x
that the boundary conditions are
∂u ∂u
(0, t) = 0, (π, t) = 0 for all t > 0, (12)
∂x ∂x
instead of u(0, t) = 0 and u(π, t) = 0. So we need to solve the heat equation
∂u ∂ 2u
=
∂t ∂x2
with the boundary conditions (12). Separation of variables u(x, t) = v(x) w(t) leads to
v 00 (x) = λ v(x) with v 0 (0) = 0 and v 0 (π) = 0
ẇ(t) = λ w(t)
for a constant λ.
Lecture actually ended here.
Looking at the cases λ > 0, λ = 0, λ < 0 (see the details in a side calculation presented
after the rest of this solution), we find that
λ = −n2 and v(x) = cos nx (times a scalar)
where n is one of 0, 1, 2, . . . (this time it turns out that n = 0 also gives a nonzero function).
2
For each such v(x), the corresponding w is w(t) = e−n t (times a scalar), and the normal
mode is
2
u = e−n t cos nx.
The case n = 0 is the constant function 1, so the general solution to the PDE with boundary
conditions is
a0
u(x, t) = + a1 e−t cos x + a2 e−4t cos 2x + a3 e−9t cos 3x + · · · .
2
183
Finally, we bring back the initial condition: substitute t = 0 and use the initial condition
on the left to get
a0
x= + a1 cos x + a2 cos 2x + a3 cos 3x + · · ·
2
for all x ∈ (0, π). The right hand side is a period 2π even function, so extend the left hand
side to a period 2π even function T (x), a triangle wave, which is an antiderivative of
4 sin 3x sin 5x
Sq(x) = sin x + + + ··· .
π 3 5
Integration gives
a0 4 cos 3x cos 5x
T (x) = − cos x + + + ··· ,
2 π 9 25
and the constant term a0 /2 is the average value of T (x), which is π/2. Thus
π 4 cos 3x cos 5x
T (x) = − cos x + + + ···
2 π 9 25
π 4 −t −9t cos 3x −25t cos 5x
u(x, t) = − e cos x + e +e + ··· .
2 π 9 25
This answer makes physical sense: when the entire bar is insulated, its temperature tends to
a constant equal to the average of the initial temperature. 2
Problem 25.7. For each real number λ, find all functions v(x) on [0, π] satisfying
b=0
b = 0.
Thus the solutions to the boundary value problem are the constant functions a.
Case 3: λ < 0. We can write λ = −ω 2 for some ω > 0. Then the general solution is
a cos ωx + b sin ωx. The first boundary condition says bω = 0, so b = 0, so v = a cos ωx. The
second boundary condition then says −aω sin ωπ = 0, which says different things about a,
depending on whether ω is an integer:
• If ω is not an integer, then sin ωπ 6= 0, so the second condition implies a = 0.
• If ω is an integer n, then sin ωπ = 0, so a can be anything. In this case, λ = −n2 for
some positive integer n (positive since n = ω > 0), and v(x) can be a cos nx for any
constant a.
Setting n = 0 into the result of case 3 gives the result of case 2, so we combine these cases
in the following:
Remark 25.8. The kind of boundary conditions we had earlier, specifying the values on the
boundary, are called Dirichlet boundary conditions. But the kind we have now, specifying the
derivative values on the boundary, are called Neumann boundary conditions.
April 30
Here
• L, ρ, T are constants;
• t, x are independent variables; and
• u = u(x, t) is a function defined for x ∈ [0, L] and t ≥ 0. The vertical displacement is
measured relative to the equilibrium position in which the string makes a straight
line.
At any given time t, the string is in the shape of the graph of u(x, t) as a function of x.
Assumption: The string is taut, so the vertical displacement of the string is small, and the
slope of the string at any point is small.
Consider the piece of string between positions x and x + dx. Let θ be the (small) angle
formed by the string and the horizontal line at position x, and let θ + dθ be the same angle
at position x + dx.
186
Newton’s second law says that ma = F. Taking the vertical component of each side gives
∂ 2u
ρ dx = T sin(θ + dθ) − T sin θ
|{z} ∂t2 | {z }
mass
|{z} vertical component of force
acceleration
= T d(sin θ).
Side calculation:
d(sin θ) = cos θ dθ
1
d(tan θ) = dθ,
cos2 θ
θ2
but cos θ = 1 − 2!
+ · · · ≈ 1, so up to a factor that is very close to 1 we get
∂u
d(sin θ) ≈ d( tan
| {z θ} )=d .
∂x
slope of string
∂ 2u
∂u
ρ dx 2 ≈ T d .
∂t ∂x
187
Divide by ρ dx to get
∂u
∂ 2u −1 d ∂x
≈ Tρ
∂t2 dx
2
∂ u
≈ T ρ−1 2 .
∂x
p
If we define a new constant c := T ρ−1 , then this becomes the
∂ 2u 2
2 ∂ u
wave equation: =c .
∂t2 ∂x2
This
2 makes
sense intuitively, since at places where the graph of the string is concave up
∂ u
∂x2
> 0 the tension pulling on both sides should combine to produce an upward force, and
hence an upward acceleration.
Comparing units of both sides of the wave equation shows that the units for c are m/s.
The physical meaning of c as a velocity will be explained later.
The ends of a guitar string are fixed, so we have boundary conditions
26.2. Separation of variables in PDEs; normal modes. For simplicity, suppose that
c = 1 and L = π. So now we are solving the PDE with boundary conditions
∂ 2u ∂ 2u
=
∂t2 ∂x2
u(0, t) = 0
u(π, t) = 0.
As with the heat equation, we try separation of variables. In other words, try to find
normal modes of the form
u(x, t) = v(x)w(t),
for some nonzero functions v(x) and w(t). Substituting this into the PDE gives
v(x)ẅ(t) = v 00 (x)w(t)
ẅ(t) v 00 (x)
= .
w(t) v(x)
As usual, a function of t can equal a function of x only if both are equal to the same constant,
say λ, so this breaks into two ODEs:
are possibilities (and all the others are linear combinations). Multiplying each by the v(x)
with the matching λ gives the normal modes
is a solution to the PDE with boundary conditions, and this turns out to be the general
solution.
26.3. Initial conditions. To specify a unique solution, give two initial conditions: not only
∂u
the initial position u(x, 0), but also the initial velocity (x, 0), at each position of the string.
∂t
(That two initial conditions are needed is related to the fact that the PDE is second -order in the t variable.)
For a plucked string, it is reasonable to assume that the initial velocity is 0, so one initial
∂u
condition is (x, 0) = 0. What condition does this impose on the an and bn ? Well, for the
∂t
general solution above,
∂u X X
= −nan sin nt sin nx + nbn cos nt sin nx
∂t n≥1 n≥1
∂u X
(x, 0) = nbn sin nx,
∂t n≥1
If we also knew the initial position u(x, 0), we could solve for the an by extending to an odd,
period 2π function of x and using the Fourier coefficient formula.
189
26.4. D’Alembert’s solution: traveling waves. D’Alembert figured out another way to
write down solutions, in the case when u(x, t) is defined for all real numbers x instead of just
x ∈ [0, L]. Then, for any reasonable function f ,
u(x, t) := f (x − ct)
There is a tiny bit of redundancy: one can add a constant to f and subtract the same constant
from g without changing u.
Problem 26.2. Suppose that c = 1, that the initial position is I(x), and that the initial
velocity is 0. What does the wave look like?
∂u
Solution: The initial conditions u(x, 0) = I(x) and (x, 0) = 0 become (after dividing
∂t
the second one by c)
The second equation says that g(x) = f (x)+C for some constant C; equivalently, g(x)−C/2 =
f (x) + C/2. If we replace f (x) by f (x) + C/2 and replace g(x) by g(x) − C/2, then the
new f and g produce the same sum f (x − t) + g(x + t) as before, but the new functions are
190
now equal, f (x) = g(x), instead of differing by a constant. For these new f and g, the first
equation yields f (x) = I(x)/2 and g(x) = I(x)/2. So the wave
consists of two equal waveforms, one traveling to the right and one traveling to the left. 2
and consider the solution u(x, t) = s(x − t) to the wave equation with c = 1. This is a
“cliff-shaped” wave traveling to the right. (You would be right to complain that this function is not
differentiable and therefore cannot satisfy the PDE in the usual sense, but you can imagine replacing s(x)
with a smooth approximation, a function with very steep slope. The smooth approximation also makes more
sense physically: a physical wave would not actually have a jump discontinuity.)
Another way to plot the behavior is to use a space-time diagram, in a plane with axes x
(space) and t (time). (Usually one draws only the part with t ≥ 0.) Divide the (x, t)-plane
into regions according to the value of u. The boundary between the regions is called the wave
front.
In the example above, u(x, t) = 1 for points to the left of the line x − t = 0, and u(x, t) = 0
for points to the right of the line x − t = 0. So the wave front is the line x − t = 0.
A different example:
Flashcard question: Suppose that the initial position is s(x), but the initial velocity is 0
(and still c = 1). Into how many regions is the t ≥ 0 part of the space-time diagram divided?
Consider t ≥ 0.
• If x < −t, then u(x, t) = 1/2 + 1/2 = 1.
• If −t < x < t, then u(x, t) = 1/2 + 0 = 1/2.
• If x > t, then u(x, t) = 0 + 0 = 0.
So the upper half of the plane is divided by a V-shaped wave front (the graph of |x|) into
three regions, with values 1 on the left, 1/2 in the middle, and 0 on the right. 2
191
Lecture actually ended here.
Remark 26.3. We have talked about waves moving in one space dimension, but waves exist
in higher dimensions too.
• In one dimension, a disturbance creates wave fronts moving to the left and right, and
the space-time diagram of the wave front is shaped like a V, as we just saw.
• In two dimensions, the disturbance caused by a pebble dropped in a still pond creates
a circular wave front that moves outward in all directions. The space-time diagram of
this wave front is shaped like an ice cream cone (without the ice cream).
• In three dimensions, the wave front created by a disturbance at a point is an expanding
sphere.
26.6. Real-life waves. In real life, there is always damping. This introduces a new term
into the wave equation:
∂ 2u ∂u 2
2 ∂ u
damped wave equation: + b = c .
∂t2 ∂t ∂x2
Separation of variables still works, but in each normal mode, the w(t) is a damped sinusoid
involving a factor e−bt/2 (in the underdamped case).
May 2
Problem 27.1. Draw the solution curves to ẏ = y 2 . (This is the special case f (t, y) := y 2 .)
y −1
=t+c (for some constant c)
−1
1
y=− (for some constant c).
t+c
When c = 0, the formula describes the hyperbola ty = −1, which consists of two solution
curves (one defined for t < 0 and one defined for t > 0). For other values of c, the solution
curves are the same half-hyperbolas except shifted c units to the left.
Oops: We divided by y 2 , which is not valid if y = 0. The constant function y = 0 is a
solution too. So in addition to the half-hyperbolas above, there is one more solution curve:
the t-axis. 2
Solution curves are graphs of functions, so they must satisfy the vertical line test (at most
one point on each vertical line).
Problem 27.2. Consider the solution to ẏ = y 2 satisfying the initial condition y(0) = 1. Is
there a solution y(t) defined for all real numbers t?
Solution: If this were a linear ODE, then the existence and uniqueness theorem would
guarantee a YES answer.
But here the answer is NO, as we’ll now explain. Setting t = 0 in the general solution
above and using the initial condition leads to
1
1=−
0+c
c = −1,
so
1 1
y=− = .
t−1 1−t
193
As t increases towards 1, the value of y(t) tends to +∞, so one says that the solution blows up
in finite time. It is impossible to extend y(t) to a solution defined and continuous at t = 1 or
beyond; the largest open interval containing the starting point 0 on which a solution exists is
(−∞, 1). 2
27.2. Existence and uniqueness. For nonlinear ODEs, there is still an existence and
uniqueness theorem, but the solutions it provides are not necessarily defined for all t.
Existence and uniqueness theorem for a nonlinear ODE. Consider a nonlinear ODE
27.3. Slope field. We are now going to introduce concepts to help with drawing solution
curves to an ODE ẏ = f (t, y). The slope field is a diagram in which at each point (t, y), you
draw a short segment whose slope is the value f (t, y).
http://mathlets.org/mathlets/isoclines/
Warning: The slope field is not the same as the graph of f : in drawing the graph of f , the
value of f is used as a height, but in drawing a slope field, the value of f is used as the slope
of a little segment.
Why draw a slope field? The ODE is telling us that the slope of the solution curve at each
point is the value of f (t, y), so the short segment there is, to first approximation, a little
piece of the solution curve. To get an entire solution curve, follow the segments!
27.4. Isoclines. Even with the computer display, it’s hard to tell what is going on. To
understand better, we introduce a new concept: If m is a number, the m-isocline is the set
of points in the (t, y)-plane such that the solution curve through that point has slope m.
(Isocline means “same incline”, or “same slope”.)
Solution: The ODE says that the slope of the solution curve through a point (t, y) is f (t, y),
so the equation of the m-isocline is f (t, y) = m. 2
Finding the isoclines will help organize the slope field. The 0-isocline is especially helpful.
Solution: This will be the region in which f (t, y) > 0. The 0-isocline f (t, y) = 0 divides the
plane into regions, and f (t, y) has constant sign on each region. To test the sign, just check
one point in each region. For f (t, y) := y 2 − t, we have f (t, y) > 0 in the region to the left of
the parabola (since f (0, 1) > 0), and f (t, y) < 0 in the region to the right of the parabola
(since f (1, 0) < 0). On the left region, solution curves slope upward; on the right region,
solution curves slope downward. The answer is: in the region to the left of the parabola. 2
197
The solution curve through (0, 0) increases for t < 0 and decreases for t > 0, so it reaches
its maximum at (0, 0). How did we know that the solution for t > 0 does not cross the lower
√
part of the parabola, y = − t, back into the upward sloping region? Answer: If it crossed
somewhere, its slope would have to be negative there, but the DE says that the slope is 0
√ √
everywhere along y = − t. Thus y = − t acts as a fence that solution curves already inside
the parabola cannot cross.
27.5. Example: The logistic equation. The simplest model for population x(t) is the
ODE ẋ = ax for a positive growth constant a: the rate of population growth is proportional
to the current population. But realistically, if x(t) gets too large, then because of competition
for food and space, the population will grow less quickly. In a better model, the growth rate
a would become smaller as the population grows. In the simplest model of this type, the
growth rate is a linearly decreasing function of the population, a − bx, where b is another
positive constant; then the DE is ẋ = (a − bx)x instead of ẋ = ax. In other words, the new
DE is
ẋ = ax − bx2 ,
where a and b are positive constants. This is a nonlinear ODE, called the logistic equation.
Let’s consider the simplest case, in which a = 1 and b = 1:
Problem 27.9. Draw the solution curves for ẋ = x − x2 in the (t, x)-plane.
Solution: The first step is always to find the 0-isocline. Here f (t, x) := x − x2 , so the
0-isocline is x − x2 = 0, which consists of the horizontal lines x = 0 and x = 1. Each of these
two lines has slope 0, matching the slope specified for the solution curve at each point of the
line, so each line itself is a solution curve! (Warning: This is not typical. An isocline is not usually a
solution curve.)
The 0-isocline divides the (t, x)-plane into three regions: in the horizontal strip 0 < x < 1,
we have f (t, x) = x − x2 = x(1 − x) > 0, so solutions slope upward. In the regions below and
above, solutions slope downward.
The diagram below shows the slope field (gray segments), the 0-isocline (yellow line), and
the solution curve with initial condition x(0) = 1/2 (blue curve).
198
May 4
Solution: Let f (x) := 3x − x2 . First find the 0-isocline by solving 3x − x2 = 0. This leads
to x(3 − x) = 0, so x = 0 or x = 3. These are horizontal lines. They are also solution curves,
corresponding to the constant functions x(t) = 0 and x(t) = 3.
As in last lecture, the 0-isocline divides the plane into “up” regions and “down” regions.
These are the region x < 0, the region 0 < x < 3, and the region x > 3. To find out which
are up and which are down, test one point in each:
• Since f (−1) < 0, the region x < 0 is a down region.
• Since f (1) > 0, the region 0 < x < 3 is an up region.
• Since f (4) < 0, the region x > 3 is a down region.
The phase line is a plot of the x-axis that summarizes this information:
−∞ ←− 0 −→ 3 ←− +∞
unstable stable
(The labels unstable and stable will be explained later. Sometimes the phase line is drawn
vertically instead, with +∞ at the top.)
28.3. Stability. In general, for ẋ = f (x), the real x-values such that f (x) = 0 are called
critical points. Warning: Only real numbers can qualify as critical points.
A critical point is called
• stable if solutions starting near it move towards it,
• unstable if solutions starting near it move away from it,
• semistable if the behavior depends on which side of the critical point the solution
starts.
In the case of the differential equation ẋ = 3x − x2 studied above, the critical points are 0
and 3. The phase line shows that 0 is unstable, and 3 is stable.
Remark 28.2. An unstable critical point is also called a separatrix because it separates solutions
having very different fates.
Example 28.3. For ẋ = 3x − x2 , a solution starting just below 0 tends to −∞, while a solution
starting just above 0 tends to 3: very different fates! 2
To summarize:
Steps for understanding solutions to ẋ = f (x) qualitatively:
1. Solve f (x) = 0 to find the critical points.
2. Write down
Problem 28.4. Frogs grow in a pond according to a logistic equation with growth constant
3 month−1 . The population reaches an equilibrium of 3000 frogs, but then the frogs are
harvested at a constant rate. Model the population of frogs.
t : time (months)
x : size of population (kilofrogs)
h : harvest rate (kilofrogs/month)
ẋ = 3x − x2 − h . 2
This is an infinite family of autonomous equations, one for each value of h, and each has
its own phase line. If in the (h, x)-plane, we draw each phase line vertically in the vertical
line corresponding to a given value of h, and plot the critical points for each h, then we get a
diagram called a bifurcation diagram. In this diagram, color the critical points according to
whether they are stable, unstable, or semistable.
202
Example 28.5. If h = 2, then ẋ = 3x − x2 − 2. Since 3x − x2 − 2 = −(x − 2)(x − 1), the
critical points are 1 and 2, and the phase line is
−∞ ←− 1 −→ 2 ←− +∞. 2
unstable stable
For each other value of h, the critical points are the real roots of 3x − x2 − h. We could
use the quadratic formula to find these roots
√ √
3 − 9 − 4h 3 + 9 − 4h
r1 (h) = , r2 (h) =
2 2
(assuming that 9 − 4h ≥ 0), and then graph both functions to get the bifurcation diagram.
But we don’t need to do this! The equation 3x − x2 − h = 0 is the same as h = 3x − x2 .
The graph of this in the (x, h)-plane is a downward parabola; to get the bifurcation diagram
in the (h, x)-plane, interchange the axes by reflecting in the line h = x.
Checking one point inside the parabola (like (h, x) = (0, 1)) shows that 3x − x2 − h is
positive there, and similarly 3x − x2 − h is negative outside the parabola. Thus the upper
branch x = r2 (h) consists of stable critical points, and the lower branch x = r1 (h) consists of
unstable critical points, at least when 9 − 4h > 0.
−∞ ←− 3/2 ←− +∞.
semistable
203
Does this mean that a solution x(t) can go all the way from +∞ through 3/2 to −∞?
No, because it can’t cross the constant solution x = 3/2. Instead there are three possible
behaviors:
• If x(0) > 3/2, then x(t) → 3/2 as t → +∞.
• If x(0) = 3/2, then x(t) = 3/2 for all t.
• If x(0) < 3/2, then x(t) tends to −∞ (we interpret this as a population crash: the
frog population reaches 0 in finite time; the part of the trajectory with x < 0 is not
part of the population model).
May 7
(Sustainable means that the harvesting does not cause the population to crash to 0, but
that instead limt→+∞ x(t) is positive, so that the harvesting can continue indefinitely.)
−∞ ←− r1 (h) −→ r2 (h) ←− +∞
unstable stable
−∞ ←− 3/2 ←− +∞
semistable
−∞ ←− +∞
Remark 28.8. Harvesting at exactly the maximum rate is a little dangerous, however, because
if after a while x becomes very close to 3/2, and a little kid comes along and takes one more
frog out of the pond, the whole frog population will crash!
for some constant a. That is, when the population is getting started, solutions to the
logistic equation obey approximately exponential growth, until the competition for
food or space implicit in the −x2 term becomes too large to ignore.
• Consider x ≈ 3, which is of interest when t → +∞. To measure deviations from 3,
define X := x − 3 ≈ 0, so x = 3 + X. The best linear approximation to f (x) for x ≈ 3
is
so
Ẋ = ẋ = f (x) ≈ −3X.
and
x = 3 + X ≈ 3 + be−3t
for some constant b, as t → +∞. (Since we are looking at solutions with 0 < x(t) < 3,
we must have b < 0.)
The “big picture” combines numerical results for 0.1 < x < 2.9 with linear approximations
near 0 and 3.
205
29. Autonomous systems
Now we study a system of two autonomous equations in two unknown functions x(t) and
y(t):
ẋ = f (x, y)
ẏ = g(x, y)
Example 29.1. If x(t) is deer population (in thousands), and y(t) is wolf population (in
hundreds), then the system
ẋ = 3x − x2 − xy
ẏ = y − y 2 + xy
is a reasonable model: each population obeys the logistic equation, except that there is an
adjustment depending on xy, which is proportional to the number of deer-wolf encounters.
Such encounters are bad for the deer, but good for the wolves!
29.1. Phase plane. Solution curves would now exist in 3-dimensional (t, x, y)-space, so they
are hard to draw. Instead, forget t, and draw the motion in the (x, y) phase plane.!At each
f (x, y)
point (x, y), the system says that the velocity vector there is the value of .
g(x, y)
Problem 29.2. In the deer-wolf example above, what is the velocity vector at (x, y) = (3, 2)?
Solution: ! ! !
ẋ 9−9−6 −6
= = .
ẏ 2−4+6 4
Draw this velocity vector with its foot at (3, 2). 2
The velocity vectors at all points together make a vector field. If you draw them all to
scale, you will wreck your picture! Mostly what we care about is the direction, so it is OK to
shorten them. Or better yet, don’t draw them at all, and instead just draw arrowheads along
the phase plane trajectories in the direction of motion.
There is an existence and uniqueness theorem for systems of nonlinear ODEs similar to
that for a single ODEs. For an autonomous system it implies that there is a unique trajectory
through each point (in a region in which the partial derivatives of f and g are continuous):
Trajectories never cross or touch!
(But see the “exception” in Remark 29.4.)
206
29.2. Critical points. A critical point for an autonomous system is a point in the (x, y)-plane
where the velocity vector is 0. To find all the critical points, solve
f (x, y) = 0
g(x, y) = 0.
Problem 29.3. Find the critical points for the deer-wolf system.
Remark 29.4. We said earlier that trajectories never cross. While it is true that no two
trajectories can have a point in common, it is possible for two trajectories to have the same
limit as t → +∞ or t → −∞, so they can appear to come together. For a trajectory to have
a finite limiting position, the velocity must tend to 0, so the limiting position must be a
critical point.
Conclusion: It is only at a critical point that trajectories can appear to come together.
29.3.1. Warm-up: linear approximation at (0, 0). To understand the behavior of the deer-wolf
system near (0, 0), use
ẋ = 3x − x2 − xy ≈ 3x
ẏ = y − y 2 + xy ≈ y.
In matrix form,
! ! !
ẋ 3 0 x
≈ .
ẏ 0 1 y
The eigenvalues are 3 and 1, so this describes a repelling node at (0, 0).
29.3.2. Linear approximation via change of coordinates (method 1). To understand the deer-
wolf system near the critical point (1, 2), reduce to the previously solved case of (0, 0) by
making the change of variable
x=1+X
y =2+Y
so that (x, y) = (1, 2) is (X, Y ) = (0, 0) in the new coordinate system. Then
Special case where (a, b) is a critical point for the system: Then f (a, b) = 0 and g(a, b) = 0,
so this linear approximation simplifies to
! !
f (x, y) x−a
≈ J(a, b) .
g(x, y) y−b
Making the change of variable X := x − a and Y := y − b leads to
! ! ! !
Ẋ ẋ f (x, y) X
= = ≈ J(a, b) .
Ẏ ẏ g(x, y) Y
Conclusion: At a critical point (a, b), if X := x − a and Y := y − b, then
! !
Ẋ X
≈ J(a, b) .
Ẏ Y
Problem 29.6. Find the behavior of the deer-wolf system near the critical point (1, 2).
209
Solution: We have
! !
∂f /∂x ∂f /∂y 3 − 2x − y −x
J(x, y) := = .
∂g/∂x ∂g/∂y y 1 − 2y + x
Plug in x = 1 and y = 2 to get
!
−1 −1
J(1, 2) = .
2 −2
Thus, if we measure deviations from the critical point by defining X := x − 1 and Y := y − 2,
we have ! ! !
Ẋ −1 −1 X
≈
Ẏ 2 −2 Y
(the same as what we got using method 1). The matrix has trace −3 and determinant
√ 4, so
−3 ± −7
the characteristic polynomial is λ2 + 3λ + 4, and the eigenvalues are . These are
2
complex numbers with negative real part, so this describes an attracting spiral. 2
May 9
Question 29.7. When is it OK to say that the original system behaves like the linear
system?
These cases, as opposed to the borderline cases in which A lies on the boundary between
regions in the trace-determinant plane, are called structurally stable.
Warning: Stability and structural stability are different concepts:
• Stable means that all nearby solutions tend to the critical point.
• Structurally stable means that the phase portrait type is robust, unaffected by small
changes in the matrix entries.
f (x, y) = 0
g(x, y) = 0
to find all the critical points in the (x, y)-phase plane. There is a stationary trajectory
at each critical point.
2. Calculate the Jacobian matrix
!
∂f /∂x ∂f /∂y
J(x, y) := .
∂g/∂x ∂g/∂y
This will be a 2 × 2 matrix of functions of x and y.
3. At each critical point (a, b),
(a) Compute the numerical 2 × 2 matrix A := J(a, b), by evaluating J(x, y) at (a, b).
211
(b) Determine whether the critical point is stable (attracting) or not:
stable ⇐⇒ tr A < 0 and det A > 0.
Or, for a more detailed picture, find the eigenvalues ! of A to classify
! the phase
Ẋ X
portrait for the “linear approximation system” ≈A . For further
Ẏ Y
details:
• If the eigenvalues are real, find the eigenlines. If, moreover, the eigenvalues
have the same sign, also determine the slow eigenline since trajectories in
the (X, Y )-plane will be tangent to that line.
• If the eigenvalues are complex (and not real), compute a velocity vector to
determine whether the rotation is clockwise or counterclockwise.
(c) Mark the critical point (a, b) in the (x, y)-plane, and draw a miniature copy
of the linear approximation’s phase portrait shifted so that it is centered at
(a, b); this is justified in the structurally stable cases (saddle, repelling node,
attracting node, or spiral). Indicate with arrowheads the direction of motion on
the trajectories near the critical point.
4. (Optional) Find the velocity vector at a few other points, or use a computer.
5. (Optional) Solve f (x, y) = 0 to find all the points where the velocity vector is vertical
or 0. Similarly, one could solve g(x, y) = 0 to find all the points where the velocity
vector is horizontal or 0.
6. Connect trajectories emanating from or approaching critical points, keeping in mind
that trajectories never cross or touch.
Problem 29.9. Sketch the phase portrait for the deer-wolf system
ẋ = 3x − x2 − xy
ẏ = y − y 2 + xy.
3x − x2 − xy = 0.
Factoring shows that these are the points on the lines x = 0 and 3 − x − y = 0. So in the
phase portrait we draw little vertical segments at points on these lines. In particular, there
will be trajectories along x = 0, and we can plot them using the 1-dimensional phase line
methods, by sampling the velocity vector at one point in each interval created by the critical
points. The line 3 − x − y − 0 does not contain trajectories, however, since that line has slope
−1, while trajectories are vertical as they pass through these points.
y − y 2 + xy = 0.
These are the lines y = 0 and 1 − y + x = 0, so draw little horizontal segments at points
on these lines. Again we can study trajectories along y = 0 using 1-dimensional phase line
methods.
Big picture:
214
Try the “Vector Fields” mathlet
http://mathlets.org/mathlets/vector-fields/
29.6. Changing the parameters of the system. The big picture suggests that all trajec-
tories in the first quadrant tend to (1, 2) as t → +∞. In other words, as long as there were
some deer and some wolves to begin with, eventually the populations stabilize at about 1000
deer and 200 wolves.
Problem 29.10. Suppose that we start feeding the deer so that the system becomes
ẋ = ax − x2 − xy
ẏ = y − y 2 + xy
Solution: The critical points will move slightly, but they won’t change their stability. The
populations will end up at the stable critical point, which is the one near (1, 2). To find it,
215
solve
0 = ax − x2 − xy
0 = y − y 2 + xy.
Since we’re looking for a solution with x > 0 and y > 0, it is OK to divide the equations by
x and y, respectively:
0=a−x−y
0 = 1 − y + x.
Solving gives
a−1 a+1
x= , y= .
2 2
For a = 3, this is x = 1 and y = 2. As a increases beyond 3, the deer population increases,
but we also see an increase in the wolf population. By feeding the deer we have provided
more food for the wolves as well!
29.7. Fences. In the original deer-wolf system, how can you be sure that all trajectories
starting with x > 0 and y > 0 tend to (1, 2)?
Steps to prove that all trajectories approach the stable critical point:
(1) Find a window into which all trajectories must enter and never leave.
(2) Do a numerical simulation within the window.
Let’s do step 1 for the deer-wolf system. A trajectory could escape in four ways: up, down,
left, and right. We need to rule out all four.
Bottom: A trajectory that starts in the first quadrant cannot cross the nonnegative part
of the x-axis, because the trajectories along the x-axis act as fences. A trajectory cannot
even tend to a point on the x-axis, because such a point would be a critical point, and the
phase portrait types at (0, 0) and (3, 0) make such an approach impossible.
Left: By the same argument, the nonnegative part of the y-axis is a fence that cannot be
approached.
Right: We have
ẋ = 3x − x2 − xy ≤ 3x − x2 < 0
whenever x > 3 (if 3x − x2 is negative, then 3x − x2 − xy is even more negative since it
has something subtracted). So all the vertical lines x = c for c > 3 are fences that prevent
trajectories from moving to the right across them. All trajectories move leftward if x > 3,
and they can settle down only in the range 0 ≤ x ≤ 3.
216
Top: Assuming x ≤ 3, we have
ẏ = y − y 2 + xy ≤ y − y 2 + 3y = 4y − y 2 < 0
whenever y > 4. Thus for c > 4, the horizontal segments y = c, 0 ≤ x ≤ 3 are fences
preventing trajectories from moving up through them.
Conclusion: All trajectories starting with x > 0, y > 0 (the only ones we care about)
eventually enter the window 0 ≤ x ≤ 3, 0 ≤ y ≤ 4 and stay there. This is small enough that
a numerical simulation can now show that all these points tend to (1, 2) (step 2).
The final exam covers everything up to here, in the sense that you are not required to know
anything specific below. On the other hand, the topics below serve partially as review of
earlier topics that are covered.
May 11
29.8. Nonlinear centers, limit cycles, etc. Consider an autonomous system. Suppose
that P is a critical point. Suppose that the linear approximation system at P is a center.
What is the behavior of the original system near P ? It’s not necessarily a center. (This is
not a structurally stable case.) In fact, there are many possibilities:
• nonlinear center, in which the trajectories are periodic (but not necessarily exact
ellipses)
• repelling spiral
• attracting spiral
• hybrid situation containing a limit cycle: a periodic trajectory with an outward spiral
approaching it from within and an inward spiral approaching it from outside!
For an example of a limit cycle (called the van der Pol limit cycle), set a = 0.1 in the
system
ẋ = y
ẏ = a(1 − x2 )y − x
Problem 30.1. Model a pendulum, consisting of a weight attached to a rod hanging from a
pivot at the top.
t : time
θ : angle measured counterclockwise from the rest position
218
Equation: When the weight is at a certain position, let θ̂ be the unit vector in the direction
that the weight moves as θ starts to increase. The θ̂-component of the weight’s acceleration is
θ̈ = −g sin θ. (13)
gravity
θ̈ = − bθ̇ − g sin θ .
friction gravity
The θ̈ and bθ̇ terms are linear, but the g sin θ makes the whole DE nonlinear.
Remark 30.2. If θ is very small, then it is reasonable to replace the nonlinear term by its
best linear approximation at θ = 0, namely sin θ ≈ θ, which leads to
θ̈ + bθ̇ + gθ = 0,
θ̇ = v
v̇ = −bv − g sin θ.
This is an autonomous system! So we can use all the methods we’ve been developing.
v=0
−bv − g sin θ = 0.
Substituting v = 0 into the second equation leads to sin θ = 0, so θ = . . . , −2π, −π, 0, π, 2π, . . ..
Thus there are infinitely many critical points:
..., (−2π, 0), (−π, 0), (0, 0), (π, 0), (2π, 0), ....
But these represent only two distinct physical situations, since adding 2π to θ does not change
the position of the weight.
219
30.4. Phase portrait of the frictionless pendulum; energy levels. Let’s draw the
phase portrait in the (θ, v)-plane when b = 0 and g = 1 . Now the system is
θ̇ = v
v̇ = − sin θ.
Flashcard question: In the frictionless case, are the critical points (0, 0) and (π, 0) stable?
Answer: Neither is stable.
• The point (π, 0) corresponds to a vertical rod with the weight precariously balanced
at the top. If the weight is moved slightly away, the trajectory goes far from (π, 0).
• The point (0, 0) corresponds to a vertical rod with the weight at the bottom. If
the weight is moved slightly away, the trajectory does not tend to (0, 0) in the limit
because the pendulum oscillates forever in the frictionless case. 2
To analyze the behavior near each critical point, use a linear approximation. The Jacobian
matrix is !
0 1
J(θ, v) = .
− cos θ 0
(Remember that all the constants were set to 1, so potential energy equals height, which we
choose to measure relative to the rest position.)
Let’s check conservation of energy:
Ė = v v̇ + (sin θ)θ̇
= v(− sin θ) + (sin θ)v
= 0.
This means that along each trajectory, E is constant. In other words, each trajectory is
contained in a level curve of E.
Energy level E = 0:
1 2
v + (1 − cos θ) = 0.
2
Both terms on the left are nonnegative, so their sum can be 0 only if both are 0, which
happens only at (θ, v) = (0, 0) (and the similar points with some 2πn added to θ). The
energy level E = 0 consists of the stationary trajectory at (0, 0).
v 2 θ2
+ = ,
2 2
a small circle. The trajectory goes clockwise along it since θ is increasing when θ̇ > 0, and
decreasing when θ̇ < 0. So trajectories near (0, 0) are periodic ovals (approximately circles);
these represent a pendulum doing a small oscillation near the bottom. The critical point is a
nonlinear center.
Lecture more or less ended here. The remaining topics below were mentioned only very
briefly.
221
Energy level E = 2:
1 2
v + (1 − cos θ) = 2
2
1 2
v = 1 + cos θ
2
θ
= 1 + (2 cos2 − 1)
2
θ
= 2 cos2
2
θ
v = ±2 cos .
2
Does this mean that the motion is periodic, going around and around? No. This energy level
contains three physical trajectories:
In the last two cases, the weight can’t actually reach the top, since its phase plane trajectory
can’t touch the stationary trajectory.
Energy level E = 3:
1 2
v + (1 − cos θ) = 3
2
√
v = ± 4 + 2 cos θ.
√ √ √
The possibility v = 4 + 2 cos θ is a periodic function of θ, varying between 2 and 6. The
energy level consists of two trajectories: in each, the weight makes it to the top still having
some kinetic energy, so that it keeps going around (either clockwise or counterclockwise).
222
30.5. Phase portrait of the damped pendulum. Next let’s draw the phase portrait
when b > 0 (so there is friction) and g = 1 . The system is
θ̇ = v
v̇ = −bv − sin θ.
This time,
Ė = v v̇ + (sin θ)θ̇
= v(−bv − sin θ) + (sin θ)v
= −bv 2 ,
..., (−2π, 0), (−π, 0), (0, 0), (π, 0), (2π, 0), ....
223
May 14
Question 31.2. Where, approximately, will be the point on the solution curve at time
t0 + 3h?
224
Solution: The stupidest answer would be to take 3 steps each using the initial slope f (t0 , y0 )
(or equivalently, one big step of width 3h). The slightly less stupid answer is called Euler’s
method: take 3 steps, but reassess the slope after each step, using the slope field at each
successive position:
t1 := t0 + h y1 := y0 + f (t0 , y0 ) h
t2 := t1 + h y2 := y1 + f (t1 , y1 ) h
t3 := t2 + h y3 := y2 + f (t2 , y2 ) h.
The sequence of line segments from (t0 , y0 ) to (t1 , y1 ) to (t2 , y2 ) to (t3 , y3 ) is an approximation
to the solution curve. The answer to the question is approximately (t3 , y3 ). 2
Usually these calculations are done by computer, and there are round-off errors in calcula-
tions. But even if there are no round-off errors, Euler’s method usually does not give the
exact answer. The problem is that the actual slope of the solution curve changes between t0
and t0 + h, so following a segment of slope f (t0 , y0 ) for this entire time interval is not exactly
correct.
To improve the approximation, use a smaller step size h, so that the slope is reassessed
more frequently. The cost of this, however, is that in order to increase t by a fixed amount,
more steps will be needed.
Under reasonable hypotheses on f , one can prove that as h → 0, this process converges
and produces an exact solution curve in the limit. This is one way to prove the existence
theorem for ODEs.
31.2. Euler’s method for systems. A first-order system of ODEs can be written in vector
form ẋ = f (t, x), where f is a vector-valued function. Euler’s method works the same way.
Starting from (t0 , x0 ), define
t1 := t0 + h x1 := x0 + f (t0 , x0 ) h
t2 := t1 + h x2 := x1 + f (t1 , x1 ) h
t3 := t2 + h x3 := x2 + f (t2 , x2 ) h.
Question 31.3. How can we decide whether answers obtained numerically can be trusted?
Here are some heuristic tests. (“Heuristic” means that these tests seem to work in practice,
but they are not proved to work always.)
225
• Self-consistency: Solution curves should not cross! If numerically computed solution
curves appear to cross, a smaller step size is needed. (E.g., try the mathlet “Euler’s
Method” with ẏ = y 2 − x, step size 1, and starting points (0, 0) and (0, 1/2).)
• Convergence as h → 0: The estimate for y(t) at a fixed later time t should converge
to the true value as h → 0. If shrinking h causes the estimate to change a lot, then
h is probably not small enough yet. (E.g., try the mathlet “Euler’s Method” with
y 0 = y 2 − x with starting point (0, 0) and various step sizes.)
• Structural stability: If small changes in the DE’s parameters or initial conditions
change the outcome completely, the answer probably should not be trusted. One
reason for this could be a separatrix, a curve such that nearby starting points on
different sides lead to qualitatively different outcomes; this is not a fault of the
numerical method, but is an instability in the answer nevertheless. (E.g., try the
mathlet “Euler’s Method” with y 0 = y 2 − x, starting point (−1, 0) or (−1, −0.1), and
step size 0.125 or actual solution.)
31.4. Change of variable. Euler’s method generally can’t be trusted to give reasonable
values when (t, y) strays very far from the starting point. In particular, the solutions it
produces usually deviate from the truth as t → ±∞, or in situations in which y → ±∞ in
finite time. Anything that goes off the screen can’t be trusted.
To see what is really happening in this example, try the change of variable u = 1/y. To
rewrite the DE in terms of u, substitute y = 1/u and ẏ = −u̇/u2 :
u̇ 1
− 2
= 2 −t
u u
u̇ = −1 + tu2 .
This is equivalent to the original DE, but now, when y is large, u is small, and Euler’s method
can be used to estimate the time when u crosses 0, which is when y blows up.
Rb
31.5. Runge–Kutta methods. When computing a f (t) dt numerically, the most primitive
method is to use the left Riemann sum: divide the range of integration into subintervals of
width h, and estimate the value of f (t) on each subinterval as being the value at the left
endpoint. More sophisticated methods are the trapezoid rule and Simpson’s rule, which have
smaller errors.
There are analogous improvements to Euler’s method.
226
Integration Differential equation Error
left Riemann sum Euler’s method O(h)
trapezoid rule second-order Runge–Kutta method (RK2) O(h2 )
Simpson’s rule fourth-order Runge–Kutta method (RK4) O(h4 )
The big-O notation O(h4 ) means that there is a constant C (depending on everything except for h) such
that the error is at most Ch4 , assuming that h is small. The error estimates in the table are valid for
reasonable functions.
The Runge–Kutta methods “look ahead” to get a better estimate of what happens to the
slope over the course of the interval [t0 , t0 + h].
Here is how one step of the second-order Runge–Kutta method (RK2) goes
1. Starting from (t0 , y0 ), look ahead to see where one step of Euler’s method would land, say
(t1 , y1 ), but do not go there!
1 y0 +y1
2. Instead sample the slope at the midpoint t0 +t
2
, 2 .
3. Now move along the segment of that slope: the new point is
t0 + t1 y0 + y1
t0 + h, y0 + f , h .
2 2
Repeat, reassessing the slope after each step. (RK2 is also called midpoint Euler.)
The fourth-order Runge–Kutta method (RK4) is similar, but more elaborate, averaging
several slopes. It is probably the most commonly used method for solving DEs numerically.
Some people simply call it the Runge–Kutta method. The mathlets use RK4 with a small
step size to compute the “actual” solution to a DE.
May 16
32. Review
32.1. Check your answers! On the final exam, there is no excuse for getting an eigenvector
wrong, since you will have plenty of time to check it! You can also check solutions to linear
systems, or solutions to DEs.
32.2. D’Alembert’s solution to the wave equation. Consider the wave equation
∂ 2u 2
2∂ u
= c
∂t2 ∂x2
without boundary conditions. For any function f of one variable, the function u(x, t) :=
f (x − ct) of the two variables x and t is a solution to the wave equation (just plug it in —
both sides end up being c2 f 00 (x − ct). At time t = 0, the shape of the wave is the graph
of u(x, 0) = f (x); at time t = 1, the shape of the wave is the graph of u(x, 1) = f (x − c),
227
which is the same shape shifted c units to the right; and so on. The physical meaning of this
solution is a wave keeping its shape but moving to the right with speed c.
Similarly, for any function g, the function u(x, t) := g(x + ct) is a solution. The physical
meaning of this solution is a wave moving to the left.
Since the wave equation is a linear PDE, the superposition
u(x, t) := f (x − ct) + g(x + ct)
is again a solution, for any choice of functions f and g. This turns out to be the general
solution (it’s an infinite family of solutions, since there are infinitely many possibilities for the
functions f and g). This is called d’Alembert’s solution. The f and g are like the parameters
c1 and c2 in the general solution to an ODE: to find them, one must use initial conditions.
Problem 32.1. Suppose that u(x, t) is the solution to the wave equation
∂ 2u ∂ 2u
= 4
∂t2 ∂x2
such that the initial waveform at time 0 is given by the function
0, if t < 0,
r(x) = 2, if 0 < t < 1,
4, if t > 1,
and the initial velocity at each point is 0. Find a formula for u(x, t).
Solution: In the given wave equation, c2 is 4, so the speed of the waves is c = 2. Thus
u(x, t) = f (x − 2t) + g(x + 2t)
for some functions f and g to be determined. The initial conditions will constrain the
possibilities for f and g. For example, evaluating both sides at t = 0 and plugging in the
initial condition u(x, 0) = r(x) on the left side gives
r(x) = f (x) + g(x).
On the other hand, taking the t-derivative of both sides gives
∂u
(x, t) = −2f 0 (x − 2t) + 2g 0 (x + 2t)
∂t
∂u
by the chain rule, and then evaluating at t = 0 and plugging in the initial condition ∂t
(x, 0) =0
on the left side gives
0 = −2f 0 (x) + 2g 0 (x),
which implies that f 0 (x) = g 0 (x), so f (x) = g(x) + C for some constant C. Solving the system
r(x) = f (x) + g(x)
f (x) = g(x) + C
228
for the unknown functions f (x) and g(x) (e.g., by substituting the second equation into the
first) gives
r(x) + C r(x) − C
f (x) = , g(x) = .
2 2
Plug these back into the general solution to get the particular solution
u(x, t) = f (x − 2t) + g(x + 2t)
r(x − 2t) + C r(x + 2t) − C
= +
2 2
r(x − 2t) r(x + 2t)
= + ,
2 2
which is a known function, since the function r was given. 2.
There are at least two ways to visualize the solution u(x, t) we found.
The first way is to plot the waveform at different times, to produce snapshots that if
displayed in succession will be a movie of the wave. For example, what does the wave look
like at time t = 1? The answer is the graph of
r(x − 2) r(x + 2)
u(x, 1) = + ,
2 2
but what does this look like? The function r(x) jumps up at x = 0 and x = 1, so r(x − 2)
jumps up when x − 2 = 0 or x − 2 = 1 (that is, x = 2 or x = 3). Meanwhile, r(x + 2) jumps
up when x + 2 = 0 or x + 2 = 1 (that is, x = −2 or x = −1). Thus u(x, 1) jumps up at
x = −2, −1, 2, 3; these values divide the real line into intervals on which u(x, 1) is constant.
To find the values, we just need to evaluate u(x, 1) at a single x-value in each interval. For
example, for 2 < x < 3, the value of u(x, 1) equals
r(0.5) r(4.5) 2 4
u(2.5, 1) = + = + = 3.
2 2 2 2
Similar calculations eventually lead to
0, if t < −2,
1,
if −2 < t < −1,
u(x, 1) = 2, if −1 < t < 2,
3, if 2 < t < 3,
4, if 3 < t,
so the wave at t = 1 looks like a staircase with four steps going up.
The second way to visualize the solution u(x, t) is draw its space-time diagram, in the
(x, t)-plane. At t = 0 (the horizontal axis), mark the x-values where u(x, 0) jumps up (x = 0
and x = 1) and in a different color write the value of u(x, 0) in each interval formed (these
values are 0, 1, 2). Then one can do the same for t = 1 (the horizontal line one unit higher):
229
mark the points where u(x, 1) jumps up (−2, −1, 2, 3) and write the values in each interval
formed. Actually, it is easier to do this for all t at once instead of one t-value at a time. That
is, since r(x) jumps at 0 and 1, the function
r(x − 2t) r(x + 2t)
u(x, t) = +
2 2
jumps whenever x − 2t = 0, x − 2t = 1, x + 2t = 0, or x + 2t = 1. The parts of these four
lines above the x-axis (i.e., the part where t ≥ 0) are the wave fronts. They divide the upper
half of the plane (t ≥ 0) into regions such that u(x, t) is constant on each region. To find the
constant value within each region, evaluate u(x, t) at one point in the region.
(OK, maybe the web has more than 8 webpages, but you get the idea.)
Let vi be the “importance” of webpage i.
Idea: A webpage is important if important webpages link to it. Each webpage “shares” its
importance equally with all the webpages it links to.
In the example above, page 2 inherits 12 the importance of page 1, 12 the importance of
page 3, and 13 the importance of page 4:
1 1 1
v2 = v1 + v3 + v4 .
2 2 3
Yes, this is self-referential, but still it makes sense. All eight equations are encapsulated in
v1 0 0 0 0 0 0 1/3 0 v1
v 1/2 0 1/2 1/3 0 0 0 0 v2
2
v3 1/2 0 0 0 0 0 0 0 v3
v4 0 1 0 0 0 0 0 0
v4 ,
=
v 0 0 1/2 1/3 0 0 1/3 0 v5
5
v6 0 0 0 1/3 1/3 0 0 1/2 v6
v7 0 0 0 0 1/3 0 0 1/2 v7
v8 0 0 0 0 1/3 1 1/3 0 v8
which is of the form v = Av. In other words, v should be an eigenvector with eigenvalue 1.
Question 33.1. How do we know that a matrix like A has an eigenvector with eigenvalue 1?
Could it be that 1 is just not an eigenvalue?
231
Trick: use the transpose AT .
det A = det AT
det(A − λI) = det(AT − λI)
eigenvalues of A = eigenvalues of AT .
The equation
1 0 1/2 1/2 0 0 0 0 0 1 1
1 0 0 0 1 0 0 0 0 1 1
1 0 1/2 0 0 1/2 0 0 0 1 1
T 1
0 1/3 0 0 1/3 1/3 0 0 1 1
A = = ,
1 0 0 1 1
0 0 0 1/3 1/3 1/3
1 0 0 0 0 0 0 0 1 1 1
1 1/3 0 0 0 1/3 0 0 1/3 1 1
1 0 0 0 0 0 1/2 1/2 0 1 1
shows that 1 is an eigenvalue of AT , so 1 is an eigenvalue of A.
In the example above, the unique solution (up to multiplying by a scalar) is
0.0600
0.0675
0.0300
0.0675
v= .
0.0975
0.2025
0.1800
0.2950
Google finds the eigenvector of a 50,000,000,000 × 50,000,000,000 matrix.
233