Elements of Numerical Analysis With Mathematica
Elements of Numerical Analysis With Mathematica
ws-book9x6Book2 page i
Publishers page
i
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page ii
Publishers page
ii
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page iii
Publishers page
iii
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page iv
Publishers page
iv
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page v
v
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page vi
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page vii
Preface
vii
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page viii
Preface ix
that was sufficiently mathematical for our students and supported Mathe-
matica as the programming platform. During that time, our students have
seen topics come and go as we settled on a stable course. Without their par-
ticipation this text could not have been written. Special acknowledgement
goes to those who helped me understand how to present this material. In
particular this includes Scott Irwin, Yevgeniy Milman, Andrew Hofstrand,
Evan Curcio, Gregory Javens and James Kluz.
John Loustau
Hunter College (CUNY)
New York, 2015
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page x
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page xi
Contents
Preface vii
1. Beginnings 1
1.1 The Programming Basics for Mathematica . . . . . . . . . 2
1.2 Errors in Computation . . . . . . . . . . . . . . . . . . . . 6
1.3 Newtons Method . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Secant Method . . . . . . . . . . . . . . . . . . . . . . . . 14
4. Numerical Differentiation 75
4.1 Finite Differences and Vector Fields . . . . . . . . . . . . 76
4.2 Finite Difference Method, Explicit or Forward Euler . . . 81
4.3 Neumann Stabiity Analysis . . . . . . . . . . . . . . . . . 85
4.4 Finite Difference Method, Implicit and Crank Nicolson . . 91
xi
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page xii
5. Numerical Integration 97
5.1 Trapezoid Method and Simpsons Rule . . . . . . . . . . . 98
5.2 Midpoint Method . . . . . . . . . . . . . . . . . . . . . . . 101
5.3 Gaussian Quadrature . . . . . . . . . . . . . . . . . . . . . 105
5.4 Comments on Numerical Integration . . . . . . . . . . . . 110
Bibliography 131
Index 133
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 1
Chapter 1
Beginnings
Introduction
1
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 2
puter system you most always be cognizant of the potential of error in your
calculations. We will see an example of this in Section 1.2.
We next look at Newtons method. Most calculus courses include New-
tons method for finding roots of differentiable functions. If you have done
a Newtons method problem with pencil and paper, you know that doing
two or three iterations of the process is a nightmare. Even the simplest
cases are not the sort of thing most students want to do. Now, we see that
it is easy to program. In this regard, it is an excellent problem for the be-
ginning student. In addition, Mathematica provides a built in function that
performs Newtons method. It is empowering for the student to compare
his results to the output produced by Mathematica. We follow Newtons
method by the secant method to find the root of a function. This provides
the student with the first example of an error estimating procedure.
By the end of the chapter, the student should be able to program the
basic arithmetic operations, access the standard mathematical functions,
program loops and execute conditional statements (if ... then ... else ...). A
special feature of Mathematica is the graphics engine. With minimal effort,
the student can display sophisticated graphical output. By the end of this
chapter the student will be able to use the basic 2D graphics commands.
We use Mathematica Version 10. Each year when the university renews
its license, the version changes. In the past, programs for one version are
either fully upgradable to the subsequent version or Wolfram provides a
program that upgrades program code written for one version to the next.
During this semester you will be programming in Mathematica. To
begin with, you will learn to be able to program the following.
Beginnings 3
Sin[x], exponential, Exp[x] and so forth all begin with a capital letter. The
argument is enclosed in square brackets.
To start, bring up Mathematica and select new notebook. Now type a
line or two of program code. For instance any two of the statements in the
prior paragraph. To execute the code, you hold the shif t key down and
then press enter.
Consider the polynomial in 2 variables, f (x, y) = (x + y)2 2xy y 2 .
You define this in a Mathematica program with the following statement,
You can find descriptions of each of these in the first two or three chapters
of most any programming text for Mathematica. In addition Mathematica
includes a programming tutorial accessible via the Help Menu.
There are some comments that need to be made.
A. Error messages in Mathematica are cryptic at best. After a while
you will begin to understand what they mean and use them to debug your
program. But this will take some experience. On the other hand there
are circumstances where you might expect to receive an error or warning
message but none is generated. For instgance,
If [x == 0,
x = 5
];
will test the value of x. If it is zero, then it will be set to 5. On the other
hand,
If [x = 0,
x = 5
];
may not do this. Parenthesis may only be used in computations for group-
ing. Square brackets are only used around the independent variables of a
function, while braces are only used for vectors and matrices. For instance
each of the following expressions will cause an error in a Mathematica pro-
gram:
[a + b]2 , f (x), (x, y).
The correct expressions are
(a + b)2 , f [x], {x, y}.
D. When defining a function in your program, you always follow the in-
dependent variable(s) with an underscore. This is how Mathematica iden-
tifies the independent variable. Later when you reference the function, you
must not use the underscore. For instance the following statements defines
a function as xex , evaluates the function at x = 1 and then defines a
second function as the derivative of the first.
f [x ] = x Exp[x];
y = f [1];
Beginnings 5
code is executed. In the second the output of the line of code is suppressed.
For instance, if you enter
a=3+5
and execute (press shift+enter), Mathematica will return the value, 8. On
the other hand if you type
a = 3 + 5;
and execute, then there is no printed output. In any event, the calculation
does occur and the result is stored in a. Any subsequent calculation can
acess the result by referencing a.
G. It is best to have only one line of program code per physical line
on the page. For short programs, violating this rule should not cause any
problems. For long and involved programs, debugging is often a serious
problem. If you have several lines of code on the same physical line, may
have trouble noticing a particular line of code that is causing an error. For
instance,
z = x + y;
x = z + 1;
is preferred,
z = x + y; x = z + 1;
H. Mathematica has distinct computational procedures for integer arith-
metic and decimal arithmetic. Integer calculations take much longer to ex-
ecute. But the result expressed as fractions is exact. Decimal calculations
are much faster but there is round off error. (See Section 2.) For instance, if
all the data for a program is whole numbers and is entered without decimal
point, then Mathematical will assume that the calculations are integer and
proceed. (See problmes 1 and 2 below.
I. We did not use := when defining the funciton f (x, y). There are
technical differences between the two symbols used to define a function.
Different authors will suggest one or the other. Out take on this is that
:= is used when defininig a module, a function defined as a sub-program
and accessed at several different locations in your program. Otherwise, to
decfine a simple function as we have done, it is best to use =. That said, it
is unlikely that you will see the difference in the contexts that arise here.
Exercises:
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 6
Errors arise from several sources. There are the errors in data collection
and data tabulation. This sort of data error needs to be avoided as much
as possible. Usually this is accomplished by quality assurance procedures
implemented at the team management level. This is not our concern in
numerical methods. Programming errors are also a quality control issue.
These errors are avoided by following good practices of software engineering.
For our own programs, we are best advised to be as simple as possible:
simple in program design, simple in coding. A mundane program is much
easier to control and modify than a brilliant but somewhat opaque one.
The simple one may take longer to code or longer to execute, but within
bounds is still preferable.
There are errors that arise because of the processes we use and the
equipment that we execute on. Both are errors due to the discrete nature
of the digital computer. These errors cannot be prevented and hence must
be controlled via error estimation.
First, the computer cannot hold an infinite decimal. Hence, the decimal
representation of fractions such as 1/3 and 2/3 are inherently incorrect.
Further, subsequent computations using these numbers are incorrect. A
small error in the decimal representation of a number when carried for-
ward through an iterated process may accumulate and result in an error of
considerable size. For instance, when solving a large linear system of equa-
tions, an error introduced in the upper left corner will iterate through the
Gauss-Jordan process causing a significant error in the lower right corner.
Another type of error arises from discrete processes. For instance, sup-
pose you have an unknown function f (x). Suppose also that you know that
f (1) = 1 and f (1.1) = 1.21. Then it is reasonable to estimate the derivative
by the Newton quotient
df 1.21 1 0.21
(1) = = 2.1.
dx 1.1 1 0.1
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 7
Beginnings 7
If in fact, f (x) = x2 , then our estimated derivative is off by 0.1. But without
knowing the actual function, we have no choice but to use the estimation.
We are faced with one of two alternatives, doing nothing or proceeding
with values that we expect are flawed. The only reasonable alternative is
to proceed with errors provided we can estimate the error.
This is a special case of a more general problem. Suppose we want
to compute a value y, but in fact, we can only compute the values at a
sequence yn that converges to y. Since we can never do our computation
all the way to the limit, then we must have a means to estimate the nth ,
x xn .
We formalize the error in the following definition.
Definition 1.2.1. Suppose that there is a computation and that estimates
a value x with the computed value x , then e = x x is called the error .
In turn, |x x | is called the absolute error , If x 6= 0, then (x x )/x is
called the relative error . In this case, the relative absolute error is given
by |x x
|/|x|.
It is reasonable to ask why we should care about e if we already know
x. Indeed, if we know x, there is no need to for a numerical technique to
estimate x with x . In numerical analysis, the basic assumption is that x
is not computeable but estimatelble. Therefore, it is useful to have precise
definitions for these terms, as there are situations where we can estimate
the error e without knowing the actual value x. Indeed, each numerical
process should include a procedure to estimate the error. It can be argued
that any procedure that does not include an error estimate is of no value.
What is the purpose of executing a computation, if we have no idea whether
or not the computed data approximates the actual value?
A second comment is in order. It is preferable to use the relative or
relative absolute error. This is because these values are dimensionless. For
instance, consider the example of the derivative of the squaring function.
If the data is given in meters, then e = 0.1meters. If the data were instead
displayed in kilometers then e = 0.001 and or centimeters, e = 10. Even
though the error is the same, the impression is different. However, for
relative error the value is 0.1, is independent of the unit of measurement.
When this is the case, we say that the data is dimensionless.
In Exercise 1 below, you are asked to execute a simple calculation which
should always yield the same result independent of the input data. In this
problem you are asked to use several different values of x. Unexpectedly,
the results will vary across a broad spectrum of possible answers. In a
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 8
simple calculation like this it is possible to look at how the numbers are
represented and determine why the error occurs. But in any actual situation
the calculations are so complex that such an analysis is virtually impossible.
When executing a computation, you should always have a idea of what
the results should be. If you get impossible output and there is no error
in your program, then you may be looking at small errors in numerical
representation compounded over perhaps thousands of separate arithmetic
operations.
Exercises:
1. Consider the function f (x, y) = ((x + y)2 2xy y 2 )/x2 . We expect
that if x 6= 0, then f (x, y) = 1. Set y = 103 and compute f for x = 10.01 ,
10.02 , 10.03 , 10.04 , 10.05 , 10.06 , 10.07 , 10.08 . For each value of x
compute the absolute error.
Suppose that you have a function f (x) = y and want to find a root or a
zero for f . Recall that x
is a root of f provided f (
x) = 0. If f is continuous
and f (x1 ) > 0 and f (x2 ) < 0 then you know that f must have at least one
root between x1 and x2 . This result, The Intermediate Value Theorem, is
usually stated as early as Calculus 1 and most commonly proved in the first
semester of Real Analysis. [Rudin (1976)] There is an intuitively simple
but inefficient means to determine a good approximation for x based on
this theorem.
(1) Consider the midpoint of the interval [x1 , x2 ], (x1 + x2 )/2 = x
.
(2) If f (
x) = 0, then x
is a root. Exit.
(3) If f (
x) > 0, then replace x1 with x.
(4) If f (
x) < 0, then replace x2 with x.
(5) Return to Step 1.
The following Mathematica code segment demonstrates this process. We
use the fact that a b > 0 if and only if a and b have the same sign.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 9
Beginnings 9
test-root = x1;
test-value = 10^-5;
While[Abs[f[test-root]] > test-value,
test-root = (x1 + x2)/2;
If[f[test-root]*f[x1] >= 0,
x1 = test-root,
x2 = test-root
];
];
Print[test-root];
x2 is closer to the root of f than x1 , then the method has been productive.
1 1
1 2 3 4 5 1 2 3 4 5
A
-1 -1
B
-2 -2
-3 -3
Fig. 1.1 Figure 1.3.1 f (x) = x cos(x) Fig. 1.2 Figure 1.3.2 f and the tangent
line at B = (2, f (2))
For instance, if f (x) = x cos(x), then there is a root at the point A near
x = 1.6. (See Figure 1.3.1)
If we start the process with x = 2, f 0 (2) 2.2347, f (2) 0.8233
and the tangent to f at 2 is given by h(x) = f 0 (2)(x 2) + f (2). Now
h crosses the x-axis at 1.62757. Figure 1.3.2 shows f together with the
tangent.
If we write this out formally, then starting at x1 , the tangent line has
slope f 0 (x1 ) and passes through the point (x1 , f (x1 )). Hence,
y f (x1 )
f 0 (x1 ) =
.
x x1
Setting y = 0 and solving for x, we get
f 0 (x1 )(x x1 ) = f (x1 ),
or
f (x1 )
x = x1 . (1.3.1)
f 0 (x1 )
Replacing x1 by x we get an iterative process that we can repeat until
the absolute value of |f (x1 )| is less then some threshold value. We will call
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 11
Beginnings 11
(1.3.1) the operative statement. The following steps provide an outline for
the program.
The initital estimate for the root is often called the seed.
Newtons method is implemented in Mathematica via the FindRoot com-
mand. For instance, f (x) = x cos(x) has a root between x = 1 and 3. The
following Mathematica statement will implement Newtons method to get
an approximate value for the root.
This reads, while the residual is greater then 105 or the iterations is less
than 10000.
6 1.0
4 0.8
2
C 0.6
2 4 6 8
0.4
-2
0.2
-4
-8
Fig. 1.3 Figure 1.3.3 f and the tangent Fig. 1.4 Figure 1.3.4 A Newtons
line at C = (x1 , f (x1 )) method, cyclic order 2
Finally, it is possible that Newtons method will fail to find any ap-
proximate root. Indeed, the process may cycle. In particular, starting at
a value x1 you may pass on to a succession of values x2 , x3 , ..., xn only to
have xn = x1 . Once you are back to the original value, then the cycle is set
and further processing is useless. The following diagram shows a function
f where f (1) = f (1) = 1, f 0 (1) = (1/2) and f 0 (1) = 1/2. Hence,
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 13
Beginnings 13
Exercises:
1. Consider the function f (x) = x2 . We know that f has a root at
zero. Suppose we select the seed to be 0.5 and the threshold to be 105 .
Execute a program in Mathematica to estimate the root. Since you already
know the outcome, this sort of example is a good means to verify that your
program is correct.
3. Figure 1.3.5 shows the graph of f (x) = x/(x2 + 1) together with the
point (1.5, f (1.5)).
a. Use FindRoot to solve f (x) = 0 starting at 1.5. What happens?
Why?
b. Write your own program to execute Newtons method starting at
x = 1.5. What is the output for the first 10 iterations?
c. Plot f along with the tangent at the 4th iteration. Put both plots
on the same axis. (Hint: Execute both plots separately and save the out-
put in a variable. Then execute the Show statement. The syntax of these
statements is exlained in the language help.)
0.4
0.2
-15 -10 -5 5 10 15
-0.2
-0.4
Fig. 1.5 Figure 1.3.5 f and the tangent line at C = (x1 , f (x1 ))
Beginnings 15
we replace x2 with x .
For instance if f (x) = x cos(x), then there is a root between 1 and 2.
Setting x1 = 1 and x2 = 2, then the line is given by (x) = [(f (2)
f (1))/(2 1)](x 1) and x = 1.39364. The following diagram shows the
graph of f along with the secant.
0.5 0.5
0.5 1.0 1.5 2.0 2.5 3.0 0.5 1.0 1.5 2.0 2.5 3.0
-0.5 -0.5
-1.0 -1.0
-1.5 -1.5
-2.0 -2.0
-2.5 -2.5
-3.0 -3.0
Fig. 1.6 Figure 1.4.1: f with the se- Fig. 1.7 Figure 1.4.2: f concave down
cant joining (1, f (1)) and (2, f (2)) near the root
Returning to the general procedure, the points (x, y) on the secant must
satisfy
f (x1 )
x = x1 f (x2 )f (x1 )
. (1.4.2)
x2 x1
If |x2 x1 | is small, then the denominator on the right hand side,
(f (x2 ) f (x1 ))/(x2 x1 ) is very near to f 0 (x1 ). As the iterative process
proceeds, we should expect the successive values for x1 and x2 to converge
together. In this case the expression (1.4.2) for the approximate root via
the secant method will converge to the expression given in (2.1.1) when
using Newtons method.
We turn now to error estimation. Suppose f is decreasing and concave
down in the interval [x1 , x2 ], as is the case for the currnet example f (x) =
x cos(x). Let x denote the approximate root derived from Newtons method
and let x denote the approximate root derived from the secant method. It
is easy to see that x x and that the actual root must lie between. Figure
2.4.2 illustrates this for the example case. We state this result formally in
the following theorem.
Theorem 1.4.1. Consider a twice differentiable real valued function f de-
fined on an interval [a, b]. Suppose that f has a root at x in the interval but
no relative extrema or inflection points, then the following holds. Each New-
tons method estimate x and secant method estimate x x
, satisfies x x
or xx x .
Proof. There are four cases to consider. We will do the proof for the case
when f is decreasing and concave up. If a < x1 < x , then f (x1 ) > 0 and
= x1 f (x1 )/f 0 (x1 ) > x1 . We denote the tangent for f at x1 by h.
x
Since f is concave up, then h decreases faster than f in the interval [x1 , x
].
Therefore, xx .
On the other side, given seeds x1 < x2 , we know that f (x1 ) > 0,
f (x2 ) < 0. By definition, the secant estimate x
is no larger than x2 . Let
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 17
Beginnings 17
k denote the secant. Since, f is concave up, then for each x [x1 , x2 ],
f (x) < k(x). Therefore, f ( x
x) < 0. It follows that x x2 . This
completes the proof for this case.
Exercises:
1. Consider f (x) = (x 1)2 2 which has a root at 1 + 2. Use the
secant method with seeds at 2 and 3 to estimate this root.
d. Use the result of Exercise 1.b from the previous work on Newtons
method along with the result of the prior section to get a upper bound on
the absolute error.
Chapter 2
Introduction
19
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 20
proof as the usual argument requires the Jordan canonical form of a matrix.
We end the chapter considering max/min problems in several variables.
Along the way we show that finding the roots of a function of several vari-
ables may be recast as a max/min problem. In addition, these procedures
can be used to solve linear systems with singular coefficient matrix. Both
questions relate to the material developed in the prior chapter.
v = {1,2,3,4} ;
Print[MatrixForm[v]];
A = {{1.,2,3,4},{2,3,4,5},{5,4,3,2},{4,3,2,1}};
B = {{0.,1,0,0},{1,0,0,0},{0,0,0,1},{0,0,1,0}};
C = A.B;
v = {1.,2,3,4};
Print[MatrixForm[C]];
w = A.v;
w = 5*w;
A, and the n n identity matrix. N orm[v] returns the length of the vector
v while Length[v] returns 4 if v is a four-tuple. Finally, A[[i]][[j]] = A[[i, j]]
returns the ij th entry of A and v[[i]] is the ith entry of the vector v.
Consider a linear system Ax = b, where A is a non-singular nn matrix
and x, b Rn . The condition that A is non-singular assures us that the
system has an unique solution given by x = A1 b. In addition we know
that A is non-singular if and only if A is row equivalent to the nn identity,
In .
The standard process of solving a linear system of equations is called
Gauss-Jordan elimination. This method is implemented in Mathematica.
Suppose you have the linear system
123 x1 1
4 5 6 x2 = 2
789 x3 3
In order to solve this system in Mathematica, you define the coefficient
matrix and constant vector via
coefMat = {{1,2,3},{4,5,6},{7,8,9}};
conVec = {1,2,3};
(1) Matrices A and B are row equivalent provided there exist elementary
Qm
matrices E1 , ..., Em with B = ( i=1 Ei )A.
(2) Multiplication on the left by an elementary matrix implements the cor-
responding elementary row operation.
(3) There are three elementary row operations.
(4) The type-1 elementary row operation exchanges two rows of a matrix.
It is denoted by E(i,j) , indicating that rows i and j are exchanged.
(5) The type-2 elementary row operation multiplies a row by a nonzero
scalar. It is denoted by E(i) , indicating that row i is multiplied by .
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 22
(6) The type-3 elementary row operation adds a scalar times one row to
another. It is denoted by E(i)+j , indicating that times row i is
added to row j.
(7) All elementary matrices are non-singular. Their inverses are given by
1 1 1
E(i,j) = E(i,j) . E(i) = E1 (i) and E(i)+j = E(i)+j . Notice that
the inverse of an elementary matrix is the elementary matrix that re-
verses the operation.
(8) If Ax = b and Bx = c are linear systems and D is a product of ele-
mentary matrices then B = DA, c = Db implies that the two systems
have the same solution. In addittion, for each type, the inverse is of
the same type.
Exercises:
1. Apply LUDecompostion to the following matrices. Write out L, U
and P.Which matrices
are ill conditioned?
1. 1. 0
a. 1. 1. 3.
0. 1. 1.
1. 2. 1. 7.
2. 0. 1. 4.
b.
1. 0. 2. 5.
1. 2. 3. 11.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 25
1. 2. 3.
c. 1. 1. 1.
5. 7. 9.
2. Let A = [i,j ] be an n n matrix with real entries. Suppose that
there is an m with i,j = 0 for i m, j m and i,i 6= 0 for 1 i < m.
Prove that A is singular.
3. The Mathematica statement Eigensystem[A] returns n+1 vectors for
an n n matrix A. The entries of the first vector are the eigenvalues of A.
The remaining vectors are the corresponding eigenvectors. Apply Eigensys-
tem to the matrices listed in (1). Recall that a matrix is singular if it has
zero as an eigenvalue. When looking at computer output a number close to
zero should be considered zero. Which of the matrices in (1) are singular?
C. All dead trees are replaced by one or the other type of tree. New
trees enter the forest only by replacement.
D. Currently, there are 10 TA and 990 TB .
E. Since all collected data is annual, the manager sets the unit of time
to be 1 year. Note that tA ( ) = total(TA ) at a given time, and tB ( ) is
set similarly.
To solve the problem the manager has done the following
For a given year, the number of dead trees is (0.01)tA ( ) + (0.05)tB ( ).
The number of replacements from the type TA is
(0.25)(0.01)tA ( ) + (0.25)(0.05)tB ( ).
(0.75)(0.01)tA ( ) + (0.75)(0.05)tB ( )
.
Unfortunately the manager has resigned leaving you to finish the study.
He has left no contact information.
a. Write tA ( + 1) in terms of tA ( ) and tB ( ). (Hint: What factors
determine tA ( + 1). These are: no TA dies or there are deaths of either
type that are replaced by TA or there are deaths of TA that are replaced
by TB . Which factors are positive, which are negative?)
b. Write tB ( + 1) in terms of tA ( ) and tB ( ).
c. Recast the results of a and b in matrix form; that is find A such that
tA ( + 1) t ( )
=A A .
tB ( + 1) tB ( )
d. Use Eigensystem[A] to find the eigenvectors and eigenvalues of A.
Confirm that A has a diagonal representation. We next recast the problem
for a diagonal for of A.
e. Write
tA (0)
tB (0)
as a linear combination of the eigenvectors of A.
f. Write
tA (1)
tB (1)
using the result of e and the eigenvalues of A.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 27
We begin with the operator norm of the linear transformation. This concept
is key to estimating the error in Gauss-Jordan elimination. But that is only
the beginning. The operator norm is essential to understanding iterative
processes with linear transformations. Hence, it plays a critical role in large
matrix processes. Recast into infinite dimensional vector spaces, the linear
operators with bounded norm are central. For instance they play a key role
in quantum mechanics and ln linear partial differential equations.
We will develop the concept to real vector spaces. The same arguments
work for the compex case.
Recall the norm of a vector in Rn , kvk = ( i vi2 )1/2 . We extend this
P
Note that the theorem proves the existnece of the norm without pro-
viding a means to calculete it. Rather, we a given an upper bound.
In this section, we have aready seen two functions that we call a norm,
the Euclidean norm of a vector in Rn and now the norm of A. These are
both examples of a more general concept.
A norm gives rise to a metric and with a metric we can talk about
convergence, open sets and continuous functions.
The last property is also called the triangle inequality . It is not difficult
to prove (see Exercise 8.) that kxyk = d(x, y) defines a metric on a normed
linear space. Before moving on, note that as a consequence of Theorem
2.2.1, linear transformations are continuous. We state this formally in the
following theorem.
We now turn to the error analysis for the solution to a linear system.
Consider the linear system Av = b. We denote the solution by v and the
computed solution v. In turn, we set b = A
v . Now, we compute the relative
normed error,
kv vk kA1 (b b)k kb bk
= kA1 k .
kvk kvk kvk
Next, we multiply top and bottom by kbk and use kbk kAkkvk,
kv vk kb bkkbk
kA1 k
kvk kvkkbk
kb bkkvk kb bk
kAkkA1 k = kAkkA1 k .
kvkkbk kbk
The right hand expression is called the relative normed residual and the
coefficient C = kAkkA1 k is the condition number .
We state this result formally as a theorem.
Theorem 2.2.4. For the linear system Av = b, the relative normed error
is bounded by the condition number times the relative normed residual.
If the condition number is very large, then the coefficient matrix for the
system is singular or nearly singular and the results returned by LinearSolve
are not considered reliable. When this is the case, Mathematica will return
a warning that the coefficient matrix is ill conditioned. But all of this
preseuposes that the matrices are not too large. For large matrices we
are on our own. Indeed, large nonsingular matrices may have very large
condition numbers. See Exercies 6,
Because of Theorem 2.2.3, we need to develop the operator norm in
order to estimate the condition number. To begin, note that kAkkBk
kABk. And there are cases where equality fails. (See Exercises 1, 3 and
4) Therefore, in order to know the condition number, you must know the
operator norm. Before proceeding, we need some terminology.
Alternatively, we write (A) for the absolute value of the smallest eigen-
value. The proof of the following theorm is included in the exercises.
For the case of a real symmetric matrix the situation is much more
tractable. Indeed, it is a happy circumstance that linear systems with
symmetric coefficieint matrices arise naturally. In particular, this is the
case for processes such as diffusion. The key here is that a real symmetric
matrix has n orthonormal eigenvectors.
There is one more result that will be useful as we go forward. The proof
uses the Jordan canonical form. We state the theorem here without proof.
It is attributed to Gelfand. [Loustau (2016)]
Exercises:
1. Prove that for real or complex matrices, kAkkBk kABk
2. Prove Theorem 2.2.2. (Hint: for part three consider k(A + B)vk =
kAv + Bvk kAvk + kBvk)
6. For the matrices in Exercise 5, set the frist rowk to (1, 0, 0, ..., 0) and
the last row to (0, 0, ..., 1). Check the condition number. Also use Eigen-
values to verify that the matrices are nonsingular.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 32
(I BA)ek = (I BA)k e1 .
norm is less than one. In this case, we can prove that xk x. Indeed, it
suffices to prove that ek 0.
kek k = k(I BA)k1 e1 k kI BAkk1 ke1 k 0,
using Exercise 1 of the prior section. However, we can do better.
Proof. By Theorem 2.2.5, limk k(I BA)k k1/k < 1. Therefore, there is
an integer j such that k(I BA)k ]k < 1 for every k j. Indeed, the k th
root of a positive number is less than 1 only if the number is less than 1.
Now if k > j, then kenk+1 k = k(I BA)nk kke1 k k(I BA)k kn ke1 k 0
as n . The final assertion holds as any two convergent subsequences
have a common convergent subequsence.
Exercises:
1. Consider the example with the tridiagonal coefficient matrix from
the text. Execute the Jacobi method using the stop threshold 105 .
2. Repeat Exercise 1 using the Gauss-Seidel method and 100 100 ma-
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 36
trices. How does the size change affect the relative normed error at 50, 500
and 5000 iterations.
6. Use the power iteration method to estimate the spectral radius for
(I BA) for the two examples (Jacobi and Gauss-Seidel) develped in the
text. How many iterations are requried to complete the estimate given
above?.
15
10
2
5
-2
0
-2
on the plane to use for . There are two standard procedures to determine
the direction vector . We develop one now and the second at the end of
the section. Recall a fact from multivariate calculus, the gradient points
in the direction of maximal descent. Hence, it seems reasonable to select
that direction for the line . In this case the technque is often called the
method of maximal descent.
We know that the gradient f = f /x(x0 , y0 ) is a vector in the xy-
plane, the domain of f , that determines the direction of maximal change
for f . So, it is reasonable to set = f /x(x0 , y0 ) and = f /y(x0 , y0 )
and consider the line (1, 2) + t(, ) = (1 + t, 2 + t) in the xy-plane. Next,
we define a function h : R R, h(t) = f (1 + t, 2 + t). We can now solve
for a max/min of h. This is a one variable calculus problem. Finding a
minimun for h should yield a value for f less than 5, the value at (1, 2).
Indeed, f (1, 2) = (2, 4), h(t) = (1 + 2t)2 + (2 + 4t)2 = 5 + 20t + 20t2 . The
derivative of h is 20 + 40t. It has extremum at t = 0.5. Now h(0.5) =
f (0, 0) = 0 < 5. Indeed, we recognize the origin as the minumum of f . And
we have arrived in one step.
We now state the general process for functions of several variables. Sup-
pose we seek a minimum of f mapping Rn to R.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 39
400
6
200
4
0
-5
2
0
5
0
There are many alternate choices for the direction vector for . One
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 40
choice is similar to the secant method. In this case we begin with the
Taylor expansion for f .
1
f (x + s) = f (x) + f (x).s + sT H(x)s + R2 (2.4.1)
2
where sT denotes the transpose of s, H is the Hessian of f and R2 is the
the remainder term. Recall that the Hessian is the matrix whose entries are
2 f /xi xj . Because of the use of the Hessian, this technique is referred
to as the Hessian method.
If we suppose that f (s) = f (x + 2), then according to Rolles theorem,
we would expect a local extrema between x and x + s. Hence, = s is the
search direction. If we take the remainder term to be zero and we recast
(2.4.1),
1
H(x)s = f (x). (2.4.2)
2
Therefore, we solve for s. Since (2.4.2) is a linear system with coefficient
matrix H(x), then we can find provided H(x) is nonsingular. Finally,
to describe the Hessian method, we need only replace statement 1 by the
following
(1) Compute (, ) as the solution to the linear system 0.5H(x)s =
f (x).
As mentioned at the beginning of the section, if f takes values in Rm
then g = f.f is real valued and the roots of f are now extrema for g. Hence,
we can use the techniques developed here to solve the general problem
f (x) = 0. We present examples in the exercises.
Exercises:
1. Use maximal descent to find a minimmum for f (x, y) = x2 + xy + y 2 .
Use (2, 1) as the search starting point.
4 2 3 5 w 4x + 2y 3z + 5w
a. Use LUDecomposition to determine if A is singular or non-singular.
(Do not forget to introduce a decimal point to the data.) How does this
impact the problem of solving an equation of the form L(x, y, z, w) =
(x0 , y0 , z0 , w0 )?
b. Use the maximal descent method to solve L(x, y, z, w) = (1, 1, 1, 1).
Use (5, 5, 5, 5) for the initial estimate.
Use at least 35 iterations.
Use 105 as the tolerance in Step 7.
Make certain to use two if statements, one for Step 5 and one for
Step 7.
c. Redo Part b using (1, 2, 3, 4) as the initial estimate.
d. Why is it possible for the solution to b and c to be different?
e. Prove that if v is the slution to b and v is the solution to c, then v v
solves L(x, y, z, w) = (0, 0, 0, 0). (What is the kernel of a linear transforma-
tion?)
Chapter 3
Introduction
43
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 44
f[x_] = x*Exp[-x] - 1;
Plot[f[x], {x,1,4}];
-0.65 -0.65
-0.70 -0.70
-0.75 -0.75
-0.80 -0.80
-0.85 -0.85
-0.90 -0.90
1.5 2.0 2.5 3.0 3.5 4.0 1.5 2.0 2.5 3.0 3.5 4.0
Fig. 3.1 Figure 3.1.1: f together with Fig. 3.2 Figure 3.1.2: f concave down
the Talor expansion at x = 2.5 near the root
x = 2.5.
df
(2.5)(x 2.5)+
g(x) = f (2.5) +
dx
1 d2 f 1 d3 f
2
(2.5)(x 2.5)2 + (2.5)(x 2.5)3 .
2! dx 3! dx3
When developing g you will need to compute the derivatives of
f . Recall that the derivatives of f are computed in Mathematica via
D[f [x], x], D[f [x], x, x] and so forth. If you plot f and g on the same axis
you will see that the cubic Taylor polynomial provides a remarkably good
approximation of this function. Figure 3.1.2 shows the graph of g together
with the graph of f . Notice that the graph of g is above f on the left and
below on the right.
A numerical measurement of the goodness of fit is given by the L2 norm
of f g,
Z 4
1/2
kf gk2 = (f g)2 dx .
1
This is called the norm interpolation error . In turn, the mean norm
interpolation error is
Z 4
1 1/2
(f g)2 dx .
41 1
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 46
The finite Taylor expansion produces a high quality one point interpo-
lation provided we know the original function. However, suppose we have
points and no function, then we will need a different approach.
n
Collecting these equations we then get the following matrix equation
1 n
0 y0
1 x1 ... x1 1 y1
. . ... . =
... ...
1 x1n+1 ... xnn+1
n yn
where x0i = 1. This is a linear system of equations where the xi and yi
are known while the i are unknown. Hence, we can use the LinearSolve
function in Mathematica to find the coefficients of p provided the coefficient
matrix is non-singular. The matrix is called a Vandermonde matrix. It is
always nonsingular provided the xi are distinct.
Proof. The Vandermonde matrix is singular only if the columns are de-
pendent. In particular, only if there are scalars 0 , ..., n not all zero with
1 n
1 x1 x1 0
0 ... + 1
... + ... + n ... = ...
1 n
1 xn+1 xn+1 0
Hence, for each i = 1, 2, ..., n + 1 we have
0 + 1 x1i + ... + n xni .
We have demonstrated a polynomial 0 + 1 x1 + ... + n xn that is not zero,
has degree n and therefore has at most n distinct roots. However, we just
showed that it has n+1 distinct roots, x1 , ..., xn+1 . As this is impossible, we
are led to the conclusion that the Vandermonde matrix is nonsingular.
derivative of h. Hence, dh/dx has at least n + 1 roots on the interval (a, b).
Repeating this argument, d2 h/dx2 has at least n roots in (a, b). Continuing,
the k th derivative of h has at least n + 2 k roots. So that the n + 1st
derivative has at least 1 root. We denote this root by = , since
depends on our choice of . Now
dn+1 Y
0 = h(n+1) () = f (n+1) () p(n+1) () g() n+1 (x xi )|x= .
dx i
0.30 0.25
0.25 0.20
0.20 0.15
0.10
0.15
0.05
0.10
-4 -2 2 4
-4 -2 2 4
0.2
0.1
-4 -2 2 4
-0.1
Fig. 3.3 Figure 3.1.3 Three alternate images y = 0.3, 0.25, 0.2; p(5) = 0.11, 0.04, 1.9
Exercises:
1. Compute the norm error and the mean norm error for the function
f (x) = xex 1 and its cubic Taylor expansion about x = 2.5. Use the
interval [1, 4].
4. Repeat Exercise 3 with additional points on the x-axis and 3 and -3,
4 and -4. Does this produce a better approximation of the function f ?
5. Consider the points Pi = (xi , yi ), i = 1, 2, ..., n + 1, in the real plane
and the corresponding Lagrange polynomials li .
a. Prove that for each i, li is a degree n polynomial with li (xj ) = 0 if
i 6= j, and li (xi ) = 1.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 52
Pn+1
b. Prove that q(x) = i=1 y1 Li (x) interpolates the given points.
They determined that the data fit to the following function (to 3 decimal
places accuracy),
f (t) = 0.02424(t/303.16)1.27591 .
d. Is the actual mean absolute error smaller than the estimated mean
absolute error?
B2,1 + t(B2.2 B2,1 ). (See Figure 3.2.1c) If we write (t) in terms of the
original four points we have the usual representation for the Bezier curve.
(t) = (1 t)3 B1 + 3t(1 t)2 B2 + 3t2 (1 t)B3 + (t3 )B4 . (3.2.1)
B3
1.4 1.4
1.0
1.0 B2
B13
0.8
0.8
0.6
B11
0.6
0.4
0.4 B4
0.2
0.2
B1 1 2 3 4 5 6
1 2 3 4 5 6
Fig. 3.4 Figure 3.2.1a: Four guide Fig. 3.5 Figure 3.2.1b: 2nd -level
points with segments points with line segments, t = 0.6.
1.4 1.4
1.2
1.2
1.0 B23
B21 B22
1.0
0.8
0.6 0.8
0.4 0.6
0.2
0.4
1 2 3 4 5 6
0.2
1 2 3 4 5 6
Fig. 3.6 Figure 3.2.1c: 3r d-level points Fig. 3.7 Figure 3.2.1d: The Bezier
with line segments curve (t)
points on the function graph are given by (x, f (x)), then the tangent vectors
to the graph are d/dx(x, f (x)) = (1, f 0 (x)). Hence, the tangent vector at
B1 is (1, f 0 (1)). Since this vector must also satisfy (1, f 0 (1)) = 3(B2 B2 ),
we have B2 = B1 + 1/3(1, f 0 (1)). Similarly, B3 = B4 1/3(1, f 0 (4)). As in
the previous cases, the Bezier interpolation of f is a good approximation
of the original curve.
In Section 1.3 we showed a curve for which Newtons method failed
because the process cycled, the third estimated root was equal to the first,
the fourth equal to second and so forth. We created this curve using a Bezier
curve. We started with B1 = (1, 1) and B4 = (1, 1). Next, we wanted the
slope at B1 to be 1/2 and equal to 1/2 at B4 , so that Newtons method
would return points (1, 0) and (1, 0). Using the technique described above,
we have
(1, 1/2) = 3(B2 B1 ); (1, 1/2) = 3(B4 B3 ).
One purpose of interpolating points is to use the interpolating function
to compute the integral of a unknown function inferred from the points. We
will see later, that the parametric form of the Bezier curve is significantly
more difficult to deal with than the polynomial or Taylor interpolation.
Exercises:
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 56
a. Prove that the cubic Bezier curve (t) defined on four points is
identical with the third Berstein polynomial.
b. Prove that for any n, pn (0) = P0 and pn (1) = Pn .
c. Use parts a, b to define a generalization of cubic Bezier curves.
We begin with a word of caution. Least square fitting in the linear case
arises also in the context of linear regression. This is more of a coincidence
than anything else. It is true that in both cases a line is fit to a finite set
of points. In addition, the line arises from the same minimization process.
Beyond that the processes are different and distinct. Least squares fitting
in the numerical methods context is a procedure that begins with a set
of points, and then guides the researcher to a polynomial, which seems
to fit well to the points. On the other hand linear regression begins with
a set of points sampled from a distribution and includes assumptions on
the distribution and the sample. Then a line is inferred. In particular,
the line is derived by minimizing the variance of a related distribution.
Furthermore, statistics are returned indicating confidence intervals for the
slope and y-intercept of the line. In addition, a general statistic is returned,
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 57
X X
= 2(yi (mxi + b))(xi ) = 2( (mx2i + (b yi )xi ),
m i i
X X
= 2(yi (mxi + b))(xi ) = 2 (yi (mxi + b)).
b i i
Setting these two terms to zero and reorganizing them just a little, we get
X X X X
0= (mx2i + (b yi )xi ) = m x2i + b xi xi yi
i i i i
X X X
0= (yi (mxi + b)) = m xi nb + yi
i i i
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 58
X X
yi = xi m + nb.
i i
Or in matrix notation,
P 2 P P
x i xi m xi yi
Pi i = Pi
i xi n b i yi
Next set
x1 ... xn
A= .
1 ... 1
Then it is immediate that
x 1
x1 ... xn 1
P 2 P
x i xi ,
AAT = . . = Pi i
1 ... 1 i xi n
xn 1
and
y1 P
xy
A . = Pi i i .
i yi
yn
Hence we may rewrite the 2 by 2 system as
y1
m
AAT = A . .
b
yn
This form of the linear system is most suitable for our calculations. It is
straight forward to prove that the coefficient matrix, A(AT ), is necessarily
non-singular provided that no two xi are equal.
To get a feel for how this looks, consider the following example. Suppose
we have points P1 = (5, 3), P2 = (4, 2), P3 = (2, 7), P4 = (0, 0), P5 =
(1, 5), P6 = (3, 3), P7 = (5, 5). The following figures shows the points and
the resulting least squares line.
If we had asked for a quadratic polynomial that best fit the point set,
then we would be looking for three coefficients, a, b and c. Setting up the
problem as in the linear case to get (a, b, c), differentiating with respect
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 59
-4 -2 2 4
Fig. 3.8 Figure 3.3.1: even data points with least square fit
to the three variables, setting the resulting terms to zero and solving we
would get the following linear system.
a y1
AAT b = A . ,
c yn
where
2
x1 ... x2n
A = x1 ... xn
1 ... 1
There are similar expressions for the cubic least squares problem, etc.
The data shown in Figure 3.3.1 would seem to be cubic (see Exercise 1
below).
At the top of the section we mentioned that least squares fitting was
separate and distinct from linear regression. Before ending the section we
add some details to that statement. The setting for linear regression starts
with two random variables, X and Y , together with the hypothesis that Y
is a linear function of X. In particular, we are supposing that Y = aX + b,
where the parameters a and b are to be determined. Then the process is to
select the parameters so as to minimize the variance of Y X. When you do
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 60
this calculation against sample data (supposing that the sample was done
with replacement), the process is exactly the degree 1 least squares fitting.
However, within the statistical context, the process returns values that
measure the correctness of the hypothesis and provide confidence intervals
for the two parameters. These are ideas special to statistical reqression not
shared with numerical analysis.
In the numerical analysis context there is no means to measure the cor-
rectness of the fit and no confidence intervals for the parameters. However,
least squares fitting is used to approximate the solution to a partial differ-
ential equation. In this case, the points that drive the least squares fitting
arise from numerical processes. We will have a means to measure how well
these values approximate the actual values and then use the least squares
process to fill between the known data.
We end this section with an important application. Exponential growth
is common in biology as well as the other sciences. For instance bacterial
growth is exponential. Epidemics show expontential growth during their
early stages. Exponential growth is characterized by the statement that
the rate of change of population size is proportional to the current size. In
particular, if f (t) represents the number of organisms in a bacteria growth
at time t, then rate of change for f is proportional to the value of f means
that
R T df /dt = f (t). Hence, by integrating both sides, we get f (T ) =
t
0 f (t)dt or f (t) = e , where = and = f (0).
Next, we turn this situation upside down. Suppose we have pairs (ti , yi )
of data, which because of the setting we know to be related via an exponen-
tial, yi = eti , but we do not know and . We can solve this problem
with least squares fitting. We write y = t and take the log of both sides.
This yields, log[y] = log[] + t. In this from, log[y] is a linear function of
t. Hence, we have the technique. First, we take the log of the yi , then fit
these values to the ti using a linear least squares fitting. The result is a
line, y = at + b. And b = and = ea . Exercise 6 is an example of this
sort of problem.
Exercises:
1. Fit the data P1 = (5, 3), P2 = (4, 2), P3 = (2, 7), P4 =
(0, 0), P5 = (1, 5), P6 = (3, 3), P7 = (5, 5) to a line (Figure 3.3.1), a
quadratic and a cubic. In each case, calculate the sum of squares from
the curve to the points. Which curve gives the best fit.
2. For the linear case prove that AAT is non-singular provided that
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 61
3. State and prove a result analogous to Exercise 2 for the case of the
quadratic least squares fitting.
k(xi , yi , zi )k to denote the usual Euclidean length (x2i + yi2 + zi2 )1/2 .
5. Prove for the least squares fit of n + 1 points to an nth degree poly-
nomial A is the transpose of the Vandermonde matrix. Conclude the least
squares fitting for this case is equivalent to the polynomial interpolation
process described in Section 3.1.
The term spline refers to a large class of curves. Some interpolate the given
points while others fit the data. They are commonly used in many areas
of application, including statistics, probability, engineering and computer
graphics. One reason for their wide use is that splines exhibit local control.
Another advantage to cubic spline interpolation or fitting over polynomial
interpolation is that the curve can simulate asymptotic behavior. For this
reason these curves are often used in probability theory to approximate a
density function.
In this section we develop two classes of spline, the classical cubic spline
[Su and Liu (1989)] and the B-spline. In addition, cubic Hermite interpo-
lation introduced in conjunction to Gaussian quadrature is often referred to
as cubic orthogonal spline interpolation. We see this interpolation method
in the next section. The Bezier curve is a parametric curve. However, the
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 63
Definition 3.4.1. Consider [a, b] R with a partition a = t0 < t1 < ... <
tn = b. A spline defined on [a, b] is a paramedic curve taking values in
Rm such that
a. for each i = 1, ..., n 1, restricted to (ti1 , ti ) is a parametric cubic,
denoted i ,
b. i (ti ) = i+1 (ti ), i = 1, ..., n 2,
c. is twice differentiable at each ti , i = 1, 2, ..., n 2.
The points (ti ) are called the knot points and the i are called the
segments.
Mi (xi xi1 )3
yi = + Cxi + D,
6i
were we write yi = (xi ). Subtracting yi1 from yi we have
Mi (xi xi1 )3 Mi1 (xi xi1 )3
yi yi1 = + C(xi xi1 ).
6i 6i
Dividing through by i = xi xi1 yields,
yi yi1 i
= (Mi Mi1 ) + C.
i 6
Now solving for C and substituting into (3.4.3) we get
(xi x)2 (x xi1 )2
i0 (x) = Mi1 + Mi +
2i 2i
yi yi1 i
(Mi Mi1 )
i 6
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 65
i yi1 i yi i
(Mi Mi1 ) x+ xi + Mi1 xi xi1 + Mi xi1 . (3.4.4)
6 i 6 i 6
Note that the unknowns in equation (3.4.4) are Mi1 and Mi , and that
the equations are linear in these two variables. That is, (3.4.4) describes
a linear system of equations. We now organize this equation to get an
expression for the second derivatives (the Mi ) without reference to x, the
independent variable, and then write the resulting linear system in matrix
form.
First, we write (3.4.4) is a slightly more compact form,
Mi1 (xi x)3 Mi (x xi1 )3
i (x) = + +
6i 6i
yi i yi1 i
Mi (x xi1 ) + Mi1 (xi x).
i 6 i 6
And for i+1 we get
Mi (xi+1 x)3 Mi+1 (x xi )3
i+1 (x) = + +
6i+1 6i+1
6 i+1 yi yi+1 yi
y
. (3.4.5)
i+1 + i i+1 i
Equation (3.4.5) represents a linear system of equations. There are
n + 1 unknowns, Mj , j = 0, ..., n and n 1 equations associated to the
spline segments, i , i = 1, ..., n 1. Setting boundary values at M0 and
Mn adds two more equations to the system,
M0 = 0 , Mn = n . (3.4.6)
If we denote the fight hand side of (3.4.5) by di and set i /(i+1 +i ) = i ,
i+1 /(i+1 + i ) = i , then (3.4.5) and (3.4.6) yield the following lower
triangular linear system in matrix form.
1 0 0 ... 0 0 0 0 M0 0
2 2 0 ... 0 0 0 0 M1 d1
2 ... 0 0 0 0 M2 d2
2 3
0 2 ... 0 0 0 0 M3 d3
3
... ... ... ... ... ... ... ... ... = ... (3.4.7)
0 0 0 ... n3 2 n2 0 Mn3 dn3
0 0 0 ... 0 n2 2 n1 Mn2 dn2
0 0 0 ... 0 0 n1 2 Mn1 dn1
0 0 0 ... 0 0 0 1 Mn n
The solution to (3.4.7) is a peicewise polynomial function. It is not
a polynimial, but rather the join of polynimial segments. Further, the
spline curve interpolates the given set of points. Hence, we have a process
similar to polynomial interpolation. However, based on the physical model,
splines will have local control. In particular, the problem that we identified
wtih polynomial interpolation cannot happen here. However, the size of
the linear system in (3.4.7) depends on the number of knot points or spline
guide points. There is an alternative called B-spline. B-splines are piecewise
cubics that exhibit local control and do not require that we solve a linear
system. The downside that they do not interpolate the guide points.
The basic idea is to remove the physical context from the spline while
keeping the basic properties. In particular, we want a priecewise cubic poly-
nomial that is twice continuously differentiable where the segments join. In
particular, suppose we have a list of distinct points, Pi = (xi , yi ), i = 0, ..., n.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 67
4
X
(xi xi+1 )i1 + 2xi i2 + 3xi i3 = 0 (3.4.10)
i=1
4
X
(xi xi+1 )i2 + 6xi i3 = 0 (3.4.11)
i=1
Proof. The result follows from the properties of the B-spline basis func-
tions.
In particular, the B-spline fit for the function is not an interplation but it
does approximate the function.
Finally, we consider local control for B-splines. First, note that each
point on a sgment i is a linear combination of the four guide points for
the segment. Furthermore, as the sum of the B-spline basis functions is
1, then sum of the coefficients in the linear combination is also 1. Hence,
the curve lies within the convex hull of the four guide points. Note that
the convex hull of a point set is the smallest convex set that contains the
points. For 4 points, it is the convex quadralateral or triangle that contains
the points. The following diagram illustrates the convex hull property. The
spline segment is generated by the four points.
In summary, B-splines are parametric cubic curves which fit the given
set of points, guide points. Each B-spline segment is determined by four
of the guide points. For instance, segment j is determined by the four
guide points, Pj , Pj1 , Pj+2 , Pj+3 . In practice each segment traces a curve
very close to its four determining guide points. The B-spline curve (the
union of the segments) is twice continuously differentiable. This high level
of smoothness is the reason that B-splines are so often used.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 69
2.0
P2
1.5
1.0
P4
0.5
P1
1 2 3 4 5
-0.5
P3
-1.0
Figure 3.4.1: The B-spline segment and the convex hull.
Fig. 3.9 Figure 3.4.1: The B-spline segment and the convex hull
In addition, you can predict the effect on the curve that arises from
the change in one of the guide points. This is because of the convex hull
property. Any given guide point is included in the calculation of at most
four curve segments, or equivalently at most four of the bounding convex
hulls. Therefore, if you change a guide point, then you can predict the
change in the curve by looking at how the convex hulls change. Figures
3.4.2 shows a two segment B-spline. The convex hull for the first segment
is formed by P1 , P2 , P3 , P4 . For the second segment the convex hull is
determined by P2 , P3 , P5 , P4 . In Figure 3.4.3, we have moved P3 .
Smootheness, local control and convergence are two strong advantage
for B-splines over polynomials. Since there is no need to solve a large linear
system, then B-splines are preferred over cubic splines. Indeed, unless exact
interpolation is absolutely required, B-splines are often the technique of
choice.
Exercises:
1. Plot the four B-spline basis functions.
2. Compute the B-spline fit to the points (xi , 1/(1 + x2i ) for xi =
4, 2, 1, 0, 1, 2, 4. Plot the parametric B-spline against the graph of
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 70
2.0 2.0
P2 P2
1.5 1.5
1.0 1.0
P4 P4
0.5 0.5
P1 P5 P1 P5
1 2 3 4 5 6 1 2 3 4 5 6
A A
-0.5 -0.5
P3 P3
-1.0 -1.0
Figure 4.3.2: Two B-spline segments with knot point A Figure 3.4.3: Two B-splines, P3 has been moved to the left
Fig. 3.10 Figure3.4.2: Two B-spline Fig. 3.11 Figure 3.4.3: Two B-splines,
segments with knot point A P3 has been moved to the left
3. In Exercise 3, the fourth guide point is (0, 1). Leaving the other
guide points unchanged, repeat Exercise 3 using (0, 1.1) for the fourth
guide point. Compare the output to the curve in Exerciese 2.
5. Repeat Exercise 3 with fourth guide point equal to (0, 1.2) or (0.0.9)
or (0, 0.8).
6. Repeat Exercise 7 of Section 3 with the same data. This time fit
B-Splines to the data and plot the curve on the same graph with the data.
Use the B-Spline to predict values for f (750). f (850) and f (950). Deter-
mine the mean absolute error and compare the result to the result using
least squares fitting. Which method was better?
Our first task is to show that the Hermite interpolation exists; that is,
polynomials exist that satisfy the conditions of definition 4.4.1. We start
with the n + 1, nth degree Lagrange polynomials associated to the points
xi , i = 0, .., n. Recall from Section 3.1,
Q
j6=i (x xi )
li (x) = Q .
j6=i (xj xi )
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 72
Since li (xj ) = i,j , then the same is true for hi = li2 . Furthermore, if we set
Qn
p(x) = i=0 (x xi ), then
XY Y
p0 (x) = (x xj ); p0 (xi ) = (xi xj )
i j6=i j6=i
0
Hence, li (x) = p(x)/[(x xi )p (xi )]. We compute
p0 (x)
d d 1 p(x) 1 p(x) 1
hi = 2li (x) li (x) = 2 +
dx dx x xi p0 (xi ) (x xi )2 p0 (xi ) x xi p0 (xi )
0
Therefore, for j 6= i, hi (xj ) = 0.
At this point we have n+1 functions hi , which are polynomials of degree
2n. Next, we seek polynomials ui and vi satisfying the following,
0 0
a. degree ui = 1, ui (xi ) = 1, ui (xi ) = hi (xi );
0
b. degree vi = 1, vi (xi ) = 0, vi (xi ) = 1.
Indeed, in this case Hi = ui hi and Si = vi hi satisfy the conditions of
Definition 4.1.1. It is immediate that vi (x) = x xi is consistent with item
0
b above. In turn, we set ui (x) = (x xi ) + 1 where = hi (xi ). It is
now a simple matter to complete the proof of the following Theorem.
If M = kf (2n+2) k , then
M
|e(
x)| x)2 .
p( (3.5.2)
(2n + 2)!
Furthermore, if f is a polynomial of degree less than or equal to 2n + 1, then
f = h.
Proof. The proof of this is analogous to the derivation of the error esti-
matge for polynomial interpolation as developed in Section 3.1. As in that
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 73
Exercises
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 74
5. Plot the 4 Hermite cubics and verify by inspection that they satisfy
the requirements of Definiton 3.5.1.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 75
Chapter 4
Numerical Differentiation
Introduction
In this chapter we begin the study of the basic constructs of calculus from
the standpoint of numerical analysis. In particular, we look at procedures
to approximate the derivative of a function when we only know values of
the function but have no closed form or series representation. In the next
chapter we consider the integral.
We begin by introducing the finite difference as the basic technique to
approximate the derivative of a function from only knowledge of some of the
function values. For instance, if we know values of our function f at points
a and b, then (f (b) f (a))/(b a) is approximately equal to the derivative
provided b a is small. From the numerical analysis point of view, we want
an error estimate for any approximation. For this setting, if we use the
Taylor expansion to derive the expressions for the finite differences, then
the error estimation procedure is given.
Finite differences is the entry point into numerical processes to approx-
imate the solution of a differential equation. In the second section, we
introduce the finite difference method (FDM) to estimate solutions of the
one-dimensional heat equation. in this section we consider only explicit
FDM. On one hand the technique is successful. However, almost immedi-
ately we see that this technique is can give rise to unacceptable levels of
error. This leads us to consider von Neumann stability questions in Section
3. In Section 4, we introduce Crank Nicolson FDM. Stability is not an issue
for this procedure. This is often the method of choice for parabolic partial
differential equations (PDE). In the exercises we consider FDM as applied
to the order 1 wave equation. We see here that FDM for hyperbolic PDE
is very different. Little of the intuition gained from the parabolic case can
75
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 76
Numerical Differentiation 77
Exercises:
1. Take f (x) = x2 and x = 0.1.
a. Compute the first forward, backward and central differences at x =
1, 2, ..., 100. Compare the ouput against the actual derivative. Compute
the mean absolute error for each case.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 79
Numerical Differentiation 79
Figure 4.1.2: The channel with uniform grid in $x$ and $y$ directions
Figure 4.1.1: The partially obstructed channe
Fig. 4.1 Figure4.1.1: The partially ob- Fig. 4.2 Figure 4.1.2: The channel
structed channel with uniform grid in x and y directions
Fig. 4.3 Figure4.3: The flow field approximated with finite differences
d. For the interior points of D, replace the gradient in c with the central
difference in the x and y directions, (u(xi +0.2, yi )u(xi 0.2, yi ), u(xi , yi +
0.2) u(xi , yi 0.2)). Plot the vector field.
e. Compare the output in c against the output in d. Compute the mean
normed error.
Numerical Differentiation 81
un+1
i = uni+1 + (1 2)uni + uni1 . (4.2.2)
In addition, we write un to denote the nth time state, the column vector
whose entries are uni . In the notation of the topic, this rendering or choice
of finite differences is referred to as FTCS , for forward time, central space.
It is important that the problem has unique solution. In the literature
this is referred to setting a well posed problem. We generally assume that
the underlying physics is deterministic. The requirement that the problem
have an unique solution is consistent with this assumption. If we look at
(4.2.1) we realize that any multiple of a solution u is also a solution. We
can pin this down by designating the temperature at the boundary. In our
finite difference setting this is equivaluent to declaring prior knowledge of
the temperatures at the boundary. In particular we must designate un0 and
unk+1 for all n. In addition we will need the initial values, u0i for all locations
xi .
This question of uniqueness arises also in the discrete setting. Notice
that (4.2.2) defines a linear relation between states, un+1 = Aun . In matrix
form this relation is
n n+1
1 2 0 ... 0 0 0 u0 u0
1 2 ... 0 0 0 un un+1
1 1
0 1 2 ... 0 0 0 un un+1
2 2
... ... ... ... ... ... ... ... ... = ... .
n n+1
0 ... 1 2
0 0 0 uk1 uk1
n n+1
0 0 0 ... 1 2 uk uk
0 0 0 ... 0 1 2 unk+1 un+1
k+1
(4.2.3)
The matrix in (4.2.3) is nonsingular. It is symmetric and the eigenvalues
are known and nonzero for any k and any . (See [Loustau (2016)].)
Therefore, given an initial state u0 , each successive state is completely
determined, un = An u0 . Therefore, the 1D heat equation is inherently well
posed as a FDM problem. But just as noted, any multiple of this matrix
will also transform each state to the next and still be a discrete form of
Equation (4.2.1). We remove this ambuiguity by designating boundary
values, a = x0 and b = xk+1 . Generally, these values are known to us.
In this case, we set values that may or may not be time dependent. For
instance, if the boundary values at time tn+1 are n+1
0 and n+1
k+1 , then the
linear relation (4.2.3) is modified as
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 83
Numerical Differentiation 83
n+1 n+1
1 0 0 ... 0 0 0 0 0
1 2 ... 0 0 0 un un+1
1 1
0 1 2 ... 0 0 0 n n+1
u2 u2
... ... ... ... ... ... ... ... ... = ... . (4.2.4)
n n+1
0 ... 1 2
0 0 0 uk1 uk1
unk un+1
0 0 0 ... 1 2 k
0 0 0 ... 0 0 1 n+1
k+1 n+1
k+1
The transformation between the format (4.2.3) and (4.2.4) can be mathe-
matically justified. See for [Loustau (2016)].
It is most important to realize that equation (4.2.1) defines a linear
process that transforms the temperatures at one time state to the next. In
the discrete form we realize the linear process by the matris in (4.2.4).
Consider the following example. Suppose there is a thin rod initial which
is insulated along its length. Suppose that the temperature is initially zero
everywhere, and that the left end is suddenly heated and kept at 20 degrees.
Finally we set = 1/2. In notation we have set the spatial interval is
[5, 5], x = 0.1, t = 0.01 and = 1/2. For the initial setting take
u(x, 0) = 0 and for the boundary values take u(5, t) = 20 and u(5, t) = 0.
Using (4.2.4) we solve for approximate values of u along the interval. The
following plot shows that local temperatures after 10 time steps.
This is the basic idea behind the finite difference method for solving
differential equations. The explicit or forward Euler method and other
related techniques are remarkably successful (see Section 4). On the one
hand any differential equation may be rendered in FDM form. However, all
these techniques have their limitations. Compare the output from Exercises
1 and 2 or 5 and 6 below. The change in causes dramatic changes in the
output. Indeed, the output of Exercise 2 is impossible. The difficulty arises
because of computational error. In the next section we develop a technique
that demonstrates how is related to the error.
Problem 4 describes an application from cancer therapy. The specific
context is the delivery of chemotherapy across a cell membrane.
Exercises:
1. Execute the example from the text using forward Euler method.
Take = 1/2, the spatial interval as [5, 5], x = 0.1, t = 0.01. For
the initial state take u(xi , 0) = 0 for every i and for the boundary val-
ues take each u(5, tn ) = 20 and u(5, tn ) = 0. Plot the temperatures at
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 84
20
15
10
-4 -2 2 4
Figure 4.2.1: Temperatures distribution after 10 iterations
2. Redo Problem 1 with = 2. Notice that the results are not well
behaved. The problem here is that the forward Euler method for this ge-
ometric setting is not stable then > 0.5. Stability of transient or time
dependent processes is covered in following section.
Numerical Differentiation 85
termine t and u(t, x2 ) when stasis has been reached. For our purposes we
define stasis as u(tn+1 , x2 ) u(tn , x2 ) < 0.0001.
The idea here is that if the state vectors are non-increasing in norm
then we prevent the wild fluctuations in output that we experienced in the
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 86
Numerical Differentiation 87
given by
M
X 1 1
(eijx , eikx ) = N,
(eijx , eikx ) = 0,
otherwise.
and
N
X 1
(ei(jk)h )i = N.
i=0
PM1 1
Theorem 4.3.2. If f (xj ) = k=M 0
dk eikxj for each j = 0, 1, ..., N 1,
then each dk = ck as given in Definition 4.3.2. Conversely, for each j,
PM1 1 ikxj
k=M0 ck e = f (xj ).
Our next result and its corollary provide the mathematical foundation
for Neumann stability analysis. The result identifies the norm of f as an
N -tuple and the norm of f defined by .
Next, suppose that un is the state vector of an FDM process and suppose
that un has discrete Fourier coefficients cni , then Theorem 4.3.2 may be
restated as follows.
Numerical Differentiation 89
Proof. From the basic properties of the discrete Fourier interpolation and
Theorem 2.3.2,
M
X1 1 M
X1 1
ku n+1 2
k = un+1 k2
k = ( cn+1
i eiix , cn+1
j eijx ) =
i=M0 j=M0
M
X1 1 M
X1 1
cn+1
i cn+1
j (eiix , eijx ) = N |cn+1
i |2
i,j=M0 i=M0
N |cn+1
i |2 N |cni |2 = k
un k2 = kun k2 .
i=M0 i=M0
We now return to the 1-D heat equation and the explicit formulation as
derived from (4.2.2).
un+1
i = uni+1 + (1 2)uni + uni1
If we write this equation as the discrete Fourier interpolation we get
M
X1 1
cn+1
k eikxi =
k=M0
M
X 1 1 M
X1 1 M
X1 1
M
X 1 1
cn+1
k eikix = (cnk eikx + (1 2)cnk + cnk eikx )eikix .
k=M0 k=M0
|cn+1
j |
n = |eijx + (1 2) + eijx |.
|cj |
Expressing the right hand side in terms of cosine and sine yields,
|cn+1
j |
n = |(cos(jx) + i sin(jx))+
|cj |
(1 2) + (cos(jx) i sin(jx))|,
or
|cn+1
j |
n = |2(cos(jx) + (1 2)| = |1 + 2(cos(jx) 1)|.
|cj |
Now stability will occur when the absolute value of the right hand side
is less than or equal to 1, or 1 1 2(1 cos(jx)) 1. Hence,
2 2(1 cos(jx)) 0. Therefore, Neumann stability is satisfied
provided 1 (1 cos(jx)). But the maximal value of 1 cos() is 2.
Hence, stability requires that 1/2. Appealing to Theorem 4.3.3, we
have proved the folloiwng theorem.
Theorem 4.3.4. The FDM rendering of the 1D heat equation is von Neu-
mann if an only if 0.5.
For the proof of Theorem 4.3.3, we were able to take advantage of some
simple properties of the trigonometry functions. Alternatively, we could
have applied basic calculus max/min procedures to determine values of
that satisfy 1 2(1 cos(jx)).
We see now that stability depends on the balance between x and
t. Generally, if you decrease x then you must decrease t so the it
accompdates for the square of x.
But stability is only one issue. We also must consider the error, en (xi ) =
u(tn , xi ) uni . We can see that this is realted to sdtability via the state
matrix A. For row i of A,
en (xi ) = u(tn , xi ) uni = u(tn , xi ) A(i) uin1 =
u(tn , xi ) A(i) u(tn1 , xi ) + A(i) en1 (xi ) = cn (xi ) + A(i) en1 (xi ),
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 91
Numerical Differentiation 91
Exercises:
1. Recall the frist order wave equation, given u = u(t, x),
u u
= .
t x
with FDM rendering, forward time and forward space (FTFS).
a. Implement FTFS for the first order wave.
b. Execute Neumann stability anlaysis for > 0
c. Execute Neumann stability analysis for < 0
d. What is the difference between the case for positive and negarive.
After solving for uni we have the following expression analogous to (4.2.2).
uni = un+1 n+1
i+1 + (1 + 2)ui un+1
i1 . (4.4.1)
n+1
As in Section 2, we use (4.4.1) to form a linear relation Au = un . In
n+1 n
matrix form, after setting oundary values, this yields Bu =u ,
n+1 n+1
1 0 0 ... 0 0 0 0 0
1 + 2 ... 0 0 0 un+1 un
1 1
0 1 + 2 ... 0 0 0 un+1 un
2 2
... ... ... ... ... ... ... ... ... = ... .
n+1 n
0 ... 1 + 2
0 0 0 uk1 uk1
un+1
n
0 0 0 ... 1 + 2 k
uk
0 0 0 ... 0 0 1 n+1
k+1 n+1
k+1
(4.4.2)
The process is called backward Euler or implicit. As in the explicit case,
the matrix B is nonsinular. Indeed, we have formula of the eigenvalues of
B and none of them are zero or near to zero.
In this case, we know the nth and want to solve for the n + 1st state.
Hence, either we solve (4.4.2) as a linear system or we compute the inverse
of B and resolve un+1 = B 1 un . Hence, un+1 = B (n+1) u0 provided the
boundary values are time independent.
But first, we should consider stability. Analugus to the explicit case, we
resolve this question by recasting (4.4.1) in Fourier form.
M
X 1 1
cnk eikxi =
k=M0
M
X1 1 M
X1 1 M
X1 1
cn+1
k eikxi+1 + (1 + 2) cn+1
k eikxi cn+1
k eikxi1 =
k=M0 k=M0 k=M0
M
X 1 1
(cn+1
k eikx + (1 + 2)cn+1
k cn+1
k eikx )eikix
k=M0
or
M
X1 1
cnk eikix =
k=M0
M
X1 1
(cn+1
k eikx + (1 + 2)cn+1
k cn+1
k eikx )eikix .
k=M0
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 93
Numerical Differentiation 93
(1 + 2) (cos(jx) i sin(jx))| =
|1 + 2(1 cos(jx))| 1,
since 1 cos(jx) 1 and > 0. Therefore, |un+1 |/|un | < 1 uncondition-
ally. We have proved the following theorem.
1 n+1 1 n
ui+1 2un+1 + un+1 ui+1 2uni + uni1 .
i i1 +
2 2
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 94
n+1/2 1 n
uni ) = ui+1 2uni + uni1 .
(ui
2
The step un un+1/2 is explicit while the step un+1/2 nn+1 is implicit,
n+1/2 1
un+1 n+1
un+1
ui = i+1 + 2(1 + )ui i1 ,
2
1 n
n+1/2
ui+1 2(1 + )uni + uni1 .
ui =
2
The corresponding matrices are
1 0 0 ... 0 0 0
2(1 + ) ... 0 0 0
0 2(1 + ) ... 0 0 0
1
... ... ... ... ... ... ... ... ,
2
... 2(1 + )
0 0 0 0
0 0 0 ... 2(1 + )
0 0 0 ... 0 0 1
and
1 0 0 ... 0 0 0
2(1 + ) ... 0 0 0
0 2(1 + ) ... 0 0 0
1
... ... ... ... ... ... ... ... .
2
... 2(1 + )
0 0 0 0
0 0 0 ... 2(1 + )
0 0 0 ... 0 0 1
The inverse of the first yeilds the n + 1/2 state from the nth . The second
maps the n + 1/2 state to the n + 1st
Taken together they are unconditionally stable.
Numerical Differentiation 95
Exercises:
1. Use implicit to execute Exercise 1 of Section 4.2.
Chapter 5
Numerical Integration
Introduction
97
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 98
Numerical Integration 99
But if the curve is given parametrically, (t) = (1 (t), 2 (t)), t [a, b], then
the arc length is known directly from the parametric formulation as
Z b
0 0
[1 (t)2 + 2 (t)2 ]1/2 dt.
a
Again, the presence of the square root often renders the integral intractable.
Exercises:
1. Let f (x) = xex 1 and set the interval to [1, 4].
a. Use integration by parts to compute the integral of f on the given
interval. Use Mathematica to evaluate the exponentials.
b. Use the Integrate command in Mathematica. Compare the result
with the result in (a).
2. Let f (x) = xex 1 and set the interval to [1, 4]. Consider the
partition 1 < 1.5 < 2 < 2.5 < 3 < 3.5 < 4.
a. Compute the integral of f using the trapezoid method for the given
partition.
b. Compute the integral of f using Simpsons rule for the given parti-
tion. Compare these results with those in 1.
The midpoint is remarkable both because of its simplicity and its accuracy.
As before we begin with a function f defined on an interval [a, b] and a
partition a = x0 < x1 < ... < xn = b. For the sub interval [xi1 , xi ], let
denote the midpoint of the interval and 2h the length. The interval now
becomes [ + h, + h] and the trapezoid approximation for the integral
is 2h(f ( + h) + f ( h))/2 = h(f (] + h) + f ( h)). If we were to
replace [f ( + h) + f ( h)]/2 by f (), then the method would be called
the midpoint rule.
Definition 5.2.1. For the interval [a h, a + h], the estimated integral for
f in the interval using midpoint rule is
Z a+h
f (x)dx 2hf (a) (5.2.1)
ah
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 102
If we have an interval [a, b] together with a partition, a = x0 < x1 < ... <
xn = b, then midpoint rule states that
Z b Xn
f (x)dx f (i )(xi xi1 ), (5.2.2)
a i=1
1.4 1.4
1.2 1.2
1.0 1.0
0.8 0.8
D
0.6 0.6
A A B
0.4 0.4
E
0.2 0.2
0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0
Fig. 5.2 Figure 5.2.1a: Midpoint Fig. 5.3 Figure 5.2.1b. Midpoint
method, method with tangent at the midpoint.
two parallelograms that lies below the curve. Since the curve is concave
up, then necessarily, + > + , and we have proved the theorem.
Exercises:
1. Let f (x) = xex 1 and set the interval to [1, 4]. Consider the
partition 1 < 1.5 < 2 < 2.5 < 3 < 3.5 < 4. Compute the integral of f using
the midpoint rule. Compare these results with those of Exercises 1 and 2
of Section 5.1.
1.4 1.4
1.2 1.2
1.0 1.0 Q
S C
0.8 0.8
0.6 0.6 M
A A
0.4 0.4
E
0.2 0.2
0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0
Fig. 5.4 Figure 5.2.1c: The trapezoid Fig. 5.5 Figure 5.2.1d: with parallelo-
method, grams.
0 1 2 3 4 5 6 7 8
0.0 0.34 1.86 4.32 8.07 13.12 16.8 18.95 18.07
9 10 11 12 13 14 15 16
16.69 15.25 13.86 12.58 11.4 10.33 8.95 6.46
17 18 19 20 21 22 23 24 25
4.65 3.37 2.4 1.76 1.26 0.88 0.63 0.42 0.3
a. Plot the 26 points (t, (t)). Fit a cubic B-spline to the data and
display both plots on the same axis. In order to ensure that the curve
extends near to the domain end points, duplicate the first and last points
when generating the B-spline. We denote this curve , and the segments
of as i (s) = (1i (s), 2i (s)).
b. The temperature difference is caused by the reaction. The first t
for which (t) > 0 is the starting time of the reaction, denoted a. To find
the end time of the reaction plot the points (t, log(](t)). You will notice
that after a while the points seem to lie on a line. The value of t for which
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 105
the plot begins to appear linear is denoted b and is the end time for the
reaction. Rb
c. Estimate the integral a (t)dt using the trapezoid method method.
d. Estimate the same integral using the midpoint rule as follows. Sub-
divide the interval into 10 sub-intervals with length h = (ba)/10. On each
sub-interval estimate the integral via the midpoint rule. Use the points on
the B-spline to estimate values of . For instance, in order to get a value
(t), you will need to
(u, v) = (v, u)
(u + v, w) = (u, w) + (v, w)
(u, u) > 0 for any u 6= 0.
Rb
In particular, (f, g) = a f gdx, defines a positive definite inner product
on the space of continuous functions of an interval (See Exercise 4.) This
is similar to the Hermitian form which arose in the context of Neumann
stability and the discrete Fourier interpolation in Section 4.3. At the time
the form was defined on the complex vector space Cn via certain function
values.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 106
f (2n+2) (x )
x) h(
f ( x) = e(
x) = x)2 ,
pn+1 ( (5.3.2)
(2n + 2)!
provided f is 2n + 1 times continuously differentiable.
The next step is to use the Hermite interpolation to estimate the integral
of f .
Z b 2 Z b
1 pn+1 (x) 1
(x xi ) = pn+1 (x)li (x),
a x xi p0n+1 (xi ) p0n+1 (xi ) a
since li = pn+1 (x)/[p0n+1 (xi )(x xi )]. Hence, the condition i = 0 implies
that pn+1 is orthogonal to the space of all polynomials of degree no larger
than n. We state this formally in the following theorem.
The special case that arose in Theorem 5.3.2 is called Gaussian quadra-
ture.
Z b Z b Z b Z b
d d
[1 hj (xj )(x xj )]lj2 = lj2 hj (xj ) Sj = lj2 .
a dx a dx a a
We have now proved the following corollary.
Qn
Corollary 5.3.1. The polynomial p(x) = i=0 (x xi ) is orthogonal to Pn
provided the Lagrange polynomials satisfy
Z b Z b
li = li2 . (5.3.7)
a a
Rb
In this case the weights are given by l
a i
>0
instance, for the interval [1, 1], p1 (x) = x and p2 (x) = x2 x + 1/6.
The corresponding Gaussian quadrature points for this interval are 0 for
n + 1 = 1 and 0.5 0.5/ 3 for n + 1 = 2.
Additionally the points and weights for the standard interval [1, 1]
are listed online or in pre computer era numerical analysis texts
[Hildebrand (1974)]. On the other hand, the Mathematica function
GaussianQuadrature[n, a, b] returns the Gaussian quadratgure points and
weights for the nth order quadrature on the interval [a, b]. In the literature
it is possible to find approximations for the Gaussian point for n in the tens
of thousands.
Exercises:
1. Given a set of Gauss points for an interval [a, b] determine the trans-
formation that will map them to the corresponding Gaussian quadrature
points another interval [, ].
3. Let f (x) = xex 1 and set the interval to [1, 4]. Compute the inte-
gral of f using the two point Gaussan quadrature. Compare these results
with those of Exercieses 1 and 2 of Section 5.1 and Exercise 1 of Section
5.2.
5. Suppose the interval [1, 1], p0 (x) = 1 and use the Gram-Schmidt
process to derive p1 and p2 .
7. Prove that there are Gaussian quadrature points for every n. This
is not the case for some of the weighted qudratures introduced in the next
section.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 110
Lemma 5.4.1. With the current notation, i = 0 if and only if pn+1 sat-
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 111
isfies
Z 1
p q
0= n+1 , (5.4.1)
1 1 x2
for any q Pn .
Proof. With the comment before the theorem statement, the proof is
routne.
If we compute the points xi we find that they are clusterd at the outer
edges of the interval. For instance, see Exercise 3.
Next, we turn to Laguerre quadrature. In this case we consider the half
line [0, ) and the weight ex
Hence, we must solve for k, the segment identifier, and t, the specific pa-
rameter value.
To determine k, we must find k such that k (0) x k (1).
To locate t, we use Newtons method via the FindRoot function.
2 Z b 2 X
2
X 1 X 1
f (x, yj )dx f (xi , yj ). (5.4.5)
j=1
2 a i=1 j=1
4
For the final step, ow, approximate the rectanglular integral via the quadra-
ture technique (5.4.5).
Exercises:
1. Prove Lemma 5.4.1.
for q Pn .
b. derive the expression given E
c. Find Hn and then derive the expression for i .
6. Let f (x) = xex 1 and consider the guide points (1, f (1)),
(1.5, f (1.5)), (2, f (2)), (2.5, f (2.5)), (3, f (3)), (3.5, f (3.5)) and (4, f (4)).
a. Use a B-spline to fit this set of guide points. Plot the resulting curve.
b. Beginning with the B-spline, use the trapezoid method to approx-
imate the integral of f . Compare this result with the output of 1 and 3
above.
where E is the triangle with vertices (-1, 2), (5, -1), (0, 4).
Chapter 6
Introduction
u0 = u(x0 ) (6.0.2)
The very basic idea is that we know the initial location, x0 , and the initlal
function value u(x0 ), Hence, we know f (x0 , u(x0 )) = u0 (x0 ). Now with this
information and Taylorss theorem we can begin to approximate u.
There is an existence, uniqueness theorem for first order ODE. This
theorem requires that the Lipshitz condition. The result is due to Picard.
Theorem 6.0.1. Consider the first order ODE, u0 = f (x, u) with given
boundary value u0 = u(x0 ) and where f is a continuous function of its
domain D and (x0 , u0 ) is interior to D. Suppose that f satisfies the Lipshitz
condition on D. In particular, suppose that there is a non-negative real K
so that
Proof. See [Simmons and Robertson (1991)] for a proof of the existence
step.
117
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 118
A second matter of note is that the case studies of this chapter are
mostly population studies. We will see a simple birth/death model and a
prey/predator model. In the latter case we look at competing cell types
in an organism as a preditor/prey model. This sort of model arises in
mathematical oncology.
We begin with a first order ODE, u0 = f (x, u) and a boundary value u(x0 ) =
u0 . Given x, we seek a means to approximate u1 = u(x1 ) where x1 =
x0 + x. Our first technique is called forward Euler. In this case we set
u1 = u0 + u0 (x0 )x = u0 + f (x0 , u0 )x. More gernearlly, we have an
interating process,
un+1 = un + f (xn , un )x, (6.1.1)
where xn = x0 + nx. Note that we have encountered the terminology
(forward Euler) in another context. This intance is related but distinct.
Consider an example. We begin with an ODE that we know describes
an exponental function. In particular, we take u0 = f (x, u(x)) = u(x)
with x0 = 0, and u0 = u(0) = 1. In addition, we choose x = 0.1. Hence,
for = 2,
u1 = u0 + f (x0 , u(x0 ))x = u0 + u0 x = 1 + 2 1 0.1 = 1.2
The actual solution to the equation is u(x) = ex . Therefore, u(x1 ) =
e20.1 1.221 and the error is about 0.021.
In turn,
u2 = u1 + f (x1 , u(x1 ))x = u1 + u1 x = 1.2 + 0.24 = 1.44,
whereas u(x2 ) = e20.2 1.492. Now the error is about 0,052, more than
twice the error after the first step. Continuing in this manner, each suc-
cessive approximation is worse. In fact, it will grow without bound. (See
Exercise 1).
There is an alternative approach. The problem is that we are using
f (x0 , u0 ), the slope at x0 to approximate the slop at x1 . However, we can
take advantage of the approximate value u1 and compute a second estimate
of the slope for u at x1 , 1/2(f (x0 , u0 ) + f (x1 , u1 )). Continuing, we get a
second estimate for u at x1 . This value is called the correction for u1 and
denoted it as u11 .
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 120
t 0 1 2 3 4
B 0.007 0.0036 0.0011 0.0001 0.0004
5 6 7 8 9 10
0.0013 0.0028 0.0043 0.00056 0.00044 0.0004
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 121
f (0.2, 99) 6.19, N (0.3) = 100; f (0.3, 99) 8.75, N (0.4) = 101,
We continue through 11 time values and report out output in Figure 6.1.1
has been recorded at each time step. Next we employ the corrector. The
first few calculations are summarized here. Now we use g for the corrected
derivative estimate.
g(0, 100) 8, 58, N (0.1) = 99; g(0.1, 99) 0.02, N (0.2) = 99,
g(0.2, 99) 6.21, N (0.3) = 100; g(0.3, 99) 8.79, N (0.4) = 101,
The estimated derivatives are slightly changed, but the population figures
showned in Figure 6.12 are rounded to the nearest integer and unchanged.
Exercises:
1. Continue the example developed above, u0 = u(x) with x0 = 0,
u0 = 1 and x = 0.1. Given that u(x) = e2x Use forward Euler to com-
pute the estimates u3 , ..., u1 0. for each case compute the corresponding
error.
4. Repeat 2 for Euler with 3 correctors. How does this data change
with each iteration?
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 122
105
105
104
104
103
103
102
102
101
101
100
100
99
2 4 6 8 10
99
2 4 6 8 10
Figure 6.1.1. Output for forward Euler
Figure 6.1.2. Output for forward Euler with corrector
Fig. 6.1 Figure 6.1.1: Fordward Eulsr Fig. 6.2 Figure 6.1.2. Midpoint
method, method with tangent at the midpoint.
5. Use least equares to fit a cubic polynomial to the given values for N .
(See Figure 6.1.3.) Now repeat Exercise 2 for using the cubic to represent
N . Note that the cubic allows is to infer values of N between the given
ones. This will be usful in the next section.
and
Exercises:
1. Continue the basic example from Section 6.1 by comparing the mid-
point estimates for u3 , ..., u10 against the actual data and the forward Euler
exgtmates.
0.5
0.4
0.3
0.2
0.1
-4 -2 2 4
Fig. 6.3 Figure 6.2.1: The sigmoidal curve, U = 0.1e2t /(0.5 + 0.2(e2t 1))
Many of us have had the experience of being on the atheletic field and
receiving a thrown, kicked or batted ball. In order to receive the ball we
must estimate the initial lift or angle. From experience, we know that the
trajectory is a parabolic arc, we can see the initial location. When we know
the inital angle we can resolve the trajectory. We also know that it takes
some experice to do this reliably and quickly. We see below that we were
executing the shooting method.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 126
This now can be expressed as u0 (x) = f (x, v, u), v 0 (x) = g(x, v, u), where
Before continuing with the solution process, we consider the data. First
if T were in normal range, say 27 C, then T = 300 Kelvin. If for instance,
u(0) = 35C = 308K, then v 0 (0) = 899, 176, 496. Now, think about what
will happen if we were to develop this pair of equations as an initial value
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 127
x x
un+1 = un + [f (un ) + f (un )]x = un + [vn + vn ]x,
2 2
where vn is the computed approximation of u0 (xn ).
Setting x = 0.01 and testing the result for negative integer values
of we see quickly that estimates of 0.9 and 1.0 bracket the desired
result. The corresponding values for u0 (0) = v(0) are 4.0476 and 3.9975.
Employing the shooting method (6.3.3) we next test at 0.995. This yields
the a result that is correct to 4 decimal places, 4.00001. Figure 6.3.1 and
6.3.2 show values for u and u0 for u0 (0) = 0.995.
Exercises:
1. Execute the example from the text.
2. Repeat the example from the section using forward Euler to extimate
u and v.
There are two ways to looik at method of line We begin with the one
that ighlights the role of lines. To provide context, we recall the 1-D heat
equation, ut = uxx .
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 129
The method of lines proceeds as follows. From the set up, the time-
space domain of the problem is a rectangle D = [0, T ] [0, 10.0]. We first
designate t and x and thereby determine a lattice of points (tn , xi ) in
the domain. Exactly as with FDM where we wrote uni for u(tn , xi ). We
now write ui for u(t, xi ). Using a second central difference and a single
forward difference on the right hand side of (9.5.6) we have
ui
= 2
u(t, xi+1 ) 2u(t, xi ) + u(t, xi1 ) =
t x
2
ui+1 (t) 2ui (t) + ui1 (t) (6.4.1)
x
Hence, we have arrived at an ODE where the left hand side is the
derivative of ui and the right hand side is a function of ui , ui1 and ui+1 .
From here we may employ any of our ODE techniques. Should we choose
to use forward Euler we have
t n
un+1 = uni + ui+1 2uni + uni1 .
i 2
x
This we immediately recognize as FTCS FDM.
Alternatively, consider the extimadtor/corrector. In this case
t n
un+1 = uni + (ui+1 2uni + uni1 ) + (un+1 n+1
+ un+1
i 2 i+1 2ui i1 ) ,
2x
We recognize this as Crank Nicolson.
Exercises:
1. Develop the 1-D heat equation with the method of lines with modi-
point method of the ODE side.
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 130
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 131
Bibliography
131
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 132
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 133
Index
133
August 21, 2016 1:36 ws-book9x6 Elements of Numerical Analysis with Mathematica... ws-book9x6Book2 page 134
parametric curve
B-spline basis functions, 67
B-spline fit, 68
B-spline guide points, 68