Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Mte 10 TB

Download as pdf or txt
Download as pdf or txt
You are on page 1of 287

UNIT 1- REVIEW OF CALCULUS

Structure
1.1 Introduction
Objectives
1.2. Three Fundamental Theorems
Intermediate Value Theorem
Rolle's Theorem
Lagrange's Mean Value Theorem
1.3 Taylor's Theorem
1.4 Errors
Round-off Error
Truncation Error
1.5 Summary
1.6 Solutions/Answers

1 . INTRODUCTION

The study of numerical analysis involves concepts from various branches of mathematics
including calculus. In this unit, we shall briefly review certain important theorems in .
calculus which are essential for the development and understanding of numerical methods.
You are already familiar with some fundamental theorems about continuous functions from
your calculus course. Here we shall review three theorems given in that course, namely,
Intermediate value theorem, Rolle's theorem and Lagrange's mean value theorem. Then we
state another important theorem in calculus due to B. Taylor and illustrate the theorem
through various examples.

Most of the numerical methods give answers that are approximations to the desired
solutions. In this situation, it is important to measure the accuracy of the approximate
solution compared to the actual solution. To find the accuracy we must have an idea of the
~ O S S errors
~ D ~ that can arise in computational procedures. In this unit we shall introduce you
to different forms of errors which are common in numerical computations.

The basic ideas and results that we have illustrated in this unit will be used often throughout
this course. So we suggest you go through this unit very carefully.

Objectives
After studying this unit you should be able to :
apply
i) Intermediate value theorem
ii) Rolle's theorem
iii) Lagrange's mean value theorem
iv) Taylor's theorem;
define the term 'error' in approximation;
distinguish between rounded-off error and truncation error and calculate these errors
as the situation demands.

1.2 THREE FUNDAMENTAL THEOREMS

In this section we shall discuss three fundamental theorems, namely, intermediate value
theorem, Rolle's theorem and Lagrange's mean value theorem. All these theorems give
properties of continuous functions defined on a closed interval [a, b]. We shall not prove
them here, but we shall illustrate their utility with various examples. Let us take up these
theorems one by one.
Solutions of Nun-linear Equations
in one Variable
1.2.1 Intermediate Value Theorem
The intermediate value theorem says that a function that is continuous on a closed interval
[a, b] takes on every intermediate value, i.e., every value lying between f(a) and f(b) if
f(a) # f(b).

Formally we can state the theorem as follows :

Theorem 1 :Let f be a function defined dn a closed interval [a, b]. Let c be a number lying
between f(a) and f(b) (i.e. f(a) < c < f(b) if f(a) < f(b) or f(b) < c < f(a) if f(b) < f(a)). Then
there exists at least one point xo E [a, b] such that f(xo) = c.

The following figure (Fig. 1) may help you to visualise the theorem more easily. It gives the
graph of a function f.

Fig. 1

In this figure f(a) < f(b). The condition f(a) < c< f(b) implies that the points (a, f(a)) and
(b, f(b)) lieon opposite sides of the line y = c. This, together with the fact that f is
continuous, implies that the graph crosses the line y = c at some point. In Fig. 1 you see
that the graph crosses the line y = c at (xo,c).

The importance of this theorem is as follows : If we have a continuous function f defined on


a closed interval [a, b], then the theorem guarantees the existence of a solution of the
equation f(x) = c, where c is as in Theorem 1. However, it does not say what the solution is.
We shall illustrate this point with an example.

7c 1
Example 1 :Find the value of x in 0 Ix 5 - for which sin (x)
2

1.
Solution : You know that the function f(x) = sin x is continuous on 10, f Since f(0) = 0
L LA
= 1, we have f(0) < Thus f satisfies all the conditions of Theorem 1.

1
Therefore, there exists at least one value 06 x, say xo, such that f(xo) = -, that is, the theorem
2
1
guarantees that there exists a point xo such that sin (xd = -. Let us try to find this point from
2

[ I;
the graph of sin x in 0, - (see Fig. 2).

Fig. 2
- 1
From the figure, you can see that the line x = - cuts the graph at the point
2 [E. i)~ e n c e
[ :]
there exists a point x - - in 0, - such that sin ((xo) = -.
0-: 2
1

Let us consider another example.

Example 2 : Show that the equation 2x3 + x2 - x + 1 = 5 has a solution in the interval [ l , 21.

Solution :Let f(x) = 2x3 + x2 - x + 1. Since f is a polynomial in x, f is continuous in [ l , 21.


Also f(1) = 3, f(2) = 19 and 5 lies between f(1) and f(2). Thus f satisfies all conditions of -
Theorem 1. Therefore, there exists a number xo between 1 and 2 such that f(xo) = 5. That is.
the equation 2x3 + x2 - x + 1 = 5 has solution in the interval [ l , 21.

Thus we saw that the theorem enables us in establishing the existence of the solutions of
certain equations of the type f(x) = 0 without actually solving them. In other words, if you
want to find an ~ntervalin which a.solution (or root) of f(x) = 0 exists, then find two
numbers a, b such that f(a) f(b) < 0.Theorem 1, then states that the solution lies in ]a, b[. We
shall need some other numerical methods for finding the actual solution. We shall study the
problem of finding solutions of the'equation f(x) = 0 more elaborately in Unit 2.

Why don't you try an exercise now.

El) Show that the following equations have a solution in the interval given alongside.

Let us now discuss another important theorem in calculus.

1.2.2 Rolle's Theorem


In this section we shall review the Rolle's theorem. The theorem is named after the
seventeenth century French mathematician Michel Rolle (1652-17 19).

Theorem 2 (Rolle's Theorem) :Let f be a continuous function defined on [a, b] and


differentiable on ]a, b[. If f(a) = f(b), then there exists a number xo in ]a, b[ such that
f(x,) = 0.
Geometrically, we can interpret the theorem easily. You know that since f is continuous, the
graph o f f is a smooth curve (see Fig. 3).

Fig. 3

You have already seen in your calculus course that the derivative f ( x d at some point xo
gives the slope of the tangent at (xo, f(xd) to the curve y = f(x). Therefore the theorem states
that if the end values f(a) and f(b) are equal, then there exists a point xo in ]a, b[ such that
the slope of the tangent at the point P(xo, f(xd) is zero, that is, the tangent is parallel to
x-axis at that point (see Fig. 3). In fact we can have more than one point at which f ( x ) = 0 as
shown in Fig. 3. This shows that the number xo in Theorem 2 may not be unique.
~ulutionsol'Non-~inrarEquations The following example gives an application of Rolle's theorem.
in one Variable

P1,
Example 3 1 Use_ Rolle's theorem to show that there is a solution of the equation
cot x = x in

Solution : Here we have to solve the eauation cot x - x = 0. We rewrite cot x - x as


cos x - x sin x
sin x
-. Solving the equation
cos x - x sin x
sin x
= 0 in
J
]O, $
L
is same as solving the
equation cos x - x sin x = 0. Now we shall see whether we can find a function f which
satisfies the conditions of Rolle's theorem and for which f ( x ) = cos x - x sin x. Our
experience in -diffvntiation suggests- that we try f(x) = x cos x. This function f is
10.1 and the derivative f (x) = cos x - x sin x. Also

requirements of Rolle's theorem. Hence, there exists


\ /
a point xo in ]a, b[ such that f'(xo) = cos xO- x0 sin xO= 0. This shows that a solution to the

equation cot x - x = 0 exists in 0, -


I :[ .

You can try the following exercise on the same lines as Example 3.

E2) Using Rolle's theorem show that there is a solution to the equation tan x - I +x =0
in 10, I[.
- - -

Now. let us look at Fig. 3 carefully. We see that the line joining (a, f(a)) and (b, f(b)) is
parallel to the tangent at (xo,f(xo)). Does this property hold when f(a) # f(b) also? In'other
words. does there exist a point xo in ]a, b[ such that the tangent at (xg. f(xo)) is parallel to the
line joining (a, f(a)) and (b, f(b))? The answer to this question is the content of the well-
known theorem, "Lagrange's mean value theorem". which we discuss next. ..

1.2.3 Lagrange's Mean Value Theorem


This theorem was first proved by the French mathematician Count Joseph Louis Lagrange
(1736-1813).

Theorem 3 : Let f be a continuous function defined on [a, b] and differentiab1e.h ]a. b[.
Then there exists a number xOin la, b[ such that

f (xo) =
0)- f(al ...
b-a
Geometr~callywe can interpret this theorem as given in Fig. 4.

Fig. 4

In this figure you can see that the straight line connecting the end points (a, f(a)j and
(b, f(b)) of the graph is parallel to'lsome tangent to the curve at an intermediate point.

You may be wondering why this theorem is called 'mean value theorem'. This is because of
the following physical interpretation.
Suppose f(t) denotes the position of an object at time t. Then the average (mean) velocity Review of Calculus
during the internal [a, b] is given by
f(b) - f(a)
b-a
Now Theorem 3 states that this mean velocity during an internal [a, b] is equal to the
velocity f(xo) at some instant xo in ]a, b[.

We shall illustrate the theorem with an example.

Example 4 : Apply the mean value theorem to the function f(x) = G i n [O, 21 (see Fig. 5).

Fig. 5 :Graph of f(x) = fi.

Solution : We first note that the function f(x) = fiis continuous on [O, 21 and differentiable
I
in ]0,2[ and f (x) = --
2.1;; '

Therefore by Theorem 3, there exists a point xo in ]0,2[ such that

f(2) - f(0) = f (xo) (2 - 0)

1
Now f(2) = fiand f(0) = 0 and f(xo) = --
2%'

Therefore we have

[$ . $1
Thus we get thatcthe line joining the end points (0,O) and (2, fi)of the graph off is parallel
m the tangent to the curve at the point

We shall consider one more example.


.
Example 5 :Consider the function f(x) = (x - 1) (x - 2) (x - 3) in [O, 41. Find a point xo in
]0,4[ such that

Solution :We rewrite the function f(x) as


f(x) = (x - 1) (x - 2) (x - 3) = x3 - 6x2 + 1l x - 6
We know that f(x) is continuous on [O, 41, since f is a polynomial in x. Also the derivative
f ( x ) = 3 x 2 - 12x+ 11
Solutions of Non-linear Equations
exists in ]0,4[. Thus f satisfies all conditions oi the mean value theorem. Therefore, there
in one Variable exists a point xo in ]0,4[ such that

This is a quadratic equation in xo. The roots of this equation are

6+26 6-26
and -
8 8
Taking 6 d 1.732, we see that there are two values for xo lying in the interval ]0,4[.

The above example shows that the number xo in Theorem 3 may not be unique. Again, as
we mentioned in the case of Theorems 1 and 2, the mean value theorem guarantees the
existence of a point only.

Why don't you try some exercises on your own?

I
E3) Let f(x) = - x3 +,2x. Find a number xo in ]0,3[ such that
3

E4) Find all numbers xo in the interval 1-2, 1[ for which the tangent to the graph of
f(x) = x3 + 4 is parallel to the line joining the end points (-2, f(-2)) and (1, f(1)).
E5) Show that Rolle's theorem is a special case of mean value theorem.

So far we have used the mean value theorem to show the existence of a point satisfying
Eqn. I . Next we shall consider an example which shows another application of mean value
theorem.

Example 6 :Find an approximate value of using the mean value theorem.

G.The number nearest to 26


for which the cube root is known is 27, i.e. f(27) = ==
Solution : Consider the function f(x) = x113.Then f(26) =
3. Now we shall apply the mean
value theorem to the function f(x) = x113in the interval ]26,27[. The function f is
continuous in [26,27] and the derivative is
I
f (x) = -
3x213
Therefore, there exists a point xo between 26 and 27 such that

The symhol = means


approximately equal to. Since xo is close to 27, we approximate - -1 - by -- 1 .
1.e.;
3x;I3 3(27)2/3'
Substituting this value in Eqn. (2) we get Review of ~ a l c t l l u s

Note that in writing the value of G w e have rounded off the number after three decimal
i
places. Using the calculator we find that the exact value of G i s 2.9624961.

We have given this example just to illustrate the usefulness of the theorem. The mean value
theorem has got many other applications which you will come across in later units.

Now we shall discuss another theorem in calculus.

1.3 TAYLOR'S THEOREM

You are already familiar with the name of the English mathematician Brook Taylor
(1685-173 1) from your calculus course. In this section we shall introduce you to a
well-known theorem due to B. Taylor. Here we shall state the theorem without proof and
discuss some of its applications.

You are familiar with polynomial equations of the form f(x) = a. + a , x + . . . + an xn where
ao, a,, . . . ,an are real numbers. We can easily compute the value of a polynomial at any
point x = a by using the four basic operations of addition, multiplication, subtraction and
division. On the other hand there are functions like ex, cos x, In x etc. which occur
frequently in all branches of mathematics which cannot be evaluated h the same manner.
For example, evaluating the function f(x) = cos x at 0.524 is not so simple. Now, to evaluate
such functions we try to approximate them by polynomials which are easier to evaluate.
Taylor's t h e o r e ~ ~ ~ ius
v eassimple method for approximating functions f(x) by polynomials.

Let f(x) be a real-valued function defined on R which is n-times differentiable (see MTE-01
Calculus Unit 6, Block 2). Consider the function

where xo is any given real number.

Now Pl(x) is a polynomial in x of degree 1 and P1(xO)= f(xo) and F I ( x o )= f(xo). The
polynomial P,(x) is called the first Taylor polynomial of f(x) at xo. Now consider another
function

Then P2(x) is a polynomial in x of degree 2 and P2(xO)= f(xo), P12(xO)= f(x0) and
PI12(xO)= f'(xo). P2(x) is called the second Taylor polynomial of f(x) at xo.

Similarly we can define the rth Taylor polynomial of f(x) at xo where 1 I r 5 n. The rth
Taylor polynomial at xo is given by

You can check that Pr(xo) = f(x& P,(xo) = f(xo) , . . .

p;)(xo) = f(')(xo) (see E6)


1
Let us consider an example.

Example 7 :Find the fourth Taylor polynomial of f(x) = In x about x, = 1.

t Solution :The fourth Taylor polynomial of f(x) is given by


~oluticbnsof Non-linear Equations Now, f(1) = In1 = 0
in one Variable
1
f(x)=-; f(l)=l
X

- 1 2 + x - ~( ~ - 1 ) ~
Therefore, P4(x) = (X- 1) - - -
2 3 4 "

Now, you can try some exercises.

E6) If P, denotes the rth Taylor polynomial as given by Eqn (3). then show that
Pr(xo) f(xo), P,(x,) = f (x,), . . . P~)(x,,) = flr)(xo). Q

E7) Obtain the third Taylor polynomial of f(x) = ex about x = 0.

W e are now ready to state the Taylor's theorem.

Theorem 4 (Taylor's Theorem) :Let f be a real valued function having (n + 1) continuous


derivatives on ]a, b[ for some n 2 0. Let xo be any point in the interval ]a, b[. Then for any
x E ]a, b[, we have

where c is a point between xo and x.

The series given in Eqn. (4) is called the nth Taylor's expansion of f(x) a t xo.

W e rewrite Eqn. f4$ in the form


f(x) = Pn(x) + R,, + ,(x)
where Pn(x) is the nth Taylor polynomial of f(x) about xo and

Rn I(x) depends on x, xo and n. Rn I(x) is called the remainder (or error) ofkhe nth
+ +

Taylor's expansion after n + 1 terms.

Suppose we put xo = a and x = a + h where h > 0,in Eqn(4). Thefiany point between a and
a + h w i l l b e o f t h e f o r m a + 8 h , O < 8 < I,.

Therefore, Eqn (4) can be written as

h2 hn ),n +l
f(a+h)=f(a)+hf(a)+-f"(a)+..
2!
. +,&")(a)+-f("+')(a+~h)
n. n + l!
...(5)

Let us now make some remarks &the Taylor's theorem.


Remark 1 : Suppose that the function f(x) in Theorem 4 is a polynomial of degree m. Then
f(')(x) = 0 for all r > m. Therefore R,, ,(x) = 0 for all n 2 m. Thus, in this case, the mth
+

Taylor expansion of f(x) about xo will be

Note that the right hand side of the above equation is simply a polynomial in (x - x0).

Therefore, finding Taylor's expansion of a polynomial function f(x) about xo is the same as
expressing f(x) as a polynomial in (x - x d with coefficients from R.

Remark 2 : Suppose we put xo = a, x = b and n = 0 in Eqn. (4). Then Eqn. (4) becomes

or equivalently

which is the Lagrange's mean value theorem. Therefore we can consider the mean value
theorem as a special case of Taylor's theorem.

Let us consider some examples.

Example 8 :Expand f(x) = x4 - 5x3 + 5x2 + xc+ 2 in powers of (x - 2).


Solution :The function f(x) is a polynomial in x of degree 4. Hence, derivatives of all
orders exist and are continuous. Therefore by Taylor's theorem, the 4th Taylor expansion of
f(x) about 2 is given by

Here f(2) = 0
f ( x ) = 4 x 3 - 15x2+ 1 0 x + 1 , f (2) = -7
f'(x) = 12x2- 30x + 10 , f'(2) = -2
F3)(x)= 24x - 3 0 , t(3)(2)= 18

p4)(x)= 24 , p4)(2)= 24
Hence the expansion is

= - 7 ( ~- 2) - (X- + 3 ( -~213 + (X- 2)4


Example 9 :Find the nth Taylor expansion of In (1 + x) about x = 0 for x E 1-1, 1[.

Solution :We first note that the point x = 0 lies in the given interval. Further, the function
f(x) = In (1 + x) has continuous derivatives of all orders. The derivatives are given by
i.
1
f (x) = -
l+x'

(-1)"- '(n - I)!


P")(x) = ,P"L(O)=(-I)"- '(n - I)!
(1 + x)"
Solutions of~on-linearEquations Therefore by applying Taylor's theorem we get that for any x E 1-1, 1 [
in one Variable

where c is a point lying between 0 and x.


Now, let us consider the behaviour of the remainder in a small interval, say, [O, 0.51, Then
for x in [O, 0.51, we have
1 1

where 0 < c < x.

Since I x I < 1, I x I "+ ' < 1 for any positive integer n.


Also since c > 0, < 1. Therefore we have
(1 +c)"+'

Now can be made as small as we like by choosing n sufficiently large i.e.


n+l
lim --I - 0. This shows that lim I Rn l(x) 1 = 0.
+

,+, n+ 1 n+-

The above example shows that if n is sufficiently large, the value of the nth Taylor
polynomial P,(x) at any xo will be approximately equal to the value of the given function
f(xo). In fact, the remainder Rn I(x) tell(s) us how close the value Pn(xo) is to f(xo).
+

Now we shall make some general observations about the remainder Rn + ,(Y; ,n the ~'aylor's
expansion of a function f(x).
Remark 3 :Consider the nth Taylor expansion off about xo given by

Then Rn I(x) = f(x) - Pn(x). If lim Rn I(x) = 0 for some x, then for that x we say that we
+ +

n+-
can approximate f(x) by P,(x) and we write f(x) as the infinite series.

f(2)(~,,) 2 f(")(xO)
f(x) = fO(x)+ f (x)(x - x,,) + ----- (X- x0) + . . . + --n! (X- x0)" + . .
2!

You are already familiar with series of this type from your calculus course. This series is
called Taylor's series of f(x). If we put xo = 0 in Eqn. (6) then the series

is called Maclaurin's series.

Remark 4 :If the remainder Rn+ I(x) satisfies the condition that I Rn+ I(x) I < M for some
n at some fixed point x = a, then M is called the bound of the error at x = a.
Inthis case we have Wetic% t ~Calculus
f

I Rn + ,(XI1 = 1 f(x) - Pn(x) I < M


That is, f(x) lies in the i'nterval I P,(x) - M , P-(x) + M I.

approximation.
We shall explain these concepts with an example.
Example 10 :Find the 2nd Taylor's expansion of f(x) = in )-I. 11 about 9~ = 0. Find
'
the bound of the error at x = 0.2.
I Solution :Since f(x) = G.
we have

k Applying Taylor's theorem.to f(x), we get

where c is a point lying between 0 and x.

x3 1 + c)-'I2.
The e m is given by R,(r) = %(

1 When x = 0.2. we have

where 0 < c < 0.2. Since c > 0 we have

Hence,
.- -. a

I
I R3(02) I d 16
= (0.5) l(r3

Hence the bound of the enm for n = 2 at x = 0.2 is (0.5)


Why don't you try some exercises now?

E8) Obtaii the nth Taylor expansion of the function f(x) = -k ) i . l[alW%=O.
1+ x
E9) Does f(x) = 6have a Taylor series expansion about x = 03 Justify your answer.

El01 Obiain the 8th Taylor expandon of the function f(x) = ccrr x in
Obtain a bound for the e m &(x).
Solutionsof Non-linear, Equations
in one Variable
There are some functions whose Taylor's expansion is used very often. We shall list their
expansions here.
1
X x2 xn, X " + '
e x = 1 +-+-+. . .+--+-----eC.. . . . . (7)
l! 2! n! (n + l)!
x3 x5
Sinx=x.--+-+.
3! 5!
..+ (2n - l)!
(- l)n x2n + 1
+ cos (c) .
(2n + I)!
x2 x4
Cosx=l--+--..
2! 41
.+ (-1)"(2n)!
(x12"

(-1 )n + 1 X2n + 2
I n.." (n\

where c, in each expansion, is as given in Taylor's theorem.

Now, let us consider some examples that illustrate the use of finding approximate values of
some functions at certain points using truncated Taylor series.

Example 11 : Using Taylor's expansion for sin x about x = 0, find the approximate value of
sin 10" with error less than

Solution :The nth Taylor expansion for sin x given in Eqn. (9) is

(-1)n X2n + 1
+ (2n + I)!
cos C .
where x is the angle measured in radians.

Now, in radian measure, we have


7t
10" = - radians.
18
X:
Therefore, by putting x = -in Eqn. (1 1) we get
18

where Rn

Now
+

("ig1.
I IS the remainder after (n + 1) terms.

If we approximate sin then the error introduced will be less than if

= I (-lln [z)
(2n + l)! 18
2n+l
coscI < 10-7.

Maximizing cos c, we require that


. \2n + 1
Using the calculator, we find that the value of left hand side of Eqn. (12) for various n is Review of Calculus

1
-

Left hand side .89x10-l ' , I ..13xl0'~ .99 x

From the table we find that the inequality in (12) is satisfied for n = 3. Hence the required
approximation is

sin [+); - 4[$+ +(+J = 0.1745445

i with error less than 1.0 x


I Let us now find the approximate value of e using Taylor's theorem.

Example 12 :Using Maclaurin's series for ex, show that ez2.71806 with error less than
0.001. (Assume that e < 3).

Solution :The Maclaurin's series for ex is


-

Putting x = 1 in the above series, we get

Now we have to-find n for which


I e -pn(l) I = I R,+ 1(1) I c0.001:

Since we have chosen xo = 0 and x = 1, the value c lies between 0 and 1 i.e. 0 < c < 1. Since
eC< c < 3, weget

The bound for Rn 1(1)for different n is given in the following table.


+

1 Bounds for R, + I 1 1.5 .5 .1 1 .I25 1 .004 1 .0006

From thh table, we see that

R,+l<.001ifn=6 \

Thus P6 (1) is the desired approximation to e. i.e.

See if you can do the following exercises.

It
E l 1) Using Maclaurin's expansion for cos x, find the approximate value of cos - with the
4
error bound l(r5.
E12) How large should n be chosen in Maclaurin's expansion for ex to have
.%lutions of Nun-linear Equations in In numerical analysis we are concerned with developing a sequence of calculations that will
une Variable give a satisfactory answer t~ a problem. Since this process involves a lot of computations,
there is a chance for the presence of some errors in these computations. In the next section
we shall introduce you to the concept of 'errors'~hatarisein numerical computations.

1.4 ERRORS

In this section we shall discuss the concept of an 'error'. We consider two types of errors
that are commonly encountered in numerical computations.
You ate already familiar with the rounding off a number which has non-terminal decimal
expansion from your school arithmetic. For example we use 3.1425 for 22n. These rounded
off numkrs are approximations of the actual values. In any camputatbnal procedure we
make use of these approximate values instead of the true values: Let xT denote the true value
and xA denote the approximate value. How do we measure the goodness of an
approximation xA to xT? The simplest measure which naturally comes to our mind is the
difference between xT and x,. This measure is called the 'error'. Formally, we define error
as a quantity which satisfies the identity.
True value xT = Approximate value xA + error.

Now if an 'error' in approximation is considerably small (according to some criterion), tht!n


we say that 'xA is a good approximation to x'.

Let us consider an example.


Example 13 :The true value of x is 3.14159265. . .In same mensuration problems the
value 22/7 is commonly used as an approximation to u. What is the error in this
approximation?
Solution :The true value of x is

Now, we convert 22n to decimal form, so that we can find the difference between the
approximate value and true value. Then the approximate value of u is

-
error = True value approximate value = - 0.001 26449 . ..(IS)
Note that in this case the error is negative. Error can be positive or negative. We shall in
general be interested in absolute value of the enor which is defined as
1. h r 1 = I True value - approximate value 1
For example, the absolute Error in Example 13 is
I error 1 = 1 -0.00126449.. :1 ~0.00126...
Sometimes, when the true value is very ldrge or very small we p f e r to study the cnw by
comparing it with the true value. This is known as Relative ~ W O Pand we &fine this error as

1 Relative crm t = True value - approximatevalue


True value
In the case of Exampk 13.

But note that in certain canp1tatbm, the nue value rrmy not be available. In that ccrse we
replace ihe mee value by the camputed qpmdmafe value in the &finition of nlrtive ens.
In numerical calculations, you will encountermainly two types of emm:nnud-off e m and
truncation ma.We shall discuss these mars in the next two s u ~ t i o n s1.4.1 and 1.4.2
.
respec'tively
1.4.1 Round-off Error I u

Let us look at Example 13 again. You can see that the numbers appearing in Eqns (13).
( 14) and (15) consist of 8 digits after the decimal
point followed by dots. The line of dots
indicates that the digits continue and we are not able to wrife all of them. That is, these
numbers cannot be represented exactly by a terminating decimal expansion. Whenever we
use such numbers in calculations we have to decide how many digits we are going to take
into account. For example, consider again the approximate value of a. If we approximate ~r
using 2 digits after the decimal point (say). chopping off the other digits, then we have

The error in this approximation is


error = 0.00 159265 ... ....(16)
If'we use 3 digits after the decimal point. then using chopping we have

In this case the error is given by


error = - 0.UMI59265. . . ... (17)
Now suppose we coqsider the approximate valuc rounded-off to three decimal places. You
already know how to round off a number which has non-terminal deci~nalexpansion. Then
the value of a rounded-off to 3 digits is 3.142. The error in this case is
error = - 0.00040734 . . .
which is smaller. in absolute value than 0.00059265 ... given in Eqn. (17). Therefore in
general whenever we want to use onl) a cenain number of digits after the decimal point.
then it is always better to use the value rounded-off to that many digits because in this case
the error is usually small. The error iavolved in a process where we use rounding off
method is called round-off error.
We now discuss the concept of floating point arithmetic.
In scientific computations a real number x is usually represented in the form
x = k ( . d , d,... d,) 10'"

...
where dl, 4.. d,, are natural numbers between 0 and 9 and m is an integer called
exponent. Writing a number in this form is known as floating point representation. We
denote this representation by fl(x). Such a floating point number is said to be normalized if
d l # 0. To translate a number into floating point representation we adopt any of the two
methods -rounding and chopping. For example. supposc we want to represent the number
537 in the normalid floating point represuntation kith n = I. then we get
fl(537) = .5 x ld chopped
, = .5 x i d rounded
In this case we are getting the same r;presentation in rounding and chopping. Now if we
take n = 2, then we get
fl(537) = .53 x l d chopped -
= .54 x rounded
In this case, the representations are different.
Now if we take n = 3, then we get
fl(537) = 537 x lo3chopped

The number n in the floating point representation is called precision.


The difference between the tme value of a number x and rounded fl(x) is called round-off
e m . Fmm the earlier discussion it is clear that the round-off error decreases when p&ision
increases.
soiutions of Non-linear E:quations in Mathematically we define these concepts as follows :
one Variable
Definition 2 :Let x be a real number and x* be a real number having non-terminal decimal
expansion, then we say that x* represents x rounded to k decimal places if

I x - x* 1 5 2 where k > 0 is a.positive integer.

Next definition gives us a measure by which we can conclude that the round-off error
occurring in an approximation process is negligible or not.

Definition 3 : Let x be a real number and x* be an approximation to x. Then we say that x*


is accurate to k decimal places if

Let us consider an example.

Example 14 : Find out to how many decimal places the value of 2217 obtained in Example
13 is accurate as an approximation t o n = 3.14159265 ?

Solution :We have already seen in Example 13 that

Now .0005 < .00126 . . . < 0.005

Therefore the inequality (18) is satisfied f o r k = 2.

Hence. by Definition 3. we conclude that the approximation is accurate to 2 decimal places.

Here is an exercise for you.

355 .
E13) In some approximation problems where graphic methods are used, the value - 1s
133
used as an approximation to IT = 3.14159265 . . . . To how many decimal places the
355
value is accurate as an approximation to IT?
---

133

Now we make an important remark.

Remark 5 :Round-off errors can create serious difficulties in lengthy computations.


Suppose we have a pro'blem which involves a long calculation. In the course of these
computations many rounding errors (some positive, and some negative) may occur.in a
number of ways. At the end of the calculations these errors will get accumulated and we
don't know the magnitude of this error. Theoretically it can be large. But, in reality. some of
these err.ors (between positive and negative errors) may get cancelled so'that the
accumulated error will be much smaller.

Let US now define another type of error called 'Truncation error'

1.4.2 Truncation Error


\Ve shall first illustrate thiserror with a simple example. In Sec 1.3. we have already'
discussed how to find approximate value of a certain function f(x), for a given value of x,
using Taylor's series expansion. Let

denote the Taylor's series of f(x) about xo. In practical situations, we cannot, in general, find
the sum of an infinite number of terms. So we must stop after a finite number of terms, say, Review of Calculus
N. This means that we are taking
N

and ignoring the rest of the terms, that is, an (x - xdn.


n=N+l
There is an e m r involved in this truncating process which arises from the terms which we
exclude. This error is called the 'truncation error'. We denote this error by T E. Thus we
have.
N m

You already know how to calculate this e m from Sec. 1.3. There we saw that using
Taylor's theorem we can estimate the error (or remainder) involved in a truncation process
in some cases.

Let's see what happens if we apply Taylor's theorem to the function f(x) about the point
xo = 0. We assume that f satisfies all conditions of Taylor's theorem. Then we have

where an = -
fin'(0) and 0 < c < x.
n!
N
Now, suppose that we want to approximate f(x) by an xn.
n=O

N
Then Eqn (19) tells us that the truncation error in approximating f(x) by an xn is given by
n=O

Theoretically we can use this formula for truncation error for any sufficiently differentiable
function. But practically it is not easy to calculate the nth derivative of many functions.
Because of the complexity in differentiation of such functions, it is better to obtain indirectly
their Taylor polynomials by using one of the standard expansions we have listed in Sec. 1.3.

For example consider the function f(r)= ex'. It is difficult to caiculate the nth derivative of
this function. Therefore, for convenience, we obtain Taylor's expansion of ex' using
Taylor's expansion of e!' by putting y = x2. We shall illustrate this in h e following example.

Example 15 :Calculate a bound for the truncation e m r in approximating ex' by

2
Solution :Put u = x2. Then ex = eU.Now we apply the Taylor's theorem to function
f(u) = eUabout u = 0. Then, we have
u2 u3 u4
eU=1 +u+-+-+-+R5(u) where
2! 3! 4!
bolulionsof %on-linuur Equations and 0 < c < u. Since I x I < 1. u = x' < 1 Le. c < 1. Therefore, eC< e < 3. Thus

, Hence the truncation error in appm~imatingeXLby


the above expression is less than
25 x lo-'.

If the absolute value of the TE is less, then we say that the approximation is good.

Now, in practical situations we should be able to find out the value of n for which the
summation Can X" gives a good approximation to f(x). For this we always specify the
accuracy (or error bound) required in advance. Then we find n using formula (20)such that
the absolute error I Rn I(x) I is less than the specified accuracy. This gives the
+
,

approximation within the prescribed accuracy.


Let us consider an example.
Example 16 :Find an approximate value of the integral

with an error less than 0.025. I

Solution :In Example 95 we observed that

,x2 ,lo
with TE = -
5!
Now we use this approximation to calculate the integral. We have

with the truncation error


I ex2 xIO
TE=f - dx.
0
5! -
We have

Integrating the right hand side of (21). we get

Here is an imporfant remark.


Remark :The magnitude of the truncation e m could be reduced within any prescribed
accuracy by retaining sufficiently large number of tenns. Likewise the magnitude of the
round-off error could be duced by retaining additional digitsr
You can now try the following exercises. Review oTCalculus

E14) a) Calculate a bound for the truncation error in approximating


-- f(x) = sin x by
x3 x5 x7
s l n x = 1--+-+-where-1 5 x 2 1.
3! 5 ! 7!
L b) Using the approximation in (a), calculate an approximate value of the integral

with an error 1 ~ ~ .
E15) .
a) Calculate the truncation error in a~vroximatinrr . -
2
ePX by I - x 2
,+ x4 , -1 2 x 2 1.
L

b) Using the approximation in (a) calculate an approximate value of eex2dx


0
within an error bound of

We end this unit by summarising what we have learnt in this unit.

1.5 SUMMARY

In this unit we have :


recalled three important theorenis in calculus, namely
i) Intermediate value theorem
ii) Rolle's theorem
iii) Lagrange's mean value theorem
state Taylor's theorem and demonstrated it with the help of examples.
The nth Taylor's expansion :

(x - xg) (x - x0l2
f(x) = f(x& + -f (x,,)+ ----
l! 2! fi2'(xo)

...+
(X - x0Y (x - x0)" + ' 6" + "(c)
n! f(n)(xd+( n + l ) !
defined the term 'error' occurring in numerical computations.
discussed two types of errors namely
i) Round-off error :Error occumng in computations where we use rounding off
f ii)
method to represent a number is called round-off error.
Truncation error :Error occuning in computations where we use truncation
process to represent the sum of an infinite number of terns.
explained how Taylor's theorem is used to calculate the truncation error.

1 1.6 SOLUTlONSiANSWERS
-

El) a) The given equation is of the form f(x) = c where f(x) = x3 - x - 5 and c = 0.
f is a continuous function in the interval [O, 21 and f(0) = -5 and f(2) = 1. Then 0
lies between f(0) and f(2). Therefore by IV theorem, the equation f(x) = 0 has a
solution in the interval [O, 21.
Solutions of Non-linear Equations b) Here the equation is of the form f(x) = c where f(x) = sin x + x and c = 1.
in one Variable
f is a continuous function defined on

= 0.5 + 0.523 = 1.023. Thus f(0) c 1 c f .Therefore by IV theorem, the


result follows.
E2) Let f(x) = (x - 1) sin x
= x sin x - sin x
Then f (x) = x cos x + sin x -cos x
=(x-l)cosx+sinx
Now f (x) = 0 implies that (x - 1) cos x + sin x = 0. That is (x - 1) + tan x = 0.
This shows that there exists a function f(x) = (x - 1) sin x such that f is continuous in
[O, 11 and differentiable on ] 0, 1[ and f (x) = tan x - 1 + x.
Therefore by Rolle's theorem there exists a point xo in 10, 1[ such that
f (xo) = tan xo - 1 + xg = 0.
E3) Note that f is a continuous function in [O,31 and differentiable in ]0,3[ and
f(x)=x2+2
Therefore by Lagrange's mean value theorem there exists a point xo in 10.31 such tha

But f(xo) = x i + 2 and f(3) = 15 and f(0) = 0. Thus

x0 = 6,since xo is a point in 10, 31, we consider only the positive value.

E4) f(x) = x3 + 4 satisfies :he requirements of Lagrange's mean value theorem in the
interval 1-2, I[. Therefore there exists a point x,, in 1-2, 1[ such that the slope f'(xo)
of the tangent line at xo is the same as the slope of the line joining (-2, f(-2)) and
(l,'f(l)).

But f (xo) = 3x8


Therefore we get

i.e. xi = 1
Since xo lies in [-2, I[, we don't consider the positive value. Therefore there exists
only one point xo = - 1 satisfying the theorem.

E5) Suppose f is a function defined on [a, b] which satisfies all the requirements of
Lagrange's mean value theorem. Then there exists a point xo in la, b[ such that
f (xo) = -f(b)
---
- f(a)
b-a
Suppose in particular f satisfies the condition that f(a) = f(b), i.e. f(b) - f(a) = 0, then
we get f(%) = 0. This is what the Rolle's theorem states. Hence we deduce that, in
the statement of L.agrange's mean value theorem, if we put the extra condition that
f(a) = f(h). theri we get the Rolle's theorem.
E6) Put x = xo in Eqn. (3). then we get Review of Calculus

P,(xo) = f(%)
To calculate, P',(%), we diffa-entiate both sides of Eqn. (3). Then we have

2f'(xo) (x - xo) 3Y(x0) (x - x0)2


Pr(x)=f(%,)+ 2! + ---3! + . . .(*)
Putting x = xo on both sides of the above expression, we get

F",(xo) = f (x,) + 0 + 0 . . . = f (x,).


Note that apart from the first term, all other terms in the R.H.S. contain the factor
(x - x,) and therefore when we put x = xo, these terms vanish.

By differentiating * further, we get P(,')(xo)= fli)(xo),i = 2, . . . . r.

The 3rd Taylor polynomial of f(x) = e x about x = 0 is P3(x)


x2 x3
= f(0) + xF(0) + - f'(0) + - - f"(0).
2! 3!
Heref(0)=eO=1
f ( x ) = e X , f(O)= I
Similarly f'(0) = 1 = f"(0)
x2 x3
Therefore P3(x) = 1 + x + - + --.
2! 3!

-1
f (x) =: --- , f(O)=-1
(1 + x)2

The function f(x) and its derivatives of different orders are continuous in
Therefore by Taylor's theorem

where c is a point lying between 0 and x.


No. Because the derivatives of f(x) are not defined at x = 0.
8th Taylor expansion of f(x) = cos x about xo = 0 is
Solutions of Nan-linear I*:quativns The remainder is given by
in one Variable
x9
R9(x) = - - sin c.
9
Tc
Now sincex liesin wehave 1.x I S - < 1
4
Therefore, we get

E l 1) We have seen in E 10 that the remainder in the 8th Taylor expansion of cos x is such
that

i.e. 1 cos x - P9 (x) 1 I 10-5

This shows that we can approximate f(x) by 8th Taylor polynomial with error bound
1 o - ~i.e.,

71
Putting x = -, we get
4

El?) I ex-Pn(x) I = l(n+l)'ecI


Xn + 1

where c lies between 0 and x. Since I x 1 I1, we get that 1 eC 1 I e. Therefore

e
Now, we have to find an integer n such that - < lo-5.
( n + I)! -

This is satisfied if n = 8 because 59 . =0.749 x


Therefore n = 8 is the required number.

El3)
355
' - = 3.141 5 9 3 9 1 . . . (using a scientific calculator) and x = 3.14159265 . . .
113

I I
Then -- 1 < 0.00000027 < -- x 1O4
2 2
Therefore the approximation is accurate to 6 decimal places.
E14) a) We apply Taylor's iheorem to the function f(x) = sin x in ]-I, I[ about x = 0.
- Then for n = 7, we have
x3 x5 x7
sin x = x - 7+ - - - + R8(x)
3. 5! 7!

where 1 R8(x) 1 =

Therefore. the truncation Prror T. E. = R8(x) is .24802 x lo4


Review or <Jalculus

Hence
sin x -I--+----+-
-- x2 x4 x6 Raw
X 3! 5! 7! x
Rg(x) xa sin c x7
where----- - - sin (c)
x 8! x 8!
Thus

' R x x
NOW
X
dx = j s! sin (c) dx
0 0

Therefore we have

=1 1 1 1
,
1

0
-=
si:x
x3 + -
x - --- x5
3!3 5!5
-

3!3 5 ! 5 7!7
= 0.946

with an error less than 0.25 x lo4

E15) a) Put u = -x2. Then eTx2= eU We consider the 2nd Taylor expansion of eU given
by
I
u2
e U =1 +u+-+R3(u)
2
t
eCu3
where R3(u) = -
3!

1 1
Since u I 0, eCI1. Hence I R3(u) I I - = -
3! 6
2 4
X
b) From (a) we have e'-X = 1 - x2 + - + R~ (-x2)
2
Hence

Now. 7
0
e-x x6
1 ~ i) -dx
~ ~ ( - dx
3! ,
0.1

0
2

e r
0.1
e Jex
0
2
z
0.1
J (I - x2 +
0
) x'
= x - -+ - = OW667666.
UNIT 2 ITERATION METHODS FOR
LOCATING A ROOT
Structure
2.1 Introduction
Objectives
2.2 Initial Approximation to a Root
'Tabulation Method
Graphical Method
2.3 Bisection Method
2.4 Fixed Point Iteration Method
2.5 Summary

2.1 INTRODUCTION

We often come across equations of the forms x4 + 3x2 + 2x + 1 = 0 or ex = x - 2 or


tanh x = x etc. Finding one or more values of x which satisfy these equations is one of the
important problems in Mathematics. From your elementary algebra course (MTE-04). you
are already familiar with'some methods of solving equations of degrees 1,2, 3 and 4.
Equations of degrees 1.2, 3 and 4 are called linear, quadratic, cubic and biquadratic
respectively. There you might have realised that it is very difficult to use the methods
available for solving cubic and biquadratic equations. In fact no formula exists for solving
equations of degree n 2 5. In these cases we take recourse to approximate methods for the
determination of the so!ution of equations of the form
f(x) = 0 ... (1)
The problem of finding approximate values of roots of polynomial equations of higher
degree was initiated by Chinese mathematicians. The methods of solution in various forms
appeared in the 13th century work che' in kiu-shoo.The first noteworthy work in this
direction was done in Europe by the English mathematician Fibonacci. Later in the year
1600 Vieta and Isaac Newton made significant coritributions to the theory.

In this unit as well as in the next two units we shall discuss somenumerical methods which
gives an approximate solution of an equation f(x) = 0. We can classify the methods of
solution into two types, namely (i) Direct methods and (ii) Iteration methods. Direct
methods produce solutions in a finite number of steps whereas iteration methods give an
approximate solution by repeated application of a numerical process. As we said earlier,.
direct methods you have done in MTE-04. You will find later that for using iteration.
methods we have to start with an approximate solution. lteration methods improve this
approximate solution. We shall begin this unit by first discussing methods which enable us
to determine an initial approximate solution and then discuss iteration methods to refine this
approximate solution.

Objectives
After studying this unit you should be able to :
find an initial approximation of the root using
( I ) tabulation method (2) graphical method.
use bisection method for find~ngapproximate roots.
use fixed point iteration method for finding approximate roots.
-- - -- -
Iteration Methods for Locating a Rout
.,*-
2.2 INITIAL APPROXIMATION TO A ROOT

You know that in many problems of engineering and physical sciences you come across
equations in one variable of the form f(x) = 0.

For example, in Physics, the pressure-volume-temperature relationship of real gases can


be described by the equation

where P, V, T are pressure, volume and temperature respectively. R, P, r, s are constants. We


can rewrite Eqn. (2) as

Therefore the problem of finding the specific volume bf a gas at a given temperature and
pressure reduces to solving the biquadratic equation Eqn. (3) for the unknown variable V.

L Consider another example in life sciences, the study of genetic problem of recombination of
chromosomes can be described in the form

1
where p stands for the recombination fraction with the limitation 0 I p I - and (1 - p) stands
2
for the non-recombination fraction. The problem of finding the recombination fraction of a
gene reduces to the problem of finding roots of the quadratic equation p2 - p + k = 0.

In these problems we are concerned with finding value (or values) of the unknown variable
x that satisfies the equation f(x) = 0. The function f(x) may be a polynomial of the form

or it may be a combination of polynomials, trigonometric, exponential or logarithmic


functions. By a root of this equation we mean a number xo such that f(xo) = 0. The root is
also called a zero of f(x).

If f(x) is linear, then Eqn. (1) is of the form ax + b = 0, a # 0 and it has only one root given
b
by x = - -. Any equation which is not linear is called a non-linear equation. In this unit we
a
shall d i p s s some methods for finding roots of the equation f(x) = 0 where f(x) is a non
linear function. You are already familiar with various methods for calculating roots of
quadratic, cubic and biquadratic equations (see MTE-04, Unit 3). But there is no such
formula for solving polynomial equations of degree more than 4 or even for a simple
equation like

Here we shall discuss some of the numerical approximation methods. These methods
involve two steps :

Step 1 : To find an initial approximation of a root.


Step 2 :To improve this approximation to get a more accurate value.
We first consider step 1. Finding an initial approximation to a root means locating (or
estimating) a root of an equation approximately. There are two ways for achieving
this-tabulation method and graphical method.

Let us start with Tabulation method.

2.2.1 Tabulation Method


t This method is based on the intermediate value theorem (I'd Theorem), (see Theorem 1,
Unit 1). Let us try to understand the various steps involved in the method through an
I example.
Solutions of Non-linear Equations Suppose we want to find a root of the equation
in one Variable
2x - log,, x = 7.

We first compute values of f(x) = 2x - lag,, x - 7 for different values of x, say x = 1,2,3,
and 4.

When x = 1, we havef(l)=2-log,, 1 - 7 = - 5

Similarly, we have
- f(2) = 4 - log,, 2 - 7 = - 3.301

(Note that log,, 2 is computed using a scientific calculator.)

f(4) = 8 - log,, 4 - 7 = - 0.3977


These values are given in the following table :

Table 1

We find that f(3) is negative and f(4) is positive. Now we apply IV Theorem to the function
f(x) = 2x - log,, x - 7 in the interval 1, = [3,4]. Since f(3) and f(4) are df opposite signs, by
IV theorem there exists a number x, lying between 3 and 4 such that f(xo) =O. That is, a root
of the function lies in the interval ]3,4[. Note that this root is positive.
Let us now repeat the above computations-forsome values of x lying in ]3,4[ say x = 3.5,
3.7 and 3.8. In the following table we report the values of f(x).

Table 2

We find that f(3.7) and f(3.8) are of opposite signs. By applying IV theorem again to f(x) in
the interval I2 = [3.7,3.8],we find that the root off(x) lies in the interval]3.7,3.8[. Note
that this interval is smaller than the previous interval. We call this interval a refinement of
the previous interval. Let us repeat the above procedure once again for the interval IZ.In
Table 3 we give the values of f(x) for some x between 3.7 and 3.8.

Table 3

Table 3 shows that the root lies within the interval 13.78, 3.79[ and this interval is much
smaller compared to the original interval 13,-41.The procedure is terminated by taking any
value of x between 3.78 and 3.79 as an approximate value of the root of the equation
f(x) = 2x - loglox - 7 = 0.

The method illustrated above is known as Tabulation method. Let us write the steps
involved in the method.

Step 1 : Select sorne.'numbers xl, x2, . . . , xn and calculate f(xl), f(x2), . . . ,f(x,). If f(xi) = 0
We will talk about the choice of
.
x I, X?, . . . x, later. for some i, then xi is a root of the equation. If none of the xis are zero, then proceed to step 2.

Step 2 :Find values x, and xi + I such that f(xi) and f(xi I) are of opposite signs i.e.
+

f(xi) f(xi
+
+ ,
< 0. Rename xi = a, and xi = b,. Then by the IV Theorem a root lies in
between a, and b,. Test for all values of f(x.),j = 1,2, . . . , n and determine other intervals,
J
if any, in which some more roots may lie.
Step 3 : Repeat Step 1 by taking some numbers between a, and bl. Again, if f(x.) = 0 for Iteration Methods for Locating a Root
J
some x. between al qnd b,, then we have found the root x.. Otherwise, continue step 2.
J J
Continue the steps 1 , 2 , 3 till we get a sufficiently small interval ]a, b[ in which the root lies. , ,
Then any value between ]a, b[ can be chosen as an initial approximation to the root. You
may have noticed that the test values x., j = 1,2, . . . , n chosen are dependent on the naiure
J
of the function f(x). I
I
I

We can always gather some information redarding the rbot either from the physical problem
in which the equation f(x) = 0 occur, or it is specified in the problem. For example, we may
ask for the smallest positive root or a root closest to a given number etc.

For a better understanding of the method let us consider one more example.

Example 1 :Find the approximate value of the real root of the equation

Solution : Let f(x) = 2x - 3 sin x - 5.

t Since f(-x) = -2x + 3 sin x - 5 < 0 for x > 0, the functian f(x) is negative for all negative
real numbers x. Therefore the function has no negative real root. Hence the roots of this
equation must lie in [0, m[. Now following step 1, we compute values of f(x), for x = 0, 1,2,
3,4, . . .
We have

f(1) = 2 - 3 sin 1 - 5 = -5.5224


using the calculator. Note that x is in radians. The values f(O), f(l), f(2) and f(3) are given in
Table 4.

Table 4

Nbw we follow step 2. From the table we find that f(2) and f(3) are of opposite signs.
Therefore a root lies between 2 and 3. Now, to get a more refined interval, we evaluate f(x)
for some values between 2 and 3. The values are given in Table 5.

Table 5

This table of values shows that f(2.8) and f(2.9) are of opposite signs and hence the root lies
between 2.8 and 2.9. We repeat the process once again for the interval [2.8,2.91 by taking
some values as given in Table 6.

From Table 6 we find that the root lies between 2.88 and 2.89. This interval is small,
therefore we take any value between 2.88 and 2.89 as an initial approximation of the root.
Since f(2.88) is near to zero than f(2.89), we can take any number near to 2.88 as an initial
approximation to the root.
Why don't you try some exercises now.

E l ) Find an initial approximation to a root of the equation 3x - == 0 using


tabulation method.
E2) Find an initial approximation to a positive root of the equation 2x - tan x = 0 using
tabulation method.
~olution,of hon-linear Equation5 YOUmight have realized that the tabulation method is a lengthy process for finding an initial
in one Varioble
apprbximation of a root. However, since only a rough approximation to the root is required,
we normally use only one application of the tabulation method. In the next sub-section we
shall discuss the graphical method.

2.2.2 ' Graphical Method


In this method, we draw the approximate graph of y = f(x). The points where the curve cuts
the x-axis are taken a s the required approximate values of the roots of the equation f(x) = 0.
Let us consider an example.
Example 2 :Find an approximate value of a root of the biquadratic equation
x4+4x3+4x2-2=0
using graphical method.
* . -
Solution :We first sketch the fourth degree polynomial f(x) = x4 + 4x3 + 4x2 - 2. This
graph is given in Fig. 1.

Fi8.1 :Graph of flx) = x4 + 4x3 + 4x1 - 2.

The figure shows that the graph cuts the x-axis at two points -2.55, and 0.55, approximately.
Hence -2.55 and 0.55 are taken as the approximate roots of the equation
x"+4x3+4x2-2=0.

Now go back for a moment to Unit I and see Example I in Sec. 1.2. There we applied
graphical method to find the roots of the equation sin x = -.21
Let us consider another example.
Example 3 :Find the approximate value of a root of
x2-ex=o

using graphical method.

Solution :First thing to do is to draw the graph of the function f(x) = x2 - ex. If is not easy
to graph this function. Now if'We split the function as

where f ,(x) = x2 and f2(x) = ex, then we can easily draw the graphs of the functions
f,(x) and f2(x). The graphs are given in Fig. 2.

The figure shows that the two curves y = x2 and y = ex intersect at some point P. From the
figure, we find that the approximate point of intersection of the two curves is - 0.7,Thus we
Iteration Methods for Locstinga Ram

Fig. 2 :Graphs ?f f,(x) = x3 and h(x) = ex.

-
have f,(-0.7) f2(-0.7), and therefore f(4.7) = f I(-0.7) - f2(4.7) = 0. Hence - 0.7 is an
C
approximate value of the root of the equation f(x) = 0.
From the above example we observe the following : Suppose we want to apply the graphic
i method for finding an approximate root of f(x) = 0. Then we may try to simplify the method
by splitting the equation as
f(x) = f l(x) - f2(x) = 0 .. . (4)
where the graphs of f,(x) and f2(x) are easy to draw. From Eqn.(4), we have f,(x) if2(x).
The x-coordinate of the point at which the two curves y, = fl(x) and y2 = f2(x) intersect gives
an approximate value of the root of the equation f(x) = 0. Note that we are interested only in
the x-coordinate, we don,t have to worry about the point of intersection of the curves.
Often we can split the function f(x) in the form (4) in a number of ways. But we should
choose that form which involves minimum calculations and the graphs of fl(x) and f2(x)
are easy to draw. We illustrate this point in the following example.
Example 4 :Find an approximate value of the positive real root of 3x - cos x - 1 = 0 using
graphic method.
Solution :Since it is easy to plot 3x - I and cosx, we rewrite the equation as 3x - 1 = cos x.
The graphs of y = fl(x) = 3x - 1 and y = f2(x) = cos x are given in Figure 3.

' "#
5 F i3 :Graphs d Il(x) = 3%- 1 and h(x) = ros x
P> ,
": '
. ),.
It is clear from the figure that the x-coordinate of the point of intersection is approximately
$1 0,6. Hence x = 0.6 is an approximate value of the root of the equatmn 3x - cos x - I = 0.
We now make a remark.
Remark 1 :You should take some care while choosing the scale for graphing. A
magnificationof the scale may improve the accuracy of the approximate value.
Solutions of Non-linear Equations Here is an exercise for you.
in one Variable

E3) Find the approximate location of the roots of the following equations in the regions
given using graphic method.
a) f(x) = e-' - x = 0,in 0 Ix S 1
b) f(x) = e o e 4-~0 . 4 ~- 9 = 0, in 0 < x 5 7

We have discussed two methods, namely, tabulation method and graphical method;which
help us in finding an initial approximation to a root. But these two methods give only a
rough approximation to a root. Now to obtain more accurate results, we need to improve
these crude approximations. In the tabulation method we found that one way of improving
the process is refining the intervals within which a root lies. A modification of this method
is known es bisection method. In the next section we discuss this method.

2.3 BISECTION METHOD

In the beginning of the previous section wc have mentioned that there are two steps involved
in finding an approximate solution. The first step has already been discussed. In this section
we consider the second step which deals with refining an initial approximation to a root.

Once we know an interval in which a root lies, there are several procedures to refine it. The
An algorithm is a complete'and bisection method is one of the basic methods among them. We repeat the steps 1.2.3 of
unambiguous set of instructions -the tabulation method given in subsection 2.2.1 in a modified form. For convenience we
leading to the solution of a problem. write the method as an algorithm.
Suppose that we are given a continuous function f(x) defined on [a, b] and we want to find
This method is also called as the roots of the equation f(x) = 0 by bisection method. We describe the procedure in the
Bolzano method, Bracketing Method. following steps :

Step 1 :Find points x x2 in the interval [a, b] such that f(xl). f(x2) < 0. That is, those points
x I and x2 for which f(x I) and f(x2) are of opposite signs--(see Step 1 of subsection 2.2.1). .
This process is called "finding an initial bisecting interval". Then byIV theorem a root lies
in the interval 1 x x2 [.

X~ X2
Step 2 :Find the middle point c of the interval ] xl, x2 [ i.e. c = -
2
. If f(c) = 0,then c is
+

the required root of the equation and we can stop the procedure. Otherwise we go to Step 3.

Step 3 :Find out if


f(x1) f(c) < 0
If it holds, then the root lies in 1 xl, c [. Otherwise the root lies in 1 c, x2 [ (see Fig. 4). Thus
in either case we have found an interval half as wide as the original interval that contains
the root.

-
(~2)
( ~ 2 f f ( ~ ?!)
(~2.

Fig. 4 :The decision process for the bisection method.


Step 4 :Repeat Steps 2 and 3 with the new interval. This process either gives you the root Iteration Methods for Locating a ROO^
or an interval having width 114 of the original interval ] xl, 3 [ which contains the required
root.
Step 5 :Repeat this procedure until ahe interval width is as small as we desire. Each
bisection halves *e length df the pmeding ihtkmd. After N step, the original interval
length will be reduced By (I factor 1/2p.

Now we shall see how this method helps in refining the initial i n t p a l s in spme of the
problems we have done in subsection 2.2.1.
Example 5 :Consider the equation 2x - loglo.x = 7 lies K
n ]3.78,3.79[. Apply bisection
method to fhd an approximate root of the equation correct to three decimal places.

Solution :Let f(x) t 2x - loglp x - 7. Frorri Table 2 in subsedZon 2.2.1, we find that
f(3.78) = - 0.01749 and f(3.79) = 0.00136. Thus a root lies in the interval 13.78, 3.791,

Then we find the middle point of the interval 33.78, 3.79[. The middle point is
c = (3.78 + 3.79)/2 = 3.785 and f(c) = f(3.785) = -0.0806 # 0. Now, we check the
condition in Step 3. Since f(3.78jfC3.785) > 0, the raot doks not lie in the interval
]3.78,3.78[.'Hence the root Ues inthe interval 13.78$,3.9[. We have to refine this interval
further to get better spproxirhation. Further bisections are shown in the following Table.

Table 7 .

-
Tr-/T[
Nwber of Bisections Bisected value xi Improved Interval ' 1

The table fltew4 that the tiproved interval after 5 bipectjons is ]3.78906,3.789375[. The
width of this interval in 3.7B9p75 - 3.78906 = 0.0003 15, If we stop further bisttdions, the
maximum absolute error w d d be 0.000315. The appmltimate root m thefef~rebe iakem a's
(3.78906-t. 3.789975)/2 = 3.789218. Hence the desired approximate value bf the mot
rounded off to Chree dec'imal places is 3.789.
Example 6 : Apply biseqtion method to find an approximation to the positive root of the
equation
2x-3sinx-5=O
rounded off to three decimal places.
Solution :Let f(x) = 2x - 3 sin x - 5.
In Example 1, we had shown that a positive root liesin the interval ]2.8,2.9[. Now we apply .
bisection method to this interval. The results are given in the following table.
SUIuthm~OI
Non-linear F~uatiuna Table 8
in one Variable
Number of bisection Bisected value xi f(xJ Improved interval
1 2.85 4.1624 ]2.85.2.9[

I 2 2.875
2.8875
- 0.0403
0.02089
]2.875,2.9[
]2.875,2.8875{

-
After 6 bisections the width of the interval is 2.8835938 2.8832031 = 0.0003907. Hence,
the maximum possible absdute error to the root is 0.0003907. Therefore the required
approximation to the root is 2.883.
Now let us make some remarks.
Remark 2 :While applying bisection method we must be careful to check that f(x) is
1

continuous. For example, we may come across functions like f(x) = --L If we consider the
x - 1'
interval 1.5, I .5[. then f(.S) f(1.5) < 0. In this case we may be tempted to use bisection
method. But we cannot use the method here because f(x) is not defined at the middle point
x = 1. We can overcome these difficulties by taking f(x) to be continuous throughout the
initial bisecting interval. (Note that if f(x) is continuous, by IV theorem f(x) assumes all
values between the interval.)
Therefore you should always examine the continuity of the function in the initial interval
before attempting the bisection method.

Remark 3 :It may happen that a function has more than one r&t in an intenal. The
bisection method helps us in determining one root only. We can determine the other roots by
properly choosing the initial intervals.
You can try some exercises now.

) Starting with the interval [%, b,], apply bisection method to the following equations
and find an interval of width 0.05 that contains a solution of the equations

~ 5 Using
) bisection method find an approximate root of the equation x3 - x - 4 = 0 in the
interval ] 1.2[ to two places of decimal.

While applying bisection method we repeatedly apply steps 2.3.4 and 5. You recall that in
the introduction we classified such a method as an Iteration method. As we mentioned in
the beginning of Sec. 2.2, a numerical process starts with an initial approximation and
means repeated application iteration i m p ~ v e this
s approximation until we get the desired accurate.value of the root.
of a numerical process or a pattern
of action. Let us cpnsider another iteration method now.

2.4 FIXED POINT ITERATION METHOD

The bisection method we have described earlier depends on ow ability to find an interval in
which the root lies. The task of finding such intervals is difficult in certain situations. In
such cases we try an alternate method called rued Point Iteration Method. We shall
discuss the advantage of this method later.
The first step in this method is to rewrite the equation f(x) = 0 as Iteration Melhtds tor l.oratinga Rcwt

x = g(x) . ..( 5 )
For example consider the equation x2 - 2x - 8 = 0.We can write it as
x=dz'x . .. ( 6 )

We can chocse the form (5) in several ways. Since f(x = 0 is the same as x = g(x). finding a ~ ~ f,,inti ~,,fa ~~uncl,on
d I: a
root of-f(x)= 0 is the same as findilig a root of x = g(x)i.e. a fixed point of g(x).Fich such point casuch that g(a)= a.
g(x) given in ( 6 ) .(7)or (8) is called an iteration function for solving f(x)= 0.

Once an iteration function is chmen. our next step is to take a point x,,. close to the root. a\
the initial approximation of the root.

Starting with xo, we tind the first approxim'ation x i as

Then we find the next approximation as


x2 =g(x,)
Similarly we find the successive approximations x2, x3. x4. . . . as

Each computation of the type x, I = g(x,) i\ called an iteration. Now. two questions arise
+

( i ) when do we stop these iterations? (ii) Does this procedure always give the requ~red
solution?
To ensure this we make the following assumptions on g(x)

Assumption*
The derivative g'(x) of g(x) exists, gO(x)is continuous and satisfies 1 g'(x) 1 < I in an
interval containing xo. (That would mean that we require 1 g'(xi) 1 < I at all iterates xi.)

The iteration is usually stopped whenever 1 xi + I - xi 1 is less than the accuracy required.

In Unit 3 you will prove that if g(x) satisfies the above conditions. then there exists a unique
point a such that g(a) = a and the sequence of iterates approach a.provided that the initial
approximation is close to the point a.

Now we shall illustrate this method with the following example.


Example 7 :Find an approximate root of the equation

using fixed point iteration method, starting with xo = 5. Stop the iteration whenever
I x i + , -xi I <0.001.
Solution :Let f(x) = x2 - 2x - 8. We saw that the equation f(x) = 0 can be written.in three
forms (6). (7) and (8). We shall take up the three forms one by one.

. , form (5). k this form the equation is written as


Case 1 :Suppose we conrider

x=(2~+8)"~ - ,.
'
~tdution+of Non-linear ~:qvdtion* Here g(x) = (2x + 8)'12. Let's see whether Assumption (*) is satisfied for this g(x). We have
in one Variable
1
gf(x) =
(2x + 8)]12
Then 1 g'(x) I < 1 whenever (2x + 8)'12 > 1. For any positive real number x, we see that the
inequality (2x + 8)'12 3r 1 is satisfied. Therefore, we consider any interval on the positive
side of x-axis. Since the starting poidt is xo = 5, we may consid& the interval I = [3,6]. This
contains the point 5, NOW,g(x) satisfies the condition that gf(x)exists on I, gf(x) is
continurns on I and I gf(x) 1 < 1 for every x in the interval [3,6]. Now we apply fixed point
. iteration method to g(x).

We get
x I = g(5) = = 4.243

Use a calculator to evaluate the


square root.

x', = 4.004

Since 1 x6,- x5 1 = I - 0.001 1 = 0.001,we conclude that an approximate valbe of a root of


f(x) = 0 is 4.

Case 2 ;-Let us consider the second form,

2x + 8
Here g(x) = -and gf(x)= '8. The I gf(x) I < I for any real number x L 3. Hence g(x)
X x2
satisfies Assumption (*) in the interval [3,6]. Now we leave it as an exercise for you to
complete the computations (See E6).

a- - 8 x2 - 8
Case 3 :Here we have x = --. Then g(x) = -and gf(x) = x. In this case 1 g'(x) 1 <1
2 2
only if 1 x 1 < I i.e. if x lies in the interval 1-1, l[.But this interval does not c~ntain5.
Therefore g(x) does not satisfy the Assumption (*) in any interval containing the initial
approximation. Htnce, the iteration method cannot provide approximation to the desired
root.

Note :This example may appear artificial to you. You are right because in this case we have
got a formula for calculating the root. This example is taken to illustrafe the method in a
simple way.

Let us consider another example.

Example 8 :use fixed point iteration procedure to find an approximate rocit of 2x - 3 sin x
- 5 = O starling with thepoint % ~2.8. + ,
Stop the iteration whenever 1 xi - xi 1 < r0-'

Solutioo :We can rewrite the equationin the form,

3 5 3
-
Here g(x) = 2 sin x + -
2 md gf(x)s -
2C O x.
~

Now at xo = 2.8, we have

which is greater than 1. Thus g(x) does not satisfy Assumption (*) and t h e f o r e in this form
the iteration method fails.
Let us now rewrite the equation in another form. We write Iteration Methods for Locating a Ri .,r

2x - 3 sin x - 5
x=x-
2 - 3 cos x
2x-3sinx-5
Then g(x) = x -
2 - 3 cos x
You may wonder how did we get this form. Note that here g(x) is of the form
f(x) You will find later thit the above equation is the iterated formula for
g(x) = x - ---
f (x)'
another popular iteratiommethod.

Then g'(x) = 1 -
(2 - 3 cos X) (2 - 3 cos X)- (2x - 3 sin x + 5) 3 sin x

-- 2x - 3 sin x + 5 3 sin x
-
(2 - 3 cos x ) ~ I
(2 - 3 cos x ) ~
At xo = 2.8, 1 gt(x0) I = 0.06693 15 (or 0.02174691) < 1
E
Therefore g(x) satisfies the Assumption (*). Using the initial approximation as xo = 2.8, we
get the successive approximation as
x, = 2.883901 5

Since I x2 - x3 I < lo-' we stop the iteration here and conclude that 2.88323 is an
approximate value of the root.

Next we shall use another form

Here g(x) = sin-' -(2";


5, and gt(x) = 7--- 2
9 - (2x - 5)2

At xo = 2.8, g'(xo) = 0.6804 < 1. In fact, we can check that in any small interval containing
2.8, 1 gr(x) 1 < 1. Thus g(x) satisfies the Assumption (*). Applying the iteration method,
we have

We find that there are two values which satisfy the above equation. One value is 0.201358
and the other is ~c- 0.201358 = 2.940235. In such situations, we take a value close to the
initial approximation. In this case the value close to the initial approximation is 2.940235.
Therefore we take this value as the starting point of the next approximation.
x, = 2.940235

Next we calculate

xZ= sin

= 0.297876 or 2.843717
Continuing like this, it needed 17 iterations to obtain the value x17 = 2.88323, which we got
from the previous form. This means that in this form the convergence is very slow.

From examples 7 and 8, we learn that if we choose the form x = g(x) properly, then we can get
the approximate root provided that the initial approximation is sufficiently close to the root. The
initial approximation is usually given in the problem or we can find using the IV theorem.
Now we shall make a remark here.
Sululiohs of Non-linear Fquatiuns Remark :The Assumption we have given for an iteration function, is a qtronger
(x)

in one Variable assrlmpiion. In actual practice there are a variety of assumptions which the Iteration function
g(x) must satisfy to ensure that the iterations approach the root. But. to use those
assumptions you would require a lot of practice in the application of techniques in
mathematical analysis. In this course, we will be restrTcting ourselves to functions that
satisfies Assumption (*). If you would like to know about the other assumptions, you may
refer to 'Elementary Numerical Analysis' by Samuel D Conte and Carl de Boor.

To get some practice over this method. you can try the following exercises.

E6) Apply fixed point iteration method to the form x = -


2X
X
starting with xo = 5 to
ohtain a root of x2 - 2.r - 8 = 0.
E7) a) Apply tixed point iteration method to the following equations u 1tl1 the 1ntt1a1
approsirnattsn given nlungside. In each caw find an approhlmate root rounded
off to 4 decimal plac~c

i i ) x = - I+ sinx , X g = 1 .
2
b) Compute the exact roots of the equation x' + 4Sx - 2 = O using quadratic
formula and compare with the approximate root obtained in (a) (i).

Let u s now briefly recall what we have done in this unit.

2.5 SUMMARY - -~

In this unit we have covered the following points :


0 We have seen that the methods for finding an approximate solution of an equation
involve two steps :
. i) Find an initial approximation to a roa.
ii) Improve the initial approximation to get a more accurate value of the root.
We have described the following iteration methods for improving an initial
approximationol'a root.
it Bisection method
ii) Fixcd point iteration method.

2.6 SOLUTIONS/ANSWERS

t =3 x - f i s i n x
El) ~ ef(x)
Since f(-x) = -3x - 41 + sin (-x) = -3x - 41 + sin < 0 for x > 0. f(x) has no
negative real root.
Computing values of f(x) for x = 0. 1. 2. . . . radians, we get
f(o)='3x 0 - c = - 1
and f ( l ) = 3 - m n 7 = 3 - d 1 +.84147 = 1.6430,assin I =0.84147.
approximately. using a calculator. Thus f(0) and f(l) are of opposite signs. Therefore
there exists a root of f(x) = 0 lying between 0 and I.
Now we randomly take some values between 0 and I. say 0.3 and 0.5. Then
f(0.3)= .9- 1.1381 =-0.23181 <O
and
f(0.5) = 0.2836836 19 > 0.
Hence the root lies in ]0.3,0.5[. Iteration \lethad\ f i ~ r1,tratinp a Wiu~i

Repeating the process once again with the values x = 0.35 and 0.41 etc. we get,
f(0.35) < 0
and
f(0;41) > 0.
Therefore the root lies between 0.35 ai~d0.41. This :nterval is small. If we stop
the iteration here, we may either take 0.41, since f(0.41) is closer to zero. or
(0.35 + 0.41)/2 = 0.38 as the required initial approximation.
Let f(x) = 2x - tan x. Since we want a positive root of f(x) = 0, we evaluate f( x ) for
x > 0.
Let us consider x = 0. 1, 1.5. Then
f(0) = 0
f( I) = 0.443
and
f(1.5) = - 11.1014
Therefore a root lies between 1 and 1.5. Now if we consider values of I'(x, fo~x = I . I
and 1.2, we get
f(l)=0.443 <O
f(l .I ) = - O.XG8 < O
and
f(l.2)=-0.1722<0
Therefor: we get that a root liec in the interval I I . I . I 1 . In fact the root lies more close
to 1. We may take ( 1 + 1.1)/2 = I .(I5 as a n i n i t r ~ approximation.
~l
3) Let f,(x)=e-X

The graphs o f f , and f2are plottcd In the l'c~llowingfigure .


Solutions of Non-linear Equations From the graph you can see that the x-coordinate of the point of intersection is
in one Variable approximately 0.55. Hence the root lies close to 0.55.
b) The given equation can be written as
f(x) = eO." - ( 0 . 4 ~+ 9)
The graphs of eO." and 0 . 4 ~+ 9 are given in the following figure, :

From the graph you can see that there are two points of intersections.
x-coordinates of the points are approximately 6 and - 22.5. Hence one root lies
close to 6 and the other root lies close to - 22.5.

E4) a) We first note that the given function f(x) = ex - 2 - x is continuous in the
interval l1.0, 1.81. Also

usi@ a calculator.
Therefore the interval 11.0. 1.8[ contains a root of the equation.
1+18
Middle point of the interval c = -= 1.4. Also, f(c) = - 3 = 4.0552 - 3 > 0.
2
Therefore the root lies in the interval ] 1, 1.4[.

Repeating this process three times more, we get the intervals ] 1.0. 1.2[. ] 1.1. 1.2[
and ] 1.1. 1.15[. Therefore the improved interval after 4 bisections is ] 1 . I , 1.15[.
The width of this interval is 0.05. This shows that the required interval of width
0.05 which contains a root of the equation is ] I . I . 1.15[.
b) Using a calculator you can show that the intervals in each of the four bisections
are given by ]3.6.4.0[, 13.6. 3.8[. ]3.6.3.7[ and 13.65. 3.70[. The width of the
last interval is 3.70 - 3.65 = 0.05. Therefore the required interval id3.65.3.701.
ES) After 5 bisections the root lies in ] 1.7959. 1.7969[. Therefore the required root correct
to two decimal places is 1.80.
Iteration Methods for Locating a Root
The iterations are given by

we have
xl = g(5) = 3.6

Since I xl - x I O I = 0.001, we conclude that an approximate value of a root of


f(x) = 0 is 4.
E7) a) i) The iteration formula in fixed point iteration method is

Here xo = -20. Starting with xo = -20, the successive iterations are given
XI =-45.1
x, = - 45.04435

Since xj and x4 are the same, we stop the iteration here. Hence.the
approximate root rounded off to four decimal places is - 45.0444.
ii) The desired root is 1.4973.
b) The given equation is x2 t 45x - 2 = 0. According to the quadratic formula, I.he
two.roots are

Comparing with the result in part (a) (i), we find $at the approximate root is the
same as the exact root - 45.0444.
UNIT 3 CHORD METHODS FOR
FINDING ROOTS
Structure
3.1 Introduction
Objectives
3.2 Regula-Falsi Method
3.3 Newton-Raphson Method
3.4 Convergence Criterion
3.5 Summary

3.1 INTRODUCTION

In the last unit we introduced you to two iteration methods for finding roots of an equation .
f(x) = 0.There we have shown that a root of the equation f(x) = 0 can be obtained by writing
the equation in the form x = g(x). Using this form we generate a sequence of approximations
,
x, = g(xi) for i = 0.1,2, . . . We had also mentioned there that the success of the iteration
+

methods depends upon the form of g(x) and the initial approximation xo. In this unit, we
shall discuss two iteration methods : regula-falsi and Newton-Raphson methods. These
methods produce results faster than bisection method. The first two sections of this unit deal
with derivations and the use of these two methods. You will be able to appreciate these
iteration methods better if you can compare the efficiency of these methods. With this In
view we introduce fhe concept of convergence criterion which helps us to check the
efficiency of each method. Sec 3.4 is devoted to the study of rate of convergence of different
iterative methods.

Objectives
After studying the unit you should be able to :
apply regula-falsi and secant methods for finding roots
apply Newton-Raphson method for finding roots
define 'order of convergence' of an iterative scheme
'obtain the order of convergence of the following four methods :
i) bisection method
ii) fixed point iteration mettiod
iii) secant method
iv) Newton-Raphson method

3.2 REGULA-FALSI METHOD (OR METHOD OF


FALSE POSITION)
In this section we shaH discuss the 'regula-falsi metbod'. The Latin word 'Regula Falsi'
means rule of falsehood. It does not mean that the nlie is a false statement. But it conveys
that the'roots that we get according to the rule are approximate roots and nor necessarily
exact roots. The method is also known as the method of false position. This method is
similar to the bisection method you have learnt in Unit 3.

The bisection method for finding approximate roots has a drawback that it makes use of only
the signs of f(a) and f(b). It does not use the values f(a), f(b) in the computations. For
example. if f(a) = 700 and f(b) = 4 . 1 , then by the bisection method the first approximate
value of a root of f(x) is the mid value xo of the interval la, b[. But at xo. f(xo) is nowhere
near 0. Therefore,in this case it makes more sense to take a value near to 4.I than the C h d Methods for Finding Roots
middle value as the approximation to the root. This drawback is to some extent overcome by
the regula-falsi method. We shall first describe the method geometrically.
Suppose we want to find a root of the equation f(x) = 0 where f(x) is a continuous function.
As in the bisection method, we first find an interval ]a. b[ such that f(a) f(b) < 0. Let us look
at the graph of f(x) given in Fig. I.

(a. f(a) )

Fig. 1 :Reguia-Falsi Method

The condition f(a) f(b) < 0 means that the points (a, f(a)) and (b, f(b)) lie on the opposite
sides of the x-axis. Let us consider the line joining (a, f(a)) and (b, f(b)). This line crosses the
x-axis at some point (c, 0) [see Fig. I]. Then we take the x-coordinate of that point as the
first approximation. If f(c) = 0, then x = c is the required root. If f(a) f(c) < 0, then the r a t
lies in ]a. c[ (see Fig. 1 (a)). In this case the graph of y = f(x) is concave near the root r).
Otherwise, if f(a) f(c) > 0, the root lies in lc, b[ (see Fig. 1 (b)). In this case the graph of
y = f(x) is convex near the root. Having fixed the interval in which the root lies. we repeat
the above procedure.
k t us now wjite the above procedure in the mathematical form. Recall the fonnula for the
line joining two points in the Cartesian plane [see MTE-051. The line joining (a. f(a))and
(b, f(b)) is given by

y - f(a) = - - f(a)
f(b)- -(X- a)
b-a
We can rewrite this in the form
y-f(a) x-a
f(b) - f(a) - b - a '
...
Since the straight line intersects the x-axis at (c, O), the point (c, 0) lies on the straight line.
Putting x = c, y = 0 in Eqn. (I), we get
-f(a) c-a
f(b) - f(a) - b -a

This expression for c gives an approximate value of a root of f(x). Simplifying (2). we can
a b w r i t e it as
a f(b) - b f(a)
C=
f(b?- f(a)
Now, examine the sign of f(c) and decide in which interval ]a, c[ or Ic, b[, the root lies. We
thus obtain a new interval such that f(x) is of opposite signs at the end points of this interval.
By repeating this process, we get a sequence of intervals ]a, b[, ]a, al[, ]a, %[, . . . as shown
in Fig. 2.
Solutions of Nun-linear Equations
in one Variable

Fig. 2

We stop the process when either of the following holds.


i) The interval containing the zero of f(x) is of sufficiently small length
or
ii) The diffbrince between two successive approximations is negligible.
In the iteration format, the method is usually written as

where ]x0, x1[ is the interval in which the root lies.

We now summarise this method in the algorithm form. This will enable you to solve
problems easily.

Step 1 : Find numbers xo and xl such that fGO)f(.x,) < 0, using the tabulation method.

xo f(xJ - x1 f(xo)
Step2:Setx2=- . This gives the first approximation.
f(xl) - f(xO)

Step 3 :If f(x2) = 0 then x2 is the required root. If f(x2) f 0 and f(x& f(x2) < 0, then the next
approximation lies in ]x0, x2[. Otherwise it lies in ]x2, xl[.

Step 4 :Repeat the process till the magnitude of the difference between two successive
iterated values xi and xi is less than the accuracy required. (Note that I xi I - xi I gives
+ +

the error after ith iteration).

Let us now understand these steps through an example.


. ... -
.

Example 1:It is known that the equation x?.+ 7x2 + 9 = 0 has a root between -8 and -7.
Use the regula-falsi method to obtain the root roundedoff to 3 decimal places. Stop the
iteration when 1 xi I - xi I < lo4.
+

Solution :For convenience we rewrite the given function f(x) as

Since we are given that xo = -8 and xl = -7, we do not have to use step 1. Now to get the
first approximation, we apply the formula in Step 2.
Since, f(xo) = f(-8) = -55 and f(xl) = f(-7) = 9 we obtain Chord Metliads for Finding Roots

(-8) 9 - (-7) (-5) = -


=
2 9 + 55
Therefore our first approximation is -7.1406.

To find the next approximation we calculaie f(x2). We have

= 1.862856
Now we compare the sign of f(x2) with the signs of f(xo) and f(xl). We can see that f(xo)
and f(x2) are of opposite signs. Therefore a root lies in the interval 1-8, -7.1406[. We apply
the formula again by renaming the end points of the interval as xl = -8, x2 = -7.1406. Then
we get the hecond approximalion as
-8 f(-7.1406) + 7.1406 f(-8) - -
. _ - ..

'We repeat this process using steps 2 and 3 given above. The iterated values are given in the
:ollow~ngtable.

Table
I
L-
, Number of iterations
- -
1
I
Interval I Iterated Values x i 1 The function value f(xi) 1

k From the table, we see that the absolute value of the difference between the 5th and 6th
iterated values is 1 7.1748226 - 7.1747855 1 = .0000371. Therefore we stop the iteration
here. Further, the values of f(x) at 6th iterated value is .00046978 = 4.6978 x lo4 which is
close to zero. Hence we conclude that -7.175 is an approximate root of x3 + 7x2 + 9 = 0
rounded off to three decimal places.
Here is an exercise for you.

- - - - - -- -- - - - - -

E 1) Obtain an approximate root for the following equations rounded off to three decimal
places, using regula-falsi method

b) xsinx- I = O

I You note that in regula-falsi method, at each stage we find an interval ] xo, x, [ which contains a
I
root and then apply iteration formula (3). This procedure has a disadvantage. To overcome this,
regula-falsi method is modified. The modified method is known as secant method. In this
method we choose xo and x, as any two approximations of the root. The Interval ] xo, x I [ need
I
not contain the root. Then we apply formula (3) with xo, xl, f(xO)and f(x,).

The iterations are now defined as :

Y - xo f(xl) - X I f(xo)
Sdutions of kon-linear Equations
in one Varlable

Note :Geometrically, in secant Method, we replace the graph ef f(x) in the interval
]x,, x,, ,[ by a straight line joining two points (x,, f(x, ,), (x, ,), f(xn+J)on the curve
+ +

and take the point of intersection with x-axis as the approximate value of the root. Any line
joining two points on the curve is called a secant line. That is why this mehod is known as
secant method. (see Fig. 3).

I
Fig. 3

Let us solve an example.


Example 2 :Determine an approximate root of the equation

using
i) secant method starting with the two initial approximations as xo = 1 and x I = 1
and
ii) regula-falsi method.
(This example was considered in the book 'Numerical methods for scientific and
engineering computation' by M.K.Jain, S.R.K. Iyengar and R.K.Jain).

Solution :. Let f(x) = cos x - x ex.


Then f(0) = 1 and f( I ) = cos 1 - e = -2.177979523. Now we apply formula (4) with
xo=Oand xl = 1. Then

Therefore the first iterated value is 0.3 146653378.To get the 2nd iterated value, we, apply
Formula (4) with x, = 1, x2 = 0.3146653378. Now f(1) = -2.177979523 and
f(0.3146653378)= 0.5 1987I 175.
Therefore

1 We contime this process. The iterated values are tabulated in the following table.

Table 2 :Secant Method


Number of iterations I Iterated values xi I f(xi)

From the table we find that the iterated values for 7th and 8th iterations are the same. Also
the value of the function at the 8th iteration is close to zero. Therefore we conclude that
0.5 177573637 is an approximate root of the equation.
ii) To apply regula-falsi method. let us first note that f(0) f(l) < 0. Therefore a root lies in
the interval 10, 1[. Now we apply Formula (3) with xg = 0 and x, = 1. Then the first
approximation is

You may have noticed that we have already calculated the expression on the right hand side
of the above equation in pan (il.
Now f(x2)=0.51987 > 0. This shows that the root lies in the interval 10.3146653378. 11. To
get the second approximation, we compute

which is same as x3 obtained in (i). We find f(x2)= 0.203545 > 0. Hence the root lies in
10.4467281446. I[. To get the third approximation. we calculate

The above expression on the right hand side is different from the expression for x4 in part
(i). This is because when we use regula-falsi method. at each stage, we have to check the
condition f(xi) f(xi .. I)< 0.
Solutions of Non-linear Equations The computed values of the rest of the approximations are given in Table 3.
in one Variable
Table 3 :Regula-Falsi Method
No. Interval Iterated value xi -: f(xi) i
1 lo,] [ 0.3146653378 0.5 19871
2 1.04467281446.1[ 0.4467281446 0.203545
3 10.4940153366.1[ 0.4940153366 0.708023 x lo-'
4 10.5099461404, 1 [ 0.5099461404 0.236077 x lo-'
5 10.5 1520 10099, 1[ 0.5 152010099 0.776011 x

From the table, we observe that we have to perform 20 iterations using regula-falsi method
to get the approximate value of the root 0.5 177573637 which we obtained by secant method
after 8 iterations. Note that the end point 1 is fixed in all iteractions given in the table.

Here are some exercises for you.

E2) Use secant method to find an approximate root of the equation x2 - 2x + 1 = 0,


rounded off to 5 decimal places, starting with xo = 2.6 and x, = 2.5. Compare the
result with the exact root I +.\IT.

E3) Find an approximate root of the cubic equation x3 + x2 - 3x - 3 = 0 using


a) i) regula-falsi method, correct to three decimal places.
ii) secant method starting with a = 1, b = 2, rounded-off to three decimal plaJces.
b) compare the results obtained by (i) and (ii) in part (a).

Next we shall discuss another iteration method.


-
3.3 NEWTON-RAPHSON METHOD

This method is one of the most useful methods for finding roots of an algebraic equation.

Suppose that we want to find an 'approximate root of the equation f(x) = 0. If f(x) is continuous.
then we can apply either bisection method or regula-falsi method ta find approximate roots.
Now if f(x) and f (x) are continuous, then we can use a new iteratioh method called
Newton-Raphson method. You will learn that this method gives the result h o r e faster than the
bikction or regula-falsi methods. The underlying idea of the method is due to mathematician
Isac Newton. But the method as now used is due to the mathematician Raphson.

Let us begin with an equation f(x) = 0 where f(x) and f ( x ) and are continuous. Let xo be an
initial approximation and assume that xo is'close to the exact root a and f(xo) z 0. Let
a = xO+ h where h is a small quantity in magnitude. Hence f ( a ) = f(xo + h) = 0

,Now we expand f(xo + h) using Taylor's theorem, Note that f(x) satisfies all the
requirements of Taylor's theorem. Therefote, wAget
f(xo + h) ;f(xd) + h f (xo) +. . . = 0
Neglecting the terms containink' and higher powers we get
.
f(xo) + h f (x,) = 0.
Chord Methods for Finding Rwts
.This gives a new approximation to a as
f(xO)
x l = x O + h = x --
0 f (xd' 9

Now the iteration can be defined by

x = x --
* I f(xl)

Eqn. (5) is called the Newton-Raphson formula.. Before solving some examples we $hall
explain this me&d geometsically.

Geometrical Interpretation df Newton-Raphson Method


Let the graph of the function y F f(x) be as shown in Fig. 4.

e.
Fig. 4 :Newton-Raphson M t h o d

If xOis an initial approlimatian to the root, then the corresponding point on the graph is
P(xv f(xO)).We draw a tangent to the curve at P. Let it intersect the x-axis at T (see Fig. 4 ) .
Let x, b~the x-coordinate of T.Let S(a,0) denote the point on the x-axis where the curve
cuts the x-axis. We know that is a mot of the equation f(x) = 0. We take x, as the new
approximation which may be closer to a than xg' Now let us find-the tangent at P(x,. f(x,,)).
The slope of the tangent at P(xo, f(xo)) is given by f(xo). Therefox by the point-slope form
of the expression for a tangent to a curve (recall the expression from MTE-OS), we can write
Y - fC%) = f(xo) (x, - xo)
This tangent passes through the point T(x,, 0) (see Fig. 4). Therefore,we get

This ik the first iterated value. To get the s e c ~ n diterated value we again consider a tangent
at the point P(xl, f(xl)) on the curve(see Fig.4) and repeat the process. Then we get a point
Sdutiom of Non-Hmclr huntions TI(%, 0) on the x-axis. From the figure. we observe that T,is more closer to S ( a . 0 ) than T.
k aw Variable
Therefore after each iteration the approximation i s coming closer and closer to the actual -
root. I n practice we do not know the actual root of a given function.

Let us now take up some examples.

Example 3 :Find the smallest positive root of


2x-tanx=O
by Newt-Raphson method, correct to 5 decimal places.
-
Sdution :Let f(x) = 2x -tan x. Then f(x) i s a continuous function and f(x) = 2 sec2x i s
also a continuous function. Recall that the given equation has already appeared in an
exercise in Unit 2 (see E2 in Unit 2). From that exercise we know that an initial
approximation to the positive root of the equation i s x = I.Now we apply the
Newton-Raphson iterated formula.
mi) .
I- 1
Xi=X. f(xi)' -- 1=l,2,3 ....
Here ,x, = I.Then f(xd = f( I)= 2 - tan 1 = 0.4425922

f ( x d = f ( l ) = 2 - ~ e c * 1= 2 - ( 1 +tan21)

=I-tan21
= -1.425519
0.4425922
Therefore x, = I-
- 1.4255 19
= 1.3 1048
For i = 2, we get
2 - tan(1.31048)
X, = 1.31048 -
I- tan2(1.3 1048)
= 1.22393
Similarly we get
x, = 1.17605

x4 = 1.165926

x, = 1.165562

xn= 1.165561

Now x, and xn are correct to five decimal places. Haice we \top the iteration process here.
The root correct to 5 decimal places i s 1.16556.

Next we shall consider an applicationof Newton-Raphson formula. We know that finding


the square mot of a number i s not easy unless we use a calculator. Calculators use some
algorithm to obtain this value. Now we shall illuhtrate how Newton-Raphson method
enables us to obtain such an algorithm for calculating square roots. Let's consider an
example.

Example 4 :Find an approximate value of fiusing the Newton-Raphson formula.

Solution :Let x = 6. Then we have x2 = 2 i.e. x2 - 2 = 0. Hence we need to find the


-
positive root of the equation x2 2 =O. Let
f(x) = x2 2. -
Then f(x) satisfies ail the conditions for applying Newton-Raphson method. We choose
xo= I as the initial approximation to the root. This i s because we know that 6
lies between
fiand and therefore we can assume that h e root will be close to 1.
Now we compute the iterated values. Chord McUloQ tor Fiading Rmts

.
The iteration' fo??ula is
.

Putting i = 1.2.3,. . . .we get

Similarly

1 Thus the value o f d ? l c o ~ c to


t seven decimal places i s 1.4 1.42 136. Now you can clicck this
1 value with the calculator.

Note I :The method used in the above example is applicable for finding square rm)t ol'i~ny
positive dial number. For example suppose we want to find an approximate value of \'A
where A i s a positive real number. Then we consider the equation x2 - A = 0. The itrri~tcd
formula in this case is

This formula involves only the basic'arithmetic operations +. -. x and +.


Note 2 :From examples (3) and (4). we find that Newton-Raphson method gives the rcwt

1% I
very fast. One reason for this is that the derivative I f ( x ) I is large compared to If(x)l for any
x = xi. The quantity - which is the difference between two iterated values is rmall in
this case. I n general we can say that if I f(xi) I is large compared to I f(xi) I, then wl. can
obtain the desired root very fast by this method.

The Newton-Raphson method has some limitations. In the following remarks we mention
some o f the difficulties..

Remark 1 :Suppose f(xi) is zero in a neighbouthood o f the root, then it g a y happen that
f(xi) = 0 for some xi. I n this case we cannot apply Newton-Raphson formula, since
division by zero is not allowed.

Remark 2 :Another difficulty is that i t may happen that f(*)is zero only at the roots. This
happens i n either o f the situations.

i) f(x) has multiple root at a.R e d l that a polynomial function f(x) has a multipk root
a of order N if we can write

, where h(x) is a function such that h(a) # 0. For a general function f(x), this means
f(a) = 0 = f(a) = . ..= fY-'(a) and ?(a) # 0.
ii) f(x).)las a stationary point (point o f maximum o f minimum) point at the root [recall
fram your c d ~ l u course
s (MTE-01) that i f f(x) = 0 at some point x then x is called a
I stationary poinj).
Solutions of Non-linear Equations In such cases some modifications to the Newton-Raphson method are necessary to get an
in one Variable
accurate result. We shall not discuss the modificatiolls here as they are beyond the scop: of
this course.

You can try some exercises now. Wherever needed, you should c.-e a calpulator for
computation.

E4) Starting with xo = 0 find an approximate root of the equation x3 - 4x + 1 = 0, rounded


off to five decimal places using Newton-Raphson method.
E5) The motion of a planet in the orbit is governed by an equatidh of the form
1
y = x - e sin x where e stands for the eccentricity. Let y = 1 and e = - Then fiird :
2
approximate root of 2x - 2 -sin x = 0 in the interval [O, n] with error less than
Start with xo = 1.5.

E6) Using ~ e w t o n - ~ a ~ h s o n s ~ u aroot


r e algorithm, find the following roots within an
accuracy of 1oP4.
i) gl/', starting with xo = 3

ii) 91 I/', starting with xo = 10

E7) Can Newton-Raphson iteration method be used to solve the equation = O? Give
reasons for your answer.

In the next section we shall discuss a criterion using which we can check the efficiency of an
iteration process.

- CONVERGENCE
3.4
- ---
CRITERION

In this section we shall introduce a new concept called 'convergence criterion' related to an
iteration process. This criterion gives us an idea of how many successive iterations have to
be carried out to obtain the root to the desired accuracy. We begin with a definition.

Definition 1 : Let xu, x I . . . . x,. . . . . be the successive approximations of an iteration


m
process. We denote the sequence of these approximations as x 1
"In=o 1 We say that
I
, xll, n = O converges to a root a with order p 2 1 if
lm

for some number h > 0. p is called the order of convergence and h is called the asymptotic
error constant.

For each i. we denote by = xi -a. Then the above inequality be written as

This inequality shows the relationship between the error in successive approximations. For
example. suppose p = 2 and 1 E~ 1 lo-' for some i, then we can expect that
1 &i + I 1 s h lo4. Thus if p is large, the iteration converges rapidly. When p takes the
integer values 1. 2 . 3 then we say that the convergence i.s linear, quadratic and cubic
respectively. In the case of linear convergence (i.e. p = I ). then we require that h < 1. In this
case we can write (6) as

If this condition is satisfied for an ireratio; '!\ rocess then we say that the iteration process
converges linearly.
jetting n = 0 in the inequality (8), we get Chord Methods for FindingRoots

I x , - a 111I x o - a I
Forn =.I, we get
I ~ ~ - a l < h l x , - a l s2 1h x - a l

Similarly for n = 2, we get


1 x j - a 1 1x1 x2-a) Ih21 x , - a I 2 h 3l x -a l
Using induction on n, we get that
1 xn-a I IP 1 xo-a I for n>O . . . (9)

If either of the inequalities (8) or (9) is satisfied, then we conclude that x 1-


1 "1.=0
converges

to the root.

Now we shall find the order of convergence of the iteration methods which you have studied
so far.

Let us first consider bisection method. . .

Convergence of bisection method


Suppose that we apply the bisection method on the interval [ao, bo] for the equation f(x) = 0.
In this method you have seen that we construct intervals [ao, bo] 2 [a,, b , ] 2 [aZ,b2] 2 . . . .
each of which contains the required root of the given equation.
I
Recall that in each step the interval width is reduced by - i.e.
2

"0 - a.
and b, - a,, = -
2"

We know that the equation f(x) = 0 has a root in [ao, b o ] Let a be the root of the equation.
an bn
Then a lies in all the intervals [ai, bi], i = 0, 1,2, . . . . For any n, let C,
2
=
+
denote the
middle point of the interval [a,,, b,]. Then coyc , , c2, . . . are taken as successive
m
approximations to the root a. Let's check the inequality (8) for jc
i nJn=o

For each n, a lies in the interval [a,, b,]. Therefore we have

m
converges to the root a. Hence we can say that the bisection method always

converges.
Solutions of on-linear Equations For practical purposes, we should be able to decide at what stage we can stop the iteration to
in one Variable have an acceptably good approximate value of a. The nufnber of iterations required to
achieve a given accuracy for the bisection method can be obtained. Suppose that we want an
approximate sobtion within an error bound of (Recall that you have studied error
bounds in Unit 1, Sec 3.4). Taking logarithms on both sides of Eqn. (lo), we find that the
number of iterations required, say n, is approximately given by
;
n = int
[ln(bo 1a;, In lo-"
I
where the symbol 'int' stands for the integral part of the number in the bracket and ]ao, bo[ is
the initial interval in which a root lies.

Let us work out an example.

Example 5 : Suppose that the bisection method is used to find a zero of f(x) in the interval
[O, I]. How many times this interval be bisected to guarantee that we have an approximate
root with absolute error less than or equal to 10-5

Solution : Let n denote the required number. To calculate n, we apply the formula in Eqn.
(II)withbo= 1,ao=OandM=5.

Then

Using a calculator, we find

= int [ 16.609640471 = 17
Similarly you can try the following exercise.

E8) For the problem given in Example 5. Unit 2. find the number n of bisections required
to have an approximate root with absolute error less than or equal to lo-'.

The following table gives the minimum number of iterations required to find an approximate
root in the interval 10, I [ for various acceptable errors.

This table shows that for getting an approximate value with an absolute error bounded by
lo-'. we have to perform 17 iterations. Thus even though the bisection method is simple to
use. it requ~resa large number of iterations to obtain a reasonably good approximate root.
This is one of the disadvantages of the bisection method.

Note : The formula given in Eqn. ( 1 I ) shows that. given an acceptable error, the number of
iterations depends upon the initial interval and thereby depends upon the initial
approximation of the root and not directly on the values of f(x) at these approximations.

Next we shall obtain the convergence criteria for the secant method.

Convergence criteria for Secant Method


Let f(x) = 0 be the given equation. Let a denote a simple root of the equation f(x) = 0. Then
we have f ( a ) # 0. The iteration scheme for the secant method is

For each i, set E~ = xi - a. Then xi = E~ + a . Substituting in Eqn. (12) we get


Chord Methods for Finding Roots

Now we expand f ( +~a ~)and f ( -~a)~ using Taylor's theorem about the point x = a .

tv(a)
W e get f ( +~a)~= f(a)+ -
1
E~ + ff(a)
-

2
E~ + .
-

since f ( a ) = 0.

Similarly,

,
Therefore f ( ~+! a) - f ( -~ +~a)= ff(a)

Substituting Eqn. ( 1 4 ) and Eqn.(l6) in Eqn. (13). we get

By neglecting the terms involving E~ + E: E ' ~- I the above expression, we get

This relationship between the errors is called the error equation. Note that this relationship
holds only if a is a simple root. Now using Eqn. ( 1 7 ) we will find a numbers p and h such
that

Setting i = j - 1, we obtain
E.= A E P.
J J-1

Taking pth root on both sides, we get

Combining Eqns. ( 1 7 ) and (18), we get


f'(a)
h&P=&.E. -
I 1 - 1 2tv(a)
Solutions of Nan-linear Equations
in one Variable
,
Substituting the expression for ei - from Eqn: (19) in the above expression we get

hep-- f f ( a ) Ei h-l/p &,;/P


1 2f(a)

1.e. h e Pi =-
2f (a)
h-vp &I + l/p ...

Equating the powers of ei on both sides of Eqn. (20) we get

This is a quadratic equation in p. The roots are given by

Sir?cep cannot be negative we ignore the negative value. Hence we have,

Now, to get the number h, we equate the constant terms on both sides of Eqn. (20). Then we
get

constant is [;;;I'1
Hence the order of convergence of the secant method is p = 1.62 and the asymptotic error

--
+

Example 6 :The following are the five successive iterations obtained by secant method to
find the root a = -2 of the equation x3 - 3x + 2 = 0.
x l =-2.6, x,=-2.4, x3=-2.106598985,
L

x4 = - 2.022641412. and x5 =- 2.000022537.


3
Compute the asymptotic error constant and show that e5= - E
3 -1'

Solution : Let f(x) = x3 - 3x + :!

Then
tv(x) = 3x2 - 3. tV(-2) = 9
tv'(x) = 6x. tv'(-2) = -12

Therefore 1. = -
I
[- i r l r= - 0.77835 1205

Now
E~ = I x5 - a 1 = I - 2.000021537 + 2
= 0.000071537
and
- E ~ = I - 2 . 0 2 2 6 4 1 4 1 2 + 2 1=0.022641412.
Hence we get that h E4 ,- E5 Chord Methods tor Findipg Roots

Convergence criterion for fixed point iteration method


Recall that in this method we write the equation in the form
x = g(x)
Let a denote a root of the equation. Let xo be an initial approximation to the root. The
iteration formula is
x i + , =g(xi),i=O, 1 , 2 , . . . . . . (21)
We assume that gf(x) exists and is continuous and I gf(x) I < 1 in an intwal containing the
root a . We also assume that xo, xl, . . . lie in this interval.

Since gf(x) is continuous near the root and I gf(x) I < 1, there exists an interval
] a - h, a + h[, where h > 0, such that I gf(x) 1 I k for some k, where 0 < k < 1.

1 Since a is a root of the equation, we have


a = g(a).
Subtracting (22) from (2 1) we get
. . . (22)

xi + - a = g(xi) - g(a)
Now the function g(x) is continuous in the interval ]xi, a [ and gf(x) exists in this interval.
Hence g(x) satisfies all the conditions of the mean value theorem [see unit 11. Then, by the
mean value theorem there exists a 5 between xi and a such that

Note that 5 lies in ] a - h, a + h[ and therefore I g'(5) I < k and hence


I I I II xi-a I
I
Setting i = 0, 1,2, . . .,n we get
Ixl-al<klxo-a1

This shows that the sequence of approximations {xi) converges to a provided that the initial
approximation is close to the root.

We summarise the result obtained for this iteration process in the following Theorem.

Theorem :If g(x) and '(x) are continllous in an interval about a root a of the equation
B
x = g(x), and if 1 g'(x) < 1 for all x in the interval, then the successive approximations
xl, x2, . . . given by

converges to the root a provided that the initial approximation xo is chosen in the above
interval.

We shall now discuss the order of convergence of this method. From the previous
discussions we have the result.
I xi ,- a I I g'(5) I
+ (xi - a ) I
Solutions of Non-linear Equations Note that 6 is dependent on each xi. Now we wish to determine the constants X and p
in one Variable
independent of xi such that

I xi+]-a I ~c I (xi-a) IP
Note that as the approximations xi get closer to the root a , g!(c) approaches a constant value
g'(a). Therefore, in the limiting case, as i + m, the approximations satisfy the relation ,
I xi+]-al <I gt(a) l I xi-a I
Therefore, we conclude that if g'(a) # 0, then the convergence of the method is linear.
If g'(a) = 0, then we have 1
i+l-a=g(xi)-a

=g[(xi - a ) + a ] - a
By applying Taylor's theorem to the (xi - a12
function g(x) about a and neglecting
higher powers.
= g(a) + (xi - a)g'(a) + 7-
g"(6) - a

since g(a) = a and g'(a) = 0 and 6 lies between xi and a.

Therefore, in the limiting case we have,


1 2
I x i + , - a 1 I - IgM(a)l l x i - a 1
2
Hence, if g'(a) = 0 and g"(a) # 0, then this iteration method is of order 2.

Example 7 :Suppose a and P are the roots of the equation x2 +- ax + b = 0.Consider a


rearrangement of this equation as

x=-- (ax + b)
X

Show that the iteration xi + ,


=
(axi + b)
-- X.
will converge near x = a when I a 1 > 1 pI

Solution :The iterations are given by


(axi + b)
xi + I = g(x.) = - -- , i=O, 1 , 2,...

- to a if
By Theorem 1, these iterations converge I gt(x)
- I < 1 near a i.e. if I gt(x)
- I
1= -
1 X'I
c 1. Note that gt(x) is continuous near a. If the iterations converge to x = a , then we require

Now you recall from your elementary algebra course (MTE-04) that if a and P are the roots,
then
a+p=-aandap=b
Therefore 1 b 1 = 1 a 1 I P 1. Substituting in Eqn. (23), we get
, ~ ~ I ~ > I l~p l I = I ~ I
'Hence 1 a l > l Dl.
Similarlyyou can solve the following exercise. Chord Methods for Finding Roots

~ 9 ;For the equation given in Example 7, show that the iteration xi + , = xi + a will
----

converge to the root x = a , when I a I < 1 p 1.

Finally we siall discuss the convergence of the Newton-Raphson method.

Convergence of Newton-Raphson Method


Newton-Raphson iteration formula is given by

To obtain the order of the method we proceed as'in the secant method. We assume that a is
a simple root of f(x) = 0. Let

Then we have

Now we expand f ( +~a )~and f ( E ~+ a ) , using Taylor's theorem. about the point a . We have

f ( a ) + E~ f '(a) +-
2
'i
2
f"(a) + . . .
I
-
'i+l-
f ( a ) + E~f"(a) + ~f f"(a) + . . .
But f ( a ) = 0 and f ( a ) # 0. Therefore

Hence, by neglecting higher powers of Ei, We get

This shows that the errors satisfy Eqn. (6) with p = 2 and h = '(a) Hence
-

2ff(a)'
Newton-Raphson method is of order 2. That is at each step. the error is proportional to the
square of the previous error.

Now, we shall discuss an alternate method for showing that the order is 2. Note that we can
write (24) in the form x = g(x) where
Sululions of Nun-linear Equations Then
inone Variable

-
- f(x) f'(x)
[f(x)12
Now,
, f(a) f( a )
g ( a ) = -- = 0, since f(a) = 0 and f ( a ) # 0.
ff(a)12
Hence by the conclusion drawn just above Example 7, the method is of order 2. Note that
this is true only if a is a simple root. If a is a multiple root i.e. i f f (a)= 0, then the
convergence is not quadratic, but only linear. We shall not prove this result, but we shall
illustrate this with an example.

Let us consider an example.

Example 8 :Let f(x) = (x - 2)4 = 0. Starting with the initial approximation xo = 2.1,
compute the iterations x,, x2, x3 and x4 using Newton-Raphson method. Is the sequence
converging quadratically or linearly?
Solution :The given function hasinultiple roots at x = 2 and is of order 4.
Newton-Raphson iteration formula for the given equation is

= 41 (3xi - 2)
Starting with xo = 2.1, the iterations are given by

Similarly
x, = 2.05625

x3 = 2.042 1875

x4 = 2.03 1640625

N o w ~ ~ = ~ ~ - 2 = 0E., l= ,x l -2=0.075, ~,=0.05625, ~~=0.0421875,


c4 = 0.03 1640625.

Then

and
3 Chord IWehds for Finding Roog
Thus the convergence is linear in this case. The error is reduced by a factor of - with each
4
iteration. This result can also be obtained directly from Eqn. (25).

You can try this exercise now :

E 10) The quadratic equation x4 - 4x2 + 4 = 0 has a double root at x = 6.Starting with x, =
1.5, compute three successive approximations to the root by Newton-Raphson
me [hod. Does the result converge quadratically or linearly ?

We now end this unit by giving a summary of it.


-

3.5 SUMMARY

In this unit we have


described the following methods for finding a root of an equation f(x) = 0
i) Regula-falsi method :
The formula is

where la, b[ is an interval such that f(a) f(b) < 0.


ii) Secant method :
The iteration formula is

where xo and x , are any two given approximations of the root.


iii) Newton-Raphson method :
The iteration formula is

where xo is an initial approximation to the root.


introduced the concept called convergence criterion of an iteration process
discussed the convergence of the following iterative methods
i) Bisection method
ii) Fixed point iteration method
iii) Secant method
iv) Newton-Raphson method.

3.6 SOLUTIONSIANSWERS
El) i) Letf(x)=xloglox-1.2~0

We have to first find two numbeis a and b such that f(a) f(b) c0. Since
the function loglox is defined only for positive values of x, we consider
on& positive numbers x. Let us take x = 1.2.3, . . .Then, using a calculator,
f(l)= 1 (log1(,1)- 1 . 2 = - 1 . 2 ~ 0
C .

Solutions of Non-linear Equations This shows that f(2) f(3) c 0 and therefore a root lies in 12,3[. NO; pui a = 2
in one Variable and b = 3. Then the first approximation of the root is

. Now f(2.72102) = 2.72102 (loglo2.72102) - 1.2 = 1.18291 - 1.2 < 0. Since


f(2.72102) f(3) < 0, a root lies in the interval ]2.72102,3(. Hence the second
' approximation is

We find f(x2) = - 0.0004 < 0. Therefore the root lies in the interval ]2.7402,3[.
The third approximation is obtained as

Since x2 and x3 rounded off to three decimal places are the same, we stop the
process here. Hence the desired approximate value of the root rounded off to
three decimal places is 2.740.
ii) Let f(x) = x sin x - I
Since f(0) = - 1 and f(2) = 0.8 18594854,a root lies in the interval 10, 2[. The firs)
approximation is

and f(x ,) = -0:0200 1921


Since f(x < 0 and f(2) > 0, the root lies in ]-0.0200192 1,2[.
The second approximation is obtained as
x2 = 1.2 124074
and
f(x2)= -0.00983461.

The root now lies in ] 1.2124074,2[.


Similarly we can calculate the third and fourth approximations as

and
- x 4 = 1.1 1415714
Since x3 and x4 rounded off to three decimal places are the same, we stop the process
here. Hence the desired root is 1.1 14.

E2) Let f(x) = x2 - 2x - 1. Starting with-%= 2.6 and x, = 2.5 the successive
approximations are,
= 2.4 1935484 Chord Methods for Finding Roots

and f(x2) = 0.0 145682.


To find the next approximation we compute

r ~ i m i l a ryou
l ~ can calculate that
x4 = 2.4 1421384
and
x5 = 2.41421356

I
I
Since x4 and x5 rounded off to 5 decimal places are the same, we stop the process
here. Therefore the required root rounded off to 5 decimal places is 2.4 1421.
Now we compare this root with the exact root 1 + fi.Using a calculator we 1 + 6=
1 2.4 142 1. ro~ndedoff to five decimal places. Hence the computed root and exact root
are the same when we round off to five decimal places.

E3) Let f(x) = x3 + x2 - 3x - 3 = 0


i) We first note that f(1) < 0 and f(2) > 0. Therefore a root lies in [I, 21. The first
approximation x, is

and f(xl) =-1.36449 c O

Therefore the root lies in ] 1.57142,2[.


Proceeding similarly, we get the values as given in the following Table.

The table shows that x5 and x4 are correct to three decimal places. Therefore we
stop the process here. Hence the root correct to three decimal places is 1.731.
ii) In secant method we start with two approximatiops a = 1 and b = 2. Then the
first approximation is the same as in part (i), namely
XI = 1.57142
To calculate the next approximation x2 we take b and x,. Here also we
&getting the same value as in part (i), namely

Then we take x, = 1.57142 and x2 = 1.70540 to get the third approximation xj.
We have
. w u t h d Nm-1- F y u h The rest of the values are given by

and
xs = 1.73205
Since x4 and x5 rounded off to three decimal places are the same,we stop here.
Hence the root is 1.732. rounded off to three decimal places.
Let us now compa~the two methods. We first note that 1 % + I - xi 1 gives the
error after ith iteration.
In regula-falsi method, the e r m after 5th iteration is
I x5 - x4 I = 1 1.73194 - 1.73140 1

whereas in secant method, the error after 5th iteration is


I x,-x, I = 1 1.73205- 1.73199 I

This shows that the error in the case of want method is smaller than that in
regula-falsi method for the same number of iterations.
m) The given function f(x) = x3 - 4x + 1 and its derivative f(x) = 3x2- 4 are continuous
everywhere.
The initial approximation is xo = 0.
The iteration formula is

The first approximation is

and f(x ,) = (0.25)3 - 4 (0.25) + 1 = 0.015625

'
The second approximation is given by

Similarly we get
x3 = 0.254101

Since x2 and x3 rounded off to four decimal places are the same. we slop the iteration
here. Hence the root is 0.2541.
-
E5) Let f(x) = 2x 2 - sin x. The f(x) and f (x) are continuous everywhere. Starting with
xo = 1.5, we compute the iterated values by the Newton-Raphson formula. The first
iteration is
+ I

= 1.5
- sin (1.5)
- 21-cos(1.5)
Chad Mctbob for FWhgRoots

Similarly,
x2 = 1.498701

wefind I x2-x, I= 1 1.498701 - 1,498702 1<


Therefore the required mot i s 1.498701
E6) i) Newton-Raphm iterated formula for computing the 6 i s

Starting with %, = 3, we obtain the iterated values as

= 2.82843 1
and xj = 2.828427

Since 1 x3 - x2 1 < lo4, we stop the iteration. Therefore the approximate mot
is 2.8284.

ii) Here the Newton-Raphson formula is

+ ~ I] i = ~ , 1.2...
x i = + [ x i - l xi-

and xo = 10. The iterated values are


XI = 9.55
X, = 9.539398
X, = 9.539392
Since 1 x3 x2 I < - we get the approximate value as 9.5393.

E7) No, because f(x) = -is not continuous at the root x = 0.


3xu3
In(O.0l) - In l(r7]= int [I1.5 129251=
E8) n = int
[ 2. 0.693 147

E9) Here g(x) = - -.The iteration


x+a

converges to a if I g'(x) 1= < I in an internal containing cr in particular


we q u i r e

i.e.(a+aY< 1 b 1.
But we havea+f5=-aand&= b. Therefon we get
p>lbl=lal 1st.
i.e. l a1 < 1 D l .
Solutions of Nun-linear equations E10) The iterated formula is
in one Variable
xf - 2
XI+, =xi-
4xi
The three successive iterations are
x, = 1.458333333
X, = 1.436667 143
x3 = 1.425497619
1 1
Then we get E~ = - E, and e2 = - E,. This shows that the sequence is not quadratically
2 2
convergent, it is linearly convergent.
UNIT 4 APPROXIMATE ROOTS OF
POLYNOMIAL EQUATIONS
Structure
4.1 Introduction
Objectives
4.2 Some Results on Roots of Polynomial Equations
4.3 Birge-Vieta Method
4.4 Graeffe's Root Squaring Method
4.5 Summary
4.6 Solutions/Anrwers

4.1 INTRODUCTION
- 4

In the last two units we discussed methods for finding approximate roots of the equation
f(x) = 0.In this unit we restrict our attention to polynomial equations. Recall that a
polynomial equation is an equation of the form f(x) = 0 where f(x) is a polynomial in x.
Polynomial equations arise very frequently in all branches of science especially in physical
applications. For example, the stability of electrical or mechanical systems is related to the
real part of one of the complex roots of a certain polynomial equation. Thus there is a need
to find all roots, real and complex, of a polynomial equation. The four iteration methods, we
have discussed so far, applies to polynomial equations also. But you have seen that all those
methods are time consuming. Thus it 1s necessary to find some efficient methods for
obtaining roots of polynomial equations.
The sixteenth century French mathematician Francois Vieta was the pioneer to develop
methods for finding approximate roots of polynomial equations. Later, several other
methods were developed for solving polynomial equations. In this unit we shall discuss two
simple methods : Birge-Vieta's and Graeffe's root squaring methods. To apply these
methods we should have some prior knowledge of location and nature of roots of a
polynomial equation. You are already familiar with some results regarding location and .
nature of roots from the elementary algebra course MTE-04. We shall beg~nthis unit by;--
listing some of the important results about the roots of polynomial equations.

Objectives
After reading this unit you should be able to :
apply the following methods for finding approximate roots of polynomial equations
i) Birge-Vieta method
ii) Graeffe's root squaring method.
list the advantages of the above methods over the methods discussed in the earlier
units.

4.2 SOME RESULTS ON ROOTS OF POLYNOMIAL


EQUATIONS

The main contribution in the study of polynomial equations is due to the French
mathematician Rene Descarte's. The results appeared in the third part of his famous paper
'La geometric' which means 'The geometry'.

Consider a polynomial equation of degree n


p(x) = anxn+ an- ,xn - +...+ a,x+ao
Solutions t)fNon-linearb:quatbns where ao, al,.. . ,a,, are real numbers and a,, # 0. You know that the roots of a polynomial
in m e Varbbk
quation need not be real numbers. it can be complex numbers, that is numbers of the form
z =,a + ib where a and b are rea1,numbers. The following results are basic to the study of
roots of polynomial equations.
Theorem 1:(Fundamental Theorem of Algebra) : Let p(x) be a polynomial of degree n 2 I
g;ven by Eqn.(l). Then p(x) = 0 has at least one root; that is there exists a number a E C
such that p(a) = 0. In fact p(x) has n complex roots which may not be distinct.
Theorem 2 :Let p(x) be a polynomial of degree n and a is a real number. Then
p(x) = (x - dr) qo(x) + . . . (2)
I
for some polynomial qo(x) of degree n - I and some constant numbej ro . qo(x) and ro are
called the quotient polynomial and the remainder respectively.
In particular, if a is a root of the equation p(x9 = 0, then ro = 0: that is ( x - a)divlaes p(x).
Then we get
p(x) = (X - a ) qo(x)
How do we determine qo(x) and ro? We can find them by the method of synthetic,divisionof
a polynomial p(x), Let us now discuss the synthetic division procedure.

Consider the polynomial p(x) as given in Eqn. (1)


p(x)=a,,xn+a,,- , x n - I + . . . + a l x + a o

Dividing p(x) by x - a we get


p(x) =%(XI (X - a)+ ro. . . . (3)
where qo(x) is a polynomial of degree n - I and ro is a constant.

Let qo(x) be represented as

qo(x)= bnxn-I + b n - ,xn-* +...+ b,x+bl -


(Note that for convenience we are denoting the coefficients by bl. . . .,b, instead of
.
b,,. bl. . . . b, - I ). Set b,, = ro. Substituting the expressions for %(XI and ro in Eqn. (3) we
get
p(x)=(~-a)(b,xn-'+bn-Ixn-2+ ...+ b-, x + b l ) + b O . . . . (4)
Now. to find b,,. b,. . . . . b, we simplify the right hand side of Eqn. (4) and compare the
. I
coefficients of x'. i = 0.1. . . . n on both sides. Note that p ( a ) = bo. Comparing the
coefficients we get

Coefficient of xn : a,, = b,,. b,, = a,,

' :a,, - , -- b,, - I - a b,.


1
coefficient of x n - b,, - I = a, - + a b,,

coefficientofxk :r4,=bk-abk+,, bk=~+abk+,

Coeficient of x0 :% = bo - a, bo=ao+abl
It is easy to perform the calculations if we write the coefficients of p(x) on a line and Approximate Roots of Pokynomial
+ ,
perform the calculations bk = ak + a bk below ak as given in the table below. Equations

Table 1 : Horner's table for synthetic division procedure.

We shall illustrate this procedure with an example.

Example 1 : Divide the polynomial


p(x) = x5 - 6x4 + 8x3 + 8x2 + 4x - 40
by x - 3 by the synthetic division method and find the remainder.
I

I Solution :Here p(x) is a polynomial of degree 5. If as, a4, a3, a2, a,, a. are the coefficients
of p(x), then the Homer's table in this case is

Table 2

Hence the quotient polynomial qo(x) is

qo(x) = x4 - 3x3 - x2 + 5x + 19

and the remainder is ro = bo = 17. Thus we have p(3) = bo = 17


I
Do the following exercises on the same lines.

E l ) Find the quotient and the remainder when 2x3 - 5x2 + 3x - 1 is divided by x - 2.
-
E2) Using synthetic division check whether a. = 3 is a root of the polynomial equation
x4 + x3 - 13x2- x + 12 = 0 and find the quotient polynomial.

Theorem 3 : Suppose that z = a + ib is a root of the polynomial equation p(x) = 0.Then the
conjugate of z, namely 5,= a - ib is also a root of the equation p(x) = 0, i.e. complex roots
occur in pairs.

We denote by p(-x) the polynomial obtained by replacing x by -x in p(x). We next give an


important Theorem due to Rene Descarte.

Theorem 4 :(Descarte's Rule of signs) : A polynomial equation p(x) = 0 cannot have


more positive roots than the number of changes in sign of its coefficients. Similarly p(x) = 0
cannot have more negative roots than the number of changes in sign of the coefficients
of p(-x).

For example, let us consider the polynomial equation

we count the changes in the sign of the coefficients. Going from left to right there are
changes between 1 and -15, between -15 and 7 and between 7 and -1 1. The total number
in one Vatiable

Here there is only one change between 1 and -15 and hence the equation cannot have more
than one negative root.

We now give another theorem which helps us in locating the real roots.

Theorem 5 :Let p(x) = 0 be a polynomial equation of degree n 2 1. Let a and b be two real
numbers with a < b. Suppose further that p(a) + 0 and p(b) # 0.Then,
i) if p(a) and p(b) have opposite signs, the equation p(x) = 0 has an odd number of roots
between a and b.
ii) if p(a) and p(b) have like signs, then p(x) = 0 either has no root or an even number of
roots between a and b.
Note : In this theorem multiplicity of the root is taken into consideration i.e. if a is a root of
multiplicity k it has to be counted k times.

As a corollary of Theorem 5, we have the following results.

Corollary 1 :An equation of odd degree with real coefficients has at least one real root
whose sign is opposite to that of the last term.

Corollary 2 :An equation of even degree whose constant term has the sign opposite to that
of the leading coefficient, has at least two real r q t s one positive and the other negative.

Corollary 3 :The result given in Theorem 5(i) is the generalisation of the Intermediate
value theorem.

The relationship between roots and coefficients of a polynomial equation is given below.

Theorem 6 :Let cr,, aZ,. . . . anbe n roots (n 1 1) of the polynomial equation

Now, you can try to solve some problems using the above theorems.

E3) How many negative roots does the equation 3x7 + x5 + 4x3 + 1Qx - 6 = 0 have? Also
determine the number of positive roots, if any.
E4) Show that the biquadratic equation
p(x) = x4 + x3 - 2x2 + 4x - 24 = 0 has at least two real roots one positive and the other
negative.

In the next section we shall discuss one of the simple methods for solving polynomial
equations.
4.3 BIRGE-VIETA METHOD
Approximate Roots of Polynomial
Equations I
- the real roots of a ~olynomial
We shall now discuss the Birae-Vieta method for finding . - 1
equation. This method is based on an original method due to two English mathematicians
Birge and Vieta. This method is a modified form of Newton - Raphson method.

Consider now, a polynomial equation of degree n, say


pn(x)=anxn+. . . + a l x +ao=O. . . . (5)
Let xo be an initial approximation to the root a.The Newton-Raphson iterated formula for &
c
improving this approximation is
pn(xl - 1
x =x ------ i=l,2, ... . . .(6)
I - P',,(x, -
To apply this formula we should be able to evaluate both pn(x) and pfn(x,)at any x, . The
most natural way is to evaluate
I I
I P n ( ~ ,=) anx: +an- Ix:- ' + . . . + a2xf + alxI+ a.
pfn(xi)= n anx:- + (n - I) an - I ~ n - 2+ . . +
. 2a2xI + a l
Francois Vieta (1540-1603)

However, this is the most inefficient way of evaluating a polynomial, because of the amount
of computations involved and also due to the possible growth of round off errors. Thus there
is a need to look for some efficient method for evaluating p,(x) and pfn(x).

Let us consider the evaluation of p,(x) and,pfn(x)at x0 using Homer's method as discussed
in the previous section.

I We have

where
P,(x) = (x - xo) 4, - + r0 .

q n - I ( ~ ) =bnxn-I + b n - 2 ~ r- 2 +.. . + b 2 x + b I
and bo = pn(xo) = rO . . . (8)
We have already discussed in the previous section how to find bi, i = 1,2, . . ., n.
I
Next we shall find the derivative pfn(xO)using Homer's method. We divide
I
q n - 1 ( ~by
) (X- xo) using Homer's method. That is, we write

Comparing the coefficients, we get ci as given in the following table

Table 3 . . .

As observed in Sec. 1, we have


C, = q,, - ,(xO).
Now, from Eqns. (7) and (8), we have
p,(x) = (x - x0) qn - 1 6 ) + pn(xo).
Solutions of Non-linear Equations Differentiating both sides of Eqn. (10) w.r.t. x, we get
in one Variable
P'"(x> = 4, - l(x) + (x - x-,J 4', - I(x). . . . (1 1)
Putting x = xo in Eqn. (1 I), we get

pt,(xo) = e,-
Comparing (9) and (12), we get
P',,(x~) = q,, - 1 ( ~ ( =
$ c1
Hence the Newton-Raphson method (Eqn. (6)) simplies to

We surnmarise the evaluation of bi and ci in the following table.

Table 4

Let us consider an example.

Example 2 : Evaluate p'(3) for the polynomial

Solution :Here the coefficients are a. = -40, a l = 4, a;! = 8, a3 = 8, a4 = -6 and a5 = 1. To


compute bo, we form the following table.

Table 5
-6 8 8 4 4 0

Therefore p' (3) = 25


To get some practice, why don't you try the following exercises.

E5) Using synthetic division, show that 2 is a simple root of the equation
p(x) = x4 - 2x3 - 7x2 + 8x + 12 = 0.
E6) Evaluate p (0.5) and p'(0.5) for
p(x) = -8x5 + 7x4 - 6x3 + 5x2 - 4x + 3

Now we shall illustrate why this method is more efficient than the direct method. Let us
consider an example. Suppose we want to evaluate the polynomial
p(x) = -8x5 + 7x4 - 6x3 + 5x2 - 4x + 3
for any given x.
When we evaluate by direct method, we compute each power of x by multiplying with x the Approxirmite Roots of Polynomial
preceding power of x as Equations

'x3 = x(x2), x4 = x(x3) etc.


Thus each term cxk takes two multiplications for k > 1. Then. the total number of
multiplications involved in the evaluation of p(x) is 1 + 2 + 2 + 2 + 2 = 9.

When we use Homer's method the total number of multiplications is 5. The number of
additions in both cases are the same. This shows that less computation is involved while
using Homer's method and thereby reduces the error in computation.

, Let us now solve some problems using Birge-Vieta method.


1

Example 3 : Use Birge-Vieta method to find all the positive real roots, rounded off to three
decimal places, of the equation
xu+7x3+24x2+x- 1 5 = 0

I Stop the iteration whenever I xi + - xi I <0.W1


Solution : W{ first note that the given equation
b4(x) = x4+ 7x3 + 24x2+ x - 15 = 0

is of degree 4. Therefore, by Theorem 1, this equation has 4 roots. Since there is only one
change of sign in the coefficients of this equation, Descarte's rule of signs (see Theotem 4),
states that the equation can have at most one positive real root.
I Now let us examine whether the equation has a positive real root.
i Since p4(0) = - 15 and p4( 1) = 19, by Intermediate value theorem, the equation has a root
lying in 10, l[.

We take xo = 0.5 as the initial approximation to the root. The first iteration is given by

Now we evaluate p4 (0.5) and p'4 (0.5) using Homer's method. The results are given in the
following table.

Table 6

-7 5625
Therefore xl = 0.5 - = 0.7459
30.75

The second iteration is given by


Solutions of Non-linear Equations Using synthetic division, we form the following table of values
in one Variable
Table 7
1 7 24 1 -15

Therefore x2 = 0.7459 - -
2'3 32 - 0.6998
50.1469

Third iteration is given by

Table 8
7 24 1 -15
0.6998 0.6998 5.388 1 20.5649 1 5.0905

For the fourth iteration we have


~~(0.6978)
X =x -
4 3 ptn(O.6978)
Table 9

Since x3 and x4 are the same. we get I x4 - xj 1 < 0.0001 and therefore we stop the iteration
here. Hence the approximate value of the root rounded off to three decimal places is 0.698.

Next we shall illustrate how Birge-Vieta's method helps us to find all real roots of a
polynomial equation.

Consider Eqn. (4)

If a is a root of the equation p(x) = 0, then p(x) is exactly divisible by x - a, that is, bo = 0.
In finding the approximations to the root by the Birge-Vieta method, we find that bo
approaches zero (bo + 0)as x, approaches a (xi + a).Hence. if xn is taken as the final
approximation to the root satisfying the criterion ( xn - xn - I 1 < E, then to this
approximation, the required quotient is
q n - 1 ( ~=) bnxn-I + b n - I x n - 2+ . . . + b,
where b'is are obtained by using xn and the Homer's method. This polynomial is called the Approximate Roots of Polynomial
deflated polynomial or reduced polynomial. The next root is now obtained using qn - ,(x) Equations

and not p,(x). Continuing this process, we can successively reduce the degree of the
polynomial and find one real root at a time.

Let us consider an example.

Example 4 :Find all the roots of the polynomial equation p3(x) = x3 + x - 3 = 0 rounded off
,
to three decimal places. Stop the iteration whenever 1 xi - xi 1 < 0.0001.
+

Solution : The equation p3(x) = 0 has three roots. Since there is only one change in the sign
of the coefficients, by Descarts' rule of signs the equation can have at most one positive real
root. The equation has no negative real root since p3(-X) = 0 has no change of sign of
coefficients. Since p3(x) = 0 is of odd degree it has at least one real root. Hence the given
equation x3 + x - 3 = 0 has one positive real root and a complex pair. Since p(1) = -1 and
p(2) = 7, by intermediate value theorem the equation has a real root lying in the interval
]1,2[. Let us find the real root using Birge-Vieta Method. Let the initial approximation
be 1.1.

First iteration

Table 10

Therefore x, = 1.1 - -- = 1.22289


4.63

Similarly, we obtain
x2= 1.21347

Since I x2 - x3 I < 0.0001, we stop the iteration here. Hence the required value of the root is
1.213, rounded off tb three decimal places. Next let us obtain the deflated polynomial of
p3(x). To.get the deflated polynomial, we have to find the polynomial q2(x) by using the
final approximation x3 = 1.213 (see Table 11).

Table 11

3 )- 0.0022. That is, the magnitude of the error in satisfying p3(x3)= 0 is


Note that ~ ~ ( 1 . 2 1=
0.0022.

This is a quadratic equation and its roots are given by


-1.213 f d(1.213)~- 4 x 2.4714
X =
2
- -1.213+ 2.9009 i
2
Solutions of Non-linear Equations Hence the three roots of the equation rounded off to three decimal places are 1.213,0.6065
in one Variable, + 1.4505 i and -0.6065 - 1.4505 i.
Remark :We now know that we can determine all the real roots of a polynomial equation
using deflated polynomials. This procedure reduces the amount of computations also. But
this method has certain limitations. The computations using deflated polynomials can cause
unexpected errors. If the roots are determined only approximately, the coefficients of the
deflated polynomials will contain some errors due to rounding off. Therefore we can expect
loss of accuracy in the remaining roots. There are some ways of minimizing this error. We
shall not be going into the details of these refinements.
Before going into the next section, you can try these exercises.
-.

E7) Fihd an approximation to one of the roots of the equation

using Birge-Vieta method starting with the initial approximation xo = -2. Stop the
iteration whenever 1 xi + , - xi 1 < 0.4 x
E8) Find all the roots of the equation x3 - 2x - 5 = 0 using Birge-Vieta method.
E9) Find the real root rounded off to two decimal places of the equation
x4 - 4x3 - 3x + 23 = 0 lying in the interval ]2,3[ by Birge-Vieta method.

4.4 GRAEFFE'S ROOT SQUARING METHOD

In the last section we have discussed a method for finding real roots of polynomial
equations. Here we shall discuss a direct method for solving polynomial equations. This
method was developed independently by three mathematicians Dandelin, Lobachevsky and
Graeffe. But Graeffe's name is usually associated with this method. The advantage of this
method is that it finds all roots of a polynomial equation simultaneously; the roots may be
real and distinct, real and equal (multiple) or complex roots.

The underlying idea of the method is based on the following fact :,Suppose PI, PZ, . . . , Pn
ate the n real and distinct roots of a polynomial equation of degree n such that they are
widely separated, that is,
I PI I >>I P21 >> I P31 >>...>>I Pnl
where >> stands for 'much greater than'. Then we can obtain the roots approximately from
the coefficients of the polynomial equation as follows :
Let the polynomial equation whose roots are P,, P2, . . . , Pn be

Using the relations between the roots and the coefficients of the polynomial as given in Sec.
4.2, we get
1
Since
-
I PI I >> I P2 I >> I P3 I >; . . . >> I Pn 1, we have from ( I 4) the a p p i l n a t i o n s Approximate Roots of Polynomial
Equations

4
These approximations can be simpfified as

J
So the problem now is to find from the given polynomial equation, a polynomial equation
whose roots are widely separated. This can be done by the method which we shall describe
now.

In the present course we shall discuss the application of the method to a polynomial
equation whose roots are real and distinct.

Let a t , a 2 , . . . ,a, be the n real and distinct roots of the polynomial equation of degree n
given by

where ao, a t , a2, . . . ,an - I , a, are real numbers and an # 0. We rewrite Eqn. (17) by
collecting all even terms on one side and all odd terms on the other side, i.e.

Squaring both sides of Eqn. ( 18), we get

Now we expand both the right and left hand sides and simplify by collecting the
coefficitnts. We get
Solutions of Nun-linear Equations Putting x' = -y in Eqn. (19), we,obtain a new equation given by
in one Variable

where
b0=$

2
bn = an

The following table helps us to compute the coefficients bo, b,, . . . ,bn of Eqn. (20) directly
from Eqn. ( 17).

Table 12
a~ a3... "n

4 a: 4 af at
o -2a,,a2 -2ala3 -2a2a4 o
o o %a4 -2a,a5 o '
0 0 0 -2aoa6 0

"0 bl b2 b,. . . b"

To form Table 12 we first write the coefficients ao, al , a2. . . . ,an as the first row. Then we
form (n + I ) columns as follows.

The terms in each column alternate in sign starting with a positive sign. The first term in
each column is the square of the coefficients ak, k = 0, 1 , 2, . . . , n. The second term in each
column is twice the product of the nearest-neighbour1ng coefficients. if there are any, with
negative sign: otherwise put it as zero. ~ 0 7 e x a m ~ lthe
e . second term in theflrst column is
zero and second term in the second colu~nnis -?ao aT Likewise the second term of the
,
(k + I )th column is -?ah - ah I.The third term in the (k + 1)th column is twice the
,,
+

product of the next neighbouring coefficients ak-,and ak + if there are any, otherwise put
it as zero. This procedure is continued until there are no coefficients available to form'the
cross products. Then we add all the terms in each column. The sum gives the coefficients bh
for k = 0. I , 2. . . . , n which are listed as the last term in each column. Since the substitution
x' = -y is used, it is easy to see that if aI, a2,. . . , anare the n roots of Eqn. (17). then
2,a2,. . . , c( are the roots of Eqn. (20).
- a;

Thus, starting with a given polynomial equation, we obtained another polynomial equation
whose roots are the squares of the roots of the original equation with negative sign.

We repeat the procedure for Eqn. (20) and obtain another equation

whose roots are the squares of the roots of Eqn. (20) with a negative sign i.e., they are fourth
powers of the roots of the original equation with a negative sign. Let this procedure be
repeated n times.Then, we obtain an equation
.L
Approximate Roots of Polynomial
whose roots yl, y2, .... *I, are given by Equations

Now, since all the roots of Eqn. (17) are real and distinct, we have
I a l I > I a 2 1 > ........> I a n /
Hence I yl 1 >> I y2.1 >> . . . . . . . . . >> 1 yn I.

We conclude that if the roots of Eqn. (17) are distinct then for large m, the 2"'th powers of
the roots are widely separated.

We stop this s q u h n g process when the cross product terms become negligible in
comparison to square terms. .

Since roots of Eqn. (2 1) are widely separated we calculate the absolute values of the roots '
yl, y2, .... "1, using Eqn. ( 16). We have

The magnitude of the roots of the original equation are therefore given by

This gives the magnitudes of the roots. To determine the sign of the roots, we substitute
these approximations-inthe original equation and verify whether positive or negative value
satisfies it.

We shall now illustrate this method with an example.

Example 5 :Find all the roots of the cubic equation x3 - 15x2+ 62x - 72 = 0 by Graeffe's
method using three squarings.

Solution :Let P3(x) = x3 - 15x2+ 62x - 72 = 0.

The equation has no negative real roots. Let us now apply the root squaring method
successively. The we get the following results :
Solutions of Non-linear Equations First Squaring
in one Variable
Table 13

Therefore the new equation is


x 3 + 101x2+ 1684x+5184=0.

Applying the squaring method to the new equation we get the lollowing results.

Second Squaring

Table 14

Thus the new equation is

For the ~hird\quaring. we have the following results.

Third Squaring

Table 15
--- - --
26873856 1788688 6833 1

[ x 2 0 4 I4 x 10" 3.1994048 x 10'' 46689889 1

Hence the new equation is

After three squarings. the roots y l , y, and y., of this equation are given by
%, a3of the original equation are
Hence, the roots a,, ~ ~ ~ r o i i r nRoots
n t e of Polynomial
Equations

Since the equation has no negative real roots, all the roots are positive. Hence the rooqs can
be taken as 9.0017,4.001 1 and 1.9990. If the approximations are rounded to 2 decimal
places, we have the roots as 9.4 and 2. Alternately, we can substitute the approximate roots
in the given equation and find their sign.

You can try these exercises now.

E10) Determine all roots of the following equations by Graeffe's root squaring method.
using three squarings.

ii) x3-2x2-5~+6=0

We have seen that Graeffe's root squaring method obtains all real roots simultaneously.
There is considerable saving in time also. The niethod can be extended to find multiple and
complex roots also. However the method is not efficient to find these roots. We shall not
discuss these extensions.
We shall end this block by summarising what we have covered in this unit.

4.5 SUMMARY

In this unit we have


discussed the following methods for finding approximate roots of polynomial
equations
i) Birge-Vieta method
ii) Graeffe's root squaring method
mentioned the advantages and disadvantages of the above methods.

El? Let p(x) = 2x3 - 5x2 + 3x - 1


Here a3 = 2, a2 =-%a, = 3, a. = -1 and a = 2. The Homer's table in this case is as
follows :

Hence the quotient polynomial is qo(x) = 2x2 - x + I and the remainder is ro = 1.


Solutions of Non-linear Equations E2) Form the H o m r ' s table in this case. From the table you can see that the last term in
in one Variable the 3rd row is zero. Hence 3 is a root of the equation. The quotient polynomial is
x3+4x2-x-4.
E3) The equation p(x) = 3x7 + x5 + 4x3 $ lox - 6 = 0 has no negative real root, since there
are no changes in the sign of cdefficients of p(-x).
Since there is one change in the sign of coefficients of p(x) the equation can have at
most one positive real root. Since the equation is of odd degree it has at least one real
root which is positive.
Since, f(0) = -6 < 0
f(1) = 12 > 0.
the equation has a positive root lying between 0 and 1 (by IV theorem).
E4) The given equation p(x) = x4 + x3 - 2x2 + 4x - 24 = 0 is of degree 4 i.e. even degree.
The sign of the constant term is negative whereas the sign of the leading coefficient is
positive. Therefore by corollary 2, the equation has two real roots, one positive and the
other negative.

E5) The Homer's table is as follows :

Since p(2) = 0 and p' (2) = -12,2 is a simple root.


E6) p(0.5) = 1.6875, p'(0.5) = -3.875
E7) The given equation is p(x) = 2x4 - 3x2 + 3x - 4 = 0.
The initial approximation is xo = -2. Then the I st iteration is

p(-2) and p' (-2) are given by the following table.

10
Therefore x , = -2 - r-
49
= 1.796

Repeating the procedure to find x2, we have


Approximate Roots of Polynomial
Therefore x2.= -1.796 - - '742 - 1.7425 Equations
- 32.565
To find x3, we have

.I018
Therefore x3 = -1.7425 +-
28.8770

Since I x3 - xl I < 0.0035 < 0.4 x lop2,we conclude that 1.7390 is the approximate
' root.
E8) Let p(x) = x3 - 2x - 5
Since there is only one change in the sign of the coefficients of p(x), the equation has
at most one real root. The equation has no negative real root since there is no change
in the sign of the coefficients of p(-x). Also

and
p(3)= 1 6 > 0
Therefore aroot lies in ]2,3[. Using xo = 2.5 as an initial approximation to the root,
you can show that 2.0945 is an approximation to the real root.
The deflated polynomial is given by the following table

Therefore we get the deflated polynomial as p2(x) = x2 + 2.0945 x + 2.3869 = 0. The


roots of this equation are given by

=-1.0473+ i 1.1359
Hence the roots are given by 2.0945, -1 .0473 + 1.1359 i, -1 .@I73 - 1.13359 i.

E10) i) The given equation is x3 +6x2 - 36x + 40 = 0

First squaring
bdutions of Non-linear Equations Second squaring
in one Variable
1600

Third squaring

Hence the new equation is


x3 + 1oxx2+ (.05 120 x 1 0 ' ~ ) x + .65536 x = 0.
The roots y,, y2 and y3 of this equation are given by

I
Hence the roots of the original equation are given by

Substituting the computed values in the original equation, we get that the roots
are approximately - 10, 2.18 and 1.83. Therefore the roots are -10, 2 and 2.
ii) Computed values of the roots are 3.014443. 1.991424 and 0.9994937.
iii) Computed values of the roots are 7.017507, -2.974432,0.958 1706.
DIRECT METHODS
' Structure
, 5.1 Introduction
i 5.2 Preliminaries
5.3 Cramer's Rule
i 5.4 Direct Methods for Special Matrices
I 5.5 Gauss Elimination Method
I
I 5.6 LU Decomposition Method
I
, 5.7 Summary
5.8 Solutions/Answers

One of the commonly occurring problems in applied mathematics is finding one or


more roots of an equation f(x)=O. In most cases explicit solutions are not available
I
and we are satisfied with being able to find one or more roots to a specified degree of
accuracy:Tii Brock 1, we ha;e discussed various numerical methods foi finding'th2
roots of an equation f(x)=O. There we have also discussed the convergence of these
, methods. Another important problem of applied mathematics is to find the solution
of systems of linear equations. Systems of linear equations arise in a large number of
areas, both directly in modelling physical situations and indirectly in the numerical
solution of other mathematical models. These applications occur in all areas of the
1
physical, biological and engineering sciences. For instance, in physics, the problem
of steady state temperature in a plate is reduced to solving linear equations.
Engineeeng problems such as determining the potential in certain electrical
networks, stresses in a building frame, flow rates in a hydraulic system etc. are all
reduced to solving a set of algebraic equations simultaneously. Linear algebraic
systems are also involved in the optimization theory, least squares fitting of data,
numerical solution of boundary value problems for ordinary and partial differential
equations, statistical infe-ce etc. Hence, the numerical solution of systems of linear
algebraic equations play a very important role.
, Numerical methods for solving linear algebraic systems may be divided into two
types, direct and iterative. Direct methods are those which, in the absence of
I
round-off or other errors, yield the exact solution in a finite number of elementary
arithmetic operations. Iterative methods start with an initial approximation and.by
applying a suitably chosen algorithm, lead to successively better approximations.
T o understand the numerical methods for solving linear system of equations it is
necessary to have some knowledge of the properties of matrices. You might have
already studied matrices, determinants and their properties in your linear algebra
course (ref: MTE-02). However, we begin with a quick recall of few definitions here.
In this unit, we have also discussed~somedirect methods for finding the solution of
system ,of linear algebraic equations.

Objectives
After studying this unit, you should be able to:
state the difference between the direct and iterative methods of solving the system
of linear algebraic equations;
obmin the solution of system of linear algebraic equations by using the direct
methods such as Cramer's rule, Gauss elimination method and LU decomposition
method;
use the pivoting technique while transforming the coefficient matrix t o upper or
lower triangular matrix.
Two matr~cesA = (a,,) and B = (bil) are equal iff they navethe same number or rows
i n d columns and their corresponding elements are equal, tha't is, a,, = bi, for all i, j.
You must also be familiar with the addition and multiplication of matrices.
Addition of matrices is defined only for matrices of same order. The sum C = A B +
of two matrices A and B, is obtained by adding the corresponding elements of A and
R , i.e., cij = aij + bij.
-4 6 3 5 -1 0
For example, if A =
[ 0 1 2 1 andB = [ 3 1 O]7then

Product of an m x n matrix A = (aij) and an n x p matrix B = (blk)is an m x p matrix C


C = AB, whose (i,k)th entry is
I1

cik = 2 a, bjk = ail bIk + a i 2 q r + ....+ain bnk


,=I

That is, to obtpin the (i,k)th element of AB, take the ith row of A and kth column
of B, multiply their corresponding elements and add up all these products. For
example, if

Note that two matrices A and B can be multiplied only if the number of columns of
A equals the number of rows of B. In the above example the product B A is not
defined.
The matrix obtained by interchangin the rows and columns of A is called the
transpose of A and is denoted by A 7g

IfA = [ -1
3]
1
then^^ = [:-:]
Determinant 1s a number associated with square matrices.

For a 2 x 2 matrix A = l a 1 1 a121

det (A) = det [::: :I] = a11.22 - a12a21

t
det (A) = a,, det
[ a22

a32 a33
a23

] - det [ a33 ] +al3 det


[ ::]
A determinant can be expanded about any row or column. The determinant of an
n x n matrix A = (aij) is given by det (A) = (-I)'+' a,! det(Ail) +(-1)i+2a,2
+
det fAi2) ...+ (- l)'+"a,, det (A,,), where the determinant is expanded about the
ith row and Aii is the (n-1) X (n- 1) matrix obtained from A by deleting the ith row
slution L1n-r Algeb-k Qualions and jth column and,i 5 i 5 n. Obviously, computation is simple if det (A) is expanded
along a row or column that has maximum number of zeros. This reduces the number
of terms to be computed.
The following example will help you to get used to calculating determinants.
Example 1 :

I IA( = (-l)'+ ' x l x (A,,I + ( - I ) ' + ~ x ~ x1 ~ , ~ 1 . + ( - 1 ) ' + ~ 1x ~61x3 =


You may now try thls exercise.
1 5-6-78= -79

El

IfA = [-i -:-i]


-

-
, calculate det (A).

If the determinant of a square matrix A has the value zero, then the matrix A is called
a singular matrix, otherwise, A is called a nonsingular matrix.

I We shall now give some more definitions.


Definition : The inverse of an n X n nonsingular matrix A 1s an n X n matrix B having
the property
AB=BA=I
where I is an ident~tymatrix of order n x n .
The inverse matrix B if it exists. is denoted by A-' and is unique.

I Definition : For a matrix A = (aij), the cofactor Aij of the element aij is given by
AiJ = (-l)'+J MiJ
where Mij(minor) is the deterrhinant of the matrix of grder (n-1) x (n-1) obtained
from A after deleting its ith row and the jth column.
Definition : The matrix of cofactors associated w ~ t hthe n x n matrix A is ann)cn matrix
A' obtained from A by replacing each element of A by its cofactor.

Let us now consider a system of n linear algebraic equations in n unknowas


a l l x l + a12x2'+ ...,. + a,, x, = b1
a 2 1 ~+ 1 az2x2 + .... S' a2, X, = b2 ... (1)

anlxl + an2x2+ .... + a,, X, = b,


where
1
a,, alz.. ....aln
a,, a,,.. ....azn
1

A is called the coefllicient matrix and has real elements.


Our problem is t o find the values xi, i=1,2 ....n if they exist, satisfying Eqn. (2).
Before we discuss some methods of solving the system (2), we give the following
definitions.

Definition : A system of linear Eqns. (2) is said to be consistent if it has at least one
solution. If no solution exists, then the system is said to be inconsistent.

Definition : The system of Eqns. (2) is said to be homogene6us if b = 0, that is, all
the elements bl, b,,. ...,b, are zero, otherwise the system is called nonhomogeneous,
In this unit, we shall consider only nonhomogeneous systems.
You also know from your linear algebra that the nonhomogeneous system of Eqns.
(2) has a unique solution, if the matrix A is nonsingular. You may recall the following
basic theorem on the solvability of linear systems (Ref. Theorem 4, Sec. 9.5, Unit 9,
Block 3, MTE-02).

Theorem 1 : A nonhomogeneous system of n linear equations in n unknowns has a


unique solution if and only if the coefficient matrix A is nonsingulan.~
If A is nonsingular, then A-' exists, and the soluti~nof system (2) can be expressed as
x =~-'b.
In case the matrix A is singular, then the system (2) has no solution if b f 0 or has
an infinite number of solutions if b = 0 . Here we assume that A is a nonsingular
matrix.
As we have already mentioned in the introduction, the methods of solution of the
system (2) may be classified into two types :
i) Direct Methods : which in the absence of mund-off errors give the exact
solution in a finite number of steps.

convergeto the exact solution vector x as the number of iterations k -


ii) Iterative Methods : Starting with an approximate solution vector xC0),these
methods generate i sequence of approximate solution vectors {xtk') which
m. Thus
iterative methods are infinite processes. Since we perform only a finite number of
iterations, these methods can only find some approximation to the solution vector
x. We shall di~cussiterative methods later in Units 7 and 8.
In this unit we shall discuss only the direct methods. You are familiar with one such
method due to the mathematician Cramer and known as Cramer's Rule. Let us
briefly review it.

5.3 CRAMER'S RULE


In the system (2), let d = d d t ( ~ f) 0 and b f 0. Then the solution of the system is
obtained as
x , = d i / d , i = 1,2......n (3)
where d, is the determinant of the matrix obtained from A by replacing the ith column
of A by the column vector b. Let us illustrate the method through an example.
solatbn of Linear Equat'oos Example 2 : Solve the system of equations.
I

I
+
3x1 x2 + 2x3 = 3
2x1 - 3x2 - X j = -3
I
XI + 2x2 + X j = 4

using Cramer's rule.


- Sdution : We have,

3 1 2
dl = -3 -3 -1 = 8 (first column in A is replaced by the column vector b)
4 2 1

3 3 2
d2 = 2 -3 -1 = 16 (second column in A is replaced by the column vector b)
1 4 1

3 1 3 '
d3 = 2 -3 -3 = -8 (third column in A is replaced by the column vector b)
1 2 4
Using (3), we get the solution

You may now try the following exercises.

E2) Solve the system of equations


3x1 + 5x2 =0
+ 2x2 - X 3 = 0
-XI

3x1 - 6x2 + 4x3 = 1


using Cramer's rule.
E3) Solve the system of e q u a--
~ons
x,+ 2x2 - 3xj + X4 = -5
xz + 3x3 + ~4 7 6
2x1 + 3 x 2 + X j + X4 = 4
x1 +x3+x4= 1
using Cramer's rule.

While going through the example and attempting the exercises you must have
observed that in Cramer's method we need to evaluate n + l determinants each of
order n, where n is the number of equations. If the number of operatildns required
to evaluate a determinant is measured in terms of multiplicatiops otlly, then to
evaluate a determinant of second order, i.e.,

we need two multiplications or (2-1) 2! multiplications. To evaluate a determ~nantor


t'hud order
Also for a system of n equations, Cramer's rule requires n + l determinants each of
order n and performs n divisions to obtain xi, i = 1,2,. ..,n. Thus the total number of
multiplications and divisions needed to solve a system of n equations, using Cramer's
rule becomes
M = total number of multiplications +total number of divisions
= ( n + l ) (n-l)n! + n
In Table 1, we have given the values of M for different values of n.

Table 1

Number of equations Number of operations


n M
2 8
3 51
4 364
3 2885
6 25206
7 241927
8 2540168
9 29030409
10 359251210

From the table, you will observe that as n increases, the number of operations
required for Cramer's rule increases very rapidly. For this reason, Cramer's rule is not
generally used for n>4. Hence for solving large systems, we need more efficient
methods. In the next section we describe some direct methods which d e ~ e n don the
form of the coefficient matrix.

We now discuss three special forms of matrix A in Eqn. (2) for which the solution
vector x can, be obtained directly.

Case 1 :A'= D , where D is a diagonal matrix. In this case'the system'of Eqns. (2)
are of the form
a,, xl ......................... = b,
a22 X2 =b2

and det (A) = a,, a 2 ....


~ a,,.
:.Since the matrix A is nonsingular, aii f 0 for i = 1,2,....,n and we obtain the solution
as
111,i = 1,2,....,n.
x.I = b.la..

Note that in this case we need only n divisions to obtain the solution vector.
:'Case 2: A = L, where L is a lower triangular matrix (aij = 0,j>i). The system of
Eqns. (2) is now of the form

+
anlxl + an2x2+ an3x3 ...+ annxn = b,
and det (A) = alla 22...a,,.
You may notice here that the first equation of the system (4) contains only xl, the
second equation contains only xl and x2 and so on. Hence, we find xl from the first
equation, x2 from the second equation and proceed in that order till we get xn from
the last equation.
Since the coefficient matrix A is nonsingular, aii # 0, i = 1,2,...,n. We thus obtain
x1 = bl/all
X2 = (b2-a21~1)/a22
X3 = (b3-a31x1 - a32~2)/833

In general, we have for any i

xi = (bi - z
i-1

j=l
aijxj)/aii,i = 1,2,. ...,n

For eitample,. consider the system of equations

From the first equation we have,


X1 = 1

From the second equation we get,

and from the third equation we have,


5+x1-3x2 -
X3 = - - -3
2 2'
Since the unknowns in this method are obtained in the order xl,x2,,,. .,Xn,this method
is called the forward substitution method.
The total number of multiplications and divisions needed to obtain the complete
solution vector x, using this method is I

M = 1 + 2 +....+ n = n(n+1)/2.
Case 3: A = U, where U is an upper triangular matrix (aij = 0, j<l). The system (2)
is now of the form
,
a,,x, + a12x2+ a13x3+...+ alnxn = bl
az2x2+ a 2 3 ~+3 ...+ a2,xn = b2
a 3 3 ~+
3 ...+ ajnxn = b3 (6)

an-1.n-1%-I t an-1,nXn = bn-1


an,, Xn = bn
and d e t < ~ =
) a , la22...ann.
You may notice here that the nth (last) equation contains only x,, the (n-1)th
equation contains xn and xn-I and so on. We c a n obtain x, from the nth equation,
xn-! from the (n- 1)th equation and proceed in that order till we get x1 from the first
equation. Since the coefficient matrix A is nonsingular, aii# 0, i = 1,2,...,n and we
obtain
Since the unknowns in this method are determined in the order xn,x,-,,..., xl, this
method is called the back substitution method. The total number of multiplications
and divisions needed to obtain the complete solution vector x using this method is
again n(n+ 1)D.
Let us consider the following example.

Example 3 : solve the linear system of equations

Solution : From the last equation, we have


x3 = 3.
From the second equation,
. . we have

Hence, from the first equation,


wk get
(5-3.2+3)
= pl-al2~2-sI3~3 = = 1
a11 2
;You may now try the following exercises :

E4) Solve the system of equations

using forward substitution method.


ES) Solve the sys_temof equations
XI - 2x2 + 3x3 - 4xq + 5x5 = 3

xs = 1
using backward substitution method.
I

I
In the above discussion you have observed that the System of Eqns. (2) can be easily 1
solved if the coefficient matrix A in Eqns. (2) has one of the three forms D,L or U
or if it can be transformed to one of these forms. Now, ygu would like to know how
to reduce the given matrix A into one of these three forms? One such method which
I
transforms the matrix A to the form U is the Gauss elimination method which we
shall descriM in the next section:
I
I
I
I
5.5 GAUSS ELIMINATION METHOD I

I
I
Gauss elimination is one of the oldest and most frequently used methods for solving I
1 systems of algebraic equations. It is attributed to the famous German mathematician, Gmua (1777-1855) 1
-
Carl Friedrick Gauss (1777 1855). This method is the generalization of the famjliar 13 ,
Solution of Linear Algebraic Equations method of eliminating one unknown between a pair of simultaneous linear equations.
You must have learnt this method in your linear algebra course (Ref. : Sec 8.4,
Unit 8, Block 2, MTE-02). In this method the matrix A is reduced to the form U by
using the elementary row operations which include :
i) interchanging any two rows
ii) multiplying (or dividing) any row by a non-zero constant
'

.iii) adding (or subtracting) a constant multiple of one row to another row.
The operation Ri + mRj is an elementary row operation, that means, add to the
elements of the ith row m times the corresponding elements of the jth row. The
elements in the jth row remain unchanged.
If any matrix A is transformed into another matrix B by a series of elementary row
operations, we say that A and B are equivalent matrices. Formally, we have the
following definition.
Definition : A matrix B is said to be row equivalent to a matrix A , if B can be obtained
from A by using a finite number of elementary row operations.
Also two linear systems Ax = b and A'x = b' are equivalent provided any solution
of one is a solution of the other. Thus, if a sequence of elementary operations on
Ax = b produces the new system A*x = b* then the systems Ax = b and A*x = b*
are equivalent.
T o understand the Gauss elimination method let us consider a system of three
equations :
allxl + a12x2 + a13x3 = bl
a21xl + a22x2 + a23x3 = b2 @I
a3lxl + a32x2 + a33x3 = b3
Let all # 0. In the first stage of elimination we multiply the first equation in Eqns. (8)
by mzl = (-azl/all) and add to the second equation. l'hen multiply the first equatlon
by m3] = (-a31/a11) and add t o the third equation. This eliminates x1 from the second
and third equations. The new system called the first derived system then becomes

a(')
32
x2 + a)$, x3 = b(')
3
where,

a
b(') = b3 - A bl
3 /
1
In the second stage of elimination we multiply the second equation in (9) by
m 3= ~ (-a::)lai:)), a:;) # 0 and add to the third equation. This ellminates x2 from the
third equation. The new system called the second derived system becomes
where .*b.

You may note here that the system of Eqns. (11) is an upper triangular system of the
form (6) and can be solved using the back substitution method provided ag) # 0.

Let us illustrate the method through an example.

Example 4 : Solve the following linear system


2x1 + 3x2 - X 3 = 5
4x1 + 4x2 - 3x3 -3
-2x1 + 3x2 - Xg = 1
using Gauss elimination method.

Solution : To eliminate x1 from the second and third equations of the system (13)
add 3 = -2 times the first equation to the second equation and add -(-2)/2=1
2
times the first equation to the third equation. We obtain the new system as

In the second stake, we eliminate x2 from the third equation of system (14). Adding
-6/(-2) = 3 times the second equation to the third equation, we get
2x, + 3x2 - i3= 5
- 2x2 - X j = -7 (15)
- 5x3 = -15

System (15) is in upper trigngular form and its solution is


x3 = 3, X2 = 2, X1 = 1.

You may observe that we can write the above procedure moreconveniently in matrix
form. Since the arithmetic operations we have performed here affect only the
elements of the matrix A and the vector b, we consider the augmented matrix i.e.
[Alb] (the matrix A augmented by the vector b) and perform the elementary row
operations on the augmented matrix.

all a12 a13


(symbol = means equivalent to)
a(')
a(') a(') b(') R3 - 2 R2
32 a(l)
22
baaiuuomd m m r E s u . t ~ o ~which
is in the desired lorm where, 4;)a::),
, a!:), a!:), by), by), ag), af) are given by
Eqns. (10) and (12).
I

Definition :The diagonal elements a,, ,):a and a$) which are used as divisors are
called pivots.
You might have observed here that for a linear system of order 3, the elimination
was performed in 3- 1=2 stages. In general for a system of n equations given by Eqns.
(2) the elimination is performed in (n-1) stages. At the ith stage of elimination, we
eliminate xi, starting from (i+l)th mw upto the nth row. Sometimes, it may happen
that the elimination process stops in less than (n- 1) stages. But this is possible only
when no equations containing the unknowns are left or when the coefficienk of all
'the unknowns in remaining equations become zero. Thus if the process stops at the
rth stage of elimination then we get a derived system of the form

0 = b(r-1)
n
where r 5n and all # 0, a%) # 0 ,..., a!:-)' # 0.
In the solution of system of linear equations we can thus expect two different
situations '

Let us now illustrate these situations through examples.

Example 5 : Solve the system of equations


4x1 + X2 + X j = 4
xl + 4x2 - 2x3 = 4

' using ~ a u selimination


i method
-
Solution : Here we have

using back substitution method, we get


x3 = -112; x, = 112; x, = 1

16 Also, det (A) = 4XT15 x (-I2)


5
-= -36
Thus in this case we observe that r = n = 3 and the given system of equations has a
uniqye solution. Also the coefficient matrix A in this case is nonsingular. Let us look
at another example.
Example 6 : Solve the system of equations
3x, + 2 x 2 + x g = 3
2x1 + x, + x, = 0 .
6x1 + 2x2 + 4x3 = 6
using Gauss elimination method. Does the solution exist?

Solution : We have.

I n this case you can see that r<n and elements b,, b;) and by) are all non-zero.
A 'ystern of is
Since we cannot determine x3 from the last equation, the system has no solution. In
inconskitent if it does not have a
such a situation we say that the equations are inconsistent. Also note that solutian.
det (A) = 0 i.e., the coefficient matrix is singular.
We now consider a situation in which not all b's are non-zero.

Example 7 : Solve the system of equations


16x1 + 22x2 + 4x3 = -2
4x1 - 3x2 + 2x3 = 9
12x1 + 25x2 + 2x3 = - 11
using Gauss elimination method.

Solution : In this case we have

[Alb] = [ 16 22 4
4 -3 2
12 25 2
-2

-11
9 ] R - 4lR1, R3 - -
3 RI
4
Now in this case ficn and elements b,, b): are non-zero, but br) is zero. Also the last
equation is satisfied for any value of x3. Thus, we get
x3 = any value

Xl= - 1 ( - 2 - 22x2 - 4x3)


16
Hence the system of equations has infinitely many solutions.

Note that in this case also det (A) = 0.


The conclusions derived from Examples 4,5 and 6 are true for any system of linear
equations. We now summarise these conclusions as follows :
i) If r = n, then the system of Eqns. (2) has a unique solution which can be obtained
using the back substitution method. Moreover, the coefficient matrix A in this
case is nonsingular.
ii) if r<n and all the elements b:;)', bL;'), ....by-') are not zero then the system
has no solution. In this case we say that the system of equations is inconsistent.
iii) If r<n and all the elements b(,:;'), bLil),...,bjl'-'), if present, are zero, then the
, system has infinite number of solutions. In this case the system has only'r linearly
independent rows.
Inboth the cases (ii) and (iii), the matrix A is singular.
Now we estimate the number of operations (multiplication and division) in the Gauss
eliminativn method for a system of n linear eqliations in n unknowns as follows :

No. of divisions
1st step of elimination (n-1) divisions
2nd step of elimination (n-2) divisions
The sum of first n natural numbers ., .......................................................
n
is 2i =7 n (n+l)
and (n-1)th step of elimination 1 divisions
1-1
... Total number of divisions A (nLl) + (n-2) -k ...... + 1
the sum of the squares of the first
n natural numbers is

No. of multiplications
1st step of elimination n(n- 1) multiplications
2nd step of elimination (n- 1) (n-2) multiplications
........................................................................
(n- I)th step df elimination 2.1 mi~ti~lications
' ... Total number of multiplications = n(n-1) + (n- 1) (n- 1) + ...... + 2.,1

Also the back substitution adds n divisions (one division at each step) and the number
of multiplications added are
(n-1)th equation 1 multiplication
//'(n-2)th equation 2 multiplication
I .....................................................
1st equation (n- 1) multiplication
n(n- 1)
Total multiplications = z ( n - 1 ) -- =
2
n(n - 1) n(n + 1)
Total operations added by back substitution = -+ n = -
2 2
Thus to find the solution vector x using the Gauss elimination method, we need

operations. For large n, we may say that the total number of operations needed i s l n"
(approximately). Thus, we find that Gauss elimination method needs much less&
number of operations compared to the Cramer's rule.

You may now try a few exercises.

E6) Use Gauss elimination method to solve the system of equations


+
x, 2x2 +
X3 = 3

3x, - 2x2 - 4x3 = -2


2x1 + 3x2 - ~3 = -6
E7) Use Gauss elimination method to solve the system of equations
+ +
3x, 18x2 9x3 = 18
+
2x1 3x2 + 3x3 = 117
4x1 + +
x2 2x3 = 283
E8) Solve the system of equations

using Gauss elimination method.


E9) Using the Gauss elimination method show that the system of equations

are inconsistent.
~ 1 0 )Use
' Gauss elimination method to solve the system of equations

It is clear from above that you can apply Gauss elimination method to a system of
I equations of any order. However, what happens if one of the diagonal elements i.e.
the pivots in the triangularization process vanishes? Then the method will fail. In such
situations we modify the Gauss elimination method and this procedure is called
pivoting.

i Pivoting
I ,,
In the elimination procedure the pivots a, ay! ,....,a("-')
nn are used as divisors. If at
I any stage of the elimination one of these pivots say at:-'). (a!:) = a,,), vanishes then
So*ctbr Ll-r Algeb-lC huatlo~ the elimination procedure cannot be continued further (see Example 8). Also, it may
happen that the pivot a!-'), though not zero, may be very small in magnitude
compared to the remaining elements in the ith column. Using a small number as a
'
divisor may lead to the growth of the round-off error. In such cases the multipliers
' -a('-2) -a!i-3)
(e.g.- I-',' ) wiil be larger than one in magnitude. The use of large
a(!-l) ' a(!-l)

multipliers will lead to magnification of errors both during the elimination phase and
during the back substitution phase of the solution. To avoid this we rearrange the
remaining rows (ith row upto nth row) so as to obtain a non-vanishing pivot or to
make it the largest element in magnitude in that column. The strategy is called
pivoting (see Example 9). The pivoting is of the two types; partial pivoting and
complete pivoting.

Partial Pivoting
In the first stage of elimination, the first column is searched for the largest element
in magnitude and this largest element is then brought at the position of the first pivot
by interchanging the first row with the row having the largest element in magnitqde
in the first column. In the second stage of elimination, the second column is searched
for the largest element in magnitude among the (n-1) elements leaving the first
element and then this largest element in magnitude is brought at the position of the
second pivot by interchanging the second row with the row having the largest element
in the second column. This searching and interchanging of rows is repeated in all the
n- 1 stages of the elimination. Thus we have the following algorithm to find the pivot.
For i = 1,2,.....n, find j such that

and interchange rows i and j.

Complete Pivoting
In the first stage of elimination, we search the entire matrix A for the largest element
in magnitude and bring it at the position of the first pivot. In the second stage of
elimination we search the square matrix of order n- l (leaving the first row and the
first-column) for the largest element in magnitude and bring it to the position of
second pivot and so on. This requires at every stage of elimination not only the
interchanging of rows but also interchanging of columns. Complete pivoting is much
more complicated and is not often used. .
In this unit. by pivoting we shall mean only partial pivoting.
Let us now understand the pivoting procedure through examples.

Example 8': Solve the system of equations


X I + x,+ ~3 = 6
+ 3x2 + 4x., =
3x1 20
+ x + 3x.3 =
Ix, 13
using Gauss elimination method with partial pivoting.

Solutibn : Let us first attempt to solve the system without pivoting. We have
Note that in the above matrix the second pivot has the value zero and the.elimination
procedure cannot be continued further unless, pivoting is used.
Let us now use the partial pivoting. In the first column 3 is the largest element.
Interchanging the rows 1 and 2, we have

In the second column, 1 is the largest element in magnitude leaving the first element.
Interchanging the second and third rows we have

You may observe her& that the iesultant matrix is in triangular form and no further
elimination is required. Using back substitution method, we obtain the solution
X3 = 2, X2 = 1, X I = 3.

Let us consider another example.

Example 9 : Solve the system of equations


-
I
+
0.0003 xl 1.566 x, = 1.569
0.3454 xl - 0.436 x2 = 3.018 (17)
using Gausselimination method with and without pjvoting. Assume that the numbers
in arithmetic calculations are rounded to four significant digits. The exact solution of
the sysiem (li) is x, = 10, x2 = 1.

Solution : Without Pivoting


I

m21 = - - = - 0.3454 = - 1151.0 (rounded to four places)


a,, , 0.0003
):a = - 0.436 - 1.566 x 1151
= - 0.436 - 1802.0 - 1802.436
= - 1802.0

b): = 3.018 - 1.569 x 1151.0


= 3.018 - 1806.0
= - 1803.0
Thus, we get the system of equations
+
0.0003 x1 1.566 x2 = 1.569
- 1802.0 ~2 = -1803.0
which gives

= 3.333
which is highly inaccurate compared to the exact solution.
.With Pivoting
We interchange the first and second equations in (17) and get
0.3454 XI - 0.436 x 2 = 3.018
+
0.0003 x 1 1.566 x2 = 1.569
we obtain

Thus, we get the system of equations


0.3454 X, - 0.436 x2 = 3.018
1.566 xz = 1.566
which gives
X2 = 1

which is the exact solution.


We now make the following two remarks about pivoting.
Remark : If the matrix A is diagonally dominant i.e.,
n
1aii1 s 2 1aij1 , then no pivoting is needed. See Example 5 in which A is
;j ;
diagonally dominant. -

Remark : If exact arithmetic is used throughout the computation, pivoting is not


necessary unless the pivot vanishes. However, if computation is carried upto a fixed
number of digits, we get accurate results if pivoting is used.
There is another convenient way of carryihg out the pivoting procedure. Instead. of
physically interchanging the equations all the time, the n original equations and the
various changes made in themcan be record& in a systematicway. Here we use an
nx(n+l) working array or matrix which we call W and is same as our augmented
matrix [Alb]. Whenever some unknown is eliminated fiom an equation, the changed
coefficients and right side for this equation are calculated and stored in the working
array W in place of the previous coefficients and right side. Also, we use an n-vector
which we call p = (pi) to keep track of which equations have already been used as
pivotal equation (and therefore should not be changed any further) and which
equations are still to be modified. Initially, the ith entry pi of p contains the integer
i, i = l , ......,n and working array W is of the form

~ u r t h e r onq
, has to be careful in the selection of the pivotal equation for each step.
For each step the pivotal equation must be selected on the basis of the current state
of the system under consideration i.e. without foreknowledge of the e f f m of the
selection on later steps. For this, we calculate initially the size di of row i of Pf, for
i = l , .....,n, where di is the number
-
di = max (aij(
At the begmning of say ktii step of dimination, we pick as pivotal equation that one
from the available n-k, which,has the absolutely largest coefficient of x, relative to
the size of the equation. This means that the integer j is selected between k and n
for which

We can also store the multipliers in the working array W instead of storing zeros.
That is, if pi is the first pivotal equation and we use the multipliers mpiYl,i=2,. ....,n
to eliminate xl from the remainihg (n-1) positions of the first column then in the
first colurqn we can store the multipliers mR,,, i=2, ...., n, instead of storing zeros.

Let us now solve the following syst=m of linear equations by scaled partial pivoting
by storing the multipliers and maintaining pivotal vector

Example 10 : Solve the following system of linear equations with pivoting

Solution : Here the wotking matrix is

and dl = 3, d2 = 4 and d3 = 5.

Note that d's will not change in the successive steps.

3 >11
Since -
5 2' 3'
.'. p1 = 3, p2 = 2 and p3 = 1.
We use the third equation to eliminate xl from first and second equations and store
corresponding multipliers instead of storing zeros in the working matrix.
W
The multipliers are rn
Pi.1
=
W
,i = 2, 3
Pi.1

After the first step the working matrix is transformed to


D I M Methods
5.6 L'U DECOMPOSITION METHOD
Let us consider the system of Eqns. (2), where A is a non-singular matrix. We first
write the matrix A as the product of a lower triangular matrix L and an u'pper
triangular matrix U in the form

A = LU
or in matrix form we write

The left side matrix A has n2.elements, whereas Land U have 1+2+...+n = n(n+ 1)/2
elements each. Thus, we have n2+n unknowns in L and U which are to be
.determined. On comparing the corresponding elements on two sides in Eqn. (19)' we
get nZequations in n2+n unknowns and hence n unknowns are undetermined. Thus,
we get a solution in te'rms of these n unknowns i.'e., we get a n parameter family of
solutions. In order to obtain a unique solution we either take all the diagonal
elements of L as 1, or all the diagonal elements of U as 1.

For uii = 1, i = 1,2,. ..,n, the method is called the Crout LU decomposition method.
For lii = 1, i = 1,2,. ..,n we have Doolittle LU .decompositionmethod. Usually Crout's
LU decomposition method is used unless it is specifically mentioned. We shall no.w
explain the method for n = 3 with uii = 1, i = 1,2,3. We have

On comparing the elements of the first column, we obtain

111= all, 121 = a21, 131 = a31


i.e., the first column of L is determined.
O n comparing the remaining elements of the first row, we get
111~12= a12; 111~13= a13
which gives
2 a12flll; U13 = ald1ll
~ 1=
Hence the first row of U is determined.
O n comparing the elements bf the second column, we get
121~12+ 122 = a22
l31u12 + 132 = a32
which gives
Now the second column of L is determined.
On comparing the elements of the second row, we get
121~13+ l22~23= a23
which gives U23 = (a23- 121 ~ ~ ~ ) / 1 ~ ~
and the second row of U is determined.
On comparing the elements of the third column, we get
131u13 + l32~23+ I33 = a33
which gives = - 131~13 - 132u23 (24)
You must have observed that in this method, we alternate between getting a column
of L and a row of U in that order. If instead of uii = 1, i = 1,2,...,n, we take
li, = 1, i = 1,2,...n, then we alternate between getting a row of U and a column of
L in that order.
Thus, it is clear froin Eqns. (20) - (24) that we can determine all the elements of L
end U provided the nonsin&r matrix A is suc! that

Similarly, for the general system of Eqns. (2), we obtain the elements of L and U
using the relations

-
Uii .-1
Also, det (A) = 111122....,I,.,,.,.

Thus we can say that every nondngular matrix A can be written as the product bf a
lower triangular matrix and an upper triangular matrix if all the principal minors of
A are nonsingular, i.e. if

Once we have obtained the elements of the matrices L and U, we write the system
of equations
Ax=b (25)
in the form
L U X =b (26)
The system (26) may be further written as the following two systems
u x = y (27)
Ly=b (28)
Now, we first solve the system (28), i.e.,
Ly =b,
using the forward substitution method t o obtain the solution vector y. Then using this
y, we solve the system (27), i.e.,
Ux=y,
using the backward substitution method to obtain the solution vector x.
The 'number of operations for this method remains the same as that in the
Gauss-elimillation method.
1 We now illustrate this method through an example.
1 Example 11 : Use the LU decomposition method to solve the system of equations
x, + X2 + X3 = 1
4x1 + 3 x 2 - x3 = 6
3x1 + 5x2 + 3x3 = 4
I

3 5 -:I [I; b 81 [H' k: ]:I


Solution : Using lii = 1, i = 1,2,3, we have

[:: 3 =
u33

On comparing the elements of row and column alternately, on both sides, we obtain
' first row : u l l = 1, u12= 1, U13=1
first column : lZ1= 4, 13] = 3
second row : u2~ = -1, u23 = 5-
second column : 132 = -2
third row : u~~ = -10
Thus, we have

Npw frorathe system


Ly=b
or

we get
y1 = 1, Y2 = 2, Y3 = 5
and from the system
u x = y
or

we get
x3 = -112, x* = 112, X l = 1.
You may now try the following exerclses :

E12) Use the L U decomposition method with u,, = 1, i = 1,2,3 to solve the system
of equations given in Example 11.
E13) Use the L U decomposition method wiqh I,, = 1, i = 1,2,3 to solve thesystem
I of equations given in E7.
E14) Use L U decomposition method to solve the system of equations given in ElO.
11 sibbaa d
Egu*~ We now end this unit by giving a summary of what we have covered in it.

5.7 SUMMARY
In this unit we have covered the following:
1) For a system of n equations
AX = b (see Eqn. (2)
in n unknowns,'where A is an n x n non-singular matrix, the methods of finding
the solution vector x may be broadly classified into two types: (1) direct m&t~ods
and (ii) iterative methods
2) Direct methods produces the exact solution in a finite number of steps provided
there are no round-off errors. Crmer's rule is one such method. This method
gives the solution vector as

where d = and di is the determinant of the matrix obtained from A by ,


IAI
replacing the ith column of A by the column vector b. Total number of operations
required for Cramer's rule in solving a system of n equations are
M = ( n + l ) (n-l)n!+n
Since the qumber M increases very rapidly, Cramer's rule is not usedfor n > 4.
3) For larger systems, direct methods become more efficient if the coefficient matrix
A is in one of the forms D (diagonal), L (lower triangular) or U (upper
triangular).
4) Gauss elimination method is another direct Method
--- -for solving large systems
- --
(n>4). In this methodthe coefficient matrix A is reduced to the form U by using
the elementary row operations. The solution vectpr x is'theo obtained by usfig
the back substitution method. For large n, the total number of operations
required in Gauss elimination method are 1
3
n3 (approximately).
5) - In Gauss elimination method if at any stage of the elimination any of the pivots
vanishes or become small in magnitude, elimination procedure cannot be
continued further. In such cases pivoting is used to obtain the solution vector x.
6) Every nonsingular matfix A can be written as the product of a lower t*angular
matrix and an upper triangular matrix, by the LU decomposition method, if all
the principal minors of A are nonsingular. Thus, LU decomposition method,
which is a modification of the Gauss elimination method can be used to obtain
the solution vector x.

E l ) det (A)= 8
E2) d = 11, d l = 11, d2 = 11, d3 =' 11
X1 = X2 = X3 = 1

E3) d = 20, d l = 0, d2 = 20, d3 = 40, d4 = -20


X1 = 0, X2 = 1, Xj = 2, X', = -1
E4) X, = ~2 = ~3 = x4 = xs = 1
E5) x5 = ~4 = x3 = X, = XI = 1
a]
2. 1 3
final derived sysiim : ["8 -7:
0 0 -- --
-1

E7) Final derivehystem :


I
E8) Final derived system :

., E9) Final derived system :

We cannot determine x4 from the last equation.

E10) Final derived s@em :

1
E l l ) Solution without pivoting :
I
Using m,, = 1.372
m,, = 1.826 and r n =~ 2.423
1 .
The final derived system is
I
0.7290 0.8100 0.9000 '
0.0
0.0
-0.1110 -0.2350
0.0

The solution is
0.02640

x = 0.2251, y = 0.2790, z = 0.3295


0.6867
-0.1084
-0.0087
I
Solutibb with pivoting;
Interchanging first and the third row and ;sir&
m2, = 0.7513
m3,= 0.5477
I Il*llrrlLlrarAllcMcEw.(bnr
and m 3 = ~ 0.6171
the final derived system is
i

The solution is x = 0.2246. v = 0.2812. z = 0.321U). i


UNIT 6 INVERSE OF A SQUARE MATRIX
Structure
6.1 Introduction
6.2 The Method of Adjoints
6.3 The Gauss-Jordan Reduction Method
6.4 LU Decomposition Method
6.5 Summary
6.6 SolutionsIAnswers

In the previous unit, you have studied the Gauss elimination and LU decomposition
methods for solving systems of algebraic equations A x = b, when A is a n x n
nonsingular matrix. Matrix inversion is another problem associated with the problem
of finding solutions of a linear system. If the inverse matrix A-' of the coefficient
matrix A is known then the solution vector x can be obtained from x = A-' b. In
general, inversion of matrices for solving system of equations should be avoided
whenever possible. This is because, it involves greater amount of work and also it is
difficult to obtain the inverse accurately in many problems. However, there are two
cases in which the explicit computation of the inverse is desirable. Firstly, when
several systems of equations, having the same coefficient matrix A but different right
hand side b, have to be solved. Then computations are reduced if we first find the
inverse matrix and then find the solution. Secondly, when the elements of A-'
themselves have some special physical significance. For instance, in the statistical
treatment of the fitting of a function to observational data by the method of least
squares, the elements of A-' give information about the kind and magnitude of errors
in the data.
In this unit, we shall study a few important methods for finding the inverse of a
nonsingular square matrix.

Objectives
After studying this unit, you should be able to :
obtain the inverse by adjoint method for n < 4;
obtain the inverse by the Gauss-Jordan and LU decomposition methods; '

obtain the solution of a system of linear equations using the inverse method.

6.2 THE METHOD OF ADJOINTS


You already know that the transpose of the matrix of the cofactors of elements of A
is called the adjoint matrix and is denoted by adj(A) (Ref. Unit 9, Block 3 of
MTE-02, Linear Algebra). .
Formally, we have the following definition.
Definition : The transpose of the cofactor matrix ACof A is called the adjoint of A
and is written as adj(A).
Thus,
adj(A) = ( A ' ) ~
The inverse of a matrix can be calculated using the adjoint of a matrix.
We obtqin the inverse matrix A-' of A from
A-' = adj (A) (1)
zqq
This method of finding the inverse of a matrix is called the method of adjoints.
-
s d d d
~ F g d w ) Eqn. (1) must not be zero and therefore the matrix A bust be
Note that d e t ( ~ in
nonsingular.
We shall not be going into the details of the method here. We shall only illustrate it
through examples.
I
Example 1 : Find A-' for the matrix

I and solve the system of equations

Solution : Since det (A) = - 1 f 0, the inverse of A exists. We obtain the cofactor
matrix A" from A by replacing each element of A by its cofactor as follows :

L -8Y
17 10 1
NOWA-1 = adj (A)
det (A)

I I Also the solution of .the given systems of equations are

iii) x = A-'b =

1 Example 2 : Find A-' for the matrix


Solutbn : We have
det(A) = 18 f 0. Thus A-' exists.
Now

. . A-' = (A3T
adj (A)
= 1
18
-
-12
-2
6 0
l8 -5 3O 11
=
1
-213
-119
0
113
-5118 116 1
Thus, A-' is again a lower triangular matrix. Similarly, we can illustrate that the
inverse of an upper triangular matrix is again upper triangular.

Example 3 : Find A-' for the matrix

Solution : Since, det (A) = 24 # 0, A-' exists.

We obtain ,

is again an upper triangular matrix.


I

You may now try the following exercises.

El) Sol& the system of equations

using the method of adjoints.


E2) Solve the system of equations

using the mtthod of adjoints.

The method of adjoints provides a systematic procedure to obtain the inverse of a ,


given matrix and for solving~systemsof linear equations. To obtain the inverse of an
n x n matrix, using this method, we need to evaluate one determinant of order n, n
I So'utton Of
Abebralc Equations is used for solving a linear system we also need matrix multiplication. The number
of operations (multiplications and divisions) needed, fbr using this method, increases
very rapidly as n increases. For this reason, this method is not used when n > 4.
For large n, there are methods which are efficient and are frequently usedfor finding
the inverse of a matrix and solving linear systems. We shall now discuss these
methods.

6.3 THE GAUSS-JORDAN REDUCTION METHOD


This method is a variation of the Gauss elimination method. In the Gauss elimination
method, using elementary row operations, we transform the matrix A to an upper
triangular matrix U and obtain the solution by using back substitution method. In
~ a u s i - ~ o r d areduction
n not only the elementi below the diagonal but also the
elements above the diagonal of A are made zero at the same time. In other words,
we transform the matrix A to a diagonal matrix D. Thisdiagonal matrix may then
be reduced to an identity matrix by dividing each row by its pivot element.
Alternately, the diagonal elements can also be made unity at the same time when the
- reduction is performed. This transforms the coefficient matrix into an identity matrix.
Thus, on completion of the Gauss-Jordan method, we have

The solution is then given by


x.I = d.I ? i = 1,2,......,n

In this method also, we use elementary row operations that are used in the Gauss
elimination method. We apply these operations both below and above the diagonal
in order to reduce all the off-diagonal elements of the matrix to zero. Pivoting can
be used to make the pivot non-zero or to make it the largest element in magnitude
in that column as discussed in Unit 5. We illustrate the method through an example.

I
Example 4 : Solve the svstem of equations
XI+ x2+ xj=,1
4x1 + 3x2 - xg = 6
3x1 + 5x2\+ 3x3 = 4
I

using Gauss-Jordan method with pivoting.

1 Solution : We have

[Alb] = [ -: :]
3 5 3 4
(interchanging first and second row)

1 5 -- (interchanging second and third row)

4 , 4
4 0 0 4

1 Rl (divide first row by 4),

A
11 R2 (divide second row by 11!4),

fi
10 R3 (divide third r o q by 10111).

which is the deiired form.


Thus, we obtain
1 X3 = - - .1
x l = l , x2= 3, 2
The method can be easily extended to a general system of n equations. Just as we
calculated the number of operations needed for Gauss elimination method in Unit 5,
in the same way you can verify that the total number of operations needed for this

E3) Verify that the total number of operations needed for Gauss Jordon reducfion
method is 1n3 + + n.
I 2 2

Clearly this method requires more number of operations compared to the Gauss
b elimination method. we.; therefore, do not use this method generally for solving system
of equations but is very commonly used for finding the inverse matrix. This is done
by augmenting the matrix A by the identity matrix I of the order same as that of A.
t
Using elementary row operations on the augmented matrix [A111 we reduce the
matrix A to the form I and in the process the datrix I is transformed to A-'

That is
!
I
Gauss
Jordan
,
[I A-']
We now illustrate the method through examples.
Example 5 : Find the inverse of the matrix

1 -2 1

using the Gauss-Jordan method.

Solution : We hav :

[AlI]=[:
-: 1 0 0 1 y1R 1
Soiutlon of Linear Algebraic Equations 1 113 213 113 0 0
2 -3 ' -1
1 -2 1 'Rz-12R1,R3-R1

Thus we obtain

Example 6 : Find the inverse of the matrix

using the Gauss-Jordan method

Solution : Here we have


.-...

Inverse of a Square Matrix

Hence

A-' =
1 0 0 0
o i o o
0 0 1 0
0 0 0 1

112' 0
2
0
112
-1
113
0
2
0
"" 0
o
- 1
1/11 21155 -17155

0
0
-113
1/11 21/55 -17155
0
3/55

is the inverse of the given lower triangular matrix.


0
3155

Let us now consider the problem of finding the inverse of an upper triangular matrix.
Example 7 : Find the inverse of the matrix

using the Gauss-Jordan method.


b ~1- A-'E~* ' ~ u t i o n:Here, we have

Hence

which is the inverse of the given upper tnangular matrix.

Note that in Examples 2,3,6 and 7, the inverse of a lowerllppr trianfiar matrix is
again a lowerlupper triangular matrix. There is another method offinding the inverse
of a matrix A which uses the pivoting strategy. Recall that in Sec. 5.5. of Unit 5, for
the solution of system of linear algebraic equation Ax = b, we showed you how the
multipliers mp,l,k'scan be stored in working array W during the process of
elimination. The main advantage of stgring these multipliers is that if we have already
solved the linear system of equations A x = b or order n, by the elimination method
and we want to solve the system Ax = c with the same coefficient matrix A, only the
right side being different, then we do not have to go through the entire elimination
process again. Since we have saved in the working matrix W all the multipliers used
and also have saved the p vector, we have only to repeat the operations on the right
hand side to obtain Z, such that U x = Z is equivalent to A x = c.
In order to understand the calculations necessary to derive E , from c consider the
changes made in the right side b during the elimination process. Let k be an integer
between 1 and n, and assume that the ith equation was used as pivotal equationq
during step k of the elimination process. Then i = pk. Initially, the right side of
equation i is just bi.

If k > 1, then ~ f t eStep


r 1, the right side is
1
! bll) = bi. - mil bPI

If k > 2, then after s t e p 2, the right side is'


by) = bill - m. b(1)
'2 P2
38 = bi-m. 11 bPI - m a b & )
Replacing i by pk in Eqn. (61, we get
b(k-') = bPk- mpkt1bpl- i p k , 2 b g )- ...... - b(k-2)
Pk rn~k'k-1 Pk-1 (7)
k = 1 , 2,...., n.
Also, since 6, = b g r l ) , j = 1, 2 ,...., n, we can rewrite Eqn. (7) as
-bk = bPk-mPk,l Lb l -mPk.2b2-
I - -
...... - mPk.k-l bk-, (8)
k = l ,....., n. .
Eqn. (8) can then be used to calculate the entries of b. But since the multipliers m ' , , ' ~
are stored in entries w,,'s of the working matrix W, we can also write Eqn. (8) in the
form
- k-l
wPkib,,
- k=l, ....., n
bk = b,, - (9)
]=I
Hence, if we just know the final content of thefirs! n columns of W and the piioting
strategy p then we can calculate the solution x of Ax = b by using the back substitution
method and writing

The vector x = [xl x2 ...... x,] will then be the solution of Ax = b.


For finding the inverse of an n x n matrix A, we use the above algorithm. We first
calculate the final contents of the n columns of the working matrix Wand the pivoting
vector p and then solve each of the n systems

Ax = e,, j=1, ......, n (11)


where el = [l 0 ..... OIT, e2 = [0 1 0 ..... OIT, .....,en = [0 0 ..... 11T7with the
help of Eqns (9) and (10). Then for each j= 1,....,n the solution of system (11) will
be the corresponding column of the iqverse matrix A-'. ~ h following
g example will
help you to understand the above procedure.

Example 8 :Find the inverse of the matrix

using partial pivoting.

T T
Splution :Initially p = p2, p3] = [I, 2, 31 and the working matrix is

W e use the second equation togiminate x, from first and third equations and store
, corresponding multipliers insteaa 6f storing zeros in the working- matrix. The
[ n~ultipliersare
we get the following working matrix

Sin& 2 = 2 so we take p = (2, 1, 3)T


4 4
W
NOWm- , = " ~ 2. i = '3

We use the first equation as pivotal equation to eliminate x2from the third equation .
and also store the multipliers. After the second step we have the following working
matrix

Now in this case, w ( ~is)our final working matrix with pivoting strategy p = (2,1, 3)T
Note that circled ones denote multipliers and squared ones denote pivot elements in
the working matrices.
To find the. inverse of the given matrix A, we have to solve
Ax = el = [b, b2 b31T
p i x = e 2 = [bl bz b31T
/
Ax = e3 = [bl b2 b31T
where el = (1 0'0IT, e2 =
Using ~ q n (9),
. we get
-
with pl = 2, b, = b2 = 0
-
with p2 = 1, b2 = bl - wll bl
-

with p3 = 3, G3 = b3 -' w31 GI - ~3262

= o - [ - + ] . ~- i.l = -1
Using Eqn. (lo), we then get the following system of equations

73X z - X3 =1
2x,.+ xz = 0
which gives x3 = - 31 ,x2 = -
-2 4 -
4 and x, = - -
9
T
3]
2
9
i.e., vector x' = [ 9 . 9
is the solution of system (12).
Remember that the solution of system (12) constitutes the first column of the &verse
matrix A-'.
In the same way we solve the system of equations Ax = e2 and Ax = e3, or

and

Using Eqns (9) and (lo), we obtain the solution of system (13) as
x = 12 I I is the second column of A-I and the solution of system
9 9 3
UNIT 7 ITERATIVE METHODS

7.1 Introduction
7.2 The General Iteration Method
7.3 The Jacobi Iteration Method
-7.4 The Gauss-Seidel Iteration Method
7.5 Summary
7.6 Solutions/Answers

,7.1 INTRODUCTION
In the previous two units, you have studied direct methods for solving linear system
of equations Ax = b, A being n x n non-singular matrix. Direct methods provide the
exact solution in a finite number of steps provided exact arithmetic is used and there
is no round-off error. Also, direct methods are generally used when the matrix A is
dense or filled, that is, there are few zero elements, and the order of the matrix is
not very large say n < 50.
Iterative methods, on the other hand, start with an initial approximation and by
applying a suitably chosen algorithm, lead to successively better approximations.
Even if the process converges, it would give only an approximate solution. These
methods are generally used when the matrix A is sparse and the order of the matrix
A is very large say n > 50. Sparse matrices have very few non-zero elements. In most
cases these non-zero elements lie on or near the main diagonal giving rise to
tri-diagonal, five diagonal or band matrix systems. It may be noted that there are no
fixed rules to decide when to use direct methods and when to use iterative methods.
However, when the coefficient matrix is sparse or large, the use of iterative methods
is ideally suited to find the solution which take advantage of the sparse nature of the
matrix involved.
In this unit we shall discuss two iterative methods, namely, Jacobi iteration and

After studying this unit, you should be able to:


"
obtain the solution of system of linear equations, Ax = b, when the matrix A is
large or sparse, by using the iterative method viz; Jacobi metliod br the
GaussBeidel method;
tell whether these iterative methods converge or not;
* '
obtain the rate of convergence and the approximate number of iterationsneeded
for the required accuracy of these.iterative methods.

7.2 THE GENERAL ITERATION METHOD


- . I niteration methods as we have already men.tioned, we start with some initial
approximate solution vector do)and generate a sequence of approximants { x ( ~ ) )
which converge to the exact solution vector x as k + m. If the method is convergent,
each iteration produces a better approximation to the exact solution. We repeat the
iterations till the required accuracy is obtained. Therefore, in an it/erative method the
amount of computation depends on the desired accuracy whereas in direct methods
the amount of computation is fixed. The number of iterations needed to obtain the
desired accuracy also depends on the initial approximation, closer the initial
apprdximation to the exact solution, faster will be the convergence.
Consider the system of equations
I
k,,lulion I I l.inrnr
~ ilprhrnie Equalions Writing the system in expanded form, We get
allxl + a12x2+ ...... alnxn = bl
azlxl + a 2 2 ~+2...... aznxn = b2

We assume that the diagonal coefficients aii f 0,(i = 1,. ..,n). If some of aii r 0, then
we rearrange the equations so that this condition holds. We then rewrite system (2) as

In matrix form, system (3) can be written as


x = Hx + c

I
where
.......- !k

I 1
0 -alz -9
a11 all
a21
,-

a22
0 - 3 2 .......-
a22
azn
a22
H = ......:................. :............................

To so!ve system (3) we make an initial'guess x(O)of the solution vector and substitute

manner until the successive iterations x ( ~have


) converged to.the'required number of

In general we can write the iteration method for solving the linear system,of
Eqns. (1) in the form
dk+')= H X ( ~+) C. k = 0.1 ......

i When the method (5) is canvergent, then


lim X(k) = lim X(k+l) = x
k-r = k-r .I.
l
and we obtain f ; o m ' ~ ~ (5)
n.
x=Hx+c (6)
1 If we define the error vector at the kth iteration as
€(k) = X(k) - (7) i
I then- subtracting Eqn. (6) from Eqn. (S), we obtain 1
= H E(k) 4

lim dk'= 0
k-rm
Before we discuss the above convergence criteria, let us recall the following
definitions from linear algebra, MTE-02.
iterative Methods t
.. -
\

eigenvalue or characteristic value of the matrix A .

The eigenvalues of the matrix A are obtained from the characteristic equation
det (A-XI) = 0
which is an nth degree polynomial in X. The roots of this polynomial XI, X2,...,Xn are
I the eigenvalues of A. Therefore, we have

We now state a theorem on the convergence of the iterative methods.

i Theorem 1 : An iteration method of the form (5) is convergent for arbitrary initial
approximate vector x(O) if and only if p(H)<l.

We define the rate of convergence as follows:

b
Definition : The number v = -loglo p(H) is called. the rate of convergence of an
iteration method.

Obviously, smaller the value of p(H), larger is the value of v.

dk)S Also the number of iterations k that will be needed to make

depends on v. For a method having higher rate of convergence, lesser number of


iterations will be needed for a fixed accuracy and fixed initial approximation. \\ e
There is another convergence criterion for iterative methods which is based on the
norm of a matrix.
! The norm of a square matrix A of order n can be 'defined in the same way as we
define the norm of an n-vector by comparing the size of Ax with the size of x (an
n-vector) as follows:
lIAxll2 IlAfl denotes the norm of A.
i) llAll2 = max

based on the euclidean vector norm, llxl12 = J Ix1I2+ /x2I2+ )xnI2


and
/

ii) l l ~ l =l ~ max IIAXIIw , based on the maximum vector norm, llxll, = max lxil.
ll~llm Isla.

In (i) and (ii) above the maximum is taken over all (non zern) n-vectors. The
I most commonly used norms is the maximum no& IW(., as it+ easier to
calculate. It can be calculated in any oMhe following two ways:
llAll, = max
k
x
i
~ a , (maximum
~l absolute column-sum)
/
I
or
llAll. = max x l a , , l (maximum absolute row sum)
k
I Solution of Linear Algebraic Eguations The norm of a matrix is a non-negative number which in addition to the property

IIABII s IIAII IIBII


satisfies all the properties of a vector norin, viz.,
a) /[A/(3 O and ((A((= 0 iff A =.o

I
b) IlaAll = 1 a1 IIAll, for all numbers a .
c) lIA+BIl 6 IIAll + IlBll
where A and B are square matrices of order'n.

Theorem 2 : The iteration method of the form (5) for the solution of system (1)
converges to the exact solution for any initial vector, if IJHJJ< 1.
Also note that
IlHll 3 P(H).
This can be easily proved by considering the eigenvalue problem Ax = Ax.
Then IlAxll =.llxxll = I A.1 llxll
or 1AI Ilxll = IIAxll 6 IlAll llxll
i.e., ihl d llAll since IIxll # 0
I Since this result is true for all eigenvalues, we have

The criterion given in Theorem 2 is only a sufficient condition, it is not necessary.


Therefore, for a system of equations for which the matrix H is such that either '

k i=l i k=l
condition is violated it is not necessary that the iteration diverges.
There is another sufficient condition for coovergence as follows:
Theorem 3 : If the matrix A is strictlv diaeonallv dominant that is.

then the iteration method (5) converges for any initial approximation x(o). I
i
?
If no better initial approximation is known, we generally take x(O) = 0.
We shall mostly use the criterion given in Theorem 1, which is both necessary and
sufficient.
i
1
I

I For using the iteration method (S), we need the matrix H a i d t6e vector &'which
depend on the matrix A and the vector b. The well-known iteration methods ?re
! based on the splitting of the matrix A in the form
A=D+L+U

I discuss two iteration methods of the form (5).

We write the system of Eqn. (1) in the form (2), viz.,


+ +
allxl aI2x2+ ... alnx, = b,
+
azlxl azzx2+ ... + a,,x, = bz

anlxl+ an2x2+ ... + annxn= b,


l t c ~ t i v Methods
t

Note that, A being a non-singular matrix, it is possible for us to make all the p i v M
non-zero. It is only when the matrix A is singular that even complete pivoting may
not lead to all the non-zero pivots.
We rewrite system (2) in the form (3) and define the Jacobi iteration method as
x I( ~ + ' )=
1 (a x ( ~ +
-- 12 2
) al3xSk) + ... + a,,~,!,~)-b,)
a11

xik+') = - 1
+
(anlxlk) an2x?) + ... + a,, ~!x: -bn)

i+j

The method (13) can be put in the matrix form as

E The method (14) is of the form ( 5 ) , where

H = -D-' (L+U) and c = D-' b

. then replace the entire vector x ( ~on


vector x ( ~ + ' )We ) the right side of Eqn. (13) by
x ( ~ + ' )to obtain the solution at the next iteration. In other words. each of the

Let us now solve a few examples for better understanding of the method and its
~~~- ~ -~
Determine the rate of convergence of the method and the number of iterations needed Iterative Methods

to make m?x IE!~)I S lo-'


Perform these number of iterations starting with-initial approximation = [I 2 21T
and com6are the result with the exact solutibn [2 4 3IT

Solution : The Jacobi method when applied to the system of Eqns. (18), gives the
iteration matrix

The eigenvalues of the matrix H are the roots of the characteristic eqmtion.
det (H-XI) = 0
Now
-A -1 --1
4 4
3
det (H-XI) = 12 -A 18
= ~ 3 - - = 0
80
-2 --1 -A
5 5
'
All the three eigenvalues of the matrix H are equal and they are equal to
A = 0.3347
The spectral radius is

We obtain t h a t e of convergence as
v = -i0gl0(0.3347) = 0.4753
The number of iterations needed for the required accuracy is given by

The Jacobi method when applied to the system of Eqns. (18) becomes

starting with the initial approximation x(" = [l 2 21T, we get from Eqn. (21)
x")=[1.75 3.375 3 . 0 1 ~
x"' = [1.8437 3.875 3.0251~
x'~'= 11.9625 3.925 2.9625IT
,'(A) =
.& [1.9906 3.9766 3.000(1]~
x(" = [1.9941 3.9953 3.00091~
- ..--,CI~G ~UIIuitionin Theorem 1 is violated. The iteration method does not conve ge
? .
Iterative Methods

We now perform few iterations and see what happens actually. Taking x(') = 0 and
using the Jacobi method

we obtain

and so on, which shows that the iterations are diverging fast. You may also try to
obtain the solution with other initial approximations.

El) ~ o i r i i f i v iterations
e of the Jacobi method for solving the system of equations
given in Example 4 with x(O) = [ l 1 1IT.

Let us now consider an example to show that the convergence criterion given in
Theorem 3 is only a sufficient condition. That is, there are system of equations which
are not diagonally dominant but, the Jacobi iteration method converges.

Example 5 : Perform iterations of the Jacobi method for solving the system of
equations

with x(O) = [0 1 llT. What can you say about the solution obtained if the exact
solution is x = [0 1 2IT?
Solution : The Jacobi method when applied to the given system of equations becomes
X(k+l)
1
= [3 - X$k) - x3(k)I
=1

xSk+') = [-1 + 3xIk)], k=0,1, .....


Using x(O) = [0 1 llT, we obtain

You may notice that the coefficient matrix is not diagonally dominant but the
iterations the exact solution after only two iterations.
And now a few exercises for yo;.

~ 2Perform
j four iterations of the Jacobi method for solving the system of equations

,with x(O) = 0. Exact solution is x = (1 -1 -llT 57


I

I ons of the Jacobi method for solving tne system V L c q u a r l v x w


I

I
I

I
I
I
I
with.dO)= 0. The exact solution is x = (1 1
I
I
E4) Perform four iterations of the Jacobi method for solving the system of equations
You may notice'here that in the first equation of system (24), we substitute the initial
approximation (xi0): xi0',.. .,xi0)).on the right hand side. In the second equation,
we substitute (xi1), xSO',...,xAO))on the right hand side. In the third equation, we
substitute (xi1), xi1), X$~),...X:O))on the right hand side. We continue in this manner
until all the components have been improved. A t the end of this first iteration, we
will have an @proved vector (xi1), xi1),. ..,xi1)). The entire process is then repeated.
In other words, the method uses an improved component as soon as it becomes
available. It is for this reason the metbod is also called the method of successive
displacements.

b
We can also write the s$tem of Eqns. (24) as follows:
all xik+') = - a12xik) - a13xjk) ... aIn xAk) + bl
-,

a21 X 1( k + l ) + a22 X2( k + l ) = - a2fl$k) - - a2n x(k)


n
+ b2

In matrix form, this system can be written as


(D+L) x ( ~ + ' )= - U x ( ~ +
) b
where D is the diagonal matrix

and L and U are respectively the lower and upper triangular matrices with the zeros
along the diagonal and are of the form

From Qn. (25), we obtain


= - ( D + L ) - ~ U X ( ~+) .(D+L)-l b
which is of the form (5) with
H = -(D+L)-' U and c = (D+L)-' b.
It may again be noted here, that if A is diagonally dominant then the iteration always
cbnverges.
Gauss-Seidel method will generally converge if the Jacobi method converges, and will
converge at a faster rate. For symmetric A ; it can be shown that
p(Gauss8eidel iteration method) = [p(Jacobi iteration method)12
of Linmr *lgebrniC Hence the rate of convergence clL ol. .gauss-Seidel method is twice the rate of
convergence of the ~ a c o bmethod.
i This result is usually true even when A is nc
symmetric.
I We shall illustrate this fact through exam~les.

Example 6 : Perform four iterations (rounded to four decimal placesj using the
Gauss-Seidel method for solving the system of equations

with do)= 0. The exact solution is x = (-1 -4 -3)T.

Solution : The Gapss-Seidel method, for the system (25) is

x{k+l) = + [x$k) + x$k)-:,.]

Taking do)= 0, we obtain the following iterations.


k =0

which is a good approximation to the exact solution x = (-1 -4 -3)T with maximum
absolute error 0.0034. Comparing with the results obtained in Example 1, we find
'-.
that the values of xi, i=1,2,3 obtained here are better approximates to the exact
60 solution than the one obtained in Example 1.
Soiution of Linear Algebraic Quation6 The eigenvalues of the matrix H are the roots of the characteristic equation
-A -1 -1
4 4
det(H-AI)= 0 --A 0 =0
8
0 -3
40 -(+AI ) +

We have
h(80A2 - 2A -1) = 0
which gives
A = 0, 0.125, -0.1
Therefore, we have
p(H) = 0.125
The rate of convergence of the method is given by
v = -10g~~(0.125) = 0.9031
The number of iterations needed for obtaining the desired accuracy is given by
k = - =2 - . Z 2 3
v 0.9031
The Gauss-Seidel method when applied to the system of Eqns. (29) becomes

X[k+')
1
= -[7 - X$k) + X$k)]
4
X$k+l) = 1
- -[-21 - 4x$k+l)- x 6 k ) ~
8 (30)

X$k+l)
1
= -[15 + zx$k+l) - (k+l)
5 x2 I
The successive iterations are obtained as
x(')= [1.75 3.75 2.951~
x ' ~ )= [1.95 3.9688 2.98631T
) [1.9956 3.9961 2.99901~
x(~=
which is an approximation to the exact solution after three iterations. Comparing the
results obtained in Example 2, we conclude that the Gauss-Seidel method converges
faster than the Jacobi method.
Example 8 : Use the Gauss-Seidel method for solving the following system of
equations.

with x(O) = [0.5 0.5 0.5 0.5IT. Compare the results with those obtained in
Example 3 after four iterations. The exact solution is x = [I 1 1 llT.
Solution : The Gauss-Seidel method, when applied to the system of Eqns. (31)
becomes
= ' [I + xik)]

x21 - 1
-q[xl (,+I) +

x3( k + l ) -
- T[x2
+ x$k)]

x$k+l) = l[l + xSk+')], k = 0, 1,...


Iterative Methods
Starting with the initial approximation x(O) = [0.5 0.5 0.5 0.51T, we obtain the
following iterates
x(') = [0.75 0.625 0.5625 0.78131~
xc2) = [0.8125 0.'6875 0.7344 0.86721~
d3)= [0.8438 0.7891 0.8282 0.9141]~
x ( ~ )= [0.8946 0.8614 0.8878 0.94391~

In Example 3, the result obtained after four iterations by the Jacobi method was
d4)= [0.8438 0.75 0.75 0.84381~
-Remark : The matrix formulations of the Jacobi and Gauss-Seidel methods are used
whenever we want to check whether the iterations converges or to find the rate of
convergence. If we wish to iterate and find solutions of the systems, we shall use the
equation form of the methods.
And now a few exercises for you.
You may now attempt the following exercises.
-- - -

E7) Perform four iterations of the Gauss-Seidel method for solving t k system of
equations given in E2).
E8) Perform four iterations of the Gauss-Seidel method for solving the system of
equations given in E3).
E9) Perform four iterations of the Gauss-Seidel method for solving the system of
equations given in E4).
E10) Set up the matrix formulation of the Gauss-Seidel method for solving the system
of equations given in E5). Perform four iterations of the method.
E l l ) Gauss-Seidel method is used to solve the system of equations given in E6).
Determine the rate of convergence and the number of iterations needed to
make m ? x I ~ $ ~G) ) [lo-'. Perform four iterations and compare the results with
'the exact solution.

We now end this unit. by giving a summary of what we have covered in it.

7.5 SUMMARY
In this unit, we have covered the following:
1) Iterative methods for solving linear system of .equations
Ax = b (see Eqn. (1))
where A is an n x n , non-singular matrix. Iterative methods are generally used
when the system is large and the matrix A is sparse. The process is 'started using
an initial approximation and lead to successively better approximations.
2) General iterative method for solving the linear system of Eqn. (1) can be written
in the form
x ( ~ + ' )= H X ( ~ ) + C, k = O,l, ........(see Eqn. (5))
where dk)and x ( ~ + ' )are the approximations to the solution vector x at the kth
and the (k+l)th iterations respectively. H is the iteration matrix which depends
on A and is generally a constant matrix. c is a column vector and depends on
both A and b.
3) Iterative method of the form given in 2) above converges for any initial vector,
if IlHll <1, which ii a sufficient condition for convergence. The necessary and
sufficient condition for convergence is p(H) <, where p(H) is the spectral radius
of H.
4) In the Jacobi iteration method or the method of simultaneous displacements.
H = - D-' (L+U); c = D-' b
where D is a diagonal matrix, L and U are respectively the lower and upper
triangular matrices with zero diagonal elements.
soluuon of *lgebrniC Qua-
5) In the Gauss-Seidel iteration method or the method of successive displacements
H = -(D + L)-'U and c = (D + L)-'b.
6) If the matrix A in Eqn. (I) is strictly diagonally dominant then the Jacobi and
Gauss-Seidel methods converge. Gauss-Seidel method converges faster than the
Jacobi method.

El) x(') = (-3 7 -31T

d3)= (-15 19 -9)T

)'(x = (-63 67 -33)T

Iterations do not converge.


E2) x") = [0.2 -1.2 -0.81~
)
'
(
x = [1.0 -0.8 -0.64JT
d3)= [0.776 -1.216 -1.041~
x(~)= [#.I024 -0.8864 -0.86721T

E3) x("= [0.75 0.0 0.251~


)'(x = [0.75 0.625 0.43751~
I x'~) = [0.9063 0.6719 0.71881~
I
~I x(~) = [0.9180 0.8594 0.75391~

E4) "'x = [-0.8 1.2 1.6 3.41~


I x'~' = [0.44 1.62 2.36 3.6IT
~ x(~)= [0.716 1.84 2.732 3.8421~
x'~) = [0.8828 1.9290 2.87% 3.92881~

x") = [0.5 0.5 0.5 0.5IT


x'~' = [0.75 0.75 0.75 0.751~
[0.875 0.875 0.875 0.875IT
x(*) = [0.9375 0.9375 0.9375 0.93751~
k=2,6
v

x"' = [0.5 0.3333 -0.54171T


x(*) = [0.7709 0.6945 -0.79001~
x'~' = [0.8950 0.8600 -0.90381~
x'~)= [0.9519 0.9359 -0.95591~
-
UNIT 8 EIGENVALUES AND
EIGENVECTORS

8.1 Introduction
8.'2 The Eigenvalue Problem
8.3 The Power Method
8.4 The Inverse Power Method
8.5 Summary
8.6 Solutions/Answers

8.1 INTRODUCTION
In Unit 7, you have seen that eigenvalues of the iteration matrix play a m'ajor role in
the study of convergence of iterative methods for solving linear system of equations.
Eigenvalues are also of great importance in many physical problems. The Stability of
an aircraft is determined by the location of the eigenvalues of a certain matrix in the
complex plane. The natural frequencies of the vibrations of a beam are actually
eigenvalues of a matrix. Thus the computation of the absolutely largest eigenvalue

For a given system of equations of the form

t e v a l u e s of the parameter A, for which the system of Eqn. (2) has a nonzero
solution, are' called the eigenvalues of A. Corresponding to these eigenvalues, the
nbnzero solutions of Eqn. (2) i.e. the vectors x, are called the eigenvectors of A. The
problem of finding the eigenvalues and the corresponding eigenvectors of a square
.katrix A is known as the eigenvalue problem. In this unit, we shall discuss ihe
eigenvalue problem. T o begin with, we shall give you some definitions and properties
related to eigenvalues. .

Solve simple eigenvalue problems;


Obtain the largest eigenvalue in magnitude and the corresponding eigenvector of
a given matrix by using the power method;
Obtain the smallest eigenvalue in magnitude and an eigenvalue closest to any
chosen number along with the corresponding eigenvector of a given matrix by
using the inverse power method.

'8.2 THE EIGENVALUE PROBLEM

homogeneous system

solution, x = 0. For the homogeneous system (3) to have a i.lonzero solution, the
matrix A must be singular and in this case the solution is not unique Pef. Theorem
Solutloa of L i m ~
Algebr*c EquaUo~ The homogeneous system of Eqn. (2) d l have a nonzero scilutiononly when the
coefficient matrix (A - XI) is singular, that is,
det (A - AI) = 0 (4)
If the matrix A i's an n x n matrix then Eqn. (4) gives a polynominal of degree n in
A. This polynomial is called the characteristic equation of A. The n roots h l , A,, ...,An
of this polynomial are the eigenvalues of A. For each eigenvalue hi, there exists a
vector xi (the eigenvector) which is the nonzero solution of the system of equations
(A - Xi) xi = 0 (5)
- - The eigenvalues have a number of interesting properties. We shall now state and
prove a few of these properties which we shall be using frequently.
P1 : A matrix A is singular if and only if it has a Zen, eigenvalue.

-
Proof : If A has a zero eigenvalue then
det (A - 0 I) = 0

*
det (A) = 0
A is singular.
Conversely, if A is singular then
det (A) = 0
+=- det (A - 0 I) = 0
* 0 is an eigenvalue of the matrix A.
P2 : A and have the same eigenvalues.
Proof : If A is an eigenvalue of A then
det (A - XI) = 0
* det (A - A I ) ~= 0 (ref. P6 Sec. 9.3, Unit 9, Block 3, MTE-02)
* det ( A -~ A I ~=) 0 (Ref. Theorem 3, Sec. 7.3, Unit7, Block 2, MTED2)
* det ( A - ~ XI) = 0
=+ A is an eigenvalue of
Hence the result.
However, the eigenvectors of A and are not the same.

P3 :If the eigenvalues of a matrix A are A', A,... ,An then the eigenvalues of Am, m
any positive integer, are Xy, A
,: ...,A t . Also both the matrices A and Am have the
same set of eigenvectors.

Proof : Since Xi (i = 1,2,...,n) are the eigenvalues of A, we have


Ax = Aix,i= 1,2,...,n (6)
Premultiplying Eqn. (6) by A on both sides, we get
A,X = A Xi x = &(Ax) = h?x (7)
which'implies that A:, &...,A: are the eigenvalues of A,. Further, A and A' have
the same eigenvectors. Premultiplying Eqn. (7) (m-1) times by A on both sides the
general result follows.

P4 : If A', A2 ,...,An are the eigenvalues of A , then l/Al, l/A2,...,l/A, are the
eigenvalues of A-'. Also both the matrices A and A-' have the same set of
eigenvectors.

Prkf : Since Xi (i= 1,2,. ..,n), q e the eigenvalues of A, we have


A x = Aix,i = 1,2,...,n (8)
Premultiplying Eqn. (8) on both sides by A-', we get
A-'A x = A, A-'X
which gives
x = Xi A-'x

and hehce the result.


P5 : If A,, A2 ,...,An are the eigenvalues of A , then A,
q, i=1,2 ,...,n are the
- Elgenvslues and Elgenvectons
eigenvalues of A-qI for any real number q. Both the matrices A and A - q I have
the same set of eigenvectors.

Proof : Since hi is an eigenvalues of A, we have


Ax = Aix, i = 1,2,...,n
Subtracting q x from both sides of Eqn. (9), we get
AX - qx = Aix - qx
which gives
( A - q1)x = (Ai - q)x
and the result follows.
1 .
P6 : If Xi, i = 1,2,...,naretheeigenvaluesof A then -? 1=1,2 ,...,n are the
hi-9
eigenvalues of (A - q1)-' for any real number q. Both the matrices A and (A - q I)-'
\
,have the kame set of eigenvectors.
P6 can be proved by combining P4 and P5. We leave the proof to you.

E l j Prove P6

We now give you a direct method of calculit?ng the eigenvalues and eigenvectors of
a matrix.
Example 1 : Find the eigenvalues of the matrix

Solution : a) Using ~ ~ n(4),


s .we obtain the characteristic equations as
\-A 0 0 :

det(A-XI) = 0 2-A 0 =0
0 0 3-A

which giyes (1-A) (2-A) (3-A) = 0.


.add hence the eigenvalues of A are A,=l, A,=2, A3=3.

which gives (1-A) (3-A) (6-A) = 0.


Eigenvalues of A are A,=l, A2=3, A3=6.

1-A 2 3
c) det (A-XI) = 0 4-A 5 =0
0 0 6-A'

Therefore, (1-A) (4-A) (6-A) = 0.


Eigenvalues of A are A , = l , A2=4, A3=6.
Remark : Observe that in Example 1 (a), the matrix A is diagonal and in parts (b)
and (c), it is lower and upper triangular respectively. In these cases the eigenvalues
of A are the diagonal elements. This is true for any diagonal, lower triangular or
upper triangular matrix. Formally, we give the-result in the following theorem.
I 1
Sdution of Idinear *lpebraic Equations Theorem 1 : The eigenvalues of a diagonal, lower triangular or an upper triangular
matrix are the diagonal elements themselves. Let us consider another example.
Example 2 : Find the eigenvalues and the corresponding eigenvectors of the matrices

I Solution : a) Using Eqns. (4), we obtain the characteristic equation as

I
which gives the polynomial

x2--5A+4=0
i.e., (A-1) (A-4) = 0
The matrix A has two distinct real eigenvalues XI = 1, A2 = 4. TO obtain the
corresponding eigenvectors we solve the system of Eqns. (5) for each value of A.
For A=l, we obtain the sytem of equations
X l + 2x2 = 0

x, + 2x2 = 0
which reduces to a single equation
x, +
2x2 = 0
Taking x2 = k, we get x, = -2k, k being arbitrary nonzero constant. Thus, the
eigenvector is of the form

I
For A=4, we obtain the sytem of equations
I
-2x1 + 2x2 = 0
Xl - X2 = 0
I
which reduces to a single equation
I x, - X2 = 0
I Taking x2 = k, we get xl = k and the corresponding eigenvectords

Note' In practice we usually omit k and say that [-2 1IT and [I 1IT are the
eigenvectors of A corresponding to the eigenvalues A = 1 and A = 4 respectively.
Moreover, the eigenvectors in this case are linearly independent.
b) The characteristic equation in this case becomes
I (A - 112= 0.
i Therefore, the matrix A has a repeated real eigenvalue. The eigenvector
I
corresponding to A = 1 is the solution of the system of Eqns. ( 9 , which reduces to
I a single equation
I

I X2 = 0.
1 Taking x, = k, we obtain the eigenvector as

I
, Note: that, in this case of repeated eigenvalues, we got linearly dependent
1 ' eigenvectors.
I
c) The characteristic equation in this case becomes
I h2-2A+5=O
1 70 which gives two complex eigenvalues A = 1 f 2i.
Taking x2 = k, we get the eigenvector

r Similarly, for A = 1 - 2i, we obtain the eigenvector

In the above problem you may note that corresponding to complex eigavalues, we
got complex eigenvectors. Let us now consider an example of 3 x 3 matrix.

Example 3 : Determine the eigenvalues and the corresponding eigenvectors for the
matrices '

Solution : a) The characteristic equation in t h i x a s e becomes


2-A -1 0
-1 2-A -1 =0
0 -1 2-A

which gives the polynomial


' (2-A) (A2-4A+2) = 0
Therefore, the eigenvalues of A are = 2,2 + f i and 2 - f i
The eigenvector of A corresponding to A = 2 is the solution of the system of
Eqns. ( 5 ) , which reduces to

Taking x3 = k, y e obtain the eigenvector

The eigenvector of A corresponding to A =2 + &is the solution of the system of


equations

! -JZ
-1
0
-1
-JZ
-1 -$Z
-1 '1 [ : , [ 1 ; ]
x.3
(10)

~ o ' f i n dthe solution of system of Eqns. (lo), we use Gauss elimination method.
Again performing R j - f i R2, we get

which give the equations

Taking x3 = k, we obtain the eigenvector

Similarly, corresponding to the eigenvalue A = 2-


solution of system of equations
a ,the eigenvector is the

Using the Gauss elimination method, the system reduces to the equations
f i x, - X 2 = o
X2-,lTx3=0
Taking x3 = k, we obtain the eigenvector

b) The characteristic equation in this case becomes


(A - 8) (A - 2)' = 0
Therefore, the matrix A has the real eigenvalues 8, 2 and 2. The eigenvalue 2 is
repeated two times.
The eigenvector corresponding to A = 8 is s o l u t i ~ nof system of Eqns. (9,which
reduces to
+
x, X2 - X j = 0
2x, + 5x, + X j = 0 (11)
2x,- x, - 5x3 = 0
Subtracting the last equation of system (11) from the second equation we obtain the
system of equations
X I + X2 - X 3 = 0

x, + X 3 = 0
Taking x3 = k. the eigenvector is

. .
The eigenvector corresponding to A = 2 is the solution of system of Eqns. ( 5 ) ,
wll,icli reduce: to a single equation.
+
2u, - X l x3 = 0 (12)
we can take any values for x, and x2 which need not be related to each other. T h e ,
two linearly independent sdutions can be written as:
El) A =

E2) A =
[
[
"
-15
J2
j2 J;]
4 3
2

10 -12 6 1
20 -4 2

E3lA= [-i 2 -3
-21 - 60 1

E4)A=[i
-1 -4
-i I]
In the examples considered so far, it was possible for us to find all the roots of the
characteristic equation exactly. But this may not always be possible. This is
particularly true for n > 3. In such cases some iterative method like Newton-Raphson
[I:; method may have to be used to find a particular eigenvalue or all the eigenvalues
from the characteristic equation. However, in many practical problems, we do not
:1
iY'
%
require all the eigenvalues but need only a selected eigenvalue. For example, when
we use iterative methods for solving a nonhomogeneous system of linear equations
Ax = b, we need to know only the largest eigenvalue in magnitude-qf the iteration
matrix H, to find out whethd the method converges or not. One iterative method;
which is frequently used to determine the largest eigenvalue in magnitude (also called
the dominant eigenvalue) and the corresponding eigenvector for a given square matrix
A is the power method. In this method d e do not find the characteristic equation.
This method is applicable only when all the eigenvalues are real and distinct. If the
.magnitude of two or more eigenvalues is the same then the method converges slowly.

8.3 THE POWER METHOD


Let us consider the eigenvalue problem
A x = Ax.
Let A,, A*, ...,An be the n real and distinct eigenvalues of A such that

Therefore, Al is the dominant eigenvalue of A.


In this method, we start with an arbitrary nonzero vector y(O) (not an eigenvector),
I
solution Or Liwar A'gebmicEquntions
and form a sequence of vectors ( Y ( ~ ) )

In the limit as k + m, y(k) converges to the eigenvector corresponding to the


dominant eigenvalue of the matrix A. We can stop the iteration when the largest
element in magnitude in y'"+')-y'") is less than the predefined error tolerance. For
simplicity, we usually take the initial vector ycU)with all its elements equal to one.
I Note that in the process of multiplying the matrix A with the vector y(k), the
I

I
Vector for which scal~nghas been elements of the vector y(k+'J may becbme very large. To avoid this, we normalize'(or
done is called a scaled vector scale) the vector y(k) at each step by dividing y(k),by its largest element in magnitude.
1 otherw~se,it IS unsealed. This will make the largest element in magnitude in 'the vector y(k+') as one and the
I remaining elements less than one.
I

I
If y(k) represents the unscaled vector and y(k) the scaled vector then, we have the
I
I power method.

v(()) = y0)and
mk+,being the largest element in magnitude of y(k+') We then
obtain the dominant eigenvalue by taking the limit

(~(~+'))r
A, = lim
k-tm (~(~))r

where r represents the rth component of that vector. Obviously, there are n ratios of
numbers. As k-m all these ratios tend to the same value, which is thedargest
eigenvalue in magnitude i.e., A,. The iteration is stopped when the magnitude of the
difference of any two ratios is less than the prescribed tolerance.
The corresponding eigenvector is then dk+')
obtained at the end of t'he last iteration
performed.
We now illustrate the method through an example.

Example 4 : Find the dominant eigenvalue and the corresponding eigenvector correct
to two decimal places of the matrix

using the power method.


Solution : We take
y(O) = v(O) = (1 1 1)T
Using Eqn. (14), we obtain

[ -1 -: -:I 1.1 [-i ]


Again,
2 -1
y(') = AV(') = =
v(7) = -y'7' = [0.7071 -1 0.70711~
3.4146

After 7 iterations, the r a t i o s B are given as 3.4138, 3.4146 and 3.4138. The
(v(@)r
maximum error in these ratios is 0.0008. Hence the dominant eigenvalue can be taken
as 3.414 and the corresponding eigenvector is [0.7071 -1 0.70711~

Note that the exact dominant eigenvalue of A as obtained in Example 3 was


2 + J2 = 3.4142 and the corresponding eigenvector was [I -& 11' which can also
1 1
be written as [-- -1 --lT = [0.7071 -1 0.70711~
J2 J2
You may now try the followingexercises.

Using four iterations of the power method and taking the initial vector y(") with all
its elements equal to one, find the dominant eigenvalue and the correspondingeigen-
vector for the following matrices.

You must have realised that an advantage of the power method is that the eigenvector
'corresponding to the dominant eigenvalue is also generated at the same time.
Usually, for most of the methods of determining eigenvalues, we need to do separate
computations to obtain the eigenvector.
In some problems, the most important eigenvalue is the eigenvalue of least
magnitude. We shall discuss now the inverse power method which gives the least
ejgenvalue in magnitude.
'1 Soktlon ot Llnenr Algebralc Equations
8.4 THE INVERSE POWER METHOD
1
We first note that if A is the smallest eigenvalue in magnitude of A, then - is the
X.-
largest eigenvalue in magnitude of A-'. The corresponding eigenvectors are same.
If we apply the power method to A-l, we obtain its largest eigenvalue and the
corresponding eigenvector. This eigenvalue is then the smallest eigenvalue in
magnitude of A and the eigenvector is same. Since power method is applied to A-l,
it is called the inverse power method.
Consider the method
y(k+l) = ~-1G(k) k=o ,1,2 ,.........
7 (17)

where y(O) is an arbitrary nonzero vector diffiient from the eigenvector of A.


However, algorithm (17) is not in suitable form, as one has find A-'. Alternately,
we write Eqn. (17) as
~,,(k+') = v(k)

We now need to solve a system of equations for Y ( ~ + 'which


), can be obtained using
any of the method discussed in the previous units. The ,largest eigenvalue of A-' ks
again given by
(y(k+l))r
p = lim
L - + ~ (dk))r

The corresponding eigenvector is d k + ' ) .


%e now illustrate the method thr$ugh an example.

Example 5 : Find the smallest eigenvalue in magnitude and the. corresponding


eigenvector of the matrix. .

using four iterations of the inverse power method.


.-" ~

Solution : Taking v(O) = [l 1 llT, we write


First iteration

..
For solving the system of Eqns. (19), we use the LU decomposition method. We write
. -

comparing the coefficients on both sides bf Eqns. (20): we obtain


and then uy(')= z
we obtain
y(') = [$ 2 $1 T
= [ 1.5 2.0 1.5 ]
T

Second iteration
~ ~ ( =2 v(l)
)

Solving LZ = v(')
and uy(')= z
we obtain

Third iteration

Fourth iteration

A p r 4 iterations, the ratios @?!


(d3))r
are given as 1.7059, 1.7083, 1.7059. The
maximum error in these ratios is 0.0024. Hence the dominant eigenvalue of A-' can
be taken as 1.70. Therefore, -- 0.5882 is the smallest eigenvalue of A in
1.70
magnitude and the corresponding eigenvector is given by [0.7073 1 0.70731~.
Note that the smallest eigenvalue in magnitude of A as calculated in Example 3 was
2-J2 = 0.5858 and the corresponding eigenvector was [I J2 llTor [0.7071 1 0.70711~.

You may now try the following exercise :


E7) Find the smallest eigenvalue in magnitude and the corresponding eigenvector of
the matrix

with do)= [-1 1lT, using four iterations of the inverse powe; method.

The inyersc! power method can be further generalized to find some other selected
eigenvalues of A. For instance, one may be interested to find the eigenvalue of A
which is nearest to some chosen number q. You know from P6 of Sec. 8.2 that the
matrices A and A-qI have the same set of eigenvectors. Further, for each eigenvalue
hi of A, hi-q is the eigenvalue of A-qI.
-

wutlon of Ll-r Al~ebraic


. Equations
. We can therefore use the iteration

with scaling as described in Eqns. (14) - (16). We determine the dominant


eigenvalue p of ( A - ~ I ) - ' using the procedure given in Eqn. (18), i.e.

pt-1) - y(k+l)
mk+l
Using P6, we have the relation
p= -
1
where A is an eigenvalue of A.
7

A-q

1
Now since p i s the largest eigenvalue in magnitude of (A-~I)-', - must be the
wr
smallest eigenvalue in magnitude of A-qI. Hence, the eigenvalue - + q of A is
I.L
closest to q.

~xamp'le6 : Find the eigenvalue of the matrix A, nearest to 3 and also the
corresponding eigenvector using four iterations of the inverse power method where,
2 -1

-1
Solution : In this case q = 3. Thus we have

A-31 = [ -i
-1 -1

To find Y(~+'),
we need to solve the system

L J
and normalise y(k+l) as given in Eqn. (22).
First iteration
Starting with v(O) = [l 1 llT and using the Gauss elimination method to solve the
system (24), we obtain

Second iteration

Third iteration
After four iterations, the ratios are given as 2.5, 2.333, 2.5. The maximum
(d3))r
error in these ratios is 0.1667. Hence the dominant eigenvalue of (A-31)-' can
be taken as 2. Thus the eigenvalue A of A closest to 3 as given by Eqn. (23) is

and the corresponding eigenvector is d4)=


[7
-1 -:IT = [0.7143 -1 0.7143
IT
Note that the eigenvalue of A closest to 3 as obtained in Example 3 was 2+J2 = 3.4142.
The eigenvector corresponding to this eigenvalue was 0.7071 -1 0.7071
IT
And now a few exercises for you.

E8) Find the eigenvalue which is nearest to - 1 and the corresponding eigenvector for
the matrix

with do)= [ -1 ;IT, using four iterations of the inverse power method.
E9) Using four iterations of the inverse power method, find the eigenvalue which is
nearest to 5 and the corresponding eigenvector for the matrix

A = [: :] (exact eigenvalues are = 1 and 6)

with do)= [1 l r

The eigenvalues of a given matrix can alst be estimated. That is, for a given matrix
A , we can find the region in which all its eigenvalues lie. This can be done as follows:
Let Xi be an e'igenvalue of A and xi be the corresponding eigenvector, i.e..
Axi = hixi (25)
or
Let be the largest element in magnitude of the vector ........,xi,nIT
Consider the kth equation of the system (26) and divide it by xiek.We then have

Taking the magnitudes on both sides of Eqn. (27), we get

since
1 1
JL S 1 f o r j = 1, 2 ,.......n.
;;,k

Since eigenyalues of A and A= are same (Ref. P2), Eqn.(28) can also be written as

Since I x ~ ,the
~ ~largest
, element in magnitude, is unknown, we approximate ~qns.(28)
and (29) by

(maximum absolute row sum) (30)


1

and
. .
(maximum absolute column sum)

We can also rewrite Eqn. (27) in the form

and taking magnitude on both sides, we get

Again. since A and have the same eigenvalues Eqn.(32) can be written as
n
Ihi-akkI 4 2 1aij1 (33)

Note that since the eigenvalues can be complex, the bounds (30), (31), (32) and (33)
1 matrix A i s symmetric if represents circles in the complex plane. If the eigenvalues are real, then they
represent intewals. For example, when A is symmetric then the eigenvalues of A are
real.
Again in Eq?. (32), since k is not known, we replace the circle by the union of the
n circles

Similarly from Eqn. (33), we have that eigenvalues of A lie in the union of ciicles
Eigenvalues and Eigenvecbrs
Thy bounds derived in Eqns. (30), (31), (34) and (35) for eigenvalues are all
indhp&dent bounds. Hence the eigenvalues must lie in the intersection of these
bounds. The circles derived above are called the Gerschgorin circles and the bounds
are called the Gerschgorin bounds.
Let us now consider the following examples:
Example 7 : Estimate the eigenvalues of the matrix

using 'the Gerschgorin bounds.


w
Solution : The eigenvalues of A lie in the following regions:

i) absolute row sums are 4, 6 and 6. Hence


( A ( 6 max [4,6,6] = 6
ii) absolute column sums are 4, 5 and 7. Hence
1x1 6 7
iii) union of the circles [using (34)]
(A-l( 6 3
/A-11 < 5
/A-21 S 4
iv),,?unionof the circles [using (391
/A-11 s 3
JA-11< 4
Ix-2) 6 5 -
union of circles in (iii) is (A- 1I 6 5
union of circles in (iv) as (A-21 d 5
2 -.
The eigenvalues lie in all the circles (36), (37), (38) and (39) i.e., in the intersection of
these circles as shown by shaded region in Fig. 1.
when only the largest eigenvalue in magnitude is to be obtained, we use the Eigenvalues and Ei~envectors
i)
power method. In this method we obtain a sequence of vectors {y(k)),using
the iterative scheme
Y(~+') = A y(k),k=O, 1, ... (see Eqn. (13))
which in the limit as k + m, converges to the eigenvector corresponding to
the dominant eigenvhlue of the matrix A. The vector y(O) is an arbitrary
',
non-zero vector (different from the eigenvector of A).
ii) we use the inverse power method with the iteration scheme
y(k+l) = ( A - ~ I ) - p
~) ,
i.e., (A-qI) Y(~+') = d k ) , k = 0, 1, 2, .......
where y(O) = is an arbitrary non-zero vector (not an eigenvector)
a) with q = 0 , if only the least eigenvalue of A in magnitude and the
corresponding eigenvector are to be obtained and
b) with any q, if the eigenvalue of A, nearest to some chosen number q and
the corresponding eigenvector are to be obtained

E l ) Characteristic equation : A3 - 5A2 - A 5 = 0 +


eigenvaluks: -1, 1, 5
eigenvectors: [- 1, 0 llT: [1 - J2 llT; [ l J2 llT
+ +
E2) Characteristic equation: A3 25A2 50 A - 1000 = 0
.eigenvalues : -20, -10, 5
eigenvectors : [- 1 112 llT; [-1 -2 11'; [1/4 112 llT
E3) Characteristic equation: A3 +
A2 - 21A - 45.= 0
eigenvalues: -3, -3, 5
eigehvectors: [I O 1/3IT; [O 1 2/31': [-I -2 llT
~ 4 Characteristic
) equation: h3 - A2 - A 1=0 +
eigenvalues: -1, 1, 1
eigenvectors: [1/3 1 0IT; [ l 1 OIT;[l 1 OIT

After 4 iterations the ratios are given by 4.9946, 5.0054, 4.9946. The

maximum error in these ratios is 0.0108. Thus the dominant eigenvalue of A can
be taken as 5.00 and the corresponding eigenvector is [0.7075 1 0.70751~
p = u.3. ~ t l elgenvalue
e or A wnlcn 1s nearest to -1 1s oDtainea trom

The corresponding eigenvector is [-1


z]T
Starting with v(O) = [ l 3IT and solving

we get
[-: -1

T
-' 1.25
T

Similarly,

y'2) = [& %]T 19


; in2 = -=
20
0.95

(4)
After 4 iterations, the r a t i o s q are 1.005, 08.9968.The maximum error in
(V )r
these ratios is 0.0082. Hence the dominant eigenvalue.of (A-51)-' can be taken
as p = 0.99.
The.eigiyvalue of A which is nearest to 5 is obtained from

T
205
The corresponding eigenvector is 11
UNIT 9 LAGRANGE'S FORM
Structure
9.1 Introduction
Objectives

9.2 Lagrange's Form

9.3 Inverse Interpolation

9.4 General Error Term

9.5 Summary

9.6 Solutions/Answers

.
Let f be a real-valued function defined on the interval [a ,b] and we denote f(x,) by fk.
Suppose that the values of the function f(x) are given to be fo, f,, f2. ....f,, when x = x,, x,,
x2. ..., x, respectively where x, < x, < x2 ...< x, lying in the interval [a.b]. The function
f(x) may not be known to us. The technique of determining an approximate value of f(x)
for a non-tabular value of x which lies in the interval [a, b] is called interpolation. The
process of determining the value of f(x) for a value of x lying outside the interval [a,b] is
- called extrapolation. In this unit, we derive a polynomial P(x) of degree In which agrees
with the values of f(x) at the given (n + 1) distinct points, called nodes or abscissas. In
other words, we can find a polynomial P(x) such that P(x,) = fL,j = 0,1,2. ...,n. Such a
polynomial P(x) is called the interpolating polynomial of f(x).

In Section 9.2 we prove the existence of an interpolating polynomial by actually


constructing one such polynomial having the desired property. The uniqueness is proved
by invoking the corollary of the fundamental theorem of Algebra. In Section 9.3 we derive
general expression for error in approximating the function by the interpolating polynomial
at a point and this allows us to calculate a bound on the error over an interval. In proving
this we make use of the general Rolle's theorem.

Objectives

After reading this unit, you should be able to:

find the Lagrange's form of interpolating polynomial interpolatirlg f(x) at n + 1


distinct nodal points;

compute the approximate value off at a non-tabular point:

compute the value of 3? (approximately) given a number 7 such that f(E) = (7)
(inverse interpolation);

compute the error committed in interpolation, if he functiofi is known, at a non- .


tabular point of interest;

find an upper bound in the mapnil

9.2 LAGRANGE'S FORM


&-

Let us recall the fundamental theorem of algcb~sa r ~ dI N useful corollaries.


I
lnterpolatlon Theorem 1: If P(x) is a polynomial of degree n 2 1, that is, P(x) = a,,xn + i~,,_~x"-'+ ...
+ a,x + %, with ao, ...,a,, real or complex numbers and a, # 0,then P(x) has at least one
zero, that is, there exists a real or complex number 5 such that P(6) = 0.

Lemma 1: If 2,. z2, ...,zk are distinct zeros of the polynomial P(x), then

for some.polynomial R(x).

Corollary: If Pk(x) and Q,(x) are two polynomials of degree S k which agree at the k + 1
distinct.points 20, z, ,...,z, then Pk(x) = Q,(x) identically.

You have come across Rolle's theorem in Section 1.2. We need a generalized version of
this theorem in !he Section 9.4 (General Error Tcrm). This is stated below.

Theorem 2: (Generalized Rolle's Theorem). ~ ef be t a real-valued function defined on


[a,b] which is n times differentiable on ]a,h[. Iff vanishes at the n + 1 distinct points xo, ...,
x, in [a,b], then a number c in ]a,b[ exisrs such that fi)(c) = 0.

We now show the existence of an interpolating polynomial and also show that it is unique.
The form of the interpolating polynomial that we are going to discuss in this section is
called the Lagrange form of the interpolating polynomial. We start with a relevant
theorem.

Theorem 3: Let x,, x,, ...,xn be n + I distinct points 06 the real line and let f(x) be a real-
valued function defined on some interval I = [a,b] containing these points. Then, there
exists exactly one polynomial Pn(x)of degree I n, which interpolates f(x) at xo, ....x,,, that
is, Pn(xJ f(xJ, i = 0, 1.2, ..., n.
=S

Proof: First we discuss the uniqueness of the interpolating polynomial, and then exhibit
one explicit construction of an interpolating polynomial (Lagrange's Form).

Let Pn(x)and Qn(x)be two distinct interpolating polynomials of degree I n, which


interpolate f(x) at (n + 1) distinct points xo, x,, ..., x,. Let h(x) = Pn(x) - Qn(x). Note that
h(x) is also a polynomial of degree 5 n. Also'

That is, h(x) has (n + 1) distinct zeros. But h(x) is of degree I n and from the Corollary to
Lemma 1, we have h(x) = 0. That is Pn(x) Qn(x).This proves the uniqueness of the
polynomial.

Since the data is given at the points (xo, fo), (x,, f,), ...,(x,, fn)let the required polynomial
be written as

Setting x = xj in (1). we get

Since this polynomial fits the data exactly, we must have


Lj(xj)= 1
and Li(xj) = 0,i ;tj
or Li(xj) = aij (3)
The polynomials Li(x) which are of degree 5 n are called the Lagrange fundamental
p~lynomials.It is easily verified that these polynomials are given by Lagrange's Form

(x - x0) (x - XI) (x 9.. - Xi- 1) (x - X i + ,) ... (x - X")


Li(x) = (xi - xo) (xi - XI)... (xi - xi - 1) (xi - xi 1) ... (xi - x")
+

Substitution of (4) in (1) gives the required Lagnnge f ~ r i nof the interpolating polynomial.

Remark: The Lagrange form (Eqn. (1)) of interpolating polynomial makes it easy to show
the existence of an interpolating polynomial. But its evaluation at a point xi involves a lot
computation.

A more serious drawback of the Lagrange form arises in pracaice due to the following: One
@? calculates a linear polynomial Pl(x), a quadratic polynomial P2(x)etc., by increasing the
number of interpolation points, until a satisfactory approximation P,(x) to f(x) has been
found. In such a situation Lagrange form does not take any advantage of the availability of
Pk- ,(x) in calculating Pk(x). Later on, we shall see how in this respect. Newton form.
discussed in the next unit, is more useful.

Let us consider some examples to construct this form of interpolation polynomials.

Example 1: If f(1) = - 3, f(3) = 9, f(4) = 30 and f(6) = 132, find the Lagrange's
interpolation polynomial of f(x).

Solution: We have x0 = 1, x, = 3, x2 = 4, x3 = 6 and fo = - 3, f1 = 9, f2 = 30, f3 = 132.

The Lagrange's interpolating polynomial P(x) is given by

where
- (x - x1)(x - x2)(x - x3)
. = (xo - x,) (xo - x2) (xo - x3)
Substituting Lj(x) and fj, j = 0, 1,2,3 in Eqn. (9,
we get

1 1
P(x) = - [x3 - 13x2+ 54x - 721 (-3) + [x3 - l l x 2 + 34x - 241 (9)

- ,
1
[x3 - l o x 2 + 27x - 181 (30)
1
+ 3 [x3 - 8x2+ 19x - 121 (132)

which gives on simplification

.
which is the Lagrange's interpolating polynomial of f(x).

Example 2: Using ~ a ~ r & ~interpolation


e's formula, find the value o f f when x = 1.4 from
the following table.

Solution: The Lagrange's interpolating formula with 4 points is

Substituting

xo = 1.2, xl = 1.7, x2 = 1.8, x3 = 2.0 and

in (6), we get

Putting x = 1.4 on both sides of (7), we get


Now you can try some exercises.
El) Show that

k l n

where Li(x) are Lagrange fundamental polynomials

n \
E2) Let w(x) = n ( x - xk). Show that the interpolating polynomial of degr& 5 n wiLh
k-0
, the nodes xo, x,, ...,x, can be written as

9.3 INVERSE INTERPOLATION


In inverse interpolation in a table of values of x and y = f(x), one is given a number 7 and
wishes to find the point 'j? so that f(Y) = 7,where f(x) is the tabulated function. This
problem can always be solved if f(x) is (continuousJand)strictly increasing or decreasing
(that is, the inverse of f exists). This is done by considering the table of values xi, f(xi). i = 0,
1, ...,n to be a table of values y;, g(yi), i = 0.1.2, ...,n for the inverse function g(y) = f -I&)
= x by taking yi = f(xi),g(y> = xi, i = 0, 1.2, ..., n. Then we can interpolate for the unknown
value g(y) in Lhis table.

and 31 = Pn(7). This process is called inverse interpolation.

Let us consider some examples.

Example 3: From the following table, find the Lagrange's interpolating polynomial which
'agrees with the values of x at the given values of y. Hence find the value of x when y = 2.
Solution: Let x = g(y). The Lagrange's interpolating polynomial P(y) of g(y) is given by'

which, on simplification, gives

The Lagrange's interpolating polynomial of x is given by P(y).

.'. x = P(y) = y3 - y2 + 1
.'. when y = 2, x = P(2) = 5.

Example 4: Find the value of x when y = 3 from the following table of values.

Solution: The Lagrange's interpolation polynomial of x is given by

Now you try some exercises.

E3) Find the Lagrange's interpolation polynomial of f(x) from the following data. ~ e n c e
obtain f(2).
. E4) Using the Lagnnge's interpolation formula, find the value of f(x) when x = 0 from Lagrange's Form
the following table:

E5) Find the value of y when x = 6 from the following table:

E6) From the following table of values, find the value of y when x = 2.5

E7) Find the value of f(5) from the following table:

E8) Using the Lagrange's interpolation formula, find the value of y when x = 10.

E9) In the following table, h is the height above the sea level and p is the barometric
pressure. Calculate p when h = 5280.

E10) In the following khle, y represents the percentage of the number of workers in a
factory whose age is less than x years. Find what percentage of workers have their
age less than 35 years.

Now we are going LO find the error committed in approximating the value of the function .
by P,(x).

9.4 ERROR
Let E,(x) = f(x) - P,,(x) be the error involved in approximating the function f(x) by an
interpolating polynomial. We derive an expression for &(x) in the following theorem.
This result helps us in estimating a useful bound on the error as explained in an example.
1 Theorem 4: Let xo, x,, ...,x, be distinct numbers in the interval [a, b] and f has
(continuous) derivatives upto order (n + 1) in the open interval ]a, b[. If P,,(x) is the
inlerpolating polynomial of degree 5 n, which interpolaps f(x) at the points x,, ...,x,, then
for each x c i [a,b], a number 5(x) in ]a, b[ exists such that
Proof: If x # xk for any k = 0.1.2. ,..,n, define the function g for t in [a,b] by

Since f(t) has continuous derivatives upto order (n + 1) and P(t) has derivatives of all
orders, g(t) has continuous derivatives upto (n + 1) order. Now, for k = 0,1,2, ...,n, we
have

= f(x) - P,(x) - [f(x) - P, (x)]. 1= 0


Thus g has continuous derivatives upto order (n + 1) and g vanishes at the (n + 2) distinct
....
points x, Q, x,. By the generalized Rolle's Theorem (Theorem 2) there exists {(x) in
]a,b[ for which gb+')(Q = 0. Differentiating g(t), (n + 1) times (with respect to t) and
' . evaluating at 4, we get

Simplifying we'get (enwat x = F)

\
The error f~rmula(Eqn. (9)) derived above, is an important theoretical result because
Lagrange interpolating polynomials are extensively used in deriving important formulae
for numerical differentiation and numerical integration.

It is to be noted that f = k(K) depends on the point X at which the error estimate is
required. This dependence need not even be continuous. This error formula is of limited
utility since fi"+')(x) is not known (when we are given a set of data at specific nodes) and
the point 4 is hardly known. But the formula can be used to obtain a bound on the error of
interpolating polynomial. Let us see how, by an example.

Example 5: The following table gives the values of f(x) = ex. If we fit an interpolating
polynomial of degree four to the data, find the magnitude of the maximum possible error
in the computed value of f(x) when x = 1.25.

x 1.2 1.3 1.4 1.5 1.6

f(x) 3.3201 3.6692 . 4.0552 4.4817 4.9530


I

Solution: From Eqn. (9), the magnitude of the error associated with the 4th degree
-
polynomial approximation is given by
' Sincc f(x)'= cx. fiS)(x)= ex. Lnprngc's Form
I
Whcn x lics in thc htcrval rf.2. 1.61.
..
Max ( f ( S ) ( ~I )= = 4.9530
(11) in (1O);and putting x = 1.25, Be upper bound on the magnitude of the
~ubstiiutin~
error

= 0.00000135.
,? ' YOU may now try the following exkrcises.
I#
El 1) For the data of Example 5 with last one omitted, i.e., considering only first four
nodcs, if we fit a polynomial of degree 3, find an estimate of the magnitude of the
error in the computed value of f(x) when x = 1.25. Also find an upper bound in the
::f,
I
magnitude of the error.

E12) The following table gives b e valug of x and f(x) = Sinhx. If the value of f(x) when
x = 0.53 is computed from the second degree interpolation polynomial. find the
estimate of the magnitude of the error.

E13) Find the value of x when y = 3 from the following table:

E14) Find the value of x when y = 4 from the table given below:

E 15) .Find the interpolating polynomial which fits the following data taking x as the
independent variable.

El6) ). Using Lagrange's interpolation formula, find the value of f(4) from the following
data:

?
Let us take a brief look at what you have studied in this unit.
. .
9.5 SUMMARY
4

In this unit, we have seen how to derive the Lagrange's form of interpolating polynomial
for a g i ~ e ~ d a tIta .has been shown that the interpolating pdlynomial for a given data is
, unique. Moreover the Lagnngk form of interpolating polynomial can be dctcrained for
equally spaced or unequally spaccd nodes. We have also seen how the Lagrange's
interpolation formula can be applicd with y as the indcpcndcnt variablc and x as thc
dcpcndcnt variable so that the value of x corresponding to a givcn valuc of y can be
calculated approximately when some conditions arc satislicd. Finally, wc have dcrivcd the
general error formula and its use has bcen illustrated to judgc rhc accuracy of our
calculation. The mathematical lormulac derived in this unit arc listed bclow for your easy
rcfcrcnce.

1) Lagrange's Form
n

Pn(x) = C f ( x i ) Li(x) where


i-0

3) Interpolation Error

9.6 SOLUTIONS/ANS WERS

El) If f(x) is a polynomial of degree 5 n, thcn

n
f(x) = Pn(x) = C L(x)~f(xi) by uniqucncss of interpolatin
i=O
J polynominl.

Whcn l(x) 1, we get (i).

Whcn f(x) = xk, k In, wc gct (ii) by the same argument.

n
Since w(x)= n ( x - xj)
j=O
UNIT 10 NEWTON FORM OF THE
INTERPOLATING POLYNOMIAL
Structure
10.1 Introduction
Objectives

10.2 Divided Differences

10.3 Newton's General Form of Interpolating Polynomial '

10.4 'IJe Error of the Interpolating Polynomial

10.5 Divided Differences and Derivatives

10.6 Further Results on Interpolation Error

10.7 Summary

10.8 Solutions/Answers

10.1 INTRODUCTION
The Lagrange's form of the interpolating polynomial derived in Unit 9 has some drawbacks
compared to Newton form of interpolating polynomial that we are going to consider now.
,
In practice, one is often not sure as to how many interpolation points to use. One often
calculates Pl(x), Pz(x), ....increasing the number of interpolation points, and hence the
degrees of the interpolating polynomials till one gets a satisfactory approximation Pk(x)to
f(x). In such an exercise. Lagrange form seems to be wasteful as in calculating Pk(x), no
advantage is taken of the fact that one has already constructed Pk- 1 ( ~ )whereas
, in Newton
form it is not so.

Before deriving Newton's general form of interpolating polynomial, we introduce the


concept of divided difference and the tabular representation of divided differences. Also
the error of the interpolating polynomial in this case is derived in terms of divided
differences. Using the two different expressions for the error term we get a relationship
between nth order divided difference and nth order\derivative.

Objectives e

After studying this unit, you should be able to :

obtain a divided difference in terms of function values;

fo& a table of divided differences and find divided differences with a given set of
arguments from the table;

show that divided difference is independent of the orde; of its arguments;


. ,
obtain the Newton's divided differences interpolating polynomial for a givcn data;

find an estimate of f(x) for a given non - tabular value of x from a table of values of
\
x and Y [ f(x) I;
relate the kth order derivative of f(x) with the kth order divided difference from the
expression for the error term. .
Newton Form of the Inter;
10.3DIVIDED DIFFERENCES . polating Polyno~nial

Suppose that we have determined a polynomial Pk- (x) of degree I k - 1 which


interpolates f(x) at the points xo, xl,r.. ~ k - ~In. order to make use of Pk - 1 ( ~in) calculating
Pk(x) we consider the following problem: What function g(x) should be added to Pk- 1 ( ~ )
to get Pk (x)? Let g(x) = Pk (x) '- Pk- l ( ~ ) Now,
. g(x) is a polynomial of degree I k and
g ( ~ d = P ~ ( ~ i ) - P ~ - ~ ( ~ i-
) =f(xi)=Ofori=O,
f(~i) 1, .... k-1. .

Suppose that P, (x) is the Lagrange polynomial of degree at most n that agrees with the
functi0n.f at the distinct numbers xo, xl.. ..., x,. Pn (x) can have the following
representation, called Newton form.

for appropriate constants Ao, A;, ....A,.


Evaluating P,(x) (Eqn. (1)) at xo we get A. = Pn (xo) = f(xo). Similarly when Pn(x) is
evaluated at xl, we get Al = - - f ( x l ) . Let us introduce the notation for divided
x1 - xo
differences and define it at this stage: The zeroeth divided difference of the function f,
with respect to xi, is denoted by f[xJ and is simply the evaluation o f f at xi, that is, f[xi] = f
(xi). The first divided difference o f f wilh respect to and + is denoted by f[xi, xi + l]
and defined as

The remaining divided differences of higher orders are defined inductively as follows. The
kth divided differences relative to xi, xi+,, ...,xi+ is defined as

whcre the (k- 1)st divided diifcrences axir...,x ~ +l]~and


- Qxi +1, ..., x,+k]have becn determined.
Thisshows that the kth divided difference is the divided diflercnces of (k- 1)st divided
differences justifying the name. The divided difference f[xl, x2, ..., xk] is invariant under
all permutations of the arguments xl, xz, ...,xk. To show this we proceed as follows giving
another expression for the divided difference.
:
For any integer k between 0 and n. let Qk(x) be the sum of the first k + 1 terms in form (I),
1.e..

Since each of the remaining terms in Eqn. ( I ) has the factor (x - x0) (X- x,) ...(X- xk).
Eqn. (1) can be rewritten as .

Pn(x) = Qk(x) + (x - x0) ,..(X- xk) R(x) lor some polynomial R(x). As the term (x - xo)
(x - xl)...(x - xk) R(x) vanishes at each of the points xo ...xk, we have f(x9 = Pn(x3 = Qk
(xi). i = 0 , 1 , 2 , ..., k. Since Qk(x) is a polynomial of degree 5 k, by uniqueness of
interpolating polynomial Qk(x) = Pk(x).

This shows that P, (x) can be constructed step by step with the addition of the next term in
Eqn. (I), as one constructs the sequence Po(x), Pl(x) ...with Pk(x) obtained from Pk- l(x)
in the form 3

Pk(x) = Pk- l(x) + Ak(x - x0) ... (X- xk- (2)

That is, g(x) is a polynomial of degree Ik having (at least) the k distinct zeros xO,..., xk- l.

...
.: Pk(x) - Pk - l(x) = g(x) = Ak(x - x0) (X- xk - for some constant Ak. This constant
...,
Ak is called h e kth divided difference of f(x) at tile points xO, xk for reasons discussed
f(x) at the p i n t s xo, ..., xk. Thus Eqn. (2) can be rewritten as
P k ( ~ ) = P k - l ( ~+) f[ . X ~ ] ( X - X ~ ) ( X -...,(x-xk-1)
X~) (3)

To get an explicit expression for f[xo. ...,xk] we make use of Lagrange form of
interpolating polynomial and the uniqueness of interpo'lating polynomial.

From Eqn. (3) we have


Pk(x)=Pk-l(~) + f[x0. ...r xk] (x-x0) . . . ( X - X ~ - ~ ) ,

-
since (x xo) (x - x,) ... (x - xk- ,) = xk + a polynomial of degree c k, we can
....
rewrite pk(x) as pk(x) = f[xO, xk] xk + a polynomial of degree < k (4)

(as Pk- 1 ( ~is) a polynomial of degree c k).

But considering the Lagrange form of interpolating polynomial we have

Therefore, on comparison with Eqn. (4) we have

This shows that


YO, ...,ykl = f[x01... xkl

if yo, ...,yk is a reordering of the sequence xo, ..., xk. We have defined the zeroeth divided,
difference of f(x) at xo by f[xo] = f(xo) which is consistent with Eqn. (5).

Fork = 1, we have from Eqn. (5)

This shows that the first divided difference is really a divided difference.

For k = 2, it can be shown (using Eqn. 5) that

This shows that the second divided difference is s dividcd dirfcrcnce of divided
dirfcrcnces.

We show below in Theorem 1 that fork > 2

This shows that the kth divided differcnce is the divided difference of (k - 1)st divided
differences justifying the name. If M = (xO,...., x,) and N denotes anv n - 1 elcmcnts of '
M and the remaining two elements are-%noted by a and P, then -- z
[(n - 1)st divided differen& on N anda - (n - 1)st divided difference on N andpl (7)
( f[x0,..., X" = , .
a-B I
!I Theorem 1:

Proof: Let Pi- l(x) be the polynomial of degree < i - 1 which interpolates f(x) at ~ g ...,
. xi-]
and let Qj- ](x) be the polynomial of degree 5 j - 1 which interpolates f(x) at the points
xl, ...,xj. Let us define P(x) as

This is a polynomial of degree Ij, and P(xi) = f(xi) for i = 0, 1, ....j. By uniqueness of the
I
i interpolating polynomial we have P(x) = Pj(x). Therecore

'
b
Equating the coefficient of xJ from both sides of Eqn. (8). we obtain (leading) coefficient of

leading coefficient of Q - (x) - leading coefficient of 3 - (x)


xj in Pj(x) =
I
xj - xo -
xj xo

We now illustrate this theo.rem with the help of a few examples but beforc that we give the
table of divided differences of various orders.

Table of divided differences

Suppose we denote, for convenience, a first order divided dirference of f(x) with any two
arguments by f[.,.], a second order divided difference with any three arguments by fl.,...]
and so on. Then the table of divided differences can be written as follows

Table I

Example I: If f(x) = x3. find the value of C[a,b.c].

Solution: . f[a,bl = f(b) - f(a) - b3 - a3


b-a b-a
Similarly,

- (c2 - a2) + b ( c - a)
c-a '

Example 2: If f(x) = -X1 ,show that

Solution: f[a,b] = g-a a - b 1


b-a =ab(b-j=-iii
Similarly,

- -1 + -
1 1 1
f[a,b,cl = ab -
-abbe
c - a c - a

1
- - - /
c-a - abc
Similarly,
1
f[b,c,d] = -
bcd

:. f[a.b,c.dl = I%[
c-a
=-I
abc

-
d-a

= - - 3 .
1
abcd

. In next section we shall make use of the divided differences to derive Newton's genlral
form of interpolating polynomial.

18.3 NEWTON'$ GENERAL FORM OF INTERPOLATING


POLYNOMIAL

In Sec.102 we have shown how P,,(x) can be constructed step by step as one constructs the'
sequence Po(x), P,(x), M... with %(x) obtained from Pk- ,(x) with the addition of the
next term in iQn.(3), that is, Newton Form of the Intcr-
pointing Polynomial
pk(x) = p k - + (X- x0) (X- xl) ...(X- Xk - 1) f l ~ ~ , . . . , ~ ~ ]
Using this Eqn. (1) can be rewritten as
P,,(x) = f[~Ol+ (x - &,I flx,,x,l + (x - xo) (X - x1) ~ [ x ~ . x , . x ~+I ... +
(x - &,I (x - XI)...(X- Xn - 1) flx0Jl,.-.,xnI. (9)
This can be written compactly as follows :

This is the Newton's form of interpolating polynomial.

Example 3: From the following table of values, find the Newton's form of interpolating
!
polynomial approximating f(x).

I
Solution: We notice that the values of x are not equally spaced. We are required to find a
1 polynomial which approximates f(x). We form tlie table of divided differences of f(x).

Table 2

26 1 13
6 822 132
789
7 1611

Since the divided difference upto order 4 are available, the Newton's interpolating
polynomial P4(x) is given by
P4(x) = f(x0) + (x - xo) f[xo.x11 + (x - xo) (x - x1) flxo.x1.x21 +
-
(X xo) (X--XI) (X- ~ 2f[~0.~1.~2.~31
) f

(X- xo) (X- XI) (X- x2) (X- ~ 3 ~[xO.X~.X~.X~.X.,I


) (1 1)

where xo = - 1, xl = 0, x2 = 3, xj = 6 and x4 = 7.

The divided differences f(xo), f[xo,xll, f[xo,xl,~Zlr


f [ x 0 , ~ l , ~ 2 ~, ~ ~3 ]~ ~ [ X ~ . Xare~ ~ X ~ , X ~ , X ~ ~ ,
those which lie along the diagonal at f(xo) as shown by the dotted line. Substituting the
values of xi and the values of the divided differences in Eqn. (1 1). we get
Interpolation which on simplification gives

We now consider an example to show how Newton's interpolating polynomial can be used
to obtain the approximate value of the function f(x) at any non-tabular point.
Example 4: Find the approximate values of f(x) at x = 2 and x = 5 in Example 3.
Solution: =
Since f(x) P4(x),from Example 3, we get
f(2) =P4(2) = 16 - 24 + 20 - 6 = 6
and
f(5) P(5) = 625 - 375 + 125 - 6 = 369

Note 1: When the values of f(x) for given values of x are required to be found, it is not
necessary to find the interpolating polynomial P4(x) in its simplified form given
above. We can obtain the required values by substituting the values of x in
Eqn.(l 1) itself. Thus,

Similarly.
P4(5) = 3 + (6) ( - 9) + (6) (5) (6) + (6) (5) (2) (5) + (6) (5) (2) ( - 1) (1)
=3-54 + 180 + 300-60~369.
Then f(2) P4(2) = 6
and f(5) =P(5) = 369.

Example 5: Obtain the divided differences interpolation polynomial and the Lagrange's
interpolating polynomial of f(x) from the following data and show that they are same.

Solution: (a) Divided differences interpolation polynomial:

Table 3

X frXI fr... I fr.,... I fr.,..... I


0 - 4 .. .-.--.--.........
5 .....".\
-------...
2 6 .. -......5 ......
--........ ..........
20 1
3 26 9
38
4 64
Ncwton Form of the Inter-
polating I'olynomial
(b) Lagrange's interpolation polynomial:

On simplifying, we get
P(x) = x3 + x - 4.
Thus, we find that both polynomials are the same.

You may now uy the following exercises:


1
1
El) Find the Lagrange's intcrpolating polynomial of f(x) from b e tablc of valucs given
bclow and show that it is b e same as thc Newton's dividcd diffcrcnccs
interpolating polynomial.

E2) From the tablc of valucs given below, obtain b e value of y when x = 1.5 using

(a) divided differences interpolation formula.

(b) Lagrange's interpolation fonnula.

E3) Using Newton's divided diffcrenccs interpolation formula, find the valucs of f(8)
and f(15) from the following table.

In Unit 9 we have dcrived the gcneral crror tcrrri i.c. thc crror cornrnittcd in approximating
f(x) by P,(x). In the ncxt section wc dcrive anothcr cxprcssion for the crror tcrln in tcrm of
divided difference.

10.4 THE ERROR OF THE INTERPOLATING


POLYNOMIAL
Let P,(x) bc the Newton form of interpolating polynomial of degree 5 n which interpolates
f(x) at xo, ...,x,. The intcrpolating error E,,(x) of P,,(x) is given by
Inlcrpolalion Let X be any point different from xo, ...,x,. If Pn+ l(x) is the Newton form of interpolating
polynomial which interpolales f(x) at xo,...,xn and Y, then P n +l(Z) = f(Y). Then by (10) we
have

Putting x = jl in the above, we have

This shows that the error is like the next term in the Newton form.

10.5 DIVIDED DIFFERENCE AND DERIVATIVE OF THE


FUNCTION
Comparing Eqn.(l3) with the error formula derived in Unit 9 Eqn. (9). we can establish a
relationship between divided differences and the derivatives of the function

f("+') (5)
Comparing, we have QQ, XIS...S%+ 11 =
,( + 1)

Further it can be shown that 5 E ]min xi, max xi[.

We state these results in the following theorem.

Theorem 2: Let f(x) be a real-valued function, defined on '[a.b] and n times differentiable
in ]a. b[. If xo. ...,x, are n + 1 distinct points in [a,b]. then there exists 5 E ]a.b[ such that

Corollary 1:
If f(x) = xu, then

Corollary 2:
If f(x) = xk, k c n, then
~ [ x ~ ~ . . .=~0x ~ ]

since nth derivative of xk, k c n, is zemb.

For example. consider the first divided difference


By Mean value theorem f(xl) = f(xo) + (xl - xo) f"(y,
xo < 5 < x, . Newton Form of the Inter-
polating Polynomial
Substituting, we get

. = f"(5). X o < 5 < X1.


~[XOIXII

Example 6: If f(x) = a,,xn + a,-lxn" + ... + alx + %,then find f[x~.xl,....xnl.


n!
= a,
Solution: From Corollaries 1 and 2 we have ~[X~,X~....,X,] .-
n! + . O = a,.

Let us consider another example.

Example 7: If f(x) = 2x3 + 3x2 - x + 1, find


f[l.- 1.2.31, f[a,b;c,dl, f[4.6,7.81.
tI Solution: Since f(x) is a cubic polynomial, the 3rd order divided differences of f(x) with
i any set of arguments are constant' and equal to 2, the coefficient of x3 in f(x),
I
I Thus, it follows that f[l, - 1.2.31, f[a,b,c,d], and fl4.6.7.81 are each equal to 2.

You may now try the following exercises:

E4) If f(x) = 2x3 - 3x2 + 7x + 1, what is the value of fl1.2,3,4]?

E5) If f(x) = 3x2 - 2x + 5, find fl1,2], f[2,3] and fl1.2.31.

In the next section, we are going to discuss about bounds on the interpolation error.

10.5 FURTHER RESULTS ON INTERPOLATION ERROR


We have 'derived the error formula

We assume that f(x) is (n + 1) times continuously differentiable in the interval of interest


and x. Since 5(x) is unknown we may replace fi"") (5(x))
[a,b] = I that contains XO,..,,~,,
by ?:\ 1 )r"(" '
I(" 1. -
If we denote (x - no) (x - xl)...(x x 3 by yrn(x) then we have
max 1f cn+') (t) I
E
IE,(x)I=lf(x)-Pn(x)lS (:+I)! m a x l ~ n ( x ) l ' (14)
XE I

Consider now the case when the nodes are equally spaced, that is, xj = xo + jh, j = 0,...,N,
and h is he spacing between consecutive nodes. For the case n=l we have linear
interpolation. If x E [xi- xi], then we approximate f(x) by Pl(x) which interpolates at
1
1 1
xi - I , and xi. From Eqn. (14) we have 4 (x) S 7 max f "(t)( mar (yrl (x)( 1
t€I tE1 .
whcre yrl(x) = (x - xi -.l) (X - xi).
Now,
-
dv1 = x - Xi-]. + X - xi=o
dx

gives x = (xi- + xi)/2.

Hcnce, thc maximum value of I (x - xi- (x - xi) I ocGursat x = x' = (xi- + xi)/2.
'

The niaximlm value is given by


(xi - xi - -= -
h2
I w1(x*) 1 = 4 4 '

Thus, we have for linear interpolation, for any x E I

For the case n=2, it can be shown that for any x € [xi - I , x i +

h3M where I f
1 E2(x) 1 G 8M (x) 1 P M on I.

Example 8: Determine the spacing h in a table of equally spaced values of the function of
f(x) = J;; between 1 and 2, so that interpolation with a first degree polynomial in this
,

table will yield seven place accuracy.

Solution: Here

'max
11xS2
( f "(x) I= -4I '

and =.
I El (x) 1 1 h
For seven place accuracy, h is to be chosen such that

or h2< (160)10-~that is h < .0013.

E6) If f(x) takes the values - 21, 15,12 and 3 respectively when x assumes the values
- 1, 1.2 and 3, find the polynomial which approximates f(x).
E7) Using the following table of values, find the polynomial which approximates fix).
Hence obtain the value of f(5).

E8) Find the polynamial which approximates f(x), tabulated below

Also find an approximate value of f(x) at x = 1 and x = - 2.

E9) If f(3) = 168, f(7) = 120, f(9) = 72 and f(10) = 63, find an approximate value of
f(6).

E10) The following table gives steam pressures P at different temperatures T, measured
in degrees. Find the pressure at temperature 372.1 degrees.
Ncnton Form of the Intcr-
polating Polynomial

E l 1) From the following table, find the value oT y when x = 102


-

E12) From the following table of values, obtain the value of y at x = 3

E13) Obtain the polynomial which agrees with the values of f(x) as shown below

E14) ~etemiineUri: spacing h in a uble of equally spaced values of the function f(x) = J;;
between 1 and 2, so that interpolation with a second-dcgrce polynomial in this wble
yields severi-place accuracy. ,

We now end this unit by giving a summary of what we have covered in it.

10.6 SUMMARY
In this unit we have derived a form of interpolating polynomial called Newton's gkneral
form, which has some advantagesover the Lagrange's form discussed in Unit 9. This form
is useful in deriving some other interpolating formulas. Wehave introduced the concept of
divided differences and discussed some of its important properties before deriving
Newton's general form. The error term has also been derived and utilizing the error term
we have established a relationship between the divided difference and the derivative of the
function f(x) for which the interpolating polynomial has been obtained. The main formulas
derived are listed below:

10.7 SOLUTIONS AND ANSWERS


El) x3-x2+3x+8

E2) 26.35156 '

E3) We form the divided differences table -of ftx) below


Table 4

From the Newton's divided difference interpolation formula, we have

Substituting x = 8 in the above get

substituting x = 15, we get

f(15) = 3150

E4) - 3

E5) 6 5 - 2 . 5 ~]1,2[;611-2,q €1 2,3[,and6

E6) x3 - 9x2 + 17x + 6

E7) x3 - 5x2 + 6x + 1.31


E8) 3x4 - 5x3 + 6x2- 14x + 5, - 5,145

E9) 147

E10) 177.4

Ell) 15.79

E12) 84

E13) x 3 + x 2 - x + 2

E14) ft"(x) = Z3X -' I 2 : hence max 3


~f"'(x)~= 3.
11x12
Newton Formof the Interpolating
Polynominl

For seven place accuracy, h has to be chosen such that

h3
-
J < 5.10-'. This gives h 0.0128.
24 3

-
The number of interval is N = 2 - 1 = 79.
h
UNIT 11 INTERPOLATION AT EQUALLY .
SPACED POINTS
Structure

11.1 Introduction
Objectives

11.2 Differences
11.2.1 Forward Differences
11.2.2 Backward Differences
1 1.2.3 Central Differences

11.3 Diffcrcnce Formulas


11.3.1 Newton's Forward Dilference Formula
11.3.2 Newton's Backward Difference Formula
11.3.3 Stirling's Central Difference Formula

11.4 Summary

1 1 . INTRODUCTION
Suppose that y is a function of x. The cxact functional relation y = f(x) bctwccn x and y
may or may not be known. But, the valucs of y at (n + 1) equally spaccd valucs of x are
supposed to bc known, i.e., (xi, y;); i = 0. .... n are known whcre xi- xi-, = h (fixcd).
i = 1.2. .... n. Suppose lhat we arc rcquircd lo dctcrminc an approximale value of ~ ( x )
or its dcrivativc f'(x) for some valucs of x in thc intcrval of intcrcst. Thc mctliods for
solving such problclns are bascd on thc conccpl of finite di~fcrcnccs.Wc havc
introduccd thc conccpt of forward, backward and ccntral dilTcrcnccs and discussed thcir
intcrrclalionship in Scc. 1I .2.

We havc alrcady introduced two important forms of thc interpolating polynomi31 in Units
9 and 10. Thcsc forms simplify whcn thc nodcs arc cquidishnt. For the case of equidistant
nodcs, wc have derivcd thc Ncwton's forward, backward diffcrcncc forms ant1 Stirling's
central diffcrcnce form of intcrpolating polynomial, each suitable for usc undcr a spccific
situation. Wc have dcrivcd thcse mcthods in Scc. 11.3, and also givcn the corresponding
error tcrm.

Objectives
After reading this unit, you should be able to

write a forward diffcrcnce in terms of function valucs fro~na table of forward


differences and locate a diffcrcnce of givcn order 31 a givcn poinl;

write a backward difference in terms of funclion valucs Crom a table of backward


differences and idcnlify differences of various orders at any givcn point from the
table;

expand a ccnual diflercncc in terms of funclion valucs and form a mble of c c n ~ a l


diffcrcnccs;
establish rclations between A, V ,6 arid divided diffcrcnce;
obtain the intcrpolating polynomial of f(x) for a givcn data by applying any one of
the inlerpolaling formulas;
compute f(x) approximately when x lies near the beginning of the table and estimate Inter. Interpolation at Equally
Spaccd Point
the error;
compute f(x) approximately when x lies near the end of the table and estimate the
error;
estimate the value of f(x) when x lies near the middle of the table and estimate the
error.

11.2 DIFFERENCES

Suppose that we are given a table of values (x,, yJ, i = 0 , 1.2, ...,N where yi = f(x9 = fi.
Let the nodal points be equidistant That is

For simplicity we introduce a linear change of variables

-
S = s(x) = -
h
X Xo
, sothat x = x(s) = xo + sh

and introduce the notation

The linear change of variables in Eqn. (2) transforms polynomials of degree n in x into
polynomials of degree n is s. We have already introduced the divided-difference table to
calculate a polynomial of degree < n which interpolates f(x) at xo, xl, ..., x,. For equally
spaced nodes, we shall deal with three types of differences, namely, forward, backward
and central and discuss their representation in the form of a table. We shall also derive the
relationship of these differences with divided differences and their interrelationship.

11.2.1 Forward Differences


We denote the forward diffcrenccs of f(x) of ith order at x = xo + sh by Aif, and define it
as follows:

Where A denotes forward difference operator.

When s = k, that is, x = xkrwe have

for i = 1 Afk = fk+ I - fk

for i = 2 A2fk.= Afk+ I- Afk

=fk+z-fk+l - [fk+l - fk]

T
= f k + ~ - ~ ~ +
~ +fk l

Similarly A3fk = fk+3- 3fk+2 + 3fk+, - fk.

We recall the binomial theorem

whcre s is a real and non-negative integer.


We give below in Lemma 1 the relationship between the forward and divided differences.
This relation will be utilized to derive the Newton's forward-difference formula which
....
interpolates f(x) at xk + ih, i = 0. 1. n.

Lemma I: For all i 2 0

Prook We prove the result by induction.

For i = 0, both sides of relation (5) are same by convention. that is,

Assuming that relation (5) holds for i = n 2 0, we have for i = n + 1

This shows that relation (5) holds for i = n + 1 also. Hence (5) is proved. We now give a
result which immediately follows from this theorem in the following corollary.

Corollary: If Pn(x) is a polynomial of degree n with leading coefficient an,and xo is an


arbitrary point. then
AnPn(xo)= a,,n ! hn

arid An+'pn(xo)= 0,i.q.. all higher differences are zero.

Proof: Taking k = 0 in relation (5) we have

Let us recall that

where f(x) is a real-valued function defined o i [a,b] and i times differentiable in ]a.b[ and
4 E la&[.
Taking i = n and f(x) = Pn(x) in Eqns. (6) and (3,we get

= hnn!a,,.
Since An ''pn(xo)= AnPn(xl)- AnPn(xo)
= hnn!a,,- hnn!a,,= 0.
This completes the proof
The shift operator E is defined as
Ef; = fi + 1
In general Ef(x) = f(x + h).
We have Esfi = fi +,

For example,
~~f~= fi+,, ~~~f~ = fi+ and E-IRfi r fi- In

Now,
Afi = fi +,- fi = Efi - fi = (E - l)fi
Hence the shift and forward difference operations are related by

A=E-l '.

or E = 1 +A.
Operating s times, we get

A'= (E- I)'= k(;)


j=O
~j (-1l8-J

w i n g use of relation (8) in Eqn. (9). we get

We now give in Table 1, the forward diffcrences of various orders using 5 val,,s.

Table 1:Forward.Difference Table

Note that the forward difference Akfolie on a straight line sloping downward to the right.

11.2.2 Backward Differences


Let f be a real-valued function of x. Let the values of f(x) at n + 1 equally spaced points
xo, XI,..., x, be fo, fl ,...,f, respectively.

The backward differences of f(x) of ith order at xk = xo + kh are denoted by vlfk.Thcy are
defined as follows:

where v denotes backward difference operator.


Using (10). we have for
Example 1: Evaluate the differences Intcrpolatlon at Equally
Spaced Point

Solution: (a) V3 [a2x2 + alx + a,-,] = 0

Note that the backward differences vkf4lie on a straight line sloping upward to the right.

Also note that Afk = v fk + 1 = fk+ 1 fk. -


Try to show that ~~f~= v4f4.

Let us now discuss about the central differences.

11.2.3 Central Differences


The first order central difference off at xk, denoted by 6fk. is defined as

Operating with 6. we obtain the higher order central differences as

with Ffk = fk when s = 0.

The second order central difference is given by


S2fk = =[fk + 112 - fk + 1/11 = &[fir + 1/21 -8[fk - 1/21
'

= f k + l-fk - fk + fk-1
= f k + , - 2fk + fk-1

Similarly, /
f

and Z4fk= fk+ 2,'- 4fk+ 1 + 6fk - 4fk- 1. + fk- 2.


Notice that the even order differences at a tabular value xk are expressed in terms of
tabular values o f f and odd order differences at a tabular value xk are expressed in terms of
non - tabular value off. Also note that the coefficients of 6'fk are the same as those of the
-
binomial expansion of (1 x)'. s = 1.2.3. ......

Since

Wc have the operation relation

The central differences at a non-tabular point xk+ In can be calculated in a similar way. -:

For example,
a r k + l,z = fk + 1:- fk
Interpolation S2fk+1n= f k + ~-2fk+ln
n + fk-I,

63fk+lR= f k + 2 '3fk+1 + 3fk-fk-1

-
S4fk+ln= f k + ~ n 4fk+3n + 6fk+ln-4fk-ln + fk-3n

Relation (15) can be obtained easily by using the relation (14)

We have

The following formulas can also be established:

\
We now give below the central difference table with 5 nodes.

Table 3 :Central Difference Table

Note that the differences S2"'fo lie on a horizon&l line shown by the dotted lines.
Spaced Point

Note Lhat Lhe differences 62mf2lie on a horizontal line.

We now define the mean operator p as follows

1
= 7 [E'" + E - I I ~ ] ~ ~ .
Hence
p = 1 [Ell2 + E-'I2]

Relation Between the Operators A+ v, gand p

We have expressed A, V , 6 and p in terms of the operator E as follows

A =E-1

Also E" =p 6
+ -
2

E-'R= 6
P- 7

Example 2: (a) Express A3f1as a backward difference.

(b) Express A3f1as a central difference.

(c) Express 62f2as a forward difference.


Intcrpolatiun Solution:

(a) a3f1= (Evl3f1= E3v3f1= v ~ =E


v3f4 ~ ~ ~(A=EV)

(e) 2
S ~ ~ ~ = [ E - " ' Sf2] = E - ' A ~ ~ ~ = A ~ E - ' ~ ~ =( SA=' E~-~' ~ A )

Example 3: Prove that (a) p2= 1 a2


+ ,
4

Solution: (a) 1 [El"


We have p = -
2
+ E-"'1

= I + -S2
4
(b) L.H.S.
1 + 1
~ - ' n )(E'R-E-'~) = -(E-E-')
p6 = 2 2
R.H.S.
1 1 1
-(A
2 + V) = T [ ( ~ - l )+ (1-E-I)] = T ( ~ - ~ - ' ) .

Hence, the result.

(c) We have
1
p = Z (Ell2 + E-112) (Ell2 - E-112) = 7(E - E-1)

..1+p2l2=1+
(E - E - ' ) ~ -- (E - E4- ' ) ~+ 4 -- (E + E - ' ) ~
4

El) Express v4f5in terms of function values.

E2) Show that (E + 1)6 = 2(E - 1)p.


- - -

11.3 DIFFERENCE FORMULAS


We shall now derive different difference formulas using the results obtained in the
preceeding section (Section 11.2)

11.3.1 Newton's Forward-Differcnce Formula


'

In Unit 10, we have derived Newton's form of interpolating polynomial (using divided
differences). We have also established in Sec. 11.2.1, the following relationship between Interpolation at Equally
Spaced Point
divided differences and forward differences

axk. .-..~k + ,I = n!h" a r k . (22)

Substituting the divided differences in terms of the forward differences in the Newton's
form, and simplifying we get Newton's forward-difference form. The Newton's form of
interpolating polynomial interpolating at xk. xk + 1, ...,xk +,, is

Substituting (22). we obtain

Setting k = 0,we have the form

Using the transformation (2). we have

X-xk+,=x0 + sh-[xo + (k+j)h]=(s-k-j)h

Hence (23) can be rewritter! as

(S - k) (S - k - 1)
= fk + -
(S k)Afk + 2!
~ ...+ (S - k ) ( Sn! - - I)
~ 2 9 - k Anfk.(25)

of degree < n.

Setting k = 0 in (25) we get the formula

The form (23), (24). (25) or (26) is called h e Newton's forward-difference formula.
d*, ,
?+ I
The error term is now given by

Example 4: Find the Newton's forward-difference interpolating polynomial which agrees


with the table of values given below. Hence obtain the value of f(x) at x = 1.5.
Solution: We form a table of forward differences of f(x).

Table 5 :Forward Differences

Since the third order differences are constant, the higher order differences vanish and we
can infer that f(x) is a polynomial of degree 3 and the Newton's forward-differences
in~rpolationpolynomial exactly represents f(x) and is not an approximation to f(x). The
step length in the data is h = 1. Taking xo = 1 and the subsequent values of x as XI,x2, ...,
as, the Newton's forward-differencesinterpolation pdlynomial.

becomes

which on simplification gives


f(x) = x3 + 2x + 7
:. f(l.5) s (l.5)3 + z(1.5) + 7
= 3.375 + 3 + 7 = 13.375
Note:
If we want only the value of f(1.5), and the interpolation polynomial is not needed,
we can use the formula (26). In this case,

and
Inierpo~ntionnl Equally
Spared Point

E3) ' The population of a town in I@ decinnial census was given below. Estimate the
population for the year 19f 5.

Population: y 46 66 81 93 101
(in thousands)
- -

Example 5: From the following table, find the number of students who obtained less than
45 marks.

Solution: We form a table'of the number of students f(x) whose marks are less han x. In
other words, we form a cumulative frequency table.

Table 6 :Frequency Table

We have xo = 40, x = 45 and h = 10

+ (0.5) ( 4 . 5 ) 24(-1.5) (-2.5)


(37)

= 31 + 21.- 1.125- 1.5625- 1.4453

= 47.8672 = 48
r .
,
The rlumber of students who obtained less than 45 darks is approximately 4%
E4) From the following table, find the value of y (0.23):

E5) Find the cubic polynomial which approximate y(x) given that

y(0) = 1. y(1) = 0, y(2) = 1 and y(3) = 10.

E6) The following table gives the values of tan x for 0.1 S x 5 0.3. Find the value of tan
(0.12).

x , 0.10 0.15 0.20 0.25 0.30

tan x 0.1003 0.1511 0.2027 0.2553 0.3093

E7) The following table gives the population of a town in ten consecutive censuses.
Calculate the population in the year 1915 and 1918. Hence obtain the increase in
population during the period 1915 and 1918.

Year x 1911 1921 1931 1941 1951 1961

Population y 12 15 20 . 27 39 52
(in thousands)

E8) Find the number of men getting wages between Rs. 10 and Rs. 15 from the
following table.

Wages in Rs. x 0 - 10 10-20 20-30 30-40

No. of men y 9 30 35 42

E9) The following table shows the monthly premiums to be paid to a company at
different ages. Find the premium to be paid at the age of 26 years.

Age 20 24 28 32 36

Premium in Rs. 14.27 15.81 17.72 19.96 22.48

E10) The area A of a circle of diameter d is given in the following table. Find the area of
the circle when the diameter is 82 units.

'
El 1) In an examination, the number of candidates who secured marks in certain limits
werc as follows:
Marks 0-9 20 - 39 40-5P 60-79 80-991
I No. of candidates 41 62 65 50 171
Find the number of candidates whose marks are 25 or less.
E12) The following table gives the'amount of a themical dissolved in water at dirrcrcnt
temperatures.

Tcmperature lo0 15' 20° 25" 30" 35"


I Amount dissolved 19.97 21.51 22.47 23.52 24.65 25.891
I I
Compute the amount dissolved at 8'.
at Equally
111terpolatlo11
E13) Find a polynomial which fits ihe following data :
Spaced Poirlt

11.3.2 Newton's Backward-Difference Formula


Reordering the interpolating nodes as x,, x,- l, ...,xo and applying the Newton's divided
difference form, we get
P,(x)=f[x,] + (x-x,)f[x,-l,x,l + (x-x,)(x-x,-l)f[xn-2*~n-1r~nI

+ ... + (x - x) ...(x - x,) f[x0. ...,x,] (27)

We may also write

Set x = x, + sh, then


x - q = x, + sh - [x,- (n--i)h]= (s + n - i)h

x-x,-j=(s+n-n+j)h=(s+j)h

and

Equation (28) becomes

We have seen already that

Hence, equation (29) can be written as


Intcrpolatlon Equation (27). (28) or (29) is called the Newton's backward-difference fonn.

In this case error is given by

The backwatd-difference form is suitable for approximating the value of the function at x
that lies towards the end of the table.

Example 6

Find the Newton's backward differences interpolating polynomial for the data of
Example 4.

Solution: We form the table of backward differences of f(x).

Table 7 :Backward Difference Table

Tables 5 and 7 are the same except that we consider the differences of Table 7 as
...,
backward differences. If we name the abscissas as xg, XI, XS,then x, = xs = 6, fn = fS =
235. With h = 1. the Newton's backward differences polynomial for the given data is given
by

(x-6) (x-5) (x-6) (x-5) (x-4)


= 235 + (X-6) (93) +7 (30) + 6 (6)
=235 + 93(x-6) + 15(x-6) + (x-4)(~-5)(x-6)
which on simplification gives

P(x) = x3 + 2x + 7,
which is the same as the Newton's forward differences interpolation polynomial in
Example 4. ,
I
Interpolation at Equally
Spaccd Poinr
11.3.3 Stirling's Central Difference Form
A number of central difference formulas are available which can be used according to a
situation to maximum advantage. But we shall consider only one such method known as
Stirling's method. This formula is used whenever interpolation is required of x near the
middle of the table of values.
For the central difference formulas, the origin xo, is chosen near the point being
approximated and points below xo are labelled as xl, x2. ...and those directly above as
,,
x- xe2. ... (as in Table 3). Using this convention. Stirling's formula for interpolation
is given by

where s = (x - xo)/h and if n = 2p + 1 is odd.

IT n = 2p is even, then the same formula is used deleting the last term.
The Stirling's interpolation is used for calculation when x lies
1 1
between xo - h and xo + ;rh.

It may be noted from the Table 3, that the odd order differences at x-]n are those which
lie along the horizontal line between xo and x- Sirnilafly. the odd order differences at
xln arc those which lie along the horizontal line between xo and xl. Even order differences
at xo are those which lie along the horizontal line through xo.

Example 8: Using Stirling's formula, find the value of f(1.32) from the following table of
values.

Solution:

Table 9: Central Difference


I at Equally
Spaced Point

:. s = -
(x-xo) =
h
1.32-1.3 = o.2.
0.1
From Eqn. (32), we have

s2 s(s2-12) 1 2 2 2
f ( x ) = f 0 + $ [sf~l,t+sfl,2]+z S2f,,+-- 3! 2 [s3f-,,, + s3f112]+ s (s4!-1 ) 5df0.

Now,

1 1
7 + ~f,,,] = T(0.1889 + 0.2059) = 0.1974

Substituting in the aboveequation, we get

= 1.73816 = 1.7382.

In the following exercises, use the Stirling's interpolation formula.

E21) Find f(l.725) from the following table.

E22) Find the value of f(1.22) from the following table.


- -

E23) Evaluate f(4.325) from the following.

E24) Find the value of y when x = 30 from the table.

E25) Find the approximate value of y (2.15) from the table.


Interpolallon
11.4 SUMMARY
In this unit, we have derived interpolation formulas for data wilh equally spaced values of
the argument. We have seen how to find the value of f(x) for a given value of x by
applying an appropriate interpolation formula derived in this section. The application of
the formulas derived in this section is easier when compared to the application of the
formulas derived in Units 9 ahd 10. However. the formulas derived in this unit can only be
applied to data with equally spaced arguments whereas the formulas derived in Units 9 and
10 can be applied for data with equally spaced or unequally spaced arguments. Thus, the
formulas derived in Units 9 and 10 are of a more general nature than those of Unit 11. The
interpolation polynomial which fits a given data can be determined by using any of the
formulas derived in this section which will be unique whatever be the interpolation
formula that is used.
The interpolation formulas derived in this unit are listed below:
1 Newton's forward difference formula:

where s - (X - xo)/h.
2. Newton's backward difference formula:

Pn(x) = Pn(xn + sh) = f: (-'I)~


k -0
[-,I vkfn where s = (x - x0)/h
3. Stirling's central difference formula:

if n = 2p + 1 is odd. If n = 2p is even, the same formula is used deleting the last term.

11.5 SOLUTIONSIANSWERS
El) From Eqn. (12) v4f,= f5 - 4 f + 6f3 - 4f2 + fl
E2) ~ + E- vz)= Em 2pS = 2 ~ *pS
LHS = E ' (Em
RHS = ~E"(E* - Em)p = 2EYr8.
E 3) The forward differences table is given below.
Table 10
'Taking xo = 1911, x = 1915. h = 10. wc gct at Equally
~nterpo~aAon
Spaced Point

S = 1051-1911
10
= o.4

= 54.8528
or y(1915) = 54.85 thousands.

E7) 12.54 thousands, 13.64 thousands, 1.l thousands

E8) 15

EY) 16.25 Rs.

E10) 5281

E l l ) $8

E12) 18.79 (Hint: Take all differences into consideration)

E13) 2x2-7x + 9

E14) 1.7081

E15) ~ ~ - +2 1~ 2

E16) 0.2662,0.4241

E17) Population in 1954 is 43.33 thousands and [he population in 1958 is 48.81. Hence
the increase in population is approximately 5.48 thousands.

E19) Hint: The numbcr of candidates f(x) whose marks are less than or equal to x is as
. follows:

(i) Take 79 as origin and determine f(70)

We gct f(70) = 199.

(ii) Take 99 as origin and obtain f(89) = 232.

E20) 2x2 - 7x +9
:. S = 1.725-1.7 = o.25
0.1
Table 11
UNIT 12 NUMERICAL DIFFERENTIATION

Structure
12.1 Introduction
Objectives
12.2 Methods Based on Undetermined Coefficients
12.3 Methods Based on Finite Difference Operators
12.4 Methods Based on Interpolation
12.5 Richardson's Extrapolation
12.6 Optimum Choice of Step Length
12.7 Sulnma~y
12.8 Solutions/Answers

12.1 INTRODUCTION

Differentiation of a f u ~ ~ c t i of(x)
n is a fundamental and important concept in calculus.
W h e l ~the function is given explicitly its derivatives f'(x), fl'(x), ... etc. can be easily
found using the methods of calculus. For example, if f(x) = x2, we know that fl(x) =
2x, ftl(x) = 2 and all the higher order derivatives are zero. However, if the function is
not known explicitly but, we are given a table of values of f'(x) corresponding to a set
of values of x, then we cannot find the derivatives by using calculus methods. For
instance if f(x& represents distance travelled by a car in time xk, k = 0, 1, 2, ... seconds,
and we require the velocity and acceleration of the car at any time xk, then the
derivatives f '(x) and f "(x) representing velocity and acceleration respectively, cannot be
found analytically. Hence, the need arises to develop methods of differentiation to
obtain the derivative of a given function f(x), using the data given in the form of a
table which might have been formed as a result of scientific experiments.

Numerical inethods have the advantage that they are easily adaptable on calculators and
computers. These methods make use of the interpolating polynomials, which we
discussed in Block-3. We shall now discuss, in this unit, a few numerical differentiation
methods, namely, the method based on undetennined coefficients, methods based on
finite difference operators and methods based on interpolation.
\

Objectives
After studying this unit you should be able to
explain the iinportance of the numerical inethods over the calculus inethods;
use the method of undetennined coefficients and methods Lased on finit difference
operators to derive differentiation formulas and 'obtain the derivative of a function at
step points;
use the inethods derived from the interpolation formulas to obtain the derivative of a
function at off step points;
use Richandson's extrapolation method for obtaining higher order solutions;
obtain the optimal steplength for the given formula.

12.2 METHODS BASED ON UNDETERMINED


COEFFICIENTS
In Unit 1, we introduced you to the concepts of round-off and truncation errors. In the
derivation of the inethods of numerical differentiation, we shall be referring to these
errors quite often. Let us first quickly recall these concepts here before going further.
Numerical Differentiation Integration
and Solution of Differential Equations Definition : The round-off e m r is the quantity R which must be added to the finite
representatio~iof a computed number in order to make it the true represe~ltatio~l
of that
number. Thus
machine representation) + R = y(true representation).
Definition : The truncation error denoted by TE is the quantity which must be
added to the finite representation of the computed quantity in order that the result be
exactly equal to the quantity we are seeking to generate. Thus
y(true representation) + TE = y(exact)
The total error En is then given by
= I machine representation) - y(exact) I
lEnI = 1 ~(ntachinerepresentation) - y(true representation) I
+ y(true representation) - y(exact) (
s ) machine representation) - y(tme representation) (
+ I y(true representation) - y(exact) 1
s IRI + I n1
Defintion : Let f(h) be the exact analytical value of a given problem obtained by
using an analytical formula and fh be the approximate value obtained by using a
numerical method. If the error f(h) - & = C hP, where C is a constant, then p is known
as the order of the numerical method.
Let us consider a function f(x), whose values are given at a set o f l t a b u l ~points.
~ For
developing numerical differentiation formulas for the derivatives f (x), f (x), ...
at a
point x = x,, we express the derivative fyx), q z 1, as'a linear combination of the
values of f(x) at all arbitrarily chosen set of tabular points. Here, we assume that the
tabular points are equally spaced with the steplength h i.e. various step (nodal) points are
xm = x,, * mh, m = 0, 1, ... etc. Then we write

where yi, i = - s, - s + 1, ......, n are the unknowns to be determined and f, +,


denotes
f(xk + mh). For example, when s = n = 1 and q = 1, Eqn. (1) reduces to
h f l (x3 = Y-If,-, +Y$, +Y,f,+,.
Similarly, when s = 1, n = 2 and q = 2, we have
h2f " ( ~ 3= y-~fk-+ ybfk + ylfk+1 + yzfk+,
Now suppose we wish to determine a numercial differentiation formula for P ( x 3 of
order p using the method of undetermined coefficients. In other words, we want our
formula to give the exact derivative values when f(x) is a polynomial of degree s p,
that is, for f(x) = 1, x, x2, x3, ..., xP. We then get p+l equations for the determination
of the unknowns yi, i = -s, -s + 1 .....,n. You know that if a method is of order p,
then its is of the form chP+l kl) (a),for some constant C. This implies that if f(x)
= xm, m = 0, 1, 2, ....., p then the method gives exact results, since

- (xm)- 0, for m
dP+ 1
1 0, 1, ...,p.
dxP+
Let us now illustrate this idea to find the numerical differentiation formula of 0 @4for f "(x&
Derivation of formula for fW(x)
Without loss of generality let us take xk = 0. We shall take the points symmetrically,
that is, x,= mb; m = 0, * 1, i2.
Let f-2, f-,, f, f,, f, denote the values of f(x) at x = -2h, - h, 0, h, 2h respectively.
In this case the formula given by Eqn. (1) can be written as
h2fl1(0) = Y-2f-2 + Y-lf-, + Y&+ ylfl + y2f2 (2)
Let us now make the formula exact for f(x) = 1, x, x2, x3, x4. Then, we have
f(x) = 1, ftl(0) = 0; f-2 = f-, = q, = f, = f , = 1
f(x) = x, fl'(0) = 0, f-2 = -2h; tl F 41; = 0; fl = b; f2 = 2b;
2 "
f(x) = x , f (0) f-2 = 4h2 = f2; ffl = b2 s f,; (,r 0; (3)
. .
f(x) = 2,f "(0) = 0. f, = - 8h3; f-, = -a3; 6 = 0; f, h3, 6 8h3
f(x) = x4 , f "(0) = 0; f-2 = 16h4 = f2; f-, = h4 = f,; 6 = 0
Numerical Differentiation

Substituting these values in Eqn. (2), we obtain the followi~~g


set of equations for
determining y,, m = 0, * 1, t 2.
Y-;! + Y-1 + Yo + Y1 + Y2 = 0

-2~-2 -Y-1 + Y1+2 Y2 = 0


4 1 + ~Y - ~~+Y, + 4y2 = 2 (4)
43Y-2-y-1 +Y1 + 8 y 2 = 0
16~-2+ Y-, + Y, + 1 3 , = 0
Thus we have a system of five equations for five unknowns. The solution of this system
of Eqns. (4) is
yq2= y2 = -1112; y-l = y1 = 16/12; yo= 30112;
Hence, the numerical differentiation formula of 0(h> for ft'(0) as given by Eqn. (2) is

- ",
f l1((J) f r --f-,
12h2 [
+ 16fi-Mfo + 16f -f
21 (5)
Now, we know that the TE of the formula (5) is given by the f i t non-zero term in the
Taylor expression of

The Taylor series expansions give


f(x, - 2h) = f(xo)- 2hf1(x0)+ 2h2 f "(x,) --
4b3 f "'(xJ + -
3
2h4 fN(xo)
3
- f4h5
15
v
(x0)+-f
4h6 VI
45
(x0)- ......
h2 h3 h4 N h5
f(x0 - h) 2 f(x0)- hf '(xo) + -f "(xo) - - f "'(xo) + -f (xJ
2 6 24
--
120
f '(x0)

+ fh6 VI(xo)+ ......


720
h2 h3 h4 w h5 v
f(xo + h) = f(xo) +Pf ' ( x ~ +) -f "(x,) + -f "'(xo) + -f (xo) + -f (x0)
2 6 24 120
h6
+ -fW(x&+......
720
2h2 4h3 2h4 4h5
f(xo + 2b) = f(xo) + 2hf '(xo) + -f "(xo) + T f '"(x0) + 7fw(xo)+ -f "(x,)
2 15
+ f4h6 n(xo)+ ......
45
Substituting these expansions in Eqn. (6) and simplifying, we get the f i t non-zero term
or the TE of the formula (5) as

You may now try the following .exercise.

El) A differentiation rule of the form


f i = a&+a,f, +a&
is given. Find a,,,a, and a, so that the rule is exact for polynomials of degree 2.

You must have observed that in the numerical differentiation formula discussed above, we
have to solve a linear system of equations. If the number of nodal points involved is large
or if we have to determine a method of hi@ order, then we have to solve a large system of
linear equations, which becomes tedious. To avoid this, we can use f i t e difference
operators to obtain the differentiation formulas, which we shall illustrate in the next section.
Numerical Differentiation Integration
and Solution of Differential Equations 12.3 METHODS BASED ON FINITE DIFFERENCE
OPERATORS

Recall ttiat in Unit 10 of Block 3, we introduced the finite difference operators E, V, A,


p and 6. There we also gave the relations among Garious operators.

In order to construct the numerical differentiation formulas using these operators, we


shall first derive relations between the differential operator D where Df(x) = f ' (x), and
the various difference operators.

By Taylor series, we have


f(x+h)= f ( x ) + h f t ( x ) + h 2 f f f ( x ) + . .
= [ l + hD + h 2 ~ +' . . . ] f(x)
= ehDf(x)
Since, Ef (x) = f (x + h)

we obtain from Eqn. (7), the identity


E = ehn
which gives the relations

We can relate D with 6 as follows :

We know that 6 = EV2- E-". Using identity (8), we can write

Hence, 6 = Bin h (hDI2)


or hD = Bin h-' (612) (1 I)
Similarly p = wsh (hD12) (12)
We also have p6 = sinh (hD) or hD = sinh-' (pa) (1 3,
ti2
and pZ = cosh2 (hD/2) = 1 + sinh2 (hDn) = 1 + -
4 (14)
Using the.Maclaurin9s expansion of sinh-'x, in relation (ll), we can express hD as an
infinite series in 612.

Thus, we have
hD = 2sinh-' (612)

Notice that this formula involves off-step points when operated on f(x). The formula
involving only the step points can be obtained by using the relation (13), i.e., '
hD = sinh-' (p6)

Using the relation (14) in Eqn. (16), we obbin

Thus, Eqns. (9), (lo), (15) and (16) give us the relations between hD and various
difference operators. Let us see bow we can use these relations to derive numerical
differentiation formulas for f; , f: etc. Numerical Differentiation
We first derive formulas for ftt. From Eqn. (9), we get

Thus forward difference formulas of qh), 0(h2), 0(h3 and 0(h4) can be obtained by '
retaining respectively 1, 2 3, and 4 terms of the relation (9) as follows :

q h ) method : hf; 5 fk+,- fk (18)


I 1
0(h2) method : hfk a r(-fk+2 + 44+,- 31,) (19)

0(h3 method : hf; - 1


:(2fk+3- 9fk+,+ 1 8 6 , - llfk) (20)
I 1
O(h3 method : hfk a =(-3fk+, +164+3- 36fk+,+ 48fk+,- 25fk) (21)

TE of the formula (18) is

and that of formula (19) is

Similarly the TE of formulas (20) and (21) can be calculated. Backward difference
formulas of qh), 0(h2), 6(h3) and 0(h3 for fkl can be obtained in the same way by .
using the equality (10) and retaining l,2,3 or 4 terms. We are leaving it as an exercise
for you to derive these formulas.

E2) Derive backward difference formulas for E,' of qb), 0(h2),0(h3) and 0(h4).

Central difference formulas for fl, can be obtained by using the relation (17), i.e.,

Note that relation (17) will give us methods of 0(h2) and 0@>,on retaining 1 and 2 tern,

0(h2) method : bfi - ?( 4


1
+ - fk- (24)
1
0(h4)mculod: hf;-i?(-fk-2+8fk-1+8fk+I+fk+2) (25)

We ROW. illustpte these methods &rough an example.


/

Exampk 1 : Given the following h b k of v s b of qx) = ex, f i d f ' (0.2) using


formulas (Is), (19), (24) and (25).

b
S o l u t h : Were h = 0.1 and exact value of ex at x = 0.2 is 1.221402758.

it Using(l8), f '(0.2) =
-
f(0.3) f(0.2)
0.1
Numerical Differentiation Integration
Actual error 1.221402758 - 1.28456 = - 0.063157
- &-
r

-
and Solution of Differential Equations
1
Using (19), f '(0.2) f(0.4) + 4f(0.3) - 3f(0.2)] 1.21701

-
h2
TE -
3
f 111(0.2) -
-eo.2 O.,,,,'$ml;
= O.O1
3
Actual error = 0.004393
1
Using (24), f ' (0.2) = - (w.3) - f (0.1)) = 1.22344
0.2

I-E = - -
h2 f"'(0.2) = - -0.01
ea2= - 0.002035f;
6 6 ,
Actual m r = - 0.002037
1
f ' (0.2) = [-f (0.0) + Sf (0 I) - 8f (0.3) + f (0.4))
_
Ushg (U), = 1.221399167

TE = h4fYI(0.2) 0-0001 ea2 0.4071


I
10-5;
30 30
Actual error = 0.3591 x

Numerical differentiation formulas for f", can be obtained by considering

We can write the forward difference methods of qh), 0(h2), m 3 ) and 0@> for f'; by
using Eqn. (26) and retaining 1, 2, 3 and 4 tenns as follows :

0@3 method : h?C=-fk+3+4$+2-5$+1+2ft 00) (


q h > method: h?~----8fk+5+51~+4-136fk+3+194$+2-144$+l+43$)
:2( (32)l

Backward difference formulas can be written in the same way by using Egn. (27).
Central differenceformulas of qh2) and 0@3 for f[ are obtained by using Eqa. (28)
and retaining 1 or 2 terms in the form :

Let us consitter an example,


Example 2 : For the table of values of qx) = ex, given h -Example 1, frnd f'' (0.2)
using the formuhs (33) and (34).

Actual error P - 0.0009972


Using Eqn.(34), f "(0.2) =
[- f(O.0) + 16f(0.1) - 30f(0.2) + 16f(0.3) - f(0.4)] Numerical Differentiation
= 1.221375
0.12
h4fw 0 2)
TE = = 0.13571 x 10-5

Actual error = 0.27758 x ,

And now the fo!lowing exercises tor you.

E3) From the following table of values find f (6.0) using an O(h) formula and f I' (6.3)
usil~gan 0(h2) formula.

E4) Calculate the first and second derivatives of lnx at x = 500 from the following table.
Use q h 2 ) forward difference method. Compute TE and actual errors.
x : 500 510 520 530
f(x) : 6.2146 6.2344 6.2538 6.2729

In Secs. 12.2 and 12.3, we have derived numerical differentiation formulas to obtain the
derivative values at nodal points or step points, when the function values are given in
the form of a table. However, these methods cannot be used to find the derivative
values at off-step points. In the next section we shall derive methods which can be used
for finding the derivative values at the off-step points as well as at step-points.

12.4 METHODS BASED ON INTERPOLATION

In these methods, given the values of f(x) at a set of points x,,, xl,...%, the general
approach for deriving numerical differentiation formulas is to obtain the unique
interpolating polynomial P,(x) fitting the data. We then differentiate this polynomial q
times (q s n), to get (x). Tbe value 9Jq)(xk)then gives us the approximate value of
fi9)(x,J where X, may be a step point br an off-step point. We would like to point out
here that even when the original data are known to be accurate i.e. Pn(x,J = f ( G ,
k = 0, 1, 2, ..., n, yet the derivative values may differ considerably at these points. The
approximations may further deteriorate while finding the values at off-step points or as
the order of the derivative increases. However, these disadvantages are present in every
numerical differentiation formula, as in general, one does not know whether the function
representing a table of values has a derivative at every point or not.
We shall first derive differentiation formulas for the derivatives using non-uniform nodal
points. That is, when the diffetence between any two consecutive points is not uniform.

Non-uniform nodal points


Let the data (x,, Q, k = O,l, ..., n be given at n + 1 points where the step length xi-x14
may not be uniform.
In Unit 9 you have seen that the Lagrange interpolating polynomial fitting the data
(x,, Q,'k = 0, 1, ..., n is given by .
n
pn (XI= 2 4(x) fk (35)
k=O
&(x) am the fundamental Lagrange polynomials given

Nx) (36)
V X )= ix - xk) x1(xk)

-
and n: (x) = (X x0) (X xl) - (X- x,) (37)
Numerical Differeatiation Integration
and Solution o f Differential Equations xl(xk)= ( x ~ -x0)(xk- x ~ ) . . . ( x ~ Xk-l)(xk-
- X~+~)"-(X~- (38)

The error of interpolation is given by

En (x) = f(x) - P,(X) = f(' '1 (a), xo< a c xn,


(n + 1) !
+

Differentiating Po (x) w.r.t. x, we obtain

and the error is given by

Since in Eqn. (40), the function 4 x ) is not known in the second term on the right
hand side, we cannot evaluate EL (x) directly. However, since at a nodal point xk,
n(x3 = 0, we obtain

If we want to obtain the differentiation formulas for any higher order, say qth
(1 s q r n) order derivative, then we differentiate Pn(x), q times and get

Similarly, the error term is obtained by differeutiating En(x), q times.


Let us consider the following examples.
Example 3 : Find fl(x) and the error of approximation using Lagrange interpolation
for the date (+, 53, k = 0, 1.
Solution : We know that Pl(x) = h(x) & +Ll (x) fl
X- X1 X- X1
where Lo (x) = -and Ll(x) = -
Xo- X l x1- xo
Now,
P; (x) = L; $ + L; (x) f,
1 1
and L; (x) = - , L ' ~ ( X=) -
xo- x1 XI- xo
fo +-=-
Hence, f ' (x) = pfl(x) = -
fl (f1- Q
Xo- Xt X1- Xo (x,- XJ

(xo- xl) (xl- xo)


E' (xJ = - 2
f l1 (a) and E1(x1)= -f f l(a), xo c a xl.
2
Example 4 : Find f (x) and fl'(x) given fo, fl, f2 at ,x x,, x2 respectively, using the
Lagrange interpolation.
Solution : By Lagmnge's interpolation formula

where,
(x- xl) (x- x2) 2x- xl- x,
Lo(x) = (xo- xl) (xo- x2) ; a x ) = (xo- xl) (xo- x,)
Hence, fl(x) = P; (x) = (x) fo + L; (x) fl + (x) f2 Numerical Differentiation

P (x) '= :L (x)


a'nd : fo + L: -(x) fl + :L (x) f2

Exan r;e 5 : Given the following values of f(x) = In x, find the approximate value of
f 1 (2.C nd f"(2.0). Also find the errors of approximations.

Solution : Using the Lagrange's interpolation formula, we have

.'. we get

The exact value of f 1 (20) = 0.5

Error is given by

Similarly,

f" (x,,) = 2 (xo - x,)fo(x,


" I
- x2) + (x, - XJ f 1(xl - x2) + (x2 - XO)( ~ -2xi)

The exact value of f'' (2.0) = - 0.25.

Error ii given by
1 1
E; (xo) = 5 ~ - x2) f '"(2.0) + -(xo- x,) (x,-
( 2 ~ x1-
24
x2) [fN!?3) + fN(2.0)]
= -0.06917
You may now try the following exercise.

E5) Use Lagrange's interpolation to find f '(x) and f "(x) at x = 2.5'5.0 from the
following table

Now let us consider the case of uniform nodal points.

Uniform nodal points


When the difference between any two consecutive points is same; i.e., when we are
given values of f(x) at equally spaced points, we can use Newton's forward or
Numerical Differentiation Integration backward interpolation formulas to find the unique interpolating polynomial P,(x). Wc
and Solution of Differential Equations
can then differentiate this polynomial to fmd the derivative values either at the nodal
points or at off-step poihts.

Let the data (xk, f3, k = 0, 1, ..., n be given at (n + 1) points where the step points xL;
k = 0, 1, ..., n are equispaced with step length h. That is, we have

You know that by Newton's forward interpolation fonnula

with error
~x-xo)(x-xl)...(x-~n)An+l-
En (x) = f(a) xo<a<x,
(n+l)!hn+'

If we put
X
-
- Xo
s or x = xg + sh, then Eqns. (44) and (45) reduce respectively to
h

and
-
S(S 1). . . (s-n) h ( n + ~ ) f ( a + ~ )
n' (1' = (n + 1) ! (a)

Differentiation of Pn(x) w.r.t. x gives us

At x = xg, we have s = 0 and hence

which is same as formula (9) obtained in Sec. 123 by difference operator method. We can
obatin the derivative at any step or off-step point by finding the value of s and substituting
the same in Eqn. (48). The formula mrrespondug to Eqn. (47) in backward differences is,

where x = x,, + sh.


Formulas .for higher order derivaiives can be obtained by differentiating P: (x) further
and the corresponding error can be obtained by differentiating <(x).

Let us illustrate the method through the following examples :

Example 6 : Find the first and second derivatives of f(x) at x = 1.1 from the
following tabulated values.
Solution : Since we have to find the derivative at x = 1.1, we shall use the forward Numerical Differentiation
difference formula. The forward differences for the given data are given in Table 1.

Table 1

Since, x = xo + s h, xo = 1, h = 0.2 and x = 1.1, we have s = -


lml- 0.5
0.2
-
Substituting the value of s in formula (48), we get

Substituting the values of Afo and A34, in Eqn. (50) from Table 1, we get
f '(1.1) = 0.63
To obtain the second derivative, we differentie formula (48) and obtain

- 1
f "(x) = ~ " ( x ) -
h
k2$+ (s- 1) A3fOI
Thus f"(l.1) = 6.6

Notp : . If you construct a folward difference interpolating polynomial P(x), fitting the
dataIgiven in Table 1, you will find that f(x) = P(x) = x3 - 3x + 2 Also, ff(l.l) = 6.3,
f"(l.1) = 6.6. The values obtained from this equation or directly as done above have to
be same as the interpolating polynomial is unique.

Example 7 : Find ff(x) at x = 0.4 from the following table of values.

Solution : Since we are required to find the derivative at the end pint, we will use the
backward difference formula. The backward differencle table for the given data is given by

I Table 2
Numerical Differentiation Integration
and Solution of Differential Equations
Since x, = 0.4, h = 0.1, x = 0.4, we get s = 0

~hbstitutin~
the value of s in formula (49), we get

= 1.14913
How about trying a few exercises now ?

E6) The position qx) of a particle moving in a line at various times x, is given in the
following table. Estimate the velocity and acceleration of the particle at x = 1.5 and 35

E7) Construct a difference table for the following data

Taking h = 0.2, compute f '(1.5) and the error, if f(x) = ex.


You m z t have by now observed 'that to obtain numerical differentiation methods of
higher order, we require a large number of tabular points and thus a large number of
function evaluations at these tabular points. consequently, there is a possibility that the
round-off errors may increase so much that the numerical results may become useless.
However, it is possible to obtain higher order solutions by combining the computed
values obtained by using the same method with two different step sizes. This technique
is called extrapolation method or Richardson's extrapolation. We shall now discuss this
method in the next section.

12.5 RICHARDSON'S EXTRAPOLATION

The underlying idea in this method is as follows :

Let fi9) (h) denote the approximate value of fiP)(x,J, obtained by using a formula of
order p, with steplength h and p9)(rh) denote the value of Pd(x,J obtained by using the
same method of order p, with steplength rh. Then,
f (40) = f (q)(~,J+ ChP + 0 (hp+') (5 1)

and f(q)(rh) = f(q)(x,J + C ( r w + 0 (hp + I) (52)

Eliminatinn C between Eqns. (51) and (52), we get

Thepew approximation to fid(x& is therefore

The expression on the right hand side of Eqn. (54) for finding the value of the qth
derivative by a certain method of order p has now become a method of order p + 1.
This technique of combining two computed values obtained by using the same method
with two different step sizes, to obtain higher order solutions is called Richardson's Numetkai Differentiation
extrapolation method.

We know that the truncation error of a numerical method of order Q is given by


TE = ClhP + O(hp*').
where C1+ 0.
If, instead of denoting the higher order terms by O(hP I), we write down the actual
+

terms, we have
TE= C ~ ~ ~ + C $ ~ + ' + C ~ ~ ~ + ~ + . . .
By repeated application of Rcihardson's extrapolation technique we can obtain solutions
of higher orders, i.e. O(hP+'), O(hP+a), 0(hP+3etc. by eliminating C1, Cz,C, respectively.
Let us see how this can be done.

Consider the central difference differentiation formula (24) of qh2) given by

Let g(xJ = fl(xJ be the exact value of the derivative, which is to be obtained and

be the value given by the O(hZ) method. The truncation error of this nlethod may be
written as
g(h) = g(xJ + clh2 + c414+ w6 ...
+ (55)
h
Let f; be evaluated with different step sizes - r = 0, 1, 2, . . .
2I'
Then, we have

Eliminating C1 from Eqns. (55) and (56), we get

Eliminating Cl from Eqns. (56) and (57), we obtain

Notke that the methods g(')(h) and g("(hL2) given by Eqns. (58) and (59) are 0(h>
approximations to g(xJ

Eliminating C2 from Eqns. (58) and (59), we get

which gives an O(h6) approximation to g(x&. Gencl&ing, we f i d that the successive


higher order methods can be obtained frarn the
I Numaid Diffd*ion Inlegration This procedure is known as the Richardson's repeated extrapolstion to the limit.
and Solution of Differential Equations
I

I
Tbtse extrapolations can be stopped when

for a given error tolerence E.

Similarly, forward difference method of 0@2) can be obtained by cansidering

and using Richardson's extrapolation technique in the form

Tbis method is of qh2).

You may note that in Richardson's extrapolation, each improvement made for foxward
(or backward) difference formula increases the order of solutions by one, whereas for
central difference formula each improvement increases the order by two.

Let us now solve the following problems.

Example 8 : Tbe following table of values of f(x) = x4, is given :

Using the formula f '(x,) = [s(.3;"]' and Richardson's extrapolation method

find f l(3).

Solutioa : Note that in this example xl = 3.0. The largest step h that can be taken
h
is h = 4. computations can also be done by using step lengths hl = - = 2 and
2
hp = hlR = 1.

Using the ionnula

we get
qh2) method.

qh2) method.

qh2) method.

Therefore, using the f m u l a given by Eqn. (a)we


, have

g(l)(h) - )!(
48 - g@) m624*UW)1108
q h ? method.
3 3

0(h? method.
Numerical Differentiation

0(h6)method.

Writing- in tabular form, we have

Step Second Fourth Sixth


length order method order metbod order metbod

Thus f'(3) = 108, must be the exact solution as we have

~ x a m p k9 : Let qx) = ex. Using a central difference formula of O(h3 find fl'(l)
Improve this value using Richardson's extrapolation by taking h = 0.1 and h = 0.05.
Solution : Witb h = 0.1 and

fk11 = e+~-~f,+fl-i 9
h2
we get

With h = 0.05 we get fl'(l) =


el.05 - ze
(0.05)~
+ e0.95
- 2.718848

Both the solutions are q h 2 ) approximations. Richardson's approximation using relation


(54) with r = --
0.05
2 and p = 2, gives us

The actual value i s e = 2718282


Your may now e y the following exercises :

E8) Compute f "(0.6) from the following table using w 2 ) central difference formula.
Improve it by Richardson's extrapolation metbod using step lengths h = 0.4.0.2,O.l.

E9) Using central difference formula of 0@') find f "(0.3) Emuthe given table and improve
tbe adracy using Richardson's empolation method using step lengths h = 0.1'0.2
F

x : 0.1 0.2 0.3 0.4 0.5


f(x) : 0.091 0.155 0.182 0.171 0.130

In the numerical differentiation methods, tbe trunation e m r is of tbe form ChP which
tends to zero as h-0. However, the metbod which approximates f (9)(x) contains hq in
Numerical Dimamtiation'ntevarionthe denominator. As h is successively reduced to smaller values, the truficatioil error
and Solution of Differential Equations
decreases but the round-off error in the method may inctease as we are dividing by a
small number. It may happen that after a certain critical value of h, the round-off enor
may become more dominant than the truncation error and the numerical results obtained
may start worsening as h is further reduced. The problem of finding a steplength h
small enough so that the truncation error is small, ye, large enough so that round-off
error does not dominate the actual error is referred to as themstep size dilemma. Such a
step length, if it can be determined is called the optimal steplength for that formula.
We shall now discuss in the next section how t9 determine the optimal steplength.

12.6 OPTIMUM CHOICE 0.F STEPLENGTH


We begin by considering an example.
Consider the numerical differentiation formula

I 1

fk = (fk+ 1- fk)

2
Let f(x) = ex and we want to approximate fl(l) by taking h = - m = l , 2,..., 7.
lom'
We have from the differentiation formula (63),

The exact solution is f '(1) = 2.718282. The actual error is e - fl(l) and the truncation
-
error is ehn. With h = - 2
lorn'
m = 1, 2 ,..., 7, we have the results as given in Table 3.

Table 3
h f'(l) Actual error Approximate Tmllcation error
2 x lo-' 3.009175 - 0.290893 - 0.271828

If you look at Table 3, you will observe that the improved accuracy of the formula,
i.e. fl(l), with decreasing h does not continue indefinitely. Tbe truncation error agrees
with the a&ual error till h = 2 x = 0.002. As h is further reduced, the truncation
error ceases to approximate the actual error. This is because the actual error is
dominated by round-off error rather than the truncation error. This effect gets
worsened as h is reduced further. In such cases we determine the optimal steplength.
When f(x) is given ie tabular form, these values may not be exact These values ccntain
round-off errors. In other words, f(x3 = E, + E,, where f(x3 is the exact value, fk is the
tabulated value and E~ is the round-off error. For the numerical differentiation fnnnllla
(63), we have

If the round-off errors in f, and 4 + ,aw E~ and E~ + , then we have


Numerical Differentiation

where R is the round-off error and TE is the truncation error.

If we take E = max (IE,~), (IE,+ ,I) and Mi = max Ifu(xjl, we Fmd that

We define the optimum value of h as the one which satisfies either of the following
col;ditions :

(9 IRI- (ii) IR(+ITE(=minimum (64)

By the first condition in (64), we have

The value of the error is

-
If we use the second condition (R( ITE 1 = min, we have

2E hMz
-+-
h 2
- min.

To find the mi~imumis Eqn. (65), we differentiate the left hand side of Eqn. (65) with
respect to

2E
--
hZ
Mz
I- -= 0 or, h2
2
- 4E
-or,
2
h 1 2qKT2..

:. The minimum total error = (R(+ (TE( - w.


Let us now consider an example.

Example 10 : For the method

determine the optimal value of h using the criteria

Using this method and the first criterion, find the value of b and determine the value of
f1(2.0), from the following tabulated values of f(x) = In x. It is g i ~ e nthat the
maximum round-off error in the function evaluation is 5 x lo4

Solution : If E ~el, and E~ are the round-off errors in the given function evaluations of
fo, f,, f2 respectively, then we have

(- 3fo+ 4f1 - 5) (- 3Eo+ 4E1 - E2) hZ


f,=
2h
+ 2h
+-
3
f 11' (a)
Numerical Differeotiation Integration
and Solution of Differential Equations

Let E = max (IE,,~, [ell . and M, = m u 1f1"(x)I


We obtain

I f w e use IRI =]TEI= min, w e g e t

Hence h3 = -or
M,
12E
hTt =
R)"
-

The error is given by

If we use I R( + (TEI = min, we get

Minimum total error = 6m E~ MY


For f (x) = I n x, we have M3 = max ~ f ' " ( x )=~0.25
2 s x s 2.12
Using the criterion (RI + lTEl and E = 5 X lo4, w e get

For h = 0.06, we get

If we take h = 0.01, we get

The *exactvalue is f '(2.0) = 0.5


Clearly for h c h,, the result deteriorate.

You may now tly the following exercise.

E10) For the method

-
fk+ I fk- 1 h2 rrr
2h -6 (a),
xk- 1< a < xk+1
determine hq, using the criteria. Numerical Diffir,.~tiation

(i) ( R ( = ITE( and

(ii) 1 R ( + ( T E1 = minimum.
Using this method and the second criterion, fmd hqt for f(x) = I n x and determine
the value o f f '(2.03) from the following table of values of f(x), if it is given that the
maximum round-off error in the function evaluation is 5 x lod
-

W e now end this unit by giving a summary of what we have convered in it.

12.7 SUMMARY
In this unit w e have covered the following :
1) .If a function f(x) is not known explicitly but a table of values of €(x) corresponding to
a set of values of x is given then its derivatives can be obtained by numerical
differentiation methods.
2) Numerical differentiation formulas using
(i) the method of undetermined coefficients and
(ii) methods based on finite difference operators can be obtained for the derivatives
of a function at nodal or step points when the function is given in the form of s
table.
3) When it is required to find the derivative of a function at off-step points then the
methods mentioned in (2) above cannot be used. In such cases, the methods derived
from the interpolation formulas are useful.
4) Higher order solutions can be obtained by Richardson's extrapolation method which
uses the lower order solutions. These results are more accurate than the results
obtained directly from higher order differentiation formulas.
5) Round-off errors play a very important role in numerical differentiation. Sometimes,
if the step size is too small, the round-off errors gets magnified unmanageably. In
such cases the optimal step length for the given formula could be used, provided that
it can be detennined.

El) Let f '(x) = a, + a , f, + a,fi. Setting f(x) = 1,x, x2, we obtain

3 2 1
Solving we obtain a, = - a, = 2 s2= - -
2b

E2) O(h) method : hf; - (Ik-fk-

0(h2) method : hf; =


Numerical Differentiation Integration
and Solution of Differential Equations
0(h3) method : hf l-l l f k - 22f,- + 9$-,- 6fk-3
6

E3) Using formula (It!), we have

Ushg formula (33).

E4) Using formula (U)),we have

Using (32), we have

t
Exact value fl(x) = lh = 0.002; f "'(x) = - l/x2 = - 0.4 x lo-'
Actual error in f '(500) is 0, whereas in f "(500) it is 0.1 x lo-'. Truncation error in

f '(x) is
- ,Zf"' - -5.33
- -
x lo-' and in f "(x) it is "h2fn = 8.8 x 10-9
3 12

E5) fn tbe given problem xo = 1,xl = 2.. x2 s 3, x3 = 4 and f, = 1, fl = 16 f, = 81 and f:, = 256.
Constructing the Lagrange fundamental polynomials, we get

x3 - 7x2 + 14x - 8
) (x3 -6
3 )
-6x2+ l l x
6
1
P3(x) = G (x) f, + t;(4 f, + L;(x) f, L;( 4 f3 +

p;(x)=g(x) $+L; +L;(x) f,+L;(x)f,


P;' (x) =g(x) f, +L;I (x) f1 +L;'(x) f,+L;'(x) f3
We obtain after substitution,

The exact values off '(x) and f "(x) are (from f(x) = x4)

E6) We are required to find fl(x) and fl'(x) at x = 1.5 and 3.5 which are off-step paihfi
Using the Newton's f(,.ward difference formula with xo = 0, x = 1.5, s = 1.5, w e p t
f'(1.5) = 8.7915 and f "(1.5) = -4.0834.

Using the backward difference formula with xn = 4, x = 3.5, s = - 0.5, we B C ~


f '(3.5) = 7.393 and f "(3.5) = 1.917.
E7) The difference table for the given problem is : Numerical Differentiation

x f(x) Af A2 f A3 f A4 f
1.3 3.669
0.8i3
1.5 4.482 0.179
0.992 0.41
1.7 5.574 0.220 0.007
1.212 0.48
1.9 6.686 0.268 0.012
1.480 0.060
2.1 8.166 0.328 0.012
1.808 0.072
2.3 9.974 0.400
2.208
-2.5 12182

5:~ Taking xo = 1.5 we see tbat s = 0 and we obtain from tbe interpolation fonnula
6

Exact value is el.' = 4.4817 and error is t 0.0067

E8) Use the q2) formula (33). With h = 0.1, f1'(0.6) = 1.25%, h = 0.2, f "(0.6) = 1.26545,
h = 0.4, f "(0.6) = 1.289394.

Using Richardson's extrapolation formula,

These two results are of q h 3 . To get qh6) result we repeat the extnpolation
technique and obtain

E9) Using (24) with h = 0.1,0.2, we have

E10) If e-,, em 6, are the round-off errors in the given function evaluations f-,, 6,f,
respectively, and if E = mar ((E- ,1 , leal , 1el 1) and M3 = max ( f "'(x) 1 then

E h2
(R(a band (TEIr 7 M Y
Numerical Differeatiotion integration
and Solution of Differential Equations
-
Ifweuse IRJ JTEJ,weget

and error is given by

Ifweuse -
IRI lTEl =mi& then

and error is ="(&r


3 2

-
For f(x) = In x and using the second aiterion, we get

b=Pt
-(30 x 1 0 ~ ~0.03.
) ~

For h = 0.03, we get

f '(2.03) =
- -
0.72271 0.69315 o,492667.
0.06

If we take b = 0.01, we get


f '(2.03) = 0.4925.
Tbe exact value off ' (2.03) = 0.492611,

The result deteriorate for b < bw


UNIT 13 NUMERICAL INTEGRATION

Structure
13.1 Introduction
Objectives
13.2 Methods Based on Interpolation
Methods Using Lagrange Interpolation
Methods Using Newton's Forward Interpolation
13.3 Composite Integration
I
13.4 Romberg Integration
13.5 Summary
13.6 Solutions/Answers

13.1 INTRODUCTION

In Unit 12, w e developed methods of differentiation to obtain the derivative of a


function f(x), when its values are not known explicitly, but are given in the form of a
table. In this unit, w e shall now derive numerical methods for evaluating the definite
integrals of such functions f(x). You may recall that in calculus, the definite integral of
f(x) over the interval [a, b ] is d e f l e d as
b
Jaf(x) dx - lim R[b]
h->O

where R[h] is the left-end Riemann sum for n subintervals of length h = -and is
n
given by

The need for deriving accurate numerical methods for evaluating the definite integral
arises mainly, when the integral is either
i) a mmplicated function such as f(x) = e-', f(x) = *etc. which have no
X

anti-derivatives expressible in terms of elementary functions, or


ii) when the inregrand is given in the form of tables.

Many scientific experiments lead to a table of values and we may not only require an
approximation to the function f(x) but also may require approximate representation of
the integral of the function. Moreover, analytical evaluation of the integral may lead to
transcendental, logarithmic or circular functions. The evaluation of these functions for a
given value of x may not be an accurate process. This motivates us to study numerical
integration methods which can be easily implemented on calculators.
1 In this unit we shall develop numerical integration methods wherein the integral is
approximated by a linear combination of the values of the integrand i.e.,

b
where xo, xl, ......., x, a r e the points which divide the interval [a, b ] into n
sub-intervals and Po, PI, ........,
P, are the weights to be determined. W e shall
discuss in this unit, a few techniques to determine the unknowns in Eqn. (1).
Numerical Differentiation Integration
d d Solution of Differential Equations Objectives
After studying this unit you should be able to
use trapezoidal and Simpson's rules of integration to integrate functions given in the
form of tables and find the errors in these rules;
improve the order of the results using ~ o m b e integration
r~ or its accuracy, by
composite rules of integration.

13.2 METHODS BASED ON INTERPOLATION

In Block 3, you have studied several interpolation formulas, which fits the given
data (x,', Q, k = 0, 1, 2, .........., n. We shall now see how these interpolation
formulas can be used to develop numerical integration methods for evaluating the
definite integral of a function which is given in a tabular form. The problem of
numerical integration is to approximatc ine definite integral as a linear combination
of the values of f(x) in the form
.I n

where the n + 1 distinct points xk, k = 0, 1, 2, ......, n are called the nodes or
abscissas which divide the interval [a, b] into n sub-intervals (xo < xl < x2 < ..... xu)
and &, k = 0, 1, .,..., n are called the weights of the integration rule or
quadrature formula. We shall denote the exact value of the definite integral by I
and denote the rule of integration by
n

The error of approximating the integral I by Ih[fl is given by

E,, ifl =
b
J f(x) ax - 2 a. .
n

k- 0
f,'

The order of the integration method (3) is de%:.;,;das follows :

Definition : An integration method of the form (3) is said to be of order p if it


produces exact results for all polynomials of degree less than or,equal to p.

In Eqn. (3) we have 2n + 2 unknowns viz., n + 1 nodes xk and the n + 1 weights


and the method can be made exact for polynomials of degree 5 2n + 1. Thus, the
method of the form (3) can be of maximum order 2n + 1. But, if some of the nodes are
prescribed in advance, then the order will be reduced. If all the n + 1 nodes are
prescribed, then we have to determine only n + 1 weights and the corresponding
method will be of maximum order n.
We first derive the numerical method based on Lagrange interpolation.

13.2.1 Methods Using Lagrange Interpolation


Suppose we are given the n + 1 abscissas x,'s and the corresponding values fk's. We
know that the unique Lagrange interpolating polynomial Pn(x) of degree r n, satisfying
the interpolatory conditions P,(xJ = f(xJ, k = 0, 1, 2, ......., n, is given by
n

with the error of interpolatioa


..

where 4 ( x ) =
(x- x0) (x- xl) ' ...(x- X,.- .. . (x- x,)
(x- X i + lJ
xo) ('k- 1
') ' ' ' ' (xk- xk- 1) (xk- Xk+ 1) ''' (xt- x,)
and n (x) = (x-x,,) (x-x,) . . . (x-16). Numerical Integration

We replace the function f(x) in the definite integral (2) by the Lagrange interpolating
polynomial Pn(x) given by Eqn. (5) and obtain

where
Pk -f 4(.) dx.

The error in the integration rule is

We have

where M
, = mar
x,<x<x,
If" ' (x) 1
Let us consider now the case when the nodes x,'s .are equispaced with x, = a, x, = b,
b -a
and the length of each subinterval is h = 7e. numerical integration methods
given by (7) are then known as Newton-Cotes formulas and the weight; k ' s given by
(8) are known as Cotes numbers. Any point x e [a, b] can be written as x = xo + sh.

With this substitution, we have

4(x) =
4
k ! (n- k)!
)(11)

Usinge x = xo + sh and changing the variable of integration from x to s, we obtain

'k-
eh(s(s-1)(s-2)
k !( n - L ) !
.,... ( s - k + l ) ( s - k - 1 ) .... ( s - n ) L (12)

and
IE [fl I r hn+2Mn+1
(n+ I ) !
(s(S- l)(s-2) .. .(a-n)ds
We now derive some of the Newton Cotes formulas viz. trapezoidal rule and Simpson's
rule by using first and second degree Lagrange polynomials with equally spaced nodes.
You might bave studied these rules in your calculus course.

Trapezoidal Rule
When ir = 1, we bave xo = a, x,, = b and h = b-a. Using Eqn. (12) the Cotes numbers
can be found as

Substituting the values of Po and in Eqn. (9,we get


I Numerical Differentiation lotegration
aod Solution of Differential Equations
The error of integration is
1

where -
Mz max
Xo<X<X
If (x) 1 (16)
b
Thus, by trapezoidal rule, $ f (x) dx is given by

I I
The reason for calling this formula the trapezoidal mle is that geometrically when f(x)
is a function with positive value then -h2 (fo + fl) is the area of the trapezium with
height h = b - a and parallel sides as 6 and fl. This is an approximation to the actual
area under the curve y = f(x) above the x-axis bounded by the ordinates x = x,,, x = x,.
(see Fig. 1.). Since the error given by Eqn. (15) contains the second derivative,
trapezoidal rule integrates exactly polynomials of degree s 1.

Fig. 1

Let us now consider an example.

Example 1 : Find the approximate value of

using trapezoidal rule and obtain a bound for the error. The exact value of I = ln2 =
0.693142 correct to six decimal places.

Solution : Here xo = 0, x, = 1 and h = 1 - 0 = 1. Using Eqn. (14), we get

Actual error = 0.75 - 0.693147 = 0.056853.

The error in the trapez . rule is given by

Thus, the error bound obtaified is much greater than the actual error.
We now derive tbe Simpson's rule.

Simpson's Rule
b a
For n = 2, w e have h = -, x0 = a, xl = -
a*b and x, = b.
2 2
From (12), we find the Cotes numbers as
h
Numerical Integration

Eqn. (7) in tnis case rduces to

Is [fI = ,
h
[fo + 4fl + f2]
b
Eqn. (17) is the Simpson's rule for approximating I =
a
f (x) dx .
The magnitude of the error of integration is

- % s(s- 1) (s- 2) ds +
I
This indicates that Simpson's rule integrates polynomials of degree 3 also exactly.
Hence, we have to write the error expression (13) with n = 3. We find
h 5 ~
1 41'11 s *(s(s- 1) (s- 2) (s- 3) ds

-- $ s(s- 1) (s- 2) (s- 3) ds + s(s- 1) (s- 2) (s- 3) ds

where M, = max
X0<X<X
( f" (x) 1
Since the error in Simpsm's rule contains the fourth derivative. Simpson's rule
integrates exactly all polynomials of degree 5 3.

b
Thus, by Simpso~l'srule, S f (x) dx
a
is given by

Geometrically,
3 +,
fo 4f1 + f2 represents the area bounded by ihe quadratic curve
I
passing through (xo f,), (x, ,fl) and (x2, f2) above the x-axis and lying between the
ordinates x = xo, x = x2 (see Fig. 2).
Y
t

Fig. 2
DifferentiationIntegration In case we are given only one tabulated value in the interval [a, b], then h = b - a, and
and Solution of Differential Equations
the interpolating polynomial of degree zero is Pdx) = G, In this case, we obtain the
rectangular integration rule given by
*

The error in the integration rule is obtained h m Eqn. (13) as

where M I= max 1 f (x) 1


acxcb

If the given tabulated value in the interval [a, b] is the value at the mid-point, then we
have xk = -,
a+b
2
and 4 = -
In this case h = b a and we obtain the integration
2
rule as

Rule (21) is called the mid-point rule. The error in the rule calculated from (13) is

This shows that the mid-point rule integrates polynomials of degree one exactly. Hence
the error for the mid-poht rule is given by

where M2= max ( f (x) 1 and h b- a


s<x+b
-
We now illustrate these methods through an example.
1 2
Example 2 : Eva1uat.e e-a dx using
0

a) rectangular rule b) mid-point rule c) trapezoidal mle and d) Simpson's mle.

If the exact value of the integral is 0.74682 correct to 5 decimal places, find the error
in these rules.
1
Solution : The values of the function f(x) = e-' at x = 0, 0.5 and 1 are

Taking h = 1 and using


a) IR[fl= h6, we get IR[fj=l.

b) IM[fl= hfm, we get IM[fl= 0.7788.

c) Idfl- -h2 [f, + f,] ,we get Idfl- -21 (1+ 0.36788) = 0.68394 ,Taking h = 0.5 and
using Simpson's rule, we get

- h
[f(O) + 4f(0.5) + f(l)]

- 0.74718.

Exact value of the integral is 0.74682.


Numerical lntegratibn
The errors in these rules a& given by

%[fj = 0.06288, E,[tJ = - 0.00036.

You may now try tbe following exercise :

Use the trapezoidal and Simpson's rule to approximate the following integrals. Compare
El)
the approximations to the actual value and find a bound for the error in each case.
-

We now d e ~ i v eintegration methods using Newton's forward interpolation formula.

13.2~2 Methods Using ^Newton's Forward Interpolation


Let the .data be given at equi-spaced nodal points xk = xo + sh, s = 0, 1, 2, ........., n,
where xo = a and x, = xo + nh = b.

The step length is given by h = -b. n-a


The Newton's forward finite difference interpolation formula interpolating this data is given by

- A2fo
f(x) P,(x) = fo+ sAf0+ s(s- 1) -+ . . . +
2
S(S- 1) (s- 2). . . (s- n+ I ) A O ~ ~
n!
with the error of interpolation

Integrating both sides of Eqn. (23) w.r.t. x between the limits a and b, we can
approximate the definite integral I by the numerical integration rule

The error.of iilterpolation of (24) is given by

i
We call obtain the trapezoidal rule (14) from (24) by using linear interpolation i.e., f(x)
P,(x) = fO + s A fO. We then have

with the error of integration given by (15).

Similarly Simpson's rule (16) can be obtained from (24) by using quadratic
interpolation i.e., f(x) = P2(x).
Numerical Differentiation Integration ~ ~ xo = ka, xl =i xo +~h, x2~= xo + 2h = 4 we have
and Solution o f Differential Equations

The error of interpolation is given by Eqn. (18).

Example 3 : Find the approximate value of I = S,l+


-dx
x
using

Simpson's rule. Obtain the error bound and compare it with the actual error. qlso
compare the result obtained here with the one obtained in Example 1.
-1
Solution : Here x, = 0, xl = 0.5, x2 = 1 and h = -.2
Using Simpson's rule, we have

Exact value of I = In2 = 0.693147.

Actual error = 0.001297. The bound for the error is given by

h5
IEs[fl 1 s -
90
M4= 0.00833, where M4= max

Here too the actual error is less than the given bound.

Also actual error obtained here is much less than that obtained in Example 1.
You may now trv the following exercise.
1.5
E2) Find an approximation t o S exdx,using
1.1
a) the trapezoidal rule with h = 0.4
b) Simpson's rule with h = 0.2

The Newton-Cotes formulas as derived above are generally unsuitable for use over large
integration intervals. Consider for instance, an approximation to

(ex dx, using Simpson's rule with h = 2. Here C

Since the exact value in this case is e4 - e0 = 53.59815, the error is -3.17143. This
error is much larger than what we would generally regard as acceptable. However, large
error is to be expected as the step length h = 2.0 is too large to make the er:or
expression meaningful. In such cases, we would be required to use higher order
formulas. An alternate approach to obtain more accurate results while using lower
order methods is the use of composite integration methods, which we shail discuss in
the next section.
- - - - -- - -

13.3 COMPOSITE INTEGRATION

In composite integration we divide the given interval [a, b] into a number of


subintervals and evaluate the integral in each of the subintervals using one of the
ntegration rules. W e shall construct composite rules of integration for trapezoidal and
Simpson's methods and find the corresponding errors of integration when these
composite rules are used.

Composite Trapezoidal Rule


We divide the interval [a, b] into N subintervals of length h = We denote the
subintervals as
N

(xk-, ,xk), k = 1, 2, . . ., N where xo = a, xN = b. Then

Evaluating each of the integrals on the right hand side by trapezoidal rule, we have

The method (26) is known as composite trapezoidal rule. The error is given by

Now since f is a continuous function on the interval [a, b], we have as a consequence
of Intermediate-value theorem

If M2 = max 1 f (5) 1 . Then


"
a<E,<b

(b-a) h2
IEJfl Is -7%
The error is of order h2 and it decreases as h decreases.

Composite trapezoidal rule integrates exactly polynomials of degree r 1. We can try to


remember the formula (26) as

IT[fl = ):( [first ordinate + last ordinate + 2 (sun of the remaining ordinates)].

Composite Simpson's Rule

In using Simpson's rule of integration (17), w e need three abscissas. Hence, w e divide
the interval [a, b] into an even number of subintervals of equal length giving an odd
b -a
number of abscissas in the form a = xo < x, < x2 < ...... < xZN = b with h = -
2N
and
I
Fjumerical Differentiation Integration x, = X, + kh, k = 0, 1, 2, ...,2N. We then write
) and Solution of Differential Equations
N
1= f ( x ) dx =
a k-1 'n-t
flx) dx

Evaluating each of the integrals on the right hand side of Eqn. (28) by the Simpson's
rule, we have
I N 1

The formula (29) is known as the composite Simpson's rule of numerical integration.
The error in (29) is obtained from (18) by adding up the errors. Thus we get

(b-a)
If M4 = max fN (5) , we can write using h = -
asesb I I 2N

The error is of order h4 and it approaches zero very fast as h -r 0. The rule integrates
exactly polyfiomials of degree r 3. We can remember the composite Simpson's rule as

I,[a - (t) [first ordinate + last ordinate + 2 (sum of even ordinates) + 4 (sum of the
remaining odd ordinates)]

We now illustrate composite trapezoidal and Simpson's rule through examples.

Example 4 : ~ v a l u s t e (=
bx
using
(a) composite trapezoidal rule and (b) composite Simpson's rule witb 2, 4 and 8
subintervals.

Solution : We give in Table 1 the values of f(x) witb h = -81 from x = 0 to x = 1.


Table 1

If N = 2 then h = 0.5 and the ordinates f , f4 and f8 are So be used.

We get
IS[q ;- 1
[G+ 4fl + fa] - - 25
0.694444
Numerical Integration

It N = 4 then h = 0.25 and the ordinates 6, f,, f,, 4, fa are t6 be used.


We have

I) -
1
IT[q =, [fo + f8 + 2 4 + 4 + f6 0.697024
(

ff N = 8 then h = 1/8 and all the ordinates in Table 1 are to be used.


We obtain

The exact value of the given integral correct to six decimal places is 1112 = 0.693147.
We now give the actual errors in Table 2 below.

Table 2

Note that as h decreases the errors in both trapezoidal and Simpson's rule also decreases.
Let us consider another example.
Example 5 : Find the minimum number of intervals required to evaluate with (*
-
an accuracy lo4, by using the Simpson rule. ol+x
~olukon: In Example 4 you may observe From Table 2 that N 8 giver lo4 (l.E - 06)
accuracy. We shall now determine N from the theoretical error bound for Simpson's
rule which gives l.E - 06 accuracy. Now

where
M, = max frV(x)
o<x< l
I I

To obtain the required accuracy we should therefore have

.:
N E 9.5
We find that we cannot take N = 9 since to make use of Simpson's rule we should
have even number ~C-intervals.We therefore conclude that N = 10 should be the
minimum number &f subintervals to obtain the accuracy l.E - 0.6 (i.e., lo4)
You may now try the following exercises :

dx
E3) ~valuateJ----Z by subdividing the interval (0,l) into 6 equal p a w and using
ol+x
/.I T r a n ~ 7 n i r l a ln r l p Ih\ Simncnn'c n ~ l pH p n r p f i n d thp v a 1 1 1nf
~ w and r r t l l a l ~ r r n m
Numerical Differeatiation Integration A 'function f(x) is given by the table
i and Solution of Differential Equations E4)

Find the integral of f(x) using (a) trapezoidal rule @) Simpson's rule.

E5) The speedometer reading of a car moving on a straight road is given. Estimate the
distance travelled by the car in 12 minutes using (a) Trapezoidal rule @) Simpson's
rule.

Time: 0 2 4 6 8 10 12
(minutes)
Speedo- : 0 15 25 40 45 20 0
meter
Reading
0.4
E6) valuates 0.2
(sin - In x + ex) dx using (a) Trapezoidal rule (b) Simpson's rule
taking h = 0.1.Find the actual errors.

E7) Determine N so that the composite trapezoidal rule gives the value of
2
I' e-X dx correct

upto 3 digits after the decimal point, assuming that e-" can be calculated accurately.

You must have realised that though the trapezoidal rule is the easiest Newton-Cotes
formula to apply but it lacks the degree of accuracy generally required. There is a way
to improve the accuracy of the results obtained by the trapezoidal and Simpson rules.
This method is known as Romberg integration, or as extrapolation to the Limit.
Richardson's extrapolation technique (ref. Sec. 12.5 of Unit 12) applied to the
integration methods is called Romberg integration. We shall now discuss this technique
in the next section.

13.4 ROMBERG INTEGRATION

In Romberg integration, first we find the power series expansion of the error term in
the integration method. Then by eliminating the leading terms in the error expression, we
obtain new values which are of higher order than the previously .computed values.
I

I
If Fdh) de otes the approximate value obtained by using the composite trapezoidal rule, then
I = Fo(h) + c,h2 + c-)~+ ~ , h q+ . . . .
where I is the exact value of the integral.
. ..
gral be evaluated with the step lengths h, -h2 and -
h
4'.

i
Eliminati g C, from Eqns. (31) and (32), we get

I
Note tha this value is of 0(h4). Similarly,

etc.
Numerical Integration
Applying this method repeatedly by eliminating C2, then C3 etc. we get the Romberg
integration formula

Fm (h) =
4mFm-l (i)- Fm-l('I
,m=1,2, .... (35)
4m-1
In the same way if Go@) denote the value of the integral obtained by using the
Simpson's rule, then
I = Go@) + d1h4+ d$6 + d,h8 + . . .
where I is the exact value of the integ;al.

Let the integral be evaluated with step lengths h, h/2 and h/4.
Then, we have
I = Go@) + d1h4+ d$6 + .. .

Eliminating dl from Eqns. (36) and (37), we get

Similar1y,

10
42G0 (t]- (i)
Go
42- 1 = GI($)
etc.
Note that these values are of order h6.

Applying extrapolation technique repeatedly, we get

We now illustrate this technique through an example.


1
Example 6 : Find the value of the integral I
0
=S 1
f (x) dx where f (x) = -using
1+ x
(a) coinposite trapezoidal and (b) composite Simpson's rules, with 3, 5 and 9 nodes.
Use extrapolation technique to improve the results.
I
Solution : We take the computed values &om Example 4. The Romberg integration values are
I given in Tables 3 and 4 for composite trapezoidal and composite Simpson's rules respectively.
I
Table 3

Note that
4F0 (i)
- (i) Fo
F1(i) = 3
Numerical Differentiation Integration
and Solutipn of Differential Equations

Table 4

Note that

Suppose that we wish to evaluate the integral in the above example directly by the
trapezoidal and Simpson's rules to an accuracy 1.OE- 06. What should be the
maximum value of step length to be chosen to achieve this accuracy ?
T o answer this question let us calculate the error bound for trapezoidal rule.

h2 h2 2 h2
IEJflJ s - max lf"(x)I >-
12 O < n < l
Hence

or N 145.
Thus to obtain l.E - 06 accuracy by trayezo~dalrule we need to use 145 subintervals,
i.e., 146 function evaluations. But by extrapolation we have used only 9 evaluatio~lsand
improved these values.
Let us co~lsideranother example.
Example 7 : Use co~npositetrapezoidal rule to find J f 2 l n x d x w i t h N = 3 , 6 , 1 2
and improve the accuracy by Rotnberg integration.
Solution : We give the result in the for111 of the followi~~gtable.

Table 5
You may now try the following exercise.

. E8) The followi~~g table gives the values of I n x for x = 1, 2, ..., 11. Evaluate the
integral of the tabulated function using Trapezoidal rule with h = 1, 2. u s e
Richardson's extrapolation technique to improve the accuracy and obtain the
actual error. Compare the results obtained by using Simpson's rule with h = 1.

We now end this unit by giving a summary of what we have covered in it.

13.5 SUMMARY

In this unit, we have learnt the following :


1) If a function f(x) is not known explicitly but a table of values of x is given or when it
has no anti-derivative expressible in terms of elementary functions then its integral
cannot be obtained by calculus methods. In such cases numerical integration methods
are used to find the definite integral of f(x) using the given data.
2) The basic idea of numerical integration methods is to approximate the definite integral
as a linear combination of the values of f(x) in the form

where the (n + 1) distinct nodes xk, k = 0, 1, ......, n, x,, < xl < x2 < ..... < x, divide
the integral [a, b] into n subinterirals and pk, k = 0, 1, ......, n are the weights of the
integration rule. The error of the integration methods is then given by

3) For equispaced nodes, the integration formulas derived by using Lagrange


interpolating polynomials P, (x) of degree s n, satisfying the interpolatory conditions
P, (xJ = f (xJ, k = 0, 1, ..., n are known as Newton-Cotes formulas. Corresponding to
n = 1 and n = 2, Newton-Cotes formulas viz., trapezoidal rule and Simpson's rule
are obtained.
4) For large integration intervals, the Newton-Cotes formulas are generally unsuitable for
they give large errors. Composite integration methods can be used in such cases by
:\4 dividing the interval into a large number of subintervals and evaluating the integral in
each of the subintervals using one of the integration rules.
5) For improving the accuracy of the trapezoidal or Simpson's rules and to obtain higher
%I I order solutions, Romberg's integration can be used.

h
E l ) a) I,[fl = -
2
[fo+ f,] = 0.346574

h
Is[f] = - Ifo+ 4f, + f2]
3
0.5
= -[4 In 1.5 + I n 21 = 0.385835
3

Exact value of I = 0.386294


I
Numerical Differentiation Integration
Actual error in IT [ f j = 0.03972
I I and Solution of Differential Equations
Actual error in Is[fj = 0.000459
Also

b) IT [fj = 0.023208 ,
I, [fj =0.032296,
Exact value = 0.034812 .
%[fj
I+ 1
C) IT [fj = 0.39270, = 0.161
I, ~q= o.a4n8 , [fj = 0.00831
Exact value = 0.34657 .

E2) IT (q = 1.49718
I, [fj = 1.47754.
1
E3) With h = 116, the values of f(x) = -
1+xZ
from x = 0 to 1 are

Now
h
lT[q-y[fo + f6 + f 6 + 2f, t f,+f3 + f 4 +f,
( )I

Exact n = 3.141593

Value of n fro111IT [fl = 4 x 0.784241 = 3.136963

Error in calculating n by IT [ f j is ETIfl = 0.004629

Value of n fr0111I, [fj= 4 x 0.785398 = 3.141592

Error in n by I, [fj is E, [fj = 1.0 x lod.

E4) I$fl a (k)


[fo t f4 + 2 (fl + f2 + f,)]
Numerical Integration
E5) Let vo = 0, vl = 15, v2 = 25, v3 = 40, v, = 45, v5 = 20, v6 = 0. Then
12
I= S0 v dt, IT [v] = + v6 + 2(v1 + v2 + V3 + V4 + v ~ )=] 290

E6) The values of f(x) = sin x = In x + ex are

f (0.2) = 3.0295 1, f (0.3) = 2.849352, f (0.4) = 2.797534

I, [fl =
04 [f (0:2) + 2f (0.3) + f (0.4)] = 0.57629

Exact value = 0.574056


E, = 2.234 x loa3.
Es = 9.2 x loa5.

E7) Error in composite trapezoidal rule

(b-a13
%[fj=-- 12NZ M 2 , M 2 = max Ifl'(x)J.
o<x<1

Thus

f "' (x) = e-X'4x(3-2x2) - 0 when x = 0, x = a


max I [f " (0) , f I ' (l)] I = max [2,2e-'1 = 2

For getting the correct value upto 3 digits, we must have

The intergel' value is N = 13.

E8) With h = 1, using trapezoidal rule


Numerical Differentiation Integration With h = 2,
and Solution o f Differential Equations

By extrapolation

1
F, (h) = - [4F (h) - F (2h)]= 16.39496667
3

By Simpson's rule

5 4

1.14 = (t)[b + flo + Z fZ-] 2 C]


. k-1
+
k-l

= 16.39496667 (which is same as the value obtained by extrapolation)

Exact value of the integral = 16.376848

Actual error = 0.01811867.


UNIT 14 NUMERICAL SOLUTION OF
ORDINARY DIFFERENTIAL
EQUATIONS

Structure
14.1 Introductio~l
Objectives
14.2 Basic Co~lcepts
14.3 Taylor Series Method
14.4 Euler's Method
14.5 Richardson's Extrapolation
14.6 Sultnnary
14.7 SolutionslPu~swers

14.1 INTRODUCTION

In the previous two units, you have seen how a colnplicated or tabulated function call be
replaced by an approxunati~~g poly~~omial so that the fundatnental operations of calculus
v i ~ . ,differentiation and integratio~~
can be performed more easily. 111 this w i t we shall solve
a differential equation, that is, we shall find tl~cUILIUIOWII function which satisties a
colnbinatio~lof the indeyc~ldentvariable, dependenr variable and its derivatives. In physics,
eagiaeering, chetnistry and nlany other disciplines it has become necessary to build
nlathelnatical ~nodelsto represent complicated processes. Differential equations are one of
the rnost hnportant mathematic-al tools uscd hl nlodelling problenls in the engineering and
physical sciences. As it is 1101always possible to obtain the analytical solutio~lof differential
equations recourse must necessarily be niade to numerical rtlethods for solving differential
equations. In this unit, we shall h~troduretwo such methods namely, Euler's ?lethod and
Taylor series method to obain numerical solutio~~ of ordinary differential equations (ODEs).
We shall also introducx Richardsoa's extrapolatio~lmethod to obtain higher order solutio~ls
to ODEs using lower order nlethods. To begin with, we shall recall few basic concepts fro111
the theory of differential equations which we shall be refeml~gquite often.

Objectives
After studying this unit you should be able to :
identify the initial value problem for the first order ordinary differential equations;
obtain the solution or the initial value proble~nsby using Taylor series method and
Euler's method;
use Richardson's extrapolatio~~ technique. for ilnprovi~lgthe accuracy df the result
obtained by Euler's method. '

14.2 BASIC CONCEPTS

In this section we shall state a few defi~litio~ls


from the theory of differential equations
and define some concepis involved in the nu~nericalsolutio~lof differential equations.

Definition : equation involving one or lllore unknown f u ~ l c t i o ~(dependent


~s
variables) and its derivatives with respect to one or more known functions (independent
variables) is called a differential equation.

For example,
Numerical Differentiation Integration
and Solution of Differential Equations

are differential equations.

Differential equations of the form (I), involving derivatives w.r.t. a single independent
variable are called ordinary differential equations (ODEs) whereas, those involving
derivatives w.r.t. two or more i n d e p e ~ ~ d variables
e~~t are partial differential equations
(PDEs). Eqn. (2) is an example of PDE.

Definition : The order of a differential equation is the order of the highest order 1
!
derivative appearing in the equation a ~ t dits degree is the highest exponent of the
*
1

highest order derivative after the equation has been rationalised i.e., after it has been I
expressed in the forn~free from radicals and any fractional power of the derivatives or
negative Dower. For cxamale eauation

is of third order and second degree. Eauation 1

is of first order and second degree as it can be written in the form 1

Definition : When the depew'ent variable and its derivatives occur in the first degree
only and not as higher powers or products, the equation is said to be linear, otherwise
it is nonlinear. 1
Equ y = x2 is a linear ODE, whereas, ( x + ~ ) ' nonlinear ODE.

Sirn -- - = 0, is a nonlinear PDE.


ay2 (d$y)

In this unit we shall be concerned only with the ODEs.

The general form ,of a linear ODE of order n can be expressed in the form

~ [ y =] a, (t) (t) + a l (t) y(n-')(t) + . . . . . . + a,-, ( 9 Y (t) + an (t) Y (t) = r(t) (5)
where r(t), a, (t), i = 1, 2, . . . . . . ., n are known functions of t and
I
d
(t) -1
dtn-
..
+ . . . . . . . + a n - l ( t ) z + an(t),
is the linear differential operator. The general nonlinear ODE of order 11 can be
written as

F(t, y, y ' , y" , . . . . . . ., y(n)) = 0


or, y(n) = f(t, y, y ' , y " , . . . . . . ., y(n-1) )
Eqn. (7) is called a canonical representation of Eqn. (6). hl such a form, the highest order
derivative is expressed in tenns of lower order derivatives and the independent variable.

The general solution of an nth order ODE contains n arbitrary constants. In order to
determine these arbitrary constants, we require n conditions. If these conditions are
given at one point, then these conditions are known as initial conditions a ~ l dthe
differential equation together with the initial conditions is called an initial value
problem (IVP). The nth order IVP can be written as

y'n' (t) = f(t,y,y',yt',. . . . . ...,y(n-1) )


I
l f ' t h e 11 co~rditionsare prescribed at Inore (ha11 onr point then these 'onditions are Nunicrical Solution of Ordinary
Differen~ialEqualions
k11ow11a s I>oundary cbnditions. The differential cquation togetIrcr with the boundary
1
c o ~ r d i t i o ~is~then
s k~rownas a Ijoundary value prohlem (BVP).
t
i T h c nth order IVP ( 8 ) is equivalent to the tollowing syslen~of 11 first order equations :
t
I S r t y = yl. Then

In vector notation, this systenl can be written a s a single equation as

stf = f (t, y), Y (to) = a


dt
........., y n f , ..... f(t,yl,. .....
where y = yl,y2
( i
f(t,y) = YrYi,.

Heocc, it is sufficient to study numerical ~rlcthodsfor the solution of the first order IVP.
Y' = f(t, Y) ~ ( t " )= Yo
9 (10)
The vector fonlr of thrse neth hods can their be used to solve Eqn. (9). Before attempting to
obtain ~lunleriralsoIutions to Eqn, (lo), we lust make sure (hat the proble~nhas a unique
solution. Thc Lollowhg (heorem cllsures the existence and uniqueiress ot (he solution to IVP (10).

Thenren~1 : If f(t, y) satisfies the conditions


i) f(t, y) is a real f u n c t i o ~ ~
ii) 1(t, y) is dcfinrd and C O I I ~ ~ I I U O Ufor
S t E [t0b] ,

iii) thcre e.xist.5 a cclnstant L such that for any t E


I
t b and for any.two numbers yl and y2
[o,

then for any yo, the IVP (10) has a unique solution. This condition is called the
Lipschitz condition and L is called the 1,ipschitz constant.

W e assunre the existence and uniqueness of the solution and also that f(t, y) has
continuous partial derivatives w.r.t. t and y of a s high order as w e desire.

I
Let us assume that t b b e an interval over which the solution of the IVP (10) is
[o,
required. If w e subiivide the interval t b into
[ n I 11 subintervals using a steysize

. where tn = b, we obtain (he mesh points or grid points to, t,, t2 ...... 7 tn

a s shown in Fig. 1.

Fig. 1
W e can then write t, = to + kh, k = 0, 1, . . . . . n. A numerical method for the solution
of the IVP ( l o ) , will produce approximate values y, at the grid points tk.
Numerical :renliation Integration Remember that the approximate values yk may contain the truncation and round-off errors.
and Solutb Differential Equations
We shall now discuss the cot~structionof numerical methods and related basic concepts
with reference to a simple ODE.

Let the grid points be defined by

where = a and b + Nh = b. 1
Separating the variables and integrating, we find that the exact solution of Eqn. (11) is
eY' - I',)
~ ( t =) Y(Q (12)
In order to obtain a relation connecting two successive solution values, we set t = t,
and gtl in Eqn. (12). Thus we get

y(b) = y(f) eY'n-.'J


and

y ( t , + ~= y(b) eYt- 1

Dividing, we get

Hence we have

y(t,,),n=O, 1,.....,N-1
Ah
y(t,+J=e

Eqn. (13) gives the required relation between y(t,,) and y(t,,,).

.
Setting 11 = 0,1, 2, . . . ., N-1, successively, we can find y(tl), Y(~J,,.. . ., Y ( ~ N )
from the given value y(fo).
I
An approximate methgd or a numerical method can be obtained by approximating ehh in
Eqn. (13). For example, we may use the followiilg polynomial ayroxhations. I

and so on.

Let us retain (p+l) terms in the expansion of ehy and denote the approximation to ehy
by E(Ah). Tbe ~~umerical method for obtaining the approximate values y, of y(t,) can
then be written as

yn+,= E ( U ) y,, n = 0, 1, . . .. . . . .,N-1 (17)

The truncation error (TE) of the method is defined by

TE = ~ ( h t , -Yntr
)
Sinc'e (p+l) tenns are re.tained in the expa~lsionof ehh,we have
Numerical Solution of Ordinary
Differential Equations

The TE is of order p+l. The integer p is then called the order of the method.

We say that a ~~umerical method is stable if the error at ally stage, i.e. y, - y($) = E,
remains bounded as n + m. Let us examine the stability of the numerical method (17).
Putting Y , + ~ = ~ ( t , + ~+ )E , + ~ and y, = y(t,) + en in Eqn. (17), we have

= E ( U ) [y(t,) + en] - eAhy(tn)


(using Eqn. (13))

We note from Eqn. (18) that the error at t,,, consists of two parts. The first part E [ U ]
- eAhis the local truncation error and can be made as small as we like by suitably
deterttlit~illgE [ U ] . The second part ( E ( U )1 E, is the propagation error from the
previous step t, to $+, and will not grow if IE(&h)l < 1. If I E ( U ) ( c 1, then as
11 + m the 'propagation error tends to zero and method is said to be absolutely stable.
Formally we give the followil~gdefinition.
Defintion : A numerical method (17) is called absolutely stable if ( E ( U ) ( s 1.
You may also observe here that the exact value y(t,) given by Eqn. (13) increases if h
> 0 and decreases if h c 0, with the growth factor eAh.The approximate value yn given
by Eqn. (17) grows or decreases with the factor IE(U) 1. Thus, in order to have
mea~li~lgful~~umericalresults, it is necessary that the growth fact-r 6f the nu~tierical
method should not increase faster than the growth factor of exact solution when h > 0
and should decay at least as fast as the growth factor of the exact solution when h c 0.
Accordilrgly, we give here the following definitioa.

Defil~itiol~
: A numerical method is said to he relatively stal,le if IE ( U ) l s eAh,h > 0.

The poly~lomialapproximations (14), (15) and (16) always'give relatively stable


methods. Let us now find when the methods Y , + ~ = E ( U ) y, are absolutely stable where
E(hh) is givem by (14). (15) or (16).

This methods are given by


First order : Y,+, = ( l + U ) Y,

Thcse n~t.lhodsare absolutely stable whelk

Firstorder: I l + h h l 4 1

or-1s U s 1

or-2s hhs 2

Second order : I 1 + hh +-
h:2(sl

The right inequality gives


Numerical Differentiation Integration
and Solution oPDifferential Equalions

The second condition gives -2 a hh. Hence the right inequality gives -2 5 Ah a 0 . The
left inequality gives

For -2 a Ah a 0, this equation is always satisfied. Hence the stability col~ditiol~


is

h2h2 h3h3
Thirdorder: ( + A h + - + -2l a 1 6

Usinge the right and left inequalities, we get

-2.5rhhsO.

These intervals for Ah are know as stability intervals.

Numerical methods for finding the solution of IVP given by Eqn. (10) may be broadly
classified as
i) Singlestep methods
ii) Multistep methods
Singlestep methods ehable us to find y,+,, an approximation to y(t,+,), if y,, y,' and h
are known.

Multistep methods enable us to find y,+,, an approximation to y(t,+,), if yi, yi , i = n,


tl-1, . . . . . . . n-m+l and h are known. Such methods are called m-step multistep methods.

In this course we shall be discussing about the singlestep methods only.

A singlestep method for the solution of the IVP

y' = f(t,y), Y (to)= Yo, tE tab


( 9 )

is a recurrence relation of the form

where $ (t,,, yn, h) is known as the increment function

If yn+l can be determined from Eqn. (19) by evaluatilig the right hand side, then the
singlestep method is known as an explicit method, otherwise it is known as an implicit
method. The local truncation error of the method (19) is defined by

The largest integer p such that

is called the order of the singlestep method.

Let us now take up an example to understand how the singlestep method works.
Example 1 : find the solution of the IVP y' = hy, y(0) = 1 in 0 < t a 0.5, using the
first order method
-
Y , + ~ = (1 t Ah) yn with h = 0.1 and h = 1.

0.5 0.5
Solution : Here the number of intervals are N = -= -=
h 0.1
We have yo = 1
11, =(1 t Ah) y,=(l t A h ) = ( l tO.1A) Numerical Solution of Ordinary
Differential Equations
y 2 = (1 t Ah) y, =(1 tAh12 =(1 t 0 . 1 ~ ) ~

y, = (1 t Ah)' = (1 t 0.1 A)'


The exact solution is y(t) = ekt.

'We now give in Table 1 the values of y, for h = * 1 together with exact values.

Solution of y' = Ay, y(0) = 1, 0 r t r 0.5 with h = 0.1


A = l A=-1
t First Order Exact First Order Exact
method Solution method Solution
0 1 1 1 1
0.1 1.1 1.10517 0.9 0.90484
0.2 1.21000 1.22140 0.81 0.81873
0.3 1.33100 1.34986 0.729 0.74082
0.4 1.46410 1.49182 0.6561 0.67032
0.5 1.61051 1.64872 0.59049 0.60653

In the same way you can obtain the solution using the second order method and
compare the results obtained in the two cases.

El) Find the solution of the IVP


y' = h y , y(O)= 1

in 0 r t r0.5 using the second order method

(
y,+, = 1 t Ah t - y, with h = d.1 and A = 1.

*
We are now prepared to consider numerical methods for integrating differential
equations. The first method we discuss is the Taylor series method. It is not strictly a
numerical method, but it is the most fundamental method to which every numerical
method must compare.

14.3 TAYLOR SERIES METHOD

Let us consider the IVP given by Eqn. (lo), i.e.,

The function f may be linear or nonlinear, but we assume that f is sufficiently


differentiable w.r.t. both t and y.
The Taylor series expansion of y(t) about any point tl, is given by

Substituting t = \+,in Eqn. (22) w e have


Numerical DifferentiationIntegration where tk+l = tk+ h. Neglecting the terms cf order hp+' and higher order terms, we have
and Solution o f Differential Equations
the approxinlation

h
where + (tk, yk, h) = y; + 21 yrlk+ . . . + -yk
h ~ - l (P)
P!
This is called the Taylor Series method of order p. The truncation error of the lllethod
is given by

hp+l
--
- y(P+l) (tk + Oh), 0 < 0 < 1
(p+l) !

When p = 1, we get from Eqn. (24)

which is the Taylor series method of order one.

T o apply (24), we must know y ( 9 , yl(tk), y"(t3, . . . - . . -,Y(P) (tk).


However, y(tk) is known to us and if f is sufficiently differentiable, then higher order
derivatives can be obtained by calculating the total derivative of the given differential
equation w.r.t. t, keeping in mind that y is itself a function of t. Thus wz obtain for the
first few derivatives as :

Y' =f(t,y)
yl'=f,+f$

y"' = 4, + 2 f ty+ ?fyy + $ (f, + f $) etc.


where f, = aWat, f,, = a2f/at2etc.

The number of terms to be included in the method depends on the accuracy


requirements.

Let p = 2. Then the Taylor Series method of 0(h2) is

h3
with t h e T E = - y U ' ( a ) , t,, < a < 4,'
6
The Taylor series method of 0(h3), (p=3) is

h4
with the TE = y(P) (a), t,, c a < C+,.

Let us con~idetthe following examples.

Example 2 : Using the third order Taylor series method find the solution of the
differential equation

xy' = x - y , y(2) = 2 at x = 2.1 taking h=0.1

Solution : We have the derivatives and their values at x=2, y=2 as follows :
yr=l-Y y' (2) = 0
X
Numerical Solution of Ordinary
Differential Equations

Using taylor series method of 0(h3) given by Eqn. (28), we obtain

y(2.1) = 2 + 0.0025 - 0.000125 = 2.002375.

Example 3 : Solve the equation x2y' = 1 - xy - x2y2, y(1) = - 1 from x=l to x=2 by
using Taylor series method of 0(h2) with h = ln
and 114 and find the actual error at
x=2 if the exact solution is y = - llx.

Solution : h o r n the given equation, we have y1


-2- 1x - y2
Differentiating it w.r.t. x, we get

Using the second order method (27),

we have the following results

Since the exact value is y(2) = -0.5, we have the actual errors as

1
e, = 0.0583 with h = -
3
1
e2 = 0.0321 with h = -
4

Note that error i s small when the step size h is small.

Your may now trjm, the following exercises-

- - - - - - -

Write the Taylor series method of order four and solve the IVPs E2) and E3).

E2) y' = x - y2, y(0) = 1. Find y(0.1) taking h = 0.1.


Numerical Differentiation Integration
and Snllrtion o f Differential Equations ~ 3 ) yf = x2 + y2, y ( ~ =) oa5.'~indy(0.4)'taking h = 0.2.

E4) Using second order Taylor series method solve the IVP

y' = 3. +,: y(0) = 1. Find y(0.6) taking h = 0.2 and h = 0.1.


Find the actual error at x = 0.6 if the exact solution is y = - 6x - 12.

Notice that though the Taylor series method of order p gives us results of desired accuracy
in a few number of steps, it requires evaluation of the higher order derivati. and becomes
tedious to apply if the various derivatives are complicated. Also, it is d f l .*'t to determine
the error in such cases. We now consider a method, the Eukr's metbod -.,hichcan be
regarded as Taylor series method of order one and avoids tbese difficulties.

14.4 EULER'S METHOD


I
Let the given IVP be

Let
[t o ,bI be the interval over which the solution of the given IVP is to be
determined. Let h be the steplength. Then the nodal points are defined by tk = to + kh,
k = 0 , 1 , 2 ,........, N w i t h t N = t O + N h = b .

Fii. ;I

The exact solution y(t) at t = t,.+, can be written by Taylor series as

Neglecting the term of 0(h2) and higher order terms, we get

Yk+l = Yk +h ~ ' k

with TE =
(TI
- y"(a), tk < a < t,

From the given IVP,y' (Q = f(p, y& = 4


We can rewrite Eqn. (30) as

Eqn. (32) is known as the Euler's method and it calculates recursively the solution at
the nodal points &, k = 0, 1, ., N... . ..
Since the truncation error (31) is of order b2, Euler's method is of first order. It is also
called an 0@)method.
Let us now see the geometrioal representation of the Euler's method. Numerical Solution of Ordinary
Differential Equations

Geometrical Interpretation
Let y(t) be the solution of the given IVP, Integrating
we get
dt
*
= f(t, y) from tk to tk+l,

(y$ dt = j'.'f(t,y) dt
tk
=~ ( t ~-+y(tk)
~) (33)

We know that geometrically f(t, y) represents the slope of the curve y(t). Let us
approximate the slope of the curve between $ and tk+,by the slope at $ only. If we
approximate y(tk+J and y(tJ by yk+, and yk respectively, then we have

yk+l - yk = f(tk, ylc)Jhl dt (34)


t,

Thus in Euler's method the actual curve is approximated by a sequence of line


segments and the area under the curve is approximated by the area of the quadrilateral.
(see Fig.3)

+.1 Fig. 2 : Geometriul representation of Euler's method

Let us now consider the following examples.

Example 4 : Use Euler method to find the solution of y' = t + ly 1, given y(0) = 1.
Find the solution on [O, 0.81 with h = 0.2.

.
Solution : We have
Numerical Differentiation Integration
and Solution of Differential Equations y(0.81= Y, = Y, + (0.2) f,
= 1.856 + (0.2) [0.6 + 1.856)
= 2.3472
Example 5 : Solve the differential equation y' = t+y, y(0) = 1. t E [0,1] by Euler's
method using h = 0.1. If the exact value is y(1) = 3.436564, find the exact muT.
Solution : Euler's inethod is
= Yn + hy',
Yn+,
For the given problem, we have

= (1 + h)yn + ht,
h = O.l,y(O) = 1,
y1 = yo = (1 + 0.1) + (0.1) (0) = 1.1
y2=(1.1)(1.1)+(0.1)(0.1)= 1.22, y,= 1.362
y, = 1.5282, y, = 1.72102, y, = 1.943122,
y, = 2.197434, y, = 2.487178, y, = 2.815895
ylo = 3.187485 = y(1)
actual error = y(1) - ylo = 3.436564 - 3.187485 = 0.2491.
Remark': Since Euler's method is of O(h), it requires h to be very small to attain the
desired accuracy. Hence, very often, the number of steps to be carried out becomes very

large. In such cases, we need higher order methods to obtain the required accuracy in a
limited number of steps.

Euler's method construct yk = y(tk) for each k = 1, 2, . . . ., N,


where

This equation is called the difference equation associated with Euler's method. A
. . . . . . ., Y , + ~ soine
difference equation of order N is a relation involving y,, Y,+~,
simple difference equations are

where n is an integer.

A difference equation is said to be linear if the unknown functions (k = 0,


.
1, . . . . . . ., N) appear linearly in the difference equation. The general fonn of a
,linear noiihoinogeneous difference equation of order N is

where the coefficients aN-,, aN,, ..... . . ., a. and b may be functions of n but not of
y. All the Eqns. (35) are linear. It is easy to solve the difference Eqn. (36), when the
coefficients are constant or a function of n say linear of a quadratic function of n.

The general solutioi~of Eqn. (36) can be written in the form

where y, (c) is the complementary solution of the homogeneous equation associated


with Eqn. (36) and y,(p) is a particular solution of Eqn. (36). To obtain the
complementary solution of the homogeneous equations, we start with a solution in the
form yn = pn and substitute it in the given equation. This gives us a polynomial of
.
degree N. We assume that its roots pl, p2, . . . . . ., PN are all real aiid distinct.
The11 by linearity it follows that Numerical Solution of Ordinary
Differential Equations
y n = c , p : + c 2 p ; + ........ +CNp",
I for arbitrary constants C,, is a solutio~iof the honiogeneous equation associated with
I
Eqn. (36). A particular solution of Eqo. (36) when b is a constant call be obtained by

I
setting y,(p) = A (a c o ~ ~ s t a ~inl tEqn.
) (36) and detrrmillil~gthe value of A. For detail,
you call refer to ele~tlentarynumerical analysis by Coate- deBoor. We illustrate this
method by considering a few examples.
Example 6 : Find the solution of the initial-value difference equations
Ynt2-4~,+, + 3yn = 2", Yo = 0, Y, = 1

Solution : The homogeneous equation of the given proble~uis


~ n + 2 - 4 ~ n + l3 ~ =n O

Let y, = p". Then Eqn. (37) reduces to


pnt2- 4pnt1 + 3P" = 0.
Dividing by fin, we obtain the characteristic equation
L p2-4~+3=0
i.e., P = 1, 3

I
t
.-. y, (C) = C, (1)" + C, (3)"

This gives
= C, + 3"C2
For obtaining the particular solutio~lwe try y,(p) = A2".

I 2 " " ~ - 4 x 2 " " ~ + 3 x 2"A = 2"

I or, A = -1

i Therefore, the general solution of the given problem is

I Using conditioris for 11 = 0, 1, we obtain

:. C2= 1/2, C , = 112 and

which is the required solution.


Note : In the above method we can obtain y, for all n from one formula given by
Eqn. (39). Whereas, in the Eulers's method for obtahing'the value at each iteration, we
require the previous iterated value. We illustrate it by considering another example.
Example 7 : Using difference method find the solution of yk+, = yk + h(5+3yJ, given
y(0) = 1. Find the solution y(O.6) with h = 0.1.
Solution : We have
Yk+l + 3h) yk = 5h (40)
Solution of the homogeneous equation is

ydc) = C (1 + 3h)'.
For obtaining the particular solution we try yk(p) = Ah.

This give
Numerical DifferentiationIntegration Therefore, the general solution of the given problem is
and Solution of Differential Equations
5
y k = ~+3l1)~--.
(l
3
Using the condition y(0) = 1, we obtain C = 8/3.

Thus
8 5
y k = y (1 +3i1)~--.
3

Eqn. (41) gives the formula for obtaining yk + k.


8
y6 = y(0.6) = - (1
3
+ 3 X 0 . 1 ) ~- -53

Now Euler's method is .

Yk+l = + 3h) yk + 5h

and we get for h = 0.1

Y1 = 1.8, Y2 = 2.84, y3 = 4.192, y, = 5.9496, y5 = 8.23448, y6 = 11.204824.

You may now try the following exercises

Solve the following IVPs using Euler's method

E5) y' = 1 - 2xy, y(0.2) = 0.1948. Find y(0.4) with h = 0.2

E6) y' = -
x2 - 4y
Ly(4) ,
= 4. Find y(4.l) taking h = 0.1

E7) y' = F,y(0) = 1. Find y(0.1) with h = 0.1


Y+X

E8) y' = 1 + y2, y(0) = 1. Find y(0.6) taking h = 0.2 and h = 0.1.

You may recall that in Unit 12 we studied Richardson's extrapolatio~~ technique to


increase the order of a numerical differentiation formula without increasing the h ~ ~ c t i o ~ ~
evaluations. In Unit 13, we introduced Romberg integration which is the Richardscl~~'s
extrapolation technique applied to the integration rules. In both the cases, the order of
the numerical value was improved by the application of the Richardson's extrapolation.
In the next section we shall use this technique to obtain higher order solutions to
differential equations using lower order methods.

14.5 RICHARDSON'S EXTRAPOLATION

Consider the Euler's method

which is au O(h) method. Let F(h) and F(h/2) be the solutions obtained by using step
lengths h and h/2 respectively.

Recall that the Richardson's extrapolation method of combining two computed values with two
different step sizes, to obtain a higher order method is given by (ref.. Formula (54) Unit 12)

where p is the order of the method.


Thus, in the case of Euler's Ph
tocIwhich is of first order, once we know the values F(h)
and F(h/2) at two differentstep sizes h and hf2, Formula (42) for r = la p=l, reduces to
2 4 ~- )F(h)
F(') (h) =
2-1
Before illustrating this technique, we give you a method of determining numerically, the
order of a method.

Let y1(4) and y2(t,J be the two values obtained by a numerical method of order p with
step sizes hl and b. If e: and 5 are the corresponding errors, then
P

By taking logarithms, we get

Hence the order y of the method is

Let us now consider the followit~gexamples.


Example 6 : Using the Euler's method tabulate the solution of the IVP

in the interval (0, I ] taking h = 0.2, 0.1. Using Richardson's extrapolation technique
obtain the improved value at t = 1.

Solution : Euler's method gives

yk+l = yk + h fk where j = - 2 j yk2


=Yk-2h4yc .

Starting with t, = 0, yo = 1, we obtain the following table of values for h - 0.2.

Table 2 : h = 0.2
- - -

Thus, y(1.0) = 0.50706 with h = 0.2

Similarly, starting with t, = 0, yo = 1, we obtain the following table of values for h = 0.1.

Table 3 : h = 0.1

y(1.0) = 0.50364 with h = 0.1


hn1crica1Differmtia'iOn Integration Using formula (43), the extrapolated value at y(1) is given by
and Solution of Differential Equations

2F(0.1) - F(0.2)
(0.1) =
1
= 2(0.50364) - (0.50706)
= 0.50022

Let us consider another example

Example 7 : Use Euler's method to solve numerically the initial value: proble
y' = t t y , y(0) = 1 with h = 0.2, 0.1 and 0.05 in the interval [0, 0.61. Apply
Richardson's extrapolation technique to compute y(0.6).

Solution : Eu!er's method gives


YkeI = YL + h fk
= Yk (tk + ~3
= (lth)ykth$

Starting with = 0, yo = 1, we obtain the following table of values.

Table 4 : h = 0.2

:. y(0.6) = 1.856 with h = 0.2

Table 5 : h = 0.1

.: y(0.6) = 1.943122 with h = 0.1

Table 6 : h = 0.05

:. y(0.6) = 199171 with h = 0.05


By Richardson's extrapolation method (43)' we have Numericill Solution of Ordinary
Differential Equations
F(')(0.~5)= 2 F(O.05) - F(O.l)
= 2.040298
~("(0.1)= 2 F(O.l) - F(0.2)
= 2.030244
L
Repeating Richardson's technique and using formula (42) with p=2, we obtain

= 2.043649
The exact solution is y = - (l+t) + 2et

Hence y(0.6) = 2.044238.

The actual error of the extrapolated value is

I error = y(0.6) - F(2)(0.05)


1 = 2.044238 - 2.043649

i
I
= 0.000589
Aad now a few exercises for you

E9) The IW

is given. Find (0.6) with h = 0.2 and h = 0.1, using EulerTs method and extrapolate
the value y(0.6). Compare with the exact solution.
1 E10) Extrapolate the value y(0.6) obtained in B).
i

We now end this unit by giving a summary of what we have cwered in it.

In this unit, we have covered the following


1) Taylor series method of order p for the solution of the IVP

is given by

2, , . . . .. ..N, tN= b. The enor of approximation is given by

2) Euler's method is the Taylor series method of order one. The steps involved in
solving the IVP given by (10) by Euler's method are as follows :
Step 1 : Evaluate f(t,,, yo)
Step 2 : Find y, = yo + h f(t, yo)
Step 3 : If t,, < b, change b to b + h and yo to y, and repeat steps 1 and 2
Numerical Differentiation Integration
and Solution o f Differential Equations Step 4 : If t, = b, write the value of y
3) Richardson's extrapolation method given by Eqn. (42) can be used to improve the
values sf the function evaluated by the Euler's method.

El) We have yo= 1, A = 1, h =0.1

y2= (1.105)~
------------------
ys = (1.105)~
Table giving the values of y, together with exact values is

Table 7
t Second order method Exact solution
0 1 1
0.1 1.105 1.10517
0.2 1.22103 1.22140
0.3 1.34923 1.34986
0.4 1.49090 .1.49182
0.5 s X 34745 ' 1.64872

E2) Taylor series method of 0(h4)to solve y' = x - y2,y(0) = 1 is

yiv = -2yyu1--6y1 y"

Substituting

E3) Taylor series method :


y1=x2 + y2 , y(0) = 0.5, y' (0) = 0.25, y' (0.2) = 0.35175
y" = 2x + 2yy' y" (0) = 0.25, y" (0.2) = 0.79280
y"' = 2 + 2 y y U+2(y92 y"' (0) = 2375, y"' (0.2) = 3.13278
Yiv = 2yy'" + 6y' y" Yiv (0) = 2.75, Yiv (0:)' = 5.17158
~(0.2)= 0.55835, y(0.4) = 0.64908
E4) Second order Taylor's method is
Numaical Solution of Ordinary
Differential Equations

E5) Euler's method is yk+,= yk + hE, = yk + h (1-2xk y,J


~(0.4)= 0.1948 + (0.2) (1 - 2 x 0.2 x 0.1948)
= 0.379216.

E7) Euler's method y' = (y-x) / (y+x), y(0) = 1, ~ ' ( 0 )= 1


y(0.1) = 1 + (0.1) (1) = 1.1

E8) Euler's method is


2
Yk+l = + Yk + h ~ t
Starting with 6 = 0 and yo = 1,we have the following tables of valdes

Table 8 : h = 0.2

Table 9 : h = 0.1

Starting with to = 0, yo = 1, we have the following table of values

Table 10 : h = 0.2
Numerical Diff6rentiation Integration Table 11 : h = 0.1
and Solution of Differential Equations

Using fonnula (42), we have


F(') (0.1) = 2F(0.1) -F(0.2)
= 1.93944
Exact solution is
1
-t
y = -6(t+2) + 13eZ
Hence

The actual error of the extrapolated value is

error = y(0.6) - ~(')(0.1)

E10) From E8),we have F(O.l) = 3.5691 and F(0.2) = 2.9856


UNIT 15 SOLUTION OF ORDINARY
DIFFERENTIAL EQUATIONS
USING RUNGE-KUTTA
METHODS

Structure

15.1 Introduction
Objectives
15.2 Runge-~uttaMethods
Runge-Kutta Methods of Second Order
Runge-Kutta Methods of Third Order
Runge-Kutta Methods of Fourth Order
15.3 Richardson's Extrapolation
15.4 Summary
15.5 Solutions/Answers

15.1 INTRODUCTION

In Unit 14, we considered the IVPs

Y' = f(t, Y), Y '(to) = Yo (1)


and developed Taylor series method and Euler's method for its solution. As mentioned
earlier, Euler's method being a first order method, requires a very small step size for
reasonable accuracy and therefore may require lot of computations. Higher order Taylor
series methods require evaluation of higher order derivatives either manually or
computationally. For complicated functions, finding second, third and higher order total
derivatives is very tedious. Hence Taylor series methods of higher order are not of
much practical use in finding the solution of IVPs of the form given by Eqn. (1).

In order to avoid this difficulty, at the end of nineteenth century, the Cierman
mathematician, Runge observed that the expression for the increment function t$ (t, y, h)
in the singlestep methods [see Eqn. (24) of Sec. 14.3, Unit 141

can be modified to avoid evaluation of hlgher order derivatives. This idea was further
developed by Runge and Kutta (another German mathematician) and the methods given
by them are known as Runge-Kutta methods. Using their ideas, we can construct higher
order methods using only the function f(t, y) at selected points on each subinterval. We
shall, in the next section, derive some of these methods.

Objectives
After studying this unit, you should be able to :
obtain the solution of IVPs using Runge-Kutta methods of second, third and fourth order,
compare the solutions obtained by using Runge-Kutta and Taylor series methods;
extrapolate the approximate value of the solutions obtained by the Runge-Kutta
methods of second, third and fourth order.

15.2 RUNGE-KUTI'A METHODS

We shall first try to discw the basic idea of how the Runge- Kutta methods are developad.
Numerical D i f h t i a t i o n Integration consider the q h 2 ) singlestep method
and Solution of Differential Equations
hZ
Y , + l Z Y, + hy', + TY",

If we write Qn. (3) in the form of Eqn. (2) i.e., in terms of + [t,,, y,, hl involving
partial derivatives of f(t, y), we obtain

Runge observed that the r.h.s. of Eqn. (4) can also be obtained using the Taylor series
expansion of f(t,, + ph, y, + qhf) as
f(t,, + ph, y, + qhQ - f + ph f, (6,Y,) + qh% fy (b* yn) (5)

Comparing Eqns. (4) and (5) we € i d that p = q = 112 and the Taylor series method of
0(h2) given by Eqn. (3) can also be written as

Since (5) is of q h 2 ) , the value of yn+l in (6) has the TE of 0(h3). Hence the method
(6) is of w h Z ) which is same as that of (3).

The advantage of using (6) over Taylor series method (3) is that we need to evaluate

the function f(t, y) only at two points (t,,, y,) and . We observe that
f(t,,, y,) denotes the slope of the solution curve to the IVP (1) at (t,,, y,). Further,

[ + -, + (-1 I
f t, y, f, denotes an approximation to the slope of the solution curve at the

[ + - ( + -:)I
point t,, y t, Eqn. (6) denotes geometrically, that the slope of the solution

I
curve in the interval t , tn+l is being approximated by an approximation to the slope at
h
the middle points t,, + -. This idea can be generalised and the slope of the solution
2

[ I
curve in t,,, t,+, can be replaced by a weighted sum of slopes at a number of points in

(called off- step poinb). This idea is the basis of the Runge-Kutta methods.
[b,
Let us consider for example, the weighted sum of the slopes at the two points [t,,, y,]
and [t,, + ph, Y, + qhfl, 0 < p, q < 1 as

We a11 W, and W2 as weights and p and q as scale factors. We have to determine the
four unknowns W1, W2, p and q such that 4 (t,,, y,, h) is of 0(h2). Substituting Eqn. (5)
in (7),we have

and the method (2) reduces to

Yn+1 , Y,) + q h c 4.(tn, Y,)}]


= Yn + h [Wlf + W2 [f, + ~ h f(t,,,
= Y. + h(W1 + W 3 4 + h% (pf, + qf f,),

where ( 1, denotes that the quantities inside the brackets are evaluated at (t,,, y,).

-
Comparing the r.h.s. of Eqn. (9) wlth Eqn. (3), we € i d that
In the system of Eqns. (lo), since the number of unknowns is more than the number of Solution OE Ordinary DiEEuential
Equations using Runpc-Kutta Methods
equations, the solution is not unique and we have infinite llurnber of solutions. The
solution of Eqn. (10) can be written as

By choosing W2 arbitrarily we may obtain infinite number of second order Runge-Kutta


1
methods. If W, = 1, p = q = and W, = 0, then we get the method (6). Another
1 2 1
choice is W2 = - which gives p = q = 1 and W, = - With this choice we obtain from
2 2'
(7), the method

which is known as Heun's method.

Note that when f is a functian of t only, the method (12) is equivalent to the
trapezoidal rule of integration, whereas the method (6) is equivalent to the midpoint rule
of integration. Both the methods (6) and (12) are of 0(h2). The methods (6) and (12)
can easily be implemented to solve the IVP (1). Method (6) is usually known as
improved tangent method or modified Euler method. Method (12) is also known as
Euler-Cauchy method. '

We shall now discuss the Runge-Kutta methods of 0(h2), 0(h3) and 0(h4.

15.2.1 Runge-Kutta Methods o f second Order


The general idea of the Runge-Kutta (R-K) methods is to write the required methods as

yn+]= yn + h (weighted sum of the slopes).

where m slopes are being used. These slopes are defined by

etc. In general, we can write

r i-I 1

The parameters Ci, aij, Wj are unknowns and are to be determined to obtain the
Runge-Kutta methods.

We shall now derive the second order Runge-Kutta methods.


Consider the method as
Y n + 1 = Yn + W l K l + WzKz

where

where the parameters C2 a,, W, and W2 are chosen to make yn+] closer to y(&+3.
h2 h3
y(b+d=y(f)+ ( t d , h ~ ' + ~ ~ ' ( b ) + ~ ~ ' " ( t d , + . . . .
where

Y' = f(S Y)
yff=<+f$

~"'=<~+2f<~+S7fl+fy(~+f$)
We expand Kt and & about the point (\, yn)

Substituting these values of K1 and K, in Eqn. (IS), we have

Y.,l = Yn + ON1 + W J hG + h2 [W2C24 + w 2 a 2 1 q

Comparing Eqn. (18) with (17, we have


w1+w2=1

1
a21w2 = 5
From these equations we find that if Cz is chosen arbitrarily we have

a,, = q,w2= 1/(2C;), Wl = 1-1/(2CJ


The R-K method is given by

and Eqn. (18) becomes

Subtracting Eqn. (20) fiom the Taylor series (17, we get the truncation error as

= y(b+J - Y,+1

Since the TE is of 0(h3), all the above R-K methods are of second order. Observe that
no choice of C2 will make the leading term of TE zero for all f(t, y). The local TE
depends not only on derivatives of the solution y(t) but also on the function f(t, y). This
is typical of all the Runge-Kutta methods. Generally, C2 is chosen between 0 and 1 so
that we are evaluating f(t, y) at an off-step point in [b, b+,]. From the defmition, every
Runge- Kutta formula must reduce to a quadrature formula of the same order or greater
if f(t, y) is independent of y; where Wi and Ci will be weights and abcissas of the
corresponding numerical integration formula.
Best way o f obtaining the valuk.pf.the arbitrary parameter C2 in our tormula is to Solution o f Ordinary Differential
Equations using RungeKutta Methods
i) choose some of Wi's zero so as to minimize the computations.
ii) choose the parameter to obtain least m,
iii) choose the parameter to have longer stability interval.
Methods satisfying either of the condition (ii) or (iii) are called optimal Runge-kutta
methods.

We made the following choices :

1 1
i) C 2 = -
2'
:. a2, = - W,= 0, W2 = 1, then
2'

which is the same as improved tangent or modified Euler's method.


ii) C2-= 1, :. azl = 1, W1 = W2 = y ,then
1

which is same as the Euler-Cauchy method or Heun's method.

2
iii) C2 = -
3'
.. a2, = 3'2 W1 = -4'1 W2 = -34' then

which is the optimal R-K method.


Method (24) is optimal in the sense that it has minimum TE. In other words, with the
above choice of unknowns, the leading tenn in the TE given by Eqn. (21) is minimum.
Though several other choices are possible, we shall limit our discussion with the above
three methods only.
In order to remember the weights W and scale factors C, and a,, we draw thc
following tables :

General form Improved tangent method

Hcun's method Optimal method


We now illustrate these methods through an example.
Numerical Differeotiation integration ~~~~~l~ 1 : solve the IVP y 1 = - 2 , y(2) = 1 and fmd y(2.1) and y(22) with
and Solution of Differential Equations
h = 0.1 using the following R-K metbods of 0(h2)
a) Improved tangent method [modified Euler method (22)]
b) Heun's method [Euler-Cauchy method (23)]
c) Optimal R-K method [method (24)]
d) Taylo~series method of 0(h2).
Compare the results with the exact solution

Solution : We have the exact values

y(2.1) = 0.82988 and y(2.2) = 0.70422


a) Improved tangent method is

~ntl=~n+K,

For this problem f(t, y) = - t y2 and

K1 = (0.1) [(- 2) (I)] = - 0.2

& = (0.1) [(-205) (1 - 0.1)~] - - 0.16605 .


~(2.1)= 1 - 0.16605 = 0.83395.

Taking tl = 2.1 and yl = 0.83395, we have

~(2.2)= yl + K, = 0.83395 - 0.124487 = 0.70946


b) Heun's method is :
1
Y.~I=Y.+T(K~+%)
K, = hf (L y 3 1-0.2 .

K,= h l ( b + h, y, + Kl) = -0.1344


y(2.1) = 0.8328

Taking t, = 2.1 and yl = 0.8328, we have


-
K1 =: 0.14564 , K,= - 0.10388
y(2.2) = 0.70804
c) Optimal method is :
Solution o f Ordinary Differential
Taking tl = 2.1 and y, = 0.83358, we have Equations using RunguKutta Methods
K, = -0.1459197, K, = -0.117463
y(2.2) = 0.7090
I d) Taylor series method of q h 2 ) :

With'tl = 2.1 ,y, = 0.835, we get

y'(2.1) = - 1.4641725 ,~"(2.1)= 4.437627958

We now summarise the results obtained and give them in Table 1.


I

Table 1
Solutions and errors in solution of y' = - t y2 y(2) = 1, h = 0.1. Numbers inside
brackets denote the errors.

t Method Method Method Method Exact


(22) (23) (24) Taylor 0(h2) Solution

You may observe here that all the above numerical solutions have almost the same error.

You may now try the following exercises :

- -

Solve the following IVPs using Heun's method of q h 2 ) and the optimal R-K method
of 0(h2).

El) 10y' = ? + g, y(0) = 1. Find y(0.2) taking h = 0.1.


E2) y' = 1 + y2, y(0) = 0. Find y(0.4) taking h = 0.2. Given that the exact solution is
y(t) = tan t, find the errors.

Also compare the errors at t = 0.4, obtained here with the one obtained by Taylor
series method of 0(h2).
1
y' = 3t + - y, y(0) = 1. Find y(0.2) taking h = 0.1. Given y(t) = 13etn- 6t - 12, find
i E3) 2
the errors.

Let us now discuss the R-K methods of third order.

15.2.2 Runge-Kutta ith hods of Third Order


Here we consider the method as

Y~+~=Y~+W~K~+WZK,+W~K~
where
Numerical Differentiation Integration
and Solution of Differential Equations K, = h f(tn + c p , Yn + a21 K,)
K3 = h f(tn + C3h, Y, + a31 Kl + a32 KJ
Expanding 6,K3 and yn+linto Taylor series, substituting their values in Eqn. (25) and
comparing the coefficients of powers of h, h2 and h3, we obtain
1
a21 = C2 c2w2+ C3W3= y

We have 6 equations to determine th? 8 unknowns. Hence the system has two arbitrary
parameters. Eqns. (26) are typical of all the R-K methods. Looking at Eqn. (26), you
may note that the sum of aij9sin any row equals the corresponding Ci's and the sum of
the Wi's is equal to 1. Further, the equations are linear in W2 and W3 and have a
solution for W2 and W3 if and only if

(Ref. Sec. 8.4.2, Unit 8, Block-2, MTE-02).

Expanding the determinant and simplifying we obtain

Thus we choose C2, C3 and satisfying Eqns. (27).

Since two parameters of this system are arbitrary, we can choose C2, C, and determine
from Eqn. (27) a<

C3(C3 - CZ)
a
- 3C2)
32 = C2(2
2
If C3 = 0, or C2 = C3 hen Cz = - and we can choose a= d 0, arbitrarily. All Ci's
3
should. be chosen such that 0 < Ci < 1. Once C2 and C3 are prescribed, Wi9sand aij's
can be determined from Eqns. (26).

We shall list a few methods in the following notation

i) Classical third order R-K method


Solution of Ordinary Differential
Equations using Rung~KuttaMethods

Heun's Method

iii) Optimal method

We now illustrate the third order R-K methods by solving the problem considered in
Example 1, using (a) Heun's method @) optimal method
a) Heun's method

= - 0.16080
y(2.1) = 0.8294
Taking t, = 2.1 and y = 0.8294, we have
K, = - 0.14446
K, = - 0.13017
K3 = - 0.11950
y(2.2) = 0.70366
Numerical Differeatiation fntcgration b) Optimal method
and Solution of Differential Equations

K3 = -0.15905
y(2.1) = 0.8297
Taking tl = 2.1 and yl = 0.8297, we have
Kl = -0.14456

You can now easily find the errors in these solutions and compare the results with those
obtained in Example 1.
And now here is an exercise for you.

E4) Solve the IVP


Y' " y-4 ~ ( 0=) 2
using third order Heun's and optimal R-K methods. Find (0.2) taking h = 0.1. Given
t5e exact solution to be y(t) = 1 + t + et, fmd the errors at t = 0.2.

We now discuss the fourth order R-K methods.

15.23 Runge-Kutta Methods of Fourth Order


Consider the method as

Since the expansions of K,,K3, and yMl in Taylor series are complicated, we shall
not write down the resulting system of equations for the determination of the unknowns.
It may be noted that the system of equations has 3 arbitrary parameters We shall state
directly a few R-K methods of ~ ( h ? .The R-K methods (31) can be denoted by

c
2 a21

c3 a31 a32

C4 a41 a42 a43

4
'1
' "2' "3
' w4
For different choices of these unknowns we have the following methods :
i) Classical R-Kmethod
Solution of Ordinary Differential
(32) Equations using R u n g e b t a Methods

This is the widely used method due to its simplicity and moderate o d u . We shall also
be working out problems mostly by the classical R-K method lanlcas specified otherwise.
ii) Runge-Kutta-MI metbod

The RungaKutta-Gill method is also used widely. But, in this unit, we s h l l mostly
work out problems witb the classical R-K method of q h 9 . Hence, whenever we refer
to R-K method of q b 4 ) we mean only the classial R-K method of 0@3given by
(32). We shall now illustrate this method through examples.
Example 2 : Solve the IVP y' = t + y, y(0) = 1 by Runge-Kuttr method of 0@3 for
t E [0,0.5] witb h = 0.1. Also find the e m r at t = 0.5, if the exact solution is
y(t) = 2et-t-1.
Solution : We use the R-K metbod of 0@>given by (32).
Initially, to = 0, yo = 1.
We have
Numerical Differentiation Integration 1
and Solution of Differential Equations Y ~ = Y ~ + ~ ( K , + ~ K ~ + ~ & + K ~ )
1
= 1 + - [1+ 0.22 + 0.2210 + 0.121051 = 1.11034167
6
Taking tl = 0.1 and y, = 1:11034167, we repeat the process.

K, = hf(t,, y,) = (0.1) [0.1 t 1.110341671= 0.121034167

+ 0.144303013] = 1.24280514
Rest of the values y3, ye y5 we give in Table 2.

Table 2

-
Now the exact solution is

Error at t * 0.5 is

Let us consider another example


Exrarpk 3 : Solve the IVP
y' = 2y + 3e1, y(0) = 0 using
a) classical R-K method of q h ?
b) R-K Gill method of qh?,

Find y(0.1), y(0.2), y(0.3) taking B = 0.1. Also € i d the errors at t = 0.3, if &e exact
solution is y(t) E 3(e2 - e'),
Solution : a) Classical R-K method is Solution of Ordinary Diffaential
Equations using RungeKutta hhethods

Taking tl = 0.1, yl = 0.3486894582, we repeat the proass and obtain

Taking t2 = 0.2, y2 = 0.837870944 and repeating the process we get

K, = 0.53399502, & = 0.579481565


. K3 = 0.6 1072997, K, = 0.694677825
.: y(0.3) = 1.416807999
b) R-K-GPI method is
1
Yn+, Yn + 6 (Kl + (2-fl) & + (z+fi) K3 + K4
)
Taking t, = 0, yo = 1 and h = 0.1, we obtain

Taking t, = 0.1, yl = 0.3486894582, we obtain

Taking t, = 0.2, y, = 0.8112507529, we obtain

K, = 0.528670978,
K, = 0.6045222614,
y(0.3) e 1.416751936

-
From the exact solution we get
y(0.3) 1.416779978
Enor in classical R-K method (at t = 0.3) = 0.2802X lo*
Error in R-K43iII method (at t = 0.3) = 0.2804 x loa.
You may now try the following exercises.

Solve the following IVPs using R-K method of 0(h4)


E5) y' = y(0) .:1. Find y(O.5) taking h = 0.5.
y+t'
E6) y' = 1 -2ty, y(0.2) = 0.1948. Find y(0.4) taking h = 0.2,
Numerical Difftraat~etionIntegration
E7) 10tyr+ yZ = 0, y(4) = 1. Find y(4.2) taking h = 0.2. Find the error given the exact
and Solution of Differential Equations
1
solution is y(t) = 0. t, where c = 0.86137

-
+

E8) y' m L
tZ
- t - y2, y(1) - 1. Find y(1.3) taking h = 0.1. Given the exact solution to be
y(t) = t,
Eiid the error at t = 1.3.

In the next section, we shall study the application of Richardson's extrapolation to the
solutions of ordinary differential equations.

15.3 RICHARDSON'S EXTRAPOLATION

You know that Richardson's extrapolation technique improves the approximate value of
y(\) and the order of this improved value of y(f) exceeds the order of the method by
one.
Here we shall first calculate the solutions F(hl) and F(hJ of the given IVP with
steplengths h, and h2 where h2 = h1/2 at a given point using a Runge-Kutta method.
Then by Richardson's extrapolation technique we have for the second order method

and for the fourth order method

as the improved solution at that point, which will be of higher order than tbe original
method. We shall now illustrate the technique through an example..

Example 4 : Using Runge-Kutta method of 0@') find the solution of the IVP y' = t + y,
y(0) = 1 using h = 0.1 and 0.2 at t = 0.4. Use extrapolation technique to improve the
accuracy. Also fid the errors if the exact solution is y(t) = 2et - t - 1.

Solution : We shall use Hcun's second order method (23) to find the solution at
t = 0.4 with h = 0.1 and 0.2. The fdlowing Table 3 gives values of y(t) at t = 0.2
and t = 0.4 with h = 0.1 and 0.2.

Table 3

f Fl=F(O.l) F2 = F(0.2) Extrapolated Errors


1 . value =
31 (4F1- F2)
0.2 . 1.24205 1.24 1.242733 0.725 x lo4

0.4 1.58180 1.5768 1~ $ 3 4 7 2 o.ln x


*

You may now try the following exercises :

E9) Solve E2) taking h = 0.1 and 0.2 using q h Z )Heun's method. Extrapolate the value at
t = 0.4. Also find the error at t = 0.4.

E10) Solve E6), taking h = 0.1 and 0.2 using 0(h2) Heun's method. Extrapolate the value
at t = 0.4. Compare this solution with the solution obtained by the classical 0(h4
R-K method.

We now end this unit by giving a iummary of what we have covered in it.
Solution of Ordinary Diffexcntial
Equations using Runge-Kutta Metbds
15.4 SUMMARY
In this unit we have leamt the following :
1) Runge-Kutta methods being singlestep methods are self- starting methods.
b
2) Unlike Taylor series methods, R-K methods do not need calculation of higher order
derivatives of f(t, y) but need only the evaluation of f(t, y) at the off-step points.
3) For a given IVP of the form
y l = f ( t , y ) , ~'(to)=Yo, f"tO,b1
where the mesh points are tj = to + jh, j = 0, 1,. . . . ..,n.
t, = b = t, + nh, R-K methods are obtained by writing
= yn + h (weighted sum of the slopes)
rn

where In slopes are used. These slopes are defined by

The ullknowns Ci, ai, and Wj are then obtained by expanding K,'s and yn+lin Taylor
series about the point (t,, y,) and comparing the coefficients of different powers of h.
4) Richardso~l'sextrapolation technique canhe used to improve the approximate value
qf y(6) obtained by q h 2 ) ,0(h3) and 0(h3 methods and obtain the method of order
one higher than the method.

El) Heun's method ': y,,, = y, + 1 (K1 + K,)


Startirlg with t, = 0, yo = 1, h = 0.1
:. K, = 0.01
& = 0.010301
y(O.1) = 1.0101505
Taking tl = 0.1 , yl = 1.0101505
K, = 0.0103040403
= 0.0181327468
y(0.2) = 1.020709158
1
Optanal R-K, method : = y, + q (KI + 3K2)
t,=O,yo=l,h=O.l
Kl = 0.01, K, = 0.01017823
y(O.1) = 1.010133673
t, = 0.1, yl = 1.010133673
K, = 0.0103037, K,= 0.010620.
y(0.2) = 1.020675142

E2) Heun's method :


K, = 0.2, K2 = 0.208
y(0.2) = 0.204
K, = 0.2083232, K,= 0.2340020843
y(0.4) = 0.425 1626422
Numerical Ciffereotiation Integration
and Solution o f Differential Equations
Optimal R-K, method :
K, = 0.2, K,= 0.2035556
y(0.2) = 0.2026667
K1 = 0.2082148, 4 = 0.223321245
y(0.4) = 0.42221 1334
Taylor series method

Now the exact solution is y(t) = tan t


Exact y(0.4) = 0.422793219
Error in Heun's method = 0.236 x loA2
Error in Optimal R-X method = 0.582~
Error in Taylor series method = 0.647 x lo-'.

E3) Hem's method :

K,=0.0833125, &=0.117478125
y(0.2) = 1.166645313
Optimal R-K, method : '
K, = 0.05, K2 = 0.071666667
y(0.1) = 1.06625
K, = 0.0833125, & = 0.106089583
y(0.2) = 1.166645313
Exact y(0.2) = 1.167221935
Error in both the methods is same and = 0.577 x
1
E4) Heun's method : yn+l = yn + q (K1+ 3K3)
Starting with 6 = 0, yo = 2, h = 0.1, we have

K, = 0.2, K,= 0.203334, K3 = 0.206889


y(O.1) = 2.205167
t, = 0.1, y, = 2.205167 we have
K, = 0.210517, K, = 0.214201, K3 = 0.218130
y(0.2) = 2421393717
1
Optimal R-K method : y,,, = yn + 9 (2K, + 3& + 4K3)
K, = 0.2, Kz= 0.205, K3 = 0.207875
y(O.1) = 2.205167
t, r 0.1, y, = 2.205167
K, = 0.2105167, K,= 0.2160425, K, = 0.219220
y(0.2) = 2.421393717
exact y(0.2) = 2.421402758
Since y(0.2) is same by both the methods

Enor = 0.9041 x lo-' in both the methods at t = 0.2.


E5) K, = 0.5, K, = 0.333333
K3 = 0.3235294118, K, r 0.2258064516
y(O.5) = 1.33992199.
S o l u ~ i o nof Ordinary Differen~ial
Equations using Runge-Kut~aMethods

K1 = - 0.005, K2 = - 0.004853689024
K, = - 0.0048544, K4= - 0.004715784587
y(4.2) = 0.995 1446726.
Exact y(4.2) = 0.99514523 1, Error = 0.559 x lo4

K, = 0.1, K2 = 0.09092913832
K, = 0.09049729525, K4 = 0.08260717517
~(1.1)= - 0.909089993
K, = 0.08264471 138, & = 0.07577035491
K, = 0.0'7547152415, K, = 0.06942067502
~(1.2)= - 0.8333318022
K, = 0.06944457204, K2= 0.0641 1104536
K, = 0.06389773475, K, = 0.0591559551
~(1.3)= - 0.7692287876
Exact y(1.3) = - 0.7692307692
Error = 0.19816 x
Heun's method :
with h = 0.1
K,=0.1, & = O . l O l
y(0.1) = 0.1005
K, = 0.101010, & = 0.104061
y(0.2) = 0.203035
K, = 0.1041223, 6 = 0.1094346
y(0.3) = 0.309813
K, = 0.1095984, & = 0.1048047

with h = 0.2
F(h) = y(0.4) = 0.4251626422 [see E2]
Now
4F(h/2) - F(h)
F("(0.4) = 3

= 0.4142958537
Exact y(0.4) = 0.422793219
Error = 0.8495 x

E10) Heun's method : with h = 0.1


y(0.2) = 0.1948
K, = 0.092208, & = 0.08277952
y(0.3) = 0.28229375
K, = 0.083062374, 6 = 0.07077151
y(0.4) = 0.359210692
Heun's method with h = 0.2
K, = 0.184416, & = 0.13932541
y(0.4) = 0.35667072
F("(0.4) = 0.360057349
Result obtained by classical R-K method of 0(h3 is
y(0.4) F 0.3599794203 (see E6)

You might also like