Mte 10 TB
Mte 10 TB
Mte 10 TB
Structure
1.1 Introduction
Objectives
1.2. Three Fundamental Theorems
Intermediate Value Theorem
Rolle's Theorem
Lagrange's Mean Value Theorem
1.3 Taylor's Theorem
1.4 Errors
Round-off Error
Truncation Error
1.5 Summary
1.6 Solutions/Answers
1 . INTRODUCTION
The study of numerical analysis involves concepts from various branches of mathematics
including calculus. In this unit, we shall briefly review certain important theorems in .
calculus which are essential for the development and understanding of numerical methods.
You are already familiar with some fundamental theorems about continuous functions from
your calculus course. Here we shall review three theorems given in that course, namely,
Intermediate value theorem, Rolle's theorem and Lagrange's mean value theorem. Then we
state another important theorem in calculus due to B. Taylor and illustrate the theorem
through various examples.
Most of the numerical methods give answers that are approximations to the desired
solutions. In this situation, it is important to measure the accuracy of the approximate
solution compared to the actual solution. To find the accuracy we must have an idea of the
~ O S S errors
~ D ~ that can arise in computational procedures. In this unit we shall introduce you
to different forms of errors which are common in numerical computations.
The basic ideas and results that we have illustrated in this unit will be used often throughout
this course. So we suggest you go through this unit very carefully.
Objectives
After studying this unit you should be able to :
apply
i) Intermediate value theorem
ii) Rolle's theorem
iii) Lagrange's mean value theorem
iv) Taylor's theorem;
define the term 'error' in approximation;
distinguish between rounded-off error and truncation error and calculate these errors
as the situation demands.
In this section we shall discuss three fundamental theorems, namely, intermediate value
theorem, Rolle's theorem and Lagrange's mean value theorem. All these theorems give
properties of continuous functions defined on a closed interval [a, b]. We shall not prove
them here, but we shall illustrate their utility with various examples. Let us take up these
theorems one by one.
Solutions of Nun-linear Equations
in one Variable
1.2.1 Intermediate Value Theorem
The intermediate value theorem says that a function that is continuous on a closed interval
[a, b] takes on every intermediate value, i.e., every value lying between f(a) and f(b) if
f(a) # f(b).
Theorem 1 :Let f be a function defined dn a closed interval [a, b]. Let c be a number lying
between f(a) and f(b) (i.e. f(a) < c < f(b) if f(a) < f(b) or f(b) < c < f(a) if f(b) < f(a)). Then
there exists at least one point xo E [a, b] such that f(xo) = c.
The following figure (Fig. 1) may help you to visualise the theorem more easily. It gives the
graph of a function f.
Fig. 1
In this figure f(a) < f(b). The condition f(a) < c< f(b) implies that the points (a, f(a)) and
(b, f(b)) lieon opposite sides of the line y = c. This, together with the fact that f is
continuous, implies that the graph crosses the line y = c at some point. In Fig. 1 you see
that the graph crosses the line y = c at (xo,c).
7c 1
Example 1 :Find the value of x in 0 Ix 5 - for which sin (x)
2
1.
Solution : You know that the function f(x) = sin x is continuous on 10, f Since f(0) = 0
L LA
= 1, we have f(0) < Thus f satisfies all the conditions of Theorem 1.
1
Therefore, there exists at least one value 06 x, say xo, such that f(xo) = -, that is, the theorem
2
1
guarantees that there exists a point xo such that sin (xd = -. Let us try to find this point from
2
[ I;
the graph of sin x in 0, - (see Fig. 2).
Fig. 2
- 1
From the figure, you can see that the line x = - cuts the graph at the point
2 [E. i)~ e n c e
[ :]
there exists a point x - - in 0, - such that sin ((xo) = -.
0-: 2
1
Example 2 : Show that the equation 2x3 + x2 - x + 1 = 5 has a solution in the interval [ l , 21.
Thus we saw that the theorem enables us in establishing the existence of the solutions of
certain equations of the type f(x) = 0 without actually solving them. In other words, if you
want to find an ~ntervalin which a.solution (or root) of f(x) = 0 exists, then find two
numbers a, b such that f(a) f(b) < 0.Theorem 1, then states that the solution lies in ]a, b[. We
shall need some other numerical methods for finding the actual solution. We shall study the
problem of finding solutions of the'equation f(x) = 0 more elaborately in Unit 2.
El) Show that the following equations have a solution in the interval given alongside.
Fig. 3
You have already seen in your calculus course that the derivative f ( x d at some point xo
gives the slope of the tangent at (xo, f(xd) to the curve y = f(x). Therefore the theorem states
that if the end values f(a) and f(b) are equal, then there exists a point xo in ]a, b[ such that
the slope of the tangent at the point P(xo, f(xd) is zero, that is, the tangent is parallel to
x-axis at that point (see Fig. 3). In fact we can have more than one point at which f ( x ) = 0 as
shown in Fig. 3. This shows that the number xo in Theorem 2 may not be unique.
~ulutionsol'Non-~inrarEquations The following example gives an application of Rolle's theorem.
in one Variable
P1,
Example 3 1 Use_ Rolle's theorem to show that there is a solution of the equation
cot x = x in
You can try the following exercise on the same lines as Example 3.
E2) Using Rolle's theorem show that there is a solution to the equation tan x - I +x =0
in 10, I[.
- - -
Now. let us look at Fig. 3 carefully. We see that the line joining (a, f(a)) and (b, f(b)) is
parallel to the tangent at (xo,f(xo)). Does this property hold when f(a) # f(b) also? In'other
words. does there exist a point xo in ]a, b[ such that the tangent at (xg. f(xo)) is parallel to the
line joining (a, f(a)) and (b, f(b))? The answer to this question is the content of the well-
known theorem, "Lagrange's mean value theorem". which we discuss next. ..
Theorem 3 : Let f be a continuous function defined on [a, b] and differentiab1e.h ]a. b[.
Then there exists a number xOin la, b[ such that
f (xo) =
0)- f(al ...
b-a
Geometr~callywe can interpret this theorem as given in Fig. 4.
Fig. 4
In this figure you can see that the straight line connecting the end points (a, f(a)j and
(b, f(b)) of the graph is parallel to'lsome tangent to the curve at an intermediate point.
You may be wondering why this theorem is called 'mean value theorem'. This is because of
the following physical interpretation.
Suppose f(t) denotes the position of an object at time t. Then the average (mean) velocity Review of Calculus
during the internal [a, b] is given by
f(b) - f(a)
b-a
Now Theorem 3 states that this mean velocity during an internal [a, b] is equal to the
velocity f(xo) at some instant xo in ]a, b[.
Example 4 : Apply the mean value theorem to the function f(x) = G i n [O, 21 (see Fig. 5).
Solution : We first note that the function f(x) = fiis continuous on [O, 21 and differentiable
I
in ]0,2[ and f (x) = --
2.1;; '
1
Now f(2) = fiand f(0) = 0 and f(xo) = --
2%'
Therefore we have
[$ . $1
Thus we get thatcthe line joining the end points (0,O) and (2, fi)of the graph off is parallel
m the tangent to the curve at the point
6+26 6-26
and -
8 8
Taking 6 d 1.732, we see that there are two values for xo lying in the interval ]0,4[.
The above example shows that the number xo in Theorem 3 may not be unique. Again, as
we mentioned in the case of Theorems 1 and 2, the mean value theorem guarantees the
existence of a point only.
I
E3) Let f(x) = - x3 +,2x. Find a number xo in ]0,3[ such that
3
E4) Find all numbers xo in the interval 1-2, 1[ for which the tangent to the graph of
f(x) = x3 + 4 is parallel to the line joining the end points (-2, f(-2)) and (1, f(1)).
E5) Show that Rolle's theorem is a special case of mean value theorem.
So far we have used the mean value theorem to show the existence of a point satisfying
Eqn. I . Next we shall consider an example which shows another application of mean value
theorem.
Note that in writing the value of G w e have rounded off the number after three decimal
i
places. Using the calculator we find that the exact value of G i s 2.9624961.
We have given this example just to illustrate the usefulness of the theorem. The mean value
theorem has got many other applications which you will come across in later units.
You are already familiar with the name of the English mathematician Brook Taylor
(1685-173 1) from your calculus course. In this section we shall introduce you to a
well-known theorem due to B. Taylor. Here we shall state the theorem without proof and
discuss some of its applications.
You are familiar with polynomial equations of the form f(x) = a. + a , x + . . . + an xn where
ao, a,, . . . ,an are real numbers. We can easily compute the value of a polynomial at any
point x = a by using the four basic operations of addition, multiplication, subtraction and
division. On the other hand there are functions like ex, cos x, In x etc. which occur
frequently in all branches of mathematics which cannot be evaluated h the same manner.
For example, evaluating the function f(x) = cos x at 0.524 is not so simple. Now, to evaluate
such functions we try to approximate them by polynomials which are easier to evaluate.
Taylor's t h e o r e ~ ~ ~ ius
v eassimple method for approximating functions f(x) by polynomials.
Let f(x) be a real-valued function defined on R which is n-times differentiable (see MTE-01
Calculus Unit 6, Block 2). Consider the function
Now Pl(x) is a polynomial in x of degree 1 and P1(xO)= f(xo) and F I ( x o )= f(xo). The
polynomial P,(x) is called the first Taylor polynomial of f(x) at xo. Now consider another
function
Then P2(x) is a polynomial in x of degree 2 and P2(xO)= f(xo), P12(xO)= f(x0) and
PI12(xO)= f'(xo). P2(x) is called the second Taylor polynomial of f(x) at xo.
Similarly we can define the rth Taylor polynomial of f(x) at xo where 1 I r 5 n. The rth
Taylor polynomial at xo is given by
- 1 2 + x - ~( ~ - 1 ) ~
Therefore, P4(x) = (X- 1) - - -
2 3 4 "
E6) If P, denotes the rth Taylor polynomial as given by Eqn (3). then show that
Pr(xo) f(xo), P,(x,) = f (x,), . . . P~)(x,,) = flr)(xo). Q
The series given in Eqn. (4) is called the nth Taylor's expansion of f(x) a t xo.
Rn I(x) depends on x, xo and n. Rn I(x) is called the remainder (or error) ofkhe nth
+ +
Suppose we put xo = a and x = a + h where h > 0,in Eqn(4). Thefiany point between a and
a + h w i l l b e o f t h e f o r m a + 8 h , O < 8 < I,.
h2 hn ),n +l
f(a+h)=f(a)+hf(a)+-f"(a)+..
2!
. +,&")(a)+-f("+')(a+~h)
n. n + l!
...(5)
Note that the right hand side of the above equation is simply a polynomial in (x - x0).
Therefore, finding Taylor's expansion of a polynomial function f(x) about xo is the same as
expressing f(x) as a polynomial in (x - x d with coefficients from R.
Remark 2 : Suppose we put xo = a, x = b and n = 0 in Eqn. (4). Then Eqn. (4) becomes
or equivalently
which is the Lagrange's mean value theorem. Therefore we can consider the mean value
theorem as a special case of Taylor's theorem.
Here f(2) = 0
f ( x ) = 4 x 3 - 15x2+ 1 0 x + 1 , f (2) = -7
f'(x) = 12x2- 30x + 10 , f'(2) = -2
F3)(x)= 24x - 3 0 , t(3)(2)= 18
p4)(x)= 24 , p4)(2)= 24
Hence the expansion is
Solution :We first note that the point x = 0 lies in the given interval. Further, the function
f(x) = In (1 + x) has continuous derivatives of all orders. The derivatives are given by
i.
1
f (x) = -
l+x'
,+, n+ 1 n+-
The above example shows that if n is sufficiently large, the value of the nth Taylor
polynomial P,(x) at any xo will be approximately equal to the value of the given function
f(xo). In fact, the remainder Rn I(x) tell(s) us how close the value Pn(xo) is to f(xo).
+
Now we shall make some general observations about the remainder Rn + ,(Y; ,n the ~'aylor's
expansion of a function f(x).
Remark 3 :Consider the nth Taylor expansion off about xo given by
Then Rn I(x) = f(x) - Pn(x). If lim Rn I(x) = 0 for some x, then for that x we say that we
+ +
n+-
can approximate f(x) by P,(x) and we write f(x) as the infinite series.
f(2)(~,,) 2 f(")(xO)
f(x) = fO(x)+ f (x)(x - x,,) + ----- (X- x0) + . . . + --n! (X- x0)" + . .
2!
You are already familiar with series of this type from your calculus course. This series is
called Taylor's series of f(x). If we put xo = 0 in Eqn. (6) then the series
Remark 4 :If the remainder Rn+ I(x) satisfies the condition that I Rn+ I(x) I < M for some
n at some fixed point x = a, then M is called the bound of the error at x = a.
Inthis case we have Wetic% t ~Calculus
f
approximation.
We shall explain these concepts with an example.
Example 10 :Find the 2nd Taylor's expansion of f(x) = in )-I. 11 about 9~ = 0. Find
'
the bound of the error at x = 0.2.
I Solution :Since f(x) = G.
we have
x3 1 + c)-'I2.
The e m is given by R,(r) = %(
Hence,
.- -. a
I
I R3(02) I d 16
= (0.5) l(r3
E8) Obtaii the nth Taylor expansion of the function f(x) = -k ) i . l[alW%=O.
1+ x
E9) Does f(x) = 6have a Taylor series expansion about x = 03 Justify your answer.
El01 Obiain the 8th Taylor expandon of the function f(x) = ccrr x in
Obtain a bound for the e m &(x).
Solutionsof Non-linear, Equations
in one Variable
There are some functions whose Taylor's expansion is used very often. We shall list their
expansions here.
1
X x2 xn, X " + '
e x = 1 +-+-+. . .+--+-----eC.. . . . . (7)
l! 2! n! (n + l)!
x3 x5
Sinx=x.--+-+.
3! 5!
..+ (2n - l)!
(- l)n x2n + 1
+ cos (c) .
(2n + I)!
x2 x4
Cosx=l--+--..
2! 41
.+ (-1)"(2n)!
(x12"
(-1 )n + 1 X2n + 2
I n.." (n\
Now, let us consider some examples that illustrate the use of finding approximate values of
some functions at certain points using truncated Taylor series.
Example 11 : Using Taylor's expansion for sin x about x = 0, find the approximate value of
sin 10" with error less than
Solution :The nth Taylor expansion for sin x given in Eqn. (9) is
(-1)n X2n + 1
+ (2n + I)!
cos C .
where x is the angle measured in radians.
where Rn
Now
+
("ig1.
I IS the remainder after (n + 1) terms.
= I (-lln [z)
(2n + l)! 18
2n+l
coscI < 10-7.
1
-
From the table we find that the inequality in (12) is satisfied for n = 3. Hence the required
approximation is
Example 12 :Using Maclaurin's series for ex, show that ez2.71806 with error less than
0.001. (Assume that e < 3).
Since we have chosen xo = 0 and x = 1, the value c lies between 0 and 1 i.e. 0 < c < 1. Since
eC< c < 3, weget
R,+l<.001ifn=6 \
It
E l 1) Using Maclaurin's expansion for cos x, find the approximate value of cos - with the
4
error bound l(r5.
E12) How large should n be chosen in Maclaurin's expansion for ex to have
.%lutions of Nun-linear Equations in In numerical analysis we are concerned with developing a sequence of calculations that will
une Variable give a satisfactory answer t~ a problem. Since this process involves a lot of computations,
there is a chance for the presence of some errors in these computations. In the next section
we shall introduce you to the concept of 'errors'~hatarisein numerical computations.
1.4 ERRORS
In this section we shall discuss the concept of an 'error'. We consider two types of errors
that are commonly encountered in numerical computations.
You ate already familiar with the rounding off a number which has non-terminal decimal
expansion from your school arithmetic. For example we use 3.1425 for 22n. These rounded
off numkrs are approximations of the actual values. In any camputatbnal procedure we
make use of these approximate values instead of the true values: Let xT denote the true value
and xA denote the approximate value. How do we measure the goodness of an
approximation xA to xT? The simplest measure which naturally comes to our mind is the
difference between xT and x,. This measure is called the 'error'. Formally, we define error
as a quantity which satisfies the identity.
True value xT = Approximate value xA + error.
Now, we convert 22n to decimal form, so that we can find the difference between the
approximate value and true value. Then the approximate value of u is
-
error = True value approximate value = - 0.001 26449 . ..(IS)
Note that in this case the error is negative. Error can be positive or negative. We shall in
general be interested in absolute value of the enor which is defined as
1. h r 1 = I True value - approximate value 1
For example, the absolute Error in Example 13 is
I error 1 = 1 -0.00126449.. :1 ~0.00126...
Sometimes, when the true value is very ldrge or very small we p f e r to study the cnw by
comparing it with the true value. This is known as Relative ~ W O Pand we &fine this error as
But note that in certain canp1tatbm, the nue value rrmy not be available. In that ccrse we
replace ihe mee value by the camputed qpmdmafe value in the &finition of nlrtive ens.
In numerical calculations, you will encountermainly two types of emm:nnud-off e m and
truncation ma.We shall discuss these mars in the next two s u ~ t i o n s1.4.1 and 1.4.2
.
respec'tively
1.4.1 Round-off Error I u
Let us look at Example 13 again. You can see that the numbers appearing in Eqns (13).
( 14) and (15) consist of 8 digits after the decimal
point followed by dots. The line of dots
indicates that the digits continue and we are not able to wrife all of them. That is, these
numbers cannot be represented exactly by a terminating decimal expansion. Whenever we
use such numbers in calculations we have to decide how many digits we are going to take
into account. For example, consider again the approximate value of a. If we approximate ~r
using 2 digits after the decimal point (say). chopping off the other digits, then we have
...
where dl, 4.. d,, are natural numbers between 0 and 9 and m is an integer called
exponent. Writing a number in this form is known as floating point representation. We
denote this representation by fl(x). Such a floating point number is said to be normalized if
d l # 0. To translate a number into floating point representation we adopt any of the two
methods -rounding and chopping. For example. supposc we want to represent the number
537 in the normalid floating point represuntation kith n = I. then we get
fl(537) = .5 x ld chopped
, = .5 x i d rounded
In this case we are getting the same r;presentation in rounding and chopping. Now if we
take n = 2, then we get
fl(537) = .53 x l d chopped -
= .54 x rounded
In this case, the representations are different.
Now if we take n = 3, then we get
fl(537) = 537 x lo3chopped
Next definition gives us a measure by which we can conclude that the round-off error
occurring in an approximation process is negligible or not.
Example 14 : Find out to how many decimal places the value of 2217 obtained in Example
13 is accurate as an approximation t o n = 3.14159265 ?
355 .
E13) In some approximation problems where graphic methods are used, the value - 1s
133
used as an approximation to IT = 3.14159265 . . . . To how many decimal places the
355
value is accurate as an approximation to IT?
---
133
denote the Taylor's series of f(x) about xo. In practical situations, we cannot, in general, find
the sum of an infinite number of terms. So we must stop after a finite number of terms, say, Review of Calculus
N. This means that we are taking
N
You already know how to calculate this e m from Sec. 1.3. There we saw that using
Taylor's theorem we can estimate the error (or remainder) involved in a truncation process
in some cases.
Let's see what happens if we apply Taylor's theorem to the function f(x) about the point
xo = 0. We assume that f satisfies all conditions of Taylor's theorem. Then we have
where an = -
fin'(0) and 0 < c < x.
n!
N
Now, suppose that we want to approximate f(x) by an xn.
n=O
N
Then Eqn (19) tells us that the truncation error in approximating f(x) by an xn is given by
n=O
Theoretically we can use this formula for truncation error for any sufficiently differentiable
function. But practically it is not easy to calculate the nth derivative of many functions.
Because of the complexity in differentiation of such functions, it is better to obtain indirectly
their Taylor polynomials by using one of the standard expansions we have listed in Sec. 1.3.
For example consider the function f(r)= ex'. It is difficult to caiculate the nth derivative of
this function. Therefore, for convenience, we obtain Taylor's expansion of ex' using
Taylor's expansion of e!' by putting y = x2. We shall illustrate this in h e following example.
2
Solution :Put u = x2. Then ex = eU.Now we apply the Taylor's theorem to function
f(u) = eUabout u = 0. Then, we have
u2 u3 u4
eU=1 +u+-+-+-+R5(u) where
2! 3! 4!
bolulionsof %on-linuur Equations and 0 < c < u. Since I x I < 1. u = x' < 1 Le. c < 1. Therefore, eC< e < 3. Thus
If the absolute value of the TE is less, then we say that the approximation is good.
Now, in practical situations we should be able to find out the value of n for which the
summation Can X" gives a good approximation to f(x). For this we always specify the
accuracy (or error bound) required in advance. Then we find n using formula (20)such that
the absolute error I Rn I(x) I is less than the specified accuracy. This gives the
+
,
,x2 ,lo
with TE = -
5!
Now we use this approximation to calculate the integral. We have
with an error 1 ~ ~ .
E15) .
a) Calculate the truncation error in a~vroximatinrr . -
2
ePX by I - x 2
,+ x4 , -1 2 x 2 1.
L
1.5 SUMMARY
(x - xg) (x - x0l2
f(x) = f(x& + -f (x,,)+ ----
l! 2! fi2'(xo)
...+
(X - x0Y (x - x0)" + ' 6" + "(c)
n! f(n)(xd+( n + l ) !
defined the term 'error' occurring in numerical computations.
discussed two types of errors namely
i) Round-off error :Error occumng in computations where we use rounding off
f ii)
method to represent a number is called round-off error.
Truncation error :Error occuning in computations where we use truncation
process to represent the sum of an infinite number of terns.
explained how Taylor's theorem is used to calculate the truncation error.
1 1.6 SOLUTlONSiANSWERS
-
El) a) The given equation is of the form f(x) = c where f(x) = x3 - x - 5 and c = 0.
f is a continuous function in the interval [O, 21 and f(0) = -5 and f(2) = 1. Then 0
lies between f(0) and f(2). Therefore by IV theorem, the equation f(x) = 0 has a
solution in the interval [O, 21.
Solutions of Non-linear Equations b) Here the equation is of the form f(x) = c where f(x) = sin x + x and c = 1.
in one Variable
f is a continuous function defined on
E4) f(x) = x3 + 4 satisfies :he requirements of Lagrange's mean value theorem in the
interval 1-2, I[. Therefore there exists a point x,, in 1-2, 1[ such that the slope f'(xo)
of the tangent line at xo is the same as the slope of the line joining (-2, f(-2)) and
(l,'f(l)).
i.e. xi = 1
Since xo lies in [-2, I[, we don't consider the positive value. Therefore there exists
only one point xo = - 1 satisfying the theorem.
E5) Suppose f is a function defined on [a, b] which satisfies all the requirements of
Lagrange's mean value theorem. Then there exists a point xo in la, b[ such that
f (xo) = -f(b)
---
- f(a)
b-a
Suppose in particular f satisfies the condition that f(a) = f(b), i.e. f(b) - f(a) = 0, then
we get f(%) = 0. This is what the Rolle's theorem states. Hence we deduce that, in
the statement of L.agrange's mean value theorem, if we put the extra condition that
f(a) = f(h). theri we get the Rolle's theorem.
E6) Put x = xo in Eqn. (3). then we get Review of Calculus
P,(xo) = f(%)
To calculate, P',(%), we diffa-entiate both sides of Eqn. (3). Then we have
-1
f (x) =: --- , f(O)=-1
(1 + x)2
The function f(x) and its derivatives of different orders are continuous in
Therefore by Taylor's theorem
E l 1) We have seen in E 10 that the remainder in the 8th Taylor expansion of cos x is such
that
This shows that we can approximate f(x) by 8th Taylor polynomial with error bound
1 o - ~i.e.,
71
Putting x = -, we get
4
e
Now, we have to find an integer n such that - < lo-5.
( n + I)! -
El3)
355
' - = 3.141 5 9 3 9 1 . . . (using a scientific calculator) and x = 3.14159265 . . .
113
I I
Then -- 1 < 0.00000027 < -- x 1O4
2 2
Therefore the approximation is accurate to 6 decimal places.
E14) a) We apply Taylor's iheorem to the function f(x) = sin x in ]-I, I[ about x = 0.
- Then for n = 7, we have
x3 x5 x7
sin x = x - 7+ - - - + R8(x)
3. 5! 7!
where 1 R8(x) 1 =
Hence
sin x -I--+----+-
-- x2 x4 x6 Raw
X 3! 5! 7! x
Rg(x) xa sin c x7
where----- - - sin (c)
x 8! x 8!
Thus
' R x x
NOW
X
dx = j s! sin (c) dx
0 0
Therefore we have
=1 1 1 1
,
1
0
-=
si:x
x3 + -
x - --- x5
3!3 5!5
-
3!3 5 ! 5 7!7
= 0.946
E15) a) Put u = -x2. Then eTx2= eU We consider the 2nd Taylor expansion of eU given
by
I
u2
e U =1 +u+-+R3(u)
2
t
eCu3
where R3(u) = -
3!
1 1
Since u I 0, eCI1. Hence I R3(u) I I - = -
3! 6
2 4
X
b) From (a) we have e'-X = 1 - x2 + - + R~ (-x2)
2
Hence
Now. 7
0
e-x x6
1 ~ i) -dx
~ ~ ( - dx
3! ,
0.1
0
2
e r
0.1
e Jex
0
2
z
0.1
J (I - x2 +
0
) x'
= x - -+ - = OW667666.
UNIT 2 ITERATION METHODS FOR
LOCATING A ROOT
Structure
2.1 Introduction
Objectives
2.2 Initial Approximation to a Root
'Tabulation Method
Graphical Method
2.3 Bisection Method
2.4 Fixed Point Iteration Method
2.5 Summary
2.1 INTRODUCTION
In this unit as well as in the next two units we shall discuss somenumerical methods which
gives an approximate solution of an equation f(x) = 0. We can classify the methods of
solution into two types, namely (i) Direct methods and (ii) Iteration methods. Direct
methods produce solutions in a finite number of steps whereas iteration methods give an
approximate solution by repeated application of a numerical process. As we said earlier,.
direct methods you have done in MTE-04. You will find later that for using iteration.
methods we have to start with an approximate solution. lteration methods improve this
approximate solution. We shall begin this unit by first discussing methods which enable us
to determine an initial approximate solution and then discuss iteration methods to refine this
approximate solution.
Objectives
After studying this unit you should be able to :
find an initial approximation of the root using
( I ) tabulation method (2) graphical method.
use bisection method for find~ngapproximate roots.
use fixed point iteration method for finding approximate roots.
-- - -- -
Iteration Methods for Locating a Rout
.,*-
2.2 INITIAL APPROXIMATION TO A ROOT
You know that in many problems of engineering and physical sciences you come across
equations in one variable of the form f(x) = 0.
Therefore the problem of finding the specific volume bf a gas at a given temperature and
pressure reduces to solving the biquadratic equation Eqn. (3) for the unknown variable V.
L Consider another example in life sciences, the study of genetic problem of recombination of
chromosomes can be described in the form
1
where p stands for the recombination fraction with the limitation 0 I p I - and (1 - p) stands
2
for the non-recombination fraction. The problem of finding the recombination fraction of a
gene reduces to the problem of finding roots of the quadratic equation p2 - p + k = 0.
In these problems we are concerned with finding value (or values) of the unknown variable
x that satisfies the equation f(x) = 0. The function f(x) may be a polynomial of the form
If f(x) is linear, then Eqn. (1) is of the form ax + b = 0, a # 0 and it has only one root given
b
by x = - -. Any equation which is not linear is called a non-linear equation. In this unit we
a
shall d i p s s some methods for finding roots of the equation f(x) = 0 where f(x) is a non
linear function. You are already familiar with various methods for calculating roots of
quadratic, cubic and biquadratic equations (see MTE-04, Unit 3). But there is no such
formula for solving polynomial equations of degree more than 4 or even for a simple
equation like
Here we shall discuss some of the numerical approximation methods. These methods
involve two steps :
We first compute values of f(x) = 2x - lag,, x - 7 for different values of x, say x = 1,2,3,
and 4.
When x = 1, we havef(l)=2-log,, 1 - 7 = - 5
Similarly, we have
- f(2) = 4 - log,, 2 - 7 = - 3.301
Table 1
We find that f(3) is negative and f(4) is positive. Now we apply IV Theorem to the function
f(x) = 2x - log,, x - 7 in the interval 1, = [3,4]. Since f(3) and f(4) are df opposite signs, by
IV theorem there exists a number x, lying between 3 and 4 such that f(xo) =O. That is, a root
of the function lies in the interval ]3,4[. Note that this root is positive.
Let us now repeat the above computations-forsome values of x lying in ]3,4[ say x = 3.5,
3.7 and 3.8. In the following table we report the values of f(x).
Table 2
We find that f(3.7) and f(3.8) are of opposite signs. By applying IV theorem again to f(x) in
the interval I2 = [3.7,3.8],we find that the root off(x) lies in the interval]3.7,3.8[. Note
that this interval is smaller than the previous interval. We call this interval a refinement of
the previous interval. Let us repeat the above procedure once again for the interval IZ.In
Table 3 we give the values of f(x) for some x between 3.7 and 3.8.
Table 3
Table 3 shows that the root lies within the interval 13.78, 3.79[ and this interval is much
smaller compared to the original interval 13,-41.The procedure is terminated by taking any
value of x between 3.78 and 3.79 as an approximate value of the root of the equation
f(x) = 2x - loglox - 7 = 0.
The method illustrated above is known as Tabulation method. Let us write the steps
involved in the method.
Step 1 : Select sorne.'numbers xl, x2, . . . , xn and calculate f(xl), f(x2), . . . ,f(x,). If f(xi) = 0
We will talk about the choice of
.
x I, X?, . . . x, later. for some i, then xi is a root of the equation. If none of the xis are zero, then proceed to step 2.
Step 2 :Find values x, and xi + I such that f(xi) and f(xi I) are of opposite signs i.e.
+
f(xi) f(xi
+
+ ,
< 0. Rename xi = a, and xi = b,. Then by the IV Theorem a root lies in
between a, and b,. Test for all values of f(x.),j = 1,2, . . . , n and determine other intervals,
J
if any, in which some more roots may lie.
Step 3 : Repeat Step 1 by taking some numbers between a, and bl. Again, if f(x.) = 0 for Iteration Methods for Locating a Root
J
some x. between al qnd b,, then we have found the root x.. Otherwise, continue step 2.
J J
Continue the steps 1 , 2 , 3 till we get a sufficiently small interval ]a, b[ in which the root lies. , ,
Then any value between ]a, b[ can be chosen as an initial approximation to the root. You
may have noticed that the test values x., j = 1,2, . . . , n chosen are dependent on the naiure
J
of the function f(x). I
I
I
We can always gather some information redarding the rbot either from the physical problem
in which the equation f(x) = 0 occur, or it is specified in the problem. For example, we may
ask for the smallest positive root or a root closest to a given number etc.
For a better understanding of the method let us consider one more example.
Example 1 :Find the approximate value of the real root of the equation
t Since f(-x) = -2x + 3 sin x - 5 < 0 for x > 0, the functian f(x) is negative for all negative
real numbers x. Therefore the function has no negative real root. Hence the roots of this
equation must lie in [0, m[. Now following step 1, we compute values of f(x), for x = 0, 1,2,
3,4, . . .
We have
Table 4
Nbw we follow step 2. From the table we find that f(2) and f(3) are of opposite signs.
Therefore a root lies between 2 and 3. Now, to get a more refined interval, we evaluate f(x)
for some values between 2 and 3. The values are given in Table 5.
Table 5
This table of values shows that f(2.8) and f(2.9) are of opposite signs and hence the root lies
between 2.8 and 2.9. We repeat the process once again for the interval [2.8,2.91 by taking
some values as given in Table 6.
From Table 6 we find that the root lies between 2.88 and 2.89. This interval is small,
therefore we take any value between 2.88 and 2.89 as an initial approximation of the root.
Since f(2.88) is near to zero than f(2.89), we can take any number near to 2.88 as an initial
approximation to the root.
Why don't you try some exercises now.
The figure shows that the graph cuts the x-axis at two points -2.55, and 0.55, approximately.
Hence -2.55 and 0.55 are taken as the approximate roots of the equation
x"+4x3+4x2-2=0.
Now go back for a moment to Unit I and see Example I in Sec. 1.2. There we applied
graphical method to find the roots of the equation sin x = -.21
Let us consider another example.
Example 3 :Find the approximate value of a root of
x2-ex=o
Solution :First thing to do is to draw the graph of the function f(x) = x2 - ex. If is not easy
to graph this function. Now if'We split the function as
where f ,(x) = x2 and f2(x) = ex, then we can easily draw the graphs of the functions
f,(x) and f2(x). The graphs are given in Fig. 2.
The figure shows that the two curves y = x2 and y = ex intersect at some point P. From the
figure, we find that the approximate point of intersection of the two curves is - 0.7,Thus we
Iteration Methods for Locstinga Ram
-
have f,(-0.7) f2(-0.7), and therefore f(4.7) = f I(-0.7) - f2(4.7) = 0. Hence - 0.7 is an
C
approximate value of the root of the equation f(x) = 0.
From the above example we observe the following : Suppose we want to apply the graphic
i method for finding an approximate root of f(x) = 0. Then we may try to simplify the method
by splitting the equation as
f(x) = f l(x) - f2(x) = 0 .. . (4)
where the graphs of f,(x) and f2(x) are easy to draw. From Eqn.(4), we have f,(x) if2(x).
The x-coordinate of the point at which the two curves y, = fl(x) and y2 = f2(x) intersect gives
an approximate value of the root of the equation f(x) = 0. Note that we are interested only in
the x-coordinate, we don,t have to worry about the point of intersection of the curves.
Often we can split the function f(x) in the form (4) in a number of ways. But we should
choose that form which involves minimum calculations and the graphs of fl(x) and f2(x)
are easy to draw. We illustrate this point in the following example.
Example 4 :Find an approximate value of the positive real root of 3x - cos x - 1 = 0 using
graphic method.
Solution :Since it is easy to plot 3x - I and cosx, we rewrite the equation as 3x - 1 = cos x.
The graphs of y = fl(x) = 3x - 1 and y = f2(x) = cos x are given in Figure 3.
' "#
5 F i3 :Graphs d Il(x) = 3%- 1 and h(x) = ros x
P> ,
": '
. ),.
It is clear from the figure that the x-coordinate of the point of intersection is approximately
$1 0,6. Hence x = 0.6 is an approximate value of the root of the equatmn 3x - cos x - I = 0.
We now make a remark.
Remark 1 :You should take some care while choosing the scale for graphing. A
magnificationof the scale may improve the accuracy of the approximate value.
Solutions of Non-linear Equations Here is an exercise for you.
in one Variable
E3) Find the approximate location of the roots of the following equations in the regions
given using graphic method.
a) f(x) = e-' - x = 0,in 0 Ix S 1
b) f(x) = e o e 4-~0 . 4 ~- 9 = 0, in 0 < x 5 7
We have discussed two methods, namely, tabulation method and graphical method;which
help us in finding an initial approximation to a root. But these two methods give only a
rough approximation to a root. Now to obtain more accurate results, we need to improve
these crude approximations. In the tabulation method we found that one way of improving
the process is refining the intervals within which a root lies. A modification of this method
is known es bisection method. In the next section we discuss this method.
In the beginning of the previous section wc have mentioned that there are two steps involved
in finding an approximate solution. The first step has already been discussed. In this section
we consider the second step which deals with refining an initial approximation to a root.
Once we know an interval in which a root lies, there are several procedures to refine it. The
An algorithm is a complete'and bisection method is one of the basic methods among them. We repeat the steps 1.2.3 of
unambiguous set of instructions -the tabulation method given in subsection 2.2.1 in a modified form. For convenience we
leading to the solution of a problem. write the method as an algorithm.
Suppose that we are given a continuous function f(x) defined on [a, b] and we want to find
This method is also called as the roots of the equation f(x) = 0 by bisection method. We describe the procedure in the
Bolzano method, Bracketing Method. following steps :
Step 1 :Find points x x2 in the interval [a, b] such that f(xl). f(x2) < 0. That is, those points
x I and x2 for which f(x I) and f(x2) are of opposite signs--(see Step 1 of subsection 2.2.1). .
This process is called "finding an initial bisecting interval". Then byIV theorem a root lies
in the interval 1 x x2 [.
X~ X2
Step 2 :Find the middle point c of the interval ] xl, x2 [ i.e. c = -
2
. If f(c) = 0,then c is
+
the required root of the equation and we can stop the procedure. Otherwise we go to Step 3.
-
(~2)
( ~ 2 f f ( ~ ?!)
(~2.
Now we shall see how this method helps in refining the initial i n t p a l s in spme of the
problems we have done in subsection 2.2.1.
Example 5 :Consider the equation 2x - loglo.x = 7 lies K
n ]3.78,3.79[. Apply bisection
method to fhd an approximate root of the equation correct to three decimal places.
Solution :Let f(x) t 2x - loglp x - 7. Frorri Table 2 in subsedZon 2.2.1, we find that
f(3.78) = - 0.01749 and f(3.79) = 0.00136. Thus a root lies in the interval 13.78, 3.791,
Then we find the middle point of the interval 33.78, 3.79[. The middle point is
c = (3.78 + 3.79)/2 = 3.785 and f(c) = f(3.785) = -0.0806 # 0. Now, we check the
condition in Step 3. Since f(3.78jfC3.785) > 0, the raot doks not lie in the interval
]3.78,3.78[.'Hence the root Ues inthe interval 13.78$,3.9[. We have to refine this interval
further to get better spproxirhation. Further bisections are shown in the following Table.
Table 7 .
-
Tr-/T[
Nwber of Bisections Bisected value xi Improved Interval ' 1
The table fltew4 that the tiproved interval after 5 bipectjons is ]3.78906,3.789375[. The
width of this interval in 3.7B9p75 - 3.78906 = 0.0003 15, If we stop further bisttdions, the
maximum absolute error w d d be 0.000315. The appmltimate root m thefef~rebe iakem a's
(3.78906-t. 3.789975)/2 = 3.789218. Hence the desired approximate value bf the mot
rounded off to Chree dec'imal places is 3.789.
Example 6 : Apply biseqtion method to find an approximation to the positive root of the
equation
2x-3sinx-5=O
rounded off to three decimal places.
Solution :Let f(x) = 2x - 3 sin x - 5.
In Example 1, we had shown that a positive root liesin the interval ]2.8,2.9[. Now we apply .
bisection method to this interval. The results are given in the following table.
SUIuthm~OI
Non-linear F~uatiuna Table 8
in one Variable
Number of bisection Bisected value xi f(xJ Improved interval
1 2.85 4.1624 ]2.85.2.9[
I 2 2.875
2.8875
- 0.0403
0.02089
]2.875,2.9[
]2.875,2.8875{
-
After 6 bisections the width of the interval is 2.8835938 2.8832031 = 0.0003907. Hence,
the maximum possible absdute error to the root is 0.0003907. Therefore the required
approximation to the root is 2.883.
Now let us make some remarks.
Remark 2 :While applying bisection method we must be careful to check that f(x) is
1
continuous. For example, we may come across functions like f(x) = --L If we consider the
x - 1'
interval 1.5, I .5[. then f(.S) f(1.5) < 0. In this case we may be tempted to use bisection
method. But we cannot use the method here because f(x) is not defined at the middle point
x = 1. We can overcome these difficulties by taking f(x) to be continuous throughout the
initial bisecting interval. (Note that if f(x) is continuous, by IV theorem f(x) assumes all
values between the interval.)
Therefore you should always examine the continuity of the function in the initial interval
before attempting the bisection method.
Remark 3 :It may happen that a function has more than one r&t in an intenal. The
bisection method helps us in determining one root only. We can determine the other roots by
properly choosing the initial intervals.
You can try some exercises now.
) Starting with the interval [%, b,], apply bisection method to the following equations
and find an interval of width 0.05 that contains a solution of the equations
~ 5 Using
) bisection method find an approximate root of the equation x3 - x - 4 = 0 in the
interval ] 1.2[ to two places of decimal.
While applying bisection method we repeatedly apply steps 2.3.4 and 5. You recall that in
the introduction we classified such a method as an Iteration method. As we mentioned in
the beginning of Sec. 2.2, a numerical process starts with an initial approximation and
means repeated application iteration i m p ~ v e this
s approximation until we get the desired accurate.value of the root.
of a numerical process or a pattern
of action. Let us cpnsider another iteration method now.
The bisection method we have described earlier depends on ow ability to find an interval in
which the root lies. The task of finding such intervals is difficult in certain situations. In
such cases we try an alternate method called rued Point Iteration Method. We shall
discuss the advantage of this method later.
The first step in this method is to rewrite the equation f(x) = 0 as Iteration Melhtds tor l.oratinga Rcwt
x = g(x) . ..( 5 )
For example consider the equation x2 - 2x - 8 = 0.We can write it as
x=dz'x . .. ( 6 )
We can chocse the form (5) in several ways. Since f(x = 0 is the same as x = g(x). finding a ~ ~ f,,inti ~,,fa ~~uncl,on
d I: a
root of-f(x)= 0 is the same as findilig a root of x = g(x)i.e. a fixed point of g(x).Fich such point casuch that g(a)= a.
g(x) given in ( 6 ) .(7)or (8) is called an iteration function for solving f(x)= 0.
Once an iteration function is chmen. our next step is to take a point x,,. close to the root. a\
the initial approximation of the root.
Each computation of the type x, I = g(x,) i\ called an iteration. Now. two questions arise
+
( i ) when do we stop these iterations? (ii) Does this procedure always give the requ~red
solution?
To ensure this we make the following assumptions on g(x)
Assumption*
The derivative g'(x) of g(x) exists, gO(x)is continuous and satisfies 1 g'(x) 1 < I in an
interval containing xo. (That would mean that we require 1 g'(xi) 1 < I at all iterates xi.)
The iteration is usually stopped whenever 1 xi + I - xi 1 is less than the accuracy required.
In Unit 3 you will prove that if g(x) satisfies the above conditions. then there exists a unique
point a such that g(a) = a and the sequence of iterates approach a.provided that the initial
approximation is close to the point a.
using fixed point iteration method, starting with xo = 5. Stop the iteration whenever
I x i + , -xi I <0.001.
Solution :Let f(x) = x2 - 2x - 8. We saw that the equation f(x) = 0 can be written.in three
forms (6). (7) and (8). We shall take up the three forms one by one.
x=(2~+8)"~ - ,.
'
~tdution+of Non-linear ~:qvdtion* Here g(x) = (2x + 8)'12. Let's see whether Assumption (*) is satisfied for this g(x). We have
in one Variable
1
gf(x) =
(2x + 8)]12
Then 1 g'(x) I < 1 whenever (2x + 8)'12 > 1. For any positive real number x, we see that the
inequality (2x + 8)'12 3r 1 is satisfied. Therefore, we consider any interval on the positive
side of x-axis. Since the starting poidt is xo = 5, we may consid& the interval I = [3,6]. This
contains the point 5, NOW,g(x) satisfies the condition that gf(x)exists on I, gf(x) is
continurns on I and I gf(x) 1 < 1 for every x in the interval [3,6]. Now we apply fixed point
. iteration method to g(x).
We get
x I = g(5) = = 4.243
x', = 4.004
2x + 8
Here g(x) = -and gf(x)= '8. The I gf(x) I < I for any real number x L 3. Hence g(x)
X x2
satisfies Assumption (*) in the interval [3,6]. Now we leave it as an exercise for you to
complete the computations (See E6).
a- - 8 x2 - 8
Case 3 :Here we have x = --. Then g(x) = -and gf(x) = x. In this case 1 g'(x) 1 <1
2 2
only if 1 x 1 < I i.e. if x lies in the interval 1-1, l[.But this interval does not c~ntain5.
Therefore g(x) does not satisfy the Assumption (*) in any interval containing the initial
approximation. Htnce, the iteration method cannot provide approximation to the desired
root.
Note :This example may appear artificial to you. You are right because in this case we have
got a formula for calculating the root. This example is taken to illustrafe the method in a
simple way.
Example 8 :use fixed point iteration procedure to find an approximate rocit of 2x - 3 sin x
- 5 = O starling with thepoint % ~2.8. + ,
Stop the iteration whenever 1 xi - xi 1 < r0-'
3 5 3
-
Here g(x) = 2 sin x + -
2 md gf(x)s -
2C O x.
~
which is greater than 1. Thus g(x) does not satisfy Assumption (*) and t h e f o r e in this form
the iteration method fails.
Let us now rewrite the equation in another form. We write Iteration Methods for Locating a Ri .,r
2x - 3 sin x - 5
x=x-
2 - 3 cos x
2x-3sinx-5
Then g(x) = x -
2 - 3 cos x
You may wonder how did we get this form. Note that here g(x) is of the form
f(x) You will find later thit the above equation is the iterated formula for
g(x) = x - ---
f (x)'
another popular iteratiommethod.
Then g'(x) = 1 -
(2 - 3 cos X) (2 - 3 cos X)- (2x - 3 sin x + 5) 3 sin x
-- 2x - 3 sin x + 5 3 sin x
-
(2 - 3 cos x ) ~ I
(2 - 3 cos x ) ~
At xo = 2.8, 1 gt(x0) I = 0.06693 15 (or 0.02174691) < 1
E
Therefore g(x) satisfies the Assumption (*). Using the initial approximation as xo = 2.8, we
get the successive approximation as
x, = 2.883901 5
Since I x2 - x3 I < lo-' we stop the iteration here and conclude that 2.88323 is an
approximate value of the root.
At xo = 2.8, g'(xo) = 0.6804 < 1. In fact, we can check that in any small interval containing
2.8, 1 gr(x) 1 < 1. Thus g(x) satisfies the Assumption (*). Applying the iteration method,
we have
We find that there are two values which satisfy the above equation. One value is 0.201358
and the other is ~c- 0.201358 = 2.940235. In such situations, we take a value close to the
initial approximation. In this case the value close to the initial approximation is 2.940235.
Therefore we take this value as the starting point of the next approximation.
x, = 2.940235
Next we calculate
xZ= sin
= 0.297876 or 2.843717
Continuing like this, it needed 17 iterations to obtain the value x17 = 2.88323, which we got
from the previous form. This means that in this form the convergence is very slow.
From examples 7 and 8, we learn that if we choose the form x = g(x) properly, then we can get
the approximate root provided that the initial approximation is sufficiently close to the root. The
initial approximation is usually given in the problem or we can find using the IV theorem.
Now we shall make a remark here.
Sululiohs of Non-linear Fquatiuns Remark :The Assumption we have given for an iteration function, is a qtronger
(x)
in one Variable assrlmpiion. In actual practice there are a variety of assumptions which the Iteration function
g(x) must satisfy to ensure that the iterations approach the root. But. to use those
assumptions you would require a lot of practice in the application of techniques in
mathematical analysis. In this course, we will be restrTcting ourselves to functions that
satisfies Assumption (*). If you would like to know about the other assumptions, you may
refer to 'Elementary Numerical Analysis' by Samuel D Conte and Carl de Boor.
To get some practice over this method. you can try the following exercises.
i i ) x = - I+ sinx , X g = 1 .
2
b) Compute the exact roots of the equation x' + 4Sx - 2 = O using quadratic
formula and compare with the approximate root obtained in (a) (i).
2.5 SUMMARY - -~
2.6 SOLUTIONS/ANSWERS
t =3 x - f i s i n x
El) ~ ef(x)
Since f(-x) = -3x - 41 + sin (-x) = -3x - 41 + sin < 0 for x > 0. f(x) has no
negative real root.
Computing values of f(x) for x = 0. 1. 2. . . . radians, we get
f(o)='3x 0 - c = - 1
and f ( l ) = 3 - m n 7 = 3 - d 1 +.84147 = 1.6430,assin I =0.84147.
approximately. using a calculator. Thus f(0) and f(l) are of opposite signs. Therefore
there exists a root of f(x) = 0 lying between 0 and I.
Now we randomly take some values between 0 and I. say 0.3 and 0.5. Then
f(0.3)= .9- 1.1381 =-0.23181 <O
and
f(0.5) = 0.2836836 19 > 0.
Hence the root lies in ]0.3,0.5[. Iteration \lethad\ f i ~ r1,tratinp a Wiu~i
Repeating the process once again with the values x = 0.35 and 0.41 etc. we get,
f(0.35) < 0
and
f(0;41) > 0.
Therefore the root lies between 0.35 ai~d0.41. This :nterval is small. If we stop
the iteration here, we may either take 0.41, since f(0.41) is closer to zero. or
(0.35 + 0.41)/2 = 0.38 as the required initial approximation.
Let f(x) = 2x - tan x. Since we want a positive root of f(x) = 0, we evaluate f( x ) for
x > 0.
Let us consider x = 0. 1, 1.5. Then
f(0) = 0
f( I) = 0.443
and
f(1.5) = - 11.1014
Therefore a root lies between 1 and 1.5. Now if we consider values of I'(x, fo~x = I . I
and 1.2, we get
f(l)=0.443 <O
f(l .I ) = - O.XG8 < O
and
f(l.2)=-0.1722<0
Therefor: we get that a root liec in the interval I I . I . I 1 . In fact the root lies more close
to 1. We may take ( 1 + 1.1)/2 = I .(I5 as a n i n i t r ~ approximation.
~l
3) Let f,(x)=e-X
From the graph you can see that there are two points of intersections.
x-coordinates of the points are approximately 6 and - 22.5. Hence one root lies
close to 6 and the other root lies close to - 22.5.
E4) a) We first note that the given function f(x) = ex - 2 - x is continuous in the
interval l1.0, 1.81. Also
usi@ a calculator.
Therefore the interval 11.0. 1.8[ contains a root of the equation.
1+18
Middle point of the interval c = -= 1.4. Also, f(c) = - 3 = 4.0552 - 3 > 0.
2
Therefore the root lies in the interval ] 1, 1.4[.
Repeating this process three times more, we get the intervals ] 1.0. 1.2[. ] 1.1. 1.2[
and ] 1.1. 1.15[. Therefore the improved interval after 4 bisections is ] 1 . I , 1.15[.
The width of this interval is 0.05. This shows that the required interval of width
0.05 which contains a root of the equation is ] I . I . 1.15[.
b) Using a calculator you can show that the intervals in each of the four bisections
are given by ]3.6.4.0[, 13.6. 3.8[. ]3.6.3.7[ and 13.65. 3.70[. The width of the
last interval is 3.70 - 3.65 = 0.05. Therefore the required interval id3.65.3.701.
ES) After 5 bisections the root lies in ] 1.7959. 1.7969[. Therefore the required root correct
to two decimal places is 1.80.
Iteration Methods for Locating a Root
The iterations are given by
we have
xl = g(5) = 3.6
Here xo = -20. Starting with xo = -20, the successive iterations are given
XI =-45.1
x, = - 45.04435
Since xj and x4 are the same, we stop the iteration here. Hence.the
approximate root rounded off to four decimal places is - 45.0444.
ii) The desired root is 1.4973.
b) The given equation is x2 t 45x - 2 = 0. According to the quadratic formula, I.he
two.roots are
Comparing with the result in part (a) (i), we find $at the approximate root is the
same as the exact root - 45.0444.
UNIT 3 CHORD METHODS FOR
FINDING ROOTS
Structure
3.1 Introduction
Objectives
3.2 Regula-Falsi Method
3.3 Newton-Raphson Method
3.4 Convergence Criterion
3.5 Summary
3.1 INTRODUCTION
In the last unit we introduced you to two iteration methods for finding roots of an equation .
f(x) = 0.There we have shown that a root of the equation f(x) = 0 can be obtained by writing
the equation in the form x = g(x). Using this form we generate a sequence of approximations
,
x, = g(xi) for i = 0.1,2, . . . We had also mentioned there that the success of the iteration
+
methods depends upon the form of g(x) and the initial approximation xo. In this unit, we
shall discuss two iteration methods : regula-falsi and Newton-Raphson methods. These
methods produce results faster than bisection method. The first two sections of this unit deal
with derivations and the use of these two methods. You will be able to appreciate these
iteration methods better if you can compare the efficiency of these methods. With this In
view we introduce fhe concept of convergence criterion which helps us to check the
efficiency of each method. Sec 3.4 is devoted to the study of rate of convergence of different
iterative methods.
Objectives
After studying the unit you should be able to :
apply regula-falsi and secant methods for finding roots
apply Newton-Raphson method for finding roots
define 'order of convergence' of an iterative scheme
'obtain the order of convergence of the following four methods :
i) bisection method
ii) fixed point iteration mettiod
iii) secant method
iv) Newton-Raphson method
The bisection method for finding approximate roots has a drawback that it makes use of only
the signs of f(a) and f(b). It does not use the values f(a), f(b) in the computations. For
example. if f(a) = 700 and f(b) = 4 . 1 , then by the bisection method the first approximate
value of a root of f(x) is the mid value xo of the interval la, b[. But at xo. f(xo) is nowhere
near 0. Therefore,in this case it makes more sense to take a value near to 4.I than the C h d Methods for Finding Roots
middle value as the approximation to the root. This drawback is to some extent overcome by
the regula-falsi method. We shall first describe the method geometrically.
Suppose we want to find a root of the equation f(x) = 0 where f(x) is a continuous function.
As in the bisection method, we first find an interval ]a. b[ such that f(a) f(b) < 0. Let us look
at the graph of f(x) given in Fig. I.
(a. f(a) )
The condition f(a) f(b) < 0 means that the points (a, f(a)) and (b, f(b)) lie on the opposite
sides of the x-axis. Let us consider the line joining (a, f(a)) and (b, f(b)). This line crosses the
x-axis at some point (c, 0) [see Fig. I]. Then we take the x-coordinate of that point as the
first approximation. If f(c) = 0, then x = c is the required root. If f(a) f(c) < 0, then the r a t
lies in ]a. c[ (see Fig. 1 (a)). In this case the graph of y = f(x) is concave near the root r).
Otherwise, if f(a) f(c) > 0, the root lies in lc, b[ (see Fig. 1 (b)). In this case the graph of
y = f(x) is convex near the root. Having fixed the interval in which the root lies. we repeat
the above procedure.
k t us now wjite the above procedure in the mathematical form. Recall the fonnula for the
line joining two points in the Cartesian plane [see MTE-051. The line joining (a. f(a))and
(b, f(b)) is given by
y - f(a) = - - f(a)
f(b)- -(X- a)
b-a
We can rewrite this in the form
y-f(a) x-a
f(b) - f(a) - b - a '
...
Since the straight line intersects the x-axis at (c, O), the point (c, 0) lies on the straight line.
Putting x = c, y = 0 in Eqn. (I), we get
-f(a) c-a
f(b) - f(a) - b -a
This expression for c gives an approximate value of a root of f(x). Simplifying (2). we can
a b w r i t e it as
a f(b) - b f(a)
C=
f(b?- f(a)
Now, examine the sign of f(c) and decide in which interval ]a, c[ or Ic, b[, the root lies. We
thus obtain a new interval such that f(x) is of opposite signs at the end points of this interval.
By repeating this process, we get a sequence of intervals ]a, b[, ]a, al[, ]a, %[, . . . as shown
in Fig. 2.
Solutions of Nun-linear Equations
in one Variable
Fig. 2
We now summarise this method in the algorithm form. This will enable you to solve
problems easily.
Step 1 : Find numbers xo and xl such that fGO)f(.x,) < 0, using the tabulation method.
xo f(xJ - x1 f(xo)
Step2:Setx2=- . This gives the first approximation.
f(xl) - f(xO)
Step 3 :If f(x2) = 0 then x2 is the required root. If f(x2) f 0 and f(x& f(x2) < 0, then the next
approximation lies in ]x0, x2[. Otherwise it lies in ]x2, xl[.
Step 4 :Repeat the process till the magnitude of the difference between two successive
iterated values xi and xi is less than the accuracy required. (Note that I xi I - xi I gives
+ +
Example 1:It is known that the equation x?.+ 7x2 + 9 = 0 has a root between -8 and -7.
Use the regula-falsi method to obtain the root roundedoff to 3 decimal places. Stop the
iteration when 1 xi I - xi I < lo4.
+
Since we are given that xo = -8 and xl = -7, we do not have to use step 1. Now to get the
first approximation, we apply the formula in Step 2.
Since, f(xo) = f(-8) = -55 and f(xl) = f(-7) = 9 we obtain Chord Metliads for Finding Roots
= 1.862856
Now we compare the sign of f(x2) with the signs of f(xo) and f(xl). We can see that f(xo)
and f(x2) are of opposite signs. Therefore a root lies in the interval 1-8, -7.1406[. We apply
the formula again by renaming the end points of the interval as xl = -8, x2 = -7.1406. Then
we get the hecond approximalion as
-8 f(-7.1406) + 7.1406 f(-8) - -
. _ - ..
'We repeat this process using steps 2 and 3 given above. The iterated values are given in the
:ollow~ngtable.
Table
I
L-
, Number of iterations
- -
1
I
Interval I Iterated Values x i 1 The function value f(xi) 1
k From the table, we see that the absolute value of the difference between the 5th and 6th
iterated values is 1 7.1748226 - 7.1747855 1 = .0000371. Therefore we stop the iteration
here. Further, the values of f(x) at 6th iterated value is .00046978 = 4.6978 x lo4 which is
close to zero. Hence we conclude that -7.175 is an approximate root of x3 + 7x2 + 9 = 0
rounded off to three decimal places.
Here is an exercise for you.
- - - - - -- -- - - - - -
E 1) Obtain an approximate root for the following equations rounded off to three decimal
places, using regula-falsi method
b) xsinx- I = O
I You note that in regula-falsi method, at each stage we find an interval ] xo, x, [ which contains a
I
root and then apply iteration formula (3). This procedure has a disadvantage. To overcome this,
regula-falsi method is modified. The modified method is known as secant method. In this
method we choose xo and x, as any two approximations of the root. The Interval ] xo, x I [ need
I
not contain the root. Then we apply formula (3) with xo, xl, f(xO)and f(x,).
Y - xo f(xl) - X I f(xo)
Sdutions of kon-linear Equations
in one Varlable
Note :Geometrically, in secant Method, we replace the graph ef f(x) in the interval
]x,, x,, ,[ by a straight line joining two points (x,, f(x, ,), (x, ,), f(xn+J)on the curve
+ +
and take the point of intersection with x-axis as the approximate value of the root. Any line
joining two points on the curve is called a secant line. That is why this mehod is known as
secant method. (see Fig. 3).
I
Fig. 3
using
i) secant method starting with the two initial approximations as xo = 1 and x I = 1
and
ii) regula-falsi method.
(This example was considered in the book 'Numerical methods for scientific and
engineering computation' by M.K.Jain, S.R.K. Iyengar and R.K.Jain).
Therefore the first iterated value is 0.3 146653378.To get the 2nd iterated value, we, apply
Formula (4) with x, = 1, x2 = 0.3146653378. Now f(1) = -2.177979523 and
f(0.3146653378)= 0.5 1987I 175.
Therefore
1 We contime this process. The iterated values are tabulated in the following table.
From the table we find that the iterated values for 7th and 8th iterations are the same. Also
the value of the function at the 8th iteration is close to zero. Therefore we conclude that
0.5 177573637 is an approximate root of the equation.
ii) To apply regula-falsi method. let us first note that f(0) f(l) < 0. Therefore a root lies in
the interval 10, 1[. Now we apply Formula (3) with xg = 0 and x, = 1. Then the first
approximation is
You may have noticed that we have already calculated the expression on the right hand side
of the above equation in pan (il.
Now f(x2)=0.51987 > 0. This shows that the root lies in the interval 10.3146653378. 11. To
get the second approximation, we compute
which is same as x3 obtained in (i). We find f(x2)= 0.203545 > 0. Hence the root lies in
10.4467281446. I[. To get the third approximation. we calculate
The above expression on the right hand side is different from the expression for x4 in part
(i). This is because when we use regula-falsi method. at each stage, we have to check the
condition f(xi) f(xi .. I)< 0.
Solutions of Non-linear Equations The computed values of the rest of the approximations are given in Table 3.
in one Variable
Table 3 :Regula-Falsi Method
No. Interval Iterated value xi -: f(xi) i
1 lo,] [ 0.3146653378 0.5 19871
2 1.04467281446.1[ 0.4467281446 0.203545
3 10.4940153366.1[ 0.4940153366 0.708023 x lo-'
4 10.5099461404, 1 [ 0.5099461404 0.236077 x lo-'
5 10.5 1520 10099, 1[ 0.5 152010099 0.776011 x
From the table, we observe that we have to perform 20 iterations using regula-falsi method
to get the approximate value of the root 0.5 177573637 which we obtained by secant method
after 8 iterations. Note that the end point 1 is fixed in all iteractions given in the table.
This method is one of the most useful methods for finding roots of an algebraic equation.
Suppose that we want to find an 'approximate root of the equation f(x) = 0. If f(x) is continuous.
then we can apply either bisection method or regula-falsi method ta find approximate roots.
Now if f(x) and f (x) are continuous, then we can use a new iteratioh method called
Newton-Raphson method. You will learn that this method gives the result h o r e faster than the
bikction or regula-falsi methods. The underlying idea of the method is due to mathematician
Isac Newton. But the method as now used is due to the mathematician Raphson.
Let us begin with an equation f(x) = 0 where f(x) and f ( x ) and are continuous. Let xo be an
initial approximation and assume that xo is'close to the exact root a and f(xo) z 0. Let
a = xO+ h where h is a small quantity in magnitude. Hence f ( a ) = f(xo + h) = 0
,Now we expand f(xo + h) using Taylor's theorem, Note that f(x) satisfies all the
requirements of Taylor's theorem. Therefote, wAget
f(xo + h) ;f(xd) + h f (xo) +. . . = 0
Neglecting the terms containink' and higher powers we get
.
f(xo) + h f (x,) = 0.
Chord Methods for Finding Rwts
.This gives a new approximation to a as
f(xO)
x l = x O + h = x --
0 f (xd' 9
x = x --
* I f(xl)
Eqn. (5) is called the Newton-Raphson formula.. Before solving some examples we $hall
explain this me&d geometsically.
e.
Fig. 4 :Newton-Raphson M t h o d
If xOis an initial approlimatian to the root, then the corresponding point on the graph is
P(xv f(xO)).We draw a tangent to the curve at P. Let it intersect the x-axis at T (see Fig. 4 ) .
Let x, b~the x-coordinate of T.Let S(a,0) denote the point on the x-axis where the curve
cuts the x-axis. We know that is a mot of the equation f(x) = 0. We take x, as the new
approximation which may be closer to a than xg' Now let us find-the tangent at P(x,. f(x,,)).
The slope of the tangent at P(xo, f(xo)) is given by f(xo). Therefox by the point-slope form
of the expression for a tangent to a curve (recall the expression from MTE-OS), we can write
Y - fC%) = f(xo) (x, - xo)
This tangent passes through the point T(x,, 0) (see Fig. 4). Therefore,we get
This ik the first iterated value. To get the s e c ~ n diterated value we again consider a tangent
at the point P(xl, f(xl)) on the curve(see Fig.4) and repeat the process. Then we get a point
Sdutiom of Non-Hmclr huntions TI(%, 0) on the x-axis. From the figure. we observe that T,is more closer to S ( a . 0 ) than T.
k aw Variable
Therefore after each iteration the approximation i s coming closer and closer to the actual -
root. I n practice we do not know the actual root of a given function.
f ( x d = f ( l ) = 2 - ~ e c * 1= 2 - ( 1 +tan21)
=I-tan21
= -1.425519
0.4425922
Therefore x, = I-
- 1.4255 19
= 1.3 1048
For i = 2, we get
2 - tan(1.31048)
X, = 1.31048 -
I- tan2(1.3 1048)
= 1.22393
Similarly we get
x, = 1.17605
x4 = 1.165926
x, = 1.165562
xn= 1.165561
Now x, and xn are correct to five decimal places. Haice we \top the iteration process here.
The root correct to 5 decimal places i s 1.16556.
.
The iteration' fo??ula is
.
Similarly
Note I :The method used in the above example is applicable for finding square rm)t ol'i~ny
positive dial number. For example suppose we want to find an approximate value of \'A
where A i s a positive real number. Then we consider the equation x2 - A = 0. The itrri~tcd
formula in this case is
1% I
very fast. One reason for this is that the derivative I f ( x ) I is large compared to If(x)l for any
x = xi. The quantity - which is the difference between two iterated values is rmall in
this case. I n general we can say that if I f(xi) I is large compared to I f(xi) I, then wl. can
obtain the desired root very fast by this method.
The Newton-Raphson method has some limitations. In the following remarks we mention
some o f the difficulties..
Remark 1 :Suppose f(xi) is zero in a neighbouthood o f the root, then it g a y happen that
f(xi) = 0 for some xi. I n this case we cannot apply Newton-Raphson formula, since
division by zero is not allowed.
Remark 2 :Another difficulty is that i t may happen that f(*)is zero only at the roots. This
happens i n either o f the situations.
i) f(x) has multiple root at a.R e d l that a polynomial function f(x) has a multipk root
a of order N if we can write
, where h(x) is a function such that h(a) # 0. For a general function f(x), this means
f(a) = 0 = f(a) = . ..= fY-'(a) and ?(a) # 0.
ii) f(x).)las a stationary point (point o f maximum o f minimum) point at the root [recall
fram your c d ~ l u course
s (MTE-01) that i f f(x) = 0 at some point x then x is called a
I stationary poinj).
Solutions of Non-linear Equations In such cases some modifications to the Newton-Raphson method are necessary to get an
in one Variable
accurate result. We shall not discuss the modificatiolls here as they are beyond the scop: of
this course.
You can try some exercises now. Wherever needed, you should c.-e a calpulator for
computation.
E7) Can Newton-Raphson iteration method be used to solve the equation = O? Give
reasons for your answer.
In the next section we shall discuss a criterion using which we can check the efficiency of an
iteration process.
- CONVERGENCE
3.4
- ---
CRITERION
In this section we shall introduce a new concept called 'convergence criterion' related to an
iteration process. This criterion gives us an idea of how many successive iterations have to
be carried out to obtain the root to the desired accuracy. We begin with a definition.
for some number h > 0. p is called the order of convergence and h is called the asymptotic
error constant.
This inequality shows the relationship between the error in successive approximations. For
example. suppose p = 2 and 1 E~ 1 lo-' for some i, then we can expect that
1 &i + I 1 s h lo4. Thus if p is large, the iteration converges rapidly. When p takes the
integer values 1. 2 . 3 then we say that the convergence i.s linear, quadratic and cubic
respectively. In the case of linear convergence (i.e. p = I ). then we require that h < 1. In this
case we can write (6) as
If this condition is satisfied for an ireratio; '!\ rocess then we say that the iteration process
converges linearly.
jetting n = 0 in the inequality (8), we get Chord Methods for FindingRoots
I x , - a 111I x o - a I
Forn =.I, we get
I ~ ~ - a l < h l x , - a l s2 1h x - a l
to the root.
Now we shall find the order of convergence of the iteration methods which you have studied
so far.
"0 - a.
and b, - a,, = -
2"
We know that the equation f(x) = 0 has a root in [ao, b o ] Let a be the root of the equation.
an bn
Then a lies in all the intervals [ai, bi], i = 0, 1,2, . . . . For any n, let C,
2
=
+
denote the
middle point of the interval [a,,, b,]. Then coyc , , c2, . . . are taken as successive
m
approximations to the root a. Let's check the inequality (8) for jc
i nJn=o
m
converges to the root a. Hence we can say that the bisection method always
converges.
Solutions of on-linear Equations For practical purposes, we should be able to decide at what stage we can stop the iteration to
in one Variable have an acceptably good approximate value of a. The nufnber of iterations required to
achieve a given accuracy for the bisection method can be obtained. Suppose that we want an
approximate sobtion within an error bound of (Recall that you have studied error
bounds in Unit 1, Sec 3.4). Taking logarithms on both sides of Eqn. (lo), we find that the
number of iterations required, say n, is approximately given by
;
n = int
[ln(bo 1a;, In lo-"
I
where the symbol 'int' stands for the integral part of the number in the bracket and ]ao, bo[ is
the initial interval in which a root lies.
Example 5 : Suppose that the bisection method is used to find a zero of f(x) in the interval
[O, I]. How many times this interval be bisected to guarantee that we have an approximate
root with absolute error less than or equal to 10-5
Solution : Let n denote the required number. To calculate n, we apply the formula in Eqn.
(II)withbo= 1,ao=OandM=5.
Then
= int [ 16.609640471 = 17
Similarly you can try the following exercise.
E8) For the problem given in Example 5. Unit 2. find the number n of bisections required
to have an approximate root with absolute error less than or equal to lo-'.
The following table gives the minimum number of iterations required to find an approximate
root in the interval 10, I [ for various acceptable errors.
This table shows that for getting an approximate value with an absolute error bounded by
lo-'. we have to perform 17 iterations. Thus even though the bisection method is simple to
use. it requ~resa large number of iterations to obtain a reasonably good approximate root.
This is one of the disadvantages of the bisection method.
Note : The formula given in Eqn. ( 1 I ) shows that. given an acceptable error, the number of
iterations depends upon the initial interval and thereby depends upon the initial
approximation of the root and not directly on the values of f(x) at these approximations.
Next we shall obtain the convergence criteria for the secant method.
Now we expand f ( +~a ~)and f ( -~a)~ using Taylor's theorem about the point x = a .
tv(a)
W e get f ( +~a)~= f(a)+ -
1
E~ + ff(a)
-
2
E~ + .
-
since f ( a ) = 0.
Similarly,
,
Therefore f ( ~+! a) - f ( -~ +~a)= ff(a)
This relationship between the errors is called the error equation. Note that this relationship
holds only if a is a simple root. Now using Eqn. ( 1 7 ) we will find a numbers p and h such
that
Setting i = j - 1, we obtain
E.= A E P.
J J-1
1.e. h e Pi =-
2f (a)
h-vp &I + l/p ...
Now, to get the number h, we equate the constant terms on both sides of Eqn. (20). Then we
get
constant is [;;;I'1
Hence the order of convergence of the secant method is p = 1.62 and the asymptotic error
--
+
Example 6 :The following are the five successive iterations obtained by secant method to
find the root a = -2 of the equation x3 - 3x + 2 = 0.
x l =-2.6, x,=-2.4, x3=-2.106598985,
L
Then
tv(x) = 3x2 - 3. tV(-2) = 9
tv'(x) = 6x. tv'(-2) = -12
Therefore 1. = -
I
[- i r l r= - 0.77835 1205
Now
E~ = I x5 - a 1 = I - 2.000021537 + 2
= 0.000071537
and
- E ~ = I - 2 . 0 2 2 6 4 1 4 1 2 + 2 1=0.022641412.
Hence we get that h E4 ,- E5 Chord Methods tor Findipg Roots
Since gf(x) is continuous near the root and I gf(x) I < 1, there exists an interval
] a - h, a + h[, where h > 0, such that I gf(x) 1 I k for some k, where 0 < k < 1.
xi + - a = g(xi) - g(a)
Now the function g(x) is continuous in the interval ]xi, a [ and gf(x) exists in this interval.
Hence g(x) satisfies all the conditions of the mean value theorem [see unit 11. Then, by the
mean value theorem there exists a 5 between xi and a such that
This shows that the sequence of approximations {xi) converges to a provided that the initial
approximation is close to the root.
We summarise the result obtained for this iteration process in the following Theorem.
Theorem :If g(x) and '(x) are continllous in an interval about a root a of the equation
B
x = g(x), and if 1 g'(x) < 1 for all x in the interval, then the successive approximations
xl, x2, . . . given by
converges to the root a provided that the initial approximation xo is chosen in the above
interval.
We shall now discuss the order of convergence of this method. From the previous
discussions we have the result.
I xi ,- a I I g'(5) I
+ (xi - a ) I
Solutions of Non-linear Equations Note that 6 is dependent on each xi. Now we wish to determine the constants X and p
in one Variable
independent of xi such that
I xi+]-a I ~c I (xi-a) IP
Note that as the approximations xi get closer to the root a , g!(c) approaches a constant value
g'(a). Therefore, in the limiting case, as i + m, the approximations satisfy the relation ,
I xi+]-al <I gt(a) l I xi-a I
Therefore, we conclude that if g'(a) # 0, then the convergence of the method is linear.
If g'(a) = 0, then we have 1
i+l-a=g(xi)-a
=g[(xi - a ) + a ] - a
By applying Taylor's theorem to the (xi - a12
function g(x) about a and neglecting
higher powers.
= g(a) + (xi - a)g'(a) + 7-
g"(6) - a
x=-- (ax + b)
X
- to a if
By Theorem 1, these iterations converge I gt(x)
- I < 1 near a i.e. if I gt(x)
- I
1= -
1 X'I
c 1. Note that gt(x) is continuous near a. If the iterations converge to x = a , then we require
Now you recall from your elementary algebra course (MTE-04) that if a and P are the roots,
then
a+p=-aandap=b
Therefore 1 b 1 = 1 a 1 I P 1. Substituting in Eqn. (23), we get
, ~ ~ I ~ > I l~p l I = I ~ I
'Hence 1 a l > l Dl.
Similarlyyou can solve the following exercise. Chord Methods for Finding Roots
~ 9 ;For the equation given in Example 7, show that the iteration xi + , = xi + a will
----
To obtain the order of the method we proceed as'in the secant method. We assume that a is
a simple root of f(x) = 0. Let
Then we have
Now we expand f ( +~a )~and f ( E ~+ a ) , using Taylor's theorem. about the point a . We have
f ( a ) + E~ f '(a) +-
2
'i
2
f"(a) + . . .
I
-
'i+l-
f ( a ) + E~f"(a) + ~f f"(a) + . . .
But f ( a ) = 0 and f ( a ) # 0. Therefore
This shows that the errors satisfy Eqn. (6) with p = 2 and h = '(a) Hence
-
2ff(a)'
Newton-Raphson method is of order 2. That is at each step. the error is proportional to the
square of the previous error.
Now, we shall discuss an alternate method for showing that the order is 2. Note that we can
write (24) in the form x = g(x) where
Sululions of Nun-linear Equations Then
inone Variable
-
- f(x) f'(x)
[f(x)12
Now,
, f(a) f( a )
g ( a ) = -- = 0, since f(a) = 0 and f ( a ) # 0.
ff(a)12
Hence by the conclusion drawn just above Example 7, the method is of order 2. Note that
this is true only if a is a simple root. If a is a multiple root i.e. i f f (a)= 0, then the
convergence is not quadratic, but only linear. We shall not prove this result, but we shall
illustrate this with an example.
Example 8 :Let f(x) = (x - 2)4 = 0. Starting with the initial approximation xo = 2.1,
compute the iterations x,, x2, x3 and x4 using Newton-Raphson method. Is the sequence
converging quadratically or linearly?
Solution :The given function hasinultiple roots at x = 2 and is of order 4.
Newton-Raphson iteration formula for the given equation is
= 41 (3xi - 2)
Starting with xo = 2.1, the iterations are given by
Similarly
x, = 2.05625
x3 = 2.042 1875
x4 = 2.03 1640625
Then
and
3 Chord IWehds for Finding Roog
Thus the convergence is linear in this case. The error is reduced by a factor of - with each
4
iteration. This result can also be obtained directly from Eqn. (25).
E 10) The quadratic equation x4 - 4x2 + 4 = 0 has a double root at x = 6.Starting with x, =
1.5, compute three successive approximations to the root by Newton-Raphson
me [hod. Does the result converge quadratically or linearly ?
3.5 SUMMARY
3.6 SOLUTIONSIANSWERS
El) i) Letf(x)=xloglox-1.2~0
We have to first find two numbeis a and b such that f(a) f(b) c0. Since
the function loglox is defined only for positive values of x, we consider
on& positive numbers x. Let us take x = 1.2.3, . . .Then, using a calculator,
f(l)= 1 (log1(,1)- 1 . 2 = - 1 . 2 ~ 0
C .
Solutions of Non-linear Equations This shows that f(2) f(3) c 0 and therefore a root lies in 12,3[. NO; pui a = 2
in one Variable and b = 3. Then the first approximation of the root is
We find f(x2) = - 0.0004 < 0. Therefore the root lies in the interval ]2.7402,3[.
The third approximation is obtained as
Since x2 and x3 rounded off to three decimal places are the same, we stop the
process here. Hence the desired approximate value of the root rounded off to
three decimal places is 2.740.
ii) Let f(x) = x sin x - I
Since f(0) = - 1 and f(2) = 0.8 18594854,a root lies in the interval 10, 2[. The firs)
approximation is
and
- x 4 = 1.1 1415714
Since x3 and x4 rounded off to three decimal places are the same, we stop the process
here. Hence the desired root is 1.1 14.
E2) Let f(x) = x2 - 2x - 1. Starting with-%= 2.6 and x, = 2.5 the successive
approximations are,
= 2.4 1935484 Chord Methods for Finding Roots
r ~ i m i l a ryou
l ~ can calculate that
x4 = 2.4 1421384
and
x5 = 2.41421356
I
I
Since x4 and x5 rounded off to 5 decimal places are the same, we stop the process
here. Therefore the required root rounded off to 5 decimal places is 2.4 1421.
Now we compare this root with the exact root 1 + fi.Using a calculator we 1 + 6=
1 2.4 142 1. ro~ndedoff to five decimal places. Hence the computed root and exact root
are the same when we round off to five decimal places.
The table shows that x5 and x4 are correct to three decimal places. Therefore we
stop the process here. Hence the root correct to three decimal places is 1.731.
ii) In secant method we start with two approximatiops a = 1 and b = 2. Then the
first approximation is the same as in part (i), namely
XI = 1.57142
To calculate the next approximation x2 we take b and x,. Here also we
&getting the same value as in part (i), namely
Then we take x, = 1.57142 and x2 = 1.70540 to get the third approximation xj.
We have
. w u t h d Nm-1- F y u h The rest of the values are given by
and
xs = 1.73205
Since x4 and x5 rounded off to three decimal places are the same,we stop here.
Hence the root is 1.732. rounded off to three decimal places.
Let us now compa~the two methods. We first note that 1 % + I - xi 1 gives the
error after ith iteration.
In regula-falsi method, the e r m after 5th iteration is
I x5 - x4 I = 1 1.73194 - 1.73140 1
This shows that the error in the case of want method is smaller than that in
regula-falsi method for the same number of iterations.
m) The given function f(x) = x3 - 4x + 1 and its derivative f(x) = 3x2- 4 are continuous
everywhere.
The initial approximation is xo = 0.
The iteration formula is
'
The second approximation is given by
Similarly we get
x3 = 0.254101
Since x2 and x3 rounded off to four decimal places are the same. we slop the iteration
here. Hence the root is 0.2541.
-
E5) Let f(x) = 2x 2 - sin x. The f(x) and f (x) are continuous everywhere. Starting with
xo = 1.5, we compute the iterated values by the Newton-Raphson formula. The first
iteration is
+ I
= 1.5
- sin (1.5)
- 21-cos(1.5)
Chad Mctbob for FWhgRoots
Similarly,
x2 = 1.498701
= 2.82843 1
and xj = 2.828427
Since 1 x3 - x2 1 < lo4, we stop the iteration. Therefore the approximate mot
is 2.8284.
+ ~ I] i = ~ , 1.2...
x i = + [ x i - l xi-
i.e.(a+aY< 1 b 1.
But we havea+f5=-aand&= b. Therefon we get
p>lbl=lal 1st.
i.e. l a1 < 1 D l .
Solutions of Nun-linear equations E10) The iterated formula is
in one Variable
xf - 2
XI+, =xi-
4xi
The three successive iterations are
x, = 1.458333333
X, = 1.436667 143
x3 = 1.425497619
1 1
Then we get E~ = - E, and e2 = - E,. This shows that the sequence is not quadratically
2 2
convergent, it is linearly convergent.
UNIT 4 APPROXIMATE ROOTS OF
POLYNOMIAL EQUATIONS
Structure
4.1 Introduction
Objectives
4.2 Some Results on Roots of Polynomial Equations
4.3 Birge-Vieta Method
4.4 Graeffe's Root Squaring Method
4.5 Summary
4.6 Solutions/Anrwers
4.1 INTRODUCTION
- 4
In the last two units we discussed methods for finding approximate roots of the equation
f(x) = 0.In this unit we restrict our attention to polynomial equations. Recall that a
polynomial equation is an equation of the form f(x) = 0 where f(x) is a polynomial in x.
Polynomial equations arise very frequently in all branches of science especially in physical
applications. For example, the stability of electrical or mechanical systems is related to the
real part of one of the complex roots of a certain polynomial equation. Thus there is a need
to find all roots, real and complex, of a polynomial equation. The four iteration methods, we
have discussed so far, applies to polynomial equations also. But you have seen that all those
methods are time consuming. Thus it 1s necessary to find some efficient methods for
obtaining roots of polynomial equations.
The sixteenth century French mathematician Francois Vieta was the pioneer to develop
methods for finding approximate roots of polynomial equations. Later, several other
methods were developed for solving polynomial equations. In this unit we shall discuss two
simple methods : Birge-Vieta's and Graeffe's root squaring methods. To apply these
methods we should have some prior knowledge of location and nature of roots of a
polynomial equation. You are already familiar with some results regarding location and .
nature of roots from the elementary algebra course MTE-04. We shall beg~nthis unit by;--
listing some of the important results about the roots of polynomial equations.
Objectives
After reading this unit you should be able to :
apply the following methods for finding approximate roots of polynomial equations
i) Birge-Vieta method
ii) Graeffe's root squaring method.
list the advantages of the above methods over the methods discussed in the earlier
units.
The main contribution in the study of polynomial equations is due to the French
mathematician Rene Descarte's. The results appeared in the third part of his famous paper
'La geometric' which means 'The geometry'.
Coeficient of x0 :% = bo - a, bo=ao+abl
It is easy to perform the calculations if we write the coefficients of p(x) on a line and Approximate Roots of Pokynomial
+ ,
perform the calculations bk = ak + a bk below ak as given in the table below. Equations
I Solution :Here p(x) is a polynomial of degree 5. If as, a4, a3, a2, a,, a. are the coefficients
of p(x), then the Homer's table in this case is
Table 2
qo(x) = x4 - 3x3 - x2 + 5x + 19
E l ) Find the quotient and the remainder when 2x3 - 5x2 + 3x - 1 is divided by x - 2.
-
E2) Using synthetic division check whether a. = 3 is a root of the polynomial equation
x4 + x3 - 13x2- x + 12 = 0 and find the quotient polynomial.
Theorem 3 : Suppose that z = a + ib is a root of the polynomial equation p(x) = 0.Then the
conjugate of z, namely 5,= a - ib is also a root of the equation p(x) = 0, i.e. complex roots
occur in pairs.
we count the changes in the sign of the coefficients. Going from left to right there are
changes between 1 and -15, between -15 and 7 and between 7 and -1 1. The total number
in one Vatiable
Here there is only one change between 1 and -15 and hence the equation cannot have more
than one negative root.
We now give another theorem which helps us in locating the real roots.
Theorem 5 :Let p(x) = 0 be a polynomial equation of degree n 2 1. Let a and b be two real
numbers with a < b. Suppose further that p(a) + 0 and p(b) # 0.Then,
i) if p(a) and p(b) have opposite signs, the equation p(x) = 0 has an odd number of roots
between a and b.
ii) if p(a) and p(b) have like signs, then p(x) = 0 either has no root or an even number of
roots between a and b.
Note : In this theorem multiplicity of the root is taken into consideration i.e. if a is a root of
multiplicity k it has to be counted k times.
Corollary 1 :An equation of odd degree with real coefficients has at least one real root
whose sign is opposite to that of the last term.
Corollary 2 :An equation of even degree whose constant term has the sign opposite to that
of the leading coefficient, has at least two real r q t s one positive and the other negative.
Corollary 3 :The result given in Theorem 5(i) is the generalisation of the Intermediate
value theorem.
The relationship between roots and coefficients of a polynomial equation is given below.
Now, you can try to solve some problems using the above theorems.
E3) How many negative roots does the equation 3x7 + x5 + 4x3 + 1Qx - 6 = 0 have? Also
determine the number of positive roots, if any.
E4) Show that the biquadratic equation
p(x) = x4 + x3 - 2x2 + 4x - 24 = 0 has at least two real roots one positive and the other
negative.
In the next section we shall discuss one of the simple methods for solving polynomial
equations.
4.3 BIRGE-VIETA METHOD
Approximate Roots of Polynomial
Equations I
- the real roots of a ~olynomial
We shall now discuss the Birae-Vieta method for finding . - 1
equation. This method is based on an original method due to two English mathematicians
Birge and Vieta. This method is a modified form of Newton - Raphson method.
However, this is the most inefficient way of evaluating a polynomial, because of the amount
of computations involved and also due to the possible growth of round off errors. Thus there
is a need to look for some efficient method for evaluating p,(x) and pfn(x).
Let us consider the evaluation of p,(x) and,pfn(x)at x0 using Homer's method as discussed
in the previous section.
I We have
where
P,(x) = (x - xo) 4, - + r0 .
q n - I ( ~ ) =bnxn-I + b n - 2 ~ r- 2 +.. . + b 2 x + b I
and bo = pn(xo) = rO . . . (8)
We have already discussed in the previous section how to find bi, i = 1,2, . . ., n.
I
Next we shall find the derivative pfn(xO)using Homer's method. We divide
I
q n - 1 ( ~by
) (X- xo) using Homer's method. That is, we write
Table 3 . . .
pt,(xo) = e,-
Comparing (9) and (12), we get
P',,(x~) = q,, - 1 ( ~ ( =
$ c1
Hence the Newton-Raphson method (Eqn. (6)) simplies to
Table 4
Table 5
-6 8 8 4 4 0
E5) Using synthetic division, show that 2 is a simple root of the equation
p(x) = x4 - 2x3 - 7x2 + 8x + 12 = 0.
E6) Evaluate p (0.5) and p'(0.5) for
p(x) = -8x5 + 7x4 - 6x3 + 5x2 - 4x + 3
Now we shall illustrate why this method is more efficient than the direct method. Let us
consider an example. Suppose we want to evaluate the polynomial
p(x) = -8x5 + 7x4 - 6x3 + 5x2 - 4x + 3
for any given x.
When we evaluate by direct method, we compute each power of x by multiplying with x the Approxirmite Roots of Polynomial
preceding power of x as Equations
When we use Homer's method the total number of multiplications is 5. The number of
additions in both cases are the same. This shows that less computation is involved while
using Homer's method and thereby reduces the error in computation.
Example 3 : Use Birge-Vieta method to find all the positive real roots, rounded off to three
decimal places, of the equation
xu+7x3+24x2+x- 1 5 = 0
is of degree 4. Therefore, by Theorem 1, this equation has 4 roots. Since there is only one
change of sign in the coefficients of this equation, Descarte's rule of signs (see Theotem 4),
states that the equation can have at most one positive real root.
I Now let us examine whether the equation has a positive real root.
i Since p4(0) = - 15 and p4( 1) = 19, by Intermediate value theorem, the equation has a root
lying in 10, l[.
We take xo = 0.5 as the initial approximation to the root. The first iteration is given by
Now we evaluate p4 (0.5) and p'4 (0.5) using Homer's method. The results are given in the
following table.
Table 6
-7 5625
Therefore xl = 0.5 - = 0.7459
30.75
Therefore x2 = 0.7459 - -
2'3 32 - 0.6998
50.1469
Table 8
7 24 1 -15
0.6998 0.6998 5.388 1 20.5649 1 5.0905
Since x3 and x4 are the same. we get I x4 - xj 1 < 0.0001 and therefore we stop the iteration
here. Hence the approximate value of the root rounded off to three decimal places is 0.698.
Next we shall illustrate how Birge-Vieta's method helps us to find all real roots of a
polynomial equation.
If a is a root of the equation p(x) = 0, then p(x) is exactly divisible by x - a, that is, bo = 0.
In finding the approximations to the root by the Birge-Vieta method, we find that bo
approaches zero (bo + 0)as x, approaches a (xi + a).Hence. if xn is taken as the final
approximation to the root satisfying the criterion ( xn - xn - I 1 < E, then to this
approximation, the required quotient is
q n - 1 ( ~=) bnxn-I + b n - I x n - 2+ . . . + b,
where b'is are obtained by using xn and the Homer's method. This polynomial is called the Approximate Roots of Polynomial
deflated polynomial or reduced polynomial. The next root is now obtained using qn - ,(x) Equations
and not p,(x). Continuing this process, we can successively reduce the degree of the
polynomial and find one real root at a time.
Example 4 :Find all the roots of the polynomial equation p3(x) = x3 + x - 3 = 0 rounded off
,
to three decimal places. Stop the iteration whenever 1 xi - xi 1 < 0.0001.
+
Solution : The equation p3(x) = 0 has three roots. Since there is only one change in the sign
of the coefficients, by Descarts' rule of signs the equation can have at most one positive real
root. The equation has no negative real root since p3(-X) = 0 has no change of sign of
coefficients. Since p3(x) = 0 is of odd degree it has at least one real root. Hence the given
equation x3 + x - 3 = 0 has one positive real root and a complex pair. Since p(1) = -1 and
p(2) = 7, by intermediate value theorem the equation has a real root lying in the interval
]1,2[. Let us find the real root using Birge-Vieta Method. Let the initial approximation
be 1.1.
First iteration
Table 10
Similarly, we obtain
x2= 1.21347
Since I x2 - x3 I < 0.0001, we stop the iteration here. Hence the required value of the root is
1.213, rounded off tb three decimal places. Next let us obtain the deflated polynomial of
p3(x). To.get the deflated polynomial, we have to find the polynomial q2(x) by using the
final approximation x3 = 1.213 (see Table 11).
Table 11
using Birge-Vieta method starting with the initial approximation xo = -2. Stop the
iteration whenever 1 xi + , - xi 1 < 0.4 x
E8) Find all the roots of the equation x3 - 2x - 5 = 0 using Birge-Vieta method.
E9) Find the real root rounded off to two decimal places of the equation
x4 - 4x3 - 3x + 23 = 0 lying in the interval ]2,3[ by Birge-Vieta method.
In the last section we have discussed a method for finding real roots of polynomial
equations. Here we shall discuss a direct method for solving polynomial equations. This
method was developed independently by three mathematicians Dandelin, Lobachevsky and
Graeffe. But Graeffe's name is usually associated with this method. The advantage of this
method is that it finds all roots of a polynomial equation simultaneously; the roots may be
real and distinct, real and equal (multiple) or complex roots.
The underlying idea of the method is based on the following fact :,Suppose PI, PZ, . . . , Pn
ate the n real and distinct roots of a polynomial equation of degree n such that they are
widely separated, that is,
I PI I >>I P21 >> I P31 >>...>>I Pnl
where >> stands for 'much greater than'. Then we can obtain the roots approximately from
the coefficients of the polynomial equation as follows :
Let the polynomial equation whose roots are P,, P2, . . . , Pn be
Using the relations between the roots and the coefficients of the polynomial as given in Sec.
4.2, we get
1
Since
-
I PI I >> I P2 I >> I P3 I >; . . . >> I Pn 1, we have from ( I 4) the a p p i l n a t i o n s Approximate Roots of Polynomial
Equations
4
These approximations can be simpfified as
J
So the problem now is to find from the given polynomial equation, a polynomial equation
whose roots are widely separated. This can be done by the method which we shall describe
now.
In the present course we shall discuss the application of the method to a polynomial
equation whose roots are real and distinct.
Let a t , a 2 , . . . ,a, be the n real and distinct roots of the polynomial equation of degree n
given by
where ao, a t , a2, . . . ,an - I , a, are real numbers and an # 0. We rewrite Eqn. (17) by
collecting all even terms on one side and all odd terms on the other side, i.e.
Now we expand both the right and left hand sides and simplify by collecting the
coefficitnts. We get
Solutions of Nun-linear Equations Putting x' = -y in Eqn. (19), we,obtain a new equation given by
in one Variable
where
b0=$
2
bn = an
The following table helps us to compute the coefficients bo, b,, . . . ,bn of Eqn. (20) directly
from Eqn. ( 17).
Table 12
a~ a3... "n
4 a: 4 af at
o -2a,,a2 -2ala3 -2a2a4 o
o o %a4 -2a,a5 o '
0 0 0 -2aoa6 0
To form Table 12 we first write the coefficients ao, al , a2. . . . ,an as the first row. Then we
form (n + I ) columns as follows.
The terms in each column alternate in sign starting with a positive sign. The first term in
each column is the square of the coefficients ak, k = 0, 1 , 2, . . . , n. The second term in each
column is twice the product of the nearest-neighbour1ng coefficients. if there are any, with
negative sign: otherwise put it as zero. ~ 0 7 e x a m ~ lthe
e . second term in theflrst column is
zero and second term in the second colu~nnis -?ao aT Likewise the second term of the
,
(k + I )th column is -?ah - ah I.The third term in the (k + 1)th column is twice the
,,
+
product of the next neighbouring coefficients ak-,and ak + if there are any, otherwise put
it as zero. This procedure is continued until there are no coefficients available to form'the
cross products. Then we add all the terms in each column. The sum gives the coefficients bh
for k = 0. I , 2. . . . , n which are listed as the last term in each column. Since the substitution
x' = -y is used, it is easy to see that if aI, a2,. . . , anare the n roots of Eqn. (17). then
2,a2,. . . , c( are the roots of Eqn. (20).
- a;
Thus, starting with a given polynomial equation, we obtained another polynomial equation
whose roots are the squares of the roots of the original equation with negative sign.
We repeat the procedure for Eqn. (20) and obtain another equation
whose roots are the squares of the roots of Eqn. (20) with a negative sign i.e., they are fourth
powers of the roots of the original equation with a negative sign. Let this procedure be
repeated n times.Then, we obtain an equation
.L
Approximate Roots of Polynomial
whose roots yl, y2, .... *I, are given by Equations
Now, since all the roots of Eqn. (17) are real and distinct, we have
I a l I > I a 2 1 > ........> I a n /
Hence I yl 1 >> I y2.1 >> . . . . . . . . . >> 1 yn I.
We conclude that if the roots of Eqn. (17) are distinct then for large m, the 2"'th powers of
the roots are widely separated.
We stop this s q u h n g process when the cross product terms become negligible in
comparison to square terms. .
Since roots of Eqn. (2 1) are widely separated we calculate the absolute values of the roots '
yl, y2, .... "1, using Eqn. ( 16). We have
The magnitude of the roots of the original equation are therefore given by
This gives the magnitudes of the roots. To determine the sign of the roots, we substitute
these approximations-inthe original equation and verify whether positive or negative value
satisfies it.
Example 5 :Find all the roots of the cubic equation x3 - 15x2+ 62x - 72 = 0 by Graeffe's
method using three squarings.
The equation has no negative real roots. Let us now apply the root squaring method
successively. The we get the following results :
Solutions of Non-linear Equations First Squaring
in one Variable
Table 13
Applying the squaring method to the new equation we get the lollowing results.
Second Squaring
Table 14
Third Squaring
Table 15
--- - --
26873856 1788688 6833 1
After three squarings. the roots y l , y, and y., of this equation are given by
%, a3of the original equation are
Hence, the roots a,, ~ ~ ~ r o i i r nRoots
n t e of Polynomial
Equations
Since the equation has no negative real roots, all the roots are positive. Hence the rooqs can
be taken as 9.0017,4.001 1 and 1.9990. If the approximations are rounded to 2 decimal
places, we have the roots as 9.4 and 2. Alternately, we can substitute the approximate roots
in the given equation and find their sign.
E10) Determine all roots of the following equations by Graeffe's root squaring method.
using three squarings.
ii) x3-2x2-5~+6=0
We have seen that Graeffe's root squaring method obtains all real roots simultaneously.
There is considerable saving in time also. The niethod can be extended to find multiple and
complex roots also. However the method is not efficient to find these roots. We shall not
discuss these extensions.
We shall end this block by summarising what we have covered in this unit.
4.5 SUMMARY
10
Therefore x , = -2 - r-
49
= 1.796
.I018
Therefore x3 = -1.7425 +-
28.8770
Since I x3 - xl I < 0.0035 < 0.4 x lop2,we conclude that 1.7390 is the approximate
' root.
E8) Let p(x) = x3 - 2x - 5
Since there is only one change in the sign of the coefficients of p(x), the equation has
at most one real root. The equation has no negative real root since there is no change
in the sign of the coefficients of p(-x). Also
and
p(3)= 1 6 > 0
Therefore aroot lies in ]2,3[. Using xo = 2.5 as an initial approximation to the root,
you can show that 2.0945 is an approximation to the real root.
The deflated polynomial is given by the following table
=-1.0473+ i 1.1359
Hence the roots are given by 2.0945, -1 .0473 + 1.1359 i, -1 .@I73 - 1.13359 i.
First squaring
bdutions of Non-linear Equations Second squaring
in one Variable
1600
Third squaring
I
Hence the roots of the original equation are given by
Substituting the computed values in the original equation, we get that the roots
are approximately - 10, 2.18 and 1.83. Therefore the roots are -10, 2 and 2.
ii) Computed values of the roots are 3.014443. 1.991424 and 0.9994937.
iii) Computed values of the roots are 7.017507, -2.974432,0.958 1706.
DIRECT METHODS
' Structure
, 5.1 Introduction
i 5.2 Preliminaries
5.3 Cramer's Rule
i 5.4 Direct Methods for Special Matrices
I 5.5 Gauss Elimination Method
I
I 5.6 LU Decomposition Method
I
, 5.7 Summary
5.8 Solutions/Answers
Objectives
After studying this unit, you should be able to:
state the difference between the direct and iterative methods of solving the system
of linear algebraic equations;
obmin the solution of system of linear algebraic equations by using the direct
methods such as Cramer's rule, Gauss elimination method and LU decomposition
method;
use the pivoting technique while transforming the coefficient matrix t o upper or
lower triangular matrix.
Two matr~cesA = (a,,) and B = (bil) are equal iff they navethe same number or rows
i n d columns and their corresponding elements are equal, tha't is, a,, = bi, for all i, j.
You must also be familiar with the addition and multiplication of matrices.
Addition of matrices is defined only for matrices of same order. The sum C = A B +
of two matrices A and B, is obtained by adding the corresponding elements of A and
R , i.e., cij = aij + bij.
-4 6 3 5 -1 0
For example, if A =
[ 0 1 2 1 andB = [ 3 1 O]7then
That is, to obtpin the (i,k)th element of AB, take the ith row of A and kth column
of B, multiply their corresponding elements and add up all these products. For
example, if
Note that two matrices A and B can be multiplied only if the number of columns of
A equals the number of rows of B. In the above example the product B A is not
defined.
The matrix obtained by interchangin the rows and columns of A is called the
transpose of A and is denoted by A 7g
IfA = [ -1
3]
1
then^^ = [:-:]
Determinant 1s a number associated with square matrices.
t
det (A) = a,, det
[ a22
a32 a33
a23
El
-
, calculate det (A).
If the determinant of a square matrix A has the value zero, then the matrix A is called
a singular matrix, otherwise, A is called a nonsingular matrix.
I Definition : For a matrix A = (aij), the cofactor Aij of the element aij is given by
AiJ = (-l)'+J MiJ
where Mij(minor) is the deterrhinant of the matrix of grder (n-1) x (n-1) obtained
from A after deleting its ith row and the jth column.
Definition : The matrix of cofactors associated w ~ t hthe n x n matrix A is ann)cn matrix
A' obtained from A by replacing each element of A by its cofactor.
Definition : A system of linear Eqns. (2) is said to be consistent if it has at least one
solution. If no solution exists, then the system is said to be inconsistent.
Definition : The system of Eqns. (2) is said to be homogene6us if b = 0, that is, all
the elements bl, b,,. ...,b, are zero, otherwise the system is called nonhomogeneous,
In this unit, we shall consider only nonhomogeneous systems.
You also know from your linear algebra that the nonhomogeneous system of Eqns.
(2) has a unique solution, if the matrix A is nonsingular. You may recall the following
basic theorem on the solvability of linear systems (Ref. Theorem 4, Sec. 9.5, Unit 9,
Block 3, MTE-02).
I
+
3x1 x2 + 2x3 = 3
2x1 - 3x2 - X j = -3
I
XI + 2x2 + X j = 4
3 1 2
dl = -3 -3 -1 = 8 (first column in A is replaced by the column vector b)
4 2 1
3 3 2
d2 = 2 -3 -1 = 16 (second column in A is replaced by the column vector b)
1 4 1
3 1 3 '
d3 = 2 -3 -3 = -8 (third column in A is replaced by the column vector b)
1 2 4
Using (3), we get the solution
While going through the example and attempting the exercises you must have
observed that in Cramer's method we need to evaluate n + l determinants each of
order n, where n is the number of equations. If the number of operatildns required
to evaluate a determinant is measured in terms of multiplicatiops otlly, then to
evaluate a determinant of second order, i.e.,
Table 1
From the table, you will observe that as n increases, the number of operations
required for Cramer's rule increases very rapidly. For this reason, Cramer's rule is not
generally used for n>4. Hence for solving large systems, we need more efficient
methods. In the next section we describe some direct methods which d e ~ e n don the
form of the coefficient matrix.
We now discuss three special forms of matrix A in Eqn. (2) for which the solution
vector x can, be obtained directly.
Case 1 :A'= D , where D is a diagonal matrix. In this case'the system'of Eqns. (2)
are of the form
a,, xl ......................... = b,
a22 X2 =b2
Note that in this case we need only n divisions to obtain the solution vector.
:'Case 2: A = L, where L is a lower triangular matrix (aij = 0,j>i). The system of
Eqns. (2) is now of the form
+
anlxl + an2x2+ an3x3 ...+ annxn = b,
and det (A) = alla 22...a,,.
You may notice here that the first equation of the system (4) contains only xl, the
second equation contains only xl and x2 and so on. Hence, we find xl from the first
equation, x2 from the second equation and proceed in that order till we get xn from
the last equation.
Since the coefficient matrix A is nonsingular, aii # 0, i = 1,2,...,n. We thus obtain
x1 = bl/all
X2 = (b2-a21~1)/a22
X3 = (b3-a31x1 - a32~2)/833
xi = (bi - z
i-1
j=l
aijxj)/aii,i = 1,2,. ...,n
M = 1 + 2 +....+ n = n(n+1)/2.
Case 3: A = U, where U is an upper triangular matrix (aij = 0, j<l). The system (2)
is now of the form
,
a,,x, + a12x2+ a13x3+...+ alnxn = bl
az2x2+ a 2 3 ~+3 ...+ a2,xn = b2
a 3 3 ~+
3 ...+ ajnxn = b3 (6)
xs = 1
using backward substitution method.
I
I
In the above discussion you have observed that the System of Eqns. (2) can be easily 1
solved if the coefficient matrix A in Eqns. (2) has one of the three forms D,L or U
or if it can be transformed to one of these forms. Now, ygu would like to know how
to reduce the given matrix A into one of these three forms? One such method which
I
transforms the matrix A to the form U is the Gauss elimination method which we
shall descriM in the next section:
I
I
I
I
5.5 GAUSS ELIMINATION METHOD I
I
I
Gauss elimination is one of the oldest and most frequently used methods for solving I
1 systems of algebraic equations. It is attributed to the famous German mathematician, Gmua (1777-1855) 1
-
Carl Friedrick Gauss (1777 1855). This method is the generalization of the famjliar 13 ,
Solution of Linear Algebraic Equations method of eliminating one unknown between a pair of simultaneous linear equations.
You must have learnt this method in your linear algebra course (Ref. : Sec 8.4,
Unit 8, Block 2, MTE-02). In this method the matrix A is reduced to the form U by
using the elementary row operations which include :
i) interchanging any two rows
ii) multiplying (or dividing) any row by a non-zero constant
'
.iii) adding (or subtracting) a constant multiple of one row to another row.
The operation Ri + mRj is an elementary row operation, that means, add to the
elements of the ith row m times the corresponding elements of the jth row. The
elements in the jth row remain unchanged.
If any matrix A is transformed into another matrix B by a series of elementary row
operations, we say that A and B are equivalent matrices. Formally, we have the
following definition.
Definition : A matrix B is said to be row equivalent to a matrix A , if B can be obtained
from A by using a finite number of elementary row operations.
Also two linear systems Ax = b and A'x = b' are equivalent provided any solution
of one is a solution of the other. Thus, if a sequence of elementary operations on
Ax = b produces the new system A*x = b* then the systems Ax = b and A*x = b*
are equivalent.
T o understand the Gauss elimination method let us consider a system of three
equations :
allxl + a12x2 + a13x3 = bl
a21xl + a22x2 + a23x3 = b2 @I
a3lxl + a32x2 + a33x3 = b3
Let all # 0. In the first stage of elimination we multiply the first equation in Eqns. (8)
by mzl = (-azl/all) and add to the second equation. l'hen multiply the first equatlon
by m3] = (-a31/a11) and add t o the third equation. This eliminates x1 from the second
and third equations. The new system called the first derived system then becomes
a(')
32
x2 + a)$, x3 = b(')
3
where,
a
b(') = b3 - A bl
3 /
1
In the second stage of elimination we multiply the second equation in (9) by
m 3= ~ (-a::)lai:)), a:;) # 0 and add to the third equation. This ellminates x2 from the
third equation. The new system called the second derived system becomes
where .*b.
You may note here that the system of Eqns. (11) is an upper triangular system of the
form (6) and can be solved using the back substitution method provided ag) # 0.
Solution : To eliminate x1 from the second and third equations of the system (13)
add 3 = -2 times the first equation to the second equation and add -(-2)/2=1
2
times the first equation to the third equation. We obtain the new system as
In the second stake, we eliminate x2 from the third equation of system (14). Adding
-6/(-2) = 3 times the second equation to the third equation, we get
2x, + 3x2 - i3= 5
- 2x2 - X j = -7 (15)
- 5x3 = -15
You may observe that we can write the above procedure moreconveniently in matrix
form. Since the arithmetic operations we have performed here affect only the
elements of the matrix A and the vector b, we consider the augmented matrix i.e.
[Alb] (the matrix A augmented by the vector b) and perform the elementary row
operations on the augmented matrix.
Definition :The diagonal elements a,, ,):a and a$) which are used as divisors are
called pivots.
You might have observed here that for a linear system of order 3, the elimination
was performed in 3- 1=2 stages. In general for a system of n equations given by Eqns.
(2) the elimination is performed in (n-1) stages. At the ith stage of elimination, we
eliminate xi, starting from (i+l)th mw upto the nth row. Sometimes, it may happen
that the elimination process stops in less than (n- 1) stages. But this is possible only
when no equations containing the unknowns are left or when the coefficienk of all
'the unknowns in remaining equations become zero. Thus if the process stops at the
rth stage of elimination then we get a derived system of the form
0 = b(r-1)
n
where r 5n and all # 0, a%) # 0 ,..., a!:-)' # 0.
In the solution of system of linear equations we can thus expect two different
situations '
Solution : We have.
I n this case you can see that r<n and elements b,, b;) and by) are all non-zero.
A 'ystern of is
Since we cannot determine x3 from the last equation, the system has no solution. In
inconskitent if it does not have a
such a situation we say that the equations are inconsistent. Also note that solutian.
det (A) = 0 i.e., the coefficient matrix is singular.
We now consider a situation in which not all b's are non-zero.
[Alb] = [ 16 22 4
4 -3 2
12 25 2
-2
-11
9 ] R - 4lR1, R3 - -
3 RI
4
Now in this case ficn and elements b,, b): are non-zero, but br) is zero. Also the last
equation is satisfied for any value of x3. Thus, we get
x3 = any value
No. of divisions
1st step of elimination (n-1) divisions
2nd step of elimination (n-2) divisions
The sum of first n natural numbers ., .......................................................
n
is 2i =7 n (n+l)
and (n-1)th step of elimination 1 divisions
1-1
... Total number of divisions A (nLl) + (n-2) -k ...... + 1
the sum of the squares of the first
n natural numbers is
No. of multiplications
1st step of elimination n(n- 1) multiplications
2nd step of elimination (n- 1) (n-2) multiplications
........................................................................
(n- I)th step df elimination 2.1 mi~ti~lications
' ... Total number of multiplications = n(n-1) + (n- 1) (n- 1) + ...... + 2.,1
Also the back substitution adds n divisions (one division at each step) and the number
of multiplications added are
(n-1)th equation 1 multiplication
//'(n-2)th equation 2 multiplication
I .....................................................
1st equation (n- 1) multiplication
n(n- 1)
Total multiplications = z ( n - 1 ) -- =
2
n(n - 1) n(n + 1)
Total operations added by back substitution = -+ n = -
2 2
Thus to find the solution vector x using the Gauss elimination method, we need
operations. For large n, we may say that the total number of operations needed i s l n"
(approximately). Thus, we find that Gauss elimination method needs much less&
number of operations compared to the Cramer's rule.
are inconsistent.
~ 1 0 )Use
' Gauss elimination method to solve the system of equations
It is clear from above that you can apply Gauss elimination method to a system of
I equations of any order. However, what happens if one of the diagonal elements i.e.
the pivots in the triangularization process vanishes? Then the method will fail. In such
situations we modify the Gauss elimination method and this procedure is called
pivoting.
i Pivoting
I ,,
In the elimination procedure the pivots a, ay! ,....,a("-')
nn are used as divisors. If at
I any stage of the elimination one of these pivots say at:-'). (a!:) = a,,), vanishes then
So*ctbr Ll-r Algeb-lC huatlo~ the elimination procedure cannot be continued further (see Example 8). Also, it may
happen that the pivot a!-'), though not zero, may be very small in magnitude
compared to the remaining elements in the ith column. Using a small number as a
'
divisor may lead to the growth of the round-off error. In such cases the multipliers
' -a('-2) -a!i-3)
(e.g.- I-',' ) wiil be larger than one in magnitude. The use of large
a(!-l) ' a(!-l)
multipliers will lead to magnification of errors both during the elimination phase and
during the back substitution phase of the solution. To avoid this we rearrange the
remaining rows (ith row upto nth row) so as to obtain a non-vanishing pivot or to
make it the largest element in magnitude in that column. The strategy is called
pivoting (see Example 9). The pivoting is of the two types; partial pivoting and
complete pivoting.
Partial Pivoting
In the first stage of elimination, the first column is searched for the largest element
in magnitude and this largest element is then brought at the position of the first pivot
by interchanging the first row with the row having the largest element in magnitqde
in the first column. In the second stage of elimination, the second column is searched
for the largest element in magnitude among the (n-1) elements leaving the first
element and then this largest element in magnitude is brought at the position of the
second pivot by interchanging the second row with the row having the largest element
in the second column. This searching and interchanging of rows is repeated in all the
n- 1 stages of the elimination. Thus we have the following algorithm to find the pivot.
For i = 1,2,.....n, find j such that
Complete Pivoting
In the first stage of elimination, we search the entire matrix A for the largest element
in magnitude and bring it at the position of the first pivot. In the second stage of
elimination we search the square matrix of order n- l (leaving the first row and the
first-column) for the largest element in magnitude and bring it to the position of
second pivot and so on. This requires at every stage of elimination not only the
interchanging of rows but also interchanging of columns. Complete pivoting is much
more complicated and is not often used. .
In this unit. by pivoting we shall mean only partial pivoting.
Let us now understand the pivoting procedure through examples.
Solutibn : Let us first attempt to solve the system without pivoting. We have
Note that in the above matrix the second pivot has the value zero and the.elimination
procedure cannot be continued further unless, pivoting is used.
Let us now use the partial pivoting. In the first column 3 is the largest element.
Interchanging the rows 1 and 2, we have
In the second column, 1 is the largest element in magnitude leaving the first element.
Interchanging the second and third rows we have
You may observe her& that the iesultant matrix is in triangular form and no further
elimination is required. Using back substitution method, we obtain the solution
X3 = 2, X2 = 1, X I = 3.
= 3.333
which is highly inaccurate compared to the exact solution.
.With Pivoting
We interchange the first and second equations in (17) and get
0.3454 XI - 0.436 x 2 = 3.018
+
0.0003 x 1 1.566 x2 = 1.569
we obtain
~ u r t h e r onq
, has to be careful in the selection of the pivotal equation for each step.
For each step the pivotal equation must be selected on the basis of the current state
of the system under consideration i.e. without foreknowledge of the e f f m of the
selection on later steps. For this, we calculate initially the size di of row i of Pf, for
i = l , .....,n, where di is the number
-
di = max (aij(
At the begmning of say ktii step of dimination, we pick as pivotal equation that one
from the available n-k, which,has the absolutely largest coefficient of x, relative to
the size of the equation. This means that the integer j is selected between k and n
for which
We can also store the multipliers in the working array W instead of storing zeros.
That is, if pi is the first pivotal equation and we use the multipliers mpiYl,i=2,. ....,n
to eliminate xl from the remainihg (n-1) positions of the first column then in the
first colurqn we can store the multipliers mR,,, i=2, ...., n, instead of storing zeros.
Let us now solve the following syst=m of linear equations by scaled partial pivoting
by storing the multipliers and maintaining pivotal vector
and dl = 3, d2 = 4 and d3 = 5.
3 >11
Since -
5 2' 3'
.'. p1 = 3, p2 = 2 and p3 = 1.
We use the third equation to eliminate xl from first and second equations and store
corresponding multipliers instead of storing zeros in the working matrix.
W
The multipliers are rn
Pi.1
=
W
,i = 2, 3
Pi.1
A = LU
or in matrix form we write
The left side matrix A has n2.elements, whereas Land U have 1+2+...+n = n(n+ 1)/2
elements each. Thus, we have n2+n unknowns in L and U which are to be
.determined. On comparing the corresponding elements on two sides in Eqn. (19)' we
get nZequations in n2+n unknowns and hence n unknowns are undetermined. Thus,
we get a solution in te'rms of these n unknowns i.'e., we get a n parameter family of
solutions. In order to obtain a unique solution we either take all the diagonal
elements of L as 1, or all the diagonal elements of U as 1.
For uii = 1, i = 1,2,. ..,n, the method is called the Crout LU decomposition method.
For lii = 1, i = 1,2,. ..,n we have Doolittle LU .decompositionmethod. Usually Crout's
LU decomposition method is used unless it is specifically mentioned. We shall no.w
explain the method for n = 3 with uii = 1, i = 1,2,3. We have
Similarly, for the general system of Eqns. (2), we obtain the elements of L and U
using the relations
-
Uii .-1
Also, det (A) = 111122....,I,.,,.,.
Thus we can say that every nondngular matrix A can be written as the product bf a
lower triangular matrix and an upper triangular matrix if all the principal minors of
A are nonsingular, i.e. if
Once we have obtained the elements of the matrices L and U, we write the system
of equations
Ax=b (25)
in the form
L U X =b (26)
The system (26) may be further written as the following two systems
u x = y (27)
Ly=b (28)
Now, we first solve the system (28), i.e.,
Ly =b,
using the forward substitution method t o obtain the solution vector y. Then using this
y, we solve the system (27), i.e.,
Ux=y,
using the backward substitution method to obtain the solution vector x.
The 'number of operations for this method remains the same as that in the
Gauss-elimillation method.
1 We now illustrate this method through an example.
1 Example 11 : Use the LU decomposition method to solve the system of equations
x, + X2 + X3 = 1
4x1 + 3 x 2 - x3 = 6
3x1 + 5x2 + 3x3 = 4
I
[:: 3 =
u33
On comparing the elements of row and column alternately, on both sides, we obtain
' first row : u l l = 1, u12= 1, U13=1
first column : lZ1= 4, 13] = 3
second row : u2~ = -1, u23 = 5-
second column : 132 = -2
third row : u~~ = -10
Thus, we have
we get
y1 = 1, Y2 = 2, Y3 = 5
and from the system
u x = y
or
we get
x3 = -112, x* = 112, X l = 1.
You may now try the following exerclses :
E12) Use the L U decomposition method with u,, = 1, i = 1,2,3 to solve the system
of equations given in Example 11.
E13) Use the L U decomposition method wiqh I,, = 1, i = 1,2,3 to solve thesystem
I of equations given in E7.
E14) Use L U decomposition method to solve the system of equations given in ElO.
11 sibbaa d
Egu*~ We now end this unit by giving a summary of what we have covered in it.
5.7 SUMMARY
In this unit we have covered the following:
1) For a system of n equations
AX = b (see Eqn. (2)
in n unknowns,'where A is an n x n non-singular matrix, the methods of finding
the solution vector x may be broadly classified into two types: (1) direct m&t~ods
and (ii) iterative methods
2) Direct methods produces the exact solution in a finite number of steps provided
there are no round-off errors. Crmer's rule is one such method. This method
gives the solution vector as
E l ) det (A)= 8
E2) d = 11, d l = 11, d2 = 11, d3 =' 11
X1 = X2 = X3 = 1
1
E l l ) Solution without pivoting :
I
Using m,, = 1.372
m,, = 1.826 and r n =~ 2.423
1 .
The final derived system is
I
0.7290 0.8100 0.9000 '
0.0
0.0
-0.1110 -0.2350
0.0
The solution is
0.02640
In the previous unit, you have studied the Gauss elimination and LU decomposition
methods for solving systems of algebraic equations A x = b, when A is a n x n
nonsingular matrix. Matrix inversion is another problem associated with the problem
of finding solutions of a linear system. If the inverse matrix A-' of the coefficient
matrix A is known then the solution vector x can be obtained from x = A-' b. In
general, inversion of matrices for solving system of equations should be avoided
whenever possible. This is because, it involves greater amount of work and also it is
difficult to obtain the inverse accurately in many problems. However, there are two
cases in which the explicit computation of the inverse is desirable. Firstly, when
several systems of equations, having the same coefficient matrix A but different right
hand side b, have to be solved. Then computations are reduced if we first find the
inverse matrix and then find the solution. Secondly, when the elements of A-'
themselves have some special physical significance. For instance, in the statistical
treatment of the fitting of a function to observational data by the method of least
squares, the elements of A-' give information about the kind and magnitude of errors
in the data.
In this unit, we shall study a few important methods for finding the inverse of a
nonsingular square matrix.
Objectives
After studying this unit, you should be able to :
obtain the inverse by adjoint method for n < 4;
obtain the inverse by the Gauss-Jordan and LU decomposition methods; '
obtain the solution of a system of linear equations using the inverse method.
Solution : Since det (A) = - 1 f 0, the inverse of A exists. We obtain the cofactor
matrix A" from A by replacing each element of A by its cofactor as follows :
L -8Y
17 10 1
NOWA-1 = adj (A)
det (A)
iii) x = A-'b =
. . A-' = (A3T
adj (A)
= 1
18
-
-12
-2
6 0
l8 -5 3O 11
=
1
-213
-119
0
113
-5118 116 1
Thus, A-' is again a lower triangular matrix. Similarly, we can illustrate that the
inverse of an upper triangular matrix is again upper triangular.
We obtain ,
In this method also, we use elementary row operations that are used in the Gauss
elimination method. We apply these operations both below and above the diagonal
in order to reduce all the off-diagonal elements of the matrix to zero. Pivoting can
be used to make the pivot non-zero or to make it the largest element in magnitude
in that column as discussed in Unit 5. We illustrate the method through an example.
I
Example 4 : Solve the svstem of equations
XI+ x2+ xj=,1
4x1 + 3x2 - xg = 6
3x1 + 5x2\+ 3x3 = 4
I
1 Solution : We have
[Alb] = [ -: :]
3 5 3 4
(interchanging first and second row)
4 , 4
4 0 0 4
A
11 R2 (divide second row by 11!4),
fi
10 R3 (divide third r o q by 10111).
E3) Verify that the total number of operations needed for Gauss Jordon reducfion
method is 1n3 + + n.
I 2 2
Clearly this method requires more number of operations compared to the Gauss
b elimination method. we.; therefore, do not use this method generally for solving system
of equations but is very commonly used for finding the inverse matrix. This is done
by augmenting the matrix A by the identity matrix I of the order same as that of A.
t
Using elementary row operations on the augmented matrix [A111 we reduce the
matrix A to the form I and in the process the datrix I is transformed to A-'
That is
!
I
Gauss
Jordan
,
[I A-']
We now illustrate the method through examples.
Example 5 : Find the inverse of the matrix
1 -2 1
Solution : We hav :
[AlI]=[:
-: 1 0 0 1 y1R 1
Soiutlon of Linear Algebraic Equations 1 113 213 113 0 0
2 -3 ' -1
1 -2 1 'Rz-12R1,R3-R1
Thus we obtain
Hence
A-' =
1 0 0 0
o i o o
0 0 1 0
0 0 0 1
112' 0
2
0
112
-1
113
0
2
0
"" 0
o
- 1
1/11 21155 -17155
0
0
-113
1/11 21/55 -17155
0
3/55
Let us now consider the problem of finding the inverse of an upper triangular matrix.
Example 7 : Find the inverse of the matrix
Hence
Note that in Examples 2,3,6 and 7, the inverse of a lowerllppr trianfiar matrix is
again a lowerlupper triangular matrix. There is another method offinding the inverse
of a matrix A which uses the pivoting strategy. Recall that in Sec. 5.5. of Unit 5, for
the solution of system of linear algebraic equation Ax = b, we showed you how the
multipliers mp,l,k'scan be stored in working array W during the process of
elimination. The main advantage of stgring these multipliers is that if we have already
solved the linear system of equations A x = b or order n, by the elimination method
and we want to solve the system Ax = c with the same coefficient matrix A, only the
right side being different, then we do not have to go through the entire elimination
process again. Since we have saved in the working matrix W all the multipliers used
and also have saved the p vector, we have only to repeat the operations on the right
hand side to obtain Z, such that U x = Z is equivalent to A x = c.
In order to understand the calculations necessary to derive E , from c consider the
changes made in the right side b during the elimination process. Let k be an integer
between 1 and n, and assume that the ith equation was used as pivotal equationq
during step k of the elimination process. Then i = pk. Initially, the right side of
equation i is just bi.
T T
Splution :Initially p = p2, p3] = [I, 2, 31 and the working matrix is
W e use the second equation togiminate x, from first and third equations and store
, corresponding multipliers insteaa 6f storing zeros in the working- matrix. The
[ n~ultipliersare
we get the following working matrix
We use the first equation as pivotal equation to eliminate x2from the third equation .
and also store the multipliers. After the second step we have the following working
matrix
Now in this case, w ( ~is)our final working matrix with pivoting strategy p = (2,1, 3)T
Note that circled ones denote multipliers and squared ones denote pivot elements in
the working matrices.
To find the. inverse of the given matrix A, we have to solve
Ax = el = [b, b2 b31T
p i x = e 2 = [bl bz b31T
/
Ax = e3 = [bl b2 b31T
where el = (1 0'0IT, e2 =
Using ~ q n (9),
. we get
-
with pl = 2, b, = b2 = 0
-
with p2 = 1, b2 = bl - wll bl
-
= o - [ - + ] . ~- i.l = -1
Using Eqn. (lo), we then get the following system of equations
73X z - X3 =1
2x,.+ xz = 0
which gives x3 = - 31 ,x2 = -
-2 4 -
4 and x, = - -
9
T
3]
2
9
i.e., vector x' = [ 9 . 9
is the solution of system (12).
Remember that the solution of system (12) constitutes the first column of the &verse
matrix A-'.
In the same way we solve the system of equations Ax = e2 and Ax = e3, or
and
Using Eqns (9) and (lo), we obtain the solution of system (13) as
x = 12 I I is the second column of A-I and the solution of system
9 9 3
UNIT 7 ITERATIVE METHODS
7.1 Introduction
7.2 The General Iteration Method
7.3 The Jacobi Iteration Method
-7.4 The Gauss-Seidel Iteration Method
7.5 Summary
7.6 Solutions/Answers
,7.1 INTRODUCTION
In the previous two units, you have studied direct methods for solving linear system
of equations Ax = b, A being n x n non-singular matrix. Direct methods provide the
exact solution in a finite number of steps provided exact arithmetic is used and there
is no round-off error. Also, direct methods are generally used when the matrix A is
dense or filled, that is, there are few zero elements, and the order of the matrix is
not very large say n < 50.
Iterative methods, on the other hand, start with an initial approximation and by
applying a suitably chosen algorithm, lead to successively better approximations.
Even if the process converges, it would give only an approximate solution. These
methods are generally used when the matrix A is sparse and the order of the matrix
A is very large say n > 50. Sparse matrices have very few non-zero elements. In most
cases these non-zero elements lie on or near the main diagonal giving rise to
tri-diagonal, five diagonal or band matrix systems. It may be noted that there are no
fixed rules to decide when to use direct methods and when to use iterative methods.
However, when the coefficient matrix is sparse or large, the use of iterative methods
is ideally suited to find the solution which take advantage of the sparse nature of the
matrix involved.
In this unit we shall discuss two iterative methods, namely, Jacobi iteration and
We assume that the diagonal coefficients aii f 0,(i = 1,. ..,n). If some of aii r 0, then
we rearrange the equations so that this condition holds. We then rewrite system (2) as
I
where
.......- !k
I 1
0 -alz -9
a11 all
a21
,-
a22
0 - 3 2 .......-
a22
azn
a22
H = ......:................. :............................
To so!ve system (3) we make an initial'guess x(O)of the solution vector and substitute
In general we can write the iteration method for solving the linear system,of
Eqns. (1) in the form
dk+')= H X ( ~+) C. k = 0.1 ......
lim dk'= 0
k-rm
Before we discuss the above convergence criteria, let us recall the following
definitions from linear algebra, MTE-02.
iterative Methods t
.. -
\
The eigenvalues of the matrix A are obtained from the characteristic equation
det (A-XI) = 0
which is an nth degree polynomial in X. The roots of this polynomial XI, X2,...,Xn are
I the eigenvalues of A. Therefore, we have
i Theorem 1 : An iteration method of the form (5) is convergent for arbitrary initial
approximate vector x(O) if and only if p(H)<l.
b
Definition : The number v = -loglo p(H) is called. the rate of convergence of an
iteration method.
ii) l l ~ l =l ~ max IIAXIIw , based on the maximum vector norm, llxll, = max lxil.
ll~llm Isla.
In (i) and (ii) above the maximum is taken over all (non zern) n-vectors. The
I most commonly used norms is the maximum no& IW(., as it+ easier to
calculate. It can be calculated in any oMhe following two ways:
llAll, = max
k
x
i
~ a , (maximum
~l absolute column-sum)
/
I
or
llAll. = max x l a , , l (maximum absolute row sum)
k
I Solution of Linear Algebraic Eguations The norm of a matrix is a non-negative number which in addition to the property
I
b) IlaAll = 1 a1 IIAll, for all numbers a .
c) lIA+BIl 6 IIAll + IlBll
where A and B are square matrices of order'n.
Theorem 2 : The iteration method of the form (5) for the solution of system (1)
converges to the exact solution for any initial vector, if IJHJJ< 1.
Also note that
IlHll 3 P(H).
This can be easily proved by considering the eigenvalue problem Ax = Ax.
Then IlAxll =.llxxll = I A.1 llxll
or 1AI Ilxll = IIAxll 6 IlAll llxll
i.e., ihl d llAll since IIxll # 0
I Since this result is true for all eigenvalues, we have
k i=l i k=l
condition is violated it is not necessary that the iteration diverges.
There is another sufficient condition for coovergence as follows:
Theorem 3 : If the matrix A is strictlv diaeonallv dominant that is.
then the iteration method (5) converges for any initial approximation x(o). I
i
?
If no better initial approximation is known, we generally take x(O) = 0.
We shall mostly use the criterion given in Theorem 1, which is both necessary and
sufficient.
i
1
I
I For using the iteration method (S), we need the matrix H a i d t6e vector &'which
depend on the matrix A and the vector b. The well-known iteration methods ?re
! based on the splitting of the matrix A in the form
A=D+L+U
Note that, A being a non-singular matrix, it is possible for us to make all the p i v M
non-zero. It is only when the matrix A is singular that even complete pivoting may
not lead to all the non-zero pivots.
We rewrite system (2) in the form (3) and define the Jacobi iteration method as
x I( ~ + ' )=
1 (a x ( ~ +
-- 12 2
) al3xSk) + ... + a,,~,!,~)-b,)
a11
xik+') = - 1
+
(anlxlk) an2x?) + ... + a,, ~!x: -bn)
i+j
Let us now solve a few examples for better understanding of the method and its
~~~- ~ -~
Determine the rate of convergence of the method and the number of iterations needed Iterative Methods
Solution : The Jacobi method when applied to the system of Eqns. (18), gives the
iteration matrix
The eigenvalues of the matrix H are the roots of the characteristic eqmtion.
det (H-XI) = 0
Now
-A -1 --1
4 4
3
det (H-XI) = 12 -A 18
= ~ 3 - - = 0
80
-2 --1 -A
5 5
'
All the three eigenvalues of the matrix H are equal and they are equal to
A = 0.3347
The spectral radius is
We obtain t h a t e of convergence as
v = -i0gl0(0.3347) = 0.4753
The number of iterations needed for the required accuracy is given by
The Jacobi method when applied to the system of Eqns. (18) becomes
starting with the initial approximation x(" = [l 2 21T, we get from Eqn. (21)
x")=[1.75 3.375 3 . 0 1 ~
x"' = [1.8437 3.875 3.0251~
x'~'= 11.9625 3.925 2.9625IT
,'(A) =
.& [1.9906 3.9766 3.000(1]~
x(" = [1.9941 3.9953 3.00091~
- ..--,CI~G ~UIIuitionin Theorem 1 is violated. The iteration method does not conve ge
? .
Iterative Methods
We now perform few iterations and see what happens actually. Taking x(') = 0 and
using the Jacobi method
we obtain
and so on, which shows that the iterations are diverging fast. You may also try to
obtain the solution with other initial approximations.
El) ~ o i r i i f i v iterations
e of the Jacobi method for solving the system of equations
given in Example 4 with x(O) = [ l 1 1IT.
Let us now consider an example to show that the convergence criterion given in
Theorem 3 is only a sufficient condition. That is, there are system of equations which
are not diagonally dominant but, the Jacobi iteration method converges.
Example 5 : Perform iterations of the Jacobi method for solving the system of
equations
with x(O) = [0 1 llT. What can you say about the solution obtained if the exact
solution is x = [0 1 2IT?
Solution : The Jacobi method when applied to the given system of equations becomes
X(k+l)
1
= [3 - X$k) - x3(k)I
=1
You may notice that the coefficient matrix is not diagonally dominant but the
iterations the exact solution after only two iterations.
And now a few exercises for yo;.
~ 2Perform
j four iterations of the Jacobi method for solving the system of equations
I
I
I
I
I
I
with.dO)= 0. The exact solution is x = (1 1
I
I
E4) Perform four iterations of the Jacobi method for solving the system of equations
You may notice'here that in the first equation of system (24), we substitute the initial
approximation (xi0): xi0',.. .,xi0)).on the right hand side. In the second equation,
we substitute (xi1), xSO',...,xAO))on the right hand side. In the third equation, we
substitute (xi1), xi1), X$~),...X:O))on the right hand side. We continue in this manner
until all the components have been improved. A t the end of this first iteration, we
will have an @proved vector (xi1), xi1),. ..,xi1)). The entire process is then repeated.
In other words, the method uses an improved component as soon as it becomes
available. It is for this reason the metbod is also called the method of successive
displacements.
b
We can also write the s$tem of Eqns. (24) as follows:
all xik+') = - a12xik) - a13xjk) ... aIn xAk) + bl
-,
and L and U are respectively the lower and upper triangular matrices with the zeros
along the diagonal and are of the form
Example 6 : Perform four iterations (rounded to four decimal placesj using the
Gauss-Seidel method for solving the system of equations
which is a good approximation to the exact solution x = (-1 -4 -3)T with maximum
absolute error 0.0034. Comparing with the results obtained in Example 1, we find
'-.
that the values of xi, i=1,2,3 obtained here are better approximates to the exact
60 solution than the one obtained in Example 1.
Soiution of Linear Algebraic Quation6 The eigenvalues of the matrix H are the roots of the characteristic equation
-A -1 -1
4 4
det(H-AI)= 0 --A 0 =0
8
0 -3
40 -(+AI ) +
We have
h(80A2 - 2A -1) = 0
which gives
A = 0, 0.125, -0.1
Therefore, we have
p(H) = 0.125
The rate of convergence of the method is given by
v = -10g~~(0.125) = 0.9031
The number of iterations needed for obtaining the desired accuracy is given by
k = - =2 - . Z 2 3
v 0.9031
The Gauss-Seidel method when applied to the system of Eqns. (29) becomes
X[k+')
1
= -[7 - X$k) + X$k)]
4
X$k+l) = 1
- -[-21 - 4x$k+l)- x 6 k ) ~
8 (30)
X$k+l)
1
= -[15 + zx$k+l) - (k+l)
5 x2 I
The successive iterations are obtained as
x(')= [1.75 3.75 2.951~
x ' ~ )= [1.95 3.9688 2.98631T
) [1.9956 3.9961 2.99901~
x(~=
which is an approximation to the exact solution after three iterations. Comparing the
results obtained in Example 2, we conclude that the Gauss-Seidel method converges
faster than the Jacobi method.
Example 8 : Use the Gauss-Seidel method for solving the following system of
equations.
with x(O) = [0.5 0.5 0.5 0.5IT. Compare the results with those obtained in
Example 3 after four iterations. The exact solution is x = [I 1 1 llT.
Solution : The Gauss-Seidel method, when applied to the system of Eqns. (31)
becomes
= ' [I + xik)]
x21 - 1
-q[xl (,+I) +
x3( k + l ) -
- T[x2
+ x$k)]
In Example 3, the result obtained after four iterations by the Jacobi method was
d4)= [0.8438 0.75 0.75 0.84381~
-Remark : The matrix formulations of the Jacobi and Gauss-Seidel methods are used
whenever we want to check whether the iterations converges or to find the rate of
convergence. If we wish to iterate and find solutions of the systems, we shall use the
equation form of the methods.
And now a few exercises for you.
You may now attempt the following exercises.
-- - -
E7) Perform four iterations of the Gauss-Seidel method for solving t k system of
equations given in E2).
E8) Perform four iterations of the Gauss-Seidel method for solving the system of
equations given in E3).
E9) Perform four iterations of the Gauss-Seidel method for solving the system of
equations given in E4).
E10) Set up the matrix formulation of the Gauss-Seidel method for solving the system
of equations given in E5). Perform four iterations of the method.
E l l ) Gauss-Seidel method is used to solve the system of equations given in E6).
Determine the rate of convergence and the number of iterations needed to
make m ? x I ~ $ ~G) ) [lo-'. Perform four iterations and compare the results with
'the exact solution.
We now end this unit. by giving a summary of what we have covered in it.
7.5 SUMMARY
In this unit, we have covered the following:
1) Iterative methods for solving linear system of .equations
Ax = b (see Eqn. (1))
where A is an n x n , non-singular matrix. Iterative methods are generally used
when the system is large and the matrix A is sparse. The process is 'started using
an initial approximation and lead to successively better approximations.
2) General iterative method for solving the linear system of Eqn. (1) can be written
in the form
x ( ~ + ' )= H X ( ~ ) + C, k = O,l, ........(see Eqn. (5))
where dk)and x ( ~ + ' )are the approximations to the solution vector x at the kth
and the (k+l)th iterations respectively. H is the iteration matrix which depends
on A and is generally a constant matrix. c is a column vector and depends on
both A and b.
3) Iterative method of the form given in 2) above converges for any initial vector,
if IlHll <1, which ii a sufficient condition for convergence. The necessary and
sufficient condition for convergence is p(H) <, where p(H) is the spectral radius
of H.
4) In the Jacobi iteration method or the method of simultaneous displacements.
H = - D-' (L+U); c = D-' b
where D is a diagonal matrix, L and U are respectively the lower and upper
triangular matrices with zero diagonal elements.
soluuon of *lgebrniC Qua-
5) In the Gauss-Seidel iteration method or the method of successive displacements
H = -(D + L)-'U and c = (D + L)-'b.
6) If the matrix A in Eqn. (I) is strictly diagonally dominant then the Jacobi and
Gauss-Seidel methods converge. Gauss-Seidel method converges faster than the
Jacobi method.
8.1 Introduction
8.'2 The Eigenvalue Problem
8.3 The Power Method
8.4 The Inverse Power Method
8.5 Summary
8.6 Solutions/Answers
8.1 INTRODUCTION
In Unit 7, you have seen that eigenvalues of the iteration matrix play a m'ajor role in
the study of convergence of iterative methods for solving linear system of equations.
Eigenvalues are also of great importance in many physical problems. The Stability of
an aircraft is determined by the location of the eigenvalues of a certain matrix in the
complex plane. The natural frequencies of the vibrations of a beam are actually
eigenvalues of a matrix. Thus the computation of the absolutely largest eigenvalue
t e v a l u e s of the parameter A, for which the system of Eqn. (2) has a nonzero
solution, are' called the eigenvalues of A. Corresponding to these eigenvalues, the
nbnzero solutions of Eqn. (2) i.e. the vectors x, are called the eigenvectors of A. The
problem of finding the eigenvalues and the corresponding eigenvectors of a square
.katrix A is known as the eigenvalue problem. In this unit, we shall discuss ihe
eigenvalue problem. T o begin with, we shall give you some definitions and properties
related to eigenvalues. .
homogeneous system
solution, x = 0. For the homogeneous system (3) to have a i.lonzero solution, the
matrix A must be singular and in this case the solution is not unique Pef. Theorem
Solutloa of L i m ~
Algebr*c EquaUo~ The homogeneous system of Eqn. (2) d l have a nonzero scilutiononly when the
coefficient matrix (A - XI) is singular, that is,
det (A - AI) = 0 (4)
If the matrix A i's an n x n matrix then Eqn. (4) gives a polynominal of degree n in
A. This polynomial is called the characteristic equation of A. The n roots h l , A,, ...,An
of this polynomial are the eigenvalues of A. For each eigenvalue hi, there exists a
vector xi (the eigenvector) which is the nonzero solution of the system of equations
(A - Xi) xi = 0 (5)
- - The eigenvalues have a number of interesting properties. We shall now state and
prove a few of these properties which we shall be using frequently.
P1 : A matrix A is singular if and only if it has a Zen, eigenvalue.
-
Proof : If A has a zero eigenvalue then
det (A - 0 I) = 0
*
det (A) = 0
A is singular.
Conversely, if A is singular then
det (A) = 0
+=- det (A - 0 I) = 0
* 0 is an eigenvalue of the matrix A.
P2 : A and have the same eigenvalues.
Proof : If A is an eigenvalue of A then
det (A - XI) = 0
* det (A - A I ) ~= 0 (ref. P6 Sec. 9.3, Unit 9, Block 3, MTE-02)
* det ( A -~ A I ~=) 0 (Ref. Theorem 3, Sec. 7.3, Unit7, Block 2, MTED2)
* det ( A - ~ XI) = 0
=+ A is an eigenvalue of
Hence the result.
However, the eigenvectors of A and are not the same.
P3 :If the eigenvalues of a matrix A are A', A,... ,An then the eigenvalues of Am, m
any positive integer, are Xy, A
,: ...,A t . Also both the matrices A and Am have the
same set of eigenvectors.
P4 : If A', A2 ,...,An are the eigenvalues of A , then l/Al, l/A2,...,l/A, are the
eigenvalues of A-'. Also both the matrices A and A-' have the same set of
eigenvectors.
E l j Prove P6
We now give you a direct method of calculit?ng the eigenvalues and eigenvectors of
a matrix.
Example 1 : Find the eigenvalues of the matrix
det(A-XI) = 0 2-A 0 =0
0 0 3-A
1-A 2 3
c) det (A-XI) = 0 4-A 5 =0
0 0 6-A'
I
which gives the polynomial
x2--5A+4=0
i.e., (A-1) (A-4) = 0
The matrix A has two distinct real eigenvalues XI = 1, A2 = 4. TO obtain the
corresponding eigenvectors we solve the system of Eqns. (5) for each value of A.
For A=l, we obtain the sytem of equations
X l + 2x2 = 0
x, + 2x2 = 0
which reduces to a single equation
x, +
2x2 = 0
Taking x2 = k, we get x, = -2k, k being arbitrary nonzero constant. Thus, the
eigenvector is of the form
I
For A=4, we obtain the sytem of equations
I
-2x1 + 2x2 = 0
Xl - X2 = 0
I
which reduces to a single equation
I x, - X2 = 0
I Taking x2 = k, we get xl = k and the corresponding eigenvectords
Note' In practice we usually omit k and say that [-2 1IT and [I 1IT are the
eigenvectors of A corresponding to the eigenvalues A = 1 and A = 4 respectively.
Moreover, the eigenvectors in this case are linearly independent.
b) The characteristic equation in this case becomes
I (A - 112= 0.
i Therefore, the matrix A has a repeated real eigenvalue. The eigenvector
I
corresponding to A = 1 is the solution of the system of Eqns. ( 9 , which reduces to
I a single equation
I
I X2 = 0.
1 Taking x, = k, we obtain the eigenvector as
I
, Note: that, in this case of repeated eigenvalues, we got linearly dependent
1 ' eigenvectors.
I
c) The characteristic equation in this case becomes
I h2-2A+5=O
1 70 which gives two complex eigenvalues A = 1 f 2i.
Taking x2 = k, we get the eigenvector
In the above problem you may note that corresponding to complex eigavalues, we
got complex eigenvectors. Let us now consider an example of 3 x 3 matrix.
Example 3 : Determine the eigenvalues and the corresponding eigenvectors for the
matrices '
! -JZ
-1
0
-1
-JZ
-1 -$Z
-1 '1 [ : , [ 1 ; ]
x.3
(10)
~ o ' f i n dthe solution of system of Eqns. (lo), we use Gauss elimination method.
Again performing R j - f i R2, we get
Using the Gauss elimination method, the system reduces to the equations
f i x, - X 2 = o
X2-,lTx3=0
Taking x3 = k, we obtain the eigenvector
x, + X 3 = 0
Taking x3 = k. the eigenvector is
. .
The eigenvector corresponding to A = 2 is the solution of system of Eqns. ( 5 ) ,
wll,icli reduce: to a single equation.
+
2u, - X l x3 = 0 (12)
we can take any values for x, and x2 which need not be related to each other. T h e ,
two linearly independent sdutions can be written as:
El) A =
E2) A =
[
[
"
-15
J2
j2 J;]
4 3
2
10 -12 6 1
20 -4 2
E3lA= [-i 2 -3
-21 - 60 1
E4)A=[i
-1 -4
-i I]
In the examples considered so far, it was possible for us to find all the roots of the
characteristic equation exactly. But this may not always be possible. This is
particularly true for n > 3. In such cases some iterative method like Newton-Raphson
[I:; method may have to be used to find a particular eigenvalue or all the eigenvalues
from the characteristic equation. However, in many practical problems, we do not
:1
iY'
%
require all the eigenvalues but need only a selected eigenvalue. For example, when
we use iterative methods for solving a nonhomogeneous system of linear equations
Ax = b, we need to know only the largest eigenvalue in magnitude-qf the iteration
matrix H, to find out whethd the method converges or not. One iterative method;
which is frequently used to determine the largest eigenvalue in magnitude (also called
the dominant eigenvalue) and the corresponding eigenvector for a given square matrix
A is the power method. In this method d e do not find the characteristic equation.
This method is applicable only when all the eigenvalues are real and distinct. If the
.magnitude of two or more eigenvalues is the same then the method converges slowly.
I
Vector for which scal~nghas been elements of the vector y(k+'J may becbme very large. To avoid this, we normalize'(or
done is called a scaled vector scale) the vector y(k) at each step by dividing y(k),by its largest element in magnitude.
1 otherw~se,it IS unsealed. This will make the largest element in magnitude in 'the vector y(k+') as one and the
I remaining elements less than one.
I
I
If y(k) represents the unscaled vector and y(k) the scaled vector then, we have the
I
I power method.
v(()) = y0)and
mk+,being the largest element in magnitude of y(k+') We then
obtain the dominant eigenvalue by taking the limit
(~(~+'))r
A, = lim
k-tm (~(~))r
where r represents the rth component of that vector. Obviously, there are n ratios of
numbers. As k-m all these ratios tend to the same value, which is thedargest
eigenvalue in magnitude i.e., A,. The iteration is stopped when the magnitude of the
difference of any two ratios is less than the prescribed tolerance.
The corresponding eigenvector is then dk+')
obtained at the end of t'he last iteration
performed.
We now illustrate the method through an example.
Example 4 : Find the dominant eigenvalue and the corresponding eigenvector correct
to two decimal places of the matrix
After 7 iterations, the r a t i o s B are given as 3.4138, 3.4146 and 3.4138. The
(v(@)r
maximum error in these ratios is 0.0008. Hence the dominant eigenvalue can be taken
as 3.414 and the corresponding eigenvector is [0.7071 -1 0.70711~
Using four iterations of the power method and taking the initial vector y(") with all
its elements equal to one, find the dominant eigenvalue and the correspondingeigen-
vector for the following matrices.
You must have realised that an advantage of the power method is that the eigenvector
'corresponding to the dominant eigenvalue is also generated at the same time.
Usually, for most of the methods of determining eigenvalues, we need to do separate
computations to obtain the eigenvector.
In some problems, the most important eigenvalue is the eigenvalue of least
magnitude. We shall discuss now the inverse power method which gives the least
ejgenvalue in magnitude.
'1 Soktlon ot Llnenr Algebralc Equations
8.4 THE INVERSE POWER METHOD
1
We first note that if A is the smallest eigenvalue in magnitude of A, then - is the
X.-
largest eigenvalue in magnitude of A-'. The corresponding eigenvectors are same.
If we apply the power method to A-l, we obtain its largest eigenvalue and the
corresponding eigenvector. This eigenvalue is then the smallest eigenvalue in
magnitude of A and the eigenvector is same. Since power method is applied to A-l,
it is called the inverse power method.
Consider the method
y(k+l) = ~-1G(k) k=o ,1,2 ,.........
7 (17)
..
For solving the system of Eqns. (19), we use the LU decomposition method. We write
. -
Second iteration
~ ~ ( =2 v(l)
)
Solving LZ = v(')
and uy(')= z
we obtain
Third iteration
Fourth iteration
with do)= [-1 1lT, using four iterations of the inverse powe; method.
The inyersc! power method can be further generalized to find some other selected
eigenvalues of A. For instance, one may be interested to find the eigenvalue of A
which is nearest to some chosen number q. You know from P6 of Sec. 8.2 that the
matrices A and A-qI have the same set of eigenvectors. Further, for each eigenvalue
hi of A, hi-q is the eigenvalue of A-qI.
-
pt-1) - y(k+l)
mk+l
Using P6, we have the relation
p= -
1
where A is an eigenvalue of A.
7
A-q
1
Now since p i s the largest eigenvalue in magnitude of (A-~I)-', - must be the
wr
smallest eigenvalue in magnitude of A-qI. Hence, the eigenvalue - + q of A is
I.L
closest to q.
~xamp'le6 : Find the eigenvalue of the matrix A, nearest to 3 and also the
corresponding eigenvector using four iterations of the inverse power method where,
2 -1
-1
Solution : In this case q = 3. Thus we have
A-31 = [ -i
-1 -1
To find Y(~+'),
we need to solve the system
L J
and normalise y(k+l) as given in Eqn. (22).
First iteration
Starting with v(O) = [l 1 llT and using the Gauss elimination method to solve the
system (24), we obtain
Second iteration
Third iteration
After four iterations, the ratios are given as 2.5, 2.333, 2.5. The maximum
(d3))r
error in these ratios is 0.1667. Hence the dominant eigenvalue of (A-31)-' can
be taken as 2. Thus the eigenvalue A of A closest to 3 as given by Eqn. (23) is
E8) Find the eigenvalue which is nearest to - 1 and the corresponding eigenvector for
the matrix
with do)= [ -1 ;IT, using four iterations of the inverse power method.
E9) Using four iterations of the inverse power method, find the eigenvalue which is
nearest to 5 and the corresponding eigenvector for the matrix
with do)= [1 l r
The eigenvalues of a given matrix can alst be estimated. That is, for a given matrix
A , we can find the region in which all its eigenvalues lie. This can be done as follows:
Let Xi be an e'igenvalue of A and xi be the corresponding eigenvector, i.e..
Axi = hixi (25)
or
Let be the largest element in magnitude of the vector ........,xi,nIT
Consider the kth equation of the system (26) and divide it by xiek.We then have
since
1 1
JL S 1 f o r j = 1, 2 ,.......n.
;;,k
Since eigenyalues of A and A= are same (Ref. P2), Eqn.(28) can also be written as
Since I x ~ ,the
~ ~largest
, element in magnitude, is unknown, we approximate ~qns.(28)
and (29) by
and
. .
(maximum absolute column sum)
Again. since A and have the same eigenvalues Eqn.(32) can be written as
n
Ihi-akkI 4 2 1aij1 (33)
Note that since the eigenvalues can be complex, the bounds (30), (31), (32) and (33)
1 matrix A i s symmetric if represents circles in the complex plane. If the eigenvalues are real, then they
represent intewals. For example, when A is symmetric then the eigenvalues of A are
real.
Again in Eq?. (32), since k is not known, we replace the circle by the union of the
n circles
Similarly from Eqn. (33), we have that eigenvalues of A lie in the union of ciicles
Eigenvalues and Eigenvecbrs
Thy bounds derived in Eqns. (30), (31), (34) and (35) for eigenvalues are all
indhp&dent bounds. Hence the eigenvalues must lie in the intersection of these
bounds. The circles derived above are called the Gerschgorin circles and the bounds
are called the Gerschgorin bounds.
Let us now consider the following examples:
Example 7 : Estimate the eigenvalues of the matrix
After 4 iterations the ratios are given by 4.9946, 5.0054, 4.9946. The
maximum error in these ratios is 0.0108. Thus the dominant eigenvalue of A can
be taken as 5.00 and the corresponding eigenvector is [0.7075 1 0.70751~
p = u.3. ~ t l elgenvalue
e or A wnlcn 1s nearest to -1 1s oDtainea trom
we get
[-: -1
T
-' 1.25
T
Similarly,
(4)
After 4 iterations, the r a t i o s q are 1.005, 08.9968.The maximum error in
(V )r
these ratios is 0.0082. Hence the dominant eigenvalue.of (A-51)-' can be taken
as p = 0.99.
The.eigiyvalue of A which is nearest to 5 is obtained from
T
205
The corresponding eigenvector is 11
UNIT 9 LAGRANGE'S FORM
Structure
9.1 Introduction
Objectives
9.5 Summary
9.6 Solutions/Answers
.
Let f be a real-valued function defined on the interval [a ,b] and we denote f(x,) by fk.
Suppose that the values of the function f(x) are given to be fo, f,, f2. ....f,, when x = x,, x,,
x2. ..., x, respectively where x, < x, < x2 ...< x, lying in the interval [a.b]. The function
f(x) may not be known to us. The technique of determining an approximate value of f(x)
for a non-tabular value of x which lies in the interval [a, b] is called interpolation. The
process of determining the value of f(x) for a value of x lying outside the interval [a,b] is
- called extrapolation. In this unit, we derive a polynomial P(x) of degree In which agrees
with the values of f(x) at the given (n + 1) distinct points, called nodes or abscissas. In
other words, we can find a polynomial P(x) such that P(x,) = fL,j = 0,1,2. ...,n. Such a
polynomial P(x) is called the interpolating polynomial of f(x).
Objectives
compute the value of 3? (approximately) given a number 7 such that f(E) = (7)
(inverse interpolation);
Lemma 1: If 2,. z2, ...,zk are distinct zeros of the polynomial P(x), then
Corollary: If Pk(x) and Q,(x) are two polynomials of degree S k which agree at the k + 1
distinct.points 20, z, ,...,z, then Pk(x) = Q,(x) identically.
You have come across Rolle's theorem in Section 1.2. We need a generalized version of
this theorem in !he Section 9.4 (General Error Tcrm). This is stated below.
We now show the existence of an interpolating polynomial and also show that it is unique.
The form of the interpolating polynomial that we are going to discuss in this section is
called the Lagrange form of the interpolating polynomial. We start with a relevant
theorem.
Theorem 3: Let x,, x,, ...,xn be n + I distinct points 06 the real line and let f(x) be a real-
valued function defined on some interval I = [a,b] containing these points. Then, there
exists exactly one polynomial Pn(x)of degree I n, which interpolates f(x) at xo, ....x,,, that
is, Pn(xJ f(xJ, i = 0, 1.2, ..., n.
=S
Proof: First we discuss the uniqueness of the interpolating polynomial, and then exhibit
one explicit construction of an interpolating polynomial (Lagrange's Form).
That is, h(x) has (n + 1) distinct zeros. But h(x) is of degree I n and from the Corollary to
Lemma 1, we have h(x) = 0. That is Pn(x) Qn(x).This proves the uniqueness of the
polynomial.
Since the data is given at the points (xo, fo), (x,, f,), ...,(x,, fn)let the required polynomial
be written as
Substitution of (4) in (1) gives the required Lagnnge f ~ r i nof the interpolating polynomial.
Remark: The Lagrange form (Eqn. (1)) of interpolating polynomial makes it easy to show
the existence of an interpolating polynomial. But its evaluation at a point xi involves a lot
computation.
A more serious drawback of the Lagrange form arises in pracaice due to the following: One
@? calculates a linear polynomial Pl(x), a quadratic polynomial P2(x)etc., by increasing the
number of interpolation points, until a satisfactory approximation P,(x) to f(x) has been
found. In such a situation Lagrange form does not take any advantage of the availability of
Pk- ,(x) in calculating Pk(x). Later on, we shall see how in this respect. Newton form.
discussed in the next unit, is more useful.
Example 1: If f(1) = - 3, f(3) = 9, f(4) = 30 and f(6) = 132, find the Lagrange's
interpolation polynomial of f(x).
where
- (x - x1)(x - x2)(x - x3)
. = (xo - x,) (xo - x2) (xo - x3)
Substituting Lj(x) and fj, j = 0, 1,2,3 in Eqn. (9,
we get
1 1
P(x) = - [x3 - 13x2+ 54x - 721 (-3) + [x3 - l l x 2 + 34x - 241 (9)
- ,
1
[x3 - l o x 2 + 27x - 181 (30)
1
+ 3 [x3 - 8x2+ 19x - 121 (132)
.
which is the Lagrange's interpolating polynomial of f(x).
Substituting
in (6), we get
k l n
n \
E2) Let w(x) = n ( x - xk). Show that the interpolating polynomial of degr& 5 n wiLh
k-0
, the nodes xo, x,, ...,x, can be written as
Example 3: From the following table, find the Lagrange's interpolating polynomial which
'agrees with the values of x at the given values of y. Hence find the value of x when y = 2.
Solution: Let x = g(y). The Lagrange's interpolating polynomial P(y) of g(y) is given by'
.'. x = P(y) = y3 - y2 + 1
.'. when y = 2, x = P(2) = 5.
Example 4: Find the value of x when y = 3 from the following table of values.
E3) Find the Lagrange's interpolation polynomial of f(x) from the following data. ~ e n c e
obtain f(2).
. E4) Using the Lagnnge's interpolation formula, find the value of f(x) when x = 0 from Lagrange's Form
the following table:
E6) From the following table of values, find the value of y when x = 2.5
E8) Using the Lagrange's interpolation formula, find the value of y when x = 10.
E9) In the following table, h is the height above the sea level and p is the barometric
pressure. Calculate p when h = 5280.
E10) In the following khle, y represents the percentage of the number of workers in a
factory whose age is less than x years. Find what percentage of workers have their
age less than 35 years.
Now we are going LO find the error committed in approximating the value of the function .
by P,(x).
9.4 ERROR
Let E,(x) = f(x) - P,,(x) be the error involved in approximating the function f(x) by an
interpolating polynomial. We derive an expression for &(x) in the following theorem.
This result helps us in estimating a useful bound on the error as explained in an example.
1 Theorem 4: Let xo, x,, ...,x, be distinct numbers in the interval [a, b] and f has
(continuous) derivatives upto order (n + 1) in the open interval ]a, b[. If P,,(x) is the
inlerpolating polynomial of degree 5 n, which interpolaps f(x) at the points x,, ...,x,, then
for each x c i [a,b], a number 5(x) in ]a, b[ exists such that
Proof: If x # xk for any k = 0.1.2. ,..,n, define the function g for t in [a,b] by
Since f(t) has continuous derivatives upto order (n + 1) and P(t) has derivatives of all
orders, g(t) has continuous derivatives upto (n + 1) order. Now, for k = 0,1,2, ...,n, we
have
\
The error f~rmula(Eqn. (9)) derived above, is an important theoretical result because
Lagrange interpolating polynomials are extensively used in deriving important formulae
for numerical differentiation and numerical integration.
It is to be noted that f = k(K) depends on the point X at which the error estimate is
required. This dependence need not even be continuous. This error formula is of limited
utility since fi"+')(x) is not known (when we are given a set of data at specific nodes) and
the point 4 is hardly known. But the formula can be used to obtain a bound on the error of
interpolating polynomial. Let us see how, by an example.
Example 5: The following table gives the values of f(x) = ex. If we fit an interpolating
polynomial of degree four to the data, find the magnitude of the maximum possible error
in the computed value of f(x) when x = 1.25.
Solution: From Eqn. (9), the magnitude of the error associated with the 4th degree
-
polynomial approximation is given by
' Sincc f(x)'= cx. fiS)(x)= ex. Lnprngc's Form
I
Whcn x lics in thc htcrval rf.2. 1.61.
..
Max ( f ( S ) ( ~I )= = 4.9530
(11) in (1O);and putting x = 1.25, Be upper bound on the magnitude of the
~ubstiiutin~
error
= 0.00000135.
,? ' YOU may now try the following exkrcises.
I#
El 1) For the data of Example 5 with last one omitted, i.e., considering only first four
nodcs, if we fit a polynomial of degree 3, find an estimate of the magnitude of the
error in the computed value of f(x) when x = 1.25. Also find an upper bound in the
::f,
I
magnitude of the error.
E12) The following table gives b e valug of x and f(x) = Sinhx. If the value of f(x) when
x = 0.53 is computed from the second degree interpolation polynomial. find the
estimate of the magnitude of the error.
E14) Find the value of x when y = 4 from the table given below:
E 15) .Find the interpolating polynomial which fits the following data taking x as the
independent variable.
El6) ). Using Lagrange's interpolation formula, find the value of f(4) from the following
data:
?
Let us take a brief look at what you have studied in this unit.
. .
9.5 SUMMARY
4
In this unit, we have seen how to derive the Lagrange's form of interpolating polynomial
for a g i ~ e ~ d a tIta .has been shown that the interpolating pdlynomial for a given data is
, unique. Moreover the Lagnngk form of interpolating polynomial can be dctcrained for
equally spaced or unequally spaccd nodes. We have also seen how the Lagrange's
interpolation formula can be applicd with y as the indcpcndcnt variablc and x as thc
dcpcndcnt variable so that the value of x corresponding to a givcn valuc of y can be
calculated approximately when some conditions arc satislicd. Finally, wc have dcrivcd the
general error formula and its use has bcen illustrated to judgc rhc accuracy of our
calculation. The mathematical lormulac derived in this unit arc listed bclow for your easy
rcfcrcnce.
1) Lagrange's Form
n
3) Interpolation Error
n
f(x) = Pn(x) = C L(x)~f(xi) by uniqucncss of interpolatin
i=O
J polynominl.
n
Since w(x)= n ( x - xj)
j=O
UNIT 10 NEWTON FORM OF THE
INTERPOLATING POLYNOMIAL
Structure
10.1 Introduction
Objectives
10.7 Summary
10.8 Solutions/Answers
10.1 INTRODUCTION
The Lagrange's form of the interpolating polynomial derived in Unit 9 has some drawbacks
compared to Newton form of interpolating polynomial that we are going to consider now.
,
In practice, one is often not sure as to how many interpolation points to use. One often
calculates Pl(x), Pz(x), ....increasing the number of interpolation points, and hence the
degrees of the interpolating polynomials till one gets a satisfactory approximation Pk(x)to
f(x). In such an exercise. Lagrange form seems to be wasteful as in calculating Pk(x), no
advantage is taken of the fact that one has already constructed Pk- 1 ( ~ )whereas
, in Newton
form it is not so.
Objectives e
fo& a table of divided differences and find divided differences with a given set of
arguments from the table;
find an estimate of f(x) for a given non - tabular value of x from a table of values of
\
x and Y [ f(x) I;
relate the kth order derivative of f(x) with the kth order divided difference from the
expression for the error term. .
Newton Form of the Inter;
10.3DIVIDED DIFFERENCES . polating Polyno~nial
Suppose that P, (x) is the Lagrange polynomial of degree at most n that agrees with the
functi0n.f at the distinct numbers xo, xl.. ..., x,. Pn (x) can have the following
representation, called Newton form.
The remaining divided differences of higher orders are defined inductively as follows. The
kth divided differences relative to xi, xi+,, ...,xi+ is defined as
Since each of the remaining terms in Eqn. ( I ) has the factor (x - x0) (X- x,) ...(X- xk).
Eqn. (1) can be rewritten as .
Pn(x) = Qk(x) + (x - x0) ,..(X- xk) R(x) lor some polynomial R(x). As the term (x - xo)
(x - xl)...(x - xk) R(x) vanishes at each of the points xo ...xk, we have f(x9 = Pn(x3 = Qk
(xi). i = 0 , 1 , 2 , ..., k. Since Qk(x) is a polynomial of degree 5 k, by uniqueness of
interpolating polynomial Qk(x) = Pk(x).
This shows that P, (x) can be constructed step by step with the addition of the next term in
Eqn. (I), as one constructs the sequence Po(x), Pl(x) ...with Pk(x) obtained from Pk- l(x)
in the form 3
That is, g(x) is a polynomial of degree Ik having (at least) the k distinct zeros xO,..., xk- l.
...
.: Pk(x) - Pk - l(x) = g(x) = Ak(x - x0) (X- xk - for some constant Ak. This constant
...,
Ak is called h e kth divided difference of f(x) at tile points xO, xk for reasons discussed
f(x) at the p i n t s xo, ..., xk. Thus Eqn. (2) can be rewritten as
P k ( ~ ) = P k - l ( ~+) f[ . X ~ ] ( X - X ~ ) ( X -...,(x-xk-1)
X~) (3)
To get an explicit expression for f[xo. ...,xk] we make use of Lagrange form of
interpolating polynomial and the uniqueness of interpo'lating polynomial.
-
since (x xo) (x - x,) ... (x - xk- ,) = xk + a polynomial of degree c k, we can
....
rewrite pk(x) as pk(x) = f[xO, xk] xk + a polynomial of degree < k (4)
if yo, ...,yk is a reordering of the sequence xo, ..., xk. We have defined the zeroeth divided,
difference of f(x) at xo by f[xo] = f(xo) which is consistent with Eqn. (5).
This shows that the first divided difference is really a divided difference.
This shows that the second divided difference is s dividcd dirfcrcnce of divided
dirfcrcnces.
This shows that the kth divided differcnce is the divided difference of (k - 1)st divided
differences justifying the name. If M = (xO,...., x,) and N denotes anv n - 1 elcmcnts of '
M and the remaining two elements are-%noted by a and P, then -- z
[(n - 1)st divided differen& on N anda - (n - 1)st divided difference on N andpl (7)
( f[x0,..., X" = , .
a-B I
!I Theorem 1:
Proof: Let Pi- l(x) be the polynomial of degree < i - 1 which interpolates f(x) at ~ g ...,
. xi-]
and let Qj- ](x) be the polynomial of degree 5 j - 1 which interpolates f(x) at the points
xl, ...,xj. Let us define P(x) as
This is a polynomial of degree Ij, and P(xi) = f(xi) for i = 0, 1, ....j. By uniqueness of the
I
i interpolating polynomial we have P(x) = Pj(x). Therecore
'
b
Equating the coefficient of xJ from both sides of Eqn. (8). we obtain (leading) coefficient of
We now illustrate this theo.rem with the help of a few examples but beforc that we give the
table of divided differences of various orders.
Suppose we denote, for convenience, a first order divided dirference of f(x) with any two
arguments by f[.,.], a second order divided difference with any three arguments by fl.,...]
and so on. Then the table of divided differences can be written as follows
Table I
- (c2 - a2) + b ( c - a)
c-a '
- -1 + -
1 1 1
f[a,b,cl = ab -
-abbe
c - a c - a
1
- - - /
c-a - abc
Similarly,
1
f[b,c,d] = -
bcd
:. f[a.b,c.dl = I%[
c-a
=-I
abc
-
d-a
= - - 3 .
1
abcd
. In next section we shall make use of the divided differences to derive Newton's genlral
form of interpolating polynomial.
In Sec.102 we have shown how P,,(x) can be constructed step by step as one constructs the'
sequence Po(x), P,(x), M... with %(x) obtained from Pk- ,(x) with the addition of the
next term in iQn.(3), that is, Newton Form of the Intcr-
pointing Polynomial
pk(x) = p k - + (X- x0) (X- xl) ...(X- Xk - 1) f l ~ ~ , . . . , ~ ~ ]
Using this Eqn. (1) can be rewritten as
P,,(x) = f[~Ol+ (x - &,I flx,,x,l + (x - xo) (X - x1) ~ [ x ~ . x , . x ~+I ... +
(x - &,I (x - XI)...(X- Xn - 1) flx0Jl,.-.,xnI. (9)
This can be written compactly as follows :
Example 3: From the following table of values, find the Newton's form of interpolating
!
polynomial approximating f(x).
I
Solution: We notice that the values of x are not equally spaced. We are required to find a
1 polynomial which approximates f(x). We form tlie table of divided differences of f(x).
Table 2
26 1 13
6 822 132
789
7 1611
Since the divided difference upto order 4 are available, the Newton's interpolating
polynomial P4(x) is given by
P4(x) = f(x0) + (x - xo) f[xo.x11 + (x - xo) (x - x1) flxo.x1.x21 +
-
(X xo) (X--XI) (X- ~ 2f[~0.~1.~2.~31
) f
where xo = - 1, xl = 0, x2 = 3, xj = 6 and x4 = 7.
We now consider an example to show how Newton's interpolating polynomial can be used
to obtain the approximate value of the function f(x) at any non-tabular point.
Example 4: Find the approximate values of f(x) at x = 2 and x = 5 in Example 3.
Solution: =
Since f(x) P4(x),from Example 3, we get
f(2) =P4(2) = 16 - 24 + 20 - 6 = 6
and
f(5) P(5) = 625 - 375 + 125 - 6 = 369
Note 1: When the values of f(x) for given values of x are required to be found, it is not
necessary to find the interpolating polynomial P4(x) in its simplified form given
above. We can obtain the required values by substituting the values of x in
Eqn.(l 1) itself. Thus,
Similarly.
P4(5) = 3 + (6) ( - 9) + (6) (5) (6) + (6) (5) (2) (5) + (6) (5) (2) ( - 1) (1)
=3-54 + 180 + 300-60~369.
Then f(2) P4(2) = 6
and f(5) =P(5) = 369.
Example 5: Obtain the divided differences interpolation polynomial and the Lagrange's
interpolating polynomial of f(x) from the following data and show that they are same.
Table 3
On simplifying, we get
P(x) = x3 + x - 4.
Thus, we find that both polynomials are the same.
E2) From the tablc of valucs given below, obtain b e value of y when x = 1.5 using
E3) Using Newton's divided diffcrenccs interpolation formula, find the valucs of f(8)
and f(15) from the following table.
In Unit 9 we have dcrived the gcneral crror tcrrri i.c. thc crror cornrnittcd in approximating
f(x) by P,(x). In the ncxt section wc dcrive anothcr cxprcssion for the crror tcrln in tcrm of
divided difference.
This shows that the error is like the next term in the Newton form.
f("+') (5)
Comparing, we have QQ, XIS...S%+ 11 =
,( + 1)
Theorem 2: Let f(x) be a real-valued function, defined on '[a.b] and n times differentiable
in ]a. b[. If xo. ...,x, are n + 1 distinct points in [a,b]. then there exists 5 E ]a.b[ such that
Corollary 1:
If f(x) = xu, then
Corollary 2:
If f(x) = xk, k c n, then
~ [ x ~ ~ . . .=~0x ~ ]
In the next section, we are going to discuss about bounds on the interpolation error.
Consider now the case when the nodes are equally spaced, that is, xj = xo + jh, j = 0,...,N,
and h is he spacing between consecutive nodes. For the case n=l we have linear
interpolation. If x E [xi- xi], then we approximate f(x) by Pl(x) which interpolates at
1
1 1
xi - I , and xi. From Eqn. (14) we have 4 (x) S 7 max f "(t)( mar (yrl (x)( 1
t€I tE1 .
whcre yrl(x) = (x - xi -.l) (X - xi).
Now,
-
dv1 = x - Xi-]. + X - xi=o
dx
Hcnce, thc maximum value of I (x - xi- (x - xi) I ocGursat x = x' = (xi- + xi)/2.
'
For the case n=2, it can be shown that for any x € [xi - I , x i +
h3M where I f
1 E2(x) 1 G 8M (x) 1 P M on I.
Example 8: Determine the spacing h in a table of equally spaced values of the function of
f(x) = J;; between 1 and 2, so that interpolation with a first degree polynomial in this
,
Solution: Here
'max
11xS2
( f "(x) I= -4I '
and =.
I El (x) 1 1 h
For seven place accuracy, h is to be chosen such that
E6) If f(x) takes the values - 21, 15,12 and 3 respectively when x assumes the values
- 1, 1.2 and 3, find the polynomial which approximates f(x).
E7) Using the following table of values, find the polynomial which approximates fix).
Hence obtain the value of f(5).
E9) If f(3) = 168, f(7) = 120, f(9) = 72 and f(10) = 63, find an approximate value of
f(6).
E10) The following table gives steam pressures P at different temperatures T, measured
in degrees. Find the pressure at temperature 372.1 degrees.
Ncnton Form of the Intcr-
polating Polynomial
E13) Obtain the polynomial which agrees with the values of f(x) as shown below
E14) ~etemiineUri: spacing h in a uble of equally spaced values of the function f(x) = J;;
between 1 and 2, so that interpolation with a second-dcgrce polynomial in this wble
yields severi-place accuracy. ,
We now end this unit by giving a summary of what we have covered in it.
10.6 SUMMARY
In this unit we have derived a form of interpolating polynomial called Newton's gkneral
form, which has some advantagesover the Lagrange's form discussed in Unit 9. This form
is useful in deriving some other interpolating formulas. Wehave introduced the concept of
divided differences and discussed some of its important properties before deriving
Newton's general form. The error term has also been derived and utilizing the error term
we have established a relationship between the divided difference and the derivative of the
function f(x) for which the interpolating polynomial has been obtained. The main formulas
derived are listed below:
f(15) = 3150
E4) - 3
E9) 147
E10) 177.4
Ell) 15.79
E12) 84
E13) x 3 + x 2 - x + 2
h3
-
J < 5.10-'. This gives h 0.0128.
24 3
-
The number of interval is N = 2 - 1 = 79.
h
UNIT 11 INTERPOLATION AT EQUALLY .
SPACED POINTS
Structure
11.1 Introduction
Objectives
11.2 Differences
11.2.1 Forward Differences
11.2.2 Backward Differences
1 1.2.3 Central Differences
11.4 Summary
1 1 . INTRODUCTION
Suppose that y is a function of x. The cxact functional relation y = f(x) bctwccn x and y
may or may not be known. But, the valucs of y at (n + 1) equally spaccd valucs of x are
supposed to bc known, i.e., (xi, y;); i = 0. .... n are known whcre xi- xi-, = h (fixcd).
i = 1.2. .... n. Suppose lhat we arc rcquircd lo dctcrminc an approximale value of ~ ( x )
or its dcrivativc f'(x) for some valucs of x in thc intcrval of intcrcst. Thc mctliods for
solving such problclns are bascd on thc conccpl of finite di~fcrcnccs.Wc havc
introduccd thc conccpt of forward, backward and ccntral dilTcrcnccs and discussed thcir
intcrrclalionship in Scc. 1I .2.
We havc alrcady introduced two important forms of thc interpolating polynomi31 in Units
9 and 10. Thcsc forms simplify whcn thc nodcs arc cquidishnt. For the case of equidistant
nodcs, wc have derivcd thc Ncwton's forward, backward diffcrcncc forms ant1 Stirling's
central diffcrcnce form of intcrpolating polynomial, each suitable for usc undcr a spccific
situation. Wc have dcrivcd thcse mcthods in Scc. 11.3, and also givcn the corresponding
error tcrm.
Objectives
After reading this unit, you should be able to
11.2 DIFFERENCES
Suppose that we are given a table of values (x,, yJ, i = 0 , 1.2, ...,N where yi = f(x9 = fi.
Let the nodal points be equidistant That is
-
S = s(x) = -
h
X Xo
, sothat x = x(s) = xo + sh
The linear change of variables in Eqn. (2) transforms polynomials of degree n in x into
polynomials of degree n is s. We have already introduced the divided-difference table to
calculate a polynomial of degree < n which interpolates f(x) at xo, xl, ..., x,. For equally
spaced nodes, we shall deal with three types of differences, namely, forward, backward
and central and discuss their representation in the form of a table. We shall also derive the
relationship of these differences with divided differences and their interrelationship.
T
= f k + ~ - ~ ~ +
~ +fk l
For i = 0, both sides of relation (5) are same by convention. that is,
This shows that relation (5) holds for i = n + 1 also. Hence (5) is proved. We now give a
result which immediately follows from this theorem in the following corollary.
where f(x) is a real-valued function defined o i [a,b] and i times differentiable in ]a.b[ and
4 E la&[.
Taking i = n and f(x) = Pn(x) in Eqns. (6) and (3,we get
= hnn!a,,.
Since An ''pn(xo)= AnPn(xl)- AnPn(xo)
= hnn!a,,- hnn!a,,= 0.
This completes the proof
The shift operator E is defined as
Ef; = fi + 1
In general Ef(x) = f(x + h).
We have Esfi = fi +,
For example,
~~f~= fi+,, ~~~f~ = fi+ and E-IRfi r fi- In
Now,
Afi = fi +,- fi = Efi - fi = (E - l)fi
Hence the shift and forward difference operations are related by
A=E-l '.
or E = 1 +A.
Operating s times, we get
We now give in Table 1, the forward diffcrences of various orders using 5 val,,s.
Note that the forward difference Akfolie on a straight line sloping downward to the right.
The backward differences of f(x) of ith order at xk = xo + kh are denoted by vlfk.Thcy are
defined as follows:
Note that the backward differences vkf4lie on a straight line sloping upward to the right.
= f k + l-fk - fk + fk-1
= f k + , - 2fk + fk-1
Similarly, /
f
Since
The central differences at a non-tabular point xk+ In can be calculated in a similar way. -:
For example,
a r k + l,z = fk + 1:- fk
Interpolation S2fk+1n= f k + ~-2fk+ln
n + fk-I,
-
S4fk+ln= f k + ~ n 4fk+3n + 6fk+ln-4fk-ln + fk-3n
We have
\
We now give below the central difference table with 5 nodes.
Note that the differences S2"'fo lie on a horizon&l line shown by the dotted lines.
Spaced Point
1
= 7 [E'" + E - I I ~ ] ~ ~ .
Hence
p = 1 [Ell2 + E-'I2]
A =E-1
Also E" =p 6
+ -
2
E-'R= 6
P- 7
(e) 2
S ~ ~ ~ = [ E - " ' Sf2] = E - ' A ~ ~ ~ = A ~ E - ' ~ ~ =( SA=' E~-~' ~ A )
= I + -S2
4
(b) L.H.S.
1 + 1
~ - ' n )(E'R-E-'~) = -(E-E-')
p6 = 2 2
R.H.S.
1 1 1
-(A
2 + V) = T [ ( ~ - l )+ (1-E-I)] = T ( ~ - ~ - ' ) .
(c) We have
1
p = Z (Ell2 + E-112) (Ell2 - E-112) = 7(E - E-1)
..1+p2l2=1+
(E - E - ' ) ~ -- (E - E4- ' ) ~+ 4 -- (E + E - ' ) ~
4
In Unit 10, we have derived Newton's form of interpolating polynomial (using divided
differences). We have also established in Sec. 11.2.1, the following relationship between Interpolation at Equally
Spaced Point
divided differences and forward differences
Substituting the divided differences in terms of the forward differences in the Newton's
form, and simplifying we get Newton's forward-difference form. The Newton's form of
interpolating polynomial interpolating at xk. xk + 1, ...,xk +,, is
(S - k) (S - k - 1)
= fk + -
(S k)Afk + 2!
~ ...+ (S - k ) ( Sn! - - I)
~ 2 9 - k Anfk.(25)
of degree < n.
The form (23), (24). (25) or (26) is called h e Newton's forward-difference formula.
d*, ,
?+ I
The error term is now given by
Since the third order differences are constant, the higher order differences vanish and we
can infer that f(x) is a polynomial of degree 3 and the Newton's forward-differences
in~rpolationpolynomial exactly represents f(x) and is not an approximation to f(x). The
step length in the data is h = 1. Taking xo = 1 and the subsequent values of x as XI,x2, ...,
as, the Newton's forward-differencesinterpolation pdlynomial.
becomes
and
Inierpo~ntionnl Equally
Spared Point
E3) ' The population of a town in I@ decinnial census was given below. Estimate the
population for the year 19f 5.
Population: y 46 66 81 93 101
(in thousands)
- -
Example 5: From the following table, find the number of students who obtained less than
45 marks.
Solution: We form a table'of the number of students f(x) whose marks are less han x. In
other words, we form a cumulative frequency table.
= 47.8672 = 48
r .
,
The rlumber of students who obtained less than 45 darks is approximately 4%
E4) From the following table, find the value of y (0.23):
E5) Find the cubic polynomial which approximate y(x) given that
E6) The following table gives the values of tan x for 0.1 S x 5 0.3. Find the value of tan
(0.12).
E7) The following table gives the population of a town in ten consecutive censuses.
Calculate the population in the year 1915 and 1918. Hence obtain the increase in
population during the period 1915 and 1918.
Population y 12 15 20 . 27 39 52
(in thousands)
E8) Find the number of men getting wages between Rs. 10 and Rs. 15 from the
following table.
No. of men y 9 30 35 42
E9) The following table shows the monthly premiums to be paid to a company at
different ages. Find the premium to be paid at the age of 26 years.
Age 20 24 28 32 36
E10) The area A of a circle of diameter d is given in the following table. Find the area of
the circle when the diameter is 82 units.
'
El 1) In an examination, the number of candidates who secured marks in certain limits
werc as follows:
Marks 0-9 20 - 39 40-5P 60-79 80-991
I No. of candidates 41 62 65 50 171
Find the number of candidates whose marks are 25 or less.
E12) The following table gives the'amount of a themical dissolved in water at dirrcrcnt
temperatures.
x-x,-j=(s+n-n+j)h=(s+j)h
and
The backwatd-difference form is suitable for approximating the value of the function at x
that lies towards the end of the table.
Example 6
Find the Newton's backward differences interpolating polynomial for the data of
Example 4.
Tables 5 and 7 are the same except that we consider the differences of Table 7 as
...,
backward differences. If we name the abscissas as xg, XI, XS,then x, = xs = 6, fn = fS =
235. With h = 1. the Newton's backward differences polynomial for the given data is given
by
P(x) = x3 + 2x + 7,
which is the same as the Newton's forward differences interpolation polynomial in
Example 4. ,
I
Interpolation at Equally
Spaccd Poinr
11.3.3 Stirling's Central Difference Form
A number of central difference formulas are available which can be used according to a
situation to maximum advantage. But we shall consider only one such method known as
Stirling's method. This formula is used whenever interpolation is required of x near the
middle of the table of values.
For the central difference formulas, the origin xo, is chosen near the point being
approximated and points below xo are labelled as xl, x2. ...and those directly above as
,,
x- xe2. ... (as in Table 3). Using this convention. Stirling's formula for interpolation
is given by
IT n = 2p is even, then the same formula is used deleting the last term.
The Stirling's interpolation is used for calculation when x lies
1 1
between xo - h and xo + ;rh.
It may be noted from the Table 3, that the odd order differences at x-]n are those which
lie along the horizontal line between xo and x- Sirnilafly. the odd order differences at
xln arc those which lie along the horizontal line between xo and xl. Even order differences
at xo are those which lie along the horizontal line through xo.
Example 8: Using Stirling's formula, find the value of f(1.32) from the following table of
values.
Solution:
:. s = -
(x-xo) =
h
1.32-1.3 = o.2.
0.1
From Eqn. (32), we have
s2 s(s2-12) 1 2 2 2
f ( x ) = f 0 + $ [sf~l,t+sfl,2]+z S2f,,+-- 3! 2 [s3f-,,, + s3f112]+ s (s4!-1 ) 5df0.
Now,
1 1
7 + ~f,,,] = T(0.1889 + 0.2059) = 0.1974
= 1.73816 = 1.7382.
where s - (X - xo)/h.
2. Newton's backward difference formula:
if n = 2p + 1 is odd. If n = 2p is even, the same formula is used deleting the last term.
11.5 SOLUTIONSIANSWERS
El) From Eqn. (12) v4f,= f5 - 4 f + 6f3 - 4f2 + fl
E2) ~ + E- vz)= Em 2pS = 2 ~ *pS
LHS = E ' (Em
RHS = ~E"(E* - Em)p = 2EYr8.
E 3) The forward differences table is given below.
Table 10
'Taking xo = 1911, x = 1915. h = 10. wc gct at Equally
~nterpo~aAon
Spaced Point
S = 1051-1911
10
= o.4
= 54.8528
or y(1915) = 54.85 thousands.
E8) 15
E10) 5281
E l l ) $8
E13) 2x2-7x + 9
E14) 1.7081
E15) ~ ~ - +2 1~ 2
E16) 0.2662,0.4241
E17) Population in 1954 is 43.33 thousands and [he population in 1958 is 48.81. Hence
the increase in population is approximately 5.48 thousands.
E19) Hint: The numbcr of candidates f(x) whose marks are less than or equal to x is as
. follows:
E20) 2x2 - 7x +9
:. S = 1.725-1.7 = o.25
0.1
Table 11
UNIT 12 NUMERICAL DIFFERENTIATION
Structure
12.1 Introduction
Objectives
12.2 Methods Based on Undetermined Coefficients
12.3 Methods Based on Finite Difference Operators
12.4 Methods Based on Interpolation
12.5 Richardson's Extrapolation
12.6 Optimum Choice of Step Length
12.7 Sulnma~y
12.8 Solutions/Answers
12.1 INTRODUCTION
Differentiation of a f u ~ ~ c t i of(x)
n is a fundamental and important concept in calculus.
W h e l ~the function is given explicitly its derivatives f'(x), fl'(x), ... etc. can be easily
found using the methods of calculus. For example, if f(x) = x2, we know that fl(x) =
2x, ftl(x) = 2 and all the higher order derivatives are zero. However, if the function is
not known explicitly but, we are given a table of values of f'(x) corresponding to a set
of values of x, then we cannot find the derivatives by using calculus methods. For
instance if f(x& represents distance travelled by a car in time xk, k = 0, 1, 2, ... seconds,
and we require the velocity and acceleration of the car at any time xk, then the
derivatives f '(x) and f "(x) representing velocity and acceleration respectively, cannot be
found analytically. Hence, the need arises to develop methods of differentiation to
obtain the derivative of a given function f(x), using the data given in the form of a
table which might have been formed as a result of scientific experiments.
Numerical inethods have the advantage that they are easily adaptable on calculators and
computers. These methods make use of the interpolating polynomials, which we
discussed in Block-3. We shall now discuss, in this unit, a few numerical differentiation
methods, namely, the method based on undetennined coefficients, methods based on
finite difference operators and methods based on interpolation.
\
Objectives
After studying this unit you should be able to
explain the iinportance of the numerical inethods over the calculus inethods;
use the method of undetennined coefficients and methods Lased on finit difference
operators to derive differentiation formulas and 'obtain the derivative of a function at
step points;
use the inethods derived from the interpolation formulas to obtain the derivative of a
function at off step points;
use Richandson's extrapolation method for obtaining higher order solutions;
obtain the optimal steplength for the given formula.
- (xm)- 0, for m
dP+ 1
1 0, 1, ...,p.
dxP+
Let us now illustrate this idea to find the numerical differentiation formula of 0 @4for f "(x&
Derivation of formula for fW(x)
Without loss of generality let us take xk = 0. We shall take the points symmetrically,
that is, x,= mb; m = 0, * 1, i2.
Let f-2, f-,, f, f,, f, denote the values of f(x) at x = -2h, - h, 0, h, 2h respectively.
In this case the formula given by Eqn. (1) can be written as
h2fl1(0) = Y-2f-2 + Y-lf-, + Y&+ ylfl + y2f2 (2)
Let us now make the formula exact for f(x) = 1, x, x2, x3, x4. Then, we have
f(x) = 1, ftl(0) = 0; f-2 = f-, = q, = f, = f , = 1
f(x) = x, fl'(0) = 0, f-2 = -2h; tl F 41; = 0; fl = b; f2 = 2b;
2 "
f(x) = x , f (0) f-2 = 4h2 = f2; ffl = b2 s f,; (,r 0; (3)
. .
f(x) = 2,f "(0) = 0. f, = - 8h3; f-, = -a3; 6 = 0; f, h3, 6 8h3
f(x) = x4 , f "(0) = 0; f-2 = 16h4 = f2; f-, = h4 = f,; 6 = 0
Numerical Differentiation
- ",
f l1((J) f r --f-,
12h2 [
+ 16fi-Mfo + 16f -f
21 (5)
Now, we know that the TE of the formula (5) is given by the f i t non-zero term in the
Taylor expression of
You must have observed that in the numerical differentiation formula discussed above, we
have to solve a linear system of equations. If the number of nodal points involved is large
or if we have to determine a method of hi@ order, then we have to solve a large system of
linear equations, which becomes tedious. To avoid this, we can use f i t e difference
operators to obtain the differentiation formulas, which we shall illustrate in the next section.
Numerical Differentiation Integration
and Solution of Differential Equations 12.3 METHODS BASED ON FINITE DIFFERENCE
OPERATORS
Thus, we have
hD = 2sinh-' (612)
Notice that this formula involves off-step points when operated on f(x). The formula
involving only the step points can be obtained by using the relation (13), i.e., '
hD = sinh-' (p6)
Thus, Eqns. (9), (lo), (15) and (16) give us the relations between hD and various
difference operators. Let us see bow we can use these relations to derive numerical
differentiation formulas for f; , f: etc. Numerical Differentiation
We first derive formulas for ftt. From Eqn. (9), we get
Thus forward difference formulas of qh), 0(h2), 0(h3 and 0(h4) can be obtained by '
retaining respectively 1, 2 3, and 4 terms of the relation (9) as follows :
Similarly the TE of formulas (20) and (21) can be calculated. Backward difference
formulas of qh), 0(h2), 6(h3) and 0(h3 for fkl can be obtained in the same way by .
using the equality (10) and retaining l,2,3 or 4 terms. We are leaving it as an exercise
for you to derive these formulas.
E2) Derive backward difference formulas for E,' of qb), 0(h2),0(h3) and 0(h4).
Central difference formulas for fl, can be obtained by using the relation (17), i.e.,
Note that relation (17) will give us methods of 0(h2) and 0@>,on retaining 1 and 2 tern,
b
S o l u t h : Were h = 0.1 and exact value of ex at x = 0.2 is 1.221402758.
it Using(l8), f '(0.2) =
-
f(0.3) f(0.2)
0.1
Numerical Differentiation Integration
Actual error 1.221402758 - 1.28456 = - 0.063157
- &-
r
-
and Solution of Differential Equations
1
Using (19), f '(0.2) f(0.4) + 4f(0.3) - 3f(0.2)] 1.21701
-
h2
TE -
3
f 111(0.2) -
-eo.2 O.,,,,'$ml;
= O.O1
3
Actual error = 0.004393
1
Using (24), f ' (0.2) = - (w.3) - f (0.1)) = 1.22344
0.2
I-E = - -
h2 f"'(0.2) = - -0.01
ea2= - 0.002035f;
6 6 ,
Actual m r = - 0.002037
1
f ' (0.2) = [-f (0.0) + Sf (0 I) - 8f (0.3) + f (0.4))
_
Ushg (U), = 1.221399167
We can write the forward difference methods of qh), 0(h2), m 3 ) and 0@> for f'; by
using Eqn. (26) and retaining 1, 2, 3 and 4 tenns as follows :
Backward difference formulas can be written in the same way by using Egn. (27).
Central differenceformulas of qh2) and 0@3 for f[ are obtained by using Eqa. (28)
and retaining 1 or 2 terms in the form :
E3) From the following table of values find f (6.0) using an O(h) formula and f I' (6.3)
usil~gan 0(h2) formula.
E4) Calculate the first and second derivatives of lnx at x = 500 from the following table.
Use q h 2 ) forward difference method. Compute TE and actual errors.
x : 500 510 520 530
f(x) : 6.2146 6.2344 6.2538 6.2729
In Secs. 12.2 and 12.3, we have derived numerical differentiation formulas to obtain the
derivative values at nodal points or step points, when the function values are given in
the form of a table. However, these methods cannot be used to find the derivative
values at off-step points. In the next section we shall derive methods which can be used
for finding the derivative values at the off-step points as well as at step-points.
In these methods, given the values of f(x) at a set of points x,,, xl,...%, the general
approach for deriving numerical differentiation formulas is to obtain the unique
interpolating polynomial P,(x) fitting the data. We then differentiate this polynomial q
times (q s n), to get (x). Tbe value 9Jq)(xk)then gives us the approximate value of
fi9)(x,J where X, may be a step point br an off-step point. We would like to point out
here that even when the original data are known to be accurate i.e. Pn(x,J = f ( G ,
k = 0, 1, 2, ..., n, yet the derivative values may differ considerably at these points. The
approximations may further deteriorate while finding the values at off-step points or as
the order of the derivative increases. However, these disadvantages are present in every
numerical differentiation formula, as in general, one does not know whether the function
representing a table of values has a derivative at every point or not.
We shall first derive differentiation formulas for the derivatives using non-uniform nodal
points. That is, when the diffetence between any two consecutive points is not uniform.
Nx) (36)
V X )= ix - xk) x1(xk)
-
and n: (x) = (X x0) (X xl) - (X- x,) (37)
Numerical Differeatiation Integration
and Solution o f Differential Equations xl(xk)= ( x ~ -x0)(xk- x ~ ) . . . ( x ~ Xk-l)(xk-
- X~+~)"-(X~- (38)
Since in Eqn. (40), the function 4 x ) is not known in the second term on the right
hand side, we cannot evaluate EL (x) directly. However, since at a nodal point xk,
n(x3 = 0, we obtain
If we want to obtain the differentiation formulas for any higher order, say qth
(1 s q r n) order derivative, then we differentiate Pn(x), q times and get
where,
(x- xl) (x- x2) 2x- xl- x,
Lo(x) = (xo- xl) (xo- x2) ; a x ) = (xo- xl) (xo- x,)
Hence, fl(x) = P; (x) = (x) fo + L; (x) fl + (x) f2 Numerical Differentiation
Exan r;e 5 : Given the following values of f(x) = In x, find the approximate value of
f 1 (2.C nd f"(2.0). Also find the errors of approximations.
.'. we get
Error is given by
Similarly,
Error ii given by
1 1
E; (xo) = 5 ~ - x2) f '"(2.0) + -(xo- x,) (x,-
( 2 ~ x1-
24
x2) [fN!?3) + fN(2.0)]
= -0.06917
You may now try the following exercise.
E5) Use Lagrange's interpolation to find f '(x) and f "(x) at x = 2.5'5.0 from the
following table
Let the data (xk, f3, k = 0, 1, ..., n be given at (n + 1) points where the step points xL;
k = 0, 1, ..., n are equispaced with step length h. That is, we have
with error
~x-xo)(x-xl)...(x-~n)An+l-
En (x) = f(a) xo<a<x,
(n+l)!hn+'
If we put
X
-
- Xo
s or x = xg + sh, then Eqns. (44) and (45) reduce respectively to
h
and
-
S(S 1). . . (s-n) h ( n + ~ ) f ( a + ~ )
n' (1' = (n + 1) ! (a)
which is same as formula (9) obtained in Sec. 123 by difference operator method. We can
obatin the derivative at any step or off-step point by finding the value of s and substituting
the same in Eqn. (48). The formula mrrespondug to Eqn. (47) in backward differences is,
Example 6 : Find the first and second derivatives of f(x) at x = 1.1 from the
following tabulated values.
Solution : Since we have to find the derivative at x = 1.1, we shall use the forward Numerical Differentiation
difference formula. The forward differences for the given data are given in Table 1.
Table 1
Substituting the values of Afo and A34, in Eqn. (50) from Table 1, we get
f '(1.1) = 0.63
To obtain the second derivative, we differentie formula (48) and obtain
- 1
f "(x) = ~ " ( x ) -
h
k2$+ (s- 1) A3fOI
Thus f"(l.1) = 6.6
Notp : . If you construct a folward difference interpolating polynomial P(x), fitting the
dataIgiven in Table 1, you will find that f(x) = P(x) = x3 - 3x + 2 Also, ff(l.l) = 6.3,
f"(l.1) = 6.6. The values obtained from this equation or directly as done above have to
be same as the interpolating polynomial is unique.
Solution : Since we are required to find the derivative at the end pint, we will use the
backward difference formula. The backward differencle table for the given data is given by
I Table 2
Numerical Differentiation Integration
and Solution of Differential Equations
Since x, = 0.4, h = 0.1, x = 0.4, we get s = 0
~hbstitutin~
the value of s in formula (49), we get
= 1.14913
How about trying a few exercises now ?
E6) The position qx) of a particle moving in a line at various times x, is given in the
following table. Estimate the velocity and acceleration of the particle at x = 1.5 and 35
Let fi9) (h) denote the approximate value of fiP)(x,J, obtained by using a formula of
order p, with steplength h and p9)(rh) denote the value of Pd(x,J obtained by using the
same method of order p, with steplength rh. Then,
f (40) = f (q)(~,J+ ChP + 0 (hp+') (5 1)
The expression on the right hand side of Eqn. (54) for finding the value of the qth
derivative by a certain method of order p has now become a method of order p + 1.
This technique of combining two computed values obtained by using the same method
with two different step sizes, to obtain higher order solutions is called Richardson's Numetkai Differentiation
extrapolation method.
terms, we have
TE= C ~ ~ ~ + C $ ~ + ' + C ~ ~ ~ + ~ + . . .
By repeated application of Rcihardson's extrapolation technique we can obtain solutions
of higher orders, i.e. O(hP+'), O(hP+a), 0(hP+3etc. by eliminating C1, Cz,C, respectively.
Let us see how this can be done.
Let g(xJ = fl(xJ be the exact value of the derivative, which is to be obtained and
be the value given by the O(hZ) method. The truncation error of this nlethod may be
written as
g(h) = g(xJ + clh2 + c414+ w6 ...
+ (55)
h
Let f; be evaluated with different step sizes - r = 0, 1, 2, . . .
2I'
Then, we have
Notke that the methods g(')(h) and g("(hL2) given by Eqns. (58) and (59) are 0(h>
approximations to g(xJ
I
Tbtse extrapolations can be stopped when
You may note that in Richardson's extrapolation, each improvement made for foxward
(or backward) difference formula increases the order of solutions by one, whereas for
central difference formula each improvement increases the order by two.
find f l(3).
Solutioa : Note that in this example xl = 3.0. The largest step h that can be taken
h
is h = 4. computations can also be done by using step lengths hl = - = 2 and
2
hp = hlR = 1.
we get
qh2) method.
qh2) method.
qh2) method.
g(l)(h) - )!(
48 - g@) m624*UW)1108
q h ? method.
3 3
0(h? method.
Numerical Differentiation
0(h6)method.
~ x a m p k9 : Let qx) = ex. Using a central difference formula of O(h3 find fl'(l)
Improve this value using Richardson's extrapolation by taking h = 0.1 and h = 0.05.
Solution : Witb h = 0.1 and
fk11 = e+~-~f,+fl-i 9
h2
we get
E8) Compute f "(0.6) from the following table using w 2 ) central difference formula.
Improve it by Richardson's extrapolation metbod using step lengths h = 0.4.0.2,O.l.
E9) Using central difference formula of 0@') find f "(0.3) Emuthe given table and improve
tbe adracy using Richardson's empolation method using step lengths h = 0.1'0.2
F
In the numerical differentiation methods, tbe trunation e m r is of tbe form ChP which
tends to zero as h-0. However, the metbod which approximates f (9)(x) contains hq in
Numerical Dimamtiation'ntevarionthe denominator. As h is successively reduced to smaller values, the truficatioil error
and Solution of Differential Equations
decreases but the round-off error in the method may inctease as we are dividing by a
small number. It may happen that after a certain critical value of h, the round-off enor
may become more dominant than the truncation error and the numerical results obtained
may start worsening as h is further reduced. The problem of finding a steplength h
small enough so that the truncation error is small, ye, large enough so that round-off
error does not dominate the actual error is referred to as themstep size dilemma. Such a
step length, if it can be determined is called the optimal steplength for that formula.
We shall now discuss in the next section how t9 determine the optimal steplength.
I 1
fk = (fk+ 1- fk)
2
Let f(x) = ex and we want to approximate fl(l) by taking h = - m = l , 2,..., 7.
lom'
We have from the differentiation formula (63),
The exact solution is f '(1) = 2.718282. The actual error is e - fl(l) and the truncation
-
error is ehn. With h = - 2
lorn'
m = 1, 2 ,..., 7, we have the results as given in Table 3.
Table 3
h f'(l) Actual error Approximate Tmllcation error
2 x lo-' 3.009175 - 0.290893 - 0.271828
If you look at Table 3, you will observe that the improved accuracy of the formula,
i.e. fl(l), with decreasing h does not continue indefinitely. Tbe truncation error agrees
with the a&ual error till h = 2 x = 0.002. As h is further reduced, the truncation
error ceases to approximate the actual error. This is because the actual error is
dominated by round-off error rather than the truncation error. This effect gets
worsened as h is reduced further. In such cases we determine the optimal steplength.
When f(x) is given ie tabular form, these values may not be exact These values ccntain
round-off errors. In other words, f(x3 = E, + E,, where f(x3 is the exact value, fk is the
tabulated value and E~ is the round-off error. For the numerical differentiation fnnnllla
(63), we have
If we take E = max (IE,~), (IE,+ ,I) and Mi = max Ifu(xjl, we Fmd that
We define the optimum value of h as the one which satisfies either of the following
col;ditions :
-
If we use the second condition (R( ITE 1 = min, we have
2E hMz
-+-
h 2
- min.
To find the mi~imumis Eqn. (65), we differentiate the left hand side of Eqn. (65) with
respect to
2E
--
hZ
Mz
I- -= 0 or, h2
2
- 4E
-or,
2
h 1 2qKT2..
Using this method and the first criterion, find the value of b and determine the value of
f1(2.0), from the following tabulated values of f(x) = In x. It is g i ~ e nthat the
maximum round-off error in the function evaluation is 5 x lo4
Solution : If E ~el, and E~ are the round-off errors in the given function evaluations of
fo, f,, f2 respectively, then we have
Hence h3 = -or
M,
12E
hTt =
R)"
-
-
fk+ I fk- 1 h2 rrr
2h -6 (a),
xk- 1< a < xk+1
determine hq, using the criteria. Numerical Diffir,.~tiation
(ii) 1 R ( + ( T E1 = minimum.
Using this method and the second criterion, fmd hqt for f(x) = I n x and determine
the value o f f '(2.03) from the following table of values of f(x), if it is given that the
maximum round-off error in the function evaluation is 5 x lod
-
W e now end this unit by giving a summary of what we have convered in it.
12.7 SUMMARY
In this unit w e have covered the following :
1) .If a function f(x) is not known explicitly but a table of values of €(x) corresponding to
a set of values of x is given then its derivatives can be obtained by numerical
differentiation methods.
2) Numerical differentiation formulas using
(i) the method of undetermined coefficients and
(ii) methods based on finite difference operators can be obtained for the derivatives
of a function at nodal or step points when the function is given in the form of s
table.
3) When it is required to find the derivative of a function at off-step points then the
methods mentioned in (2) above cannot be used. In such cases, the methods derived
from the interpolation formulas are useful.
4) Higher order solutions can be obtained by Richardson's extrapolation method which
uses the lower order solutions. These results are more accurate than the results
obtained directly from higher order differentiation formulas.
5) Round-off errors play a very important role in numerical differentiation. Sometimes,
if the step size is too small, the round-off errors gets magnified unmanageably. In
such cases the optimal step length for the given formula could be used, provided that
it can be detennined.
3 2 1
Solving we obtain a, = - a, = 2 s2= - -
2b
t
Exact value fl(x) = lh = 0.002; f "'(x) = - l/x2 = - 0.4 x lo-'
Actual error in f '(500) is 0, whereas in f "(500) it is 0.1 x lo-'. Truncation error in
f '(x) is
- ,Zf"' - -5.33
- -
x lo-' and in f "(x) it is "h2fn = 8.8 x 10-9
3 12
E5) fn tbe given problem xo = 1,xl = 2.. x2 s 3, x3 = 4 and f, = 1, fl = 16 f, = 81 and f:, = 256.
Constructing the Lagrange fundamental polynomials, we get
x3 - 7x2 + 14x - 8
) (x3 -6
3 )
-6x2+ l l x
6
1
P3(x) = G (x) f, + t;(4 f, + L;(x) f, L;( 4 f3 +
The exact values off '(x) and f "(x) are (from f(x) = x4)
E6) We are required to find fl(x) and fl'(x) at x = 1.5 and 3.5 which are off-step paihfi
Using the Newton's f(,.ward difference formula with xo = 0, x = 1.5, s = 1.5, w e p t
f'(1.5) = 8.7915 and f "(1.5) = -4.0834.
x f(x) Af A2 f A3 f A4 f
1.3 3.669
0.8i3
1.5 4.482 0.179
0.992 0.41
1.7 5.574 0.220 0.007
1.212 0.48
1.9 6.686 0.268 0.012
1.480 0.060
2.1 8.166 0.328 0.012
1.808 0.072
2.3 9.974 0.400
2.208
-2.5 12182
5:~ Taking xo = 1.5 we see tbat s = 0 and we obtain from tbe interpolation fonnula
6
E8) Use the q2) formula (33). With h = 0.1, f1'(0.6) = 1.25%, h = 0.2, f "(0.6) = 1.26545,
h = 0.4, f "(0.6) = 1.289394.
These two results are of q h 3 . To get qh6) result we repeat the extnpolation
technique and obtain
E10) If e-,, em 6, are the round-off errors in the given function evaluations f-,, 6,f,
respectively, and if E = mar ((E- ,1 , leal , 1el 1) and M3 = max ( f "'(x) 1 then
E h2
(R(a band (TEIr 7 M Y
Numerical Differeatiotion integration
and Solution of Differential Equations
-
Ifweuse IRJ JTEJ,weget
Ifweuse -
IRI lTEl =mi& then
-
For f(x) = In x and using the second aiterion, we get
b=Pt
-(30 x 1 0 ~ ~0.03.
) ~
f '(2.03) =
- -
0.72271 0.69315 o,492667.
0.06
Structure
13.1 Introduction
Objectives
13.2 Methods Based on Interpolation
Methods Using Lagrange Interpolation
Methods Using Newton's Forward Interpolation
13.3 Composite Integration
I
13.4 Romberg Integration
13.5 Summary
13.6 Solutions/Answers
13.1 INTRODUCTION
where R[h] is the left-end Riemann sum for n subintervals of length h = -and is
n
given by
The need for deriving accurate numerical methods for evaluating the definite integral
arises mainly, when the integral is either
i) a mmplicated function such as f(x) = e-', f(x) = *etc. which have no
X
Many scientific experiments lead to a table of values and we may not only require an
approximation to the function f(x) but also may require approximate representation of
the integral of the function. Moreover, analytical evaluation of the integral may lead to
transcendental, logarithmic or circular functions. The evaluation of these functions for a
given value of x may not be an accurate process. This motivates us to study numerical
integration methods which can be easily implemented on calculators.
1 In this unit we shall develop numerical integration methods wherein the integral is
approximated by a linear combination of the values of the integrand i.e.,
b
where xo, xl, ......., x, a r e the points which divide the interval [a, b ] into n
sub-intervals and Po, PI, ........,
P, are the weights to be determined. W e shall
discuss in this unit, a few techniques to determine the unknowns in Eqn. (1).
Numerical Differentiation Integration
d d Solution of Differential Equations Objectives
After studying this unit you should be able to
use trapezoidal and Simpson's rules of integration to integrate functions given in the
form of tables and find the errors in these rules;
improve the order of the results using ~ o m b e integration
r~ or its accuracy, by
composite rules of integration.
In Block 3, you have studied several interpolation formulas, which fits the given
data (x,', Q, k = 0, 1, 2, .........., n. We shall now see how these interpolation
formulas can be used to develop numerical integration methods for evaluating the
definite integral of a function which is given in a tabular form. The problem of
numerical integration is to approximatc ine definite integral as a linear combination
of the values of f(x) in the form
.I n
where the n + 1 distinct points xk, k = 0, 1, 2, ......, n are called the nodes or
abscissas which divide the interval [a, b] into n sub-intervals (xo < xl < x2 < ..... xu)
and &, k = 0, 1, .,..., n are called the weights of the integration rule or
quadrature formula. We shall denote the exact value of the definite integral by I
and denote the rule of integration by
n
E,, ifl =
b
J f(x) ax - 2 a. .
n
k- 0
f,'
where 4 ( x ) =
(x- x0) (x- xl) ' ...(x- X,.- .. . (x- x,)
(x- X i + lJ
xo) ('k- 1
') ' ' ' ' (xk- xk- 1) (xk- Xk+ 1) ''' (xt- x,)
and n (x) = (x-x,,) (x-x,) . . . (x-16). Numerical Integration
We replace the function f(x) in the definite integral (2) by the Lagrange interpolating
polynomial Pn(x) given by Eqn. (5) and obtain
where
Pk -f 4(.) dx.
We have
where M
, = mar
x,<x<x,
If" ' (x) 1
Let us consider now the case when the nodes x,'s .are equispaced with x, = a, x, = b,
b -a
and the length of each subinterval is h = 7e. numerical integration methods
given by (7) are then known as Newton-Cotes formulas and the weight; k ' s given by
(8) are known as Cotes numbers. Any point x e [a, b] can be written as x = xo + sh.
4(x) =
4
k ! (n- k)!
)(11)
'k-
eh(s(s-1)(s-2)
k !( n - L ) !
.,... ( s - k + l ) ( s - k - 1 ) .... ( s - n ) L (12)
and
IE [fl I r hn+2Mn+1
(n+ I ) !
(s(S- l)(s-2) .. .(a-n)ds
We now derive some of the Newton Cotes formulas viz. trapezoidal rule and Simpson's
rule by using first and second degree Lagrange polynomials with equally spaced nodes.
You might bave studied these rules in your calculus course.
Trapezoidal Rule
When ir = 1, we bave xo = a, x,, = b and h = b-a. Using Eqn. (12) the Cotes numbers
can be found as
where -
Mz max
Xo<X<X
If (x) 1 (16)
b
Thus, by trapezoidal rule, $ f (x) dx is given by
I I
The reason for calling this formula the trapezoidal mle is that geometrically when f(x)
is a function with positive value then -h2 (fo + fl) is the area of the trapezium with
height h = b - a and parallel sides as 6 and fl. This is an approximation to the actual
area under the curve y = f(x) above the x-axis bounded by the ordinates x = x,,, x = x,.
(see Fig. 1.). Since the error given by Eqn. (15) contains the second derivative,
trapezoidal rule integrates exactly polynomials of degree s 1.
Fig. 1
using trapezoidal rule and obtain a bound for the error. The exact value of I = ln2 =
0.693142 correct to six decimal places.
Thus, the error bound obtaified is much greater than the actual error.
We now derive tbe Simpson's rule.
Simpson's Rule
b a
For n = 2, w e have h = -, x0 = a, xl = -
a*b and x, = b.
2 2
From (12), we find the Cotes numbers as
h
Numerical Integration
Is [fI = ,
h
[fo + 4fl + f2]
b
Eqn. (17) is the Simpson's rule for approximating I =
a
f (x) dx .
The magnitude of the error of integration is
- % s(s- 1) (s- 2) ds +
I
This indicates that Simpson's rule integrates polynomials of degree 3 also exactly.
Hence, we have to write the error expression (13) with n = 3. We find
h 5 ~
1 41'11 s *(s(s- 1) (s- 2) (s- 3) ds
where M, = max
X0<X<X
( f" (x) 1
Since the error in Simpsm's rule contains the fourth derivative. Simpson's rule
integrates exactly all polynomials of degree 5 3.
b
Thus, by Simpso~l'srule, S f (x) dx
a
is given by
Geometrically,
3 +,
fo 4f1 + f2 represents the area bounded by ihe quadratic curve
I
passing through (xo f,), (x, ,fl) and (x2, f2) above the x-axis and lying between the
ordinates x = xo, x = x2 (see Fig. 2).
Y
t
Fig. 2
DifferentiationIntegration In case we are given only one tabulated value in the interval [a, b], then h = b - a, and
and Solution of Differential Equations
the interpolating polynomial of degree zero is Pdx) = G, In this case, we obtain the
rectangular integration rule given by
*
If the given tabulated value in the interval [a, b] is the value at the mid-point, then we
have xk = -,
a+b
2
and 4 = -
In this case h = b a and we obtain the integration
2
rule as
Rule (21) is called the mid-point rule. The error in the rule calculated from (13) is
This shows that the mid-point rule integrates polynomials of degree one exactly. Hence
the error for the mid-poht rule is given by
If the exact value of the integral is 0.74682 correct to 5 decimal places, find the error
in these rules.
1
Solution : The values of the function f(x) = e-' at x = 0, 0.5 and 1 are
c) Idfl- -h2 [f, + f,] ,we get Idfl- -21 (1+ 0.36788) = 0.68394 ,Taking h = 0.5 and
using Simpson's rule, we get
- h
[f(O) + 4f(0.5) + f(l)]
- 0.74718.
Use the trapezoidal and Simpson's rule to approximate the following integrals. Compare
El)
the approximations to the actual value and find a bound for the error in each case.
-
- A2fo
f(x) P,(x) = fo+ sAf0+ s(s- 1) -+ . . . +
2
S(S- 1) (s- 2). . . (s- n+ I ) A O ~ ~
n!
with the error of interpolation
Integrating both sides of Eqn. (23) w.r.t. x between the limits a and b, we can
approximate the definite integral I by the numerical integration rule
i
We call obtain the trapezoidal rule (14) from (24) by using linear interpolation i.e., f(x)
P,(x) = fO + s A fO. We then have
Similarly Simpson's rule (16) can be obtained from (24) by using quadratic
interpolation i.e., f(x) = P2(x).
Numerical Differentiation Integration ~ ~ xo = ka, xl =i xo +~h, x2~= xo + 2h = 4 we have
and Solution o f Differential Equations
Simpson's rule. Obtain the error bound and compare it with the actual error. qlso
compare the result obtained here with the one obtained in Example 1.
-1
Solution : Here x, = 0, xl = 0.5, x2 = 1 and h = -.2
Using Simpson's rule, we have
h5
IEs[fl 1 s -
90
M4= 0.00833, where M4= max
Here too the actual error is less than the given bound.
Also actual error obtained here is much less than that obtained in Example 1.
You may now trv the following exercise.
1.5
E2) Find an approximation t o S exdx,using
1.1
a) the trapezoidal rule with h = 0.4
b) Simpson's rule with h = 0.2
The Newton-Cotes formulas as derived above are generally unsuitable for use over large
integration intervals. Consider for instance, an approximation to
Since the exact value in this case is e4 - e0 = 53.59815, the error is -3.17143. This
error is much larger than what we would generally regard as acceptable. However, large
error is to be expected as the step length h = 2.0 is too large to make the er:or
expression meaningful. In such cases, we would be required to use higher order
formulas. An alternate approach to obtain more accurate results while using lower
order methods is the use of composite integration methods, which we shail discuss in
the next section.
- - - - -- - -
Evaluating each of the integrals on the right hand side by trapezoidal rule, we have
The method (26) is known as composite trapezoidal rule. The error is given by
Now since f is a continuous function on the interval [a, b], we have as a consequence
of Intermediate-value theorem
(b-a) h2
IEJfl Is -7%
The error is of order h2 and it decreases as h decreases.
IT[fl = ):( [first ordinate + last ordinate + 2 (sun of the remaining ordinates)].
In using Simpson's rule of integration (17), w e need three abscissas. Hence, w e divide
the interval [a, b] into an even number of subintervals of equal length giving an odd
b -a
number of abscissas in the form a = xo < x, < x2 < ...... < xZN = b with h = -
2N
and
I
Fjumerical Differentiation Integration x, = X, + kh, k = 0, 1, 2, ...,2N. We then write
) and Solution of Differential Equations
N
1= f ( x ) dx =
a k-1 'n-t
flx) dx
Evaluating each of the integrals on the right hand side of Eqn. (28) by the Simpson's
rule, we have
I N 1
The formula (29) is known as the composite Simpson's rule of numerical integration.
The error in (29) is obtained from (18) by adding up the errors. Thus we get
(b-a)
If M4 = max fN (5) , we can write using h = -
asesb I I 2N
The error is of order h4 and it approaches zero very fast as h -r 0. The rule integrates
exactly polyfiomials of degree r 3. We can remember the composite Simpson's rule as
I,[a - (t) [first ordinate + last ordinate + 2 (sum of even ordinates) + 4 (sum of the
remaining odd ordinates)]
Example 4 : ~ v a l u s t e (=
bx
using
(a) composite trapezoidal rule and (b) composite Simpson's rule witb 2, 4 and 8
subintervals.
We get
IS[q ;- 1
[G+ 4fl + fa] - - 25
0.694444
Numerical Integration
I) -
1
IT[q =, [fo + f8 + 2 4 + 4 + f6 0.697024
(
The exact value of the given integral correct to six decimal places is 1112 = 0.693147.
We now give the actual errors in Table 2 below.
Table 2
Note that as h decreases the errors in both trapezoidal and Simpson's rule also decreases.
Let us consider another example.
Example 5 : Find the minimum number of intervals required to evaluate with (*
-
an accuracy lo4, by using the Simpson rule. ol+x
~olukon: In Example 4 you may observe From Table 2 that N 8 giver lo4 (l.E - 06)
accuracy. We shall now determine N from the theoretical error bound for Simpson's
rule which gives l.E - 06 accuracy. Now
where
M, = max frV(x)
o<x< l
I I
.:
N E 9.5
We find that we cannot take N = 9 since to make use of Simpson's rule we should
have even number ~C-intervals.We therefore conclude that N = 10 should be the
minimum number &f subintervals to obtain the accuracy l.E - 0.6 (i.e., lo4)
You may now try the following exercises :
dx
E3) ~valuateJ----Z by subdividing the interval (0,l) into 6 equal p a w and using
ol+x
/.I T r a n ~ 7 n i r l a ln r l p Ih\ Simncnn'c n ~ l pH p n r p f i n d thp v a 1 1 1nf
~ w and r r t l l a l ~ r r n m
Numerical Differeatiation Integration A 'function f(x) is given by the table
i and Solution of Differential Equations E4)
Find the integral of f(x) using (a) trapezoidal rule @) Simpson's rule.
E5) The speedometer reading of a car moving on a straight road is given. Estimate the
distance travelled by the car in 12 minutes using (a) Trapezoidal rule @) Simpson's
rule.
Time: 0 2 4 6 8 10 12
(minutes)
Speedo- : 0 15 25 40 45 20 0
meter
Reading
0.4
E6) valuates 0.2
(sin - In x + ex) dx using (a) Trapezoidal rule (b) Simpson's rule
taking h = 0.1.Find the actual errors.
E7) Determine N so that the composite trapezoidal rule gives the value of
2
I' e-X dx correct
upto 3 digits after the decimal point, assuming that e-" can be calculated accurately.
You must have realised that though the trapezoidal rule is the easiest Newton-Cotes
formula to apply but it lacks the degree of accuracy generally required. There is a way
to improve the accuracy of the results obtained by the trapezoidal and Simpson rules.
This method is known as Romberg integration, or as extrapolation to the Limit.
Richardson's extrapolation technique (ref. Sec. 12.5 of Unit 12) applied to the
integration methods is called Romberg integration. We shall now discuss this technique
in the next section.
In Romberg integration, first we find the power series expansion of the error term in
the integration method. Then by eliminating the leading terms in the error expression, we
obtain new values which are of higher order than the previously .computed values.
I
I
If Fdh) de otes the approximate value obtained by using the composite trapezoidal rule, then
I = Fo(h) + c,h2 + c-)~+ ~ , h q+ . . . .
where I is the exact value of the integral.
. ..
gral be evaluated with the step lengths h, -h2 and -
h
4'.
i
Eliminati g C, from Eqns. (31) and (32), we get
I
Note tha this value is of 0(h4). Similarly,
etc.
Numerical Integration
Applying this method repeatedly by eliminating C2, then C3 etc. we get the Romberg
integration formula
Fm (h) =
4mFm-l (i)- Fm-l('I
,m=1,2, .... (35)
4m-1
In the same way if Go@) denote the value of the integral obtained by using the
Simpson's rule, then
I = Go@) + d1h4+ d$6 + d,h8 + . . .
where I is the exact value of the integ;al.
Let the integral be evaluated with step lengths h, h/2 and h/4.
Then, we have
I = Go@) + d1h4+ d$6 + .. .
Similar1y,
10
42G0 (t]- (i)
Go
42- 1 = GI($)
etc.
Note that these values are of order h6.
Note that
4F0 (i)
- (i) Fo
F1(i) = 3
Numerical Differentiation Integration
and Solutipn of Differential Equations
Table 4
Note that
Suppose that we wish to evaluate the integral in the above example directly by the
trapezoidal and Simpson's rules to an accuracy 1.OE- 06. What should be the
maximum value of step length to be chosen to achieve this accuracy ?
T o answer this question let us calculate the error bound for trapezoidal rule.
h2 h2 2 h2
IEJflJ s - max lf"(x)I >-
12 O < n < l
Hence
or N 145.
Thus to obtain l.E - 06 accuracy by trayezo~dalrule we need to use 145 subintervals,
i.e., 146 function evaluations. But by extrapolation we have used only 9 evaluatio~lsand
improved these values.
Let us co~lsideranother example.
Example 7 : Use co~npositetrapezoidal rule to find J f 2 l n x d x w i t h N = 3 , 6 , 1 2
and improve the accuracy by Rotnberg integration.
Solution : We give the result in the for111 of the followi~~gtable.
Table 5
You may now try the following exercise.
. E8) The followi~~g table gives the values of I n x for x = 1, 2, ..., 11. Evaluate the
integral of the tabulated function using Trapezoidal rule with h = 1, 2. u s e
Richardson's extrapolation technique to improve the accuracy and obtain the
actual error. Compare the results obtained by using Simpson's rule with h = 1.
We now end this unit by giving a summary of what we have covered in it.
13.5 SUMMARY
where the (n + 1) distinct nodes xk, k = 0, 1, ......, n, x,, < xl < x2 < ..... < x, divide
the integral [a, b] into n subinterirals and pk, k = 0, 1, ......, n are the weights of the
integration rule. The error of the integration methods is then given by
h
E l ) a) I,[fl = -
2
[fo+ f,] = 0.346574
h
Is[f] = - Ifo+ 4f, + f2]
3
0.5
= -[4 In 1.5 + I n 21 = 0.385835
3
b) IT [fj = 0.023208 ,
I, [fj =0.032296,
Exact value = 0.034812 .
%[fj
I+ 1
C) IT [fj = 0.39270, = 0.161
I, ~q= o.a4n8 , [fj = 0.00831
Exact value = 0.34657 .
E2) IT (q = 1.49718
I, [fj = 1.47754.
1
E3) With h = 116, the values of f(x) = -
1+xZ
from x = 0 to 1 are
Now
h
lT[q-y[fo + f6 + f 6 + 2f, t f,+f3 + f 4 +f,
( )I
Exact n = 3.141593
I, [fl =
04 [f (0:2) + 2f (0.3) + f (0.4)] = 0.57629
(b-a13
%[fj=-- 12NZ M 2 , M 2 = max Ifl'(x)J.
o<x<1
Thus
By extrapolation
1
F, (h) = - [4F (h) - F (2h)]= 16.39496667
3
By Simpson's rule
5 4
Structure
14.1 Introductio~l
Objectives
14.2 Basic Co~lcepts
14.3 Taylor Series Method
14.4 Euler's Method
14.5 Richardson's Extrapolation
14.6 Sultnnary
14.7 SolutionslPu~swers
14.1 INTRODUCTION
In the previous two units, you have seen how a colnplicated or tabulated function call be
replaced by an approxunati~~g poly~~omial so that the fundatnental operations of calculus
v i ~ . ,differentiation and integratio~~
can be performed more easily. 111 this w i t we shall solve
a differential equation, that is, we shall find tl~cUILIUIOWII function which satisties a
colnbinatio~lof the indeyc~ldentvariable, dependenr variable and its derivatives. In physics,
eagiaeering, chetnistry and nlany other disciplines it has become necessary to build
nlathelnatical ~nodelsto represent complicated processes. Differential equations are one of
the rnost hnportant mathematic-al tools uscd hl nlodelling problenls in the engineering and
physical sciences. As it is 1101always possible to obtain the analytical solutio~lof differential
equations recourse must necessarily be niade to numerical rtlethods for solving differential
equations. In this unit, we shall h~troduretwo such methods namely, Euler's ?lethod and
Taylor series method to obain numerical solutio~~ of ordinary differential equations (ODEs).
We shall also introducx Richardsoa's extrapolatio~lmethod to obtain higher order solutio~ls
to ODEs using lower order nlethods. To begin with, we shall recall few basic concepts fro111
the theory of differential equations which we shall be refeml~gquite often.
Objectives
After studying this unit you should be able to :
identify the initial value problem for the first order ordinary differential equations;
obtain the solution or the initial value proble~nsby using Taylor series method and
Euler's method;
use Richardson's extrapolatio~~ technique. for ilnprovi~lgthe accuracy df the result
obtained by Euler's method. '
For example,
Numerical Differentiation Integration
and Solution of Differential Equations
Differential equations of the form (I), involving derivatives w.r.t. a single independent
variable are called ordinary differential equations (ODEs) whereas, those involving
derivatives w.r.t. two or more i n d e p e ~ ~ d variables
e~~t are partial differential equations
(PDEs). Eqn. (2) is an example of PDE.
Definition : The order of a differential equation is the order of the highest order 1
!
derivative appearing in the equation a ~ t dits degree is the highest exponent of the
*
1
highest order derivative after the equation has been rationalised i.e., after it has been I
expressed in the forn~free from radicals and any fractional power of the derivatives or
negative Dower. For cxamale eauation
Definition : When the depew'ent variable and its derivatives occur in the first degree
only and not as higher powers or products, the equation is said to be linear, otherwise
it is nonlinear. 1
Equ y = x2 is a linear ODE, whereas, ( x + ~ ) ' nonlinear ODE.
The general form ,of a linear ODE of order n can be expressed in the form
~ [ y =] a, (t) (t) + a l (t) y(n-')(t) + . . . . . . + a,-, ( 9 Y (t) + an (t) Y (t) = r(t) (5)
where r(t), a, (t), i = 1, 2, . . . . . . ., n are known functions of t and
I
d
(t) -1
dtn-
..
+ . . . . . . . + a n - l ( t ) z + an(t),
is the linear differential operator. The general nonlinear ODE of order 11 can be
written as
The general solution of an nth order ODE contains n arbitrary constants. In order to
determine these arbitrary constants, we require n conditions. If these conditions are
given at one point, then these conditions are known as initial conditions a ~ l dthe
differential equation together with the initial conditions is called an initial value
problem (IVP). The nth order IVP can be written as
Heocc, it is sufficient to study numerical ~rlcthodsfor the solution of the first order IVP.
Y' = f(t, Y) ~ ( t " )= Yo
9 (10)
The vector fonlr of thrse neth hods can their be used to solve Eqn. (9). Before attempting to
obtain ~lunleriralsoIutions to Eqn, (lo), we lust make sure (hat the proble~nhas a unique
solution. Thc Lollowhg (heorem cllsures the existence and uniqueiress ot (he solution to IVP (10).
then for any yo, the IVP (10) has a unique solution. This condition is called the
Lipschitz condition and L is called the 1,ipschitz constant.
W e assunre the existence and uniqueness of the solution and also that f(t, y) has
continuous partial derivatives w.r.t. t and y of a s high order as w e desire.
I
Let us assume that t b b e an interval over which the solution of the IVP (10) is
[o,
required. If w e subiivide the interval t b into
[ n I 11 subintervals using a steysize
. where tn = b, we obtain (he mesh points or grid points to, t,, t2 ...... 7 tn
a s shown in Fig. 1.
Fig. 1
W e can then write t, = to + kh, k = 0, 1, . . . . . n. A numerical method for the solution
of the IVP ( l o ) , will produce approximate values y, at the grid points tk.
Numerical :renliation Integration Remember that the approximate values yk may contain the truncation and round-off errors.
and Solutb Differential Equations
We shall now discuss the cot~structionof numerical methods and related basic concepts
with reference to a simple ODE.
where = a and b + Nh = b. 1
Separating the variables and integrating, we find that the exact solution of Eqn. (11) is
eY' - I',)
~ ( t =) Y(Q (12)
In order to obtain a relation connecting two successive solution values, we set t = t,
and gtl in Eqn. (12). Thus we get
y ( t , + ~= y(b) eYt- 1
Dividing, we get
Hence we have
y(t,,),n=O, 1,.....,N-1
Ah
y(t,+J=e
Eqn. (13) gives the required relation between y(t,,) and y(t,,,).
.
Setting 11 = 0,1, 2, . . . ., N-1, successively, we can find y(tl), Y(~J,,.. . ., Y ( ~ N )
from the given value y(fo).
I
An approximate methgd or a numerical method can be obtained by approximating ehh in
Eqn. (13). For example, we may use the followiilg polynomial ayroxhations. I
and so on.
Let us retain (p+l) terms in the expansion of ehy and denote the approximation to ehy
by E(Ah). Tbe ~~umerical method for obtaining the approximate values y, of y(t,) can
then be written as
TE = ~ ( h t , -Yntr
)
Sinc'e (p+l) tenns are re.tained in the expa~lsionof ehh,we have
Numerical Solution of Ordinary
Differential Equations
The TE is of order p+l. The integer p is then called the order of the method.
We say that a ~~umerical method is stable if the error at ally stage, i.e. y, - y($) = E,
remains bounded as n + m. Let us examine the stability of the numerical method (17).
Putting Y , + ~ = ~ ( t , + ~+ )E , + ~ and y, = y(t,) + en in Eqn. (17), we have
We note from Eqn. (18) that the error at t,,, consists of two parts. The first part E [ U ]
- eAhis the local truncation error and can be made as small as we like by suitably
deterttlit~illgE [ U ] . The second part ( E ( U )1 E, is the propagation error from the
previous step t, to $+, and will not grow if IE(&h)l < 1. If I E ( U ) ( c 1, then as
11 + m the 'propagation error tends to zero and method is said to be absolutely stable.
Formally we give the followil~gdefinition.
Defintion : A numerical method (17) is called absolutely stable if ( E ( U ) ( s 1.
You may also observe here that the exact value y(t,) given by Eqn. (13) increases if h
> 0 and decreases if h c 0, with the growth factor eAh.The approximate value yn given
by Eqn. (17) grows or decreases with the factor IE(U) 1. Thus, in order to have
mea~li~lgful~~umericalresults, it is necessary that the growth fact-r 6f the nu~tierical
method should not increase faster than the growth factor of exact solution when h > 0
and should decay at least as fast as the growth factor of the exact solution when h c 0.
Accordilrgly, we give here the following definitioa.
Defil~itiol~
: A numerical method is said to he relatively stal,le if IE ( U ) l s eAh,h > 0.
Firstorder: I l + h h l 4 1
or-1s U s 1
or-2s hhs 2
Second order : I 1 + hh +-
h:2(sl
The second condition gives -2 a hh. Hence the right inequality gives -2 5 Ah a 0 . The
left inequality gives
h2h2 h3h3
Thirdorder: ( + A h + - + -2l a 1 6
-2.5rhhsO.
Numerical methods for finding the solution of IVP given by Eqn. (10) may be broadly
classified as
i) Singlestep methods
ii) Multistep methods
Singlestep methods ehable us to find y,+,, an approximation to y(t,+,), if y,, y,' and h
are known.
If yn+l can be determined from Eqn. (19) by evaluatilig the right hand side, then the
singlestep method is known as an explicit method, otherwise it is known as an implicit
method. The local truncation error of the method (19) is defined by
Let us now take up an example to understand how the singlestep method works.
Example 1 : find the solution of the IVP y' = hy, y(0) = 1 in 0 < t a 0.5, using the
first order method
-
Y , + ~ = (1 t Ah) yn with h = 0.1 and h = 1.
0.5 0.5
Solution : Here the number of intervals are N = -= -=
h 0.1
We have yo = 1
11, =(1 t Ah) y,=(l t A h ) = ( l tO.1A) Numerical Solution of Ordinary
Differential Equations
y 2 = (1 t Ah) y, =(1 tAh12 =(1 t 0 . 1 ~ ) ~
'We now give in Table 1 the values of y, for h = * 1 together with exact values.
In the same way you can obtain the solution using the second order method and
compare the results obtained in the two cases.
(
y,+, = 1 t Ah t - y, with h = d.1 and A = 1.
*
We are now prepared to consider numerical methods for integrating differential
equations. The first method we discuss is the Taylor series method. It is not strictly a
numerical method, but it is the most fundamental method to which every numerical
method must compare.
h
where + (tk, yk, h) = y; + 21 yrlk+ . . . + -yk
h ~ - l (P)
P!
This is called the Taylor Series method of order p. The truncation error of the lllethod
is given by
hp+l
--
- y(P+l) (tk + Oh), 0 < 0 < 1
(p+l) !
Y' =f(t,y)
yl'=f,+f$
h3
with t h e T E = - y U ' ( a ) , t,, < a < 4,'
6
The Taylor series method of 0(h3), (p=3) is
h4
with the TE = y(P) (a), t,, c a < C+,.
Example 2 : Using the third order Taylor series method find the solution of the
differential equation
Solution : We have the derivatives and their values at x=2, y=2 as follows :
yr=l-Y y' (2) = 0
X
Numerical Solution of Ordinary
Differential Equations
Example 3 : Solve the equation x2y' = 1 - xy - x2y2, y(1) = - 1 from x=l to x=2 by
using Taylor series method of 0(h2) with h = ln
and 114 and find the actual error at
x=2 if the exact solution is y = - llx.
Since the exact value is y(2) = -0.5, we have the actual errors as
1
e, = 0.0583 with h = -
3
1
e2 = 0.0321 with h = -
4
- - - - - - -
Write the Taylor series method of order four and solve the IVPs E2) and E3).
E4) Using second order Taylor series method solve the IVP
Notice that though the Taylor series method of order p gives us results of desired accuracy
in a few number of steps, it requires evaluation of the higher order derivati. and becomes
tedious to apply if the various derivatives are complicated. Also, it is d f l .*'t to determine
the error in such cases. We now consider a method, the Eukr's metbod -.,hichcan be
regarded as Taylor series method of order one and avoids tbese difficulties.
Let
[t o ,bI be the interval over which the solution of the given IVP is to be
determined. Let h be the steplength. Then the nodal points are defined by tk = to + kh,
k = 0 , 1 , 2 ,........, N w i t h t N = t O + N h = b .
Fii. ;I
Yk+l = Yk +h ~ ' k
with TE =
(TI
- y"(a), tk < a < t,
Eqn. (32) is known as the Euler's method and it calculates recursively the solution at
the nodal points &, k = 0, 1, ., N... . ..
Since the truncation error (31) is of order b2, Euler's method is of first order. It is also
called an 0@)method.
Let us now see the geometrioal representation of the Euler's method. Numerical Solution of Ordinary
Differential Equations
Geometrical Interpretation
Let y(t) be the solution of the given IVP, Integrating
we get
dt
*
= f(t, y) from tk to tk+l,
(y$ dt = j'.'f(t,y) dt
tk
=~ ( t ~-+y(tk)
~) (33)
We know that geometrically f(t, y) represents the slope of the curve y(t). Let us
approximate the slope of the curve between $ and tk+,by the slope at $ only. If we
approximate y(tk+J and y(tJ by yk+, and yk respectively, then we have
Example 4 : Use Euler method to find the solution of y' = t + ly 1, given y(0) = 1.
Find the solution on [O, 0.81 with h = 0.2.
.
Solution : We have
Numerical Differentiation Integration
and Solution of Differential Equations y(0.81= Y, = Y, + (0.2) f,
= 1.856 + (0.2) [0.6 + 1.856)
= 2.3472
Example 5 : Solve the differential equation y' = t+y, y(0) = 1. t E [0,1] by Euler's
method using h = 0.1. If the exact value is y(1) = 3.436564, find the exact muT.
Solution : Euler's inethod is
= Yn + hy',
Yn+,
For the given problem, we have
= (1 + h)yn + ht,
h = O.l,y(O) = 1,
y1 = yo = (1 + 0.1) + (0.1) (0) = 1.1
y2=(1.1)(1.1)+(0.1)(0.1)= 1.22, y,= 1.362
y, = 1.5282, y, = 1.72102, y, = 1.943122,
y, = 2.197434, y, = 2.487178, y, = 2.815895
ylo = 3.187485 = y(1)
actual error = y(1) - ylo = 3.436564 - 3.187485 = 0.2491.
Remark': Since Euler's method is of O(h), it requires h to be very small to attain the
desired accuracy. Hence, very often, the number of steps to be carried out becomes very
large. In such cases, we need higher order methods to obtain the required accuracy in a
limited number of steps.
This equation is called the difference equation associated with Euler's method. A
. . . . . . ., Y , + ~ soine
difference equation of order N is a relation involving y,, Y,+~,
simple difference equations are
where n is an integer.
where the coefficients aN-,, aN,, ..... . . ., a. and b may be functions of n but not of
y. All the Eqns. (35) are linear. It is easy to solve the difference Eqn. (36), when the
coefficients are constant or a function of n say linear of a quadratic function of n.
I
setting y,(p) = A (a c o ~ ~ s t a ~inl tEqn.
) (36) and detrrmillil~gthe value of A. For detail,
you call refer to ele~tlentarynumerical analysis by Coate- deBoor. We illustrate this
method by considering a few examples.
Example 6 : Find the solution of the initial-value difference equations
Ynt2-4~,+, + 3yn = 2", Yo = 0, Y, = 1
I
t
.-. y, (C) = C, (1)" + C, (3)"
This gives
= C, + 3"C2
For obtaining the particular solutio~lwe try y,(p) = A2".
I or, A = -1
ydc) = C (1 + 3h)'.
For obtaining the particular solution we try yk(p) = Ah.
This give
Numerical DifferentiationIntegration Therefore, the general solution of the given problem is
and Solution of Differential Equations
5
y k = ~+3l1)~--.
(l
3
Using the condition y(0) = 1, we obtain C = 8/3.
Thus
8 5
y k = y (1 +3i1)~--.
3
Yk+l = + 3h) yk + 5h
E6) y' = -
x2 - 4y
Ly(4) ,
= 4. Find y(4.l) taking h = 0.1
E8) y' = 1 + y2, y(0) = 1. Find y(0.6) taking h = 0.2 and h = 0.1.
which is au O(h) method. Let F(h) and F(h/2) be the solutions obtained by using step
lengths h and h/2 respectively.
Recall that the Richardson's extrapolation method of combining two computed values with two
different step sizes, to obtain a higher order method is given by (ref.. Formula (54) Unit 12)
Let y1(4) and y2(t,J be the two values obtained by a numerical method of order p with
step sizes hl and b. If e: and 5 are the corresponding errors, then
P
in the interval (0, I ] taking h = 0.2, 0.1. Using Richardson's extrapolation technique
obtain the improved value at t = 1.
Table 2 : h = 0.2
- - -
Similarly, starting with t, = 0, yo = 1, we obtain the following table of values for h = 0.1.
Table 3 : h = 0.1
2F(0.1) - F(0.2)
(0.1) =
1
= 2(0.50364) - (0.50706)
= 0.50022
Example 7 : Use Euler's method to solve numerically the initial value: proble
y' = t t y , y(0) = 1 with h = 0.2, 0.1 and 0.05 in the interval [0, 0.61. Apply
Richardson's extrapolation technique to compute y(0.6).
Table 4 : h = 0.2
Table 5 : h = 0.1
Table 6 : h = 0.05
= 2.043649
The exact solution is y = - (l+t) + 2et
i
I
= 0.000589
Aad now a few exercises for you
E9) The IW
is given. Find (0.6) with h = 0.2 and h = 0.1, using EulerTs method and extrapolate
the value y(0.6). Compare with the exact solution.
1 E10) Extrapolate the value y(0.6) obtained in B).
i
We now end this unit by giving a summary of what we have cwered in it.
is given by
2) Euler's method is the Taylor series method of order one. The steps involved in
solving the IVP given by (10) by Euler's method are as follows :
Step 1 : Evaluate f(t,,, yo)
Step 2 : Find y, = yo + h f(t, yo)
Step 3 : If t,, < b, change b to b + h and yo to y, and repeat steps 1 and 2
Numerical Differentiation Integration
and Solution o f Differential Equations Step 4 : If t, = b, write the value of y
3) Richardson's extrapolation method given by Eqn. (42) can be used to improve the
values sf the function evaluated by the Euler's method.
y2= (1.105)~
------------------
ys = (1.105)~
Table giving the values of y, together with exact values is
Table 7
t Second order method Exact solution
0 1 1
0.1 1.105 1.10517
0.2 1.22103 1.22140
0.3 1.34923 1.34986
0.4 1.49090 .1.49182
0.5 s X 34745 ' 1.64872
Substituting
Table 8 : h = 0.2
Table 9 : h = 0.1
Table 10 : h = 0.2
Numerical Diff6rentiation Integration Table 11 : h = 0.1
and Solution of Differential Equations
Structure
15.1 Introduction
Objectives
15.2 Runge-~uttaMethods
Runge-Kutta Methods of Second Order
Runge-Kutta Methods of Third Order
Runge-Kutta Methods of Fourth Order
15.3 Richardson's Extrapolation
15.4 Summary
15.5 Solutions/Answers
15.1 INTRODUCTION
In order to avoid this difficulty, at the end of nineteenth century, the Cierman
mathematician, Runge observed that the expression for the increment function t$ (t, y, h)
in the singlestep methods [see Eqn. (24) of Sec. 14.3, Unit 141
can be modified to avoid evaluation of hlgher order derivatives. This idea was further
developed by Runge and Kutta (another German mathematician) and the methods given
by them are known as Runge-Kutta methods. Using their ideas, we can construct higher
order methods using only the function f(t, y) at selected points on each subinterval. We
shall, in the next section, derive some of these methods.
Objectives
After studying this unit, you should be able to :
obtain the solution of IVPs using Runge-Kutta methods of second, third and fourth order,
compare the solutions obtained by using Runge-Kutta and Taylor series methods;
extrapolate the approximate value of the solutions obtained by the Runge-Kutta
methods of second, third and fourth order.
We shall first try to discw the basic idea of how the Runge- Kutta methods are developad.
Numerical D i f h t i a t i o n Integration consider the q h 2 ) singlestep method
and Solution of Differential Equations
hZ
Y , + l Z Y, + hy', + TY",
If we write Qn. (3) in the form of Eqn. (2) i.e., in terms of + [t,,, y,, hl involving
partial derivatives of f(t, y), we obtain
Runge observed that the r.h.s. of Eqn. (4) can also be obtained using the Taylor series
expansion of f(t,, + ph, y, + qhf) as
f(t,, + ph, y, + qhQ - f + ph f, (6,Y,) + qh% fy (b* yn) (5)
Comparing Eqns. (4) and (5) we € i d that p = q = 112 and the Taylor series method of
0(h2) given by Eqn. (3) can also be written as
Since (5) is of q h 2 ) , the value of yn+l in (6) has the TE of 0(h3). Hence the method
(6) is of w h Z ) which is same as that of (3).
The advantage of using (6) over Taylor series method (3) is that we need to evaluate
the function f(t, y) only at two points (t,,, y,) and . We observe that
f(t,,, y,) denotes the slope of the solution curve to the IVP (1) at (t,,, y,). Further,
[ + -, + (-1 I
f t, y, f, denotes an approximation to the slope of the solution curve at the
[ + - ( + -:)I
point t,, y t, Eqn. (6) denotes geometrically, that the slope of the solution
I
curve in the interval t , tn+l is being approximated by an approximation to the slope at
h
the middle points t,, + -. This idea can be generalised and the slope of the solution
2
[ I
curve in t,,, t,+, can be replaced by a weighted sum of slopes at a number of points in
(called off- step poinb). This idea is the basis of the Runge-Kutta methods.
[b,
Let us consider for example, the weighted sum of the slopes at the two points [t,,, y,]
and [t,, + ph, Y, + qhfl, 0 < p, q < 1 as
We a11 W, and W2 as weights and p and q as scale factors. We have to determine the
four unknowns W1, W2, p and q such that 4 (t,,, y,, h) is of 0(h2). Substituting Eqn. (5)
in (7),we have
where ( 1, denotes that the quantities inside the brackets are evaluated at (t,,, y,).
-
Comparing the r.h.s. of Eqn. (9) wlth Eqn. (3), we € i d that
In the system of Eqns. (lo), since the number of unknowns is more than the number of Solution OE Ordinary DiEEuential
Equations using Runpc-Kutta Methods
equations, the solution is not unique and we have infinite llurnber of solutions. The
solution of Eqn. (10) can be written as
Note that when f is a functian of t only, the method (12) is equivalent to the
trapezoidal rule of integration, whereas the method (6) is equivalent to the midpoint rule
of integration. Both the methods (6) and (12) are of 0(h2). The methods (6) and (12)
can easily be implemented to solve the IVP (1). Method (6) is usually known as
improved tangent method or modified Euler method. Method (12) is also known as
Euler-Cauchy method. '
We shall now discuss the Runge-Kutta methods of 0(h2), 0(h3) and 0(h4.
r i-I 1
The parameters Ci, aij, Wj are unknowns and are to be determined to obtain the
Runge-Kutta methods.
where
where the parameters C2 a,, W, and W2 are chosen to make yn+] closer to y(&+3.
h2 h3
y(b+d=y(f)+ ( t d , h ~ ' + ~ ~ ' ( b ) + ~ ~ ' " ( t d , + . . . .
where
Y' = f(S Y)
yff=<+f$
~"'=<~+2f<~+S7fl+fy(~+f$)
We expand Kt and & about the point (\, yn)
1
a21w2 = 5
From these equations we find that if Cz is chosen arbitrarily we have
Subtracting Eqn. (20) fiom the Taylor series (17, we get the truncation error as
= y(b+J - Y,+1
Since the TE is of 0(h3), all the above R-K methods are of second order. Observe that
no choice of C2 will make the leading term of TE zero for all f(t, y). The local TE
depends not only on derivatives of the solution y(t) but also on the function f(t, y). This
is typical of all the Runge-Kutta methods. Generally, C2 is chosen between 0 and 1 so
that we are evaluating f(t, y) at an off-step point in [b, b+,]. From the defmition, every
Runge- Kutta formula must reduce to a quadrature formula of the same order or greater
if f(t, y) is independent of y; where Wi and Ci will be weights and abcissas of the
corresponding numerical integration formula.
Best way o f obtaining the valuk.pf.the arbitrary parameter C2 in our tormula is to Solution o f Ordinary Differential
Equations using RungeKutta Methods
i) choose some of Wi's zero so as to minimize the computations.
ii) choose the parameter to obtain least m,
iii) choose the parameter to have longer stability interval.
Methods satisfying either of the condition (ii) or (iii) are called optimal Runge-kutta
methods.
1 1
i) C 2 = -
2'
:. a2, = - W,= 0, W2 = 1, then
2'
2
iii) C2 = -
3'
.. a2, = 3'2 W1 = -4'1 W2 = -34' then
~ntl=~n+K,
Table 1
Solutions and errors in solution of y' = - t y2 y(2) = 1, h = 0.1. Numbers inside
brackets denote the errors.
You may observe here that all the above numerical solutions have almost the same error.
- -
Solve the following IVPs using Heun's method of q h 2 ) and the optimal R-K method
of 0(h2).
Also compare the errors at t = 0.4, obtained here with the one obtained by Taylor
series method of 0(h2).
1
y' = 3t + - y, y(0) = 1. Find y(0.2) taking h = 0.1. Given y(t) = 13etn- 6t - 12, find
i E3) 2
the errors.
Y~+~=Y~+W~K~+WZK,+W~K~
where
Numerical Differentiation Integration
and Solution of Differential Equations K, = h f(tn + c p , Yn + a21 K,)
K3 = h f(tn + C3h, Y, + a31 Kl + a32 KJ
Expanding 6,K3 and yn+linto Taylor series, substituting their values in Eqn. (25) and
comparing the coefficients of powers of h, h2 and h3, we obtain
1
a21 = C2 c2w2+ C3W3= y
We have 6 equations to determine th? 8 unknowns. Hence the system has two arbitrary
parameters. Eqns. (26) are typical of all the R-K methods. Looking at Eqn. (26), you
may note that the sum of aij9sin any row equals the corresponding Ci's and the sum of
the Wi's is equal to 1. Further, the equations are linear in W2 and W3 and have a
solution for W2 and W3 if and only if
Since two parameters of this system are arbitrary, we can choose C2, C, and determine
from Eqn. (27) a<
C3(C3 - CZ)
a
- 3C2)
32 = C2(2
2
If C3 = 0, or C2 = C3 hen Cz = - and we can choose a= d 0, arbitrarily. All Ci's
3
should. be chosen such that 0 < Ci < 1. Once C2 and C3 are prescribed, Wi9sand aij's
can be determined from Eqns. (26).
Heun's Method
We now illustrate the third order R-K methods by solving the problem considered in
Example 1, using (a) Heun's method @) optimal method
a) Heun's method
= - 0.16080
y(2.1) = 0.8294
Taking t, = 2.1 and y = 0.8294, we have
K, = - 0.14446
K, = - 0.13017
K3 = - 0.11950
y(2.2) = 0.70366
Numerical Differeatiation fntcgration b) Optimal method
and Solution of Differential Equations
K3 = -0.15905
y(2.1) = 0.8297
Taking tl = 2.1 and yl = 0.8297, we have
Kl = -0.14456
You can now easily find the errors in these solutions and compare the results with those
obtained in Example 1.
And now here is an exercise for you.
Since the expansions of K,,K3, and yMl in Taylor series are complicated, we shall
not write down the resulting system of equations for the determination of the unknowns.
It may be noted that the system of equations has 3 arbitrary parameters We shall state
directly a few R-K methods of ~ ( h ? .The R-K methods (31) can be denoted by
c
2 a21
c3 a31 a32
4
'1
' "2' "3
' w4
For different choices of these unknowns we have the following methods :
i) Classical R-Kmethod
Solution of Ordinary Differential
(32) Equations using R u n g e b t a Methods
This is the widely used method due to its simplicity and moderate o d u . We shall also
be working out problems mostly by the classical R-K method lanlcas specified otherwise.
ii) Runge-Kutta-MI metbod
The RungaKutta-Gill method is also used widely. But, in this unit, we s h l l mostly
work out problems witb the classical R-K method of q h 9 . Hence, whenever we refer
to R-K method of q b 4 ) we mean only the classial R-K method of 0@3given by
(32). We shall now illustrate this method through examples.
Example 2 : Solve the IVP y' = t + y, y(0) = 1 by Runge-Kuttr method of 0@3 for
t E [0,0.5] witb h = 0.1. Also find the e m r at t = 0.5, if the exact solution is
y(t) = 2et-t-1.
Solution : We use the R-K metbod of 0@>given by (32).
Initially, to = 0, yo = 1.
We have
Numerical Differentiation Integration 1
and Solution of Differential Equations Y ~ = Y ~ + ~ ( K , + ~ K ~ + ~ & + K ~ )
1
= 1 + - [1+ 0.22 + 0.2210 + 0.121051 = 1.11034167
6
Taking tl = 0.1 and y, = 1:11034167, we repeat the process.
+ 0.144303013] = 1.24280514
Rest of the values y3, ye y5 we give in Table 2.
Table 2
-
Now the exact solution is
Error at t * 0.5 is
Find y(0.1), y(0.2), y(0.3) taking B = 0.1. Also € i d the errors at t = 0.3, if &e exact
solution is y(t) E 3(e2 - e'),
Solution : a) Classical R-K method is Solution of Ordinary Diffaential
Equations using RungeKutta hhethods
K, = 0.528670978,
K, = 0.6045222614,
y(0.3) e 1.416751936
-
From the exact solution we get
y(0.3) 1.416779978
Enor in classical R-K method (at t = 0.3) = 0.2802X lo*
Error in R-K43iII method (at t = 0.3) = 0.2804 x loa.
You may now try the following exercises.
-
+
E8) y' m L
tZ
- t - y2, y(1) - 1. Find y(1.3) taking h = 0.1. Given the exact solution to be
y(t) = t,
Eiid the error at t = 1.3.
In the next section, we shall study the application of Richardson's extrapolation to the
solutions of ordinary differential equations.
You know that Richardson's extrapolation technique improves the approximate value of
y(\) and the order of this improved value of y(f) exceeds the order of the method by
one.
Here we shall first calculate the solutions F(hl) and F(hJ of the given IVP with
steplengths h, and h2 where h2 = h1/2 at a given point using a Runge-Kutta method.
Then by Richardson's extrapolation technique we have for the second order method
as the improved solution at that point, which will be of higher order than tbe original
method. We shall now illustrate the technique through an example..
Example 4 : Using Runge-Kutta method of 0@') find the solution of the IVP y' = t + y,
y(0) = 1 using h = 0.1 and 0.2 at t = 0.4. Use extrapolation technique to improve the
accuracy. Also fid the errors if the exact solution is y(t) = 2et - t - 1.
Solution : We shall use Hcun's second order method (23) to find the solution at
t = 0.4 with h = 0.1 and 0.2. The fdlowing Table 3 gives values of y(t) at t = 0.2
and t = 0.4 with h = 0.1 and 0.2.
Table 3
E9) Solve E2) taking h = 0.1 and 0.2 using q h Z )Heun's method. Extrapolate the value at
t = 0.4. Also find the error at t = 0.4.
E10) Solve E6), taking h = 0.1 and 0.2 using 0(h2) Heun's method. Extrapolate the value
at t = 0.4. Compare this solution with the solution obtained by the classical 0(h4
R-K method.
We now end this unit by giving a iummary of what we have covered in it.
Solution of Ordinary Diffexcntial
Equations using Runge-Kutta Metbds
15.4 SUMMARY
In this unit we have leamt the following :
1) Runge-Kutta methods being singlestep methods are self- starting methods.
b
2) Unlike Taylor series methods, R-K methods do not need calculation of higher order
derivatives of f(t, y) but need only the evaluation of f(t, y) at the off-step points.
3) For a given IVP of the form
y l = f ( t , y ) , ~'(to)=Yo, f"tO,b1
where the mesh points are tj = to + jh, j = 0, 1,. . . . ..,n.
t, = b = t, + nh, R-K methods are obtained by writing
= yn + h (weighted sum of the slopes)
rn
The ullknowns Ci, ai, and Wj are then obtained by expanding K,'s and yn+lin Taylor
series about the point (t,, y,) and comparing the coefficients of different powers of h.
4) Richardso~l'sextrapolation technique canhe used to improve the approximate value
qf y(6) obtained by q h 2 ) ,0(h3) and 0(h3 methods and obtain the method of order
one higher than the method.
K,=0.0833125, &=0.117478125
y(0.2) = 1.166645313
Optimal R-K, method : '
K, = 0.05, K2 = 0.071666667
y(0.1) = 1.06625
K, = 0.0833125, & = 0.106089583
y(0.2) = 1.166645313
Exact y(0.2) = 1.167221935
Error in both the methods is same and = 0.577 x
1
E4) Heun's method : yn+l = yn + q (K1+ 3K3)
Starting with 6 = 0, yo = 2, h = 0.1, we have
K1 = - 0.005, K2 = - 0.004853689024
K, = - 0.0048544, K4= - 0.004715784587
y(4.2) = 0.995 1446726.
Exact y(4.2) = 0.99514523 1, Error = 0.559 x lo4
K, = 0.1, K2 = 0.09092913832
K, = 0.09049729525, K4 = 0.08260717517
~(1.1)= - 0.909089993
K, = 0.08264471 138, & = 0.07577035491
K, = 0.0'7547152415, K, = 0.06942067502
~(1.2)= - 0.8333318022
K, = 0.06944457204, K2= 0.0641 1104536
K, = 0.06389773475, K, = 0.0591559551
~(1.3)= - 0.7692287876
Exact y(1.3) = - 0.7692307692
Error = 0.19816 x
Heun's method :
with h = 0.1
K,=0.1, & = O . l O l
y(0.1) = 0.1005
K, = 0.101010, & = 0.104061
y(0.2) = 0.203035
K, = 0.1041223, 6 = 0.1094346
y(0.3) = 0.309813
K, = 0.1095984, & = 0.1048047
with h = 0.2
F(h) = y(0.4) = 0.4251626422 [see E2]
Now
4F(h/2) - F(h)
F("(0.4) = 3
= 0.4142958537
Exact y(0.4) = 0.422793219
Error = 0.8495 x