Chapter 4 (5 Lectures)

CHAPTER 4 (5 LECTURES)
ITERATIVE TECHNIQUES IN MATRIX ALGEBRA
1. Introduction
In this chapter we will study iterative techniques to solve linear systems. An initial approximation
(or approximations) will be found, and new approximations are then determined based on how well
the previous approximations satisfied the equation. The objective is to find a way to minimize the
difference between the approximations and the exact solution.To discuss iterative methods for solving
linear systems, we first need to determine a way to measure the distance between n-dimensional column
vectors. This will permit us to determine whether a sequence of vectors converges to a solution of the
system. In actuality, this measure is also needed when the solution is obtained by the direct methods
presented in Chapter 3. Those methods required a large number of arithmetic operations, and using
finite-digit arithmetic leads only to an approximation to an actual solution of the system.
2. Norms of Vectors and Matrices

2.1. Vector Norms. Let Rn denote the set of all n−dimensional column vectors with real-number
components. To define a distance in Rn we use the notion of a norm, which is the generalization of
the absolute value on R, the set of real numbers.
Definition 2.1. A vector norm on Rn is a function, k·k, from Rn into R with the following properties:
(1) kxk≥ 0 for all x ∈ Rn .
(2) kxk= 0 if and only if x = 0.
(3) kx + yk≤ kxk+kyk for all x, y ∈ Rn (triangle inequality).
(4) kαxk= |α|kxk for all x ∈ Rn and α ∈ R.
Definition 2.2. The l2 and l∞ norms for the vector x = (x1 , x2 , . . . , xn )t are defined by
Xn 1/2
kxk2 = |xi |2 and kxk∞ = max |xi |.
1≤i≤n
i=1
Note that each of these norms reduces to the absolute value in the case n = 1.
The l2 norm is called the Euclidean norm of the vector x because it represents the usual notion of
distance from the origin in case x is in R1 = R, R2 , or R3 . For example, the l2 norm of the vector
x = (x1 , x2 , x3 )t gives the length of the straight line joining the points (0, 0, 0) and (x1 , x2 , x3 ).
Example 1. Determine the l2 norm and the l∞ norm of the vector x = (−1, 1, −2)t .
Sol. The vector x = (−1, 1, −2)t in R3 has norms
p √
kxk2 = (−1)2 + (1)2 + (−2)2 = 6
and
kxk∞ = max{|−1|, |1|, |−2|} = 2
Definition 2.3 (Distance between vectors in Rn ). If x = (x1 , x2 , · · · , xn )t and y = (y1 , y2 , · · · , yn )t
are vectors in Rn , the l2 and l∞ distances between x and y are defined by
( n )1/2
X
kx − yk2 = (xi − yi ) ,
i=1
kx − yk∞ = max |xi − yi |.
1≤i≤n
Definition 2.4 (Matrix Norm). A matrix norm on the set of all n × n matrices is a real-valued
function, k·k, defined on this set, satisfying for all n × n matrices A and B and all real numbers α :
1
2 ITERATIVE TECHNIQUES IN MATRIX ALGEBRA
(1) kAk≥ 0,
(2) kAk= 0 if and only if A is O, the matrix with all 0 entries,
(3) kαAk= |α|kAk,
(4) kA + Bk≤ kAk+kBk,
(5) kABk≤ kAk kBk,
If k·k is a vector norm on Rn , then
kAk= max kAxk
kxk=1
is a matrix norm.
The matrix norms we will consider have the forms
kAk∞ = max kAxk∞ ,
kxk∞ =1
and
kAk2 = max kAxk2 .
kxk2 =1
Theorem 2.5. If A = (aij ) is an n × n matrix, then

n
X
kAk∞ = max |aij |.
1≤i≤n
j=1
Example 2. Determine kAk∞ for the matrix

 
1 2 −1
A = 0 3 −1 (2.1)
5 −1 1
Sol. We have
3
X 3
X
|a1j | = |1| + |2| + | − 1| = 4, |a2j | = |0| + |3| + | − 1| = 4
j=1 j=1
and
3
X
|a3j | = |5| + | − 1| + |1| = 7.
j=1
So above theorem implies that kAk∞ = max{4, 4, 7} = 7.
Definition 2.6 (Eigenvalue and eigenvector). Let A be a square matrix then number λ is called an
eigenvalue of A if there exists a nonzero vector x such that Ax = λx. Here x is called the corresponding
eigenvector.
Definition 2.7 (Characteristic polynomial). Characteristic polynomial is defined as
P (λ) = det(A − λI).
λ is an eigenvalue of matrix A if and only if λ is a root of the characteristic polynomial, i.e., P (λ) = 0.
Definition 2.8 (Spectral Radius). The spectral radius ρ(A) of a matrix A is defined by
ρ(A) = max |λ|, where λ is an eigenvalue of A.
2.2. Convergent Matrices. In studying iterative matrix techniques, it is of particular importance
to know when powers of a matrix become small (that is, when all the entries approach zero). Matrices
of this type are called convergent.
Definition 2.9. An n × n matrix A convergent if
lim Ak = 0.
k→∞
Example 3. Show that matrix 1
2 0
A= 1 1
4 2
is a convergent matrix.
ITERATIVE TECHNIQUES IN MATRIX ALGEBRA 3
Sol. Computing powers of A, we obtain:

1 1

2 4 0 3 8 0
A = 1 1 ,A = 3 1 ,
4 4 16 8
and, in general, 1
k 2k
0
A = k 1 .
2k+1 2k
So A is a convergent matrix because
1 k
lim k
= 0, lim k+1 = 0.
k→∞ 2 k→∞ 2
∴ lim Ak = 0,
k→∞
which implies matrix A is convergent.
Note that the convergent matrix A in this Example has spectral radius ρ(A) = 21 , because 12 is the
only eigenvalue of A. This illustrates an important connection that exists between the spectral radius
of a matrix and the convergence of the matrix, as detailed in the following result.
Theorem 2.10. The following statements are equivalent.
(i) A is a convergent matrix.
(ii) lim Ak = 0, for some natural norm.
k→∞
(iii) ρ(A) < 1.
The proof of this theorem can be found in advanced texts of numerical analysis.
3. Iterative Methods
The linear system Ax = b may have a large order. For such systems Gauss elimination is often too
expensive in either computation time or computer memory requirements or both.
In an iterative method, a sequence of progressively iterates is produced to approximate the solution.
Jacobi and Gauss-Seidel Method: We start with an example and let us consider a system of
equations
9x1 + x2 + x3 = 10
2x1 + 10x2 + 3x3 = 19
3x1 + 4x2 + 11x3 = 0.
One class of iterative method for solving this system as follows.
We write
1
x1 = (10 − x2 − x3 )
9
1
x2 = (19 − 2x1 − 3x3 )
10
1
x3 = (0 − 3x1 − 4x2 ).
11
(0) (0) (0) (0)
Let x = [x1 x2 x3 ] be an initial approximation of solution x. Then define an iteration of sequence
(k+1) 1 (k) (k)
x1 = (10 − x2 − x3 )
9
(k+1) 1 (k) (k)
x2 = (19 − 2x1 − 3x3 )
10
(k+1) 1 (k) (k)
x3 = (0 − 3x1 − 4x2 ), k = 0, 1, 2, . . . .
11
This is called Jacobi or method of simultaneous replacements. The method is named after German
mathematician Carl Gustav Jacob Jacobi.
We start with [0 0 0] and obtain
(1) (1) (1)
x1 = 1.1111, x2 = 1.900, x3 = 0.0.
(2) (2) (2)
x1 = 0.9000, x2 = 1.6778, x3 = −0.9939
etc.
An another approach to solve the same system will be following.
(k+1) 1 (k) (k)
x1 = (10 − x2 − x3 )
9
(k+1) 1 (k+1) (k)
x2 = (19 − 2x1 − 3x3 )
10
(k+1) 1 (k+1) (k+1)
x3 = (0 − 3x1 − 4x2 ), k = 0, 1, 2, . . . .
11
This method is called Gauss-Seidel or method of successive replacements. It is named after the German
mathematicians Carl Friedrich Gauss and Philipp Ludwig von Seidel. Starting with [0 0 0], we obtain
(1) (1) (1)
x1 = 1.1111, x2 = 1.6778, x3 = −0.9131.
(2) (2) (2)
x1 = 1.0262, x2 = 1.9687, x3 = −0.9588.
General Approach: The Jacobi iterative method is obtained by solving the ith equation in Ax = b
for xi to obtain (provided aii 6= 0)
 
n
1  X 
 (−aij xj ) + bi  , for i = 1, 2, · · · , n.
xi =
aii  
j=1
j6=i
(k+1) (k)
For each k ≥ 1, generate the components xi from xi as
 
n
(k+1) 1  X
(−aij xkj ) + bi 

xi =  , for i = 1, 2, · · · , n.

aii 
j=1
j6=i
In matrix form
Dx(k+1) = −(L + U )x(k) + b,
where D, L and U are diagonal, strictly lower triangle and upper triangle matrices, respectively.
If D−1 exists, then the Jacobi iterative scheme is
x(k+1) = −D−1 (L + U )x(k) + D−1 b.
We write Tj = −D−1 (L + U ) and B = D−1 b to obtain
x(k+1) = Tj x(k) + B.
Matrix Tj is called the iteration matrix.
For Gauss-Seidel, we write each equation as
(k+1) (k)
a11 x1 = −a12 x2 − · · · + b1
(k+1) (k+1) (k)
a21 x1
+ a22 x2 = −a23 x3 − · · · + b2
..................................
(k+1) (k+1)
an1 x1 + an2 x2 + · · · + ann x(k+1)
n = bn
In general, we solve the ith equation in Ax = b for xi and iterative scheme is
 
i−1 n
(k+1) 1 −
X (k+1)
X (k)
xi = (aij xj )− (aij xj ) + bi ) ,
aii
j=1 j=i+1
In matrix form
(D + L)x(k+1) = −U x(k) + b
where D, L and U are diagonal, strictly lower triangle and upper triangle matrices, respectively.
x(k+1) = −(D + L)−1 U x(k) + (D + L)−1 b
x(k+1) = Tg x(k) + B, k = 0, 1, 2, · · ·
−1
Here Tg = −(D + L) U and this matrix is called iteration matrix and B = (D + L)−1 b.
Example 4. Use the Gauss-Seidel method to approximate the solution of the following system:
4x1 + x2 − x3 = 3
2x1 + 7x2 + x3 = 19
x1 − 3x2 + 12x3 = 31.
Continue the iterations until two successive approximations are identical when rounded to three signif-
icant digits.
Sol. To begin, write the system in the form
1
x1 = (3 − x2 + x3 )
4
1
x2 = (19 − 2x1 − x3 )
7
1
x3 = (31 − x1 + 3x2 )
12
As
|a11 | = 4 > |a12 | + |a13 | = 1
|a22 | = 7 > |a21 | + |a23 | = 3
|a33 | = 12 > |a31 | + |a32 | = 2
which shows that coefficient matrix is strictly diagonally dominant. Therefore Gauss-Seidel iterations
will converge.
Start with a random vector x(0) = [0, 0, 0]t the first approximation is
(1)
x1 = 0.7500
(1)
x2 = 2.5000
(1)
x3 = 3.1458.
Similarly
x(2) = [0.9115, 2.0045, 3.0085]t
x(3) = [1.0010, 1.9985, 2.9995]t
x(4) = [1.000, 2.000, 3.000]t .
3.1. Convergence analysis of iterative methods.
Theorem 3.1 (Necessary and sufficient condition). A necessary and sufficient condition for the con-
vergence of an iterative method is that the eigenvalue of iteration matrix T satisfy the inequality
ρ(T ) < 1.
Proof. Let
ρ(T ) < 1.
The sequence of vector x(k) by iterative method (Gauss-Seidel) are given by
x(1) = T x(0) + B.
x(2) = T x(1) + B = T (T x(0) + B) + B = T 2 x(0) + (T + I)B.
........................
x = T x + (T k−1 + T k−2 + ... + T + I)B
(k) k (0)
Since ρ(T ) < 1, this implies

lim T k x(0) = 0
k→∞
Therefore
lim x(k) = (I − T )−1 B as k → ∞
k→∞
Therefore, x(k) converges to unique solution x = T x + B.

Conversely, assume that the sequence x(k) converges to x. Now
x − x(k) = T x + B − T x(k−1) − B = T (x − x(k−1) )
= T 2 (x − x(k−2) )
= T k (x − x(0 ).
Let z = x − x(0) ) then
lim T k z = lim (x − x(k) ) = x − lim x(k) = x − x = 0.
k→∞ k→∞ k→∞
=⇒ ρ(T ) < 1.
Theorem 3.2. If A is strictly diagonally dominant in Ax = b, then iterative method always converges
for any initial starting vector.
Proof. We assume that A is strictly diagonally dominant, hence aii 6= 0 and
n
X
|aii | > |aij |, i = 1, 2, · · · , n
j=1
j6=i
Gauss-Seidel iterations are given by

x(k+1) = −D−1 (L + U )x(k) + (D + L)−1 b
x(k+1) = Tj x(k) + B.
Method is convergent iff ρ(Tj ) < 1.
Now
k(D + L)−1 k∞
kTj k∞ = k − (D + L)−1 U k∞ ≤ k − (D + L)−1 k∞ kU k∞ = < 1.
| max aii |
This shows the convergence condition for Jacobi method.
Further we prove the convergence of Gauss-Seidel method. Gauss-Seidel iterations are given by
x(k+1) = −(D + L)−1 U x(k) + (D + L)−1 b
x(k+1) = Tg x(k) + B.
Let λ be an eigenvalue of matrix A and x be an eigenvector then
Tg x = λx
−(D + L)−1 U x = λx
−U x = λ(D + L)x
n
X Xi
− aij = λ[ aij xj ], i = 1, 2, . . . , n
j=i+1 j=1
n
X i−1
X
− aij = λaii xi + λ aij xj
j=i+1 j=1
i−1
X n
X
λaii xi = −λ aij xj − λ aij xj
j=1 j=i+1
i−1
X n
X
|λaii xi | ≤ |λ| |aij | |xj | + |λ| |aij | |xj |
j=1 j=i+1
Since x is an eigenvector, x 6= 0, so we can take norm ||x||∞ = 1.

Hence  
i−1
X n
X
|λ| |aii | − |aij | ≤ |aij |
j=1 j=i+1
Pn Pn
j=i+1 |aij | |aij |
=⇒ |λ| ≤ Pi−1 ≤ Pj=i+1
n =1
|aii | − j=1 |aij | j=i+1 |aij |
which implies spectral radius ρ(T ) < 1.
This implies Gauss-Seidel is convergent.
Example 5. Given the matrix  
1 2 −2
A = 1 1 1 
2 2 1
Decide whether Gauss-Seidel converge to the solution of Ax = b.
Sol. The iteration matrix of the Gauss-Seidel method is
T = −(D + L)−1 U
 −1  
1 0 0 0 2 −2
= − 1 1 0 0 0 1
2 2 1 0 0 0
  
1 0 0 0 2 −2
= − −1 1 0 0 0 1
0 −2 1 0 0 0
 
0 2 −2
= − 0 −2 3 
0 0 −2
 
0 −2 2
= 0 2 −3
0 0 −2
The eigenvalues of iteration matrix T are λ = 0, 2, 2 and therefore spectral radius > 1. The iteration
diverges.
4. The SOR method

We observed that the convergence of an iterative technique depends on the spectral radius of the
matrix associated with the method. One way to select a procedure to accelerate convergence is to
choose a method whose associated matrix has minimal spectral radius. These techniques are known
as Successive Over-Relaxation (SOR). The SOR method is devised by applying extrapolation to the
Gauss-Seidel method. This extrapolation takes the form of a weighted average between the previous
iterate and the computed Gauss-Seidel iterate successively for each component. We multiply with a
(k+1)
weight ω and to calculate xi , we modify the Gauss-Seidel procedure to
(k+1) (k) (k+1) (k)
xi = xi + ω(xi − xi )
(k+1) (k) (k+1)
xi = (1 − ω)xi + ω xi .
The last term is calculated by Gauss-Seidel and we write
 
i−1 n
(k+1) ω  X (k+1)
X (k)
xi = (1 − ω)xki + bi − (aij xj )− (aij xj )) .
aii
j=1 j=i+1
The choice of relaxation factor ω is not necessarily easy, and depends upon the properties of the
coefficient matrix. If A is a symmetric and positive definite matrix and 0 < ω < 2, then the SOR
method converges for any choice of initial approximate vector x(0) .
Important Note: If a matrix A is symmetric, it is positive definite if and only if all its leading
principle submatrices (minors) has a positive determinant.
Example 6. Consider a linear system Ax = b, where
   
3 −1 1 −1
A = −1 3 −1 , b =  7 
1 −1 3 −7
a. Check, that the SOR method with value ω = 1.25 of the relaxation parameter can be used to solve
this system.
b. Compute the first iteration by the SOR method starting at the point x(0) = (0, 0, 0)t .
Sol. a. Let us verify the sufficient condition for using the SOR method. We have to check, if matrix
A is symmetric, positive definite: A is symmetric as A = AT , so let us check positive definitness:

3 −1
det(3) = 3 > 0, det = 8 > 0, det(A) = 20 > 0.
−1 3
All leading principal minors are positive and so the matrix A is positive definite. We know, that for
symmetric positive definite matrices the SOR method converges for values of the relaxation parameter
ω from the interval 0 < ω < 2.
Therefore the SOR method with value ω = 1.25 can be used to solve this system.
b. The iterations of the SOR method are easier to compute by elements than in the vector form:
Write the system as equations and write down the equations for the Gauss-Seidel iterations
(k+1) (k) (k)
x1 = (−1 + x2 − x3 )/3
(k+1) (k+1) (k)
x2 = (7 + x1 + x3 )/3
(k+1) (k+1) (k+1)
x3 = (−7 − x1 + x2 )/3.
Now multiply the right hand side by the parameter ω and add to it the vector x(k) from the previous
iteration multiplied by the factor of (1 − ω) :
(k+1) (k) (k) (k)
x1 = (1 − ω)x1 + ω(−1 + x2 − x3 )/3
(k+1) (k) (k+1) (k)
x2 = (1 − ω)x2 + ω(7 + x1 + x3 )/3
(k+1) (k) (k+1) (k+1)
x3 = (1 − ω)x3 + ω(−7 − x1 + x2 )/3.
For k = 0:
(1)
x1 = (1 − 1.25) · 0 + 1.25 · (−1 + 0 − 0)/3 = −0.41667
(1)
x2 = (1 − 1.25) · 0 + 1.25 · (7 − 0.41667 + 0)/3 = 2.7431
(1)
x3 = (1 − 1.25) · 0 + 1.25 · (−7 + 0.41667 + 2.7431)/3 = −1.6001.
The next three iterations are
x(2) = (1.4972, 2.1880, −2.2288)t ,
x(3) = (1.0494, 1.8782, −2.0141)t ,
x(4) = (0.9428, 2.0007, −1.9723)t .
The exact solution is equal to x = (1, 2, −2)t .
5. Error Bounds and Iterative Refinement

Definition 5.1. Suppose x̃ ∈ Rn is an approximation to the solution of the linear system defined by
Ax = b. The residual vector for x̃ with respect to this system is r = b − Ax̃.
It seems intuitively reasonable that if x̃ is an approximation to the solution x of Ax = b and the
residual vector r = b − Ax̃ has the property that krk is small, then kx − x̃k would be small as well.
This is often the case, but certain systems, which occur frequently in practice, fail to have this property.
Example 7. The linear system Ax = b given by

1 2 x1 3
=
1.0001 2 x2 3.0001
has the unique solution x = (1, 1)t . Determine the residual vector for the poor approximation x̃ =
(3, −0.0001)t .
Sol. We have
3 1 2 3 0.0002
r = b − Ax̃ = =
3.0001 1.0001 2 −0.0001 0
so krk∞ = 0.0002. Although the norm of the residual vector is small, the approximation x̃ =
(3, −0.0001)t is obviously quite poor; in fact, kx − x̃k∞ = 2.
Theorem 5.2. Suppose that x̃ is an approximation to the solution of Ax = b, A is a nonsingular
matrix, and r is the residual vector for x̃. Then for any natural norm,
kx − x̃k≤ krk·kA−1 k
and if x 6= 0 and b 6= 0
kx − x̃k krk
≤ kAk·kA−1 k .
kxk kbk
Proof. Since r = b − Ax̃ = Ax − Ax̃ and A is nonsingular, we have x − x̃ = A−1 r.
kx − x̃k= kA−1 rk≤ kA−1 k·krk.
Moreover, since b = Ax, we have kbk≤ kAk·kxk. So 1/kxk≤ kAk/kbk and
kx − x̃k kAk·kA−1 k
≤ krk.
kxk kbk
Condition Numbers: The inequalities in the above theorem imply that kA−1 k and kAk·kA−1 k pro-
vide an indication of the connection between the residual vector and the accuracy of the approximation.
In general, the relative error kx − x̃|/kxk is of most interest, and this error is bounded by the product
of kAk·kA−1 k with the relative residual for this approximation, krk/kbk. Any convenient norm can be
used for this approximation; the only requirement is that it be used consistently throughout.
Definition 5.3. The condition number of the nonsingular matrix A relative to a norm k·k is
K(A) = kAk·kA−1 k.
With this notation, the inequalities in above theorem become
krk
kx − x̃k≤ K(A)
kAk
and
kx − x̃k krk
≤ K(A) .
kxk kbk
For any nonsingular matrix A and natural norm k·k,
1 = kIk= kA · A−1 k≤ kAk·kA−1 k= K(A)
A matrix A is well-conditioned if K(A) is close to 1, and is ill-conditioned when K(A) is significantly
greater than 1. Conditioning in this context refers to the relative security that a small residual vector
implies a correspondingly accurate approximate solution. When it is very large, the solution of Ax = b
will be very sensitive to relatively small changes in b. Or in the the residual, a relatively small residual
will quite possibly lead to a relatively large error in x̃ as compared with x. These comments are also
valid when the changes are made to A rather than to b.
Example 8. Determine the condition number for the matrix

1 2
A= .
1.0001 2
Sol. We saw in previous Example that the very poor approximation (3, −0.0001)t to the exact solution
(1, 1)t had a residual vector with small norm, so we should expect the condition number of A to be
large. We have kAk∞ = max{|1| + |2|, |1.001| + |2|} = 3.0001, which would not be considered large.
However,
−1 −10000 10000
A = , so kAk∞ = 20000,
5000.5 −5000
and for the infinity norm, K(A) = (20000)(3.0001) = 60002. The size of the condition number for this
example should certainly keep us from making hasty accuracy decisions based on the residual of an
approximation.
5.1. The Residual Correction Method. A further use of this error estimation procedure is to
define an iterative method for improving the computed value x. Let x(0) , the initial computed value
for x, generally obtained by using Gaussian elimination. Define
r(0) = b − Ax(0)
Then
Ae(0) = r(0) , e(0) = x − x(0) .
Solving by Gaussian elimination, we obtain an approximate value of e(0) . Using it, we define an
improved approximation
x(1) = x(0) + e(0) .
Now we repeat the entire process, calculating
r(1) = b − Ax(1)
x(2) = x(1) + Ae(1) ,
where e(1) is the approximate solution of
Ae(1) = r(1) , e(1) = x − x(1) .
Continue this process until there is no further decrease in the size of error vector.
For example, use a computer with four-digit floating-point decimal arithmetic with rounding, and
use Gaussian elimination with pivoting. The system to be solved is
x1 + 0.5x2 + 0.3333x3 = 1
0.5x1 + 0.3333x2 + 0.25x3 = 0
0.3333x1 + 0.25x2 + 0.2x3 = 0
Then
x(0) = [8.968, −35.77, 29.77]t
r(0) = [−0.005341, −0.004359, −0.0005344]t
e(0) = [0.09216, −0.5442, 0.5239]t
x(1) = [9.060, −36.31, 30.29]t
r(1) = [−0.0006570, −0.0003770, −0.0001980]t
e(1) = [0.001707, −0.01300, 0.01241]t
x(2) = [9.062, −36.32, 30.30]t .
6. Power method for approximating eigenvalues

The eigenvalues of an n × n of matrix A are obtained by solving its characteristic equation
det(a − λI) = 0
λn + cn−1 λn−1 cn−2 λn−2 + · · · + c0 = 0.
For large values of n, the polynomial equations like this one are difficult, time-consuming to solve
and sensitive to rounding errors. In this section we look at an alternative method for approximating
eigenvalues. The method can be used only to find the eigenvalue of A that is largest in absolute value.
We call this eigenvalue the dominant eigenvalue of A.
Definition 6.1 (Dominant Eigenvalue and Dominant Eigenvector). Let λ1 , λ2 , · · · , and λn be the
eigenvalues of an n × n matrix A. λ1 is called the dominant eigenvalue of A if
|λ1 | > |λi |, i = 2, 3, . . . , n.
The eigenvectors corresponding to λ1 are called dominant eigenvectors of A.
6.1. The Power Method. The power method for approximating eigenvalues is iterative. First we
assume that the matrix A has a dominant eigenvalue with corresponding dominant eigenvectors. Then
we choose an initial approximation x(0) of one of the dominant eigenvectors of A. This initial approx-
imation must be a nonzero vector in R. Finally we form the sequence given by
x(1) = Ax(0)
x(2) = Ax(1) = A2 x(0)
..............
(k) (k−1)
x = Ax = A(A(k−1) x(0) ) = Ak x(0) .
For large powers of k, and by properly scaling this sequence, we will see that we obtain a good
approximation of the dominant eigenvector of A. This procedure is illustrated in the following example.
Example 9. Complete six iterations of the power method to approximate a dominant eigenvector of

2 −12
A= .
1 −5
Sol. We begin with an initial non-zero approximation of

(0) 1
x =
1
We then obtain the following approximations by power method

2 −12 1 −10 1.00
y (1) = Ax(0) = = = −10 = −10x(1)
1 −5 1 −4 0.40

2 −12 1.00 −2.8 1.000
y (2) = Ax(1) = = = −2.8 = −2.8x(2)
1 −5 0.40 −1 0.357

(3) (2) 2 −12 1.000 −2.284 1.000
y = Ax = = = −2.284
1 −5 0.357 −0.785 0.3436
After several iterations, we note that the approximations appears to be approaching the dominant
eigenvalue λ = −2.
Theorem 6.2. If A is an n × n diagonalizable matrix with a dominant eigenvalue, then there exists a
nonzero vector x0 such that the sequence of vectors given by
Ax0 , A2 v, A3 x0 , · · · , Ak x0 , · · ·
approaches a multiple of the dominant eigenvector of A.
Proof. Let A is diagonalizable, which implies that it has n linearly independent eigenvectors
x1 , x2 , · · · , xn with corresponding eigenvalues of λ1 , λ2 , · · · , λn .
We assume that these eigenvalues are ordered so that λ1 is the dominant eigenvalue (with a corre-
sponding eigenvector of x1 ).
Because the n eigenvectors x1 , x2 , · · · , xn are linearly independent, they must form a basis for Rn .
For the initial approximation x0 , we choose a nonzero vector such that the linear combination
x0 = c1 x1 + c2 x2 + · · · + cn xn
has nonzero leading coefficients. (If c1 = 0, the power method may not converge, and a different x0
must be used as the initial approximation.
Now, operating both sides of this equation by A produces
Ax0 = Ac1 x1 + Ac2 x2 + · · · + Acn xn
Ax0 = c1 (Ax1 ) + c2 (Ax2 ) + · · · + cn (Axn )
Ax0 = c1 (λ1 x1 ) + c2 (λ2 x2 ) + · · · + cn (λn xn )

As λi are eigenvalues, so Axi = λi xi .
Repeated multiplication of both sides of this equation by A produces
Ak x0 = c1 (λk1 x1 ) + c2 (λk2 x2 ) + · · · + cn (λkn xn )
which implies that " #
k k
k λ2 λn
A x0 = λk1 c1 x1 + c2 x2 + · · · + cn xn
λ1 λ1
Now, from our original assumption that λ1 is larger in absolute value than the other eigenvalues it
follows that each of the fractions
λ2 λ3 λn
, ,··· , < 1.
λ1 λ1 λ1
Therefore each of the factors k k k
λ2 λ3 λn
, ,··· ,
λ1 λ1 λ1
must approach 0 as k approaches infinity. This implies that the approximation
Ak x0 ≈ λk1 c1 x1 , c1 6= 0
improves as k increases. Since x1 is a dominant eigenvector, it follows that any scalar multiple of x1 is
also a dominant eigenvector. Thus we have shown that Ak x0 approaches a multiple of the dominant
eigenvector of A.
Example 10. Calculate seven iterations of the power method with scaling to approximate a dominant
eigenvector of the matrix  
1 2 0
−2 1 2
1 3 1
Sol. Using x(0) = [1, 1, 1]T as initial approximation, we obtain

    
1 2 0 1 3
y (1) = Ax(0) = −2 1 2 1 = 1
1 3 1 1 5
and by scaling we obtain the approximation
   
3 0.60
x(1) = 1/5 1 = 0.20
5 1.00
Similarly we get    
1.00 0.45
y (2) = Ax(1) = 1.00 = 2.20 0.45 = 2.20x(2)
2.20 1.00
   
1.35 0.48
y (3) = Ax(2) = 1.55 = 2.8 0.55 = 2.8x(3)
2.8 1.00
 
0.51
y (4) = Ax(3) = 3.1 0.51
1.00
etc.
After several iterations, we observe that dominant eigenvector is
 
0.50
x = 0.50
1.00
Scaling factors are approaching to dominant eigenvalue λ = 3.
Remark 6.1. The power method is useful to compute the eigenvalue but it gives only dominant eigen-
value. To find other eigenvalue we use properties of matrix such as sum of all eigenvalue is equal to the
trace of matrix. Also if λ is an eigenvalue of A then λ−1 is the eigenvalue of A−1 . Hence the smallest
eigenvalue of A is the dominant eigenvalue of A−1 .
6.2. Inverse Power method. The Inverse Power method is a modification of the Power method
that gives faster convergence. It is used to determine the eigenvalue of A that is closest to a specified
number σ.
We consider A − σI then its eigenvalues are (λ1 − σ, λ2 − σ, · · · , λn − σ), where (λ1 , λ2 , · · · , λn ) are
the eigenvalues of A.
−1 1 1 1
Now the eigenvalues of (A − σI) are , ,..., .
λ1 − σ λ2 − σ λn − σ
The eigenvalues of the original matrix A that is the closest to σ corresponds to the eigenvalue of largest
magnitude of the shifted and inverted of matrix (A − σI)−1 .
To find the eigenvalue closest to σ, we apply the Power method to obtain the eigenvalue µ of (A−σI)−1 .
Then we recover the eigenvalue λ of the original problem by λ = 1/µ + σ. This method is called shifted
and inverted. We solve y = (A − σI)−1 x which implies (A − σI)y = x. We need not to compute the
inverse of the matrix.
Example 11. Apply the inverse power method with x(0) = [1, 1, 1]T to the matrix
 
−4 14 0
−5 13 0
−1 0 2
with σ = 19/3.
Sol. For the inverse power method we consider
 −31 
3 14 0
19 20
A− I =  −5 3 0 
3
−1 0 − 13
3
Starting with x(0) = [1, 1, 1]T , (A − σI)−1 x(0) = y (1) gives (A − σI)y (1) = x(0) . This gives
 −31    
3 14 0 a 1
 −5 20 0   b = 1 .
 
3
13
−1 0 − 3 c 1
Solving above system a = −6.6, b = −4.8, and c = 1.2923.
Therefore y (1) = (−6.6, −4.8, 1.2923)T . We normalize it by taking 6.6 as scale factor and x(1) =
1 (1) = (1, 0.7272, −0.1958)T .
−6.6 y
1
Therefore first approximation of the eigenvalue of A near 19/3 is − 6.6 + 19
3 = 6.1818.
Repeating the above procedure we can obtain the eigenvalue (and which is 6).
Example 12. Find the eigenvalue of matrix nearest to 3
 
2 −1 0
−1 2 −1
0 −1 2
using power method.
Sol. The eigenvalue of matrix A which is nearest to 3 is the smallest eigenvalue in magnitude of A − 3I.
Hence it is the largest eigenvalue of (A − 3I)−1 in magnitude. Now
 
−1 −1 0
A − 3I = −1 −1 −1
0 −1 −1
 
0 −1 1
B = (A − 3I)−1 = −1 1 −1 .
1 −1 0
Starting with
 
1
x(0) = 1
1
we obtain     
0 −1 1 1 0
y (1) = Bx(0) = −1 1 −1 1 = −1 = 1.x(1)
1 −1 0 1 0
 
1
y (2) = Bx(1) = −1 = 1.x(2)
1
   
2 0.6667
y (3) = Bx(2) = −3 = 3  −1  = 3x(3)
2 0.6667
   
1.6667 0.7143
y (4) = Bx(3) = −2.3334 = 2.3334  −1  = 2.3334x(4) .
1.6667 0.7143
After six iterations, we obtain the dominant eigenvalue of matrix B and which is 2.4 and the dominant
eigenvector is
 
0.7143
 −1  .
0.7143
1
Now the eigenvalue of matrix A is 3 ± 2.4 = 3 ± 0.42 = 3.42, 2.58. Since 2.58 does not satisfy
|A − 2.58I| = 0, therefore the correct eigenvalue of matrix A nearest to 3 is 3.42.
Although the power method worked well in these examples, we must say something about cases in
which the power method may fail. There are basically three such cases:
1. Using the power method when A is not diagonalizable. Recall that A has n linearly Independent
eigenvector if and only if A is diagonalizable. Of course, it is not easy to tell by just looking at A
whether it is diagonalizable.
2. Using the power method when A does not have a dominant eigenvalue or when the dominant
eigenvalue is such that λ1 = λ2 .
3. If the entries of A contains significant error. Powers Ak will have significant roundoff error in their
entires.
Exercises
(1) Find l∞ and l2 norms of the vectors.
a. x = (3, −4, 0, 32 )t
b. x = (sin k, cos k, 2k )t for a fixedpositive integer
 k.
4 −1 7
(2) Find the l∞ norm of the matrix: −1 4 0 .
−7 0 4
(3) The following linear system Ax = b have x as the actual solution and x̄ as an approximate
solution. Compute kx − x̄k∞ and kAx̄ − bk∞ . Also compute kAk∞ .
x1 + 2x2 + 3x3 = 1
2x1 + 3x2 + 4x3 = −1
3x1 + 4x2 + 6x3 = 2,
x = (0, −7, 5)t
x̄ = (−0.2, −7.5, 5.4)t .
(4) Find the first two iterations of Jacobi and Gauss-Seidel using x(0) = 0:
4.63x1 − 1.21x2 + 3.22x3 = 2.22
−3.07x1 + 5.48x2 + 2.11x3 = −3.17
1.26x1 + 3.11x2 + 4.57x3 = 5.11.
(5) The linear system
x1 − x3 = 0.2
1 1
− x1 + x2 − x3 = −1.425
2 4
1
x1 − x2 + x3 = 2
2
T
has the solution (0.9, −0.8, 0.7) .
a. Is the coefficient matrix strictly diagonally dominant?
b. Compute the spectral radius of the Gauss-Seidel iteration matrix.
c. Perform four iterations of the Gauss-Seidel iterative method to approximate the solution.
d. What happens in part (c) when the first equation in the system is changed to x1 −2x3 = 0.2?
(6) Show that Gauss-Seidel method does not converge for the following system of equations
2x1 + 3x2 + x3 = −1
3x1 + 2x2 + 2x3 = 1
x1 + 2x2 + 2x3 = 1.
(7) Find the first two iterations of the SOR method with ω = 1.1 for the following linear systems,
using x(0) = 0 :
4x1 + x2 − x3 = 5
−x1 + 3x2 + x3 = −4
2x1 + 2x2 + 5x3 = 1.
(8) Compute the condition
  following matrices relative to k.k∞ .
numbers of the
0.04 0.01 −0.01
3.9 1.6
a. b.  0.2 0.5 −0.2 .
6.8 2.9
1 2 4
(9) (i) Use Gaussian elimination and three-digit rounding arithmetic to approximate the solutions
to the following linear systems. (ii) Then use one iteration of iterative refinement to improve
the approximation, and compare the approximations to the actual solutions.
a.
0.03x1 + 58.9x2 = 59.2
5.31x1 − 6.10x2 = 47.0.
Actual solution (10, 1)t .
b.
3.3330x1 + 15920x2 + 10.333x3 = 7953
2.2220x1 + 16.710x2 + 9.6120x3 = 0.965
−1.5611x1 + 5.1792x2 − 1.6855x3 = 2.714.
Actual solution (1, 0.5, −1)t .
(10) The linear system Ax = b given by

1 2 x1 3
=
1.0001 2 x2 3.0001
has solution (1, 1)t . Use four-digit rounding arithmetic to find the solution of the perturbed
system
1 2 x1 3.00001
=
1.000011 2 x2 3.00003
Is matrix A ill-conditioned?
(11) Determine the largest eigenvalue and the corresponding eigenvector of the following matrix
correct to three decimals using the power method with x(0) = (−1, 2, 1)t using the power
method.
 
1 −1 0
−2 4 −2 .
0 −1 2
(12) Use the inverse power method to approximate the most dominant eigenvalue of the matrix until
a tolerance of 10−2 is achieved with x(0) = (1, −1, 2)t .
 
2 1 1
1 2 1 .
1 1 2
(13) Find the eigenvalue of matrix nearest to 3
 
2 −1 0
−1 2 −1
0 −1 2
using inverse power method.
Bibliography
[Burden] Richard L. Burden, J. Douglas Faires and Annette Burden, “Numerical Analysis,” Cengage
Learning, 10th edition, 2015.
[Atkinson] K. Atkinson and W. Han, “Elementary Numerical Analysis,” John Willey and Sons, 3rd
edition, 2004.
Appendix A. Algorithms
Algorithm (Gauss-Seidel):
(1) Input matrix A = [aij ], b, XO = x(0) , tolerance TOL, maximum number of iterations
(2) Set k = 1
(3) while (k ≤ N ) do step 4-7
(4) For i = 1, 2, · · · , n
 
i−1 n
1  X X
xi = − (aij xj ) − (aij XOj ) + bi )
aii
j=1 j=i+1
(5) If ||x − XO|| < T OL, then OUTPUT (x1 , x2 , · · · , xn )

STOP
(6) k = k + 1
(7) For i = 1, 2, · · · , n
Set XOi = xi
(8) OUTPUT (x1 , x2 , · · · , xn )
STOP.
Algorithm (Power Method):
(1) Start
(2) Define matrix A and initial guess x
(3) Calculate y = Ax
(4) Find the largest element in magnitude of matrix y and assign it to K.
(5) Calculate fresh value x = (1/K) ∗ y
(6) If [K(n) − K(n − 1)] > error, goto step 3.
(7) Stop

Chapter 4 (5 Lectures)

Uploaded by

Copyright:

Available Formats

Chapter 4 (5 Lectures)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 4 (5 Lectures)

Uploaded by

Copyright:

Available Formats

CHAPTER 4 (5 LECTURES)

ITERATIVE TECHNIQUES IN MATRIX ALGEBRA

2. Norms of Vectors and Matrices

Theorem 2.5. If A = (aij ) is an n × n matrix, then

Example 2. Determine kAk∞ for the matrix

Sol. Computing powers of A, we obtain:

Since ρ(T ) < 1, this implies

Therefore, x(k) converges to unique solution x = T x + B.

Gauss-Seidel iterations are given by

Since x is an eigenvector, x 6= 0, so we can take norm ||x||∞ = 1.

4. The SOR method

5. Error Bounds and Iterative Refinement

Example 7. The linear system Ax = b given by

6. Power method for approximating eigenvalues

Ax0 = c1 (λ1 x1 ) + c2 (λ2 x2 ) + · · · + cn (λn xn )

Sol. Using x(0) = [1, 1, 1]T as initial approximation, we obtain

(5) If ||x − XO|| < T OL, then OUTPUT (x1 , x2 , · · · , xn )

You might also like