Station Iter
Station Iter
Station Iter
Abstract
Stationary iterative methods for solving systems of linear equations are con-
sidered by some as out of date and out of favor, as compared to methods based
on Krylov subspace iterations. However, these methods are still useful in many
circumstances because they are easier to implement and, more importantly,
can be used as pre-conditioners in combination with Krylov-subspace methods.
In this note, we briefly introduce the fundamental ideas of stationary iterative
methods.
1 Introduction
We consider solving a linear system of equations
Ax = b, (1)
where we will always assume, unless specified otherwise, that A is n by n and real,
i.e., A ∈ <n×n , nonsingular, and the right-hand side (RHS) b ∈ <n is nonzero.
For any nonsingular matrix Q ∈ <n×n , one can rewrite the system into an equiv-
alent form:
Ax = b ⇔ x = M x + c, (2)
1
where,
M = I − Q−1 A, c = Q−1 b, (3)
xk+1 = M xk + c, (5)
until some stopping criterion is met. Methods of this form are called stationary
because we do exactly the same thing at every iteration, which is to multiply the
iterate by M and add to it the c vector.
Let x∗ be the solution to Ax = b or equivalently, let it satisfy
x∗ = M x∗ + c. (6)
where by convention the matrix norm is the one induced by the given vector norm.
Recall an induced matrix norm is defined by
kM xk
kM k := max .
x6=0 kxk
It is clear from (8) that after every iteration the error, as is measured by the given
norm, is at least reduced by a fixed factor of kM k whenever kM k is less than one.
Therefore, we have a sufficient condition for convergence.
2
Theorem 1. Let x∗ satisfy x∗ = M x∗ + c. The stationary iterative method (5)
converges, i.e., for any initial guess x0 ,
lim xk = x∗ ,
k→∞
2 Jacobi method
From Wikipedia, the free encyclopedia, it reads
I would not call the above a clear description of the Jacobi method (in fact, some
statements are technically wrong), though it at least informs us that Jacobi was a
German mathematician.
3
In the Jacobi method, Q is chosen as the diagonal matrix formed by the diagonal
of A; that is, Q = D where
A , i = j,
ii
Dij = (9)
0, i =6 j.
Therefore, the Jacobi method can be written into the following scheme:
The idea behind the Jacobi method is simple. At each iteration, one solves the i-th
equation in Ax = b for a new value of xi , the i-th variable in x, while fixing all the
other variables at their values in the prior iteration.
In Matlab, (10) could be implemented as
r = b − A ∗ x; x = x + invd .∗ r;
kb − Axk
< , (11)
1 + kbk
where in the denominator one is added to the norm of b to guard against the case of
an excessively small right-hand side b.
When is the Jacobi method convergent? Let us inspect the `∞ norm of its corre-
sponding iteration matrix M = I −D−1 A (which is the maximum row sum in absolute
value). Since the diagonal of M is zero, we have
X |aij |
kM k∞ = max ,
1≤i≤n
j6=i
|aii |
4
A matrix satisfying the left inequality in (12) is called row strictly diagonally dom-
inant. Similarly, a matrix is called column strictly diagonally dominant if for each
column the absolute value of the diagonal element is great than the sum of the abso-
lute values of all the off-diagonal elements in that column.
This theorem does not say that if a matrix is not strictly diagonally dominant,
then the Jacobi method does not converge. It provides a sufficient, but not necessary,
condition for convergence.
Now what if A is column but not row strictly diagonally dominant? In this case,
we can measure the errors in the vector norm k · kD defined by
kxkD := kDxk1 ,
Observe that
DM D−1 = D(I − D−1 A)D−1 = I − AD−1 .
Hence
X |aij |
kM kD = kI − AD−1 k1 = max <1
1≤j≤n
i6=j
|ajj |
Theorem 2. If A is, either row or column, strictly diagonally dominant then the
Jacobi method converges from any initial guess.
5
3 Fundamental Theorem of Convergence
We already mentioned that Theorem 1 is only a sufficient condition. The following
theorem gives a necessary and sufficient condition in terms of the spectral radius of
M , defined to be
ρ(M ) = max{|λ| : det(M − λI) = 0}, (13)
Theorem 3. The stationary iterative method (5) converges from any initial guess if
and only if
ρ(M ) < 1. (14)
Proof. Obviously, in any vector norm and for any eigen-pair (λ, x) of B,
Indeed ρ(B) is a lower bound of kBk over all induced norms. We prove (15) by
showing that for any > 0, there exists a nonsingular matrix S that defines an
induced norm k · kS := kS(·)S −1 k1 such that kBkS ≤ ρ(B) + .
By the well-know Schur’s theorem, any square matrix is similar to a upper trian-
gular matrix. Namely, P BP −1 = D + U where D is diagonal and U is strictly upper
triangular. Furthermore, let T be the diagonal matrix with
Tii = 1/ti , i = 1, 2, · · · , n,
T P BP −1 T −1 = D + T U T −1 .
6
The strict upper triangular matrix Û := T U T −1 has elements
u tj−i , j > i,
ij
ûij =
0, j ≤ i,
where the nonzero elements can be made arbitrarily small by choosing t arbitrarily
small. Therefore, letting S = T P , we have shown that
SBS −1 = D + Û , kÛ k1 ≤
Now we prove Theorem 3. Since ρ(M ) < 1 implies kM k < 1 for some induced
norm k · k, the sufficiency of condition (14) for convergence follows directly from
Theorem 1. For necessity, let us assume that there exists an eigen-pair (λ, d) of M
such that M d = λd and |λ| ≥ 1. Let x0 = x∗ + d, where x∗ is the solution, so that
e0 = x0 − x∗ = d. Then
implying non-convergence. This establishes the necessity of condition (14) for con-
vergence from any initial point.
However, it should be clear from the proof that convergence from some initial
guesses is still possible even when ρ(M ) ≥ 1.
4 Gauss-Seidel Method
Another popular stationary iterative method is the Gauss-Seidel (GS) method where
Q is chosen to be the lower triangular part, including the diagonal, of A. If one
7
partitions A into three parts:
A=D−L−U
where D is the diagonal and −L (−U ) is the strictly lower (upper) triangular part of
A. Then for the GS method
Q=D−L
Both the Jacobi and the GS method solves one equation (the i-th) for one variable
(the i-th) at a time. The difference is that while the Jacobi method fixes other
variables at their prior iteration values, the GS method immediately uses new values
once they become available. Therefore, the GS method generally converges faster.
Like the Jacobi method, the GS method has guaranteed convergence for strictly
diagonally dominant matrices.
Theorem 4. If A is, either row or column, strictly diagonally dominant then the
Gauss-Seisel method converges from any initial guess.
We will show ρ(M ) < 1. Let (λ, x) be an eigen-pair of M where x is scaled so that
|xj | ≤ 1 for j = 1, 2. · · · , n and |xi | = 1 for some index i. From the definition of M ,
M x = λx ⇒ λ(Dx − Lx) = U x.
8
Taking the moduli of both sides and in view of the fact that |xi | = 1 and |xj | ≤ 1, we
have !
X X
|λ| |aii | − |aij | ≤ |aij |,
j<i j>i
Unlike the Jacobi method, the Gauss-Seidel method has guaranteed convergence
for another class of matrices.
Theorem 5. The Gauss-Seidel method converges from any initial guess if A is sym-
metric positive definite.
x∗ Dx x∗ (Q + Q∗ − A)x 1 − |λ|2
= = > 0.
x∗ Ax x∗ Ax |1 − λ|2
5 SOR Method
SOR stands for Successive Over-Relaxation. It is an extension to the GS method.
For SOR, the diagonal is split into two parts and distributed to both the left and the
9
right hand sides. That is,
1 1
Q = D − L, Q−A=U − 1− D.
ω ω
Therefore, −1
1 1
M= D−L U − 1− D ,
ω ω
or equivalently,
M (ω) = (D − ωL)−1 (ωU + (1 − ω)D). (18)
Theorem 6. The SOR method converges from any initial guess if A is symmetric
positive definite and ω ∈ (0, 2).
The proof follows from a similar argument as for the GS method and is left as
an exercise. In addition, the condition ω ∈ (0, 2) is always necessary for convergence
from any initial guess.
Exercises
1. Prove that for any nonsingular matrix S ∈ <n×n , kS(·)S −1 kp is an induced
matrix norm in <n×n for p ≥ 1 (where k · kp is the matrix norm induced by the
vector p-norm).
2. Prove the convergence of the GS method for column strictly diagonally domi-
nant matrices.
10
3. Prove Theorem 5 in details following the given sketch.
4. Prove Theorem 6.
5. Prove that a necessary condition for SOR to converge is ω ∈ (0, 2). (Hint: First
show det(M (ω)) = (1 − ω)n .)
References
[1] James M. Ortega Numerical Analysis, A Second Course. Academic Press, 1972.
[3] David M. Young. Iterative Solution of Large Linear Systems. Academic Press,
New York, 1971.
11