Ronel N. Dadula Msit - A Research in Numerical Method: Conjugate Gradient Method
Ronel N. Dadula Msit - A Research in Numerical Method: Conjugate Gradient Method
Dadula
MSIT - A
Ax = b
where the n-by-n matrix A is symmetric (i.e., AT = A), positive definite (i.e., xTAx > 0 for
all non-zero vectors x in Rn), and real.
We say that two non-zero vectors u and v are conjugate (with respect to A) if
Since A is symmetric and positive definite, the left-hand side defines an inner product
Two vectors are conjugate if they are orthogonal with respect to this inner product.
Being conjugate is a symmetric relation: if u is conjugate to v, then v is conjugate to u.
(Note: This notion of conjugate is not related to the notion of complex conjugate.)
Suppose that {pk} is a sequence of n mutually conjugate directions. Then the pk form a
basis of Rn, so we can expand the solution x* of Ax = b in this basis:
This result is perhaps most transparent by considering the inner product defined above.
This gives the following method for solving the equation Ax = b: find a sequence of n
conjugate directions, and then compute the coefficients αk.
If we choose the conjugate vectors pk carefully, then we may not need all of them to
obtain a good approximation to the solution x*. So, we want to regard the conjugate
gradient method as an iterative method. This also allows us to solve systems where n is
so large that the direct method would take too much time.
We denote the initial guess for x* by x0. We can assume without loss of generality that
x0 = 0 (otherwise, consider the system Az = b − Ax0 instead). Starting with x0 we search
for the solution and in each iteration we need a metric to tell us whether we are closer to
the solution x* (that is unknown to us). This metric comes from the fact that the solution
x* is also the unique minimizer of the following quadratic function; so if f(x) becomes
smaller in an iteration it means that we are closer to x*.
This suggests taking the first basis vector p1 to be the negative of the gradient of f at x =
x0. This gradient equals Ax0−b. Since x0 = 0, this means we take p1 = b. The other
vectors in the basis will be conjugate to the gradient, hence the name conjugate
gradient method.
(see the picture at the top of the article for the effect of the conjugacy constraint on
convergence). Following this direction, the next optimal location is given by
with
where the last equality holds because pk+1 and xk are conjugate.
Numerical example
we will perform two steps of the conjugate gradient method beginning with the initial
guess
Our first step is to calculate the residual vector r0 associated with x0. This residual is
computed from the formula r0 = b - Ax0, and in our case is equal to
Since this is the first iteration, we will use the residual vector r0 as our initial search
direction p0; the method of selecting pk will change in further iterations.
This result completes the first iteration, the result being an "improved" approximate
solution to the system, x1. We may now move on and compute the next residual vector
r1 using the formula
Our next step in the process is to compute the scalar β 0 that will eventually be used to
determine the next search direction p1.
Now, using this scalar β0, we can compute the next search direction p1 using the
relationship
We now compute the scalar α1 using our newly-acquired p1 using the same method as
that used for α0.
Finally, we find x2 using the same method as that used to find x1.
The result, x2, is a "better" approximation to the system's solution than x1 and x0. If
exact arithmetic were to be used in this example instead of limited-precision, then the
exact solution would theoretically have been reached after n = 2 iterations (n being the
order of the system).