Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
29 views

Ronel N. Dadula Msit - A Research in Numerical Method: Conjugate Gradient Method

The conjugate gradient method is an iterative method for solving systems of linear equations involving a symmetric, positive-definite matrix. It works by iteratively computing search directions that are conjugate to each other. In each iteration, it finds an optimal step size along the search direction to minimize a quadratic function related to the residual. The method resembles Gram-Schmidt orthonormalization as it enforces conjugacy of the search directions. A numerical example demonstrates two iterations of the method on a simple 2x2 system.

Uploaded by

Ronel Dadula
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Ronel N. Dadula Msit - A Research in Numerical Method: Conjugate Gradient Method

The conjugate gradient method is an iterative method for solving systems of linear equations involving a symmetric, positive-definite matrix. It works by iteratively computing search directions that are conjugate to each other. In each iteration, it finds an optimal step size along the search direction to minimize a quadratic function related to the residual. The method resembles Gram-Schmidt orthonormalization as it enforces conjugacy of the search directions. A numerical example demonstrates two iterations of the method on a simple 2x2 system.

Uploaded by

Ronel Dadula
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Ronel N.

Dadula
MSIT - A

Research in Numerical Method: Conjugate gradient method

Conjugate gradient method

Description of the method

Suppose we want to solve the following system of linear equations

Ax = b

where the n-by-n matrix A is symmetric (i.e., AT = A), positive definite (i.e., xTAx > 0 for
all non-zero vectors x in Rn), and real.

We denote the unique solution of this system by x*.

The conjugate gradient method as a direct method

We say that two non-zero vectors u and v are conjugate (with respect to A) if

Since A is symmetric and positive definite, the left-hand side defines an inner product

Two vectors are conjugate if they are orthogonal with respect to this inner product.
Being conjugate is a symmetric relation: if u is conjugate to v, then v is conjugate to u.
(Note: This notion of conjugate is not related to the notion of complex conjugate.)

Suppose that {pk} is a sequence of n mutually conjugate directions. Then the pk form a
basis of Rn, so we can expand the solution x* of Ax = b in this basis:

The coefficients are given by


(because are
mutually conjugate)

This result is perhaps most transparent by considering the inner product defined above.

This gives the following method for solving the equation Ax = b: find a sequence of n
conjugate directions, and then compute the coefficients αk.

The conjugate gradient method as an iterative method

If we choose the conjugate vectors pk carefully, then we may not need all of them to
obtain a good approximation to the solution x*. So, we want to regard the conjugate
gradient method as an iterative method. This also allows us to solve systems where n is
so large that the direct method would take too much time.

We denote the initial guess for x* by x0. We can assume without loss of generality that
x0 = 0 (otherwise, consider the system Az = b − Ax0 instead). Starting with x0 we search
for the solution and in each iteration we need a metric to tell us whether we are closer to
the solution x* (that is unknown to us). This metric comes from the fact that the solution
x* is also the unique minimizer of the following quadratic function; so if f(x) becomes
smaller in an iteration it means that we are closer to x*.

This suggests taking the first basis vector p1 to be the negative of the gradient of f at x =
x0. This gradient equals Ax0−b. Since x0 = 0, this means we take p1 = b. The other
vectors in the basis will be conjugate to the gradient, hence the name conjugate
gradient method.

Let rk be the residual at the kth step:


Note that rk is the negative gradient of f at x = xk, so the gradient descent method would
be to move in the direction rk. Here, we insist that the directions pk be conjugate to each
other. We also require the next search direction is built out of the current residue and all
previous search directions, which is reasonable enough in practice.

The conjugation constraint is an orthonormal-type constraint and hence the algorithm


bears resemblance to Gram-Schmidt orthonormalization.

This gives the following expression:

(see the picture at the top of the article for the effect of the conjugacy constraint on
convergence). Following this direction, the next optimal location is given by

with

where the last equality holds because pk+1 and xk are conjugate.

Numerical example

To illustrate the conjugate gradient method, we will complete a simple example.

Considering the linear system Ax = b given by

we will perform two steps of the conjugate gradient method beginning with the initial
guess

in order to find an approximate solution to the system.


Solution

Our first step is to calculate the residual vector r0 associated with x0. This residual is
computed from the formula r0 = b - Ax0, and in our case is equal to

Since this is the first iteration, we will use the residual vector r0 as our initial search
direction p0; the method of selecting pk will change in further iterations.

We now compute the scalar α0 using the relationship

We can now compute x1 using the formula

This result completes the first iteration, the result being an "improved" approximate
solution to the system, x1. We may now move on and compute the next residual vector
r1 using the formula

Our next step in the process is to compute the scalar β 0 that will eventually be used to
determine the next search direction p1.

Now, using this scalar β0, we can compute the next search direction p1 using the
relationship
We now compute the scalar α1 using our newly-acquired p1 using the same method as
that used for α0.

Finally, we find x2 using the same method as that used to find x1.

The result, x2, is a "better" approximation to the system's solution than x1 and x0. If
exact arithmetic were to be used in this example instead of limited-precision, then the
exact solution would theoretically have been reached after n = 2 iterations (n being the
order of the system).

You might also like