Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Lanczos Method Seminar For Eigenvalue Reading Group: Andre Leger

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Lanczos Method Seminar for Eigenvalue Reading Group

Andre Leger

1 Introduction and Notation


• Eigenvalue Problem: Ax = λx, A ∈ CN ×N , x ∈ CN
• Now λ ∈ R since A = AT .
• Vector qi is orthonormal if
1. qi T qi = 1,
2. QT = Q−1 , Q = [q1 , . . . , qN ],
3. ||qi ||2 = 1.
4. qi T qj = 0 for i 6= j,

2 A Reminder of the Power Method


• We recall the Power Method is used to find the eigenvector associated with the maximum
eigenvalue.
• Simply xk+1 = cAxk , where c is a normalisation constant to prevent large xk+1 .
• As k → ∞, xk+1 → v1 , the eigenvector associated with eigenvalue λ1 where λ1 > λ2 ≥
λ3 . . . λN
• We obtain the maximum eigenvalue by the Rayleigh Quotient
T
k xk Axk
R(A, x ) =
||xk ||22

• Why don’t we just use the QR method? Well if A is sparse, then applying an iteration of the
QR approach does not maintain sparsity of the new matrix. INEFFICIENT.
• Note: We only find ONE eigenvector and eigenvalue. What if we want more?

3 The Idea Behind Lanczos Method


• Lets follow the Power Method, but save each iteration, such that we obtain
v, A, v, A2 v, . . . Ak−1 v

• These vectors form the Krylov Space


Kk (Av) = span {v, Av, A2 v . . . , Ak−1 }

• So after n iterations
v, Av, . . . , An−1 v
are linearly independent and x can be formed from the space.
• By the Power Method, the n-th iteration tends to an eigenvector hence the sequence becomes
linearly dependent but we want a sequence of linearly independent vectors.
• Hence we orthogonalise the vectors, this is the basis of Lanczos Method
4 Lanczos Method
• Assume we have orthonormal vectors
q1 , q2 , . . . , qN

• Simply let Q = [q1 , q2 , . . . , qk ] hence


QT Q = I

• We want to change A to a tridiagonal matrix T, and apply a similarly transformation:


QT AQ = T or AQ = QT

• So we define T to be
 
α1 β1 0 ... ... ... 0
 β1 α2 β2 0 ... ... 0 
 

 0 .. 
 β2 α3 β3 0 ... . 

 .. .. 
 . 0 ... ... ... ... .  ∈ Ck+1,k

Tk+1,k =
 .. .. 
 . ... ... ... ... ... . 
 ..
 

 . ... ... ... ... . . . βk−1 
 
 0 ... ... ... 0 βk−1 αk 
0 ... ... ... ... 0 βk

• After k steps we have AQk = Qk+1 Tk+1,k for A ∈ CN,N , Qk ∈ CN,k , Qk+1 ∈ CN,k+1 ,
Tk+1,k ∈ Ck+1,k .
• We observe that
AQk = Qk+1 Tk+1,k = Qk Tk,k + βk qk+1 eTk
• Now AQ = QT hence
A [q1 , q2 , . . . , qk ] = [q1 , q2 , . . . , qk ] Tk

• The first column of the left hand side matrix is given by


Aq1 = α1 q1 + β1 q2

• The ith term by


Aqi = βi−1 qi−1 + αi qi + βi qi+1 ,† i = 2, . . .
• We wish to find the alphas and betas so multiply † by qTi so that
qTi Aqi = qTi βi−1 qi−1 + qTi αi qi + qTi βi qi+1
= βi−1 qTi qi−1 + αi qTi qi + βi qTi qi+1
= αi qTi qi

• We obtain βi by rearranging † from the recurrence formula


ri ≡ βi qi+1 = Aqi − αi qi − βi−1 qi−1

• We assume βi 6= 0 and so βi = ||ri ||2 .


• We may now determine the next orthonormal vector
ri
qi+1 = .
βi
5 A Little Proof - Omit from Seminar
Lemma: All vectors qi+1 generated by the 3-term are orthogonal to all qk for k < i

Proof

• We assume qTi+1 qi = 0 = qTi+1 qi−1 and by induction step qTi qk for k < i.

• We prove qTi+1 qk for k < i.

• Multiply † by qk for k ≤ i − 2 and we show qk , qi are A orthogonal. Hence

qTk Aqi = qTk AT qi = (Aqk )T qi




= (βk−1 qk−1 + αk qk + βk qk+1 )T qi


= βk−1 qTk−1 qi + αk qTk qi + βk qTk+1 qi
= 0+0+0=0

• Now multiply † by qk so that

qTk Aqi = βi−1 qTk qi−1 + αi qTk qi + βi−1 qTk qi+1

Rearranging we obtain

βi−1 qTk qi+1 = qTk Aqi − βi−1 qTk qi−1 − αi qTk qi = 0

6 The Lanzcos Algorithm


Initialise: choose r = q0 and let β0 = ||q0 ||2
Begin Loop: for j = 1, ...
r
qj = βj−1
r = Aqj
r = r − qj−1 βj−1
αj = qTj r
r = r − qj αj
Othorgonalise if necessary
βj = ||r||2
Compute approximate eigenvalues of Tj
Test Convergence (see remarks)
endfor
End Loop

7 Remarks 1: Finding the Eigenvalues and Eigenvectors


• So how do we find the eigenvalues and eigenvectors?

• If βk = 0 then

1. We diagonalise the matrix Tk using simple QR method to find the exact eigenvalues.

Tk = Sk diag (λ1 , . . . , λk ) SkT

where the matrix Sk is orthonormal Sk SkT = I.


2. The exact eigenvectors are given correspondingly in the columns of the matrix Y where

SkT QTk AQk Sk = diag (λ1 , . . . , λk )

so that Y = Qk Sk .
3. We converge to the k largest eigenvalues. The proof is very difficult and is omitted.

• Now βk is never really zero. Hence we only converge to the eigenvalue.

– After k steps we have AQk = Qk Tk,k + βk qk+1 eTk


– For βk small we obtain approximations to the eigenvalues θi ≈ λi by

Tk = Sk diag (θ1 , . . . , θk ) SkT

– We multiply AQk by Sk from above so that

AQk Sk = Qk Tk,k Sk + βk qk+1 eTk Sk


= Qk Sk diag (θ1 , . . . , θk ) SkT Sk + βk qk+1 eTk Sk
AYk = Yk diag (θ1 , . . . , θk ) + βk qk+1 eTk Sk
AYk .ej = Yk diag (θ1 , . . . , θk ) .ej + βk qk+1 eTk Sk ej
Ayj = yj θj + βk qk+1 Skj
Ayj − θj yj = βk qk+1 Skj
∴ ||Ayj − θj yj ||2 = |βk ||Skj | (1)

– So if βk → 0 we prove θj → λj .
– Otherwise |βk ||Skj | needs to be small to have a good approximation, hence convergence
criterion
|βk ||Skj | < 

8 Remarks 2: Difficulties with Lanzcos Method


• In practice, the problem is that the orthogonality is not preserved.

• As soon as one eigenvalue converges all the basis vectors qi pick up perturbations biased
toward the direction of the corresponding eigenvector and orthogonality is lost.

• A “ghost” copy of the eigenvalue will appear again in the tridiagonal matrix T.

• To counter this we fully re-orthonormalize the sequence by using Gram-Schmidt or even QR.

• However, either approach would be expense if the dimension if the Krylov space is large.

• So instead a selective re-orthonormalization is pursued. More specifically, the practical



approach is to orthonormalize half-way i.e., within half machine-recision M .

• If the eigenvalues of A are not well separated, then we can use a shift and employ the matrix

(A − σI)−1

following the shifted inverted power method to generate the appropriate Krylov subspaces.

You might also like