Chapter 4
Chapter 4
Chapter 4
The QR Algorithm
and hence Ak and Ak−1 are unitarily similar. The matrix sequence {Ak } converges (under
certain assumptions) towards an upper triangular matrix [11]. Let us assume that the
63
64 CHAPTER 4. THE QR ALGORITHM
Algorithm 4.1 Basic QR algorithm
1: Let A ∈ Cn×n . This algorithm computes an upper triangular matrix T and a unitary
matrix U such that A = U T U ∗ is the Schur decomposition of A.
2: Set A0 := A and U0 = I.
3: for k = 1, 2, . . . do
4: Ak−1 =: Qk Rk ; /* QR factorization */
5: Ak := Rk Qk ;
6: Uk := Uk−1 Qk ; /* Update transformation matrix */
7: end for
8: Set T := A∞ and U := U∞ .
eigenvalues are mutually different in magnitude and we can therefore number the eigen-
values such that |λ1 | > |λ2 | > · · · > |λn |. Then – as we will show in Chapter 8 – the
elements of Ak below the diagonal converge to zero like
(k)
(4.2) |aij | = O(|λi /λj |k ), i > j.
With the same assumption on the eigenvalues, Ak tends to an upper triangular matrix
and Uk converges to the matrix of Schur vectors.
D = diag([4 3 2 1]);
rand(’seed’,0);
format short e
S=rand(4); S = (S - .5)*2;
A = S*D/S % A_0 = A = S*D*S^{-1}
for i=1:20,
[Q,R] = qr(A); A = R*Q
end
Looking at the element-wise quotients of the last two matrices one recognizes the conver-
gence rates claimed in (4.2).
A(20)./A(19) = [ 1.0000 0.9752 1.0000 -1.0000]
[ 0.7495 1.0000 0.9988 -1.0008]
[ 0.5000 0.6668 1.0000 -1.0001]
[ -0.2500 -0.3334 -0.4999 1.0000]
So, again the eigenvalues are visible on the diagonal of A20 . The element-wise quotients
of A20 relative to A19 are
4.2. THE HESSENBERG QR ALGORITHM 67
A(20)./A(19) = [ 1.0000 1.0000 1.0000 -1.0000]
[ 0.4000 1.0000 0.4993 -1.0000]
[ 0.4000 0.4753 1.0000 -1.0000]
[ -0.2000 -0.5000 -0.5000 1.0000]
Notice that (4.2) does not state a rate for the element at position (3, 2).
These little numerical tests are intended to demonstrate that the convergence rates
given in (4.2) are in fact seen in a real run of the basic QR algorithm. The conclusions we
can draw are the following:
1. The convergence of the algorithm is slow. In fact it can be arbitrarily slow if eigen-
values are very close to each other.
2. The algorithm is expensive. Each iteration step requires the computation of the QR
factorization of a full n × n matrix, i.e., each single iteration step has a complexity
O(n3 ). Even if we assume that the number of steps is proportional to n we would
get an O(n4 ) complexity. The latter assumption is not even assured, see point 1 of
this discussion.
In the following we want to improve on both issues. First we want to find a matrix
structure that is preserved by the QR algorithm and that lowers the cost of a single
iteration step. Then, we want to improve on the convergence properties of the algorithm.
Definition 4.1 A matrix H is a Hessenberg matrix if its elements below the lower off-
diagonal are zero,
hij = 0, i > j + 1.
Proof. We give a constructive proof, i.e., given a Hessenberg matrix H with QR factor-
ization H = QR, we show that H = RQ is again a Hessenberg matrix.
The Givens rotation or plane rotation G(i, j, ϑ) is defined by
1 ··· 0 ··· 0 ··· 0
.. . . .. .. ..
. . . . .
0 ··· c ··· s ··· 0
← i
..
G(i, j, ϑ) := ... ..
.
..
. .
..
.
(4.4) 0 · · · −s · · · c · · · 0 ← j
.. .. .. . . ..
. . . . .
0 ··· 0 ··· 0 ··· 1
↑ ↑
i j
Thus, it is a simple matter to zero a single specific entry in a vector by using a Givens
rotation1 .
Now, let us look at a Hessenberg matrix H. We can show the principle procedure by
means of a 4 × 4 example.
× × × × × × × ×
× × × × G(1, 2, ϑ1 )∗ · 0 × × ×
H=
0 × × × −−−−−−−−−→ 0 × × ×
0 0 × × 0 0 × ×
× × × × × × × ×
G(2, 3, ϑ2 )∗ · 0 × × × G(3, 4, ϑ3 )∗ ·
0 × × × = R
−−−−−−−−−→ 0 0 × ×
− −− − −− −− −→ 0 0 × ×
0 0 × × 0 0 0 ×
G∗ G∗ G∗ H = R ⇐⇒ H = QR.
| 3 {z2 }1
Q∗
H = RQ = RG1 G2 G3 ,
or, pictorially,
× × × × × × × ×
0 × × × ·G(1, 2, ϑ1 ) × × × ×
R=
0 0 × × −−−−−−−−→ 0 0 × ×
0 0 0 × 0 0 0 ×
× × × × × × × ×
·G(2, 3, ϑ2 ) × × × × ·G(3, 4, ϑ 1 )
× × × ×
−−−−−−−−→ 0 × × × −−−−−−−−→ 0 ×
=H
× ×
0 0 0 × 0 0 × ×
D = diag([4 3 2 1]);
rand(’seed’,0);
S=rand(4); S = (S - .5)*2;
A = S*D/S % A_0 = A = S*D*S^{-1}
H = hess(A); % built-in MATLAB function: generates
% unitarily similar Hessenberg matrix
for i=1:30,
[Q,R] = qr(H); H = R*Q
end
This yields the matrix sequence
H( 0) = [ -4.4529e-01 -1.8641e+00 -2.8109e+00 7.2941e+00]
[ 8.0124e+00 6.2898e+00 1.2058e+01 -1.6088e+01]
[ 0.0000e+00 4.0087e-01 1.1545e+00 -3.3722e-01]
[ 0.0000e+00 0.0000e+00 -1.5744e-01 3.0010e+00]
Again the elements in the lower off-diagonal reflect nicely the convergence rates in (4.2).
70 CHAPTER 4. THE QR ALGORITHM
4.2.2 Complexity
We give the algorithm for a single Hessenberg-QR-step in a Matlab-like way, see Algo-
rithm 4.2. By
Hk:j,m:n ∈ C(j−k+1)×(n−m+1)
If we neglect the determination of the parameters ck and sk , see (4.5), then each of
the two loops requires
n−1
X n(n − 1)
6i = 6 ≈ 3n2 flops.
2
i=1
A flop is a floating point operation (+, −, ×, /). We do not distinguish between them,
although they may slightly differ in their execution time on a computer. Optionally, we
also have to execute the operation Uk := Uk−1 Qk of Algorithm 4.1. This is achieved by a
loop similar to the second loop in Algorithm 4.2. Since all the rows and columns of U are
n−1
X
6n ≈ 6n2 flops.
i=1
Altogether, a QR step with a Hessenberg matrix, including the update of the unitary
transformation matrix, requires 12n2 floating point operations. This has to be set in
relation to a QR step with a full matrix that costs 37 n3 . Consequently, we have gained a
factor of O(n) in terms of operations by moving from dense to Hessenberg form. However,
we may still have very slow convergence if one of the quotients |λk |/|λk+1 | is close to 1.
4.3. THE HOUSEHOLDER REDUCTION TO HESSENBERG FORM 71
4.3 The Householder reduction to Hessenberg form
In the previous section we discovered that it is a good idea to perform the QR algorithm
with Hessenberg matrices instead of full matrices. But we have not discussed how we
transform a full matrix (by means of similarity transformations) into Hessenberg form.
We catch up on this issue in this section.
P = I − 2uu∗ , kuk = 1,
It is easy to verify that Householder reflectors are Hermitian and that P 2 = I. From this
we deduce that P is unitary. It is clear that we only have to store the Householder
vector u to be able to multiply a vector (or a matrix) with P ,
This multiplication only costs 4n flops where n is the length of the vectors.
A task that we repeatedly want to carry out with Householder reflectors is to transform
a vector x on a multiple of e1 ,
P x = x − u(2u∗ x) = αe1 .
Since P is unitary, we must have α = ρkxk, where ρ ∈ C has absolute value one. Therefore,
x1 − ρkxk
x − ρkxke1 1 x2
u= = ..
kx − ρkxke1 k kx − ρkxke1 k .
xn
We can freely choose ρ provided that |ρ| = 1. Let x1 = |x1 |eiφ . To avoid numerical
cancellation we set ρ = −eiφ .
In the real case, one commonly sets ρ = −sign(x1 ). If x1 = 0 we can set ρ in any way.
The multiplication of P1 from the left inserts the desired zeros in column 1 of A. The
multiplication from the right is necessary in order to have similarity. Because of the
nonzero structure of P1 the first column of P1 A is not affected. Hence, the zeros stay
there.
The reduction continues in a similar way:
× × × × × × × × × ×
× × × × ×
P2 ∗ / ∗ P2 × × × × ×
P1 AP1 = 0 × × × × −−−−−−−−→ 0 × × × ×
0 × × × × 0 0 × × ×
0 × × × × 0 0 × × ×
× × × × ×
× × × × ×
P3 ∗ / ∗ P3
−−−−−−−−→ 0 × × × × = P3 P2 P1 A P1 P2 P3 .
0 0 × × × | {z }
U
0 0 0 × ×
Algorithm 4.3 gives the details for the general n × n case. In step 4 of this algorithm,
the Householder reflector is generated such that
ak+1,k α u1
ak+2,k u2
0
(I − 2uk u∗k ) . = with u k = .. and |α| = kxk
.
. 0 .
an,k 0 un−k
according to the considerations of the previous subsection. The Householder vectors are
stored at the locations of the zeros. Therefore the matrix U = P1 · · · Pn−2 that effects the
similarity transformation from the full A to the Hessenberg H is computed after all House-
holder vectors have been generated, thus saving (2/3)n3 flops. The overall complexity of
the reduction is
P
n−2
• Application of Pk from the left: 4(n − k − 1)(n − k) ≈ 34 n3
k=1
n−2
P
• Application of Pk from the right: 4(n)(n − k) ≈ 2n3
k=1
4.4. IMPROVING THE CONVERGENCE OF THE QR ALGORITHM 73
Algorithm 4.3 Reduction to Hessenberg form
1: This algorithm reduces a matrix A ∈ Cn×n to Hessenberg form H by a sequence of
Householder reflections. H overwrites A.
2: for k = 1 to n−2 do
3: Generate the Householder reflector Pk ;
4: /* Apply Pk = Ik ⊕ (In−k − 2uk uk ∗ ) from the left to A */
5: Ak+1:n,k:n := Ak+1:n,k:n − 2uk (uk ∗ Ak+1:n,k:n);
6: /* Apply Pk from the right, A := APk */
7: A1:n,k+1:n := A1:n,k+1:n − 2(A1:n,k+1:n uk )uk ∗ ;
8: end for
9: if eigenvectors are desired form U = P1 · · · Pn−2 then
10: U := In ;
11: for k = n−2 downto 1 do
12: /* Update U := Pk U */
13: Uk+1:n,k+1:n := Uk+1:n,k+1:n − 2uk (uk ∗ Uk+1:n,k+1:n);
14: end for
15: end if
P
n−2
• Form U = P1 · · · Pn−2 : 4(n − k)(n − k) ≈ 43 n3
k=1
Lemma 4.4 Let H be an irreducible Hessenberg matrix, i.e., hi+1,i 6= 0 for all i =
1, . . . , n − 1. Let H = QR be the QR factorization of H. Then for the diagonal elements
of R we have
|rkk | > 0, for all k < n.
Thus, if H is singular then rnn = 0.
Proof. Let us look at the k-th step of the Hessenberg QR factorization. For illustration,
let us consider the case k = 3 in a 5 × 5 example, where the matrix has the structure
+ + + + +
0 + + + +
0 0 + + + .
0 0 × × ×
0 0 0 × ×
The plus-signs indicate elements that have been modified. In step 3, the (nonzero) element
h43 will be zeroed by a Givens rotation G(3, 4, ϕ) that is determined such that
cos(ϕ) − sin(ϕ) h̃kk r
= kk .
sin(ϕ) cos(ϕ) hk+1,k 0
74 CHAPTER 4. THE QR ALGORITHM
Because the Givens rotation preserves vector lengths, we have
1: H − λI = QR /* QR factorization */
2: H = RQ + λI
H = Q∗ (H − λI)Q + λI = Q∗ HQ.
Thus, " #
RQ =
00
(4.7) σk := h(k−1)
n,n = e∗n H (k−1) en .
Algorithm 4.4 implements this heuristic. Notice that the shift changes in each iteration
step! Notice also that deflation is incorporated in Algorithm 4.4. As soon as the last lower
off-diagonal element is sufficiently small, it is declared zero, and the algorithm proceeds
with a smaller matrix. In Algorithm 4.4 the ‘active portion’ of the matrix is m × m.
76 CHAPTER 4. THE QR ALGORITHM
Lemma 4.4 guarantees that a zero is produced at position (n, n − 1) in the Hessenberg
matrix H if the shift equals an eigenvalue of H. What happens, if hn,n is a good approx-
imation to an eigenvalue of H? Let us assume that we have an irreducible Hessenberg
matrix
× × × × ×
× × × × ×
0 × × × × ,
0 0 × × ×
0 0 0 ε hn,n
where ε is a small quantity. If we perform a shifted Hessenberg QR step, we first have to
factor H − hn,n I, QR = H − hn,n I. After n − 2 steps of this factorization the R-factor is
almost upper triangular,
+ + + + +
0 + + + +
0 0 + + + .
0 0 0 α β
0 0 0 ε 0
From (4.5) we see that the last Givens rotation has the nontrivial elements
α −ε
cn−1 = p , sn−1 = p .
|α|2 + |ε|2 |α|2 + |ε|2
Applying the Givens rotations from the right one sees that the last lower off-diagonal
element of H = RQ + hn,n I becomes
ε2 β
(4.8) h̄n,n−1 = .
α2 + ε2
So, we have quadratic convergence unless α is also tiny.
A second even more often used shift strategy is the Wilkinson shift:
" (k−1) (k−1)
#
hn−1,n−1 hn−1,n
(4.9) σk := eigenvalue of (k−1) (k−1)
that is closer to h(k−1)
n,n .
hn,n−1 hn,n
D = diag([4 3 2 1]);
rand(’seed’,0);
S=rand(4); S = (S - .5)*2;
A = S*D/S;
H = hess(A)
for i=1:8,
[Q,R] = qr(H-H(4,4)*eye(4)); H = R*Q+H(4,4)*eye(4);
end
produces the output
4.5. THE DOUBLE SHIFT QR ALGORITHM 77
H( 0) = [ -4.4529e-01 -1.8641e+00 -2.8109e+00 7.2941e+00]
[ 8.0124e+00 6.2898e+00 1.2058e+01 -1.6088e+01]
[ 0.0000e+00 4.0087e-01 1.1545e+00 -3.3722e-01]
[ 0.0000e+00 0.0000e+00 -1.5744e-01 3.0010e+00]
If σ1 ∈ C \ R then σ2 = σ̄1 . Let us perform two QR steps using σ1 and σ2 as shifts. Setting
k = 1 for convenience we get
H 0 − σ 1 I = Q 1 R1 ,
H1 = R1 Q1 + σ1 I,
(4.10)
H 1 − σ 2 I = Q 2 R2 ,
H2 = R2 Q2 + σ2 I.
R1 Q1 + (σ1 − σ2 )I = Q2 R2 .
Multiplying this equation with Q1 from the left and with R1 from the right we get
Therefore, (Q1 Q2 )(R2 R1 ) is the QR factorization of a real matrix. We can choose (scale)
Q1 and Q2 such that Z := Q1 Q2 is real orthogonal. (Then also R2 R1 is real.) By
consequence,
H2 = (Q1 Q2 )∗ H0 (Q1 Q2 ) = Z T H0 Z
is real.
A procedure to compute H2 by avoiding complex arithmetic could consist of three
steps:
(k−1)
1. Form the real matrix M = H02 − sH0 + tI with s = 2Re(σ) = trace(G) = hn−1,n−1 +
(k−1) (k−1) (k−1) (k−1) (k−1)
hn,n and t = |σ|2 = det(G) = hn−1,n−1 hn,n − hn−1,n hn,n−1 . Notice that M has
two lower off-diagonals, " #
M= .
3. Set H2 = Z T H0 Z.
This procedure is however too expensive since item 1, i.e., forming H 2 requires O(n3 )
flops.
A remedy for the situation is provided by the Implicit Q Theorem.
4.5. THE DOUBLE SHIFT QR ALGORITHM 79
Theorem 4.5 (The implicit Q theorem) Let A ∈ Rn×n . Let Q = [q1 , . . . , qn ] and
V = [v1 , . . . , vn ] be orthogonal matrices that both similarly transform A to Hessenberg
form, H = QT AQ and G = V T AV . Let k denote the smallest positive integer for which
hk+1,k = 0, with k = n if H is irreducible.
If q1 = v1 then qi = ±vi and |hi,i−1 | = |gi,i−1 | for i = 2, . . . , k. If k < n, then
gk+1,k = 0.
(4.11) wi = W ei ∈ span{e1 , . . . , ei }, i ≤ k.
(Notice that orthogonal upper triangular matrices are diagonal with diagonal entries ±1.)
This is proced inductively. For i = 1 we have w1 = e1 by the assumption that q1 = v1 .
For 1 < i ≤ k we assume that (4.11) is true for wi and use the equality GW = W H. The
(i−1)-th column of this equation reads
i
X
Gwi−1 = GW ei−1 = W Hei−1 = wj hj,i−1 .
j=1
i−1
X
wi hi,i−1 = Gwi−1 − wj hj,i−1 ∈ span{e1 , . . . ei },
j=1
hi,i−1 = eTi Hei−1 = eTi QT AQei−1 = eTi QT V GV T Qei−1 = wiT Gwi−1 = ±gi,i−1 ,
k
X
gk+1,k = eTk+1 Gek = ± eTk+1 GW ek = ± eTk+1 W Hek = ± eTk+1 wj hj,k = 0.
j=1
Golub and van Loan [6, p.347] write that “The gist of the implicit Q theorem is that if
QT AQ = H and Z T AZ = G are both unreduced Hessenberg matrices and Q and Z have
the same first column, then G and H are “essentially equal” in the sense that G = DHD
with D = diag(±1, . . . , ±1).”
We apply the Implicit Q Theorem in the following way: We want to compute the
Hessenberg matrix Hk+1 = Z T Hk−1 Z where ZR is the QR factorization of M = Hk−1 2 −
sHk−1 + tI. The Implicit Q Theorem now tells us that we essentially get Hk+1 by any
orthogonal similarity transformation Hk−1 → Z1∗ Hk−1 Z1 provided that Z1∗ HZ1 is Hessen-
berg and Z1 e1 = Ze1 .
80 CHAPTER 4. THE QR ALGORITHM
Let P0 be the Householder reflector with
Since only the first three elements of the first column M e1 of M are nonzero, P0 has the
structure
× × ×
× × ×
× × ×
P0 = 1 .
.
. .
1
So,
× × × × × × ×
× × × × × × ×
+ × × × × × ×
′
Hk−1 := P0 Hk−1 P0 =
T
+ + × × × × × .
× × × ×
× × ×
× ×
We now reduce P0T Hk−1 P0 similarly to Hessenberg form the same way as we did earlier, by
a sequence of Householder reflectors P1 , . . . , Pn−2 . However, P0T Hk−1 P0 is a Hessenberg
matrix up to the bulge at the top left. We take into account this structure when forming
the Pi = I − 2pi pTi . So, the structures of P1 and of P1T P0T Hk−1 P0 P1 are
1 × × × × × × ×
× × × × × × × × × ×
× × × 0 × × × × × ×
P1 =
× × × , H ′′ = P T H ′ P1 = 0 + × × × × × .
k−1 1 k−1
1 + + × × × ×
1 × × ×
1 × ×
The transformation with P1 has chased the bulge one position down the diagonal. The
consecutive reflectors push it further by one position each until it falls out of the matrix
at the end of the diagonal. Pictorially, we have
× × × × × × ×
× × × × × × ×
× × × × × ×
′′′
Hk−1 = P2T Hk−1
′′
P2 =
0 × × × × ×
0 + × × × ×
+ + × × ×
× × × × × × × × ×
× × × × × × ×
× × × × × ×
Hk−1 = P3 Hk−1 P3 =
′′′′ T ′′′
× × × × ×
0 × × × ×
0 + × × ×
+ + × ×
4.5. THE DOUBLE SHIFT QR ALGORITHM 81
× × × × × × ×
× × × × × × ×
× × × × × ×
′′′′′
Hk−1 = P4T Hk−1
′′′′
P4 =
× × × × ×
× × × ×
0 × × ×
× × × × × × ×
0 + × × × × ×
× × × ×
× × × × × ×
′′′′′′
Hk−1 T ′′′′′
= P5 Hk−1 P5 = × × × × ×
× × × ×
× × ×
0 × ×
It is easy to see that the Householder vector pi , i < n − 2, has only three nonzero elements
at position i + 1, i + 2, i + 3. Of pn−2 only the last two elements are nonzero. Clearly,
P0 P1 · · · Pn−2 e1 = P0 e1 = M e1 /α.
Remark 4.3. Notice that in Algorithm 4.5 a double step is taken also if the eigenvalues of
hqq hqp
G=
hpq hpp
H(0) =
>> PR=qr2st(H)
1 6 -1.7735e-01 -1.2807e+00
2 6 -5.9078e-02 -1.7881e+00
3 6 -1.6115e-04 -5.2705e+00
4 6 -1.1358e-07 -2.5814e+00
5 6 1.8696e-14 1.0336e+01
6 6 -7.1182e-23 -1.6322e-01
H(6) =
7 5 1.7264e-02 -7.5016e-01
8 5 2.9578e-05 -8.0144e-01
9 5 5.0602e-11 -4.6559e+00
10 5 -1.3924e-20 -3.1230e+00
H(10) =
11 4 1.0188e+00 -9.1705e-16
H(11) =
The inner product uT x costs 5 flops, multiplying with 2 another one. The operation
x := x − uγ, γ = 2uT x, cost 6 flops, altogether 12 flops.
In the k-th step of the loop there are n − k of these application from the left in step 13
and k + 4 from the right in step 15. In this step there are thus about 12n + O(1) flops to be
executed. As k is running from 1 to p − 3. We have about 12pn flops for this step. Since p
runs from n down to about 2 we have 6n3 flops. If we assume that two steps are required per
eigenvalue the flop count for Francis’ double step QR algorithm to compute all eigenvalues
of a real Hessenberg matrix is 12n3 . If also the eigenvector matrix is accumulated the two
additional statements have to be inserted into Algorithm 4.5. After step 15 we have
1: Q1:n,k+1:k+3 := Q1:n,k+1:k+3 P ;
1: Q1:n,p−1:p := Q1:n,p−1:pP ;
which costs another 12n3 flops.
We earlier gave the estimate of 6n3 flops for a Hessenberg QR step, see Algorithm 4.2.
If the latter has to be spent in complex arithmetic then the single shift Hessenberg QR al-
gorithm is more expensive than the double shift Hessenberg QR algorithm that is executed
in real arithmetic.
Remember that the reduction to Hessenberg form costs 10 3
3 n flops without forming the
14 3
transformation matrix and 3 n if this matrix is formed.
1: Choose a shift µ
2: Compute the QR factorization A − µI = QR
3: Update A by A = RQ + µI.
Of course, this is done by means of plane rotations and by respecting the symmetric
tridiagonal structure of A.
In the more elegant implicit form of the algorithm we first compute the first Givens
rotation G0 = G(1, 2, ϑ) of the QR factorization that zeros the (2, 1) element of A − µI,
c s a11 − µ ∗
(4.12) = , c = cos(ϑ0 ), s = sin(ϑ0 ).
−s c a21 0
A = Q∗ AQ, Q = G0 G1 · · · Gn−2 .
Q e1 = G0 G1 · · · Gn−2 e1 = G0 e1 .
Both explicit and implicit QR step form the same first plane rotation G0 . By referring to
the Implicit Q Theorem 4.5 we see that explicit and implicit QR step compute essentially
the same A.
86 CHAPTER 4. THE QR ALGORITHM
The shift for the next step is determined from elements a5 , a6 , and b6 . According to (4.12)
the first plane rotation is determined from the shift and the elements a1 and b1 . The im-
plicit shift algorithm then chases the bulge down the diagonal. In this particular situation,
the procedure finishes already in row/column 4 because b4 = 0. Thus the shift which is an
approximation to an eigenvalue of the second block (rows 4 to 6) is applied to the wrong
first block (rows 1 to 3). Clearly, this shift does not improve convergence.
If the QR algorithm is applied in its explicit form, then still the first block is not treated
properly, i.e. with a (probably) wrong shift, but at least the second block is diagonalized
rapidly.
Deflation is done as indicated in Algorithm 4.6:
Deflation is particularly simple in the symetric case since it just means that a tridiagonal
eigenvalue problem decouples in two (or more) smaller tridiagonal eigenvalue problems.
Notice, however, that the eigenvectors are still n elements long.
4.7 Research
Still today the QR algorithm computes the Schur form of a matrix and is by far the
most popular approach for solving dense nonsymmetric eigenvalue problems. Multishift
and aggressive early deflation techniques have led to significantly more efficient sequential
implementations of the QR algorithm during the last decade. For a brief survey and a
discussion of the parallelization of the QR algorithm, see [7].
The three steps of the presented symmetric QR algorithm are (1) reducion of the origi-
nal matrix to tridiagonal form, (2) computation of the eigenpairs of the tridiagonal matrix,
and (3) back-transformation of the eigenvectors. In the ELPA project the first step has
been successfully replaced by a two-stage procedure: transformation full to banded, and
banded to tridiagonal. This approach improves the utilization of memory hierarchies [8, 3].
4.8 Summary
The QR algorithm is a very powerful algorithm to stably compute the eigenvalues and (if
needed) the corresponding eigenvectors or Schur vectors. All steps of the algorithm cost
O(n3 ) floating point operations, see Table 4.1. The one exception is the case where only
eigenvalues are desired of a symmetric tridiagonal matrix. The linear algebra software
package LAPACK [1] contains subroutines for all possible ways the QR algorithm may be
employed.
88 CHAPTER 4. THE QR ALGORITHM
nonsymmetric case symmetric case
without with without with
Schurvectors eigenvectors
10 3 14 3 4 3 8 3
transformation to Hessenberg/tridiagonal form 3 n 3 n 3n 3n
20 3 50 3
real double step Hessenberg/tridiagonal QR al- 3 n 3 n 24n2 6n3
gorithm (2 steps per eigenvalues assumed)
4 3
total 10n3 25n3 3n 9n3
We finish by repeating, that the QR algorithm is a method for dense matrix problems.
The reduction of a sparse matrix to tridiagonal or Hessenberg form produces fill-in, thus
destroying the sparsity structure which one almost always tries to preserve.
Bibliography
[1] E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. D.
Croz, A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov,
and D. Sorensen, LAPACK Users’ Guide – Release 2.0, SIAM, Philadel-
phia, PA, 1994. (Software and guide are available from Netlib at URL
http://www.netlib.org/lapack/).
[2] P. Arbenz and G. H. Golub, Matrix shapes invariant under the symmetric QR
algorithm, Numer. Linear Algebra Appl., 2 (1995), pp. 87–93.
[4] J. W. Demmel, Applied Numerical Linear Algebra, SIAM, Philadelphia, PA, 1997.
[6] G. H. Golub and C. F. van Loan, Matrix Computations, The Johns Hopkins
University Press, Baltimore, MD, 2nd ed., 1989.
[9] B. N. Parlett, The QR algorithm, Computing Sci. Eng., 2 (2000), pp. 38–42.
BIBLIOGRAPHY 89
[10] H. Rutishauser, Solution of eigenvalue problems with the LR-transformation, NBS
Appl. Math. Series, 49 (1958), pp. 47–81.
[11] J. H. Wilkinson, The Algebraic Eigenvalue Problem, Clarendon Press, Oxford, 1965.
90 CHAPTER 4. THE QR ALGORITHM