Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Spectral and Structural Analysis of High Precision ®nite Di Erence Matrices For Elliptic Operators

Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

Linear Algebra and its Applications 293 (1999) 85±131

www.elsevier.com/locate/laa

Spectral and structural analysis of high


precision ®nite di€erence matrices for elliptic
operators
a,* b,1
Stefano Serra Capizzano , Cristina Tablino Possio
a
Dipartimento di Energetica ``S. Stecco'', Universit
a di Firenze, Florence, Italy
b
Dipartimento di Scienza dei Materiali, Universit
a di Milano ± Bicocca, Milan, Italy
Received 19 December 1997; accepted 29 December 1998
Submitted by R.A. Brualdi

Abstract
In this paper we study the structural properties of matrices coming from high-pre-
cision Finite Di€erence (FD) formulae, when discretizing elliptic (or semielliptic) dif-
ferential operators L…a; u† of the form
 k  
d dk
…ÿ†k a…x† u…x† :
dxk dxk

Strong relationships with Toeplitz structures and Linear Positive Operators (LPO) are
highlighted. These results allow one to give a detailed analysis of the eigenvalues lo-
calisation/distribution of the arising matrices. The obtained spectral analysis is then
used to de®ne optimal Toeplitz preconditioners in a very compact and natural way and,
in addition, to prove Szeg o-like and Widom-like ergodic theorems for the spectra of the
related preconditioned matrices. A wide numerical experimentation, con®rming the
theoretical results, is also reported. Ó 1999 Elsevier Science Inc. All rights reserved.
AMS classi®cation: 65F10; 15A12; 15A18; 65N22

Keywords: Finite di€erences; Elliptic operators; Toeplitz and Vandermonde matrices; Linear
positive operators; Ergodic theorems; Matrix algebras; Preconditioning

*
Corresponding author. Department of Energetica ``S. Stecco'', University of Florence, Via
Lombroso 6/17, 50134 Florence, Italy; e-mail: serra@mail.dm.unipi.it
1
E-mail: Cristina.Tablino.Possio@mater.unimi.it

0024-3795/99/$ ± see front matter Ó 1999 Elsevier Science Inc. All rights reserved.
PII: S 0 0 2 4 - 3 7 9 5 ( 9 9 ) 0 0 0 2 2 - 1
86 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

1. Introduction

This paper concerns the main properties of matrices coming from a large
class of Finite Di€erence (FD) discretizations of template problems of the form
8  k  k 
>
< …ÿ†k d a…x† d u…x† ˆ f …x† on X ˆ ‰0; 1Š;
dxk dxk …1†
>
:
Dirichlet B:C: on oX:

We have focused our attention on 1D problems even though the same analysis
can be also carried out in a very similar way in the multidimensional case as
well (see Section 7).
Recently this matter was considered by Tilli [44] and Serra Capizzano [30],
by looking at the problem from two di€erent points of view. The Tilli approach
is in the style of the Szeg o results [17] and mainly concentrates on the as-
ymptotic estimates and ergodic theorems regarding eigen/singular values of a
large class of matrices that the author calls ``Locally Toeplitz'' matrices, which
contain Toeplitz structures and FD matrices as proper subsets.
The Serra Capizzano approach concerns with an in depth analysis (not
only asymptotic, but for any ®xed dimension) of the localization of the
spectra of FD matrices, Toeplitz-based preconditioners and especially of the
related preconditioned matrices. In e€ect, the second approach gives infor-
mation on the ``relative distribution'' of preconditioned matrices, while the
®rst applies to the ``real distribution'' of the original matrices (FD matrices,
Toeplitz, etc.).
Due to applicative interest, here we follow and generalize this second point
of view by showing relationships with the theory of linear positive operators
(LPOs) [20]. In fact, such relationships are very welcome when considering
recent results [38] which indicate that many di€erent problems from applied
and numerical mathematics can be treated in a uniform way when LPOs are
encountered.
Therefore, for any ®xed k, we look for high-precision FD formulae by
formalising the problem in terms of the solution of special Vandermonde-like
ÿ1
systems (see also [23]). In such a way, by setting the mesh size h ˆ …n ‡ 1† ,
the discrete approximation of the template problem (1) gives rise to an n  n
linear system whose coecient matrix is constructed by using two sets of
values: the solution of a Vandermonde-like linear system and a proper set of
equispaced samplings of the function a…x†. When a…x†  1, the resulting
matrices enjoy the Toeplitz structure; otherwise these matrices are simply
banded.
In addition, when a…x† > 0 and when the ``weak formulation'' of the
problem (1) is considered, it is well known that the resulting bilinear functional
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 87

is self-adjoint and coercive [8]. We would like to ®nd again these important
structural properties in the discrete approximation.
In the case of an abstract Faedo±Galerkin approach (Finite Elements, etc.)
the symmetry and the coerciveness are directly inherited from the resulting
®nite dimensional bilinear forms, so that the obtained matrices are symmetric
and positive de®nite.
In the FD case the situation is quite di€erent. In general the matrices dis-
cretizing the template problem are not symmetric. Moreover, by imposing
symmetry, the structure arising in the case a…x†  1 is symmetric and Toeplitz,
but its positive de®niteness is actually not guaranteed (we can exhibit FD
formulae approximating the operator ÿd2 =dx2 , which are related to Toeplitz
matrices with ``big'' negative eigenvalues (see Section 3.1)).
This fact is connected to an interesting representation theorem proved in
k
Section 3: if a symmetric FD formula approximates the operator …ÿ† d2k =dx2k
with a precision order m P 1, then the resulting Toeplitz matrix sequence admits
a generating function g…x† such that g…x† ÿ x2k  xm‡2k (for the notion of  see
De®nition 3.1) in any small neighbourhood of x ˆ 0. Therefore g…x† must be
nonnegative (not identically zero) in a neighbourhood of zero, but not neces-
sarily everywhere.
Since we are interested in computation and since we want to use the pow-
erful preconditioned conjugate gradient (PCG) methods or multigrid methods,
we would like to deal with symmetric and positive de®nite matrices discretizing
problems (1). More speci®cally, we follow a natural approach for the discret-
ization of the operator appearing in (1): since it can be looked at as the
composition of two derivatives of order k, we leave the operator in ``divergence
form'' and we discretize the inner and outer derivatives separately. In fact, if
the operator in (1) is represented as
Xk   j
k k d d2kÿj
…ÿ† a…x† u…x†
jˆ0
j dxj dx2kÿj

then the resulting FD discretizations lead to inherently nonsymmetric matrices,


while the previous choice can preserve the symmetry and the positive de®-
niteness.
In particular, these structural properties are guaranteed when the inner and
the outer derivatives are discretized through a conjugated pair of FD formulae
(see De®nition 2.1) and the coecient function a…x† is positive. For this reason,
for any ®xed number q of distinct equispaced points, among all the possible
q-points formulae giving rise to symmetric structures, we choose the best
approximation FD formulae, that is those for which the resulting precision
order m is maximized and equals q ‡ 1 ÿ k. The resulting FD matrices are
indicated as An …a; m; k†, where n is the matrix dimension, a and k are those
appearing in (1) and m ˆ bq=2c (see Sections 2 and 4).
88 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

For this class of FD formulae the following properties hold true.


· An …a; m; k† shows a special dyadic decomposition
X
An …a; m; k† ˆ a…~xi †Qn;i ;
i

f~xi g being the set of equispaced samplings of the coecient function


a…x† and Qn;i being nonnegative de®nite matrices of rank one (dyads).
· By calling S nn the space of the real symmetric matrices, for any ®xed n, m, k,
the operator
An …; m; k† : C‰0; 1Š ! S nn

is linear and normally positive in the sense that it maps nonnegative (posi-
tive) functions in nonnegative (positive) de®nite matrices.
· For any a and nonnegative b belonging to C‰0; 1Š, we ®nd
a a
r ˆ inf 6 k…Pn …a; b†† 6 sup ˆ R;
Y b Y b

Y ˆ fx 2 ‰0; 1Š : jb…x†j ‡ ja…x†j > 0g;

where Pn …a; b† equals A‡n …b; m; k†An …a; m; k†, X


‡
denoting the pseudo-inverse
of Moore±Penrose [24,26] of X and k…X † any eigenvalue (not formally zero)
of X.
· When a and b are nonnegative and equivalent (i.e. r, R 2 …0; 1†), then the to-
pological closure Z ofSthe set of the eigenvalues of all the matrices fPn …a; b†gn
is contained in ‰r; RŠ f0g and contains the range of the function a=b.
· When a and b are nonnegative and equivalent and under some assumptions
on the zeros of b, if we denote by fk…n† i gi;n all the complete sets of the eigen-
values of the matrices fPn …a; b†gn , then we have the following relation: for
any continuous function F with bounded support
1X n   Z
lim F k…n†
i ˆ F …g† dx; …2†
n!1 n ‰0;1Š
iˆ1

holds true, where g…x† ˆ a…x†=b…x† over fx : b…x† > 0g and g…x† equals zero
where the function b…x† vanishes. Notice that g…x† is the functional counter-
part of the matrices fA‡ n …b; m; k†An …a; m; k†gn obtained by pseudo Moore±
Penrose inversion [24,26], since g…x† can be also de®ned as b‡ a where
b‡ …x† ˆ 1=b…x† over the set where b…x† is positive, and is zero where b…x†
is zero (a sort of pseudo-inversion of a nonnegative function).
The ®rst two statements are important for establishing the exact rank of the
matrices An …a; m; k†, which is especially meaningful when a…x† has zeros. The
third statement not only suggests natural preconditioners, but is a useful tool
as well for establishing a precise upperbound for the spectral condition num-
bers (with respect to the Euclidean vector norm) of these matrices.
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 89

We want to stress again that this analysis is very similar to the one followed
in [10,29,31,33,34] in a Toeplitz context (compare Theorem 3.1 [10], Theorem
2.4 [29], Theorem 2.1 [33] and Theorem 3.1 [34]), so revealing itself to be an
important con®rmation of the unifying theory devised in [38].
Lastly, the fourth and the ®fth statements are concerned with the asymptotic
distribution of the eigenvalues of the preconditioned matrices.
More in detail, if a…x† is strictly positive (i.e. the problem is strictly elliptic),
then a simple choice of the preconditioner is An …1; m; k†. The related PCG
method is optimal (see Theorem 4.4) in the sense that the expected number of
iterations for reaching the solution within a preassigned accuracy is bounded
by a constant not depending on the dimension n: of course this is true if the
chosen accuracy does not depend on the dimension n.
Furthermore, the ergodic relation (2) is equivalent to write that the eigen-
values of Pn …a; b† distribute as the above de®ned function g…x†: this fact implies
that the number of iterations predicted by the Axelsson±Lindskog bounds [2] is
very tight and precise.
Finally, the matrix An …1; m; k† is a Toeplitz one, whose generating function is
explicitly known and, in particular, is nonnegative with a zero at x ˆ 0.
Therefore very fast methods can be applied to these Toeplitz structures as:
· multigrid methods requiring O…n…2q ÿ 1†† ops and O…log n† parallel steps
with O…2nq† processors in the parallel PRAM model [27] of computation
[12,13];
· a recursive displacement-rank based technique requiring O…n log…2q ÿ 1†‡
log…n†…2q ÿ 1† log2 …2q ÿ 1†† ops and O…log n† parallel steps with O…2nq† pro-
cessors [5],
where n is the matrix dimension and 2q ÿ 1 the matrix bandwidth.
Clearly, in order to obtain the total computational cost, the quoted costs
have to be multiplied by the number of iterations, which is constant with
respect to n, and added to the cost of few matrix-vector multiplications (recall
the PCG algorithm). The ®nal cost is of O…n log…2q ÿ 1†† ops and
O…log n…2q ÿ 1†† parallel steps with O…2nq† processors in the PRAM model of
computation.
In conclusion, we have reduced the asymptotic cost of these band systems to
the cost of the band-Toeplitz systems for which the recent literature provides
very sophisticated algorithms [12,13,5].
These simple but rather powerful results can be strongly generalized with
regard to two main directions.
· When considering a 2 L1 , we have to modify the de®nition of a…~xi † as a
``mean value'' in order to deal with this irregular case (compare the sugges-
tions given in [30,38,36]).
· When considering p-dimensional elliptic problems on square regions (see
Section 7 for a speci®c example), we have to carefully extend the analysis giv-
en in this paper through tensor arguments [41,36].
90 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

The paper is organized as follows. In Section 2 we relate the derivation of


FD formulae with Vandermonde-like systems. In Section 3 we discuss several
relations among FD matrices, band matrices and Toeplitz matrices, showing
how structured matrices can be used to describe speci®c aspects of FD ma-
trices. Section 4 is devoted to a structural and spectral analysis of the operators
An …; m; k† and A‡
n …; m; k†An …; m; k† and Section 5 to its applications to the
preconditioning problem. In Section 6 we discuss the complexity of these
methods, while in Section 7 we show how the proposed technique can be ex-
tended to high dimensional/nonregular (a 2 L1 ) problems. In Section 8 we
perform several numerical experiments showing the e€ectiveness of the pro-
posed ideas (a complementary theoretical and numerical analysis of these and
other related preconditioning techniques is presented in a twin paper [39]).
Concluding remarks are given in Section 9.

2. FD formulae and Vandermonde matrices

In this section some general properties of FD formulae for the discretization


of the di€erential operator dk =dxk by using q equispaced mesh points are
highlighted.
It is natural to assume that the discretization of …dk =dxk …u…x†††jxˆxr ,
1 6 k 6 q ÿ 1 involves m ˆ bq=2c mesh points less than xr , m ˆ bq=2c greater
than xr , plus the point xr if q is odd. More precisely, if q ˆ 2m ‡ 1, then the
mesh points are de®ned as xj ˆ xr ‡ jh, j ˆ ÿm; . . . ; m, while if q ˆ 2m then
they are de®ned as xj ˆ xr ‡ …j ÿ 1=2†h, j ˆ 1; . . . ; m and xj ˆ xr ‡ …j ‡ 1=2†h,
j ˆ ÿm; . . . ; ÿ1.
By supposing u regular enough and by making use of standard Taylor ex-
pansions, we look for FD formula coecients cj so that
X Xs  t 
ÿk tÿk d
h cj u…xj † ˆ /t h u…x†
j tˆ0
dx t
xˆxr

k k
is an approximation of …d =dx …u…x†††jxˆxr . This means that we have to choose
cj for which /k ˆ 1 and /t ˆ 0 for t 6 s ÿ 1, t 6ˆ k with s P k ‡ 1. The precision
order of the FD formula associated with the coecient vector c 2 Rq is, at
least, m ˆ s ÿ k.
In such a way we obtain a sequence of linear systems of the form
Ds Vs c ˆ ek‡1 ; s ˆ k ‡ 1; . . . ; q; …3†

where ek‡1 is the …k ‡ 1†th vector of the canonical basis of Rs , Ds is an s  s


ÿ1
diagonal matrix with …Ds †i;i ˆ ……i ÿ 1†!† and Vs is a s  q matrix made up by
the ®rst s rows of the matrix Vq , transpose of the Vandermonde matrix gen-
erated by the distinct q mesh point displacements.
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 91

If q ˆ 2m ‡ 1 the Vandermonde matrix points are j; j ˆ 0; 1; . . . ; m, so


that
2 3
1 1 ... 1 1 1 ... 1 1
6 …ÿm† …ÿm ‡ 1† ... …ÿ1† 0 …1† ... …m ÿ 1† …m† 7
6 7
6 2 7
6
Vq ˆ 6 …ÿm† …ÿm ‡ 1†2 ... …ÿ1†2 0 …1†2 ... …m ÿ 1†2 …m†2 7
7
6 .. .. .. .. .. .. .. .. .. 7
6 7
4 . . . . . . . . . 5
2m 2m 2m 2m 2m 2m
…ÿm† …ÿm ‡ 1† . . . …ÿ1† 0 …1† ... …m ÿ 1† …m†
…4†

otherwise they are …j ÿ 1=2†; j ˆ 1; . . . ; m, so that


2 3
1 1 ... 1 1 ... 1 1
6 7
6 …ÿm ‡ 12† …ÿm ‡ 32† ... …ÿ 12† …12† ... …m ÿ 32† …m ÿ 12† 7
6 7
6 …ÿm ‡ 1†2 …ÿm ‡ 32†2 ... …ÿ 12†2 …12†2 ... …m ÿ 32†2 …m ÿ 12†2 7
Vq ˆ 6
6 2 7:
7
6 .. .. .. .. .. .. .. .. 7
6 . . . . . . . . 7
4 5
…ÿm ‡ 12†2mÿ1 …ÿm ‡ 32†2mÿ1 . . . …ÿ 12†2mÿ1 …12†2mÿ1 ... …m ÿ 32†2mÿ1 …m ÿ 12†2mÿ1

…5†

Owing to the de®nition of the Ds matrix and to the property of Vandermonde


matrices generated by distinct points, it is always possible to ensure that
rank…Ds Vs † ˆ s. In particular, when s ˆ q, Dq Vq 2 Rqq and det…Dq Vq † 6ˆ 0, so
that a unique q-points FD formula with precision order at least q ÿ k exists; in
addition, the coecients cj are rational as a consequence of the Cramer res-
olution rule.
Let rev…c† be the vector so that rev…c†j ˆ cÿj (j ˆ ÿm; . . . ; m if q ˆ 2m ‡ 1;
j ˆ 1; . . . ; m if q ˆ 2m); due to the structural properties of the matrix Vs the
following lemmas hold true.

Lemma 2.1 [40]. If c 2 Rq is the coecient vector related to an FD formula


discretizing the operator dk =dxk ; k P 1 with precision order m; then d ˆ
k
…ÿ† rev…c† gives an FD formula for the same operator, showing the same pre-
cision order.

For notational simplicity, we introduce the following de®nition.

De®nition 2.1. Let c 2 Rq be the coecient vector related to an FD formula


discretizing the operator dk =dxk ; k P 1 with precision order m, then the pair
…c; d† with d ˆ …ÿ†k rev…c† is named conjugated pair of FD formulae of preci-
sion order m. Moreover, d denotes the conjugated FD formula of c and vice
versa.
92 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

In addition, the unique vector c de®ning the maximal precision FD formula


is symmetric or antisymmetric with respect to its middle according to the
quantity k …mod 2† being equal to 0 or 1.

Lemma 2.2 [40]. If c 2 Qq is the coecient vector related to the maximal


k
precision FD formula discretizing dk =dxk ; k P 1; then rev…c† ˆ …ÿ† c. The
precision order m equals q ÿ k ‡ 1 if k ‡ q is odd and equals q ÿ k if k ‡ q is
even.

Proof. Case k even and q ˆ 2m ‡ 1 odd. It is evident that the symmetric vector c
with cÿj ˆ cj ; j ˆ 1; . . . ; m satis®es all the even index equations of the system
(3) (see the matrix Vq in (4)). By imposing these conditions, the following re-
duced system is obtained
2 32 3
1 2 ... 2 c0
60 76 c 7
6 76 1 7
Dm‡1 6
6 ..
76 . 7 ˆ e k
76 . 7 …2‡1†; …6†
4. 2Xm 54 . 5
0 cm

where
Dm‡1 ˆ diag……0!†ÿ1 ; …2!†ÿ1 ; . . . ; ……2m†!†ÿ1 †;

Xm ˆ Ym diag…y1 ; . . . ; ym †

and
2 3
1 ... 1
6 y1 ... ym 7
6 7
Ym ˆ 6
6 .. .. .. 7
7
4 . . . 5
y1mÿ1 . . . ymmÿ1

is the transpose of the Vandermonde matrix generated by the nonvanishing


distinct points yj ˆ j2 ; j ˆ 1; . . . ; m. Owing to the property of Vandermonde
matrices, the reduced system also shows a unique solution that must coincide
with the unique solution of the full system (3). It is also evident that the related
precision order is q ÿ k ‡ 1, due to the symmetry.
Case k odd and q ˆ 2m even: see [40]. 

The usefulness of Lemma 2.2 is evident in the practical computation of the


maximal precision FD formula coecients, but it is also of paramount im-
portance in de®ning some properties related to Toeplitz structures which are
discussed in Section 3.
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 93

De®nition 2.2. Let c 2 Qq be the coecient vector related to the maximal


precision FD formula discretizing dk =dxk ; k P 1 with respect to q P k ‡ 1
equispaced points, then the pair …c; c† is named maximal precision conjugated
pair.

3. FD formulae and structured matrices

The aim of this section is to present some relationships existing between FD


matrices and Toeplitz matrices [17].

3.1. Toeplitz matrices and generating functions

Hereafter, the Toeplitz matrices are assumed to be generated by Lebesgue


integrable functions f de®ned on the fundamental interval ‰ÿp; pŠ, in the sense
that the entry of the n  n Toeplitz matrix Tn …f † along the kth diagonal is given
by the kth Fourier coecient of f:
Z p
1
ak ˆ f …x†eÿikx dx; i2 ˆ ÿ1; k 2 Z:
2p ÿp

If f is real-valued, then we notice that the quoted de®nition implies that aÿk ˆ
ak so that all the matrices of the sequence fTn …f †gn are Hermitian.
A deeper relation between the spectral structure of the class fTn …f †gn and the
symbol f is well described in the following fundamental results.

Theorem 3.1 [17,46]. Let f : ‰ÿp; pŠ ! R be a Lebesgue integrable function and


let us denote by fTn …f †gn the corresponding family of Toeplitz matrices. Let k…n†
i
be the eigenvalues of Tn …f †. Then, for any continuous function F 2 C…R† with
bounded support, the following asymptotic formula holds true
  Z p
1X n
…n† 1
lim F ki ˆ F …f …x†† dx: …7†
n!1 n 2p ÿp
iˆ1

Theorem 3.2 [17,45]. Let f and Tn …f † stand as in Theorem 3.1. Let k…X † be the
generic eigenvalue of the matrix X and let mf and Mf be the essential in®mum
and supremum of f, respectively, i.e. the inf f and sup f up to within zero
measure sets [28]. Then for any n 2 N‡ the following cases occur:
· mf < k…Tn …f †† < Mf if mf < Mf or
· k…Tn …f †† ˆ M if mf ˆ Mf ˆ M.

A consequence of Theorem 3.2 is the strong characterization of the as-


ymptotic inertia of the matrices fTn …f †gn .
94 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

Corollary 3.1 [47]. Let f and Tn …f † stand as in Theorem 3.1. Let k…n†
i be the
eigenvalues of Tn …f † ordered in a nondecreasing way. Then, for any choice of
a < b, the following asymptotic formula
1 1
lim #fi : k…n†
i 2 …a; b†g ˆ mfx 2 ‰ÿp; pŠ : f …x† 2 …a; b†g …8†
n!1 n 2p
holds true, provided that mfx 2 ‰ÿp; pŠ : f …x† ˆ ag ˆ mfx 2 ‰ÿp; pŠ :
f …x† ˆ bg ˆ 0, where mfg denotes the usual Lebesgue measure on the real
line.

Finally, a further result concerning the asymptotic evaluation of the spectral


condition number of the matrices fTn …f †gn is proved in the following theorem
due to Kac et al. [19] and generalized in [33,6]. The following preliminary
de®nition is needed.

De®nition 3.1. Let f and g be the two Lebesgue integrable functions. We write
f  g over the interval I if f and g are nonnegative over I and there exist two
positive constants c1 and c2 so that c1 g 6 f 6 c2 g almost everywhere in I. In
addition, we write f  g over the interval I if either f  g or ÿ f  g
or f  ÿg or ÿ f  ÿg over I.

Theorem 3.3 [19,33,6].Qs If f is aa Lebesgue integrable function so that f  g


over ‰ÿp; pŠ and g ˆ jˆ1 jx ÿ xj j j ; aj > 0; s 2 N; then the spectral condition
number k2 …Tn …f †† of the Toeplitz matrices fTn …f †gn is asymptotic to
na with a ˆ maxj aj . Moreover, when f  x2k over ‰ÿd; dŠ for some positive d,
then the condition number of fTn …f †gn grows at least as n2k .

In the case where f is complex-valued, a similar theory exists [25,1,45]


though the resulting Toeplitz matrices are not Hermitian and the ergodic result
concerns singular values in place of eigenvalues.

Theorem 3.4 [25,1,45,46]. Let f : ‰ÿp; pŠ ! C be a Lebesgue integrable function


and let us denote by fTn …f †gn the corresponding family of Toeplitz matrices. Let
…n†
ri be the singular values of Tn …f †. Then, for any function F 2 C…R† with
bounded support, the following asymptotic formula holds true:
  Z p
1X n
…n† 1
lim F ri ˆ F …jf …x†j† dx: …9†
n!1 n 2p ÿp
iˆ1

In the following, we make heavy use of the spectral Szeg o±Tyrtyshnikov


theory for Toeplitz matrices in order to give a spectral asymptotic character-
ization to the FD matrices coming from the discretization of some constant
functional coecient di€erential operators.
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 95

For instance, let us consider a vector c 2 Rq …q ˆ 2m ‡ 1† related to a FD


formula as in the previous section for the discretization of the operator
d2k =dx2k . The discrete approximation of the problem …ÿ†k d2k =dx2k …u† ˆ f leads
to a linear system whose coecient matrix is a band-Toeplitz matrix having on
k
its rth row the vector …0; . . . ; 0; ~c; 0; . . . ; 0†, where ~c ˆ …ÿ† c.
By supposing that the coecient c~0 of the vector ~c ˆ …~ cÿm ; . . . ; c~ÿ1 ; c~0 ; c~1 ; . . . ;
c~m † lies on the main diagonal of the matrix, the resulting ToeplitzPmatrix Tn … p ~c †
m
shows a trigonometric complex-valued polynomial p~c …x† ˆ jˆÿm c~j eijx as
generating function.
If we consider the FD formula of maximal precision order m ˆ q ÿ 2k ‡ 1,
then in light of Lemma 2.2 the vector c is so that c ˆ rev…c†, that is cÿj ˆ cj 8j.
This fact has a noteworthy impact on the generating function Ppm~c that becomes a
real-valued trigonometric polynomial p~c …x† ˆ c~0 ‡ 2 jˆ1 c~j cos…jx† with
p~c …ÿx† ˆ p~c …x† for any x. In practice, in light of the Szeg o theory (Theorems 3.1
and 3.2), we have a strong information concerning the distribution and lo-
calization of the eigenvalues of the related n  n FD matrices.
For instance, let us consider the discretization of ÿd2 =dx2 by using the
formula of maximal precision order m ˆ 2 over three points: ~c ˆ …ÿ1; 2; ÿ1†.
The related Toeplitz matrix An …a; m; k† ˆ An …1; 1; 1† ˆ Tn …p~c † is generated by
p~c …x† ˆ 2 ÿ 2 cos…x† and therefore, in light of the Szeg o Theorem 3.2, we have
the eigenvalues belonging to the open interval …0; 4† and the minimal eigen-
value of Tn …p~c † going to 0 as n tends to in®nity asymptotically to nÿ2 [33].
Consequently, we deduce that the matrices An …a; m; k† are positive de®nite for
any dimension.
Nevertheless, this property is not general for FD matrices coming from the
the FD discretization of elliptic operators by means of symmetric FD formulae
(refer to Section 3.3).

3.2. The approximation of the ``continuous operator''

We want to discover some relationships between the ``continuous'' operator


k
…ÿ† d2k =dx2k : C2k ‰0; 1Š ! C‰0; 1Š
and the FD matrices discretizing the operator itself. In order to avoid cum-
bersome notations we ®rst consider the case k ˆ 1. Setting fj …x† ˆ eijpx , j 2 Z,
2
x 2 ‰0; 1Š, we ®nd that ÿd2 =dx2 …fj † ˆ …jp† fj . Therefore the function fj …x† is an
2
eigenfunction and …jp† is the related eigenvalue.
Notice that these eigenfunctions are the continuous counterpart of the
Fourier eigenvectors of the circulant algebras [9]. In addition, the set of the
eigenvalues of the continuous operator can be looked at as a sampling of the
function x2 over the grid jp; j 2 N.
Let us consider the vector ~c ˆ …ÿ1; 2; ÿ1† de®ning the 3-points FD formula
of maximal precision order m ˆ 2 with respect to the operator ÿd2 =dx2 . The
96 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

discrete operator is represented by hÿ2  Tn …p~c † where Tn …p~c † is the n  n Toep-


litz matrix generated by the trigonometric polynomial p~c …x† ˆ 2 ÿ 2 cos…x†
ˆ 4 sin2 …x=2†, whose eigenvalues are given by
 
ÿ2 2 jp 2 ÿ1
h 4 sin ˆ …jp† ‡ O…h2 †; h ˆ …n ‡ 1† ;
2…n ‡ 1†
when j is a constant with respect to n. Moreover we observe that x2 ÿ p~c …x†  x4
over any neighbourhood of x ˆ 0 small enough. This is not a speci®c property
of the quoted FD formula, but is a general one as formally stated in the
subsequent proposition.
It is worthwhile noting that the polynomials fpc g, where c is a symmetric
vector, share some structural properties which are direct consequences of the
features of the considered FD formulae.

Proposition 3.1. Let c 2 Rq …q ˆ 2m ‡ 1† be such that cj ˆ cÿj for each


j ˆ 1; . . . ; m and let pc …x† be the even real-valued trigonometric polynomial
Xm
c0 ‡ 2 cj cos…jx†:
jˆ1

Then the following two are equivalent:


· x2 ÿ pc …x†  x2‡m with m P 1 in a neighbourhood of x ˆ 0,
· the FD formula related to pc …x† discretizing the operator ÿd2 =dx2 has a preci-
sion order equal to m.

Proof. Let us consider a vector c such that x2 ÿ pc …x†  x2‡m ; m P 1 holds true.
From a direct calculation of the Taylor expansion
! !
Xm
1 Xm
x2 ÿ pc …x† ˆ ÿ c0 ‡ 2 cj ‡ 1 ‡ cj j2 x2
jˆ1
2 jˆ1
! !
X m
…ÿ† X
i m
ÿ cj j2i x2i ‡ Rm …x†;
iˆ2
…2i†! jˆ1

Rm …x† ˆ rm x2m‡2 ‡ o…x2m‡2 †, rm 6ˆ 0, it follows that the relation x2 ÿ pc …x†  x2‡m


is equivalent to ful®ll the ®rst …2 ‡ m†=2 equations of the reduced Vander-
monde-like system (6). Notice that the regularity of the involved functions and
the symmetry of c implies that m is an even number and therefore the minimal
precision order is m ˆ 2. 

In the same way a similar statement can be proved for the discretization of
k
the operator …ÿ† d2k =dx2k .

Proposition 3.2. Let c 2 Rq …q ˆ 2m ‡ 1† be such that cj ˆ cÿj for each


j ˆ 1; . . . ; m and let pc …x† be the even real-valued trigonometric polynomial
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 97

X
m
c0 ‡ 2 cj cos…jx†:
jˆ1

Then the following two are equivalent:


· x2k ÿ pc …x†  x2k‡m with m P 1 in a neighbourhood of x ˆ 0,
· the FD formula related to pc …x† discretizing the operator …ÿ†k d2k =dx2k has a
precision order equal to m.

Corollary 3.2. Let c 2 Rq …q ˆ 2m ‡ 1† be such that cj ˆ cÿj for each


j ˆ 1; . . . ; m and let pc …x† be the even real-valued trigonometric polynomial
X
m
c0 ‡ 2 cj cos…jx†:
jˆ1

If the condition x2k ÿ pc …x†  x2k‡m ; m P 1 holds locally in a neighbourhood of


x ˆ 0; then the matrices Tn …pc † are ill-conditioned and the related spectral con-
dition number grows at least as n2k .

Proof. It is a direct consequence of Theorem 3.3. 

In other words, the relation x2k ÿ pc …x†  x2k‡m is a di€erent way of writing
and looking at the Vandermonde-like linear systems considered in Section 2, so
that a more precise FD formula is simply a more precise local expansion of the
spectral function x2k in a neighbourhood of x ˆ 0.
Therefore, in the language of numerical analysis of di€erential equations,
the condition pc …x†  x2k is the consistency condition, while the relation x2k ÿ
pc …x†  x2k‡m establishes that a consistent FD formula has a precision order
equal to m.
Notice that Corollary 3.2 tells us that a condition number growing as n2k is
necessary for the consistency condition and therefore is inherent in every good
FD discretization scheme.
Finally, in the light of these remarks, we can say that the FD discretization is
a ``local approximation'' of the continuous operators in the sense that the
discrete eigenvalue function pc …x† tends toward the continuous eigenvalue
function x2k . Here the convergence is not intended as a global convergence in a
given functional topology but it is a convergence in the sense of the asymptotic
Taylor expansion in a neighbourhood of x ˆ 0: this ``locality'' in the approx-
imation has a spectral interpretation that we will highlight in the next section.

3.3. Local and global positive de®niteness

In light of the last propositions in Section 3, it is evident that the consistency


of an FD formula, joint with the symmetry of the associated vector c, imposes
98 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

that the generating function of the Toeplitz matrices be locally nonnegative.


Nevertheless, the global nonnegativity of the polynomial pc is not implied and
in fact is not necessary.
By considering the FD formula related to ~c ˆ …ÿ1; 2; ÿ1†, the generating
function is the polynomial p~c …x† ˆ 2 ÿ 2 cos…x† which is nonnegative and not
identically zero. So, by virtue of the Szeg o Theorem 3.2, it follows that the
related FD matrices fTn …p~c †gn are positive de®nite for any dimension (refer to
Section 3.1).
Now, let us consider the 5-points FD formula related to the choice of the
vector ~c ˆ 12 …ÿ1; 2; ÿ2; 2; ÿ1†. This formula has a precision order m ˆ 2 and
gives rise to a family of Toeplitz matrices whose generating function is
p~c …x† ˆ cos…x†…2 ÿ 2 cos…x††. This generating function is, as expected, non-
negative in a neighbourhood of x ˆ 0, but attains a negative minimum
mp~c ˆ ÿ4 at x ˆ p and a positive maximum Mp~c ˆ 1=2 at x ˆ p=3. Therefore,
again in light of the Szeg o Theorem 3.2, the eigenvalues of the related Toeplitz
matrices belong to the open interval …ÿ4; 1=2†. So, for n large enough, the
matrices of fTn …p~c †gn are nonde®nite and have Nÿ ˆ O…n† negative eigenvalues,
with the minimal one tending to ÿ4 as the dimension n tends to in®nity.
Moreover, the number of the negative eigenvalues Nÿ is not negligible and, in
e€ect, by virtue of the Widom Corollary 3.1 and of the fundamental theorem of
algebra [28], Nÿ is asymptotic to
n n
mfx 2 ‰ÿp; pŠ : p~c …x† < 0g ˆ :
2p 2
To sum up, while positive de®niteness is a structural property in the Faedo±
Galerkin approach, it does not always hold in the case of FD matrices. For
such a reason and for practical purposes, we will consider only those, among
all possible FD formulae, which give rise to de®nite positive matrices in the
case a…x†  1. To do this, the main tool is taking into account only those
formulae derived through a conjugated pair of FD formulae according to
De®nition 2.1.

3.4. FD formulae and convolution masks

Let us suppose that c 2 Rq1 and d 2 Rq2 are two coecient vectors associ-
ated with two FD formulae discretizing the operator dk =dxk with k P 1 and
precision order m1 and m2 , respectively. Note that we consider both q1 and q2
odd when k is even and vice versa. In order to calculate the quantity zr obtained
as a discrete approximation of
 k  
k d dk
…ÿ† a…x† k u…x† ;
dxk dx
xˆxr
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 99

we make use of the two quoted formulae, applying the former to obtain the
discretization of the inner operator and the latter to obtain the discretization of
the outer operator. More precisely, by calling v…x† ˆ dk =dxk …u…x††, we have
Xq1
cj
v…xr † ˆ u…xr‡jÿs1 † ‡ O…hm1 †
jˆ1
h k

so that
!
…ÿ† X X
k q2 q1
zr ˆ 2k di a…xr‡iÿs2 † cj u…xr‡i‡jÿs2 ÿs1 † ; …10†
h iˆ1 jˆ1

where

mi ‡ 1 if qi ˆ 2mi ‡ 1;
si ˆ
mi ‡ 1=2 if qi ˆ 2mi ;

and where the global precision is m ˆ min…m1 ; m2 † under the assumption


a 2 Ck ‰0; 1Š.
This expression has a matrix interpretation as follows: call ar the q2 di-
mensional vector of the evaluations of a…x† as in (10) and ar  d the q2 di-
mensional vector obtained from componentwise product between ar and d.
Call T the q2  …q1 ‡ q2 ÿ 1† Toeplitz matrix having the vector …c; 0; . . . ; 0† on
its ®rst row. In such a way the quantity zr can be represented in a compact way
as a weighted convolution
k
…ÿ†
zr ˆ …ar  d†T T ur ; …11†
h2k
where the vector ur 2 Rq1 ‡q2 ÿ1 is constructed by the equispaced evaluations of
u…x† displayed in (10).
It is also evident that the quantity zr represents the left hand side of the rth
row of the linear system arising from the discretization of the template problem
(1). Therefore, according to the expression (11), we can completely de®ne the
related coecient matrix An …a; c; d; k†.

De®nition 3.2. The symbol An …a; c; d; k† denotes the matrix discretizing the
problem (1) through the FD formula coecient vector c for the inner derivative
and FD formula coecient vector d for the outer derivative and according to
Eq. (11) for the rth row, where the factor h2k is neglected because it is included
in the right hand side of the associated linear system.
When d equals c and …c; c† is the maximal precision conjugated pair, then the
matrix An …a; c; c; k† is denoted by the shorter symbol An …a; m; k† with m ˆ bq=2c
(or simply An …a† when it is clear from the context).
100 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

The weighted convolution representation is useful for giving an interesting


dyadic decomposition of the matrices An …a; c; d; k†, as highlighted in the sub-
sequent section.

3.5. The dyadic decomposition of An …a; c; d; k†

Let c 2 Rq1 , d 2 Rq2 and let Gn ˆ f~xi g be the set of the equispaced samplings
of the coecient function a…x† with respect to the FD formula coecient vector
d and let a^i …x† be the piecewise linear and continuous function so that

1 if x ˆ ~xi ;
a^i …x† ˆ
0 otherwise on Gn :

The following representation theorem holds true.

Theorem 3.5. For every dimension n, the matrix An …a; c; d; k† can be expressed as
X
a…~xi †An …^
ai ; c; d; k†;
i

where, for any i, 2 3


0 0 0
k k6 7
ai ; c; d; k† ˆ …ÿ† Qn …c; rev…d†; i† ˆ …ÿ† 4 0 Di
An …^ 0 5;
0 0 0
Di ˆ rev…d†cT 2 Rq2 q1

and #Gn ˆ n ‡ q2 ÿ 1.

Proof. As a consequence of (11) the rth row of the matrix An …a; c; d; k† is given
by
k T
…ÿ† ‰0; . . . ; 0; …ar  d† T ; 0; . . . ; 0Š;

where the number of the zero entries at the beginning of the row is r ÿ a, with a
absolute constant depending on the chosen FD formulae (with the convention
that negative values are assumed to be zero).
Due to the linearity of the componentwise product, we can consider directly
the matrices An …^ ai ; c; d; k†.
Case q2 odd (k even): In this case …ar †j ˆ a…xr‡j †; j ˆ ÿbq2 =2c; . . . ; bq2 =2c.
So, if a…x† ˆ a^i …x†, it is evident that at most the q2 rows given by
i ÿ bq2 =2c 6 r 6 i ‡ bq2 =2c …1 6 r 6 n† are not identically zero and, on each
row, the unique nonvanishing entry is selected in a backward manner. More
precisely, the vector ar  d is equal to dt et , with t ˆ i ÿ r ‡ dq2 =2e, so that
ar  dT T ˆ ‰0; . . . ; 0; dt c1 ; . . . ; dt cq1 ; 0; . . . ; 0Š where the number of beginning
zero entries is equal to t ÿ 1.
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 101

To sum up, the number of initial zero entries of the rth-row of the matrix
An …^
ai ; c; d; k† is equal to r ÿ a ‡ t ÿ 1 and is always the same since for increasing
r values, decreasing t values are obtained. Therefore, we have
2 3
0 0 0 gi ÿ bq22 c ÿ 1 rows;
An …^ai ; c; d; k† ˆ …ÿ†k 4 0 Di 0 5 gq2 rows;
0 0 0

where
2 3
dq2
6 7
Di ˆ 4 ... 5 ‰c1 ; . . . ; cq1 Š 2 Rq2 q1 ;
d1

and where the …1; 1† entry of the rectangular block Di is located at the position
…i ÿ bq2 =2c; i ‡ bq2 =2c ÿ a ‡ 1† with respect to the matrix An …^
ai ; c; d; k†. Notice
that in the case of a nonpositive position index i ÿ bq2 =2c and/or i ‡ bq2 =2c ÿ
a ‡ 1 we have to consider only the submatrix having positive entry indices with
regard to those of An …^ ai ; c; d; k†.
Case q2 even (k odd): In this case, …ar †j ˆ a…xr‡j‡1=2 †, j ˆ ÿq2 =2; . . . ;
q2 =2 ÿ 1, so that, if a…x† ˆ a^i …x†, the at most q2 not identically zero rows are
given by i ÿ q2 =2 ‡ 1 6 r 6 i ‡ q2 =2. In the same manner as in the q2 odd case,
we obtain
2 3
0 0 0 gi ÿ q22 rows;
ai ; c; d; k† ˆ …ÿ† 4 0 Di 0 5 gq2 rows;
k
An …^
0 0 0

where the …1; 1† entry of the rectangular block Di is located at the position
…i ÿ q2 =2 ‡ 1; i ‡ q2 =2 ÿ a ‡ 1† with respect to the matrix An …^
ai ; c; d; k†. 

Corollary 3.3. Let c 2 Rq be an FD formula coecient vector and let d ˆ


k
…ÿ† rev…c† be the corresponding conjugated FD formula coecient vector, then
each dyad
k T
…ÿ† Qn …c; rev…d†; i† ˆ c‰iŠ…c‰iŠ†

is symmetric and nonnegative de®nite. Here the vector c‰iŠ denotes a vector of Rn
containing in ``its middle'' the vector c suitably shifted in accordance with i.

Proof. By taking into account Theorem 3.5, the proof is trivial. In fact, since a
is equal to b…2q ÿ 1†n2c ‡ 1, Di 2 Rqq is a diagonal block and since
k
rev…d† ˆ …ÿ† c, it directly follows that the dyad
…ÿ†k Qn …c; rev…d†; i† ˆ …ÿ†2k Qn …c; c; i† …12†
102 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

has rank one and is symmetric and nonnegative de®nite by construction since
for each y 2 Rn nf0g; y ˆ …y1 ; x; y2 † we have
2
yT Qn …c; c; i†y ˆ xT ccT x ˆ jjxT cjj2 P 0: 
By means of Corollary 3.3, we can state the following theorem which pro-
vides a link between the number of the zeros of the nonnegative function a…x†
k
and the rank of the matrix An …a; c; …ÿ† rev…c†; k†.

Theorem 3.6. Let Gn ˆ f~xi g be the set of equispaced samplings of the nonneg-
ative coecient function a…x† as considered in the representation Theorem 3.5 and
let In‡ …a† ˆ fi : a…~xi † > 0g. Suppose that the n ‡ q ÿ 1 vectors fc‰iŠ : i ˆ 1; . . . ;
n ‡ q ÿ 1g considered in Corollary 3.3 strongly generate Rn in the sense that each
subset fc‰ik Š : 1 6 i1 < i2 <    < in 6 n ‡ q ÿ 1g is a basis for Rn . Then
rank…An …a; c; …ÿ†k rev…c†; k†† ˆ minfn; #…In‡ …a††g:

Proof. To prove the statement it is sucient to refer to the representation


Theorem 3.5 and to Corollary 3.3, since we have
k
X
An …a; c; …ÿ† rev…c†; k† ˆ a…~xi †Qn …c; c; i†: 
i2In‡ …a†

Finally, in the case a…x†  1, we can give a global description of the matrices
An …a; c; d; k† and, in particular, a spectral characterization of the matrices
An …a; c; …ÿ†k rev…c†; k†.

Theorem 3.7. The matrices An …1; c; d; k† are Toeplitz matrices generated by a


k
complex-valued polynomial g…x† ˆ …ÿ† pc …x†pd …x†, x 2 ‰ÿp; pŠ, where pc and pd
are the complex-valued polynomials related to the FD formula coecient vectors
c 2 Rq1 and d 2 Rq2 .
k
If d ˆ …ÿ† rev…c†, then g…x† equals the nonnegative real-valued polynomial
2
jpc …x†j .
Proof. As a consequence of the weighted convolution formula (11) and by
recalling that a…x†  1, it follows that the nonzero entries of the generic row of
the matrix An …1; c; d; k† are given by a vector w so that
X jq k
i
wj ˆ …ÿ†k cs dt ; j ˆ ÿ…m1 ‡ m2 †; . . . ; m1 ‡ m2 ; mi ˆ ; …13†
s‡tˆj
2
s 2 fÿm1 ; . . . ; m1 g, t 2 fÿm2 ; . . . ; m2 g. By using standard arguments of sym-
bolic polynomial computation [4], this identity represents the convolution
product between the two polynomials pc and pd de®ned as
X
m1 X
m2
pc …x† ˆ pc …z† ˆ c j zj ; pd …x† ˆ pd …z† ˆ d j zj ;
jˆÿm1 jˆÿm2
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 103

where z ˆ eix and with the assumption that qi ˆ 2mi ‡ 1 (the case qi ˆ 2mi can
be treated in a very similar way). Therefore, An …1; c; d; k† is a Toeplitz matrix
with generating function
mX
1 ‡m2

g…x† ˆ pw …z† ˆ …ÿ†k pc …z†pd …z† ˆ w j zj :


jˆÿ…m1 ‡m2 †

If d ˆ …ÿ†k rev…c†, then we have


k
pd ˆ …ÿ† pc …zÿ1 †;

so that, by direct computation we ®nd


g…x† ˆ pw …z† ˆ …ÿ†2k pc …z†pc …zÿ1 † ˆ pc …z†pc …z† ˆ jpc …z†j2 : 

The previous characterization can be linked to the powerful Szeg


o theory
regarding the Toeplitz matrices generated by real-valued functions. More
precisely the following statement holds true.
k 2
Proposition 3.3. The Toeplitz matrices fAn …1; c; …ÿ† rev…c†; k† ˆ Tn …jpc j †gn are
symmetric positive de®nite for any value of the dimension n. In particular the
related spectral condition number is asymptotically greater than cn2k , with c
absolute constant.

Proof. Since jpc j2 is a real-valued nonnegative (not identically zero) trigono-


k
metric polynomial, the matrices An …1; c; …ÿ† rev…c†; k† are symmetric and their
positive de®niteness directly follows from Theorem 3.2. In addition, the sym-
metry of the Toeplitz structure implies that the vector w is such that wj ˆ wÿj
for each j.
Moreover, since w is related to a consistent FD formula for the discretiza-
tion of the operator d2k =dx2k with precision order m equal to the precision order
of the FD formula de®ned by the vector c, it follows that g…x† has a zero of
order 2k at x ˆ 0 (by virtue of Proposition 3.2) and, in light of Theorem 3.3,
the claimed thesis ensues. 

4. A class of FD matrices as a class of LPOs

Here we want to show that An …†  An …; m; k† is a LPO for any ®xed n, m and
k, that is a linear operator mapping nonnegative functions a…x† into nonneg-
ative de®nite n  n matrices. Moreover, we prove that this operator is also
normally positive, in the sense that it maps strictly positive functions into
positive de®nite symmetric matrices.
Let us denote by L…X ; Y † the class of the linear operators from the real
vector space X to the real vector space Y and let us denote by LP …X ; Y † the
104 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

convex subset of L…X ; Y † of the LPOs from X to Y. It is evident upon


construction that An …† belongs to L…C‰0; 1Š; S nn † where S nn is the vector
space of real symmetric n  n matrices. As proved in Section 3.5, the ma-
trices An …1† (see De®nition 3.2) are Toeplitz matrices whose generating
2 2
function is g…x† ˆ jpc j , so that Tn …jpc j † are positive de®nite for any di-
mension.
In the following, we formally prove the positivity of the considered matrix-
valued operators in the case of a generic continuous function a…x†.

4.1. Structural and spectral analysis of An …† and A‡


n …† An …†

Let c ˆ d 2 Qq be the FD formula coecient vector discretizing the oper-


ator dk =dxk with maximal precision order m ˆ q ÿ k ‡ 1 with respect to q eq-
uispaced points. Let Gn ˆ f~xi g be the set of equispaced samplings of the
coecient function a…x†, where

ih; i ˆ 1 ÿ m; . . . ; n ‡ m if q ˆ 2m ‡ 1;
~xi ˆ
…i ‡ 1=2†h; i ˆ 1 ÿ m; . . . ; n ‡ m ÿ 1 if q ˆ 2m;

and #Gn ˆ n ‡ q ÿ 1. It is evident that we are supposing the de®nition set of


the function a…x† to be enlarged to ‰ÿ; 1 ‡ Š for a ®xed (small) positive . This
assumption is necessary since O…1† points of the set Gn lie outside the interval
‰0; 1Š even if the maximal distance of ~xi 2 Gn from ‰0; 1Š is O…h†. In the case of
Dirichlet boundary conditions, it is natural to extend a…x† using a standard
Taylor expansion.
According to De®nition 3.2 the following results hold true.

Theorem 4.1. The matrices An …a† can be expressed as


X
a…~xi †Qn …c; c; i†: …14†
i

Proof. Recall Theorem 3.5 and Lemma 2.1. 

Corollary 4.1. For any n; k and m, the operator An …† : C‰0; 1Š ! S nn is linear
and positive. In addition, the operator is normally positive, i.e. if a…x† is a strictly
positive function then An …a† will be a positive de®nite matrix.

Proof. By invoking Theorem 4.1 when a…x† is nonnegative, we ®nd that An …a† is
a nonnegative linear combination of the nonnegative de®nite dyads Qn …c; c; i†.
So, the nonnegative de®niteness of the matrix An …a† directly follows.
To prove the normality property use has been made of the Rayleigh quotient
of the considered matrix An …a†. For any x 2 Rn
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 105
X
xT An …a†x ˆ a…~xi †xT Qn …c; c; i†x
i
X
P amin xT Qn …c; c; i†x;
i

where amin ˆ minfa…~xi †;P ~xi 2 Gn g.


We now notice that i Qn …c; c; i† is nothing more than the Toeplitz matrix
2
An …1† ˆ Tn …jpc j †, where pc is the polynomial associated with the maximal
precision FD formula coecient vector c, so that, in light of Proposition 3.3,
2
Tn …jpc j † is positive de®nite. Finally, for any x 6ˆ 0, it follows that
X
xT An …a†x P amin xT Qn …c; c; i†x
i
T 2
ˆ amin x Tn …jpc …x†j †x
2 2
P amin kmin …Tn …jpc …x†j ††kxk2 > 0

and the result is proved. 

We stress that the statement of Corollary 4.1 is still valid and with un-
changed proof for all the matrices An …a; c; d; k† with d ˆ …ÿ†k rev …c† as a con-
sequence of Corollary 3.3 and Proposition 3.3. Moreover the assumption of
continuity of the symbol a can be easily removed.
Let ker…X † be the null space of the matrix X, that is the space of the
eigenvectors of X associated with the zero eigenvalue.

De®nition 4.1. Two continuous nonnegative functions a and b will be called


equivalent if a  b over ‰0; 1Š according to the two positive constants c1 and c2 ,
that is c1 b 6 a 6 c2 b. (As a consequence they take the same values, up to pos-
itive multiplicative constants and they vanish in the same subset of ‰0; 1Š.)

De®nition 4.2. Two sequences of n  n nonnegative de®nite matrices


fAn gn and fBn gn are called equivalent if there exist two positive absolute con-
stants c1 and c2 such that the related Rayleigh quotients satisfy the inequalities
c1 xT Bn x 6 xT An x 6 c2 xT Bn x

for any x.

Lemma 4.1. Let A and B be two nonnegative de®nite n  n matrices. If A and B


are extracted from two equivalent sequences, then
· ker…A† ˆ ker…B†;
· A and B have exactly the same inertia.

Proof. If u belongs to ker…B†, clearly uT Bu ˆ 0. Therefore, from the equivalence


of A and B, it follows that uT Au ˆ 0. Since A is nonnegative de®nite, by
106 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

invoking the classical Schur decomposition, we ®nd that A has a nonnegative


de®nite square root A1=2 . Now, by de®ning v ˆ A1=2 u we have
2
0 ˆ uT Au ˆ vT v ˆ jjvjj2

so that v ˆ 0. Consequently, u belongs to ker…A1=2 † ˆ ker…A†.


From the dual relation

cÿ1 T T ÿ1 T
2 x Ax 6 x Bx 6 c1 x Ax;

we can prove that ker…A†  ker…B† in the same way.


Finally, from the assumption of nonnegative de®niteness, all the other ei-
genvalues of A and B are positive and the proof is concluded. 

We prove that the nonnegative functions a…x† partition the matrices An …a†
into equivalence classes in the following theorem. Even if this result is a direct
consequence of the representation Theorem 4.1, we make use in the proof of
the positivity of the operator An …† which is, in this speci®c FD context, a
consequence of Theorem 4.1.

Theorem 4.2. Let a and b be two equivalent functions. Then fAn …a†gn and
fAn …b†gn are equivalent

Proof. From the relation


c1 b 6 a 6 c2 b

and from Corollary 4.1, it ensues that for any x 2 Rn the following inequality
holds
c1 xT An …b†x 6 xT An …a†x 6 c2 xT An …b†x: 
The following result is of noteworthy and practical interest, since it shows
that the functions a…x† also partition the matrices An …a† with regard to the
spectral condition numbers.

Theorem 4.3. Let a and b be two equivalent functions and let An …b† be positive
de®nite. Then
· k2 …An …a††  k2 …An …b††;
2
· if a…x† > 0 then k2 …An …a††  k2 …Tn …jpc j ††;
2k
· if a…x† P 0 then k2 …An …a†† P cn ; with c > 0 and for n large enough.

Proof. From Theorem 4.2 it follows that for any x 2 Rn the following in-
equalities hold
c1 xT An …b†x 6 xT An …a†x 6 c2 xT An …b†x
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 107

and therefore the matrix An …a† is positive de®nite as well. The rest of the proof
arises from Proposition 3.3 and Theorem 3.3, while the technique is the same as
the one introduced in Theorem 2.2 of [33] in a Toeplitz context. 

Now we want to de®ne the concept of optimal preconditioner in a PCG


context.

De®nition 4.3. Let fAn gn and fBn gn be two sequences of n  n nonnegative


de®nite matrices. The matrices fBn gn are optimal preconditioners for the ma-
trices fAn gn if
· ker…B‡ n An † ˆ ker…An †,
· all the other nonzero eigenvalues of B‡ n An belong to a bounded interval
‰c1 ; c2 Š, with ci positive constants independent on n.

We notice that the given de®nition coincides with the one suggested by
Axelsson and Lindskog [3] when each An is invertible. In general fBn gn are
optimal preconditioners for fAn gn if and only if the two sequences of matrices
are equivalent. More precisely, the preconditioner must work only in the
subspace where An is invertible. In the orthogonal subspace where An is the null
matrix the problem of ®nding a solution to the related linear system is ill-posed
and therefore it is convenient that the matrix Bn acts as the null matrix in the
same subspace. All these ideas and properties are summarized in the preceding
de®nition of equivalent sequence of matrices.
The following result gives the theoretical support for our preconditioning
techniques.

Theorem 4.4. Let a and b be two equivalent functions. Then, for any n, the
matrix


n …b†An …a†

has the same null space as An …a† and An …b† while the other nonzero eigenvalues
belong to the closed set ‰c1 ; c2 Š; ci being the equivalence constants in De®nition
4.1. Therefore, according to De®nition 4.3 , we can say that the matrices fAn …b†gn
are optimal preconditioners for the matrices fAn …a†gn .

Proof. From Theorem 4.2 we know that fAn …a†gn and fAn …b†gn are equivalent,
so that Lemma 4.1 assures that the null spaces of the symmetric nonnegative
de®nite matrices An …a† and An …b† coincide and they coincide also with the null
spaces of A‡ ‡
n …a† and An …b† as a consequence of the classical Schur decompo-
sition. In addition, since ker…An …a†† ˆ fy 2 Rn : y ˆ An …a†x; x 2 Rn g? , with a
contradiction argument, it is easily proved that they also coincide with the null
space K of the matrix A‡ n …b†An …a†.
108 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

Let us consider a generic vector u ˆ u1 ‡ u2 , such that u1 2 K and u2 2 K? , K


being the quoted null space. Due to the nonnegative de®niteness, it is evident
that
uT An …a†u ˆ uT2 An …a†u2 6ˆ 0

and
uT An …b†u ˆ uT2 An …b†u2 6ˆ 0;

so that, in light of the equivalence relation stated in Theorem 4.2, we deduce


that
uT An …a†u
c1 6 6 c2 : …15†
uT An …b†u

In addition, it is easy to prove that


1=2 1=2
u2 ˆ …A‡
n …b†† …An …b†† u2 :

Now, we suppose that k 6ˆ 0 is an eigenvalue of A‡


n …b†An …a† so that


n …b†An …a†u ˆ ku:
1=2
First we multiply by …An …b†† and we obtain
1=2 1=2
…A‡
n …b†† An …a†u ˆ k…An …b†† u:

Then we observe that


1=2
An …a†u ˆ An …a†…A‡
n …b†† …An …b††1=2 u

and we ®nd the following key relationship


1=2 1=2 1=2 1=2
…A‡
n …b†† An …a†…A‡
n …b†† …An …b†† u ˆ k…An …b†† u:
1=2
Finally by de®ning v ˆ …An …b†† u and by multiplying by vT , it follows that
1=2 1=2
vT …A‡
n …b†† An …a†…A‡
n …b†† v uT An …a†u
kˆ T
ˆ T ;
v v u An …b†u

so that by (15) we have c1 6 k 6 c2 and the claimed thesis follows. 

The latter result can be generalized in the case where the symbols a and b are
not equivalent. However, since we are interested in the applicative problem of
the preconditioning, we stress that this extension has a theoretical interest only.
In fact, Theorem 4.4 tells us that fAn …b†gn are optimal preconditioners for
fAn …a†gn if a and b are equivalent. Theorem 4.5 tells us that the property of
equivalence of a and b is not only sucient but also necessary.
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 109

Theorem 4.5. Let a and b be functions de®ned on ‰0; 1Š with b taking nonneg-
ative values. Then all the eigenvalues of

n …b†An …a†

belong to the set ‰c1 ; c2 Š [ f0g, where c1 ˆ inf Y a…x†=b…x†; c2 ˆ supY a…x†=b…x†
with Y ˆ fx 2 ‰0; 1Š : ja…x†j ‡ jb…x†j > 0g.

Therefore, the only way to guarantee that the matrices fAn …b†gn are optimal
preconditioners for the matrices fAn …a†gn is for a and b to be equivalent. In the
case where a is not equivalent to b we have c1 6 0 or c2 ˆ 1 (see also Corollary
4.3).

4.2. Density and distribution results

In order to prove the density and the distribution results (in the spirit of the
Szego Theorem 3.1) we need a preparatory lemma regarding the asymptotic
inertia of the matrices An …a†.
Notice that in Theorem 4.2 and Lemma 4.1, we linked the inertia of
the matrices fAn …a†gn and fAn …b†gn when a and b belonged to the same
equivalence class. Here we are interested in linking the asymptotical in-
ertia of fAn …a†gn to the distribution of the sign of the functional coe-
cient a…x†.
We will make use of the following de®nitions and notations.

De®nition 4.4. Fix s < t and a, b two continuous functions with b taking
nonnegative values. Then
N …s; t; a; n† ˆ #fi : ki …An …a†† 2 …s; t†g;
N …s; t; …a; b†; n† ˆ #fi : ki …A‡
n …b†An …a†† 2 …s; t†g;
A‡ …a† ˆ fx 2 ‰0; 1Š : a…x† > 0g;
Aÿ …a† ˆ fx 2 ‰0; 1Š : a…x† < 0g;
A0 …a† ˆ fx 2 ‰0; 1Š : a…x† ˆ 0g:

Moreover, when …s; t† ˆ …ÿ1; 0† or …s; t† ˆ …0; 1†, we have in short


N …0; 1; a; n† ˆ N ‡ …An …a††;
N …0; 1; …a; b†; n† ˆ N ‡ …A‡
n …b†An …a††;
N …ÿ1; 0; a; n† ˆ N ÿ …An …a††;
N …ÿ1; 0; …a; b†; n† ˆ N ÿ …A‡
n …b†An …a††:

Finally, by aÿ …a†, a‡ …a† and a0 …a† we denote the Lebesgue measure mfg of the
sets Aÿ …a†, A‡ …a† and A0 …a†, respectively.
110 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

Lemma 4.2. With the preceding notations the following relationships:


N ‡ …An …a†† ˆ na‡ …a† ‡ O…1†;
N ÿ …An …a†† ˆ naÿ …a† ‡ O…1†;
dim…Ker…An …a†† ˆ na0 …a† ‡ O…1†

hold true when [Hp1] the continuous function a…x† has a ®nite number of sign
changes.

Proof. Let In‡ …a† ˆ fi : a…~xi † > 0g, Inÿ …a† ˆ fi : a…~xi † < 0g and In0 …a† ˆ
fi : a…~xi † ˆ 0g where Gn ˆ f~xi g is the set of the equispaced samplings of the
function a…x† de®ned in Section 4.1 and where the number of ~xi lying outside
the interval ‰0; 1Š needed by the FD formula is a constant with respect to n, i.e.
O…1†.
Under the assumption of continuity of the functional coecient a…x†, it
follows that:
#…In‡ …a†† ˆ na‡ …a† ‡ O…1†;
#…Inÿ …a†† ˆ naÿ …a† ‡ O…1†; …16†
#…In0 …a†† ˆ na0 …a† ‡ O…1†:

Now, by taking into account the dyadic decomposition of the matrix An …a†
given in Theorem 4.1, it is evident that each dyad Qn …c; c; i† is made up of a
weighted sum of at most q2 terms of the form es  eTt where ek is the kth element
of the canonical basis of Rn and s; t belong to a proper neighbourhood of the
index i; more precisely it holds that
( Pm
cs ct ei‡s eTi‡t if q ˆ 2m ‡ 1;
Qn …c; c; i† ˆ Ps;tˆÿm
ÿ1 T
Pm T
s;tˆÿm cs ct ei‡s‡1 ei‡t‡1 ‡ s;tˆ1 cs ct ei‡s ei‡t if q ˆ 2m;

with the convention that the terms corresponding to nonexisting canonical


vectors are discarded.
Therefore the space Rn can be partitioned as a direct sum of orthogonal
subspaces in the following way. First de®ne the set of indices
I^n‡ …a† ˆ fj 2 f1; . . . ; ng : 8i 2 Inÿ …a† [ In0 …a†; eTj Qn …c; c; i†ej ˆ 0g;
I^nÿ …a† ˆ fj 2 f1; . . . ; ng : 8i 2 In‡ …a† [ In0 …a†; eTj Qn …c; c; i†ej ˆ 0g;
I^n0 …a† ˆ fj 2 f1; . . . ; ng : 8i 2 Inÿ …a† [ In‡ …a†; eTj Qn …c; c; i†ej ˆ 0g;
W …a† ˆ fj : j 2 f1; . . . ; ng n …I^n‡ …a† [ I^nÿ …a† [ I^n0 …a††g;

where the de®nition of the set W …a† is necessary to give evidence of canonical
vectors belonging to those adjacent dyads Qn …c; c; i† for which a change in the
sign of the functional coecient a…x† occurs.
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 111

It is easy to check that:


I^n‡ …a†  In‡ …a†;
I^ÿ …a†  I ÿ …a†;
n n

I^n0 …a†  In0 …a†;


I^0 …a† \ I^‡ …a† ˆ ;;
n n

I^nÿ …a† \ I^n‡ …a† ˆ ;;


I^0 …a† \ I^ÿ …a† ˆ ;;
n n

and that
#W …a† ˆ O…1†;
#…I^‡ …a†† ˆ ‰#…I ‡ …a†† ÿ O…1†Š ;
‡
n n

#…I^nÿ …a†† ˆ ‰#…Inÿ …a†† ÿ O…1†Š ;


‡

#…I^0 …a†† ˆ ‰#…I 0 …a†† ÿ O…1†Š ;


‡
n n

where the symbol x‡ denotes maxfx; 0g and where the relationships


#…I^n …a†† ˆ ‰#…In …a†† ÿ O…1†Š‡ ,  2 f‡; ÿ; 0g follow from the assumption that
a…x† has a ®nite number of sign changes. By de®ning:

S‡ …a† ˆ spanfej : j 2 I^n‡ …a†g;


Sÿ …a† ˆ spanfej : j 2 I^ÿ …a†g; n

S0 …a† ˆ spanfej : j 2 I^n0 …a†g;


W…a† ˆ spanfej : j 2 W …a†g;

we ®nd that Rn is the direct sum of the four orthogonal subspaces S‡ …a†,
Sÿ …a†, S0 …a† and W…a†. Notice that the presence of boundary dyads (i.e.
those dyads whose representation requires less than q2 canonical vectors) have
no in¯uence on the considered partitioning of Rn .
It is crucial to observe that for any choice of the vectors x 2 S‡ …a†, y 2
S …a† and z 2 S0 …a† with kxk2 ˆ kyk2 ˆ kzk2 ˆ 1, we ®nd
ÿ

xT An …a†x > 0; An …a†x 2 S‡ …a†  W…a†;


yT An …a†y < 0; An …a†y 2 Sÿ …a†  W…a†;
An …a†z ˆ 0:

In fact, take for instance x 2 S‡ …a†, by the representation Theorem 4.1 we


have
X
xT An …a†x ˆ a…~xi †xT Qn …c; c; i†x: …17†
i2I^n‡ …a†
112 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

Now let us considering a strictly positive continuous function a^ which coincides


with a over A‡ …a† \ Gn . From the de®nition of S‡ …a† and In‡ …a† and from the
relation (17), we can check that
xT An …a†x ˆ xT An …^
a†x;

where
xT An …^
a†x > 0

since, from Corollary 4.1, we know that the linear operator An …†  An …; m; k† is
also normally positive. The other two inequalities follow in a similar manner.
Now, by virtue of the relation An …a†z ˆ 0 and by virtue of the third relation
displayed in (16), we ®nd that
dim…Ker…An …a†† P #…I^n0 …a†† ˆ na0 …a† ‡ O…1†:

At this point, call v1 ; . . . ; vN ‡ , w1 ; . . . ; wN ÿ and s1 ; . . . ; sN 0 the eigenvectors of


An …a† respectively related to its positive, negative and null eigenvalues. Call
Z‡ …a†, Zÿ …a† and Z0 …a† the subspaces of Rn spanned by this three sets of
vectors. Here, in short, the symbols N ‡ , N ÿ and N 0 denote N ‡ …An …a††,
N ÿ …An …a†† and dim…Ker…An …a†† respectively.
In light of their de®nitions, the following set-theoretic relationships:
S‡ …a† \ …Zÿ …a†  Z0 …a†† ˆ f0g;
Sÿ …a† \ …Z‡ …a†  Z0 …a†† ˆ f0g;
…S‡ …a†  S0 …a†† \ Zÿ …a† ˆ f0g;
…Sÿ …a†  S0 …a†† \ Z‡ …a† ˆ f0g

hold true. Moreover, by exploiting the well known inequalities


‰dim…X† ‡ dim…Y† ÿ nŠ‡ 6 dim…X \ Y† 6 minfdim…X†; dim…Y†g;

we deduce that
0 ˆ dim…S‡ …a† \ …Zÿ …a†  Z0 …a††† P ‰#…I^n‡ …a†† ‡ …n ÿ N ‡ † ÿ nŠ
‡

ˆ ‰#…I^‡ …a†† ÿ N ‡ Š ;
‡
n

0 ˆ dim…Sÿ …a† \ …Z‡ …a†  Z0 …a††† P ‰#…I^nÿ …a†† ‡ …n ÿ N ÿ † ÿ nŠ


‡

ˆ ‰#…I^ÿ …a†† ÿ N ÿ Š ;
‡
n

0 ˆ dim……S‡ …a†  S0 …a†† \ Zÿ …a††


P ‰n ÿ ‰#…I^nÿ …a†† ‡ dim…W…a††Š ‡ N ÿ ÿ nŠ‡
ˆ ‰N ÿ ÿ ‰#…I^ÿ …a†† ‡ dim…W…a††ŠŠ ;
‡
n
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 113

0 ˆ dim……Sÿ …a†  S0 …a†† \ Z‡ …a††


P ‰n ÿ ‰#…I^n‡ …a†† ‡ dim…W…a††Š ‡ N ‡ ÿ nŠ
‡

ˆ ‰N ‡ ÿ ‰#…I^‡ …a†† ‡ dim…W…a††ŠŠ


‡
n

and ®nally
#…I^n‡ …a†† 6 N ‡ 6 #…I^n‡ …a†† ‡ dim…W…a††;
#…I^ÿ …a†† 6 N ÿ 6 #…I^ÿ …a†† ‡ dim…W…a††;
n n

#…I^n0 …a†† 6 dim…Ker…An …a†† 6 #…I^n0 …a†† ‡ dim…W…a††:

By recalling that dim…W…a†† ˆ O…1† and the relations linking the sizes of the
sets I^n …a† and In …a† with  2 f‡; ÿ; 0g, the claimed thesis directly follows. 

Remark. The assumption [Hp1] of the preceding lemma concerning the ®nite
number of sign changes of a…x† can be expressed more explicitly: in fact it is
equivalent to the assumption that the set of the zeros of a…x† is made up by a ®nite
collection of isolated zeros and by a ®nite collection of closed disjoint intervals.

It is worthwhile noticing that the assumption that the continuous function


a…x† has an in®nite number of isolated zeros is a bit pathological, even if we can
provide some examples. Take for instance a…x† ˆ x  sin…xÿ1 † whose isolated
zeros are xk ˆ 1=kp with k positive integer number.
The following lemma allows one to treat the case of in®nite isolated zeros
with a ®nite number of accumulation points.

Lemma 4.3. With the preceding notations the following relationships:

N ‡ …An …a†† ˆ na‡ …a† ‡ o…n†;


N ÿ …An …a†† ˆ naÿ …a† ‡ o…n†;
dim…Ker…An …a†† ˆ na0 …a† ‡ o…n†

hold true when the continuous function a…x† ful®lls the following assumption.
[Hp2] there exists a ®nite number of points fs1 ; . . . ; sp g such that for any positive
 the function a…x† has a ®nite number of sign changes in the new de®nition set
X ˆ ‰0; 1Šn [pjˆ1 …sj ÿ ; sj ‡ †.

Proof. It is enough to repeat the same proof as in Lemma 4.2 by discarding the
indices i so that ~xi belongs to [pjˆ1 …sj ÿ ; sj ‡ †. 

Notice that the assumption [Hp2] is necessary and cannot be violated. In


fact in [42] we exhibit a continuous function a such that a0 …a† is positive but
An …a† is positive de®nite (i.e. dim…Ker…An …a†† ˆ 0 for any n).
114 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

We are now ready to prove the ®rst Szeg


o-style theorem concerning the
asymptotic distribution of the eigenvalues of the preconditioned matrices
Pn …a; b† ˆ A‡
n …b†An …a†.

Theorem 4.6. Let a and b be two equivalent functions and let s 2 R so that
a ÿ zb ful®lls assumption [Hp2] for z 2 f0; sg. Let …s; 1† be the generic half-line
and let us denote by Fs the characteristic function of …s; 1†. Then, by denoting by
k…n†
i the eigenvalues of the matrices fPn …a; b†gn , we ®nd that
1X n   Z
lim Fs k…n†
i ˆ Fs …a=b† dx ‡ mfA0 …b†gFs …0†; …18†
n!1 n A‡ …b†
iˆ1

where A‡ …b† ˆ fx 2 ‰0; 1Š : b…x† > 0g.


P
Proof. We ®rst point out that the quantity niˆ1 Fs …k…n†
i † denotes the number of
eigenvalues of fPn …a; b†gn belonging to …s; 1†, that is N …s; 1; …a; b†; n†, in ac-
cordance with De®nition 4.4.
Setting g…x† ˆ 0 if b…x† ˆ 0 and a…x†=b…x† otherwise, we have
Z
Fs …a=b†dx ‡ mfA0 …b†gFs …0†
A‡ …b†
Z1
ˆ Fs …g†dx ˆ mfx 2 ‰0; 1Š : g…x† 2 …s; 1†g:
0

Now, let us consider the symmetric matrices


W…s† ˆ An …a† ÿ sAn …b†
and
1=2 1=2
Un …a; b† ˆ …A‡
n …b†† An …a†…A‡
n …b†† :
Notice that the matrix Un …a; b† has the same spectrum as the precondi-
tioned matrix Pn …a; b† since the matrix Un …a; b† can also be written as
1=2 1=2
…An …b†† Pn …a; b†…A‡
n …b†† and since the null spaces of An …b† and An …a† are
identical.
In addition, the matrices Un …a; b† and W…s† are formally linked by the fol-
lowing relationship
1=2 1=2
W…s† ˆ …An …b†† …Un …a; b† ÿ sI†…An …b†† : …19†
Considering the unitary matrix U whose ®rst columns are chosen as a basis of
the null space of An …b†, we ®nd
2 3
0
6 ..
…An …b††1=2 ˆ U 6 . 0 7
7U H ;
4 0 5
0 X
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 115

where X is symmetric and strictly positive de®nite. Since ker…Pn …a; b†† ˆ
ker…An …b†† we can also write that
2 3
ÿs
6 .
Un …a; b† ÿ sI ˆ U 6
.. 0 77U H
4 ÿs 5
0 Hs

and, in light of Eq. (19),


2 3
0
6 .. 7 H
W…s† ˆ U 6 . 0 7U :
4 0 5
0 X Hs X

So, it follows that Hs and X Hs X have the same inertia, by virtue of the Syl-
vester inertia law, owing to the invertibility and the symmetry of X. In other
words we can say that the matrices W…s† and Un …a; b† ÿ sI have the same in-
ertia in the subspace orthogonal to Z0 …b†, where Z0 …b† is the null space of
An …b†. Therefore, recalling that Pn …a; b† and Un …a; b† have the same spectra, we
we ®nd that
N …s; 1; …a; b†; n† ˆ #fi : ki …Pn …a; b†† > sg
ˆ #fi : ki …Un …a; b† ÿ sI† > 0g
ˆ #fi : ki …Un …a; b†† ˆ 0gFs …0† ‡ #fi : ki …Hs † > 0g
ˆ dim…ker…Pn …a; b††Fs …0† ‡ #fi : ki …X Hs X † > 0g:

Now, through Lemmas 4.2 and 4.3 and by comparison with the splitting of
W…s†, we have
N …s; 1; …a; b†; n† ˆ …n  mfA0 …b†g ‡ O…er††Fs …0† ‡ #fi : ki …X Hs X † > 0g
ˆ …n  mfA0 …b†g ‡ O…er††Fs …0† ‡ #fi : ki …W…s†† > 0g;

where, again in light of Lemmas 4.2 and 4.3,


#fi : ki …W…s†† > 0g ˆ N …0; 1; a ÿ sb; n†
ˆ n  mfx 2 ‰0; 1Š : a ÿ sb > 0g ‡ O…er†
ˆ n  mfx 2 A‡ …b† : a=b > sg ‡ O…er†
Z
ˆn Fs …a=b† dx ‡ O…er†:
A‡ …b†

Finally we have er ˆ O…1† if [Hp1] holds for a ÿ sb or er ˆ o…n† if [Hp2] holds


for a ÿ sb. These remarks end the proof. 
116 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

Notice that, when a ÿ sb has only a ®nite number of isolated zeros and F, a
and b are Lipschitz continuous, then the quoted result tells us that the error

1 X n   Z
…n†
F ki ÿ F …g† dx
n iˆ1 A‡ …b†

is bounded by c=n with c absolute constant. Therefore the given statement is


also in the style of the second-order results due to Widom [48] in the Toeplitz
context.
The following are the main Szeg o-type results.

Theorem 4.7. Let a and b be two equivalent functions so that a ÿ sb ful®lls


assumption [Hp2] for any s 2 H with H ˆ R and 0 2 H , then the eigenvalues k…n† i
of the matrices fPn …a; b†gn satisfy the ergodic formula (20), i.e., for any contin-
uous function F with bounded support, it ensues
1X n   Z
lim F k…n†
i ˆ F …a=b† dx ‡ mfA0 …b†gF …0†; …20†
n!1 n A‡ …b†
iˆ1

where A‡ …b† ˆ fx 2 ‰0; 1Š : b…x† > 0g.

Proof. Call S the space of the linear combinations of functions of the form Fs
with s 2 H . Observe that the topological closure with respect to the in®nity
norm of the space S contains all the continuous functions with bounded sup-
port. Now the general statement given in (20) is a consequence of Theorem 4.6
and of the linearity with respect to F of both sides of Eq. (20). 

Theorem 4.8. Let a and b be two continuous functions with b taking nonnegative
values and a0 …b† ˆ mfA0 …b†g ˆ 0. Then the eigenvalues k…n† i of the matrices
fPn …a; b†gn satisfy the following ergodic relation, i.e., for any continuous function
F with bounded support, it holds
1X n   Z
lim F k…n†
i ˆ F …a=b† dx ‡ mfA0 …b†gF …0†: …21†
n!1 n A‡ …b†
iˆ1

Proof. By Proposition 5.2 in [44] (see also Theorem 3.3 in [37]), it follows that
2
the sequence fAn …a†gn is Locally Toeplitz with regard to the pair …a; jpc j †.
Therefore by Theorem 3.6 in [44] (Eq. (43)) we deduce that fAn …a†gn distributes
2
as a…x†  jpc …y†j i.e. for any continuous function F with bounded support, it
holds
  Z
1X n
1 2
lim F k…n†
i …A n …a†† ˆ F …a…x†jpc …y†j † dx dy:
n!1 n 2p ‰0;1Š‰ÿp;pŠ
iˆ1
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 117
2
Moreover by the assumption and since jpc …y†j is a nonzero polynomial
we have mf…x; y† 2 ‰0; 1Š  ‰ÿp; pŠ : b…x†jpc …y†j2 ˆ 0g ˆ 0 (i.e. b…x†jpc …y†j2 is
``sparsely vanishing'' [11]).
Finally An …† is a Linear Positive Operator for any n. Therefore the as-
sumptions of Theorem 2.6 in [35] are ful®lled and the claimed thesis fol-
lows. 

Remark. Theorem 4.7 does not follow from Theorem 4.8 since there exist
equivalent continuous functions a and b with A0 …b† having positive measure so
that one of the assumptions of Theorem 4.8 is violated. Conversely, Theorem
4.8 does not follow from Theorem 4.7. In fact there exist continuous functions
b with a0 …b† ˆ mfA0 …b†g ˆ 0 such that the set A0 …b† does not ful®ll condition
[Hp2]: take b…x† ˆ inf y 2 K jx ÿ yj with K being the classical Cantor set of ‰0; 1Š so
that A0 …b† ˆ K.

As a direct consequence of Theorems 4.7 and 4.8 we ®nd the following.

Corollary 4.2. Let a and b be two continuous functions satisfying the assump-
tions of Theorem 4.7 or of Theorem 4.8. Then the topological closure Z of the S set
of the eigenvalues of all the matrices fPn …a; b†gn is contained in ‰c1 ; c2 Š f0g
and the set Z contains the usual range of the function a=b. Moreover, the
eigenvectors related to the null eigenvalue belong, for any dimension, to the null
space of An …a†.

Proof. It is a simple consequence of Theorems 4.7 or 4.8 according to the as-


sumptions. For more details, see the proof of a more general result in [38]. 

When the two functions are not equivalent then the matrices fAn …b†gn are
not good preconditioners since a ``lot'' of eigenvalues (not related to null spaces
of fAn …a†gn ) of the preconditioned matrices cluster at zero and/or at in®nity
and this is at odds with De®nition 4.3.

Corollary 4.3. Let a and b be two nonequivalent continuous functions with b


taking nonnegative values. Then the following cases can occur:
· For n large enough, the null space of Pn …a; b† strictly contains the one of An …a†.
· 8 > 0, 9c 2 …0; 1Š, 9 n 2 N so that at least c n eigenvalues of Pn …a; b† belong
to …ÿ; † for n P n . Moreover the related eigenvectors do not belong to the
null space of An …a†.
· 8M > 0, 9cM 2 …0; 1Š, 9 nM 2 N so that at least cM n eigenvalues of Pn …a; b† be-
long to …ÿ1; ÿMŠ [ ‰M; 1† for n P nM .

Proof. For the sake of notational simplicity, we consider the nonequivalent


case where the function a…x† is nonnegative, continuous, the set fx 2 ‰0; 1Š :
118 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

a…x† ˆ 0g is strictly contained in the set fx 2 ‰0; 1Š : b…x† ˆ 0g and there exists
an interval J1  ‰0; 1Š so that b…x† ˆ 0 over J1 and a…x† is strictly positive on J1 .
0
By referring to the notations used in Lemma 4.2, we consider the set In;J 1
…b† ˆ
0 ^ 0 ^0 0
fj : b…~xj † ˆ 0; ~xj 2 J1 g  In …b† and the set In;J1 …b† ˆ In …b† \ In;J1 …b† conse-
quently de®ned. Therefore, since

I^n;J
0
1
0
…b†  In;J 1
…b†; #…I^n;J
0
1
0
…b†† ˆ #…In;J 1
…b†† ÿ O…1†;

we deduce that the subspace S0J1 …b† of the vectors of the canonical basis, whose
indices belong to I^n;J
0
1
…b†, is contained in the null space of A‡
n …b†, while

zT An …a†z > 0

with z 2 S0J1 …b†, kzk2 ˆ 1. Now, in light of Lemma 4.2, the dimension of the
subspace S0J1 …b† is equal to n  mfJ1 g ‡ O…1† and the theorem is proved. 

Therefore, if the equivalence condition does not hold, then the precondi-
tioned matrix is characterized by a linear quantity of very small and/or very big
eigenvalues. This fact, in light of the Axelsson±Linskog analysis [3], leads to a
substantial deterioration of the performances of the related PCG method. 

5. The Toeplitz preconditioner

In the previous section we have proved that optimal preconditioners for the
matrices fAn …a†gn are matrices of the same kind fAn …b†gn , provided that b is a
function equivalent to a (according to De®nition 4.1). The computational in-
terest of such a result crucially depends on the ``cheap solution'' property of a
generic linear system whose coecient matrix is An …b†.
Therefore, in the case of a strictly elliptic problem, i.e. a…x† strictly positive,
it is evident that a preconditioner as the matrix Dm;k ˆ An …1; m; k†  An …1†
(b  1) is a good choice for two reasons:
· fDm;k gn are optimal preconditioners in the Axelsson±Lindskog sense [3],
· Dm;k is a band-Toeplitz matrix, for any size n.
Notice that the optimality of the preconditioner Dm;k is a direct consequence
of Theorem 4.4. Moreover, we appreciate the role of this preconditioner since
the original coecient matrix An …a† has a condition number which grows as-
ymptotically at least as n2k (recall Theorem 4.3).
A further improvement of the preconditioner Dm;k is based on the use of the
main diagonal of An …a†. More speci®cally, we de®ne the preconditioner matrix
P as D1=2 1=2
n …a†Dm;k Dn …a† where Dn …a† is the suitable scaled main diagonal of
An …a†.
The exceptional clustering properties of the preconditioned matrices asso-
ciated with P have been analysed in [30] for some very special choices of the
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 119

value m. In [39] a general analysis of the eigenvalue clustering of the matrix set
fP ÿ1 An …a†gn is performed. In Section 8, a wide numerical experimentation is
reported con®rming the goodness and the e€ectiveness of this approach.
Finally, we want to consider the case where a…x† is a nonnegative function
and has zeros. In this case the simple preconditioner Dm;k does not assure the
optimality of the related PCG method. In e€ect, by using Corollary 4.3, we can
state a negative result: more precisely, for any  > 0, 9 n and there exists a
positive constant c for which, for any n P n , at least dc ne eigenvalues of
Dÿ1
m;k An …a† belong to ‰0; Š. Therefore, we must expect that the performances of
the related PCG methods are strongly spoiled (see all the tables in Section 8
when a…x† has zeros).
On the other hand, the performances of the PCG method based on the
improved preconditioner P are still good and an in depth theoretical expla-
nation of this phenomenon is presented in [39].
However, Theorem 4.4 also furnishes further indication for dealing with the
case of a…x† with zeros: supposing that a…x† has isolated zeros and b…x† is
chosen so that
a
0 < c1 6 6 c2 ;
b
then the matrices fAn …b†gn are still optimal preconditioners according to the
optimality de®nition considered by Axelsson and Lindskog [3].
For numerical evidence of the associated PCG method see the last Table 5 in
Section 8, where a…x† has a complicated expression but is asymptotic to x and
therefore b…x† ˆ x can be chosen.
However, in this context it is interesting to give numerical evidence of the
spectral results found in Section 4. In Fig. 1 the plot of the eigenvalues of
An …a† and An …b† with a…x† ˆ x…1 ‡ x2 †= exp…x† and b…x† ˆ x is reported. No-
tice, for instance, that the plot of the eigenvalues of An …b† is not close to a
sampling of the function b…x† ˆ x. Nevertheless, this is not a surprising result

Fig. 1. An …x…1 ‡ x2 †= exp…x†; m; k† and An …x; m; k† spectra (n ˆ 300, m ˆ 2, k ˆ 2).


120 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

Fig. 2. Pn …a; b† ˆ Aÿ1 2


n …x; m; k†An …x…1 ‡ x †= exp…x†; m; k† spectrum (n ˆ 300, m ˆ 2, k ˆ 2).

since, in the light of the Tilli analysis [44], we know that the eigenvalues of
2
fAn …b†gn distribute as b…x†  jpc …y†j over the rectangle ‰0; 1Š  ‰ÿp; pŠ.
In Fig. 2 the plot of the eigenvalues of the preconditioned matrix Pn …a; b† ˆ
Aÿ1
n …b†An …a† is reported. Notice that, in light of the ergodic results and since b
vanishes in a negligible subset of ‰0; 1Š, this plot must distribute as a=b.
To give numerical evidence of this, the comparison between these eigen-
values and the values …a=b†…xi † on a uniform n-dimension grid fxi g of ‰0; 1Š is
reported in Fig. 3 where the absolute error j…a=b†…xi † ÿ ki …Pn …a; b††j is plotted
with respect to the di€erent cases of k and m values. These numerical results
perfectly con®rm the theoretical expectations of Theorems 4.7 and 4.8 and
Corollary 4.2.

6. Operation counts

As shown in Section 5, in the case of nonnegative functions with isolated


zeros, optimal preconditioners are based on Toeplitz matrices. This statement

Fig. 3. Absolute error ja…x†=b…x† ÿ ki …Pn …a; b††j vs x (a…x† ˆ x…1 ‡ x2 †= exp…x†, b…x† ˆ x, n ˆ 300).
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 121

is important since in recent literature it is possible to choose among di€erent


very ecient band-Toeplitz solvers whose cost is strongly reduced with respect
to classical band solvers [16]:
· multigrid methods requiring O……2q ÿ 1†n† ops and O…log n† parallel steps
with O…2nq† processors [12,13] in the parallel PRAM model of computation,
· a recursive displacement-rank based technique [5] requiring O…n log…2q ÿ 1†
‡ …2q ÿ 1† log2 …2q ÿ 1† log…n†† ops and O…log n† parallel steps with O…2nq†
processors,
where n is the matrix dimension and 2q ÿ 1 the matrix bandwidth.
So, in order to obtain the total computational cost needed to get the solution
of a system An …a†u ˆ f using the PCG method, the preceding costs must be
multiplied by the number of the PCG iterations which is constant with respect
to n (see Theorems 4.4 and 4.7) and added to the cost of few matrix-vector
multiplications (recall the PCG algorithm). The ®nal cost is of O…n log…2q ÿ 1††
ops and O…log…n…2q ÿ 1††† parallel steps with O…2nq† processors in the PRAM
model of computation.
In conclusion, we have reduced the asymptotic cost of these band systems
(which are Locally Toeplitz [44,38]) to the cost of band-Toeplitz systems for
which the recent literature provides very sophisticated algorithms (for the most
ecient see [5]).
Note that both the bandwidth of the matrix An …a†  An …a; m; k† and the
bandwidth of the Toeplitz preconditioner Dm;k are linear as m, because the
matrices Dm;k and An …a† share the same pattern.
In addition, when a…x† is regular enough, the approximation error given by
the exact solution of the linear system associated with An …a† goes to zero as
hqÿk‡1 . Therefore, for increasing q values, a substantial improvement in the
precision error is obtained while the computational cost increases only loga-
rithmically with m [5].
For an evidence of this observation, let us consider the two ``consecutive''
matrices An …a; m; k† and An …a; m ‡ 1; k†. Then, in order to obtain an approxi-
mation error bounded by a ®xed , in the ®rst case we require a dimension
 
1
n1 ˆ O 1=…qÿk‡1†


and in the second case we require


 
1
n2 ˆ O 1=…qÿk‡3† :


Setting k ˆ 2, q ˆ 3, we ®nd
   
1 1 p
n1 ˆ O 1=2 ; n2 ˆ O 1=4 ˆ O… n1 †:
 
122 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

Therefore, in the second case for  ˆ 10ÿ8 instead of solving a system of order
10 000, we must solve a system of order 100: this dramatic reduction of the
algebraic problem size is obtained at the cost of a small growth of the band-
width of the involved systems.
However, the bandwidth of An …a; m ‡ 1; k† equals the bandwidth of
An …a; m; k† plus 4 and, in the considered example, the cost of each iteration is
strongly reduced since it moves from c  104 log…5† to c  102 log…9†, where c
is a suitable positive constant.

7. Further generalizations

These simple, but rather powerful, results can be strongly generalized with
regard to two main directions.
· When considering a 2 L1 , we have to modify the de®nition of a…~xi † as a
``mean value'' in order to deal with this irregular case (compare the sugges-
tions given in [30,38,36]).
· When considering p-dimensional elliptic problems on square regions [36,41],
the only delicate step is the careful use of tensor arguments.

7.1. The case where a…x† is not regular

When the coecient function a 2 L1 is not a continuous function, due to the


role played by the zero-measure sets, it is senseless to discretize the template
problem (1) by using the evaluations of the function a in some discrete set of
points. Therefore, in order to deal with L1 functions, we have to modify the
de®nition of a…~xi † as a ``mean value''. More precisely, the usual value a…~xi † is
replaced by
Z
n a…t† dt; Ii ˆ ‰~xiÿ1 ; ~xi Š: …22†
Ii

Notice that in the case where a…x† is continuous the standard and the new
de®nition of the quantity a…~xi † do not coincide but are asymptotically equiv-
alent. In particular the following result holds true.

Proposition 7.1. Let a…x† be a continuous function. Then


Z

a…~xi † ÿ n a…t† dt 6 xa …nÿ1 †

Ii

with xa …† being the modulus of continuity of the function a…x†.


S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 123

Notice that the dyadic decomposition Theorem is still valid and if the
function a…x† is (essentially) nonnegative, owing to the use of the Lebesgue
integral in Eq. (22), the matrix An …a† can still be looked at as a Linear
Positive Operator from the linear space of the Lebesgue integrable functions
(with the ``essential'' ordering) to the linear space of the real symmetric n  n
matrices.
Since these two properties are the key ones in the theory discussed in the
previous sections, we can conclude that all the theorems developed in Sections
3 and 4 can be generalized to this case (for a similar extension refer to Section 6
in [36]).

7.2. A multidimensional example

Here we are little interested in a general and detailed analysis of the mul-
tidimensional case due to the related cumbersome notations. Therefore in or-
der to show how the proposed ideas can be extended and used in this context,
we focus our attention on a two dimensional problem (p ˆ 2) only. More
precisely, let us consider the weighted Bilaplacian problem
   
o2 o2 o2 o2
a…x; y† 2 u…x; y† ‡ 2 a…x; y† 2 u…x; y† ˆ f …x; y†
ox2 ox oy oy

2
over X ˆ ‰0; 1Š with the homogeneous boundary conditions. We discretize
both the operators
 
o2 o2
a…x; y† 2 u…x; y† ; z 2 fx; yg;
oz2 oz

by using m ˆ 2 (q ˆ 5) and the maximal precision order FD formula.


This leads to a 9-points formula with respect to the x axis and to the y axis:
the related matrix denoted by An …a; mx ; my ; k† with mx ˆ my ˆ k ˆ 2 is a two
level banded matrix.
The ®rst preconditioner is constructed by using the same formula applied
to the case a…x†  1. Therefore, the related matrix is a two level Toeplitz
matrix [21,45] denoted by Dmx ;my ;k with mx ˆ my ˆ k ˆ 2. Additionally, we can
enrich the preconditioning technique by using the information given by the
main diagonal Dn …a† of An …a; mx ; my ; k†, so de®ning the preconditioner P ˆ
D1=2 1=2
n …a†Dmx ;my ;k Dn …a†.
For the matrices An …a; mx ; my ; k†, Dmx ;my ;k and Dÿ1
mx ;my ;k An …a; mx ; my ; k† the fol-
lowing results hold:
· the operator An …; mx ; my ; k† is linear and positive;
· Dmx ;my ;k is a two level Toeplitz matrix;
124 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

· the eigenvalues of Dÿ1mx ;my ;k An …a; mx ; my ; k† belong to the interval ‰c1 ; c2 Š with
c1 ˆ infa and c2 ˆ supa.
Therefore we can conclude that all the theorems developed in Sections 3 and
4 can be generalized to this case (for a similar, but more detailed extension, see
[30] in the case k ˆ 1, and [41]).
Concerning the computational cost we point out that the displacement
rank technique developed in [5] is no longer ``optimal'' in the multilevel
Toeplitz case. The same remark also holds for the classical band-solvers [16].
Therefore, in the multilevel case the only known optimal iterative solvers are
those based on the multigrid methods [13,43] or on mixed methods
(PCG + multigrid) [29]: with the latter multigrid-type choices, the computa-
tional cost is linear as the size of the linear system and linear as the band-
widths of the coecient matrix.

8. Numerical experiments

In this section we present numerical results obtained in the solution of the


linear system An …a; m; k†u ˆ f with respect to di€erent choices of the coecient

Table 1
Number of PCG steps in the case of An …a; m; k† with k ˆ 1; m ˆ 1 and n ˆ 300
a…x† Dn …a† D1;1 P
1‡x 300 11 3
300 11 3
exp…x† 300 14 3
300 14 4
x 300 107 7
300 106 7
x2 300 433 4
300 433 4
…x ÿ 0:5†2 150 212 4
150 430 7
jx ÿ 0:5j ‡ 0:5 150 11 4
300 11 4
jx ÿ 0:5j 150 70 8
150 102 11
p
1 ‡ x if x P 0 300 11 4
1 if x < 0 300 11 4
sin2 …7x† ‡ 1 300 12 8
300 12 8
exp…x† if x 6 2=3 300 11 5
2 ÿ x if x > 2=3 300 11 6
x…1 ‡ x2 †= exp…x† 300 93 7
300 92 7
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 125

Table 2
Number of PCG steps in the case of An …a; m; k† with k ˆ 1; m ˆ 2 and n ˆ 300
a…x† Dn …a† D2;1 P
1‡x 323 11 3
325 11 3
exp…x† 323 14 3
324 14 4
x 326 107 7
326 107 7
x2 327 434 4
326 433 4
…x ÿ 0:5†2 158 213 6
267 431 7
jx ÿ 0:5j ‡ 0:5 158 11 4
324 11 4
jx ÿ 0:5j 158 70 8
324 102 11
p
1 ‡ x if x P 0 343 12 4
1 if x < 0 345 11 4
sin2 …7x† ‡ 1 323 12 8
325 12 8
exp…x† if x 6 2=3 343 11 5
2 ÿ x if x > 2=3 345 11 5
x…1 ‡ x2 †= exp…x† 326 93 7
326 93 7

function a…x†, ranging from the ``easy'' case given by k ˆ 1 and a strictly
positive coecient a…x†  ex to the ``dicult'' one given by k ˆ 2 and non-
negative coecient functions with zeros and/or discontinuity in the function
and/or in their derivatives. Note that in the last case, this choice of the func-
tional coecient greatly increases the ill-conditioning of the resulting sequence
of matrices fAn …a; m; k†gn .
We do not make explicit comparison to circulant preconditioners [7,18,22]
because, as shown in [14,15], s and band-Toeplitz based preconditioners per-
form much better than circulant ones in the case of Dirichlet boundary value
problems. It is also true that circulant preconditioners are very e€ective in the
case of y-periodic linear elliptic problems [22] with strictly positive smooth
coecient functions.
This complementary behaviour of the two techniques is not surprising: in
fact, while the discretized y-periodic linear elliptic problems have a ``circulant
pattern'', the discretized Dirichlet BVPs show a symmetric pattern which is
naturally closer to the s and symmetric Toeplitz structure.
We performed our tests at di€erent grid sizes and we noticed no signi®cant
change since, as proved in the previous sections, our PCG method shows a
126 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

Table 3
Number of PCG steps in the case of An …a; m; k† with k ˆ 1; m ˆ 3 and n ˆ 300
a…x† Dn …a† D3;1 P
1‡x 343 11 3
345 11 3
exp…x† 343 15 3
345 15 4
x 346 107 7
346 107 7
x2 347 434 4
346 433 4
…x ÿ 0:5†2 167 213 6
264 431 7
jx ÿ 0:5j ‡ 0:5 167 11 4
344 11 4
jx ÿ 0:5j 167 70 8
344 102 11
p
1 ‡ x if x P 0 343 12 4
1 if x < 0 345 11 4
sin2 …7x† ‡ 1 343 12 8
345 12 8
exp…x† if x 6 2=3 343 11 6
2 ÿ x if x > 2=3 345 11 6
x…1 ‡ x2 †= exp…x† 346 93 7
346 93 7

convergence rate which is independent of the dimension n of the involved


systems. Therefore, we only report the numerical results obtained in the case
n ˆ 300.
As usual the errors have been measured according to the relative Euclidean
norm of the residuals. More precisely, we say that the error at the sth step is
less than e if the relationship krs k2 =kr0 k2 < e holds, rs being the residual of the
PCG method at the sth step.
The considered preconditioners are: the diagonal matrix Dn …a† given by the
suitable scaled main diagonal of A ˆ An …a; m; k†, the band-Toeplitz matrix
Dm;k ˆ An …1; m; k† ˆ An …1† and the improved Toeplitz based preconditioner
P ˆ D1=2 1=2
n …a†Dm;k Dn …a†.
We make comparison with the basic preconditioners Dn …a† and Dm;k show-
ing that only the combination of the two matrices leads to ``more than linear''
PCG methods (see also [30,39]).
On the other hand, when a…x† is strictly positive, even if irregular or highly
oscillating, we observe that the matrix Dm;k performs like an optimal precon-
ditioner.
All these theoretical expectations have strong con®rmation in Tables 1±4
where the number of PCG steps to reach the solution within a preassigned
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 127

Table 4
Number of PCG steps in the case of An …a; m; k† with k ˆ 2; m ˆ 2 and n ˆ 300
a…x† Dn …a† D2;2 P
1‡x ÿ 13 4
ÿ 13 4
exp…x† ÿ 17 4
ÿ 17 4
x ÿ 124 11
ÿ 124 11
x2 ÿ 468 11
ÿ 467 11
…x ÿ 0:5†2 ÿ 227 21
ÿ 448 27
jx ÿ 0:5j ‡ 0:5 ÿ 14 7
ÿ 14 8
jx ÿ 0:5j ÿ 82 23
ÿ 109 27
p
1 ‡ x if x P 0 ÿ 14 6
1 if x < 0 ÿ 14 6
sin2 …7x† ‡ 1 ÿ 14 13
ÿ 14 13
exp…x† if x 6 2=3 ÿ 14 15
2 ÿ x if x > 2=3 ÿ 14 15
x…1 ‡ x2 †= exp…x† ÿ 108 11
ÿ 108 11

accuracy e ˆ 10ÿ7 is reported both in the case …f†i ˆ 1 for every i ˆ 1; . . . ; n and
f vector of random numbers uniformly distributed on ‰0; 1Š.
The same tests are reported in Table 5 with respect to the case of Pn …a; b† ˆ
Aÿ1 2
n …b; m; k†An …a; m; k† where a…x† ˆ x…1 ‡ x †= exp…x† and b…x† ˆ x. Lastly,
Table 6 is devoted to the multidimensional example of the weighted bilaplacian
problem reported in Section 7.2.

Table 5
Number of PCG steps in the case of preconditioning of An …a; m; k†; a…x† ˆ x…1 ‡ x2 †= exp…x†
with An …b; m; k†; b…x† ˆ x (n ˆ 300)
m; k An …x; m; k†
1,1 8
7
2,1 8
7
3,1 8
7
2,2 9
9
128

Table 6
Number of PCG steps in the case of An …a; mx ; my ; k† with k ˆ 2; mx ˆ my ˆ 2 (n ˆ 100; 400; 900)
a…x; y† Dn …a† D2;2 P Dn …a† D2;2 P Dn …a† D2;2 P
1‡x‡y 56 13 3 206 14 4 448 14 4
69 13 3 252 14 4 543 14 4
exp…x ‡ y† 55 19 3 205 23 3 448 24 4
57 19 3 209 22 3 449 24 4
x‡y 58 23 5 212 35 5 462 44 5
76 24 5 291 35 5 625 43 5
…x ‡ y†2 58 41 5 214 96 6 470 149 6
76 47 5 293 104 6 629 162 6
…x ÿ 0:5†2 ‡ …y ÿ 0:5†2 15 15 7 67 47 9 146 76 10
63 40 9 234 87 11 509 14 8
jx ÿ 0:5j ‡jy ÿ 0:5j ‡ 0:5 15 10 5 67 13 6 147 14 7
65 11 7 238 13 7 509 14 8
jx ÿ 0:5j ‡ jy ÿ 0:5j 15 14 7 67 25 9 146 33 11
65 21 9 243 33 12 525 39 14
p
1 ‡ x ‡ y if x ‡ y P 0 56 9 4 206 10 4 446 11 4
1 if x ‡ y < 0 70 9 4 264 10 4 575 11 4
sin2 …7…x ‡ y†† ‡ 1 57 10 11 207 11 14 456 12 14
80 10 12 304 11 15 660 12 15
exp…x ‡ y† if x ‡ y 6 2=3 57 23 8 213 35 10 469 43 12
2 ÿ x if x ‡ y > 2=3 83 24 8 307 36 12 658 43 14
…x ‡ y†…1 ‡ …x ‡ y†2 † 58 21 5 212 31 5 462 38 5
= exp…x ‡ y†
79 22 5 292 31 5 622 37 5
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 129

9. Concluding remarks

A general spectral and structural analysis concerning FD matrices and a


class of preconditioned matrices has been performed. Beside the theoretical
interest, the discussed results are a useful guide in order to de®ne optimal
preconditioners with respect to the considered class of elliptic problems.
Moreover, we want to stress the role played by algebraic arguments (the
dyadic decomposition) and by LPOs in order to ®nd spectral ergodic results of
Szego and Widom type.
To conclude, we notice that the connection among di€erent topics resulted
successful in order to solve spectral and computational problems in structured
linear algebra.

References

[1] F. Avram, On bilinear forms on Gaussian random variables and Toeplitz matrices, Probab.
Theory Related Fields 79 (1988) 37±45.
[2] O. Axelsson, V. Barker, Finite Element Solution of Boundary Value Problems, Theory and
Computation, Academic press, New York, 1984.
[3] O. Axelsson, G. Lindskog, On the rate of convergence of the preconditioned conjugate
gradient method, Numer. Math. 48 (1986) 499±523.
[4] D. Bini, Matrix structure in parallel matrix computation, Calcolo 25 (1988) 37±51.
[5] D. Bini, B. Meini, E€ective methods for solving banded Toeplitz systems, invited lecture:
SIAM annual meeting ± Stanford, CA, July, 1997, SIAM J. Matrix Anal. Appl., in press.
[6] A. Bottcher, S. Grudsky, On the condition numbers of large semi-de®nite Toeplitz matrices,
Linear Algebra Appl. 279 (1998) 285±301.
[7] R.H. Chan, T.F. Chan, Circulant preconditioner for elliptic problems, J. Numer. Linear
Algebra Appl. 1 (1992) 77±101.
[8] P. Ciarlet, The Finite Element Method for Elliptic Problems, North-Holland, Amsterdam,
1978.
[9] P. Davis, Circulant Matrices, Wiley, New York, 1979.
[10] F. Di Benedetto, G. Fiorentino, S. Serra, C.G. preconditioning for Toeplitz matrices, Comput.
Math. Appl. 25 (1993) 35±45.
[11] F. Di Benedetto, S. Serra Capizzano, A unifying approach to matrix algebra preconditioning,
Numer. Math., 82 (1) (1999) 57±90.
[12] G. Fiorentino, S. Serra, Multigrid methods for Toeplitz matrices, Calcolo 28 (1991) 283±305.
[13] G. Fiorentino, S. Serra, Multigrid methods for symmetric positive de®nite block Toeplitz
matrices with nonnegative generating functions, SIAM J. Sci. Comput. 17 (1996) 1068±1081.
[14] G. Fiorentino, S. Serra, Tau preconditioners for (high order) elliptic problems, in: Vassilevski
(Ed.), Proceedings of the Second IMACS conference on Iterative Methods in Linear Algebra,
Blagoevgrad (Bulgaria), June 1995, pp. 241±252.
[15] G. Fiorentino, S. Serra, Fast parallel solvers for elliptic problems, Comput. Math. Appl. 32
(1996) 61±68.
[16] G. Golub, C. Van Loan, Matrix Computations, Johns Hopkins University Press, Baltimore,
1983.
[17] U. Grenander, G. Szeg o, Toeplitz Forms and Their Applications, 2nd ed., Chelsea, New York,
1984.
130 S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131

[18] T. Huckle, Fast transforms for tridiagonal linear equations, BIT 34 (1994) 99±112.
[19] M. Kac, W. Murdoch, G. Szeg o, On the extreme eigenvalues of certain Hermitian forms,
J. Rat. Mech. Anal. 13 (1953) 767±800.
[20] P.P. Korovkin, Linear Operators and Approximation Theory (English translation), Hindustan
Publishing, Delhi, 1960.
[21] T.K. Ku, C.C.J. Kuo, On the spectrum of a family of preconditioned Toeplitz matrices, SIAM
J. Sci. Stat. Comput. 13 (1992) 948±966.
[22] I. Lirkov, S. Margenov, P. Vassilevsky, Circulant block factorization for elliptic problems,
Computing 53 (1994) 59±74.
[23] N. Macon, A. Spitzbart, Inverses of Vandermonde matrices, Amer. Math. Monthly. 65 (1958)
95±100.
[24] E.H. Moore, General Analysis. Part I. Amer. Phil. Soc., Philadelphia, 1935.
[25] S.V. Parter, On the distribution of singular values of Toeplitz matrices, Linear Algebra Appl.
80 (1986) 115±130.
[26] R. Penrose, A generalized inverse for matrices, Proc. Cambridge Phil. Soc. 51 (1955) 406±413.
[27] M.J. Quinn, Parallel Computing: Theory and Practice, McGraw-Hill, New York, 1994.
[28] W. Rudin, Real and Complex Analysis, McGraw-Hill, New York, 1985.
[29] S. Serra, Preconditioning strategies for asymtotically ill-conditioned block Toeplitz systems,
BIT 34 (1994) 579±594.
[30] S. Serra, The rate of convergence of Toeplitz based PCG methods for second order nonlinear
boundary value problems, Numer. Math. 81 (3) (1999) 461±495.
[31] S. Serra, Preconditioning strategies for Hermitian Toeplitz systems with nonde®nite generating
functions, SIAM J. Matrix Anal. Appl. 17 (4) (1996) 1007±1019.
[32] S. Serra, The extension of the concept of generating function to a class of preconditioned
Toeplitz matrices, Linear Algebra Appl. 267 (1997) 139±161.
[33] S. Serra, On the extreme eigenvalues of Hermitian (block) Toeplitz matrices, Linear Algebra
Appl. 270 (1998) 109±129.
[34] S. Serra, Asymptotic results on the spectra of block Toeplitz preconditioned matrices, SIAM J.
Matrix Anal. Appl. 20 (1) (1998) 31±44.
[35] S. Serra Capizzano, An ergodic theorem for classes of preconditioned matrices, Linear
Algebra Appl. 282 (1998) 161±183.
[36] S. Serra Capizzano, Spectral behavior of matrix-sequences and discretized boundary value
problems, A preliminary version in TR nr. 31, LAN, Dept. of Mathematics, Univ. of Calabria,
1998.
[37] S. Serra Capizzano, C. Tablino Possio, Superlinear preconditioning for optimal precondi-
tioners of collocation linear systems, submitted.
[38] S. Serra Capizzano, Some theorems on linear positive operators and functionals and their
applications, TR nr. 26, LAN, Dept. of Mathematics, Univ. of Calabria, 1997.
[39] S. Serra Capizzano, C. Tablino Possio, High-precision Finite Di€erence schemes and Toeplitz
based preconditioners for Elliptic Problems, submitted.
[40] S. Serra Capizzano, C. Tablino Possio, Spectral and structural analysis of high precision ®nite
di€erences matrices discretizing elliptic operators, TR nr. 28, LAN, Dept. of Mathematics,
Univ. of Calabria, 1997.
[41] S. Serra Capizzano, C. Tablino Possio, Preconditioning strategies for 2D Finite Di€erence
matrix sequences, manuscript, 1999.
[42] S. Serra Capizzano, Locally X matrices, spectral distributions, preconditioning and applica-
tions, TR nr. 29, LAN, Dept. of Mathematics, Univ. of Calabria, 1998, SIAM J. Matrix Anal.
Appl., to appear.
[43] H. Sun, X. Jin, Q. Chang, Multigrid scheme for ill-conditioned block Toeplitz linear systems,
TR nr. 12, Dept. of Mathematics, Chinese Univ. of Hong Kong, 1998.
S. Serra Capizzano, C. Tablino Possio / Linear Algebra Appl. 293 (1999) 85±131 131

[44] P. Tilli, Locally Toeplitz matrices: Spectral theory and applications, Linear Algebra Appl. 278
(1998) 91±120.
[45] E.E. Tyrtyshnikov, A unifying approach to some old and new theorems on distribution and
clustering, Linear Algebra Appl. 232 (1996) 1±43.
[46] E.E. Tyrtyshnikov, N.L. Zamarashkin, Spectra of multilevel Toeplitz matrices: Advanced
theory via simple matrix relationships, Linear Algebra Appl. 270 (1998) 15±27.
[47] H. Widom, Toeplitz matrices, in: I. Hirshman Jr. (Ed.), Studies in Real and Complex Analysis,
Math. Ass. Amer., 1965.
[48] H. Widom, On the singular values of Toeplitz matrices, Zeit. Anal. Anw. 8 (1989) 221±229.

You might also like