Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

(Applied Optimization 42) Helmut Neunzert, Abul Hasan Siddiqi (Auth.) - Topics in Industrial Mathematics - Case Studies and Related Mathematical Methods-Springer US (2000)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 386

Topics in Industrial Mathematics

Applied Optimization

Volume 42

Series Editors :

Panos M. Pardalos
Universi ty 0/ Florida, U.SA

Donald Hearn
Unive rsity 0/ Florida, U.SA

The titles pub lished in this series are listed at the end 0/ this volume.
Topics in Industrial
Mathematics
Case Studies and Related Mathematical Methods

by

Helmut Neunzert
Kaiserslautern, Gennany

and
Abul Hasan Siddiqi
Aligaht; lndia
and Dhahran, Saudi Arabia

,
Springer-Science+Business Media, B.V.
A C.I.P. Catalogue record for this book is available from the Library of Congress.

ISBN 978-1-4419-4833-5 ISBN 978-1-4757-3222-1 (eBook)


DOI 10.1007/978-1-4757-3222-1

Printed on acid-free paper

All Rights Reserved

© 2000 Springer Science+BusinessMedia Dordrecht


Originally published by Kluwer Academic Publishers in 2000.
Softcover reprint of the hardcover 1st edition 2000

No part of the material protected by this copyright notice may be reproduced or


utilized in any form or by any means, electronic or mechanical,
including photocopying, recording or by any information storage and
retrieval system, without written permission from the copyright owner
To Renate Neunzert and Azra Siddiqi

without whose patience and cooperation this work would not have been possible .
Contents
Preface xi
1 esse Studies at Kaiserslautern 1
1.1 Molecular alignment 2
1.1.1 The problem . 2
1.1.2 The model .. 2
1.1.3 The evaluation 6
1.1.4 The interpretation of the results 6
1.2 Acoustic identification of vehicles 7
1.2.1 The problem . 7
1.2.2 The model 7
1.2.3 The evaluation . . . . . . 9
1.2.4 The interpretation of the results 16
1.3 The Airbag-sensor . . . . . . 17
1.3.1 The objective . . . . . . 17
1.3.2 The modelling project . 17
1.3.3 The algorithmic project 18
1.3.4 The modelling of Safing sensor 18
1.3.5 The advanced model . . . . . . 22
1.4 How to judge the quality of a nonwoven fabric . 27
1.4.1 The problem . . . . . . . . . . 27
1.4.2 The models: A first approach . 28
1.4.3 Evaluation of our first model 35
1.4.4 The second model 36
1.5 Fatigue lifetime . . . . . . . . . . . . 42
1.5.1 Introduction 42
1.5.2 Physical situation, modelling, rate independence, and rainßow
counting . . . . . . . . . . . . . . . . . . . . . . . . 44
1.5.3 Damage estimation using rainßow counted data . . . . . . . . 50

2 Algorithms for Optimization 53


2.1 Introduction . 53
2.2 General results about optimization . . . 54
2.3 Special classes of optimization problem . 60

vii
viii CONTENTS

2.3.1 Programming problem . . . . . . . . . . 60


2.3.2 Calculus of variation . . . . . . . . . . . 60
2.3.3 Minimum norm problem and projection 61
2.3.4 Optimal control problem for a system represented by differen-
tial equations . . . . . . . . . . . 61
2.4 Newton algorithm and its generalization . . . . . . . 62
2.5 Conjugate gradient method . . . . . . . . . . . . . . 71
2.6 Variable metric methods (DFP and BFGS methods) 74
2.7 Problems 76

3 Maxwell's Equations and Numerical Methods 79


3.1 Maxwell's equations . . . . . . . . . . . . . . . 80
3.1.1 Brief historical note and physicallaws . 80
3.1.2 Maxwell's equations and their consequences 86
3.1.3 Variational formulation of Maxwell's equations 92
3.1.4 Variational formulation (weak formulation) of magnetostatics
of a surface current. . . . . . . . . 94
3.2 Finite element method . . . . . . . . . . . 96
3.2.1 Introduction to numerical methods 96
3.2.2 Finite element method . . . . . . . 104
3.2.3 Abstract finite element method . . . 106
3.2.4 Finite element method in concrete cases 113
3.3 Boundary element method . . . . . . . . . . . . 120
3.3.1 Basic mathematical results for the boundary element method 121
3.3.2 Formulation of boundary value problems in terms of integral
equations over boundary of the given domain . . . . . . . . 126
3.3.3 Main ingredients of the boundary element method . . . .. 130
3.3.4 Coupling of boundary element and finite element methods . 136
3.4 Problems . . . . . . . . . . . . . . . . .. 138

4 Monte Carlo Methods 153


4.1 Monte Carlo method . . . . . . . . . . . . . . . . . . . . . . . . 154
4.1.1 Motivation 154
4.1.2 Monte Carlo method in rn-dimensional Euclidean space 157
4.2 Quasi-Monte Carlo methods . . . . . . . . . . . . . . . 159
4.2.1 Basic results 159
4.2.2 Properties of discrepancy and star discrepancy 164
4.3 The particle methods. . . . . . . . . . . . . . . . . 169
4.3.1 Introduction 169
4.3.2 Particle approximations-measure-theoretic
approach 170
4.3.3 Functional analytic approach . . 171
4.4 A current study of the particle method . 172
4.4.1 Introduction 172
CONTENTS ix

4.4.2 Derivation of distance . . . . . . . . . . . . 174


4.4.3 Computational results . 177
4.4.4 Spatially homogeneous Boltzmann equation 178
4.5 Problems . 178

5 Image Processing 181


5.1 Image model and methods of image processing 182
5.1.1 Image model . .. . 182
5.1.2 Image enhancement 183
5.1.3 Image smoothing . 185
5.1.4 Image restoration. . 188
5.1.5 Image analysis .. . 191
5.1.6 Variational methods in image processing . 198
5.2 Introduction to Fourier analysis . . . . . 201
5.2.1 Amplitude, frequency and phase . . 201
5.2.2 Basic results 202
5.2.3 Continuous and discrete Fourier transforms 206
5.2.4 The fast Fourier transforms . 212
5.2.5 Fourier analysis via computer 215
5.3 Wavelets with applications. . . . . . 219
5.3.1 Introduction 219
5.3.2 Wavelets and multi-resolution analysis 220
5.3.3 Special features of wavelets . . . . . . 230
5.3.4 Performance of Fourier, fractal and wavelet methods in image
compression . . . . . . . . . . . . . 239
5.3.5 Differential equation and wavelets 249
5.4 Fractal image compression . 253
5.4.1 Introduction 253
5.4.2 IFS theory 254
5.5 Problems 261

6 Models of Hysteresis and Applications 265


6.1 Introduction to hysteresis . 265
6.2 Hysteresis operators . . . . 268
6.3 Rainflow counting method . 274
6.4 Energy dissipation . . . . . 281
6.5 Hysteresis in the wave equation 284

7 Appendix 287
7.1 Introduction to mathematical models. 287
7.2 Fractal image compression 295
7.3 Some basic results . . . . . . . . . . . 304
7.4 Results from Sobolev spaces . . . . . . 315
7.5 Numerical solutions of linear systems . 326
x CONTENTS

7.6 Black-Scholes world of option pricing . 333


Bibliography . 345
Symbols 367
Index . . . . . 369
Preface
Industrial Mathematics is a relatively recent discipline. It is concerned primarily
with transforming technical, organizational and economic problems posed by indus-
try into mathematical problems; "solving" these problems by approximative methods
of analytical and/or numerical nature; and finally reinterpreting the results in terms
of the original problems. In short, industrial mathematics is modelling and scientific
computing of industrial problems.
Industrial mathematicians are bridge-builders: they build bridges from the field
of mathematics to the practical world; to do that they need to know about both
sides , the problems from the companies and ideas and methods from mathematics.
As mathematicians, they have to be generalists. If you enter the world of indus-
try, you never know which kind of problems you will encounter, and which kind of
mathematical concepts and methods you will need to solve them. Hence, to be a
good "industrial mathematician" you need to know a good deal of mathematics as
well as ideas already common in engineering and modern mathematics with tremen-
dous potential for application. Mathematical concepts like wavelets, pseudorandom
numbers, inverse problems, multigrid etc., introduced during the last 20 years have
recently started entering the world of real applications.
Industrial mathematics consists of modelling, discretization, analysis and visu-
alization. To make a good model, to transform the industrial problem into a math-
ematical one such that you can trust the prediction of the model is no easy task.
One needs plenty of experience because modelling is mainly learnt by doing. A nice
approach would be to pose real-world problems to the students who should work on
t hem under the guidance of an experienced modeler. In international programmes in
Kaiserslautern, "modelling seminars" are organized each semester along these lines.
They are proving an important tool for the education of industrial mathematicians,
The problems are mainly supplied by an "Institute for Industrial Mathematics"
which cooperates with industry on a large scale, doing about 40 different projects
every year, This institute is a very important source of interesting problems. But
not every university has such a source. How do others get appropriate problems?
Again, by searching them, where they are-in industry. StafI members have to visit
companies, discuss their problems-they will find a variety of good projects.
This book is designed to help the beginners, to show what we have experienced
during our interaction with industry and teaching indust rial mathematics. It tries
to teach modelling by reading. It may not be the best solution -learning by doing
is clearly preferable, We have, however, tried to maintain the flavour: first, by
presenting five case studies and then adding some background material related to
the theories used for the case studies.
The case studies which make up the first chapter of the book have been taken from
a modelling seminar in Kaiserslautern and handle problems of molecular alignment
in drug design, acoustic identification of vehicles, the security of air bag sensors,
quality control of fabrics and fatigue life analysis. The subsequent chapters provide
the reader with mathematical concepts and methods which are essential for a proper

xi
xii

analysis of these models and for exploration of related new areas.


For example, a problem of fatigue life analysis dealing with the estimation of the
lifetime of critical car components is presented in Chapter 1, while a mathematical
formulation based on the concept of hysteresis is given in chapter 6. Airbag sensors
need MaxweIl's equations whose basis and relevant literature is given in Chapter 3.
Optimization is needed in drug design and the acoustic identification of vehicles-
Chapter 2 discusses some important algorithms in continuous optimization. Random
numbers and so-called Monte Carlo methods help to evaluate very complex integrals
(as in drug design) and to solve high-dimensional kinetic equations needed for nuclear
reactors, space flight, semiconductors. Chapter 4 is devoted to these methods. Image
processing is an emerging field, where a whole bunch of new mathematical ideas are
used; our problem deals with quality control of fabrics and uses the fundamental
concepts of multiscale analysis . But other methods like wavelets, fractals, energy
model, etc. may also be equally important and we describe some of them in Chapter
5.
In the appendices, we have provided discussion on certain topics which are es-
sential for understanding of the main text as weIl as some results which could not
find an appropriate place in a particular chapter. At the end of each chapter, we
have given some problems, some of which may lead to research problems, especially
in Chapters 3 and 5. Hints of some of these problems are mentioned there. At the
end, we have provided an extensive bibliography.
The book addresses several types of readers. We hope it to be useful for all those
who have genuine curiosity to know about Industrial Mathematics. It is intended as
a handy manual of Mathematical Methods for current industrial and technological
problems which may be very useful for engineers and phys icists. It is also intended
to serve as a lucid commentary on most applied methods which are likely to attract
more attention in years to come, Finally, it can be used for a course on Mathematical
Methods of current reallife problems/Industrlal problems at graduate and advanced
undergraduate levels. A deeper insight, if needed, can be obtained through updated
and appropriate references mentioned in the text. Proof of theorems like Theorems
3.2, 3.3 and 6.3 may be omitted by the readers who are not interested in a rigorous
analysis.
We are grateful to a number of persons who have provided valuable help in the
completion of the book. In particular, we wish to express our thanks to Martin
Brokate, Axel Klar, Michael Hack, Franz-Josef Pfreundt, Sergej Rjasanow, Joachim
Weickert, Pammy Manchanda, Kalimuddin Ahmad, Firozzaman, Günter Gramlich,
Sudarshan Tiwari and Ingeborg Woltman. Abul Hasan Siddiqi would also like to
express his thanks to the German Academic Exchange Service, the University of
Kaiserslautern and the International Centre of Theoretical Physics, Trieste, Italy,
for providing excellent opportunities to acquire knowledge for this emerging area.
He would also like to express his gratitude to the founder director of the ICTP late
Prof. Abdus Salam, Nobel Laureate, who motivated hirn to work in applied areas
of mathematics.
xm

Abul Hasan Siddiqi would like to thank King Fahd University of Petroleum and
Minerals, Dhahran 31261, Saudi Arabia for providing excellent facilities during the
preparation of the final manuscript.

Helmut Neunzert Abul Hasan Siddiqi


Kaiserslautern University King Fahd University of Petroleum
Kaiserslautern, Germany and Minerals
Dhahran, Saudi Arabia
and
Aligarh Muslim University, Aligarh, India
Chapter 1

Case Studies at
Kaiserslautern

As mentioned in the int roduction, modelling can only be learnt by doing. A


textbook can only show examples how others have done it-it gives the spirit ofwhat
is industrial mathematics, modelling or scientific computing, but it cannot teach it .

However, in order to remain as much realistic as possible, we present some


paradigmatic case studies in the way they came to us, Each problem is first posed in
a nonmathematical way; so everybody may feel free to start modelling the problem
by hirnself. In a subsequent section, we present our modelling ; of course, t here is
no unique model; others may have other and even better ideas, Finally, we discuss
how we have evaluated the model; i.e., we give algorithms to solve the mathematical
problem posed by the model. Although different approaches are possible, we can
claim that our solutions are good ones in the sense that they give answers to the
original problems. We should mention that all the five problems were posed by com-
panies in Germany; some small, others rather big. All these problems were treated
by mathematicians from the "Laboratory for Technomathematics" in Kaiserslautern,
Germany. The "we modelIed" , will clearly identify the name in each case, It may
be mentioned that the companies paid for the work done by the mathematicians
involved -a good indicator that the work was useful. As mentioned in the int ro-
duction, the five problems chosen from different industrial areas asking for different
mathematical methods have been used as a guideline in the book. Mathematics
needed in these problems will be presented in detail in the following chapters.

1
2 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

1.1 Molecular alignment


1.1.1 The problem
Pharmaceutieal industry uses extensive computer simulations in drug design.
One aspect is molecular alignment. Some drugs are molecules which may bond
themselves to a macromolecular receiver, e.g., a protein. These molecules operate
like keys which must fit into a key-hole of the protein. However, the exact structure
of this key-hole is often unknown ; what is known , however, is another matehing
key whieh we want to replace by the newly synthesized drug. This new key should
be similar to the old key. The problem is: find a model for the similarity of two
molecules whieh allows us to find new keys and develop algorithms to decide how
similar in this newly defined sense the two given molecules are,

Figure 1.1.1

Such modelling problems have to be handled in elose cooperation with chemists.


The mathematies part was done by Dr. F. J . Pfreundt of the Technomathematies
Lab, Kaiserslautern; the partners were Dr. G. Kleib and Th. Rietznet, BASF.

1.1.2 The model


Molecules are given by their geometrie as well as their electromagnetie struc-
ture. Geometrieally, we consider moleeules as a rigid structure of balls (atoms) of
different radii and different charges. Rigidity means that the distance between the
N atoms forming a molecule is fixed. The radius of each atom is normally called
the van der Waals radius and is the minimal distance between this molecule and
any other.
In a fixed position, we may therefore represent a molecule by the N positions of
the centers of the balls ~l' .. . , ~N ' by the N corresponding radii Tl, ••• , T N and by N
charges ql, ... , qN; i.e.
1.1. MOLECULAR ALIGNMENT 3

To compare the shapes of molecules and to define the similarity, the molecules
must be translated and rotated. Since the structure is rigid, translating a molecule
by a vector g and rotating it by a rotation A means to translate and rotate any atom
by g and A, respectively; i.e.

AM + g = {g + A;[I, . ..,G + A;[Nj rl,···,TNjql, ···, qN}.


A rotation is given by Euler's angles 0, <p and Wj i.e.
A = A(0,<p, w).
Any motion of M ~ AM + g is therefore characterized by 6 parameters G E ]R3,
o E [0,71"], <p, W E [0,271"]. Assume that we have already defined a distance d of 2
molecules; i.e., we know what

for given MI, M 2 should be. Similarity would then mean "small distance" . But
then the distance would depend on the relative positions of the molecules. To define
similarity, we would have to move one of the molecules until it best fits the other.
Similarity should therefore be measured by

To find the best possible position for MI, i.e., to find the optimal g and A is called
alignment. It is clear that this alignment depends on d, the distance between
the molecules we choose. We want to remind that the main purpose of the entire
investigation is to substitute one key MI by another key M 2 • The similarity should
be a similarity of keys with respect to one lock! We speak about distances, but we
do not expect to get distances in the sense of a (mathematically correct) metric.
d must be a functional on pairs of molecules with d (MI, M 2 ) = 0 if and only if
MI = M 2 • We do not insist on having a tri angle inequality-it is easily obtained,
and not even required. As mentioned before, one has to take geometrical aspects
into consideration as weIl as the electrostatic situation.
Typical attempts in the past (see, for example, the so-called SEAL-metric, de-
veloped by Kearsley and Smith [1990]) defined d (MI , M 2 ) for

and

by
N M
d(MI,M2 ) := C - LLwije-allzi-v;1I2,
i=1 j=1
4 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

where wij = wEqjrÜ +WsrfrfJ. Gis chosen such that d(Ml,Ml) = O. Clearly G
depends on M l . The sum measures the overlap of the two molecules and is maximal
and equal to C if both molecules are the same.
There are several free parameters in the game: o, WE, Ws and ßj they were found
by fitting to experiments. One may also do some geometrie considerations: if we
have two balls of the same radius R, the volume of the intersection of the 2 balls
having a distance r, V(r), is 0 for r > 2Rj for 0 :S r :S 2R, we compute by elementary
geometry

which is maximal for r = 0: V(O) = ~1l' R3. We shall now choose Ws and o, ß such
that
V(r) = w s R 2 ß e- a r
fits optimally to V(r). Again, "optimal" is not defined and one may think of V(O) =
ws R2ß = ~ R 3 , i.e, Ws = t and ß = ~ . Then o may be chosen such that
V1(O) = V1(0), i.e, +a = + 4~' Ot her ideas are possible.
Let us now try to do the modelling differently. The domain "fitted" by the
molecule M = {;!Zl' ..., ~Nj rl, ..., r s: ql, ..., qN} is, of course,

Hence, the "surface" of M is an. It is quite a complex structure. The atoms carry
the charges q which define an electrostatie Coulomb potential

This is repulsive. We may think of taking the distance between the potentials of two
molecules, rPl and rP2' on the surfaces of both molecules, i.e., on an l and an 2 and
may try to define

(J J
1

d (Ml , M 2) = IrPl - rP21 P dw + Ic/>l - rP21 dw) ;;


01 00 2

(dw is the surface measure on an). This would lead to asymmetrie d, but practieally
it would be enough to consider only one integral.
Experimental validation, however, shows that what counts is the electrie field
and not the potential; i.e.,

E = -"V rP·
1.1. MOLECULAR ALIGNMENT 5

Moreover, it is neither the flux (E, rY nor the electric energy at the surface; i.e.,
J IIEII 2 dJ..J, whieh plays the most important role, but rather the direction
eo
E
~= II EII'
which determines similarity.
The electrostatic component of our d may, therefore, be modelled by

OOt
! 1I~1 - ~2112 dJ..J.

What remains is the geometrieal part. The most natural choiee from the point of
view of a mathematician would be the Hausdorff distance
8 (0 1 , ( 2) = Xe
max min IIx -
OtVe02
yll.

We first take the shortest distance of a point x to O2 ; i.e.,


d0 2 (x) = min
Ve02
IIx - vll .
and then use the maximum of all these distances with respect to 0 1
8(0 1 , ( 2) = 1 1~II LCO(Ot) = Ild2I1Lco(eot) '
Since lLoo-norms are unpleasant to compute, we choose
-
8(0 1,02) = Ild2I1L2(e
2
ot)'
Modelling is always a balance between the correctness of the model and the com-
plexity of its evaluation. Hence, there is no need to choose a more complex model.
Hausdorff is mathematieally a very pleasant metric, but what about computa-
tionally! O.K., says the chemist, but then begins again to play (having the old
SEAL-model in mind) : Let us be more flexible by using

{ (1 - e-ßr4 (x ) ) dJ..J(x),
Jeo t
(it is almost ß Ild211~2(eot)' at least, if d2 is small -and it is more flexible).
Here we are, at the moment

d(Ol, ( 2) =a
eO t
! 1I~1 - ~2112 ! (1 -
e02
dJ..J + e- ß d2 (X)2) dJ..J,

and we may play with o, ß controlling the weight of the electrostatie and the geo-
metrie aspects,
Let us stop here with modelling; one is never at the end of a modelling task.
There is no "the model" - it is either better or worse. To know how good a model
is, it must be evaluated, d must be computed, alignment performed, and comparisons
with experiments done .
6 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

1.1.3 The evaluation


There are 2 numerical problems to be solved, if one wants to use the model
given above.:
(a) One has to evaluate integrals over quite complicated surfaces an, surfaces of
domain n formed by the union of balls centered at different positions and of
different radii,
(b) One has to minimize

with respect to (), ip, w, g.


One should realize that the evaluation of the function d is quite an elaborate task.
This means that we should look for algorithms which need minimal evaluation.
Considering these 2 problems, one may realize one "principle" of industrial math-
ematics, which we claim to hold almost always . It is a brother of "Murphys princi-
pIe": what can go wrong, goes wrong. Of course, we know how to evaluate surface
integrals: we approximate the surfaces by piecewise polynomial surfaces etc. How
can we do that, if the surface is as complicated as that of a molecule? Of course, we
know many optimization strategies: those which need derivatives and others which
do not . Which one should we choose here?
We come back to these questions in Chapters 4 and 2, respectively.

1.1.4 The interpretation of the results


What we have done are only the first steps; the problem is not yet solved . But
what means "solved"? We expect that molecules, which are known to be similar in
the sense defined above, have a small distance-small compared to a typical distance
between two arbitrary molecules. By our normalization, it means to be near 1 or even
less than 1. We took 4 pairs of rather similar molecules; to explain their structures
is beyond the scope of this book. We compare the SEAL-distance with our distance
d:
Mi M2 dSEAL d
trypsin 1 trypsin 2a 0.58 0.56
ltmn ltmp 1.16 0.59
abs tapap 3.16 1.18
napap lapap 1.47 1.04
So, it seems that our d discovers similarities better than the former distance
concepts .
However, the time to compute d is far too long to be used in practice. The most
important task, therefore, is now to accelerate the algorithms. To develop faster
algorithms is the genuine work of a numerical analyst, i.e., a genuine mathematical
task.
1.2. ACOUSTIC IDENTIFICATION OF VEHICLES 7

1.2 Acoustic identification of vehicles


1.2.1 The problem
An observer should watch movements of ships around him . He has only ears;
he does not see the ships (there may be fog); he has neither radar nor other sensors.
He knows from which direction the sound reaches him and the intensity with which
the sound reaches him. But he has no information about the distance of the ship
and the intensity of the sound emitted. The first question is: how can he use the
information available to him? What does it tell him about the movements of the
ship and the kind of ship under consideration?
Moreover, whatever he may get out of it, he wants it as quickly as possible.
Therefore, he needs on-line algorithms to determine what can be determined. The
problem was investigated by Dr. S. Rjasanow of the Technomathematics group at
Kaiserslautern (the customers want to remain anonymous).

1.2.2 The model


We put the observer at the origin of the (x, y)-plane, in which the ships are
supposed to move in straight lines with uniform speed. These assumptions seem to
be justified in "our" situation (says the problem poser!). Therefore,

is the trajectory of the ship, where we do not know ß and 1l.. What we observe is the
angle between the position of the ship and (say) the z-axis. We call it a(t) and we
measure aj = a( tj) at times tl, ..., tk.
Moreover, we measure the intensity of the incoming signal, d(t). Since it is
inversely proportional to IIx(t)1I 2 , we get

8
d(t) - ------,,-2
- Ilx(t)1I '

where 8 denotes the unknown intensity of the sound emitted from the observed
vehicle. It is not sufficient to measure aj and dj and dj = d(tj) in order to determine
ß,1l. and 8, even if we are very diligent and do it quite often. A more distant, but
faster and noisier vehicle, may create the same signals at the origin and may not be
distinguishable from the nearer but slower and more silent vehicle. We need more
information in order to determine 8 and motion (ß, 1l.), and we may pose different
problems depending on the information available.

Problem 1. We know the speed IIvll = a (probably since there is a maximal speed
which ships normally use) and we measure only aj, j = 1, ..., k, We want to know
the motion ß and 1l..
8 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

Problem 2 uses, in addition to the information provided in Problem 1, measure-


ments of dj, j = 1, ..., k and asks for better results. Apart from the motion, we want
to know the type, i.e., 8.

Problem 3 does not assume that we know (J, but measures 8 and is again interested
in the motion. All measurements carry an element of error.

Figure 1.2.1

We know that

Since we do not know r(tj), we express the fact that flLj points in the direction
cos
. a]
o
)
by saymg that

( smaj

W o=
sin oj )
-] ( - cosaj

is orthogonal to flLj

(x .,W.) = o.
-] -]

If we assume that the measurement error of (Xj, Wj) is normally distributed, we


would make a regression to get !! and :!l by minimizing
k
L ((flLj,?Qj) - 0) 2 •
j=l
1.2. ACOUSTIC IDENTIFICATION OF VEHICLES 9

If we think that the eosine of 0: is normally distributed, we minimize

Sinee we did not get the information about the instrument measuring the angles,
we ehoose these two functionals for our further investigation; they are quite good
to handle. It is an "eeonomics" principle in modelling (formulated by the Aus-
trian philosopher Ernst Mach during the last eentury) to choose the simplest model
eoinciding with the given information.
For Problems 2 and 3, where dj is used, we need another regression and, most
likely, the functional to be minimized should be

Although

is simpler but harder to justify, we shall still use it.

1.2.3 The evaluation


Keeping in mind that ;f.j = ~ + tj'Y.. depends on the parameters ~ and 'Y.. to be
estimated by regression, we write

and

k
grada rP1 = 22: ((~, '1!d.j) + tj ('Y.., '1!d.j) ) ws-
j= l
10 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

In order to simplify, we use the matrices


k
W(n) = 'L...J
" t"! . w · . w'!
3 -3 - 3 '
j=1

where we notice that

YJ.j . YJ.jT = (sin a . ) ( smaj


_ COS~j . - cosaj )

= • 2 a!
sm •
- smaj
2
cosaj ) .
( - cos aj sm aj cos aj
With this notation we easily get

grada <Pl = 2 (W(O)!! + W(I)Q)


gradV <Pl = 2 (W(1)!! + W(2)Q) .
In the same way, we may express the gradients of the other functionals.
We now turn to.

Problem 1. We do not need Wj moreover IIQII = a is given. If we choose <PI for


2
regression and use 11!!.1I = a 2 as constraint, we get by using the Lagrange multipliers

as functional to be minimized; necessary conditions for extrema are given by

and

If W(O) is a regular matrix, we get!! = _W(O)-lW(I)!!. and therefore an eigenvalue


problem for p, and Q is

(W(I)W(O)-lW(1) - W(2»)!!. = p,!!.,

for which the condition Ilvli = o is only a normalization of the eigenvector. (We
mention that a general treatment of "normal system of equations" , which originate
from regression, are treated in Ciarlet and Lions [1991]) .
1.2. AGOUSTIG IDENTIFIGATION OF VEHIGLES 11

Sinee W(i) are symmetrie matriees, the matrix of our eigenvalue problem is again
symmetrie and we get 2 real eigenvalues and 2 orthogonal real eigenvectors. The
length of the eigenvectors is a and therefore we get by using a = - W(O)-l W(I)~,
the following 4 solutions

with eigenvalues "'1' J.L2. It is not diffieult to decide which sign is the eorrect one;
but which eigenvalue, "'1 "'2'
or should we ehoose? This may lead to a maximum
instead of aminimum, or to no extremal value of F at all!
To proeeed further, we eonsider the Hessian of F with respect to (Q,~)

W (O) W(1) )
Hp. := 2 ( W(I) W(2) + J.LE2 .

It is easy to see that Ho is positive definite. If we use the following four k-dimensional
vectors:

s.(4) =

where the matrix !Ho is the Gramian of these vectors; i.e.,

Gramians are always positive definite, if the veetors Sj form a linear independent
set, sinee

4
with equality only if L ~iS.i = O.
i=1
But even in this ease, Ho is semidefinite. We search for solutions of
1
2" grad(a.v)F = 0,
and see that this means solutions of

~HO (:) = -", (~) .


12 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

Since Ho is positive semidefinite, both eigenvalues, JL1 and JL2' are non-positive if

But F cannot have a loeal extremum at JL = JL1' Consider HJ.l.l and

At the same time we get

since W(O) is the Gramian of 8.1 and 8.2 and, as we assumed, regular. We see
that HJ.l.l cannot be definite and therefore F(g" Q, JL1) cannot have a local extremum.
Therefore, JL = JL1 can be excluded. The only chance is JL = JL2; and HJ.l.2 positive
definite or at least semidefinite! Certainly not definite, since

The proof that D is semidefinite for "generic" measurements is still not available - it
should not be too hard. The numerical results point to the fact that JL2 is a good
choice.
What remains is the question of orientation: Is +Q2 or -Q2 the correct motion,
g,2 or -g,2 to starting point?
The angle between x(O) and the other directions
-
(C?S a
smaj
j
) should be between

- ~ and ~. If therefore / a2,


\
(C?S a
j ))
smaj
> 0 for all j, we choose the plus sign; if
not, we choose -!!2 and -:!!.2.
A final remark to Problem 1 concerning the regularity of W(O): If the ship is on
path hitting us directly, then all angles are equal if the measurements are exact. With
noisy data the matrix W(O) might not be singular, but is certainly ill-conditioned
and the algorithm does not work.
Now we turn to Problem 2, but we will shorten the presentation.
We combine if>1 and W using a parameter A, which might be chosen according to
the relative weight of direction and intensity measurements. Instead of F we get
1.2. ACOUSTIC IDENTIFICATION OF VEillCLES 13

I
I
I
I
I
I
I
I
I
I
I
.... .... I
.... .... I
.... .... I
........ I
....
---..... ..... -....... --
I
'" I
....
..........

---- I
----- --
"
..........
I

Figure 1.2.2

Regression with G.\ gives necessary conditions

,\ grada if>l + (1 - ,\) grad, W = 0


,\ grad v if>l + (1 - ,\) grad, W+ 2p.Y.. = 0
(1- '\)W.s = 0
2
1IY..1I = 0'2 .

Eliminating 8, we get

with

Sl1 = ,\W(O) + 2(1- '\)f11(~,Y..)~


5 12 = ,\W(l) + 2(1- '\)b2(~'Y..)~ = 5 21
5 2 2 = ,\W(2) + 2(1- '\)h2(~'Y..)~'
where lij denote functions quadratic with respect to g and Y.. and depending on dj.
We again get an eigenvalue problem, but now a nonlinear one. We rewrite it as
a problem of determining the zeros!!T = (~T,y..r,p.) of

E(u) = 0,

where we include in F the component 1IY..11 2 - 0'2. This nonlinear system is solved by
a several-dimensional Newton method, choosing a starting value ~, linearizing F
14 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

at :i! to get

where DF(:i!) is the Jacobian of F at u and then iterating


Y? = :i! - DF(:i!)-l F(:i!).
More generally,

with t1i solving

(It is not really necessary to compute DF(y/)-lj it is enough to solve this equation
for a given right hand side).
In our case the J acobian D F has a special structure

The equation is solved by any iteration method such as Gauß-Seidel or SOR; we


get an "inner" iteration in that way, being performed in any step of the "outer"
j-iteration. We may remind that A has to be chosen by the user .

Problem 3 is a bit simpler since there is no constraint; 8 is given and we start with

Again, we may write the regression as a nonlinear equation for (~) , which we
try to solve by a Newton method. We are not completely in command of the
domain of convergence. Although, practically, everything looks fine yet, in reality,
since we do not require that II.~II = a ; we get a new critical point Q = Q = Q and we
cannot exclude that our method just converges to this trivial solution. To explain
the situation, we study a similar one-dimensional problem with x E IR instead of
(~); G consists of a quadratic and a fourth order term
G(x) = o:x2 + ß(x2 - 8)2, 0:, ß, 8 > O.
If 0: ~ 2ß8, 9 has only one minimum: the trivial one x = 0;
if 0: < 2ß8, 9 has a maximum at x = 0 and two minima at
x=±J8- ~
1.2. ACOUSTIC IDENTIFICATION OF VEHICLES 15

a ~ 2ßo

Figure 1.2.3

We are int erest ed in the second situation only; the one-dimensional example
suggests that we get an interesting solution if at g = :!l. = 0 the function G has a
maximum. To see when this is the case, let us consider the Hessian of G at g = :!l. = O.
We get

W (O) W(1») (fl1(Q,Q)~ !l2(Q,Q)~)


DG(Q,Q) = A ( W(1) W(2) - 2(1- A) h1(Q,Q)~ h2(Q,Q)~ .

The last matrix is again a Gramian of the two vectors (.,fd;, ..., J'dk) T and (.,fd;t1, . ..,
..J(lktk)T and therefore positive definite. For sufficiently small A, the matrix DG(Q, Q)
is therefore negative definite, so that we are in the (hopefully) interesting situation.
Therefore we try.

Remark. Until now, we have only used 4>1' Let us at least look at the simplest
problem 1 with 4>2 instead of 4>1 . We want to see how sensitive the solution is with
respect to the chosen functional. Moreover, we try another method of solution.
k
We want to minimize 4>2(g,:!l.) = ~ <~~.~~> with the constraint 11:!l.1I 2 = a 2 • We
]=1 J

avoid the Lagrange multipliers by putting

v= a(coscp)
- sincp '
16 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

Figure 1.2.4

and writing cP2 as a function of g and ip: cP2 (g, <p) . The price we pay is the shape
of the normal equations. grada<p is a rational function of (al, a2) and has many
zeros. ~ oscillates near the origin and at infinity. Newton's method cannot be
used for that reason - t he domain of convergence is very sm all. We need other
optimization methods to approach the problem, but we did not find methods which
were absolutely reliable. In 95% of all practical problems they work - but clients in
industry want software which never fails. To many exceptions, to interactivity with
the user means to need perturbations. We therefore abandoned these ideas ,

1.2.4 The interpretation of the results


We are not allowed to present real data, Therefore we show a constructed
example which is not far from reality.
We assume that the "truth" is

~(t) = ( -;~O) +t ( ~6 ),0 s t s 50,


and 8 = O.l.
We make the data very noisy by adding normally distributed errors to Wj with
a variation W r and also normally distributed errors to dj with a variation d r • For
tj, we take 100 equidistant points. For functional cPI' Problems 1, 2 and 3 give the

following estimates for g and ~, which are correctly ( ;:g) and ( ~6) :
1.3. THE AIRBAG-SENSOR 17

Wr dr Problem 1 Problem 2 Problem 3


a v a v a v
0.1 0.0 -8.75 -0.03 -8.83 -0.03 -9.56 -0.01
245.00 -5.99 245.00 -5.99 249.50 -5.98
0.3 0.0 -4.11 -0.14 -4.21 -0.14 -8.13 -0.05
220.00 -5.99 221.00 -5.99 239.00 -5.73
0.5 0.5 0.36 0.23 0.43 0.25 -5.36 -0.12
192.0 -5.99 197.0 -5.99 239.00 -5.73

Problem 1 takes less computing time, Problems 2 and 3 are solved more slowly,
but with approximately equal effort, Problem 3 delivers elearly the best solution.
Our results show that the functional rP2 cannot compute-it is slower and less precise.
Of course the computation is done on-line-one uses the data at hand at a certain
time. Each newly incoming measurements improve the result ; the "old" result is used
as starting value for the Newton iteration. In that way, this iteration needs fewer
steps.
In the beginning, the results are less encouraging, especially if the course of the
ship is directly pointed at the observer. However, they soon get better if the path
does not go exactly through the origin.
The dient accepted the results but soon asked for higher computational speed.
A problem for industry is never solved: it is like in a German fairy tale of a little
boy with the strange name Häwelmann. He was never content with what he got ; he
always shouted "more, more" .

1.3 The Airbag-sensor


1.3.1 The objective
Usually, in an airbag system of a ear, the blowing up ofthe airbag is controlled
by an electronic and an electromechanical sensor. The airbag will be released if both
sensors react on an outer influence elosing an electronic circuit.
In this section, we deal with the electromechanical sensor, the so-called Saflng
sensor. Basically, it consists of a magnet along with a metallic ring which can move
within certain boundaries. The electronic circuit will elose if the ring moves over a
certain distance. The sensor has to be built in such a way that the circuit is not
closed if the force is small (say, in case of braking only), but is closed in case of a
strong force (crash case).
We discuss here a model for the magnetic force and other influences on the ring.

1.3.2 The modelling project


Model a Safing sensor, that is, the movement of a ring under the action of
an external force and the magnetic field.
18 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

1.3.3 The algorithmic project


Give a correct computational method for the model. This does not come
immediately. No on-line answer is required. This project was carried out by Dr.
Axel Klar, Martin Braun and by a group of students during the ECMI-Modelling
week 1994.

1.3.4 The modelling of Safing sensor


Usually, in the airbag system of a car there are two sensors controlling the
blowing up of the airbag. The first sensor is a purely electronical sensor. This
sensor may also react on electromagnetic influences not caused by a car crash. To
avoid the blowing up of the airbag in such a case, a second sensor is used . This is
the so-called Safing sensor. It is built on an electromechanical basis, The airbag will
blow up only if both sensors react.

9 mm
magnet

.....
.....
0

19 mrn

Figure 1.3.1

In this project, we deal with the Safing sensor only. It consists of a conical
magnet in a cylinder. Around this cylinder, there is a metallic ring with mass m
that can move along the cylinder. Initially, the ring must be at one side as shown in
Figure 1.3.1. If it hits one side with a certain velocity v, then it bounces back with
a velocity c- v, where c is the elasticity coefficient (in our models c = 0.3).
Whenever there are forces on the car, there is also an acceleration of the ring in
relation to the cylinder. So the ring starts moving. If it moves over a certain point,
say A, the electronical circuit closes. However, this should only happen if the forces
on the car are very large, i.e., if there is a crash. If these forces are only small, for
1.3. THE AIRBAG-SENSOR 19

example, if one brakes, the circuit should not elose. In that case, the ring should
not move over the point A. Further, after the braking, the ring must move back to
its initial position. All this can be accomplished by the magnet and its magnetic
field and force. This force causes the ring not to move too far if there is only a small
force on the car.
In the project, we wanted to determine the movement of the ring. The main
problem was the calculation of the magnetic field and the magnetic force. First, we
assumed the magnetic force to be a constant or a linear function of x, where x is
the direction of the movement. Then, we worked out the real magnetic field and the
magnetic force theoretically and found a numerical solution for this case.
It is important to get an estimation for the accelerations of the ring in case of a
crash and in case of braking only. Then, the required magnetic force and, thus, the
magnetic field can be determined. When all forces are known, the movement of the
ring can be modelIed as a function of time .
In this section, realistic values for the acceleration of the car (and the ring in the
sensor) will be determined for the crash case and the braking case. We will consider
only a little crash and very strong braking. H the ring moves a suitable distance in
these cases, then it will also move for a strong crash or little braking.
Suppose a car crashes a wall at a speed of 36 km/h; i.e., 10 m/s, and suppose the
crash lasts 100 minutes. Then the acceleration is (a = v/t)100 m/s2 ~ lOg where g
is the earth acceleration. This is not very accurate, but from experiments it follows
that this value is realistic, although it is very small,
Suppose a car that is driving at a speed of 36 km/h brakes suddenly and stands
still after half a second (this is very fast). Then, the mean acceleration is 20 m/l ~
2g. This value is also realistic although it is very large.
From this, the magnitude of the magnetic force can be determined such that
the circuit eloses if the car crashes and does not elose if the car only brakes. The
acceleration of the ring due to the magnetic force should be within 20 m/s2 and
100 m/s 2 and acting against the crash and braking force.

Governing equations
The physics of our problem can be described by Newton's second law:

mx-=
n
~-
LJFi.
i=l

Choosing the positive x-direction towards the right of the initial position of the ring,
we obtain

where
• Fcar is the force which is exerted on the ear. It can either be the crash force
Fcrash or the brake force Fbrake.
20 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

• Ffrie is the friction force between the ring and the cylinder. It always acts
against the movement; i.e.,
-JLmRg for x(t) > 0, i.e, the ring moves }
Fi. = towards the position x-direction
fric
{ JLmRg for x(t) < 0, i.e., the ring moves '
towards the negative z-direction
where mR = 0.002 kg is the mass of the ring, JL = 0.5 is the friction coefficient
and 9 = 9.81 m/s 2 is the earth acceleration.
• F mag is the magnetic force on the ring.
• Find is the induced induction force.
From now on, we consider only the different kinds of accelerations on the ring
which can be obtained from the corresponding forces divided by the mass of the
ring . Then,
x(t) = aear(t) + arrie(X(t)) - amag(t,x(t)) - ai nd (x(t))
holds. This equation is a non-linear ordinary differential equation which can be
solved by numerical methods only. As an integration method, we used the trape-
zoidal rule which is applied twice to get the velocity x(t) and then the movement of
the ring x(t).

First model
In the first model, the magnetic force which is exerted on the ring is assumed
to be constant in time and independent of the position of the ring. The acceleration
of the ring due to this magnetic force is a mag = const. Furtherrnore, the acceleration
due to the force on the car is taken as
t ) _ { (1 - cosw t) · a c for t ~ T
a ear ( - 0 for t > T,

where ac is the amplitude of the acceleration, w = ~, and T is the time that the
force lasts, This eosine function appears to be a rather good approximation of the
real acceleration if there is a crash or a brake force. The acceleration due to the
friction between the ring and the cylinder is as described above. To approximate the
movement of the ring in case of a real crash with this model, the following values
are used:
ac = 100 m/s
2

a meg = 20 m/s2
arrie = 5 m/s 2
T = 25 ms.
To approximate the movement in case of braking, the same values are used except
for a e which is taken as 40 m/s2 •
1.3. THE AIRBAG-SENSOR 21

Second model
In the second model, the acceleration due to the magnetic force and the friction
force is the same as in the first model, but the acceleration due to the force which
is exerted on the car is taken from a real crash data. A picture of these values is
shown in Figure 1.3.2.
Again a m ag = 20m/s2 and llfric = 5m/s2 • The crash time T is approximately
90 ms whch can be easily calculated.

With these values, it appears that the ring stays a longer time at the right hand
side of the cylinder than in the previous model. This is because the crash lasts a
longer time. Therefore, the closing time of the electronical circuit is longer (about
90ms) . (The closing time is the time during which the ring is beyond point A, i.e.,
the time between two passings of points A).

Figure 1.3.2

Third model
In the third model, again the acceleration due to the forces on the car is taken
from a real crash, and the friction force is the same as in the two previous models.
In this model, the magnet in the cylinder is replaced by aspring which gives the
following linear acceleration of the ring as a function of its position:

aspring(X) = Cspring • X + Co ,
where Cspring is the spring constant and Co is the spring force on the ring, if it is in
its initial position divided by the mass of the ring.
Cspring and CO are chosen 2000s- 2 and 5m/il, respectively. The acceleration
CO is introduced to make sure that the ring moves back to its initial position after
braking. The closing time with the approximation is 95 ms,
22 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

1.3.5 The advanced model


Here, we have to calculate the force the conical permanent magnet exerts on the
ring. However, this includes magnetization in an indirect manner only: magnetiza-
tion is generated by the microscopic current described by the Maxwell equation. For
this we require a book on classical electro-dynamics, for example, by J .D . Jackson or
by Sommerfeld or the comparatively current book by Dautry-Lions, see references
in Chapter 3. The main tool ls always the system of Maxwell equations. We are
mainly interested in macroscopic effects, that is, in the magnetizations and the
forces induced by it. Here one has to be careful: macroscopic currents also play a
role when we note, for example, that the movement of the ring also implies a move-
ment of charges and thus a current; the resulting so-called electromotive force is not
taken it into account initially.
We observe that the movement of the ring is also a movement of the weight
and that creates a current and we consider this electric force in the next step. Let
us describe the magnetization of the ring and the force which results from it by
formulating the Maxwell equation in terms of the magnetization instead of currents.
Magnetization is a vector field M(~) which we may consider as a sum of moments
of elements of magnetic field per unit volume, that is, as a density.
Between magnetic field H, magnetic induction B and magnetization M, the
following relation holds:
1
H = -B - M or B = JLoH + JLoM,
JLo
where JLo is a weIl known constant.
M is the magnetization of the cone K, M K and the induced magnetization of the
ring M R which is not known. However , M K is given and known. It is constant in
K and is in the direction of the axis of the magnet, that is, of the z-axis

The stationary Maxwell equ ation is given below

div B = 0,

or

div H = -div M .
If we could determine H, we would get

B = JLoH + JLoM .
B now generates the magnetization of the ring and

M R = JLR -1 . B j R = JLBjR.
JLO'JLR
1.3. THE AIRBAG-SENSOR 23

In principle, we have to solve the following system of equations:

div H = - div M K - div M R


B = /LoH + /Lo(M K + M R)
MR=/LB.

For determining H and B, we must know M R which again depends on B . In reality,


MR has less influence than B at least compared to M K' Here, one can think of an
iteration, say,
div -H(i+l) = - div -
MK- div
-R
Mi

B(i+l) = /Lo (H(Hl) + M K + M~»)


M~+l) = /LB(i+l) , i = 0,1,2" ..

One can start with M~) = 0; so

div H(l) = -div M K


{ B(l) = /LO(H(l) + M K) ,

and finally

M~) = /LB(1) .
One can hope that B(l) and M~) are sufficiently accurate and the following iterations
do not yield a big change. In any case, we shall try to do this without having
estimated the error so far.
How does one solve (*)? The first problem only appears because M K is not
differentiable. It jumps at the surface aK and is otherwise constant. This problem
is, like several others in the field of differential equations, self-fabricated. Really, the
equation is as follows:
The flux of B(1) through an arbitrary surface of a bounded smooth body is always
zero ; that is,

l (B(l), n.)dw = 0,

for all closed surfaces S; of course, the Gauss theorem changes this into div B = 0
as :

r divB(l)d~= Jr
i:
(B(l),n.}dw=O,
ev
for all sets V with sufficiently smooth boundary and from this it directly follows
that
div B(l) = 0 .
24 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

But, of course, the theorem is valid for differentiable B(l) only- and this is not the
case here, at least not on 8K. Away from 8K, we have div M K = 0; l.e., div
H(l) = 0; on 8K, and it follows from !s (B(l) .,!!)dw = 0, that
(B(l), WOK+ = (B(l) ,'!!)OK_,
where '!! is the outer normal vector on 8K and 8K+ denotes the boundary limit
when we approach the 8K from the interior, where 8K_ is the exterior limit. This
can be seen from Figure 1.3.3.

Figure 1.3.3.

From !s(B(1),Tl)dW = 0, we have (B1,'!!)OK_ + (B(I)''!!)OK+ = O. (When the


closed sets are very thin, the side face does not play any role). This is precisely the
continuity of (B(l) ,Tl). Thus we have

BI = JLO(H(I) + M K)'
which implies that

where

is the jump of the normal component of M K '


Our problem to determine H(l) now reads div H(l) = 0 away from 8K. On 8K,
the above jump condition must hold.
In order to calculate H(l), we need the following Maxwell equation:
rotH(I) = i
Here, j denotes the microscopic and macroscopic currents. Our interest is H(l) in
the ring R, where we started with M~) = O. Out of K and specially in the ring R,
1.3. THE AIRBAG-SENSOR 25

the j is equal to zero, i.e., H(l) = O. These are the integrability conditions for H(l )
and in view of the Poincare Lemma, we can suppose a potential r/J of H

This gives us

~r/J = 0,

and

One observes how the special conical form of the magnet comes into play.
If the magnet were of cylindrical form, we would have nl = 0 on the surface of the
cylinder and there would be no jump in the normal derivative; the field only appears
at the side face. Now, we utilize our knowledge of the solution of Laplace equation
and jumps of the solution. One needs the well-known theory of "single layer" which
together with "double layers" is the main tool for the boundary integral method.
We describe quickly all the relevant material.
Let N(x, y) = 411"11;-1111 be the kernel of the Laplace equation, then we have, for
given m E CO(BK): The function

r/J(~ = r
JaK
N('f., y)m('!l)dw('!l)

satisfying the Laplace equation outside the BK is continuous in the whole space and
its normal derivative ~ has a singularity of order m.

The desired potential is

r/J('f.) = r
JaK
N('f., 1j)(M, 71)dw.

We look at the ring R. Having r/J, we get

B(l) = -P,o"Vr/J (M K is in the zero ring),

and therefore we have

'f.ER.
26 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

The conical surface consisting of side, top and bottom must be parametrized and
integrated. It goes straight since Mi is constant and ni (y) is fixed on any one of the
three sub-surfaces.
Now it is nearly done: we have the magnetization of the ring from which we
can calculate the force K on R . In principle, this is done again with the underlying
micro-currents and associated Lorenz force j x B . There are currents inside R and
on the surface. We have -

Force = r(Li x B)d~ + Jr (io x B)dw.


JR 8R

We are in the "Macro-theory" with magnetization instead of current and we must


know that

Li = rot M R and io = M R X rr holds.


With this we have

Force = r(rot M R) x Bd~ + Jr (M R x ,l1) x Bdw.


JR 8R

By the identity ~ x (Q x ~) = Üh~Q - < ~,Q > ~ and the relation div B = 0 we get
Force = r (MR,rr}Bdw .
J8R
We are again interested in the component of the force in the direction of the axes of
the cylinder, that is, Bi' This means the evaluation of another surface integral over
the surface of the ring which has 4 parts: the inner and outer cylinders, and both
frontal sides. This is also simple, although M R in R is not constant in R.
The movement of the ring under this magnetic force can be seen in Figure 1.3.4.

This is what has been done up to now. However, there is still much to do, for
example:
Clarify whether the iteration of H(i) can really be terminated. Estimate how
the electromotive forces influence the behaviour. Incorporate the switching into the
investigation. It looks as if the third point will lead to a continuation of the project.
But if we want to have simulation results which are really accurate, we have to look
at many points in a more precise manner.
The fundamental equations (MaxweIl's equations) are completely reliable, classi-
cal electrodynamics is quite a mature theory. Moreover, the material constants are
weIl known. What needs to be examined here are the simplifications and neglec-
tions : problems which are clearly formulated mathematically but may sometimes be
difficult to solve. This is a very rewarding area of research where, surprisingly, few
mathematicians are working!
In Chapter 3, the fundamental results concerning Maxwell's equations are pre-
sented indicating the current techniques and methodology for their solutions.
1.4. HOW TO JUDGE THE QUALITY OF A NONWOVEN FABRIC 27

O.OI0r---- ---,;;-j<""i:r;=r---_ """"- - ---,

0.009

0.008

0.007

0.DD6

0.005

0.D04

0.003

0.002

0.001

O~ _'____ --'-- --'--_"__ .>..<=""'_ ___'


O.~ 0.10 0.15 0.20 0.25

Figure 1.3.4. Movement of the ring with the real magnetie force.

1.4 How to judge the quality of a nonwoven fabric


1.4.1 The problem
A variety of industrial products, ranging from carpets to baby napkins, make
use of artificial fabries . Nonwoven fabries and agglomeration of (plastie) fibres are
mainly used. They are sometimes called "ßeeces". The visual and mechanical
properties of these fabries depend on the homogeneity of these fibres - homogeneity
with respect to the local density of the fibres and homogeneity with respect to
their directions. If there are areas of different density, they look like dark or light
clouds; the defect of nonuniform density is therefore called "cloudiness". If many
parallel fibres stick together, the fabric shows anisotropie or "ships". Clouds and
ships reduce the quality of a fabric; traditionally this quality was judged through
visual inspection - experts looked at the fabrie and ranked its quality. To get a more
objective judgment, one needs a measure for nonuniformity. The company takes
on-line images of the fabric; ca 1 m long pieces of the fabrie ribbon are mapped and
should be evaluated on-line in order to change or stop the production process if the
quality decreases.
The problem consists in developing a mathematieal model evaluating the fabrie
images which describes the nonuniformity of these images in a way corresponding
to the judgment of quality by experts, The calculation of the quality given by the
model should be fast enough to enable an on-line judgment.
The project has quite a long history, in whieh several people worked success-
fully but only for a short while, when further improvements were requested ("More,
28 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

more", shouts little Hävelmann!). With respect to cloudiness, it seems that we got
a saturation and there is now even a software product judging the cloudiness in a
proper way. Ships are still under consideration.
The long history has led to a long list of contributors: Dr. Brian Wetton, Dr . H.
G. Stark (both are meanwhile professors, in Canada and in Germany respectively),
Dr. P. Hackh, Dr. R. Rösch, Dr. J. Weiekert (and one of us). We shall describe
some attempts in detail; others just by mentioning them.

1.4.2 The models: A first approach


At the end of the production process, a piece of the fabrie is screened by a laser
and the intensity of the transparent light is measured "pointwise". Such a point has
a diameter corresponding to that of the laser beam and there are ca N = 5000 points
in each image. The points are called pixels, and to each pixel a grey value, i.e., the
measured intensity, is assigned. Since an image is a two-dimensional object, we use
double indiees (i ,j) to address an individual pixel and the grey value of pixel (i,j)
is denoted by JLi ,; and assumed to be nonnegative. We are interested in intensity
fluctuations and therefore only in relative values. Hence, we normalize JL by

where NI . N 2 = N is the total number of pixels, As a positive, normalized function


on a finite domain, it may be interpreted as a probability density and this in fact
helps us in getting ideas in spite of the fact that there is no randomness involved .
Absolute uniformity would mean JLi,; = kfor all (i,j)j we denote this uniform
distribution by P,i';'
Our modelling task seems to be clear: if we want one number to characterize
the quality, it should be something like a distance between a measure JL and p,. The
larger the distance, the poorer the quality.

1) Since JL is given by an N-tuple, i.e., JL E jRN, we may use any distance known
in jRN, for example

This is a simple but a wrong idea. Why is it wrong, we understand by looking at


a "one-dimensional fabric" (it prevents us only from elaborate index handling-
the idea is, of course, independent of the dimension) : A cloud is a section
Cl: ~ i ~ ß, in which lLi is larger (light cloud) or smaller (dark cloud) than the
average JLi = N'
• 1
1.4. HOW TO JUDGE THE QUALITY OF A NONWOVEN FABRlC 29

1--'

· . !. .I
!
-k ··-t-·
·: -_·__ ·_··--_·_····--I. ---L-J·----·.t --+---·-b------i-----
:
: i
,.....--,
: r-----l
: :

~ 11 1

I I I
2
I N
/
I I
dark cloud light cloud

Figure 1.4.1

Now assume that we have a fabric with a large hole oflength, say, 110N in the
middle. The corresponding J.t could be

J.thole = (1~ ~, ... , 190 ~, ~,190 ~, ~o ~) ; ... ,


length ffi
realize that J.thole is normalized to 1: We have 190N pixels of intensity 190 and -k
110N pixels of intensity 0, which sums up to 1. Such a fabric has a very big
dark hole-lO% of the image is completely dark.
Now take another rather regular fabric - not uniform, but oscillating and with-
out big holes

We see that lJ.ti - iLi I = 3k and therefore

N! 1
eh (J.tosc' jJ.) = 3N = 3VN .
The distance of J.thole to uniformity is
30 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

Both fabrics-the oscillating one and the one with the big dark hole-have the
same d2-distance, but certainly not the same quality.
dp-distances have the property that they are invariant against index permuta-
tions; since /J, does not change under index permutation we have

( ttN( /-tz(i) - N
1) p) ~ =
(N (
~ /-ti - N
1) p) ~
By index permutation, many small holes may be transformed into one big
hole, and both would have the same d p distance. This contradicts the idea of
quality.

2) A physicist has a natural measure for the distance to uniformity: the entropy.
Since we compare with /J" we speak of relative entropy - the entropy in com-
parison to jJ.. It is defined by

dent (/-t, jJ,) =L /-tij In ~~~ =L /-tij In /-tij + In N .


i,j /-ttJ i,j

But what was true for dp -and what is an advantage of entropy in statistical
physics-is true for dent and therefore unsuitable for our purpose: invariant
against index permutations, a big hole would count as many small ones.

3) We should take the idea of a hole more seriously. A hole is an index domain
(a set of neighbouring pixels) n, in which /-tij - 1:1 has one sign . It is larger
or smaller than 1i
everywhere in n. The size of the hole is given by the size
of n and by the deviations of the grey values from average. We call it "hole
volume" and define it as

We restriet our consideration again to one-dimensional fabrics for the sake of


simplicity. Here, n is a section [o, ß] and the hole volume is

a, ß are boundaries of holes - by definition. The largest hole is therefore


1.4. HOW TO JUDGE THE QUALITY OF A NONWOVEN FABRlC 31

If we pass over a boundary, J-Lj - ~ changes its sign and the volume defined as
above becomes smaller.
Therefore we keep the idea of the largest hole in distance (but may change
the values of the distance a bit), when we omit the restriction, t hat o,ß are
boundaries of holes and define

ß 1
D(II. ;,.) = max ",",,(11. . - -)
rrr r: 1~0I$.ß~N ~,.., N '
,=01
or, again, in two dimensions

D(J-L, jL) =
o
max
connected
L (J-Lij - NI) .
(i,j)e(}

We make a last simplification in restricting the set of connected domains to


the set of rectangles. We finally get

D(J-L,jL) = max { L
(i,j)eR
(J-Lij - ~) ,R = {(i,j): 01 s i:::; e..
02:::; j sß 2 }} .

We go on with this definition; it is a distance and it cares for index permuta-


tions. But there is another reason for choosing it : it has a long mathematical
tradition - in a field of pure mathematics, namely, theory of numbers. And
therefore there is much known about it. And moreover, we used the distance
already in Problem l.
As for one-dimensional images, already in 1917, the famous mathematician
Hermann Weyl defined D in a paper entitled (originally in German) "On
the uniform distribution of numbers modulo 1", Weyl [1917] . It is a paper in
the field of analytic number theory, more precisely, in the theory of Diophan-
tine approximation (starting with the question, how exactly any real number
may be approximated by a rational number); just for scientific curiosity: Do
you know Dirichlet's approximation theorem first proved by Kronecker
saying that for every real number J-L and any natural number N, there exist
integers n, m such that

u - m i <_1_ l<n<N .
lr: n nN' - -
One may observe that there are always integers m, n, such that the fraction
~ approximates J-L with an error less than ~. It is a nice game: Try it!
32 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

The same distance also occurs in statistics: for densities p" v , one defines the
Kolmogorov-Smirnov distance as

ß
D(II. v) = max ,",(11.)0 - v)o)
,.., I '.S.a'.S.ß'.S.N .L..J r:
j=a

Here again are many results available.


It is one of the pleasures of modelling if one discovers that others working in
completely different fields find similar notions and concepts. It is not only
a pleasure but also an advantage. Others, in general, quite intelligent peo-
pIe have thought carefully about it, As a consequence, a modeller should
have a deep mathematical knowledge; this knowledge might be the only valu-
able contribution a mathematician can make to the work of interdisciplinary
problem-solving groups.
As mentioned earlier, D is a metric in ]RN; for example, there is a triangle
inequality. In each metric D, there are unit circles, defined as points, whose
distance to zero is less or equal one

E = {p,/ D(p" 0) ~ I},


(of course, 0 is not a distribution in our sense; but we want to play with the
concept to get some feeling).
If N = 2, we can draw a picture of E - please verify it,
It is what we are used to when thinking of unit balls, but it is reasonable (the
fact that it is invariant to permutation is due to the very small N) and it may
work if we are able to find fast algorithms for large N.

4) But before we try more mathematical ideas: p, are probability densities and in
spaces of prob ability measures, we may find more concepts of distances. For
example, there is a Lipschitz distance which , if p, is interpreted as a measure,
is defined as

p(p" v) := sup {lI tpdp, - 1 tpdvl ,tp E ab and Itp(x) - tp(y)1 s Ilx - YII} .
Here, abis the space of bounded continuous functions on a domain in RN - the
concept has to be transferred to our situation with finite domain

p(p"v) := sup {1t,(P,i- Vi)tpil' (tpl"" ,tpN) E]RN and

Itpi-tpi+I1 s ~,i=l, ... , N, tpN+l =o} .


1.4. HOW TO JUDGE THE QUALITY OF A NONWOVEN FABRJC 33

Figure 1.4.2

This seems to be very complicated. We have to maximize with respect to the


set

This is a problem of linear programming. But, working a bit more, we find


an equivalent expression for p, which is much simpler to handle and which we
formulate as a theorem (We got it from a private communication from J . Wiek
in 1990) .

'I'heorem 1 The discrete Lipschitz distance is also given by

1 N i
p(JL, v) := N L L(JLj - Vj)
i=l j=l

Proof. Let Yj := JLj - vs- Now


34 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

(we get it by Abel's "partial summation", the discrete analogue ofpartial integration
formula).
j
Putting Yo = !PN+l = 0 and denoting L: Yi = lJ, we get
i=O

N N
LYj!Pj = LlJ (!pj -!PHI)'
j=l j=l

Since l!Pj - !PHt! $ -Fr, we get Itl Yj!pjl $ -Fr j~ IlJl·


therefore an upper bound for p(JL, v) . But this bound is even reached for some
The last expression is

special !P, which we construct: We denote ßj = sign lJ, such that IlJl = ßjlJ and
N N N j

L IYjl = L ßjYj = L ßj LYi (by definition).


j=l j=l j=l i=l

Changing the order of summation, we get


N N N
LlYjl = LYiLß j .
j=l i=l j=i
N
Putting cPi = 17 L ßj, we see that
j=i

and
1 N N
N L IlJl = LcPiYi .
j=l i=l

The bound is attained by cP. Hence,

m:xlt,y;,,;! = t,.,;y; = ~ t,1Y;!,


which is the statement of the theorem.
P may be interpreted in a "hole" philosophy: IlJl is the volume of holes, which
begin at the left boundary (at j = 1). P is therefore not the volume of the largest
hole but the average of certain hole volumes .
The unit sphere in R2 with respect to P looks like a parallelepiped.
1.4. HOW TO JUDGE THE QUALITY OF A NONWOVEN FABRlC 35

Figure 1.4.3

It is not any more symmetrie in (Xl, X2), not even in 2 dimensions. In two
dimensions, the definition (or better representation ) of pis: Let

i i
Yii = L L(jtkl - Vk,) for 1 s i,j s N,
k=l'=l

then

1
p(jt, v) = N2 L IYiil·
i,i

pis, as a model for quality as good as D. Now, we need fast algorithms to eompute
D and p.

1.4.3 Evaluation of our first model

It is important to get fast algorithms to compute D or Pi realize that N is rather


big, up to 6000. Moreover, we need online algorithms-they have to accompany the
production process.
For p, that is easy: compute the partial sums Yi or Yi k and take the (arithmetic)
average. Our theorem delivers a fast and easy algorithm. What ab out D? In one
dimension, we have
36 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

= l$a$ß$N
max IYß - Ya-11 = O$a,ß$N
max IYß - Yal
(please realize that we have substituted)
Q ::; ß by Q,ß)

= max Yß - min Ya .
O$ß$N O$a$N

This is also very easy: one has to eompute the partial sums, their maxima and their
minima and the differenee. But there is no such formula in two dimensions. It is
easier to use an equivalent formulation (equivalent with respeet to the modelling
quality)

where D is related to D* - see Hack [1990] for more details.


The results with D were initially very satisfying, but not eompletely: sometimes
the quality estimated by D (or p) and the judgment of the experts disagree.
Fine, says the eompany, but not fine enough. And the game begins again.

1.4.4 The second model


Clouds and ships are features which have typieal sizes. Clouds are isotropie
irregularities of a size which is much larger than the diameter of the fibres and much
smaller than the size of the fabric, Ships are anisotropie defects of a length and
thickness which have different orders of magnitude.
It is a typical feature of modern applied mathematies to apply a "multiscale
analysis" to signals, images and functions. This means practically to look at an
image first from a very short distanee: one sees many small details but loses infor-
mation about the "whole" . Inereasing the distance to the image (you may see people
in galleries, studying first tiny details of a painting-for example the titlel-and then
stepping back to get an overall impression of it), one realizes that the details dis-
appear and the features on larger seales eome into view. The mathematical model
of this proeess of zooming of magnifieation and minifieation is called "multiscale
analysis" . It appears in different forms and connected with different other theories;
for example, as wavelets, but also in numerieal analysis as multigrid or hierarchieal
bases (see, for example, Louis, Maaß and llieder [1997] or Hackbusch [1992]). The
general idea is to eonstruet bases which separate the scales, i.e., different elements
are representing features at different scales. In that way, it is to filter out features
of a eertain size. Modern mathematics provides eonstruetions of these bases whieh
allow easy deeomposition of images. "Easy" means eomputationally fast; one way is
1.4. HOW TO JUDGE THE QUALITY OF A NONWOVEN FABRIC 37

to care for the orthonormality of the basis. Decomposition is then Fourier - or now:
wavelet analysis, since wavelets form orthonormal bases with scale separation.
It should be clear that we first tried to describe clouds and ships with the help
of a wavelet analysis; see Stark [1990]. In fact , we started to apply wavelets to
industrial problems in 1987 as one of the first outside of France.
The results were not bad but not so impressive either to justify the application of
these advanced tools. In 1995, J . Weickert proposed a simpler but similar concept,
the so-called "techniques of pyramids" known since long in image processing,
which is somehow a prototype of multiscale analysis .
We shall present this approach here - it is, as we said, an easy introduction to a
modern mathematical concept. It will be applied to the problem of cloudiness and
only for 1d images (but here, the extension to 2d is straight-forward).
The process of moving away from an image is modelled by "melting together"
any pair of adjacent pixels: magnification, i.e., moving towards the image means
the splitting of one pixel into two. How do we do this splitting and pasting? It is
clear from the above that dimensions N, which are powers of 2, playaspecial role.
Hence, we start with

We should see J.L as a representation of the complete image, containing all details
even on the finest scale. Larger scales mean less information and the image at larger
scales will be represented by elements in lR2 ' +1 with 1 ~ j ~ k - 1. Processes
of magnification, the transition from larger to finer scales, must consist in putting
new information into it and are mappings from lR2' +1 to lR2 ' H + 1; we call them
interpolation or prolongation operators. Since we restriet our considerations to lin-
ear transformations, these operators may be represented by (2i H + 1) x (2 i + 1)
matrices. One idea to prolongate is simply to insert the average of two neighbouring
grey values:

Figure 1.4.4.
38 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

The corresponding matrix is

1 0 .,.
1 1
2 2 0
o1 0
Ij = o 0 21 21 0
0 1 0

o 1
2
1
2
For j = 0, we put

We see that the row sums are always 1.


The opposite operation of "moving away" , a reduction, which removes informa-
tion, projects larger spaces to smaller ones; reduction operators are represented by
(2 j + 1) x (2j +l + 1) matrices.
A simple pastry process is just to take average values .

J.Lo J.La-----ER2;+1+1

(2/3J.LO + 1/ 3J.Ll) (1/4J.L l + 1/2J.L2 + 1/4J.La)


Figure 1.4.5.

We see that at the boundary, we melt J.L m, J.Lm+l and J.Lm+2 together to one new
e Iemen t "4J.Lm
l + 2J.L
1 1 • " ith a mas k (1l' 2'"4
m +l + "4J.L m+2; i.e., we convo u e J.L Wl
I t " 1 1) 4 . B u t

we "sample" only every second element: tJ.Ll + ~J.L2 + tJ.La, tJ.La + ~J.L4 + "4J.Ls, ••• At the
left boundary, we convolute with (}'!) at the right correspondingly with (k, }).
We get the following matrices:
1

[J
0
f4" 1 1
0
2 I"4
Rj= 0 0 1
2
1
"4 "" "j
o ...
0 .i' 'i'
1.4. HOW TO JUDGE THE QUALITY OF A NONWOVEN FABRlC 39

Moreover, we put

R 0-- (3
l l3l3) ·
Again, the sums of row elements are one. We want to make a side remark which
connects our consideration with a different approach. To convolute with a sequence
(t ~ t ) looks like a very rough discretization of the convolution with a Gaussian;
i.e., of
+00

Ji,(u, x) := / Gu(x - y)fJ.(y)dy,


-00

where

with a variance a to be determined. Now we know that Ji,(t,x) solves the diffusion
equation Ji,t = Ji,zz with initial value Ji,(O, x) = fJ.(x) . If our reduction is a discrete
approximation of a continuous convolution, it should be a discrete approximation
of the diffusion equation. In fact, if we discretize x with a step size Dox such that
fJ.k (t) = Ji,k(t, kDox) , k E Z and if we discretize t with Dot to get

by using a central difference approximation for the second derivative, we end up


with

fJ.kU + 1) = fJ.kU) + ::2 (fJ.k+l (j) - 2fJ.k(j) + fJ.k-l (j))


Dot ( .)
= Dox2 fJ.k+l J + (1 - Dot) ( .)
2 Dox 2 fJ.k J
Dot
+ Dox ( .)
2 fJ.k-l J ;

with :;2 h= we get the above mentioned mask. One step in magnification means
therefore one timestep of diffusion. In diffusion processes, perturbations of high
frequency disappear quickly; meanwhile, the "long waves" remain. That is exactly
what our reduction does, too. Why do we mention this diffusion aspect? The theory
of diffusion offers plenty of additional knowledge like maximum principle, monotony,
decease of variation. It is rewarding to study diffusion filters in image processing as
quite some people have done.
Now we return to our reduction operators for an iterative magnification process.
The basis is the complete image fJ. = v k E lR2k+l and we partition it step by step.
We get
40 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

Vo

/i~
/Vl~~
R, / /lVIV1~\ Va

Figure 1.4.6.

In each of these steps, the high-frequency effects are reduced: each step is a low-
pass filter and this pyramid, called "Gaussian pyrarnid", may be interpreted as a
sequence of low-pass filtered versions of u,
In each step, information is thrown away. What are these losses? If we want to
compare vi with vi+ 1 , we face the fact that these vectors belong to different spaces.
J 1
First, we have to "lift" vi to lR.2 + +1 in order to be able to compare. This lifted
version should have the same information as vi. Therefore the lifting is done by our
interpolation. Iivi is in lR?,+l +1 and we are now able to compare

vi+I - Iivi = wi+I.

wi+I contains the information lost by the reduction vi+ 1 -+ vi ; i.e, by Ri ' To
give a simple example: if VI = (Vl ,V.2,Va), then vO = ROv l = l(VI + V2 + va) = 'iJ
and lovo = (v, v,v). Therefore, w 1 = (VI - v, V.2 - v, Va - v) contains the deviation
from the average . In general, w i contains all features belonging to the jth scale.
The computation of wi is comparable with a band pass filter, where the width of
the band corresponds to one step in our size scaling. If we write formally wO = vO,
we may write w k , •••, WO in the form of a pyramid; and we call this pyramid the
"Laplace pyramid",
Given the last reduction VO = WO and the losses w k , •••, w 1 , it is easy to reconstruct
the whole image recursively

VI = w 1 +lovo
v2
= w.2 + l1 V 1
104. HOW TO JUDGE THE QUALITY OF A NONWOVEN FABRlC 41

Figure 1.4.7

Storing w k, •••, WO means storing the whole Image, Of course, wk , •••, wO contains
more bits than J.I. - it contains redundant information, but it is still very useful. For
example, we may reduce the amount of data by reduction of the kind of information
we are not interested in. If we are not interested in details on scales up to order
j < k, we are content with vi instead of J.I. and need only WO , ••• , wi . This set contains
1 + 3 + ...+ (2i + 1) = (j + 1) + 2i +l data which is less than 2k+1 + 1 data in the
original image.

Remarks.

a) Since wi should contain only "ftuctuations" of a certain order of magnitude,


their average value should be zero. Whether this condition is fulfilled depends
on the reduction at the boundary-our masks (~,!) and (!'!) are properly
chosen.

b) We considered only 1d problems. Real 2d images are reduced to the 1d case


by treating the rows separately one after the other.

We will now use our Laplace pyramid to judge the cloudiness. We look at
fluctuations at the jth scale; the larger the L 2-norm of wi , the more this scale
contributes to the nonuniformity. We normalize this L 2-norm and consider

which is (since the average of w i is zero) the "variance" of w i . Now, what is a cloud?
First, not an objective concept: it is defined by the judgment ofan expert who looks
at the fabric. Weickert has done experiments with 18 experts and found out that all
of them weighted irregularities on a middle scale more. If we try to weigh the scales
42 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

by introducing a weighted measure as quality measure


k

q(f.L) := :~::>jul(f.L),
j=l

the expert opinion reached maximally the scale j = 4 (which corresponds to a


cloud size of ca. 7 cm) and it included all scales between 3 and 20 cm. This
expert evaluation defined appropriate weights Cj and in computing q(f.L) with these
weights, an inst rument for an objective and acknowledged quality measurement was
developed.
What remains to be done. Strips!

Final remarks.
Pyramids have many features in common with wavelets (which correspond to
w j ) . They are not as elegant, orthogonal and wo, ..., w k therefore contain redundant
information. But the price wavelets pay for orthonormality is some inflexibility
which often creates problems, especially at the boundary. Pyramids contain a bit
of concepts in wavelet and multigrid in a quite simple way. They are able to teach
more than just how to handle a problem of quality control.

1.5 Fatigue lifetime


1.5.1 Introduction
Steel may seem to be a very ordinary material, but actually behaves in a
complicated way. One of the issues which is still not understood well enough is the
slow degradation and eventual destruction of steel-made machinery and equipment.
For example, consider the rotor blades of a wind power generator which can be
relatively large. The larger they are, the heavier they will be, and the strenger will
be the downward pull of gravity upon them. But since the blades rotate, this force
will change all the time. If the blade points upward, it will be pushed towards the
shaft; if it points downward, it will be pulled away from it . If the blade points
side wards , gravity will bend it down - just imagine a bridge over a valley which
is supported on only one end. Now, these rotors turn several millions times over
the years. If the construction has not been done properly, some degradation occurs
during every single turn which, even if it is very slight, will eventually destroy the
rotor because of the large number of turns. This phenomenon, called fatigue, is
actually very common. A railway track experiences a heavy force each time a railway
wheel passes over it. Fatigue cracks in structural parts of airplanes have reportedly
caused crashes and near crashes in the past and, even if detected in time, are a
recurrent cause of trouble. It would be very helpful if one could predict them in a
reliable manner before they start to appear. The fatigue issue is also very important
for the construction of trucks, buses and cars. On one hand you may want to have
the truck weighing as little as possible in order to reduce fuel consumption and price
1.5. FATIGUE LIFETIME 43

of production; on the other hand its parts have to be strong enough to withstand
the bumps when travelling over the roads for years. Here the analysis is particularly
difficult since, as everybody knows, these bumps are rather irregular.
Although, of course, engineers have done most of the research and development
in this area, it has turned out that mathematics contributes new insights and new
algorithms for computing estimates of the amount of damage. This is not just
speculation; for example, near Kaiserslautern, Germany, a company run by mathe-
maticians, computer scientists and engineers alike, which has had its origin in the
group of industrial mathematics headed by Neunzert of Kaiserslautern University,
provides software used by the majority of German car producers.
The research group at Kaiserslautern started a systematic study of fatigue anal-
ysis since 1984 and have made significant contributions (see, e.g., Dressler and
Hack[1996], Dressler, Hack and Krüger [1997] and Brokate and Sprekels [1996])
by establishing a relationship between the phenomenon of hysteresis and fatigue
lifetime and developing methods particularly useful for data of long drives or for
extrapolated data.
The estimation of the lifetime of a component in the automotive industry like
car, aircraft, railways must be carried out to verify the durability of the machine as
a whole or a part of it prior to its actual manufacturing. A specific part of a car
is usually not destroyed by one large load, but by the growth of very small cracks
during many, typical several million, hysteresis loops. At ambient temperatures, the
frequency of the osciIlation is not taken into account. The damage induced by the
individual loops is accumulated according to the famous Palmgren-Miner-rule, ac-
cording to which experimental results for a given work-piece are usually condensed
into an S-N-diagram depicting the Wähler line, a plot of the (scalar) stress am-
plitude S versus the number N of cycles (oscillations between two amplitudes; for
example, 0 and S) until destruction occurs. For a sequence of cycles of varying am-
plitude, the Palmgren-Miner-rule of linear damage accumulation evaluates the
total accumulated damage as the sum of the contributions l/N from the individual
oscillations. The damage induced by these loops is used to estimate the lifetime.
The process of finding duration in which a specific part or the whole machine will
be destroyed is called the fatigue analysis. The duration is called fatigue life.
Components under consideration may be of different size and shape; so we require a
local evaluation of the damage depending on material parameters only. Very often
no information about the local value is available. Hence, to get reallocal values, one
has to build a specimen, put into the vehicle and drive it on the test track.
It may be observed that the motivation for data reduction schemes in fatigue
analysis is not just the reduction of data (in the sense of storage saving) as sug-
gested by the direct meaning of the word. The main point is to focus attention on
the relevant information by intelligent filtering, that is, by omitting the immense
mass of data having no effect on damage accumulation. This permits both an ef-
fective modular use of modern numerical damage evaluation techniques and the
reorganization of test-drive data for test stand experiments. From the point of view
of damage analysis, it is weIl accepted that the rainßow method is the optimal
44 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

rate-independent data reduction scheme for one-dimension load histories. There are
stronger points in favour of rainflow reconstruction algorithm from practical point
of view and it has no disadvantage against the Markov simulation, e.g., performance
and reliability. It is mathematically exact in the sense that there are no approxi-
mative or heuristic arguments used . The main goal of this section is to introduce
briefly a very effective on-line counting, the so-called "4-point counting" by Krüger
[1985], and present fatigue lifetime estimate based on rainflow counted data using
the local strain approach.

1.5.2 Physical situation, modelling, rate independence, and


rainßow counting
Our goal is to estimate the lifetime of a structural component experiencing
repetitive loading. These components can be of different size and shape; hence we
need a local evaluation of the damage depending on material parameters only. The
basic assumption here is that the damage to the components is fatigue-based and
damage is not due to heavy loads and loading is one-dimensional. One can model
this situation by assuming that the component has one crack, perpendicular to this
loading, that is larger than all others. Due to this, the global stress and strain in
the component can be thought to be higher at this point (Figure 1.5.1).

C::::::::·..····-
~
.~
- - - - - -.....
.:::..~~,,;;:: ( ) " .L.__ .":::~
------- ------.-.-------.:. :..-..~-- . . . -. .------..l ......,
. - . ---j..
I ~·!
~L
_----- .
... I
----------------_'":~~ ..
~ ~l~ ~ ... 6.1~

Figure 1.5.1. Schematic drawing of the basic situation.

Thus the global stress and strain are one-dimensional and take their local values
at the crack. The breaking will eventually occur at a point when the total accumu-
lated damage exceeds a certain value.

L Force
Stress = (7 = - = -------
A Cross section Area
. 6,1 Extension
Stram = € = -1 = 0 ngm
. . al Length '

«(71,€d are their local values.


Let us consider in more detail the compression of the component at a single
expansion component. The expansion will cause an elastic deformation up to a
certain point. If this point is passed, then the material undergoes a plastic defor-
mation which causes a permanent damage. Similarly, under compression, the rod
will deform first elastically and then plastically. So, in the stress-strain plane, the
1.5. FATIGUE LIFETlME 45

trajectory (u(t), €(t)) ofthe history of the sample will follow a so-called hysteresis
loop as in Figure 1.5.2.

Nonlinear plasti
deformation
r--+----

A ~ B rv expansion

B ~ A rv compression

Figure 1.5.2. A single hysteresis loop.

Now, we note that the units of area in the stress-strain plane are energy per
unit volume. Thus, each time the trajectory completes a hysteresis loop energy is
dissipated, a function of the size of the loop, which it is assumed to permanently
damage the structure.
There are two types of curves that make up the time history of the straining
of the structure, and they are related by a simple formula. These are called cyclic
curve and doubled curve. Initially, the trajectory follows the cyclic curve until its
first turning point (corresponding to a change of direction of the force applied). At
this point, the trajectory changes its direction to follow the doubled curve because
it first must undo what has previously been done before creating strain in the other
direction. The formula for the doubled curve, when the cyclic curve is given by
€(t) = g(u(t)), is

e(t) = ±29(ü;:)),

where e(t) = €(t) - €(tn ) , ü(t) = u(t) - u(t n ) and t« is the last time at which there
is a turning point. This is known as the Masing law and, in order to describe the
46 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

trajectory completely, we must add to it the so-called Memory laws. These refer to
Figure 1.5.3.

Stress a

Strain €

,f Cyclic curve
I
.... ~ ....

( Trajectory

Figure 1.5.3. Memory laws.

(i) After closing a hysteresis loop on the cyclic curve, the trajectory follows the
cyclic curve again.
(ii) After closing a hysteresis loop on a basis branch, the trajectory follows this
branch again.
(iii) If a branch of the trajectory that started on the cyclic curve ever meets it
again, then the trajectory slope changes and it follows the cyclic curve away
from the origin.
The first two of these laws can be explained by noting that completing a hysteresis
loop wipes this loop from the memory of the trajectory, so it continues along the
original track. The third law describes a similar loss of memory of the trajectory
1.5. FATIGUE LIFETIME 47

when it intersects with the cyclic curve . This is due to the fact that this point could
also have been reached by simply traversing the cyclic curve from the origin (the
only difference being the accumulated damage).
Load-time graph for a short duration of an experimentally observed data is given
in Figure 1.5.4.

Load L

o Residual turning points

Figure 1.5.4. Time series.

Corresponding to the load-time graph in Figure 1.5.4, the schematic stress-strain


trajectory is given in Figure 1.5.5.

The rainfiow matrix and the residual graph. We start with a short time series
L(t) . First, we discretize the load into n intervals of equallength

AL = maxiLI,
n
where n is a given integer, and we denote by Li the load corresponding to interval
i counting in the positive direction. The rainflow matrix A is then defined by
aij, being the number of hysteresis cycles whose closing parts start at L j and end
at Li. So, if i < j, the loop is "sitting" and if i > i. the loop is "hanging" .
Furthermore, the loops contained in only one load interval are not counted, i.e.,
aii = 0 for i = 1, 2, . . . ,n. If we delete all hysteresis loops from the time series,
then whatever is left is called the residual graph or residue. Let tl, ta , t3,t4 be 4
successive residual turning points in Figure 1.5.4 and

Then we delete ti+l and ti+2 from the time series, which is called the Madelung
deletion. Since there is no hysteresis loop in the residual graph, the load values are
48 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

Stress (T

.'" ""

y
Boundary curve / ••••

/,/

Strain e

o Residual turning points

r Trajectory

,... Cyclic curve


f

Figure 1.5.5. Trajectory in the stress-strain plane.

of increasing modulus until a possible tailing at the end which may be considered as
a nesting of unfinished hysteresis loops. The concept of residual graph is illustrated
in Figure 1.5.6.

I Load stream 1-+ Stream of hesteresis loops


in the (T-€ plane -+1 Damage I
Rainflow counting simply means that, through the repeated Madelung deletions,
one eliminates all inner cycles, while noting them down or counting them in the
process . Damage can be considered as a functional on the space of loading functions

°
which are piecewise monotone functions v defined on [0, T] into R, where T is a
fixed number . This means that there exists a sub division = to < tl < ... t n = T
such that the restriction of v on any interval [ti, ti+l] is monotone. In this case,
we call (to, tl, ' . . ,tn ) a monotonicity division of [0, T]. Let Mpm(O, T) be the space
of all piecewise monotone functions on [O ,T]. For v E Mpm(O,T) and for a mini-
1.5. FATIGUE LIFETIME 49

Load L

Time t

D Residual turning points

- Residual graph

--_.- Time series

Figure 1.5.6.

mal monotonicity division (to,tl"" ,tn) of [O,T], (v(to),v(td,' " ,v(t n)) is called
the string of turning points of v. A transformation if> on [0, Tl into itself is called
monotonicity-preserving if if> is increasing, if>(0) = 0, and if>(T) = T. A functional
V on Mpm(O, T) is called rate-independent if for any v E Mpm(O, T) and any
monotonicity-preserving if>, V(v) = V(voif» for v E Mpm(O, T), and matehing mono-
tonicity division of [0, T]. It can be verified that V(v) = V( v) for any v E M pm (0, T)
with v( 4) = v( ti) for all i = 0, 1,2, ... ,n and (to, tl ' ... ,t~) a monotonicity division
of V. This implies that we can define any rate-independent functional as a
functional on the set of turning points. This concept gives us the first part of
data reduction. We just need the string of turning points.
Prom an algorithmic point of view, a very effective on-line counting is the so-called
"4-point" counting given by Krüger et al. [1985] whose refinement and didactical
reorganizations are presented in Dressler, Hack and Krüger [1997]. A mathematical
description can hp. seen in Brokate and Sprekels [p.76, 1996]. Having the loop stream,
damage is evaluated using material properties, experimental data and a damage
parameter. This damage parameter induces a functional V, that maps the stream
of hysteresis loop into its damage value indicating how much the stream of loops has
50 GHAPTER 1. GASE STUDIES AT KAISERSLAUTERN

damaged the part. Usually a value of zero means not damaged at all, and the value
one indicates that the part is broken. Schematically, we get:

Lr------~----,----..__~-----__=_

Lo
I
-------------------1I
Lu ----------

Figure 1.5.7.

1.5.3 Damage estimation using rainflow counted data


In view of the following facts, one looks for the damage of a rainflow counted
loading:

(i) The load can be rainflow counted on-line, that is, during the drive on the
test track to reduce the data such that much longer drives and much finer
measurement can be applied.

(H) There exist mathematical tools to manipulate the rainflow count, that is, to
extrapolate the data in such a way that the resulting rainflow count has the
1.5. FATIGUE LIFETIME 51

same distribution of hysteresis loop. This gives a much better extrapolation


of track data than simply merging several load hysteresis.

(iii) One can manipulate the rainflow matrix to simulate special situations.

(iv) One can generate artificial rainflow matrices.

It may be observed that all information on the original order with respect to
time of the hysteresis loop (in the plan load v is load strain) is lost when the data
is rainflow counted. We get a set of loops instead of a stream, The problem is
that the counting is done on the loading stream and therefore we ean only get
information on the loeation of the loops with respect to the loading. We can not
read the exact location of the local strains from the rainflow count. For each dass
of hysteresis loops, we can get the upper and lower loading value Lu, L o. By using
the geometrical and material properties, one can calculate the amplitude of strain
€a (for detail we refer to Neuber [1961]). By the transformation from hysteresis
loops in the loading-strain-plane into such in the stress-strain-plane, one also loses
the information on the exact location in the stress-strain-plane and one only has the
amplitude (aa and €a) .
But most of the damage parameters also depend on the location of the hysteresis
loops in the stress-strain-plane. The reason is that the damage induced by a
hysteresis loop of a given stress and strain amplitude is larger under tension than
under compression. The crucial value hereby is the mean stress (am). Hence one
can not speak of the damage of a rainflow count with respect to such a mean stress-
dependent damage parameter. However, it is possible to proceed from a rainflow
count to a stream of turning points.
A rainflow count consists of the tuple (RF M, RES), where RFM denotes the
rainflow matrix with entries RFM(i,j) giving the number of hysteresis loops from
the level i to the level j and RES is the residual of the count.
We consider the set of streams of loading turning points which load to the given
rainflow count (RFM, RES). If RC denotes the rainflow count operator, we write
formally

.c = RC-(RFM,RES) .

For each stream we can calculate the damage 1J(H(L)) . Hence we can canonically
define the damage of.c and for (RFM, RES) by the expectation value ofthe damage
of the stream of turning points in .c, where we assume that every stream has the same
probability. This assumption makes sense if we do not have any further information
on the loading. Thus

1J(RFM,RES)
1
= l.cl L 1J(H(L)) .
LEI:-
52 CHAPTER 1. CASE STUDIES AT KAISERSLAUTERN

Ifthe stream ofturning point consists ofthe hysteresis loops {hrlr = 1" " ,T}, the
total damage is evaluated as
T
V tot = L: d(h r ) ,
r=l

where d(hr ) denotes the change induced by the hysteresis loop hr . For details and
implementation of this technique, we refer to Dressler and Hack [1996] where they
have also shown the application of this method included in the Fatigue Analysis Tool
F ALANCSTM to real data.
It may be remarked that the contents of this section as weIl as those of Chapter
6 are going to be the essential tools for computational work in material sciences such
as in the study of elasticity, plasticity, elasto-plasticity with strain hardening and
elasto-viscoplasticity.
Chapter 2

Algorithms for Optimization

2.1 Introduction
The traces of optimization can be found in the development of calculus. As far
back as 1629, Pierre de Fermat showed that the necessary condition for an extremum
(minima or maxima) for a real-valued function of one variable is that the derivative
must be zero . A class of extremum problem, that has been the favourite of giants
like Jean Bernoulli, Leonhar Euler, Andrien Legendre and Carl Gustav Jacobi since
the end of the eighteenth century and the beginning of nineteenth century, is known
as the calculus of variation.
The calculus of variation is an infinite-dimensional problem of unconstrained op-
timization in which the functional to be minimized is defined by an integral. In the
second half of the nineteenth century, Karl Weierstrass posed the crucial question
of existence of a solution of minima and maxima and answered it for a fairly gen-
eral situation. In 1939, Nobel Laureate Leonid Vitalevieh Kantorovich formulated
many problems of Economics in the form of optimization problem for linear func-
tionals known as the mathematical method of production of planning and
optimization and made extensive study of such problems. Basically, these results
constituted the main ideas of the linear programming (See, for example, Kantorovich
[1975]). Without having any knowledge ofthis work, in 1942, George B. Dantzig en-
countered similar problems of minimizing linear functionals under linear constraints
while studying applications of mathematics to industrial problems. Around 1950,
David Gale, Harold Willion Kuhn and Albert William Tucker contributed much to
this field and also extended this theory for non-linear functionals under constraints.
Around 1957, L.S. Pontryagin, V.G. Boltyanskij, R.V. Gamkrelidze and Richard E.
Bellman studied the optimal control problem, a special type of optimization problem
and a generalization of the calculus of variations where one looks for the solution of
a system of differential equations which minimizes or maximizes (extremizes) a func-
tional. In the 1960s, A.Ya-Dubovsky and A. Milyutin, as weH as B.N. Pshenichnyj,
Luden W. Neustadt, Hubert Halkin, Jack Warga et al, developed general tech-

53
54 GHAPTER 2. ALGORlTHMS FOR OPTIMIZATION

niques for obtaining extremum conditions for abstract optimization problems with
constraints so as to include the Kuhn-Tucker theorem and the maximum principle
obtained earlier. Convex analysis has played a vital role in the study of abstract ex-
tremum problem . Richard Terry Rockafellar and his co-workers have made valuable
contributions in this area. A fairly general theorem for existence of a unique solution
of optimization problem in the setting of reflexive Banach spaces, in partieular for
Hilbert spaces, has been proved by I. Ekeland and R. Temam in 1974. As a corollary,
one obtains from this theorem that the solution of optimization problem for the en-
ergy functional has a unique solution under fairly general conditions. Generalization
of variational problem (variational equation) in the form of variational inequalities,
was introduced by Guido Stampacchia, Jacques-Louis Lions and G. Fichera in 1970s
and it has been extensively studied by Alain Bensoussan, C.Baiocchi, G. Duvaut,
R. Glowinski, F . Giannessi, G. Isac, N. Kikuchi, D. Kinderlehrer, U. Mosco, P.D.
Panagiotopoulos et al. An interesting part of this theory is that a large dass of
optimization problems induding the programming problems (linear and non-linear)
are special cases of the variational inequality problems. Shape optimization prob-
lems are of vital importance in the design and construction of industrial structures
like cars, aircraft, space crafts (Sokolowski and Zolesio [1992]). For a comprehensive
account ofthe above development, we refer to Polyak [1987] and Siddiqi [1986,1994].
A comprehensive algorithm for unconstrained optimization is presented by Goldfarb
[1994].
Since the main objective of this chapter is to present those results of optimization
which are frequently used in industrial applications, we confine ourselves to optimiza-
tion algorithms like Newton, Gradient, Conjugate gradient, and quasi-Newton, espe-
cially DFP (David-Fletcher-Powell) and BFGS (Broyden-Fletcher-Goldfarb-Shanno).
It may be recalled that the iterative methods, where one constructs sequences
which converge to a solution of optimization problems, are known as the optimization
algorithms. It may be also observed that a field of more recent developments known
as the automatie (or computational) differentiation could influence the developments
of optimization algorithms. The algorithmie differentiation is a process for evaluating
derivatives whieh depends only on an algorithmie specification of the function to be
differentiated. In actual practiee the specification of the function is all or part of
a computer programme, and the derivative values are produced by the execution
of the program derived from the program of the original function, hence the term
"automatie" (Raff) . We refer to Griewank and Corliss [1991] and Berz, Bishof,
Corliss and Griewank [1996] for a detailed account of this elegant study.
Arecent book by Outrata, Koövara and Zowe [1998] contains valuable contribu-
tions in the field of algorithmie non-smooth optimization.

2.2 General results about optimization


Let X be aspace with appropriate structure (topologieal or algebraic or both)
and let K be a subset of X . Furthermore, let F be a function on X into R . The
2.2. GENERAL RESULTS ABOUT OPTIMIZATION 55

general problem of optimization, say (P), is to find an element u of K such that


F(u) :::; F(v) for all v E K . If this happens, we write

F(u) = vEK
inf F(v) (P) ,

where u is called a minima and the value F(u) is called the minimum value. Finding
the maxima of Fis equivalent to finding the minima of -F. Sometimes this problem
is called the eonstraint optimization problem. If K = X, then (P) is often
known as the uneonstraint optimization problem. If K is a neighbourhood of
u, that is, minima exists in a neighbourhood of u, then it is called a relative minima
or loeal minima. If u is a minima or maxima, then it is called an extrema. A
minimum or maximum value of F is called extremum value. If X = R, the set
of real number, and K = [a, b] ~ R and F is a continuous function on R into itself,
then F has a minima on [a, b].
For a topological space X, K, a compact subset of X and F continuous, (P) has a
solution. The optimization problem for a Banach space X where K is not compact,
for example, K is a unit ball of the Banach space namely K = {x E X/I x I:::; I}
and F a continuous functional had been an open problem for a long time . This was
resolved in 1974 in the context of a reflexive Banach space where K is closed, convex
and non-empty subject of X and F bounded continuous and convex functional (see,
for example, Siddiqi [1986, p. 233]) . The solution is unique if F is also strictly
convex. If X is a normed linear space and the derivative of a functional F exists at
a point u E X and u is an extrema of F, then F' (u) = O. This is an extension of a
famous result of Fermat that if u is a minima of F(x) on Rn and F is differentiable
at u, then 'VF(u) = O. These results can be extended for an open sub set of X in
the following manner.

Theorem 2.1. Let U be an open subset 01 a normed linear space X and F : U C


X -+ R be a functional. 11 F is diJJerentiable at an element u 01 X and F has
extremum value at this element, then

F' (u) = O. (2.1)

Equation (2.1) is often called Euler's equation.

Remark 2.1.

(i) The condition U to be open subset is essential as the theorem cannot hold for
X = R, U = [0,1] and F : [0,1] -+ R defined by F(x) = x.
56 CHAPTER 2. ALGORlTHMS FOR OPTIMIZATION

(ii) For X = Rn, (2.1) is equivalent to the equations

(2.2)

Theorem 2.2. Let U be an open subset 0/ Rn, let rPi : U ~ R, 1 :::; i :::; m be a
sequence 0/ continuous functions whose first derivatives exist and is continuous, and
let u be a point 0/ the set M,

M = {y E U/ ifJi (y) = 0, 1 :::; i :::; m} C U, (2.3)

at which the derivatives ifJ:(u),1 :::; i :::; m, are linearly independent elements 0/ the
vector space 0/ all bounded linear functionals on Rn into R.
Suppose that F : U ~ R is differentiable at u and F has an extremum ualue at
u over U, then there exist m numbers f..ti(u), 1:::; i :::; m, uniquely defined such that

F' (u) + f..tl (u)ifJ~ (u) + ... + f..tm(u)ifJ~(u) = o. (2.4)

The numbers f..ti,1 :::; i :::; m, obtained in this theorem are called the Lagrange
multipliers associated with the extremum u und er the constraints (2.3).

Example 2.1 . Let X = R 2,F (Xl , X2) = -X2, M = ((Xl,X2) E R 2 1 ifJ(Xl,X2)


xt
= + x~ -1 = O]. Then equation (2.4) takes the form

Remark 2.2. In order to solve a problem posed in the form of Theorem 2.2, we
need to find the m+n unknown Ui, 1 :::; i :::; n, and f..tj' 1 :::; j :::; m, which are solutions
of the system of m + n equations (u = (Ul' U2, •• • u n ) )
2.2. GENERAL RESULTS ABOUT OPTIMIZATION 57

8%(u)
U2
+ H
""l~
8I/>,(u) + ... + H
""m~'
= o.
81/>~(u)

(2.5)

8%(u) + Jl1 -au;;-


81/>, (u) + ... + 81/>~ (u) = 0;
Un Jlm ~

tPl (U) = 0,······ ,tPm(U) =0 .


The first n equations may also be written in the matrix form

kF(U)
m

= 'VF(u) + LJli'VtPi(U) = O. (2.6)


i=1

The following theorem is a generalization of a well-known result of the classical


calculus.

Theorem 2.3. Let U be an open subset of a normed linear space X and F : U C


X ~ R be differentiable on U and twice differentiable at the point U EU. Then

F" (u)(w,w) ~ 0 for etJery w EX, (2.7)

protJided F has a minimum tJalue at u .

Extrema and convexity. Theorem 2.1 takes the following form:

Theorem 2.4. If, in addition to the hypotheses of Theorem 2.1, K is a contJex subset
of U, then
p' (u)(v - u) ::; 0 for etJery v in a K and U E K. (2.8)

It may be observed that if K is a subspace of X , then (2.8) takes the form

F' (u)v = 0 for every v E K,


58 CHAPTER 2. ALGORlTHMS FOR OPTIMIZATION

and in particular F' (u) = 0, the condition (2.1) if K = X.


Theorem 2.5. Let K be a convex subject 0/ a normed linear space
(i) I/ a convex functional F : K C X t-- R has a minimum value in a neigh-
bourhood 0/ a point u E K , then u is a minimum value over the entire sei
K.
(ii) Every strictly convex functional F : K c X t-- R has at most one minima,
and that minima say, u is strict; that is,
J(u) < J(v) [or all v E K.

(iii) Let F : U c X t-- R be a convex function defined on an open subset U 0/ X


containing U and differentiable at a point u E K . Then the /unctional has a
minima at u be/onging to U i/ and only i/
F' (u)(v - u) ~ 0 [or every v E K. (2.9)
I/ the set K is open, the preceding condition is equivalent to (2.1).
By the Weierstrass theorem of analysis, the optimization problem for a continu-
ous function defined on Rn into R has at least one solution over a closed bounded
subset of Rn . The following theorem provides the solution of such a problem for the
non-bounded case.

Theorem 2.6. Let K be a non-empty closed subsei 0/ Rn and F : Rn ~ R a


continuous function which is coercive on U in the sense that F(x) ~ 00 as Ilxll ~
00. Then (P) has at least one solution.
This result has been extended in the infinite-dimensional case as follows .

Theorem 2.7 Let K be a non-empty, convex, closed subset 0/ a Hilbert space X


and F : K C X ~ R be convex, continuous and coercive in the sense 0/ the above
theorem. Then (P) has at least one solution. This solution is unique i/, in addition,
F is strictly convex.
It may be remarked that this theorem holds in the context of reflexive Banach
spaces while continuity can be replaced by a weaker condition, namely, lower semi-
continuity. However, this cannot be extended beyond reflexive Banach spaces as this
will break down for Li space (space of Lebesgue integrable functions) . For proof of
the general form of this theorem, we refer to Siddiqi [1986]. Let a(·, .) be a bilinear,
continuous and symmetrie form on a Hilbert space X and F E X* (space of all
bounded linear functionals on X known as the dual space). Then
1
J(v) = 2a(v,v) - F(v) (2.10)

is often called the energy functional. It is called a quadratic functional if


A is a matrix in (2.11). In view of some weH known results of functional analy-
sis (See, for example, Siddiqi, 1986), (2.10) can be written as J(v) = !(Au,v) -
2.2. GENERAL RESULTS ABOUT OPTIMIZATION 59

(y, v) , for every v E X and an element Y E X (associated with F by the Riesz


representation theorem), A : X ~ X, bounded linear. A relation of the type

a(u,v) = F(v) or(Au,v} = (y,v), (2.11)

for every v E U, a subspace of X and u an element of U, is known as the variational


equation. Let K be a non-empty, closed and convex subset of a Hilbert space X,
then the problem of finding u E K such that

a(u,v -u) ~ F(v -u) for every v E K, (2.12)

or equivalently,
(Au, v - u) ~ (y, v - u) for every v E K

is known as the variational inequality problem. Inequality (2.12), is called a


variational inequality (See Siddiqi, 1986). It can be proved as a corollary of
Theorem 2.7 (see Siddiqi, 1986) that the optimization problem (P) for the energy
functional J has a unique solution provided a(·,·) is coercive, that is, there exists
a number a such that a > 0 and a(v,u) ~ allvl1 2 for every v E X. It can also be
proved that finding the solution of the problem (P) for the energy functional over
a non-empty, convex and closed subset K of X is equivalent to finding the solution
of the variational inequality (2.12) where a(·,·) is bilinear, bounded, coercive and
symmetrie. If K is a subspace of X and, in partieular, X itself, then finding the
solution of the variational equation (2.11)is equivalent to finding the optimization
problem for J (v) under the conditions mentioned above on a(·, .) , that is, there exists
a unique u E K such that

J(u) $ J(v) for all v E K if and only if a(u, v) = F(v) for all v E K,
or

(Au, v} = (y,v) or Au = y.

Similarly, if K is a non-empty, convex and closed subset, then there exists a unique
u E K such that

J(u) $ J(v) for all v E K if and only if a(u,v - u) ~ F(v - u) for all v E K,

or

(Au, v - u) ~ (YjV - u) .

For details of the above mentioned results, we refer to Ciarlet [1989] and Siddiqi
[1986].
60 GHAPTER 2. ALGORlTHMS FOR OPTIMIZATION

2.3 Special classes of optimization problem


2.3.1 Programming problem
Unconstrained optimization problems of the following type are called program-
ming problems:

(i) Let U = {v s
E X/r/>i(V) s s
0,1 i m',ifJi(v) = O,m' + 1 i m} s s
where ifJi : X ~ R, 1 ~ i ~ m. For F : U C X ~ R, X a normed linear
space, (P) is called a non-linear programming problem with inequality
constraints if m' = m and with equality constraints if m' = o.

(ii) If U = {v E X/r/>i(V) ~ 0, 1 ~ i ~ m} and F and ifJi are convex functionals,


then the optimization problem (P) is called a convex progrmnming prob-
lem.

(iii) If X = Rn,J(v) = ~(Av,v) - (y,v},A = AT E An(R),y ERn, the ma-


trix A is assumed to be positive definite U = {v
E Rn/ Ei=l ifJi (Vj) ~ di'
1 ~ i ~ m}, where ifJi' i = 1,2· .. m are affine and hence convex then (P) is
called a quadratic programming problem. Here ifJi = Ax+d,x ERn, and
a e er ,

where A = (aij), then the problem (P) is called a linear programrning


problem.

2.3.2 Calculus of variation


Let X = Cl [a, b] be aspace of all continuous functions whose first derivatives
exist and are continuous such that

F(x(t}) = l b
l(x(t), x' (t), t)dt,

where 1(·,·) is continuous in all arguments and is continuously differentiable in x


and x'. Then (P) is called the calculus of variation problem. The function
x(t) which yields extremum values of F must satisfy the Euler-Lagrange equation;
namely,
~- ~(~) =0. (2.13)
8x dt 8x'
2.3. SPECIAL CLASSES OF OPTIMIZATION PROBLEM 61

2.3.3 Minimum norm problem and projection


Let M be a closed convex subset of a Hilbert space X and let
ep = inf IIxll. Then there exists an element u of M such that ep = lIull. This
zEM
result provides an ans wer to the problem of existence of an element whose norm is
less than or equal to the norms of all elements over a subset. For a given element W
of X, there exists an element PM(W) E M such that IIw - PM(w)11 = inf Ilw - vII.
vEM
The element PM (w) satisfies the condition

(PM(w) - w, V - PM(W)) ~ 0 for all v E M, (2.14)

and, conversely, if an element u E M satisfies the condition

(u - W, v - u) ~ 0 for all v E M , (2.15)

then PM(w) = u.
The element PM(W) is called the projection of an element W E X on M and
PM : X ~ M, the operator PM on X onto the set M is called the projector
operator. The projector operator is non-expansive , that is, it satisfies the condition

PM is linear if and only if M is a subspace and (2.14) takes the form

(PM(W) - w,v) =0 for every v E M .

2.3.4 Optimal control problem for a system represented by


differential equations
We consider a system described by nonlinear differential equations

X' (t) = f(x(t),u(t)) on [O,T], a real finite-time interval, (2.16)

with x(O) = Xo, (2.16)(a)


where x(t) E Rn is the state vector, u(t) E Rm is the control vector and
f : Rn X Rm ~ Rm is assumed to be continuously differentiable in its arguments.
For the class of admissible controls, we take

u = Cl (0, r, Rn) = {f : [0, T] ~ Rn / / exists and continuous} ,

and suppose that for all u E U, equation (2.16) has a unique solution xE c' (0, Ti Rn) .
We call x(t) the trajectory corresponding to the control u(t) . We suppose that the
terminal time T is fixed and x(t) satisfies

G(x(T)) = C, (2.17)
62 GHAPTER 2. ALGORlTHMS FOR OPTIMIZATION

where G : Rn ~ Rm, C E Rn . The funetional to be extremized is

J(u) = l T
l(x,u)dt, (2.18)

where 1 : tc-:» ~ R, and we suppose 1 and G to be eontinuously differentiable.


The optimal control problem is to find an admissible pair (x , u) which extremises
(minimizes or maximizes) J( .) under the eonditions of equations (2.16) and (2.17).
For more details, see Siddiqi [1986].

2.4 Newton algorithm and its generalization


The Newton method deals with the search of zeros of the equation F(x) =
0, F: U C X ~ Y, X and Y are normed spaces, in partieular, for X = Y =
R, F : R ~ R or X = Rn and Y = Rn, F : Rn -t Rn and U an open subset of X
(open interval of R or open ball of Rn). Onee we have this method, the functional
F ean be replaced by F ' or \l F to obtain the algorithm for finding the extrema of
F, that is, zeros of F' or \l F which are extremum points of F. One can easily check
that if F : [a, b] ~ R and IF' (x)1 < 1, then F(x) = 0 has a unique solution, that
is, F has a unique zero. For the function F : U C R ~ R , U the open subset of R,
the Newton method is defined by the sequenee
F(Uk)
Uk+l = Uk - F' (Uk) , k ~ O. (2.19)

Uo is an arbitrary point of open set U. The geometrie meaning of (2.19) is that


each point Uk+l is the intersection of the axis with the tangent at the point Uk.
This partieular ease suggests the following generalization for the functional F : U C
X ~ Y : For an arbitrary point Uo E U, the sequenee {Uk} is defined by

Uk+l = Uk - {F' (Uk)} -1 F(Uk), (2.20)

under the assumption that all the points Uk He in U. If X = Rn, Y = Rn,F(u) = 0


is equivalent to
F 1 (u) =0, U=(Ul,U2,'''U n)ERn
F 2 (u) = 0
F 3 (u) = 0

Fn(u) = 0,
where Fi : Rn -t R, i = 1,2,·· . n . A single iteration of the Newton method eonsists
in solving the linear system
F' (Uk) L::..uk = -F(Uk) with matrices
= (2.21)
{ F' (Uk) (8Fi(Uk))
8x)· ..
,
I,)
2.4. NEWTON ALGORlTHM AND ITS GENERALIZATION 63

and then setting

It may be observed that if Fis an affine function, that is, F(x) = A(x)+b, A = (aii)
is a square matrix of size n, that is, A E An(R) and b ERn, then the iteration
described above reduces to the solution of the linear system AUk = b. In this case,
the method converges in a single iteration.
We now look for:

(i) sufficient conditions which guarantee the existence of a zero of the function F,
and

(ii) an algorithm for approximating such an element u, that is, for constructing a
sequence {Uk} of points of U such that

lim Uk
k--+oo
= u.

We state below two theorems concerning the existence of a unique zero of F and
state their corollaries for the existence of the unique zero of "V F. The extrema of F
will exist at the zero of "V F.

Theorem 2.8. Let X be a Banach space, U be an open subsei 01 X ,Y a normed


linear space and F : U C X -t Y be differentiable over U . Suppose that there exist
three constants o:,ß and'Y such that 0: > 0 and Sa(UO) = {u E X/llu- uoll ~ o:} ~ U
(i)

sup sup IIA;;l (u)IIBL(x,Y) ~ ß, Ak(u) = A k E ßL(X, Y)


k~O uES.. (uo)

is bijective.
(ii)

'Y
-ß' and'Y < 1 .
I I
sup sup 11 F (x) - Ak(x ) IIBL(X,Y) ~
k~O x' ES.. (vo)

(Hi)

Then the sequence de/ined by

(2.22)
64 CHAPTER 2. ALGORlTHMS FOR OPTIMIZATION

is entirely contained within the ball and converges to a zero U 0/ F in Sa(UO) which
is imique. Furthermore
Ilul - uoll "{ k .
11 Uk - U1I <
_ 1
-"{
(2.23)

Theorem 2.9. Let X be a Banach space, U an open subset 0/ X, F : U C X ~ Y,


and Y be a normed linear space. Furthermore. let F be continuously differentiable
over U . Suppose that U is a point 0/ U such that

F(u) = 0, A = F' (u) : X -+ Y, bosmded linear and bijective }


{ sup
k2:0
IJAk - AIIBL(x,Y) ~ IIA-lltBL(Y,X) ,and A < 1/2 .

Then there exists a closed ball, Sr(u), with centre U and radius r such that [or every
point Uo E Sr(u) , the sequence {Uk} dejined by

(2.24)

is contained in Sr(u), and converges to a point u, which is the only zero 0/ F in the
ball Sr(u), Furthermore. there ezisis a number rs such that

(2.25)

As a consequence of Theorem 2.8, we get the following result:

Corollary 2.1. Let U be an open subset 0/ a Banach space X and let F : U C


X ~ R which is twice differentiable in the open sei U. Suppose that there are
three constants o:,ß ,"{ such that 0: > 0 and Sa(UO) = {v E X Illv - uoll ~ o} C U,
Ak(V) E ßL(X,X*) and bijective [or every v E Sa(UO) and

sup sup
k2:0 vES.. (uo)
II Akl (v) IIBL(X* ,X ) s ß
" , "{
IIF -ß' and

1
sup sup (v) - Ak(v )IIBL(X,X*) ~
k2:0 v ,v' ES.. (uo)
, 0:
"{ < 1, IIJ (uo)llx* ~ 13(1 - "{) .

Then the sequence {Uk} defined by

Uk+l = Uk - A k (uk,)F (uk),k ~ k ~ 0


l ' ,

is contained in the ball Sa(UO) and converges to a zero 0/ F', say u, which is the
only zero in this ball. Furthermore.
2.4. NEWTON ALGORlTHM AND ITS GENERALIZATION 65

As a consequence of Theorem 2.9, we get the following result.

Corollary 2.2 Let U be an open subsei 01a Banach space X and let F : n c X -+ R
be a function which is twice diJJerentiable in U . Moreooer, let U be a point 01U such
that

1
and x < "2'
Then there ezists a closed ball Sr (u) with centre U and radius r > 0 such that, [or
every point Uo E Sr(u), the sequence {Uk} dejined by Uk+l = Uk - A;;l F' (Uk), is
contained in Sr(u) and converges to the point u , which is the only zero 01F' in the
ball. Furthermore, Uk+l = Uk - A;;l(Uk)F' (Uk) converges geometrically, namely,
there esist« a'Y such that'Y < 1 and lIuk - ull ~ 'liluo - ulI, k ~ O.

Remark 2.3.
(i) Let X = Rn, the generalized Newton method of Corollary 2.1 take the form

(2.26)

where Ak(Uk) are invertible matrices of order n, VF(Uk) denotes the gradient
vector of the function F at the point Uk; (Rn)* is identified with Rn). In
particular, the original Newton method corresponds to

(2.26a)

where the matrix V 2 F(Uk) is Hessian of the function F at the point u .


(ii) The special case, Ak(Uk') = ep-l I, is known as the gradient method with
fixed parameter.
(iii) The special case , Ak(Uk') = _ep;;l I, is called the gradient method with
variable parameter.
(iv) The special case, Ak(Uk') = -(ep(Uk))-l I, is called the gradient method
with optimal parameter, where the number ep(Uk) (provided it exists) is
determined from the condition

(2.27)

General definition of gradient method. Every iterative method for which the
point Uk+l is of the form
66 CHAPTER 2. ALGORlTHMS FOR OPTIMIZATION

is called a gradient method. If CPk is fixed, it is called a gradient method


with fixed parameter, while it is called a gradient method with variable
parameter provided CPk is variable .

Theorem 2.10. Let X = Rn and the functional F : X -)- R be elliptic, that is,
there is a positive constant 0' such that F(x) ~ 0'IIxl1 2 for all x E X. Then the
gradient method with optimal parameter cotwerqes.

Remark 2.4.
(i) The following properties of elliptic functionals are quite useful (For details we
refer to Ciarlet [1989]) :
(a) Let F: X --t R (X is a Hilbert space, in particular X = Rn) be strictly
convex and coercive, then it satisfies the inequality

F(v) - F(u) ~ (V'J(u), V - u) + ~llv - ul1 2 for every u, V E X . (2.28)

(b) If F is twice differentiable, then it is elliptic if and only if

(V'2(U)W, w) ~ 0'IIwll 2 for every w EX. (2.29)

(c) A quadratic functional over Rn

F(v) = ~(Av,v) - (y,v),A is the n X n matrix and


(2.30)
{ A =2 AT is elliptic if and only if 2
(\i' F (u )w,w) = (Aw,w) ~ Alllwl/ for all u,w ERn,

where Al denote the smallest eigenvalue of A.


(ii) Let J(v) = ~(Av,v) - (y,v),A : Rn -)- (Rn)* = Rn.
Since \i' J (Uk) and V' J (Uk+ d are orthogonal and V'J (v) = Av - y, we have

2
This implies that cp(Uk) = (A WJJ
Wk
,W"
where Wk = AUk - Y = V'J(Uk)'
A single iteration of the method then takes the following form :
(i) Calculate the vector Wk = AUk - y.

(jl
2
(ii) Calculate the number CP(Uk) = w k
W1e,W"
ll ) •

(ili) Calculate the vector


2.4. NEWTON ALGORlTHM AND ITB GENERALIZATION 67

Theorem 2.11. Let F : Rn -t R be a functional which is diJJerentiable and satisfies


the following properties. There are two positive constants a and ß such that

(i) (V F(v) - V F(u), v - u} ~ allv - ull 2 for all v, u E Rn and o > 0,


(ii) IIVF(v) - VF(u)11 ~ ßllv - ull for every u,v ERn.
Furthermore. let there exist two numbers a and b such that
2a
o < a ~ CPk ~ b < ß2 for every k.

Then the gradient method with variable parameter converges and the convergence is
geometric in the sense that there exists a constomt » depending on a, ß, a, b such that
"( < 1 and Iluk - ull ~ "(klluo - ull.
Remark 2.5.
(i) If F is twice differentiable, then eondition (i) ean also be written in the form
s
sup IIV2 F(u)11 e.
(ii) In the ease of an elliptic quadratic functional F(v) = !(Av , v) - (y, v), one
iteration of the method takes the form

and it follows from Theorem 2.11 that the method is eonvergent if 0 < a ~
CPk ~ b ~ 2>'1 / >.~, where >'1 and >'n are the least and the largest eigenvalues of
the symmetrie positive definite matrix A.

Proof of Theorem 2.8. First of all, we prove that for dvery integer k ~ 1,

Iluk - Uk-111 ~ ßIIF(uk-dll,


lIuk - uoll ~ a equivalently Uk E Sa(UO)
"(
IIF(uk)11 ~ ß11uk - Uk-&

We apply the method of the finite induetion for the proof. Let us show that the
results are true for k = 1; that is,

Putting k = 1 in relation (2.22), we get

U1 - Uo = -A01(uo)F (uo), (2.31)


68 GHAPTER 2. ALGORITHMS FOR OPTIMIZATION

which implies that lIul - uoll ~ ßIIF(uo)1I ~ 0'(1 - 1') ~ 0' by the hypotheses of the
theorem. Further, from (2.31), we can write
F(Ul) = F(Ul) - F(uo) - A O(UO)(UI - uo).
By the Mean Value Theorem applied to the function U -t F(u) - Ao(uo)u, we have

IIF(udli ~ sup IIF' (u) - Ao(uo)lI l1 ul - uoll ~ -ß'Y lIuo - ull1,


UESa(uo}

by condition (ii) of the theorem.


Let us assume that the desired results are true for the integer k = n - 1. Since
Un - Un-l = -A;;':1 (U(n-l)' )F(Un-l)' it follows that Ilun - un-dl ~ ßIIF(u n- I)1I
which gives the first relation for k = n. Then we have
lIun - un-lll = II A ;;': 1(U n-l )F(Un-l)11 ~ ßIIF(un-dli
~ ß~lIun-l - un-211

t
This implies that

lIun - uoll ~ Ilui - ui-ll1 ~ {~'Yi-l } lI ul - uoll


< Il ul - uoll < _ß_IIF(u )11 < ßO'(l_ ) = 0'
- 1 -'Y - 1 -1' 0 - ß l' ,

which means that U n E Sa(Uo).


For the proof of the last relation, we write
F(u n) = F(u n) - F(Un-l) - A n-1(U(n-l)')(U n - Un-l).
By applying the Mean Value Theorem to the function u -t F(u) -A(n-l) (U(n-l)' )u,
we get
IIF(uk)11 ~ sup IIF'(u) - A n-1(U(n-l)' )lllIun - un-ll1
uESa(uo)
'Y
~ ß11u n - u n - l ll,

and the last relation is established for n. Hence these three relations are true for all
integral values of k ,
We now prove the existence of a zero of the functional F in the ball Sa(UO).
Since

E
!
IIUk+rn - ukll s lI uk+i+l - uk+ill
i=1
rn-I k
(2.32)
s 'Y k L 'Yillui - uoll ~ 1 ~ lI ul - uoll -t 0 as k -t 00,
i=O l'
2.4. NEWTON ALGORlTHM AND ITB GENERALIZATION 69

where {Uk} is a Cauchy sequenee of points in the ball Sa(UO) which is a closed
subspace of a eomplete metric space X (X, a Banach space). This implies that
there exists a point U E Sa(uo) such that
lim Uk = U.
k-too

Sinee F is differentiable and therefore eontinuous, we get

IIF(u)1I = lim
k-too
IIF(uk)1I ~ ß2 k-too
lim Iluk - uk-lll = 0,

which, in turn, implies F(u) = 0 by the first axiom of the norm. By taking the
limit m ~ 00 in (2.32), we find that lIuk - ull ~ ~lIul - uoll is the desired result
10

eoneerning geometrie eonvergenee.


Finally, we show that u is unique. Let v be another zero of F, that is, F(v) = O.
Sinee F(u) = F(v) = 0,
v - u = _AQl (F(u) - F(v) - Ao(uo)(v - u)),
from which it follows that
IIv - u] ~ IIAol (uo)11 sup IIF' (v) - Ao(uo)llllv - ull ~ 1'lIv - ull,
uES", (uo)

which implies that u = v as l' < 1.


Proof of Theorem 2.9.
(i) First of all, we show the existenee of eonstants 0: and ß such that
0: > 0, Sa(u) = {x E X/llx- ull ~ o] C U, (2.33)
and
sup sup 111 - A;;l F' (x)11 ~ ß < 1. (2.34)
k~O uS",(u)

For every integer k, we ean write A k = A(1 + A-l(Ak - A)) with IIA-l(Ak -
A)II ~ >. < 1 in view of a eondition ofthe theorem. Thus, A k are isomorphisms
from X onto Y and moreover,
IIA;;lll = II(A(1 + A-l(Ak - A))-lll
s 11(1 + A -l(Ak _ A)-lIlIlA-lli ~ I~A~~I .
This implies that
111 - A;;l All = IIA;;l A k - A;;l All ~ IIA;;lll IIAk - All
>. 1
s !I A k
-1
!lilA-111 für>. <2
IIA- lil >.
~ 1->' IIA-lll'
70 CHAPTER 2. ALGORlTHMS FOR OPTIMIZATION

or
-1 >.
111 - A k All ~ 1 _ >. = ß < 1.
I

Let 8 be such that ß' < ß' + 8 = ß < 1. This implies that

From here, (2.33) and (2.34) follow immediately keeping in mind the continuity
ofthe derivative F' and the fact that A = F'(u).

(ii) Let Uo be any point of the ball Sa(u) and {Uk} be the sequence defined by
Uk+l = Uk _A;l F(Uk); each ofthese elements lies in Sa(u), This implies that
{Uk} is weIl defined. Since F(u) = 0, we have

Uk+1 - U = Uk - A;l F(Uk) - (u - A;l F(u)) .

By the Mean Value Theorem applied to the function, x -4 x- A;l F(x) shows
that

IIUk+l - ull ~ sup 111 - A;l F'(u)lIl1uk - ull ~ 'Ylluk - ull.


xES,,(u)

By (2.34) and continuing in this, way we get

whieh is the geometrie convergence. This relation also implies that Uk ~ U


ask~ooasß<1.

(iii) The zero of F, point U is unique. For this let v be another point such that
F( v) = O. The sequence {Uk} corresponding to Uo = V is a stationary sequence,
since U1 = Uo - AQ1 F(uo) = Uo and; on the other hand, it converges to the
point U by the above discussion. This implies U = v.
For the proof of Theorem 2.10 we refer to Ciarlet [19g9,pp.300 - 301] and we
prove here Theorem 2.11.

Proof of Theorem 2.11. In a gradient method with variable parameter, we have


Uk+1 = Uk-Ipk V'F(Uk)' Since V'F(u) = 0 für a minima at u, we can write Uk+l -u =
(Uk - u) -Ipk {V' F(Uk - V'F(u)}. This implies that lIuk+1 - uII 2 = lIuk - ull- 2lpk <
V'F(Uk)-V'F(u),Uk-U > +1p~IIV'F(uk)-V'F(u)1I2 ~ {1- 20tlpk +ß 2IpD lIuk-ulI,
under the condition that Ipk > O. If
20t
o < a ~ Ipk ~ b < ß2'
2.5. CONJUGATE GRADIENT METHOD 71

then

1 - 20 <Pk + ß2<p~ < 1,


and so

where 'Y < 1 which depends on 0, a, b and ß. This also implies the congence of {Uk} .

Newton Algorithm for finding inf F(x),F: Rn --+ Rn .


:r:ERn .

Algorithm 2.1:
Data. Uo ERn
StepO. Set k = 0
Step 1. Compute the Newton search direction
hk = -H(Uk)-lY'F(Uk)
Here H (u) is the Hessian Matrix of F
Step 2. Set Uk+l = Uk + hk
replace k by k + 1 and go to step 1.

Newton-Armijo Algorithm 2.2:


Parameters 0 E (0,1/2), ß E (0,1)
Data. Uo ERn
Step O. Set i = 0
Step 1. Compute the Newton search direction
hi = -H(Ui)-lY'F(Ui)
Stop if hi = O.
Step 2. Compute the Armijo step size
Ai = maxkEN{ßk I F(Ui + ßkhi) - F(Ui) s oßk < hi' Y' F(Ui) >}
Step 3. Set Ui+l = Ui + Ai + hi'
replace i by i + 1 and go to Step 1.

2.5 Conjugate gradient method


The conjugate gradient method deals with the minimization of the quadratic
functional on X = Rn; that is,

J : V E Rn --+ ~(AV, V) - (b, V) ,

or
1
J(V) = 2(Av, V} - (b, V),
72 CHAPTER 2. ALGORlTHMS FOR OPTIMIZATION

where A is the n x n matrix. Starting with an initial arbitrary vector uo, we set
da = 'VJ(uo). If 'VJ(uo) = 0, the algorithm terminates. Otherwise, we define the
number
('V J( uo), do)
ro = (Ado,do) ,

then the vector Ul is given by

Assuming that the vectors Ul, d 1 , ... ,Uk-l, dk-l, Uk have been constructed which
assumes that the gradient vectors 'VJ (Ul), 0 :$ I :$ k - 1 are all non-zero, one
of two situations will prevail: either 'VJ(Uk) = 0 and the process terminates, or
'VJ (Uk) #- 0, in which case we define the vector

then the number rk and UkH are given by

('VJ(uk),dk)
rk = (Adk,dk) ,

and

respectively.
This elegant algorithm was developed by Hestenes and Stiefel in 1952. This
method converges in at most n iterations (For proof, see Polyak [1987, p.69]). For
a computer programme for implementation of these algorithms, we refer to Press et
al, [1992].
The study of the conjugate gradient method for non-quadratic function on Rn
into R began in sixties. Details of these methods and their comparative merits in
different cases can be found in Powel[1986], Gilbert and Nocedal [1992] and Polak
[1997] . However, we present here the essential ingredients of two of these best
methods, namely, Fletcher-Reeves(FR) and Polak-Ribiere (PR) .
Let F : Rn ~ R, we look for inf F(v) where F is twice differentiable. The
vERn
point at which inf F(v) is attained will be denoted by arginf(x). Starting with an
vERn
arbitrary vector uo, one assumes the vectors Ul, U2, " . Uk to have been constructed,
which means that the gradient vectors 'V F(Ui), 0:$ i :$ n -1, are non-zero. In such
situations, either 'VF(u n ) = 0 and the algorithm terminates, or 'VF(u n ) #- 0, in
which case, vectors UnH is defined (if it exists and is unique) by the relations
2.5. CONJUGATE GRADIENT METHOD 73

the successive descent directions di being defined by the recurrence relation:

do = V J(uo)
d. - r7F( .) (VF(Ui), VF(U i) - VF(Ui_1)} d.
, - V U, + IIV F(Ui_1)1I 2 ,-1,
(VF(Ui), VF(Ui) - VF(Ui-d}
ri = IIVF(ui_1)1I 2 '

is called the Polak-Ribiere formula and in this case the conjugate gradient method
is called the Polak-Ribiäre conjugate gradient method and one denotes ri by
-r».
,
The case

is called the Fletcher-Reeves formula and the corresponding method is called the
Fletcher-Reeves conjugate gradient method. Such ri is denoted by r[R . It
may be observed that the Polak-Ribiere conjugate gradient method is more efficient
in practice.

Polak-Riblere conjugate gradient algorithm:


Data. Uo ERn
Step O. Set i = 0, do = V F( uo), and h o = -do
Step 1. Compute the step size
Ai = arg min~~o F( Ui + Ahi)
Step 2. update: Set Ui+l = Ui + Aihi
di+1 = V F(ui+d
(di±l,dir-di)
r iP R -- IIdi 12
hi+1 = -di+1 + rfRhi
Step 3. Replace i by i + 1 and go to Step1.

Fletcher-Reeves conjugate gradient Algorithm:


Data. Uo ERn
Step O. set i = 0, do = V F(uo, and ho = -do
Step 1. Compute the step size
Ai = arg min~>o F(ui + Ahi)
Step 2. Update: Set U~+l = Ui + Aihi
di+1 = VF(Ui+1)
riFR _
_ ~.lfdiIf2
hi+l = -di+1 + r[R
Step 3. Replace i by i + 1 and go to Step 1.
74 CHAPTER 2. ALGORlTHMS FOR OPTIMIZATION

2.6 Variable metric methods (DFP and BFGS meth-


ods)
Variable metric methods are used to compute the gradient of functions at
arbitrary points. Variable metric methods come in two main flavours: one is the
Davidon-Fletcher-Powell (DFP) algorithms and the other is the Broyden-Fletcher-
Goldfarb-Shanno (BFGS). However, the BFGS scheme is superior in the sense that it
has a low round-off error and converges faster in comparison to other methods. Con-
sider a function F : Rn ~ R whose value and the gradient (V'F(x» can be calculated
at a given n-dimensional point u as the origin of the coordinate system. Our goal is
to find u E Rn at which F has minimum value, that is, F(u) ::; F(x) for all x ERn.
In the first place, we approximate F(.) by a quadratic functional by Talylor's ex-
pansion. Then we look for a point of Rn at which the quadratic functional attains
minimum and subsequently find the minimum value. The function F can be ap-
proximated by its Taylor series

+ L. -8 -z, + -2 L 8 .8 .XiXj + ...


8F 1 8 2F
F(x) = F(u)
XI .. XI x J
I I ,J (2.2)
{ ~ C - b - X + ~x . A . X
= F(u) + ~(Ax,x) - (b,x) ,

where
8 2F 2
C = F(u),b = -V'F(x),A = (ai,j) = 8 8 = V' F(x) ,
Xi Xj
X = (X1,X2" ·xn) ERn.
In approximation, the gradient of F at u, that is, V'F (u) = (Au, u) - (b, u), b E
Rn ,x E Rn where (., .) denotes the inner product. u is an extrema of F if V' F(u) =
o or(Au, u) = (b, u)or Au = b. Thus, finding the extrema of F is approximately
equivalent to solving the matrix equation Au = b, that is, equivalent to finding A - 1.
Newton's method can be employed for this task.
The matrix A whose components are the second partial derivatives ofthe function
is called the Hessian matrix of the function at the given point. The basic idea of
the variable metric method is to build up, iteratively, a good approximation to the
inverse Hessian matrix A -1, that is, to construct a sequence of matrices H, with the
property
limHi = A - 1 • (2.36)
i -+ oo
It is much better if the limit is achieved after n iterations instead of 00 . Consider
finding a minimum by using Newton's method to search for a zero of the gradient
of the function. Near the current point Xi, we have the second order approximation
F(x) = F(Xi) + ((x - Xi), V'F(Xi» + ~(A(x - xd, (X - Xi»; so
V'F (x) = V' F(Xi) + A(x - Xi)' (2.37)
2.6. VARIABLE METRIC METHODS (DFP AND BFGS METHODS) 75

In Newton's method, we set \7F(x) = 0 to determine the next iteration point

(2.38)

The left hand side in (2.38) is the finite step which is required for exact minima; the
right hand side is known if we have an accurate H ~ A -1 .
The "quasi" in quasi-Newton is because we do not use the actual Hessian metric
of F but instead use an approximation of it. Now, consider the descent direction of
F at Xi. This is the direction p along which F decreases: (\7F(u) .,p) < O.
For the Newton direction (2.38) to be adescent direction, we must have

(2.39)

that is, A must be positive definite.


When we are not elose enough to the minimum, taking the full Newton step with
p even with positive definite A need not decrease the function. Here we can use
the line search and the backtracking strategy to choose a step along the direction of
Newton step p .
These properties are valid for DFG algorithm, Press et al. [1992] which update
Hi+1' Subtracting equation (2.38) at Xi+! from that same equation at Xi gives

(2.40)

where \7Fi == \7 F(Xi)' We require that the new approximation H i+! satisfies equa-
ti on (2.40); that is,
Xi+! - Xi = H i+! (\7F i+! - \7Fi ) ;
i.e., the updating formula should be of the form Hi+1 = H, + correction. The DFP
(Davidon-Fletcher-Powell) updating formula is

(2.41)

where
Si = (\7Fi+1 - \7Fi)
Yi = (Xi+! - Xi)'
The BFGS (Broyden-Fletcher-Goldfarb-Shanno) formula is the same but with
one additional term; i.e., .. . + [sr Hisi]u u T where

(2.42)

All these methods are invariant under linear transformation of the variables and
converge to a strict local minimizer.
For a computer programme, see Press et al. [1992] . For details concerning
algorithms and convergence of these formulae, we refer to Polak [1997].
76 CHAPTER 2. ALGORlTHMS FOR OPTIMIZATION

2.7 Problems
Problem 2.1. Let m ~ n > 0 and A = (aij), i = 1,2, · ·· ,m,j = 1,2,···,n and
y E Rm. Then write down the solution of optimization problem for the function
F(x} = IIAx - yll over Rn.

Problem 2.2. (a) Let F : Rn ----+ R be continuously differentiable in an open


convex set U c Rn. Then show that u E U can be an extrema of F on U if
\1F(u} = o.
(b) Extend (a) for F : X ----+ Y where Y is a normed linear space and X is a
vector space and F is a Gäteau differentiable function .

Problem 2.3. (a) Discuss the application of the Newton method for finding a
minimizer of f(x} = sin z.
(b) Under what conditions for a real-valued function f defined on [a,b],
f(x} = 0 has a solution? Justify your answer.

Problem 2.4. Explain the concept of Fibonacci search and the golden section
search.

Problem 2.5. Let Ax = y, where A is as n x n matrix and x and y are elements of


Rn. Write down a sequence of approximate solutions of this equation and examine
its convergence. Under what condition on A this equation has necessarily the unique
solution?

Problem 2.6. Explain the concept of the steepest descent and apply it to study
the optimization of the functional F defined on a Hilbert space H as folIows:

F(x} = (Ax ,x) - 2(y, x} ,

where A is a self-adjoint positive definite operator on H, x, Y EH. For any Xl EH,


construct a sequence {x n } where

where Zn = Y - A, and show that {x n } converges to Xo in H which is the unique


solution ofAx = y. Furthermore, show that by defining

F(x} = (A(x - xo), X - xo},

the rate of convergence satisfies


2.7. PROBLEMS 77

Problem 2.7. Show that the Polak-Ribiere conjugate gradient algorithm converges
under appropriate conditions.

Problem 2.8. Examine the convergence of the DFP algorithm.


Chapter 3

Maxwell's Equations, Finite


and Boundary Element
Methods

What are the essential needs of modern life? One can say without any hesita-
tion, electric lighting, heating and cooling, telephones, electric motors and genera-
tors, radio, television, X-ray machine, ECG machine, radars and weather forecasting
system equipment, etc. In fact , the invention of electromagnetism embracing elec-
tricity and magnetism completely revolutionized our way of life. Every day, some
new tools, techniques and equipment are being developed based on the concepts of
the electromagnetism theory. The backbone of modern electromagnetism is a system
or a set of four equations established by the Scottish Scientist James Clark Maxwell,
a professor at the University of Cambridge, around 1872. This system is now known
as the system of Maxwell's equations. In the last three decades , two powernd meth-
ods, namely, the finite element and the boundary element, have been invented to
find out solutions of physical phenomena represented by differential equations.
The main object of this chapter is to give abrief introduction to this elegant
system of mathematical models along with these two methods. In Section 3.1.1, we
give abrief historical note along with the physicallaws which led to the derivation of
Maxwell 's equations. Consequences of Maxwell's equations are presented in Section
3.1.2 while variational (weak) formulation of Maxwell's equations are discussed in
Section 3.1.3. Sections 3.2.1 to 3.2.4 are devoted to the various aspects of the finite
element method. The boundary element method is introduced in Section 3.3, and
problems are presented in Section 3.4.

79
80 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERlCAL METHODS

3.1 Maxwell's equations

3.1.1 Brief historical note and physicallaws

Electromagnetic theory encompasses electric and magnetic theory. It deals with


the electromagnetic fields, a region in which electric and magnetic forces act. As the
story goes, in ab out 600 B.C., a Greek mathematician, astronomer and philosopher
Thales observed that when amber is rubbed with silk it produces sparks which led
to the invention of electricity, electron and electronics (the Greek word for amber is
electron). He also observed the attractive power between pieces of a natural mag-
netic rock called lead-stone found at a place called Magnesia, from where the words
magnet and magnetism are derived. Around 1600 A.D., William Gilbert of England
invented the electroscope for measuring electrostatic effects and pointed out that the
earth itself is a huge magnet. Around 1750, American scientist-statesman Benjamin
Franklin discovered lightning of rod with electricity, positive and negative charges
and established the conservation law of charge. After this invention, Charles Au-
gustine de Coulomb of France invented torsion balance for measuring electric and
magnetic forces. During the same period, the famous German mathematician and
astronomer Karl Friedrich Gauss formulated his famous divergence theorem relating
the volume and surface. In 1800, Alessandro Volta of Italy invented Volta cell and
electric battery. In 1819, the Danish physicist Hans Christian Oersted noted that
electricity produces magnetism. In the following years, the french Physicist Andre
Marie Ampere observed that the atoms in a magnet are magnetized by t iny electric
current circulating in them. At this time, George Simon of Germany pronounced
his famous law relating current, voltage and resistance. In 1831, Michael Faraday of
London demonstrated that achanging magnetic field pro duces an electric current.
The above mentioned inventions provided motivation to James Clark Maxwell, a
professor at the Cambridge University, to develop a unified theory of electricity and
magnetism which he published in the form of a classic treatise in 1873. He postu-
lated that light is electromagnetic in nature and that electromagnetic radiation of
other wavelengths should be possible. He discovered four basic laws governing the
interaction of bodies which are magnetised or electrically charged or both. The bod-
ies may be static or moving. He established four equations known as the Maxwell's
equations. The acceptance of Maxwell's theory by the scientific community had to
wait for 15 years until it was vindicated by the German physicist Heinrich Hertz
through experiments.
Objects like metals which permit easy passage of charges are called conductors.
On the other hand, the objects in which charges are not free to pass are called
insulators or dielectrics.
The force between two point charges placed at a certain distance r is given by
Coulomb's law

(3.1)
3.1. MAXWELL'S EQUATIONS 81

where
F = Force between two point charges ql and q2 j
r = Distance between qi and qz j
k = Proportionality constant.

In the international system k = 4~E' where € is the permittivity of the medium in


which the charges are situated, e has the dimension of capacitance per unit length.
The permittivity of vacuum €o is
1
€() = -nFm-
3611"
l
~ permittivity of air.

Force is a vector which can be written as

where f is the unit vector pointing in the direction of line joining the charges.
Electric field intensity E is defined as the force per unit charge

E= F =f_q_l _ , (3.2)
q2 411"€or2
where qz is the positive test charge.
The electric field intensity is a vector having the same direction as the force F
but differing in dimension and numerical magnitude. The (average) electric charge
density ep is equal to the total charge q in a volume V divided by the volume. Thus,
ep = ~ . The SI unit of charge density is Coulomb per cubic meter (cmr"), sp can
be defined as
. t::.q
ep = b.1V-tO
im "V'
L:1

and hence ep can be treated as a point function. It is often called the volume charge
density to distinguish it from the surface charge density and the linear charge density.
The electric potential is a measure of energy for some kind of unit quantity. The
work or energy per unit charge required to transport the test charge from a point
Xl to point X2 is called the difference in the electric potential of the points X2 and
Xl . Potential is a scalar quantity, that is, it has magnitude but no direction. The
total electric potential at a point is the algebraic sum of the individual component
potentials at the point. The gradient of the potential at a point is defined as the
potential rise 6.v across an element of length 6.1 along a field line divided by 6.1,
with the limit of the ratio taken as 6.1 approaches zero. Thus, if V is the electric
potential, then

Gradient of V = lim ~V = dV •
al-tO L-1 l dl
82 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERlCAL METHODS

The gradient of V = -E, as a potential rise occurs when moving against the electric
field, the direction of the gradient is opposite to that of the field. The usual notation
for the gradient of V is grad V or 'ilV.
Gradient of V = grad V = 'ilV = -E.
In reetangular coordinates
( A8V 8V
E = - 't"7V
v = - x 8x + Y 8y ) = xAE:z: + YAE 1/'
A

where
8V 8V
E:z:=-, E1/=-'
8x 8y
The combination of two point charges q of opposite sign separated by a small distance
l is called an electric dipole, and the product ql is called the electric dipole
moment.
A dimension defines some physical characteristic like length, mass, time, veloc-
ity, force.

+Q

oe Infinite plane

Figure 3.1. Electric field between two point charges of opposite sign with flux
tubes joining them. The tube follows the electric field lines . Each tube has a
constant amount of flux.

If in Figure 3.1, each tube represents a constant amount of charge or electric flux
t/J then, at any point, there is a flux density D, proportional to E, as given by ~,
3.1. MAXWELL'S EQUATIONS 83

where A is the cross-sectional area of the tube. Thus, the electric flux for a tube is
given by
1/J = DA, D = average flux density/cm- 2 •
It can be seen that
D = €oE (See Kraus, [1992]). (3.3)

Gauss' law of eleetrie field. The total electric flux through any closed surface
equals the charge enclosed. The equivalent form of this law is

i D . dS = Iv epdV = q, (3.4)

where i denotes the double or surface integral over closed surface and Iv denotes
the tripie or volume integral throughout the region enclosed.
Ir we replace ep by \l . D, then we get

i D . ds = Iv \l . DdV .

This result is known as the divergenee theorem or the Gauss theorem. This
relation holds not only for D but for any vector function. In other words, the
divergence theorem states that the integral of the normal component of a vector
function over a closed surface S equals the integral of the divergence of that vector
throughout the volume V enclosed by the surface S.
A conductor can conduct or convey electric charge . In statie situations, a conduc-
tor may be treated as a medium in which the electric field is always zero. A medium is
homogeneous if its physical characteristics like mass, density, and molecular struc-
ture do not vary from point to point. Ir the medium is not homogeneous , it is called
non-homogeneous, or inhomogeneous or heterogeneous. A medium is called
linear with respect to an electrostatic field if the flux density D is proportional to
the electrie field intensity E. This is the case in free space where D = €oE (€o is per-
mittivity constant) . In material media, permittivity constant may not be constant.
Such material is called non-linear. A material is called isotropie if its properties are
independent of direction. Crystalline media or certain plasma may have direction
characteristies and such materials are called non-isotropie or anisotropie. Here,
we shall confine ourselves to isotropie, linear and homogeneous media.
Eleetrie eurrent J is a vector of R3 that measures a flux of electric charge. It
is also called the current density vector. The flux of electric charges across a surface
element dS in the sense of unit normal n to dS = J . ndS. The Ohms law states
that the potential difference or voltage V between the ends of a conductor is equal
to the product of its resistance R and the current I. The Ohms law at a point is
given by the relation
J=oB, (3.5)
84 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERlCAL METHODS

where
J = current density,
E = Field intensity,
a = conductivity of the material.
Gauss' law for magnetic fields. A moving charge constitutes an electric current
and possesses a magnetic field. If acharge q moving with a velocity V experiences
a force F, there must be a magnetic field B = ~, the force being perpendicular
to both the field and the direction of motion of the charge. The magnetic field is a
vector quantity and it is also regarded as a magnetic flux density and is given by
B = 'l/JA' 'l/Jm = IBI Acos Cl: where

'l/J m= magnetic ßux through area A


IBI = magnitude of magnetic ßux density B
Cl: = angle between normal to area A and direction of B .
Dimensionally,
force
'l/Jm = current moment x area,
or

'l/Jm=! ! n.e« .
Gauss' law for magnetic fields states that the magnetic flux through closed Gaus-

i» .
sian surface is zero; that is,
dS=O . (3.6)

This implies that the divergence of B equals zero; that is,


"v. B = O. (3.7)

Amperes law. Let


B
H = -, (3.8)
P,
where B = magnetic ßux density and p, = permeability of medium (H and B are
vectors having the same direction). The amperes law states that the line integral of
H around a single closed path is equal to the current enclosed; namely,

f H·dl =1, (3.9)

where
H = magnetic field, A rn-I
dl = infinite small element of path length, m
I = current enclosed, A.
3.1. MAXWELL'S EQUATIONS 85

Faraday's law. As we know, a current carrying conductor produces a magnetic


field. In fact, the reverse effect is also possible, that is, a magnetic field can produce
a current in a closed circuit but under the condition that the magnetic flux linking
the circuit must be changing. Consider the situation in Figure 3.2(a) of the closed
wire loop. A magnetic field with flux density B is normal to the plane of the loop.
The induced current flows in such a way as to oppose the original direction. It
produces its own magnetic field and the direction of ßow of the current is such that
the secondary magnetic field acts in the opposite direction to the original magnetic
field (see Figures 3.2(b) and 3.2(c».

B (decreasing)

Figure 3.2{a). Relation between decreasing flux density B and induced current I
in loop.

According to the Faraday's law, "the total electromotive force (emf) induced in
a closed circuit is equal to the time rate of decrease of the total magnetic ßux linking
the circuit". In mathematical language, it means that

V = _ dt/J m
dt '
where
V = total emf
t/J m = total flux
t = time
or
V = _ r 8B
Js 8t
. dS
'
(3.10)
86 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

B (decreasing) B (increasing)
(b) (c)

Figure 3.2(b) and 3.2(c). Induced currents for decreasing and increasing flux
density B.

where fs indieates a double or surface integral (J 1) over a surface Sand B is the


flux density. For more details, we refer to Kraus [1992] .
The notions mentioned in this subsection and the subsequent subsections have
wide applications in fields like technology for producing electricity, namely, alter-
nators, transformers, circuit breakers; usage of strong currents, namely, electric
circuits and their various components, resistance, inductance, capacities; control-
ling high frequencies in generators, lines, antennae, receivers , electron beam and
their applications in radio, T .V., radar, automatie control deviees, telecommuniea-
tion equipment. Some of the other areas of applications are optical cavity of lasers,
quantum electronics, instrumentation, linear and non-linear opties including fibre
opties and laser circuits besides electro-chemistry.

3.1.2 Maxwell's equations and their consequences

Maxwell's macroscopic equations (Maxwell's equations in continuous


media). Let E(·, ·) and B(·, .) be two functions defined on the whole space R~ x
R t with vector values in R3 which are called the electric field and the magnetic
induction. Assurne that E( ·, ·) and B(·,·) are related with D(., .) and H(·, ·) through
3
the relations D = E + P, -Pis the electrie field generated by where j = + J 3
and H = B - M, M is the magnetie field generated by Cp where sp = Cp + p. D is
called the electric induction or the electric displacement or flux density while
H is called the magnetic field. Furthermore, let p(. ,.) and J( .,.) be defined on
R~ x R t with p(x, t) E Rand J(x, t) E R3. The following set of equations are called
3.1. MAXWELL'S EQUATIONS 87

Maxwell's macroscopic equations:

8D
--+curl H
8t
=J (3.11)(i)
div D =p (3.11)(ii)
8B
7it+curl E =0 (3.11)(iii)
div B = 0, (3.11)(iv)

where
E = (Ei,~, E g), X = (Xi,X2,Xg)
a
div E = "L..J 8Ei
i=i8Xi
curl E = (8Eg _ 8~, 8Ei _ 8Eg, 8E2 _ 8Ei ).
8X2 8xg 8xg 8Xi 8xt 8X2
E , B, p, J are called respectively the electric field, magnetic induction, charge
density and current density. Dis called the electric conduction or electrical
displacement or fiux density while H is known as the magnetic field.

Remark 3.1. (i) (a) Since (3.11)(i) is derived from the Ampere's law, it is often
called the Maxwell-Ampere's law.
(i) (b) (3.11)(ii) is a consequence ofthe Gauss ' electric law and so it is known as
the Maxwell-Gauss electric law.
(i) (c) (3.11)(iii) is a consequence of Faraday's law and therefore it is called the
Maxwell-Faraday law.
(i) (d) Similarly, (3.11)(iv) is nothing but the Gauss' magnetic law.
(ii)(a) Relation (3.9) can also be written as

(3.12)

as the current I enclosed by the path is given by the integral of the normal component
of J over the surface A. Or,

(3.13)

Applying Stoke's theorem to (3.13), we get (3.11)(i).


(ii) (b). By Gauss' law
!D .dS=!<pdV. (3.14)

Applying (3.14) to an infinitesimal volume, we get (3.11)(ii) .


88 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

(ii) (c) Replacing V in (3.10) by the Une integral of E around the circuit, we get

f E .& = - r 8B8t .ss.


Js (3.15)

By applying Stokes theorem to (3.15), we get (3.11)(iii).


(ii) (d) In view of (3.7), we get (3.11)(iv), which is also known as Maxwell's
magnetic field equation.
(ii) (e) For further details ofthe derivation of Maxwell's equations, the continuity
relation for the electricity and equations of conservation laws, we refer to Dautry
and Lions [1990], Kraus [1992], Itzykson and Zuber [1980] and Landau and Lipschitz
[1959].

Remark 3.2. (i) We have employed here the systems unit called natural in which
the speed of light in vacuo , C = 1 and the permeability of the vacuum, JLo = 1.
(ii) If, in equations (3.11), P = 0 and M = 0, then these take the form
8E
--+
8t
curl B =j (3.16)(i)
div E = <p (3.16)(ii)
8B
[jt+ curl E =0 (3.16)(iii)
div B = O. (3.16)(iv)

The system of the four equations of (3.16) is called the system of Maxwell's mi-
croscopic equations or Maxwell's equations in vacuo.

Remark 3.3. Maxwell's equaiions in perfeet media. For perfect media D


€E, H = ~B, with € and JL being positive constants depending on the charac-
teristic of the medium considered known as the permittivity or dielectric constant
and the permeability or the magnetic permeability constant, respectively. Putting
these values of D and Hin (3.11), we have the following system of equations:
8E
-€JL-+ curl B
8t
= JLJ (3.17)(i)

div E = Ife (3.17)(ii)


8B
[jt+ curl E =0 (3.17)(iii)
div B = O. (3.17)(iv)

Remark 3.4. Since D = €E and E = - V'V, we have D = -€V'V and putting this
value in (3.11)(ii), we get
(3.18)
3.1. MAXWELL'S EQUATIONS 89

which is the Poisson's equation.


If sp = 0, then (3.18) is reduced to the Laplace's equation:

'V2 V = O. (3.19)

In reetangular coordinates,

2 8 2v 8 2V 8 2V
'V V = 8x2 + 8 y2 + 8z 2 =0 .
The static potential distribution for any conductor configuration can be determined
by solving Laplace's equation under appropriate boundary conditions.

Remark 3.5. In static electromagnetism, we find solutions of Maxwell's equations


which are independent of time. In this case, (3.11) reduces to

curiH = J (3.20)(i)
div D = sp (3.20)(ii)
curl E = 0 (3.20)(iii)
div B = O. (3.20)(iv)

In electrostatics domain, J = B = H = 0, we have


div D = <p (3.21)(i)
curl E = O. (3.21)(ii)

For magnetostatics, sp = 0, E = D = 0, we get

curl H = J (3.21)(i)
div B = O. (3.21)(ii)

Remark 3.6. Derivation of the wave equation. Let us assume that <p = O,j = O.
By differentiating (3.16)(i) with respect to the time and by keeping in view the
formula
curl curl E = -6.E + grad div E,
we find that E satisfies the wave equation

8 2E 8 2E
8t2 = 6.E = 8x 2 .

In a similar manner, by differentiating (3.16)(iii), we find that B satisfies the wave


equation; that is,

8 2B 8 2B
8t 2 = 8x2 •
90 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERlCAL METHODS

Remark 3.7. Relation between charge density and current density. Differentiating
(3.16)(ii) with respect to t and by taking the divergence on both sides of (3.16)(i),
we get the equation
8tp
8t + diIV J. = 0 , (3.22)

which is called the equation of the conservation of electricity or the conti-


nuity relation for electricity.

Remark 3.8. The vector field P and M are called, respectively, the polarization
and the magnetization vector .

Remark 3.9. Maxwell's equations and the Lorentz condition. In equations (3.16),
we consider the functions
(x,t) -t A(x,t) E R 3 , and
{ (x, t) -t cjJ(x, t) ER,

which are called the potential vector and the scalar potential, respectively.
A( ·, .) and cjJ(., .) are related to E and B by the relation

B = curl A (3.23)(i)
8A
E = - grad cjJ - - . (3.23)(ii)
8t
On substituting the values of Band E in equations (3.16)(i) and (3.16)(ii), we obtain
the inhomogeneous linear systems

~t~ - 6A + grad (div A + ~~) = j (3.24)(i)

-6cjJ - :t (div A) = tp . (3.24)(ii)

It can be noted that A and cjJ are not defined in a unique manner by (3.23) starting
from E and B . Ir A and cjJ satisfy (3.23), then for each arbitrary function F of x
and t, AI and cjJl defined by

AI = A+ grad F (3.25)(i)
A..I = A.._ 8F (3.25)(ii)
'I' 'I' 8t'
also satisfy (3.23).
The transformation (A, cjJ) -t (AI, cjJl) given by (3.25) is called a gauge trans-
formation.
By (3.25), we have

8cjJI . 8cjJ 82 F
div AI + 7ft = div A+ 8t +6F- 8t 2 • (3.26)
3.1. MAXWELL'S EQUATIONS 91

Taking for F, a solution ofthe equation


{PF . 8r/>
t::.F - - 2
8t
=- (div A+-)
8t '
(3.27)

and assuming A and r/> known, we find that a pair (AL, r/>L) can be chosen such that

diIV A L + 7ft °
8r/>L = . (3.28)

This relation is called the Lorentz condition.


Keeping in view the Lorentz condition, (3.24) can be written as

(3.29)(i)

(3.29)(ii)

It may be observed that the pair (AI" r/>L) is not unique when j and sp are known.

Rernark 3.10. Maxwell's equations and the wave equation. Ir ep and A are
chosen such that ~: + div A = 0, then keeping in view (3.29) and (3.24), Maxwell's
equations in vacuum (equation (3.16)) can be written as the wave equation in the
absence of charge and current (ep = O,j = 0). That is, we have
82A
8t2 = t::.A, x E R3, t ~ °;
A(x, 0) = AO(x),x E R3; (3.30)
{ 8A
-at(x,O) = A'(x),x E R3 ,

82r/>
8t 2 = t::.r/>, x E R3 ;
r/>(x,O) = r/>°(x), x E R3 ; (3.31)
{ 8r/>( °-
8t x, ) _ r/>'(z)
x ,x ER, 3

where A and r/> are, respectively, the vector potential and scalar potential.
Each component of the potential vector is the solution of the wave problem;
namely,
~:~ = t::.u, x ERn, t > 0;
u(x,O) = uO(x); (3.32)
{
~~(x,O) = u'(x) ,
where uO and u' are given functions or distributions with a velocity of propagation
which is the speed of light in the vacuum, taken equal to 1 in the natural system of
units.
92 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERlCAL METHODS

Let u = u(x, t) ERbe a solution of the wave equation

8u
We put u l =u,u2 = 8t' then
FM = (EI ,BI) = (curl curlfzu") , curl(xu 2 »
FE = (E 2,B2 ) = (-curl(xu 2 ) , curl curl(xu l »
can be shown to be the solutions of Maxwell's equations (3.16) where sp = O,j = 0.
This can be obtained by using the identity curl curl curl (x4J) = - curl(x.64J).
FM is called the transverse magnetic wave while FE is called the transverse
electric wave. U = (U I.U2) is called the Debye potential. For further details, we
refer to Dautry and Lions [1990] and Schulenberger [1978].

Remark 3.11. The Coulomb condition. In place of (3.28), we can impose the
condition div A = 0, then (3.24) (ii) implies that .64J = -ep, that is, 4J satisfies the
Poisson equation in R~ where 4J and ip possibly depend on the time. In general, we
can choose for the solution 4J

, 1 I'ep(x,
4J(x , t ) = R3 X -
t)
X
Idx,
, 3
x ER, t ER. (3.33)

The right hand side expression in (3.33) has a meaning by convolution if sp is a


distribution with compact support. Normally, this 4J is called a Newtonian potential

°
but in electromagnetism this 4J is called the Coulomb potential and the condition
div A = is called the Coulomb condition.

3.1.3 Variational formulation of Maxwell's equations


Stable media. The results presented in this section are based on Duvaut-Lions
[1972, Chapter VII), and Dautry and Lions [1990, Vol. 3, Chapter IX , pp. 239-264].
Let 0 be a domain in R3 with a regular bounded boundary r. The open set 0 may
or may not be bounded. We want to find the vector flelds B, D, J that satisfy

~~ + J - curl (p, B) = GI in 0 (3.34)(i)

~~ +curl(€ D) = G2 in 0, (3.34)(ii)

where GI and G 2 satisfy the conditions

div G 2 = 0,G 2 .n = O,p, = ~,€ = ~ .


/lo €
3.1. MAXWELL'S EQUATIONS 93

(J1. and e are as in equation (3.17)) . € and jL are strictly positive and remain bounded
which may depend on x , in particular, may be pieeewise eonstant.
Let

= Bo(x), D(x, 0) = Do(x).


B(x,O)
Furthermore, div D = cp, div B = O.

Bo(O) = B o, Do(O) = D o on n (3.35)(i)


J = o € D on n. (Stable Media) , (3.35)(ii)

and

n 1\ D = O. (3.35)(iii)

We define the spaee H(eurlj n) as follows:

It is a Hilbert spaee with respeet to the inner produet which induees the norm

(3.36)

Let

s. (curl; n) = {v E H(eurlj n) In I\v=O onf, wheren is the normal}


0, to I' directed towards the exterior of n .
(1)(n))3 is dense in Ho (curl; n).
Let
1l = (L 2(n))6 = (L2(n))3 x (L 2(n))3 . (3.37)
An inner produet (,) is defined on 1l in the following manner:

An operator A is defined on 1l as follows:

D(A) - { <I> = {cf>, 'ljJ} E 1l I eurl (€'ljJ) E (L 2 (n ))3 }


- and eurl (jL'ljJ) E (L 2(n ))3, n 1\ cf> = 0 on I' (3.39)
A<I> = {-eurl (jL'ljJ) , eurl (Ecf>)} E 1l .

It has been shown [Duvaut-Lions 1972] that D(A) is dense in 1l and A is closed and

A* = -A, D(A*) = V(A).


The following problem is the variational formulation or weak formulation of
Maxwell's equations in the stable media (equations (3.34)).
94 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

Let
(3.40)(i)
which defines
M E L:,(1i,1i) . (3.40)(ii)
Find U = {D, B} such that
U E LOO(O, T, 1i) (3.40)(iii)

l T
[-(U, ~~)1i - (U,A«p)1i + (MU, «P)1i] dt

= l T
(G, «p)1i dt + (Uo, «p(0))1i V«P E tc , (3.40) (iv)

and
«P E L 2 (0,T jV (A ))
{ a«p/at E L 2 (0, T, 1i), «p(T) = 0, (3.40)(v)

G = {G I,G2 } E L 2 (0, T ,1i)


{ Uo = {Do,Bo} E 1i . (3.40) (vi)

For proof, see Lemma 4.5 [Duvaut-Lions, 72, p. 347].

Theorem 4.1 [Duvaut-Lions, 72, p. 347]. The system of equations (3.40)(i) - (vi)
has a unique solution.

3.1.4 Variational formulation (weak formulation) of magne-


tostatics of a surface current
In this section, we look for the solutions B of the following equations:

div B = 0 in R 3 (3.41)(i)
curl B = 0 in R 3 \ 0 = 0' (3.41)(ii)

such that

W = 2
1
{t
rIBI
in
2
dx +2
1,
{t
r IBI
in'
2
dx < 00, BE (L 2(R 3 )) 3 , 0 c tr. (3.42)

We consider the case where the surface current Jt: on I' boundary of 0 is given,
that is, we find B E (L 2(R3))3 satisfying

div B = 0 in R 3 (3.43)(i)
(PI) c~l B = 0 in 0 and 0' (3.43)(ii)
{ [ : 1\ n] r = Jr, Jt: given with div Jr =0 (3.43)(iii)
3.1. MAXWELL'S EQUATIONS 95

where [:::ii A nJ r
~
denotes the jump in the quantity -B /\
~
n with n normal to r
oriented to the exterior of 0 across r, that is, on denoting by Bo' and Bo, the
restriction of B to 0' and 0, [1} An] r (Br;yr -
= n.
B~/r) A
Let
v = H(div 0, R = {B E (L 3))3/div jj = o}
3
) 2(R and
W = {A E (H = o}.
1(R3»3/divA

By the Poincare lemma for each B E V, there exists unique A E W such that
B = curl A. By applying the }echniqu~s of the Fourier transform, it can be verified
that the mapping A E W ~ B = curl A E V is an isometry of the Beppo-Levi space
on the space V .
Let
1
a(B , B ) = 2 { B.Bdx+], ( B·Bdx, VB,BEV
/1}o 2/1 }q' (3.44)
{ ao(A,A) = a(curl A, curl A), VA,A E W,

where a( ·, ·) and aoL ') are continuous and coercive on V and W, respectively. In
fact, for /1m = inf {/1, /1'}, /1 M = sup {/1, /1'}, we have,
1
-2 (
/1 M } Ra
IBI 2 da; s a(B,B) ~ -/112 }{Ra IBI 2
dx VB E V . (3.45)
m

It can be observed that each element A E W admits a trace A/r E (Hl/2 (r))3.

Theorem 3.1. For each given surface current Js: such that Jr E (H-~(r))3,
Jt: . n = 0, almost every where on r, and div Jr = 0, then the problem (Pd is
equivalent to the variational problem:
Find B E V (respectively A E W, B = curl A) such that
~ 1 { ~ ~ ~
a(B,B)=ao(A,A)+'2}rJr ·A/rdr, VB= curlAEV, (3.46)

or

has a unique solution.


For proof, we refer to Dautry and Lions [1990, p.241] .

Remark 3.12. Finding the solution of the variational problem (3.46) is equivalent
to solving the optimization problem: Find B E V (or A E W, B = eurl A) such that

F(B) = [nf F(B), (3.47)


BEV
96 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

where F is a quadratic funetional defined on V as

F(B) = a(B,B) -Ir JrArdI', VB = eurl A E V. (3.48)

Variational formulations of various other problems of eleetromagnetism ean be found


in Dautry-Lions [1990, vol. 3 pp. 243-263].
Bossavit [1993) has presented applieations of MaxweIl's equations in areas like
electrical motors, mierowave ovens, resonant eavities. He has published aseries of
papers on various aspects of MaxweIl's equations with boundary eonditions, see for
example, Bossavit [1995,1993,1991). Hoppe and Wohlmuth [1981] have studied the
interior boundary value problem for Maxwell's equations in the time harmonie ease.
He has obtained qualitative results by a-priori estimates established therein and has
also established estimates for the global diseretization error in various norms of the
underlying spaces of approximating veetor fields. Non-eonforming finite element
methods have been applied. Reissei [1995) has also studied variational formula-
tion and numerical solutions of such problems. Monographs of Colton and Kress
[1983], Jin [1993] and Wang [1995) provide a eomprehensive aeeount of the varia-
tional (weak) formulation and numerical solutions ofthe MaxweIl 's equations. These
results are of vital importanee in areas like fibre eommunication systems. An au-
tomatie mesh generator for the finite element method along with all proeedures is
ineorporated in a software package available with Wang [1995]. A book by Krizek
and Neittanmäki [1996) and papers by Alonso and Valli [1997, 1999] present some
eurrent aspeets of the Maxwell's equations.

3.2 Finite element method


3.2.1 Introduction to numerical methods
As the things stand today, it is a formidable task to find exact solutions of
most of the real life models. During the last half of the eentury, attempts have
been made to develop techniques and methods to find approximate solutions of
these models and to examine whether the approximate solutions eonverge to exact
solutions or not. The error estimations between the exaet and approximate solutions
have been studied extensively in the reeent past. New endeavours are made to
find methods and techniques which will require minimum time in evaluating the
models with maximum aeeuraey. These methods, teehniques and tools eonstitute
the subjeet of 'scientifie eomputation'. Very often, the models are in the form
of ordinary and partial differential equations with boundary or initial eonditions.
By an exaet solution we mean a solution of these boundary and initial boundary
problems , while the eorresponding approximate solution is a solution of an algebraic
equation obtained from the given problem by diseretizing the analytical model. The
following methods for finding approximate solutions of models are weIl known to
mathematicians, engineers and other users of mathematies:
3.2. FINITE ELEMENT METHOD 97

(i) The Rayleigh-Ritz method,

(ii) Galerkin's method,

(iii) The weighted residuals method,

(iv) The collocation method,

(v) The least square method,

(vi) The finite difference methods,

(vii) Multigrid methods,

(viii) Finite volume methods,

(ix) Particle methods,

(x) The finite element method,

(xi) The boundary element Method, and

(xii) The wavelet method.

In this subsection, we brießy int roduce some of these methods. A detailed ac-
count of the finite element method is presented in the following subsections while
a comprehensive account of the boundary element method is given in Section 3.3.
The wavelet method is introduced in Chapter 5 along with the updated references.
The particle methods are introduced in Chapter 4. The finite difference, the finite
element and the boundary element methods have been the main competitor of each
other. Lucid accounts of their advantages and disadvantages can be found in Dautry
and Lions [1990, vol. 4, pp. 168-170 and 369-370], Hammond [1986], Brebbia et al,
[1985], and Reddy [1985]. However, this area is rapidly growing and in recent years
methods like the multigrid finite element, wavelet-based adaptive finite element are
attracting more attention due to their superior performance in many areas . See, for
example, Hackbusch [1985, 1994, 1995], Bramble [1993] , Brenner and Scott [1994] ,
Hoppe and Wohlmuth [1997] Canuto and Cravero [1997] and references therein.
In cases of complicated geometrie regions, the finite element methods andfor their
combination with boundary element and multigrid methods have an edge over the
other methods in general. Some sort of weak formulation (variational formulation)
of given models is required in all methods. The theory of distributions developed
by the French mathematician Laurent Schwartz during 1945-1950 and the Sobolev
spaces studied by the Russian physicist, S.L.Sobolev in the years 1937-38 have wide
applications in the weak-formulation of initial and boundary value problems . The
finite element methods can be treated as approximation methods in the Sobolev
spaces. However, we shall not follow this approach here and we refer to Ciarlet
[1978], and Ciarlet and Lions [1991] for t his type of study. This approach is also
98 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

introduced in Siddiqi [1986] along with updated literature. Mathematical models


can be written in the form
T u =! on n
{ Su =g on r, (3.49)

where n c Rn (in particular n = 1,2,3), I' is the boundary of n, T and S are dif-
ferential operators, f and g are elements of an appropriate function space, especially
a Sobolev space X. For example,
(i) T="V, 8=1,g=0

(ii) T="V , S=~,g=O


an
(Hi) T= d~2' 2
!=x ,S=1,g=0

In most of the methods, we are required to write (3.49) in the variational form
(weak form) : Find u E X such that

a(u,v) = F(v), (3.50)

where a(-, .) is a bilinear continuous form on the space X and F is a bounded linear
functional on X into R. We know that if a( ·,·) is also symmetrie, then finding the
solution u of (3.50) is equivalent to finding u E X such that

J(u) s {(v) for all v E X, where


(3.51)
{ J(v) = 2a(v, v) - F(v) .

It may be recalled that by an exact or classical solution of a differential equation,


we mean a function that identically satisfies the equation and the specified boundary
or initial conditions. Variational (weak) solution of a differential equation with
or without boundary conditions is the solution of an associated variational problem.
The exact solution is sufficiently differentiable as required by the equation while the
variational solution is not sufficiently differentiable to satisfy the differential equation
but differentiable enough to satisfy a variational equation (problem) equivalent to
the differential equation.

(i) Rayleigh-Ritz method. The Rayleigh-Ritz method deals with the approxi-
mate solution of (3.50) in the form of a finite series
m
Um = I: CjrPj + rPo , (3.52)
j=1
3.2. FINITE ELEMENT METHOn 99

where the coefficients Cj, called the Rayleigh-Ritz coefficients, are chosen such that
equation (3.50) holds for v = rPi, i = 1,2· .. mj that is,
m

a(rPi' L CjrPj + rPo) = F(rPi)' i = 1,2· .. m. (3.53)


j=l

Since a(·,·) is bilinear, (3.53) takes the form


m

La(rPi,rPj)Ci = F(rPi) - a(rPi'rPO)' (3.54)


j=l

or
Ac= b, (3.55)
where

is a matrix

and

which represents a system of m linear algebraic equations in m unknowns Ci. The


columns (and rows) of coefficient matrix A must be linearly independent in order
that the coefficient matrix in (3.55) ean be inverted. Thus, for symmetrie bilinear
forms, the Rayleigh-Ritz method can be viewed as one that seeks a solution of
the form in equation (3.52) in which the parameters are determined by minimizing
the quadratic functional (energy functional) given in (3.51). After substituting Um
of equation (3.52) for U into (3.51) and integrating, the functional J(u) becomes
an ordinary function of the parameters Cl, C2 • • '. The neeessary eondition for the
minimum of J (Cl, C2, • •• em) is that

BJ(.. ·) = BJ( .. · ) = ...... 8J(· ··) = O. (3.56)


BCI BC2 8cm
This gives m linear algebraic equations in m unknowns, Cj, j = 1,2· · . m. It may be
observed that (3.54) and (3.56) are the same in the symmetrie ease while they differ
in the non-symmetrie case. In other words, we get the same Ci'S by solving (3.54)
and (3.56) separately. In the non-symmetrie case, we determine the m unknowns
by solving the linear algebraie equations (matrix equations) (3.55). The choice of
{rPj} , j = 1,2· .. m is crucial and this should be the basis of the Hilbert space.

(ii) The Galerkin method. Let R = T(u) - b '# 0 where b = T(uo) in 0 such

l
that
(R,w) = Rwdf] = O. (3.57)
100 GHAPTER 3. MAXWELL'S EQUATIONS AND NUMERIGAL METHODS

where w = L,ßiePi'
i=1
For linear T, (3.57) gives a linear system of algebraie equations from which ß/s
ean be determined. In the linear ease, the Rayleigh-Ritz and Galerkin methods are
identical. It is a eommon praetice to ehoose w as a variation of Uj that is,
(3.58)
where I5O:i = ßi for all i. Thus the Galerkin method seeks an approximate solution
m
to (3.50) in the form of w = L, ßiePi' and determines the eoefficients ßi from the
i=1
eondition that the residual R is orthogonal to w.

(iii) The weighted residual method, T in (3.49) ean be chosen as any one of
the following operators:
d du
Tu = - dx(a dx) (3.59)(i)
d?- d?-w
Tw = dx 2 (b dx 2 ) (3.59)(ii)
d du
Tu = - dx (u dx) (3.59)(iii)
8 8u 8 8u ]
Tu = - [8x (k", 8x) + 8y (k y 8y) (3.59)(iv)

8u 8u 8 2u 8 8u 8v
T(u,v) = u 8x +v 8y + 8x2 + 8y(8y + 8x)·
In this method also, the solution u of (3.49) is approximated by expression of the
form
m

Um = ePo + L, CjcPj , (3.60)


j=1

where cPo must satisfy all boundary eonditions, say cPo = 0, if all the specified bound-
ary eonditions are homogeneous and cPj must satisfy all conditions as mentioned in
the Rayleigh-Ritz method as wen as eontinuity. However, eontinuity ean be relaxed
if weak formulation is possible for the given problem.

E = T( um) - f i- 0, (3.61)
in general, is ealled a residual or error in the equation. Onee cPo and cP1 are seleeted,
E is simply a function of the independent variables and the parameters Cj. In the
weighted residual method, the parameters are determined by setting the integral
of a weighted residual of the approximation to zero, that is, setting the eondition

i = 1,2· ··m , (3.62)


3.2. FINITE ELEMENT METHOD 101

where 'l/Ji are weight functions which are linearly independent. It may be remarked
that if 'l/Ji = rPi for all i, then we get the Galerkin method as a special ease of the
weighted residual method. The ease 'l/J i =I- rPi is sometimes referred to as the Petrov-
Galerkin method.

For linear T, (3.62) reduees to

or
m

I:Tijcj = fi, (3.63)


j=l

where
Tij = In rPiT(rPj)dxdy. (3.64)

It is clear that the matrix [T] = (Tij) is not symmetrie as Tij =I- T ji. The details of
this method ean be found in Finlayson [1972].

(iv) The collocation method. In this method we look for an approximate solution
Um of (3.49) of the form (3.60) by requiring the residual in the equation to be
identically zero at tn seleeted points Xi = (Xi,Yi),i = 1,2,3' ··rn in the domain f2j
that is,
(3.65)
The selection of the points xi is vital for obtaining an aceurate solution, The collo-
eation method ls a special ease of the residual method when 'l/J i = 6(x - xi), where
6(x) is the Dirac delta function charaeterized by the equation

In f(x)6(x - e) dx = f(e) . (3.66)

For more details, we refer to Douglas and Dupont [1973], Prenter and Russel [1976]
and wheeler [1978].

(v) The least squares method. This is a special case of the weighted residual
E
method where 'l/Ji = aa , when we are looking for a solution, as in (3.62). We shall
Ci
determine the parameter Ci from the condition

rE 2(x,y,Cj)dxdy=0,
aaCi in (3.67)
102 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

which is the necessary condition for the integral of the square of the residual (3.63)
to be minimum. On differentiation, (3.67) takes the form

1
8E
-8 Edxdy=O. (3.68)
o Ci

For linear T, (3.68) takes the form

t (inr
) =1
T(t/>i)T(t/>j)dXdY) Cj = rT(t/>i) [f - T(t/>o)] dxdy .
in
(3.69)

We refer to Bramble-Nitsche [1973], Prenter and Russel [1976], and Locker and
Prenter [1978] for more details.

(vi) Finite difference method. In the finite difference method, we replace dif-
ferential operators in (3.49) by a difference quotient. Suppose Tu = -6.u and
Su = u/r,g = 0, n = {(x,y)/O < x< 1,0 < Y < 1} in (3.49) , that is, we want to
solve the Dirichlet problem
-6.u =! in n
{ u/r = O. (3.70)

We explain the general procedure through the following example. Let N be an


integer ~ 1 and h = NIl . A mesh on the square n is the set of points (Xi =
N
ih, Yj = jh), i = 0,1, .. . + 1, j = 0,1,2 · . . N + 1, while these points are called the
nodes of the mesh. The finite difference method comprises obtaining an approximate
solution, that is, an approximation of U satisfying (3.70) at the points (Xi, Yj), i, j =
0, 1,2· .. N + 1 and it is based on the Taylor forrnula; namely,

· Iy).)
-6.u(x S =~
h 2 [4 u · . -
I ,) U·+1
1 ,). - U·
1- 1 ,) . - U·I ,)·+1 - U·1,).- 1]
(3.71)
{ + ~~ [~(Xi+(Jih'Yi)+~(Xi'Yi+(Jjh)] ,
with Uij = U(Xi,Yj) and IBil ~ 1 i,j = 1,2,3· ··N where the 4-th order derivative
of U exists and is continuous.
The two main points of this method are :

(1) neglecting the rest of the expansion in which the coefficient ~~ is small; and

(2) requiring that equation (3.70) is satisfied at all the points (Xi, Yj),
j = 1,2,3, ... ,N ofthe mesh, the quantity -6.U(Xi,Yj) being approximated,
in conformity with (1) by the difference quotient
1
-h2 [4u I)·· - U·+1
1 ,). - U·
1- 1 ,) . - U·
I,)·+ 1 - U· · - 1] ·
1, )

Putting !ij = !(Xi ,Yj), we get in this way N 2 equations in N 2 unknowns


uij,i,j = 1,2 · · ·N j that is,
1
h2 [4Uij - Ui+l,j - Ui-1,j - Ui,j+1 - Ui,j-1] =!ij, i,j = 1,2··· N . (3.72)
3.2. FINITE ELEMENT METHOD 103

The boundary condition Ujr = 0 is taken into account in equation (3.71) by requiring
that
UO,j=UN+l,j=Ui ,O=Ui,N+l = 0, i,j = 1,2,3· ··N. (3.73)

The system of equations (3.72) and (3.73) is then written in the form of matrix
equation as
AU=F. (3.74)

The solution of (3.74) is the approximate solution of (3.70). There exists a


fairly good literature on this method published in sixties, for example, one may
see references in Dautry and Lions [1990, vol. 4]. The problem of convergence of
the approximate solution to the exact solution has also been investigated. Weak
formulation of (3.49) and applying the finite difference method to it is known as
the variational approximation method for finite difference. This has been
studied by Felippa [1973]. For further general results, we refer to Cea [1964] and
Aubin [1972].

(vii) Multi-grid method. We shall see in the next two subsections that ap-
plications of the finite and the boundary element methods reduce the continuous
problems, namely, ordinary and partial differential equations to matrix equations.
In recent years, several robust and adaptive algorithms have been developed to solve
these equations known as the multi-grid methods which we do not present here due
to the limitation of space. In full multi-grid methods for elliptic partial differential
equations, one works on a sequence of meshes where a number of pre- andfor post-
smoothing steps are performed on each level. Interested readers in acquiring a good
knowledge of this new development may go through Hackbusch [1985, 1989, 1994],
Bramble [1993], Brenner and Scott [1994] and references therein.

(viii) Finite volume method. In the finite volume method, integral form of the
equations representing laws of fluid dynamics are discretized. The flow field domain
is subdivided into a set of non-overlapping cells that cover the whole domain on
which the conservation laws are applied. In the finite volume method, the term cell
is used for element. On each cell conservation laws which are the basic laws of fluid
dynamics are applied to determine the flow field variables in some discrete points
of the cells, called nodes. Cells can be triangular, quadrilateral, etc. They can be
elements of a structured grid or non-structured grid. In this method a function space
for the solution need not be defined and nodes can be chosen in a way that does not
imply an interpolation structure. Since the computational grid is not necessarily
orthogonal and equally spaced, the definition of derivative by Taylor's expansion
is impossible. Furthermore, there is no mechanism like a weak formulation and
therefore this method is best suited for flow problems in primitive variables where
the viscous terms are absent (Euler equations) or not very important (high Reynolds
number Navier Stokes equations) . For a lucid descr iption of this method, we refer
to Kroner [1997] and Wendt [1991].
104 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERlCAL METHODS

3.2.2 Finite element method

The finite element method is a variational-method-based technique to solve


differential equations. In this method continuous problems described by a differen-
tial equation are written into an equivalent variational form (weak form) and the
approximate solution of this variational problem is assumed to be a linear combi-
nation, L cd);, of approximation functions tPi' The constants Ci'S are determined
i
by the associated variation form. The finite element method provides a systematic
technique for deriving the approximation functions for simple subregions which con-
stitute a geometrically complex region. In this method, the approximation functions
are piecewise polynomials, that is, polynomials that are defined only on a subregion,
called an element. As mentioned earlier, phenomena in nature or real-life problems
can be expressed in the form of mathematical models employing the laws of physics
and with the aid of the given physical conditions. As we see these models are nothing
but algebraic, differential, integral or operators equations. Finding the exact solu-
tions of these equations is a formidable task. In the case where analytic solution is
not feasible, we look for an approximate solution. The finite difference method and
variational methods such as the Ritz and Galerkin methods were weil known until
the beginning of the sixties for finding approximate solutions. A systematic study
of variational formulation of the boundary value problems and their discretization
began in the early seventies. Since early 1950, engineer Argyris started the study
of certain techniques for structural analysis which are now known as the primitive
finite element method. The work representing the beginning of finite element was
contained in a paper of Turner, Clough, Martin and Topp [1956], where endeavour
was made for a loeal approximation of the partial differential equations of linear
elasticity by the usage of assembly strategies, an essential ingredient of finite ele-
ment method. In 1960, Clough termed these techniques as "finite element method" .
Between 1960 and 1980, several eonferences were organized in different parts of the
world, mainly by engineers, to understand the intricacies of the method. A paper by
Zlamal [1968] is considered as the first most signifieant mathematical contribution in
which analysis of interpolation properties of a dass of triangular elements and their
application to the second- and fourth-order linear elliptic boundary value problems is
carried out. Valuable contributions of Ciarlet, Strang, Fix, Schultz, Birkhoff, Bram-
ble and Zlamal, Babuska, Aziz, Varga, Raviart, Lions, Glowinski, Nitsche, Brezzi
have enriched the field. Proceedings of the conferences edited by Whiteman [1973-
79] and the book by Zienkiewicz and Cheung [1967] have popularized the method
among engineers and mathematicians alike. The Finite Element Handbook edited
by Kardestuncer and Norrie [1987] and the Handbook of Numerical Analysis edited
by Ciarlet and Lions [1991] provide updated literature. Wahblin [1995] contains
some of the current research work in this field. In short, one can say that there is no
other approximation method which has had such a vast impact on the theory and
applieations of numerieal methods. It has been practically applied in every conceiv-
able area of engineering: structural analysis, semiconductor devices, meteorology,
3.2. FINITE ELEMENT METHOD 105

flow through porous media, heat conduction, wave propagation, electromagnetism,


environmental studies, safing sensors, geomeehanics, biomechanics, aeromeehanics
and acoustics; to name a few.
The finite element method is popular and attractive due to the following rea-
sons: The method is based on weak formulation (variational formulation) of bound-
ary and initial value problems. This is a eritical property beeause it provides a proper
setting for the existenee of even diseontinuous solution to differential equations, for
example, distributions, and also beeause the solution appears in the integral of a
quantity over a domain. The fact that the integral of a measurable funetion over
an arbitrary domain ean be expressed as the sum of integrals over an arbitrary eol-
leetion of almost disjoint subdomains whose union is the original domain , is a very
important point in this method. Due to this fact, the analysis of a problem ean
be earried out loeally over a subdomain, and by making the subdomain sufficiently
small, polynomial functions of various degrees are sufficient for representing the loeal
behaviour of the solution. This property ean be exploited in every finite element
program which allows us to foeus the attention on a typical finite element domain
and to find an approximation independent of the ultimate loeation of that element
in the final mesh. The property stated above has important implieations in physics
and eontinuum mechanics and eonsequently, the physical laws will hold for every
finite portion of the material.
Some important features of the finite element methods are
(1) arbitrary geometries,
(2) unstruetured meshes ,
(3) robustness, and
(4) sound mathematical foundation.
Arbitrary geometries means that, in principle, the method can be applied t o do-
mains of arbitrary shapes and with arbitrary boundary eonditions. By unstructured
meshes, we mean that, in principle, one ean plaee finite elements anywhere ranging
from the eomplex cross-sections of biological tissues to the exterior of aireraft to
int ern al flows in turbo machinery, without strong use of a globally fixed eoordinate
frame. Robustness means that the scheme developed for assemblage after loeal ap-
proximation over individual elements is stable in appropriate norms and insensitive
to singularities or distortions of the meshes (This property is not available in classical
differenee methods).
The method has asound mathematical basis as eonvergenee of an approximate
solution of the abstract variational problem (a more general form of variational
inequality problem) and error estimation of the abstract form in a fairly general
situation and their particular cases have been systematically studied in the last two
decades. These studies make it possible to lift the analysis of important engineering
and physical problems above the traditional empiricism prevalent in many numerieal
and experimental studies.
106 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

In Subsection 3.2.3, we shall present error estimation of an approximate solution


of an abstract variational problem in the context of a Hilbert space and its exact
solution along with the general concept of finite element method, finite element
and main steps for solving a houndary value problem. In Subsection 3.2.4, detailed
procedures will be indicated for solving concrete problems.

3.2.3 Abstract finite element method


Let H be a Hilbert space and a(· , ·) be a bounded bi-linear form on H x H
into R. For each F E H*, the dual space of H (the space of all bounded linear
functionals on H) the variational equation (problem):
There exists u E H such that
a(u, v) = F(v) for an v E H (3.75)
has a unique solution provided a(·, .) satisfies the coercivity or ellipticity condi-
tion, namely 3 0: > 0 such that
a(u,u) 2: 0:1Iul1 2 for an u E H.
This result is known as the Lax-Milgram Lemma (for proof, one may see Siddiqi
[1986] or Ciarlet [1978] which mainly depends on the Riesz representation theorem
concerning representation of the elements of H* in terms of the inner product on
H) .

Conformal finite element method. The procedure for finding a finite-dimensional


subspace H h of H such that there exists Uh E H h satisfying the equation
(3.76)
is called the conformal finite element method or simply the finite element
method. Equation (3.75) is known as the abstract variational problem and
equation (3.76) is called the approximate problem. If H h is not a subspace of H,
the above method is called the non-conformal finite element method.
Equation (3.76) is nothing but a matrix equation of the form
AU=b, (3.77)
where U = (0:1,0:2,0:3'" O:N(h»), N(h) = dimension of Hi,
AT = (a(wi,wj))i ,j (3.77)(i)
b = (F(wI),F(W2)'" F(WN(h»)) (3.77)(ii)
N(h)
Uh = LO:iWi (3.77)(iii)
i=l
N(h)
Vh = LßjWj, (3.77)(iv)
j=l
3.2. FINITE ELEMENT METHOD 107

where eri and ß j are real numbers i, j = 1,2,3· .. ,N(h). The choiee of the basis
{wih of Hh' i = 1,2··· N(h), is of vital importance, namely, choose a basis of Hi;
which makes Aasparse matrix so that the computing time is reasonably small. In
the terminology of structural engineers, A and L( Wj) are called the stiffness matrix
and the load vector, respectively.
If a(',') is symmetrie, then finding the solution of (3.75) is equivalent to finding
the optimization problem
J(U) = inf J(v), (3.78)
vEH
where J(v) = ~a(v,v) - F(v) is called the energy functional.
In this case, the finite element method is nothing but the Rayleigh-llitz-
Galerkin method. It is quite clear that (3.76) and the approximate problem for
(3.78), namely,
J(Uh) = inf J(Vh), (3.79)
vhEHh

where J(Vh) = !a(vh,uh) - f(vh) have unique solutions.


Finding lIu - uhll, where U and Uh are the solutions of (3.75) and (3.76), respec-

° °
tively, is known as the error estimation. The problem Uh -t U as h -t 0, that is,
lIuh - ull -t as h -t or N = k-t 00 is known as the convergence problem.

Error estimation.

Theorem 3.2 [Oäa's Lemma]. There ezists a constant C independent 0/ the


subspace H h such that
(3.80)

where C = ~, independent 0/ Hh, M is the constant associated with the continuity


(boundedness) 0/ a( ·,·) and er is the coercivity constant. /f a(',') is symmetric, then
the degree of approximation is improved, that is, we get C = ~ which is less than
the constant in the non-symmetric case.

Theorem 3.3 [First Strang Lemma]. Let H be a Hilbert space and Hh(-) be
its finite-dimensional subspcce. Further, let a(·, .) be abilinear bounded and elliptic
form on Hand F E H*. Assume that Uh is the solution 0/ the /ollowing approximate
problern. Find Uh E H h such that
ah(uh, Vh) = Fh(Vh) for all Vh E H h, (3.81)
where ah( ',') is abilinear form defined on H h and FhO is a linear functional defined
on u.: Then

lIu-uhll~C( inf {lI u- Vhll+ sup la(Vh,Wh)-ah(Vh,Wh)l}


vhEHh whEHh IIwhll
IF(Wh) - Fh(Wh)l)
+ sup ,
whEHh II w hll
108 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

provided ah(',') is uniformly Hh-elliptic, that is, 3 ß > 0 such that ah(vh,vh) ~
ßllvhll2 for all Vh E H h and all h ,

Note: It may be observed that (i) ah(-, ') and FhO are not defined for all the elements
of H, and (ii) equation (3.81) has a unique solution under the given conditions.

Theorem 3.4 [Second Strang Lemma]. Let Uh be a solution of the following


approximate problem (discrete problem): Find Uh E Hi, such that

(3.82)

where ah(-,') is as in Theorem 3.3 and F E H* . Then there exists a constant C


independent of the subspace H h such that

where H = HJ (0).
We prove here Theorems 3.2 and 3.3 while the proof of Theorem 3.4 is along
similar lines (See Ciarlet, 1991, pp . 212-213, for proof and more information about
results in this direction).

Proof of Theorem 3.2. By (3.75) and (3.76), we get a(u, v) - a(uh' Vh) = F(v) -
F(vh) and this gives a(u, Vh) - a(uh' Vh) = 0, for v = Vh. By bilinearity of a(·, .), we
get
a(u - Uh, Vh) = 0 V Vh E Hh :::} a(u - Uh, Vh - Uh) = 0 (3.83)
by replacing Vh by vt. - Uh .
Since a(·, ·) is elliptic,

or

or
1
Q
[a(u - Uh, U- Vh) + a(u - Uh,Vh - Uh)] ~ lIu - uhll2.
Using (3.83), this becomes

~Q [a(u - Uh, U- Vh)] >


-
lIu - uhll2,

or
3.2. FINITE ELEMENT METHOD 109

using boundedness of a(·, ')j namely,

la(u, v)1 s MllulillvlI ·


This gives us

or

When the bilinear form a(·,·) is symmetrie, there is aremarkable interpretation of


the approximate solution, namely, the approximate solution Uh is the projection of
the exact solution U over the subspace Hh with respect to the inner product a(·,·)
as a( U - Uh, Vh) = 0 for all Vh E H h . Thus, we get

By the properties of ellipticity and boundedness of a(· , '), we get

or

Thus

Proof of Theorem 3.3. We have

by the triangular inequality of the norm and

By continuity of the bilinear form a(·, '), (3.84) takes the form

ßlluh - vhll2 ~ a(u - Vh,Uh - Vh) + {a(vh ' Uh - Vh) - ah(vh, Uh - Vh)}
+ {Fh(Uh - Vh) - F(Uh - Vh)} ,
110 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

or

ßlluh - vhll ~ Milu - vhll


la(vh'Uh - Vh) - ah(vh, Uh - vh)1
+ .:........:......:..:..:....-~......,I.;..:.;lu:...-h--....;.V~hI"":';':..:-...:.:....-.....;.;.:",,:
IFh(Uh - Vh) - F(Uh - vh)1
+:-..;.:....:....."--.,..,...:-"'---....;-;-"--_...:...:..
lIuh - vhll
la(Vh'Wh) - ah(Vh,wh)1
~ MII U - Vh 11 + whEHh
SUp 11
Wh 11
IFh(Wh) - F(Wh)1
+ SUp •
whEHh II w hll
By putting the value of Iluh - vhll in the first inequality and taking the infimum over
Hh, we get the desired result.

Remark 3.13. (i) Theorem 3.3 is a generalization of Cea's lemma as ah(-,·) = a(·,·)
and Fh(.) = F( ·) in the case of conformal finite element method (the case when
H h eH).
(ii) The problem (3.75) can be written in the form

Au=/, (3.75)'

where A : H ~ H bounded and linear.


By a well-known result (see, for example, Theorem 2.31 in Siddiqi [1986]), there
exists a bounded linear operator A on H into itself (H* = H) such that (Au, v) =
a(u, v) . By the Riesz theorem for each v, there exists a unique / E H such that
F(v) = (I, v}. Thus (3.75) takes the form (Au,v) = (I,v) which gives (3.75)'.

Convergence results. As a consequence of Theorem 3.2, we find that


lIuh - ull ~ 0 as h ~ 0, that is, the approximate solution Uh converges to the
exact solution of (3.75) if there exists a family {H h } of subspaces of the space H
such that for each u E H,

(3.85)

If lIu - uhll ~ Cha for a > 0 where C is a positive constant independent of U


and Uh and h is the characteristic length of an element, then a is called the rate of
convergence. It may be observed that the convergence is related to the norm under
consideration, say, Li norm, energy norm or L oo norm (or sup norm) .
There is vast literature concerning convergence for special types of norms and
spaces and special types of boundary and initial value problems. For interested read-
ers we refer to Wahlbin [1995], Zeinisek [1990], Ciarlet-Lions [1991]' Kardestuncer
and Norrie [1987] and Siddiqi [1994] . We present here a convergence theorem for a
3.2. FINITE ELEMENT METHOD 111

more general model namely variational inequality including variational formulation


as a special case.
Many physical phenomena are modelled by the following problem:
Let K be a non-empty closed convex subset of a Hilbert space H . Find u E K
such that
a(u, v - u) ~ F(v - u) for all v E K, (3.86)
where a(-,·) is a bounded bilinear form over H and F E H* . It is clear that if K
is a subspace of H, then (3.86) takes the form (3.75). Inequality (3.86) is known
as a variational inequality whose study was initiated around 1960 by Fichera,
Stampacchia and Lions. There exists a vast literature on this theme; for compre-
hensive and updated references we refer to Glowinski [1984] , Siddiqi [1994, 1994],
and Kikuchi and Oden [1988] . Inequality (3.86) and its generalizations are natural
models for contact problems, flow through porous media and many economic the-
ories; see, for example, Siddiqi [1997] for an updated literature and new areas of
applications of the variational inequalities and their finite element analysis.
Let Hh be a closed subspace of H and Kh be a subset of Hh for all h such that
{Khh satisfies the following conditions:

(i) If {vhh is such that Vh E Kh for all h and {Vhh is bounded in H, then the
weak cluster points of {vhh belong to K.

(ii) There exists a subset U of H with U = K and rt. : U -t K h such that


lim Th v = v strongly in H, for all v E U.
h-tO

The problem. Find Uh E K h such that

(3.87)

is called the approximate (discrete) problem.


It is known that (3.87) has a unique solution if a(·, ·) is elliptic . Under the above
conditions, the solution Uh of the discrete problem (3.87) converges to the exact
solution U of (3.86). For proof of this result , we refer to Glowinski [1984] .

Finite elements. Let 0 be a polygonal domain in R 2 , (0 C R 2 ) . A finite collection


of triangles T h satisfying the following conditions is called a triangulation:

(i) The union of the triangles with boundaries is equal to the polygonal domain,
that is

(ii) Two members, say, K l and K 2 of Th are either equal or disjoint, that is, for
any two distinct members K; and K 2 of T h , their intersection is either empty,
or a vertex common to Tl and T2 , or an edge common to them.
112 cHAPTER 3. MAXWELL'S EQUATIONS AND NUMERIcAL METHODS

Let P(K),K E Th , be a function space defined on K such that P(K) C HI(K)


(Sobolev space of order 1). Generally, P(K) is taken as aspace of polynomials of
some degree, and is contained in HI (0). This is essential for proving the convergence
of the method under consideration and has significant advantages in solving (3.76).
For proof, see Siddiqi [1986, p. 255] or Ciarlet [1978].
For h = max (diameter of K), N(h) = the number of nodes of the triangulation,
KET"
P(K) = Pt (K) = space of polynomials of degree less than or equal to 1 in x and
y, u, = {Vh I Vh/K E PI (K), K E Th} is contained in GD (0), space of real-valued
continuous functions on 0, and the functions wi,i = 1,2· · · N (h), defined by

Wi = 1 at i-th node
= 0 at other nodes ,
form a basis of H h and so Hh is an N(h)-dimensional subspace of HI(O) .

In the terminology of Ciarlet, (K, PK, ~) , where


(i) L is a set of m linear functionals 'l/Ji, l $ i $ m defined over PK such that
K
given any real scalars ai,l $ i $ m, there exists a unique function W E PK
satisfying 'l/Ji(W) = ai, 1 $ i $ mj
(ii) K is a compact subset of Rn with a non-empty interior and a Lipschitz con-
tinuous boundary; and
(iii) PK is a finite-dimensional space of real-valued functions defined over the set
K with dimension mj

is called a finite element.

Remark 3.14. (i) It is clear that m linear functionals 'l/Ji are linearly independent
and, in particular, there exist m functions Wi E P(K), 1 $ i $ m, that satisfy

1 s i s m,
and we have
m
W = L'l/Ji(W)Wi for all W E PK.
i=1

(ii) L is often called a set of m degrees of freedom, that is, of m linear func-
K
tionals, 4Ji,l $ i ~ m and 4Ji are called the degrees of freedom of the finite
element. The functions Wi, 1 ~ i ~ m, are called the basis functions of the finite
element. The basis functions are also called the shape functions in the literature
of engineering and technology.
3.2. FINITE ELEMENT METHOD 113

(iii) Very often simply K is called the finite element, especially in the engineering
literature (See remark Ciarlet p. 94[1991]). H h is called the space of finite elements.
(iv) If K is an n-simplex, (K, PK , I:K) is called the simplitial finite element.
Finite elements are called triangular and rectangular as K is triangular and
reetangular, respectively, in R 2 • In R3, a finite element is called tetrahedral if K
is tetrahedral.
(v) In practice, L in R 2 is the set of values of w E PK at the vertices and middle
K
points of the triangle or rectangle, as the case may be. PK is a set of polynomial of
degree less than or equal to 1.

3.2.4 Finite element method in concrete cases


The material presented here is mainly based on Brenner and Scott [1994] ,
Reddy [1985], Chari and Silvester [1980], Jin [1993] , Silvester and Ferrari [1990],
Dautry-Lions [vo1.4, 1990] and Zenisek [1990]. For basic knowledge for tackling real-
life problems through finite element methods, we refer to these lucidly written text
books. For a comprehensive study and numerous applications, we cite Ciarlet-Lions
[1991] and Kardestuncer and Norrie [1987].
The steps involved in the finite element analysis of a problem are as follows:

(i) Formulation of the problem in the variational form (equality or inequality


form) .

(ii) Discretization or representation of the domain of the problem into collection


of preselected finite elements.

(iii) Construction of the finite element mesh of preselected elements.

(iv) Numbering of nodes and elements.

(v) Generation of the geometrie properties, for example, coordinates, cross-section


required.

(vi) Derivation of discretized equations for all typical elements in the mesh, that
is, equations of the form

n
where an arbitrary element u = :~=>~iWi .
i=l

(vii) The search of appropriate basis/shape/interpolation function/approximation


function Wi.
(viii) Assemblage of element equations to obtain the equations of the whole problem.
114 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

(ix) Imposition of the boundary conditions.


(x) Solution of the boundary conditions.
(xi) Visualization of the solution and interpretation.
The steps involved will be explained by the finite element solution of the following
boundary value problem:

- ~ (a~:) = t, 0< x < aj


(3.88)
{ u(O) = 0, (a(X):)x=a =A,
where a = a(x), f = f(x), and Aare the data of the problem.

It may be remarked that this boundary value problem models the following phys-
ical phenomena.
(i) In electrostatics, it models electrostatic potential u between two parallel plates,
one located at x = 0 with u = 0 and the other located at x = a with a value
depending on A where a(x) and A are die-electric constants and electric flux,
respectively.
(ii) Transverse deflection of a cable, u = transverse deflection, a(x) = tension in
cable, f = distributed transverse load, A = axial force.
(in) Axial deformation of a bar, u = longitudinal displacement, a(x) = EA (where
E = modulus and A = area of cross-section), f = friction or contact force on
surface of bar, A = axial force.
(iv) Heat transfer, u = temperature, a(x) = thermal conductivity, f = heat gen-
eration, A = heat flux,
(v) Flow through porous media, u = fluid head, A(X) = coefficient of permeability,
f = fluid flux, A = flow (seepage).

Variational formulation. Multiply both sides of (3.88) by v and integrate over


(0, a), then we get

Jro dv du [ du ] a
adxdxdx+ v(-a dx) 0 = Jro vf dx

by applying integration by part. Thus (3.88) can be written in the form

a(u, v) = F(v), (3.88)'


where

l
a
dudu
a( u, v) = 0 a dx dx dx (3.88)'(i)
,
3.2. FINITE ELEMENT METHOD 115

Element number

@ ®
~
~ •••
-r • •3 •e • • •• •••
• • •

"
2 e+l N N+l
x
(i)
Node number

~--------:~- i
@
•2

X=X e x = X e+1
(ii)

Figure 3.3(i) and 3.3(ii). Finite element representation of a line


(one-dimensional domain) by line elements.

(3.88)' (ii)

where
'Y = (-VD: dU )
dx x=o

Discretization of the domain, meshes, nodes etc. Figures 3.3(i) to 3.3(iv) give

n °
geometrical meaning of finite elements, meshes, nodes . Here nodes are points of the
interval [0, a] (0 = (0, a), = [0, a], boundary oH1 consists of points and a) which
is subdivided into a set of subintervals or line elements called the finite element
mesh or triangulation. The mesh in Figure (3.3)(i) is a non-uniform mesh as the
elements are not of equal length. The intersection of any two elements is called the
inter-element boundary. The number of elements used in a problem depends mainly
on the element type and accuracy desired. et h elements and et h nodes are shown
in Figure (3.3) (ii). A typical element oe = (XA,XB), a line element, is called et h
element. Xe will denote the coordinate of et h element (see Figure 3.3(i». In Figure
3.3(iv), the boundary conditions of an element on the typical element are shown.

Derivation of discrete or approximate equation over et h element. Let the


exact solution U of (3.88) be approximated on the et h element by
m
ue(x) = LD:;e)tP;e) (x) , (3.89)
j=l

where D:j are the parameters to be determined and tPj(x) are the approximation
functions or the basis functions to be constructed. Substituting ue(x) for U and
116 GHAPTER 3. MAXWELL'S EQUATIONS AND NUMERIGAL METHODS

e
·1-- he
·h
e=o
I--.e 0
I
e=he
(üi)

(iv)

Figure 3.3(iii) and 3.3(iv). Finite element discretization of a one-dimensional


domain.

v = rPj over the et h element, namely, (XA, XB) into (3.88)', we get

(3.90)(i)

where

(3.90)(ii)

(3.90)(iii)

Derivation ofthe interpolation functions for an element. rP(e) are constructed


using the conditions of Rayleigh-Ritz methods test function. To satisfy these condi-
tions, we must select rPi such that equation (3.89) is differentiable at least with re-
spect to x and satisfies the essential boundary conditions U(XA) = u~e), U(XB) = u~e).
Furthermore, {rPi} must be linearly independent and complete. These conditions will
be met if we choose
(3.91)
The continuity is obviously satisfied, rPl = 1 and rP2 = x are linearly independent
and the set {I, x} is complete. To satisfy the remaining requirements on {rPi}' U
must satisfy the essential boundary conditions of the element; that is,

u(xe) = u~e) = Cl + C2 Xe; (3.91)'


{ u(xe+d = u~e) = Cl + C2 Xe+l ,
3.2. FINITE ELEMENT METHOD 117

or, in matrix form


u~e) }
{ u~e)
_
-
[1
1 Xe+l
Xe ] { Cl }
C2 '
(3.92)

Solving (3.92) for Cl and C2 in terms of u~e) and u~e), we obtain


u~e) Xe +l - u~e) Xe u(e) _ u(e)
Cl= , C2=2 1 (3.93)
Xe+l - Xe Xe+l - Xe

Putting these values of Cl and C2 in (3.91), we get


2
u(x) = Lu~e)rP~e) , (3.94)
i=l

where
,I,(e) _
'1'1 -
X ,I,(e) _ X - Xe
Xe+l -
, '1'2 -
< <
,Xe _ X _ Xe+l • (3.95)
Xe+l - Xe Xe+l - Xe
Expression (3.94) satisfies the essential boundary conditions of the element, and {rPi}
are continuous, linearly independent, and complete over the element. Comparing
(3.89) with (3.94), we find that m = 2 and ale) = U]e).
Since the approximation functions are derived from (3.91) in such a way that
u(x) is equal to u~e) at node l(xA = xe) and to u~e) at node 2(XB = xe+d, that
is interpolated, they are known as the Lagrange family of interpolation functions.
The interpolation functions have the following properties besides rP~e) = 0 outside
the element oe = (0, a):

(i) rP~e)(Xj) = {~ ~i~ :; Xl = Xe,X2 = Xe+l (3.96)(i)


2
(ii) LrP~e) (X) = 1 (3.96)(ii)
i=l
The important point is to note that one can derive discrete equation
(3.90) using (3.89) and the rPi will depend upon the type of element that
is its geometry, number of nodes and number of primary unknowns per
node.
The nature of rPi will also be crucial in evaluating integrals in (3.90)(ii) and
(3.90) (iii). In view of (3.96)(i), (3.90) can be written as

(3.97)(i)

where for the linear element, XA = Xe and XB = Xe+!


(e)
K ij -
_l z.
z
.+
1

a dx
drPi drPj
dx dx (3.97)(ii)

Fi(e) = rz·+l rPddx + A~e) . (3.97)(iii)


i:
118 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

Equations (3.97) are known as the finite element model of the given
boundary value problem (3.88) or equivalently (3.88)' over an element.
A local coordinate system (a coordinate system in the element) proves more con-
venient in the derivation of the basis functions. If x denotes the local coordinate
whose origin is at node 1 of an element, then the local coordinate is related to the
global coordinate (coordinate used in the earlier formulation) by linear transforma-
tion
x =x+xe •
Putting this value of x in (3.91) and (3.92), we get

= Cl + C2 X
= u(O) = Cl

(3.98)

and finally we obtain

u = tU)e)fjJ)e)(x)
(3.99)
{ fjJ~e)(x) = ~=~ ~ ,fjJ~e)(x) = ~, 0 ~ x ~ he;
(3.97)(ii) and (3.97)(iii) take the following form in the local coordinate:

K~~) =
I)
r:
Ja
Ci dfjJi dfjJj dX
dX dX
(3.100)(i)

Fi(e) r:
= Ja fjJi fdX + >.~e) , (3.100)(ii)

where
Ci = evaluated at (xe + x) = a(x e + x) (3.100) (iii)
f = f(x e +x). (3.100) (iv)
When the linear interpolation functions are used to approximate the dependent
variable of the present problem, we obtain

(3.101)(i)

(3.101)(ii)
3.2. FINITE ELEMENT METHOD 119

where a e and Je are element-wise constant.


Assembly on connectivity of elements equations: Equation (3.97) is valid
for arbitrary elements. Suppose 0 = (0, a) is divided into three elements of not
necessarily equal lengths. Since these elements are connected at nodes 2 and 3 and
u is continuous, U2 of element oe should be the same as U1 of element oe+! for
e = 1,2. If Ui, i = 1,2,3 · . . mare values of u at global nodes and u{ are nodal values
in the loeal coordinate, then
U~l) = U1 U~l) = U2 = U~2) U~2) = U3 = U~3) u~3) = U4. (3.102)
Relations (3.102) are ealled the inter-element continuity condition. The element
equation (3.97) can be written for different elements in the following form:
Element 1.

a1 1 -100]
1 00 {U1U2 }= fth 1 {I}
1 + {AP)}
A~l) .
(3.103)(i)
h1 [ 0 0 0 0 U3 2 0 0
o 0 00 U4 0 0

Element 2.

(3.103)(ii)

Element 3.

a3[~ ~ ~ ~] {~~
h3 0 0 1 -1 U3
} = hh2 3 {~} + {~AP)
1 } . (3.103)(iii)
o 0 -1 1 U4 1 A~3)

By superimposing, that is, adding equations (3.103)(i)-(iii), we get the global


finite element model of (3.88); namely,

a1 a1 U1 fth 1
A(1)
1
0 0
h1 h1
a1 a1 a2 -a2
- +- 0 U2 1 fth 1+hh2 A~l) +A~2)
h1 h1 h2 h2
a2 a2 a3 a3 =2 +
0
h2
-
h2
+h- h3 U3 hh2+hh3 A2(2) +A1(3)
3
a3 a3
0 0
h3 h3 U4 hh3 A~3) .
(3.104)
120 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

Imposition of boundary conditions. Equation (3.104) is valid for any problem


described by the differential equation (3.88) irrespective of the boundary conditions.
After putting appropriate boundary conditions, we obtain a non-singular matrix
which can be inverted to have the solution of (3.104). The detailed account of
the physical meaning of the boundary conditions can be found in Reddy [1985].
Solution of the discretized equation (3.104) and its visualization:
MATLAB and Mathematica are two of the most popular computing environ-
ments for computation and visualization of the solution of equation (3.104) . These
packages have been introduced in Appendix 7.1. See also Appendix 7.5 for a solu-
tion of (3.104). Methods discussed in Chapter 4 could be useful for more accurate
evaluation of integrals in (3.90), especially, in higher dimensions.
It may be remarked that the steps narrated here will remain the same for solving
differential equations of higher orders or in higher dimensions, that is, in solving
partial differential equations. Solutions of a fairly large number of modelling prob-
lems of science and technology through the finite element methods are discussed in
Reddy [1985] and Hackbusch [1992] in a simple way. See solutions of problems for
illustration of the finite element method in concrete cases.

3.3 Boundaryelement method


The theory of integral equations is weIl known through the classical work of
Fredholm [1903], especially due to its relevance to the potential theory (Laplace type
problems) . The work of KeIlog [1928], Tricomi [1985], Kupradze [1968], Mikhlin
[1957,1965], etc. provides quite rich literature on the subject. However, methods
of integral equations were not popular with engineers until late seventies in spite of
the invention of digital computers. Researchers at Southampton University may be
credited for making a systematic endeavour in eighties and nineties to solve problems
of different branches of engineering by modelling them through integral equations
on the boundary of a region (domain) and then by discretizing these models. This
technique led to the development of the boundary element method (also called the
boundary integral method in the initial stages). Thus the boundary element
method (BEM) consists in the transformation of the partial differential equation
describing the behaviour of an unknown inside and on the boundary of the domain
into an integral equation relat ing to only boundary values, and then finding out the
numerical solution. The combination of the finite element method is possible with
the boundary element method where the integral equation on the boundary can be
discretized through the finite element techniques treating the boundary as an ele-
ment. Now-a-days, BEM is a powernd technique as it reduces the dimension of the
problem by one as boundary integral equations and it permits for complex surface
elements to define the external surface of the domain. One of the significant advan-
tages of this method is that discontinuous elements can also be used and meshes of
such elements can be refined in a simpler manner. This method has been applied
in a variety of physical phenomena like transient heat conduction, thermo-
3.3. BOUNDARY ELEMENT METHOD 121

elasticity, contact problems, free boundary problems, water waves, aero-


dynamies, elasto-plastic material behaviour, electromagnetism, soil rne-
chanics. There is a vast literature published on these topics in the last fifteen years
which can be found in books by Brebbia [1978, 1984, 1985, 1988, 1991], the proceed-
ings ofinternational conferences held in different parts ofthe world edited by Brebbia
et al, [1980,1984,1990,1991], Antes and Panagiotopoulos [1992], Pozrikidis [1992],
Banerjee [1994], Chen and Zhou [1992], and Hackbusch [1995]. We confine ourselves
here to the basic mathematical t echniques of the method and illustration of the main
steps in the solution of problems by this method. This is a fast-expanding area of
practical numerical methods and this can play a vital role in finding solutions of
industrial problems.

3.3.1 Basic mathematical results for the boundary element


method
Let H be an appropriate Hilbert space, then a bounded linear operator on H
into itself is called self-adjoint, if T = T*, where T* is the adjoint operator of T
defined by the relation
(Tu, v) = (u, T *v) .
The self-adjointness of an operator is analogous to the symmetry of a matrix.
More frequently, H is taken to be the Sobolev space of order 1, namely, H 1(n).
Equations (3.51) can be written as
Tu = -fon n (3.105)(i)
Su = 9 on r 1 (3.105)(ii)
Lu = h on r 2 , (3.105)(iii)
where r = r 1 + r 2 = boundary of the region n c Rn(n = 1,2 or 3) (3.105)(ii)
and (3.105)(iii) are called, respectively, the essential and natural boundary con-
ditions.
Let

(u ,v) =l uvd n
(u,v)r =l uvd r

(Tu,v) = l T(u)vd n.
Various examples of T are discussed earlier.
Applying integration by parts repeatedly, one can get the relationship of the type

(Tu,w) = l T(u) w d n

= l u T(w) d n+ l [L(w )S(u) - S(w)L(u)] dr .


(3.106)
122 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

(i) Let
T(u) cPu - ) .2 u, n = (
= dx )
0,1 .
2

Then,
du
r = {O,l}, S(u) = dx' L(u) = u,
and the relation (3.106) takes the form

1 1
T(u)wdx = 1 1
T(w)udx + [L(w)S(u)]~ - [L(u)S(w)]~.
(ii) Let
cPu
T(u) - dx2 ' n= (0,1),
then (3.106) takes the form

1 o
1
T(u)wdx = 1
0
1
T(w)udx + [LI (W)SI (u) -
1
~(W)S2(U)]~
+ [S2(w)L2(u) - Sdw)L 1(u)]o,
where

Let a particle of unit mass subject to the force of a specified field F be moved
from a point x to a point y in space. Then the work done on the particle by the
force, denoted by W, is given by the expression

W= l Y
Fdr, (3.107)

where F is the force field vector and dr is the differential motion of the particle on
the path from x to y.
If x is a fixed point while y varies, the integral (3.107) represents a function of y

l
alone. This scalar function
Y
u(y) = Fdr, (3.108)

is called the potential of the field F.


If the field is gravitational, then the potential is Newtonian. The Newtonian
potential generated by two particles of masses ml and m2 located at points x (fixed)
and y (variable), respectively, is of the form

l
y
1
u(x) = z
Gmlm2 Y'( -)
r
= Gmlm2 -r1 + constant, (3.109)
3.3. BOUNDARY ELEMENT METHOD 123

where G is the gravitational constant and r is the distance between x and Y; that is,

1
r(x, y) = IIx - Yll = d(x, y) = [(Xl - Yl)2 + (X2 - Y2)2 + (X3 - Y3)2] 2", (3.110)

where X = (Xl,X2,X3),Y = (Yl,Y2,Y3)'


Attraction forces of the same character as those occurring in gravitation also act
between electric charges and the poles of magnets. In that case, we shall refer to
sources rather than masses. Thus, we assume that a unit simple source located at
a source point X in space generates at a point Y the potential Ilx ~ Yll which is a
continuous function of Yi infinitely differentiable except at the source point x.
A discrete distribution of simple sources of intensities CPl' CP2 • • •CPm located at
points el' e2' ... em,
respectively, generate the Newtonian potential

(3.111)

at point y.
This potential is also a continuous function of Y and infinitely differentiable at
all points except those coinciding with en'
If we consider a continuous distribution
of simple sources of volume density sp throughout the domain 0, then the potential
associated with this force field is a volume potential

u(x) = 1 cp(y) IIx ~ Yll dO(y). (3.112)

The integrand has a singularity when the field point x lies inside the domain O. It
can be easily seen that u(x) exists at all x E 0 and is differentiable provided cP is
bounded. If sp satisfies the Lipschitz condition of order 0: > 0, that is, Icp(x) -cp(y)1 ~
2
MII X - YIl a, M >, 8 x2 exists;
0 t h en 8 . t hat
at iis,
Xi

8u(x) { 8 ( 1 )
8Xi = Jn cp(y) 8Xi IIx _ yll dO(y)
{ 1 ( 8cp(y) 1
= - Jr cp(y) IIx - Yll ni (y)dI'(y) + Jn 8Xi(Y) IIx - JlII dO(y).

By differentiating this relation with respect to Xi(X), the i-th coordinate of x, we


124 GHAPTER 3. MAXWELL'S EQUATIONS AND NUMERIGAL METHODS

get

or

~:~~:~ = ~(x) t aX~(Y) (II ~ YII) ni(Y)dI'(y)


l [~(x)
X
(3.113)
{ + - ~(y)] a~~X) (IIX ~ YII) dO(y) .
It is clear that for derivation, we have used the relations

ax~x) (lIx~YII) = -ax~(y) CIX~YII)


a a
aXi(Y) ~(y) = aXi(Y) [~(x) - ~(y)],
[Xi(Y) denote the i-th coordinate of y].
Adding (3.113) for different values of i, we get

V2U(X) = ~(x)ta~y) (IIX ~ YII)dI'(Y)


l [~(x)
(3.114)
{
+ - ~(y)] V 2
(IIX ~ YII) dO(y) .
The second integral of (3.114) vanishes as the 0 can be divided into two parts, one
of a small sphere of radius E surrounding the point x and the other entire region
1
denoted by OE and keeping in mind that Ilx _ yll satisfies Laplacian (the integral
over the small sphere vanishes as ~(x) satisfies the Lipschitz condition) .
Therefore

V2U(X) = l
~(x) an~y) IIx ~ ylld r(y)
{ 2 raa ra 1
V u(X) = ~(x) ire an(y) IIx _yll d r(y) + ~(x) i r an(x) Ilx _ yll dI'(y) ,
(3.115)
3.3. BOUNDARY ELEMENT METHOD 125

where reis the boundary of the sphere surrounding x and I' is the boundary of n.
We have
r a 1 1
Jr. an(y) IIx _ ylI d r(y) = - 102 Jr• d I' = -471".
r
Sinee there are no sourees in the domain between r e, and I', therefore there is
no flux out of the region and, eonsequently,

l-. .!!...(!)dr + Jr.!!...(!)dr=o,


r an r
e
r an r
(3.116)

where r = Ilx - ylI and the normal is outward on I', but inward on r e. Therefore

Ir a~y) Clx ~ yll) d r(y) = -471" . (3.117)

By (3.115) and (3.117), we get

V'2 U(X) = -47r<p(x). (3.118)

The potential associated with a eontinuous distribution of simple sourees extend-


ing over a surface r and of surface density o defined by the relation

u(x) = Ir a(y) IIx ~ ylI dr(y) (3.119)

is ealled a single layer potential.


Let r(y) and r(y') be two surfaces separated by a small distanee earrying dis-
tributions of attraction of magnitude a(y) and a(y'), respectively, such that

a(y)dr(y) = -a(y' )dr(y'),


then the potential

(3.120)

where J.L is the limit of a(h) as h -t 0 and a -t 00, is ealled a double layer
potential. The function J.L is known as the surfaee density or the double integral. It
ean be verified (see, for example, Brebbia, TeIles and Wrobel [1984]) that the value
of double layer potential given by (3.120) is equal to

u+(x) = -271"J.L(x) + Ir J.L(Y) a~y) Clx ~ yll) dr(y),


if x approaches the surface r from the inside. u(x) is equal to u(x) , given by
126 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

in the case x approaches from outside.


Thus we get
u+(u) - u(x) = -471"J.t(x). (3.121)
The concept equivalent to the Newtonian potential in two dimensions is the
logarithmic potential defined as log IIx ~ ylI' where

I~ -lAI = d(x, y) = {[Xl (x) - Xl (y)]2 + [X2(X) - X2(y)]2

+ [X3(X) - X3(y)]2}! ,

and Xi(X) and Xi(Y) are coordinates of the points x and y, respectively. The single-
layer potential for two-dimensional problems is given by

u(X) = Ir u(y) log IIx ~ yll dI'(y) . (3.122)

Then the double layer potential for two-dimensional problems is of the form

u(X) = Ir J.t(y) an~y) (lOg IIx ~ YII) dI'(y) . (3.123)

The two-dimensional volume potential is defined as

u(X) = l cp(y) log Ilx ~ yll dO(y).


It satisfies Poisson's equation

\72u(X) = -271"cp(x).
Equation (3.121) in two-dimensional problems takes the form

(3.124)

3.3.2 Formulation of boundary value problems in terms of


integral equations over boundary of the given domain
Indirect method. We consider Laplace's equation with the Dirichlet or Neumann
boundary conditions, namely, (3.105) with T = \72,1 = 0, where S = 1,9 = U,X E
r, (Dirichlet boundary condition), or L = :n' n
where is the unit outward normal
to surface r, h = q(x), x E r 2 (Neumann boundary condition), r = r 1 + r 2, U and
q are prescribed values of u(·).
A function u is called harmonie within 0, bounded by a closed surface r, if (i)
u is continuous on the boundary of 0, namely, on r and inside 0, (ii) u is at least
twice differentiable, and (iii) u satisfies Laplace's equation in 0 .
3.3. BOUNDARY ELEMENT METHOD 127
n
U =U

Figure 3.4. h(x) = q(x) on r 2 Neumann boundary condition.

It is weil known that a harmonie function can be represented by a potential


distribution and, conversely, every potential is a harmonie function (see, for example,
Kellog [1929]). An effective method of formulating the boundary value problem of
potential theory is to represent the harmonie function by a single-layer or a double-
layer potential generated by continuous source distributions over r, provided these
potentials satisfy the prescribed boundary conditions for u. This method helps
us to formulate the associated integral equations whieh define the source densities
concerned.
Thus, for formulating the Dirichlet boundary value problem in terms of an inte-
gral equation, we assume that the unknown function u(x) is expressed by a single-
layer potential with unknown density rr; that is,

u(x) = Ir u(x)u*(x, y)df(y) , xE n, (3.125)

where
u*(x, Y) = !Ix: yll' is the Newtonian potential.

(See Theorem 3 of Dautry and Lions [1990] in three dimensions.)


Since the kernel u*(x, y) in (3.125) is continuous as x passes through the surface,
128 GHAPTER 3. MAXWELL'S EQUATIONS AND NUMERIGAL METHODS

taking limit in this equation as x approaches r , gives the integral equation on r

u(x) = Ir a(y)u*(x, y)dI'(y), xE r. (3.126)

In this equation, a(·) is the only unknown. Equation (3.126) is a Fredholm equation
of the first kind. While using a double-layer potential for u(x), one can write the
Dirichlet boundary problem in the form of a Fredholm equation of the second kind,
but experts prefer a single-layer treatment as it is more convenient mathematically
and more illuminating physically.
If we represent the solution of the Neumann boundary problem in three dimen-
sions by a single-layer potential with unknown density zr; namely,

u(x) = l a(y)u*(x,y)dI'(y), (3.127)

we get
8u
q(x) = 8n(x)
r Bu"
= -a7J"a(x) + Jr a(y) 8n(x) (x, y)dI'(y) , (3.128)

where a = 1 for two-dimensional problems and a = 2 for three-dimensional prob-


lems. (u*(·, ·) is Newtonian potential in three dimensions and logarithmic potential
in two dimensions). (3.128) is a Fredholm equation of the second kind as the un-
known a( ·) appears both inside and outside the integral. (3.128) has a solution

l
provided
q(x) df(x) = O. (3.129)

The solution is unique only within an arbitrary additive constant. See results in this
direction in Dautry and Lions [pp. 130-136, vol. 4, 1990].

Direct formulation of boundary value problems in the form of integral


equation on boundary. We use the weighted residual method discussed in
Section 3.2.1 for writing boundary value problems into integral equations on the
boundary of a given domain.
Let us consider the boundary value problem
\i'2U(X) = 0 in 0
u(x) =uonr 1 (3.130)
{ 8u
8n = q(x) on r2,
where r = r 1 + r 2,0 C R3, r = boundary of the region O. The weighted residual
equation (3.62) takes the following form for this boundary value problem:

In \i'2U(X)U*(x, y)dO(x) = 12 (q(x) - u*(x, y) df(x)


q)

{ - .r
Jr 1
[u(x) - u(x)) q*(x,y) df(x) ,
(3.131)
3.3. BOUNDARY ELEMENT METHOD 129

where u*(·,·) is interpreted as the weighting function and


*( ) _ 8u*(x, y) (3.132)
q x,y - 8n(x) .

Integrating by parts (3.131) with respect to we have

1
Xi,

8u*(x y)
8 .( ' ) d!1(x)
X
=- r1
q(x)u*(x,y)dI'(x)

- - Jrr2
XI

q(x)u*(x,y)dI'(x) (3.133)

- Jrr [u(x) - ileX)] q*(x, y)dI'(x) ,


1

where i = 1,2,3 and Einstein's summation eonvention is followed for repeated in-
dices. Using integration by parts onee more, we get

l 2u*(x,y)u(x)dn(x)
=- Ir
{
V q(u)u*(x,y) d r(x)

+l
(3.134)
u(x)q*(x,y)dI'(x),

keeping in mind that I' = I' 1 + I'2 .


Recalling the following properties of the Dirae delta function 8(x, y):
8(x, y) = 0, if X =1= y
(3.135)
{ In u(x)8(x,y)d!1(x) : ~~/~ = yX

assuming u*(x, y) to be the fundamental solution of Polsson's equation; namely,


V 2 u* (x , y ) = -20 7r 8(x,y) , (3.136)
where 0 = 1 and 0 = 2 are, respectively, two- and three-dimensional problems, and
putting the value of V2 U * (X, y ) from (3.136) into (3.134), we have

20 7r u(y) + Ir u(x)q*(x,y)dI'(x) = Ir q(x)u*(x, y)dI'(x). (3.137)

Considering the point y to be on the boundary and accounting for the jump of the
left hand, (3.137) yields the integral equation on the boundary of the given domain
n (boundary integral equation)
c(y)u(y) + Ir u(x) q*(x,y)dI'(x) = Ir q(x)u*(x,y)dI'(x) . (3.138)

Remark 3.15. (i) If we are looking for a solution of the Neumann boundary
problem, then the right hand side of (3.138) is known and for the desired solution
and we have to solve a Fredholm equation of the seeond kind .
130 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

(ii) If we are interested in solving a Dirichlet boundary problem, the values of


u(x) are prescribed throughout rand the problem reduces to solving a Frsdholm
equation of first kind in unknown q(x) = 8:rX) , normal derivative.
(iii) The solution of a Cauchy (mixed) boundary problem will require solution of
a mixed integral equation for the unknown boundary data.

3.3.3 Main ingredients of the boundary element method


The following steps are involved in solving a boundary value problem by the
boundary element method:

(i) Conversion of the problem into the boundary integral equation (as indicated
in Subsection 3.3.2).

(ii) Discretization of the boundary I' into aseries of elements over which the poten-
tial and its normal derivatives are supposed to vary according to interpolation
functions. These elements could be straight lines, circular areas, parabolas,
etc .

(iii) By the collocation method, the discretized equation is applied to a number


of particular nodes within each element where values of the potential and its
normal derivatives are associated.

(iv) Evaluation of the integrals over each element by normally using a numerical
quadrature scheme.

(v) To derive a system oflinear algebraic equations imposing the prescribed bound-
ary conditions and to find its solution by direct or iterative methods.

(vi) Determination of values of the function u at the internal points of domain


under consideration.

Remark 3.16. (i) The values of u(x) at any internal point of the domain are
determined by equation (3.137).
(ii) The values of derivatives of u, at any internal point Y with Cartesian coordi-
nates Xi(Y), i = 1,2,3 can be determined from the equation

8u(y)
8Xi(Y)
=_1_{ Jrr q(X)8u*(x,
2a7l" 8Xi(Y)
y) dI'(x)- r u(x)8q*(x,y) dr(X)}.
Jr 8Xi(Y)
(3.139)

It is clear that (3.139) is obtained by differentiating (3.137) under appropriate


conditions.
(iii) A significant advantage of the boundary element method is the relaxation
of the condition that the boundary surface is smooth (Lyapunov), that is, it can be
used for surfaces having corners or edges.
3.3. BOUNDARY ELEMENT METHOD 131

We illustrate now the above mentioned steps through concrete examples over
finite regions of homogeneous isotropie media with Neumann, Dirichlet and Cauchy
boundary conditions. Non-homogeneous regions may be divided into homogeneous
subregions with different physical properties and the method can be applied to a
system of equations for each subregion introducing compatibility of potentials and
equilibrium conditions in terms of normal derivatives. The method can also be ex-
tended for infinite regions. The integral equation given in (3.138) can be discretized
into a large number of elements. For the sake of simplicity, we consider a two-
dimensional case in which the boundary is divided into segments of constant and
linear forms as shown in Figure 3.5.

(i)

Figure 3.5(i) . Constant boundary elements.

For the constant element case, the boundary is discretized into N elements . Let
NI belong to r l and N 2 to r 2 , where the values of u and q are taken to be constant
on each element and equal to the value at the midnode of the element. We observe
that in each element the value of one of the two variables u or q is known. Equation
(3.138) can be written in the form

CiUi + Ir U q*& = Ir qu" d r , (3.140)

where u* = 2~ log(~), as a two-dimensional case is considered, while u* = 4~ (~),r =


Ilx - yll = d(x,y) . Here we have chosen Xi(Y) = Ui and c(y) = Ci.
(3.138) can be discretized as follows :

CiUi +L
N

i=l
1 r;
uq" d r =L
N

i=l
1 r;
u*q d r. (3.141)
132 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

(ii)
Figure 3.5(ii). Linear boundary elements.

For a constant element, the boundary element is always smooth; hence the Ci coeffi-
cient is identically equal to ~rj, denote the length of element j . (3.141) represents
in discrete form the relations hip between node i at which the fundamental solution
is applied and all the j elements, including the case i = i. on the boundary. The
values of U and q inside the integrals in equation (3.141) are constant at one element
and, consequently, we have

(3.142)

1 o., = 1r j u* d r .
Let
H ij = q* d r, and (3.143)
rj

(A symbol involving i and j is used to indicate that the integrals r


Jrj
q* d r relate
the i-th node with the node j) .
Equation (3.141) takes the form

1 N N
2Ui + 'LHijUj = L Gij qj. (3.144)
j=l j=l

Here, the integrals in (3.141) are simple and can be evaluated analytically but, in
general, numerical techniques will be employed .
3.3. BOUNDARY ELEMENT METHOD 133

Let us define
Hij if i # j
H ij = (3.145)
{ - 1
H '3··+- ifi=j,2
then (3.144) can be written as
N N
'L,HijUj = 'L, G i j qj,
j=l j=l

which can be written in the form of matrix equation as


AX=F, (3.146)
where X is the vector of unknown u's and q's (Either u's or q's will be known and X
has N components), and A is a matrix of order N. The most effective and economical
method for evaluating the matrix equation (3.146) can be seen in Hackbusch [1994]
and Datta [1995]. A short note on such methods is also given in Appendix E. Thus
all the values of potentials and fluxes on the boundary are known and one could
calculate the values of potentials and fluxes at any interior point using

Ui = Ir qu * dr - Ir q * U dr . (3.147)

Equation (3.147) represents the integral relationship between an internal point, the
boundary values of U and q and its discretized form is
N N
Ui = 'L,qjGij - 'L,UiHij . (3.148)
j=l j=l

The values of internal fluxes can be determined by differentiating (3.147) (See equa-
tion (3.139}) which gives us

e«= -
-
8Xl
11r r
Bq"
qu-dr,
8 Xl
(3.149)

where x, are the coordinates, I = 1,2 are in two dimensions and I = 1,2,3 are in
three dimensions.

Remark 3.17. (i) Hij and G i j can be evaluated by simple Gauss quadrature values
for all elements (except one node under consideration) as folIows:

H ij = l-
r q* dr = i 'L,q~wk'
I n

rj k=l

and
134 OHAPTER 3. MAXWELL'S EQUATIONS AND NUMERIOAL METHODS

where lj is the element length and Wie is the weight associated with the numerical
integration point k,
(ii) For the integrals corresponding to singular elements, one may use the integral
formulae discussed in Brebbia et al, [1984].
(iii) For the constant elements, as the present case,

due to orthogonality between the normal and surface of the element

where T = 1] I Tl I, 1] homogeneous coordinate over asegment.


(iv) Mixed boundary conditions like

eu + f q = d on r, (3.150)

often known as Robin type conditions, can be considered where d, e, f are functions
of the positions. If f = 0, it reduces to the Dirichlet cond ition and for e = 0, we get
the Neumann condition. For e =I 0, and f =I 0, it can include impedance boundary
conditions of electromagnetic problems and the convection boundary conditions of
heat conduction problems. By applying this boundary condition at all boundary
nodes, we may get the matrix equation

AX=F. (3.151)

(v) If we use the indirect formulation for boundary value problems, equations
(3.128) and (3.130) can be discretized as above and we get a matrix equation

AlT = F ,
3.3. BOUNDARY ELEMENT METHOD 135

where the unknowns in the U vector are the source intensities

Ui = lr1
au" d r
(
qi =- 2Ui + J/
q* d I'

o.,
n

Ui = LUj
j=i
1 N
qi = -2Ui + LujHij
j=i
- - 1
H ii = H ii -
N
2 N
Ui = LUi a.; qi = LujHij
j=i j=i
Ui = Ui at Ni points at I' 1
qi = 7li at N 2 points at r 2 •

Discretization by linear elements in two dimensions. We consider here


a linear variation for U and q in which case nodes are located at the intersection
between straight elements as shown in Figure 3.5(ii). Equation (3.140) can be written

Ir Ir
as
CiUi + U q* d r = qu* d r ,

or in discretized form

CiUi + L
N

j=i r,
l
uq* d r = L
j=i r,
q u* d r.
N
l (3.152)

The values of U and q at any point of the element can be defined in terms of their
nodal values and two linear interpolation functions 1/Ji and 1/J2' functions of homo-
geneous coordinate TJ, such that

u(TJ) = 1/Ji Ui + 1/J2U2 = [1/Ji' 1/J2] { Ui } = 1/JTu n


~ [.p".p,] {:'} =.pTqn .
(3.153)
{ q(,) =.p, q, + .p,'h
The dimensionless coordinate TJ is equal to X/ 2 and the 1/Ji' 1/J2 functions are given by
(3.154)

(3.155)
136 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

where

Similarly,
(3.156)

with
gfj = f 'l/Jl u* d r
Jr
Ir 'l/J2
1

glj = u* d r.
Putting these values, (3.152) can be written in terms of a matrix equation
AX = F . For details, we refer to Brebbia et al. [1984] .

3.3.4 Coupling of boundary element and finite element meth-


ods
Combining the methods of finite element and boundary element is of great
interest in many practical problems . A reasonable approach could be to convert a
problem on a given domain into a boundary integral equation and then to solve it
by the finite element method. We present here some general results in this direction
without proof (For detai ls, we refer to Dautry-Lions [Vol. 4, 1990]).

Oe R 3 ,H2 (0 ) = {f E L 2 (0 )/ D a fE L 2 (0 ), a = (al,a2,a3),lal
= al + a2 + a3 ~ 2}
(f,q) = L (D af,D a g)L 2 (O),
lal9

IlfIIH2(Q) = L IID
afIIL
2(0).
lal9
Consider the Dirichlet problem

Vu(x) = O,X E 0
(3.157)
{ uIr = 0, I' is boundary of O.
This is called the interior Dirichlet problem, while the following problem is called
the exterior Dirichlet problem :

Vu(x) = O,X E R3 \ 0 = 0' (3.158)


{ u/r = 0.
3.3. BOUNDARY ELEMENT METHOD 137

Theorem 3.5. Let uo(x) be an element in H~ (I') and let

1 ( q(y)
411" J Ilx _ Ylld')'(x) = uo(x) ,x Er, (3.159)
r
where d')' is the element 0/ sur/ace r o = n, be a boundary integral with unknown
q( .). (3.159) has a unique solution in the space H-L This problem has the varia-
tional /ormulation
, , , l(r)
a(q,q) = F(q) [or all q E H-2 , (3.160)
where
a(q,q') = .2.. { ( q(y)q' (y) d ')'(x) d ')'(y) (3.161)
411" Jr Jr Ilx - yll

F(q') = Ir uo(y)q' (y) d ')'(y). (3.162)

H:; 'i,
1
a( ', .) is coercive on that is, 3 a > 0 such that

The solutions of the interior and exterior Dirichlet boundary value problems (3.157)
and (3.158) exist and are given by

1 ( q(u) 3
u(x) = 411" J Ilx _ y11d ')'(y) , for all x ER , (3.163)
r
where q(.) is the solution of (3.159).
Note : Analogous results can be proved for the Neumann boundary prob-
lems, and the Helmholtz equation with the Dirichlet and Neumann
boundary conditions (See Dautry-Lions, Vo1.4, Chapter-XI).

Finite element approximation of the boundary integral equation given in


(3.159).
Approximate (discrete) problem of (3.160). Find a subspace Vh of H~(r) and
Vh such that
qh E
(3.164)

Remark 3.18. (i) In the simplest case, Vh will consist offunctions that are constant
on each element of the mesh Th. Here we can construct Vh by a finite element method
using a polynomial of degree k , We can use finite elements of Co dass whenever
k > 1. Here the space Vh will be aspace of continuous functions whose restriction
to each triangle is a polynomial of degree k,
138 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

(ii) If Vh is the space of functions constant on the elements of the mesh, the
number of unknowns are equal to the number of these elements. The coefficients of
the matrix of the linear system to be solved are

(3.165)

The matrix is symmetrie and positive definite by virtue of the coercivity of the bilin-
ear form a(q, q') . If h = maximum diameter of the elements in the mesh Th, then the
error between the solution qh of the discretized equation (3.164) and q , the solution
of (3.160), is given by the foIlowing theorem.

Theorem 3.6. Let q be the solution 0/ (3.160) and qh be the solution 0/ (3.164),
then
IIq - qhIlH-~([') :S Ch k +! IIqllHHl [I') , (3.166)
(k = 1 i/ elements are from Hl(O) or k = 2 i/ elements belong to H2(O)).
Let Uh be given by

and U by (3.163), then

Results analogaus to Theorem 3.6 are possible for many other classes of boundary
value problems like the Helmholtz equation with Diriehlet and Neumann boundary
conditions and problems of linear elasticity and Stokes system (See Dautry-Lions
[Vol. 4, 1990)).

3.4 Problems
Problem 3.1 (i) What do you understand by wave propagation in uniform guides?
(ii) Write down a mathematieal equation which models wave propagation in
uniform guides.
(iii) Write down the variational (weak) formulation of the mathematical model.

Problem 3.2 (i) Write down the Helmholtz equation in the three-dimensional space
with the Diriehlet and the Neumann boundary conditions.
(ii) What kind of physical phenomena are represented by the Helmholtz equation
with weIl known boundary conditions?

Problem 3.3. Discuss the numerieal solution of the Helmholtz equation with
Dirichlet and Neumann boundary conditions by
3.4. PROBLEMS 139

(i) Finite element method.


(ii) Boundary element method.

Problem 3.4. Discuss the solution of the Helmholtz equation with hysteresis.

Problem 3.5. Write down Maxwell's equations into integral form .

Problem 3.6. Discuss the solution of Maxwell's equations independent of time


applying wavelet techniques.

Problem 3.7. Write down the mathematical equation modelling the radio antennas.
Find its finite element solution.

Problem 3.8. What is the physical meaning of the Sommerfeld's radiation condi-
tion? Write down its mathematical formulation.

Problem 3.9. Explain the meaning of the following terms:


(i) The electromagnetic energy and Maxwell stress tensor.
(ii) Spectral problems of wave guides.
(iii) Formulae providing transmission conditions at the interface between different
continuous media.

Problem 3.10. Discuss the solution of Maxwell's equation by Mathematica.

Problem 3.11. Discuss the error between the exact and the finite element solution
of the following differential equation
cFu
- dx 2 = 2, 0 <x <1
u(O) = u(l) = o.
Problem 3.12. Consider the one-dimensional heat transfer in an insulated rod
of cross-section area A, Length L, conductivity k, convection coefficient ß, and
surrounding media (ambient) temperature Too• Write down the differential equation
modelling this phenomenon and solve it by the finite element method.

Problem 3.13. Write down the discrete form of the differential equation

cF
dx 2
~u) + f(x)
( p(x) bdx = 0, 0< x < L,
2

under the appropriate boundary condition.


140 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

Problem 3.14. Solve the Poisson equation

_V'2 u = 1

in a square of side 1 with the boundary eonditions


8u
8x(0,y)
8u
= 8y(x,0) = °
u(l, y) = u(x, 1) = °
by the finite element method using the linear elements.

Problem 3.15. Solve the integral equations of eleetrostatics in one dimension by


the finite element method.

Problem 3.16. Diseuss the finite element solution of the nonlinear equation

V' x (vV' x A) = J,

k,
where v = JL is the material permeability, A is a veetor potential, whose curl gives
the magnetie flux density B (V' x A = B).

Problem 3.17. Solve Poisson's equation V'2 U = b in a square with sides of length
2 and homogeneous boundary eonditions u = 0, at x = ±1, and y = ±1, by the
boundary element method.

Problem 3.18. Diseuss the solution of the transient sealar wave equation by the
boundary elment method.

Problem 3.19. Write down an essay on the finite element method in less than 1000
words introducing the topic without writing mathematical formula.

Problem 3.20. What do you understand by interpolation (approximation) fune-


tion? Explain this coneept with the help of examples.

Problem 3.21. Show that the Helmholtz operator A = V'2 + k 2 is self-adjoint.

Problem 3.22. Consider Poisson's equation

in the square whose vertices are

u(O,y) = y2,U(X,0) = x 2
8u (1, y ) = 2 - 2y - y,
8x 2 8u()
8y x,l =2- x - x2 .
3.4. PROBLEMS 141

Solve the boundary value problems using trial functions of the form:
Ul(X,y) = x 2 + y2 + axy
U2(X,y) =
x 2 + y2 + bxy(x + y).

Problem 3.23. State the main steps involved in solving the Poisson equation by
applying triangular finite elements.

Problem 3.24. What do you understand by Eddy current problems? Describe


the mathematical modelling of Eddy current and discuss a numerical solution of the
model.

Problem 3.25. Discuss the finite element solution of the following boundary value
problem:
fflu
- d,x2 + u(x) = f(x) , x E (0,1)
u(o) = u(l) = 0.

Problem 3.26. Find the llitz-Galerkin solution of the following boundary value
problems:

(a) (~~ + :~) = -1 in the square (0,1) x (0,1)


u(x) = 0, on the boundary of the square
(b) (~~ + ~~) = -'Il"2COS'll"X inthesquare (0,1) x (0,1)
8u
8n = 0, on the boundary of the square.

Hints for solution:

Problem 3.1. (i) The behaviour of wave guides is derived from the property of a
hollow tube with a highly reflecting inside surface that channelizes electromagnetic
wave energy from one end to the other. A simple example is that of shining light
through a tube with a mirror interior surface. Because of the possibility of launehing
two alternative, independent plane waves distinct by virtue of possessing orthogonal
polarization, one may get two independent sets of waves or modes in the guide,
known as transverse magnetic and transverse electric waves. This phenomenon is
known as the wave propagation in uniform guides.
(ii) Let us consider a cylindrical waveguide of arbitrary cross-section and with
its axis aligned along the z-direction. In order that a coherent wave propagate at
frequency w, we must have

E = E(x, y) exp i(wt - ß z),


142 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

H = H(x,y)expi(wt - ß z), where ß is a propagation constant corresponding to


wavelength 21r/ ß. Putting these values in the following form of Maxwell's equa-
tions:
\1 x E = -iwpB (i)
\1 x H = iw€E. (ii)
(We get (i) and (ii) by keeping in mind that it in (3.17)(i) and (3.17)(ii) can be
replaced by iw by change of variable), we get,

8E
8y
z
+ ~sasy = -~"wf.J.
Hz (iii)

-~ve ez - z
-8E =
-~" Hz
wf.J. (iv)
8x
8Ey _ 8Ez = -i w f.J. Hz (v)
8x 8y
8H
8y
z
+ ~i B H.y = ~. W e E z (vi)

-~
"ßHz -
8H
8x
z
= ~• W €
Ey (vii)
8Hy _ 8Hz _ " E (viii)
8x 8y - ~ W € z·

By solving (iii) and (vii) für Hz and Ey and (iv) and (vi) for H y and E z, we get

Hz = i (w e 8~ -ß 8Hz ) /k 2 _ß2 (ix)


8y 8x
H = -j (w e 8Ez + ß 8Hz ) /(k 2 - ß2) (x)
Y 8x 8y

E z -- . (ß -8Ez
-~ + w f.J. -8H z) /(k 2 -p2) (xi)
8x 8y
-" ( ß 8Ez
E y-~--+Wf.J.-8Hz ) /(k 2 -ß2)
, (xii)
8y 8y
If f.J. and e are constants, then, by choosing k 2 = w2 f.J. e, it can be shown that
\12 E+ k 2 E = 0 (xiii)
\12H + k 2 H = O. (xiv)
See also Remark 3.10. Observing the exp( -ißz) dependence of the fields, the z-
component equations of the above is given as
\1}Ez + (k2 - ß2) E; = 0 (xv)
\1}Hz + (k 2 - ß2) = 0, (xvi)
3.4. PROBLEMS 143

82 82
where '\7~ = 8 x 2 + 8 y2 is the transverse Laplacian operatot.
E z and the transverse E-field components determined by it from equations (ix)
to (xii) may exist separately from Hz and its corresponding transverse components.
Thus, the two separate modes, transverse electric (TE) where E z = 0, H f 0 and
transverse magnetic (TM) when Hz = 0, E z f 0 are identified. Equations (xvi)
and (xvii) are crucial as all other field quantities will follow from these two. It may
be observed that these two equations are special cases of the well-known Helmholtz
equation,

F(Ez) = ~ In [('\7Ez )2 - (k2 - E;J dx dy


l
ß2)
(xvii)
{ F(Hz) = ~ ('\7Hz )2 - (k 2 - ß2) H;dx dy.

Problem 3.2.
n e R 3 , '\7u(x) + k2 u (x ) = 0, xE n
u(x) = 0 on 8 n = r (Dirichlet Boundary) (i)
{ 8u(x) = 0 on 8 n = r
8n '
where k is a positive real constant.
Homogeneous Helmholtz equation with Dirichlet and Neumann boundary condi-
tions in three dimensions

(ii)

where u = u(x,y,z) is a function of position whose determination constitutes the


desired solution; p denotes the function p(x, y, z) representing the material properties
of the medium, k 2 is a constant independent with position which may or may not
be known, and g(x,y,z) is a given function.
Equation (ii) is called the non-homogeneous or inhomogeneous Helmholtz equa-
tion. It may be assigned Dirichlet, Neumann or both boundary conditions.
(iii) It is a model of propagation of wave in uniform guides as seen above . It
models the radiation from electromagnetic sourees. It has a close connection with
the wave equation

82w 2
- 2 - '\7 w + 9
8t
=0 on n x R, n c R 3 • (iii)

For g(x,t) = eiktj(x) on n x R,u(x) is a solution of (ii), for p = 1, if and only if


w(x, t) = eiktu(x) is a solution of (iii).
In the one-dimensional Heimholtz equation,

d2u 2
-+k
d x
2 u=o (iv) ,
144 GHAPTER 3. MAXWELL'S EQUATIONS AND NUMERIGAL METHODS

is the model of the vibrating string whose general solution is

u(x) = >. eik:J; + J.L e-ik:J; . (v)


Originally, Helmholtz used this model to study acoustic resonance frequencies of
halls, loud speakers, etc, Sound waves in air are represented by longitudinal distur-
bances in pressure, denoted by u(·), and perturbations of a prevailing static pressure
Uo. Such disturbances are governed by the wave equation

2
V2 _ ~8 u (vi)
U - c8 t2 '

where c = 'Yuo/lfJo with lfJo the static density of the air and 'Y its ratio of specific heats
at constant pressure and volume. The disturbance velocity of the air v is related to
the incremental pressure u through a linearized equation of motion
8u
Vu = -ep 8t . (vii)

(vi) and (vii) may be cast into complex phase form to the Helmholtz equation

V2 u + k 2 u = 0
Vu = -iwlfJov
where k2 = w2 /c2 .

The study of electrical heating by electromagnetic induction will also require the
study of Helmholtz equation. Standing waves on a bounded shallow body of water
are modelIed by the Helmholtz equation with the Neumann boundary condition

Problem 3.3.
Finite element method. We consider here Helmholtz equation with the Dirichlet
boundary condition. The Neumann boundary problem can be solved on similar lines
keeping in mind that the integral on the boundary will be zero.
Let n c R3 be divided into m elements of r nodes each. We may express the
behaviour of the unknown function u(x) within each element as
r
u U) = ~ ai Ui = [al {u}U) , (i)
i=l

where Ui is the value of u(·) at i-thth node . Here, we are considering nodal values of
u as degrees of freedom; however, values of derivatives could also be used as degrees
of freedom without changing the procedure to be followed. The quantity Ui may be
thought of as a general nodal parameter. The functional

(ii)
3.4. PROBLEMS 145

for the Helmholtz equation

A > 0 known or unknown,


u = 0 on r,
will attain minima or maxima at the solution of the Helmholtz equation under
consideration; that is,

-81
8 = O'
,t= 1, 2, "'r, (iii)
Ui

will be satisfied.
Since the functional 1(·) contains only the first order derivative, ai must be
chosen to preserve at least continuity of U at element interfaces. If the approximating
functions guarantee continuity, we must focus attention on one element because the
integral I(u) can be represented as the sum of integrals over all elements; that is,

m
I(u) = LI(u(j»). (iv)
i=l

The discretized form of the functional for one element is obtained by substituting
equation (i) into (ii). Then (iii) takes t he form

8I(u(j») .
8
Ui
= 0, t = 1, 2, ... , r . (v)

By (i), we get

Thus

8 I(u~i») = 0 = [ [[!!..!:.] {u} 8 ai + 8 a {u} 8 ai


8 uj 10m 8x 8x 8y 8y
(v')
{ 8 a] 8ai ] .
+ [""[f; {u}""[f; - A ai d!V (v)I ,
146 GHAPTER 3. MAXWELL'S EQUATIONS AND NUMERIGAL METHODS

Combining all equations like equation (v), we get

{~
8 u(j)
} (j) -
- (vi)

where the coefficients ofthe matrices [K)(j) , and {Si}(j) are given by the equations

ki "
J
{ 8ij =
I
= {
Jr:w
(8 ai 8aj
B » 8x
+ 8 ai
8y
8 aj
8y
+ 8 ai
8z
8 a j) d
8z
ai aj d n(j),i = 1,2,3 ·· ·r,j = 1,2,3 · ··r .
n(j)
(vii)

QW

Thus we have a matrix equation.


For a solution domain of m elements, the systems of matrix equations are of the
form
[K] = A[S] {U}
m m
where [K) =
{ L
[K]j , and [S] = [S](j) I L (viii)
j=l j=l

{u} is the column vector of nodal values of u. Equations in (viii) are a set of, say
n, linear homogeneous algebraic equations in nodal values of u. The problem which
we have solved here is known as the eigenvalue or characteristic value problem; the
values of A are termed eigenvalues or characteristic values . For each different value of
Ai, there is a different column vector [u], that satisfies (viii). The vector {uh that
corresponds to a particular value of Ai is called an eigenvector or a characteristic
vector. We know that {u} =I- {O}, that is, (viii) has a non-trivial solution if and only
if the characteristie determinant is zero; that is,

I [K) - [H] I = O. (ix)

This equation is used to find the eigenvalues. We get the following n-th order
polynomial by expanding

(x)
By the fundamental theorem of algebra, this polynomial has n roots Ai. Substituting
these values of Ai in (viii), we solve n sets of equations for the n eigenvectors {4> h . [K]
and [H] are symmetrie and definite positive and so eigenvalues are distinct and real
and u/s are linearly independent. The usual procedure for solution is to assign an
3.4. PROBLEMS 147

arbitrary value to one component of the vector {u} and then solve the remaining
(n - 1) equations for the other components. The consequence of this fact is that
the natural eigenvectors {4J} i of the wave motion are known only in relation to one
another, not in absolute terms. Iterative numerical methods are also known to find
the solution of (viii) (See Ciarlet [1989]).
(ii) See Dautry-Lions [1990, Vol. 4 pp. 143-145], Brebbia [1984, pp. 121-123 and
416-417] and Balder and Zerger [1996].

Problem 3.5.

1 E. dl = _ f aB . dS
Je Js at
i H . dl = l(J+~~) ' dS

1 D·dS = 1 cpdO

t
JSi fl

B·dS = 0,
where the integrals are taken over an open surface S or its boundary contour C or
over a closed surface SI .

Problem 3.8. The Sommerfeld condition expresses the fact that the energy flux
across every sphere of radius R very large, in the sense of lxi increasing, is positive
for the reflected electromagnetic wave satisfying the condition; namely,

ls bounded when R = [z] -+ 00.


Problem 3.9. Electromagnetic energy for the field {E,B} is defined as

Problem 3.15. See Silvester and R.L. Ferrari [1983, Chapter 4].

Problem 3.16. The equation is a mathematical model of the vector potential A


in ferromagnetic materials. Ferromagnetic materials exhibit strong magnetic effects
and are the most important magnetic substances. The permeability of these ma-
terials is not a constant but a function of both the applied field and the previous
magnetic history of the specimen. Iron, nickel, and cobalt are ferromagnetic mate-
rials.
148 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

Two-dimension case.

~
8x
(V 8A)
8x
+ ~ (V 8A) = -J .
8y 8y
Corresponding energy functional is

F(U) = In W(U)d n- In JU d o,
where

W(U) = f H· d B
B = V x U and H = f-LB
U = ~::::'>~i(X,y)Ui .
i

By the finite element approximation, we get the matrix equation


SU=J,
where U is the vector of nodal potential values, that is, values of U at nodes,

J = r
1n<e)
Jak d n(e)

S = (Sij), Sij =
n<e) 1 ( X X
Vf-L
Y Y
8 ai 8 aj
-8-8
8ai8aj)
+- 8 -8
The matrix equation can be solved by the iterative method. For further details, one
d n.

may see Silvester and Ferrari [1983] .

Problem 3.20. In the finite-element method, the functions used to represent be-
haviour of a field variable within an element are called interpolation, shape and
approximation functions. In mathematical terminology these are the basis elements
of the approximating subspace Hi, of the Hilbert space over which the continuous
problem is defined. Many types of functions could serve as interpolation functions,
but mainly polynomials have been widely used in practical applications. However,
well known orthonormal systems and recently invented wavelets could also be used
as interpolation functions. Let us consider the two-dimensional field variable ifJ(x, y).
The nodal values of ifJ can uniquely and continuously define c/>(x, y) throughout the
domain of interest in the x-y plane. Suppose the domain of ifJ( x, y) is divided into
triangles. Then the plane passing through the three nodal values of ifJ associated
with element n(e) is described by the equation
ifJ(e)(x,y) = a~e) + a~e) x + a~e)y ;
ifJ~e) = a~e) + a~e) Xi + a~e)Yi ;
(i)
",(e) _ aCe)
'l'j -
+ aCe)
1 2
x . + a(e)y . .
J 3 J'
",(e) _ aCe) + aCe) Xk + a(e)Yk
'l'k - i 2 3'
3.4. PROBLEMS 149

Solving these equations yields

(e) _ if>i(xi Yk - YiXk) + if>i(Xk Yi - Xi Yk) + if>k(Xi Yi - Xi Vi)


(}l - 2/}.
(e) _ if>i(Yi - Yk) + if>j(Yk - Vi) + if>k(Yi - Vi)
(}2 - 2/}.
(e) _ if>i(Xk - xi) + (Xi - Xk) + if>k(xi - Xi)
(}3 - 2/}.
1 Xi Yi
2/}. = 1 xi Yi
1 Xk Yk

Putting these values in (i) and rearranging the terms, we have

where

The functions N (e) are interpolation functions.

Problem 3.18. See Brebbia et al, (1984 [352-354]).

Problem 3.24. An electric current induced within the body ofthe conductor, when
the conductor either moves through a non-uniform magnetic field or is in a region
where there is a change in magnetic flux, is known as the eddy current. Since iron is
a conductor, there will be an eddy current induced in it, when the flux is changed.
Because of ohmic resistance, the eddy current causes a loss of energy. Hysteresis
and eddy current loss form the two components of iron loss. For a long time, these
currents were considered harmful but new technologies have recently arisen which
make use of eddy currents for production of heat or forces like
(i) the linear induction motor, in which eddy currents are produced in areaction
rail to create a propulsive force;

(ii) in electrodynamic systems, where eddy currents provide the lift force, although
create also a drag force; and
(iii) electrical heating by electro-induction, where one deals with the generation of
a given eddy current distribution.
150 CHAPTER 3. MAXWELL'S EQUATIONS AND NUMERICAL METHODS

Approximate mathematical models can be derived from Maxwell's equations and


it will be a Helmholtz equation linear or non-linear. For details, see Chapter 6 and
10 in Chari and Silvester [1983). See also Balder and Zerger [1996).

Problem 3.25. We present here a finite element solution of the following boundary
value problem [Datta, 1995).

cFu
- dx 2 + u(x) = f(x), far x E (0,1) , (i)

u(x) = 0, for xE {O, I}.


Let H = Hö(O,l) . If we now multiply the equation -u" + u = f(x) by an
arbitrary function v E H (v is called a test function) and integrate the left-hand side
by parts, we get

1 1
(-u"(x) +u(x))v(x)dx = 1 1
f(x)v(x)dx ,

that is,
1 1
(u'v' + uv)dx = 11
f(x)v(x)dx. (ii)

We write equation (ii) as:

a(u,v) = (F,v) for every v E H ,

where

a(u,v) = 1 1
(u'v' + uv)dx and (F,v) = 1 1
f(x)v(x)dx .

(Notice that the form a(·,·) is symmetrie (that is, a(v,u) = a(u,v)) and bilinear.)
It can be shown that u is a solution of equations (ii) if and only if u is a solution
of (i)

The discrete problem. We now discretise the problem in (ii). We start by con-
structing a finite-dimensional subspace H n of the space H.
Here, we consider only the simple case where H n consists of continuous piecewise
linear functions. For this purpose, let 0 = Xo < Xl < X2 < .. . < X n < X n+1 = 1
be a partition of the interval [0,1) into subintervals I j = [X j-1 , Xj] of length hj =
[Xj - Xj-1], j = 1,2, . . . ,n + 1.
With this partition, we associate the set H n with the set of all functions v(x) that
are continuous on the interval [0,1], linear in each subinterval Ij,j = 1,2, . . . , n + 1,
and satisfy the boundary conditions v(O) = v(l) = O.
We now introduce the basis functions {4>1 ' 4>2 "" ,4>nh of H n. We define 4>j(x)
by
3.4. PROBLEMS 151

2. cPj(x) is a continuous piecewise linear function.

cPj (x) can be computed explicitly to yield:

X-~~-l , when Xj-l s x s Xi


cPj(X) = xiti-x
{
hitl ' when Xj $ X $ Xj+l •

o 1

Because cPl' . . . , cPn, are the basis functions, any function v E H n can be written
uniquelyas
n
v(x) =L VicPi(X), where Vi = V(Xi).
i=l
We easily see that H n C H. The discrete analogous of (ii) then reads: Find U n E Hi;
such that
a(un,v n) = (F,v n) for evrey Vn E H n . (iii)
Now, if we let Un = L:~1 CicPi(x) and notice that equation (iii) is true, particularly
for every function cPix), j = 1,. . . ,n, we get n equations; namely,

Using the linearity of a(·, cPj) leads to n linear equations in n unknowns


n
LCia(cPi,cPj) = (F,cPj) for every j = l,2,... ,n,
i=l
which can be written in a matrix form as

(iv)
152 GHAPTER 3. MAXWELL'S EQUATIONS AND NUMERIGAL METHODS

where (Fn)i = (F, ifJj) and A = (aij) is a symmetrie matrix given by aij = aji =
a(ifJi' ifJj) and c = (Cl, ,, , ,cn)T.
The entries of the matrix A ean be eomputed explicitly. We first notice that aij
= aji = a(ifJi' ifJj) = 0 if li - il ~ 2. (This is due to the loeal support of the function
ifJi(X)), A direet eomputation now leads to

Henee, system (iv) ean be written as

(v)

where, aj = J 1. + h .l+l + Hhj + hi+d and bj = -1. +!!j.


J J
In the special ease of
uniform grid hj =h = (nil)' the matrix Athen takes the form

~. 0]
2 -1 ", 0

A=~ -1 2 +!!: (: .'.. (vi)


h 6 ". ". 1
-1
o '. -1 2
o 1 4
Methods for solving (v) are given in Appendix 7.5.
Chapter 4

Monte Carlo Methods

The written history of the Monte Carlo methods began in 1949 after the paper
of Metropolis and Ulm [1949], but at that time the method had already been used for
several years in secret defense projects ofthe United States of America for simulating
the behaviour of nuclear reactors. It was definitely known to J. Von Neumann, N.
Metropolis, S.M. Ulm, H. Kahn and their coworkers at the Los Alamos Scientific
Laboratory much before its publication. The name "Monte Carlo" comes from the
random or chance character of the method and the famous casino in Monaco (South
of France). It may be observed that finding random numbers is a crucial step of
the Monte Carlo methods, The numbers which are generated at the roulette table
in Monte Carlo are truly random, but the numbers which are actually utilized in
Monte Carlo methods are generated on the computer using deterministic formulae
which are called pseudo-random numbers or quasi-random numbers. A fundamental
difficulty of the Monte Carlo method stems from the requirement that the nodes
be independent random samples, The problem arises for generating independent
random samples concretely. The users ofMonte Carlo methods avoid this problem by
using pseudo-random numbers in place of truly random samples . The first paper with
the term quasi-Monte Carlo came to light by the Tech-report in 1951 by Richtmyer.
This method can be described in a simple wayas the deterministic version of a Monte
Carlo method in the sense that the random samples in the Monte Carlo method are
replaced by well-selected deterministic points. An excellent exposition of the quasi-
Monte Carlo method can be found in Niederreiter [1992,1978] and Halton [1970] .
Application-oriented treatment can be found in other references especially Koonin
et al, [1990] , Press et al. [1992] and Bratley, Fox and Niederreiter [1994] . For
an updated account of the Monte Carlo and quasi-Monte Carlo methods, we refer
to Caflish [1998]. The particle methods which are used to simulate rarefied gas
flows now-a-days have a lot in common with the original work of Metropolis and
Ulm, see, for example, Neunzert and Struckmeier [1997]. The general idea of the
particle methods is the approximation of densities by discrete measure. An elegant
description of the particle method along with applications is presented in Neunzert
154 GRAPTER 4. MONTE GARLO METRODS

and Struckmeier [1995]. The Industrial Mathematics group of the University of


Kaiserslautern 100 by Neunzert has applied the particle methods in areas like:
(i) Space flight computation: Simulation of pressure probes in space flight exper-
iments, chemical effects in rarefied gas flows during the reentry phase of space
vehicles.

(H) Traffic flowsimulation models used for the implementation in a traffic guidance
systems on highways.
(iil) Glass: Investigations on the influence of radiation on the glass melting process.

(iv) Free surface flow as in air-paper industries and granular flow.


Details of these applications can be found in the report of the Industrial Mathe-
matics Group of the Kaiserslautern University website
URL: http://www.mathematik.uni-kl.de/harvvest/bzw.
http://www.mathematik.uni-kl.de/wwwtecm/

Deficiencies of the Monte Carlo method


(i) There is only a probabilistic bound on the integration error.
(H) The regularity of the integrand is not reflected.
(iii) Generating random samples is difficult.

The main goal of this chapter is to give abrief introduction of Monte Carlo, quasi-
Monte Carlo, and Particle methods along with their typical applications. Section
4.1 is devoted to the Monte Carlo method while Section 4.2 deals with the quasi-
Monte Carlo method. The particle methods are introduced in Section 4.3. A current
study of the particle methods is mentioned in Section 4.4. Problems are mentioned
in Section 4.5. Applications of the quasi-Morite Carlo method to the pricing of
derivatives are discussed by Ninomiya and Tezuka [1996], Papageorgiou and Traub
[1996], Paskov and Traub [1995], Tezuka [1998] and Wilmott [Chapter 49,1998]. An
introduction to the Black-Scholes financial world is presented in Appendix 7.6.

4.1 Monte Carlo method


4.1.1 Motivation
First, let us recall classical methods of numerical integration.

(i) The reetangular rule. Let

f :R -t Rand I = l b
f(x)dx .
4.1. MONTE CARLO METHOD 155

Divide the interval [a, b] into n equal subintervals by the points a = Xo =:; Xl =:; X2 =:;
. . . . •. =:; X n = b such that if the length is 'b - a', then each subinterval is of length
h = b~a and Xi = a + ih , i = 0,1,2· .. ,n and Xi+! - Xi = h, i = 0, 1,2 · .. ,n. Let
Yi = f(Xi), i = 0,1,2· . . ,n.
To find I is equivalent to finding the area enclosed by the graph of the function
n
between 'a' and 'b' which is approximately equal to L hf(Xi) denoted by IR ' We
i=1
have the error between I and IR denoted by Ef .

l
b (b - a) n-l
I - IR = f(x)dx - --Lf(Xi)' (4.1)
a n i=0

or,
n-l
b - a,,",
Ef = I - -L.)(Xi) . (4.2)
n i=0

It can be easily checked that the error Ef is proportional to ~(b - a)!'("1) for some
"1 E (a, b). In view of this Ef is exact; that is, Ef = 0 if f is a constant function.
Further, the error is inversely proportional to n and so, for a given function t, the
error will decrease as the number of subintervals will inerease, that is, n will increase.

(ii) The trapezoidal rule,

Yi = f(Xi)
Xi+l - Xi = h
Xi 's are as in the previous case,

In the trapezoidal rule, we choose

IT = Approximate integral = h [f(Xo) ; f(Xd]

+h [f(Xl); f(X2)] + + h [f(X n-l)2+ f(X n ) ]


1 (4.3)
= 2h [f(xo) + 2f(Xl) + 2f(X2) + + 2f(xn-l) + f(x n )]
b- a h
= -2- L ti/(ti) ,
i=O

ti = 1 for i = 0 and n and ti = 2 for i =I- 0 and n


I=IT+Ef · (4.4)
156 CHAPTER 4. MONTE CARLO METHODS

f(x)

Figure 4.1. The trapezoidal rule.

It can be checked that the error in the trapezoidal rule, EI is proportional to


- ~~ (b - a)/' (17), TI E (a, b). The rule is exact for polynomials of degree less than or
equal to 1.

(iii) Quadrature rules. A common feature of the above two rules is that the
intervals (a, b) are subdivided in such a way that the Xk 's, k = 0,1,2, '" ,n are
equally spaced. Such a choice is not always possible, for example, the integrand
could be singular; that is, f(x) could behave as (x - b)Q near x = b, with a > -1.
A more refined approach is to try and find the values of Xk for which the error is
minimized. Methods based on this approach are called the quadrature rules. Let

1= l b
f(x)dx = l b
w(x)g(x)dx, (4.5)

where w(x) is a non-negative integrable function, called weight function and g(x) =
f(x)
w(x) .
n
We approximate the right hand side integral of (4.5) by LWi g(Xi) where the
i=l
weight Wi and the abscissae can be found from standard literature. It can be seen
that Xi'S are zeros of the (n + l)-th degree polynomial which is orthogonal to all
polynomials of a lower degree with respect to the weight w(x) over [a, b]. The error
n
involved, that is, I - LWi g(Xi), is proportional to the (2n + 2)-th derivative of
i=O
g(x) . The method is exact, that is, the error is zero for all polynomials of degree
4.1. MONTE CARLO METHOD 157

~ (2n + 1). For details see, for example, Stroud [1974].

Remark 4.1. (i) Untillate forties, the above methods were known for finding ap-
proximate values of integrals. The quadrature rules provided fairly good approxima-
tions for integrals in lower dimensions, that is, for functions of one or two variables,
but in higher dimensions, the error was high and this motivated the invention of a
new technique for evaluating integrals numerically.
(ii) Suppose, for evaluating an integral with a quadrature rule requires n points,
that is, to obtain sufficient accuracy we need to consider a quadrature approxi-
mation based on some n-th degree orthogonal polynomial. Then the analogous
rn-dimensional integral would require n points in each dimension, that is, we would
need n m points in total. In view of this, the quadrature rules are not appropriate
for evaluating integrals numerically as it will involve large points.
(iii) The order of approximation for the rn-dimensional quadrature rules is

4.1.2 Monte Carlo method in rn-dimensional Euclidean space


Let B be a subset of R": satisfying 0 < f.L(B) < 00, where f.L(-) is a Lebesgue
measure on R m , K is a prob ability space with the probability measure d>. = du/ f.L(B),
then for fELl (f.L), we have

1 f(u) du = f.L(B) 1 f d x = f.L(B)E(f), (4.6)

where f : Rm -+ R, and E(f) is the expected value ofthe random variable f. The
problem of numerical integration is nothing but finding approximate value of ex-
pected value of a random variable. This concept can be developed in a more general
setting, namely, an arbitrary probability space (A, A, >.). The Monte Carlo estimate
for the expected value E(f) is obtained by taking N independent >.-distributed ran-
dom samples al, a2, ... ,an E A and letting
1
L f(ak) .
n
E(f) ~ - (4.7)
n k=l
The strong law of large numbers guarantees that this procedure converges almost
certainly in the sense that

>.00 _ a.e, (4.8)

i
where >.00 is the product measure of denumerable many copies of >.. The variance
of f is denoted by 0'2(f) = (f - E(f))2d >..
158 CHAPTER 4. MONTE CARLO METHODS

It can be proved that [Niederreiter, 1992] for each f E L 2 ( ..\ ) ,

(4.9)

n
(4.7) and (4.9) state that E(f) - ~L.f(an) is on an average a(f)n- t , where
k=l
a(f) = (a 2(f)) t is the standard deviation of f . The probabilistic information
about the error is obtained from the Central Limit Theorem, which states that if
o < (T(f) < 00, then
·
1im
n-HXl
P rob (cla(f) 1
-r,;;- ~ 11:
yn
tt
n
f(ai) - E(f) ~ C2
a(f))
..;n
= _1_j dt,
C
2 e- t 2 / 2
V2ir Cl

for any constants Cl < C2, where Probf-) is the ..\CX>-measure of the set of all se-
quences al , a2 ... ,an of elements of A that have the property indicated between the
parentheses.
By (4.6) and (4.8), we obtain the Monte Carlo estimate

(4.10)

where Xl, X2 • • • ,Xn are n independent ..\-distributed random samples from B and

(4.11)

is on the average jL(B)a(f)n-L

Remark 4.3. (i) In view of (4.11) or, on the basis of the Central Limit Theorem,
the Monte Carlo method for numerical integration yields a probabilistic error bound
of the form O(n- t ) in terms of the number n of nodes. The most important fact is
that this order of magnitude does not depend on the dimension (number of variables
of the function). In view of this, it is preferable over quadrature rules in higher
dimensions, especially in dimension ~ 5.
(ii) If the domain of integration B is so complicated that we cannot calculate
jL(B), then by change of variables, it suffices to consider the situation where B is
contained in the unit cube r.
Then we can write

[ f(x)dx = [ f(X)XB(X)dx, (4.12)


JB Ir
4.2. QUASI-MONTE CARLO METHOnS 159

where XB is the characteristic function of B.


If the integral in (4.12) is estimated according to (4.10), then the Monte Carlo
estimate is

1
1 n
f(x)dx ~ nLf(Xk), Xk E B, (4.13)
B k=l
where Xl, X2 ... ,X n are n independent random samples from the uniform distribu-
tion on r: The error estimation will be of the order n -! as in the above case,
As discussed above, the Monte Carlo method comprises the following basic ele-
ments:
(i) The statistical modelling of the given numerical problem in terms of the ran-
dom variable which is nothing but writing the integral in terms of the expected
value of a random variable.
(H) The analysis of random variables of the model with regard to the statistical
properties, such as law of distribution, statistical dependence or independence.
(iii) The generation of random samples reflecting the above mentioned statistical
properties.
As mentioned earlier, the Monte Carlo Methods have been used in fairly large
number of real-world problems in such fields as semi-conductor devices, transport
theory, structural mechanics, reliability theory, system theory. However, in view of
certain deficiencies ofthe method mentioned earlier, the 'quasi-Monte Carlo method'
in which the points Xl, X2 . . . ,X n are deterministic has been investigated systemat-
ically in the last two decades. For advantages of the new method, we refer to Mo-
rokoff and Caflisch [1995, 1993] and Sarkar and Prasad [1987]. It has been shown,
for example, that there is a significant improvement in both magnitude of error and
convergence rate over Monte Carlo simulation for certain lew-dimensional problems.
In 1988, an improved Monte Carlo scheme has been developed by the group headed
by Neunzert which reduces fluctuations and is often referred to as the Kaiserslautern
Monte Carlo (KMC) scheme . A fully deterministic method for solving the Boltz-
mann equation is proposed in Lecot [1991] . Some recent advances in this area can
be found in Neunzert et al. [1995, 1996] and some of these results are mentioned in
Sections (4.3) and (4.4).

4.2 Quasi-Monte Carlo methods


4.2.1 Basic results
The basic idea of a quasi-Monte Carlo method is to replace random samples in
a Moote Carlo method by properly chosen deterministic points. The deterministic
sequence of points {x n } should be chosen judiciously so as to guarantee small errors
between .!.
n k=l
f f(Xk) and 1
B
f(u)du. The usage ofthis method has been highlighted
160 GHAPTER 4. MONTE GARLO METHODS

in Scientific American, Science and SIAM News in the last couple of years. This
method is also an excellent example of merging disciplines. Hardy and Ramanu-
jan would have never imagined that the techniques of Number Theory would be
real tools to solve real-world problems via the quasi-Monte Carlo method. In all
important cases, the selection criterion for deterministic points leads to the con-
cepts of uniformly distributed sequence and discrepancy. We introduce here
a quasi-Monte Carlo method and discuss the role of the discrepancy in this method
indicating deterministic bounds for the integration error in terms of discrepancy.
For the sake of convenience and better understanding, we consider = [0, l]m =r:
rn-products of [0,1], the closed rn-dimensional unit cube as the domain of integration.
We consider the following quasi-Monte Carlo approximation for the integral
r f(x)dxj that is,
Ir
(4.14)

-::m
where Xl, X2,' " ,Zn EI .
We are interested in those points of r for which the quasi-Morite Carlo approx-
imation is convergent; that is,
1
L
n
f(x)dx = lim - f(Xk) . (4.15)
/,I'" n-+oo n k=l

A sequence of points Xl, X2 , ... , Xn in r is called uniformly distributed if


1
L
n
lim - XB(Xk) = p.(B), (4.16)
n-e-cc n k=l

for all subintervals B of yn, where XB is the characteristic function of B and p.


denotes the rn-dimensional Lebesgue measure.
It may be observed that (4.15) holds even for Riemann integrable functions f
on r:in case the sequence {Xk} is uniformly distributed. Broadly speaking, the
concept of uniform distribution means that the points Xl, X2, ... , Xk are "evenly
distributed" over yn. For an arbitrary subset K of yn, we define
n
H(K,P) = LXB(Xk), (4.17)
k=l

where P is the set of points of XI,X2,' " ,X n E r:


and XB is the characteristic
function of B .
H(K, P) is the counting function that indicates the number k,l :::; k :::; n, for
which Xk E K . If K is a non-empty family of Lebesgue measurable subsets of r,
then the discrepancy of the point set P is defined as

Dn(K, P) = sup H (B ,P ) - p.(B) , I (4.18)


BEK I n
4.2. QUASI-MONTE CARLO METHODS 161

where J.L is the Lebesgue measure of B.


It can be easily seen that 0 $ Dn(B,P) $ 1. The star discrepancy D~(P) =
D~(Xl,X2"" ,xn) ofthe points set P ofpoints Xl,X2,'" ,X n is defined as D~(P) =
m
Dn(K,P), , where K is the family of all subintervals ofr ofthe form II [O,Ui),
i=l
r: = product of m semi-closed intervals

m-times
"

Theextreme discrepancy Dn(P) = D n(Xl,X2,' " ,x~ofP is definedas Dn(P) =


Dn(K,P), where K is the family of all subintervals of I ofthe form II~dui,vi) '
The following results have been proved see, for example, Niederreiter [1992].

Lemma 4.1.

(i) For any P consisting 0/ points in r, we have


(4.19)

(ii) I/O $ Xl $ X2 $. · .... Xn $1, then

(4.20)

(iii) I/O $ Xl $ X2 $ $ Xn $ 1, then

Dn(Xl,X2'" ,xn) = .!:.n + max


l~k~n
(~-
n
Xk) - min
l~k~n
(~-
n
Xk) (4.21)

(iv) For a sequence S 0/ points 0/ r,


i/ D N (S) and DN(S) denote the discrepancy
and star discrepancy 0/ the first n ierms 0/ S, respectively, then the /ollowing
statements are equivalent:
(a) S is uni/ormly distributed in r
(b) lim Dn(S) = 0
n-t 00

(c) lim D~(S) = O.


n-t 00

Remark 4.4. It is clear from the above lemma that the discrepancy and the star
discrepancyare quantifications of the concept of distributed sequences in Im . For a
domain of integration more general than r , the interested reader may find details
in Niederreiter [1992 and 1978] and references therein.
162 CHAPTER 4. MONTE CARLO METHODS

We prove here two theorems on the error bounds for the quasi-Monte Carlo
approximation in one dimension and state the extension of these results in higher
dimensions without proofs which are nicely presented in Niederreiter [1992] .

Theorem 4.1. Let I be a function 01 bounded variation with the total variation
V(f) on [0,1]; then, [or any Xl, X2, '" , Xn E [0,1], we have

Theorem 4.2. 11 f is continuous on [0,1] and w(f, t) = sup If(u) - f(v)l , t ~


u ,vE[O,l]
0, lu - v I ~ t denote its moduiu» 01 continuity then, [or any Xl, X2, . . . ,X n E [0, 1],

Proof of Theorem 4.1. Let us suppose that Xl ~ X2 ~ Xs ~ . •. ~ Xn,XO = 0


and X n+l = 1. By applying the formula for summation by parts (Abel's rule) and
integration by parts, we obtain

t' I(x)dx = - ~n t xdf(x)


j
n
1 n k
n~f(Xk) - J (f(Xk+l) - f(Xk)) +J
o o
n 1:1:10 +1 k (4.24)
=L (x - n)df(x) .
k=O :1:10
By relation (4.20), for fixed k with 0 ~ k ~ n, we have

(4.25)

By (4.24), (4.25) and definition of the total variation, we get the desired result.

Proof of Theorem 4.2. Let Xl ~ X2 ~ .. . ~ Xn . By the Mean Value Theorem for


integrals, we have

l

k/ n
1 h 1 n
f(x)dx =L f(x)dx = - Lf(tk)'
k=l (k-l)/n n k=l

with (k - l)ln < tk < k]«. This implies that

1 n
nLf(Xk)-
1 1
1 n
f(x)dx = n L(f(Xk) - f(tk)) . (4.26)
k=l ° k=l
4.2. QUASI-MONTE CARLO METHODS 163

By (4.20) , IXk - tkl ~ D~(Xl,X2'··· ,xn ) for 1 ~ k ~ n and the result follows from
(4.26).
The variation of I on r in the sense of Vitali is defined by

v(m) (f) = sup L 1L}.(f, J)I , (4.27)


P JEP

where L}.(f, J) is an alternating sum of the values of I at the vertices of a subinterval


J ofr (function values at adjacent vertices have opposite signs) and the supremum
is taken over all partitions P of r into subintervals. More conveniently, it is defined

11 11\
as
V B(!) =
o
...
0
ami
au l 8u 2 . . . aU m
Idul,·· · ,du m , (4.28)

and holds whenever the indicated partial derivative is continuous on T",


For 1 ~ k ~ s and 1 ~ i l < i 2 < ... < i» ~ s, let V(k)(f,iI,i 2,· · · ,ik) be the
variation in the sense of Vitali of the restriction of I to the k-dimensional face

Then
L L
B

V(f) = V(k)(f,i l,i2'·· · ,ik) (4.29)


k=l 191<h<· · ·<i.~8

is called the variation of I on r in the sense of Hardy and Krause, and I is


of bounded variation in this sense if V (f) is finite.

Theorem 4.3 [Koksma-Hlawka Inequality]. Let r = ([O,l])m = m-Cartesian


product 01[0,1]. 11 I has a bounded variation V(f) on
-;m
r in the sense 01 Hardy and
Krause then, [or any Xl, X2, . . . ,X n EI, we have

(4.30)

Theorem 4.4. For any Xl, X2 , ... ,X n E rand any e > 0, there exists a function
I E COO(r) with V(f) = 1 and

l ~fJ(Xk) - lt~/(X)dXI > D~(Xl,X2' ·· ·


k=l
,xn ) - € . (4.31)

For a continuous function I on r, we define its modulus 01 continuity by


w(fj t) = sup I/(u) - l(v)1 [or t ~ 0,
lIu-vIl9.u,vET"
164 CHAPTER 4. MONTE CARLO METHODS

Theorem 4.5. If f is continuous on 1


m
, then for any Xl , X2, • • • ,X n Er, we hooe

(4.32)

Remark 4.5. The error estimates given in the theorems mentioned above indicate
that the point sets with small star discrepancy guarantee smaU errors in the quasi-
Monte Carlo integration over ym.

4.2.2 Properties of discrepancy and star discrepancy


Since the upper and lower bounds ofthe error of an integral by the quasi-Monte
Carlo sum are given in terms of discrepancy and star discrepancy, it is essential to
know their properties in order to study the applications of the quasi-Morite Carlo
method for numerical integration.
A point set P consisting of n elements of I" is called a low-discrepancy point
set if D~(P) or Dn(P) is small. A sequence S of elements of Im for which D~(S)
or Dn(S) are small for all n ~ 1 is caUed a low-discrepancy sequence. Low-
discrepancy point sets/sequences are called quasi-random pointsjquasi-random
sequences.
From these discussions, it is quite clear that the accuracy of the quasi-Monte
Carlo method will depend on the selection of quasi-random sequences, that is, low-
discrepancy sequences. Four classes of such sequences, namely, van der Corput
sequence, generalized van der Corput sequence or Faure sequence, Halton sequence
(Higher-dimensional analogue of van der Corput sequence) and Hammersly sequence
are introduced here.
For an integer b ~ 2, we put Zb = {O, 1, .. . ,b - I}, that is, Zb is the least
residual system mod b. Every integer n ~ 0 has a unique digit expansion
00

n = I:>j(n)lJi (4.33)
j=O

in base b, where aj(n) E Zp for all j ~ 0 and aj(n) = 0 for all sufficiently large j,
that is, the sum in (4.33) is actually finite.
For an integer b ~ 2, the radical inverse function 4>b in base b is defined by

= I: aj(n)b- j- l
00

4>b(n) for aU integers n ;::: 0, (4.34)


j=O

where n is given by its digit expansion (4.33) in base b.


4.2. QUASI-MONTE CARLO METHODS 165

r/Jb(n) is obtained from n by asymmetrie reflection of the expansion (4.33) in the


decimal point. It is clear that r/Jn(b) EI = [0,1) for all n ~ 0.
In the construction of many low-discrepancy sequences, it is convenient to index
the terms by n = 0,1,2 . . .. In one-dimensionallow-discrepancy sequences S, we
denote the terms by Xo, Xl, . .. and write
Dn(S) = Dn(XO,Xl" " ,xn-d,
for the discrepancy of the first n terms of S. The star discrepancy can be written
similarly,

van der Corput sequence in base b, For an integer b ~ 2, the van der Corput
sequence in base b is the sequence XO,Xl, .. . with Xn = r/Jb(n) for all n ~ 0. A
sequence Xo , Xl, X2 . . . is called the generalized van der Corput sequence in
base b, if X n 's are given by

L a(aj(n))b- j-
00
l
Xn = for all n ~ 0, (4.35)
j=O

where n is given by (4.33).


van der Corput had studied the case b = 2, but the general case was studied by
Faure and therefore the general case is often called the Faure sequence.
For a given dimension m ~ 1, let bl, b2 , • • • ,b m be integers ~ 2. A sequence
Xo , Xl, X2, ... in yrn, where
(4.36)
and r/J b are given by (4.34), is called the Halton sequence in the bases bl, b2, ' " ,b m .
It is clear that the van der Corput sequence is a special case m = 1, ofthe Halton
sequence. A Halton sequence is called the Hanunersley sequence if an arbitrary
term, say, Xk is given by

Xk = (k:1,r/Jb 1(k),r/Jb2(k), ... ,r/Jb"'_l(k») «i», (4.37)

for k = 0, 1, . . . ,n - 1.

Estimates of star discrepancy and discrepancy.


(i) For the van der Corput sequence Sb
D~(Sb) = O(n-llog n), (4.38)
for n 2 with a constant depending only on b. For b = 2 we have
~ nD~(S2) =
logn
nDn(S2) ;:; log 8 + 1 for all n > 1 and

·
I1m
n--+oo
(D (S)
n n 2 -
log
--
n) _ + - - .
log 8
-
4
-
9
log 3
log 8
(4.39)
166 CHAPTER 4. MONTE CARLO METHODS

(ii) Let 8 be the Halton sequence in the pairwise relatively prime bases bs, bz, . . . ,b m ,
then
*
D n(8) < n + nIIk
mIm
= 1
(bk - 1
2 log bk log n
+ -1)
+ -bi2 for all n;:::: 1 . (4.40)

(iii) For any dimension m ;:::: 1, there exists an infinite sequence of points in T"
such that
Dn = O(n- 1 (log n)?') . (4.41)
Furthermore, for every n ;:::: 2, there exists a finite sequence of n points in Im
such that
D n=O(n- 1(logn)m-1) . (4.42)
(iv) For the Hammersley sequence containing n terms, we have

D *n (8)
n < nm + n1 IIm-1 ( 2 bi-1
i=1
1 bi+l )
log bi og n + -2- .

Remark 4.6. Relation (4.41) is very important in the sense that it guarantees that
for any dimension, there exist quasi-Monte Carlo techniques that perform substan-
tially better than the Monte Carlo method.

Example 4.1. The Halton sequence in one dimension is generated by choosing a


prime number p and expanding the sequence of integers 0,1,2,··· ,n into base p
notation. The n-th term of the sequence is given by
ao a1 a2 am
Zn = p + p2 + p3 + ... + pm+l '

where the ai's are integers taken from the base p expansion of n - 1, [n-1]p =
a mam-1 . . . a2 a1 ao, with 0 :$ ai < p. For example, if p = 3, the first 12 terms of
the sequence (n = 1,2, .. . ,12) are
I 2 1 4 7 2 5 8 1 10 19}
{ 0, 3' 3' 9' 9' 9' 9' 9' 9' 27' 27 ' 27 .

Example 4.2. The approximation of a surface integral by a low-discrepancy se-


quence: Let us consider a ball kR with radius R where the corresponding sphere is
denoted by a R . Then, we can write

iR4Jdw = 1
21r 1r
2
(1 4J (R sin 0 cos 4J, R sin 0 sin 4J, R cos 0) R sin OdO) d4J

= 21r2R211 14J1
(Rsin 1rO' cos 21r4J', Rsin 1rO' sin 21r4J', Rcos 1rO')

x sin 1rO' d(}' d4J' .


4.2. QUASI-MONTE CARLO METHODS 167

One has to consider a uniform distribution with respect to the variables (()', ql)
in [0, 1]2, where the integrand is the following:
2rr2R 2 r/>( . . . ) sin rr()' .

One can also say that it is given a distribution with t he corresponding density

p(()', r/>') = sin rr()' ,


and integrate 2rrR 2 r/>(' .. ) with respect to this density.
We choose a low-discrepancy sequence in

and put
Pi = (R sin rr()~ cos 2rrr/>~, R sin rr()~ sin 2rrr/>~, R cos rr()D ,
such that
2rr2 R 2 N
~ Lsinrr()~r/>(Pi)

1
i=l

is an approximation for r/>dw.


UR
Now, one has to consider the surface from more spheres and must take into
account the weight 2rrR2 sin rr()~.
One selects N points on the sphere 0"1 and another N points on the 0"2 and
chooses 2N points such that they He on the union of the surfaces of spheres. For
this, one should realize not only the angles but also the radii. See Figure 4.2.
The algorithm is the following: Consider M spheres 0"1,'" ,0" M with radii
(Tl,'" ,TM) ' One determines N points on every sphere, where the radius and
sin 7r()i always take, as usual, the weight

{(T I,' p(i))


1 ,
(T ' p(i))}
.. . '1' N
,; - 1" 2 ... , M
," - .

From NM points, we choose those that He on a U~l Ku .

Remark. One must check if, for a given point pJi), the following is true:
(')
II Pj ' - ~II ~ t» ,

for all k = 1" " ,M. Ifthe above inequality is true, the point belongs to aU~l Ku;
otherwise, not.
So, one gets a point set which consists of maximal NM points (Ti,pl i ) ) , i E
{1, .. · ,M}, I E {1, ... ,N}. We sum
2 2 2
7r Ti . ()(i)'A.(n(i))
N sm 7r I 'I' rl '
168 CHAPTER 4. MONTE CARLO METHODS

Figure 4.2

over these sets such that we get evidently an approximation for r


leuK"
rjxJJ..J.
The evaluation of integrals by the Monte Carlo and quasi-Monte Carlo methods
depends on the generation of random numbers (often called pseudo-random num-
bers) and quasi-random numbers or quasi-random or pseudo-sequences, respectively.
We shall briefly mention here how they are generated.
Now-a-days, most computers contain routines that generate random numbers
uniformly distributed between 0 and 1. The method of generation most commonly
used in the computer routines for the generation of uniformly distributed random
numbers is the multiplicative congruential method where the i-th element Ti of the
sequence is given by the previous element Ti -l by a relation such as

Ti = pri-l (mod q), (Ti is the remainder


(4.43)
{ when pri-l is divided by q) ,

where p and q are appropriate constants. The first element of the sequence must be
given by the user. The numbers Ti of the sequence in equation (4.43) are obtained
with a precise mathematical algorithm and therefore they are not at all random;
in reality, given the first element called seed, all other terms are easily predictable.
However, for a good choice of the constants p and q, the sequences of Ti behave ran-
domly in the sense that they pass a large number of statistical tests of randomness.
Numbers T of such a type are called pseudo-random numbers and sequences of
such numbers are called pseudo-random sequences. These numbers have the
advantage over truly random numbers of being generated in a fast way and of being
reproducible when desired for program debugging.
4.3. THE PARTICLE METHODS 169

Relation (4.43) can be replaced by a more general formula

ri = pri-1 + l(mod q) . (4.44)

Relations (4.43) and (4.44) are called congruential methods of generating pseudo-
random numbers; in particular (4.43) is called multiplicative congruential method.
For more details, one may see Hammersley and Handscomb [1967] and Press, Teukol-
sky [1992]. van der Corput, Faure, Halton and Hammersley sequences discussed in
the preceding sections are examples of quasi-random sequence. The current research
papers of Morokoff and Calflish (see, for example, paper of 1995) and references
therein may provide important clues for generating quasi-random sequences.

4.3 The particle methods


4.3.1 Introduction
The partide methods are numerical methods of approximation (discretization)
of partial differential equations of the type

~~ + i: 8~i (aiu) + aou = g, x E Rn, t> 0, (4.45)


t=l

where u(x,O) = uo(x) is known as the hyperbolic equation (in conservation form).
There are important partial differential equations which belong to this dass,
some of which are:
1. The Boltzmann equation for dilute gases
1
It + v\l xl + F\l vi = -QU) .

Here, 1= I(t, x , v) denotes the density of a dilute gas

°
where t :::: is the time variable, x E n c R3 the space coordinate, and v E R3
the velocity, F = F(t, x) denotes an external force (gravity) or is related to a
self-consistent potential <P such that F = \l <P.
2. Euler equations

c::: + div(pu) = °
8PUi
-8
. (PUi U ) =
+ div 8P
-8 ,i = 1,2,3,
t Xi

where P = P(p) is the pressure, which is given as a function of p,


170 GHAPTER 4. MONTE GARLO METHODS

In analogy to the methods like finite element, finite difference, the particle meth-
ods are also known as the point-set methods. These methods can be studied from:
(a) measure-theoretic approach,
(b) number-theoretic approach,
(c) functional analytic techniques, and
(d) statistical methods.
The Monte Carlo and quasi-Monte Carlo are, respectively, essentially statistical
and number-theoretic approaches of the particle methods discussed in the previous
sections. In Subsection 4.3.2, we introduce the measure-theoretic approach while in
Subsection 4.3.3, the functional analytic method is presented. Our presentation here
is mainly based on Neunzert and Struckmeier [1995], and Raviart [1983].

4.3.2 Particle approximations - measure-theoretic


approach
A particle is characterized by its position x, velocity v and mass (or charge) o.
In order to simplify the notation, we put P = (x, v) . A particle ensemble (or finite
point set) is given by

or, in another notation, by


N
8W N = L a i8Pi •
i=1
Here, 8 denotes the Dirac measure and N is the number of particles.
We consider sequences of particle ensembles

w~ = {(a{",pIN ) , .•. ,(a~,P:)) ,


or
N
8w~ = Laf<5~ .
i=1

Often, p iN are taken from a sequence of PI, P2 .•• , that is, more and more particles
are brought into the game. Then

{P{'" .. ,pff} = {PI,P2 , ' " ,PN } .

Now, for a given density f E L+(R3), we say that 8w NN converges to f if

L af r/l(pf) = Jfr/ldvdx,
N
lim
N--+oo
i=1
4.3. THE PARTICLE METHODS 171

for all rP E Cb(R3 X R3). This means that the discrete measure OwNN weak" converges
to fdvdx, where Cb is a set of all continuous and bounded functions. One may
interpret this as an integration rule where we integrate the function rP with respect
to the measure fdvdx. We would like to measure the distance between w~ and f.
This might be done by any distance in measure spaces such as the Prohorov metrie or
bounded Lipschitz distance, but also since the limit fdvdx is absolutely continuous
with respect to the Lebesgue measure, with the help of the discrepancy. A pertinent
result in this direction is by Neunzert and Wiek (see, for example, Babovsky and
Illner [1989])
OwNN - t f Hf D(w~,f) - t O.
It may be recalled that the bounded Lipschitz distance dL(jL, v) between two mea-
sures JL and u is defined as

where LiP1 denotes the set of Lipschitz continuous functions with Lipschitz constant
1, whereas the discrepancy between two measures JL and v is defined by
D(JL, v) = sup IJL(A) - v(A)I .
A

Here, Adenotes a rectangle parallel to an axis of the underlying space.


Given the measure JL with density f and an appropriate distance between mea-
sures, an important question is how to construct OWN such that the distance between
JL and Ow N is as small as possible if N, the number of particles, is fixed. For the
construction of appropriate OWN' one may see Neunzert and Struckmeier [1995].

4.3.3 Functional analytic approach


We present here the particle method of approximation of problem (4.45). Let us
assurne that 9 = 0, then the method consists in approximating the initial condition
Uo by a linear combination of Dirac measures

U~ = }:::>~jO(x - Xj), (4.46)


jEJ
for some set (xi> G:j)jEJ of points Xj ERn and weights G:j E R.t . Replacing Uo by
U~, we look for the solution of the problem

8Uh ~ 8
8t + L..J -8. (aiuh) + aOUh = 0;
(4.47)
i=1 XJ
{
Uh("O) = u~ .
By a well-known result of Raviart 11983J, (4.47) has a unique solution

Uh(X, t) = L G:j(t)o(x - Xj(t)),


jEJ
172 CHAPTER 4. MONTE CARLO METHODS

where

Xj(t) = X(t ;«s . 0)


Oj(t) = O j exp( -l t
ao(Xj(s) , s)ds) .

Thus, we have obtained a particle approximation Uh of the solution U of problem


(4.45) in the case f = O. Raviart has also studied the approximation of function
by Dirac measures. He has considered t he following problem: Given a function
Uo E Co (Rn), how to choose

U~ = LOj()(x - Xj),
jEJ

so that it represents a good approximation of UD. This problem has to be understood


as an approximation in the sense of measure on Rn . Thus, for a given q; in C8(R n),
we have to compare

We interpret this now as the classical problem of numerical quadrature. Given a


parameter h > 0, we cover Rn with a uniform mesh with size h, For all j =
(iI,h,··· ,jn) E zn, we denote by B j the cell

Bj = {x E Rn I (ji - ~)h s Xi s (ji + ~)h, 1:::; i :::; n},

and by Xj = (jihh~ i~n the center of B j. Then we set

u~ = L Oj()(x - Xj) , (4.48)


jEZ n

where Oj is some approximation of jB;


ueda: Since u E CO(Rn), we can choose

(4.49)

Raviart [1983] has estimated the error between jRn


uof/Jdx and L
jEzn
Ojq;(Xj).

4.4 A current study of the particle method


4.4.1 Introduction
In kinetic theory, a gas is assumed to be in an equilibrium state when it does
not exchange mass and energy with other bodies, and its st ate does not change
4.4. A CURRENT STUDY OF TRE PARTICLE METHOD 173

with time. The trend towards a Maxwellian distribution in the space homogeneous
case is expressed by Boltzmann's H -theorem. This indicates that this particular
distribution is a good candidate to describe a gas in a (statistical) equilibrium state.
The parameters, density p, mean velocity u and the temperature T of a Maxwellian
distribution may depend on the time and space variables. In this case, a Maxwellian
distribution is called a local Maxwellian distribution or a local equilibrium
state.
Finding a criterion of local equilibrium is one of the fundamental tasks for cou-
pling of the Boltzmann and hydrodynamics equations. A typical application of
coupling of these equations arises in the re-entry of the space shuttle in the atmo-
sphere. In this region, the Boltzmann equation also gives the correct description of
the flow but, in the numerical solution, the computational effort is very high. In
fact, the mean free path between the molecules is very small and, in any numerical
code of the Boltzmann equation like the finite point set method (FPM), the mesh
size should be smaller than the mean free path. On the other hand, in this region,
the Boltzmann equation can be replaced by the Euler equations if the partiele distri-
bution function is a Maxwellian distribution or elose to it . But the Euler equations
are not valid everywhere, especially in the shock region or on the solid boundary.
Therefore, in those regions where Euler equations are not valid, one must solve the
Boltzmann equation.
In order to determine the domain of validity of these equations, we need a crite-
rion to determine whether a partiele system is near enough to a Maxwellian distribu-
tion or not. This means that one needs a distance between a Maxwellian distribution
and a partiele system.
The most widely used method for simulation of the evolution of a rarefied gas is
the particle method. As indicated earlier, the Euler equations can also be solved by
the partiele method, see Tiwari and Rjasanow [1997].
As mentioned in the previous subsection, the particle methods are applied to
evolution equations for densities f(t ,x,v), (x,v) E n, of particle number, mass,
charge, momentum in the phase space n, which is normally the position-velocity
space. In other words, particle methods are applied to appropriate conservation
laws for quantities given by measure JL, describing the partiele number, defined by
the relation

JL(A) = i fdxdv,

where Adenotes a measurable set in n. Typically, conservation equations are evo-


lution equations for measures, which are aposteriori transformed into (partial) dif-
ferential equations for the corresponding density function f.
Since the discrete measure is a sum of Dirac measures, it is not easy to use, for
example, the relative entropy which also measures the distance between the partiele
system and the density function. Another difficulty is that we need a distance
in velocity space which is three-dimensional and the number of data is about 50.
Therefore, using the statistical tools is not simple. It was proposed by Neunzert that
174 CHAPTER 4. MONTE CARLO METHODS

H'-oorax as a distanee between a particle system and the density function should
be used. Tiwari and Rjasanow [1997] have effectively implemented his ideas, It
was proved in a Ph.D. thesis written under the supervision of Prof. Neunzert (see
Schreiner [1994]) that the weak* eonvergenee of Ow N to J1. is equivalent to

as long as 8 < -d/2, where d is the dimension of the velocity space which is equal
to three in our case, Therefore, we ehoose 8 = -2 for our eomputation. We present
below a summary of the results of Tiwari and Rjasanow [1997].

4.4.2 Derivation of distance


Let 8 E R, then HB(Rd), d = 1,2,3 is the subspace of S' (Rd), the spaee of
tempered distributions I with property

(1 + leI 2)B/2 ! E L 2(R d), e ERd, (4.50)

where !(e) is the Fourier transform of I defined by

!(e) = r e -i<v,~> I(v)dv .


JRd
(4.51)

HB(Rd) is a Hilbert space with sealar produet

and the eorresponding norm is

(4.52)

The Dirac measure 8a = o(x - a), at a ERd belongs to HB(Rd), if 8< -d/2.
For 8 = mE N, the space HB(Rd) eoincides with the Sobolev spaee W m,2(R d),
where
wm,p(Rd) := {u E V(R d) : ir« E V(R d),
{ for Cl: ~ m, 1 ~ p < 00 , m ~ O}.
We are interested in caleulating the distanee (Sobolev norm of the differenee)
between the loeal Maxwellian distribution IM[p, U, T](v) , which we denote by IM,
and a given particle distribution, which ean be interpreted by the diserete measure
OWN ·
The loeal Maxwellian distribution function is defined by
P _lyu l2
IM [P,U,T] (v) = (21fRT)3/2 e RT (4.53)
4.4. A CURRENT STUDY OF THE PARTICLE METHOD 175

where p is the density, U the mean velocity, T the temperature and R the gas
constant and p, U, T depend on time t and position z.
We normalize the Maxwellian distribution by defining the new velocities by
v-U
w=-- (4.54)
VRT'
such that the new temperature is equal to 1 and the new mean velocity is equal to
O. Then the Maxwellian distribution (4.53) is given by
p -~
iM(W) = (2rr)3/2 e 2
(4.55)

We normalize iM and OWN to 1, that is, we divide iM and OWN by the density p.
We have to compute the HB-nor m for 8 < -3/2 for the normalized Maxwellian
distribution
_ 1 _~
iM(W) - (2rr)3/2 e , (4.56)

and the discrete measure


1 N
OWN =N L o(w - Wj). (4.57)
j=l

The Fourier transform of the Maxwellian distribution (4.53) is again a Maxwellian


distribution and is given by
~ 1(12
iM(e) = e--r , (4.58)
and the Fourier transform of the discrete measure (4.57) is given by
N
8WN(e) = ~ Le-i( zi,1I;)1 . (4.59)
N j=l

Since OWN belongs to HB(Rd) if 8 < -~, it is also true that iM belongs to HB(R d)
for all 8 < -~. As we have already mentioned that we are looking for a distance in
the three-dimensional velocity space, a simple choice für 8 is -2.
We define the distance between iM and OWN by

(4.60)

In order to compute (4.60), we consider

lIiM - oWNII~-2 = lIiMII~-2 - 2(fM,OwN)H-2 + lIowNII~-2 . (4.61)


Now, we compute each term on the right hand side of (4.61). First, we consider

2 r li~(e)12 r e-I~12
lIiMIIH-2 = JR 3 (1 + leI2)2~ = JR 3 (1 + leI 2)2 de .
176 CHAPTER 4. MONTE CARLO METHODS

Transforming ~ into spherical coordinates

~1 = r sin (} cos </>, ~2 = r sin (} sin </>, ~3 = r cos (}


{ x (0 ~ r < 00, 0 s 8 s n, 0 ~ </> s 21r),

and integrating with respect to 8 and </>, we get

Integrating by parts, we get

IlfMllt-2 = 21r 3 [1 00

o
_r2
~dr - 2
1 +r
1 0
00
r2
e- dr
]

= 21r [3; e Erfc(I) - V1f] ,


where Erfc(x) is the complementary error function and is related to the error function
by the expression

Erfc(x) =1- Erf(x) =1- 2


.fiF 1:1: e-t dt.
0
2

Therefore,
(4.62)

After appropriate calculations, we arrive at

r;:1r ~ [ -Iv -I(


2
r ) I H - 2 -_ ye
( fM,CJWN 1)
L..J e ' 1 - -I-01 Erfc
(I-lvii)
V2
2N i=l v J 2

+e Iv,_I(1 + -)Erfc(
1 I+lvJol]) (4.63)
lVii V2
2 N N
11 &WN 112H-2 -- ~
N2 ""e-lv;
L..J L..J
-v,1 ' (4.64)
i=l 1=1

and

where IlfMllt-2' {fM'&WN)H-2, and lI&wN Ilt-2 are given by (4.62), (4.63) and (4.64),
respectively.
4.4. A GURRENT STUDY OF THE PARTIGLE METHOD 177

4.4.3 Computational results


As a first test, we measure the distance between the Maxwellian distribution
and the particle system generated by it. Table 1 shows the H- 2 -norm as a dis-
tance between the Maxwellian distribution and Maxwellian distributed particles for
different numbers of particles.

Table 1. Distance between Maxwellian and its distributed data

I Number of particles I H 2-Norm I


10 0.7552
50 0.2724
100 0.1794
500 0.0619
1000 0.0397

Table 1 shows that the norm decreases if the number of particles increases.
The norm is approximately of order 2j.JN. If we compute the distance between
the Maxwellian distribution and non-Maxwellian distributed particles, the distance
should be larger than the distance between the Maxwellian distribution and its dis-
tributed data. Therefore, we have generated the particles according to the sum of
two Maxwellians

f(v) -- 2(271"RT)3/2
p (-~
e + e-~) ,

and then computed the distance between f(v) and the Maxwellian distribution ac-
cording to such a particle system with the help of H- 2 -norm. As in the previous
case, we have computed for different numbers of particles.
Table 2 shows that the distance between the particle system, generated according
to the sum of two Maxwellian distributions and the Maxwellian distribution with
parameters given by the particle system, is bigger compared with that in Table 1.
In this case also, the norm decreases if the particles number increases but at a very
slow rate and is bounded from below.

Table 2. Distance between sum of two Maxwellians and the


corresponding Maxwellian

Number of particles IH 2-Norm


10 1.7867
50 1.7516
100 1.7494
500 1.7484
1000 1.7393
178 CHAPTER 4. MONTE CARLO METHODS

4.4.4 Spatially homogeneous Boltzmann equation


One of the properties of the spatially homogeneous Boltzmann equation is the
Boltzmann's H-theorem which states that the tendency of a gas is to approach
a equilibrium distribution (see Cercignani [1989]). The Maxwellian distribution
can be considered as an equilibrium distribution. We consider first the spatially
homogeneous Boltzmann equation with non-equilibrium distribution at time t = O.
The spatially homogeneous Boltzmann equation for f (t, v) is given by
8f
8t = J(f,I), (4.66)

where

J(f, I) =
JR
r Jl'/ES~
3
r Iv - ula(lv - u], TJ){f(t, v')f(t, w')
- f(t,v)f(t,w)}dJ.JJ(TJ)dw
v' = v - (v - w, TJ}TJ
w' = w - (w - v, TJ}TJ .
We solve equation (4.66) with the following initial condition:
2
p ( _ !v_ UI _ 'v+U !2 )
f(O, v) = 2(21r RT)3/2 e 2RT +e 2RT (4.67)

As time t tends to infinity, the solution of (4.66)-(4.67) tends to an equilibrium


distribution with the density, mean velocity, and the temperature obtained from the
initial function f(O,v) . Therefore, IIfM - 8w N IIH-2, defined in (4.60), decreases to
a constant as time t tends to infinity. We verify this behaviour by using the particle
code, developed in AG Technomathematik, University of Kaiserslautern. At every
time step, we compute the H- 2-norm . Figure 4.3 shows the decrease in H-2- nor m as
time increases. The simulation is done with 100 particles. Note that the asymptotic
value for the distance is approximately the same aa that between the Maxwellian
distribution and the corresponding particles system which can be seen from Table
1.

4.5 Problems
Problem 4.1. Compute I = 11
xdx by the Monte Carlo method.

Problem 4.2. Show that the sequence X n = (nO), n = 1,2,3 . . . , where 0 is an


irrational number, is uniformly distributed in [0,1].

Problem 4.3. Let Q be the root of polynomial of degree n that has integer coeffi-
cients and is irreducible over the rationals. Then show that there exists a constant
4.5. PROBLEMS 179

c > 0 such that, for every pair of integers,


C
p,q with q > 0, 10: - p/ql >
- -qn .

Problem 4.4. Let f(u) be periodic in [0,1] and be of dass C3[0,1] so that we have
f(O) = f(I), / (0) = / (1), / ' (0) = t" (1). Let 0 be a quadratic irrational number,

1
then prove that
1
1
IN N
L f ((nO» - f(x)d xl s ~.
n=l 0

1.4

1.2

0 .8

0.6

0 .4

o 20 40 60 80

u..
Figure 4 .3 H-2 . Norm for 100 particles.

1
Problem 4.5. Compute 10 eZtZ2ZaZ4 dXldx2dx3dX4 using a uniforrnly
distributed sequence of points.

1
Problem 4.6. (i) Show that D~ (Xl , X2 • •• X n ) 2:: 2 n ' in general, and

1
D *n (Xl, X2, • •• ,Xn
) = -2
1 .if Xk = 2k- < k-
2 n .rror 1 - <n ,
n
in particular.
1
(ii) D n ( X l , X2 " "2:: 2 n if Xk = 2 2k~1 for 1 s k ~ n.
,Xn )

(iii) Dn(S) 2:: Cn- llog


n,D~(S) 2:: (0.06)n-Qog n for infinitely many n.
(iv) D~(S) = O(n- llog n) for all n> 2.
180 CHAPTER 4. MONTE CARLO METHODS

Problem 4.7. Show that the sequence Wn = (xf,x~, · ·· ,x~) is uniformly dis-
tributed if and only if
L nt/J(x~) = 1t/J(x)dx,
1
lim
n~oo i=O 0

for every real-valued continuous function t/J on [0, 1].

Problem 4.8. Verify the results (4.63) and (4.64).


Chapter 5

Image Processing and


Fourier - Wavelet Methods

The image processing and signal analysis whose ingredients are modelling,
transforms, smoothing and sharpening, restoration, encoding (image data compres-
sion , image transmission, feature extraction) , decoding , segmentation, representa-
tion, archiving and descr iption, have played a vital role in understanding the intri-
cacies of nature, for providing comforts, luxur ies and pleasure to all of us, and even
probing mission of a space-craft. Image processing is used in telecommunications
(telephone and television), photocopying machine, video camera, fax machine, trans-
mission and analysis of satellite images, medical imaging (echography, tomography
and nuclear magnetic resonance), warfare, artificial vision, climatology, meteorology
and film industry for imagery scenes. In short, it is one of the most important disci-
plines for industrial development and unveils the secrets of nature. Different kinds of
techniques and tools like Fourier series, Fourier transform, Walsh-Fourier transform,
Haar-Fourier transform, Hotelling transform, Hadamard transform, entropyencod-
ing and, more recently, wavelets , wavelet packets , and fractal methodology have been
used to tackle the problems of this field, It is difficult to say authoritatively which
method is superior to the other in a particular situation. However, a combination of
the wavelets and fractal methods promises for a great future. This chapter comprises
five sections, namely, image model and methods of image processing, classi-
cal Fourier analysis, wavelet techniques in image processing, Fractal image
processing and problems. Several subsections of Section 5.1 are devoted to image
modelling, image enhancement, image smoothing, image restoration, degradation
and image analysis. In Section 5.2, basic results concerning Fourier series, Fourier
coefficients and Fourier transform, fast Fourier algorithm, and sampling theorems-
the backbone of digital image processing via Fourier techniques - are discussed . The
celebrated theory of wavelet analysis, developed in the last decade, along with its
applications to the digital image processing is presented in Seetion 5.3.
The main objective of this chapter is to familiari ze the readers with the mathe-

181
182 CHAPTER 5. IMAGE PROCESSING

matical models and mathematical methods for studying modem techniques of image
processing. Our treatment is mainly based on the work of Weaver [1989], Gonzalez
and Woods [1993], Maaß and Strark [1994], Wickerhauser [1994], Barnsley and Hurd
[1993], Lu [1997], and Fisher [1995].

5.1 Image model and methods of image processing


5.1.1 Image model
Signal or image or data is considered as a function. With sound or radio waves,
the function expresses intensity or amplitude in terms of time. For still images, the
free variable is spatial. For video, both time and space are varied. When the domain
and range of such a function are defined as intervals of real numbers, and the function
can attain all of the values in its range, the signal or data is said to be analogue.
When the domain and range consist of finite sets of possible values, represented by
integers, the data is said to be digital.
Assume that an image of an object is represented by a function of two variables
f(x,y)(f: R x R ~ R), where the value or amplitude of f at (x,y) E R x R = R2
denotes the intensity (brightness) of the image at that point. Since light is a form
of energy, f(x,y) must be non-zero and finite; that is,

0< f(x,y) < 00.


We may write
f(x, y) = i(x, y)r(x, y), (5.1)
where i(x, y) is called the Illuminatton component and r(x, y) is called the re-
ftectance component. We may assume that

0< i(x,y) < 00,0< r(x,y) < 1.


The nature of i(x, y) is determined by the light source, while r(x, y) is determined by
the characteristics of the objects in a scene. The intensity of a monochrome image
f at coordinates (x, y) will be called the grey level (1) of the image at that point.
If 1 = 0, then the image is considered as black and if 1 = L, then it is considered as
white.
Very often, we take

L = i max . rmax (Product of the maximum value of i and r) .

Suppose that a continuous image f(x, y) is approximated by equally spaced samples


arranged in the form of an N x N arrays. Then it is called a digital image. In other
words, a digital image is an image f(x,y) that has been discretized both in the
spatial coordinate and in brightness, We may consider a digital image as a matrix
whose row and column indices identify a point in the image and the corresponding
matrix element value identifies the grey level at that point. The elements of such a
5.1. IMAGE MODEL AND METHODS OF IMAGE PROCESSING 183

digital array are called image elements, picture elements, pixels or pels, with the
last two names being commonly used abbreviations of 'picture elements'. There are
several advantages of selecting sequence arrays with sizes and number of grey levels
that are integer powers of 2. A typical size comparable in quality to a monochrome
TV image is a 512 x 512 array with 128 grey levels. Elements of digitized f(x,y) or
digital images are given in equation (5.2).

f(x ,y) (5.2)

Each element of this matrix is called a 'pixel'. Usually, N = 2n , and G = 2m ,


where G denotes the number of grey levels. The number b of bits required to store
a digitized image is given by
b=N xNxm. (5.3)

The quality of an image will vary along with variation of N and m. At some point
in the acquisition of a digital image, a spatial resolution (pixel width and length)
and a number of values (pixel depth) are imposed on the image. The resolution,
that is, the degree of discernible detail of an image is strongly dependent on both
N and m. The more these parameters are increased, the closer the digitized array
will approximate the original image. However, relation (5.3) clearly indicates that
storage and consequently processing requirements increase rapidly as a function of
N and m. The effect of variation of N and m on the quality of the image has been
studied but no conclusive result is available.

5.1.2 Image enhancement


Enhancement methods are related to improvements of a given image for a
particular purpose. These methods are problem-oriented, that is, a method may be
quite useful for enhancing X-ray images but it may not be the best approach for
enhancing pictures of Mars transmitted by aspace probe .
We discuss here two broad categories of image enhancement methods: (1) fre-
quency domain methods and (2) spatial domain methods. The frequency domain
method is related to modifying the Fourier transform of an image. The spatial
domain method is based on direct manipulation of the pixel in an image.

Frequency domain methods. The frequency domain methods are based on the
convolution theorem discussed in the next section, Let g(x,y) be an image formed
by the convolution of an image f(x, y) and a position invariant operator h(x, y); that
is,
g(x,y) = h(x,y) * f(x,y). (5.4)
184 CRAPTER 5. IMAGE PROCESSING

Then by the Convolution Theorem, we have

G(u,v) = H(u,v)F(u,v), (5.5)


where G, Hand F are the Fourier transforms of g, h and f, respeetively. H (u, v) is
known as the transform function of the proeess. The goal is to find H (u, v) for a
given image f(x,y) so that the desired image given by

g(x,y) = F- 1[H(u, v) F(u, v)] (5.6)


exhibits some highlighted feature of f(x,y). For example, [It is clear that F(u,v)
ean be ealeulated] edges in f(x,y) can be aceentuated by using a function H(u,v)
that emphasizes the high-frequeney eomponents of F(u, v).

Spatial domain methods. The term spatial domain refers to the aggregate of
pixels eomposing an image, and spatial domain methods are proeedures that operate
directly on these pixels. Image proeessing functions are the spatial funetions in the
spatial domain and may be expressed by

g(x, y) = T[f(x, y)] , (5.7)

where fex, y) is an input image, g(x, y) is the proeessed image and T is an operator
operating on f, defined over a neighbourhood of (x,y), or T may be eonsidered as
defined on a set of input images, such as performing the pixel-by-pixel sum of K
images for noise reduction.
Let us eonsider a neighbourhood about (x, y) as shown in Figure 5.1, a square
eentred at (x, y).
Reetangles and circles may be used to deseribe neighbourhoods but reetangles are
more appropriate from implementation point of view. The eentre of the subimage is
moved from pixel to pixel starting, say, at the top eorner, and applying the operator
at each loeation (x,y) to yield the value of 9 at that loeation.
If the neighbourhood is a square of the unit length, then T is of its simplest
form and 9 depends only on the value of f at (x, y) and T beeomes a grey-Ievel
transformation action or mapping of the form

8 = T(r), (5.8)

where rand 8 denote the grey level of fex, y) and g(x, y) at any point. If T(r) has
the form as shown in Figure 5.2, the effeet of this transformation is to produee an
image of higher eontrast than the original by darkening the levels below a value m
and brightening the levels above m in the original pixel spectrum. In this teehnique,
known as contrast stretching (different foundation), the levels of r below mare
eompressed by the transform into a narrow range of 8 towards the dark end of the
speetrum. The opposite effeet takes place for values of r above m.
In the limiting case shown in Figure 5.3, T(r) produees a two-level (binary)
image.
5.1. IMAGE MODEL AND METHODS OF IMAGE PROGESSING 185

(x , y)

./

Figure 5.1. A 3 x 3 neighbourhood about a point (x, y) in an Image.

The general approach is to let the values of f in a predefined neighbourhood of


(x,y) determine the value of 9 at those coordinates.
The notion of masks (also referred to as templates, windows, or filters) which
are small two-dimensional arrays, for example, 3 x 3 are quite useful for this method
(for more details, one may see Gonzalez and Woods [1993]).

5.1.3 Image smoothing


Smoothing operators are employed mainly for diminishing or illuminating false
effects that may be present in an image as a result of poor sampling performance or
transmission channel. The present methods for smoothing are:

(1) lew-pass filtering,

(2) neighbourhood averaging,

(3) median filtering,

(4) Butterworth filter, and

(5) averaging multiple images.


186 CHAPTER 5. IMAGE PROCESSING

s = T(r)

Dark ...... - - - -...~ Light r

Figure 5.2. Dark ~ Light.

We present here the lew-pass filtering technique for image smoothing and we
refer to the bibliography for other methods.

Low-pass filtering. Edges and other transitions such as noise in the grey levels
of an image contribute heavily to the high frequency content of its Fourier trans-
form. It therefore follows that blurring can be achieved via the frequency domain
by attenuating a specified range of high-frequency components in the transform of
a given image. By (5.5), we have
G(u,v) = H(u, v)F(u, v) ,
where F(·, .) is the Fourier transform of the image f (., .) which we want to smooth.
The problem is to select a function H(·, ·) that yields G(·, ·) by attenuating the high-
frequency components of F(·, .). The inverse Fourier transform of G( ·,·) will then
yield the desired smoothed image g(., .). Since high-frequency components are 'fil-
tered out' and information in the low-frequency range is passed without attenuation,
this process is known as low-pass filtering. H(·,·) in this reference is referred as
the filter transfer function. An ideallow-pass filter (ILPF) in two-dimensional
space is defined by the relation
H( )-
u, v -
{1 iff D(u,v)
0i D
D o;
=
(u, v) > o , D
(5.9)

where D o is a specified nonnegative quantity and D(u,v) is the distance from point
(u, v) to the origin of the frequency plane; that is,
D(u ,v) = (U 2+V2)1/2. (5.10)
5.1. IMAGE MODEL AND METHODS OF IMAGE PROCESSING 187

s = T(r)
~~

T(r)

Figure 5.3. Dark ~ Light.

The concept of the blurring and ringing can be explained with the help of the
convolution theorem. Since

G(u,v) = H(u, v) F(u,v) [See (5.5)],

it follows from the Convolution Theorem that

g(x,y) = h(x,y) * f(x,y) ,


where g( ., .), h(·, ·) and f( ·,·) are the inverse Fourier transforms of G(·, '), H(·, .)
and F(·, .), respectively.

Enhancement based on an image model. As we have seen earlier, an image,


f(x,y) can be written as

f(x,y) = i (x,y)r(x,y) , (Equation (5.1» .

Let z(x, y) = log fex, v), then

z(x,y) = logi(x,y) + log r(x,y) , (5.11)

or

F(z(x,y» = F(log i(x,y» + F{log r(x,y» , by Theorem 5.7,

or
Z(u, v) = J(u , v) + R(u, v), (5.12)
188 CHAPTER 5. IMAGE PROCESSING

where Z(u,v), I(u,v) and R(u,v) are the Fourier transform of Z(x,y), log i(x,y)
and log r( x, y), respectively. If we process Z (u, v) by means of a filter function
H(u, v), it follows from equation (5.5) that

S(u,v) = H(u,v)Z(u,v) = H(u,v) I(u, v) + H(u,v)R(u,v), (5.13)

where S(u, v) is the Fourier transform of the result [compare with Equations (5.4)
and (5.5)]. In the spatial domain, we have the relation

s(x,y) = .r-1 {S (u , v )} = .r-1 {H (u , v )I (u , v )} + .r-1 {H (u , v)R(u, v)} . (5.14)

By letting
l(x,y) = .r-1 {H (u , v )I (u , v )} , (5.15)
and
r' (x,y) = .r-1 {H (u , v)R(u, v)} . (5.16)
Equation (5.14) can be written in the form

s(x,y) = l (x ,y) + r' (x ,y). (5.17)

The derived enhanced image g(x,y) is given by

g(x,y) = exp{s(x,y)} (5.18)


= el (X,II) • er' (:1: ,11) • (5.19)

This process is summarized in Figure 5.4.

!(x,y)-4l log 1-41 FFT 1-41 H(u ,v) 1-4 -4~g(x,y)

Figure 5.4.

A relatively recent and potentially powerful area of image processing is the use
of pseudo-colour for image display and enhancement. Fast Fourier transforms and
different kinds of filters have been successfully used in this area; see, for example,
Gonzalez and Woods [1993] .

5.1.4 Image restoration


As in the case of image enhancement, the final goal of the image restoration
methods is to improve a given image in a certain sense. For the sake of differenti-
ation, we consider restoration as a process of endeavours to reconstruct or recover
an image that has been degraded by using a prior knowledge of the degradation
phenomenon. Thus, restoration methods are nothing but modelling of the degrada-
tion and applying the inverse process to recover the original image. We discuss here
5.1. IMAGE MODEL AND METHODS OF IMAGE PROGESSING 189

models which will employ Fourier transform and convolution theory of Section 3,
techniques of the minimum norm problem, and the method of Lagrange multipliers.

Image degradation model and Fourier analysis. The input-output relationship


in a model of the image degradation process can be written as

g(x, y) = H f(x , y) + 71(X, y), (5.20)

'1(- ,11)
.\.
f(x, y) ---+ [J[] ---+ E9 ---+ g(x,y)
Figure 5.5. A model of the image degradation process .

H is an operator on an input image f(x ,y) to produce a degraded image g(x ,y)
minus noise term represented by 71(X, V). If 71(X, y) = 0, that is, there is no noise, the
operator maps images to g(x,y) . If H is linear , then the system is called a linear
system. An operator having input-output relationship g(x,y) = Hf(x,y) is said to
be position (space) invariant if

H f(x - 0:, Y - ß) = g(x - 0:, Y - ß), (5.21)


for any f(x,y) and any 0: and ß.
For example, we can express f(x,y) in the form

f(x,y) = i:i: f(o:, ß)o(x - 0:, Y - ß)do:dß, (5.22)

Then, if 71(X, y) = 0 in (5.20),


g(x,y) = Hf(x,y) = H [i:i:f(O:,ß)O(X-o:,y-ß)dO:d ß] . (5.23)

If H is linear and additive property is valid for integrals, then

g(x,y) = i : i : H[f(o:,ß)o(x- o:,y-ß)]do:dß (5.20)

= i : i : f(o:,ß) Ho(x - 0: , Y - ß) do:dß , (5.21)

by the homogeneity property of H .


The term h(x,o:,y,ß) = Ho(x - 0:, Y - ß) is called the impulse response of
H. In other words, if 71(X,y) = 0 in equation (5.20), we find that h(x ,o:,y,ß) is the
response of H to an impulse at coordinates (0:, ß). If H is position invariant, then
from (5.21) we get

H o(x - 0:, Y - ß) = h(x - 0:, Y - ß) .


190 CHAPTER 5. IMAGE PROCESSING

i:i:
Thus, (5.25) can be written as

g(x,y) = f(a,ß) h(x - a,y - ß)dadß, (5.26)

or
g(x,y) = f(x,y) * h(x,y).
In the presence ofthe noise factor TJ(x,y), we have

g(x,y) = f(x,y) * h(x,y) + TJ(x,y), (5.27)

or G(u, v) = F(u,v)H(u,v) + N(u, v) by the Convolution Theorem, where G(-,·),


F(·, '), H(·, '), N(·,·) are the Fourier transforms of the functions g(., '), f(·, '), h(·,·)
and TJ(', '), respectively. Relation (5.27) and its equivalent form is valid for the
discrete case also.
The degradation problem can be found in terms of matrices (associated with
the discrete Fourier transform) in the following manner: Let f,g, and n be MN-
dimensional column vectors formed by stacking the rows of the M x N functions
fe(x,y), ge(x,y) and TJe(x,y), then the equation
M-l N - l
ge(x,y) = L Lfe(m,n)he(x-m,y-n)+TJe(x,y) (5.28)
m=O n=O

can be written in a matrix form

9 = Hf + TJ· (5.29)

Thus, image degradation problem is to find out the ideal image f(x,y), given
g(x, y) and having a knowledge of h(x, y) and TJ(x, y) , that is, estimate f while g, H
and TJ are given. The fast Fourier transform algorithm can be used to solve it .

Restoration through the minimum norm problem. By (5.29), we have

TJ=g-Hf ·
In the absence of any knowledge about TJ, a meaningful criterion function is to
find ] such that IITJI1 2 = IIg - H ]11 2 is minimum where IITJII 2 = TJTJ', Iig - H ]11 2 =
(g - H])' (g - H]) or, equivalently, we want to minimize J(]) = IIg - H]1I2 • For
this, the necessary condition is
8J , ~
____ =0=- 2H (g - H 1).
8f
The relation holds if
5.1. IMAGE MODEL AND METHODS OF IMAGE PROGESSING 191

If M = N and H- 1 exists, then

(5.30)

where "'" stands here for transpose.

Constrained restoration. Let J be a linear operator which transmits the image


represented by I . The least square restoration problem is the process of finding
j for which IIJ(j)1\2 is minimum, subject to the constraint 11g - Hjl\2 = 117111 2 • This
problem can be solved by using the method of Lagrange multipliers. The me~hod
relates to express the constraint in the form a(llg - H 111 2 ) and add it to 11 J(I) 1\2.
In other words, find j that minimizes the functional

where o Is a constant called the Lagrange multiplier. Differentiating L(j) and by


solving the equation
aLe!) = 0 ,
al
we get

j = (H' H +ri J)-1 H'g,


where r = ~ and i denote the adjoint (transpose) of J. This method introduces
considerable flexibility as one can find different solutions for different choiees of J.

5.1.5 Image analysis


Broadly speaking, image analysis can be divided into two major areas, image
handling and image understanding. Image handling refers to the manipulation
of visual information in order to prepare visual data for the human observer. This
manipulation can be led by simply aesthetie or artistie goals as in computer graphies,
or it may aim at extracting the essential information for some skilled observer .
Noise removal, sharpening of contrast and, to some extent, edge detection are niee
examples of this theme. Image understanding requires the interpretation of a given
image in terms of human knowledge. Among the challenging industrial applications
are problems of quality control such as the automatie inspection for completeness
of mounted electronic devices. More difficult tasks are scene interpretation and
identification problems. Different problems need methods of different complexity;
hence, they are characterized as low- or high-level procedures.
As mentioned earlier, image raw data consists of a single-valued (grey scale
image) or a three-component (colour pictures) two dimensional intensity function
I(x,y) on some usually reetangular domain n c R 2 • In a digital computer, (x,y) as
weH as 1(x, y)-values are discretized and the image raw data forms a two-dimensional
array, the pixel matrix. Low-Ievel procedures transform these numerical data
192 CHAPTER 5. IMAGE PROCESSING

into other numerical data which might again be a pixel matrix or values Pi(f) of
some (non)linear functional Pi on the image data. The functional Pi'S are called
attributes.
The purpose of a low-level transformation is to create numerical data which is
better suited for the classification or scene interpretation task done by a human
observer or a high-level procedure. The compression techniques aim at reducing
storage requirements for pictorial data and at speeding up read or write operations
to or from disk. A good example of low-level procedures is multiscale analysis
which looks at the image f through a microscope whose resolution gets coarser and
coarser; thus it associates to f a sequence of smoothed version r, labelled by a scale
parameter a which increases from some minimal value.
The capability of compressing images is essential for fast transmission of digitized
images or their efficient storage. Both applications require a representation of the
image matrix Ai' i = 1,2, · · . ,n, j = 1,2, ... ,n, with fewer parameters affecting
the picture quality visibly. A typical algorithm proceeds in three steps:

Step 1: Compute the coefficients (fT) (u, v) = (T f) (u, v), where T is a lin-
ear, invertible transformation.
Step 2: A standard procedure for quantization of the coefficients consist in
specifying a quantization matrix Q and computing

f Tq(u,v ) = Integer Round {!r(U,V)}


Q(u,v) .

Step 3: The most commonly used coding methods are entropy coders which
aim at storing frequency appearing values as few possible bits, The
compressed image is denoted by fe.

Step 1 defines a one-to-one mapping without resulting in any data reduction.


Compression is achieved by the subsequent two steps. Both steps are chosen such
that the number of bits needed to encode fe is considerably smaller than the number
of bits in the original image. The quotient of both numbers is called the compression
rate.
Until recently, the JPEG standard, an effort to standardize transform compres-
sion algorithm by making choices and recommendations for Steps 1 to 3 above, was
mainly in practice. The JPEG standard is based on a cosine transform in Step 1.
In recent years, other standards like EPIC and fractal compression are gaining in
popularity. See, Section (5.3) for details. In the remaining subsections, we briefly
introduce the basic concept of image compression model, notions of fidelity and
entropy.

Image compression models. As indicated in Figure 5.6(a), a compression sys-


tem consists of two distinct structural blocks, namely, an encoder and a decoder.
An input image f(x ,y) is fed into the encoder which creates a set of symbols from
the input data. After transmission over the channel, the encoded representation
5.1. IMAGE MODEL AND METHODS OF IMAGE PROCESSING 193

is fed to the decoder where a restructured output image j (x, y) is generated. In


general, j(x,y) may or may not be an exact replica of f(x,y) . Both the encoder
and decoder shown in this figure consist of two relatively independent functions or
sub-blocks. The source encoder is responsible for reducing or eliminating any cod-
ing Inter-pixel or psycho-visual redundancies in the output Image. As Figure 5.6(b)
shows, each operation is designed to reduce the redundancies. Figure 5.6(c) depicts
the corresponding source coder. Figures 5.6(d) and 5.6(e) represent the transform
coding system.

f(x,y)

Figure 5.6(a).

Symbol Inverse
Channel
decoder mapper j(x , y)

Source decoder
Figure 5.6(c).

A compression technique which provides error-free compressed image is called the


error-free compression method. The Huffman coding is an example of the error-free
compression method, details of which can be found in Gonzalez and Woods [1993] .
The transform coding is an example of the lossy compression technique. Lossy
compression is based on the concept of compromising the accuracy of the constructed
image in exchange for increased compression . If the resulting distortion, which may
or may not be visually apparent, can be tolerated, the increase in compression can
be significant. In fact, many lossy encoding techniques are capable of reproducing
recognizable monochrome images from data that have been compressed by more
than 30:1 and images that are virtually indistinguishable from the originals at 10:1
to 20:1. Error-free encoding ofmonochrome images seldom results in more than a 3:1
reduction in data. However, it may be observed that in the recent past, the wavelets
and fractal techniques gave much better results. See, for example, the discussion
towards the end of Section 5.3.

Fidelity criteria. Removal of psycho-visually redundant data results in a loss of


real or quantative visual information because information of interest may be lost, a
194 CHAPTER 5. IMAGE PROCESSING

Input Compressed
image image
(N xN)
Figure 5.6(d).

Merge
Compressed Symbol Inverse Decompressed
nxn
image decoder transform image
subima es
Figure 5.6(e).

repeatable or reducible means of quantifying the nature and extent of information


lost is highly desirable . Two general eriteria are used as the basis of such an assess-
ment: one is called objective fidelity and the other is called subjective fidelity.
When the level of information loss can be expressed as a function of original or input
image and the compressed and subsequently decompressed output image, it is said
to be based on objective fidelity criteria. A good example is the root mean square
(RMS) error between an input and output image . This is defined as folIows: Let
f(x,y) represent an input image and let g(x,y) denote an approximation of f(x,y)
that results from compressing and subsequently decompressing the input. For any
value of x and y, the error e(x,y) between f(x,y) and g(x,y) can be defined as
e(x,y) = f(x,y) - g(x,y), so that the total error between the two images is
M-l N-l
L L [f(x, y) - g(x, V)],
z=o y=o
where the images are of size M x N.
Following are the distortion measures between the images :

1. The mean absolute error


1 M-l N-l
ema = MN L L If(x,y) - g(x,y)l·
z=o y=o

2. The mean square error, eros, between f(x ,y) and g(x,y) is the squared
error averaged over the M x N array, or,

1 M -IN-l 2
ems = MN L L [f(x,y) - g(x,y)]
z=o y=o
The root mean square error (e rms) is the square root of ems; i.e.,

erms = ..;eIDS .
5.1. IMAGE MODEL AND METHODS OF IMAGE PROCESSING 195

3. The mean-square signal-to-noise ratio of the output image, denoted by


SNR m s , is
~M-l ~N-l 2( )
SNRm - L"z=o L"lI=O 9 x,Y
s - ~M-l ~N-l[f( ) ( )]2 ·
L"z=O L"lI=O X, Y - 9 x, Y

The root mean-square signal-to-noise ratio, SN R.rms, is simply the square root
of SNR m s ; i.e.,

SN R rms = .jSN Rms •


4. The peak signal-to-noise ratio PSNR is defined as follows:

PSNR = 10loglo ( 255 x 255) .


em s

where 255 is the highest pixel value in a grey scale image.

Transform coding. Let a transformation A be defined with the help of an n x n


matrix as folIows:

Y = Ax , (5.31)
where

x=
[~J [~J y=

A=
a2l a22
["" "12
",.
a2n ]
anl a n2 a nn

In general, A is not invertible. In this case, the vector of pixels x is transformed


into a vector of coefficients y. For some sets of vectors x and some transformations
A, fewer bits are required to encode the n coefficients of y than the n pixels of z: In
particular, if the elements Xl, X2, •• • , X n are highty correlated and the transforma-
tion matrix A is chosen such that the coefficients Yl, Y2, . .. ,Yn are less correlated,
then the Yi'S can be individually coded more efficiently than the Xi'S. A difference
mapping, explained below, is obtained if we choose A as
196 GHAPTER 5. IMAGE PROGESSING

1 0 0 0 0 0
1 -1 0 0 0 0
0 1 -1 0 0 0
A= 0 0 1 -1 0 0
(5.32)
0 0 0 1 -1 0
0 0 0 0 1 -1
in equation (5.31). The first element of Y is Y1 = Xl and all subsequent coefficients
are given by Yi = Xi-l - Xi . If the grey levels of the adjacent pixels are similar, then
the differences Y i = Xi-l - Xi will, on the average, be smaller than the grey levels so
that it should require fewer bits to code them. This mappin/1 is invertible. I

If A is a unitary matrix, then A -1 exists and A -1 = A , whereas if A is the


transpose of A, then (5.31) can be written as
I
X = Ay. (5.33)
It is quite clear from (5.31) that each coefficient Yk is a linear combination of all
pixels; that is,
n

Yk = L a kixi, (5.34)
i=l
for k = 1,2,3, .. . ,n.
Similarly, byequation (5.33), each pixel Xi is a linear combination of all the pixels
n

Xi = LbikYk, (5.35)
k=l

for i = 1,2,3, . . . ,n.


Equations (5.34) and (5.35) are similar to the expressions defining the forward
and the inverse transformation kerneis, respectively, whereas au is the forward trans-
formation kernel and bik is the inverse transformation kernel,
For the two-dimensional case, (5.34) and (5.35) take the form
n n

Ykl =L L Xijaijk/, (5.36)


i=l j=l

and
n n

Xij = LLYklbijkl. (5.37)


k=l'=l
Here, aijkl and bijkl are forward and inverse transformation kerneis, respectively.
In equations (5.36) and (5.37) the Fourier, Walsh and Hadamard transforms are
commonly used for encoding purposes, which fit directly in this setting and they
give fairly good results. For example, the Fourier kernel is given by
_
a ijkl = N1 e-J"27l"(ik+ jl)/N.
5.1. IMAGE MODEL AND METHODS OF IMAGE PROCESSING 197

Another interpretation of (5.37) is possible . Let us write (5.37) in the form


n n

X = LLYkIBkl, (5.38)
k=1'=1

and interpret this as aseries expansion of the n x n subimage X into n 2n x n basis

,
images
bkl11 bkl12 bkll n ]

B~ ~ ::~I: :<:~~
[ (5.39)

with the Ykl for k, 1= 1,2,· . " n being the coefficients (weights) of the expansion.
Hence, (5.39) gives the image X as a weighted sum ofthe basis Bkl. The coefficients
of the expansion are given by (5.36) which may be written in the form
(5.40)

where A kl is formed in the same manner as Bkl except that the forward kernel is
used .

The quantizer. , Let us consider the number of all possible values of the coefficients
Yiin equation (5.31) where
n

Yi = LaijXj.
j=1

m
If each element Xj can have any 2 different values, then each aijXj term can also
have any of 2m different values and the sum of n such terms could have any of
(2m ) n = 2m n different values. Therefore, a natural binary representation would
require mn bit code words to assign a unique word to each of the possible 2m n
values of Yi . Since only mbit words would be required to code any Xj, and our
objective is to use fewer bits to code the Yh we must round off the Yi to a fewer
number of allowed levels.
Aquantizer is a device whose output can have only a limited number of possible
values. Each input is forced to one of the permissible output values. For more details,
we refer to Gonzales and Woods [1993].

The coder. As shown in Figure 5.6, the input to the coder are the n elements of
the vector

v ~ [tJ
198 CHAPTER 5. IMAGE PROCESSING

Let us suppose that each Vi, i = 1,2, · · ·, n takes one of M values (levels)
For each input Vi, the coder outputs a binary word whose value
WI, W2, • • • , W M.
depends on the value Wk of the input. The coder Input-output relationship is one-to-
one in that a unique code word CI< is assigned to each possible input value wk" This
process is reversible and error-free. If the coder is required to handle M possible
input values, then designing the coder amounts to choosing M unique binary code
words and assigning one of them to each input.
An equal-Iength code is a set of code words each of which has the same number
of bits, along with a rule for assigning code words to quantizer output levels. One
example of an equal-Iength code is the natural binary code. One possible assignment
rule for the natural code is to order the code words according to their binary values.
For example, suppose that there are eight possible coder input values (quantizer
output levels) ordered WI, W2 · ·· , ws; then the natural code is Cl = 000, C2 =
011, . . . , Cs = 111, as illustrated in Table 5.1. There are 8! possible assignments
of the eight-code words to the eight inputs. The reflected binary or grey code,
illustrated in Table 5.1, has the property that any two adjacent code words in the
set differ in only one bit position.
Table 5.1 Some typical codes
Input Natural Code Grey Code
YI 000 111
Y2 001 110
yg 010 100
Y4 011 101
Ys 100 001
Y6 101 000
Y7 110 010
Ys 111 011

A uniquely decodable code is a code with the property that a sequence of


code words can be decoded in only one way. The code Cl = 0, C2 = 1, Cg = 1, C4 = 10
is not unique because of bits 0011 could be decoded as Cl Cl C2C2 or as Cl CgC2. The
codes presented in Table 5.1 are uniquely decodable.
An instantaneous code is one that can be decoded instantaneously. That is,
if we look at the sequence of incoming bits one at a time, we know the value of the
input when we come to the end of a code word. We do not have to look ahead at
any future incoming bits in order to decode the bit stream. The codes in Table 5.1
are instantaneous.

5.1.6 Variational methods in image processing


In the recent past, certain important algorithms and functional analytie tech-
niques have been developed for understanding the automatie segmentation of digital
images and visual perception. Perception theory [see, for example, Koendrink 1985]
5.1. IMAGE MODEL AND METHODS OF IMAGE PROCESSING 199

deals with the complex interaction between regions and edges (or boundaries) in
an Image , Segmentation of a digital image means search of homogeneous regions
and edges by a numerical algorithm. The homogeneous regions are assumed to cor-
respond to meaningful parts of objects in the real-life problem, and edges to their
apparent contours. It is interesting to note that in most of the segmentation al-
gorithms, one tries to minimize segmentation energy. Segmentation energy is the
measure of smoothness of the region and tells us how faithful the analyzed image to
the original image and the obtained edges to the image discontinuities are. A fairly
general segmentation energy model is discussed in Morel and Solimini [1995, Chapter
4]. However, we briefly mention here the Mwnford-Shah segmentation energy
model. This model defines the segmentation problem as a joint smoothingjedge
detection problem, namely, "Given an image g(u), one seeks simultaneously a piece-
wise smooth image u(x) with a set K of abrupt discontinuities, the edges of g" . The
best segmentation of a given image is obtained by minimizing the functional

E(u,K)= { (IVu(xW+ (u-g) 2)dx+ length (K) .


ln\K
The first term imposes that u is smooth outside the edges, the second that the
piecewise smooth image u(x) indeed approximates g(x), and the third that the
discontinuity set K has minimallength and, therefore, is as smooth as possible. The
model is minimal in the sense that removing one of the above three terms would yield
a trivial solution. Mumford and Shah [1985] conjectured the existence of minimal
segmentations made of a finite set of C 2 curves. So far, this conjecture is not fully
proved but enough useful results have been obtained in this direction [See Morel and
Solimini 1995 for detailed updated work].
Multiscale analysis is a fairly recent concept in image processing. It deals with
the analysis of pictorial data on different resolution levels and was systematically
treated for the first time by Mallat [1989] through wavelet techniques. Multiscale
analysis looks at the image I through a microscope whose resolution gets coarser
and coarser. We define a picture as a grey level or brightness real function uo(x)
defined at each point (pixel) of a domain n, which generally is aplane, a square or a
rectangle. Multiscale analysis is concerned with the generation, from a single initial
picture 10, of a sequence of simplified pictures f>.. (x), where l».. (x) appears to be a
rougher and sketchier version of 10 as >. increases.
In b., details and features like edges are kept if their "scale" exceeds >.. Let H>..
be the map carrying 10 to H>..(Jo) = I>.., then H>..(Jo) is either a picture, denoted by
f>.. or, more generally, a pair (K>..,f>..) where K>.. is a set of boundaries or edges at
scale >..
Any multiscale analysis must satisfy the properties:
(i) Fidelity, namely, f>.. ~ 10 as >. ~ o.
(ii) Causality, namely, H>..(Jo) only depends on H>.., (Jo) if>' > >.: .
(iii) Euclidean invariance, namely, if Ais an isometry, then H>..(JooA) = H>..(Jo)oA.
200 CHAPTER 5. IMAGE PROCESSING

(iv) Strong causality (in case of boundary detection); namely,


K>. C K>., ux > ,\' .
In view of (i) and (ii), there must exist an energy functional E(f>.) such that
E(f>.) decreases as A increases. By virtue of (iii), this functional is likely to be a
Lebesgue integral of some bidimensional or mono-dimensional terms depending on
I>. and K>. in the case of boundary detection.
Morel and Solimini [1995] have discussed numerous examples of multiscale anal-
ysis and their variational formulation. Koendemik [1985] observed for the first time
the relationship between images and solution of partial differential equations. He
noticed that the convolution of the signal with Gaussian at each scale was equivalent
to the solution of the heat equation with the signal as the initial datum. Let 10 be
this datum, then the scale space analysis associated with 10 consists in solving the
system

8/~~, t) = /:).I(x, t)
l(x,O) = lo(x) .
The solution of this equation for an initial datum with bounded quadratic norm is
I(x,t) = Gt(*)/o, where

Gu(x) = _1_ exp (_IIX Il2)


4rru 4u
is the Gauss ian function.
The sequence I>. = I(',A) dearly defines a multiscale analysis satisfying the
properties like fidelity, causality and invariance. In Perona and Malik theory (See
Perona and Malik [1990], Morel and Solimini [1995]), the heat equation has been
replaced by a nonlinear equation of the type

~~ = div (g(IV II)V f)


1(0) = 10,
where gis a smooth non-increasing function with g(O) = l,g(s) ~ 0, and g(s) -+ 0
as s -+ 00.
Variational formulation of the Perona and Malik model and related results can be
found in the references cited above. Berger et al . [1996] contains valuable research
papers on image processing through the numerical treatment of partial differential
equations. An updated account of the wavelet theory providing a stable mathe-
matical foundation for understanding of multiscale analysis can be found in Mallat
[1996]. Chambolle and Lions [1997] have studied the problem of image recovery via
total variation minimization and have discussed the suitability of this method for
recovery of images belonging to a certain dass of images. Chambolle et al. [1998],
De Vore [1998], March and Dozio [1997], and Chipot, March and Vitulano [1999]
contain valuable material in this area.
5.2. INTRODUCTION TO FOURIER ANALYSIS 201

5.2 Introduction to Fourier analysis


Fourier analysis (studies of Fourier series and transforms) is a well-established
topic which has been studied extensively and applied vigorously in different disci-
plines, especially engineering and physics, Renowned mathematicians have taken
part in the development of this subject in its history of about 170 years which was
invented by the great French Scientist Jean Baptiste Joseph de Fourier (1768-1830).
Sound and vision are the most important external sensory signals which help us
to interpret the world around us, In some sense, they are related but in another
sense, they are very different. The essential physical model for light and sound is the
same, namely waves. The wave model is also significant for the discussion of heat
radiation. It has been demonstrated by Leonhar Euler, Daniel Bernoulli, Joseph
Louis Lagrange and Jean Baptiste Joseph de Fourier (J.B . Fourier) that waves,
no matter how complicated, have a beautiful common description in mathematical
terms. That is the concept ofFourier series. Fourier series allows us to describe tones
in terms of harmonics. A tone is obtained as a superposition of pure tones of the
form Ak sin(kwt + (Jk) where Ak is the amplitude and w is the basic frequency. The
most striking and interesting feature about the Fourier series is that it also allows
a description of something as complex as a tone of a violin. Indeed, for any given
tone, Fourier analysis allows us to determine the Ak 'S and w, both electrically and
mathematically. The object of this section is to describe those results which are of-
ten used in the image processing, especially image enhancement, image degradation,
and image compression and transmission.
Invention of the fast Fourier transform algorithm in 1965 and studies of the
related topics like Walsh and Haar-Fourier series and recent studies of wavelets have
aroused great interest in this field. With the help of the results of this section, one
can also analyze and understand a large number of other scientific and technological
problems. The main themes discussed in this chapter include:
1. amplitude, frequency and phase;

2. Bessel's inequality, Riesz-Fischer theorem and basic convergence theory includ-


ing Poisson's summation formula;

3. continuous and discrete Fourier transforms and their convolution and fast
Fourier transform algorithrn; and
4. Fourier analysis via computer.
Our presentation here is based mainly on Weaver [1983, 1989].

5.2.1 Amplitude, frequency .and.phase


Fourier analysis in the simplest sense is the study of the effects of adding
together sine and cosine functions. Daniel Bernoulli, while studying vibration of a
string in the year 1750, first suggested that a continuous function over the interval
202 CHAPTER 5. IMAGE PROCESSING

(0,11") eould be represented by an infinite series eonsisting only of sine functions.


This suggestion was based on his physical intuition and it was a serious bone of
eontention. About 70 years later, a French soldier and a personal friend ofNapoleon,
J .B. Fourier reopened this debate while studying heat transfer. He argued that a
function eontinuous on an interval (-11", 11") eould be presented as a linear eombination
ofboth sine and eosine funetions. These studies led to the development ofthe subject
which ean be found in the voluminous monographs of Antoni Zygmund [1959] and
N.K. Bari [1961]. We introduee in this seetion the eoneept of frequeney content,
that is, the effect of a linear eombination of sine and eosine funetions.
Let us eonsider the functions A sin wx and A eos wx where A is called the am-
plitude and w is called the (radial) frequency. Thus the amplitude is simply
a eonstant that seales the height of the sine and eosine funetions and causes them
to vary in the same preseribed way between A and -A. The (radial) frequeney w
is a measure of how often the function repeats itself. sin w repeats every 211" radi-
ans, whereas A sin wx repeats every 211"/w radians. An appropriate representation
of frequeney is the number of eydes or eomplete revolutions of the radian vector.
When measured in eydes, the frequency is ealled cireular frequeney and is denoted,
say, by J..t. Clearly, 211"IJ. = w. The period T of Asin211"IJ.x or Aeos211"J..tx is defined
as the number of x units required to eomplete one eyde or 211" radians. It is given
mathematically as T = 1/ IJ. = 211" /w .
When the variable x represents a spatial measurement (such as the length of a
vibrating string), a slightly different terminology is used. The period T is ealled
the wavelength and is denoted by A, whereas the radial frequeney is ealled the wave
number and is denoted by k, Let
n
f(t) = L(A k eos211"J..tkt + Bk sin 211"IJ.kt). (5.43)
k=l

From a physical or intuitive point of view, the higher (larger) frequeney terms
help to make up the finer details of the function while the lower (smaller) ones
eontribute more to the overall or basic shape of the funetion. The frequency
content ofthe function f(t), given by (5.43), is a measure of all frequencies IJ.k' k =
1,2,3, ... ,n, used in the summation, in which mode they are used and how much
of each is used. To be more precise, it is the set of triplex (A k, Bk, J..tk)'
A k and Bk are ealled pure eosine and pure sine frequeney contents of J..tk' re-
speetively. The most eomfortable way to know the frequeney eontent of a function
is to eonstruct graphs of A k and Bk versus IJ.k ' These graphs are ealled frequeney
domain plots . The graph of A k versus IJ.k is called pure eosine frequeney plot while
the graph Bk versus IJ.k is ealled pure sine frequeney plot.

5.2.2 Basic results


Let f(t) be a periodic function with period T and Lebesgue integrable (say,
in particular, eontinuous) over (-T /2, T /2), then the Fourier series of f(t) is the
5.2. INTRODUCTION TO FOURIER ANALYSIS 203

trigonometrie series
1 00

2Ao + L(Akcoskx+Bksinkx} , (5.44)


k=1
where

Ak = T2jT/2 f(t}cos
21rkt
T dt , k = 1,2,3, ... (5.45)
-T/2
1jT/2
Ao = T f(t}dt
-T/2
2jT/2 21rkt
Bk = T f(t} sin ----;y-dt, k = 1,2,3, . . . , (5.46)
-T/2
and we write it as
1 00

f rv 2Ao + L(Akcoskx + Bksinkx}.


k=1
Here, we take
k
Wk = T' k = 0,1,2,3, .. . (5.47)
Very often, we choose T = 21r.
Ak and Bk are called cosine Fourier coefficient and sine Fourier coefficient, re-
spectively. The set of triplex (Ak,Bk ,Wk), where Ak,Bk ,Wk are given by (5.45) ,
(5.46) and (5.47), respectively, is called the Fourier series frequency content.
The complex form of the Fourier series of f(x} is

L c; e2Trint/T,
00

-00

where

Cn = An +2 iBn , n < 0
Co = Ao
An -iBn 0
Cn = 2 ,n>
n
Wn = T,n = -2,-1,0,1,2.
Let
n
T = 21r, andSn(f}(x} = L Cke i kz
k=-n
1 n
= 2Ao + L(Ak coskx + Bksinkx}
k;=1
204 CHAPTER 5. IMAGE PROCESSING

be the n-th partial sum of the Fourier series of f . Then

(5.48)

where
1 n sin(n + ~)x
Dn(x) = -2 + Lcoskx = 2' z (5.49)
k=l sm 2"
is the "Dirichlet kernel" • and

Gn(f)(x) = So(f) + Sl(f) + ... + Sn(f) = ~


n+ 1 1f'
1
0
2
11" f(x - t)Kn(t)dt, (5.50)

where

is called the "Fejer kernel".

Theorem 5.1 (Bessel's Inequality) .


00
2
L ICkl s IIfIlL(o,211")'
k=-oo
or

~A~ + f(A~ + B~) s IIflli2(0.211")'


n=l
This also means that {Ak} and {Bk} are elements 0112 •
Theorem 5.2. (Riesz-Fisher Theorem) . Let {Ck} E 12 • Then there exists 1 E
L 2 ( -1f'. 1f') such that {Ck} is the k-th Fourier coeffieient 01 I. Furthermore,
00

L ICkl2 = IlflIL2(-1I",1I") '


k=-oo
For {A k}, {Bk} belonging to 12 • there esists f E L 2 (0. 21f') sucli that A k• Bk are,
respeetively, k-th eosine and sine Fourier coefficients 01 I. Furthermore,
5.2. INTRODUOTION TO FOURIER ANALYSIS 205

Theorem 5.3. Let 1 E L 2 (- rr, rr), then

lim
n-too
111 - Sn (f) IIL2(-1I" 11")
'
= O.

Theorem 5.4. Let 1 E 0[0, , 2rr] such that

(211" w(f, t) dt < 00, w(f, t) = sup I/(x + t) - l(x)l.


16 t t

Then the Fourier series 011 converges '1milormly to I,. that is,

lim
n-+oo
111 - Sn (t) IILoo (-11" 11")
t
= O.
11w(f,1J) = O(1Ja), then the condition 01 the theorem holds.

Theorem 5.5. 11 1 is a function 01 bounded variation, then

where

I(x+) = h-tO+
lim I(x + h)

I(x-) = h-tO-
lim I(x - h)

exists at every x, a < x < b.


Theorem 5.6. Let 1 E LI (R) satisfy the lollowing two conditions :

(i) The series

I:
00

I(x + 2rrk)
k=-oo

converges everywhere.

(ii) The Fourier series


00

1 ' " j(k)e i kz


2rr 6
k=-oo

converges everywhere.
206 CHAPTER 5. IMAGE PROCESSING

Then the following "Poisson's Summation Formula" holds:


1
L L
00 00

f(x+21rk) = 271" j(k)e ikZ, x ER. (5.52)


k=-oo k=-oo
In particular,
1
L L
00 00

f(271"k) = 271" j(k),


k=-oo k=-oo

where (fdenotes the discrete Fourier transform of f).

5.2.3 Continuous and discrete Fourier transforms


Let f(x) be a continuous function of a real variable z, then the Fourier trans-

i:
form of f(x) denoted by:F(f) is defined by the following equation:

:F(f(x)) = f(x)e-i21rtzdx = F(t). (5.53)

Given F(t) the inverse Fourier transform denoted by :F- 1 is defined as

(5.54)

Equations (5.54) and (5.55) are called the Fourier transform pair. If F(t) is also
continuous , then (5.55) always exists. The Fourier transform of F(t) of a real variable
function f(x) is generally a complex-valued function, say

F(t) = R(t) + iJ(u), (5.55)

where R and J are real and imaginary parts, respectively.


We can write (5.55) as

F(t) = lF(t)le iif)(t ) ,


where

lF(t) I = [R2(t) + J2(t))!


[J(t)]
4J(t) = tan -1 R(t)'

The magnitude function lF(t)1 is called the Fourier spectrum of f(x), and 4J(t) its
phase angle. P(t) = IF(tW is called the power spectrum or spectral density of
f(x) . The variable t appearing in the Fourier transform may be called frequency
5.2. INTRODUCTION TO FOURIER ANALYSIS 207

variable. The name arises from the fact that, when using Euler's formula, the
exponential term exp[-i27rtx] ean be written as

exp[-i27rtx] = eos27rtx - isin27rtx.

If we interpret the integral in (5.53) as a limit summation of diserete terms, it is


clear that each value of t determines the frequeney of its eorresponding sine-eosine
pair.
The set of triplex (C(p), 8(p), p) where

C(p) = F(JI.) + F( -p) j


8(p) = i[F(JI.) - F( -p)] ,

I:
is ealled the Fourier transform frequency content. It can be verified that

C(p) = 2 f(t) eos 27rpt,

I:
and

8(p) = 2 f(t) sin 27rptdt


00

Cl (p) = 21 f(t) eos 27rJl.t

is ealled the Fourier eosine transform and

is ealled the Fourier sine transform.

Example 5.1. Let Pa(x) = I , lxi< a,. a > 0 } . PaO is known as the pulse
{ 0 , 0 th-
erwise

I:
function. Its Fourier transform is

F(pa(x)) = F(t) = Pa (x)e-21ritzdx = laa e- 21ritdt


e21rta _ e-21rta
=
27rit
sin27rta
= 2a 2 .
7rta
The graph of this function F(t) is given in Figure 5.7.
The Fourier transform of f(t) defined by
at,
f(t) = {e- t ~ 0, . a > 0
0, otherwise ,
208 GHAPTER 5. IMAGE PROGESSING

2a

-1/2a 1/2a

Figure 5.7. Fourier transform of the pulse functions.

is
1 a - 27l"it
F(t) = a + 27l"it a 2 + 47l"2W2 .
The space of the set of points (J.t, F(J.t)) is called the frequency domain and the
space ofthe set ofthe points (t, f(t)) is called the temporal or time domain when

i:
t represents time, where

:F(F(t)) = f(x)e- 2rri P.t = F(J.t). (5.56)

If t represents an optional variable, we call [t, f(t)] the spatial domain.


An interesting property of the Fourier transform is that it is norm-preserving;
namely,

Properties of Fourier transforms. The fundamental properties are enumerated


in the form of the following theorems.
5.2. INTRODUCTION TO FOURIER ANALYSIS 209

Theorem 5.7 (Linearity). I/ F(JI.) and G(JI.) are Fourier trans/orms 0/ /(t) and
g(t), respectively, then a F(JI.) + bG(JI.) is the Fourier transform 0/ h(t) = a/(t) +
bg(t) ; a and b are scalars.

Theorem 5.8 (First shift theorem). I/ the function /(t) has a Fourier transform
given by F(JI.), then the Fourier transform 0/ /(t - a) is given by F(JI.)e- 21ri /Ja .

Theorem 5.8 (Second shift theorem). I/ F(JI.) is the Fourier trans/orm 0/ the
/unction /(t), then F(JI. - a) is the Fourier transform 0/ the function f(t)e 2 -.ri at .

Theorem 5.9 (Scale change). I/ the Fourier trans/orm 0/ /(t) is F(JI.), then the
Fourier transform 0/ /(at) is given by fa"rF(!i-), where ais any real number not equal
to O.
Theorem 5.9 tells us that if F(JI.) is the Fourier transform of /(t), then F( -Jl.) is
the Fourier transform of /( -t).

Theorem 5.10 (Transform of a transform). :F(:F(f(t))) = /( -t), that is,


Fourier transform 0/ the Fourier trans/orm 0/ a function is equal to the function
with the minus sign in the variable (rotated function 0/ [}.

Example 5.2. The Fourier transform of the Gaussian function /(x) = e- a z 2 is


equal to

If a = 1r , then we find that it is that function whose Fourier transform is the function
itself.

Example 5.3. The Fourier transform of the Dirae delta function o(x) is 1.

o(x) = 0 if x
~ 0
=lifx=O.

Definition 5.1. The convolution of two continuous functions /(x) and g(x) is

i:
defined by the equation

/(x) * g(x) = /(y)g(x - y)dy . (5.57)

The concept of convolution is inherent in almost every field of the physical sci-
ences and engineering. For example, in mechanics, it is known as the superposition
or Duhamel integral. In system theory, it ·phl.ys a crucial role as the lmpulseresponse
int egral and, in optics, as the point spread or smearing function. It has applications
of vital importance in image processing.
210 CHAPTER 5. IMAGE PROCESSING

The convolution of functions is associative, commutative and distributive; that


is,

f(x) * [o(x) * h(x)] = [f(x) * g(x)] * h(x) (Associative)


f(x) * g(x) = g(x) * f(x) (Commutative)
f(x) * [g(x) + h(x)] = f(x) * g(x) + f(x) * h(x). (Distributive)

i:
The convolution of two equal impulse functions is given by

h(x) = Pa (Y)Pa (x - y)dy

= t;
-a
dy = x+2a, - 2a:S x :s 0,
and

h(x) = r
lz-a
dy = 2a - x, 0 :s x :s a,
The fundamental properties of the convolution are summarized in the form of the
following theorems.

Theorem 5.11 (Convolution theorem.) 1f F(I(x)) = F(x) and F(g(x)) = G(x),


then

(i) F(I(x) * g(x)) = F(x)G(x).

(ii) F(I(x)g(x)) = F(x) * G(x) .


Notions of the Fourier transform and convolution can be extended for functions
of two variables and most of the properties carry over to the two-dimensional case.
For example, if the Fourier transform pair for the two-dimensional function f(x, y)

i:i:
is given by

i:i:
F(u, v) = f(x, y)e- 21ri(UZ+UY)dxdy (5.58)

f(x,y) = F(u,v)e 21ri(UZ+UY)dudv, (5.59)

then the Scale Change Theorem takes the following form:


Ifthe function f(x,y) has a Fourier transform given by F(u,v), then the Fourier
transform of f(ax,by) is given by

1 (U V)
lallbl F ~'b .
5.2. INTRODUCTION TO FOURIER ANALYSIS 211

If f(x, y) = h(x)g(y), then F(u, v) = H(u)G(v) where F(u, v) is the Fourier trans-
form of f(x, y) and H(u) and G(v) are the Fourier transforms of h(x) and g(y). The

i:i:
equation

f(x, y) * g(x, y) = f(f., "l)g(x - f., Y -"l)~d"l,

is called the convolution offunctions of two variables f(x, y) and g(x, y).

Example 5.4. (i) The convolution of a function f(x) with the Dirac delta function

i:
(unit impulse function) is the function f(x) itself; that is,

8(x) * f(x) = 8(a)f(x - a)da

= 1-0+
8(a)f(x - a)da

= f(x) 1- 0+
8(a)da
= f(x).

i:i:
(ii)

8(x, y) * f(x, y) = 8(a - ß)f(x - o, y - ß)dadß


=f(x,y) .

Discrete Fourier transform. We have seen in the previous section that images
are transmitted through the Fourier transform and very often we face problems
in evaluating the integrals involved. In such a situation, digital computers can
come to our rescue provided integrals can be converted into a form amenable to
computer analysis, We know that the computer recognizes sequences of numbers
that represent functions. Therefore we discretize an arbitrary function to obtain a
sequence of numbers that can be handled by a computer.
A discrete Fourier transform is an operation that maps a sequence {f(k) }&,-l
or {fk} to another sequence {F(j)}&,-l} or {Fj} which is defined by the equation
N-l
F(j) = ~ L f(k)e- 21rikj/N, j E [0, N - 1]. (5.60)
k=O

The sequence defined by the equation


N-l
f(k) = ~L F(j)e21riki/N, k E [0, N - 1], (5.61)
j=O
212 CHAPTER 5. IMAGE PROCESSING

is called the inverse discrete Fourier transform. {F, t} is called the discrete
Fourier transform pair.
If we write WN = e21ri / N , called weighting kernel, then (5.60) and (5.61) take
the form
N-l
F(j) =~ L f(k)eW-kj (5.62)
k=O
N-l
f(k) = ~ L F(j)Wkj. (5.63)
N j=O

All theorems for the continuous Fourier transform have their counterparts in the
discrete case; for example, the First Shift Theorem and the Transform of a Transform
take the following form.

Theorem (5.12) (First Shift Theorem). Let {F(j)}~-l be the discrete Fourier
transform of {f(k) }~-l, then the discrete Fourier transform of the shifted sequence
{f(k - n)}~-l is equal to {F(j)WNjn}~-l .

Theorem (5.13.) (Transform of a transform). If {F(j)}~-l is the discrete


Fourier transform of the sequence {f(k)}~-l, then the discrete Fourier transform
of the Fourier transform of the sequence {f(k)}~-l is equal to

~{f(N - k)}~-l = ~{f(-k)}~.


N N
It can be easily checked that
F(-j) = F(N - j), jE [O,N -1]
f(-j) = f(N - k) , k E [O,N -1] .

5.2.4 The fast Fourier transforms


J .W. Tukey and J .W. Cooley published an algorithm in 1965 which tremen-
dously reduces the number of computations required for computing the discrete
Fourier transform of a sequence under certain conditions. This technique is called
the Fast Fourier Algorithm (FFA) and is considered one of the most significant
contributions of numerical analysis of this century. We present this algorithm here.
Let F(j) be the discrete Fourier transform of {f(k)}~-l . Let us assume that N
is an even integer and form the two subsequences
fICk) = f(2k) } N
{ h(k) = f(2k + 1), k = 0, ," ,M -1, whereM = 2' (5.64)

It can be checked that


fI(k+ M) = f(2(k + M)) = f(2k + N) = f(2k) = fICk)
h(k + M) = f(2(k + M) + 1) = f(2k + N + 1) = f(2k + 1) = h(k),
5.2. INTRODUGTION TO FOURIER ANALYSIS 213

and so {ft(k)} and {h(k)} are periodic sequences with period M.

By equation (5.63), we have


M-l
1 'L.....
Fdj) = M " ft (k)W k J'
M
k=O
j = 0,1,· .. ,M - 1 . (5.65)
M-l

F2(j) = ~ L h(k)WM
ki
k=O
It can be verified that both F 1 (j) and F 2 (j) are periodic with period M; that is,

F 1 (j + M) = F 1 (j)
F2(j + M) = Fdj).

Now we shall show that the discrete Fourier transform of {f (k)} ~=Ol, namely,
F(j) = 11 L-~=Ol f(k)W;Vk i can be expressed as the sum of the two Fourier trans-
forms, one of even order and the other of odd order
M-l M-l
F(j) = ~ L f(2k)W;V2k i +~ L f(2k + 1)W;V(2k+lH.
k=O k=O
Since

W N- 2ki _- e-21riki/(N/2) -_ W-ki


M ,
-
WN (2/c+l )i _ -21rii(2k+l)/N _ W-kiW-i
-e - M N'

Therefore, the previous equation becomes

1 W-i
L L h(k)WMki , j = 0""
M-l M-l
F(j) =N ft(k)WMki + ; ,N-1
k=O k=O
= ~Fl(j) + ~W,vl F2(j), j = 0,'" ,N-1.

Because {F1 (j )} and {F2 (j )} are periodic with period M, we have

F(j) = ~[Fl (j) + F 2 (j )W,vl]


(5.66)
F(j + M) = ~[Fl (j) - F 2 (j )W,vl ], j = 0,,, ' ,M -1.

To calculate the discrete Fourier transformof {f(k)} , N 2 complexoperations-(ad- -


ditions and multiplications) are required, whereas to calculate the discrete Fourier
transform of {ft (k)} or {h (k)} requires only M2 or N 2 /4 complex operations.
214 CHAPTER 5. IMAGE PROCESSING

When we use (5.66) to find F(j) from Fl(j) and F2U), we need N + 2(N2/4) com-
plex operations. In other words, we first require 2(N2/4) operations to calculate the
two Fourier transforms {FlU)} and {F2 (j )} , and then we require the N additional
operations prescribed by equation (5.66) . Thus, we have reduced the number of
operations from N 2 to N + ~2. Let us assume that N is divisible by 4 or M = !f is
divisible by 2. Then the subsequences {iI (k)} and {h(k)} can be further subdivided
into four ~ order sequences as per equation (5.64) as folIows:

9dk) = iI (2k)
92(k) = iI (2k + 1)
M
hdk) = h(2k), k = 0, 1, ... , 2 - 1
h2(k) = h(2k + 1) .
Thus, we can use (5.66) to obtain the discrete Fourier transforms {FlU)} and
{F2 (j )} with only M + ~2 complex operations and then use these results to obtain
{FU)} which requires N + 2(M + ~2) = 2N + ~2 operations. Thus, when we
subdivide a sequence twice (N > 4 and N divisible by 4), we reduce the number of
operations from N 2 to 2N + ~2 . The 2N term is the result of applying equation
(5.66) (twice) whereas the ~2 term is the result of transforming the four reduced
sequences. For the case N = 4, we completely reduce the sequence to four first
order sequences that are their own transforms and, therefore, we do not require the
additional N 2/4 transform operations. The formula then becomes 2N. Continu-
ing this process, we can show that if N is divisible by 2P (p is a positive integer),
then the number of operations required to compute the discrete Fourier transform
of {f(k)}~-l, the N-th order sequence by repeated subdivision is

N2
pN+ 2P .
Again, for complete reduction (l.e., N = 2P ) , the ~: term is not required and we
obtain pN for the number of operations required. This results in a reduction factor
of

Thus, the essence of the Cooley-Tukey algorithm (fast Fourier transform algo-
rithm) is to choose sequences with N = 2P and go to complete reduction. Although
most sequences do not have such a convenient number of terms, we can always arti-
ficially add zeros to the end of the sequence to reach such a value. The extra number
of terms in the sequence is more than compensated for by the tremendous saving
of time due to the use of Cooley-Tukey algorithm. For example, a direct imple-
mentation of the transform for N = 8192 requires approximately 45 minutes on an
IBM7094 machine while the same job can be done in 5 seconds by the same machine
5.2. INTRODUGTION TO FOURIER ANALYSIS 215

using the FFT algorithm (Cooley-Thkey algorithm) . For details like implementation
and the inverse FFT, we refer to Gonzalez and Woods [1993] and Press et al. [1990].

Forward and inverse transformation. Let T be the transform of f(x) defined


by the relation
N-1
T(u) = Lf(x)g(x,u), u=0,1,2, .. · , N - 1 ,
z=O

where g(x, u) is called the forward transformation kerne!.


The inverse transform of T is given by the relation
N-1
f(x) = L T(u)h(x,u), x = 0,1,2, '" ,N-1
u=O

where h(x, u) is called the inverse transformation kernel.


For two-dimensional square arrays, the forward and inverse transforms are given
by the equations
N-1 N-1
T(u,v) = L L f(x,y)g(x ,y,u,v),
z=O 1I=0

and
N-1 N-1
f(x ,y) = L L T(u ,v)h(x,y,u,v).
u=O v=O

The forward kernel is said to be separable if


g(x, y, u, v) = g1 (z, U)g2(y, u),
The forward kernel is called separable and symmetrie if
g(x,y,u,v) = g1(X,U)g1(y,V).
In a similar manner, we can define the separable and symmetrie inverse trans-
formation.
The two-dimensional Fourier transform and the inverse Fourier transform are
separable and symmetrie. For applications of the fast Fourier transform to different
areas, we refer to van Loan [1992] .

5.2.5 Fourier analysis via computer


From the discussion in the preceding sections, it is quite clear that the concepts
of the Fourier series, the continuous Fourier transform and the discrete Fourier trans-
form are basieally transforms or mappings. The Fourier series maps an analytie func-
ti on (continuous function) into a sequence (sequence of Fourier coefficients) while the
216 CHAPTER 5. IMAGE PROCESSING

continuous Fourier transform carries one function to another. The discrete Fourier
transform is a transform from one sequence space into another sequence space. The
Fourier series and the continuous Fourier transform both require the evaluation of
integrals which may be very tedious and sometimes quite time-consuming and cum-
bersome. The discrete Fourier transform deals with bounded sequences and requires
only straight-forward addition and multiplication of terms and, furthermore, by ap-
plying the FFT algorithm, it can be computed very rapidly and efficiently through
the computer. We discuss here the methods through which the FFT algorithm can
be used to calculate the Fourier transform and Fourier coefficients of a function.
In the first place, one can convert, with the help of the sampling theorems, the
function into a sequence and then calculate the discrete Fourier transform of this
sequence, and from this we obtain the Fourier series or Fourier transform of the
original function.

Sampling a function. Broadly speaking, digitizing or discretizing or sampling


means conversion of a function f(x) into a sequence {f(k)} . This can be done by
choosing values of that function at discrete locations of z. Let us assurne that the
discrete points or locations are evenly spaced in x with the distance between any
two sampies being 6x. The /(k) term of the sequence is equal to the value of the
function f(x) at x = Xo + k6x [xo is the location of the first sample, that is, at
k = 0]. Naturally, we want this sampled sequence to properly represent the function .
{!(k)} is considered as an adequate representation of f (x), if we can recover that
function exactly from the sequence. That is to say, if we can interpolate between the
sequence terms !(k) to retrieve the function f(x) . The theorem mentioned below
is known as the sampling theorem which theoretically answers the question of
how small our sample size 6x must be in order that the sampled values reasonably
resemble the original function.

Theorem 5.14 [Sampling Theorem]. Let f(x) be a band-limited function with


bandwidth 2a, that is, F(w) = 0 [or Iwl ~ a. Then f(x) is uniquely determined by a
knowledge 01 its values at unilormly spaced intervals 6x apart (6x = 2Ia)' Specially,
we baue

f(x) = ~ f(k6X)sin(27ra[x - kÖx)).


L..t 27ra[x - kÖx]
k=-oo

For proof, we refer to Weaver [1983] .


This theorem provides a method for sampling a function in order to be able
to uniquely recover the function from its sampled sequence. In other words, for
this band-limited function, the highest frequency component present in a and the
sampling theorem requires that we must have a sampling rate of at least 1/2a.
PhysicalIy, this means that we must have at least two sampies per cycle of the highest
frequency component present. This sample rate (6x = 21a) is often called the rate
and the sequence obtained using this rate, the Nyquist sampies. The sampling
5.2. INTRODUCTION TO FOURIER ANALYSIS 217

theorem also supplies us with an interpolation formula with which to recover the
function from its Nyquist samples, For more explanation and discussion on the
merits and demerits of the theorem, one may consult Weaver [83]. A function f(x)
is said to have bounded support if f(x) = 0, lxi ~ b, where b is some positive
constant.
In Fourier analysis, such a function is called time-limited or space-limited.
f(x) is called almost time-limited if and only if, given any ~T > 0, there exists a
positive real number b, called time-limit, such that

i: If(x)ldx < ~T' and 1 00

If(x)ldx < ~T'


A function f(x) is called almost band-limited if and only if, given any ~B > 0,

i:
there exists a positive real number a, called the band limit, such that

lF(w)ldw < ~B' and 1 00

IF(w)ldw < ~B'

The Gaussian function f(x) = e- a z 2 is both almost time-limited and almost band-
limited. The band and time limits are determined, to a large extent, by the res-
olution, sensitivity, and/or dynamic range of the detection instruments. A more
realistic theorem is as folIows:

Theorem 5.15 [Real World Sampling Theorem]. Ifthe /unction f(x) is almost
band-limited with bandwidth 2a, and almost time-limited, with time width 2b, then
f(x) can be recovered from its sampled sequetice to any desired accuracy. That is,
given ~ > 0, we can always choose band a such that

f( x ) = ~ f(kt::. ) sin(271"5a[x - k6x]) A( )


LJ
k=-M
x 2 5 [ kA]
71" a x - u.X
+ I; X ,

where

IAI;(x)1 < ~
M t::.x > b
1
t::.x < lOa .

The following procedure is followed to calculate the Fourier transform by the


computer (details can be found in Weaver [1989] and Gonzalez and Woods [1993]).
Assurne that f(x) is an almost time-limited function over the domain [-c, b] . Let
us assume that it is almost band-limited over the domain [-a,a]. Then to digitally
obtain the Fourier transform of this function, we proceed as folIows:
218 CHAPTER 5. IMAGE PROCESSING

1. If necessary (C:f:. 0), form the new function g(x) by shifting f(x) to the right
by an amount Cj that is
g(x) = f(x - c).

2. Sampie the function g(x) with the sampling rate t::.x l~a. and choose the
number of sampies N such that
N t::.x > 10(b + c).

3. Calculate the discrete Fourier transform of this sampled sequence and multiply
the resulting sequence by N t::.x to obtain the sequence {G(jt::.w)} .
4. By the relations
F(-j) = F(N - j), jE [O ,N -1]
F(-k) = f(N - k) , k E [O,N -1],
we obtain values for the negative indices j that represent values for the negative
frequencies -jt::.w(t::.w = N1:1)'
5. Recover G(w) from {G(jt::.w)} as per the real-world sampling theorem or by
simply constructing a smooth curve between the sampled values.
6. If necessary (C:f:. 0), recover F(w) from G(w) as per the formula
F(w) = G(w)e2 11"i cw .
It is clear that if c = 0, then F = G.

The methods in the previous section can be employed for sampling the function
and then Fourier coefficients can be obtained from the terms of the discrete Fourier
transform of the sampled sequence. Let g(x) be a periodic function over (-T,T)
with period 271" and be Lebesgue integrable over (-T, T). Define a function f(x) as
follows:
f(t) = g(t) ,t E [-t, t]
f(t) = 0 , Itl > t·
It can be seen that [Weaver 1983] if Ck is k-th complex Fourier coefficient of g(t),
then Ck = ~F(f), where F(w) is the continuous Fourier transform of f(t). Thus,
we see that the Fourier coefficients of a function may be obtained from the Fourier
transform of that function at equally spaced increments w = f.
lllustrative examples can be found in Weaver [1983]. For current developments
concerning Shannon's sampling theory, we refer to Zayed [1993]. For a comprehensive
discussion of the material presented in this section, we refer to Nievergelt [Chapters
4-6,1999].
5.3. WAVELETS WITH APPLICATIONS 219

5.3 Wavelets with applications


5.3.1 Introduction
Wavelet analysis is the outcome of the synthesis of ideas that have emerged
in different branches of mathematies, physics and engineering. Since the days of
Fourier, scientists and engineers have made vigorous efforts to represent square in-
tegrable functions (signals having finite energy) as a linear combination of functions
having some niee properties. Radamacher, Haar, Walsh, Franklin and Vilenkin con-
structed non-trigonometrie orthogonal systems in their endeavour to accomplish this
goal. The Walsh function was extensively studied and applied by electrical and elec-
tronie engineers during the seventies and eighties prior to the invention of wavelets in
mid-eighties (see, for example, Siddiqi [1978, 1987] and reference therein) . In 1981,
Stromberg [1981] constructed an orthonormal spline system on areal line which is
now termed as the first example of wavelet constructed by a mathematician. How-
ever , even without having the knowledge of this work, physicists like Grossman and
geophysicists like Morlet were developing a technique to study non-stationary sig-
nals which 100 to the development of wavelet theory in the last decade (see, for
example, Daubechies [1992] and Meyer [1993]). Meyer, Daubechies, Mallat et al,
have put this theory on a firm foundation through the multiresolution analysis and
establishing relationships between function spaces and wavelet coefficients. This sei-
entific discipline of vital importance has been elegantly introduced by Meyer [1993],
where he has also explained the relationship between fractals (another exciting sci-
entific discipline) and wavelets along with future avenues of researches , especially in
understanding the hierarchieal organization and formation of distant galaxies. For
latest interaction of fractals and wavelets, we refer to Berkner [ 1997], Mendivil and
Vrscay [1997], and Siddiqi et al. [1997, 1999]. For current references for results on
theory and applications of wavelets, we refer to Kovaöeviö and Daubechies [1996],
Louis, Maaß and Rieder [1997], Dahmen [1997], De Vore [1998], Kobayashi [1998],
Ogden [1997], Liu and Chan [19098], and Canuto and Cravero [1997]. Since 1991, a
generalization of wavelets, known as wavelet packets, has been studied by Coifman
et al. [1992]. Wavelet packets were also named arborescent wavelets and they are
partieular linear combinations or superposition of wavelets. Discrete wavelet pack-
ets have been thoroughly studied by Wiekerhauser [1994] who has also developed
computer programmes and implemented them.
The aim of signal analysis is to extract relevant information from a signal by
transforming it, In order to study the spectral behaviour of an analog signal from
its Fourier transform, full knowledge of signal in the time-domain must be acquired.
Ir a signal is altered in a small neighbourhood of some time instant, then the entire
spectrum is affected. Indeed, in the extreme case, the Fourier transform of the delta
distribution a(t - to), with support at a single point to, is e- itow , whieh certainly
covers the whole frequency domain. Hence, in many applications such as analysis
of non-stationary signals and real-time signal processing, the formula of the
Fourier transform alone is quite inadequate.
220 CHAPTER 5. IMAGE PROCESSING

A typical problem is the analysis of the sound which we hear when we blow
a flute. We can observe that this sound consists of high-frequency parts as well as
low-frequency parts. If we use the normal Fourier analysis, we would need extremely
high frequencies to represent this jump from high to low frequencies. We can avoid
this jump if we look not at the whole time interval but just at an interval where
we find mainly frequencies of the same order. This means that we introduce 'time
windows'. We achieve such a time window technically by introducing a window
function g. The usual approach is to introduce time-dependency in the Fourier anal-
ysis while preserving linearity. The idea is to introduce 'local frequency' parameter
(local in time) so that the 'local' Fourier transform looks at the signal through a
window over which the signal is approximately stationary.
The deficiency of the formula of the Fourier transform in time-frequency anal-
ysis was already observed by D. Gabor who, in his 1946 paper, introduced a time-
localization 'window function' g(t - b), where the parameter b is used to translate
the window in order to cover the whole time-domain for extracting loeal information
of the Fourier transform of the signal. In fact, Gabor used a Gaussian function for
the window function g. Since the Fourier transform of a Gaussian function is again
a Gaussian, the inverse Fourier transform is localized simultaneously. It is observed
that the time-frequency window of any Gabor transform is rigid and, hence, is not
very effective for detecting signals with high frequencies and investigating signals
with low frequencies. This motivates the introduction of wavelet transform which
windows the function (signal) and its Fourier transform directly. It allows room for
a dilation (or scale) parameter that narrows and widens the time-frequency window
according to high and low frequencies. In other words, the wavelet transform is
a tool that cuts up data or function into different frequency components and then
studies each component with aresolution matched to its scale.
The wavelet theory provides a unified framework for a number of techniques
which had been developed independently for various signal processing applications.
For example, multiresolution signal processing, used in computer vision; subband
coding, developed for speech and image compression; and wavelet series expansions,
developed in applied mathematics, have been recently recognized as different views
of a single theory.
We present here some basic results of the wavelet theory, a fast-developing field,
which have brought about tremendous improvements in computing time while solv-
ing models of real-life problerns; compression ratio, and noise reduction in image
processing. For a deeper insight, we refer to original sources like Beylkin, Coif-
man and Rokhlin [1991] , Glowinski, et al. [1990], Daubechies [1992], Amartunga
and Williams [1993], Walker [1997], Siddiqi [1998], Siddiqi and Ahmad [1998], and
Kobayashi [1998].

5.3.2 Wavelets and multi-resolution analysis


Definition 5.3.1. A wavelet is a function 'IjJ(t) E L 2(R) such that the /amily 0/
5.3. WAVELETS WITH APPLICATIONS 221

[unctions

(5.67)

where j and k are arbitrary integers, is an orthonormal basis in the Hilbert space
L 2(R).

Remark 5.3.1. This definition means that


(i) ('l/Jj,k(t),'l/Jm,n(t)) = I5j,ml5k,n, where I5j,m and I5k,n are Kronecker delta
( s.; : 1 ~i;;:) and (".) denotes the inner product of L 2(R).

(ii) f E L 2 (R) can be written as

f =L L (I,'l/Jj,k(t)) 'l/Jj,k(t), (5.68)


jez kez
where Z denotes the set of integers.

Definition 5.3.2. Wavelet coefficients 0/ a function / E L 2 (R ), denoted by Cj,k


are defined as the inner product 0/ / with 'l/Jj,k(t)i that is,

Cj,k = (I,'l/Jj,k(t)) = the inner product 0/ / with

'l/Jj,k(t) = In f(t)'l/Jj,k(t)dt. (5.69)

The series

L L (I, 'l/Jj,k) 'l/Jj,k(t),


jez kez
(5.70)

is called the wavelet series 0/ fE L 2 (R ).

Remark 5.3.2. (i) For a given wavelet 'l/J(t), a scaled and translated version is
obtained by

1
'l/Ja,b(t) = ..;a'l/J (t - b)
-a- , a"l- 0, s « R. (5.71)

The parameter a corresponds to the scale while b is the translation parameter. The
wavelet 'l/Jl,O(t) = 'l/J(t) is called the basic wavelet or mother wavelet.
(ii) Since 'l/Jj,k(t) oscillates more quickly, therefore it is more suitedfor represent-
ing finer details in the signal 'l/Jj,k(t) which is localized about the point t = 2- jk .
The wavelet coefficient Cj,k measures the amount offluctuation in the function about
the point t = 2- jk with the frequency determined by the dilation index j.
222 CHAPTER 5. IMAGE PROCESSING

(iii) (a) Given areal number h we define the translation operator Th acting
on functions defined on R by the formula

Th(f)(X) = f(x - h).


(b) Given an integer 8, we define the dyadic dilation operator J; acting on
functions defined on R by the formula

(iv) In order to have the existence of an inverse transform of the continuous


wavelet transform defined below, a technical condition must be satisfied by t/J E
L 2(R)j namely

0< C
'"
= 271" r It/J(w) I dw <
J Iwl
R
A 2
00
'
(5.72)

where -if;(w) is the Fourier transform of t/J. Very often, this property is taken as
the definition of wavelet, that is, a function t/J in L 2(R) satisfying (5.72) is called a
wavelet.
Definition 5.3.3. Let t/J be a wavelet. Then the wavelet transform «t « ~(R)
is defined as

T",/(a, b) = Vä r
1 JR f(t)t/J -a- dt (t - b) = (I, t/Ja,b(t)) . (5.73)

Remark 5.3.3. (i) If t/J satisfies (5.72), then 1 can be constructed by

I(t) = C;l
100
o
da
2"
a
/00 T,pf(a, b)t/Ja.b(t)db,
-00
(5.74)

J »:
that is, the truncated integral

A 2" T,p(t, b)t/Ja,b(t)db,


l/A a -B

converges to c,pf(t) in L 2 (R ) as A and B approach +00.


(ii) condition (5.72) implies that -if;(O) = 0 so that IR t/J(t)dt = O.
(iii) More generally, one may impose the condition

(iv) Problem 5.13 provides a method for constructing a variety of wavelets.


(v) It can be proved that T,p is an isometry, that is, IIT,p(f)IIL2 = II/I1 L2 '
5.3. WAVELETS WITH APPLICATIONS 223

(vi) the adjoint operator

(5.75)

inverts the wavelet transform on its range; that is,

(5.76)

where P", is the orthogonal projection of T", onto its range.


For the proof of these two results, we refer to Louis, Maaß and Rieder [pp. 7-8,
1997].
(vii) It is interesting to observe that the wavelet transform for a = 2-i and
b = k2- i is the wavelet coefficient Ci.k.

Definition 5.3.4. A multiresolution analysis (MRA) is a sequence of closed


subspaces of L 2 (R) such that

(i) V-2 C V-I C Vo C VI C V2 C . .. .

(ii) span Ull; = L 2 (R ).


iEZ

(iii) n ll; = {O}.


iEZ

(iv) f(x) E Vi if and only if f(2- ix) E Vo.

(v) fE Vo if and only if f(x - m) E Vo for all mEZ .

(vi) There exists a function cp E Vo, called the scaling function, such that the system
{cp(x - m)}mEZ is an orthonormal basis in Vo.
It may be observed that a scaling function cp determines the multiresolution
completely. It induces a wavelet, often referred to as the father wavelet. The
scaling equation and its equivalent forms will be discussed in the next section along
with decomposition and reconstruction algorithms.
In the remaining part of this section, we would like to answer the following
natural questions:

Q1. What is the general problem involving wavelets?

Q2. Are there functions satisfying conditions of Definition 5.3.1?

Q3. Is there any relationship between wavelets and multiresolution analysis?


224 GHAPTER 5. IMAGE PROGESSING

Q4. Are there any advantages of wavelet transforms and wavelet series over such
concepts in Fourier analysis?
Q5. What is the convergence theory of the wavelet series?
Q6. What are the applications of concepts introduced above in real-life problems,
in particular, industrial problems?
Q7. Is it possible to extend Definitions 5.3.1 and 5.3.4 to L 2{Rn), where n is any
natural number?

Ql General problem. Let f{t) be a function defined for tE R . Let us imagine that
this function describes some real-life phenomenon. To make things mathematically
simple, let us suppose that f E L 2{R). Our object is to transmit/store/analyze this
function using some finite device. For example, f represents a voice signal and we
want to transmit it over the telephone lines or put it on a compact disk, If we can
find an orthonormal basis {If'n} in L 2{R), then we can write
(5.77)

where the series converges in L 2{R), and the coefficients cn are uniquely determined
by the formulas
Cn = (f, If'n) for n E N. (5.78)
Thus, instead of transmitting the function t, it suffices to transmit the sequence
of coefficients {Cn}nEN and let the recipient sum the series hirnself. It is not a
finite procedure. To make it finite, we have to choose a finite set A c N such that
L cnlf'n will be very elose to L
cnlf'n. This means that the recipient is really
nEA nEN
forming the sum L c
Cnlf'n' where n and c n are almost equal, that is, the distance
nEA
between them could be ignored. This is a very general theme and there have been
many ways to deal with various special instances of different aspects of this arch-type
problem. Wavelets are one of the new tools to tackle this type of problem effectively
and efficiently.

Q2 Existence and examples of wavelet. According to Problem 5.13, every


differentiable function whose derivative belongs to L 2{R) will define a wavelet. In
fact, the set of all wavelets satisfying (5.72) is dense in L 2{R) (Problem 5.14). We
mention below the Haar, Daubechies, Shannon and Gaussian-related wavelets.

Haar wavelet [Alfred Haar, 1911].

H{t) : ~1, ifO~t~~


(5.79)
if~~t<l
{ otherwise
=0,
5.3. WAVELETS WITH APPLICATIONS 225

11-----------.

o
1
2'
11

-1

Figure 5.8a. The Haar function.

is an orthonormal basis of L 2(R). Support of Hi,k = [k2- i , (k+1)2- i ]. The intervals


[k2i , (k + 1)2i ] for k, j E Z form the family of dyadic intervals .

Daubechies wavelets Ingrid Daubechies 1988]. To design continuous wavelets


amenable to a fast transform with an interval with finite length, Ingrid Daubechies
introduced a basis block or scaling function by rf> as shown in Figure 5.9.

rf>(r) = 0 if r $ 0 or 3 $ r, (5.80)

rf> does not admit any algebraic formula in terms of elementary mathematical func-
tions. Starting from the initial values

ho = rf>(O) =0
h1 = rf>(1) = 1 + J3
2
h2 = rf>(2) = 1 - J3
2
h3 = rf>(3) = O.
226 CHAPTER 5. IMAGE PROCESSING

4 - j =4, k = 13
j = 3, k =2
....:..-
,--

2 -
j = 0, k = 0

o I
I
-2 -
-
-4 - -
I I I I I I
0.0 0.2 0.4 0.6 0.8 1.0

Figure 5.8b. Haar wavelet examples at different scales Ho,o(t), H 3,2(t), H4,3(t).

The building block rP satisfies the recurrence relation

t{l(r) = 1 + v'3 rP(2r) + 3 + v'3 rP(2r _ 1)


4 4 (5.81)
{ 3-v'3 I-v'3
+ - 4 - rP(2r - 2) + 4 rP(2r - 3).

It is dear that cp(O) + cp(1) + cp(2) + cp(3) = 1. This recurrence relation takes the
form of an inner product with (ho, h 1 , ha, h3 ) :

cp(r) = ho . cp(2r) + h 1 • cp(2r - 1) + h2 • cp(2r - 2) + h 3 • cp(2r - 3).

It can be seen that

Gaussian related wavelet. t/J(t) = -~ :2 e-


t2

Shannon wavelet. If cp(x) is equal to the Shannon sampling function Si~:z, then
the corresponding wavelet is the Shannon wavelet which is given by the expression

./. ( ) _ sin(271'x) - sin 7I'X


'l"Shannon X - • (5.82)
7I'X
5.3. WAVELETS WITH APPLICATIONS 227

1.5

-0.5
Figure 5.9. Ingrid Daubechies' basic building block tp,

1.5

-1_U:6::-----_-4~------I_~2-----!0~------!:-2------J

Figure 5.10. A few wavelets obtained from the mother wavelet


1/J(t) = (1 - 2t2)- t = 1/JIO(t), which is the second
2

derivative of a Gaussian. Displayed are (from left to


right) : 1/J3(/2) ,-2(t), 1/Jl,O(t), 'I/J(1/4),2 1 / 2 •
228 CHAPTER 5. IMAGE PROCESSING

Q3 Relationship between wavelets and multiresolution analysis. Cor-


responding to every wavelet, we can construct a multiresolution analysis; see, for
example, Problem 5.16 where it is shown that there is a multiresolution analysis
corresponding to the Haar wavelet. Conversely, one can construct a wavelet from
a multiresolution analysis, see, for example, discussion on pp. 32-38 [Wojtaszcyzk,
1997] where it is proved that if ip is a scaling function, then the corresponding wavelet
is given by the formula

1{J(x) = L an (-1)ncp(2x + n + 1), (5.83)


n EZ

i:
where

an = cp (~) cp(x - n)dx. (5.84)

More details are given in the next section.

Q4 Advantages of wavelets over Fourier methods. During the last three


decades, signal processing has become a discipline of vital importance. Most of
the signals in nature or otherwise are transient (time-dependentfnon-stationary). It
was realized that the tools of Fourier analysis are incapable of studying such signals
properly. This realization provided motivation for invention of wavelet analysis
which removed effectively the following drawbacks of the Fourier analysis:
(i) The computation of Fourier transform for only one frequency at a time.
(ii) Exact representations cannot be computed in real time.

(iii) The Fourier transform provides information only in the frequency domain, but
none in the time domain .
The main disadvantage of the Fourier transform in signal processing is its missing
localization property, namely, if a signal changes at a specific time, its transform
changes everywhere and a simple inspection ofthe transformed signal does not reveal
the alteration. The main reason for this is the periodic behaviour of the trigonometrie
functions. In order to remove this deficiency of the Fourier method, we use small
waves or wavelets and, in this case, translation and scaling allows for a frequency
resolution at arbitrary positions. The Fourier transform considers phenomena in
an infinite interval, and this conflicts from everyday point of view. It decomposes
signals in plane waves (trigonometrie functions), whieh oscillate infinitely with the
same period, and these have no loeal character. The wavelet t ran sform allows more
flexibility; the wavelet whieh can be almost any chosen function can be shifted
and dilated to analyze signals. Wavelets may be thought of as generalization of
oscillations, abstractly expressed in a zero mean value.
If f in Definition 5.3.3 shows a big change in a neighbourhood N(b) ofthe point b,
it has a high frequency spectrum there. Since the set {1{J((. -b)fafa E R\ {O}}zooms
5.3. WAVELETS WITH APPLICATIONS 229

into b for sufficiently small a, the eorresponding values of the wavelet transform
eharacterize the high frequeney parts of f in N(b). The wavelet transform Tt/Jf can
be interpreted as
(i) a phase-space representation of I,
(ii) an approximation of a derivative of i, and
(iii) the splitting up of f into different frequeney bands.
The first interpretation gives loealization properties of the wavelet transform yielding
a generalized uneertainty principle. The interpretation as the approximation of a
derivative of f leads to finding jumps in derivatives, a erucial property for pattern
reeognition.
As we have seen wavelets are intrinsically eonneeted to the notion of multireso-
lution analysis, by which objeets like signals, functions, data ean be examined using
widely varying levels of foeus. As a simple analogy, eonsider observing a ear show-
room. The observation ean be made from a greater distanee, at which the viewer
ean diseern only the basic shape of the strueture. As the observer moves near, var-
ious other details of the ear show-room ean be observed like the number of ears
parked there. Moving further closer, the observer may notice the models of ears and
other objects in the room. Continuing still further, it is possible to observe the de-
tails of paintings in the room. The basic framework of all these views is essentially
applieation of the wavelet methods. This eapability of multiresolution analysis is
known as "zoom-in, zoom-out" property. Thus, wavelet analysis is an exeeHent tool
to examine features of the signal of any size adjusting a sealing parameter in the
analysis.
Signals are typically eontaminated by random noise and a major part of signal
proeessing is aeeounting for this noise. A special emphasis is on denoising, that is,
extracting the true (pure) signal from the noisy version aetually observed . Wavelets
have performed admirably weH in this field and are superior to any other tools . The
wavelet method is weH suited for denoising signals not only those with smooth, weH
behaved natures, but also those signals with abrupt jumps, sharp spikes, and other
irregularities. If signal proeessing is to be done in real time, that is, if the signals are
treated as they are observed, it is important that fast algorithms are implemented.
One of the key advantages that wavelets have in signal proeessing is the associated
fast algorithms which are faster than the fast Fourier transform. The wavelet com-
pression technique is far superior to Fourier methods, say DCT for signals and image
eompression with minimum loss of originality or with maximum aceuracy. This new
technique has done aremarkable job in eompression of 30 million sets of finger prints
eoHected by the United States Federal Bureau of Investigation (FBI) in less than 30
kilobytes of storage space for an adequate representation of the digital data achiev-
ing eompression ratio of 20:1. Reeently Walker [1997] has demonstrated with the
specifie examples and, using Haar and Daubechie wavelets, that a eompression ratio
of 25:1 eould be achieved with permissible distortions. Siddiqi and Ahmad [1998]
have investigated the eompression ratio of images available at the "Brag-zone site"
230 CHAPTER 5. IMAGE PROCESSING

of the university of Waterloo applying Fourier method (JPEG), fractal method and
wavelet method. It has been observed that the performance of wavelet method is
superior to the JPEG in all cases and only similar to fractal method in few cases.
Details are presented in Section 5.3.4.

Q5 Convergence of wavelet series. In 1966, Carleson proved the famous Lusin


conjecture that the Fourier series of an L 2(R) function f converges pointwise al-
most everywhere to I, This result was extended by Hunt to L p functions when
1 < p < 2. For Fourier series of R2, in 1971, C. Feffeman proved that spheri-
cally summed two-dimensional Fourier series of L p(R2 ) functions do not converge
in L p (R 2) for a certain real p. KeHy, Kon, Raphael [1994, 1994] have studied the
convergence of wavelet expansions induding the inter-connection between classical
results in this area and their results including those concerning such expansions by
Meyer and Walter. Recently, Manchanda, Mukheimer and Siddiqi [1998] have stud-
ied the pointwise convergence of two-dimensional wavelet expansions incorporating
rotation. Essentially, they have proved that such an expansion of a continuous func-
tion f in Li (R 2 ) n L 2 (R 2 ) converges uniformlyon a compact subset. Relaxation of
continuity and weakening of regularity condition on the wavelet require investigation
on the lines of KeHy et al [1994].

Q6. Real-life and industrial applications. Wavelet analysis has outperformed


all other mathematical theory known till today as far as the practical problems are
concerned. Some typical examples will be discussed in Sections 5.3.4 and 5.3.5.
For more case studies like wavelet multiresolution display of coastline data; wavelet
analysis for a text-to-speech system, atmospheric wind, turbulent fluid, seismic data,
we refer to Kobayashi [1998] . For a Spanish cement production data set analysis
applying the discrete wavelet transform, we refer to Arino and Vidakovic [1999]. For
applications of wavelet transforms to financial stock index event detection, one may
go through Nievergelt [pp. 33-34, 1999].

Q7. Multivariable wavelet and multi-dimensional multiresolution analy-


sis. In Section 5.3.4, we briefly introduce th'se concepts in two dimensions; for more
information, we refer to Dai, Larson and Speegle [1997] and Wojtaszczyk [1997].

5.3.3 Special features of wavelets


Interpretation of multiresolution analysis and scaling equations. Let
{Vj};EZ be an MRA with the scaling function <p(x) with each Vj having ONB
{<Pj,k' k E Z} . Furthermore, let Pj(f) denote the projection of a function f onto the
space Vj (the best approximation f in Vj) and 1/J(x) be the wavelet associated with
<p by (5.83). Then

(5.85)
5.3. WAVELETS WITH APPLICATIONS 231

The detail function gi-1 (residual between two approximations P j (I) - Pj - 1 (I)) can
be written in terms of tPj,k(X) as follows:

Pj (I) = P j- 1(I) + L (f, tPj-1,k}tPj-1,k (5.86)


kEZ

or
j-1
Pj(l) = Pjo(l) + L L(f,tPl,k}1jJl,k' (5.87)
l=jo k
Let

(5.88)

often called the wavelet subspcce;

Vi = span{cpj,k,k E Z}, (5.89)


jx
where cpj,k(X) = 2i/ 2 cp(2 - k). (5.90)

Then

(5.91)

where ffi represents the orthogonal sum of two subspaces of ~(R) (Wj-1J.. Vi-d .
It is also clear that

f(x) E W j if and only if f(2x) E W H 1. (5.92)

In fact, it can be verified that

~(R) = ffiWj. (5.93)

Equations (5.86) and (5.88) tell us the basic property 0/ wavelet, namely, it is pos-
sible to construct approximations at increasing leuels 0/ resolution that are linear
combinations 0/ dilations and combinations 0/ a scaling function 11', the difJerences
in approximations being expressed as linear combinations 0/ dilations and transla-
tions 0/ a wavelet function tP. Furthermore, the scaling function 11' and the wavelet
tP (their dilates and translates) are orthogonal. The ideas of the MRA with de-
tail spaces between successive levels of approximation are expressed in Figure 5.11,
where arrows denote composition, namely, Vi is composed of Vi-1 and Wj-1, etc.
From conditions (i) and (vi) of the MRA (Definition 5.3.4), we find that the
scaling function 11' is in V1. By condition (iv) of Definition 5.3.4, cp(x/2) belongs to
Va and applying condition (vi) again, we get

cp(x/2) = L ancp(x - n), (5.94)


nEZ
232 CHAPTER 5. IMAGE PROCESSING

Figure 5.11. Relation of approximation and detail spaces.

or equivalently

<p(x) =L an<p(2x - n), (5.95)


nEZ

where {an} is a sequence of scalars.


Equation (5.95) and the following equivalent equations are called scaling equa-
tions:

<p(~) = m'lA~/2)rp(~/2) (5.96)


<p(2~) = mcp(~)rp(~), (5.97)

where mcp(~) is a 21l"-periodic function given by

mcp (~ ) -_ 2"1 'L..J


"
ane-ine . (5.98)
nEZ

(Here, <p denotes the Fourier transform of the scaling function <p) .
Since 1I<p(xj2)1I = .../2, we see that

L lanl 2 = 2, (5.99)
nEZ
so

(2~ l21r Imcp(~)12d{) 1/2 = ~ . (5.100)

Scaling equations (5.95)-(5.97) playafundamental role in the wavelet analysis.


It can be checked that

Imcp(~W + Imcp(~ + 1l"W = 1 for alm ost all ~ E R . (5.101)


For results concerning the generation (formationjcreation) of an MRA from a func-
tion of L 2 , in particular a wavelet, and vice-versa, that is, creation of a wavelet from
an MRA, we refer to Theorems 2.13 and 2.20 in Wojtaszczyk [1997].

Wavelet decomposition and reconstruction. It is clear that if<p E Vo, then


tp E V1 as Vo C V1 • Since {<P1,k' k E Z} is an orthonormal basis for V1 , there
5.3. WAVELETS WITH APPLICATIONS 233

exist two sequences {Pk} and {qk} in 12 (Z), where 12 is aspace of square summable
sequences over Z ( {an} E 12 (Z) if L
lan 12 < 00) such that
nEZ

tt'(x) = LPktt'l ,k(X) = 21 / 2 LPktt'(2x - k), (5.102)


kEZ kEZ
t/J(x) = L qktt'l,k(X) = 21 / 2 L qktt'(2x- k) (5.103)
kEZ kEZ
Equations (5.102) and (5.103) are called the two-scale relations. qk = (-1)k p1_k, if
sp and t/J are orthogonal.
Equation (5.102) is also known as the dilation equation or refinement equa-
tion. We may write Pk = (tt', tt'1 k)'
Let t/J(x) be equal to the H~ function and tt'(x) the characteristic function of
[0,1], that is

tt'(x) = 1 if xE [0,1]
=0 if xlt[O,l].
Then it may be verified that for each jE Z, Vj = {cpj,k,k E Z} is an MRA with the
scaling function cp(x) with each Vj having ONB. In this case
1
k = 0,1
Pk = ..;2' (5.104)
0, otherwise .
Equations (5.102) and (5.103) take the form
1 1
tt'(x) = ..;2tt'I,O(X) + ..;2tt'I,1 (x) (5.105)
1 1
t/J(x) = tntt'1 o(x) - tntt'1 1 (x). (5.106)
v2 ' v2 '
The decomposition algorithm Let {Cj,k},j,k E Z and {dj ,k},j,k E Z represent
the wavelet and scaling function coefficients of a function j E L 2(R)j that is,

Cj,k = L j(x)t/Jj,k(x)dx, (5.107)

and

(5.108)

Then we may prove that

dj,k = LPt-2kdj+l,t (5.109)


tEZ
234 CHAPTER 5. IMAGE PROCESSING

and
Cj,k = :~:) -1 )lp-t+2k+l dj+l ,l' (5.110)
te»
Thus, given scaling coefficients at any level m, all lower-level scaling function coef-
ficients for j < m can be computed recursively applying (5.105) and all lower-level
wavelet coefficients (j < m) can be computed from the scaling function coefficients
using (5.110). This decomposition algorithm is represented schematically in Figure
5.12. In Figure 5.12, arrows represent the decomposition computations, namely,

---- Cm-n+l,. -------- Crn-2,.

dm - n + l ,. .....f - - - - - ------------ d rn- 2 , • .....1 - - - - - dm-l ,....


..I - - - - - d rn, .

Figure 5.12. Schematic representation of the decomposition algorithm.

d m- 2,.; Cm-2, . can be computed using only the coefficients dm-I.


It can be noted that in either equation, (5.109) and (5.110), if the dilation index
k is increased by one, the indices of the {pt} sequence are all offset by two . Conse-
quently, if there are only finitely many non-zero elements in the sequence {pt} , then
applying the decomposition algorithm to a set of non-zero scaling function coeffi-
cients at level j + 1 will yield only half as many non-zero scaling function coefficients
at level j. Thus, computing the decomposition algorithm recursively yields fewer co-
efficients at each level. This is the basis of the fast wavelet transform to be discussed
after the reconstruction algorithm.

The reconstruction algorithm. Let us begin with the MRA {Vj,j E Z} with
{'Pj,k ,k E Z} and {tPj,k,k E Z} forming orthonormal bases for Vj and Wj, respec-
tively. Since 'Pl,O E ~ and Vi. = VoffiWo, 'Pl,O can be written as a linear combination
of the 'Po ,k 's (basis of va) and the tPo ,k 's (basis of Wo). Let for k E Z,
a2k = ('Pl,O' 'PO,k) and a2k-l = «1'1 ,1' 'Po,k)
~k = ('Pl,O' tPO,k) and b2k-l = (lfJl ,l' tPO,k) '
Therefore
'Pl,O(X) = z)a2kIfJO,k(X) + b2ktPo,k(X)) (5.111)
kEZ
and
IfJl,l(X) = I)a2k-l'PO,k(X) + ~k-ltPO,k(X)) , (5.112)
kEZ
5.3. WAVELETS WITH APPLICATIONS 235

By using (5.111) and (5.112), we can write a similar expression for anY<PI,k' Let k
be even. Then

<PI ,k(X) = <PI ,O (x-~)


= L aU<PO,l ( x - ~) + bU1/Jo,l (x - ~)
lEZ
= L au1/Jo,~+l(X) + b2l<PO,~ ,l(X)
lEZ
or

<PI,k (x) =L a2l-k<PO,l(X) + bU-k1/Jo,l(X). (5.113)


lEZ
Assuming k to be odd, we get the same formula . For odd (even) k, only the odd
indexed (even indexed) elements of the sequences {al} and {bt} are accessed.
Proceeding on similar lines, a formula relating each scaling function <Pj,k to scal-
ing functions and wavelets at level j - 1 can be obtained; namely,

<Pj,k(X) = L a2l-k<Pj-I,lbu-k1/Jj-I,l' (5.114)


lEZ
We can apply (5.114) to obtain the following reconstruetion algorithm for the
scaling function and wavelet coefficients:

dj ,k = L aU-kdj-l ,l + b2l-k Cj-l,l . (5.115)


lEZ
We can show that

= i:
~k-l = (<PI ,I' 1/JO,k}

V2<p(2x - (1 - 2k)}1/J(x)dx

= L(-l)~_l+l (<PI,I-2k,<PI,l)
lEZ
= -P2k ·
Similarly it can be seen that

ak = P-k and bk = (-l)k pk+l for all k E Z.

Thus the reconstruction algorithm (5.115) can be written as

dj ,k = LPk-21dj-I ,l + (-1)kP21-k+lCj-I,l ' (5.116)


lEZ
236 CHAPTER 5. IMAGE PROCESSING

Cm ,. Cm+ 1, . ----------------------- Cm +n ,.

d-«; ----. -----. ------1.~dm+n-l,. ------1~~dm+n, .

Figure 5.13. Schematic representation of the reconstruction algorithm.

The scaling function coefficients at any level can be computed from only one set
of low-level scaling function coefficients and all the intermediate wavelet coefficients
by applying (5.115) or (5.116) recursively. This concept is shown schematically in
Figure 5.13.
The filter representation. The decomposition and reconstruction algorithms
can be treated as examples of signal processing filters. As we know in every-day life,
filter is used to purify certain things; for example, in a laboratory it is used to purify
a liquid from solid impurities or to remove asolid from solid impurities. A filter
in the signal processing may attempt either to isolate the pure signal from its noise
contamination or to extract the noise, depending on which (signal or noise) is of
primary consideration. A discrete signal f is represented by a sequence {ik hEZ E
l2(Z), A filter may also be represented by a sequence {akhEz E l2(Z), and is
denoted by A . Applying a filter to a signal yields another signal. The filtering
process comprises a discrete convolution of the filter sequence with the signal. By
applying the filter A to the signal f, we get

(AJ)k =L ae-2kft, (5.117)


eEZ
where (AJ)k is a new signal indexed by kranging over Z. A2 f will mean that A has
been applied twice. If A and B are two filters then ABf will mean that, first B has
been applied to f and then A has been applied to B f. Our goal now is to express
the decomposition algorithm given by equation (5.110) in terms of filters. Let H be
a filter represented by the sequence {pklkEZ given as

(5.118)

Let {dj ,klkEZ, scaling function coefficients at a particular level, be a signal. The
scaling function coefficient at the next lower level signal is obtained by applying the
filter H to the signal dj,. = {dj,khEZ :

dj - 1 ,. = Hdj, . , (5.119)

(5.119) corresponds to (5.109).


5.3. WAVELETS WITH APPLICATIONS 237

The scaling function coefficients at any level can be obtained by repeatedly ap-
plying the filter H :
(5.120)

Now we define a new filter G by


qk = (-l)k pI_k , k E Z, (5.121)

where Pk'S are defined by (5.118).


Wavelet coefficients at level j -1 can be obtained from scaling function coefficients
at level j via the filter

Cj-I ,. = Gdj,. . (5.122)

(5.122) corresponds to (5.110).


By combining filters H and G, wavelet coefficients at any lower level can be
computed from scaling function coefficients at level j
Cj-m ,. = GHm-Idj,. , (5.123)
where H and Gare examples of quadrature mirror filters well known in the engineer-
ing literature. His known as a Iow-pass filter and G is an example of a high-pass
filter. Broadly speaking, low-pass filters are related to averaging operations, and
high-pass filters correspond to differencing.
It may be observed that for compactly supported wavelets, that is, those having
zero values outside a compact interval, the sequence {Ci,j} and {di,j} will have
only finite number of non-zero elements. Furthermore, the two-scale sequence {Pk}
completely characterizes each wavelet basis. For wavelets with support on all of R,
each sequence element Pk is, in general, non-zero but elements decay exponentially
as Ikl becomes large.
Let H be the lew-pass filter associated with the Haar system, then Po = PI = ~;
other Pk'S are zero. For a signal 1= {lkhEZ

f = HI where Ik = ~ (f2k + hk-I) , (5.124)

which is proportional to the average of adjacent elements.


For a signal 1= {fkhEZ , Gis defined as

r = GI where I:' = ~ (hk - hk-I) , (5.125)

which is proportional to the difference between adjacent elements.


In general, one may say that Pk'S for low-pass filter satisfy the relation

(5.126)
238 GHAPTER 5. IMAGE PROGESSING

and for high-pass filters, they satisfy the relation

(5.127)

The fast wavelet transform. We have seen in Section 5.2 that the Cooley
and Tukey fast Fourier transform (algorithm) of data has a reduced computational
cost of only O(nlogn), a very significant achievement. The main idea is to reduce
the number of computations made by recursively computing the discrete Fourier
transform of subsets of the data. This is achieved by reordering the data subsets so
as to take advantage of some redundancies in the usual discrete Fourier transform
algorithm. Along similar lines, we could take approximate wavelet transforms by
replacing the function f in the definition of a wavelet coefficient by an estimate such
as

where

= 0, otherwise . (5.128)

This idea could be used to estimate the foHowing scaling function coefficients as weH:

n
~ L XN'j,k(f.) . (5.129)
l=l

Let us start with a set of high-level scaling function coefficients, and assume that
we have only a finite number of coefficients. This assumption is reasonable, as
any signal f E L 2 (R) must have rapid decay in both directions, so dj,k can be
neglected for large Ikl. Rescale the original function, if necessary, so that the scaling
function coefficients at level m are dm,o, ... ,dm,n-l' Computing scaling function and
wavelet coefficients at level m -1 is accomplished via equations (5.109) and (5.110).
As observed earlier, the {Pk} sequence used in these calculations has only finite
number of non-zero values if the wavelets are compactly supported; otherwise Pk
values decay exponentially; so it can be approximated by finitely number of terms.
In either case, let K denote the number of non-zero terms used in the sequence,
possibly truncated. Computing a single coefficient at level m - 1 according to either
(5.109) or (5.110) would take at most K operations. If scaling function coefficients
dm,k for k f/. {O, .. . , n - 1} are set to zero, giving exactly n non-zero coefficients
5.3. WAVELETS WITH APPLICATIONS 239

at level m, then the number of non-zero scaling function coefficients at level m - 1


would be at most

(5.130)

where [x] denotes the greatest integer function of z. If M is even large, the number
of non-zero wavelet coefficients is no more than the number in (5.130). Therefore,
the total number of non-zero scaling function coefficients at level m - 1 is approxi-
mately ~, and the total number of operations required to compute the one-level-down
wavelet and scaling function coefficients is approximately 2K· ~.
Let nl be the number of non-zero scaling function coefficients at level m - l.
Applying the decomposition again requires no more than

operations to compute the wavelet coefficients and the same number of operations
for the scaling function coefficients. There will be approximately T ~ !f- non-
zero scaling function coefficients at level m - 2, and the computation will require
approximately 2K . T ~ 2K . ~ operations.
Continuing this process, the total number of operations required for all decom-
positions is approximately

2K ( -n + -n + -n + ...) = 0 (n) .
2 4 8
If we do the similar computation using the fast Fourier transform, the requ ired
number operations is of the O(nlogn). Therefore the fast wavelet transform is
faster than the fast Fourier transform.

5.3.4 Performance of Fourier, fractal and wavelet methods in


image compression
In many fast-growing industries, image compression plays very vital role.
Therefore, intensive researches are being carried out in this area and different meth-
ods are tested for different dass of images to improve compression ratio with per-
missible distortion. Fourier method (discrete Fourier transform (DCT) or JPEG),
fractal digital image compression (IFS method, etc .) and wavelet method or WIC
are most popular and compete with each other.
Walker [1997] has compared the performance of Fourier and wavelet methods
(using Haar and Daubechies transforms) for one-dimensional signal. In this section,
we shall compare the performance of Fourier, fractal and wavelet methods for some
of t he test images from the "Bragzone site" of the University of Waterloo,Canada
available for research purposes. It is mainly based on Siddiqi and Ahmad [1998]. We
shall present here abrief account of test images, JPEG, multiresolution analysis in
240 CHAPTER 5. IMAGE PROCESSING

L 2(R2 ) and the fast wavelet transform for image compression, encoding and decoding
in fractal method along with compression results.

Description of a set of test images. Most of our test images have come from
'BragZone site" of the University of Waterloo. This site provides diverse set of
test images for research purposes. The aim of providing such images is to bring
uniformity to the image processing community as far as test images are concerned.
The images of selected compression ratios for various compression algorithms are
also available on this site.
We explain the details of these images. The size of the image and number of bits
required to store one pixel is mentioned in the bracket.

• Bridge [256x256x8]: This is the smaller version ofthe classieal512x 512 bridge
image. Being originally derived from a 6-bit spanning process, it exhibits low
resolution and contrast, as evident in the regions of foliage.

• Circles [256X 256 x 8]: A set of nested circles with the outside circle darkest and
the innermost circle lightest. This image tests edges at various orientations.
Also, the frame truncates the larger circles, creating two horns that narrow to
a point.

• Crosses [256 x 256 x 8]: A pattern of diagonal and horizontal lines. The lines
are narrow, varying from one to three pixels wide. Fine lines are essential to
architectural drawings, flow charts, graphie designs, etc. Yet, many coders
have great trouble with this type of content, especially along the diagonals.

• Lena [512x 512 x 8]: The image contains a niee mixture of details, Hat regions,
shading and texture that do a good job of testing various image processing
algorithms.

• Peppers [512x 512 x 8]: The absence of fine detail makes it comparatively easy
to compress.

• Squares [256x 256 x 8]: Aseries of concentric squares of decreasing width from
the outside in, and of increasing luminance. The plateaus are completely Hat
with grey level values of 50, 100, 150 and 200. This image tests the preservation
of Hat regions and step edges. The reetangular boundaries are at even pixel
locations, helping to make this image the easiest to compress the entire site,

• Washat [512 x 512 x 8]: This is a satellite photograph of the Washington DC


area. The contrast is low but there is much fine detail that must be preserved
in order to facilitate interpretation by earth scientists and intelligence analysts.

• Wood [512 x 512 x 8]: The wood image is taken from ITWM, University of
Kaiserslautern, Germany. Due to the presence of enough fine details in image,
it is often difficult for the coders to compress it .
5.3. WAVELETS WITH APPLICATIONS 241

• Zelda [512 x 512 x 8): From the USC database. Like Lena, the absence of
much fine detail makes it comparatively easy to compress . However, humans
are very sensitive to faces. In some places, such as around the eyes and mouth,
slight mathematical distortions can be perceptually disconcerting.

JPEG: Abrief description. The JPEG, Joint Photographie Experts Group, is a


working group of technical experts in image encoding. The goal of JPEG is to de-
velop an international standard for compression of grey scale or color images. Here
we describe the basic coding machinery of the JPEG baseline system encoder. It
provides a simple and efficient algorithm that is adequate for most image encoding
applications. The image is partitioned into 8 x 8 blocks and each block is indepen-
dently transformed using the 2-dimensional discrete eosine transform (DCT). The
forward 2-D DCT of any block of pixels is defined as (see Wallace [1990] for more
details.)
7 7
1 '" '" (2x + l)u7T (2y + l)v7T
F(u,v)=4C(u)C(v)LJLJf(x,y)cos 16 cos 16 '
x=OI/=O

and the inverse 2-D DCT is defined as

f( X, Y) -- ~4 ~ ~ C( )C( v )F( U, v ) cos (2x +16l)u7T cos (2y +16l)v7T '


LJ LJ U
u=Ov=O

where
for w = 0
C(w) = {~ otherwise .

All transformed coefficients are normalized by applying a user-defined normaliza-


tion array that is fixed for all blocks. The normalized coefficients are then uniformly
quantized by rounding to the nearest integer . Thus, the quantization function is
defined as

FQ(u, v) = Integer Round (~~::~~) ,


where Q(u, v) is called quantization matrix.
The top-Ieft coefficients in the 2-D DCT array are referred to as the DC value and
are proportional to the average brightness of the spatial block. After quantization,
t his coefficient is encoded with a lossless differential pulse code modulation (DPCM)
scherne. The remaining 63 coefficients of 8 x 8 block are called AC coefficients. The
quantization of AC coefficients pro duces many zeros, especially at higherfrequencies.
To take advantage of these zeros , the 2-D array of DCT coefficients is formatted
into a 1-D vector using a zig-zag reordering. This rearranges the coefficients in
approximately decreasing order of t heir average energy with the aim of creating
242 CHAPTER 5. IMAGE PROCESSING

large number of zero values. If the first non-zero AC value is separated from the DC
value with a (one) zero, then it is called a run of length of 1 zero.
The DC value and AC coefficients in the I-D sequence are encoded using separate
Huffman code tables. At the receiver, the sequence is decoded using the Huffman
decoding table and then denormalized. The denormalized block is then inverse
transformed using the inverse 2-D DCT.
Although the JPEG image standard has been widely used but it has some draw-
back. The error caused due to the quantization process is isolated in each local block.
At the high compression ratio, the block effect becomes obvious . The second, energy
compactness is only achieved within each local block and as a result, redundancy
between blocks can not be utilized.

Fast wavelet transform for image compression

Wavelets and multiresolution analysis in R 2 • Let P(x) and P(x) be two


functions of one variable. We can form a function of two variables as follows:
2
1 = P @ 12 = ®I j
,
j=l
where
2 2
®I j
(XI, X2) = II jj(Xj)
j=l j=l
= P(XdP(X2)'
Let Xl and X 2 be two closed subspaces of L 2(R), then we can form a closed subspace
2
of L 2(R2 ) denoted by ® Xj or Xl @ X 2 and defined as the closed linear span in
j=l
L 2(R 2) of all functions of the form P(XI) . P(X2)' The symbol @ is known as the
tensor product. It can be checked that ifthe system U!}sEA; are orthonormal bases
in subspaces X I,X2 C L 2(R), then the system

2
is an orthonormal basis in ®X j • It can also be verified that
j=l

2
® ~(R) = L 2
2(R ) .
j=l
5.3. WAVELETS WITH APPLICATIONS 243

From the above results it is quite clear that (i) if.,pl and .,p2 are two wavelets in
L 2(R), then .,p(Xl,X2) = .,pl(xd.,p2(X2) is a wavelet in L 2(R2);
(ii) if {Vj} and {"Ci} are two multiresolution analyses in L 2(R), then Vj ® Vi is
a multiresolution analysis in L 2(R2).

Mallat transform. We now describe the fast wavelet transform of Mallat [1989]
for two-dimension signals, such as images.
Wi is defined as the orthogonal complement of "Ci into "Ci+l ' Thus we have the
foUowing orthogonal bases:

• basis for "Ci:


(rJJm,n) = (rJJm(x)~(y)) m,n , m,n E Z,

• three bases for Wi :


(.,p~~n) = (rJJm(X).,p~(y)) m,n , (.,p~~n) = (.,p!n)(x)cP!n(y)) m,n ,
(.,p~~n) = (.,p!n(x).,p~(y)) m,n.
For an image f E L 2(1R2) its approximation at scale 2i is given by its orthogonal
projection onto "Ci
fv/x,y) = L (f,rJJm ,n)rJJm,n(x,y).
m,nEZ

This approximation is thus characterized by the sequence stn,n = 2i (f, rJJm,n)' The
sequence Si = (stn,n)m,nEZ is called discrete approximation of f at the resolution
2i . The additional details from the scale 2i to 2i +l are given by its orthogonal
projection onto Wi '

J.Wj(X,y) = L (f,.,pi~~n).,p~~n(x,y), d= 1,2,3.


m,nEZ

This component is thus characterized by the sequences,

• Dtn~n = 2i (f, rJJm~)'

These sequences

Dj = (D~~n)m,nEZ' d = 1,2,3,
are called details of the resolution 2i +l .
244 CHAPTER 5. IMAGE PROCESSING

In image compression, we get a discretized image at resolution, say, 2i and the


goal is to decompose it into lower resolutions. According to the Mallat algorithm Si
could be computed from Sj+l with the action of low-pass filter h(n) followed by a
decimation.
.
S::n,n = '"' '+1-h(2m - k, 2n - t) .
LJ S~,t
k,tEZ

The bidimensional filter h is the tensor product of the same 1-D low-pass filter h
defined as, h(m, n) = h( -m)h( -n) with h(n) = 2- l / 2 (r/JÜl , r/J~} .
Similarly, Dj, d = 1,2,3, can be computed from Si+! by the action of high-pass
filter 9d (9d is the tensor product of 1-D high- and low-pass filters 9 and h) followed
by the same decimation.

D im,n
,d -- '"'
LJ sr: - (2m -
k ,t gd k , 2n - <.0) ,
k,tEZ

where 9dm, n) = h(-m)g( -n) , 92(m, n) = g( -m)h( -n), 93(m, n) = g( -m)g( -n)
and g(n) = 2- 1 / 2 ('ljJü l ,'IjJ~}.
The filters H = {h(n) : n E Z} and G = {g(n) : n E Z} are called quadrature
mirror filters (QMF). H and G correspond to a low- and high-pass filters, respec-
tively. The rows of the image are filtered by computing their correlation with the
low- and high-pass filters H and G followed by 2:1 decimation. The same procedure
is then applied to columns. Thus we get a four-channel orthogonal subband decom-
position using separable QMF. That is, from the discretized image Si, we get the
four-channel decomposition Si-I, DJ-l' DJ-l and DJ-l' The same is repeated on
channel Si-l and so on. Figures 5.14 and 5.15 represent a general scheme of wavelet
decomposition. The transformed wavelet coefficients are then quantized followed
by entropy encoding.

Encoding and decoding process


Digital fractal image processing. Suppose that we are given an image I that we
wish to encode. This means that we want to find a collection of maps Wl , W2, • •• ,WN
N
with W = UWi and f = xw. That is, we want I to be the fixed point of the map
i=l
W . We seek a partition of f into pieces to which we apply transforms Wi and get
back [ , Being able to exactly cover f with parts of itself is not likely, so the least
possible cover is sought with the hope that Xw and f will not look too different,
i.e, d(f, xw) is small. The hope for this comes from the Collage theorem. Thus, the
encoding process is to partition P by a set of ranges Bi' For each Bi, an Ai C P
and Wj : Ai X 1 -t 13 are sought such that wi(f) is as elose to In (Bi x 1) as possible;
i.e.
5.3. WAVELETS WITH APPLICATIONS 245

C olu m ns

8j _ 1

D]_1

DJ_l

Dj_l

Figure 5.14. One stage in wavelet decomposition.

is minimized.

Example. Let us start with a 512 x 512 pixel image with 256 grey levels. We
partition the image recursively until the image size is d x d (a typical consideration
is 32 x 32) and call these square blocks Bi range blocks. For each range block of the
image, we find an overlapping domain block Ai, twice the size of the range block,
so that when we apply a suitable transformation Wi to Ai, we get some thing very
elose (in the sense of the root mean square metric) to the part of the image over Bi,
i.e., we seek to minimize the expression

The drms, in case of two images fand g, is defined as

drms(f,g) : g (t(x,y) - g(x,y)f dxdy.

The pixels in a domain block are arranged in groups of four so that the domain
is reduced to the size of range. Let al, a2, . . . ,an and bl, b2 , . . . ,b n be the pixel
intensities of the blocks Ai and Bi, respectively. We want to determine 8 and
o (contrast and brightness adjustments, respectively) such that we minimize the
quantity,
246 CHAPTER 5. IMAGE PROCESSING

8j-2 1
D J- 2

D~
J- 1
D~
J- 2 D1-2

D~ D~J- 1
J- 1

Figure 5.15. Wavelet decomposition scheme.

After solving, we get

s=

and

For more details, we refer to Fisher [1995].


Computational results and conclusion. Original image and compressed images
of Zelda, Holz and Peppers are given in Figures 5.16 to 5.18, compressed in different
compression ratios with wavelet, fractal and JPEG. The distortion measures are
given in Table 1. Peppers image is compressed in different CRs ranging from 16:1 to
50:1. The plots of CRs versus distortion error of Peppers images are given in Figure
5.19. The CRs and the corresponding errors (mean square error, Sobolev and total
variation error) are shown on X and Y axes, respectively.
It has long been accepted that distortion measures like mean square error (MSE)
are inaccurate in predicting distortion, Teo and Heeger [1994], though MSE is the
widely used distortion measure in image processing literature. We have studied
various types of images ranging from natural to artificial and dassified them into
three dasses (A, B and C): see Siddiqi and Ahmad [1998]. The Peppers image
(Figure 5.16) belongs to Class A of our dassification and FIC is the best compression
technique for this dass of images, Siddiqi and Ahmad [1998] .
5.3. WAVELETS WITH APPLICATIONS 247

Figure 5.16 Top : Original image of Zelda and Holz8, Bottom: Compressed
images with wavelet, CR 16:1 and 20:1, respectively.

In the present study and in case of Peppers image (true in general for the images
of Class A), we observe that up to CR 40:1, all the distortion measures behave alike
(all match human perception) but if we increase CR above 40:1, only Sobolev or
TVE match human perception. MSE fails to measure it. The performance of the
fractal method is better than wavelet and JPEG, in case of Peppers image for CR
above 40:1, as dearly seen from the plot of Sobolev or TVE. A similar observation
is made in case of other images of this dass at high compression ratios. Therefore,
we conclude that for a particular type images and for higher CRs, Sobolev or TVE
are better measures than MSE. For the rest of the images, Sobolev and TVE are as
good as MSE.
Our computational results for different distortion measures are given in the fol-
lowing table:
248 CHAPTER 5. IMAGE PROCESSING

Figure 5.17 Top: Compressed images with FIC, CR 16:1 and 20:1, respectively.
Bottom: Compressed images with JPEG, CR 16:1 and 20:1, respectively.

Table 1. Distortion measure of Peppers, Zelda and Holz images.

Image: Peppers, CR 32:1


Algorithm MSE PSNR TVE Sobolev
Wavelet 39.7012 32.1428 10.5462 12.2851
Fractal 43.3606 31.7599 10.6643 12.5335
JPEG 62.4071 30.1785 12.4391 14.7356
Image: Zelda, CR 16:1
Algorithm MSE PSNR TVE Sobolev
Wavelet 8.80429 38.6839 5.54797 6.2916
Fractal 17.5769 35.6814 6.80572 7.99342
JPEG 11.7603 37.4266 6.25757 7.13565
Image: Holtz, CR 20:1
Algorithm MSE PSNR TVE Sobolev
Wavelet 92.021 28.4919 17.8883 20.2981
Fractal 158.822 26.1217 19.8303 23.496
JPEG 98.5858 28.1927 18.194 20.727
5.3. WAVELETS WITH APPLICATIONS 249

Figure 5.18 Top left: Original image of peppers. Top right: Compressed image
with wavelet , CR 32:1. Below left: Compressed image with FIC, CR 32:1. Below
right: Compressed image with JPEG, CR 32:1.

5.3.5 Differential equation and wavelets


Wavelets are currently being investigated for the numerical solution of differen-
tial and integral equations. We shall consider here a simple example to illustrate the
application of wavelets in solving it. It is clear from the MRA that any function in
L 2(R) can be approximated arbitrarily weH (in the L 2(R) norm) by the piece-wise
constant functions from l-j provided j is large enough. l-j is the space of piece-wise
constant L2(R) functions with breaking at the dyadic integers k- 2- j , i, k E Z . We
take the foHowing boundary value problem :

-u" (x) + Cu(x) = f(x), xE n = (0,1) (5.99)


{ u(O) = u(l) = 0,
250 CHAPTER 5. IMAGE PROCESSING

with C > 0, a constant, I E Hl(O) and solve it for U = u(x) .


We apply the variational method of approximation (Galerkin method) for solving
equation (5.99). As the existence of the solution of variational problem is guaranteed
only in a complete space , so we take the Sobolev space Hl (0). The idea is to solve
variational equation on the finite-dimensional subspace of H 1 (0).

In variational form, the solution U E Hl (0) of the above equation satisfies

~ (u'v' + uv) = In iv, (5.100)

for all v E H 1(0) .


To approximate U by the Galerkin's method, we choose a finite-dimensional sub-
space of Hl (0) which is aspace spanned by wavelets defined on the interval [0,1]. We
have already discussed the wavelet bases ifJj,k' 'l/Jj,k and spaces Vj, Wj generated by
them respectively in previous sections. For getting numerical solution of (5.99), we
choose a positive value m and approximate U byan element Um E Vm that satisfies

~ (u:nv' + umv)dx = In I(x) v dx, v E Vm , (5.101)

where Um and v can be written as


m-l
Um = POUm + L QkUm' (5.102)
k=O

Pk is the projection from Hl(O) onto Vk and Qk-l = Pk - Pk-l. Therefore,


m-l 2;-1
Um = cg ifJo,o + LL
j=O k=O
d{ 'l/Jj,k' (5.103)

Here, ifJo,o is identically 'one' on 0, cg is the average of Um on O. We have used


a "multiresolution" approach, i.e., we have written a successive coarser and coarser
approximation to Um'
Therefore, for all practical purposes, Um can be approximated to an arbitrary
precision by a linear combination of wavelets. We need now to determine d{. Equa-
tion (5.103) together with (5.101) gives the foBowing:

L L
m-l 2;-1
?: L
1=0 k=O
d{
..
('l/Jj,k'I/Jj' ,k' + 'l/Jj,k'I/Jj' ,k')dx =
..
I'l/Jj' ,k,dx, (5.104)

, .' ,
for k = 0" " ,21 - 1 and j = 0"" ,m - 1 or, more precisely,

LA = f.
5.3. WAVELETS WITH APPLICATIONS 251

22r---,r---,--""""'T----r--...,..---.,.----,
Sobpeppera Wconst.d&t
20 Bebpeppers Frectel.det
Sobpeppe ra JPEG."at

------/
18

12

10
~~-----------
----
20 25 30 35 40 45 50
Ccm pression Ratio

7
16 r----,.----r--..,----,r---r--.,.----,
TVEpepper Wenn"'..... - /
15 TVEpeppers Fr"':tal..... - /

//;0/;0;0
TVEpeppeea J PEG .""t

14

~13
.~ 12
~

~ :~ /-------
20 25 30 35 40 50
Compresaion Rat io

MSEpeppers Woonst.dat -
160 MSEpeppeni Frectal.dat
MSEpeppe.. JPEG.""t - -

14U

g
C<l 12U

~
e& IUU

~
:;; 8U

2U';j15F - * - - *"- -,fr- -;!;--


35 --t.:;--
40 -n-
45 - .....50,
Compress icn Ratio

Figure 5.1D Different distortion measures: Sobolev, TVE and MSE.


252 CHAPTER 5. IMAGE PROCESSING

Let us work out the details for the boundary value problem considered in Chapter 3
-U
II
= ! on (0,1), u(O) = u(l) = 0 .
The standard weak formulation
(U',v') = (I,v),v E HJ([O, 1]). (5.105)
Let hj be a finite-dimensional subspace of HÖ ([0,1]) . The simplest conforming choice
of the trial space Hh are the spans of scaled test functions
(5.106)
where
I +x -1 S x < 0,
4>(x) = 01 - x 0 S x < 1, (5.107)
{ otherwise.
Choosing the 4>j,k as a basis function for H h , the Galerkin condition

(Uh' v) = (I, v), v E Hi, (5.108)


gives rise to a linear system of equations
Ahu=F, (5.109)
where Ah is the stiffness matrix relative to the basis functions 4>j,k and u, F are
corresponding vectors with f k = (I,4>j,k)' Clearly, A h is tridiagonal. Hence (5.109)
is very efficiently solvable. However, for higher-dimensional analogues, the matrix
would no longer have such a narrow bandwidth and one has to resort to iterative
methods to preserve sparseness.
On the other hand, recalling the min-max characterization of the smallest and
largest eigenvalue of a symmetrie positive definite matrix, one can see that the
condition numbers of A h grow like 22 h , which renders iterative methods prohibitively
inefficient. To remedy this, one has to precondition the linear systems. One way is
to exploit suitable multiscale decompositions of the trial spaces H h • First, note that
since
1 1
4>(x) = 24>(2x + 1) + 4>(2x) + 24>(2x - 1); (5.110)
that is,
1 1 1
4>j,k = 2J24>i+1,2k-l + J24>i+1,2k + 2V2"4>i+ 1,2k+l' (5.111)
therefore, H h are nested and, of course, their union is dense in L 2([0, 1]).
In order to successively update solutions from coarser grids, one can consider the
following hierarchieal decomposition of the trial spaces. We use here the Lagrange
projectors
2;
L j! =L 2-i!2!(T jk)4>j,k, (5.112)
k=O
5.4. FRACTAL IMAGE COMPRESSION 253

and note that the complements

(5.113)

are simply spanned by the tent functions on new grid points on the next higher scale

(5.114)

Note that neither the r/lj,k nor the 'ljJj,k is orthogonal but it is not hard to show that
they satisfy the stability condition

(5.114)

for some constants Cl, C2 independent of the sequence {Ck H~o' Keeping this in mind,
we now consider stiffness matrices relative to the hierarchieal bases composed of the
collections Wj and note that

d d
dx'IjJj,k(X) = dXr/lj+l,2k+l(X) = 2j+~2'IjJj,k(X),
H
(5.116)

where 'ljJr,k(X) are the Haar wavelets . Therefore, we get

/ .!!:..-.I' d .1, ) _ 2n+j+3(.I,H .I,H) _ 2-2j+38 . 1"


\ dx 'f'j,k' dx 'f'n,l - 'f'j,k' 'f'n,l - J,nUk,l·

Hence A h is, up to a 2 x 2 upper left block stemming from the coarse grid space Ho,
a diagonal matrix, which is trivially preconditioned by symmetrie diagonal scalings.
Now, one has to be somewhat careful when extrapolating from this observation.
The fact that the hierarchical basis functions 'ljJj,k are actually orthogonal to the
energy inner product is an artifact. In two dimensions, this is no longer the case
but it turns out that the hierarchical stiffness matrices can still be preconditioned
by diagonal scaling to efficiently reduce the growth of the condition numbers to
logarithmie behaviour. Moreover, it has suggested similar strategies involving other
multiscale bases which do even better. For interested readers, we refer Canuto and
Cravero [1997] and Dahmaen [1997].

5.4 Fractal image compression


5.4.1 Introduction
Image compression techniques and methods are of vital importance in high
technologies, especially in the information technology. These techniques and meth-
ods are related to communieating and storing images and data (information) in
254 CHAPTER 5. IMAGE PROCESSING

the shortest possible time (in minimum space on hard disk or CD-ROM) and re-
trieval with permissible distortion. Tools and techniques of the Fourier analysis
like Fourier transform, convolution, Shannon sampling theorems and Walsh-Fourier
methods have been successfully used for a long time. Matrix transform, optimization
and variational techniques were also employed to study image and data compression.
Wavelet theory entered in this area in the late eighties and started replacing the role
of DCT (JPEG) which was dominating the scene until that time in view of its high
compression ratio and the quality of information after retrieval.
The concept of a fractal and the discipline of fractal geometry was introduced by
Benölt Mandelbrot in the early eighties to study irregular shapes like coastallines,
mountains, elouds or rain fall. Fractals are complicated looking sets like Cantor set,
Sierpinski gasket, Sierpinski carpet, von-Koch curve, Julia set, but they arise out of
simple algorithms. By now it is a well-established discipline and a comprehensive
and updated bibliography can be found in Barnsley [1988], Barnsley and Hurd [1993],
Fisher [1995], Saupe [1996], Lu [1997] and references therein.
Barnsley [1988,1996] established a elose connection between functional analysis,
fractals and multimedia by demonstrating that fractals can be defined in terms of
fixed points of mappings defined on an appropriate metric space into itself and image
compression can be studied through this methodology achieving marvelous results.
He has already commercialized his achievements in the form of the top-selling mul-
timedia encyelopedia Encarta, published by the Microsoft Corporation ineluding,
on one CD-ROM, seven thousand colour photographs which may be viewed inter-
actively on a computer screen. There are diverse images like those of buildings,
musical instruments, people's faces, baseball bats, ferns. This development is known
as the IFS theory (Iterated Function System theory) for image compression. Stewart
[1995] has explained in a simple language the far-reaching consequences of this theory
which will drastically reduce the expenses on communication through fax machines
manufactured using this technique and methodology. It has been established that
for a fairly large dass of images, the IFS theory provides a better compression ratio
and quality of images after retrieval (see, for example, Fisher, Lu and recent papers
on web resources) compared to the most popular method until now, DCT(JPEG).
It ia also expected that a combination of fractal and wavelet techniques may still
yield better results. This is a fast-growing field in which, besides mathematicians,
computer and information scientists, physicists, chemists and engineers are actively
involved.

5.4.2 IFS theory


Let (X,d) be a complete metric space and let T : X ~ X be a mapping of
X into itself. The iterates of T are mappings TOn: X ~ X, defined by TOO(x) =
X,TOl(X) = T(TOO(x)) = T(x),T0 2(x) = T(T01(x)) orT02 = ToooTol, .. . ,TOn(x) =
T(TO(n-l)(x)) or TOn = ToTO(n-l). A mapping Fon the set of real numbers R into
itself is called an affine transformation if it is of the form F(x) = ax + b for all
x E R, where a and bare constants. A mapping G on R2 into itself is called an
5.4. FRACTAL IMAGE COMPRESSION 255

affine transformation if it ean be written as


G(x, y) = (ax + by + e, cx + dy + f),
where a, b,c, d, e and f are real numbers. G ean also be written as

where A = (~ :) is a 2 x 2 matrix and B is the eolumn veetor (j ).


The mapping T is ealled a Lipsehitz mapping if there exists a eonstant a ~ 0 such
that d(T(x),T(y)) ~ ad(x,y) for all x,y EX. T is ealled a contraetion mapping
if 0 ~ a < 1. a is ealled the eontractivity factor of T. A Lipsehitz eontinuous
function T is called eventually contractive if there is a number n such that TOn
is a eontraction map. Let Y be a non-empty subset of a eompact metric space
(X, d), then a mapping S on Y into X is called a local (partitioned) contraetion
mapping on (X, d) if there is a number s, 0 ~ s < 1, such that d(S(x), S(y)) ~
sd(x, y) for all x,y E Y. A eomplete metric space (X,d) equipped with n eontraction
mappings Wi : X --+ X, i = 1,2" .. ,n, denoted by {X, d, Wi, i = 1,2" . . } is ealled
an iterated funetion system (IFS) . A eomplete metric space (X, d) equipped
with n eventually eontractive mappings Wi : X --+ X, i = 1,2"" ,n, denoted by
{X,d,Wi,i = 1,2,'" ,n} is ealled an eventually iterated funetion system. A
loeal (partitioned) iterated function system (LIFS) is a compact metric space (X, d)
equipped with n loeal eontractive mappings Wi : Y - t X, Y ~ X. A recurrent
iterated funetion system (RIFS) is a eollection Wl, W2 ,' •• W n of n Lipschitz maps
in a eomplete metrie space X and n x n matrix (aij) satisfying L
aij = 1 for all i .
j
The iterated function system (IFS) theorem states that:
For an IFS {X ,d,Wi} with eontractivity factors ai,i = 1,2,· ··n, the mapping
W defined on H(X) = space of all eompact subsets of X, into itself by W(B) =
Ur=l wi(B) is a eontraction mapping on the eomplete metric space (H(X), h(., .)),
where
h(A, B) := maxi d(A, B), d(B, An ,
d(A,B) := max{d(x,B) I xE A} ,
d(x,B) := min{d(x,y) I y E B},
with the eontractivity factor a = max ai, that is,
l~i~n

h(W(B), (W(C)) ~ ah(B, C),


for all B, C E H(X) .
Consequently, W has a unique fixed point, say, A E H(X) whichsatisfies the
relation A = W(A) = Ur=l wi(A) , and is given by
A = lim WOn(B) for any B E H(X) .
n-too
256 CHAPTER 5. IMAGE PROCESSING

This fixed point A is called the attractor or deterministic fractal or fractal.


h(.,.) is known as the Hausdorff metric.
Let {R 2, d, Wi, i = 1,2, · · ·n} be an IFS where Wi'S are given by

x)
Wi ( Y = (aiCi bi) (x)
di Y +( fi ) ' t. = 1, 2,". n .
e,

Then the following table is known as the IFS code:


Table 5.2. IFS code .
W a b C d e f P
Wl al bl Cl dl el b Pl
W2 a2 b2 C2 d2 e2 hP2
Wn an bn Cn d n en f n Pn
where

for i = 1,2, . . . nj the symbol ~ means "approximately equal to" . The numbers Pi's
can be interpreted as probabilities for finding the attractor of an IFS using the Chaos
Game Algorithm. An image can be treated as a closed bounded (compact) subset
of R 2 • The following result, known as the Collage Theorem, is very important for
designing IFS whose attractors or fractals are elose to a given image. Let (X, d) be
a complete metric space and S be a given image, that is, S E H(X) and let € ~ 0
be also given. Choose an IFS {X, d,Wi}, i = 1,2,··, n, with contractivity factors
(Xi, i = 1,2,·· . n, such that

Then

h(S,A) ~ _l_h(S, W(S))


1-8
s --1
e ,
8

where

and A is the attractor of the IFS . This theorem precisely tells us that if we can find
an IFS code so that the Hausdorff distance between S and W(S) is very small, the
attractor of W will be very elose to the target image S. There are algorithms for
finding attractors of the given IFS like the Chaos Game Algorithm, the Photocopy
Machine Algorithm (see Barnsley and Hurd [1993] and Lu [1993]).
5.4. FRACTAL IMAGE COMPRESSION 257

The basic technique of image compression through IFS is to find out appropriate
affine contraction mappings Wl, W2, ... Wn such that the condition of the Collage
Theorem is satisfied, namely, S is very elose to W(S) and so, instead of communi-
cating/storing the image, we can communicate/store the fractal or attractor of IFS,
that is, coefficients in the IFS code. For details of this method, we refer to Barnsley
[1988], Barnsley and Hurd [1993], Lu [1993], Jacquin [1993], Fisher [1995], and Ning
Lu [1997].

3. Inverse problem for images.


The problem of representing a given image (or a function) by the IFSs or their
variations is a typical inverse problem. This involves finding the IFS parameters of an
image that is exactly generated via an IFS. In recent years, the iterated function
system with probabilities (IFSP), iterated fuzzy set systems (IFZS) and IFS
with grey level maps (IFSM) have been introduced and the corresponding inverse
problems have been investigated. Such an inverse problem is related to the problem
of finding the image/function as the fixed point element of a given iteration algorithm
ofthe types IFS, IFSP and IFSM on function spaces like L p, Hm,p (the Sobolev space
of order m), the weighted Sobolev space and the Besov space. In purely abstract
mathematical terms, it comprises the following steps:
(i) Finding a suitable metric space X in which to represent the image (function).
(ii) Finding an appropriate metric d(·,·) on X .
(iii) Finding an appropriate contraction map T on X into itself.
The fact that such problems have more than one solution motivated the search for
different kinds of optimality. This problem has been studied in recent years by Forte
and Vrscay (see, for example, Siddiqi, Ahmad, and Mukheimer [1997] and Forte and
Vrscay [1995], Manchanda, Mukheimer and Siddiqi [1998]).
We present here a summary of these results. Let F*(X) denote the set of all
functions u : X ---t [0,1] which are (i) upper semi-continuous on (X, d) (ii) normal-
ized, that is, for each u E F*(X), there exists an Xo E X such that u(xo) = 1. For
u E F*(X) and for each 0' E [0,1], the o-level sets of u are defined by
[u]a = {x E X I u(x) 2: O'},O' E (0,1]

[u]O = {x E X I u(x) > O} .


It is dear that [u]a E H(X). For each u, v E F* (X), we can define a metric
doo(u,v) = sup h([u]a,[v]a) (F*(X),wi,<P),
OSa::;t

where Wi : X ---t X are contraction maps, <P = {rPl, rP2," . rPn} and t/>/si= 1,2,·· . n
are mappings of [0,1] into itself (rPi : [0,1] ---t [0,1]), each of which being (a) non-
decreasing, (b) right continuous, (c) rPi(O) = 0 for all i, and (d) for at least one
i,rPi(l) = 1, is called the iterated fuzzy set system (IFZS).
258 CHAPTER 5. IMAGE PROCESSING

Let contraction mappings Wl, W2, • • • ,W n be associated with probabilities


Pl,1J2, ... ,Pn with I:>i =
n

i=l
1. Furthermore, let ß(X) denote the u-algebra of

Borel subsets of X and M(X) denote the set of all probability measures on ß(X) .
(M(X), wi,T), where T is defined by the relation
n
(Tv)(S) = (Mv)(S) = LPi(V(wi1(S))),
i=l

for avE M(X) and each S E H(X) is called the iterated funetion system with
probabilities (IFSP) . M(X) is a metric space with respect to the metric

dH(JL, v) = fELipl(X)
sup
Jx I
f fdJL - f fdvl'
Jx
where

LiPl(X) = {f: X ~ R Ilf(xd - f(X2)1 ~ d(XI,X2) V Xl,X2 E X}.


T defined above is called the Markov operator on (M(X), dH(., .)). For a measure JL
on B(X) and for any integer P ~ 1, let Lp(X, JL) denote the vector space of all real-
valued functions u such that uP is integrable on (B(X), JL) . Lp(X, JL) is a complete
metric space with respect to the metric induced by the L p norm; that is,

d(u,v) = Ilu - vllz, = ( i lu(x) - v(x)IPdJL(x) )


l/p
,

where ~ = {r/>l' r/>2"" r/>n} with r/>i : R ~ R, known as the grey level maps, is called
the iterated funetion system with grey level maps (IFSM).

An operator T can be defined on IFSM as


n '
(Tu)(x) =L r/>i(u(wi1(x))) .
i=l

The prime (I) signifies that the sum operates on all those terms for which wi 1(x)
is defined. If wi 1(x) = 0 for all i = 1,2,'" ,n then (Tu)(x) = O. For X C Rn, let
m(n) E M(X) denote the Lebesgue measure on ß(X). The indicator function of a
subset A of X denoted by IA(X) is defined by

I (x) = { 1, x E A
A 0, otherwise
Lip(R) = {r/>: R ~ R Ilr/>(td - r/>(t2)1 ~ ßlt1 - t21},
V tl, t2 ER and for some ß E [0,00). It can be verified that for any u E Lp(X,JL), 1 ~
p< 00, and r/> E Lip(R), 1 ~ i ~ n , T is a mapping on Lp(X, JL) into itself. In fact , T
5.4. FRACTAL IMAGE COMPRESSION 259

becomes a contraction mapping under certain assumptions and hence has a unique
fixed point as Lp(X, JL) is a complete metric space .
Affine IFSM on Lp(X,JL) is that IFSM on Lp(X,JL) where rPi are given by rPi(t) =
(Xit+ ßi , tE R, i = 1,2"" n . Let X = [0,1] and JL = m(l) with Wi(X) = Six+ai
and (Xi = ISil < 1, 1 ~ i ~ n. If T is contractive with fuced point ü, then

This means that ü may be expressed as a linear combination of both piece-wise


constant functions Xi(X) as weIl as functions 'ljJk(X) which are obtained by dilations
and translations ofü(x) and Ix(x) = 1, respectively. This reminds us ofthe role of
scaling functions in the wavelet theory.
The Collage Theorem mentioned earlier can be rephrased as folIows:
Let (X , d) be a complete metric space, and for a given x E X there exists a contrac-
tion map W : X ~ X with the contractivity factor a such that d(x, W(x)) < € .
Then
e
d(x, x) < - 1 - '
-a
where x is the fixed point of W (W(x) = x).
In view of this result, the inverse problem for approximation of functions in
Lp(X, JL) by IFSM may be stated as folIows: Given a target function v E Lp(X, JL)
and a 0 > 0, find an IFSM (Lp(X,JL),Wi,rPi) with the associated operator T such
that IIv - Tvll p ~ O.
For JL E M(X) , a family A of subsets A = {Ai} of Xis called JL-dense in a family
°
N of subsets B of X iffor every e > and any BEN, there exists a collection A E A
such that A S; B and JL(B \ A) < e. Let {wil be an infinite sequence of contraction
maps on X into itself. We say that {Wi} generates a "JL-dense and non-overlapping"
(abbreviated as "JL - d - n") family A of subsets of X if, for every e > and every
B S; X, there exists a finite set of integers ik ~ 1, 1 ~ k ~ n, such that
°
(i) A = U~=l Wi" (X) S; B,
(ii) JL(B \ A) < e, and
(iii) JL(Wi" (X) n Wi, (X)) = 0 if k f. I .
If {Wi} satisfies the above conditions on (X,d), then i~f {ai} = 0, where ai's
l~t<oo
are the contractivity factors of Wi 's, independent of JL. If X = [0,1] and JL is the
Lebesgue measure, then the wavelet type functions

Wij(X) = 2- i (x + j -1), i = 1,2"" ,j = 1,2" " ,2i ,

can form a "JL-d-n" family.


260 CHAPTER 5. IMAGE PROCESSING

For each i* ~ 1, the set of maps {Wi. i» j = 1,2" " ,2i'} provides a set of
2- i • contractions of [0,1] which tile [0,1]. In 1995, Forte and Vrscay obtained the
following result which provided the solution of the inverse problem:

Theorem A[Forte-Vrscay, 1995]. Let v E Lp(X,j.L), 1:$ p < 00, then


lim inf Ilv - Tnvll p = 0,
n-too

provided the sequence 01 contraction maps Wi generates a "u-d-n" a lamily A 01


subsets 01 X and Wi 's are also one-to-one, where
n I

(Tnv)(x) =L cPi(u(wi1(x))) .
i=l

This theorem has been studied for local IFSM and special cases like p = 2 and
cPi 's affine maps. Forte and Vrscay have also carried out an approximation of the
target image "Lena", a 512 x 512-pixel grey scale image, with each fixed pixel having
256 possible values (8 bits, with values from 0 to 255, which are rescaled to values
in [0,1)). This type of approximation has been studied by Siddiqi et al, [1997]
for singer and bride. The correspondence between fractal-wavelet transforms and
iterated function systems with grey-Ievel maps has been systematically studied by
Mendivil and Vrscay [1997] . A wavelet-based solution to the inverse problem for
fractal interpolation functions has been investigated by Berkner [1997]. Manchanda,
Mukheimer and Siddiqi [1998] have extended Theorem A to the Besov space.

4. Comparison of JPEG, wavelet (EPIC) and IFS techniques


for image compression.

A comparison of performance in terms of compression ratio (CR), and visual


appearance (PSNR) of IFS (fractal), wavelet (EPIC) and JPEG techniques for some
wood blocks and a singer and bride of Indian sub-continent is given in the following
table:
Table 5.3. CR and PSNR.
Compression techniques
Image Fractal EPIC JPEG
CR PSNR CR PSNR CR PSNR
Wood 1 4.36:1 31.59 4.83:1 34.33 5:1 28.31
Wood 1 18.78:1 27.52 18.41:1 28.73 20:1 25.83
Wood2 4.69:1 29.15 4.78:1 33.98 5:1 26.95
Wood 2 18.74:1 25.03 17.97:1 27.12 20:1 23.77
Singer 4.36:1 33.27 5.10:1 36.99 5:1 30.49
Singer 18.81:1 27.30 18.22:1 29.56 20:1 25.62
Bride 4.36:1 32.33 5.16 :1 40.35 5:1 30.64
Bride 18.79:1 27.02 18.68:1 30.65 20:1 26.06
5.5. PROBLEMS 261

The wavelet technique is slightly better in terms of the visual quality of images af-
ter compression followed by fractal and JPEG. The following references also provide
interesting information about wavelets and their applications to image processing;
Bertoluzza [1992], Daubechies et al. [1992], Daubechies [1992], DeVore et al. [1992],
Kaiser [1994], Kelly, Kon and Raphael [1994], Mallat [1989, 1996], Strang [1989,
1993], Sweldons [1994] and Walter [1992, 1995].

5.5 Problems
Problem 5.1. Consider a linear position invariant image degradation system with
impulse response h(x - a . y - ß) = e[(:z:-n)2+(I/-ß)21. Suppose that the input to
the system is an image consisting of a line infinitesimal width located at x = a,
and modelled by f(x, y) = 8(x - a). Assuming no noise, what is the output image
g(x,y)?

Problem 5.2. (i) Show that the discrete Fourier transform and its inverse are
periodic functions, (ii) obtain the Fourier transforms of: (a) dfd(x) , and (b) f(x, y) +
x x
f(x,y)
y

Problem 5.3. Discuss the variational formulation of the Perona and Malik model.

Problem 5.4. Find solutions of the dilation equation

<jJ(x) =L cn<jJ(2x - n), L Cn = 2,


n n

for the following values of Cn

(i) CO = 2 and cn = 0 for n :f:. 0, and


(ii) CO = Cl = 1 and cn = 0 for C n :f:. 0 and 1.

Problem 5.5. Show that <jJ(x) = sinc(7rx) is a scaling function.

Problem 5.6. Let

and

C = ( 1~I(w)1 dw < 00
t/J JR Iwl '
and
d<jJ
'IjJ(x) = dx'
262 CHAPTER 5. IMAGE PROCESSING

then show that the wavelet transform of 4J is given by

Problem 5.7. State and prove the Parseval's identity for wavelet transforms.

Problem 5.8. Establish a relationship between wavelet transforms and wavelet


Fourier coeflicients.

Problem 5.9. Write a short note introducing the silent features of wavelets.

Problem 5.10. Explain the advantages of wavelet techniques in scientific comput-


ing.

Problem 5.11. Discuss the relationship between the fractal wavelet transform and
the iterated function systems with grey-level maps.

Problem 5.12. How can one apply the scaling properties of continuous wavelet
transforms to find a solution to the inverse problem for fractal interpolation func-
tions?

Problem 5.13. Let cp be a given k-times, k 2: 1, differentiable function such that


cp(k) E L 2 (R ) and cp f= Q. Then t/J(t) = cpk(t) is a wavelet .

Problem 5.14. The set of wavelets \]! = {t/J E L 2(R) such that t/J satisfies (5.72n
is adense subset in L 2(R) .

Problem 5.15. Let H(t) be the Haar funct ion. Show that {Hi,k(t)} = {2i/ 2 H(2 it-
kn is an orthonormal basis in L 2(R) .

Problem 5.16. Show that there is a multiresolution analysis corresponding to the


Haar wavelet.

Solution of Problem 5.13. We know that I.,f,(w) I = Iwl k lep(w)1 by the properties
5.5. PROBLEMS 263

of the Fourier transform. We need to show that

c,p = 211" r 1-r/J(w)1 dw <


J Iwl
2
00

r
R
Iwl 2k 1~(wW
C", = 211" J Iwl dw
R

= 211" r IwI2k-ll~(wWdw + 211" {


1
IwI2kl~(w)12 dw
J-1 J 1wl>1 Iwl
~ 211" (1I<pIlL + 11<p(k)IIL) < 00.
Solution of Problem 5.14. Let 1 E L 2(R), then we get i E L 2(R) . Let us define
If as
if(W) = {i(w), Iwl ~ e
0, Iwl < e ,
Then, for every e, If is a wavelet as it satisfies the admissibility condition (5.72).
Since 1I111L2= lIillL2' we have

111 - IfllL2 = iff li(wWdw -+ 0 as € -+ O.

Thus every function 1 in L 2(R) can be considered as a limit of a sequence of


wavelets.

Solution of Problem 5.15. We need to show that


(Hi ,k' Hm ,n) = 0 if j =/= m or k =/= n
= 1 if j = m and k = n.
Since

(Hi,k' Hm,n) = i: 2
8
/
2
H(t)H(2 8t - r)dt,

where 8 = m- j and r = 2m-ik-n, (Hi,k' Hm,n) = 1 for j = m and k = n. If j = m


but k =/= n, then r =/= 0; so supp H(2 8t - r) n supp H(t) = rP, where supp Hi ,k =

i:
[k2- i, (k + 1) 2-i] . Consequently, the integral is 0 or supp H(2 8 t - r ) g supp H(t).
But in this case H(t) is constant on supp H(2 8 t - r ) , so the integral is also 0 because
H(t)dt = O.
In order to show that {2 i/ 2 H(2 it - k)}iEZ,kEZ is an orthonormal basis in L 2(R),
it is sufficient to show that
Sn = span{Hi,kh<n,kEZ
= Ln = {I E L 2(R)j 1 is constant on [kT n , (k + 1)2- n] , for k E Z} .
{Sn} and {Ln} have the following properties:
264 CHAPTER 5. IMAGE PROCESSING

(a) {Sn} is a family of dosed subspaces of L 2(R) such that


(i) ... S-2 C S-1 C So C SI C ...
(ii) f(t) E Sn if and only if f(2t) E Sn+!'
(iii) f(t) E So if and only if f(t + k) E So for k E Z.
(b) {Ln} is also a family of dosed subspaces of L 2 (R ) satisfying (i) to (iii) .
(c) For all i E Z we have L( = S(.
In view of these results {Hj,k} is an orthonormal basis of L 2 (R ). We verify (c);
other parts can be easily checked.
In view of (ii), it is sufficient to show that L o = So. Since each Hj,k for j < 0
is constant on any interval [r,r + 1], we see that So C L o. On the other hand, each
function of L o can be written as L
arX[r,r+!] so by (iii), it suffices to show that
rEZ
X[O ,I] E So. To show this, let us consider the series

L2
j/ 2Hj,o
= L2 jH(2jt) .
j<O j<O

Since 112j H(2 jt) 112 = 2j/ 2 and j < 0, this series is absolutely convergent in L 2 (R ).
It is dear from the definition of the Haar function H(t) that
j
L...J 2 /2 H 3,. 0
"" = 0 for t <- 0 ,
j<O
j/ 2
L 2 Hj,o(t) =L 2j = 1 for 0 < t < 1,
j<O j<O

and for 2r < t < 2r +l where r = 0, 1,2, . . . , one has

L
00
j/ 2 j
L 2 Hj,o(t) = _T r - l
+ 2- = O.
j<O j=r+2
This implies that So = L o which, in turn, implies that L( = S( for all i E Z .

Solution of Problem 5.16. {Ln}~=_oo' where Ln is defined in the solution of


Problem 5.15, is a multiresolution analysis associated with the Haar wavelet Hj,k '
X[O,I] can be taken as the scaling function.
Chapter 6

Models of Hysteresis and


Applications

The phenomenon of hysteresis occurs in a large number of practical situations


and plays a very important role in areas like ferro-magnetism, ferro-electricity, phase
transitions from solid to liquid or liquid to solid . In spite of the fact that this concept
has varieties of significant applications in scientific problems, its systematic study
started quite late. However, in the last couple of years, two proceedings of the
conferences, edited by Visintin [1993, 1994] and four monographs by Brokate and
Sprekels [1996] ; Krasnosel'skii, and Pokrovskii [1989], Mayergoyz [1991] and Visintin
[1994] have appeared, besides a large number of research papers on applications of
the concept, especially in the estimation of life-time of machines. These publications
are attracting the attention of fairly large number of mathematicians and engineers.
The main goal of this chapter is to familiarize the readers with the basic tools and
techniques of this field which may prove vital for industrial applications. It may
be observed that this area still requires intensive study and there is no dearth of
challenging problems.

6.1 Introduction to hysteresis


The word hysteresis literally means the lag exhibited by a body in reacting
to changes in the forces, especially magnetic forces effecting it. More precisely,
this phenomenon is exhibited, often by a system comprising a ferromagnetic or
imperfectly elastic material, in which the reaction of the system is dependent upon
its past reaction to change.
Let us concentrate on Figure 6.1. This figure shows an iron ring on which a
magnetising winding has been wound which can be supplied with a current. There
is also a secondary winding connected to a flux meter which can measure the change
of flux (the integral of voltage with time) when the magnetizing current is changed.

265
266 CHAPTER 6. MODELS OF HYSTERESIS AND APPLICATIONS

Flux meter

Figure 6.1. Ring used in the determination of a BH curve (hysteresis loop).

One can observe the relationship between the magnetic field strength H applied to
the iron and the resultant flux density B in the iron. By carrying out an experiment,
it has been found that the change in B associated with particular change in H
depends not only on the values of H but also on the magnetic history of the iron .
The curve shown in Figure 6.2 gives a relationship between Band H .
If the field applied to a specimen is increased to saturation and then decreased,
the ßux density B decreases but not as rapidly as it increased along the initial
magnetization curve. Thus, there is a residual flux density or reminiscence, B r , as
shown in the figure even when H = O. In order to reduce B to zero, a negative
field <H; is required. This can be created by reversing the battery polarity. This
force is called coercive force and is marked on the curve by H c • The phenomenon
which causes B to lag behind H so that the magnetization curve for increasing and
decreasing fields is not the same is called hysteresis and the loop traced out by
the magnetization curve is called a hysteresis loop. In soft (easily magnetized)
materials, the hysteresis loop is thin, with a small area encIosed . For hard magnetic
materials, the area enclosed by the loop is larger. In calculations on soft magnetic
materials, it is often possible to neglect the width of the hysteresis loop, although
the energy loss due to hysteresis is still taken into account. In such cases, engineers
use the curve shown in Figure 6.3 which is called a magnetic reversal curve or
magnetic saturation curve.
The saturation curve is quite helpful in predicting the flux resulting from a certain
applied magnetizing current.
Materials such as iron , nickel, cobalt which exh ibit a strong magnetic effect, are
called ferromagnetic. The permeability of these materials is not constant but is a
function of the applied field and the previous magnetic history of the speclmen.
Hysteresis can be defined as a rate independent of memory effect. This is a
property of some constitutive laws which relate an input variable u and output
variable w. Here, memory means that at any instant t, w(t) is determined by the
previous evolution of u, and not just by u(t), while rate-independence means that
the curve described in R 2 by the pair (u, w) , namely, loops are invariant for changes
6.1. INTRODUCTION TO HYSTERESIS 267

Flux density

.0
]
~--~~
§
'"0
~ ~i===1~-r:'-"-~;:-------
H
Hysteresis loop

Figure 6.2. Hysteresis diagram.

of the input rate, for example, the frequency. As indicated in Chapter 1 (see Dressler
and Hack [1996], Dressler, Hack and Krüger [1997]), the rainflow counting method
is a well-established method for fatigue analysis and damage estimation.

Hysteresis loss (loss of energy in the process of hysteresis). We want to


know the loss of energy in the work done to take the magnetic state of the material
around the hysteresis loop indicated in Figure 6.2. With reference to the situation
in Figure 6.1, let the magnetizing current I be supplied by a generator and let it be
changed from +Im ax to -Im ax and back to +Im ax and let this process take time T.
Then the work done by the generator will be

W = 1 7'
-eIdt,

where e is the e.m.f, induced by the change of flux in the iron . Thus

= N dilJ = _N S dB
e dt dt'
where S is the cross-section area of the ring and N the number of turns of the
binding. By the circuit law, we have

f Hdl = Hl = NI .
268 GHAPTER 6. MODELS OF HYSTERESIS AND APPLIGATIONS

Figure 6.3. Saturation or reversal curve.

f
Hence

l
T
W = 0 SIH dt
dB = SI HdB. (6.1)(a)

where the integral is taken around complete magnetic cycle. This integral is the area
of the hysteresis loop. Equation (6.1)(a) states that the hysteresis loss per cycle is
given by the volume of iron multiplied by the area of the hysteresis loop. The energy
is supplied by the source of the current and reappears as heat in the iron. The energy
loss per cycle is constant, so the totalloss depends on the number of cycles. The loss
of energy due to hysteresis is therefore proportional to frequency. It is also important
to know how the loss is related to the change of the maximum flux, The study of
hysteresis operators provides mathematical modelling of the hysteresis loops and
loss of energy which was not available earlier (see Remark on page 149 [Hammond
[1986])and Section 6.3 for current research concerning dissipation of energy. In
Section 2, we introduce hysteresis operators while rainflow counting method will be
discussed in Section 3. Section 4 will be devoted to relations between the rainflow
counting method, hysteresis operators of different kinds (Prandtl, Ishlinskii, Preisach
mh-hystrons, moving model etc.) and the dissipated energy indicating the techniques
for fatigue analysis. This chapter is mainly based on Brokate's work (See Brokate
[1994], Brokate et al. [1996]).

6.2 Hysteresis operators


A hysteresis operator results from a translation of a hysteresis diagram into a
mathematical object. If in Figure 6.2, we treat H and B axes as v and w, respectively,
then for given v : [0, Tl ~ R, T > 0, we get an output function w : [0, T] ~ R
6.2. HYSTERESIS OPERATORS 269

such that (v(t), w(t)) moves along the loop in the diagram. In the sequel, Maps[O, T),
Cpm[0, T], and Cpc[0, T] denote, respectively, the set of functions, the set of piecewise
monotone functions and the set of piecewise continuous functions , on [0,T] into R.
An operator H on Cpm[O, T) into M ap[O, 1'] is called a hysteresis operator if it is
rate-independent and has the Volterra property; namely,

(i) H[v] 04>= H[v 04>], for all v E Cpm[O, T) (6.1)


for all eontinuous monotone inereasing transformations 4> defined on [0, T) into itself
satisfying 4>(0) = 0, 4>(T) = T (rate-independenee property).
(ii) For v, v E Map[O, T] and t E [0, T], v(t) = v(t) implies that (H[v])(t) =
(H[v])(t) (Volterra property).

Remark 6.1. (i) The rate-independenee implies that only the locally extremal
values of v (Ioeal minima or local maxima) can have an influenee on the memory of
the proeess; we may replace input functions v by input strings (vo,VI, V2 ... vn ) , Vi E
R. Let us assume that
S = {(Vo, VI,V2'" vn)ln E No, Vi ER, 0::; i ::; n},No = NU {O} (6.2)
is the set of all finite strings and

SH = {(VO,Vl,V2" ·vn ) E Sivo # VI and n ~.1, (6.3)


{ (Vi+! - Vi)(Vi - Vi-I) < 0, for 0 < ~ < n}.
S H is ealled the subset of alternating strings.
Hf : SH - t R is defined by

Hf (vo,VI .. ·v n ) = H[v](T), (6.4)


where V E Cpm[0, T) is any input function having a monotonicity partition 0 = to <
tl < t2 < . .. t« = T such that V(ti) = Vi,O ::; i ::; n is ealled the final value
mapping.
(ii) Any mapping Hf : SH -t R defines a hysteresis operator H if we set

H[v](t) = Hf(v(to),v(td'" ,V(tk)), (6.5)


where 0 = to < h < ... < tk = t is a monotonicity partition of V : [0, T] -t R such
that the string (v(tO),V(tl)"" V(tk)) is alternating.
(iii) Equations (6.4) and (6.5) provide a bijeetive eorrespondenee between the
set of all hysteresis operators and the set of all real-valued mappings on S H . Any
hysteresis operator H ean be interpreted as the mapping on Sinto itself if we set

H(VO,VI" " ,vn ) = (Hf(vo), Hf (vo,vd, Hf(vo,VI,V2) . .. , (6.6)


{ Hf(Vo,VI,V2, ···Vn ) ) .
In the sequel, H[v] and H[s] will denote the value of H at the function v and the
value of H at the string s, respectively.
270 CHAPTER 6. MODELS OF HYSTERESIS AND APPLICATIONS

Example 6.1. Let us consider longitudinal vibrations of a rod. Newton's law


coupled with the constitutive stress-strain relation; that is,
8 2u 8a 8u
8t 2 - 8x' a = H [€], e = 8x ' (6.7)
together with certain initial and boundary conditions determines the displacement
u, the stress a and the strain e as functions of time t and space x. Within the elastic
limit, Hooke's law
H[€] = E€ (6.8)
holds where E denotes the modulus of elasticity. Beyond the elastic limit, many
materials exhibit plastic behaviour. The simplest description, namely, the elastic-
perfectly plastic model of Figure 6.4 with a fixed yield stress lai = rand pure plastic
flow, admits a continuum of possible stable states {( o , e)/ a E [-r, rl} for every value
of e. Figure 6.4 will give rise to a hysteresis operator Cr acting on aspace of functions
and its constitutive equation
(6.9)
represents an equation between functions rather than between certain values of stress
and strain. For r ~ 0, the hysteresis operator cr [' , w_I), W-l E R (the initial

Figure 6.4. The elastic-perfectly plastic element.

value) is defined by its final value mapping SH -t R given recursively by cr,/(vo) =


er(vo - w-d
(6.10)
6.2. HYSTERESIS OPERATORS 271

where
er(v) = min {r, max {-r, v}}. (6.11)
We take W-I = 0 unless otherwise stated, and we write tr[v] instead of t'r[v, 0] .

Remark 6.2. (i) Let r ~ 0, we define F; [" w-d for the initial value W-I (W-I
represents the internal state before v(O) is applied at time t = 0)

W(t) = F; [V ,W_I](t), (6.11)

by its corresponding final value mapping Fr : S H -t R given recursively by

Fr(VO,VI,V2" ·Vn) = fr(vn,Fr(vO,VI · · · vn- d ), (6.12)

where
fr(v,w) = max{v - r,min{v + r,w}}. (6.13)
In case the choice of initial value is not mentioned explicitly, we assume it to be zero,
Accordingly, we write Fr(v) instead of Fr[V;0]. FrL ') is called the play operator.
(ii) Let Rz,v be defined as, for x < y

Rz .v[v] = 1 if the input switches to y from below


(6.14)
= -1 if the input switches to x from above.

R z•v is called relay with thresholds x < y . The final value mapping Rz,v with
thresholds x< y and the initial value W-I (x, y) denoted by Rz,v,f has the following
form:
I ' Vn ~ y
-I,vn < x
R x,v. f (VO" VI V2'" , Vn) -- R - v . .. v _ ) x < v < y n > 1
(v (6.15)
{ x,v,f 0, 1 n 1 , n ,_
W_I(X,y),x < Vn < y,n = O.
In case the choice of initial value is not stated explicitly, we assume that W-I (x, y) =
1 if x + y ~ 0 and W-I(X,y) = -1 otherwise.
(iii) The play operator Fr incorporates the memory of all relays R x •v with Ix-yl =
2r.

Lermna 6.1. [Brokate 94]. (i) For each r > 0 and each 8 E R, we have

Rs-r,s+r,f(Vo, VI' " ,Vn) -


_{I' if Fr.f(vo,Vl, · ··Vn) > 8
(6.16)(a)
-1, if Fr,f(Vo,Vl,. ·· Vn ) < 8,
for every n ~ 0 and every 8tring (VO, VI • • • Vn) ES.
(iv) It can be seen [Brokate 1994] that Fr + t r = id; that is
Fr[V; W-I] + t'r[V; W-I] = v for every v E Cpm[O, T] , (6.16)(b)
272 CHAPTER 6. MODELS OF HYSTERESIS AND APPLICATIONS

and every W-l E R.


In 1928, Prandtl constructed a model as the continuous parallel combination of
elastic-perfectly plastic elements, which yields a reasonable approximation in cyclic
plasticity for investigation of stabilized elastic-plastic behaviour, that is, the be-
haviour after a large number of load cycles. We can write Prandtl's model in terms
of hysteresis operators as

(Prandtl model), (6.17)

where p(.) is related to the stabilized zr-e-curve for initial loading that is starting
from 0' = e = 0, as weIl as the shape of the hysteresis loops .
In 1935, Preisach developed a model for describing the hysteresis loops traced
out by magnetic field and magnetization in ferromagnetic materials (see F igure 6.2).
The operator form of the Preisach model is as follows:

H[v] = roojoo wer, s)Rs -r,s+r[v]dsdr


Jo -00
(Preisach model), (6.18)

where w(·,·) denotes some density function analogous to p(.) in (6.17) and Rx ,y
denotes the relay function with thresholds, x < y .

Remark 6.3. (i) By putting the values of Rs-r,B+r in (6.18), we can write the
Preisach model in terms of the play operator as

H[v](t) = 1 00

q(r, Fr[v](t))dr + qoo, (6.19)

where

q(r,s) = 21sw(r,a)dO' (6.20)

qoo = 100 ([°00 w(r,O')da -100 w(r ,O')d 0') dr. (6.21)

We observe that qoo = 0 if wer, s) = wer,-s) for all r and s.


(ii) Hysteresis loop corresponding to the play operator is given in Figure 6.5.
(iii) By 6.16(b) , and keeping in mind F o = id, the Prandtl operator H can be
written in terms of the play operator; that is,

H[v](t) = 100 OO

p(r)dr . Fo[V](t)-l p(r)Fr[v](t)dr. (6.22)

(iv) It may be observed that the values (Fr[v](t)), r ~ 0 playa crucial role in
determining the output H[v](t) for all hysteresis operators discussed here. In fact,
these values contain the entire memory information at time t, required to determine
the future, that is, to determine H[VJrt,T] from [vJrt,T] '
6.2. HYSTERESIS OPERATORS 273

Preisach type operator. Let (VO,Vl,' • • V n ) E S be an arbitrary input string.


This string successively generates the memory curves t/Jk : 14 -t R,

(6.23)

through the update

'UJ

Figure 6.5. Hysteresis loop of the play operator.

Let t/J-1 be an element of the set 4?o defined by

4?o = {rj> : R+ -t R 1 I rj>(r} - rj>(r} I s Ir - rl, for all r, r 2: O} (6.25)


{ rj>(p,oo) = 0 for some p 2: 0,

where 4?o is known as the family of Preisach memory curves.


It can be checked that t/J,. E 4?o for all k whenever t/J-l E 4?o·
An operator H on Cp m ([0, tED into Map ([0, tED is called a Preisach type
operator if

(i) H is a hysteresis operator.


274 CHAPTER 6. MODELS OF HYSTERESIS AND APPLICATIONS

(ii) The final value map HI : S ~ R has the form

for some mapping Q : <11 0 ~ R, called the output mapping and some initial
condition t/J- 1 E <11 0 .
Equation (6.26) can also be written as

H[v](t) = Q(t/J(t)), t/J(t)(r) = Fr [Vi 1/J-l (r)] (t), tE [0, T), r ~ O. (6.27)

A Preisach type operator H is called aPo-operator if 1/J-l = O.

Remark 6.4 (i) The Prandtl model is a Preisach type operator where

Q(r/» = Por/>(O)-l
OO

p(r)r/>(r)dr,Po = 1 00

p(r)dr. (6.28)

(ii) The Preisach model is also the Preisach type operator where

Q(r/» = 100

q(r, r/>(r))dr + qoo, (6.29)

and
q(r, s) = 21 8

w(r, a)du. (6.30)

(iii) It may be remarked that the dass of Prandtl (Preisach) operators is not
quite unique, as we have to specify the dass of allowed density function p(.) and
w(-, .). However, no such ambiguity is involved in the definition of aPo-operator.

6.3 Rainflow counting method in fatigue analysis


and damage estimation
We have discussed the application of the rainflow counting method in fatigue
analysis in Chapter 1. As mentioned there, the rainfiow counting method due
to Endo is widely used to decompose an arbitrary sequence of loads or deformations
into cycles and to count the cycles. In combination with the Palmgren-Miner rule,
one can obtain for every sequence of loads areal number which estimates the damage
inflicted upon the work piece by this loading sequence. Mechanical engineers have
developed several refinements and modifications of this procedure; in any case, the
single cyde, respectively, the corresponding hysteresis loop in the stress-strain plane
certainly constitutes a basic event for damage assessment. It is also known that the
memory structure of the elastoplastic constitutive law due to Prandtl-Ishlinskii di-
rectly corresponds to the decomposition performed by the rainflow method. The role
6.3. RAINFLOW COUNTING METHOD 275

of the rainflow method to fatigue analysis has been discussed in detail in Krüger et al.
[1985] and Dreßler et al. [1996,1997]. In this section, we briefly present the descrip-
ti on of (i) the accumulated damage equating the variation of the output of a certain
Preisach hysteresis operator, (ii) the accumulated damage depending continuously
on the input, (iii) the relation between the rainflow method and the hysteresis mem-
ory operator, and (iv) hysteretie constitutive laws, dissipation formulas and rainflow
count formulas for nonlinear rate-independent rheological models and derivation of
the accumulated damage as the output of a certain Preisach operator. For more
details, we refer to Brokate et al. [1996].
Astring sES is called irreducible if neither deletion rule (monotone or Madelung)
applies to it. We introduce a partial ordering on S by saying that s' ~ S for s, SES,
if s' can be obtained from S by a finite sequence of (arbitrary mixed) monotone and
Madelung deletions. For any such finite sequence of deletions, we define its unsym-
metrie rainflow count as a function a u : R x R -t No, where au(x,y) specifies how
often the pair (x, y) is removed by a Madelung deletion. If we ignore the order within
each pair, we obtain the symmetrie rainflow count a(x,y) = au(x,y) + au(y, x). It
is clear that a u (x, x) = a(x, x) = 0 for all x E R and that au(x,y) =j:. 0 for a finite
number of pairs only. In actual applications, the input values are classified in a
preprocessing step, so that afterwards only finitely many, say, K different values
Xl < ... < XK can occur. In this case, the rainflow count reduces to the K x K
rainflow matrix Au with au,jl = au(xilxl). Accordingly, A = Au + Ar, defines the
symmetrie rainflow matrix. In practiee, one keeps both the rainflow counts and the
residual, since the latter includes information on the order of the original sequence
which has been found relevant for purposes of reconstruction and extrapolation of
loading sequences from rainflow matriees. However, if we are interested in a com-
plete cycle count, we need to count the residual, too. For any sES, there is a
unique irreducible string SR E S with SR ~ s. This irreducible string is called the
rainflow residual of s. The symmetrie rainflow count of (SR, SR) appears as the
natural way to count the residual SR (See Brokate et al [Lemma 2.5, 1996]). Since
this means that we have to compare counts of different strings, we will write a(s)
and aB(x,y) instead of a and a(x, y) to specify the string whose count is obtained.
For any SES, its periodie rainflow count aper(s) : R X R -t No is defined as

It can be checked that

where
wraPi(s) = (Vii ' " ,vn ,VO,Vl, ... Vi-d · (6.31)
The count of a single cycle of maximum amplitude is denoted by

(6.32)
276 CHAPTER 6. MODELS OF HYSTERESIS AND APPLICATIONS

For x,y E R with x < y,W-I E {0,1}, the relay hysteresis operator Rz ,y : S ~ S is
defined by
(6.33)
with
l , Vi 2: Y
Wi = -l,Vi ~ X (6.34)
{
Wi-I, X < Vi < Y .

For any string s = (vo, VI . .. V n ) ES, we denote its variation by


n-I
Var(s) = ~)Vi+I - vii . (6.35)
i=O

The removal of a Madelung pair (x, y) from astring deereases its variation by the
amount 21y - z], whereas monotone deletions do not change the variation; so

Var(s) = 2~)y - x)a(s)(x, y) + Var(sR), (6.36)


z<y
holds for any string sES.
The number Var(Rz ,y(s)) represents the number of oscillations of s between the
thresholds x and y. The following results ean be verified

Var(Rz ,y(s)) =2 E a(s)(e,rJ) + Var(Rz,y(SR)) (6.37)


e "SZ<Y"STJ

Var(Rz ,y(s, s)) = 2Var(R z ,y(s)) (6.38)(i)


Var(Rz,y(s)) =2 E aper(s)(e,rJ) (6.38)(ii)
e"Sz<Y"STJ

Var(R~~;(s)) =2 E aper(s)(e,rJ), (6.39)


e"SZ<Y"STJ
if = (VO,VI,'" ,vn) with Vo = Vn, and ifw_I = Wo = WN.
s
Let N(x, y) denote the number of times arepetition of the input cycle (x, y)
leads to failure . Then on a unit seale,
1
~(x,y) = N(x,y) (6.40)

represents the eontribution of the single cycle (x, y) to the damage of the strueture.
The total damage D(s) due to an input string sES is then estimated as

D(s) = Eaper(X, y)~(x, y). (6.41)


z<y
6.3. RAINFLOW COUNTING METHOD 277

Within the context of fatigue analysis, there is no reason to consider arbitrary


large input values . In order to facilitate the exposition from now on, we fix on apriori
bound M for admissible input values. The relevant threshold values for the relays
R X ,1I' then He within the triangle P = {(x, y) E R2 1 - M ~ x < y ~ M}. For the
sake of simplicity, we also assume that

W-I (x,y)-_{I,O,x+y
x+ 0;O. Y~
~ (6.42)

Under these assumptions, the Preisach operator H : S --t Stakes the form

H(s) = r
}x<1I
<p(x,y)Rx,1I(s)dx dy, ip E LI (P), (6.43)

and its periodic version Hper : S --t S is defined by

Hper(s) = 1 cp(x,y)R~~;(s)dx
X<1I
dy. (6.44)

Here, the integral is understood to be componentwise with respect to the elements


of the string Rx.1I (s), and the function <p is set to zero outside the triangle P. An
operator H : S --t S of the form s = (vo,'" ,vn ) --t H(s) = (wo,,,, ,wn ) is called
piecewise monotone if (Wk - wk-d(Vk - Vk-I) ~ 0, 1 ~ k ~ n holds for every
s = (vo, '" ,vn ) ES.

Theorem 6.1. Let H : S --t S be a Preisach operator for some given density
function <p E LI (P). Then [or each s = (vo, VI' " ,Vn ) E S with Vo = Vn we haue

Var(Hper(s)) = Laper(x, y)f/J(x, y), (6.45)


X<1I
where
(6.46)

Corollary 6.1. Let t::. E H 2(P), o, t::.(x, y), 811t::.(x, y) ~ 0 and o, t::. (x, y) s 0 [or all
(x,y) E = 8xt::.(x, x) = 811 t::.(x, x) = 0 for each x E R . Furthermore,
P and t::.(x, x)
let the density function 0/ the Preisach operator be <p(x,y) = -~8x1lt::.(x,y) . Then
[or each string s = (vo,VI, • •• ,vn ) E S with Ilslloo ~ M and Vo = V n , the total
damage D(s) associated to s can be represented as

D(s) = Laper(X, y)t::.(x,y) = Var(Hper(s)). (6.47)


X<1I
278 CHAPTER 6. MODELS OF HYSTERESIS AND APPLICATIONS

Remark 6.5. (i) It may be remarked that the corollary indicates that the mathe-
matical theory developed for the Preisach operator can be used for the analysis of
the darnage functional D arising from the combination of rainflow counting and the
Palmgren-Miner-rule.
(ii) It has been shown that the total damage depends continuously on the input
(See Theorem 3.1 [Brokate et al. 1996]) or [Brokate et al. 1996 , Section 2.6]) and,
as a consequence, the total damage is stable with respect to small variations of
the input and, in particular, with respect to range discretization in data reduction
algorithms.

Proof of Theorem 6.1. We have


[(Hper(S))i+l - (Hper(S))i] (sign (vi+d - sign (Vi))

{
= r
Jx<y
<p(x,y) [(R~~;(S))i+! - (R~~;(S))i] (6.48)
x (sign (Vi+!) - sign (Vi)) dx dy .
Since H(s,s) = (H(s), Hper(s)), therefore
(Hper(s))i+l - (Hper(s))i = (H(s, S))n+i+2 - (H(s, s))n+i+l , (6.49)
and the same formula holds true if we replace H by Rx,y. From the piecewise mono-
tonicity of Hand of Rx,y, it follows that if we sum over i in (6.48), then

Var(Hper(s)) = r
Jx<y
<p(x,y)Var (R~~;(s)) dx dy. (6.50)

We obtain (6.45) by integrating (6.39) with the density function sp over the domain
where x < y and keeping in mind the relation

(6.51)

Proof of Corollary 6.1. The assertion will follow from the theorem if we show that
H with conditions given in the corollary is piecewise monotone. Let s = (vo,. . . ,vn )
with Vn > Vn-l and Vn ;f Vj for all j < n. For the Preisach operator (6.43), there
holds
8nH(vo , '" , vn) = 2 J
r· <p(vn - 2r, vn)dr, (6.52)
o
where r* > 0 is a certain number depending on (vo, Vl , .. . ,vn ) and 8nH (vo, Vl, .. . ,
v n ) denotes the partial derivatives of the last component of the output string with
respect to the last input value. From (6.46) and the assumptions on 6, we get

8nH(vo, Vl,'" ,vn) = ~8y6(Vn - 2r*, Vn)-~8y6(Vn, vn) ~ O. (6.53)


6.3. RAINFLOW COUNTING METHOD 279

An analogous argument in the case Vn < Vn-l yields

(6.54)

Therefore, H is piecewise monotone.

Rainflow and hysteresis memory. The rainflow method is linked intimately to


the memory structure of scalar rate-independent hysteresis. One may establish the
connection for every hysteresis model whose memory is based upon a continuous
family of play (or equivalently, relay) operators. Their collective memory evolution
generated by a particular input string 8 = (Vo,' . . ,Vn ) is completely described by a
string of functions

(6.55)

The function 'ljJ1; represents the memory after VI; has been processed. For a systematic
study of various properties of the hysteresis memory operator which maps input
strings to memory curves, we refer to Brokate and Visintin [1989] and Visintin [1994].
We indicate here how it is applied to calculate the partial derivative 8vnH(vo,'" ,vn )
in the damage analysis.
By (6.24), we have

(6.56)

We call the function 'ljJn or, more precisely, its graph {(r, 'ljJn (r)) Ir ~ O} the hys-
teresis memory curve belonging to the string 8 = (Vo, Vl,. .. ,Vn ) and denote it
as
(6.57)
If 8 is alternating and satisfies

Ivol > IV11, IVi+l - vii< lVi - Vi-tl, 0< i < n, (6.58)

then its hysteresis memory curve 'ljJn consists of n + 1 pieces of straight lines of the
slope ± 1 starting at (0, vn ) and ending at (ro, 0) = (Ivol,O) with corners (ri' 'ljJn (ri)) ,

. _ lVi - Vi-li ./. ( .) _ Vi + Vi-lI< . < (6.59)


r~ - 2 ' o/n r~ - 2 ,_ ~ _ n

The connection to the rainflow method is as folIows: Let Vn tend to V n-2, then
r n tends to rn-l' When V n becomes equal to V n-2, the corner (rn,'ljJn(rn)) merges
with (rn-l, 'lfJn(rn-l)) and both vanish. At that moment, the rainflow method counts
and deletes the Madelung pair (Vn-l' v n) and, in the input-output plane, a hysteresis
loop is closed. For an arbitrary string 8, one may check that

(6.60)
280 CHAPTER 6. MODELS OF HYSTERESIS AND APPLICATIONS

Figure 6.6. The hysteresis memory curve.

where we obtain the so-called backward residual SBR from the residual SR =
(v~, v~ , .. . , v~) by deleting successively v~, v~ ... until the remaining string satisfies
(6.58).
Let S = (Vo, VI, •. . , Vn ) be alternating and furthermore assume that (6.58) holds.
Consider the case Vn > Vn-I, or the case n = 0 and Vo > O. From Figure 6.6, we see
that
8 nFr (vO, VI , ' " , v n ) = 8vn c} r (v O, Vl, ' " ,vn) = {ol,r >< rn , (6.61)
,r r n

as weH as
(6.62)

If the string S = (vo,,·· , v n ) is arbitrary, the computation above applies to its


backward residual SBR; so we have

21 2r, vn)dr,
ro
8nH (vo, . . . , v n) = cp( V n - (6.64)
6.4. ENERGY DISSIPATION 281

with a certain number r" > O. Similarly, we deduce that in case Vn < Vn-I'

8 nH(vo, VI,'" ,Vn) = 21 r·


cp(v n , V n + 2r)dr, (6.65)

for some r" :

6.4 Energy dissipation in rate-independent rheo-


logical models
We mention here a formal analogy between energy dissipation and damage
accumulation for scalar rate-independent constitutive laws which are established by
the calculus of hysteresis operators. This kind of technique has been developed and
widely used in material science and engineering by the name rheological models.
The concept of a rheological element consists of a constitutive relation between the
sress a and the strain e, and an internal energy U (one may consider the free energy
or the potential energy in isothermal situations; these concepts coincide).

Basic rheological elements. (i) & denotes the linear elastic element with the
constitutive equation and internal energy

(6.66)

where E denotes the modulus of elasticity.


(ii) N denotes the nonlinear elastic element with the constitutive equation and
internal energy
€ = g(a), U = G(a) = a g(a) - Iocr g(e)d e, (6.67)

where 9 : R ~ R is a non-decreasing function with g(O) = O.


(iii) P denotes the rigid plastic element whose constitutive relation is described
by the variational inequality

€(t)(a(t) - a) ~ 0 for all a E [-r, rJ (6.68)


a(t) E [-r, r], (6.69)

with the yield stress r > O.


(iv) B denotes the brittle element described by

€(t) = 0 if lIall[o,tl < h, a(t) = 0 if lIall[o,tl ~ h, (6.70)

with the fracture stress h > O. Here, 1I . II[o,tl denote the sup norm over the time
interval [0, tJ .
For the elements P and B, the internal energy U is set to 0 to express the fact
that no reversible power can be stored by these elements.
282 CHAPTER 6. MODELS OF HYSTERESIS AND APPLICATIONS

The definition B via (6.69) makes sense if IO'(t)1 < h for all t ~ O. The material
remains rigid as long as IO'(t)1 stays bounded away from the value h; as soon as
IO'(t-)1 = h, the material breaks, 0' jumps to zero and we lose any control on e.
Condition (6.70) can be equivalently written in terms of the Heaviside function H ,
defined as H(x) = 1 if x> 0, and H(x) = 0 otherwise, as
€(t)H(h -I\O'I\[O,tl) = O,O'(t) (1- H(h -I\O'II[o,tl) = O. (6.71)
From the two given rheological elements R 1 and R 2 , we may form a new element R
as the
(a) series combination R = R1 - R 2 or R = L ~,setting
i f{l ,2}

(6.72)

(b) parallel combination R = RI/R2 or R = II i E{l ,2} R i , setting


e = €1 = €2, 0' = 0'1 + 0'2, U = U1 + U2 · (6.73)

Here e, 0', and U are treated as functions of time t within the interval [0, t]. According
to the second law of thermodynamics, the dissipation rate q(t) given by
q=i:O'-U (6.74)
has to satisfy
q(t) ~ O. (6.75)
A weak formulation of (6.75) which we consider here is as follows:

{
q(t2) - q(tr)
o
= [c(t)O'(t) - U(t)]~~ -
~tr~t2~T.
1: 2

c(t)Ö"(t)dt ~0 (6.76)

A rheological element is called thermodynamically consistent if (6.75) and (6.76)


hold for all functions satisfying the constitutive relation and belonging to a suitable
function space. The elements E; N and P are thermodynamically consistent. In
fact, E and N are conservative, that is, q = O. It is clear that any parallel or series
combination of thermodynamically consistent elements is again thermodynamically
consistent. It can be proved that the parallel combination N / B is thermodynami-
cally consistent for 0' E H1 (0, T) if 9 is a real-valued continuous function satisfying
z g(x) > 0 for x -=I 0 (for proof, see Brokate et al, [1996]).

Example 6.2 (Basic elastaplastic elements). The series combination &-P is


described by

e = ce + cP , 0' = E e"; U = 2.-0'2


2E
(6.77)
o E [-r, r], i:P(u - a) ~ 0 for all a E [-r, r]. (6.78)
6.4. ENERGY DISSIPATION 283

We observe that both E/ P and E-P are governed by an evolution variational in-
equality; namely
(E€ - ü)(a - ä) ~ 0 Y ä E [-T,T], (6.79)
for E-P, and
(ü - üP)(aP - ä) ~ 0 Y ä E [- T, T], (6.80)
for E/ P. In (6.79), a is determined from the strain f, whereas in (6.80), the plastic
stress a P , and hence e = ~(a - a P ) are determined from the stress a, Both varia-
tional inequalities have the same form, i.e., for a given function v : [0, Tl ~ R, we
look for w : [0, T] ~ R such that
(iJ(t)-tb(t))(w(t) - x) ~O a.e, in (0, T) for all x E [-T, T] (6.81)(i)
w(t) E [-r, r] a.e. in [0, Tl, w(O) =wo. (6.81)(ii)
It is well known in the theory of variational inequalities (see, e.g., Duvaut-Lions
[1976] and Brezis[1973]) that (6.81) has a unique solution w E BI (0, T). The cor-
respondence v ~ w defines a hysteresis operator Sr, called the stop operator by
Krasnoselskii and Pokrovskii [1989] which, for the initial value
Wo = min {T, max {-T, v(O)}},
is related to the play operator Fr by the relation (6.16(b)).
We can rewrite the constitutive relations for the two elastoplastic elements in
terms of the operator Fr and Sr as

E- P :o = Sr [Ef], U = 2~(Sr[Ef])2 (6.82)


1
E I P: e« = Fr[a] , U = 2E(Fr[a])2 . (6.83)

Example 6.3 (The Prandtl-Preisach-Ishlinskii model). We consider a infinite


series combination
N o - LNr/Pr. (6.84)
r>O
We denote by g(r,·) the constitutive function of the nonlinear elastic element N r ;
that is,
f r = g(T, a~), T ~ O. (6.85)
In the composition formula (6.72), we replace the sum by an integral so that (6.84)
is described by

f = fO + 1
00

fr dT-,a=-a~ + a~ Y T > 0 (6.86)

a~ E [-r,r],€(a~ - ä) ~ 0 Y ä E [-T,T] Y T > 0, (6.87)

U = G(O, aö) + 100

G(T, a~)dT, (6.88)


284 CHAPTER 6. MODELS OF HYSTERESIS AND APPLTCATIONS

where we define Gin analogy to (6.67) as

G(x,a) = ag(x,a) -leT g(x,~)d~. (6.89)

Variational inequality (6.87) implies that

ä~(a~ - ä) ~ 0 for all ä E [-r,r]. (6.90)

We, therefore, have


a~ = Sr[a],a~ = Fr[a]. (6.91)
Since ag = a ; (6.84) can be described in the Preisach operator form as

e = g(O,a) + 1 00

g(r, Fr [a])dr (6.92)

U = G(O,a) + 1 00

G(r, Fr [a])dr. (6.93)

Let us assume that g(r, 0) = 0 for all r and that the partial derivative in the second
argument 8eg(x,~) is non-negative, measurable and sufficiently regular. Brokate-
Sprekels [1996b] have shown that

q = Idt
.c 0 rg(r, Fr [a])drl,

which motivates us to define the dissipation operator


(6.94)

(6.95)

where H d is a Preisach operator. By integrating (6.94), we obtain the total dissipa-


tion of energy as
(6.96)
Thus, we see that the total dissipation of energy can be expressed in terms of the
variation of a Preisach operator in a way similar to the accumulated damage. This
result is very important in view of the discussion in the introduction regarding
hysteresis loops.

6.5 Hysteresis in the wave equation


Equation (6.7) can be written in the form of a wave equation with hysteresis
as
(6.97)
6.5. HYSTERESIS IN THE WAVE EQUATION 285

or, equivalently, in the form

~
8t
(H- [88tz]) = 88z'
1
2
U (6.98)

for a variable z representing the time integral of the stress,

z(x, t) = l t
a(x, r)d r + zO(x),

{ zO(x) = -1° 1
:tu(!;,O)d!;.
(6.99)

For a solution of (6.95) with appropriate boundary conditions, see Brokate [Th 7.2,
1994].

6.6 Problems.
Problem 6.1. (i) Show that the play operator is Lipschitz continuous on Wll (0, T) =
Hl(O,T).
(ii) Show that the hysteresis operator is Lipschitz continuous on W ll (0, T) =
H1(0,T).

Problem 6.2. (i) What do you understand by the height function?


(ii) State Masing's law and the Ramberg-Osgood equation of cyclic plasticity
equation.

Problem 6.3. Show that the composition of the two Prandtl operators is also a
Prandtloperator. What can you say about the inverse of a Prandtl operator?

Problem 6.4. Write down Maxwell's equations with hysteresis and discuss their
solutions.

Problem 6.5. Discuss the accumulated damage in the Safing sensor's process.

Problem 6.6. Discuss the solutions of the Helmholtz equation with hysteresis.

Problem 6.7. Discuss a weak solution of the wave equation with hysteresis . Under
what conditions, the solution is unique?

Problem 6.8. Solve the Poisson equation with hysteresis by tlu


method.

Problem 6.9. Explain the concept of the shape memory effect.


Chapter 7

Appendix

7.1 Introduction to mathematical models and sim-


ulation
Al Introduction.
In every problem related to any aspect of technological and industrial develop-
ment, three main steps are involved:

1. Mathematical modelling of the physical phenomenon under consideration.

2. Discretization of the model into a discrete algebraic form amenable to numeri-


cal solutions. In other words, writing the model in an appropriate from which
can be written in a language acceptable to computers.

3. Evaluation of the programme by the computer and visualization of the result


on the monitor and subsequent realization and interpretation.

According to the McGraw-Hill Encyclopedia of Science and Technology, a model


is an entity to represent some other entity for some well defined purpose. A system
is a collection of entities that interact. It is a proxy for a particular situation.

Examples of models.

1. A picture or drawing, such as a map used to record geological data or asolid


model to design a machine component.

2. An idea (mental model), such as the internalized model of a person's relation-


ships with the environment used to guide behaviour.

3. A description (linguistic model) of the protocol for a biological experiment or


the transcript of a medical operation to guide and improve procedures.

287
288 CHAPTER 7. APPENDIX

4. A physical object like an electronic circuit, airfoil used in the wind tunnel
testing of a new aircraft design.
5. A system of equations and logical expressions (mathematical model or com-
puter simulation) like the mass and energy balance equations that predict the
end-product of a chemical reaction, or a computer program that simulates the
flight of aspace vehicle.
Models are built and developed to help hypothesize, define, explore, understand,
simulate, predict, design and communicate some aspect of the original entity for
which the model is a substitute. Models provide principal support for the study of
every scientific field and new areas of technology and industrial development also use
models extensively. Study of a system, physical situation and phenomenon through
its models is more economical. Changes in the structure of a model are easier to im-
plement, and changes in the behaviour are easier to isolate, understand and commu-
nicate to others. A model can be used to achieve a thorough and deep understanding
when direct experimentation with an actual system is very dangerous, disruptive or
demanding. A model can be used to study those properties of a system that have not
been observed or built or cannot be built or observed with the present technologies.
The discrete algebraic forms are often called the simulation model which, when
expressed as a sequence of computer instructions, provides the computer simulation
program. The evaluation of the physical phenomenon with the help of a program and
computer is known as computer experiments. Vast literature exists on the three
steps mentioned above which are known as the mathematical modelling, algorithms
and programming, and computer simulation, respectively see, for example, Hockney
and Eastwood [1981], Gaylord and Wellin [1994], Wolfram [1990] , Crandall [1991] ,
Bahder [1995] , Ross [1995], Vvendensky[1994], Eriksson [1995] and Heath [1997].
Software like MATLAB, Mathematica, Femlab are now-a-days quite popular
for scientific computations.

A 2 Mathematical models.
The description of a physical situation (phenomenon) can be termed as a model.
Modelling is an attempt to describe a phenomenon in a logical way. In this pro-
cess, we identify generalizable principles and processes and frame a mathematical
structure (equations - algebraic , difference, differential, partial differential, stochastic
differential, integro-differential, matrix, operator; inequalities -variational inequal-
ity, boundary and initial value problems , iterative formulae, etc.) which is used to
investigate and simulate new possibilities leading to the understanding of the intri-
cacies of nature and solutions of problems in different disciplines of the emerging
technologies and related industrial development. Models may also consist of stories,
fables and analogies or computer programmes in which they also permit the simula-
tion of dynamic behaviour. As simulation games , they comprise a set of rules guiding
behaviour, gaming simulations for tackling new and unusual situations. The exam-
ples of models may range from cars, rail engines, bridges, aircraft, robots, epidemics,
electronic equipment, economic equilibrium, pharmaceutical compounds, chemical
7.1. INTRODUCTION TO MATHEMATICAL MODELS 289

reactors, economic growth, trafflc control, paper, glass and steel industries, social
process, urban development, population explosion, wars, and environmental pollu-
tion to global climate change. Models can be very useful for scientific understanding
of phenomenon, systematic development of technologies, management and planning.
It may be understood clearly that a good model is the best possible representation
of a given physical situation or phenomenon; however its solution may reproduce
a limited set of behaviour of the original due to certain unavoidable factors. A
phenomenon may have different models and objectives according to the purpose of
the modelling. Olassical sdence emphasizes observation and description while the
model developers stress the understanding of the processes and their changing be-
haviour. The validity of a model and its logical admissibility (well-posedness) must
be carefully examined. The validity of a model can be categorized into four cat-
egories: behavioral validity, structural validity, empirical validity and application
validity. While modelling a phenomenon of a specific technology or industry, a fairly
good understanding of that process and physical situation must be acquired. The
objectives of the model must be specified in the first place as the mathematical struc-
tures and variables involved will depend upon the objectives. The determination of
physical laws and identification of parameters may not be an easy task; therefore,
multi-disciplinary discussion becomes essential. The existing information may be
very useful in developing a model. The conservation laws of mass, energy, motion,
electricity are quite useful for modelling technological and industrial problems. An-
alytical solutions of models in emerging areas of technologies are rarely possible .
Therefore, the model must be discretized for numerical solution which may lead
to error. The minimization of error, that is, controlling the process of discretiza-
tion such that the error is minimum, is a vital question in the modelling processes.
Model-building in areas of biomedical sciences, biotechnology, chemical and phar-
maceutical industries is a fairly difficult task and the solution of such models may
require the knowledge of new mathematical techniques to be invented in future. A
material such as silicon, germinimum, gallium arsenide whose electrical conductivity
increases with temperature and lies between metals like silver, copper and insulators
like glass and fused quartz is known as semiconduetor. The phenomenon is due
to the thermal generation of equal numbers of negative charge carriers, known as
electrons and positive charge carrier called holes. Semiconductors of different con-
ductivity like n-type, p-type can be brought together to form a variety of junctions
which are the basis of semiconductor devices used as electronic components. The
term semiconductor is often used for the devices themselves . Advances in the elec-
tronic and computer industry are closely related to the semiconductor technology.
A systematic study of mathematical modelling of semiconductor devices has been
started in the late seventies keeping in view the rapidly growing demand of this field
(see, for example, Markowich [1986], Bank et al. [1990] and Bhattacharya [1994]).
Mathematical modelling is essential for appropriate analysis and simulation. A
model is called behaviourally valid if it produces the same behaviour as the orig-
inal physical phenomenon under similar conditions. If the numerical solution of an
original situation coincides with the numerical solution of the corresponding model,
290 CRAPTER 7. APPENDIX

then it is called empirically valid. For application validity, we must check that
the model delivers correct results in a specific situation set by the model builder
(developer). For a comprehensive study of mathematical models, we refer to Taylor
[1986], Kapur [1987] and Bellomo and Preziosi [1995].

Algorithm, programming and software. A prescribed set of weIl defined rules


or instructions for the solution of a problem in a finite number of steps is called
an algorithm, e.g., the rules for finding minima of a real-valued function. Some
important algorithms for 'optimization problems' have been discussed in Chapter
2 while some other important algorithms like the fast Fourier transform algorithm,
fast wavelet algorithm, the Chaos game algorithm are presented in Chapter 5 and
Appendix B, respectively. Expressing an algorithm in a formal notation is an im-
portant part of a programme. Let P = {O, 1, 2,3·· ·} and f : P" ---t P, na natural
number. fis called effectively computable if there is an effective procedure, that
is, an algorithm that correctly calculates f . An effective algorithm is one that meets
the following specifications:

(i) The finite numbers of instructions be such that there must not be any ambi-
guity concerning the order in which the instructions are to be carried out.

(H) For n-tuple x E pn after a finite number of steps, the calculation must termi-
nate and output fex) and n-tuple not in pn must not output a value .

The study of whether effective algorithms exist to compute a particular quantity


forms the basis of the theory of algorithms. Except for the simplest of algorithms,
it is diflicult to prove whether a particular algorithm is correct or false. In practice,
we should be satisfied if the algorithm is valid, that is, it performs the calculation
(computation) assigned to it . It involves testing the routine against a variety of
instances of the problem and ensuring that it performs satisfactorily for these test
cases, If the test set is chosen sufliciently weIl, there can then be confidence in
the algorithm. The study of the performance characteristics of a given algorithm is
known as the algorithm analysis. Algorithms can be analyzed in terms of their
complexity and efliciency. The efliciency can be characterized by its order. If two
algorithms for the same problem are of the same order, then they are approximately
equally eflicient in terms of computation. Algorithm efliciency is a measure of the
average execution time necessary for an algorithm to complete work on a set of
data. A language or notation used to express clearly an algorithm is known as the
algorithmic language. It is usually part of the programming language.

A 3 Program.
A program (computer program) is a set of statements that can be submitted as
a unit to some computer system and used to direct the behaviour of the system.
A procedural program gives a precise definition of the procedure to be followed by
the computer system in order to obtain the required results. By contrast, a non-
procedural program specifies constraints that must be satisfied by the results that
7.1. INTRODUGTION TO MATHEMATIGAL MODELS 291

are produced hut does not specify the procedure by which these results should be
obtained. Such a procedure must be determined by the computer system itself.
A programming language is a notation for the precise description of computer pro-
grammes or algorithms. Programming languages are artificiallanguages in which the
syntax and semantics are strictly defined. They serve their purpose but do not have
the freedom of expression that is characteristic of a naturallanguage. Some impor-
tant programming languages are Fortran (latest Fortran-90), C, C+, C++ Turbo
(Turbo basic, Turbo C, Turbo Pascal and Turbo Prolog) Algol, Pascal, Pascal",
Basic and Wordstar. For details, see Press et al [1990], Cargill [1992] and Dowd
[1993].

A 4 Software.
This term stands for those components of a computer system that are intangible
rather than physical. It is generally used to refer to the programmes executed by a
computer system as distinct from the physical hardware of that computer system,
and to encompass both symbolic and executable forms for such programmes. Soft-
ware library, the same as program library, is a collection of programmes and packages
that are made available for common use within similar situations. A typical soft-
ware library may contain compilers, utility programmes, packages for mathematical
operations, etc . Before citing some specific sources of good mathematical software,
we give below the desirable characteristics that such software should possess.

• Reliability: always works correctly for easy problems.

• Robustness: usually works for hard problems, but fails gracefully and infor-
matively when it does faiI.

• Accuracy: produces results as accurate as warranted by the problem and


input data, preferably with an estimate of the accuracy achieved.

• Maintainability: is easy to understand and modify.


• Portability: adapts with little or no change to new computing environments.
• U sability: has convenient and well-documented user interface.
• Applicability: solves a broad range of problems.
These properties often conflict, and it is rare for a software to satisfy all of them.
Nevertheless, this gives mathematical software users some idea what qualities to look
for and developers some worthy goals to strive for. Some general-purpose mathe-
matical software are listed here . The software listed are written in Fortran unless
otherwise stated. For more information about available mathematical software, we
refer to um http://gams.nist.gov on the Internet's World-Wide Web.
• IMSL (International Mathematical and Statistical Libraries): A commercial
product of Visual Numerics Inc., Houston, Texas. A comprehensive library of
292 CHAPTER 7. APPENDIX

mathematical software; the full library is available in Fortran, and a subset is


available in C. See -bf URL http://www.vnLcom .

• NAG (Numerical Algorithms Group): A commercial product of NAG Inc,


Downers Grove, Illinois. A comprehensive library of mathematical software;
the full library is available in Fortran, and a subset is available in C. See -bf
URL http://www.nag.com.

• netlib: A collection of free software from diverse sources available over the
Internet. See URL http://www.netlib.org, or send email containing the
request "send index" to netlibornl.gov, or ftp to one of several websites, such
as neülib.bell-labs.com or netlib2.cs.utk.edu.

• NR (Numerical recipes): A collection of software accompanying the book nu-


merical Recipäs, by Press et al. [1992], available in C and Fortran editions.

Scientific computing environments. The software libraries mentioned above


contain subroutines that are meant to be called by user-written programmes, usu-
ally in a conventional programming language such as Fortran or C. An increasingly
popular alternative for scientific computing is interactive environment that provides
powerful, conveniently accessible, built-in mathematical capabilities, often combined
with sophisticated graphics and a very high-level programming language designed
for rapid prototyping of new algorithms. One of the most popular computing en-
vironments is MATLAB. Another family of interactive computing environments is
based primarily on symbolic computation rather than numeric, often called corn-
puter algebra. These packages, which include Axiom, Derive, Macsyma, Maple,
Mathematica and Reduce, provide many of the same mathematical and graphical
capabilities, in addition to providing symbolic differentiation, integration, equation
solving, polynomial manipulation, etc. We present here some details of Mathematica
and MATLAB; for others, see Heath [1997] and references therein.

A s Mathematica in science and technology.


The main purpose of this section is to draw the attention of mathematicians
to a very powerful programming language named 'Mathematica' invented by Wol-
fram Research, Inc. with its formal announcement on 23rd June, 1988. Until very
recently, computer languages Fortran, C, C++ were mainly used by scientists and
engineers for their professional work. These languages are best suited for numeri-
cal calculations and development of large software programmes and do not directly
support mathematical calculations as carried out by scientists and engineers. Math-
ematica handles both types of problems . All important functions of mathematical
physics and matrix functions are already built into Mathematica. Its tremendous
graphics capability for visualizing results of calculations has made it very popular.
Some very significant calculations in different fields, especially solid state physics,
genetics and population biology, which could not be carried out by the classical
7.1. INTRODUGTION TO MATHEMATIGAL MODELS 293

languages were made possible through this language. In "Mathematica for the Sei-
ences", programmes in the mathematical language carry the main message, with
words, diagrams and formula providing support. A traditional scientific formula
might give the solution to a particular kind of equation, but cannot describe the
process by which such a solution is found . In a Mathematica formulation, however,
one can describe the complete computational process that is required. To reproduce
computations, we are only required to execute the Mathematica programmes on a
computer directly to the problems under consideration. The writing of a program in
Mathematica requires a small fraction of the time required to write the correspond-
ing program in other classical scientific languages like Fortran and G. Mathematica
has symbolic computation capabilities that are lacking in Fortran and G. However,
Mathematica is more time-consuming than these programmes in certain situations
and this shortcoming can be overcome with a bridge known as Math-Link. The
Math-Link system of communication is a general scheme where two running pro-
grammes can exchange data. Math-Link can be used to call Mathematica functions
from inside a Fortran or C program. Since most of Mathematica's functions are
capable of taking complex arguments, it may be convenient to use Mathematica as
a function library by a Fortran or G program. Math-Link can also be used to call a
Fortran subroutine or C function can be instalIed into Mathematica and made to be-
have as if it were a built-in Mathematica function. There are at least two situations
where communication between Mathematica, Fortran and G will be appropriate: (i)
when a complex program is written in a scientific language like Fortran or G and it
works efficiently and is of high quality, and (ii) when Mathematica takes more time.
The Mathematica system of communication enables us to exchange data. Compu-
tations can be carried out in Fortran or C and shipped back to Mathematica via
Math-Link, For a solution ofthe Maxwell equation, Schröding equation, KdV equa-
ti on L-G tank, heat and wave equations, we refer to Bahder [1995] and Crandall
[1991].

A o MATLAB
MATLAB is a proprietary commercial product of The MathWorks, Inc. (see
URL http:j jwww.mathwork.com). MATLAB, which stands for Matrix Lab-
oratory, is an interactive system that integrates extensive mathematical capabilities,
especially in linear algebra, with powerful scientific visualization, a high-level pro-
gramming language, and a variety of optional "toolboxes" that provide specialized
capabilities in particular applications, such as signal processing, image processing,
control, system identification, optimization, and statistics. There is also a MAT-
LAB interface for the NAG mathematical software library. MATLAB is available
for a wide variety of personal computers, workstations, and supercomputers, and
comes in both professional and inexpensive student editions. If MATLAB is not
available on your computer system, there are similar, though less powerful, packages
that are available by ftp, including octave (ftp.che.utexas.edujpubjoctave)
and RLaB (evans.ee.adfa.oz.aujpubjRLab) . Other similar commercial prod-
ucts include GAUSS,HiQ,IDL,Mathcad, and PV-WAVE. Users program will be
294 CHAPTER 7. APPENDIX

much shorter. The environment often has built-in functions for many of the prob-
lems we may encounter, which greatly simplifies the interface with such routines
because much of the necessary information (array sizes, etc.) is passed implicitly by
the environment. An additional bonus is built-in graphics, which avoids having to
do this separately in a post-processing phase. Even if one intends to use a standard
language such as C or Fortran in the long run, one may still find it beneficial to
learn a package such as MATLAB for its usefulness as a rapid prototyping envi-
ronment in which new algorithms can be tried out quickly and later recorded in a
standard language, if necessary, for greater effidency or compatibility. For acquiring
a good knowledge ofMATLAB, we refer to Acton [1996], Biran and Breiner [1995],
Nakamura [1996], Part-Enander[1996J, Redfen and Cambell [1996] .

A7 Simulation.
The general meaning of simulation is imitation of some existing system or systems
to be manufactured or created. It may be applied in areas like communication net-
work designing, designing of sophisticated equipment, exploration of traffic pattern
and prediction of weather forecasting . It has major applications in the fast-growing
computer industry. The simulation can be classified into two categories: (i) dis-
crete simulation, and (ii) continuous simulation. For a discrete-event simulation,
one considers all significant changes to the state of the system as distinct events
that occur at specific points in time; the simulation then achieves the desired be-
haviour by modelling a sequence of such events treating each event individually. In
the continuous case, changes occur gradually over aperiod of time and the progress
of changes is tracked. Simulation is nothing but examination of problem without
experimentation. The simulation of a device is usually taken to be the process of
representing the characteristics of the device of examining the operation of analo-
gous systems or processes. Therefore, by providing a description of the device and
its operating conditions, a simulation can predict how the actual device would be-
have. The discrete algebraic equations describe the simulation model which, when
expressed as a sequence of computer instructions, provides the computer simula-
tion program. Computer simulation may be regarded as the theoretical exercise of
numerically solving a boundary value problem. Simulation can be used to predict
accurately the working of complex devices in order to have many variations to be
evaluated before expensive technology is employed in manufacturing or constructing
the optimal choice. It can also be used to obtain information not available at hand,
for example, the study of galaxies and submicron electronic devices . Steps involved
in simulation (computer simulation) are:
1. Mathematical modelling of the physical situation or phenomenon.
2. Discretization of the model.
3. Computer programming.
4. Implementation of the program on an appropriate computer (PC's/Work-
station/Laptop etc) .
7.2. FRACTAL IMAGE COMPRESSION 295

7.2 Fractal image compression


Fractal image compression. Image compression techniques based on fractals
have been developed in the last few years and promise better compression perfor-
mance. These techniques are being developed on the recognition that fractals can
describe natural scenes better than shapes of traditional geometry. There are three
main fractal image compression techniques: (i) fractal image compression based on
the iterated function systems (IFS), (ii) The segment based coding, and (iii) The
yardstick coding. Here, we confine ourselves to the first technique where images
are compressed into compact IFS codes at the encoding stage, and fractal images
are generated to approximate the original image at the decoding stage. For this
technique, we refer to Barnsley [1988] and [1996], Barnsley and Hurd [1993], Fischer
[1994] and Lu [1997]. This topic is also discussed in Chapter 5.
The word fractal was coined by Benöt Mandelbrot from the Latin word fractus,
meaning broken, for describing objects that were too irregular to fit into the tradi-
tional geometrical setting. There are several definitions of fractals but we shall treat
fractals as fixed points of mappings on the metric space of all compact subsets of a
complete metric space into itself. Cantor set, Sierpinski gasket, Sierpinski carpet,
Von Koch curves are examples of fractals. An image can be approximated by these
objects and some numbers/quantity, characteristic of these objects will be trans-
mitted or stored and the original image can be retrieved from these characteristic
numbers/quantity, For example, the fern image can be represented by 24 integers
which can be represented by 192 bits (if each integer is represented by 8 bits) . On
the other hand, if we store fern pixels, we need 307200 bits (supposing that the image
size is 480 lines by 640 pixels). Therefore, we achieved compression ratio of 1600 to 1
by the IFS technique of fractal image compression. Here, we briefly introduce the
fractal objects, IFS Theorem, Collage Theorem, and methods for finding IFS for
an image and image for an IFS. Commercial software to implement the process are
available.

Cantor set: The Cantor set C is a subset of the metric space X = [0, 1], which is
obtained by successive deletions of middle third open subintervals, as follows:
I o = [0,1],
t] [t, j]
I 1 = [0, u
I 2 = [0, -] u [-, -] U [l!, 1] U [~,~]
91 9 2 9 3 9 96 7 9 9 8 9 18 19 20 21 24 25 26 27
13 = [0, 27] U [27' 27] U [27' 27] U [27' 27] U [27 ' 27] U [27' 27] U [27' 27] U [27 ' 27]
I 4 = 13 minus the middle open third of each interval in I 3

IN = IN-l minus the middle open third of each interval in IN-I.


The cantor set C is defined as
C = n~=o In, where
I o 2 h 212 2 13 2 I N- 1 2 IN 2IN+l .
296 CHAPTER 7. APPENDIX

G =I- 0 as 0 E G. C is a perfect set.

A= = = = = = =G

Figure BI. Sierpinski gasket.

In Figure BI, Xb~,Zl are middle points ofthe side AB,BG and AG, respec-
tively. Remove the tri angle with vertices X1,Y1 and Zl.
Xf',xt, Y{ are middle points of the triangle AX1Z1 and so on. Remove the
central triangles of three left-over triangles of EI'
Continue this process. Whatever is left is called the Sierpinski gasket or Sierpinski
triangle. For an interesting account of the Sierpinski gasket, we refer to Stewart [95].
It may be observed that there is a elose connection between Chaos and fractals.

Iterated function system.

Theorem BI (Banach contraction fixed point theorem). Let X be a complete


metric space and letT: X ---+ X be a contraction mapping, that is, d(T(x),T(y» ~
ad(x, y), 0 ~ a < 1. Then T has a unique fixed point, that is, 3 unique u E X such
that Tu = u. Furthermore. TOn(y) ---+ u as n ---+ 00 where TOno is dejined as
follows:

TOo(x) =X
T01(x) = T(x), T0 2(x) = T(T(x» = T(T01(x»
T 03(x) = T(ro 2(x) . .. , T On(x ) = T(TO ln - (x)
1
) .
7.2. FRACTAL IMAGE COMPRESSION 297

- - - - - - - - - - - - - - - - - - Eo

Figure B2. Von koch curve.

This theorem is a key to the fractal image compression.

Definition BI. Let (X, d) be a complete metric space. X together with a finite
set 0/ contraction mappings W n , n = 1,2 ... ,N with contractivity [actors Sn, n =
1,2" .. ,N is called an "iterated function system" abbreviated as IFS, where

S = max sa.
n

An IFS will be denoted by

{X, Wn,n = 1,2," ·N, s} ,s = max sa ,


n

Example BI. {R, W(x) = 0, W(x) = ~x + H is an IFS . Let (X, d) be a com-


plete metric space and H(X) = {K ~ X/Kcompact}. For B E H(X) and x E
X,d(x,B) = min{d(x,y)/y E B},

d(A ,B) = max{d(x,B)/x E A}


d(A, B) =I d(B,A)
h(A, B) = max {d(A, B), d(B, A)} ,
298 CHAPTER 7. APPENDIX

where h(.,.) is metric on H(X) , (H(X), h) is a complete metric space, and h(.,.) is
called the Hausdorff metric.

Let {R2 , Wn,n = 1,2,3· ··N, s} be an IFS and

IFS {R2, W l , W 2 , W3 ... W N } can be expressed in the table as follows:


Table BI.
W a b C d e I b
1 al bl Cl dl el !l l
b
2 a2 b2 C2 da e2 h b2
3 a3 b3 C3 d3 e3 h b3
4 a4 b4
, , , e4,
C4 d4
, b4,
14

:
, , , , , , ,
, , , ,
N aN bN CN dN eN IN bN
where
N
Pi ~ laidi - biCil1 L laidi - biCil·
i=l

This table is known as the IFS code table. It can be seen that for IFS of fern
leaf

N = 4, al = 0, a2 = .2, a3 = .85, a4 = .85,


bl = 0, bz = .26, b3 = .04, b4 = .04,
Cl = 0, C2 = .23, C3 = .04, C4 = .04,
d l = .16, d2 = .22, C3 = .85, C4 = .85,
el = 0, e2 = 0, e3 = 0, e4 = 0,
!l = 0, [z = .2, h = .2, h = .2, !4 = .2

Theorem B2 (The Iterated Funetion Theorem-IFS). Let {X, W n , n


1,2, ... ,N} be an iterated function system with contractivity [actor s, Then the
transformation defined by
W(B) = Wl(B) U W 2(B) u ··· U WN(B) for all BE H(X)
is a contraction mapping on the complete metric space (H (X), h(.» with contractivity
factor s; that is,
h(W(B), W(C» ~ sh(B,C) V B,C E H(X).
7.2. FRACTAL IMAGE COMPRESSION 299

W has a unique fixed point, A E H(X) satisfying


N
A= W(A) = U Wn(A) ,
n=l

and is given by
A = lim WOn(B) for any B E H(X) .
nt-oo

Definition B2. The fiaed point A E H(X) described in the IFS theorem is called the
attractor of the IFS. The attractor is also called deterministic fractal or fractal .
Let W : R 2 --t R 2 be a contraction map, then A and W(A) are shown in Figure
B3.

W(A)

Figure B3. Contraction map.

Relation between mappings on X and H(X).

Lemma B 1. Let W 1 be a continuous mapping on the metric space (X , d) into itself.


Then W 1 maps H(K) into itself.

Lemma B2. Let V : X --t X be a contraction mapping on the metric space


(X,d) with contractivity factors s. Then W : H(X) ---+ H(X) defined by W(B) =
{V(x)/x E B, VB E H(X)} is a contraction mapping on (H(X),h( .)) with the same
contractivity factor.

Lemma B3. Let (X, d) be a metric space and {Wn , n = 1,2,3, S, ' " ,N} be an IFS.
Let the contractivity factor for W n be Sn ' Define W : H(X) ~ H(X} by
N
W(B) =W1(B)UW2(B)U"'UWN(B) = UWn(B)
n=l
300 CHAPTER 7. APPENDIX

1,2,'"
[or each B E H(X). Then W is a contraction mapping with contractivity [actor
s = max{sn: n = ,N} .

Example B3. (a) X = [0,1], W1(x) = ~, W 2(x) = ~x + j. W 1 and W 2 are


contraction maps on X into itself. The deterministic fractal or fractal of the IFS
system {X, W n , n = 1, 2} is the cantor set.
(b) X = [0,1] x [0,1]

Wi(X) = Ti (;) +bi,i=1,2,3,

W1 = (162 1~2) ,bI (~)=

W2 = (1621~2) , (i~:)
b2 =

Ws (162 1~2) ,bs (l b2)


= =

The fixed point A of the map W : H(X) -+ H(X) defined as WeB) =


N
U Wn(B) is the Sierpinski gasket.
n=l
(e) Sierpinski gasket and Von Koch curve are examples offraetal (deterministic
fractal).

element 0/ H(X), X = R 2 ) and {X, W n = 1,2··.


Theorem B3 (The Collage Theorem). Let B be an arbitrary target image (an
N} be an iterated function system
with contractivity factor s,O < s < 1. Further, let WeB) be as in Theorem B2
such that the HausdorJJ distance between Band WeB) be less than~. Then the
HausdorJJ distance between Band the attractor A 0/ the given [FS system is less
than (1 - S)-l~.

Remark BI.
(i) Theorem B2 (Iterated Function Theorem) tells us that each IFS defines a
unique fractal image (attractor). This also says that the attractor of the IFS
is exactly the same as the union of the transformations of the attractor.
(H) Small (affine) deformed copies ofthe target are arranged so that they cover up
the target as exactly as possible. This Collage of deformed eopies determines
an IFS (Theorem B2). Theorem B3 tells us that the better the Collage, as
measured by the Hausdorff distance, the eloser will be the attractor of the IFS
to the target.
(iii) An interesting consequence of the Collage Theorem is that if the matrix entries
in two codes are elose, thus attractors of the codes are also elose. In other
words, small errors in codes lead t o small errors in images.
7.2. FRACTAL IMAGE COMPRESSION 301

Methods for flnding IFS for an Image, Let S be an image and we want to find
an IFS for S. We can split the whole image into non-overlapping segments whose
union covers the whole image. Each segment is similar to the whole image and can
be obtained by a transformation of the whole image. If we can find transformations
for each segment, the combination of these transformations is the IFS encoding of
the original image . In other words, to encode the image into IFS is to find a set of
contractive affine transformations W 1 , W 2 • • • , W N, so that the original image S is
the union of N subimages:

The IFS for the Sierpinski triangle can be found as folIows: The Sierpinski

(x;, y;)
(X3, Y3)

Figure B4a.

triangle is the union of three small triangles. Each small triangle is a transformation
of the original Sierpinski triangle. The IFS of the Sierpinski triangle is the collection
of these three transformations. The general form of the transformation representing
the triangles is:

To determine the transformation, we are required to find values of a, b, c, d, e, f.


T~e top triangle is ~bt~ined by the ori9ina~ triangle through tr~s~ormation of the
pomts (Xl,Yl) to (X1'Yl)' (X2,Y2) to (X2'Y2) and (X3,Y3) to (X3'Y3). We have the
following six linear equations:
302 CHAPTER 7. APPENDIX

(1, 1)

(0, 0) 1lliiilli1.!illillill..illll~~==~== (2,0)


F

Figure B4b.

aX2 + by2 + e = x~
aX3 + bY3 + e = x~
CXl + dYl + f = y~
CX2+ dY2 + e = y~
CX3 + dY3 + e = y; .
Putting the values of the coordinates of Figure B4b and solving this system of
equation, we get a = 0.5, b = 0, c = 0, d = 0.5, e = 0.5, e = 0.5 and f = 0.5. In a
similar manner, we can find values ofthe six variables for the bottom right triangles.
Thus, the IFS of the Sierpinski gasket of Figure B4b is {R 2 , Wn,n = 1,2,3.}
where
5
W1(X) [;] = [Oö 0~5] [;] + [~:~]
W2 (x) [;] = [Oö
5
0~5] [;] + [~].
It may be observed that the IFS of an image is not unique. There may be many
IFSs for an image. However, an IFS has a unique attractor (fractal)

The IFS codes of Figure B4b are given in Table B2 below.


7.2. FRACTAL IMAGE COMPRESSION 303

Table B2.
W a b C d e f p=b
WI 0.5 00 0.5 0.5 0.5 PI
W2 0.5 00 0.5 0 0 P2
W3 0.5 00 0.5 1 0 P3
where
3
Pi ~ laidi - biCil/ L laidi - biCil ·
i=l

Let a target image S lie in the reetangle

Let S be digitized or a polygonalized approximation. S is rendered on a computer


graphie monitor. An affine transformation

is introduced, with coefficients initialized at al = d l = 0.25, bl = Cl = 0.25.


The image WI(S) is displayed on the monitor in a different colour from S. The
image W I (S) is a quarter-sized copy of S, centered eloser to the point (0,0) . The user
now interactively adjusts the coefficients by specifying changes with a mouse or some
other interaction technique so that the image WI (S) is variously translated, rotated,
and sheared on the screen. The goal of the user is to transform WI (S) so that it
lies over a part of S. It is important that the dimensions of W I (S) are smaller than
those of S to ensure that W I is a contraction. Once WI (S) is suitably positioned,
it is fixed, its coefficients are recorded, and a new affine transformation W2 and a
new supcopy W 2(S) are introduced. The image W2 is adjusted interactively until
W2(S) covers a subset of those pixels in S that are not in WI(S) . Overlapping is
permissible but it should be kept to aminimum.
In this manner, the user determines a set of contractive affine transformations
{WI, W2 , W3 , ' " ,WN} with this property: The original target S, and the set
N
S~ = UWn(S) ,
n=l

are virtually elose, while N is as small as possible. The mathematical indicator of the
eloseness of S and S~ is the Hausdorff distance between them, h(S, S~). By 'virtually
dose', we mean that "h(S,S~)" is smalL The coefficients of the transformations
{WI, W2 , W3 , • •• , WN} thus determined are stored. The Collage Theorem assures
us that the attractor A of this IFS will also be virtually close to S. Moreover, if
S=S~,
304 CHAPTER 7. APPENDIX

then A = S. There are several interesting algorithms for computing the attractor of
IFS like:
(i) Photocopy machine Algorithm, (See [Barnsley and Hurd 1993, pp. 98-99]).
(ii) Chaotic game algorithm (See Lu [1993]).
(iii) For more details, we also refer to Fisher [1994].

7.3 Some basic results


Cl Linear Algebra.
A vector space (linear space) X over a field of real numbers R is a set X together
with mappings (x,y) ~ x + y of X x X into X (+, operation of addition) and
(a,x) ~ ax of R x X into X (., operation of scalar multiplication) having the
following properties:
(i) x + Y = Y + x,
(ii) x+(y+z)=(x+y)+z,
(iii) there exists 0 E X with x +0 =x for all x EX,
(iv) for each x E X there exists x' EX with x + x' = 0,
(v) (o + ß)x = ax + ßx for all a,/3 E Rand x E X,
(vi) a(x + y) = ax + ay for all a E Rand x,y E X,
(vii) (aß) x = ß(ax) for all a.ß E R and x E X,
(viii) 1· x = z.
A subset Y of X which is also a vector space with respect to the same operation
of addition and scalar multiplication is called a subspace.

Examples of a vector space.


(i) The set of real numbers R is a vector space.
(ii) The set Rn, Euclidean space of n dimension, with elements of the type x =
(Xl, X2 , • • • x n ) , Xi E R is a vector space over R.
(iii) The set of complex numbers C is a vector space over R.
(iv) The set G[a, b] of all continuous functions on [a, b] into R is a vector space over
R.
(v) The set of all bounded real sequences is a vector space.
7.3. SOME BASIC RESULTS 305

(vi) The set i, = { {xn}/~, Ixnl P


< 00, 1 ~ p < 00 } is a vector space.

(vii) The set L p of all measurable functions such that I/IP, 1 ~ p < 00, is Lebesgue
integrable is a vector space.
A set of n linearly independent vectors, {ei} ~=l is called the basis of a vector
n
space X if every element x E X can be written as x = Laiei.
The number of
i=l
vectors in the basis of X is called the dimension of X. X is called a finite-dimensional
vector space of dimension n if the number of elements in the basis is finite, say n.
The vector
n
X = L aiei EX
i=l
will always be represented by

x=CJ,
in the matrix notation. It is called the column vector. x T = (al, a2 .. . ,an) is
known as the row vector. The row vector x T is the transpose ofthe column vector.
Let X be a vector space of finite dimension over R . The function (" .) : X x X -*
n
R, (x, y) = xyT = xT Y =L caß, is called the Euclidean scalar product. Two
i=l
vectors x and y of X are orthogonal denoted by xl..y if < x, y >= O. A vector x of X
is called orthogonal to a subset Y of X denoted by x-LY if xl..y for all y E Y. A
set {Xl, X2,' " x n } of vectors in Xis called orthogonal if (Xi, Xj) = 8i j, 1 ~ i,j ~ k,
where 8i j is the Kronecker delta: 8ij = 1 if i = i. 8ij = 0 if i =F j .
Let X and Y be two vector spaces equipped with the bases (ej)j=l and (Ii)~l '
respectively. Relative to these bases, a linear operator A : X -+ Y is represented by
the matrix having m rows and n columns:
an a12 al n
a2l a22 a2n

A=

amI a m2 . . . amn

the elements aij of the matrix A being defined uniquely by the relations
m

Aej = L aij [i 1~ j ~n.


i=l
306 CHAPTER 7. APPENDIX

Equivalently, the j-th column vector

alj )
a2j
. ,
(
amj

of the matrix A represents the vector Aej relative to the basis (Ii)~l' We call

the i-th row vector of the matrix A.


A matrix with m rows and n columns is called a matrix oftype (m, n). A column
vector is a matrix of type (m,l) and a row vector a matrix of type (1, n) . A matrix
A with element aij is written as

A=(aij) i=1,···m,j=1,2···n;

where the first subscript i always denotes the row and the second subscript j denotes
the column. Amn(R) denotes the space of all mx n matrices aij ER. The transpose
of the matrix A = (aij), i· .. ,1·· · m, j = 1,2· - - n is the matrix AT = (aji); equiv-
alently (Ax,Y)m = (X,AT y). If A = (aij) and B = (bkj) are matrices of (m,l) and
(l, n)-type, respectively, then their product AB is the matrix of type (m, n) defined
by
I
AB = L aik bkj.
k=l

It can be seen that (AB)T = B T AT. A = aij i = 1,2-· ·m,j = 1,2 ···n is
called a square matrix of order n if m = n. Ann(R) is a ring which is called the
ring of square matrices of order n. Elements au of a square matrix A = (aij), i =
1,2, ... n, j = 1,2,'" n are called diagonal elements and the elements aij, i =f:. j,
are called the off-diagonal elements. 1= (<5 ij ), i, j = 1,2· .. n is called the identity
matrix. A square matrix A = (aij) is called invertible matrix and its inverse is
denoted by A-l if AA-l = A-l A = I, otherwise A is called singular. If A and B
are invertible matrices, then

Ais called
(i) symmetrie if aij = aji, that is, A = AT,
(ii) orthogonal if AAT = AT A = I,
(iii) normal if AAT = AT A, and
7.3. SOME BASIC RESULTS 307

(iv) diagonal if aij = 0 for i f. j .


n
The trace of A denoted by tr(A) is defined by tr(A) = L aii, tr(AB) =
i=1
tr(BA), tr(A + B) = tr(A) + tr(B) . The concept of eigenvalues, eigenvector and
spectrum for matrices can be derived from discussion of these concepts for linear
operators mentioned in the next section. The matrix equation Ax = b; that is

(C.l.l)

has a unique solution if


n
L I- a ij + 8i j I ::; k < 1.
j=1

Iterative methods for the solution ofAx = b . Given an invertible matrix A


and a vector b, one would like to find the solution U of the linear system

Ax=b.

Assurne that a matrix Band a vector c can be found such that (1 - B) is invertible
and the unique solution of the linear system

U = Bu+c
is also the solution ofAx = b. The form of the system u = Bu + c invokes the
definition of an iterative method for Ax = b. Choosing an arbitrary initial vector
uo, the sequence of vectors {Uk} is defined by

Uk+1 = BUk + c,k 2: O.


We say that the iterative method is convergent if the sequence Uk is convergent to
U for every initial vector Uo; that is,

lim
u-too
Uk = U,
for very choice of uo.
The iteration method is convergent if and only if IIBII < 1. The matrix B is
constructed from the matrix A. There are many methods for constructing B from
A, for example, Jacobi, Gauss-Seidel and relaxation methods. For details of these
308 CRAPTER 7. APPENDIX

methods, we refer to Ciarlet [1989], Datta [1995], Goulub and van Loan [1996] and
Press Teukolsky et al. [1992].

C 2 Banach and Hilbert spaces.


Let X be a non-empty set, t hen a mapping d : X x X ~ R+ defined on X x X
into the set of non-negative real numbers 14 is called ametrie on X if it satisfies
the following conditions:
(i) d(x,y) ~ 0, d(x,y) =0 if and only x = Y,
(ii) d(x,y) = d(y, x) for all x,y EX, and
(iii) d(x, y) $ d(x, z) + d(z, y) for all z, y, z E X.
The metric is a generalization of the notion of distance on the real line. A non-
empty set of X equipped with the metric d(·,·) is called a metric space. It is clear
that (R, d(-, .)) is a metric space with d(x, y) = Ix - yl for x, y E R. Rn, n ~ 1 is a
metric space with respect to

(t
1

d(x,y) = lXi - YiIP) P, 1 s p < 00,


for every

and

where

x = (Xl, X2 . . . x n ) and y = (YI' Y2, . . . Yn) E Rn.


ay* G[a, b] is a metric space with respect to

d(f,g) = max If(x) - g(x)I.J,g E G[a, b],


a::;zSb

and

(lb
l.

dU, g) = If(x) - g(X) 12 dx) .I,2 gE G[a,b].

The set of all bounded real sequences, denoted by m, is a metric space with respect
to

d(x, y) = sup lXi - Yil, x = {Xi} and Y = {Yi} ,


i
7.3. SOME BASIC RESULTS 309

belonging to m.
(0,1] is a metric space with respect to d(x,y) = Ix - yl. Every metric space is a
Hausdorff topological space,
Let X be a vector space over a field of real numbers R. A function defined on X
into R, denoted by 11 . 11 is called a norm if it satisfies the following conditions:
(i) IIxll ~ 0, IIxll = 0 if x = 0,
(H) lIaxll = [o], 11xII, a E R,x E X, and
(iii) IIx + yll $ IIxll + lIylI, x, y E X.
(X,II . 11) is called areal normed linear space or simply normed space. One can
define normed linear space over a field of complex numbers and most of the results
mentioned here may be true for such spaces. However, here we confine ourselves to
the real case. Every normed space is a metric space but the converse may not be
true. R, Rn, C[a, b], lp, 1 $ p < 00, m, L p, 1 $ p < 00 are normed spaces. A normed
linear space, in which every Cauchy sequence is convergent; that is

IIxn - xmll-t 0 as m,n -t 00 =} 3x E X such that Ilxn - xll-t 0 as n -t 00

is called a Banach space. A normed linear space is called finite-dimensional if the


underlying vector space is finite-dimensional.
Every finite-dimensional normed space is a Banach space, Rn is Banach space
with respect to the norms

11xII ~ (t, lXII') I


1

IIxll = (tlxiIP) P, 1 $p< 00

IIxll = m~
l~l~n
lXii.
C[a, b] is a Banach space with respect to the norm

11111 = sup II(x) - g(x)l ·


a~z$b

(i
1

~, but it
b
C[a, b] is a normed space with respect to the norm 11111 = II(X) 12dx)
is not a Banach space,
lp, 1 $ p < 00 is a Banach space with respect to the norm
1

X = (Xl, X2 • •• X n ..• ) E lp, IIxll = (~lxiIP) P , 1$ p < 00 ,


310 CHAPTER 7. APPENDIX

L p , 1 :'5 p < 00 is a Banach space with respect to the norm

(l If(x)IPdX) ;; ,
1
b
11/11 =

where L p = {f : [a, b] -+ R measurable functions/jj']" is Lebesgue integrable}. We


say that two elements 1 and 9 of L p are equal if they are equal almost everywhere.
A mapping T on a normed linear space X into another normed space Y over
fields R is called a linear operator if
(i) T(x + y) = Tx + Ty for every x, y E X, and
(ii) T(o:x) = o:T(x), for every 0: E R and x E X,

If Y = R, then an operator is called functional, that is, functional is a mapping on a


normed space into R. A linear operator is bounded if 3k > 0 such that IITxll :'5 kllxll
for all x EX. The concepts of boundedness, continuity and continuity at '0' are
equivalent. The set of all bounded linear operators on a normed space X into normed
linear space Y denoted by BL[X, Y] is a normed linear space with respect to the
following norm:
IITII = inf {k/IITxll :'5 kllxll}·
Operations of addition and scalar multiplications are defined as

(T+S)(x) = Tx+Sx
(o:T)(x) = o:T(x).
BL[X, Y] is a Banach space if Y is a Banach space. Thus the set of all bounded
linear functionals is a Banach space. It is denoted by X * or X' or X" and is known
as the dual or adjoint or conjugate space of X . We say that X and Y are algebraic
and topologically equivalent denoted by X = Y or X ~ Y if there exists a mapping
T : X --+ Y such that
(i) T(x + y) = Tx + Ty,
(ii) T(o:x) = «r»,
(iii) IITxll = IIxll,
(iv) T is 1 - 1, and
(v) T is onto.
It can be proved that

(R n)* = Rn, (lp)* = lq, p~ + q~ = 1


(L p )* = t.; p~ + ~q = 1.
7.3. SOME BASIC RESULTS 311

A list of dual spaces of a fairly large number of normed spaces can be found in
Siddiqi [1986, Table 1.1, 34-35]. An operator mapping, a continuous function on [a, b]
to its integral on [a, b], is linear and bounded functional, that is, T : C[a, b] ~ R
defined by T f = l b
f(t)dt is linear and bounded.
Differential equations, integral equations and systems of algebraic equations can
be written in the form of operator equation T» = y where T is a linear operator on
a normed linear space into itself or into another normed linear space . For example,
a system of n linear equations:

aUx1 + a12 X2 + + a1nXn = b1


a21X1 + a22X2 + + a2nXn = b2

can be written in the form

Tx = y, where x = (X1,X2 " · x n ) ERn and y = (Y1 ,Y2, " 'Yn),

and
n

Yi = ~)-aij + c5ij)xj + bi,


j=l

where i = 1,2 · ·· n, (aij), i,j = 1,2 ·· · n is an n X n matrix. The differential equation


~~ = f(x,y) can be written in the form Ty = i, where y E Cl[a, b] = space of all
functions on [a, b] whose first derivative exists and is continuous on [a, b] .
Let

+ J1.1 K(x, y)f(y)dy,


b
h(x) = g(x)

be an integral equation where K(x,y) is a measurable function in the square


A = {(x,y)la::; x::; b,a::; y::; b} such that

ll b b
IK(x, yWdx dy < 00,

and g(x) E L 2(a, b). Then the integral equation can be written in the form T f = h
where T : L 2 (a, b) ~ L 2 (a, b) is a linear operator.
The set of n x n matrices, n finite, is a Banach space; in fact, a Banach algebra
312 CHAPTER 7. APPENDIX

with respect to the following norms. For a matrix A = (aii)' i, j = 1, 2 . .. ,n


n n
IIAII = sup laiil, IIAII = :E:E laiil
i,i i=l i=l

A subset M of a vector space X is called convex ifax + ßy, E, M for 0:,


ß E R,o: + ß = 1,o:,ß 2: 0 and x,y E M. A functional F defined on a con-
vex subset K of a vector space X is called a convex functional if F(ax + ßy) ~
o:F(x) + ßF(y), a.ß 2: 0,0: + ß = 1. Every subspace is a convex set but the converse
may not be true. Every linear funct ional is a convex functional but the converse
need not be true. An operator T on a normed linear space X into another normed
space Y of type Tx = Sx + b, for all x E X and bEY (for b fixed) is called an
affine operator where S is a linear operator on X into Y . If Y = R, T is called an
affine functional. For details, see Siddiqi [1986]. A vector space equipped with the
function (".) : X X X --t R, called inner product, having the following properties:

(i) (x + x', y) = (x,y) + (x' , V),


(ii) (ax, y) = o:(x, y)o: E R, x, Y EX,
(iii) (x, y) = (y, x) ,

(iv) (x, x) 2: 0 "Ix E X, (x, x) = 0 if and only if x = 0,

is called an inner product space on a pre-Hilbert space. Every inner product


space is a normed linear space with the norm IIxll 2 = (x, x), but the converse may
not be true. A pre-Hilbert space which is complete as a normed space, that is, its
every Cauchy sequence is convergent, is called a Hilbert space. The Hilbert space has
an additional structure like the concepts of an angle and orthogonality between two
elements. The following theorems playa very important role in different disciplines
like optimization, variational inequalities, optimal control, mathematical economics.

Theorem Cl. Let K be a non-empty, convex, closed subset 0/ a Hilbert space H .


Given any element WEH, there exists a unique element Pw such that Pw E K and
IIw - Pwll ~ IIw - vII [or all v E K ; equivalently

IIw - Pwll = inf IIw -


vEK
vii. (C2.1)

The element Pw E K satisfies

(Pw - W,v - Pw) 2: 0 [or every v E K, (C2 .2)


7.3. SOME BASIC RESULTS 313

and, conversely, if an element u satisfies

u E K and <u - W, v - u >~ 0 for every v E K, (C2.3)

then u = Pw.
The function P : H ~ K defined in this manner satisfies the relation

(C2.4)

The function P is linear if and only if the subsei K is a subspace of H, in which


case (C2.2) takes the form

(Pw - w,v) = 0 for every v E K. (C2.5)

Theorem C2. Let M be a closed subspace of a Hilbert space H , then

H=MEBMJ., (C2.6)

uihere M 1. = {y E H (x, y) = 0 for all x E M}, that is, H is the direct sum of M
and MJ. .

Theorem C3. For each F E H* (dual space of H), there exists a unique element
y E H such that

F(x) =< x,y > for every xE H .

Definition Cl. The function P defined on Theorem (C2.1) on H into K is called


the projection operator, and the element Pw is called the projection onto the set
K of the element w.

Remark Cl .

(i) Relation of (C2.1) teIls us that Pw is the nearest element of K to w.


(ii) Inequality (C2.2) expresses the necessity for the angle formed by the vectors
Pw - w and v - Pw to be less than or equal to rr for all elements v E K.
(iii) w - Pw = 0 if and only if w E K.

(iv) H = Rn and K = R+ = {x = (Xl,X2" 'Xn)IXi ~ 0 for all i}, then (PW)i =


max{wi,O}, 1 $ i $ n, satisfies (C2 :2).

Definition C2. (a) A mapping a( ·, ·) : H x H ~ R , where H is a Hilbert space,


is called abilinear form if it is linear in both the variables, that is,
314 CHAPTER 7. APPENDIX

(i) a(xl + X2, y) = a(xl' y) + a(x2, y)


(ii) a(ax, y) = aa(x, y)
(iii) a(x, Yl + Y2) = a(x, Yd + a(x, Y2)
(iv) a(x, ßy) = ßa(x, V).
The bilinear form a(·,·) is called bounded if there exists k >0 such that

la(x, y)1 ~ kllxllllyll·


(b) a(·,·) is called

(i) symmetrie if a(x,y) = a(y,x)


(ii) positive a(x, x) ~ 0 for all xE H
(iii) positive definite if a(x, x) ~ 0 for all and a(x, x) = 0 => x = o.
(c) a(-,·) is called eoercive if there exists a > 0 such that a(x, x) ~ allxl1 2 for all x E
H.
(d) The norm of a bounded bilinear form a(-, ·) is defined by

la(x,y)1
Ilall = 8UP
%#0,1/#0
11 x 11 11 Y11 ;
x Y
= %#08UP
,1/#0
la(-II
x 11' -1
11
Y
1)1 j (C2 .7)

= 8UP la(x, y)1 .


11%11=111/11=1

Theorem C4. Let T be a bounded linear operator on a Hilbert space H into itself.
Then a bounded bilinear form a(·,·) can be defined by the relation

a(x,y) = (x,Ty), (C2.8)

and Ilall = IITII· Conversely let a(., .) be a bounded bilinear form on H x H into R ,
then a bounded linear operator T on H into itself can be defined by the relation

(x,Ty) = a(x,y)
such that lIall = IITII.
Theorem C5 (Lax-Milgram Lemma). Ifa(·,·) is a bounded bilinear and coercive
form on a Hilbert spaee H, then the functionaljoperator equation

a(x,y) = F(y) for all y E H j


{ {:} Tx = y,T : H ---t H, (C2.9)
7.4. RESULTS FROM SOBOLEV SPACES 315

has a unique solution.

Theorem C6 (Fundamental Theorem of Calculus of Variation). If a(·,·) is


a bounded, bilinear, symmetrie and eoercive form on a Hilbert space H, then finding
the solution of (C2.9) is equivalent to finding the solution of optimization problem
for J(v) = ~a(v,v) - F(v) where F E H*.

Examples of bilinear form:


(i)
n
H = Rn,a(x,y) = LXiYil
i=l

where

(ii) Let

then

a(j,g) = 1r0 f(x)g(x)dx +1


r dxdf icd dx
0

is a bilinear, symmetrie and eoercive form.


(iii) Let

A = (aU) i,j = l,2,···,n be n x n,


be a symmetrie matrix and H = Rn, then a(u,v) = ~(Au,v) is a bilinear
symmetrie bilinear form.

7.4 Results from distribution theory and


Sobolev spaces
The distribution theory was developed by the French mathematician Laurent
Schwartz around 1950 to resolve the diserepaney ereated by the Dirac delta function
whose value is zero at all points exeept one while its Lebesgue integral is 1. This
function was introdueed by the famous physicist P.M. Dirae around 1930 which eon-
tradicted the eelebrated Lebesgue theory of integral. A dass of Lebesgue integrable
316 CHAPTER 7. APPENDIX

functions was introduced by the Russian physicist S.L.Sobolev around 1936 which
has been found very useful in many areas of current interest, and is now known as
the Sobolev space. The results related to this space have provided asound foun-
dation for modem theory of ord inary and partial differential equations - analytical
as weIl as numerical methods. Some important results of these two topics are men-
tioned here along with important references. For appreciating and understanding
the results of different chapters, especially Chapter 3, these results have been very
useful,

Let n be a positive integer. A vector or n-tuple o = (al , a2'" , an), where


ai, i = 1,2··· , n, are non-negative integers, is called multi-index of dimension n.
n
The number [o] = Lai, is caIled the magnitude or length of the multi-index.
i=l

a! = al
!·· ·an!
cȧ -_ (a) _ a!
ß - ß !(a - ß)!
xer = XflX~2 .. ·x~n where x = (Xl,X2,'" ,xn ) ERn.

We say that multi-indices o, ß are related by a $ ß if ai $ ß i for all i = 1,2,3· .. , n .

We are familiar with the concept of the classical partial derivative for a function
of n variables I = I(x),x = (Xl,X2,'" ,xn ) . Laurent Schwartz introduced the
concept of multi-index and a new notation of derivatives given below. In the new
alerl
terminology, Der will denote the expression a er 1 a . D (er )I will be called a
X ••• x~n
l

derivative of I of order [o], For n = 1, al = 1, Der I = aal{ which is the classical


Xl

derivative of a function of single variable denoted by :~ (Xl = X). For n = 2, al =

1, a2
Xl
1
= 1, a ~2: = Der f. This is nothing but the partial derivative of the function
X2

of two variables I(Xl,X2) which is denoted by aa21 . We also have D(l ,l) I =
2 2 Xl X2
aI o (al) wh'lC h llS equal to a a aI = -a o (al)
aX2 aXl = -a X2 -a -aX2 . n istri ution
I di ib . al
Xl Xl X2 Xl
2: 2;
derivative t o be defined later, we shall not distinguish between aa and aa
Xl X2 X2 Xl
7.4. RESULTS FROM SOBOLEV SPACES 317

The relationship between the multi-index terminology, Laplace operator, and


harmonie operator is discussed in Chapter 3 of Rektorys [1980] .
All functions are defined on a bounded subset n of Rn into R . The boundary of
n is denoted by I' or 8 n and n = n + r . We say that 1 E ~ if 1/1 2 is Lebesgue
1

integrable on n. 2
L (n ) is a Hilbert space with respect to 11/11 = (lI/12 d X) 2 or

(I,g) = ll9dx (Here lldx stands for f ···l J( X l , X 2 · · · ,xn)dx1 , · · · dxn) .


Throughout our discussion, n is a bounded subset of Rn with the Lipschitz
boundary r. A technieal definition is slightly complicated; however broadly it means
that the boundary will not contain cuspidal points and edges. Examples in two- and
three-dimension domains having the boundary Lipschitz condition or Lipschitzian
boundary are circles, squares, triangles, spheres , cubes, annuli, etc , In one dimen-
sion, n = (a, b) broadly speaking, the function representing Lipschitzian boundary
will be smooth or piecewise smooth and will have no singularity. A function 1 de-
fined on n into R is said to satisfy the Hölder condition with exponent A,O < A $ 1,
if there exists a constant M > 0 such that

I/(x) - l(y)1 $ Mllx - yW' 'Vx,y E n,


where 11 • 11 is the Euclidean norm on Rn.
Let n be a bounded domain and let there exist positive constants a > 0, ß > 0,
a finite number of Cartesian coordinate systems xi x;··· x~(r = 1,2, · ·· m) and
m functions ar {xi,x; .· :x~_ S\i/liiCli arecdfitinuous;n(n---l-}-dimeasienal-dosed
cubes Er
lxii $ a.i = 1,2 · ··n -1,
such that
318 CHAPTER 7. APPENDIX

(i) every point x of r can be expressed at least in one of m considered coordinate


systems in the form

x= (x~,ar(X~») jX~ = (X~_l E Er).

(ii) points (x~, x~) satisfying

lie in 0 ,
(iii) points (x~, x~) satisfying

ar(X~) - ß < xlv < ar(X~), x~ E Er>


lie outside O. r satisfying these conditions is called continuous. It is called Lipschitz
continuous if there exists a constant M > 0 such that

"
n- l
2
} ~
Pr(Yr) - ar(Xr~ ::; M ~ (Yi - xi)
I I

{
Vx r, Yr E Er> r = 1,2, . .. m .

CO(O) = Space of continuous functions on 0 into R,


Ck(O) = Space of all functions defined on 0 into R,
whose derivatives upto k-th order exist and are
continuous.
COO(O) = n~oCk(O) = Space of all functions defined on n into R having
derivatives of all order.
supp f = closure {x E 0/ f(x) =F O} is called the support
of f . If K is compact, fis said to have compact
support.

It can be easily seen that Clf'(O) is a vector space with respect to usual opera-
tions.
A sequence {if>n} in Clf' is said to converge to an element if> in Clf'(O)j namely,
if>n -t if>, if (i) there is a fixed compact set K c 0 such that supp if>n C K for all n,
(ii) if>n and all its derivatives converge uniformly to if>(x) and its derivatives; that is,
Dcxif>n -t Dcxif> for all Q uniformly.
Clf'(O) equipped with the topology induced through the convergence is called
the space of test funetions and is often denoted by D(O). A continuous linear
functional defined on D(O) is called a distribution or Schwartz distribution (A
functional F on D(O) is continuous if if>n -t if> =? Fif>n -t Fif».
The space of distributions is nothing but the dual space of D(O) and is often
denoted by n' (0). If 0 = Rn, we write simply o',
The distributional derivative or the derivative of a distribution is a continuous
linear functional defined as:
For F E D' (0), (DCXP,if» = (-1)lcx 1(F,n cxif»'v'if> E D(O) .
7.4. RESULTS FROM SOBOLEV SPACES 319

A function I :0 -+ R is called locally integrable if for every compact set K c


0,IK III < 00, that is, I is Lebesgue integrable over compact set K.
Every continuous functional is locally integrable. Every Lebesgue integrable
function is locally integrable over [a, b].
°
If 0 = Sr = ball of radius r > and center (0,0) in R 2 , then I(r) = ~, r = lxi
is locally integrable on Sr . Locally integrable functions can be identified with distri-
butions. We mention some results for n = 1 and n = 2. Case n = 1 (Distribution
theory on realline).

rjJ(x) = {exp [(X _l)l(x _ 3)] , 1< x < 3


0, outside the open interval (1,3)
supp rjJ = [1,3],rjJ(x)is a test function
rjJ(x) = exp (_x- 2 ) , x > °
=0, x$O,

rjJ(x) is a test function.


(i) F(rjJ) = rjJ(O), F is linear and continuous on D(n),O = (a, b) and so F is a

l
distribution.
b
(ii) Let F(rjJ) = l(x)rjJ(x)dx, I is a locally integrable function. Fis linear and
continuous on D(O) and therefore it is a distribution.
(iii) Let F(rjJ) = l b
IrjJ(x)1 2 dx , F is continuous on D(O) but not linear and,
therefore, F is not a distribution.
(iv) The Dirac delta distribution is defined as

(D(J, rjJ) = rjJ(O) V rjJ in D(O) .

Da is linear and continuous on D(O) and hence a distribution.


A distribution generated by locally integrable functions as in example (ii) is called
a regular distribution; otherwise it is called a singular distribution.
The Heaviside function H is defined by

0, x< °
H(x) ={ ~,x = °
1, x >0.

Let
H1(x) = {O,x s°
1,x> 0 .
H and H 1 generate the same distribution and such functions are identified identically
in distribution theory.
320 CHAPTER 7. APPENDIX

Two distributions F and G are equal if (F, c/J) = (G, c/J) for all c/J E D(O) such
that supp(c/J) C (a,b).
For a comprehensive account of one-dimensional theory, see Griffel [1987]. The
distributional derivative of lxi is sgn(x) defined as

sgn(x) = -l,x < 0


= 1,x > 0
= O,x= O.

The distributional derivative of His the Dirac delta distribution.

Meaning of Ir f(x)dJ'. Let 0 be a domain with the Lipschitzian boundary r.


Every point of the boundary I' can be expressed in at least one of the considered
coordinate systems in the form

X -_ [(Xl(r) 'X 2(r) ••• ,X N(r»- l ,ar (r) (r) (r»] .


Xl 'X 2 . : 'X N _ l

The surface element above the element dx~r) dx~r) . . . dx~2l is

Ba; 2 Ba; 2] ~ (r) (r)


[
dS= 1+ (ax(r)
-) + + (ax(r)
--) dXl ···dXN _ l.
1 N -l

If on r (r ) ,a f un ctiion f(S) = f( Xl'


(r) X(r), • • • Xn-l,
2
r a r (r)
Xl ,X 2(r) , • •• Xn-l
r»"lS grven
measurable on Er, we define

1 r(~)
f(S)dS =
1 E~
f( Xl(r) 'X 2(r) • ••
(r) (r) (r) (r) »dS
Xn- l, ar Xl 'X 2 , '" Xn-l .

It may be observed that values of the function ar(x~r) , x~r), . . . X~2l) characterize
the part r(r) of the boundary above the cube Er> where the integral on the right is
taken in the Lebesgue sense. If ( j2(S)dS exists, we say that f(S) E ~(r(r».
i-.
If j2(S) is integrable over r(r) for every r = 1,2 · ·· ,m, we say that f(S) E L 2 (r) .
The integral Ir f(S)dS can be defined with the aid of a partition of unity as

where the functions c/Jr(x) exist with compact supports in the system of neighbour-
hoods defined earlier such that
m
Lc/Jr(x) = 1 for each X E r.
r=l
7.4. RESULTS FROM SOBOLEV SPAGES 321

From here, we get


m

Llr(S) = I(S) on r.
r= l

hl(S)dS is defined by the equation

h I(S)dS = ~ herl Ir(S)dS .


With this definition of the choice of the m coordinate systems and the choice of
functions 4>r(r), we proceed further.
Let L 2 (r ) = {J(S) Ih 2
II(S)1 dS exists } be a Hilbert space with respect to

(J,g) = hl(S)g(S)dS, SE r,
The induced norm and metric are given as

IIll1r = (J, J)r


d(J, g) = 111 - gllr .
Hm(n) = {f E L 2 (n )ID O' E L 2 (n ), 101 ~ m} , m is any positive integer, is called the
Sobolev space of order m.
H1(n) denotes the Sobolev space of order 1 and H 2 (n ) is the Sobolev space of
order m .
Hm(n) is a Hilbert space with respect to the inner product

(J,g)wn{O) = L (DO'I,DO'g)L2{O)'
1001~m

For m = 1,0. = (a, b),

It can be easily verified that Hl (a, b) is a Hilbert space.


For m = 2,0. = Sr = {(x, y) /x2 + y2 $ r} = circle with origin as the centre; or
0. = {(x, y)/a < x < b, c < Y < d} = a reetangle with sides oflength b - a and d - c.

(J,g)H2{O) = lall: (D l, Q
DOgh2 (O ),
10'1 9
322 CHAPTER 7. APPENDIX

where
7.4. RESULTS FROM SOBOLEV SPACES 323

H{J'(O) is denoted by H -m(o) .


The restriction of a function 1 E Hm(O) to the boundary T is called the trace of
1 and is denoted by r j, that is, r f = 1(8) = value of 1 at r.
Hrf(O) = {f E Hm(O)jrf = O} = closure of D(O) in Hm(o)
Hm(Rn) = H{t(Rn),m ~ O.

1E H-m(o) if and only if 1= 2: Dag for some 9 E ~(O) [Kardestuncer J .,


\a l$ m
1987].

Sobolev space with a real index. HS(0),8 E R+. For 0 < o < 1, we define
HO'(O) to be the closure of COO(O) with respect to the norm

lIull; = lI ull[2] + 2: IIDaull~,


lal=[s]

where
2 rrlu(t) - u(rW
IIull9 = JoJo Iit _ rlln +29 dtdr ,
and n is the dimension of the domain O.
For 8 > 0 not integer, write 8 = k + o where k = [8], that is k is integer (k = [8]
and a = 8 - k) and define

where HS(O) is a Hilbert space with respect to

(f,g}H'(O) = 2: (Da I, D agh (o ) if


2 8 = k
\a\9

= 2: {(Da I, D ag}L 2(O)

1019
rr(Da I(x) - Dal(y))(Dag(x) - Dag(y)) d d }
+JoJo IIx - Ylln + 2u X y .
1 1
Note: H;(O), H;(O), H;m(o), p + q = 1,1 ::; p < 00 can be defined
replacing L 2(0) by Lp(O).
The dual of H&(O) is denoted by H-S(O) . HS(r),H-S(r) can also be defined
keeping in mind the definition of Hm(r) and HS(r) , see Dautry and Lions [1990,
Chapter III] for more details about HS(r) . HS(r) is also a Hilbert space for 8 E R.
It can be observed that every element of u: (a, b) is absolutely continuous and
hence continuous.
324 CHAPTER 7. APPENDIX

Some important inequalities.

Ostrogradski's formula. For a field p = (Pl,P2,'" ,Pn),

In div p dx = Irp·ndS,S Er.

Ir
Green's formula for integration by parts.
(i) In v.6.udx + In grad ugrad vdx = v : :dS

(ii) In (u.6.v - v.6.u)dx = Ir (u ~: - v ~:) d,.


It is clear that (i) is a generalization of the integration by parts formula stated
below (for n = 1,!l = [a, b])

l b
u" (x)v(x)dx = u' (b)v(b) - u' (a)v(a) - l b
U' (x)v' (x)dx.

Thc Friedrichs inequality. Let I] be a bounded domain of Rn with a Lipschitz


boundary. Then there exists a constant k1 > 0, depending on the given domain,
such that for every fEHl (!l) ,

The Poincarö inequality. Let I] be a bounded domain of Rn with a Lipschitz


boundary. Then there exists a constant, say, k 2 depending on !l such that for every
fE Hl(!l)

The above two inequalities hold for elements of L 2 (!l). For fEHl (!l) such that
Inf(x)dx = 0

IIfI12 L 2 (!l ) ~ klt,II:~IIL(n),kl > 0 constant depending on Tl .

Let

Thus divergence, written in brief div., is an operator on D' (!l) into V' (!l)n, usually
from application point of view n = 2 or 3.
7.4. RESULTS FROM SOBOLEV SPACES 325

For case n = 3 and v = (VI, V2, V3) E V' (0)3, we put

eur1 V = (8V 3 _ 8 V2 8VI _ 8V3 8V2 _ 8VI )


8X2 8X3 '8X3 8XI' 8XI 8X2 '

which defines the differential operator, denoted by eurl in V' (0)3. Sometimes, it is
also denoted by \7/\ or "rot" (rotation).
In case n = 2, we put

the eurl operator defined on D(0)2 into D(O). A eur1 or eurl on D' (0)2 into itse1f
ean be defined as folIows:

-eurl V = eurl V = (8V


8XI -
8V) .
8X2

For n = 3, the diagram

D'(O) ~ D'(0)3 ~ D'(0)3 ~ D'(O)


D(O) ~ D(0)3 ~ D(0)3 ~ D(O)
is such that curl grad V = 0 for al1 V E V' (0)3 .
The following relations are satisfied:

(i) Im grad c kereurl.


(ii) Imeurl C kerdiv.

(iii) (div v, ljJ) = (v, - grad cp) for all V E V(0)3, 4> E V(O), that is, div is the
transpose of - grad.

(iv) (eurlv, ljJ) = (v, curle) for all v E V' (0)3, ljJ E V(03) .

For n = 2, we have the diagrams

D' (0) gr~e D' (0)2 ~ V' (0)


D'(o) ~ V'(O? ~ V'(O) ,

such that

eurl grad v = 0 for v E D' (0)


div eurl v = 0 for all v E V' (0)2.
326 GHAPTER 7. APPENDIX

Thus

Im grad C ker curl


Imcurl C kerdiv
(curl v, f/J) = (v, curl f/J) , for all v E n' (0)2, f/J E V(O).

Let 0' be an open set in R3 , with bounded complement n.


Let us put
B (0 )
"
= {
u E V (0)18Xi
I ' } .
8u E L 2(0)
/)
W' (0') = closure ofV(n') under the semi norm, IIgrad IIL2(!l/)n. W ' (0 is called
the Beppolevi space.

7.5 Numerical solutions of linear systems


Here, we present iterative methods required to solve a very large and sparse
linear system Ax = b. These include the following:
1. The J acobi method.

2. The Gauss-Seidel method.


3. The successive overrelaxation (SOR) method.
Another important iterative method to solve such system, known as the conjugate
gradient method, is discussed in Chapter 2.

EI. Fundamental idea.


The basic idea behind an iterative method is first to write the system Ax = b in
an equivalent form
x = Bz + d, (EI)
then, starting with an initial approximation X(l) of the solution vector x , generate
a sequence of approximations {x(k)} iteratively defined by

x(k+ 1 ) = Bx(k) + d, k = 1, 2, . . . (E2)

with the hope that under certain mild conditions, the sequence {x(k)} converges to
the solution as k -t 00.
To solve the linear system Ax = b iteratively using the idea, we therefore need
to know
1. How to write the system Ax = b in the form of (EI) .
2. How x(1) should be chosen so that iteration (E2) converges to the limit with
any arbitrary choice of x(1) .
7.5. NUMERlCAL SOLUTIONS OF LINEAR SYSTEMS 327

E2. Stopping criteria for iteration (2).


It is natural to wonder when iteration (E2) can be terminated! When convergence
occurs, x(k+l) is a better approximation than x(k), so a natural stopping criterion is
as follows:

Stopping Criterion 1
Stop iteration (E2) if

11 x(k+ 1) - x(k) 11
11 x(k) 11 <e
for a prescribed small positive number e (€ should be chosen according to
the accuracy desired) or if the number of iterations exceeds a predetermined
number.

A stopping criterion in terms of the residual is given in the following:

Stoping Criterion 2
Stop iteration (E2) if

11 r(k+ 1) 11 s €(I1 All · 11 x(k+ 1) 11 + 11 b 11)


where

or, if the number of iterations exceeds a predetermined number. € is the


tolerance to be chosen such that J.L < € < 1, J.L is the machine precision .

E3. The Jacobi method.


The system

Ax=b

or

aUxl + a12X2 + + alnX n = b1


a21Xl + a22X2 + + a2nXn = bz
328 CHAPTER 7. APPENDIX

can be written (under the assumption that aii t= 0, i = 1, . .. , n) as


1
Xl = -(bI - al2X2 - • .• - alnX n)
au
1
X2 = -(~ - a2lXI - •.•- a2nXn)
a22

1
Xn = -(b n - anlXI - •. • - an,n-Ixn-d ·
ann

In matrix notation,
o _!!n
an
-~
an
o -~
a22

an-l,n
an-l ,n-l
o

or
X= Bx+d.
If we write the matrix A in the form
A=L+D+U,
where
o 0 0 0
a21 0 0 0
L=
o
anl ann- l 0
D = diag(au, ... , a nn ), and
o al2 al3 . • • al n
o0 a23 . . . a2n

U=
an-ln
0 ... 0
7.5. NUMERlCAL SOLUTIONS OF LINEAR SYSTEMS 329

then it is easy to see that

B = _D- 1 (L + U) = (I - D- 1 A)
d = D- 1b.

(Note that, because of our assumption that aii #- 0, i = 1, .. . , n, Dis nonsingular.)


We shall call the matrix

B = -D- 1(L + U) = (I - D- 1A)

the Jacobi iteration matrix and denote it by B j • Similarly, we shall denote the
vector D- 1b by bj, which we call the Jacobi vector
The Jacobi Iteration Matrix and the Jacobi Vector
Let A = L + D + U. Then
Bj = -D-1(L + U)
bj = D- 1b

With the Jacobi iteration matrix and the Jacobi vector as defined, iteration (E2)
becomes the following.
The J acobi iteration

We thus have the following iterative procedure, called the J acobi method.

ALGORITHM 1. T(h:df)Obi method.


1
(1)
Step 1: Choose Xl = X~ •

x n(1)
Step 2: For k = 1,2, . .. , do until a stopping criterion is satisfied.

(E3)

or

X•~k+l) -
• -
2.
a"
(b·. _ Ln a'IJ.x(k»)
1 ',
J
j #- i, i = 1, . .. , n. (E4)
U j=l

Note: In practical implementation, we will use Equation (E4) .


330 GHAPTER 7. APPENDIX

E4. The Gauss-Seidel method.


In the Jacobi method, to compute the components of the vector x(k+ 1) , only the
components of the vector x(k) are used; however, note that to compute x~k+l), we
could have used x~k+l) through x~~tl), which were already available to us. Thus,
a natural modification of the Jacobi method will be to rewrite the Jacobi iteration
(E4) in the following form:

The Gauss-Seidel Iteration

X~k+l) = ...!... (b'- ~ a - _x (k+ 1) - ~


1 aii 1 ~ I) ) ~
a I- )
_x)(k») ,
}
(E5)
j=l j=i+l .
i = 1,2, . . . ,n

The idea is to use each new component, as soon as it is available, in the compu-
tation of the next component.
The iteration (E5) is known as the Gauss-Seidel iteration, and the iterative
method based on this iteration is called the Gauss-Seidel method.
In the notation used earlier, the Gauss-Seidel iteration is

(Note that the matrix D + L is a lower triangular matrix with au, ... , an n on the
diagonal; because we have assumed that these entries are nonzero, the matrix (D+L)
is nonsingular).
We call the matrix

the Gauss-Seidel Matrix and denote it by Bas. Similarly, the Gauss-Seidel vector
(D + L)-lb is denoted by bas.

The Gauss-Seidel matrix and Gauss-Seidel vector

Let A = L + D + U . Then
Bas = -(D + L)-lU
bas = (D + L) -lb.
ALGORlTHM 2. The Gauss-Seidel method.
Step 1: Choose an initial approximation X(l).
Step 2: For k = 1,2, . .. , do until a stopping criterion is satisfied.
(E6)
7.5. NUMERICAL SOLUTIONS OF LINEAR SYSTEMS 331

L aijX)k+ l) - Ln )
or
i-l
X~k+l) = ~ bi - aijx)k) ,i = 1,2, . . . ,n. (E7)
(
an j=l j=i+l

On actual computer implementations, it is certainly economical to use equations


(E4) and (E7) rather than (E3) and (E6).
It is often hard to make a good guess of the initial approximation x Cl ) . Thus, it
will be nice to have conditions that will guarantee the convergence of iteration (E2)
for any arbitrary choice of the initial approximation. We have the following results
in this direction.

Theorem EI. (Iteration Convergence Theorem) The iteration

XCk+l) = BxCk) + c
converges to a limit with an arbitrary choice of the initial approximation XCl) if and
only if the matrix BCk) ~ 0 as k ~ 00; that is, B is a eonvergent matrix.

Conditions for convergence of iteration


A necessary and sufficient condition for the convergence of the iteration (E
2), for any arbitrary choice of x(1), is that p(B) < 1. A sufficient condition
is that 11 B 11< 1, for some norm. Here p(B) denotes the spectral radius of
B.

This result can be applied to identify classes of matrices for which the Jacobian
andfor Gauss-Seidel methods converge for any choice of initial approximation x Cl ) .

The J acobi and Gauss-Seidel methods for diagonally dominant matrices.

1. If A is row-diagonally dominant, then the Jacobi method converges for any


arbitrary choice of the initial approximation x(1).

2. If A is row-diagonally matrix, then the Gauss-Seidel method converges for any


arbitrary choice of x(1).

The Gauss-Seidel method for a symmetrie positive definite matrix. The


following theorem tells us that the Gauss-Seidel method converges, with an arbitrary
choice of x Cl ) , for a symmetrie positive definite matrix.

Theorem E2. Let A be e symmetrie positive definite matrix. Then the Geuse-Seide!
method eonverges for any arbitrary ehoiee of the initial approximation of x(1).

Rates of convergence and eomparison between the Gauss-Seidel and the


J acobi methods. We have seen that for row-diagonally dominant matrices, both
332 CHAPTER 7. APPENDIX

the Jacobi and the Gauss-Seidel methods converge for an arbitrary X(l) . The ques-
tion naturally arises whether this is true for some other matrices as weIl. Also, when
both methods converge, another question arises: which one converges faster?
If A is a matrix having diagonal entries positive and off-diagonal entries negative,
then
1. Either one or both the Jacobi and the Gauss-Seidel methods converge or di-
verge .
2. When both the methods converge, the Gauss-Seidel method converges faster
than the Jacobi method.

E5. The successive overrelaxation (SOR) method.


The Gauss-Seidel method is frustratingly slow when p(BGs) is elose to unity.
However, the rate of convergence of the Gauss-Seidel iteration can, in certain cases,
be improved by introducing a parameter w, known as the relaxation parameter. The
foIlowing modified Gauss-Seidel iteration is known as the successive overrelaxation
iteration or, in short, the SOR iteration, if w > 1.

The SOR Iteration

} (ES)

From (E8), we note the foIlowing:

1. When w = 1, the SOR iteration reduces to the Gauss-Seidel iteration.


2. When w > 1, in computing the (k+ 1) approximation, more weight is placed on
the most current value than when w < 1, with the hope that the convergence
will be faster.

In matrix notation, the SOR iterat ion is

x(k+1 ) = (D + WL)-l[(l- w)D - wU]x(k) + w(D + WL)-lb, k = 1,2, . .. (E9)

(Note that because aii i:- 0, i = 1, ... , n, the matrix (D + wL) is nonsingular.)
The matrix (D +WL)-l[(l-w)D - wU] is called the SOR matrix and is denoted by
BSOR. Similarly, the vector (D +WL)-lb is denoted by bSOR.

The SOR Matrix and the SOR Vector


BSOR = (D + WL)-l [(1 - w)D - wU]
bSOR = w(D + WL)-lb.
7.6. BLACK-SCHOLES WORLD OF OPTION PRlCING 333

In matrix notation, the SOR iteration algorithm is as folIows:

ALGORlTHM 3. The successive overrelaxation method:


Step 1: Choose X(l) .
Step 2: For k = 1,2, . . . do until a stopping criterion is satisfied
1
x(k+ ) = BSORx k + bsOR
or compute x(k+ 1 ) using Equation (ES).
Choice of w in the convergent SOR iteration. It is natural to wonder about
the range of w for which the SOR iteration converges and is the optimal choiee of w.
To this end, we mention here the following important results due to William Kahan
and Ostrowski-Reieh:

THEOREM: Kahan. The SOR iteration cannot converge for an arbitrary initial
approximation if w lies outside the interval (0,2).

THEOREM: Ostrowski-Reich. Let A be a symmetrie positive definite matrix


and let 0 < w < 2. Then the SOR method will converge for any arbitrary choiee of
X(l) .
For further details of these methods and proofs of theorems, we refer to Datta
[1995]. MATLAB can be applied to implement these algorithms. The MATLAB
toolkit MATCOM may enable us to study different algorithms for the same problem,
comparing efficiency, stability, and accuracy.

7.6 Black-Scholes world of option pricing


Fl. Introduction.
This section is based on Wilmott et al., [1995], Wilmot [1998], Siddiqi , Man-
chanda, Kocvara [1999], Sircar and Papanieolaou [1998]; see the references in Scholes
[1997] and Merton [1997] for more details. Option pricing theory is a relatively new
discipline within the economies and banking industry which encompasses mathemat-
ieal finance theory and finance practice. This special field of finance is the study of
allocation and deployment of economie resources, both spatially and across time in
an uncertain environment. Sophistieated mathematieal models and numerical meth-
ods described in Chapters 3 and 4 have been included to capture the inßuence and
interaction of time and uncertainty. One of the fastest algorithms for a nonlinear
problem in this area will be presented in this appendix along with some basic results,
In October 1997, Myron Scholes and Robert Merton were awarded the Nobel
Prize for Economies on their work, which can be expressed in the form of a linear
parabolic partial differential equation (diffusion equation) that enables investors to
price accurtely their bets on the future, More precisely, it allows investors to calcu-
late the fair priee of a derivative security whose value depends on the value of another
334 CHAPTER 7. APPENDIX

security, known as the underlying, based on a small set of assumptions on the price
behaviour of that underlying. Fisher Black had died in 1995. Black-Scholes pricing
is considered as one of the modern financial theory's biggest successes in terms of
both approach and applicability. Before publication of their work in 1973, pricing of
a derivative was rather a mysterious task owing to their often complex dependencies
on the underlying, and they were traded mainly over the counter rather than in
large markets, generally with high transaction costs . Opening of the Chicago Board
of Trade nearly coincided with the publication of the Black-Scholes model and since
then trading in derivatives has become a regular feature. The Chicago Board Options
Exchange (CBOE) first created standardized, listed options. Initially, there were just
options call on 16 stocks. Put options were introduced in 1977. In the US options
are traded on CBOE options, the American Stock Exchange, the Pacific Stock Ex-
change and the Philadelphia Stock Exchange. Worldwide, there are more than 50
exchanges on which options are traded. Extensive use of the Black-Scholes model
has significantly influenced the market. This feedback effect has been incorporated
in the model by Sircar and Papanicolaou [1998] where we get a nonlinear parabolic
partial differential equation. Some generalized forms of the Black-Scholes model are
discussed by Wilmott [1998]. Basic theory of derivatives comprising production and
markets, derivatives, the random behaviour of assets, elementary stochastic calcu-
lus, the Black-Scholes equation and its generalizations and their sensitivity analysis,
American options, and multi-aaset options is known as the Block-Scholes World 01
Option Pricing. Options are certain kinds of contracts, many of which have been
named as European, American , Asian and Russian but they have nothing to do with
the context of the origin; rather they refer to a technicality in the option contract.
For general financial news, one may visit www.bloomberg.com, www.cnnfn.com,
ww.wsj.com, www.ft.com, www.fow.com, www.cbot.com. For information about
training programms, one may visit www.wilmott.com.

F2. Basic concepts related to European and American options.


Before introducing European and American options, we briefly mention the com-
monly used terms like Asset or Underlying Asset, Equity, Derivative, Equity Deriva-
tive , Expiry, Option Pricing, Call Option, Put Option, Strike Price (Exercise Price),
Risk Management, Volatility.
By underlying asset, often called only underlying or asset, we mean com-
modity, exchange, shares, stocks and bonds, etc . Equity is a share in the ownership
of a company which usually guarantees the right to vote at meetings and a share
in the dividends (payment to shareholders as return for investment in the corpo-
ration). Derivative refers to either a contract or a security whose pay-off or final
value is dependent on one or more features of the underlying equity. In many cases
it is the price of the underlying equity which determines to a large extent the value
of the equity derivative or derivative based on equity, although other factors like
interest rates, time to maturity and strike price can also playa significant role . The
termination time of a derivatives contract, usually when the final pay-off value is
calculated and paid, is called expiry. Option pricing or options are some kind
7.6. BLACK-SCHOLES WORLD OF OPTION PRlCING 335

of contracts: cts : The right to the holder (owner) and an obligation to the seIler
(writer) of a contract either to buy or to seIl an underlying asset at a fixed price for
a premium. In call options the holder has the right but not the obligation to buy
the underlying asset at the strike price. Options in which the right to seIl for the
holder and the obligation to buy for the writer at a strike price E for the payment
of a premium is guaranteed, are called put options. Strike price or Exercise price
is the price at which the underlying asset is bought in options . Rlsk management
is the process of establishing the type and magnitude of risk in a business enterprise
and using derivatives to control and shape that risk to maximize the business objec-
tive. Volatility is a measure of the standard deviations of returns. In practice it is
understood as the average daily range ofthe last few weeks or average absolute value
of the daily net change of the last few weeks. Dividends are lump sum payments
disbursed every quarter or every six months to the holder of the stock. Commodi-
ties are usually raw products such as precious metals, oil, food products, etc, Prices
of these products may depend on seasonal ßuctuations, scarcity of the product and
are difficult to predict. Most trading is done on the futures market, making deal to
buy or sell the commodity sometime in the future. The exchange rate is the rate
at which one currency can be exchanged for another. This is the world of foreign
exchange (FX). Although ßuctuations in exchange rates are unpredictable, there is
a link between exchange rates and the interest rate of two countries.
F3. Modelling of European options.
A European call option is a contract with the following conditions: At a
prescribed time in the future, known as the expiry date, the owner of the option
may purchase a prescribed aaset, called underlying asset, for a prescribed amount
(strike price or exercise price) . Similarly, a European put option is a contract in
which at a prescribed time in the future the owner (holder) of the option may seIl
an asset for a prescribed amount.
Let V(8, t) denote the value of an option which is the function of the underlying
asset 8 ant time t. Black and Scholes [1973] proved that V is a solution of the
parabolic partial differential equation:
8V 1 2 282v 8V ( )
7ft + 20' 8 882 + -s 88 - rV = 0, F1
where 0' and r are volatility and interest rate, respectively.
Let C(8, t) and P(8, t) denote the value of V(8, t), respectively, when it is a call
option and put option. It has been shown (see, for instance, Wilmott, et al. [1993]
that a European call option C(8, t) is a solution of the following boundary value
problem:
8C 1 2 282C BC
-Bt + -O':.s - - TC = 0
-882 + r 8ss (F2)
2
C(8, T) = max(8 - E, 0) (F3)
C(O, t) = 0 (F4)
C(8, t) --t 8 as 8 --t 00 (F5)
336 CHAPTER 7. APPENDIX

where 8,0', r are as above, and E and T are the exercise price and expiry time,
respectively.
On the other hand, a European put option P(8, t) is a solution of the following
boundary value problem:
8P 1 2 282p 8P
7ft + 20' 8 88 2 +r8 88 -rP = 0 (F6)
P(8,T) = max(E - 8,0) (F7)
P(O, t) = Ee-r(T-t) (F8)
if r is independent of time
P(O, t) = e«: ft r(T)dT
if r is time dependent.
As 8 -t 00, the option is unlikely to be exercised and so
P(8, t) -t 0 as 8 -t 00. (F9)

Equations (F2)-(F5) and (F6)-(F9) are known as Black-Scholes model for


call and put options, respectively.
The Black-Scholes call option model can be transformed into the diffusion equa-
tion:

for - 00 < x < 00, T >0 (FIO)

with
u(x, 0) = max(e~(k+l):Z: - eHk-1):Z:, 0), (Fl1)
by putting 8 = Ee",t = T - T/~O'2 and

C(8, t) = Ee-!Ck-l):z:-Hk+l)2Tu(x, T),


where k = h ~O'
.
The Black-Scholes put option model can analogously be written in the form of
the diffusion equation.
If it is assumed that the asseet receives a continuous and constant dividend D,
then the Black-Scholes equation takes the form
82V )8 8V
8t + 20' 8 88 2 + (r
8V 1 2 2
- D 88 - rV = O. (F12)

Currency options are modelled by the following modified form of the Black-Scholes
equation

(F13)
7.6. BLACK-SCHOLES WORLD OF OPTION PRlCING 337

where r f denotes interest at the foreign rate of interest. If we want to model com-
modity options, then the relevant feature of commodities, requiring that they have
a cost of carry, must be taken into account in the Black-Sholes equation. This is
just like having a negative dividend and so we get

(FI4)

where q = -D denotes the fraction of the value of a commodity that goes towards
paying the cost of carry. For details about the models mentioned above see Wilmott
[1998]. The following feedback pricing model for European call option has been
studied by Sircar and Papanicolaou [1998]:

8C 1 [ (l-p~) ]2 2 2
82C (8C ) _
m+2 I-p~ -pS~ a S 8S 2 +r S 8S -C -0,
t <T - e (FI5)
C(S,T - e) = CBs(S,T - s) (FI6)
C(O,t) = 0 (FI7)
lim !C(S, t) - (x -
S--+oo
I
Ke-r(T-t)) = 0 (FI8)

with

C(S, t) = CBs(S, t) (FI9)

for T - e :::; t :::; T, where C BS denotes the Black-Scholes price and the last condition
means that C is equal to CBS in some small interval of time.
The generalized Black-Scholes model incorporating feedback is the following non-
linear parabolic equation

(F20)

F4. Modelling of American options.


American options are those options which can be exercised by any time prior
to expiry time. American call and put options are related to buying and selling,
respectively. The valuation of American options leads to a free boundary problems.
Typically, at each time t, there is a valuation of S which marks the boundary between
two regions; namely, to one side one should hold the option and to the other side one
should exercise it, Let us denote this houndary by SI (t) (generally, this critical asset
value varies with time). Since we do not know S,(t) apriori, we are lacking one piece
of information compared with the corresponding European option problem. Thus
with American options we do not know apriori where to apply boundary conditions.
This situation resembles the obstacle problem and can be effectively tackled by
338 CHAPTER 7. APPENDIX

methods of variational inequalities (see Chapters 2 and 3). Wilmott, Dewynne and
Howison [1993] have shown that American call option and American put option can
be formulated as the following boundary value problems and equivalent variational
inequalities.
American call option is modelled by the following boundary value problem

8u _ 8
2u
>0 (F21)
8r 8x2 -
u(x, r) - g(x, r) ~ 0 (F22)

(~; - ~:~) . (u(x,r) - g(x,r)) = 0 (F23)

u(x,O) = g(x, 0) (F24)


u(a, r) = g(a, r) = 0 (F25)
u(b, r) = g(b,r), (F26)
where
(F27)
The financial variables S, t and the option value C are again computed by putting
S = Ee", t = T - T / ~a2 and
C(S, t) = Ee-Hk-l)Z-Hk+l)2Tu(x, r) .
In order to avoid technical complications, the problem is restricted to a finite interval
(a, b) with -a and b large enough. In financial terms, we assume that we can replace
the exact boundary conditions by the approximations that for small values of S,
P = E - 5, while for large values, P = o.
Let us denote by U T the function x I-t u(x, r). The equivalent parabolic varia-
tional inequality is as follows:
Find u = U T E K T (r runs over [0, ~a2T]) such that

( ~; ,lp - U)2 + a(u, <p - u) ~0


(F28)
1 2
for all <p E KT) a.e. r E (0, 2a T), u(x,O) = g(x, 0),

where
K T := {v E H1(a, b) I v(a) = g(a, r), v(b) = g(b, r}, v(x) ~ g(x, r)}
and (·, ·h denotes the inner product on L2(a, b). With

W(O, ~a2T):= {v I v E L 2(0, ~a2T;Hl(a,b)), :~ E L 2(0, ~a2T;H-l(a,b))}


Wo(O, ~a2T) := { V I v E W(O, ~a2T), v(O) = g(., 0) }
7.6. BLACK-SCHOLES WORLD OF OPTION PRlCING 339

and
K. := {vi v E W(O, ~a2T), v T E K T for a.e. TE [0, ~a2T] }

x« := { v I v E Wo(O, ~a2T), V T E K T for a.e, T E [0, ~a2T]}

we can formulate an equivalent variational inequality:


Find u E K.o such that

fo~(T2T (:~,tp-U)2 dT+ fo~(T2Ta(U,tp_U)dT~0 foralltpEK.. (F29)

American put option is modelled by a boundary value problem that only


differs in the boundary conditions and the (transformed) pay-off function g:
2u)
8U _ 8 >0 (F30)
( 8T 8x 2 -
(u(x, T) - g(x, r)) ~ 0 (F31)
8U 82U) (F32)
( 8T - 8x2 . (u(x, T) - g(x, T)) = 0
U(x, 0) = g(x, 0) (F33)
u(a, T) = g(a, T) (F34)
u(b, T) = g(b,T) 0, (F35)

where

(F36)

The equivalent variational inequality is formulated analogously to (F29) or (F28),


with the only change in the boundary conditions and the function g.
It may be remarked that in American call option, C(S, t) lies above the pay-off
max(S - E, 0); in the transformed variables, this condition takes the form
(u(x, T) - g(x, r)) ~ O. The condition (~ - ~) ~ 0 means that the return from
the risk-free delta-hedged portfolio is less than the risk-free interest rate r ,
The numerical method to solve the complementarity problems modelling Amer-
ican call and put options will be the same. In Wilmott et al, [1995], it is discussed
in quite detail how the projected Successive Overrelaxation (SOR) method can be
employed for numerical simulation of American options. In the next section we de-
scribe a new, more efficient algorithm, its application to American option pricing
and present numerical comparison with the projected SOR method.
F5. Numerical simulation of American call option by a two-step
algorithm..
We discretize the boundary value problem (F21)-(F26) by the finite difference
method and solve it using the Crank-Nicholson scheme; see, e.g., Wilmott et al,
340 CHAPTER 7. APPENDIX

[1993] or Glowinski [1984]. At each time step we need to solve a linear complemen-
tarity problem (LCP) :
Find u m +1 E ]Rn such that
Cu m+! ~ bm , um+! ~ gm+l
(um+! _ gm+!)T(Cu m+! - bm ) = O. (F37)

Here C is an n X n real symmetrie positive definite matrix given below:

1 +0: -~o: 0 0
1
-20: 1 +0: -20:
1

C= 0 -~o: 0
1 +0: _10:
1 2
0 0 -20: 1 +0:

with 0: = (ti)2' or and ox being the time and space discretization parameters,
respeetively. Vectors um+! and gm+! are the discrete counterparts of u(x, r) and
g(x, r) from (F21)-(F26) at a time step (m + l)dr and bm is a "right-hand side"
vector containing information from the previous time step mdr.
Note that the matrix C is large and sparse and that the problem (F37) has to
be solved repeatedly in each time step. Thus we need a fast LCP solver (in this
application, literally, time is money). Several algorithms were proposed for LCPs
with large sparse symmetrie positive definite matrix. These algorithms are either
based on methods for solving linear systems, like the suecessive overrelaxation (SOR)
method or the preeonditioned eonjugate gradient (PCG) method, or on optimization
methods for eonvex quadratic programs, like the gradient projection method or
interior-point methods.
Recently, a two-step algorithm has been proposed by Koövara and Zowe [1994].
This algorithm, based on ideas of multigrid methods, combines the effidency of the
PCG method for solving linear systems with the ability of SOR to smooth down the
solution error. The smoothing property of a variant of SOR with a projection on a
feasible set, called SORP, enables to detect fast the active set

I(u) := {i I Ui = gd·
In the above definition and in the rest of this section we skip the time step index
and write (F37) as
Find u E jRn sueh that

Cu ~ b, u~g
(F38)
(u - g)T(Cu - b) = O.

Instead, we will use the upper index to denote the successive iterate; i.e., u k will
be the kt h iterate of a particular method. By u· we denote the (unique) solution to
7.6. BLACK-SCHOLES WORLD OF OPTION PR1CING 341

(F38). We further denote by Cii the (i,j)-eomponent of the matrix C. Finally,let


us define the feasible set of (F38):

S:= {v E JRn I Vi ~ gi, i = 1,2, ... ,n}.

The two-step algorithm mentioned above proved to be very efficient for large
LCPs. The examples in Koevara and Zowe [1994] even indicate that the algorithm,
based on PCG, is asymptotically as fast as PCG for linear systems. It is our strong
belief that the algorithm fits naturally to our LCP and significantly improves the
efficieney of the overall time stepping proeedure. In the following text we deseribe
the algorithm in detail.
We first reeall the definition of SORP:

SORP
Choose xO E JRn and put for k = 0,1,2, ...

(F39)

where w E (0,2) is a relaxation parameter.

We will work with a symmetrie version of SORP, called SSORPj one SSORP
step consists of one forward and one backward SORP step:

-k+l -_ max {k
Ui 1
Ui - W"lJ77
..
( LJ
"
i<i
C -k+l
iiUi "C
+ LJ
i2:i
iiUik - b) }
i ,gi ,

i = 1,2, . .. ,n
(F40)

U~+l = max {Ü~+l -


1 1 WCii:
1
..
(LJ
"
iS:i
I]] "LJ G"U~+l - b')
C"ü~+l + i>! I]] I' g.}
1 ,

i = n,n -1, . . . ,1.


We shall denote by SSORpm(xj C, b,g) the value which we obtain in m steps (F40) .
If m = 1 we skip the superscript.
The new algorithm ean be viewed as a variant of the active set strategy. That
means, at each iteration step one has to solve a linear system with a matrix of similar
strueture as that of C. This system, however, does not have to be solved exactly,
particularly when the actual active set I(u k ) is far away from I(u*). The idea ls
to perform just a few steps of a preeonditioned eonjugate gradient method. For
eompleteness we give below the definition of the PCG algorithm for the solution of
system
Au=b
342 CHAPTER 7. APPENDIX

with asymmetrie and positive definite matrix A.

PCG
Let M be a symmetrie positive definite matrix (the preconditioner) .
Choose xO E jRn and e > O.
Set rO = b - Axo,po = zO = M-IrO and do for k = 0, 1,2, .. . :

ak = {rk,zk}/{Ji,Apk}
x k+1 = x k + akpk
rk+l = r k - akApk
zk+l = M-Ir k+1
if {zHI, r H I } ~ e, then if IIrH I Ii ~ e, continue
ß k = {rHl,zk+l}/{rk,zk}
pHI = zHI + ßkpk

We denote by PCGB(U ; A, b} the point whieh we reach in s PCG steps starting from
u.
We are now going to explain the new approach. Assurne that we have an ap-
proximation u of the solution u* to (F38) . We again denote by I(u} the active set
with respect to the constraint u ~ g, i.e.,

I(u} := {i I Ui = gi} .

Let p(u} be the cardinality of I(u) and PI(u) : jRn ~ jRn-p(u) the operator which
assigns a vector v E jRn the reduced vector in jRn-p(u) obtained by omitting the
components of v with indices from I (u). We skip u in this notation if this does not
lead to confusion. Further we write J* for I(u*).
The basic idea ofthe new method is the following: We try to identify the active set
by a few steps of SSORP; we get an approximation of J*. With this approximation,
we perform several steps of PCG. Then, again some steps of SSORP to improve the
approximation of J*, and so on. Therefore we call the method SSORP-PCG. One
iteration of the SSORP-PCG algorithm consists of two steps:

SSORP-PCG
Choose uO E jRn ,m E N,sEN and do for k = 0,1,2, . .. :
Step 1: Perform m SSORP steps (F40) and put

UH I/2 := ssone-i», C,b,g)

Step 2: Determine PI with 1:= I(U H I / 2 ) and compute

r k = b - CU k +l/ 2.
7.6. BLACK-SCHOLES WORLD OF OPTION PRlCING 343

Perform 8 PCG steps and define with


zk := PCGB(O,p/CpT,P/r k )
the next iterate
uk+l := Uk+l/2 + "(pT zk,
where "( is the largest real number such that "( ~ 1 and uk+ 1 ES.

It was proved in Koevara and Zowe [1994] that the sequenee of iterates {ukhEN
produeed by SSORP-PCG eonverges to the solution x· of LCP (F38).
Just as for multigrid methods, the number of SSORP steps (parameter m in
Step 1) can be chosen small; already for m = 2 we obtained good results. The
number of PCG steps (parameter 8 in Step 2) is more problem-dependent. Generally
speaking, 8 should grow with the eondition number ofthe matrix C . We reeommend
to take 8 = 5 for weIl-eonditioned problems and 8 = 10 otherwise.
Coneerning the preeonditioner, it is weIl known that effident preeonditioning
matrices for elliptic problems are those based on ineomplete factorization. We im-
plemented the M IC(O)· algorithm.
Below we will eompare SSORP-PCG with SORP. Ta guarantee equal eonditions,
we have chosen the foHowing stopping eriteria:
• The SORP method with relaxation parameter w = 1.9 was stopped when
lIuk+l - uklh beeame less than 10- 9 the first time.
• The stopping eriterion in SSORP-PCG guarantees an aceuracy eomparable
to the one in the SORP implementation: if we applied SORP after stopping
SSORP-PCG, then we typically had Iluk+l - ukll2 ~ 10- 9 •

N umerical Results
In this section we present results of an example eomputed by the new algorithm
and, for eomparison, also by the plain SORP method. We would like to emphasize
that the data shown below should not be evaluated from the viewpoint of overall
efficiency; the example is academic, the choice of time and space diseretization pa-
rameters, as weH as the parameter Q, ean be far from being optimal. Our goal is
to demonstrate the effideney of the SSORP-PCG algorithm for solving a particular
subproblem (LCP) which, no doubt, is large and has to be solved repeatedly many
times,
We have solved an example from Wilmott et al, [1995] in order to get eomparable
results. This is a problem of eomputing American put option with interest rate
r = 0.10, volatility a = 0.4 and exercise price E = 10. The caleulation is earried
out with Q = 1 and with the expiry time of t hree months. The space interval [a, b]
is chosen as [-0.5,0.5]. We earried out the eomputation for three different space
diseretization steps: 8x = 0.01, 8x = 0.001 and 8x = 0.0001.
Table 1
344 GHAPTER 7. APPENDIX

6x 67 n N
0.01 10 4 99 200
0.001 10-6 999 20000
0.0001 10-8 9999 2000000
Table 1 shows the corresponding values of the time step 67, the size n of the
n x n matrix G and the number of time steps N saying how many LCPs we have to
solve.
These three problems were solved using the two-step algorithm SSORP-PCG.
Table 2 shows the overall CPU time needed to solve the problem, as well as the time
needed to solve one LCP. These times are compared with the solution times of the
plain SORP method. Table 2 also shows the (average) number of active constraints.
Note that for large number of active constraints (compared to the problem size n)
the SORP methods becomes very efficient. This is observed on the number of SORP
iterations, shown in the last column of Table 2; with increasing number of variables,
the number of iteration decreases. This is very untypical behaviour caused by the
particular data of the problem.

Table 2
S-P S-P SORP SORP
one step overall one step overall #active #SORP
6x CPU CPU CPU CPU constr. iter.
0.01 0.0021 0.38 0.0089 1.45 20 75
0.001 0.016 390.4 0.041 783.9 400 32
0.0001 0.44 * 1.05 * 8500 28
The second factor which influences the behaviour of the algorithms is that in
both, SORP and SSORP-PCG, the solution from the previous time step was taken
as an initial approximation for the current time step. This technique, in fact, favours
the SORP algorithm. This is clearly seen from Table 3 which shows the CPU times
obtained with the initial approximation for each time step taken as zero vector. In
this situation, SSORP-PCG is a clear winner.
Table 3
S-P S-P SORP SORP
one step overall one step overall #SORP
6x CPU CPU CPU CPU iter.
0.01 0.0023 0.43 0.013 2.39 109
0.001 0.016 486.3 0.11 2221.1 87
0.0001 0.44 * 2.89 * 74
Our results show that the new algorithm certainly outperforms the SORP method,
even though the problem data favour the latter.
All numerical experiments were carried out on a Sun Ultra1 m140 computer
running the operating system Solaris 2.6. The CPU times are in seconds. Study of
American options with feedback is an interesting open problem.
Bibliography

[1] Acton, F.S., Real Computing Made Real, Princeton University Press, Prince-
ton, NJ. 1996.

[2] Albrecht, J., Collatz, L. et al. (ed.), Numerieal Treatment 01 Eigenvalue prob-
lems, vo1.5, Birkhäuser, Basel-Boston-Berlln, 1991.

[3] Alonso, A. and Valli, A., A domain deeomposition approach [or heteroge-
neous time-harmonie Maxwell equations, Comput. Math. Appl. Mech. Engrg.,
143(1997),97-112.

[4] Alonso, A. and Valli, A., An optimal domein deeomposition preeonditioner


[or low-frequeney time-harmonie Maxwell equations, Math. Comp., 68(1999),
607-631.

[5] Alvarez, L., Lions, P.L. and Morel, J.M ., Image Selectioe Smoothing and edge
detection by nonlinear diffusion II, SIAM J . Numer. Anal., 29(1992),845-866.

[6] Amaratunga, K . and Williams, R., Wavelet based Green's function approach
to 2D PDEs, Engineering Computations, 10(1993), 349-367.

[7] Antes, H. and Panagiotopoulos, P.P. , The Boundary Integral Approach to


Statie and Dynamie Contaet problems, Birkhäuser (ISNM 108, International
Series of Numerical Mathematics), 1992.

[8] Arino, M.A. and Vidakovic, B., On umuelet sealograms and their applieations
in time series, Preprint 1999.

[9] D'Attellis, C.E. and Fernändoz-Berdaguer, KM., (eds.) Wavelet Theory and
Harmonie Analysis in Applied Seiences. Birkhäuser, Boston, Basel, 1997.

[10] Argyris, J.H., Energy theorems and structural analysis, Aireraft Engg.,
26(1954),347-356,383-387,394.

[11] Aubin, J .P., Approximation 01Elliptie boundary problems, Wiley Inter science,
New York, 1972.

345
346 BillLIOGRAPHY

[12] Babovsky, H., Gopengiesser, F., Neunzert, H., Struckmeier, J. and Wiesen, B.,
Low discrepancy methods [or the Boltzmann equation, 16th Internat. Symppos.
on Rarified Gas Dynamics, Pasadena, Ca., July 1988.
[13] Babovsky, H. and Illner, R., A convergence proo] 0/ Nanbu 's simulation method
for full Boltzmann equation, SIAM J. Num. Anal., 26(1989),45-65.
[14] Bahader, T . B., Mathematica [or Scientists and Engineers, Addison Wesley
Publishing Company Inc. Reading , Massachusetts, 1995.
[15] Balder, R. and Zenger, C., The solution 0/ multidimensional real Helmholtz
equations on sparse grids, SIAM J . Sei. Comput., 17(1996),631-646.
[16] Banerjee, P.K., The Boundary element methods in Engineering, McGraw Hill
Book Company, London, 1994.
[17] Bank, R.E. , Bulirsch, R., Merten, K., Mathematical Modelling and Simulation
0/ Electrical Oircuits and Semi conductor Deoieee. Birkhäuser, Basel, 1990.
[18] Bari, N.K., Treatise on Trigonometric Series , Vol. I, II, Pergamon Press, Ox-
ford, 1961.
[19] Barnsley, M.F., Fractal Everywhere, Academic Press, San Diago, CA, 1988.
[20] Barnsley, M.F., The Desktop Fractal Design Handbook, Academic Press, New
York, 1989.
[21] Barnsley, M.F., Fractal Image Compression, Notices of the AMS, 43(1996),
657-662 .
[22] Barnsley, M.F., Hurd, L.P., Fractal Image Compression, AK Peters, Ltd,
Boston, 1993.
[23] Beauchamp, K.G., Trans/orms [or Engineers , A Guide to Signal Processing,
Oxford University Press, 1987.
[24] Bellomo, N. and Preziosi, L., Modelling Mathematical Methode and Scientific
Computation, CRC Press, London, 1995.
[25] Berger, M.D. etal, ICAOS'96 12th Internal Oonference on Analysis and Op-
timization systems Images, Wavelates and PDEs, Paris, June 26-28, 1996,
Springer Verlag, Paris, 1996.
[26] Berkner, K., A Wavelet-based Solution to the Inverse Problem [or Fractal In-
terpolation Functions in Vahel et al (eds.), Fractals in Engineering, Interna-
tional Conference on Fractals, Springer, Paris, 1997.
[27] Bertoluzza, S., Some Error Estimates [or Wavelet Expansion, Mathematical
Models and Methods in Applied Seiences, 2(1992), 489-506.
BIBLIOGRAPHY 347

[28] Beste, A. , Brokate, M., Dressler, K., Kann man berechnen, wie lange ein Auto
hält, In Bachern, Jünger, Schröder, Mathematik in der Praxis, Springer-Verlag,
1995.

[29] Berz, M., Bishof, C., Corliss, G. and Griewank, A. (eds.) Gomputational Dif-
ferentiation Techniques, Applications and Tools, SIAM, Philadelphia, 1996.

[30] Beylkin, G., Coifrnan, R. and Rokhlin, V., Fast uuiuelet transforms and nu-
merical algorithms, Cornrn. Pure Appl. Math., 44(1991) , 141-183.

[31] Bhattacharya, P., Semiconductor Optoelectronic deoices, Prentice Hall Inc.


International edition, 1994.

[32] Binder, K., The Monte Garlo Method in condensed matter Physics, Springer
Verlag, New York-Berlin, 1992.

[33] Biran, A. and Breiner, M., MATLAB for Engineers, Addison-Weseley, Read-
ing, MA, 1995.

[34] Bird, G.A., Molecular Gas Dynamies, Clarendon Press, Oxford, 1976.

[35] Black, F . and Scholes, M., The pricing of options and corporate liabilities, J .
Pol. Econ., 81(1973), 637-659.

[36] Blanchard, P. and Brüning E., Variational Methods in Mathematical Physics,


Springer Verlag, Texts and Monographs in Physics, Berlin, Heidelberg, 1992.

[37] Bossavit, A., Differential forms and the computation of jields and forces in
electromagnetism, Eur. J . Mech, B., 10(1991),474-488.

[38] Bossavit, A., Electromagnetism, en tJue de la modelisation (Electromagnetism


in tJiew of modeling), Springer Verlag, Paris, 1993.

[39] Bossavit, A., On the homogenization of Maxwell Equations, COMPEL,


14(1995), 23-26.

[40] Boyce, W. E . (ed.) Gase Studies in Mathematical Modelling, Pitrnan Advanced


Publishing Program, Boston, 1981.

[41] Bramble J.H. and Nitsche, J .A., Generalised Ritz-Least square method for
Dirichlet Problems , SIAM J . Numer. Anal., 10(1973),81-93.

[42] Bramble J .H., Multigrid Methods, Pitrnan Research Notes in Mathematics, v.


124, John Wiley and Sons, 1993.

[43] Bratley, P., Fox, B., Bennet L., Schrage, L. E., A guide to Simulation, Springer
Verlag, New York, Berlin, 1984.
348 BIBLIOGRAPHY

[44] Bratley, P., Fox, B.L., Niederreiter, H., Algorithm-798-Programs to Generate


Niederreiters Law, Discrepancies, ACM Transactions on Mathematical Soft-
ware, 20(1994), 494-495.

[45] Brebbia, C.A., The Boundary Element Methods [or Engineers , Pentech Press,
London, 1978.

[46] Brebbia, C.A., Topics in boundary element research. Vol. 1, Springer, 1984.

[47] Brebbia, C.A., Topics in boundary element research, Vol. 2, Springer, 1985.

[48] Brebbia, C.A. (ed.), Boundary element X , Springer Verlag, Berlin, 1988.

[49] Brebbia, C.A. (ed.), Boundary Element Technology V I Elsevier (Appl. Sei-
ence), London, 1991.

[50] Brebbia, C.A. and Gipson, G.S. (eds.), Boundary Elements XIII, Springer-
Verlag, Berlin, 1991.

[51] Brebbia, C.A., Tanaka, M. and Honna, T . (eds.), Boundary elements XII,
Springer Verlag, Berlin, 1990.

[52] Brebbia, C.A., Walker, S., Boundary Element Techniques in Eng ineering,
Newnes-Butterworths, London, 1980.

[53] Brebbia, C.A., TeIles, J.C.F., Wrobel , L.C. , Boundary Element Techsiiques,
Spr inger Verlag, Berlin-Heidelberg, 1984 (Theory and Applications in Engi-
neering).

[54] Brenner, S.C. and Scott, L.R., The Mathematical theory 01finite element meth-
ods, Springer Verlag, New York, Berlin, 1994.

[55] Brezis, H., Operators maxumus monotone at semigroupes dans les espaces de
Hilbert, North Holland, Amsterdam, 1973.

[56] Brezzi, F., Fortin, M., Mixed and Hybrid Finite Element Methods, Springer
Verlag, New York, 1991.

[57] Brokate, M., Some BV properties 01 the Preisach hysteresis operator, Applica-
ble Analysis, 32(1989), 229-252.

[58] Brokate, M., Hysteresis operators, in Phase Transitions and Hysteresis in Vis-
intin, A.(ed.), Leetute Notes in Mathematics No.1584, Springer Verlag Berlin,
Heidelberg, 1994.

[59] Brokate, M., Dressler, K. and Krejö, P., Rain-flow counting and energy
dissipation [or hysteresis models in elastoplasticity, European J . Mechanics
AjSolids, 15(1996), 705-737.
BIBLIOGRAPHY 349

[60] Brokate, M. and Siddiqi, A.H. (eds.), Functional Analysis with current Appli-
cations to Science, Technology and Industry, Pitman Research Notes in Math-
ematics, Longman, U.K., 1997.

[61] Brokate, M. and Sprekels, J ., Hysteresis and Phase Transitions, Springer Ver-
lag, Berlin-Heidelberg, 1996.

[62] Brokate, M., and Visintin, A., Properiies of the Preisach model for hysteresis,
J . Reine Angev. Math., (1989), 1-40.

[63] Brokate, M., Beste A., Dressler, K., Kann man berechenen, wie lange ein
Auto hält, In : Mathematik in der Praxis-Fallstudien aus Industrie, Wirtschaft,
Naturwissenschaften und Medizin, 3-24 in (eds.) A. Bachern, M. Jünger, R .
Schrader, Springer Verlag, Berlin, 1995,3-24.

[64] Bruns, W., Motoc, 1., O'Driscoll, K.F., Monte Carlo Applications in Polymer
Science, Lecture Notes in Chemistry, Springer Verlag, 1981.

[65] Burnett, D.S., Finite Element Analysis From concepts to Applications,


Addison-Wesely, Pub.Comp. Reading, Massachusetts, 1987.

[66] Byrd, R.H ., Nocedal, J ., A tool for the analysis of quasi Newton methods with
application to unconstrained minimization, SIAM J. Numer. Anal., 25(1989),
727-739.

[67] Caflisch , Russel E., Monte Carlo and quasi-Monte Carlo methods, Numerica,
(1998, pp. 1-49, Cambridge University Press, 1999.

[68] Canuto, C. and Cravero, 1., A WatJelet-based AdaptitJe Finite Element Method
for AdtJection Diffusion Equctions, Mathematical Models and Methods in Ap-
plied Sciences, 7(1997), 265-289.

[69] Caracciolo, S. and Fabrocini, A.(ed.) Monte Carlo Methode in Theoretical


Physics, ETS Editrice, Pisa, 1991.

[70] Cargill, T., C++ Programming Style, Addison-Wesely, Rouding, New York,
1992.

[71] Ce'c, J., Approximation Variationnelle de problemes aux limits, Ann. Inst.
Fourier (Grenoble), 14(1964), 345-444.

[72] Cercignani, C., The Boltzmann Equation and Its Applications, Springer,
Berlin, 1989.

[73] Chambolle, A., DeVor, R.A ., Lee, N.Y. and Lucier, B., Nonlinear suaoelet im-
age processing: Variational problems, compresssion, and noise remotJal through
uunselet shrinkage, IEEE Trans. Image Processing, 7(1998), 319-335.
350 BIBLIOGRAPHY

[74] Chari, M.V.K., and Silvester, P.P. (eds.), Finite Elements in Electrical and
Magnetic field problems, John Wiley and Sons, Chichester, 1980.
[75] Chavent, G. and Jaffre, J ., Mathematical Models and Finite Elements for
Reservoir Simulation, North Holland, Amsterdam, 1986.
[76] Chen, G. and Zhou, J ., Boundary Element Methods, Academic Press, Cam-
bridge, 1992.
[77] Chipot, M., March, R., and Vitulano, D., Numerical Analysis 0'
Oscillations
in a Nonconvex Problem Related to Image Selective Smoothing, Preprint 1999,
CNR, Rome, Italy.
[78] Chui, C.K., An Introduction to Wavelets, Academic Press, San Diego, CA
1992.
[79] Chui, C.K. (ed.), Wavelets: A Tutorial in Theory and Applications, Academic
Press, SanDiego, CA, 1992.
[80] Churchil, R.V., Fourier Series and Boundary value Problems, McGraw Hill,
New York, 1963.
[81] Ciarlet, P.G., The finite element methods [or elliptic problems, North Holland,
Amsterdam, 1978.
[82] Ciarlet, P. G., Introduction to numericallinear algebra and Optimization, Cam-
bridge University Press, 1989.
[83] Ciarlet, P.G. and Lions, J.L., Hand book 0' Numerical Analysis, Elsevier Sei-
ence Publishers, North Holland, 1991.
[84] Clough, R.W., The Finite Element Method in plane stress analysis , in Pro-
ceedings 2nd ASCE Conference on Electronic Computation, Pittsburg, PA.
[85] Coifman, R.R. Meyer, Y, and Wickerhauser, M.V., Size Properties 0'
Wavelet
packets in Ruskai, Marry Beth (ed.) et al., Wavelets and their Applications,
Boston, MA, 1992,453-470.
[86] Coifman, R.R., and Wickerhauser, M.V., Entropy based algorithms [or besi
basis selection, IEEE Trans. Inform . Theory, 38(1992), 713-718.
[87] Colton, D. and Kress, R., Integral equation methods in scattering theory, John
Wiley & Sons, New York, 1983.
[88] Cooley, J.W., and Thkey, J.W., An Algorithm [or the Machine Calculation 0'
Complex Fourier series, Mathematical Computations, 19, 1965.
[89] Crandall, R. E., Mathematica [or the Seiences. Addison-Weseley Publishing
Company, Inc., U.S.A., 1991.
BmLIOGRAPHY 351

[90] Dahmen, W. , Wavelet and Multiscale Methods for Operator Equations, Acta
Numerica, Cambridge, 1998.

[91] Dai, X., Larson, D.R and Speegle, D.M., Wavelet seis in Rn II, 15-30, Con-
temporary Mathematics, Vol. 216, AMS (Aldroub and Lin eds.), 1997.

[92] Datta, B.N., Numerical Linear Aigebm, Brooks/cole publishing eompanny,


Bonn,1995.

[93] Daubechies, 1., Ten lectures on wavelets, Society for Industrial and Applied
Mathematics (SIAM), Philadelphia, PA, 1992.

[94] Daubechies, 1., Grossman, A. and Meyer, Y., nonorthogonal expansions, J.


Math, Phys., 27(1986), 1271-1283.

[95] Daubechies, 1., Mallat, S. and Willsky, (ed.), A special issue on wavelet
tmnsforms and multi resolution signal analysis, IEEE Trans. Inform. Theory,
38(1992), 529-840.

[96] Dautry, R. and Lions, J., Mathematical Analysis and Numeriaal Methods for
Science and Technology, Vols. 1,2,3,4,5,6, Springer Verlag, Berlin, Heidel-
berg, New York, 1990-1995.

[97] Davies, A.J., The Finite Element Method, A first approach, Oxford Applied
Mathematics and Computing Scienees, Clarendon Press, Oxford, 1980.

[98] Davis, P.J . and Rabinowitz, P., Methods of Numerical Integration, Aeademic
Press, New York, 1984.

[99] Dennis, Jr J.E. and Schnabel, Robert R., Numerical Methods for unconstmined
optimization and nonlinear equations, Prentice-Hall Inc., New Jersey, 1983
(Reproduction of SIAM, Philadelphia, 1996).

[100] DeVore, RA., Nonlinear application, Acta Numerica, 1998, 51-150, Cam-
bridge University Press.

[101] DeVore, RA., Jawerth, B. and Lucier, B.J., Image compression through
wavelet tmnsform coding, IEEE, Trans. Inform ., 38(1992), 719-746.
[102] Douglas, J. Jr., Dupont, T., A Finite Element Collocation Method for Quasi
linear Pambolic Equations, Math. Comput., 27(1973), 17-28.
[103] Dowd, K., High Performance Computing, Oreilly and Associates, Sebastopol,
CA, 1993.

[104] Dressler, K. and Hack, M., Fatigue life time estimation based on min-flow
counted data using the local strain approach, Eur. J. Meeh. A/Solids, 15(1996),
955-968.
352 BIBLIOGRAPHY

[105] Dressler, K and Hack, M., Fatigue li/etime estimation based on rainflow
counted data using the local strain approach, Eur. J. Mech. AjSolids, 15(1996),
955-968.

[106] Dressler, K , Hack, M. and Krüger, W ., Stochastic Reconstruction 0/ Loading


Histories from a Rain-flow Matrix, ZAMM. Z. angew. Math. Mech., 77(1997) ,
217-266.

[107] Duvaut, G., Lions, J .L., Inequalities in Mechanics and Physics, Springer,
Berlin, 1976.

[108] Eirola, T., Sobolev characterization 0/ soluiions 0/ dilation


equation, SIAM J. Math. Anal., 23(1992), 1015-1030.
[109] Eriksson, K., Introduction to Adaptive Methods [or Differential Equations,
Acta Numerica, 1995, 105-158.
[110] Evans, G., Practical Numerical integration, John Wiley and Sons, Chichester,
New York, 1993.

[111] Falk, R.S., Error Estimates [or the Approximation 0/ Class 0/ Variation In-
equalities, Math. Comp., 28(1974), 963-971.
[112] Fasman, G. D., Prediction 0/ Protein Structure and the principles 0/ Protein
Con/ormation, Fasman, G. D. (ed.), Plenum Press, New York and London
1989, 2nd edition, 1990.

[113] Felippa, Finite element and Finite Difference energy Techniques [or the Nu-
merical solution 0/ partial Differential Equations, Summer Computer Simula-
tion Conference Proc. Montreal, 1973, Vol. 1, pp .1-14.

[114] Finlayson, B.A., The method 0/ Weighted Residuals and Variational Princi-
ples, Academic Press, New York, 1972.

[115] Fisher,Y., Fractal image compression, Springer Verlag, New York, 1995.

[116] Fletcher, R., An optimal positive definite update [or sparse Hessian matrices,
SIAM, J . Optim., 1(1991), 18-21.

[117] Forte, B. and Vrscay, E.R., Solving the Inverse Problem [or Function and
Image Approximation Using Iterative Function Systems, Dynamics 0/ Contin-
uous, Discrete and Impulsive Systems, 1(1995) , 177-231.
[118] Fredholm, 1., Sur une classe d'Equations Functionelles, Acta Mathematica,
27(1903), 365-390.

[119] Friedman, A., Mathematics in Industrial problems, [or IMA volumes in Math-
ematics and its Applications, V. 16, Springer Verlag, 1988.
BIBLIOGRAPHY 353

[120] Gaylord R . J., Wellin, P. R., Computer Simulations with Mathematica Explo-
rations in Complex Physics and Biological Systems, Springer Verlag, 1994.
[121] Gilbert, J .C. and Nocedal J ., Global convergence properties of conjugate gra-
dient Method for optim., SIAM J. Optimaization, 2(1992), 21-42.
[122] Gill, P. E., Murry, W. and Wright, M. H., Practical Optimization, Academic
Press, London, 1981.
[123] Glowinski, R., Marroco, A., Analyse numerique du champ magnetique d' un
altemateur par ele'ments finis et eur-relaaation ponctuelle nonlinear, comput.
Methods Appl. Mech. Engg. , 3(1974), 55-85.
[124] Glowinski, R., Numerical Methods [or nonlinear variation al problems, Springer
Verlag, 1984.
[125] Glowinski, R., Lions, J .L. and Tre'moliereres, Numerical analysis 0/ variation al
inequalities, Studies in Mathematics and its applications, Vol.8, North Holland
Publishing Comp., Amsterdam, 1981.
[126] Glowinski, R. et al. , Wavelet solutions 0/linear and nonlinear elliptic, parabolic
and hyperbolic problems in one space dimension in Glowinski et al. (eds.), Com-
puting Methods in Applied Seiences and Engineering, SIAM, Philadelphia,
1990.
[127] Goldfarb, D., Algorithms [or unconstrained optimization, A Review 0/ Recent
Developments, Proc. 0/ Symp. in Appl. Math., 48,1994 Amer. Math. Soc.,
(0160-7634/94).
[128] Goulub, G.H., Van Loan, C.F., Matrix Computations, 3rd edition, The Johns
Hopkins University Press, Boltimore and London, 1996.
[129] Gonzalez, R. C. and Woods, R.E, Digital Image Processing, Addison - Wesley
Publishing Comp., Reading, Massachusetts, 1993.
[130] Goodman, J.W., Introduction to Fourier Optics, McGraw HilI, New York,
1968.
[131] Griewank, A. and Corliss, G. F., Automatic differentiation 0/ Algorithms, The-
ory Implementation and Application, SIAM, Philadelphia, 1991.
[132] Griffel, D. H., Applied Functional Analysis" Ellis Horwood Ltd ., New York,
1981.
[133] Hack, P., Quality Control of Artificial Fabrics, Report AGTM Nr. 45, Kaiser-
slautern, Germany, 1990.
[134] Hack, M., Life Time Estimation in the Car Industry, Report AGTM Nr.
105,1994.
354 BIBLIOGRAPHY

[135] Hackbusch, W.,Multigrid methods and applieations, Springer, Berlin 1985.


[136] Hackbusch, W. (ed.), Robust Mutligrid method, Notes on numerical fluid me-
chanics, Vol. 23, Vieweg, BraunschweigjWiesbaden, 1989.
[137] Hackbusch, W., Elliptie Differential Equations , Theory and Numerieal Treat-
ment, Springer Verlag, 1992.
[138] Hackbusch, W., Iterative soluiion o/large sparse system 0/ equations, Springer
Verlag, 1994.
[139] Hackbusch, W., Integral Equations, Theory and Numerieal Treatment, Basel-
Boston-Berlin, Birkhäuser, 1995.
[140] Halton,J.H., A retrospeetive and prospeetive survey of the Monte Carlo Method,
SIAM Rev., 12(1970), 1-63 .
[141] Hammersley, J.M. and Handscomb, D.C., Monte Carlo Methods, reprint 1967,
London Methuen and Co. Ltd. (first print 1964).
[142] Hammond, P., Energy Methods in Eleetromagnetism, Clarendon Press, Oxford,
1981.
[143] Hammond, P., Eleetromagnetism [or Engineers, Pergamon Press, Oxford, New
York, 3rd edition, 1986.
[144] HanseIman, D. and Littlefield, B., Mastering MATLAB, Prentice Hall, Upper
Saddle River, NJ, 1996.
[145] Heath, M.T., Seientifie Computingj An Introduetion Survey, The McGraw HilI
Complies, Inc., New York, 1997.
[146] Hockney, R. W. and Eastwood James W., Computer Simulation using parti-
cles, McGraw HilI Inc., U.S.A. 1981.
[147] Hoppe, R. H.W., and Wohlmuth, B., A Penalty Method [or the Approximation
Solution 0/ Stationary Maxwell Equations, Numer. Math., 36(1981), 389-403.
[148] Hoppe, R. H.W., and Wohlmuth, B., Multilevel iterative solution and adaptive
mesh refinement for mixed finite element diseretization, Applied Numerical
Mathematics, 23(1997), 97-117.
[149] Hua, L.K. and Wang, Y., Applications 0/ Number Theory to Numerieal Anal-
ysis, Springer Verlag, Berlin-Heidelberg, 1981.
[150] Huebner, K.H., The Finite Element Method [or Engineers, John Wiley and
Sons, New York, London, 1975.
[151] Itzykson, C. Z., Zuber , J .B., Quantum Held Theory, McGraw Hill, New York,
1980.
BIBLIOGRAPHY 355

[152] Jackson, J .D., Electrodynamics 0/ Gontinuous Media, Pergamon, Oxford, 1960.


[153] Jackson, J .D., Glassical Electrodynamics, Wiley, New York, 1975.

[154] Jacoboni C. and Lugli, P. The Mote Garlo Method for Semiconductor Deoice
Simulation, Springer Verlag, Wien, New York, 1989.
[155] Jaequin, A.E., Fractal Image Goding, A Review, Proc. IEEE, 81(1993), 1451-
1465.

[156] Jaswon, H.A. and Syrnm, G.T., Integml Equation Methods in Potential Theory
and Elastostatics, Aeademic Press, London, 1977.

[157] Jin, Jianming, The Finite Element Method in Electromagnetics, John Wiley
and Sons, INC, New York, 1993.

[158] Johnson, G ., Ne'de'lec, J.C. On the coupling 0/ the Boundary integral and
Finite Element Methods, Math. Comput., 35(1980) , 1063-1079.
[159] Kaiser, G., A jriendly guide to wavelets, Birkauser, Boston-Basel-Berlin, 1994.

[160] Kantorovich, L. V., Mathematics in Economics .Achieuemenis, Difficuliies,


Perspectiues, Nobel Memorial Lecture, Dec. 11,1975.
[161] Kapur, J.N., Mathematical Modelling, John Wiley and Sons, New York, 1987.

[162] Kardestuneer, H. and Norrie, D. H., Finite Element Handbook, MeGraw Hill
Book Company, 1987.

[163] Kearsley, S. K. and Smith, M. G., An Alternative Method jor the Alignment
0/ Molecular Structures, Maximising Electrostatic and Steric Overlap, Tetra-
hedron Computer Methodology, 3(1990), 615-633.

[164] Kellog, O.D ., Foundation 0/ Potential Theory, Springer Verlag, Berlin 1929
(also in Dover publ., New York, 1953).

[165] Kelly, S., Kon, M. and Raphael, L. Poiniuiise Gonvergence 0/ Wavelet Expan-
sions, J . Funct. Anal., 126(1994),102-138.
[166] Kelly, S.E., Kon, M.A . and Raphael, L.A., Gonvergence: Fourier series versus
wavelet expansion, Proe. of NATO Advaneed Study Institute: Wavelets and
their Applications, J. Byrnes ed., 1994.

[167] Kesavan, S., Topics in Functional Analysis and Applications, John Wiley and
Sons, New York, 1988.

[168] Kikuchi, N., and Oden, J .T., Gontact problems in Elasticity, A Study 0/ Vari-
ational Inequaliiies and Finite Element Methods, SIAM Publ, Philadelphia,
1988.
356 BIBLIOGRAPHY

[169] Kinderlehrer, D. and Stampacchia, G., Introduction to Yariational Inequalities


and their Applications, Academic Press, New York, 1980.

[170] Kobayashi, M., Wavelets and their applications, Gase siudies, SIAM, Philadel-
phia, PA, 1998.

[171] Kocvara, M. and Zowe, J ., An iterative two-step algorithm for linear comple-
mentary problems, Numer. Math., 68(1994), 95-106 .

[172] Koenderink, J., The structure of images, Biological Gybemectics, Springer Ver-
lag, New York, 1985.

[173] Koonin, S.E. and Mevedith, D. C., Gomputational Physics, Fortran version ,
Addison Wesely Pub. Comp., U.S.A., 1990.

[174] Koopmans, T. C., Concepts of optimality and their uses, Nobel Memorial Lee-
ture, Dec. 11, 1975.

[175] Koorwinder, T.H. ed., Wavelets: An Elementary Treatment of Theory and


Applications, World Scientific, Singapore, 1993.

[176] Kovaöeviö, J . and Daubechies, I.(ed.), Special issue on Wavelets , Proc. IEEE,
84(1996), 507-614.

[177] Krasnosel'skii, M.A., Pokrovskii A.V., Systems with hysteresis, Springer Ver-
lag, Heidelberg 1989.

[178] Kraus, J . D., Electromagnetics, McGraw Hill, International, 1992.

[179] Krejci, P., On Maxwell equations with the Preisach hysteresis operator: The
one dimensional time periodic case, Appl. Math., 34(1989), 364-374.

[180] Krejcf, P., Vector hysteresis modele, Europ.J, Appl. Math., 2(1991)(a), 281-
292.

[181] Krejcf, P., Huseresis memory preserving operators, Applications of Math.,


36(1991),305-326.

[182] Krejcf, P., Hysteresis, convexity and dissipation in hyperbolic equations,


Gakkotosho, Tokyo, 1996.

[183] Krizek, M. and Neittaanmäki, P., Mathematical and Nusnericol Modeling in


Electrical Engineering Theory and Applications, Kluwer Academic Publisher,
DordrechtjBoston, 1996.

[184] Kröner, D., Numerical schetnes for conseruaiiue laws, Wiley Teubner, Chich-
ester , 1997.
BIBLIOGRAPHY 357

[185] Krüger, W. Scheutzow, M., Beste, A., Petersen, J ., Markov und Rainftow
Rekonstruktion stochastischer Beansprachungszeitjunktionen. VDI-Report, Se-
rie 18, Nr. 22, 1985.
[186] Kuipers, L. and Niederreiter, Uniform Distribution of Sequences, John Wiley,
New York, 1974.
[187] Kupradze, V.D., Potential Methods in the Theory of Elasticity, Israel Scientific
publ. Jerusalem, 1968.
[188] Landau, L., Lipschitz, E ., Classical Theory of Fields, Pergamon, Oxford, 1959.

[189] Landau, L., Electrodynamics of Continuous Media, Pergamon, Oxford, 1960.

[190] Lecot, C., A Quasi-Monte Carlo Method for the Boltzmann equation, Math.
of Comp., 56(1991), 621-644.
[191] Lecot, C., Error Bounds for Quasi-Monte Carlo integration with neis,
65(1996), 179-187.

[192] Lemaitre, J. and Chaboche, J.L., Medionies of solid materials, Cambridge


university Press, Cambridge, 1990.
[193] Lemarie, P.G., Les ondelettes en 1989, Leetute Notes in Mathematics, No.
1438, Springer Verlag, Berlin, Heidelberg, 1989.

[194] Lim, J .S. and Oppenheim, A.V., Advanced Topics in Signal Processing, Pren-
tice Hall, New Jersey, 1987.
[195] Liu, C.S., and Chan, A.K., Wavelet Toolware, Academic Press, London, 1998.
[196] Locker, J., Prenter, P.M., Optimal L 2 and L oo Error estimates for continuous
and Discrete Least Squares Methods for Boundary value problems, SIAM J.
Numer. Anal ., 15(1978), 1151-1160.
[197] Louis, A.K., Maaß, P., Rieder, A., Wavelets Theory and Applications, Chich-
ester, John Wiley and Sons, 1997.
[198] Lu, G., Fractal Image Compression, Signal Processing: Image Communication,
5(1993), 327-343.

[199] Lu, Ning, Fracal Imaging , Academic Press, San Diego, London, 1997.

[200] Maaß P., Stark, H.G' l Wavelets and digital image processing, Surveys on Math-
ematics for Industry Springer-Verlag, Wien-New York, 4(1994) , 195-235.
[201] Madelung, E ., Über Magnetisierung durch schnellverlaufende Ströme und die
Wirkungsweise des Rutherford-Marconischen Magnetdetektors, Ann. Phys.,
17(1905),861-890.
358 BIBLIOGRAPHY

[202] Mallat, S., A theory [or multiresolution signal decomposition the Wavelet rep-
resentation, IEEE. Pattern Anal.Mech.Intell., 11(1989),674-693.

[203] Mallat, S., Multi resolution and wavelet orthonormal bases 0/ L 2(R), 1rans.
Amer. Math. Soc., 315(1989),68-87.

[204] Mallat, S., Multi frequency Channel decompositions 0/ images and wavelet mod-
els, IEEE Trans. Acoust. Speech. Signal Process, 37(1989), 2091-2110.

[205] Mallat, S., Wavelets [or a oision, Proc. IEEE, 89(1996), 604-614.

[206] Mallat, S. and Huang, W.L., Singularity detection and processing with
wavelets, IEEE Trans . Inform. Theory, 38(1992), 617-643.

[207] Mallat, S. and Zhong, S., Characterization 0/ signals from Multiscale edges,
IEEE Trans. Patt. Anal. Mach. Intell., 14(1992), 710-732.

[208] Manchanda, P., Mukheimer, A.A.S. and Siddiqi, A.H., Certain resulis concrn-
ing the iterated function system, Preprint 1998 to appear in Numer. Functional
Analysis and Optimization, FAAC, Kuait University Issue.
[209] Manchanda, P., Mukheimer, A.A.S. and Siddiqi, A.H., Pointwise convergence
0/ two-dimens ional wavelet expansions incorporating rotation, Preprint 1999.
[210] Mangin, G.A., The Thermomechanics 0/ plasticity and fracture, Cambridge
University Press, Cambridge, 1992.

[211] March, R. and Dozio, M., A variational method [or the recovery 0/ smooth
boundaries, Image and Vision Computing, 15(1997),705-712.

[212] Markowich, P. A., The Stationary Semiconductor deoice equations, Springer


Verlag, Wien, New York. 1986.
[213] Mayergoyz, LD., Mathematical Models 0/ hysteresis, Springer Verlag, 1991.
[214] Mendivil, F. Vrscay, E.R., Correspondence Between Fractal- Wavelet 1rans-
[orms and Iterated Functions System with Grey Level Maps, in Vehel et al
(eds.}, Fractal in Engineering, Springer Verlag, INRlA, June 1997.

[215] Merton, R.C., Applications %ption-pricing theory: Twentyfive years later,


The American Economic Review, June 1998, pp . 50-76 (Nobel Prize Memorial
Lecture delivered in Stockholm, December 9, 1997).

[216] Metropolis, N. and Ulam, S.M., The Monte Carlo Method, J. Amer. Statist.
Assoe., 44(1949), 35-341.

[217] Meyer, Y., Wavelets and Operators, Cambridge studies in advanced mathe-
matics, Cambridge University Press, 1992.
BIBLIOGRAPHY 359

[218] Meyer, Y., (Translated and revised by Robert D. Ryan), Wavelets - Algorithms
and Applications, Society for Industrial and Applied Mathematics, Philadel-
phia, 1993.

[219] Mikhailov, G.A. Minimization 0/ computational costs 0/ Non-analogue Monte


Garlo Methods, World Scientific, Singapore, 1991.

[220] Mikhlin, S.G., Integral Equations, Pergamon Press, Oxford, 1957.

[221] Mikhlin, S.G., Approximate Solutions 0/ Differential and Integral Equations,


Pergamon Press, Oxford, 1965.

[222] Morel, J.M., Solimini S., Variational Methods in Image Segmentation with
seven image processing experiments, Birkhäuser, Boston, 1995.

[223] Morokoff, W.J . and Caflisch, R.E., A quasi-Monte Garlo approach to particle
simulation 0/ the heat equation, SIAM J. Numer. Anal., 30(1993), 1558-1573.

[224] Morokoff, W.J. and Caflisch, Russel, E., Quasi-Monte Garlo Integration, Jour-
nal of Computational Physics, 122(1995), 218-230.

[225] Mumford, D. and Shah, J. (i) Boundary detection by minimizing functionals,


IEEE Conference on Computer Vision and Pattern Recognition, San Francisco
1985. (ii) Boundary detection by minimizing functionals, Image understanding,
ed. S. Ullman and W.Richards 1988. (iii) Optimal Approximations by Pieceuiise
Smooth Functions and Associated Variational Problems, Communications on
Pure and Applied Math., XLII (1989).

[226] Murakami, Y. (00.), The rainflow Method in /atigue, Butterworth and Heine-
mann, Oxford, 1992.

[227] Muskhelishvili, N.!., Singular Integral Equations, Noordhoff Publishing Co.,


Groningen, 1953 .

[228] Nakamura, S., Numerical Analysis and Graphie Visualization with MATLAB,
Printice Hall, Upper Saddle River, NJ, 1996.

[229] Nashed, M.Z., Differentiation and related properties 0/ non- linear operators,
Non-linear Functional Analysis and its Applications (ed.) L.B. Rall, Academic
Press, New York, 1971.

[230] Neuber, H., Theory 0/ Stress Goneentration [or Shear-strained Prismatical


Bodies with Arbitrary Non-linear Stress-strain law, Trans. ASME, J. Appl.
Mech., 28(1961), 544-550.

[231] Neunzert, H., Leciures on Mathematical Modelling, Gase Studies from the Lab-
oratory 0/ Technomathematics, Kaiserslautern university, Germany, 1995.
360 BIBLIOGRAPHY

[232] Neunzert, H., Struckmeier, J ., Portieie Methods [or the Boltzman Equation,
Acta Numerica, 1995, 417--457.
[233] Neunzert, H., Klar, A., Struckmeier, J ., Particle Methods - Theory and Ap-
plications in Kirchgässner, K., Mahrenholtz D., Mennicken R. (ed.) ICIAM95,
Proc, Third International congress on Industrial and Applied Mathematics,
Akademie Verlag, Berlin, Vol. 87, 1996.

[234] Neunzert, H. and Struckmeier, J., Boltzmann Simulation by Particle Methode,


Accademia Nationale dei Lincei, Boltzmann's Legacy 150 years after his birth,
Roma, 1997.

[235] Neunzert, H. and Siddiqi, A. H., Industrial Mathematics - Some ideas, in


Brokate, M. and Siddiqi, A. H.(eds.), Functional Analysis with Current Appli-
cation to Science, Technology and Industry, Pitman Research Notes in Math-
ematics, Longman, U.K., 1997.

[236] Niederreiter, H., Quasi-Monte Garlo Methode and Pseudo random number,
BuH. Amer. Math-Sec., 84(1978), 957-1041.

[237] Niederreiter, H., Random Number Generators and Quasi Monte Garlo Meth-
ods, SIAM, Philadelphia, Vol. 63, 1992.
[238] Niederreiter, H. in Hlawka, E . and Tichy, R.F. (eds.) Number Theoretic Anal-
ysis, Lect. Note, 1452, Springer Verlag, 1990.
[239] Nievergelt, Y., Waveleta Made Easy, Birkhäuser, Boston, 1999.
[240] Ninomiya, S. and Tezuka, S., Toward real-time pricing 0/ complex financial
derivatives, Appl. Math. Finance, 3(1996), 11-20.
[241] Nussboumer, H.J., Fast Fourier Transform and Gonvolution Algorithms, New
York, Springer Verlag, 1982.

[242] Oden, J. Tinsely, in P.G. Ciarlet and J.L. Lions(eds.), An Introduction, in


Handbook 0/ Numerical Analysis, Vol.II, Finite Element Methode (Part I) 3-
15, Elsevier Science Publishers, North Holland, 1991.
[243] Ogden, R.T., Essential wavelets [or statistical applications and data analysis,
Birkhäuser, Boston, 1997.

[244] Oppenheim, Alan V., Applications 0/ Digital Signal Processing, Prentice - Hall,
New Jersey, 1978.
[245] Outrata, J., Koövara, M. and Zowe, J., Non-smooth approach to optimiza-
tion problems with equilibrium constraints: Theory, applications and numrical
results, Kluwer Academic Publishers, Boston/Dordrechc/London, 1998.

[246] Papageorgiou, A. and Traub, J.F., Beating Monte Garlo, Risk, 9(1996), 63--65.
BIBLIOGRAPHY 361

[247] Part-Enander, E., et al, MATLAB Handbook, Addison Wesely, Reading, MA,
1996.

[248] Paskov, S. and Traub, J.F., Faster valuation of financial derivatives, J . Port-
folio Management, 22(1995) , 113-120.

[249] Perona, P. and Malik, J., Seale space and edge detection using enisotropic
diffusion, IEEE Trans. Patt. Anal. and Mach. Intell., 12(1990),629-639.

[250] Pfreundt, F. and Hackh , P., Algorithmen zum Molekülmatching vin Wirk-
stoffdesign. Abschlußbericht, Sept. 1993, Fachbereich Mathemak, Universität
Kaiserslautern, Germany.

[251] Polak, E., Optimization Algorithms and Gonsistent Approximations, Applied


Mathematical Seiences Series, Springer, NY, Berlin, 1997.

[252] Polyak, B. T., Introduction to Optimization, Optimization Software Inc., Pub-


lications Division, New York, 1987.

[253] Powell, M.J.D., Gonvergence properties of Algorithms for non-liner optimiza-


tion, SIAM Review, 28(1986),487-500.

[254] Pozrikidis, C., Boundary Integral and Singularity Methods for linearized vis-
cous flow, Cambridge Text in Applied Mathematics, Cambridge University
Press, Cambridge, 1992.

[255] Prandtl, L., Ein Gedankenmodell zur kinetischen Theorie der festen Körpor,
ZAMM, 8(1928), 85-106.

[256] Preisach, F ., Über die magnetische Nachwirkung , Z. Physik, 94(1935), 277-


302.

[257] Prenter, P.M., Russel, R.D., Orthogonal Gollocation for Elliptic Partial Dif-
ferential Equaiion, SIAM J . Numer. Anal., 13(1976), 923-939.

[258] Press, W . H., Teukolsky, S.A., Vetterling, W.T. and Flannery, B. P., Numerical
Recipes in Fortran - The Art of Scientific Gomputing Cambridge University
Press, 1992.

[259] Rabiner, L.R., Bernard, G., Theory and Application of Digital Signal Process-
ing, Prentice - Hall, New Jersey, 1975.

[260] Raviart, P. A., An Analysis of Particle Methods in Brezzi, F. (eds.), Numerical


Methods in Fluid Dynamics, Lect. Notes in Math. Springer Verlag, Heidelberg,
1983.

[261] Reddy, J.N., An introduction to the Finite Element Method, McGraw Hill,
International Editions, 1985.
362 BIBLIOGRAPHY

[262] Reddy, J.N., Applied Functional Analysis and Variational Methods in Engi-
neering, McGraw Hill, 1985.
[263] Roofen, D. and Cambell, C., The MATLAB Handbook, Springer Verlag, New
York,1996.
[264] Reissel, M., 3D Eddy current computation using Krylo subspace Methods, Pre-
print, Department of Mathematics, Kaiserslautern University, 1995.
[265] Rektorys, K., Variational Techniques, D. Reidel Publishing Company, Dor-
drecht, Holland, 1980.

[266] Richtmyer, R.D., On the evaluation 01 definite, integrals and a Quasi Monte
Carlo Method based on properlies 01 algebraic numbers, Report LA-1342, Los
Alamos Sci. Lab. Almos, N.M. 1951.
[267] Rjasanow, S., Parameterschatzverfahren zur Bestimmung der Bahnpame-
tre liner geradlininger, gleichlörnubgen Bewegung, Fachbereich mathematik,
universität Kaiserslautern, Germany.
[268] Ross, C. C., Differential equations an introduction with Mathematica (R),
Springer Verlag, New York, 1995.
[269] Ruskai, M.B., et al, (eds.), Wavelets and their Applications Jones and Bartlet,
Boston, 1992.
[270] Sarkar, P.K. and Prasad, M.A., A comparative study 01 pseudo and quasi-
random sequence [or the solutions 01 integral equations, Journal of Computa-
tional Physics, 68(1987), 66-68.
[271] Saupe, D., et al, Fractal Image Compression - An Introductory Overview,
Fractal Models for Image Synthesis, Compression and Analysis in Saupe, D.
and Hart, J. (eds.) ACMSIG-Graph's 96 Course Notes 27, New Orleans,
Louisiana, 1996.
[272] Schaltz, A. H., Thome'e, V., Wendland W. L., Mathematical Theory 01 Finite
and Boundary Finite Element Methods, Birkhäuser-DMV Seminar Bond 15,
Basel, Boston, Berlin, 1990.
[273] Scholes, M.S., Derivatives in a dynamical environments, The American Eco-
nomic Review, June 1998, 15-35 (Nobel Prize Memorial Lecture delivered in
Stockholm, December 9, 1997).
[274] Schulenberger, J.R., The Debye Potential, A Scalar Factorization [or
Maxwell's equations, J. Math. Anal. Appl., 63(1978), 502-520.
[275] Schumaker, L.L. and Webb, G.(eds.), Recent Advances in wavelet Analysis,
Academic Press, Boston San Diego, New York, Inc., 1994.
BIBLIOGRAPHY 363

[276] Schreiner, W., Particle Verfahren für Kinetische Schemata zu den Eular-
gleichungen, Ph.D. Thesis, University of Kaiserslautern, 1994.
[277] Sendov, B., Dimov, I. (eds.) Monte Carlo Methods and Parallel Algorithms,
World Scientific, Singapore, 1989.
[278] Shreider, Yu, A., The Monte Carlo Method, Pergamon Press, Oxford-London,
1966.
[279] Siddiqi, A.H., Walsh Function, AMU Press, Aligarh, 1978.
[280] Siddiqi, A.H., Functional Analysis with Applications, Tata McGmw Hili, 1986.
[281] Siddiqi, A.H., Approximation by Walsh Series, Mathematical Analysis and its
Applications, in Mazhar, S.M., Hamoui, A. and Faur, N.S. (ed.), Pergamon
Press, 1987, 43-51.
[282] Siddiqi, A.H.,(ed.), Recent Developments in Applicable Mathematics, Mcmil-
lan, India, 1994.
[283] Siddiqi, A.H., Certain current Developments in Variational Inequalities in Lau
and Tweddle (eds.), Pitman Research Notes in Mathematics Series Nr 316,
Longman Scientific and Technical, 1994, 219-238.
[284] Siddiqi, A.H., Leciure Notes on Variational Inequalities with Applications, In-
dustrial Mathematics, Department of Mathematics, Kaiserslautern University,
1997.
[285] Siddiqi, A.H., Fractal-wavelet methods in image processing, Invited Talk Inter-
national Conference on Fourier Analysis and Applications, Kuwait University,
Kuwait, 1998.
[286] Siddiqi, A.H. and Ahmad, K., Wavelet Methods in Differential equations,
Preprint AMU-95.
[287] Siddiqi, A.H., Ahmad, M. K., and Mukheimer, A. Current Developments in
Fractal Image Compression, in Brokate, M. and Siddiqi, A. H.(eds.), Func-
tional Analysis with Current Application to Science, Technology and Industry,
Pitman Research Notes in Mathematics, Longman, Vol. 377, U.K., 1997.
[288] Siddiqi, A.H., Manchanda, P. and Kocvara, M., An iteratiue two-step algo-
rithm for American option pricing, Preprint 1998, Erlangen University to be
published in IMA Journal of Mathematics Applied to Business and Industry.
[289] Siddiqi, A.H. and Ahmad, M.K., Distortion measure through Soboleu and total
variation norm metrics, Preprint, 1998.
[290] Siddiqi, A.H. and Ahmad, M.K., Sharp opemtor and classification of images,
Preprint 1998.
364 nillLIOGRAPHY

[291] Sirear, K.R and Papanieolaou , G., General Block-Sholes models accounting
for increased market volatility from heding strategies, Applied Mathematieal
Finance, 5(1998),45-82.
[292] Silvester, RP. and Ferrari, RN., Finite Elements for Electrical Engineers,
Cambridge University Press, 1983 (edition I), 1990 (edition 11).
[293] Sobol, I.M., The Monte Carlo Method, Mir Publishers, Moscow, 1975 .

[294] Stark, H.G., Multiscale analysis, wavelets and texture quality, Reports of
AGTM, Kaiserslautern University, Germany, Nr. 41, 1990.
[295] Sokolowski, J. and Zolesio, J . P., Introduction to shape optimization, shape
sensitivity Analysis, Springer-Verlag, Berlin, 1992.
[296] Stewart, 1., Four encounters with Sierpinski Gasket, Mathematical Intelli-
geneer, 17(1995), 52-64.
[297] Strang, G., Wavelets and dilation equations - Abrief introduction, SIAM Re-
view, 31(1989), 614-627.
[298] Strang, G., Wavelet transforms versus Fourier transforms, BuH. Amer . Math.
Soc., 28(1993), 288-305.
[299] Strömberg, J.O., A modiJied Franklin system and higher order spline systems
on Rn as unconditional bases for Hardy spaces, in Conference on Harmonie
Analysis in Honour of Antoni Zygmund, etal (ed.) Vol. 11, University of Chicago
Press, Chicago, 1981, 475-494.
[300] Stroud, A.H., Nutnerical Quadrature and Solution of Ordinary Differential
Equations, Springer Verlag, New York-Heidelberg-Berlin, 1974.
[301] Sweldons, W. and Piessens, R., Quadrature formulae and asymptotic error
expansions for wavelet approximations of smooth functions, SIAM J. Numer.
Anal., 31(1994), 1240-1264.
[302] Tapia, R.A., The differentiation and integration of non-linear operators, Non-
linear Functional Analysis and Applications (edit.) B. RaU, Academic Press,
1971.

[303] Taylor, A.B., Mathematical Models in Applied Mechanics, Clarendom Press,


Oxford, 1986.
[304] Teo, P.C. and Heeger, D.J ., Perceptual image distortion, SPIE, 2179(1994),
127-141,

[305] Tezuka, S., Financial applications of Monte Carlo and quasi-Monie Carlo
methods, in Hellekalek and Larcher, Lect, Notes in Statistics, Springer-Verlag,
1998, pp. 303-332.
BIBLIOGRAPHY 365

[306] Thomas, E.G. and Meadows, A.J., Maxwell's Equations and their Applica-
tions, Adam Hilger Ltd., Bristol and Boston, 1985.
[307] Tiwari, S, and Rjasanow, S., Sobolev norm as a criterion of tocol thermal
equilibrium, to appear in Eur. J . Meeh. BjFluids.
[308] Tricomi, F., Integral Equations, Interscience pub!., London, New York, 1957
Reprint Dover Publ. New York, 1985.
[309] Turner, M.J., Clough, R.W., Martin, H.C. and Topp, L.J ., Stiffness and de-
fleetion analysis 01 complex struciures, J. Aero, SeL, 23(1956),805-823.
[310] van Loan, C. Computational Frameworks [or the Fast Fourier Tronsforms,
SIAM Philedelphia, 1992.
[311] Visintin, A. (00.) Models 01 hysteresis, Pitman Research Notes in Math. Series
No. 286, Longman Scientific and Technical, 1993.
[312] Visintin, A., Differential Models 01 hysteresis, Springer Verlag, 1994.
[313] Visintin, A. (ed.), Phase transition and Hysteresis, Lecture Notes in Mathe-
matics, Vol. 1584, Springer Verlag, Berlin, 1994.
[314] Vvedensky, D., Partial differential Equations with Mathematica Addison-
Wesley Publishing Company, 1994.
[315] Wahlbin, L.B., Superconveryence in Galerkin Finite Element Methods Lecture
Notes in Mathematics, Springer Verlag, Vol. 1605, Bonn- Heidelberg, 1995.
[316] Wait, R. and MitchelI, A.R., l';inite Element Analysis and Applications, John
Wiley and Sons, Chichester, New York, 1985.
[317] Walker, J.S ., Fourier analysis and wavelet analysis, Notices ofthe Amer. Math.
Soc., 44(1997), 658-670.
[318] Wallace, G.K., Overview 01 the JPEG (ISOjCCITT) Still image compression
standard, SPIE, 1244(1990), 220-233.
[319] Walter, G.G., Approximation 01 the delta [unction by wavelets, J . Approx.
Theory, 71(1992), 329-343.
[320] Walter, G.G ., Pointwise convergence 01wavelet expansions, J. Approx . Theory,
80(1995), 108-118.
[321] Wang, Xin-Hua, Finite Element Methods for non-linear optical wave guides,
Advances in non-linear Optics, Vol. 2, Gordon and Breach publishers , Ams-
terdam, 1995 .
[322] Weaver, H. Joseph, Applications 01Discrete and Continuous Fourier Analysis,
John Wiley & Sons, Inc. 1983.
366 BIBLIOGRAPHY

[323] Weaver, H. Joseph, Theory of Discrete and Continuous Fourier Analysis, John
Wiley and Sons, 1989.
[324] Weickert, J., A model for the cloudiness of Fabrics, Preprint Industrial Math.
Lab. Kaiserslautern university, Germany, 1995, Proc. 8th EC MI, Conference,
Wiley-Teubner 1995.
[325] Wendt, J. F . (ed.), Computational Fluid Dynamics, Springer Verlag, Berlin,
1991.
[326] Weyl, H., Überdie Gleichverteilung von Zahlen modulo Eins, Math. Ann.,
77(1917), 313-352.
[327] Wheeler, M.F., An elliptic Collocation Finite Element Method with Interior
penalties, SIAM J. Numer. Anal., 15(1978), 152-161.
[328] Whiteman, J . (Ed.), The mathematics of finite elemenis and applications, I,
II, III, Proc. Conf. Brunel University, Academic Press, (1973, 1976, 1979).
[329] Wickerhauser, M.V., Adapted wavelet analysis from theory to software Walles-
ley, MA: Peters, 1994.
[330] Wilmott, P., Derivatives, the theory and practice of financial engineering, John
Wiley and Sons, Chichester, New York, 1998.
[331] Wilmott, P., Howison, S.D. and Dewynne, J .N., The mathematics of financial
derivatives, Cambridge University Press, 1995.
[332] Wilmott, P., Dewynne, J .N. and Howison, S.D., Option pricing: Mathematical
modele and computation, Oxford Financial Press, 1993.
[333] Wojtaszczyk, P., A mathematical introduction to wave/ets, London Mathemat-
ical Society Student Text 37, Cambridge University Press, 1997.
[334] Wolfram, S., Mathematica™, A system for doing Mathematics by computer,
Addison-Wesley Publishing Company, 1990.
[335] Zayed, A.I., Advances in Shannon's Sampling Theory, CRC Press, London,
1993.
[336] Zenisek, A., Nonlinear Elliptic and Evolution Problems and their finite ele-
ment approximations, computational Mathematics and Applications, Academic
Press, Harcourt Brace Jovanovich Publishers, London, 1990.
[337] Zienkiewicz, O.C. and Cheung, Y.K., The Finite Element Method in Structural
and Continuum Mechanics, McGraw Hill, New York, 1967.
[338] Zlamal, M., On the Finite Element Method, Numer. Math., 12(1968), 394-409.
[339] Zygmund, A., Treatise on Trigonometric Series, V 01.1, IIj Oxford Press, 1959.
BIBLIOGRAPHY 367

Symbols
-+ arrows of mapping
=> implication sign
V for all
-<===} if and only if
N, Z, R denote the sets of natural numbers, integers and real numbers
4> the empty set
Rn Euclidean space of dimension n
c subset
inf A the infinimum of A c R
sup A the supremum of A c R
diam A the diameter of A c Rn
d(A,B) the distance between A,B C Rn
[a,b] ={xERla~x~b}
(a, b) = [z E R I a < X < b}
(a,b] ={xERla<x~b}
[a, b) = {x E R I a ~ X < b}
R+ = (0,00)
R+ = [0,00)
11·11 norm
support of / = closure of {x E 0 I /(x) =f:. O}
C8"(u) = D(O) the space of all infinitely differentiable functions with compact
support.
D ' (0) the space of all Distributions = the space of all bounded linear functionals
defined on D(O).
Hm(o) = {f E L 2(0) IDOl E L 2(0), I a I~ m}, m is any positive integer, Sobolev
space of order m.

Hlf(O) = {f E Hm(O)jrf = O} = closure of D(O) in Hm(o)


Hm(Rn) = Hlf(Rn),m ~ O.

/ E H-m(o) if and only if / = L DOI g for some 9 E L 2(0).


1001~m
n crv'
= La '. = (Vl,V2, "'Vn ) E (V (o))n .
I

divv for all v


i=l x,
curl v = rot v = 'V x v = (ov 3 _ OV2 , OVl _ aV3 , aV2 _ OVl )
OX2 aX3 OX3 OXl aXl OX2
for v = (Vl,V2,V3) E Vi (0)3.
XA(X) = 1,
= 0, xE A
x rt A . . fu nction
eh aracteristic ' or In
. diicat nx
' fu nct IOn.
'
Index
abstract variational problem, 106 coercive, 314
accuracy, 291 coercivity, 106
affine operator, 312 Collage Theorem, 256
affine transformation, 255 collocation method, 101
airbag system, 17 commodities, 335
algorithm, 290 computer algebra, 292
algorithmic language, 290 conductors, 80
alignment, 3 conformal finite element method, 106
almost band-limited, 217 conjugate gradient, 71
Ameriean call option, 338 constraint optimization problem, 55
American put option, 339 contraction mapping, 255
analogue, 182 contrast stretching, 184
anisotropie,83 convergence problem, 107
applicability, 291 convex programming problem, 60
approximate problem, 106 convolution, 211
asset, 334 convolution theorem, 187
attractor, 256 Coulomb condition, 92
attributes, 192 Coulomb 's law, 80
current density, 87
Banach contraction fixed point theo- cyclie curve, 45
rem, 296
basic wavelet, 221 Daubechies wavelets, 225
basis functions, 112 Debye potential, 92
behaviourally valid, 289 decoder , 192
Bernoulli, 201 decomposition algorithm., 233
Black-Scholes equation, 336 degrees of freedom, 112
Black-Scholes model, 336 derivative, 334
boundaries, 199 deterministic fractal, 256
boundary element method, 120 deterministic fractal or fractal, 299
brightness, 199 dielectries, 80
difference quotient, 102
calculus of variation problem, 60 diffusion equation, 39
call options, 335 dilation equation, 233
Cantor set , 295 dimension, 82
Clouds, 27, 36 Dirichlet's approximation theorem, 31

369
370 INDEX

discrepancy, 160 fast Fourier algorithm, 212


discrete Fourier transform, 211 fast wavelet transform, 238
discrete Fourier transform pair, 212 father wavelet, 223
discrete Lipschitz distance, 33 fatigue,42
dissipation operator, 284 fatigue analysis, 43
distribution, 318 fatigue cracks, 42
divergence theorem, 83 fatigue life, 43
double layer potential, 125 fatigue lifetime, 44
doubled curve, 45 Faure sequence, 165
Fejer kernel , 204
edges at scale, 199 filter transfer function, 186
effectively computable, 290 final value mapping, 269
electric charge density, 81 finite difference method, 102
electric conduction, 87 finite element mesh, 115
electric current, 83 finite element method, 105-107
electric dipole, 82 finite element model , 119
electric dipole moment, 82 finite elements, 111
electric displacement, 86 finite volume method, 103
electric field intensity, 81 First Shift Theorem, 212
electric induction, 86 First Strang Lemma, 107
electrical displacement, 87 fleeces, 27
element, 115 Fletcher-Reeves conjugate gradient
ellipticity, 106 method, 73
empirically valid, 290 Fletcher-Reeves formula, 73
Encarta,254 fluctuations, 41
encoder, 192 flux density, 86, 87
energy functional, 58 forward transformation kernel , 215
equity, 334 Fourier, 201
error, 100 Fourier sine transform, 207
error estimation, 107 Fourier spectrum, 206
essential, 121 Fourier transform, 206
Euler,201 Fourier transform frequency content,
Euler's equation, 55 207
European call option, 335 fractal, 256
European put option, 335 frequency content, 202
eventually contractive, 255 frequency domain, 208
eventually iterated function system, 255 frequency variable, 206
exact, 98 Friedrichs inequality, 324
exchange rate, 335 functional, 310
expiry, 334
extrema,55 Galerkin method, 99
extremum value, 55 gauge transformation, 90
Gauss theorem, 83
Faraday's law, 85 Gauss' magnetic law, 87
INDEX 371

Gauss-Seidel iteration, 330 inverse transform, 215


Gauss-Seidel method, 330 inverse transformation kernel, 215
Gaussian pyramid, 40 isotropie, 83
Gaussian related wavelet, 226 iterated function system (IFS), 255
generalized van der Corput sequenee, iterated function system with proba-
165 bilities, 257
gradient method, 65, 66 Iterated Funetion Theorem, 298
gradient method with optimal, 65 iterated fuzzy set system, 257
Green's formula for integration by parts, iterated fuzzy set systems, 257
324
grey level, 182, 199 Jacobi iteration matrix, 329
Jacobi vector, 329
Haar wavelet, 224 Jacobian, 14
Halton sequenee, 165 JPEG,241
Hammersley sequenee, 165
harmonie, 126 Kolmogorov-Smirnov distanee, 32
Hausdorff, 5 Kroneeker, 31
Hausdorff distanee, 5 Lagrange,201
Hausdorff metrie, 298 Laplace pyramid, 40, 41
height funetion, 285 Laplace's equation, 89
Hermann Weyl, 31 Lax-Milgram Lemma, 314
Hessian matrix, 74 least square restoration, 191
heterogeneous, 83 least squares method, 101
high-pass filter, 237 linear, 83
Hilbert space, 99, 314 linear programming problem, 60
holes, 289 loeal (partitioned) eontraction mapping,
homogeneous, 83 255
hysteresis, 266 loeal equilibrium state, 173
hysteresis loop, 266 loeal Maxwellian distribution, 173
hysteresis memory eurve, 279 loeal minima, 55
Lorentz eondition, 91
ideal Low-pass filter, 186 low-diserepaney, 164
illumination eomponent, 182 low-diserepaney sequenee, 164
image degradation problem, 190 low-level proeedures, 191
image handling, 191 lew-pass filter, 237
image understanding, 191 low-pass filtering, 186
impulse response, 189
indirect method, 126 Madelung deletion, 47
inhomogeneous, 83 magnetie field, 84, 86, 87
inner product spaee, 312 magnetic flux, 84
insulators, 80 magnetic flux density, 84
invariant, 189 magnetic reversal eurve, 266
inverse diserete Fourier transform, 212 magnetic saturation eurve, 266
inverse Fourier transform, 206 magnetization, 90
372 INDEX

magnetizations, 22 pels, 183


maintainability, 291 Petrov-Galerkin method, 101
Mallat transform, 243 phase angle, 206
Masing Law, 45 piecewise monotone, 277
MATLAB, 293 pixels, 183
Maxwell equations, 22 play operator, 271
Maxwell's macroscopie equations, 87 Poincare inequality, 324
Maxwell's magnetic field equation, 88 Poisson's equation, 89
Maxwell-Ampere's law, 87 Polak-Ribiere conjugate gradient method,
Maxwell-Faraday law, 87 73
Maxwell-Gauss electrie law, 87 polarization, 90
minima, 55 portability, 291
model, 287 position, 189
molecules, 2 positive, 314
mother wavelet, 221 positive definite, 314
multi-grid method, 103 potential, 122
multiplicative congruential method, 169 potential vector, 90
multiresolution analysis, 223 power spectrum, 206
multiscale analysis, 36, 192, 199 pre-Hilbert space, 312
Mumford-Shah segmentation energy model, Preisach memory curves, 273
199 Preisach type operator, 273
primitive finite element method, 104
NAG,293 projection, 61, 313
natural boundary, 121 projector operator, 61
Newton Algorithm, 71 pseudo-random numbers, 168
Newton Method, 62 pseudo-random sequences, 168
Newtonmethod, 13, 14 put options, 335
Newton-Armijo Algorithm, 71
non-conformal finite element method, quadratic functional, 58
106 quadratic programming problem, 60
non-homogeneous, 83 quadrature rules, 156
non-isotropie, 83 quantizer, 197
non-linear programming problem, 60
quasi-Monte Carlo approximation, 160
nonuniformity, 27
quasi-random points, 164
Nyquist, 216
quasi-random sequences, 164
Nyquist samples, 216

objective fidelity, 194 radical inverse function, 164


Ohms law, 83 rainflow counting method, 274
operator, 258, 274 rainflow matrix, 47
optimal control problem, 62 rainflow method, 43
option pricing, 334 rainflow residual, 275
options, 334 rate-independent, 49
reconstruction algorithm, 234, 235
Palmgren-Miner-rule,43 rectangular, 113
INDEX 373

reetangular rule, 154 symmetrie, 215, 306, 314


recurrent iterated function system, 255
refinement equation, 233 techniques of pyramids, 37
reflectance component, 182 temporal, 208
relative minima, 55 test functions, 318
relay with thresholds, 271 tetrahedral, 113
reliability, 291 time domain, 208
residual, 100 time windows, 220
residual graph, 47 time-limit, 217
residue,47 time-limited, 217
rigidity,2 transform of a transform, 212
risk management, 335 transverse electric wave, 92
RLaB,293 transverse magnetic wave, 92
robustness, 291 trapezoidal rule, 155
triangular, 113
Safing sensor, 17 triangulation, 111, 115
Sampling Theorem, 216
unconstraint optimization problem, 55
sampling theorems, 216
underlying, 334
scalar potential, 90
underlying aaset, 334
Schwartz distribution, 318
uniformly distributed, 160
Second Strang Lemma, 108
uniformly distributed sequence, 160
segmentation, 199
usability, 291
semieonductor, 289
separable, 215 van der Corput sequence, 165
Shannon wavelet, 226 van der Waals radius, 2
shape functions, 112 variational (weak) solution, 98
ships, 27, 36 variational equation, 59
similarity, 3 variational inequality, 59, 111
simulation model, 288 variational inequality problem, 59
single layer potential, 125 volatility, 335
singular, 306
smooth curve, 216 wavelet, 220
Sobolev space, 321, 323 wavelet coefficients, 221
SORP, 340, 341 wavelet series, 221
spatial domain, 184 wavelet transform, 222
spectral density, 206 weight functions, 101
SSORP-PCG, 342 weighted residual method, 100
stable media, 92
star discrepancy, 165
stop operator, 283
stress-strain-plane, 51
strike price, 335
subjective fidelity, 194
support, 318
Applied Optimization

1. D.-Z. Du and D.E Hsu (eds .): Combinatorial Network Theory. 1996
ISBN 0-7923 -3777-8
2. MJ. Panik: Linear Programming: Mathematics , Theory and Algorithms. 1996
ISBN 0-7923-3782-4
3. R.ß. Kearfott and V. Kreinovich (eds.): Applications of' Interval Computations.
1996 ISBN 0-7923-3847-2
4. N. Hritonenko and Y. Yatsenko : Modeling and Optimimization 0/ the Lifetime 0/
Technology. 1996 ISBN 0-7923-4014-0
5. T. Terlaky (ed.): Interior Point Methods ofMathematical Programming. 1996
ISBN 0-7923-4201-1
6. B. Jansen: Interior Point Techniques in Optimization. Complernentarity, Sensitivity
and A1gorithms. 1997 ISBN 0-7923-4430-8
7. A. Migdalas, P.M. Pardalos and S. Storey (eds.): Parallel Computing in Optimization.
1997 ISBN 0-7923-4583-5
8. EA. Lootsma: Fuzzy Logic for Planning and Decision Making . 1997
ISBN 0-7923-4681 -5
9. J.A. dos Santos Gromicho: Quasiconvex Optimization and Location Theory. 1998
ISBN 0-7923-4694-7
10. V. Kreinovich, A. Lakeyev, J. Rohn and P. Kahl: Computational Complexity and
Feasibility 0/ Data Processing and Interval Computations. 1998
ISBN 0-7923-4865-6
11. J. Gil-Aluja: The Interactive Management ofHuman Resources in Uncertainty. 1998
ISBN 0-7923-4886-9
12. C. Zopounidis and A.I. Dimitras: Multicriteria Decision Aid Methods for the Predic-
tion ofBusiness Failure. 1998 ISBN 0-7923-4900-8
13. E Giannessi, S. Kom16si and T. Rapcsäk (eds.): New Trends in Mathematical Pro-
gramming. Homage to Steven Vajda. 1998 ISBN 0-7923-5036-7
14. Ya-xiang Yuan (ed.): Advances in Nonlinear Programming. Proceedings of the '96
International Conference on Nonlinear Programming. 1998 ISBN 0-7923-5053-7
15. w.w. Hager and P.M. Pardalos: Optimal Control. Theory, Algorithms, and Applica-
tions .1998 ISBN 0-7923-5067-7
16. Gang Yu (ed.): Industrial Applications ofCombinatorial Optimization. 1998
ISBN 0-7923-5073-1
17. D. Braha and O. Mairnon (eds.) : A Mathematical Theory 0/ Design:Foundations,
Algorithms and Applications. 1998 ISBN 0-7923-5079-0
Applied Optimization

18. O. Maimon. E. Khmelnitsky and K. Kogan : Optimal Flow Control in Manufacturing.


Production Planning and Scheduling. 1998 ISBN 0-7923-5106-1
19. C. Zopounidis and P.M. Pardalos (eds.): Managing in Uncertainty: Theory and Prac-
tice. 1998 ISBN 0-7923-511O-X
20. A.S. Belenky: Operations Research in Transportation Systems: Ideas and Schemes
of Optimization Methods for Strategie Planning and Operations Management. 1998
ISBN 0-7923-5157-6
21. J. Gil-Aluja: Investment in Uncertainty . 1999 ISBN 0-7923-5296-3
22. M. Fukushima and L. Qi (eds.): Reformulation: Nonsmooth , Piecewise Smooth,
Semismooth and Smooting Methods . 1999 ISBN 0-7923-5320-X
23. M. Patriksson: Nonlinear Programming and Variational Inequality Problems. A Uni-
fied Approach. 1999 ISBN 0-7923-5455-9
24. R. De Leone, A. Murli, P.M. Pardalos and G. Toraldo (eds.): High Performance
Algorithms and Software in Nonlinear Optimization. 1999 ISBN 0-7923-5483-4
25. A. Schöbel: Locating Lines and Hyperplanes. Theory and Algorithms. 1999
ISBN 0-7923-5559-8
26. R.ß. Statnikov : Multicriteria Design. Optimization and Identification. 1999
ISBN 0-7923-5560-1
27. V. Tsurkov and A. Mironov: Minimax under Transportation Constrains. 1999
ISBN 0-7923-5609-8
28. V.I. Ivanov: Model Development and Optimization. 1999 ISBN 0-7923-5610-1
29. F.A. Lootsma: Multi-Criteria Decision Analysis via Ratio and Difference Judgement.
1999 ISBN 0-7923-5669-1
30. A. Eberhard, R. Hill, D. Ralph and ß.M. Glover (eds.): Progress in Optimization.
Contributions from Australasia. 1999 ISBN 0-7923-5733-7
31. T. Hürlimann: Mathematical Modeling and Optimization. An Essay for the Design
of Computer-Based Modeling Tools. 1999 ISBN 0-7923-5927-5
32. J. Gil-Aluja: Elementsfor a Theory ofDecision in Uncertainty. 1999
ISBN 0-7923-5987-9
33. H. Frenk, K. Roos, T. Terlaky and S. Zhang (eds.): High Performance Optimization.
1999 ISBN 0-7923-6013-3
34. N. Hritonenko and Y. Yatsenko: Mathematical Modeling in Economics, Ecology and
the Environment. 1999 ISBN 0-7923-6015-X
35. J. Virant: Design Considerations ofTime in Fuzzy Systems . 2000
ISBN 0-7923-6100-8
Applied Optimization

36. G. Di Pillo and F. Giannessi (eds.): Nonlinear Optimization and Related Topics. 2000
ISBN 0-7923-6109-1
37. V. Tsurkov : Hierarchical Optimization and Mathematical Physics. 2000
ISBN 0-7923-6175-X
38. C. Zopounidis and M. Dournpos: Intelligent Decision Aiding Systems Based on
Multiple Criteriafor Financial Engineering. 2000 ISBN 0-7923-6273-X
39. X. Yang, A.I. Mees, M. Fisher and LJennings (eds.): Progress in Optimization .
Contributions frorn Australasia. 2000 ISBN 0-7923-6175-X
40. D. Butnariu and A.N. lusern: Totally Convex Functionsfor Fixed Points Computation
and Infinite Dimensional Optimization. 2000 ISBN 0-7923-6287-X
41. J. Mockus: A Set ofExamples of Global and Discrete Optimization. Applications of
Bayesian Heuristic Approach. 2000 ISBN 0-7923-6359-0
42. H. Neunzert and A.H. Siddiqi: Topics in Industrial Mathematics. Case Studies and
Related Mathernatical Methods. 2000 ISBN 0-7923-6417-1

KLUWER ACADEMIC PUBLISHERS - DORDRECHT I BOSTON I LONDON

You might also like