Harmonic Analysis
Harmonic Analysis
Harmonic Analysis
Hans G. Feichtinger
hans.feichtinger@univie.ac.at
October 1, 2013
Abstract
Harmonic Analysis (HA) has the Fourier transform, convolution operators,
translation invariant linear systems, and the Shannon sampling theorem based on
Poissons formula as central topics. During the last century HA has split into
an abstract direction, dealing with functions over locally compact Abelian (LCA)
groups and on the other hand a variety of application areas, such as digital signal
and image processing or mobile communication, inuencing our daily life.
This article describes a line of thoughts which has been developed by an ab-
stract harmonic analyst who has converted to an application oriented harmonic
analyst in the last decades. The author also likes to run numerical experiments in
order to gain insight into both practical question as well as abstract concepts.
The change of perspectives that took place by this process is described as a
transition from Classical Fourier Analysis to Postmodern Harmonic Analysis, with
strong functional analytic aspects, but also incorporating concepts from abstract
and computational harmonic analysis.
This is just a rst attempt, with a modest philosophical background, but we
hope that the terminology, and above all the views, recommendations and methods
described in our this note will help to enrich harmonic analysis and contribute to
its healthy development in the years to come.
Readers who dont agree, or support the authors view-points expressed in this
note are encouraged to contact him and to share their views, also with younger
and older colleagues active in the eld, as well as engineers or computer scientists
and other practical users of Fourier analysis.
Since we will talk a lot about concept and ideas we have to acknowledge that
concepts emerge in the scientic community, typically have a life span
1
where
they receive high attention, and once they have been explored properly nd their
way into the canon of teaching at the universities or get discarded. Among the
concepts relevant for analysis the ideas of Fourier certainly play a central role and
have inuenced the eld for centuries now. Nevertheless, the chapter should not
be closed, the way how Fourier analysis is taught may need some reconsideration,
and this article will propose some ideas in such a direction.
1
See the interesting article by Wilder [62].
1
1 Mathematical Problems and their Tools
1.1 A coarse Tour dHorizon
Let us rst quickly try to summarize the development of Fourier analysis. Where did the
theory start, and how far has Fourier analysis come in the two centuries, in particular
during the last decades, where the distributional view-point appearing in the context of
micro-local analysis works with sophisticated arguments from so-called hard analysis.
On the other hand the numerical practice performed by people working in electrical
engineering, signal and image processing or computer science
2
is perceived nowadays as
a far distant subject having little in common with Classical Fourier Analysis, apart from
the name. In this situation we will try to argue that there is more to be said and more
to be done for the (re-)unication of eld, with a high potential for synergy, making the
two branches more supportive to each other in a signicant way.
Not surprisingly our story starts about 200 years ago with J.B. Fouriers original
claim that every periodic function has a representation as an innite trigonometric
series. It took a century to give a proper meaning to this statement, because it required
subsequent generations of mathematicians (including Riemann and Lebesgue) to develop
proper notions of what a function is (leading to the currently widely accepted concept
of a mapping) and how an integral of such a function should be interpreted, in the most
general setting. Nowadays these concepts appear as properly determined and granted,
and a person raising doubts about such well established concepts has a dicult position.
But if we try take a naive (not to say child-like) viewpoint, we have to ask ourselves: are
equivalence classes of measurable functions modulo null-functions in the Lebesgue sense
really such natural objects? Are they suitable to describe something like the change of
temperature in a room or in a town?
Even more provocatively, let us shortly discuss a very simple (physical) function on
R
4
, namely the temperature in a room or in some region, say town. One may expect that
this is a smooth function dened everywhere at rst sight, describing the temperature
at each point in space using a world coordinate system and for any given moment in
time (during the observation interval). But does it really have a physical meaning to
even think of the temperature during a millisecond in a point-like volume? Is there a
clear meaning to the expression: temperature at a given location at a particular time,
or do we realistically only infer on the existence of such a function only indirectly (is it
a genearlized function) by measuring local and temporal means? Does it help us in this
situation that we are allowed to ignore the values of the function over a null-set? Will
it be an L
p
-function? In addition we have to face that computers will never be able to
handle an uncountable amount of information.
Of course this example is mentioned in order to stimulate a rethinking of the ques-
tion, whether mathematicians have discovered a god-given truth and the right way of
describing functions (under any circumstances), or whether this is more a history based,
sophisticated and universally accepted way of describing things, which might be also
replaced by alternative ways? So what about the idea of taking functions (as we think
of them now) as possible limits (available only under favorable conditions) of a natural
2
Of course we hope to convince at least some readers that the distributional view-point is not so
complex and should be further developed to be more useful for the applied scientists!
2
concept of local average information (nowadays called distributions). In other words,
could one think of distributions as natural objects, and diculties in Fourier analysis
partially stemming if one is pushing inappropriately for domains (such as L
1
(R
d
))?
Of course a lot of positive things can (and should) be said about the Lebesgue
integral. In some sense the original problems (arising from questions of the convergence
of Fourier series) of Fourier theory could be elegantly answered by the theory of Lebesgue
integration, and thus younger generations learning the subject (including the author
when he was young) view this this (important) approach as being carved in stone, as
the right way to view things, as the nal insight. As a consequence the way how most
of us are teaching the subject is not too dierent from how it was done 60 years ago,
just a little bit more elegant and with a slightly stronger functional analytic avor.
Coming back to the Fourier transform. Once the Lebesgue integral was established
it was possible to properly dene the Fourier transform on the Banach space L
1
(R
d
),
in fact on
L
1
(G), | |
1
G). But it is
although clear that this Fourier-Plancherel transform is not anymore an integral trans-
form in the strict (pointwise a.e.) sense. Furthermore Plancherels theorem can be
proved in a similar way starting from any reasonable dense subspace, one does not need
L
1
(R
d
) for this purpose.
If we look back we can see that Fourier Analysis was deeply involved in the de-
velopment of modern functional analysis and measure theory, the proof of Plancherels
theorem being a typical example of how things can be done if technical problems arise.
The method consists in reducing the problem to the analysis using dense subspaces and
then extend the operators, using estimates for their mapping properties with respect to
appropriate norms, using an approximation argument.
But of course the development did not stop, and new view-points have been developed
to cope with objects (then called generalized functions or distributions) which should
have Fourier transforms as well, somehow as limits (but in which sense?) of ordinary
functions, very much like real numbers are considered as limits of their nite decimal
approximations. And of course such considerations, if they have to be put on solid
ground, have to deal with the problem of existence and natural embedding of the
old objects (here ordinary functions) into the larger context.
The reader will easily guess that with my arguments given above I am opening
the stage for distributions, which can be elegantly described using their actions as linear
functionals on the test functions (I would call this the modern view-point). Alternatively
(but less elegant) they can be described as by a more elementary view-point working
with sequences of ordinary functions. Such an approach was prominently pursued in the
work of M.J. Ligthhill: [43]. Although it is nice to know that such an approach exists,
3
the inuence of this approach on the community has been minor in the last decades
and virtually nobody is following this approach anymore, because it is too cumbersome
compared with the more powerful and elegant functional analytic setting (see e.g. [34]).
Among the dierent possible setting (including ultra-distributions and Gelfand-Shilov
classes) that one can choose in order to describe generalized functions the approach pro-
vided by Laurent Schwartz is certainly the most popular and in some sense the most
natural one. It is based on the space S(R
d
) (of rapidly decreasing functions) as a space
of test functions, endowed with a natural family of seminorms. The vector space of
so-called tempered distributions is then simply the dual S
(R
d
), the space of continuous
linear functionals on S(R
d
). The construction is carried out in such a way that S(R
d
)
is invariant under the Fourier transform, which allows to extend the action of the FT
to the dual space. This approach has nally revolutionizing the treatment of PDEs (see
the fundamental work of Lars H ormander, [3436], which we cannot elaborate here),
leading to micro-local analysis and the theory of pseudo-dierential operators.
For questions of abstract HA over LCA groups a corresponding theory was proposed
by Bruhat (see [5, 50, 51]), but as it is using structure theory it is a highly complicated
tool and has found relatively little use in the past (see e.g. in the work of M. Rieel [53],
or A. Vourdas [59]).
In the last two or three decades of the last century time-frequency analysis and in
particular Gabor analysis received a lot more of attention (in parallel with the rapid
growth of wavelet theory) and new methods had to be developed in order to properly
describe the situation encountered in this setting. In fact, it was not really possible to
save Gabors original approach by an appropriate use of distribution theory ( [38]). In-
stead, it became more important to have a family of suitable function spaces to describe
mapping properties of relevant operators, such as frame operators or Gabor multipli-
ers, namely modulation spaces (see [14] or [16]). We also had to learn how to handle
redundant representations, now known as Gabor frames (as a counterpart to Gaborian
Riesz bases). This is in contrast to the situation of wavelet theory where it was possi-
ble to design suitable orthonormal bases of wavelet type, which at the same time also
form unconditional bases for a whole range of function spaces, including the Besov- and
Triebel-Lizorkin spaces.
Interestingly enough (but not yet fully accepted or known in the community) the
tools (test functions and distributions) arising in this context are also useful in the
description of questions outside time-frequency analysis, in particular in the context
of classical Fourier analysis (summability methods) as well as for the description of
questions of discretization. In fact, one of the main points of this little note is to try to
indicate the possible use of what we call the Banach Gelfand Triple based on the Segal
algebra S
0
(G) for a wide range of problems, including Gabor analysis, but not at all
limited to this setting (see [9] for a full account).
Remark 1. We also advocate the view-point that the development of mathematics as
a discipline is not only based on the amount of facts accumulated, or on the increased
complexity of statements possible within the theory or continued specialization. On the
contrary: Major progress is based on the lucky moments in the history of our science,
where new view-points allow to get a better and easier understanding of what has been
done so far, as well as (ideally) enabling us to answer questions that could not be
answered using the classical tools (see corresponding remarks in [7]).
4
1.2 Classical Fourier Analysis
A good overview of what is nowadays perceived is Classical Fourier analysis is provided
by the survey talk held by Charles Feerman at the International Mathematical Congress
1974, entitled Recent progress in Classical Fourier Analysis [12]. It features among
others Calderon-Zygmund operators, H
p
-spaces, atomic decompositions and Cotlars
Lemma, and of course Carlesons famous Acta paper [6], on the convergence and growth
of partial sums of Fourier series. In fact, in this period the school around E. Stein
developed systematically ways to describe the smoothness of functions, using Bessel
potentials, allowing to describe fractional smoothness, with Sobolev spaces arising as
the natural cornerstones for integer smoothness, positive or negative, as well as the
family of Besov spaces B
s
p,q
(R
d
). They can be be viewed as generalized Lipschitz spaces
with respect to L
p
-norms (see Steins book on Singular Integrals and Dierentiability
Properties of functions [55]). See the books by Muscalu and Schlag [49] for a recent
account in this direction, or the books by Grafakos (e.g. [31]).
It turned out that the Paley-Littlewood decompositions of these spaces allow to
provide atomic decompositions of tempered distributions with the extra property, that
the (weighted) summability conditions of the corresponding coecients (which are not
uniquely determined) in terms of weighted mixed norm spaces allow to determine the
membership of a given function in one of these smoothness space. As a typical reference
see the early work of Frazier and Jawerth ( [29]), which was inspired by the work of
Peetre. The atomic decompositions described there can in fact be seen as precursors to
modern wavelet theory.
For the pioneers in interpolation theory, Jaak Peetre and Hans Triebel, these Func-
tion Spaces and the use of dyadic decompositions on the Fourier transform side were
the starting point to identify the corresponding interpolation spaces, using either real or
complex interpolation methods (for pairs of Banach spaces). In this way the family of
Triebel-Lizorkin spaces F
s
p,q
(R
d
) arose (with Bessel potential spaces being special cases,
for q = 2 within this family). For a systematic summary of all the known properties
of these spaces (duality, embedding, traces and much more) the reader may consult the
books of H. Triebel (e.g. [56, 57]). It was also recognized that the L
p
-spaces belong to
this family, but only for 1 < p < , while one should replace L
1
(R
d
) and L
(R
d
) by the
Hardy space H
1
(R
d
) and its dual, the famous BMO-space, respectively. Given the use-
fulness of these function spaces for many purposes relevant at that time, e.g. the study
of maximal functions or for the theory of Calderon-Zygmund (CZ) operators was giving
this line of research (among others) great visibility within the analysis community.
Later on the good t between wavelet orthonormal systems and this setting in combi-
nation with the existence of ecient numerical algorithms was among the various reasons
why wavelet theory took o very quickly and in fact was immediately catching the
attention of pure and applied mathematicians and applied scientists such as electrical
engineers of computer scientists interested in image processing. From the very begin-
ning, e.g. in the rst preprints of Yves Meyer, it has been pointed out that the wavelet
ONBs are not just orthonormal bases for L
2
(R
d
), but they are also unconditional bases
for all those function spaces mentioned above. Moreover, the matrix representation of
CZ-operators in such wavelet bases had good o-diagonal decay, which in turn explains
many of their good properties, and why these function spaces are very suitable for the
5
description of CZ-operators. It also became clear that good wavelets have to have good
decay (concentration in the time-domain) as well as satisfy a few moment conditions,
in order to allow the characterization of functions resp. tempered distributions through
the size of their wavelet coecients or equivalently the membership of the continuous
wavelet transform in a suitable weighted mixed-norm space over scale space.
The intensive study of the connection between membership of functions in various
function spaces (of high or low smoothness, with or without decay in time) and the corre-
sponding wavelet coecients, which more recently is carried over also to the anisotropic
setting, led to the insight that strong decay allows the guarantee that most of the func-
tion (e.g. in the sense of some L
p
-norm) is already encapsulated in a relatively small
number of wavelet terms. From there the theory of sparsity took o, with the search for
ecient algorithms to nd the best approximation of a signal by nite partial sums of
wavelet series. Such ideas are behind certain data compression methods, using thresh-
olding methods, but also provide a setting for compressed sensing.
Although wavelet theory isnt by far providing universal tools for analysis (nor does
any other theory), it was nevertheless very inspiring for a number of developments in
modern analysis. Describing function spaces by suitable (orthogonal and even over-
complete and non-orthogonal) sets of specic functions became an important branch of
analysis, contributing to image compression applications, to the description of pseudo-
dierential operators, or the characterization of function spaces. By now we have a
theory of shearlet expansions, as well as a variety of concepts for atomic decompositions
and again the characterization of various function spaces in the complex domain.
Out of this rich variety of functional spaces, which in some respect are all very similar
to each other (the corresponding unied coorbit theory has been developed in the late
80s, see [17]), we will pick out a few spaces which arose originally in the context of time-
frequency analysis. There is also a natural link to the Schr odinger representation of the
reduced Heisenberg groups (from a group representation theoretical point of view), or
on the other hand simple to a characterization of function spaces through the behavior
of the STFT (short-time Fourier transform) of its elements (see [18]).
There are many reasons for going with this specic setting: Above all it is simple
to explain, and treats the time- and the frequency variable at equal footing, hence
it is optimally suited for a description of the Fourier transform (as a transition from
pure frequencies to Dirac point measures). Secondly it has many applications in the
description of real analysis problems (e.g. Poissons formula or the Fourier inversion
theorem). Finally one can obtain in this setting a kernel theorem and other useful
description of operators which reduce to MATLAB implementable terms in the case of
the group G = Z
n
.
The rest of the manuscript is organized as follows: First we give a quick summary
of the ingredients describing the Banach Gelfand Triple (S
0
, L
2
, S
0
), just enough to
indicate that is is the proper setting for the description of time-frequency analysis,
many aspects of classical analysis, even for problems in abstract harmonic analysis, but
above all a suitable format to related the connections between the dierent settings
(in the spirit of Conceptual Harmonic Analysis). We then provide a few typical cases
where this setting makes the description much easier and transparent than the classical
approaches found in the literature. Finally we will indicate (shortly only) the relevance
and natural occurrence of the so-called w
(G). We list
6
a few occurrences, and how it can be used to turn typical heuristic arguments used in
the literature to solid mathematical statements formulated in a distributional setting.
2 The Idea of Conceptual Harmonic Analysis
Let us try to describe our aim and vision of Postmodern Harmonic Analysis through
the concept of Conceptual Harmonic Analysis, which has been proposed by the author
already a while ago, as a counterpart to abstract and applied resp. computational har-
monic analysis. Let us therefore briey summarize the short-coming of the current AHA
view-point. Again, the historical perspective may help to understand the situation a bit
better.
2.1 From Fourier to the FFT and back
Many books in the eld of Fourier Analysis follow the historical path, reminding the
reader of the establishment of the theory of Fourier series, which in the context of func-
tional analysis is nothing else than the expansion of elements in a Hilbert space (here
L
2
(T)) with respect to some (here it is the system of complex exponentials) complete
orthonormal basis. If we take the abstract view-point we are in the setting of a com-
pact Abelian group G, for which one can always nd a family of characters
G forming
(automatically) a CONB for L
2
(G), hence we have an unconditional series expansion
f =
G
f, , for f L
2
(G). (1)
Later on the condition of periodicity of functions could be given up, the Fourier series
theory was replaced by the technically much more challenging continuous Fourier trans-
form theory, but in both cases the question of inversion is rather delicate if understood
in a pointwise sense (see [6]). Later on A. Weil ( [61]) and others (Gelfand-theory)
pushed the theory of Fourier transform to its natural limits, as far as it concerned the
setting, namely the underlying LCA (locally compact Abelian groups). For the concrete
setting of Euclidean spaces that theory of tempered distributions developed by Lau-
rent Schwartzt ( [54]) provided the natural setting. Chronologically at the end of this
development (and there are books meanwhile on dierent versions of the fast Fourier
Transform) the FFT came in the picture, see [8].
But couldnt we just try to revert this order? After all, the FFT routine realizes the
DFT (discrete Fourier transform), which is a simple and nice unitary mapping fromC
n
to
C
n
(up to the usual normalization factor of
n) and can be taught in any linear algebra
course. Looking at the matrix in a concrete way it turns out to be a Vandermonde matrix,
which allows to connect the properties of the nite Fourier transform with properties of
polynomials. In this way it is easy to explain why regular subsampling corresponds to
of the spectrum. Just for an illustration consider the polynomial p(t) = 1 + 2t
2
+ 3t
4
which can be interpreted as the lower order polynomial q(z) = 1 + 2z + 3z
2
, evaluated
at z = t
2
. But taking the squares of unit roots of order n is the same as running twice
through the unit roots of order n/2 (assuming that n is even).
7
But how can this help us to get from this nite setting to Fourier series and eventually
to the continuous Fourier transform? Can we say, that (for suciently large n at least)
the FFT allows is to compute a good approximation of
f, at least for nice functions
f? What is the connection between the FFT of a sequence of regular samples of f and
corresponding samples (at which spacing?) of
f ? What kind of setting would we need
to answer such a question?
Let us just mention here (because we will not be able to provide the details of the
answer in this note in sucient detail): The distributional setting allows us to view nite
signals as periodic and discrete signals (if we use for a moment engineering terminology),
resp. as discrete periodic (hence unbounded, but translation bounded) measures, which
- properly described - will converge to a continuous limit in a very natural way!
2.2 Conceptual Harmonic Analysis
While Abstract Harmonic Analysis (AHA) was/is helpful for ignoring the technical dif-
ferences between the dierent settings, it is only establishing the analogy between dier-
ent groups. Already this has some advantages compared to the engineering approach.
Engineering books often make a distinction between continuous and discrete variables,
between one and several dimensions, between periodic and non-periodic signals, the AHA
perspective just asks for the identication of
G, given G. Depending on the context the
elements of the dual group are called characters, or pure frequencies or plane waves (see
e.g. [11]).
AHA also provides the general insight such as the fact that
G is discrete if and only
if G is compact, and vice versa. Also there is a natural identication of the dual group
of
G with G itself (formulated in the Pontrjagin van-Kampen duality).
The idea of Conceptual Harmonic Analysis (CHA) is a more integrative view-point.
In the Postmodern Area we have already all these tools, and we should try to put
things together in a more coherent way. In other words, we can do Fourier analysis in
the setting of Numerical Software Packages (many of us are using MATLAB, or other
packages), but we are also still interested in questions of continuous Fourier analysis.
We may thus ask questions like the following one: How can we compute the norm of the
Fourier transform of a given function in some function space, how can we approximately
identify the turning point or a local maximum of the Fourier transform of some explicitly
given Schwartz function f S(R
d
), preferably with a given precision, not depending
on the individual function, but only on qualitative information about it? Moreover, the
process should be computationally realizable, and questions of resources (such as time)
and memory requirements will enter the evaluation of any such an algorithm, which of
course will typically start with samples of the function taken over a regular grid, and
will somehow make use of the FFT. But how can it be done, what kind of new proofs
will be required, and under which conditions will the users be satised with the method
(e.g. because they can be assured that the performance is close to optimal).
Obviously plotting the FFT of a sequence of samples will not deliver such a result,
even if it might give (properly done) some indication of what one can expect. Note
that the result of the FFT is just nite sequence (maybe multi-indexed) of complex
numbers, while we are looking for a function, labeled by natural parameters, such as Hz,
to describe the acoustical frequency content of a musical signal.
8
This reasoning brings us to the idea, that the continuous limit is obtained in various
dierent ways, very much like Riemannian sums are approximating the Riemann inte-
gral. In other words, we are thinking of a family of approximating operations for our
target operation, which form a kind of Cauchy-net, because we clearly expect that two
suciently ne approximation are also close to each other in a suitable sense. But
which approach is computationally most eective, or most robust?
In summary, CHA (as we try to promote it) tries to emphasize the connections be-
tween the dierent settings in addition to the analogy already provided by the AHA
perspective. The current state of this approach still emphasizes the qualitative aspects.
We see some relevant results telling us that certain approximations work well asymp-
totically for a large class of functions. A typical representative of such a result is the
computation of
f for f S
0
(R
d
), using only computations (FFTs) applied to (peri-
odized) function samples, as given in [40].
Of course we expect in the near future that aspects of approximation theory will have
to come into the picture, as well as of numerical analysis. Just think of the familiar
situation found in numerical integration: how densely should one sample in order to
guarantee that the Riemanian sum is suciently close to the integral, given some a priori
information about the smoothness of the function f which is to be integrated? To which
extent does the required sampling rate depend on this quality of f? And in addition:
arent there more ecient numerical integration methods (comparable to something
like the trapezoidal rule) that allow to achieve the same level of approximation at lower
computational complexity, at least for nice function, maybe not the most general ones
(i.e. the continuous functions, which are well suited for the Riemannian integral in its
full generality).
So at the end we see that the proposed setting requires to have an understanding
of questions reaching from approximation theory to numerical analysis and abstract
harmonic analysis, but in a more integrative sense than usual.
2.3 A Comparison with Number Systems
Of course the author has already tried to explain some of the proposed concepts to his
students, and in doing so a comparison with a familiar situation arising in the analysis
courses turned out to be helpful. More precisely we think about the use of the dierent
number systems, resp. the dierent elds which play a role for computations.
Let us therefore present these comparisons here in a nutshell. We also hope that the
perspective provided in this way will make it easier to readers less familiar with functional
analytic thinking (say a graduate engineering student) to appreciate the relevance of the
three layers proposed through our concept of Banach Gelfand Triples, which will play a
central role below.
Thus let us rst recall that we all have learned about the tower of number systems.
First of all we have the rational numbers, and that it is the minimal eld containing the
natural numbers. In other words, we need and get them just by carrying out the four
basic operations (addition etc.), starting from the natural number system N. In fact we
have just two important operations at hand, namely addition and multiplication. But
because all of them are supposed to be invertible (except for multiplication by zero), we
have in fact substraction (formally addition of the additive inverse element) and division
9
(multiplication with the multiplicative inverse element) at our disposition. Overall, there
are quite simple algebraic rules, and also compatibility between the two operations. In
particular, every child knows that the inverse to a/b is just b/a (if b ,= 0!).
We also have learned in our analysis courses that Q is incomplete, and that there
is no rational number x such that x
2
= 2. Hence there is/was a need to enlarge the
rational numbers. Although the p-adic numbers are a fascinating object (where the
completion is taken with respect to an alternative metric) we would like to remind
the reader hat this completion process leads again to a uniquely determined complete
eld, if we measure distances in the Euclidean metric. This uniqueness implies that
one may work with dierent concrete models, knowing however that they are mutually
equivalent from the logical point of view (maybe not so much from the practical or
computational perspective). Of course the representation of real numbers as innity
decimal expressions has a number of advantage, e.g. with respect to the ability of quickly
comparing the size of two numbers.
But this enlargement process (in the abstract setting done through equivalence
classes of Cauchy sequences) is in a way non-trivial and requires to carefully treat two
aspects. One has to properly embed the rationals into the reals, i.e. to provide an
injective mapping j : Q R, i.e. to associate to each pair of rational numbers (p, q)
(with no common divisor) a well dened innite decimal expression r R, and on the
other hand one has to extend the already existing multiplication of rational numbers the
new (and strictly larger) setting of real numbers. Of course we know in practice how to
do it. Given a pair of real numbers r
1
, r
2
we truncate their innite decimal expression
at a certain decimal, thus obviously obtaining rational numbers, with denominator of
the form 10
k
for some k N, which consequently gives a rational and hence a real
number. Although this nite decimal expression is not just the truncation of r
1
r
2
it
is of course possible to verify that these products of truncated decimal expression is
convergent (for k ) to some limit, i.e. more and more digits of the product can
be obtained exactly in this way, and we have a natural multiplication on R. Using
similar ideas one computes the multiplicative inverse, i.e. 1/r for r R.
This having said it is clear that numbers such as 1/
2 or 1/
2
are well dened and
can be computed in many dierent ways, but the user does not have to care about the
actual realization of these operations, just the well-denedness of the object within R
counts and is sucient in order to do interesting and correct mathematics.
At the end the impossibility of solving quadratic equations (such as x
2
+ 1 = 0)
suggests that one may need a further extension of the real number system, and in fact it
is a kind of miracle that the trick of adjoining the complex unit (or imaginary unit),
typically denoted by i or j (engineering), helps to overcome this problem.
But does this object really exist? Can we just write down a non-existent number?
After all, it has been called imaginary unit, indicating that it might not really exist,
and wishful thinking alone cannot solve a real problem!
As we know mathematicians have found a way to dene the eld of complex number
C through pairs of real numbers. So instead of a complex numbers z = a + ib we deal
with a pairs of real number (z is viewed as a point in the complex or Gaussian plane)
and dene addition and multiplication properly, and verify that we have obtained a eld
with respect to the addition and multiplication dened in the expected way.
Again one has to identify R with the subset of C consisting of elements of the form
10
(a, 0), a R, and that the new multiplication is (the only) compatible with the old one,
given for real numbers.
For our analogy we will use this comparison to indicate that generalized functions
(such as the Dirac measure) are not just vague objects but rather well dened objects,
one just has to be careful in manipulating them and follow clear rules, which are of-
ten motivated (if not uniquely determined in some way) by the behavior on ordinary
functions. So at the end complex numbers exist in the same way as bounded linear
functionals exist in a very natural sense, allowing also computations with them, or the
application of linear operators (such as convolution operators or the Fourier transform,
which is originally dened as an integral transform on test functions, but this concrete
form of operation does not have to be meaningful in the more general setting, just as
the inversion of rationals is only trivial on the rationals, but not in the setting of real
numbers, where a more complicated procedure has to be applied).
So altogether we see a graded system of number systems, with a natural embedding
of one in to the other (larger one), so that whenever we are doing computations they can
be done either in the lowest possible level (e.g. using rationals) or at the highest level,
and the result will always be the same. In other words, there is a lot of consistency
within this graded system of elds, and ambiguities that might occur (from a logical
point of view) can be eliminated at the fundamental level, so that the user does not
have to care about such ambiguities at all, when doing practical work, i.e. computations
using any of these number systems.
We can learn from this example that a proper terminology and a well-dened set of
computational rules may simplify computations (in this case) very much. In fact, one
might argue that it is no surprise that mathematics was not of a high value within the
Roman empire, just because the system of Roman numerals is rather inadequate for
such a way of thinking.
2.4 Axiomatic Request from Conceptual Harmonic Analysis
We have just seen that the idea of Conceptual Harmonic Analysis (CHA) requires above
all a exible setting which allows to describe important operations in a natural and easy
way.
Although a ne analysis of interesting (linear) operators requires to have the full
collection of suitable function spaces the reduction to a minimal set of spaces, namely
the so-called Banach Gelfand triple, will be enough. On the other hand, part of our
argumentation will be that the restriction to the Hilbert space setting alone would not
make sense. In fact, much of the problems in classical Fourier analysis are connected
with the concentration on the (too large spaces) L
2
(R
d
) and L
1
(R
d
).
The motivation comes partially from the success of the triple (S, L
2
, S
)(R
d
), consist-
ing of the Schwartz space of rapidly decreasing functions, the Hilbert space L
2
(R
d
) and
the dual space of tempered distributions S
(R
d
). It is also an example of a so-called
rigged Hilbert space, i.e. a Hilbert space endowed with some extra properties.
Since the topology on S(R
d
) (a Frechet space with respect to a suitable metric) is a
bit complicated, we suggest to look for a suitable Banach space of test functions. As a
consequence it is also possible to practically work with the dual space, either endowed
with the norm topology or (and we will see that this is an important extra structure) the
11
so-called w
nZ
d
n
, because this is a central object in signal processing, e.g. in order to
describe the sampling process (as a multiplication operator by .. ). Poissons formula
can in fact be expressed then as the fact that T(..) = .. , which in turn implies that
sampling of the signal corresponds to a periodization of the spectrum supp(
f) in the
frequency domain.
We do not claim that this collection of requirements uniquely determines the Banach
Gelfand Triple (S
0
, L
2
, S
0
f(t) = e
2it
f(t) with x, , t R
d
These operators are intertwined by the Fourier transform
(T
x
f)= M
x
f , (M
f)= T
f
Time-Frequency analysis (TF-analysis) starts with the Short-Time Fourier Transform:
V
g
f() = f, M
T
t
g = f, ()g = f, g
, = (t, );
For any pair of functions f, g L
2
(R
d
) the STFT V
g
f is a bounded, continuous and
square integrable function over phase space, i.e. dened over R
d
R
d
. If |g|
2
= 1 then
the mapping f V
g
f is even isometric, i.e. |V
g
f|
L
2
(R
d
R
d
)
= |f|
L
2
(R
d
)
. A function f
from L
2
(R
d
) belongs to (smaller) space S
0
(R
d
) if for some non-zero Schwartz function g
|f|
S
0
=
R
2d
[V
g
f(x, )[dxd < .
Dierent windows dene the same space and equivalent norms. One has isometric invari-
ance
S
0
(R
d
), | |
S
0
S
0
(G), | |
S
0
and
S
0
(G), | |
S
0
the
ideal setting to discuss Banach spaces of functions resp. distributions with such invari-
ance properties (see [4, 15] for discussions of some basic properties of such spaces).
Let us also not that a distribution S
(R
d
) belongs to the subspace S
0
(R
d
)
if and only if its spectrogram V
g
() is a bounded over R
d
R
d
, with possible norm
||
S
0
= |V
g
()|
-convergence in S
0
can be shown to be
equivalent to the very natural concept of uniform convergence over compact subsets of
R
d
R
d
. It is not dicult to show that pure frequencies resp. Dirac measures converge
(only) in this sense if their parameters converge.
13
4 How to make use of the BGT
The concept of Banach Gelfand Triples merges three possible view-points which are
playing a dierent role in dierent contexts. At the lower level typically descriptions can
be taken literal, i.e. integral transformations can be carried as as usual, often even just
by means of Riemannian integrals. The intermediate level allows to the preservation of
energy norms and scalar products, but sometimes limiting procedures are needed when
integral transforms are applied to general elements. Finally, the dual space is large
enough to contain objects such as Dirac measures (taking the role of unit vectors in the
discrete setting), even Dirac combs, or pure frequencies.
Let us again illustrate it through the example of the Fourier transform (also because
this is one of the central themes of Harmonic Analysis).
Here the roles of the dierent layers are clear. The space of test functions may be seen
as too small for several considerations
3
but it has the big advantage that almost all of the
problems normally associated with the use of the Fourier transform seem to disappear,
even if one is only willing to use absolutely convergent Riemannian integrals. In fact, the
Fourier inversion takes the expected form, it can be understood in the pointwise sense,
no sets of measure zero or measure theoretical arguments have to be involved, and even
Poissons formula is valid for all f S
0
(R
d
).
By way of natural extension the Fourier transform can be extended to the Hilbert
space L
2
(R
d
), becoming now a unitary isomorphism between L
2
(R
d
) and L
2
(
R
d
). In-
stead of L
1
L
2
one simple considers another dense subspace, namely S
0
, and veries
the isometry property of the FT on this dense subspace. L
2
(R
d
) is then simply the
completion of S
0
(R
d
) with respect to the L
2
-norm. Wether the Hilbert space consists of
equivalence classes of measurable functions (this is the stanadard model) or is just the
completion of S
0
(R
d
) with respect to the L
2
-norm, or a space of (regular) distributions
with certain integrability constraints does not matter for a wide range of application
areas. In contrast, the preservation of orthogonality is in fact an important practical
issue
4
.
Finally, the dual space S
0
(R
d
) is large enough to contain the pure frequencies (or
Dirac distributions), which are anyway always considered as the building blocks of the
Fourier transform, even if they do not belong to L
2
(R
d
).
4.1 The Segal Algebra
S
0
(G), | |
S
0
S
0
(G), | |
S
0
S
0
(G), | |
S
0
()g
are bounded, by constants which depend only on the S
0
- norm of g and geometric (cov-
ering) properties of the lattice R
d
R
d
.
An important result in Gabor analysis is now the following:
Theorem 2. Assume that for some g S
0
(R
d
) Gabor family (()g)
is a Gabor
frame, then the canonical dual window g also belongs to S
0
(R
d
).
This is a rather deep theorem when taken in full generality, but not so dicult
to verify (using the Neumann series and Janssens representation of the Gabor frame
operator) whenever the given lattice is of good density (hence the adjoint lattice
is
thin enough).
The combination of these two results actually provides a scenario for proving con-
tinuous dependence of dual frames on both the atom g S
0
(R
d
) and the lattice. Note
that the continuous dependence of TF-shifts when applied to an individual f LtRd we
also nd that R
g
(c) depends continuously on (with the natural convergence of lattices
expressed by their generating matrices), but this is not a perturbation in the operator
norm sense!
Nevertheless we have the following robustness result [21]:
Theorem 3. Assume that (g, ) with g S
0
(R
d
) generates a Gabor frame or a Gaborian
Riesz basic sequence. Then for all g
1
, close enough to g in the S
0
-sense, and
1
close
enough to the corresponding family derived from (g
1
,
1
) is of the same type. Moreover,
the canonical dual generators depend continuously (in the S
0
-sense) on both variables.
15
This result is not only of theoretical interest, but allows also to approximate a general
continuous problem by a corresponding rational problem which in turn can be well
approximated by a nite problem, susceptible to good approximation by the nite (and
computable) setting. How such an argument can be used to approximately compute
the dual window for a Gaussian window and a general lattice in R
2
is described in [10]
(together with alternative methods).
4.3 Fourier Inversion and Summability
Although at rst sight
L
1
(G), | |
1
| Th
|
1
< ) and [h
(s) 1[ 0
with , uniformly over compact sets, with h
L
1
(
G).
A large reservoir of such summability kernels are obtained by taking a decent func-
tion and simply applying dilations to it, i.e. to choose h
= D
h, with D
h(z) = h(z).
Taking then the limit 0 provides the proper setting, covering a rich variety of clas-
sical cases.
It is easy to argue that is enough to assume that h S
0
(R
d
) with h(0) = 1. In
fact, in this case h = g for some g S
0
(R
d
) with
R
d
g(x)dx = 1. Since the simple
dilation operator on Fourier transform corresponds to L
1
-norm preserving stretching,
i.e. St
g(x) =
d
g(x/), we nd that hf TL
1
(R
d
) S
0
(R
d
) S
0
(R
d
) L
1
(R
d
) and
on the other hand f = lim
0
St
kZ
d
f(k) =
nZ
d
f(n).
Unfortunately (as already pointed out in Katznelsons book, see [41]), it is not enough
to assume integrability of f and
f, nor is the absolute convergence of the sum on both
sides.
16
Kahane and Lemarie [39] were able to give pairs of weighted L
p
-conditions which still
allow to come up with counter examples to Poissons formula (in the pointwise sense,
as stated above). It was then K. Gr ochenig (see [32]) who was able to show that in
all the cases where the combined conditions on the function and its Fourier transform
are strong enough to avoid this unpleasant situation, i.e. to guarantee the validity of
Poissons formula one has already an embedding into S
0
(R
d
). On the other hand for
f S
0
(R
d
) the validity of Poissons formula is obvious (at least once the characterization
of S
0
(R
d
) using atomic decompositions is known, see [13]).
Thus altogether it is not wrong to argue that S
0
(G) is the largest universally dened
Banach space of continuous functions such that Poissons formula is valid for general
co-compact lattices G (because in this case
(R
d
) one
could rephrase some of their results by saying, that symmetry is established by looking
at those (still possible unbounded) measures which are translation bounded (i.e. belong
to the Wiener amalgam space W(M,
G provides a good example of such a situation.
4.6 Multipliers
In the discussion of multipliers between dierent translation invariant spaces the book
of Larsen [42] a large number of multiplier spaces is described, sometimes using rather
dicult concepts, such as the concept of quasi-measure as introduced by G. Gaudry
[30]. It is also shown by Gaudry that not only the translation invariant system can
be represented by a convolution with a quasi- measure, but also transfer function is a
quasi-measure. This looks nice at rst side (engineers would say that there is a well-
dened impulse response, and its Fourier transform, also a quasi-measure, is the transfer
distribution), but unfortunately general quasi-measures (although behaving decently,
namely like pseudo-measures locally) may not have a Fourier transform, not even in the
sense of tempered distributions.
Within the context of the S
0
-BGT the situation is quite simply described. Every
operator from L
p
to L
q
(with p < ) commuting with translations can also be viewed
17
(via restriction to a dense subspace) as an operator T from S
0
(G) to S
0
(G) (commuting
with translation). Any such operator maps S
0
(G) in fact already into C
b
(G) S
0
(G),
and consequently it is not dicult to nd out that the linear functional S
0
(G),
dened by (f) = T(f
)(0)
5
is the appropriate convolution kernel. Obviously
S
0
(R
d
) by L. Schwartz has been
tailored to the needs of Partial Dierential Equations, with the consequence that a family
of unbounded operators (including dierentiation and multiplication with polynomials,
as well as the Fourier transform) have to be continuous, and consequently the topology
of S(R
d
) is all but easy to describe. It is even less easy to convince engineers the make
use of it, and it is in fact not necessary to have distributions of this generality for the
majority of application problems. However a proper handling of the distribution or of
Dirac combs to describe the sampling and/or periodization procedures is very important
for a good understanding. Fortunately S
0
(R
d
) is large enough for this purpose, but still
easy to handle technically. The only new aspect that will be emphasized is the use of
two topologies on the space, namely the norm convergence, but even more importantly
the w
-convergence. Although this notation is less familiar among engineers one should
point out that the simple fact that Riemannian sums converge to the integral is an
instance of this type of convergence, indicating that the concept is not at all a very
complicated one.
5 The Role of Computational Harmonic Analysis
Distribution theory is not only an important tool for real or harmonic analysis. It is also
needed whenever one wants to describe the approximation of a continuous scenario by
discrete, resp. periodic or even nite dimensional models. Since the functions from such
models can be seen as discrete, periodic signals, one cannot describe their convergence
to say some L
2
(R
d
)-function using the L
2
(R
d
)-norm, simply because those periodic and
discrete signals do not belong to L
2
(R
d
), not even locally.
It is the concept of w
is a Gabor frame.
Just as a rst step we have studied the nite case, i.e. signals of nite length (resp.
functions on Z
N
resp. discrete periodic functions). First we had to be able to compute
the generator of the canonical and tight Gabor frames for general lattices, including the
non-separable ones. This in turn raised the questions whether we can generate all of
19
them, maybe even produce a given list of such lattices at a given redundancy (there
are only nitely many possible rational redundancies for a given integer N, and only
those not too high are typically of interest for applications), and of course this should
enable to user to run an exhaustive search.
Once being able to do this we had to come up with a variety of quality criteria,
among them the condition number of the frame operator, the S
0
-norm of the dual atom
or geometric properties of the considered lattices using maximal pavements and minimal
coverings. Only systematic and exhaustive computations of various (other) gures of
merit then allowed us to identify the most relevant criteria and in fact to establish (in
this case) the ranking derived from each of them turned out to be more or less equivalent.
As a further step one then has to nd out which version of the criterion is the most
ecient one, and how to select from the full variety of possible lattice the most interesting
or most suitable ones for whatever specic are application (and corresponding gures of
merit).
In fact, at the end we are not far from a consumers report, which gives a customer
(in our case maybe an engineer) advice of which system of functions might be most
suitable for her/his application, or which combinations of Gabor atoms with TF-lattices
can be recommended resp. might not be as good as expected according to such an
analysis.
An interesting point is the fact, that the justication for the validity of a hopefully
unbiased recommendation requires itself quite a bit of sophisticated analysis, which is
interesting in itself, but has found little attention so far.
5.3 Fourier Transform in Practice
The FFT (resp. DFT) is understood as the natural way of doing a (continuous) Fourier
transform, because it looks so similar, and after all a computer can only handle nite
data. This is a typical engineering argument, which is of course true, but perhaps
not convincing for people in numerical analysis or those who have to take care of the
implementation of some algorithm on a DSP looking for an ecient implementation
of some idea coming up in constructive approximation theory.
For many of those working regularly with the FFT the mathematical viewpoint
(that CMH would advocate), namely to view what is done in the nite setting as an
approximation of the continuous problem by corresponding problem over Z
N
, or to treat
it as a question of approximating a distributional entity in the w
-sense by a distribution
which can be thought as living on Z
N
(i.e. a discrete and periodic distribution) would
be seen as an overkill in terms of complicated concepts.
On the other hand practical books, especially quite recent ones, such as [58] or [?]
indicate to their readers that one has be to quite careful in the interpretation of e.g.
the values of the output of the FFT (which coordinate corresponds to which frequency,
etc.), and a plethora of tricks has to be learned by the user, most of which would be
relatively easily explained through the CMH context.
In fact, there was perhaps not enough interest from both sides to build and enforce
this bridge (meaning engineers and mathematicians), but I see a formidable task here,
to improve the understanding and the teaching of these subjects. An interesting recent
source is also [1].
20
6 The relevance of w
-convergence
There are many situations in Fourier analysis where heuristic arguments are used to
describe the transition from one setting of Fourier analysis to another. For example,
one often nds the formulation (used also by engineers): On a computer we can only
deal with nite sequences and therefore instead of computing the Fourier transform of
f L
1
(R
d
) we have to apply the FFT to a sequence of samples. Another typical case
is the approximation of the Fourier transform by thinking of the integral transform as
of the limit of Fourier series expansions for the given function f L
1
(R
d
), viewed as
a pperiodic function (obviously dened, at least if f has compact support), by taking
the limit p . We plan to give a variety of such examples in the rest of this section.
6.1 Fourier Integrals as Limits of Fourier Series
One of the spots where most presentations of the Fourier transform have a hard time to
explain in a more than purely heuristic way why the Fourier transform (and its inverse)
take their classical shape is the way how the Fourier transform can be understood as the
limiting case of the classical theory of Fourier expansions of periodic functions, just by
letting the length of the period go to innity.
First let us view the theory of Fourier series in our distributional setting. This is
not a new idea, but just a refreshment of known ideas from the context of tempered
distributions, cast into the more simple setting of S
0
f = f pZ).
Using now a partition of unity, for example a BUPU (bounded uniform partition
of unity) of the form (T
h = ..
f = T(h ..
p
) =
h (1/p) ..
1/p
,
or in other words
f =
1
p
nZ
h(n)
1/p
or equivalently
f =
1
p
nZ
h(n/p)
n/p
,
telling us that in turn the (pperiodic) is (at least in some sense) an innite sum of pure
frequencies from the lattice Z/p, which is getting more and more dense (within R) as
p . The correct way of verifying this convergence is again in the w
-setting. Note
that 1/p ..
p
is convergent to the constant 1 (resp. to the Haar measure on R
d
) in the
w
-sense.
21
6.2 Generalized Stochastic Processes
The last two decades have shown that the BGT (S
0
, L
2
, S
0
if such a process
belongs to S
0
, whose auto-correlation is
just the distributional Fourier transform of