Fa Notes
Fa Notes
Tom Leinster
2013–14
Contents
D Duality 66
D1 An abstract view of Fourier analysis . . . . . . . . . . . . . . . . 66
D2 The dual of a finite abelian group . . . . . . . . . . . . . . . . . . 70
D3 Fourier transforms on a finite abelian group . . . . . . . . . . . . 73
D4 Fourier inversion on a finite abelian group . . . . . . . . . . . . . 77
1
Chapter A
If we can deal with the sequence (cn ) rather than the function f , everything
will be much easier (and more algebraic).
2 If we can express f in this way, then the coefficients cn must be given by
f (n) (0)
cn = .
n!
(Proof: differentiate each side of (A:1) n times, then evaluate at 0.)
(Figure A.1). It can be shown that f (n) (0)/n! = 0 for all n (which you
might
P∞ guess from the flatness of the function near 0). But of course f (x) 6=
n
n=0 0 · x except at x = 0. This brings to light:
2
1
exp(-1/(x**2))
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-4 -2 0 2 4
2
Figure A.1: The function x 7→ e−1/x
5 Yet another problem: the power series method only captures infinitely differ-
entiable functions. (For recall that every power series is infinitely differen-
tiable inside its disk of convergence.) However, many interesting functions
are not infinitely differentiable.
6 In this power series approach to functions, the number 0 is given a special
role. This isn’t necessarily good or bad, but P
might seem a little suspicious.
We could equally well consider power series cn (x − a)n centred at a, for
any other value of a.
P f (n) (0) n
7 Also suspicious: the power series n! x of f depends only on the values
of f near 0. In the jargon, it’s locally determined. You can’t reasonably
expect to predict the value of f for large |x| given only the value of f for
small |x|.
In other words, if you had two functions f and g that were the same
throughout the interval (−δ, δ), then they’d have the same power series.
However, they might have very different values outside that interval.
So for various reasons, the algebraist’s dream can’t be realized using power
series. However. . .
3
we can rewrite this more efficiently as
∞
X
f (x) = ck e2πikx
k=−∞
The algebraist’s dream can’t be fully realized. But Fourier series come closer
to realizing it than power series do, at least for periodic functions on R. They’re
not as obvious an idea as power series, but in many ways they work better.
We’ll spend a lot of the course exploring the extent to which functions can be
understood in terms of their Fourier series. There are many analytic subtleties,
which we’ll have to think hard about.
The development of Fourier theory has been very important historically. It
has been the spur for a lot of important ideas in mathematics, not all obviously
connected to Fourier analysis. We’ll meet some along the way.
4
Figure A.2: Excerpt from the index of Tom Körner’s book Fourier Analysis
A2 Pseudo-historical overview
For the lecture of 16 January 2014
Most mathematicians are terrible historians. They can’t resist recounting what
should have happened, not what did happen. I can’t claim to be any better—
hence the ‘pseudo’ of the title.
This lecture is all about the ‘excessive optimism’ and ‘excessive pessimism’
mentioned in the index of Körner’s book (Figure A.2).
A 1-periodic function on R is determined by its values on [0, 1) (or any
other interval of length 1). We say that a 1-periodic function is integrable if
its restriction to [0, 1) is integrable in the usual sense. Let f : R → C be an
integrable 1-periodic function.
For k ∈ Z, the kth Fourier coefficient of f is
Z 1
fˆ(k) = f (x)e−2πikx dx.
0
It’s not entirely clear what kind of thing Sf is. For instance, it might not
always converge, and it’s not even obvious what convergence should mean. But
the following definition is perfectly safe: for n ≥ 0, the nth Fourier partial
sum is the function Sn f given by
n
X
(Sn f )(x) = fˆ(k)e2πikx .
k=−n
for all x ∈ [0, 1)? For ‘most’ x ∈ [0, 1)? For some x ∈ [0, 1)?
5
This is the question of pointwise convergence. There are other kinds of
convergence, as we’ll see, some of which are important in ways that pointwise
convergence is not.
Fourier himself (1768–1830) thought that yes, it’s always true. But he was
imprecise about this (and most other things). It’s not even clear what he would
have taken the word ‘function’ to mean.
To start with a point that was certainly clear to Fourier himself, it can’t
be true for all x and all f . For instance, suppose it’s true for all x for some
particular f . Define g to be the same as f except at one point of [0, 1), where
it takes some different value. Then ĝ(k) = fˆ(k) for all k, so Sn g = Sn f for all
n, so it can’t also be true that (Sn g)(x) → g(x) as n → ∞ for all x.
(For instance, take g to be the 1-periodic function given by g(x) = 1 for
x ∈ Z and g(x) = 0 for x 6∈ Z. Then for all n, the function Sn g has constant
value 0, so (Sn g)(x) 6→ g(x) whenever x ∈ Z.)
Backing up Fourier’s intuition, Dirichlet proved:
Theorem A2.1 (Dirichlet, 1829) Let f : R → C be a 1-periodic, continu-
ously differentiable function. Then (Sn f )(x) → f (x) as n → ∞ for all x ∈ [0, 1)
(or equivalently, for all x ∈ R).
Once Dirichlet had proved that, it was generally believed that ‘infinitely
differentiable’ could be relaxed to ‘continuous’—that it was just a matter of time
before someone figured out a proof that would work in this wider generality.
Most of the most prominent mathematicians of the day believed this. Dirich-
let believed it, Cauchy, believed it, and Riemann, Weierstrass, Dedekind and
Poisson all believed it. Cauchy even claimed he’d proved it. (Standards of
rigour were lower in those days.) Dirichlet promised he’d prove it, but never
did.
But then came a bombshell.
Theorem A2.2 (du Bois-Reymond, 1876) There is a 1-periodic, contin-
∞ f : R → C such that for some x ∈ [0, 1), the sequence
uous function
(Sn f )(x) n=0 fails to converge.
Widespread pessimism ensued. The pendulum swung from the general con-
sensus that Fourier series behave perfectly for all continuous functions to the
opposite extreme. After du Bois-Reymond’s theorem appeared, it came to be
∞ f : R → C with the
believed that there was some 1-periodic continuous function
property that for all x ∈ [0, 1), the sequence (Sn f )(x) n=0 fails to converge.
This pessimism was reinforced by a further example:
Theorem A2.4 (Kolmogorov, 1926) There exists a 1-periodic, Lebesgue in-
tegrable function
∞ f : R → C such that for all x ∈ [0, 1), the sequence
(Sn f )(x) n=0 fails to converge.
6
We won’t be doing Lebesgue integrability in this course, and I won’t assume
you know anything about it. But as general cultural background:
This situation persisted until relatively recently. For instance, one of the
books on my own undergraduate reading list was Apostol’s 1957 text Mathe-
matical Analysis. In most respects, it is like today’s textbooks, but he states in
it that it’s still unknown whether for a continuous function, the Fourier series
has to converge at even one point.
The turning point came in the 1960s.
All known proofs of Carleson’s theorem are hard, still far too hard for a
course such as this.
(I say ‘still’ because most hard proofs of important facts are simplified over
time. This was a big theorem that attracted a lot of attention, so lots of people
have put in lots of work to simplify the proof. They have simplified it, but
it’s still well beyond our reach. This is true even for the very watered-down
version of Carleson’s theorem in the first bullet point: that for merely continuous
functions, there is merely one point at which the Fourier series behaves well.)
Carleson’s theorem is best possible, in a sense that can be made precise
(Theorem A3.18). Roughly, this means that you can’t do any better than ‘al-
most all’. It almost completely answers the question of pointwise convergence of
Fourier series. But there are other types of convergence too, and the behaviour
of Fourier series from those points of view is interesting too, and has provoked
a lot of mathematical developments—as we shall see.
7
A3 Integration
For the lecture of 20 January 2014
We won’t be doing any Fourier analysis for the first couple of weeks. The course
pulls together several strands of analysis, and we’re going to look at them one at
a time before attempting to bring them together. This has the disadvantage of
postponing the moment when we first see any Fourier theory, but the advantage
that we can concentrate on one thing at a time.
For the rest of this section, let I ⊆ R be a bounded interval.
Definition A3.1 i. A function f : I → R is integrable if it is bounded and
Riemann integrable.
ii. Let f : I → C be a function. Writing f1 , f2 : I → R for its real and
imaginary parts (so that
for all x ∈ R), we say that f is integrable if f1 and f2 are both integrable.
In that case, we define the integral of f by
Z Z Z
f (x) dx = f1 (x) dx + i f2 (x) dx.
I I I
Part (i) is a declaration about how the word ‘integrable’ will be used in this
course. Other people use it differently. For example, a more permissive meaning
(which you don’t have to know anything about) would be ‘Lebesgue integrable’.
Examples A3.2 i. Bounded continuous functions are integrable. (If I is
closed then ‘bounded’ follows automatically from ‘continuous’.)
ii. Let J ⊆ I be an interval. The characteristic function (or indicator
function) of J is the function
χJ : I → C
defined by (
1 if x ∈ J,
χJ (x) =
0 if x ∈
6 J.
R
Then χJ is integrable, and I χJ (x) dx = |J|. Here |J| is the length of
the interval J, defined by |J| = sup J −inf J (or as 0 if J = ∅). Concretely,
if J is [a, b] or [a, b) or (a, b] or (a, b) then |J| = b − a.
8
Lemma A3.4 i. If f, g : I → C are integrable then so is f + g, with
Z Z Z
(f + g)(x) dx = f (x) dx + g(x) dx.
I I I
Proof Parts (i) and (ii) follow from the corresponding properties of real-valued
integration, and part (iii) follows directly from the definitions.
I’ll occasionally state ‘Facts’, without proof. Some of these facts were proved
in previous courses (such as PAA). Others weren’t, and I’ll be asking you to take
those on trust.
Fact A3.5 Let f : I → C be an integrable function and φ : C → C a continuous
function. Then the composite φ ◦ f : I → C is integrable.
(If you’ve come across Lebesgue integration, you may be aware that both
this lemma and Example A3.6 fail for Lebesgue integration. This is one respect
in which the Riemann theory makes life simpler.)
Fact
R A3.8 IfR f, g : I → R are integrable and f (x) ≤ g(x) for all x ∈ I then
I
f (x) dx ≤ I g(x) dx.
Note that this is for functions into R, not C; inequalities makes no sense in
C.
9
for some u ∈ C with |u| = 1. Now
Z Z Z Z
f (x) dx = Re f (x) dx = Re 1 1
f (x) dx = Re f (x) dx
I
I
u I I u
Z 1
= Re f (x) dx (A:2)
u
ZI
1
≤ f (x) dx (A:3)
u
ZI Z
= |f (x)| dx = |f |(x) dx.
I I
Here Re(z) denotes the real part of a complex number z, equation (A:2) follows
from the definition of complex-valued integration (Definition A3.1(ii)), the in-
equality (A:3) follows from Fact A3.8 and the fact that Re(z) ≤ |z| for all z ∈ C,
and the rest follow either from earlier lemmas or directly from the definitions.
10
R
Fact
R A3.13 If f, g : I → C are integrable and f = g a.e. then I
f (x) dx =
I
g(x) dx.
R
Fact A3.14 If h : I → R is integrable, h(x) ≥ 0 for all x ∈ I, and I
h(x) dx =
0, then h = 0 a.e..
Proof For (i), suppose for a contradiction that f (x) 6= g(x). By continuity, we
can find some δ > 0 such that f (t) 6= g(t) for all t ∈ (x − δ, x + δ) ∩ I. Also,
since f = g a.e., we can choose E ⊆ I of measure zero such that f (t) = g(t) for
all t ∈ I \ E. Then (x − δ, x + δ) ∩ I ⊆ E, so (x − δ, x + δ) ∩ I has measure zero
by Example A3.11(iii). But |I| > 0, so (x − δ, x + δ) ∩ I is an interval of length
> 0, contradicting Example A3.11(iv).
Part (ii) follows immediately.
Proof We could deduce this from Fact A3.14 and Proposition A3.15. This is
unnecessarily complicated, though. Sheet 1, q.4 asks you to find a direct proof.
∗ ∗ ∗
The last two results of this lecture are non-examinable, and are included just
for background.
Fact A3.17 A function f : I → C is integrable if and only if it is bounded and
continuous a.e. (that is, the set {x ∈ I : f is not continuous at x} has measure
zero).
For example, χQ∩[0,1] : [0, 1] → C is not integrable, since the set of disconti-
nuities is [0, 1], which does not have measure zero.
At the end of the last lecture, I said that in a certain sense, Carleson’s
theorem (Theorem A2.6) cannot be improved. Here is the precise statement.
Theorem A3.18 (Kahane and Katznelson, 1960s) Let E ⊆ [0, 1] be a set
of measure zero. Then there is a 1-periodic, continuous function f : R → C such
∞
that for all x ∈ E, the sequence (Sn f )(x) n=0 fails to converge.
11
Figure A.3: The area between f and g is small, but the largest difference between
them is large.
There is no single right answer to this question. Compare the question ‘how big
is a person?’ I might interpret that as a question about height, but you might
interpret it as a question about weight. Neither of us would be right or wrong.
Height and weight are correlated, but not logically related in an absolute sense.
Related to the question of the title is another question: what does it mean
for two functions to be ‘close’ ?
We might say that twoR functions f and g are close if the area between their
graphs is small (that is, |f (x) − g(x)| dx is small). Or, we might say that
they are close if their values are never too different (that is, supx |f (x) − g(x)|
is small). Both ideas are sensible, but they are not the same, as Figure A.3
demonstrates.
For the rest of this section, let I ⊆ R be a bounded interval of length > 0.
Definition A4.1 Let f : I → C be an integrable function. Define:
R
• kf k1 = I |f (x)| dx, the 1-norm of f ;
qR
• kf k2 = I
|f (x)|2 dx, the 2-norm of f ;
12
If we want to refer to the 1-norm in the abstract, we sometimes write it as
k · k1 . The dot is a blank or placeholder, into which arguments can be inserted.
The same goes for k · k2 and k · k∞ .
Lemma A4.3 Let k · k stand for any of k · k1 , k · k2 or k · k∞ . Then for all
integrable functions f, g : I → C and all c ∈ C:
i. kf k ≥ 0;
ii. kcf k = |c| · kf k;
iii. kf + gk ≤ kf k + kgk.
Proof All easy except (iii) for k · k2 , which is Sheet 1, q.6(iv).
A norm on the set {integrable functions I → C} is an operation satisfying
conditions (i)–(iii) and one further condition: that if kf k = 0 then f = 0. This
further condition fails for k · k1 and k · k2 (as the next example shows), so strictly
speaking we should call them ‘seminorms’. But I will abuse terminology and go
on calling them the 1-norm and 2-norm. (The ∞-norm really is a norm.)
Example A4.4 Define f : [0, 1) → C by
(
1 if x = 0,
f (x) =
0 otherwise.
Then kf k1 = kf k2 = 0 but f 6= 0.
Lemma A4.5 i. For an integrable function f : I → C,
kf k1 = 0 ⇐⇒ f = 0 a.e. ⇐⇒ kf k2 = 0.
13
Definition A4.6 Let (fn ) be a sequence of integrable functions from I to C,
and let f be an integrable function from I to C. We say:
• fn → f in k · k1 if kfn − f k1 → 0 as n → ∞;
• fn → f in k · k2 (or in mean square) if kfn − f k2 → 0 as n → ∞;
• fn → f in k · k∞ (or uniformly) if kfn − f k∞ → 0 as n → ∞;
• fn → f pointwise if for all x ∈ I, fn (x) → f (x) as n → ∞;
kfn k − kf k ≤ kfn − f k
kf k − kfn k ≤ kf − fn k = kfn − f k,
so
kfn k − kf k ≤ kfn − f k.
fn → f uniformly
⇒ fn → f pointwise
⇒ fn → f a.e..
We now restrict ourselves to the case I = [0, 1), since that’s what we’ll need
once we start to consider 1-periodic functions.
14
Lemma A4.10 i. Let f : [0, 1) → C be an integrable function. Then
kf k1 ≤ kf k2 ≤ kf k∞ .
fn → f in k · k∞
⇒ fn → f in k · k2
⇒ fn → f in k · k1 .
k · k∞
tt NNNN
u} tttt #+
k · k2 pointwise
uu LLLLLLL
v~ uuuu LLL
"*
k · k1 a.e.
15
A5 Periodic functions
For the lecture of 27 January 2014; part one of two
i. 1-periodic function R → C: R
−1 0 1
iii. function T → C,
T
where T is the circle:
0
In (i), it’s best to think of the real line R as coiled up into a spiral of period
1, so that x, x ± 1, x ± 2, . . . all lie on the same vertical line.
In (ii), it’s best to think of the interval [a, a + 1) as bent round into a circle,
as shown. (The values of a we most commonly use are 0 and −1/2.)
The viewpoint in (iii) is ultimately the most satisfactory. To get from (i)
to (iii), think of pushing down on the coil to squash it into a circle. To get
from (ii) to (iii), join the two ends of the arc.
Formally, T is the quotient group R/Z. (In case you’ve forgotten what this
means, there is an equivalence relation ∼ on R given by x ∼ y ⇐⇒ x − y ∈ Z;
then R/Z is the set of equivalence classes.) The elements of T are the elements
of R but with x regarded as the same as x + n in T whenever x ∈ R and n ∈ Z.
So,
. . . , −1.9, −0.9, 0.1, 1.1, 2.1, . . .
are all names for the same element of T. (Compare the fact that
are all names for the same element of Z/10Z, i.e. the same integer mod 10.)
There are one-to-one correspondences between functions as in (i), functions
as in (ii), and functions as in (iii). We will switch freely between the three
ways of thinking about periodic functions, but most often, we will adopt the
viewpoint of (iii).
We will also take for granted some easy lemmas about periodic functions,
e.g. that any linear combination or product of 1-periodic functions is again 1-
periodic.
Definition A5.1 A function T → C is continuous if the corresponding 1-
periodic function R → C is continuous.
16
Note that a 1-periodic function f : R → C is continuous if and only if its
restriction fe to [0, 1) is continuous and lim fe(x) = fe(0). (Here lim means
x→1− x→1−
the limit as x tends to 1 from below.) To see why the second condition is needed,
consider the function f : R → C defined by f (x) = x − bxc, where bxc is the
integer part of x:
−1 0 1 2
17
A6 The inner product
For the lecture of 27 January 2014; part two of two
In order for this definition to make sense, we need to know that the function
f · g is integrable. Lemmas A3.4(iii) and A3.7 guarantee this.
Lemma A6.2 Let f, g, h : T → C be integrable functions, and let a, b ∈ C.
Then:
i. hg, f i = hf, gi;
ii. haf + bg, hi = ahf, hi + bhg, hi and hf, ag + bhi = ahf, gi + bhf, hi;
iii. hf, f i = kf k22 ≥ 0.
The properties of h·, ·i stated in this lemma nearly say that it is an inner
product. All that prevents it from being one is that hf, f i = 0 does not quite
imply f = 0; it only implies that f = 0 almost everywhere (by Lemma A4.5(i)).
However, I will abuse terminology slightly by referring to hf, gi as the inner
product of f and g anyway.
Here are some further properties of h·, ·i.
Lemma A6.3 Let f, g : T → C be integrable functions. Then:
i. kf + gk22 = kf k22 + kgk22 + 2Rehf, gi (‘cosine rule’);
ii. if hf, gi = 0 then kf + gk22 = kf k22 + kgk22 (‘Pythagoras’).
Proof For (i), use khk22 = hh, hi. Part (ii) follows immediately.
Note the modulus sign on the left-hand side. Without it, the inequality
would not even make sense, since hf, gi is not usually a real number.
From this, we deduce that kf + gk2 ≤ kf k2 + kgk2 (Sheet 1, q.6 again; see
also Lemma A4.3). We also deduce the following, which completes the proof of
Lemma A4.10:
18
Lemma A6.5 Let f : [0, 1) → C be an integrable function. Then kf k1 ≤ kf k2 .
h|f |, 1i = kf k1 , k |f | k2 = kf k2 , k1k2 = 1,
giving kf k1 ≤ kf k2 · 1 = kf k2 , as required.
Proof We have
Z Z
|hf, gi| = f (x)g(x) dx ≤ f (x)g(x) dx
ZT T
Z
= |f (x)| · |g(x)| dx ≤ |f (x)| · kgk∞ dx = kf k1 kgk∞ ,
T T
Lemma A6.6 easily implies two more small results, both of which will be
useful later:
Lemma
p A6.8 Let f : T → C be an integrable function. Then kf k2 ≤
kf k1 kf k∞ .
19
A7 Characters and Fourier series
For the lecture of 30 January 2014; part one of two
Among all periodic functions, certain ones are special. These are the so-
called ‘characters’. Fourier theory can be seen as an attempt to build all periodic
functions out of characters.
Let k ∈ Z. We define ek : R → C by ek (x) = e2πikx . This function is
1-periodic, so can be seen as a function ek : T → C, the kth character of T.
Remarks A7.1 i. The notation ek is not standard. No one outside this
class will know what you mean by ‘ek ’ unless you define it.
ii. The apparently strange terminology ‘character of T’ will be put into con-
text in the very last part of this course.
Here are some elementary properties of the characters.
Lemma A7.2 Let k ∈ Z and x, y ∈ T. Then:
i. ek is continuous;
ii. |ek (x)| = 1;
iii. ek (x + y) = ek (x)ek (y), ek (−x) = 1/ek (x), and ek (0) = 1.
Note that in (iii), it does make sense to add and subtract elements of T,
because T is by definition a group (the quotient R/Z).
Proof Straightforward.
Some further elementary properties:
Lemma A7.3 Let k, ` ∈ Z. Then:
i. ek+` = ek · e` , e−k = 1/ek , and e0 = 1;
ii. e−k = ek ;
iii. ek = ek1 .
Proof Straightforward.
We now come to a crucial property of the characters.
Lemma A7.4 The characters (ek )k∈Z are orthonormal. That is, for k, ` ∈ Z,
(
1 if k = `,
hek , e` i =
0 if k 6= `.
Proof We have
Z Z 1 Z 1
hek , e` i = ek (x)e` (x) dx = ek (x)e−` (x) dx = ek−` (x) dx,
T 0 0
20
You can think of the characters as analogous to the standard basis vectors
in Rn (which are also orthonormal). When we express a point of Rn in terms of
its coordinates, we are viewing it as a linear combination of the standard basis
vectors. Similarly, in Fourier theory, we seek to view any periodic function as a
linear combination of the characters. The analogy is not exact, because there
are infinitely many characters, so we have to take infinite linear combinations
of characters. This is what gives the subject its subtlety.
Here are the central definitions of this course.
Definition A7.5 Let f : T → C be an integrable function.
i. For k ∈ Z, the kth Fourier coefficient of f is
fˆ(k) = hf, ek i.
P∞
ii. The Fourier series of f is k=−∞ hf, ek iek . Compare:
Pn if u1 , . . . , un denote
the standard basis vectors of Rn , then v = k=1 (v.uk )uk for all v ∈ Rn .
So, we might guess that f is ‘equal’ to its Fourier series (whatever that
means). The central question of this subject is whether, and in what sense,
this is actually true.
Proof Part (i) follows from Lemma A6.2, and part (ii) from Lemma A6.9.
21
A8 Trigonometric polynomials
For the lecture of 30 January 2014; part two of two
(k ∈ Z).
Proof We have
* n
+ n
X X
ĝ(k) = hg, ek i = c` e` , ek = c` he` , ek i
`=−n `=−n
(
ck if − n ≤ k ≤ n
=
0 otherwise,
where in the last step we used the fact that the characters are orthonormal.
Pn Pn
Corollary A8.4 If k=−n ck ek = k=−n dk ek then ck = dk for all k ∈
{−n, . . . , 0, . . . , n}.
22
Here is a baby version of the whole of Fourier theory.
Proposition A8.6 Let n ≥ 0. Then the functions
given by
g 7→ ĝ(−n), . . . , ĝ(0), . . . , ĝ(n)
Pn
k=−n ck ek ←[ (c−n , . . . , c0 , . . . , cn )
are mutually inverse.
Pn
Proof • Let g = k=−n cP k ek be a trigonometric polynomial of degree ≤ n.
n
We must show that g = k=−n ĝ(k)ek . This follows from Lemma A8.3.
Pn
• Let (c−n , . . . , cn ) ∈ C2n+1 and put g = k=−n ck ek . We must show that
ĝ(k) = ck for all k ∈ {−n, . . . , n}. This also follows from Lemma A8.3.
given by ∞
f 7→ fˆ(k) k=−∞
P∞
k=−∞ ck ek ←[ (ck )∞
k=−∞ .
23
f (· + t) f
−t 0
f (· + t) : T → C
defined by
x 7→ f (x + t).
Geometrically, this means shifting the graph of the function by t units to the
left (Fig. A.4). (Or really, since T is a circle, it means rotating the graph by a
fraction t of a revolution.) It’s not so hard to see (try it!) that
f is continuous ⇐⇒ f (· + t) → f pointwise as t → 0
f is uniformly continuous ⇐⇒ f (· + t) → f in k · k∞ as t → 0
24
0 1 2 3
Step functions are integrable, by Example A3.2(ii) and Lemma A3.4. Indeed,
Z X X
ck χJk (x) dx = ck |Jk | .
I k k
(You may be familiar with something like this definition from the theory of
metric spaces; for example, Q is dense in R.)
Proposition A9.4 Let I be a bounded interval. Then {step functions I → C}
is dense in {integrable functions I → C} with respect to k · k1 .
Proof We prove it just for I = [0, 1), since this is the case that will matter
most to us and the proof for other bounded intervals is very similar.
First take a real -valued integrable function f : [0, 1) → R, and let ε > 0. By
definition of integration, we can choose a partition P of [0, 1) such that
Z 1
L(f, P ) > f (x) dx − ε.
0
25
Put Jk = [xk−1 , xk ) and put
n
X
g= mk χJk .
k=1
R1
(Draw a picture!) Then g is a step function, with L(f, P ) = 0
g(x) dx. We
have g(x) ≤ f (x) for all x ∈ [0, 1), so
Z 1 Z 1
kf − gk1 = (f (x) − g(x)) dx = f (x) dx − L(f, P ) < ε,
0 0
as required.
Now take an arbitrary integrable function f : [0, 1) → C, say f = f1 +if2 with
f1 , f2 : [0, 1) → R. Let ε > 0. By definition of integrability of complex-valued
functions (Definition A3.1(i)), f1 and f2 are integrable. So by the previous
paragraph, we can choose step functions g1 , g2 : [0, 1) → R such that
We are now ready to prove that integrable functions are ‘sort of continuous’,
in the sense of (A:4), following the plan described above.
Theorem A9.5 Let f : T → C be an integrable function. Then f (· + t) → f in
k · k1 as t → 0 (that is, kf (· + t) − f k1 → 0 as t → 0).
Proof For the duration of this proof only, let us say that an integrable function
g : T → C is ‘good’ if g(· + t) → g in k · k1 as t → 0. We will prove that every
integrable function is good.
First, for any interval J ⊆ [0, 1), the characteristic function χJ is good.
Indeed, write a = inf J and b = sup J. If a = b, or if a = 0 and b = 1, then
kχJ (· + t) − χJ k1 = 0 for all t. Otherwise, we can show that
kχJ (· + t) − χJ k1 = 2|t|
whenever |t| is sufficiently small (exercise; see Figure A.6). Hence kχJ (· + t) −
χJ k1 → 0 as t → 0. Pn
Second, every step function g : [0, 1) → C is good. Indeed, if g = k=1 ck χJk
(in the usual notation) then
n
X
kg(· + t) − gk1 =
c k χ Jk (· + t) − χ Jk
k=1 1
n
X
≤ |ck |
χJk (· + t) − χJk
1
k=1
→0
26
a−t a b−t b
where the second equality is by substitution. (Intuitively, the area between the
graphs f and g is unchanged if we shift everything horizontally by t.) Also,
by the second part, we can choose δ > 0 such that for all t ∈ (−δ, δ), we have
kg(· + t) − gk1 < ε/3. Now for all t ∈ (−δ, δ), we have
as required.
27
Chapter B
Convergence of Fourier
series in the 2- and 1-norms
It’s as simple as that. Compare the long and complicated saga of pointwise
convergence recounted in Section A2. In contrast, this result is clean, easily
stated, and not too difficult to prove.
The 2-norm will play a much greater role than the 1-norm in this chapter. Be-
cause k·k2 goes hand in hand with inner products (recalling that kf k22 = hf, f i),
working in the 2-norm has a great deal in common with ordinary Euclidean ge-
ometry. It’s the most easily visualized context to work in.
The title of this section is made precise by part (i) of the following lemma,
illustrated in Figure B.1. Recall that given an integrable function f , its nth
Fourier partial sum Sn f is a trigonometric polynomial of degree ≤ n.
28
f
Sn f
g
0
trig poly
s of deg
≤n
Figure B.1: Approximating a function f by a trigonometric polynomial of degree
≤ n.
by Pythagoras (Lemma A6.3(ii)). Part (i) follows, then part (ii) by putting
g = 0 in (B:1).
For part (iii), we have
* n n
+
X X
2
kSn f k2 = ˆ
f (k)ek , ˆ
f (`)el
k=−n `=−n
n
X
= fˆ(k)fˆ(`)hek , e` i
k,`=−n
n
ˆ 2
X
= f (k) ,
k=−n
where the first two equalities use Lemma A6.2 and the third uses the orthonor-
mality of the characters (Lemma A7.4).
29
Pn
Proof By parts (ii) and (iii) of Lemma B1.1, we have k=−n |fˆ(k)|2 ≤ kf k22
for all n ≥ 0. The results follow.
kf − SN f k2 ≥ kf − SN +1 f k2
ε > kf − SN f k2 ≥ kf − SN +1 f k2 ≥ kf − SN +2 f k2 ≥ · · · ,
30
B2 Convolution: definition and examples
For the lecture of 6 February 2014; part two of two
(x ∈ T).
Remark B2.2 Recall from Section A5 that the circle T is a group (the quotient
R/Z). The group operation is addition mod 1. So, it does make sense to add
and subtract elements of T, as we did in the definition of convolution.
31
f
f ∗ χ[−1/2,1/2]
1 1 1
32
B3 Convolution: properties
For the lecture of 10 February 2014; part one of two
33
Remark B3.3 This lemma tells us that + and ∗ give the set
{integrable functions T → C}
the structure of a ‘commutative algebra’ over C, that is, both a vector space
over C and a ring.
Or nearly. The only missing part is that it has no multiplicative identity.
Nowadays, the definition of ‘ring’ is usually taken to include the existence of a
multiplicative identity; but in analysis especially, there are important rings that
do not have one.
We’ll see very soon that if the identity did exist, it would be the mythical
‘delta function’.
Again, this is stronger than might be expected. It’s not merely true that
the convolution of two trigonometric polynomials is a trigonometric polyno-
mial. In fact, the convolution of anything with a trigonometric polynomial is a
trigonometric polynomial.
Pn
Proof We may write g = k=−n ck ek for some n ≥ 0 and ck ∈ C. Then
n
X n
X
ck fˆ(k) ek ,
f ∗g = ck (f ∗ ek ) =
k=−n k=−n
Since Fourier
Pn partial sums are important in Fourier theory, this example
suggests that k=−n ek is also important. It is, and it has its own name:
Definition
Pn B3.7 Let n ≥ 0. The Dirichlet kernel of order n is Dn =
k=−n e k .
34
Here are two further properties of convolution.
Lemma B3.10 For integrable functions f, g : T → C,
kf ∗ gk∞ ≤ kf k∞ kgk1 .
The final property should remind you of an important fact about Fourier
transforms.
∗ g(k) = fˆ(k)ĝ(k).
f[
35
1/ε
−ε/2 ε/2
This short section is largely intended as motivation for the rest of Part B. It
contains no definitions or theorems, but the ideas are important for what follows.
Fact: there is no integrable function δ : T → C such that
Z
for all continuous f : T → C, f (x)δ(x) dx = f (0). (B:5)
T
(Compare Sheet 1, q.5.) But we can get close. Non-rigorously, for a ‘small’
ε > 0, put
1
∆ε = χ[−ε/2,ε/2] : [−1/2, 1/2) → C
ε
(Figure B.3). Then for any continuous function f : T → C,
Z Z ε/2 Z ε/2
1 1
f (x)∆ε (x) dx = f (x) dx ≈ f (0) dx = f (0),
T ε −ε/2 ε −ε/2
f ∗ δ = f,
since Z
(f ∗ δ)(x) = f (x − t)δ(t) dt = f (x − 0) = f (x)
T
36
If δ is merely a limit of trigonometric polynomials Kn , say Kn → δ in k · k2 ,
then perhaps it follows that f ∗ Kn → f ∗ δ in k · k2 . In that case, we would have
f ∗ Kn → f in k · k2 . Lemma B1.3 would then imply that Sn f → f in k · k2 .
This is the result we’re aiming for.
37
Kn
−1/2 −δ δ 1/2
Figure B.4: Axiom PAD3 states that for large n, the shaded area is small.
R
Remarks B5.2 i. PAD2 is inspired by the thought that if T δ(t)f (t) dt =
f (0) for all continuous f (as in the previous section),R then in particular
this is true when f is the constant function 1, giving T δ(t) dt = 1.
ii. In PAD3, the ‘δ’ mentioned is a real number, not the delta function!
iii. PAD2 tells us that the area under the graph of Kn is always 1. PAD3 says
that as n gets larger, that area gets concentrated into an ever-narrower
strip around the y-axis (Figure B.4).
iv. For this part of the theory, it’s convenient to view functions on T as
functions on [−1/2, 1/2).
38
v. The name ‘approximation to delta’ is non-standard. The usual name is
‘approximation to the identity’, since delta is the (non-existent) identity
for convolution.
vi. Almost every definition in this course is conceptually well-motivated. This
one, however, is not. The conditions PAD1–3 are just what is needed in
order to make the arguments work. (Other variants are possible.) How-
ever, it’s only a stepping stone, which we’ll use to reach theorems that are
clean and free of arbitrary conditions.
∞
Examples B5.3 i. nχ[−1/2n,1/2n) n=1 is a PAD. (Check!) A typical ele-
ment of this sequence is shown in Figure B.3, taking ε = 1/n.
ii. (Dn )∞
n=0 is not a PAD. First, it fails PAD1, since (for instance) D1 (1/2) =
−1 < 0. More seriously, it can be shown that kDn k1 → ∞ as n → ∞,
whereas if (Kn ) is a PAD then
Z
kKn k1 = Kn (t) dt = 1 (B:6)
T
39
Integrating (B:7) with respect to x shows that for all n ≥ 0 and δ ∈ (0, 1/2),
kf ∗ Kn − f k1
Z 1/2 Z Z
≤ |f (x − t) − f (x)|Kn (t) dt dx + 2kf k∞ Kn (t) dt (B:8)
−1/2 |t|<δ δ<|t|≤1/2
Z Z
= kf (· − t) − f k1 Kn (t) dt + 2kf k∞ Kn (t) dt. (B:9)
|t|<δ δ<|t|≤1/2
R
(To get (B:8), we used the fact that 2kf k∞ δ<|t|≤1/2 Kn (t) dt is independent of
x, so that integrating it with respect to x over an interval of length 1 leaves it
unchanged.)
We now prove convergence. Let ε > 0. By Theorem A9.5 (‘integrable
functions are sort of continuous’), we can choose δ ∈ (0, 1/2) such that for all
t ∈ (−δ, δ), kf (· − t) − f k1 < ε/2. By PAD3, we can then choose N ≥ 0 such
that for all n ≥ N , Z
ε
Kn (t) dt < .
δ<|t|≤1/2 4kf k∞
So by (B:9), for all n ≥ N ,
Z
ε ε
kf ∗ Kn − f k1 ≤ Kn (t) dt + 2kf k∞
2 |t|<δ 4kf k∞
Z 1/2
ε ε
≤ Kn (t) dt + (B:10)
2 −1/2 2
= ε, (B:11)
using PAD1 in (B:10) and PAD2 in (B:11).
We now know that f ∗ Kn → f in k · k1 for any integrable f . It is relatively
easy to deduce the stronger result that f ∗ Kn → f in k · k2 . The key is
Lemma A6.8, which gives an upper bound on the 2-norm in terms of the 1- and
∞-norms.
Proposition B5.5 Let (Kn )∞n=0 be a PAD and let f : T → C be an integrable
function. Then f ∗ Kn → f in k · k2 .
Proof For all n ≥ 0, we have
kf ∗ Kn k∞ ≤ kf k∞ kKn k1 = kf k∞
by Lemma B3.10 and (B:6). So for all n ≥ 0,
kf ∗ Kn − f k∞ ≤ kf ∗ Kn k∞ + kf k∞ ≤ 2kf k∞ .
(Idea: we now have control over the ∞-norm of (f ∗ Kn − f ). The previous
proposition gives us control over its 1-norm. Putting them together will give us
control over its 2-norm.)
Hence for all n ≥ 0, using Lemma A6.8 and Proposition B5.4,
p
kf ∗ Kn − f k2 ≤ kf ∗ Kn − f k1 kf ∗ Kn − f k∞
p p
≤ kf ∗ Kn − f k1 2kf k∞
→ 0 as n → ∞.
40
B6 Summing the unsummable
For the lecture of Monday 24 February 2014
Problem (Dn ) is not a PAD, as noted in Example B5.3(ii). Also, the sequence
∞
(D
P∞ n )n=0 does not converge in any of our usual five senses. In other words,
k=−∞ ek is thoroughly unsummable.
PN n
does not converge, since the partial sums SN = n=0 (−1) are
S0 = 1, S1 = 0, S2 = 1, S3 = 0, ...,
S = 1 + (−1 + 1) + (−1 + 1) + · · · = 1 + 0 + 0 + · · · = 1.
Alice and Bob agree to split the difference, and so conclude that S = 1/2.
41
ii. Consider two copies of S lined up in columns:
2S = (1 − 1 + 1 − 1 + · · · )
+ (1 − 1 + 1 − · · · )
= 1 + 0 + 0 + 0 + ···
= 1.
42
P∞
ii. Let n=0 xn be a series
P in C. If the sum exists, then the Cesàro sum
exists and is equal to xn .
(The idea now: we want |aN − s| to be small. Let L be a large integer. Then
|sn − s| is small when n ≥ L, and |s0 −s|+···+|s
N +1
L−1 −s|
is small when N is much
greater than L, since the numerator does not depend on N .)
Let ε > 0. Since sn → s, we can choose L such that |sn − s| < ε/2 for all
n ≥ L. We can then choose an integer
n 2 o
M ≥ max L, |s0 − s| + · · · + |sL−1 − s| − 1 .
ε
For all N ≥ M ,
where in (B:12), we used the definition of M in the first summand and the
definition of L in the second.
Pn
Remark B6.4 The partial sums Dn = k=−n ek are too wild to form a PAD.
The proposition we have just proved suggests that the Cesàro means N1+1 (D0 +
· · · + DN ) might be tamer. It turns out that they are; indeed, they form a PAD.
This will allow us to complete the plan described at the beginning of Section B5,
thus proving that in the 2-norm, Fourier series always converge.
43
B7 The Fejér kernel
For the lecture of Thursday 27 February 2014; part one of two
At the end of the last section, we expressed the hope that although the Dirichlet
kernels Dn do not form a PAD, perhaps their Cesàro means
1
(D0 + · · · + Dn )
n+1
do. Here we show that this is indeed the case.
Definition B7.1 Let n ≥ 0. The Fejér kernel of order n is
1
Fn = (D0 + · · · + Dn ) : T → C.
n+1
Note that the Fejér kernel, like the Dirichlet kernel, is a trigonometric poly-
nomial.
Way back in Section A1, we abandoned the sine-and-cosine formulation of
Fourier series, choosing to work with the more elegant exponential formulation.
But it will be useful to have expressions for Dn and Fn in traditional trigono-
metric form.
Lemma B7.2 Let n ≥ 0 and 0 6= t ∈ T. Then
Since t 6= 0, we have e1 (t) 6= 1, and we may therefore apply the formula for
summing a geometric series. After some routine algebra, we get
e1 (t)n+1/2 − e1 (t)−(n+1/2)
Dn (t) = . (B:13)
e1 (t)1/2 − e1 (t)−1/2
Noting that e1 (t)α = e2πiαt and applying the formula eiθ = cos θ + i sin θ, we
obtain the result on Dn (t).
We now use a trick: multiply the top and bottom of (B:13) by the bottom.
This gives
44
PN
So for any N ≥ 0, the sum n=0 Dn (t) telescopes, giving
N
e1 (t)N +1 + e1 (t)−(N +1) − e1 (t)0 + e1 (t)−0
X
Dn (t) = 2
n=0 e1 (t)1/2 − e1 (t)−1/2
2
e1 (t)(N +1)/2 − e1 (t)−(N +1)/2
= 2
e1 (t)1/2 − e1 (t)−1/2
[2i sin((N + 1)πt)]2
= ,
[2i sin(πt)]2
This explicit formula helps us to prove our main result on the Fejér kernel.
Proposition B7.3 (Fn )∞ n=0 is a PAD. In particular, there is a PAD consisting
of trigonometric polynomials.
Hence
Z Z Z
1
FN (t) dt = D0 (t) dt + · · · + DN (t) dt
T N +1 T T
1
= (1 + · · · + 1) = 1.
N +1
PAD3: Let δ ∈ (0, 1/2). We must prove that
Z
lim FN (t) dt = 0.
n→∞ δ<|t|<1/2
We use the fact that sin θ ≥ π2 θ for all θ ∈ [0, π/2]. (This can be proved using
convexity; see Figure B.5.) Now
1/2
sin2 ((N + 1)πt)
Z Z
2
FN (t) dt = dt (B:14)
δ<|t|≤1/2 N +1 δ sin2 πt
Z 1/2
2 1
≤ dt (B:15)
N +1 δ (2t)2
1 1
= −2 (B:16)
2(N + 1) δ
→ 0 as N → ∞. (B:17)
Here (B:14) follows from Lemma B7.2 and FN being even, (B:15) is because
sin2 ((n + 1)πt) ≤ 1 and | sin πt| ≥ π2 · πt, and (B:16) is a routine calculation.
45
1
sin θ
2
πθ
0
0 π/2
2
Figure B.5: sin θ ≥ πθ for all θ ∈ [0, π/2].
46
B8 The main theorem
For the lecture of Thursday 27 February 2014; part two of two
We can now prove the main theorem of Part B. It states that Fourier’s idea
works perfectly when we use the 2- or 1-norm.
Theorem B8.1 Let f : T → C be an integrable function. Then Sn f → f in
both k · k2 and k · k1 .
Proof This follows from Theorem B8.1, noting that the Fourier partial sums
Sn f are trigonometric polynomials.
47
Corollary B8.5 Let f, g : T → C be integrable functions such that fˆ(k) = ĝ(k)
for all k ∈ Z. Then:
i. f = g a.e.;
ii. f (x) = g(x) for all x ∈ T such that f and g are both continuous at x;
iii. f = g if f and g are both continuous.
This result does not mention the 2-norm (or indeed any norm) in its state-
ment, although it does use the 2-norm in its proof. It answers a very fundamental
question about Fourier series, telling us:
Different functions have different Fourier series.
Here ‘different functions’ must be understood as ‘functions that are not almost
everywhere equal’, since it’s a basic fact that if f = g a.e. then f and g have
the same Fourier coefficients; see A3.13.
This encourages us to believe in Fantasy A8.7.
48
Chapter C
k · k∞ N
tt
ttt NNN
u} t #+
k · k2 pointwise
uu LLLLLLL
v~ uuuu LLL
"*
k · k1 a.e.
Part B covered the lower-left region of the diagram: convergence of Fourier series
in the 2- and 1-norms. In Part C, we look at the upper-right region: uniform
and pointwise convergence. (We will never look seriously at almost everywhere
convergence.)
C1 Warm-up
For the lecture of Monday 3 March 2014; part one of two
In Part B, we showed:
for all integrable f : T → C, Sn f → f in k · k2 and k · k1 .
Is it also true in k · k∞ ?
No, since if Sn f → f in k · k∞ then f is continuous (by Fact A4.8), whereas
not every integrable function is continuous.
But this leaves open the possibility that:
for all continuous f : T → C, Sn f → f in k · k∞ .
However, this is not true either. By du Bois–Reymond’s example (Theo-
rem A2.2), there is some continuous f : T → C such that Sn f does not even
converge pointwise to f , let alone uniformly.
49
So if we’re hoping for all Fourier series to converge in either the uniform or
pointwise sense, we’ll be disappointed. But perhaps it’s true if we dilute our
hopes, asking only for convergence almost everywhere. In other words, we might
hope that:
for all continuous f : T → C, Sn f → f almost everywhere.
This is true, by Carleson’s theorem (Theorem A2.6). But as previously men-
tioned, all known proofs are much too hard for this course.
However, there are some easier results available. For example, we will be
able to prove the theorem of Dirichlet that if f is continuously differentiable,
then Sn f → f uniformly (hence in all five senses). We will also look at an
application that seems to have nothing to do with Fourier theory: the so-called
equidistribution theorem of Weyl, concerning the statistical behaviour of mul-
tiples of irrational numbers.
(This should remind you of the situation with convergence in the 1-, 2- and
∞-norms—but it’s the other way round. Notice that squaring a small positive
number decreases it.)
P∞
Proof For the first implication, suppose that k=−∞ |ck | < ∞. Then certainly
the set {|ck | : k ∈ Z} is bounded; put kck∞ = supk∈Z |ck |. For all n ≥ 0, we
have
Xn n
X ∞
X
|ck |2 ≤ kck∞ |ck | ≤ kck∞ |ck |,
k=−n k=−n k=−∞
P∞
so k=−∞|ck |2 < ∞. P∞
For the second implication, if k=−∞ |ck |2 < ∞ then {|ck |2 : k ∈ Z} is
bounded, so {|ck | : k ∈ Z} is bounded too.
Remark C1.2 No further implications hold. To see that the converse of the
first implication fails, consider
(
1/k if k ≥ 1
ck =
0 if k ≤ 0.
To see that the converse of the second implication fails, take ck = 1 for all k.
50
Here is another basic fact. It is about convergence in the 1-norm, but will
be useful for arguments about uniform and pointwise convergence.
We can now answer a subtle question that has been present from the begin-
ning. Can it happen that the Fourier series of f converges, but not to f ? For
convergence in the uniform sense, no.
Lemma C1.4 Let f : T → C be an integrable function. Suppose that the se-
quence (Sn f )∞
n=0 converges uniformly (not necessarily to f ). Then:
Proof Let g be the uniform limit of the sequence (Sn f ). By Fact A4.8, g is
continuous. Certainly Sn f → g in k · k1 , so fˆ(k) = ĝ(k) for all k ∈ Z by
Lemma C1.3 (taking ‘ck ’ to be fˆ(k) and ‘f ’ to be g). The result now follows
from Corollary B8.5.
51
C2 What if the Fourier coefficients are abso-
lutely summable?
For the lecture of Monday 3 March 2014; part two of two
P∞
We saw in Proposition B1.2 that k=−∞ |fˆ(k)|2 < ∞ for any integrable func-
P∞ f : T ˆ→ C. However, f may or may not have the stronger property that
tion
k=−∞ |f (k)| < ∞. (To see why this is stronger, recall Lemma C1.1.)
P∞ ˆ
In this section, we will see that if k=−∞ |f (k)| < ∞ then the Fourier series
of f behaves well, in a sense that will be made precise. In the next section, we
will see that many common functions do have this property.
We begin with a result on double sequences.
P∞
Proposition C2.1 Let c ∈ CZ . Suppose that k=−∞ |ck | < ∞. Then the
P ∞
n
sequence k=−n ck ek converges uniformly to a continuous function g.
n=0
Moreover, ĝ(k) = ck for all k ∈ Z.
Pn
Proof For n ≥ 0, write sn = k=−n ck ek .
First we show that the sequence (sn ) converges
P pointwise. Indeed, let x ∈ T.
Given ε > 0, we can choose N such that k : |k|≥N |ck | < ε; then for all m ≥
n ≥ N,
X X
|sm (x) − sn (x)| = ck ek (x) ≤ |ck | < ε.
k : n<|k|≤m k : n<|k|≤m
52
C3 Continuously differentiable functions
For the lecture of Thursday 6 March 2014; part one of two
P∞
We have just shown that if k=−∞ |fˆ(k)| < ∞ then the Fourier series of
f behaves well (Theorem C2.2). If you had a particular function f in front of
you, you might be able to calculate its Fourier coefficients, you might be able
to calculate the sum of their absolute values, and that sum might be finite.
In that case, you could conclude that the Fourier series of that particular f
was well-behaved. But are there general conditions on f guaranteeing that
P ∞ ˆ
k=−∞ |f (k)| < ∞?
The answer turns
P out to be yes. But beware that not all integrable functions
f : T → C satisfy |fˆ(k)| < ∞. In fact, there exist quite ‘nice’ functions f
P ˆ
with |f (k)| = ∞:
Example C3.1 Define f : T → C by
X sin(2πnx)
f (x) = .
n log n
n≥2
For example, a function T → C belongs to the set C 1 (T) if and only if the
corresponding 1-periodic function R → C is continuously differentiable.
53
Since uniform convergence is the strongest kind of convergence, this tells us
that the Fourier series of a continuously differentiable function converges in all
five senses. Thus, it behaves as well as any function could.
But the converse fails, even for continuous functions. Here is an example due
to Weierstrass:
∞
X
f (x) = 2−n cos(2n πx).
n=1
P ˆ
It can be shown (non-examinably) that |f (k)| < ∞ and that f is continu-
ous. But it can also be shown that f is not differentiable anywhere, let alone
continuously differentiable everywhere.
54
C4 Fejér’s theorem
For the lecture of Thursday 6 March 2014; part two of two
55
Proof For (i), let x ∈ T. In the proof of Proposition B5.4, equation (B:7)
stated that for all n ≥ 0 and δ ∈ (0, 1/2),
Z Z
|(f ∗Kn )(x)−f (x)| ≤ |f (x−t)−f (x)|Kn (t) dt+2kf k∞ Kn (t) dt.
|t|<δ δ<|t|≤1/2
Suppose that f is continuous at x. Let ε > 0. Choose δ ∈ (0, 1/2) such that
|f (x − t) − f (x)| < ε/2 for all t ∈ (−δ, δ). By PAD3, we can also choose N such
that for all n ≥ N , Z
ε
Kn (t) dt < .
δ<|t|≤1/2 4kf k∞
Then for all n ≥ N ,
Z
ε ε
|(f ∗ Kn )(x) − f (x)| ≤ Kn (t) dt + 2kf k∞ ·
2 |t|<δ 4kf k∞
Z 1/2
ε ε
≤ Kn (t) dt + = ε,
2 −1/2 2
as required.
For (ii), suppose that f is continuous. Let ε > 0. Every continuous map
from a compact metric space to a metric space is uniformly continuous, and T
is compact, so f is uniformly continuous. Hence we can choose δ ∈ (0, 1/2) such
that for all x ∈ T and t ∈ (−δ, δ), we have |f (x − t) − f (x)| < ε/2.
(In case you have not previously encountered this theorem about metric
spaces, an alternative argument is available. Here we use instead the theorem
that any continuous function on a closed bounded interval is uniformly contin-
uous. Since f is continuous on [−1, 1], it is uniformly continuous there. So by
interpreting T as [−1/2, 1/2), we can choose δ as in the last paragraph.)
By PAD3, we can choose N as in the proof of (i). For all x ∈ T, for all
n ≥ N , we calculate that |(f ∗ Kn )(x) − f (x)| < ε, again as in the proof of (i).
So kf ∗ Kn − f k∞ ≤ ε for all n ≥ N , as required.
Proof (Fn ) is a PAD and An f = f ∗ Fn , so this follows from the last proposi-
tion.
56
Corollary C4.4 The set {trigonometric polynomials} is dense in C(T) with
respect to k · k∞ .
Proof Sheet 4.
57
C5 Differentiable functions and the Riemann lo-
calization principle
For the lecture of Monday 10 March
This section contains two major results. First, there is a convergence theorem
for differentiable functions, to add to our existing convergence theorem for con-
tinuously differentiable functions (Dirichlet’s Theorem C3.4). Then, we meet
the deep and shocking ‘localization principle’ of Riemann.
Dirichlet’s theorem states that if f is continuously differentiable then Sn f →
f uniformly. We will weaken both the hypothesis and the conclusion, proving
that if f is differentiable (but not necessarily continuously so) then Sn f → f
pointwise (but not necessarily uniformly).
To prove this, we need another fact about integration.
Fact C5.1 Let I ⊆ R be a bounded interval and x ∈ I. Let g : I → C be a
bounded function such that for all δ > 0, the restriction g|I\(x−δ,x+δ) is inte-
grable. Then g is integrable.
The proof is omitted and non-examinable, but can be found in the appendix
of the book Fourier Analysis by Stein and Shakarchi.
Theorem C5.2 Let f : T → C be an integrable function. Then:
i. (Sn f )(x) → f (x) for all x ∈ T such that f is differentiable at x;
ii. Sn f → f pointwise if f is differentiable.
Proof We just have to prove (i), since this immediately implies (ii). Take x ∈ T
such that f is differentiable at x.
For all n ≥ 0,
Z Z
f (x) − (Sn f )(x) = Dn (t)f (x) dt − Dn (t)f (x − t) dt (C:4)
T T
en+1 (t) − e−n (t)
Z
= (f (x) − f (x − t)) dt (C:5)
e1 (t) − 1
ZT
= (en+1 (t) − e−n (t))g(t) dt, (C:6)
T
R
where in (C:4) we used the fact that T Dn (t) dt = 1, in (C:5) we used equa-
tion (B:13) (from the proof of Lemma B7.2), and in (C:6) we put
f (x) − f (x − t)
g(t) =
e1 (t) − 1
58
We prove that g is integrable using Fact C5.1.
First we show that for each δ > 0, the function g|[−1/2,1/2)\(−δ,δ) is integrable.
Let δ > 0. By Lemma A3.7, it is enough to show that the restrictions of the
functions
t 7→ 1/(e1 (t) − 1), t 7→ f (x) − f (x − t)
to [−1/2, 1/2) \ (−δ, δ) are both integrable. And indeed, the first restricted
function is integrable because it is continuous and bounded, and the second is
integrable because f is.
Next we show that g is bounded. For 0 < |t| ≤ 1/2, we have
f (x) − f (x − t) t 1
g(t) = · → f 0 (x) · as t → 0,
t e1 (t) − 1 2πi
using l’Hôpital’s rule in the second factor. Define g(0) = f 0 (x)/2πi; then g is
continuous at 0, so there exists η > 0 such that g|(−η,η) is bounded. But also,
g|[−1/2,1/2)\(−η,η) is bounded (since it is integrable). Hence g itself is bounded.
It follows from Fact C5.1 that g is integrable, as required.
59
In other words, whether or not the Fourier series of f at x converges to f (x)
depends only on the behaviour of f near x.
Equivalently,
(Sn f )(x) − (Sn g)(x) → 0 as n → ∞.
The result follows.
The theme of ‘local versus global’ becomes increasingly visible at this level
of mathematics. You may have met it in differential geometry; for example,
a surface is locally like R2 , but globally may be quite unlike it. In number
theory, there is the local-to-global principle of Hasse, which has to do with p-adic
numbers and is helpful in determining the solvability of polynomial equations
over the integers. The interplay between local and global is captured formally
by the notion of sheaf, which is especially prominent in algebraic geometry.
60
x1 x3
x2
This section is an application of our theory. We will use what we have learned
to solve a problem that appears to have nothing to do with Fourier analysis.
Question:
61
So far, we haven’t seen any examples of equidistributed sequences, only non-
examples. It’s not obvious that there are any equidistributed sequences at all!
But using Fourier analysis, we’ll construct lots.
When we defined equidistributed sequence, we were choosing a precise mean-
ing for the phrase ‘evenly spread’. But here’s a different, less obvious interpre-
tation. We could say that a sequence (xn ) in [0, 1) is ‘evenly spread’ if for all
integrable functions f : T → C,
n Z 1
1X
f (xj ) → f (t) dt as n → ∞. (C:7)
n j=1 0
Think for a while about why this is a sensible idea—understanding the rest of
this section depends on it.
There are variations on this idea in which we use different classes of func-
tions. For instance, we could regard (xn ) as ‘evenly spread’ if (C:7) holds for
all continuous (rather than integrable) functions f , or all step functions, or all
characters, etc. Actually, one of these variations produces the notion of equidis-
tributed sequence itself:
Lemma C6.3 Let (xn )∞ n=1 be a sequence in [0, 1). Then (xn ) is equidistributed
if and only if (C:7) holds whenever f is the characteristic function of an interval
in [0, 1).
and Z 1
χI (t) dt = |I| .
0
The result follows.
We now show that all these notions of ‘evenly spread’ sequence produced by
choosing different classes of function are, in fact, the same.
Theorem C6.4 Let (xn )∞
n=1 be a sequence in [0, 1). The following are equiva-
lent:
i. (C:7) holds for all integrable f : T → C;
62
Proof We show that (i) ⇒ (ii) ⇒ (iii) ⇒ (iv) ⇒ (v) ⇒ (vi) ⇒ (i).
(i) ⇒ (ii): trivial. Pm
(ii) ⇒ (iii): assume (ii). Let f = k=−m ck ek be a trigonometric polynomial.
Then
m m n m Z 1 Z 1
1X X 1X X
f (xj ) = ck · ek (xj ) → ck ek (t) dt = f (t) dt
n j=1 n j=1 0 0
k=−m k=−m
as n → ∞.
(iii) ⇒ (iv): assume (iii). Let f : T → C be a continuous function. Let
ε > 0. By Corollary C4.4, we can find a trigonometric polynomial g such that
kf − gk∞ < ε/3. By assumption, we can choose N such that for all n ≥ N ,
n Z 1
1 X
n g(xj ) − g(t) dt < ε/3.
j=1 0
(iv) ⇒ (v): assume (iv). Let I ⊆ [0, 1) be an interval, and let ε > 0. We
can find a continuous function gU : [0, 1) → R such that gU (t) ≥ χI (t) for all
t ∈ [0, 1) and
Z 1 Z 1
gU (t) dt − χI (t) dt < ε/2
0 0
So for all n ≥ N ,
n Z 1
1X
χI (xj ) − χI (t) dt (C:8)
n j=1 0
n Z 1
1X
≤ gU (xj ) − χI (t) dt
n j=1 0
X n Z 1 Z 1 Z 1
1
= gU (xj ) − gU (t) dt + gU (t) dt − χI (t) dt
n j=1 0 0 0
63
χI
gU
Proof This is essentially (v) ⇐⇒ (ii) of Theorem C6.4. Indeed, condition (v)
is equivalent to (xn ) being equidistributed, by Lemma C6.3. Condition (ii) holds
if and only if for each k ∈ Z,
n Z 1
1 X 2πikxj
e → ek (t) dt as n → ∞. (C:9)
n j=1 0
When k = 0, (C:9) holds for any sequence (xn ) (check!). On the other hand,
R1
e (t) dt = hek , e0 i = 0 whenever k 6= 0. The result follows.
0 k
This enables us to find, at last, some examples of equidistributed sequences.
Corollary C6.6 (Weyl’s
∞ equidistribution theorem) Let α ∈ R \ Q. Then
the sequence hnαi n=1 in [0, 1) is equidistributed.
Proof We verify Weyl’s criterion. Let 0 6= k ∈ Z. For all n ≥ 1,
n n
1 X 2πikhjαi 1 X 2πikjα
e = e
n j=1 n j=1
64
since hjαi − jα ∈ Z. We sum this geometric series using the standard formula,
which is valid as long as the ratio is not equal to 1. Here, this means e2πikα 6= 1,
or equivalently kα 6∈ Z, which is true as k 6= 0 and α 6∈ Q. Thus, the formula is
valid and the sum of the geometric series is
1 2πikα e2πiknα − 1
e .
n e2πikα − 1
So for all n ≥ 1,
1 n 2πikhjαi
X 2πiknα
+ |1|
≤ · e
1 1 2
e n e2πikα − 1 = n · e2πikα − 1 → 0
n
j=1
as n → ∞.
Weyl’s equidistribution theorem tells us that ‘in the long run’, about 1/100 of
these fractional parts lie between 0.12 and 0.13 (for instance). In particular,
there are infinitely many natural numbers n such that
Could you prove this without Weyl’s theorem? Could you prove that there is
even one number n with this property?
65
Chapter D
Duality
The right general context for Fourier analysis is that of topological groups. To
understand what a topological group is, you first need to know roughly what
a topological space is. So here is a short, informal introduction to topological
spaces.
Roughly speaking, a set X is called a topological space if we know what
it means to ‘move gradually’ within X. For instance, Rn is a topological space
because we know what it means for one point to be close to another, and we
know what a ‘continuous path’ in Rn is (namely, a continuous map [0, 1] → Rn ).
Similarly, the sphere is a topological space: you know what it means to move
gradually on the surface of the earth. The circle T is a topological space for the
same reason.
Some non-mathematical examples to help your intuition: the set of all
colours is a topological space, because you know what it means for a colour
to change gradually. The set of all possible human faces is a topological space,
because you know what it means for a face to change gradually (e.g. as you
66
age) or for one face to be similar to another. Your own changing face defines a
continuous map f from [0, 1] to the space of all faces, with f (0) being your face
at birth and f (1) your face at death.
Formally, a topological space is a set X equipped with some extra data
(specifying what it means to ‘move gradually’). That extra data is called a
‘topology’ on X. We will not need the formal definition.
Some topological spaces are rather trivial. For instance, you can’t move
gradually from one integer to another without passing through non-integers, so
the set Z is a topological space in a trivial way: the only way of moving gradually
is to stay still. Formally, this is called the discrete topology on Z. (If you know
the definition of topological space, the discrete topology is the topology in which
all subsets are open.) Any set can be given the discrete topology.
Topological spaces are the right context for the definition of continuous map.
You know what continuity means for a function R → R, and more generally for
a function Rn → Rm , and also for a function T → C. Given topological spaces
X and Y , there is a definition of what it means for a function X → Y to be
continuous. The definitions for Rn , T, C, etc. are all special cases of this general
definition.
Roughly speaking, a topological group is a topological space that is also
a group. We are most interested in those topological groups that are abelian
(that is, the multiplication is commutative) and ‘locally compact’ (a condition
that I will not explain; it is satisfied by all the examples I will mention). So,
what we are interested in is locally compact, abelian topological groups, which
are usually just called ‘locally compact abelian groups’ or LCAGs for short.
Examples D1.1 i. Rn is a LCAG (with addition as the group operation),
for any n ≥ 0.
ii. The circle T is a LCAG too. Recall that T is the quotient group R/Z, so
that its group operation is addition too.
iii. Z is a LCAG, with + as the group operation and the discrete topology.
iv. Any finite abelian group is a LCAG, with the discrete topology.
v. S = {z ∈ C : |z| = 1} is a LCAG, with · as the group operation. In fact,
T∼= S, via the map t 7→ e2πit .
Definition D1.2 Let G be a LCAG. A character of G is a continuous group
homomorphism G → S.
Examples D1.3 i. For each ξ ∈ Rn , there is a character eξ of Rn defined
by
eξ (x) = e2πiξ.x
(x ∈ Rn ), where ξ.x is the dot product of ξ and x. (This is the ordinary
product in the case n = 1.) You know that eξ is continuous, and you can
easily check that eξ is a group homomorphism (try it!). Fact: these are
the only characters of Rn .
ii. For each k ∈ Z, there is a character ek of T defined in the usual way.
Lemma A7.2 states exactly that each ek is a character in the sense of Def-
inition D1.2. Fact: these are the only characters of T. So, the characters
of T in our new sense are precisely the characters of T in our old sense.
67
iii. For each t ∈ T, there is a character εt of Z defined by
εt (k) = e2πikt
(t ∈ T). Again, you can easily check that each εt really is a character,
that is, a continuous homomorphism. Fact: these are the only characters
of Z.
The set
b = {characters of G}
G
forms a LCAG in a natural way. I won’t describe the topology, but the group
structure is given ‘pointwise’: if e1 and e2 are characters of G then the character
e1 · e2 is defined by (e1 · e2 )(x) = e1 (x) · e2 (x) (x ∈ G). The hat notation is
related to, but different from, the hat notation fˆ for the Fourier coefficients or
Fourier transform of a function f .
Examples D1.3 tell us what G b is for various LCAGs G:
G G
b
Rn Rn
T Z
Z T
b∼
For instance, T = Z because we have an isomorphism
Z → T b
k 7→ ek .
b∼
Theorem D1.4 (Pontryagin duality) G
b
= G for all LCAGs G.
The group G b can be thought of as the ‘mirror image’ of the group G (Fig-
ure D.1). Pontryagin duality states that the mirror image of the mirror image
of G is G itself.
(ˆ)
{nice functions on G} o / b ,
{nice functions on G}
(ˇ)
68
ordinary Fourier transforms
R
R2
R3
T Z
Fourier series
G G
b
finite
abelian
rest of Part D groups
(x ∈ G). The terminology suggests that the two processes should be mutually
inverse, which they are if the functions are ‘nice’ enough.
When G = Rn (and so G b = Rn too), fˆ is the usual Fourier transform
n
of a function f : R → C (see Example D1.3(i)), and φ̌ is the inverse Fourier
transform of a function φ : Rn → C.
When G = T (and so G b = Z), what is here called fˆ(ek ) is what we usually
call fˆ(k), the kth Fourier Pcoefficient of f . Given a double sequence φ : Z → C,
what is here called φ̌(x) is k∈Z φ(k)ek (x). So, this is nothing but Fantasy A8.7.
Both Fourier transforms and Fourier series therefore arise as special cases
of the general theory of Fourier transforms on locally compact abelian groups.
Developing this theory is a major undertaking, requiring some of the theory of
topological groups and also some measure theory. But there is a special case in
which all the complications disappear, and that is what we will study for the
rest of the course. It is the theory of Fourier transforms on finite abelian groups.
69
D2 The dual of a finite abelian group
For the lecture of Monday 24 March. (Lecture of Thu 20 March is cancelled)
Maybe you’re fed up with fussy analytic conditions: this function is continu-
ously differentiable, that series is absolutely summable, and so on. If so, the last
few lectures are for you. They will show you a world in which Fourier analysis
works beautifully without any analytic conditions at all.
Everything will be developed from the ground up, assuming only some ba-
sic group theory. I will not assume any definitions, notation or results from
Section D1.
Definition D2.1 We denote by S the multiplicative group {z ∈ C : |z| = 1}.
• the inverse of e ∈ G
b is e (again, defined as in Notation A3.3);
Proof The set of all functions G → S is certainly a group under these opera-
tions. We show that G
b is a subgroup of this group, that is:
70
ω2
ω = exp(2πi/7)
ω3
1
ω4
S
ω6
ω5
Figure D.2: The complex nth roots of unity are exactly the powers of
exp(2πi/n). Shown: n = 7.
71
Next recall that any two groups G1 and G2 have a product (also called a
‘direct product’) G1 × G2 .
∼c c
1 × G2 = G1 × G2 for all finite abelian groups G1 and G2 .
Lemma D2.5 G\
Proof Sheet 5.
Fact D2.6 Every finite abelian group is isomorphic to Cn1 × Cn2 × · · · × Cnk
for some k, n1 , n2 , . . . , nk ≥ 1.
This is part of the classification theorem for finite abelian groups, which you
may have met in other courses.
Putting together the last three results gives:
b∼
Proposition D2.7 G = G for every finite abelian group G.
Proof This follows from Lemma D2.4, Lemma D2.5 and Fact D2.6.
This proposition is the reason why the finite abelian groups were drawn on
the central dotted line (the ‘mirror’) of Figure D.1.
Remark D2.8 For a given G, there is usually no canonical (i.e. God-given)
isomorphism G → G. b In order to construct an isomorphism, you have to make
some arbitrary choice, of the same kind you make when tossing a coin or choosing
a basis for a vector space.
72
D3 Fourier transforms on a finite abelian group
For the lecture of Thursday 27 March
First we’ll make some definitions analogous to the definitions for Fourier series.
Then we’ll prove some results analogous to results on Fourier series. We’ll
discover that life is much easier on a finite abelian group than on the circle.
For the rest of this section, let G be a finite abelian group.
In (i), Fn(X) is a vector space over C, via the usual addition and scalar
multiplication of functions (as in Notation
R A3.3).
1
R
In (ii), the factor #G ensures that G 1 dx = 1, just as T 1 dx = 1. So the
integral of a function on G can be thought of as its mean value, just as for
functions on T.
We now establish some elementary properties of integration, directly analo-
gous to those for ordinary integration stated in Lemma A3.4.
Lemma D3.2 i. For any functions f, g : G → C,
Z Z Z
(f + g)(x) dx = f (x) dx + g(x) dx.
G G G
73
Lemma D3.3 h·, ·i is an inner product on Fn(G).
Note that h·, ·i is a genuine inner product; that is, if hf, f i = 0 then f = 0.
(Contrast the comments after Lemma A6.2.) This is because for f ∈ Fn(G),
1 X
hf, f i = |f (x)|2 ,
#G
x∈G
74
Definition D3.6 Let f ∈ Fn(G). The Fourier transform of f is the function
fˆ ∈ Fn(G)
b defined by fˆ(e) = hf, ei (e ∈ G).
b
Proof Both sides of this equation are functions on G,b so to prove equality, we
need to take an arbitrary element e0 of G
b and show that evaluating each side at
e0 gives the same result.
Let e0 ∈ G.
b Then
X ∧ DX E
φ(e) · e (e0 ) = φ(e) · e, e0
e∈G
b e
X
= φ(e)he, e0 i (by Lemma D3.3)
e
= φ(e0 ) (by orthonormality).
P ∈ G). The family (δx )x∈G spans Fn(G), since if f ∈ Fn(G) then f =
(y
x∈G f (x)δx (check!). It is also linearly
P independent, since if (cx )x∈G is a
family of complex numbers such that x∈G cx δx = 0, then for each y ∈ G we
have X X
0= cx δx (y) = cx δx (y) = cy .
x∈G x∈G
So it is a basis.
Lemma D3.8 doesn’t really have an analogue in the world of Fourier series
(that is, Fourier theory on the circle group T).
75
Theorem D3.9 The characters of G form an orthonormal basis of Fn(G).
This is the (simpler) analogue of Corollary B8.5; see also Remark C4.3.
76
D4 Fourier inversion on a finite abelian group
For the lecture of Monday 31 March
The upside-down hat is meant to suggest the opposite of the Fourier trans-
form. It is pronounced ‘check’ (as in ‘phi-check’).
Remark D4.2 Suppose we were dealing with the circle T rather than a finite
abelian group G. We have T b = Z (as in Section D1), so φ would be an element
of Fn(Z), that is, aPdouble sequence c = (ck )k∈Z . Then č is notation for the
familiar expression k∈Z ck ek .
Fantasy A8.7 comes true in the setting of finite abelian groups. (No fussy
analytical details!) This is the content of the following theorem.
Theorem D4.3 The maps
(ˆ)
Fn(G) o /
Fn(G)
b
(ˇ)
are linear and mutually inverse. In particular, the vector spaces Fn(G) and
Fn(G)
b are isomorphic.
ˇ
Proof Corollary D3.10 states that fˆ = f for all f ∈ Fn(G), and Proposi-
tion D3.7 states that φ̌ˆ = φ for all φ ∈ Fn(G). b So (ˆ) and (ˇ) are mutually
ˆ
inverse. The map ( ) is linear since h·, ·i is linear in the first argument; so its
inverse (ˇ) is linear too. Hence (ˆ) and (ˇ) define an isomorphism of vector
spaces.
p
Definition D4.4 For f ∈ Fn(G), define kf k2 = hf, f i.
We know from Theorem D4.3 that the function (ˆ) : Fn(G) → Fn(G) b is an
isomorphism of vector spaces. We’d like to say that it’s also an isomorphism of
metric spaces, that is, distance-preserving: kf k2 = kfˆk2 for all f ∈ Fn(G). But
the right-hand side is not yet defined. So we make some definitions:
77
R
Definition D4.6 b define b φ(e) de = P b φ(e) ∈ C.
i. For φ ∈ Fn(G), G e∈G
R
ii. For φ, ψ ∈ Fn(G),
b define hφ, ψi = b φ(e)ψ(e) de ∈ C.
G
p
iii. For φ ∈ Fn(G),
b define kφk2 = hφ, φi.
Compare and contrast (i) with Definition D3.1(ii): here, there is no factor
1
of #G . This is to make the following true:
kf k2 = kfˆk2 .
Proof Suppose that e(x) = e(y) for all e ∈ G. b Since x 6= y, there is some
f ∈ Fn(G) such that f (x) 6= f (y). (E.g. define f by f (x) = 1 and f (z) = 0 for
all z 6= x.) Then by Corollary D3.10,
X X
f (x) = fˆ(e)e(x) = fˆ(e)e(y) = f (y),
e∈G
b e∈G
b
a contradiction.
(e ∈ G).
b Then evx ∈ G
b for all x ∈ G, and the map
b
G −→ G
b
b
x 7−→ evx
is an isomorphism of groups.
78
Proof For each x ∈ G, the function evx is a homomorphism G
b → S: indeed,
for all e1 , e2 ∈ G,
b we have
ev is a homomorphism
⇐⇒ ∀x, y ∈ G, ev(xy) = ev(x) · ev(y)
⇐⇒ ∀x, y ∈ G, evxy = evx · evy
⇐⇒ ∀x, y ∈ G, ∀e ∈ G,
b evxy (e) = (evx · evy )(e)
⇐⇒ ∀x, y ∈ G, ∀e ∈ G,
b evxy (e) = evx (e) · evy (e)
⇐⇒ ∀x, y ∈ G, ∀e ∈ G,
b e(xy) = e(x) · e(y),
∗ ∗ ∗
79