Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Fourier Analysis

Download as pdf or txt
Download as pdf or txt
You are on page 1of 79

N I V ER

TH

Y
IT

O F
R

D I
U
N B

School of Physics and Astronomy

Fourier Analysis
Prof. John A. Peacock
jap@roe.ac.uk

Session: 2013/14

Introduction

Describing continuous signals as a superposition of waves is one of the most useful concepts in
physics, and features in many branches - acoustics, optics, quantum mechanics for example. The
most common and useful technique is the Fourier technique, which were invented by Joseph Fourier
in the early 19th century. In many cases of relevance in physics, the waves evolve independently,
and this allows solutions to be found for many ordinary differential equations (ODEs) and partial
differential equations (PDEs). We will also explore some other techniques for solving common
equations in physics, such as Greens functions, and separation of variables, and investigate some
aspects of digital sampling of signals.
As a reminder of notation, a single wave mode might have the form
(x) = a cos(kx + ).

(1.1)

Here, a is the wave amplitude; is the phase; and k is the wavenumber, where the wavelength is
= 2/k. Equally, we might have a function that varies in time: then we would deal with cos t,
where is angular frequency and the period is T = 2/. In what follows, we will tend to assume
that the waves are functions of x, but this is an arbitrary choice: the mathematics will apply equally
well to functions of time.

Fourier Series

Learning outcomes
In this section we will learn how Fourier series (real and complex) can be used to represent functions
and sum series. We will also see what happens when we use truncated Fourier series as an approximation to the original function, including the Gibbs phenomenon for discontinuous functions.

2.1

Overview

Fourier series are a way of expressing a function as a sum, or linear superposition, of waves of
different frequencies:
X
f (x) =
ai cos(ki x + i ).
(2.1)
i

This becomes more well specified if we consider the special case where the function is periodic with
a period 2L. This requirement means that we can only consider waves where a whole number of
wavelengths fit into 2L: 2L = n k = n/L. Unfortunately, this means we will spend a lot of
time writing n/L, making the formulae look more complicated than they really are.
A further simplification is to realize that the phase of the waves need not be dealt with explicitly.
This is because of the trigonometric identity (which you should know)
cos(A + B) = cos(A) cos(B) sin(A) sin(B).

(2.2)

Thus a single wave mode of given phase can be considered to be the combination of a sin and a cos
mode, both of zero phase.

Fourier Series deal with functions that are periodic over a finite interval. e.g. 1 < x < 1.
The function is assumed to repeat outside this interval.
Fourier Series are useful if (a) the function really is periodic, or (b) we only care about the
function in a finite range (e.g. < x < ). Well discuss this more in Sec. 2.7.
If the range is infinite, we can use a Fourier Transform (see section 3).
We can decompose any function we like in this way (well, any that satisfy some very mild
mathematical restrictions).
The sines and cosines are said to form a complete set. This means the same as the last bullet
point. We wont prove this.
One can decompose functions in other complete sets of functions (e.g. powers: the Taylor
series is an example of this), but the Fourier Series is perhaps the most common and useful.
Most of this course will be concerned with Fourier Series and Fourier Transforms (see later).

2.2

Periodic Functions

Periodic functions satisfy


f (t + T ) = f (t)

(2.3)

for all t. T is then the period. Similarly, a function can be periodic in space: f (x + X) = f (x).
Exercise: Show that if f (t) and g(t) are periodic with period T , then so are af (t) + bg(t) and
cf (t)g(t), where a, b, c are constants.
Note that a function which is periodic with a period X is also periodic with period 2X, and indeed
periodic with period nX, for any integer n. The smallest period is called the fundamental period.
Note also that the function does not have to be continuous.
Examples:
sin x and cos x both have a fundamental period of 2.

has a period of 2L/n, where n is an integer.
sin nx
L


So sin nx
and cos nx
all have periods 2L as well as 2L/n (for all integer n).
L
L
Note that the boundary of the period can be anything convenient: 0 x 2L for example,
or a x a + 2L for any a. Since it is periodic, it doesnt matter.

2.3

The Fourier expansion

Within the interval L x L, we can write a general (real-valued) function as a linear superposition of these Fourier modes:

 nx  X
 nx 
X
1
an cos
+
bn sin
f (x) = a0 +
2
L
L
n=1
n=1
h
 nx 
 nx i
X
1
= a0 +
an cos
+ bn sin
(2.4)
2
L
L
n=1
3

where an and bn are (real-valued) expansion coefficients, also known as Fourier components. The
reason for the unexpected factor 1/2 multiplying a0 will be explained below.
2.3.1

What about n < 0?

We dont need to include negative n because the Fourier modes have a well defined symmetry (even
or odd) under n n: lets imagine we included negative n and that the expansion coefficients
are An and Bn :
 nx 
 nx i
A0 X h
f (x) =
+
An cos
+ Bn sin
(2.5)
2
L
L
n





 nx 
 nx 
nx
nx
A0 X
+
An cos
+ An cos
+ Bn sin
+ Bn sin
.
=
2
L
L
L
L
n=1
(2.6)




Now, cos nx
= cos nx
and sin nx
= sin nx
, so we can rewrite this as
L
L
L
L

 nx 
 nx i
A0 X h
+
(An + An ) cos
+ (Bn Bn ) sin
.
f (x) =
2
L
L
n=1

(2.7)

At this point An and An are unknown constants. As they only appear summed together (rather
than separately) we may as well just rename them as a single, unknown constant a0 = A0 , an
An + An , (n 1). We do the same for bn Bn Bn . So, overall it is sufficient to consider just
positive values of n in the sum.

2.4

Orthogonality

Having written a function as a sum of Fourier modes, we would like to be able to calculate the
components. This is made easy because the Fourier mode functions are orthogonal i.e. for non-zero
integers m and n,
Z L
 nx   0 m 6= n
 mx 
cos
=
(2.8)
dx cos
L m=n
L
L
L
Z L
 mx 
 nx   0 m 6= n
dx sin
sin
=
(2.9)
L m=n
L
L
L
Z L
 mx 
 nx 
dx cos
sin
=0.
(2.10)
L
L
L
You can do the integrals using the trigonometry identities in Eqn. (2.14) below. Note that one of
the Fourier modes is a constant (the a0 /2 term), so we will also need
Z L
 nx   0 n 6= 0
dx cos
=
(2.11)
2L n = 0
L
L
Z L
 nx 
dx sin
=0
(2.12)
L
L
Note the appearance of 2L here, rather than L in the n > 0 cases above.

FOURIER ANALYSIS: LECTURE 2


The orthogonality is the fact that we get zero in each case if m 6= n. We refer to the collected
Fourier modes as an orthogonal set of functions.
Let us show one of these results. If m 6= n,
 



Z L
Z
 mx 
 nx 
1 L
(m + n)x
(m n)x
dx cos
cos
=
dx cos
+ cos
L
L
2 L
L
L
L
n
o
n
o L

(m+n)x
L sin (mn)x
L
L
1 L sin

+
=
2
(m + n)
(m n)
L

= 0

if m 6= n.

(2.13)

If m = n, the second cosine term is cos 0 = 1, which integrates to L.


ASIDE: useful trigonometric relations To prove the orthogonality, the following formul are
useful:
2 cos A cos B
2 sin A cos B
2 sin A sin B
2 cos A sin B

= cos(A + B) + cos(A B)
= sin(A + B) + sin(A B)
= cos(A + B) + cos(A B)
= sin(A + B) sin(A B)

(2.14)

To derive these, we write ei(AB) = eiA eiB , and rewrite each exponential using ei = cos i sin .
Add or subtract the two expressions and take real or imaginary parts as appropriate to get each
of the four results. Alternatively, the orthogonality can be proved using the complex representation
directly: cos(kx) = [exp(ikx) + exp(ikx)]/2, so a product of cosines always generates oscillating
terms like exp(ixk); these always integrate to zero, unless k = 0.

2.5

Calculating the Fourier components

The Fourier basis functions are always the same. When we expand different functions as Fourier
series, the difference lies in the values of the expansion coefficients. To calculate these Fourier
components we exploit the orthogonality proved above. The approach will be the same as we follow
when we extractPcomponents of vectors, which are expressed as a sum of components times basis
functions: v = i ai ei . The basis vectors are orthonormal, so we extract the j th component just
by taking the dot product with ej to project along that direction:
X
ej v = ej
ai ei = aj .
(2.15)
i

This works because all the terms in the series give zero, except the one we want. The procedure
with Fourier series is exactly analogous:
1. Choose which constant we wish to calculate (i.e. am or bm for some fixed, chosen value of m)

if we are interested in
2. Multiply both sides by the corresponding Fourier mode (e.g. cos mx
L
am or sin mx
if
we
are
trying
to
find
b
)
m
L
5

3. Integrate over the full range (L x L in this case)


4. Rearrange to get the answer.
So, to get am :
Z L
 mx 
f (x)
dx cos
L
L
Z L
 mx 
1
dx cos
= a0
2
L
L

Z
Z L

 mx 
 nx 
 mx 
 nx 
L
X
dx cos
+
an
cos
+ bn
dx cos
sin
L
L
L
L
L
L
n=1
= a0 .L m0 +

L an mn

(2.16)
(2.17)
(2.18)
(2.19)

n=1

= L am .

(2.20)
(2.21)

mn is the Kronecker delta function:



mn =

1 m=n
0 m <> n

(2.22)

Rearranging:
Z
 mx 
1 L
am =
dx cos
f (x)
L L
L
Z
 mx 
1 L
dx sin
Similarly, bm =
f (x) .
L L
L

(2.23)
(2.24)

So this is why the constant term is defined as a0 /2: it lets us use the above expression for am for
all values of m, including zero.

2.6

Even and odd expansions

What if the function we wish to expand


= f(x), or odd: f (x) = f (x)? Because the
 is even:f (x) nx
Fourier modes are also even (cos nx
)
or
odd
(sin
), we can simplify the Fourier expansions.
L
L
2.6.1

Expanding an even function

Consider first the case that f (x) is even:


Z
Z
Z
 mx 
 mx 
 mx 
1 L
1 0
1 L
dx sin
f (x) =
dx sin
f (x) +
dx sin
f (x) (2.25)
bm =
L L
L
L 0
L
L L
L
In the second integral, make a change of variables y = x dy = dx. The limits on y are L 0,
and use this minus sign to switch them round to
 0 L. f (x) = f (y) = +f (y) because it is an
my
even function, whereas sin my
=

sin
as it is odd. Overall, then:
L
L
Z
Z
 mx 
 my 
1 L
1 L
bm =
dx sin
f (x)
dy sin
f (y) = 0
(2.26)
L 0
L
L 0
L
6

1.0

0.8

0.6

0.4

0.2

-1.0

0.0

-0.5

0.5

1.0

Figure 2.1: e|x| in 1 < x < 1.


i.e. the Fourier decomposition of an even function contains only even Fourier modes. Similarly, we
can show that
Z
Z
Z
 mx 
 my 
 mx 
1 L
2 L
1 L
f (x) +
f (y) =
f (x). (2.27)
dx cos
dy cos
dx cos
am =
L 0
L
L 0
L
L 0
L
2.6.2

Expanding an odd function

For an odd function we get a similar result: all the am vanish, so we only get odd Fourier modes,
and we can calculate the bm by doubling the result from integrating from 0 L:
am = 0
bm =

2
L

(2.28)
Z

dx sin
0

 mx 
L

f (x)

(2.29)

We derive these results as before: split the integral into regions of positive and negative x; make
a transformation
y = x for the latter; exploit the symmetries of f (x) and the Fourier modes

mx
mx
cos L , sin L .
Example: f (x) = e|x| for 1 < x < 1. The fundamental period is 2.

1.0

0.8

0.6

0.4

0.2

-1.0

0.0

-0.5

0.5

1.0

Figure 2.2: Fourier Series for e|x| in 1 < x < 1 summed up to m = 1 and to m = 5.
The function is symmetric, so we seek a cosine series, with L = 1:
Z
 mx 
2 L
am =
dx cos
f (x)
L 0
L
Z 1
= 2
dx cos(mx)ex
Z0 1

1
= 2
dx eimx + eimx ex
2
Z 10

=
dx eimxx + eimxx
0


=

e(im+1)x
e(im1)x
+
im 1 (im + 1)

1
(2.30)
0

(2.31)
Now eim = (ei )m = (1)m , and similarly eim = (1)m , so (noting that there is a contribution
from x = 0)
am

(1)m e1 1 (1)m e1 1

=
im 1
 im + 1

1
1
m 1
= [(1) e 1]

im 1 im + 1
2
= [(1)m e1 1]
(im 1)(im + 1)
m 1
2[(1) e 1]
=
m2 2 1
2[1 (1)m e1 ]
.
=
1 + m2 2

(2.32)

f(x)=x 2

Figure 2.3: f (x) = x2 as a periodic function.

FOURIER ANALYSIS: LECTURE 3


2.7

Periodic extension, or what happens outside the range?

To discuss this, we need to be careful to distinguish between the original function that we expanded
f (x) (which is defined for all x) and the Fourier series expansion fFS (x) that we calculated (which
is valid only for L x L.
Inside the expansion range fFS (x) is guaranteed to agree exactly with f (x). Outside this range, the
Fourier expansion fFS (x) will not, in general, agree with f (x).
As an example, lets expand the function f (x) = x2 between L and L (L is some number, which we
might decide to set equal to ). This is an even function so we know bn = 0. The other coefficients
are:
Z
Z
2 L
2 L3
2L2
1 L
2
dx x =
dx x2 =
=
a0 =
L L
L 0
L 3
3
Z L
Z
 mx 
 mx 
m
1
2 L
2L2 
am =
dx x2 cos
=
dx x2 cos
= 3 3 y 2 sin y + 2y cos y 2 sin y 0
L L
L
L 0
L
m
2
m
2
4L (1)
2L
= 3 3 2m(1)m =
(2.33)
m
m2 2
For details, see below.
So, overall our Fourier series is
fFS (x) =

 nx 
L2 4L2 X (1)n
+ 2
cos
.
3
n=1 n2
L

(2.34)

Inside the expansion range fFS (x) agrees exactly with the original function f (x). Outside, however,
it does not: f (x) keeps rising quadratically, whereas fFS (x) repeats with period 2L. We say the
Fourier series has periodically extended the function f (x) outside the expansion range. This is shown
in Fig. 2.3.
9

Figure 2.4: The Fourier spectrum an (with y-axis in units of L2 ) for function f (x) = x2 .
There are some special cases where fFS (x) does agree with f (x) outside the range. If f (x) is itself
periodic with period 2L/p (i.e. the size of the range divided by some integer p s.t. f (x + 2L/p) =
f (x)), then fFS (x) will agree with f (x) for all x.
Another special case is where f (x) is only defined in the finite range of expansion e.g. because we
are only considering a string extending from 0 to L. Physically, then it does not matter if fFS (x)
deviates from f (x) outside the range.
A plot of the coefficients, {cn } versus n, is known as the spectrum of the function: it tells us how
much of each frequency is present in the function. The process of obtaining the coefficients is often
known as spectral analysis. We show the spectrum for f (x) = x2 in Fig. 2.4.
Choice of periodic extension There is no unique way of casting f (x) as a periodic function,
and there may be good and bad ways of doing this. For example, suppose we were interested in
representing f (x) = x2 for 0 < x < L: we have already solved this by considering the even function
x2 over L < x < L, so the periodicity can be over a range that is larger than the range of interest.
Therefore, we could equally well make an odd periodic function by adopting +x2 for 0 < x < L and
x2 for L < x < 0. This is then suitable for a sin series. The coefficients for this are
Z
 mx  1 Z 0
 mx 
1 L
2
2
dx x sin
+
dx (x ) sin
bm =
L 0
L
L L
L
Z L


2 
m
2
mx
2L
=
dx x2 sin
= 3 3 y 2 cos y + 2y sin y + 2 cos y 0
L 0
L
m
2
2L
= 3 3 [(1)m+1 m2 2 + 2(1)m 2]
(2.35)
m
So now we have two alternative expansions, both of which represent f (x) = x2 over 0 < x < L. To
10

lowest order, these are


 x 
L2 4L2
cos : f (x) =
2 cos
+
3

L
 x 
2L2
sin
+ .
sin : f (x) =

(2.36)
(2.37)

It should be clear that the first of these works better, since the function does behave quadratically
near x = 0, whereas the single sin term is nothing like the target function. In order to get comparable
accuracy, we need many more terms for the sin series than the cos series: the coefficients for the
former decline as 1/m2 , as against only 1/m for the latter at large m, showing very poor convergence.
Doing the integrals for the x2 expansion We need to do the integral
Z
 mx 
2 L
am =
dx x2 cos
L 0
L

(2.38)

The first stage is to make a substitution that simplifies the argument of the cosine function:
y=

mx
L

dy =

m
dx
L

which also changes the upper integration limit to m. So


Z m
Z m
2
L
L2 y 2
2L2
am =
dy y 2 cos y .
dy 2 2 cos y = 3 3
L 0 n
m
m 0

(2.39)

(2.40)

We now solve this simplified integral by integrating by parts. An easy way of remembering integration by parts is
Z
Z
dv
du
u dx = [uv] v dx
(2.41)
dx
dx
and in this case we will make u = y 2 and dv/dy = cos y. Why? Because we want to differentiate y 2
to make it a simpler function:
Z
Z
2
2
dy y cos y = y sin y dy 2y sin y .
(2.42)
We now repeat the process for the integral on the RHS, setting u = 2y for the same reason:


Z
Z
2
2
dy y cos y = y sin y 2y cos y + dy 2 cos y = y 2 sin y + 2y cos y 2 sin y .
(2.43)
So, using sin m = 0 and cos m = (1)m :
m
2L2
4L2 (1)m
2L2  2
m
.
am = 3 3 y sin y + 2y cos y 2 sin y 0 = 3 3 2m(1) =
m
m
m2 2

2.8

(2.44)

Complex Fourier Series

Sines and cosines are one Fourier basis i.e. they provide one way to expand a function in the interval
[L, L]. Another, very similar basis is that of complex exponentials.
f (x) =

cn n (x) where n (x) = e+ikn x = einx/L

n=

11

(2.45)

kn = n/L is the wavenumber.


This is a complex Fourier series, because the expansion coefficients, cn , are in general complex
numbers even for a real-valued function f (x) (rather than being purely real as before). Note that
the sum over n runs from in this case. (The plus sign in phase of the exponentials is a convention
chosen to match the convention used for Fourier Transforms in Sec. 3.)
Again, these basis functions are orthogonal and the orthogonality relation in this case is

L
Z L
Z L

[x]L
= 2L (if n = m)
h
i
i(km kn )x

L
dx e
=
= 2L mn .
dx m (x)n (x) =
m kn )x)
exp(i(k
= 0 (if n 6= m)
L
L
i(km kn )
L

(2.46)
Note the complex conjugation of m (x): this is needed in order to make the terms involving k
cancel; without it, we wouldnt have an orthogonality relation.
For the case n 6= m, we note that m n is a non-zero integer (call it p) and
exp[i(km kn )L] exp[i(km kn )(L)] = exp[ip] exp[ip]
(2.47)
p
p
p
p
= (exp[i]) (exp[i]) = (1) (1) = 0 .
For p = m n = 0 the denominator is also zero, hence the different result.
We can use (and need) this orthogonality to find the coefficients. If we decide we want to calculate
cm for some chosen value of m, we multiply both sides by the complex conjugate of m (x) and
integrate over the full range:
Z L
Z L

X
X

dx m (x)f (x) =
cn
dx m (x)n (x) =
cn .2Lmn = cm .2L
(2.48)
L

n=

cm =

1
2L

dx m (x)f (x) =

1
2L

n=
L

dx eikm x f (x)

(2.49)

As before, we exploited that the integral of a sum is the same as a sum of integrals, and that cn are
constants.
2.8.1

Relation to real Fourier series

The complex approach may seem an unnecessary complication. Obviously it is needed if we have
to represent a complex function, but for real functions we need to go to some trouble in order to
make sure that the result is real:
f (x) =

inx/L

cn e

f (x) = f (x) =

n=

cn einx/L

(2.50)

n=

Equating the coefficients of the eimx/L mode, we see that the Fourier coefficients have to be Hermitian:
(2.51)
cm = cm .
This shows why it was necessary to consider both positive and negative wavenumbers, unlike in the
sin and cos case.
The advantage of the complex approach is that it is often much easier to deal with integrals involving
exponentials. We have already seen this when discussing how to prove the orthogonality relations for
12

sin and cos. Also, doing things this way saves having to do twice the work in obtaining coefficients
for sin and cos series separately, since both the an and bn coefficients are given by a single integral:
Z
Z
1
1
1
(2.52)
cn =
f (x) exp(ikn x) dx =
f (x) [cos kn x + i sin kn x] dx = (an + ibn ).
2L
2L
2
This extra factor of 21 arises from the orthogonality relations, reflecting the fact that the mean value
of | exp(ikx)|2 is 1, whereas the mean of cos2 kx or sin2 kx is 1/2. Thus all coefficients for a real
series are already known if we know the complex series:
an = cn + cn ;

bn = (cn cn )/i.

(2.53)

FOURIER ANALYSIS: LECTURE 4


2.8.2

Example

To show the complex Fourier approach in action, we revisit our example of expanding f (x) = x2
for x [L, L] (we could choose L = if we wished). The general expression for the Fourier
coefficients, cm , takes one of the following forms, depending on whether or not m is zero:
Z L
1
L2
cm=0 =
dx x2 =
(2.54)
2L L
3
Z L
1
2L2 (1)m
cm6=0 =
dx x2 eimx/L =
(2.55)
2L L
m2 2
See below for details of how to do the second integral. We notice that in this case all the cm are
real, but this is not the case in general.
ASIDE: doing the integral We want to calculate
Z L
1
dx x2 eimx/L
cm
2L L

(2.56)

To make life easy, we should change variables to make the exponent more simple (whilst keeping y
real) i.e. set y = mx/L, for which dy = (m/L) dx. The integration limits become m:
Z m
Z m
1
L
L2 y 2 iy
L2
cm =
dy
2 2e =
dy y 2 eiy .
(2.57)
3
3
2L m
m m
2m m
Now we want to integrate by parts. We want the RHS integral to be simpler than the first, so we
set u = y 2 du = 2y dy and dv/dy = eiy v = eiy /(i) = ieiy (multiplying top and
bottom by i and recognising i i = 1). So




Z m
Z m
 2 iy m
 2 iy m
L2
L2
iy
iy
cm =
iy e

dy 2y.ie
=
iy e
2i
dy ye
m
m
2m3 3
2m3 3
m
m
(2.58)
13

The integral is now simpler, so we play the same game again, this time with u = y du/dy = 1
to get:



Z m
 2 iy m
 iy m
L2
iy
iy e
2i iye

dy ie
(2.59)
cm =
m
m
2m3 3
m


 iy m o
L2 n 2 iy m
iy m
=
iy
e

2i
iye

i
ie
(2.60)
m
m
m
2m3 3

L2  2 iy
iy
iy m
iy
e

2i.i.ye
+
2i.i.i
e
(2.61)
=
m
2m3 3
m
L2  iy
2
iy
+
2y

2i
(2.62)
=
e
m
2m3 3
We can now just substitute the limits in, using eim = eim = (1)m (so eiy has the same value
at both limits). Alternatively, we can note that the first and third terms in the round brackets
are even under y y and therefore we will get zero when we evaluate between symmetric limits
y = m (N.B. this argument only works for symmetric limits). Only the second term, which is
odd, contributes:

L2 
L2  iy m
im
im
=
2ye
2me

(2m)e
cm =
m
2m3 3
2m3 3
2
L
2L2 (1)m
m
=

4m(1)
=
.
2m3 3
m2 2

2.9

(2.63)

Comparing real and complex Fourier expansions

There is a strong link between the real and complex Fourier basis functions, because cosine and
sine can be written as the sum and difference of two complex exponentials:

 1

nx

cos
= [n (x) + n (x)]

 nx 
 nx 
2
(2.64)
n (x) = cos
+ i sin

L
L


sin nx
= [n (x) n (x)]
L
2i
so we expect there to be a close relationship between the real and complex Fourier coefficients.
Staying with the example of f (x) = x2 , we can rearrange the complex Fourier series by splitting
the sum as follows:

X
X
fFS (x) = c0 +
cn einx/L +
cn einx/L
(2.65)
n=1

n=1
0

we can now relabel the second sum in terms of n = n:

X
X
0
inx/L
fFS (x) = c0 +
cn e
+
cn0 ein x/L

(2.66)

n0 =1

n=1

Now n and n0 are just dummy summation indices with no external meaning, so we can now relabel
n0 n and the second sum now looks a lot like the first. Noting from Eqn. (2.63) that in this case
cm = cm , we see that the two sums combine to give:

 nx 
X
X
fFS (x) = c0 +
cn [n (x) + n (x)] = c0 +
2cn cos
(2.67)
L
n=1
n=1
So, this suggests that our real and complex Fourier expansions are identical with an = 2cn (and
bn = 0). Comparing our two sets of coefficients in Eqns. (2.33) and (2.63), we see this is true.
14

2.10

Some other features of Fourier series

In this subsection, were going to look at some other properties and uses of Fourier expansions.
2.10.1

Differentiation and integration

If the Fourier series of f (x) is differentiated or integrated, term Rby term, the new series (if it
converges, and in general it does) is the Fourier series for f 0 (x) or dx f (x), respectively (in the
latter case, only if a0 = 0; if not, there is a term a0 x/2, which would need to be expanded in sin
and cos terms).
This means that we do not have to do a second expansion to, for instance, get a Fourier series for
the derivative of a function.
It can also be a way to do a difficult integral. Integrals of sines and cosines are relatively easy, so
if we need to integrate a function it may be more straightforward to do a Fourier expansion first.

FOURIER ANALYSIS: LECTURE 5


2.10.2

Solving ODEs

Fourier Series can be very useful if we have a differential equation with coefficients that are constant,
and where the equation is periodic1 . For example
dz
d2 z
+ p + rz = f (t)
2
dt
dt

(2.68)

where the driving term f (t) is periodic with period T . i.e. f (t + T ) = f (t) for all t. We solve this
by expanding both f and z as Fourier Series, and relating the coefficients. Note that f (t) is a given
function, so we can calculate its Fourier coefficients.
Writing
z(t) =

X
X
1
a0 +
an cos (nt) +
bn sin (nt)
2
n=1
n=1

(2.69)

X
X
1
f (t) =
A0 +
An cos (nt) +
Bn sin (nt)
2
n=1
n=1

where the fundamental frequency is = 2/T .

X
X
dz
=
nan sin (nt) +
nbn cos (nt)
dt
n=1
n=1

X
X
d2 z(t)
2 2
=
n an cos (nt)
n2 2 bn sin (nt)
dt2
n=1
n=1
1

For non-periodic functions, we can use a Fourier Transform, which we will cover later

15

(2.70)

Then the l.h.s. of the differential equation becomes

X




r
d2 z dz
2 2
2 2
+rz
=
a
+
n

a
+
pnb
+
ra
cos(nt)
+
n

pnb
+
rb
sin(nt)
.
+p
0
n
n
n
n
n
n
dt2
dt
2
n=1
(2.71)
Now we compare the coefficients with the coefficients of expansion of the r.h.s. driving term f (t).
The constant term gives
A0
ra0 = A0 a0 =
.
(2.72)
r
Equating the coefficients of cos(nt) and sin(nt) gives:

n2 2 an + pnbn + ran = An
n2 2 bn pnan + rbn = Bn

(2.73)

This is a pair of simultaneous equations for an , bn , which we can solve, e.g. with matrices:
 2 2
   
n + r
pn
an
An
=
(2.74)
2 2
pn
n + r
bn
Bn
Calculating the inverse of the matrix gives

 
 
1 n2 2 + r
an
pn
An
=
bn
pn
n2 2 + r
Bn
D

(2.75)

where
D (r n2 2 )2 + p2 n2 2 .

(2.76)

So for any given driving term the solution can be found. Note that this equation filters the driving
field:
p
p
A2n + Bn2

(2.77)
a2n + b2n =
D
For large n, equation 2.76 gives D n4 , so the high frequency harmonics are severely damped.
Discussion: If we set p = 0 and r > 0 we see that
we have a Simple Harmonic Motion problem
00
2
(z + 0 z = f (t)), with natural frequency 0 = r. p represents damping of the system. If the
forcing term has a much higher frequency (n  0 ) then D is large and the amplitude is suppressed
the system cannot respond to being driven much faster than its natural oscillation frequency. In
fact we see that the amplitude is greatest if n is about 0 (if p is small) an example of resonance.
Let us look at this in more detail. If we drive the oscillator at a single frequency, so An = 1 say, for
a single n, and we make Bn = 0 (by choosing the origin of time). All other An , Bn = 0.
The solution is

1
1 2
(w0 n2 2 );
bn = pn
D
D
If the damping p is small, we can ignore bn for our initial discussion.
an =

(2.78)

So, if the driving frequency is less than the natural frequency, n < 0 , then an and An have the
same sign, and the oscillations are in phase with the driving force. If the driving frequency is higher
than the natural frequency, n > 0 , then an < 0 and the resulting motion is out of phase with the
driving force.

16

1.0

0.8

0.6

0.4

0.2

0.0

10

15

20

Figure 2.5: The amplitude of the response of a damped harmonic oscillator (with small damping) to
driving forces of different frequencies. The spike is close to the natural frequency of the oscillator.
2.10.3

Resonance

If the driving frequency is equal to the natural frequency, n = 0 , then an = 0, and all the motion
is in bn . So the motion is /2 out of phase (sin(nt) rather than the driving cos(nt)). Here we
cant ignore p.
2.10.4

Other solutions

Finally, note that we can always add a solution to the homogeneous equation (i.e. where we set
the right hand side to zero). The final solution will be determined by the initial conditions (z and
dz/dt). This is because the equation is linear and we can superimpose different solutions.
For the present approach, this presents a problem: the undriven motion of the system will not
be periodic, and hence it cannot be described by a Fourier series. This explains the paradox of
the above solution, which implies that an and bn are zero if An and Bn vanish, as they do in the
homogeneous case. So apparently z(t) = 0 in the absence of a driving force whereas of course
an oscillator displaced from z = 0 will show motion even in the absence of an applied force. For a
proper treatment of this problem, we have to remove the requirement of periodicity, which will be
done later when we discuss the Fourier transform.

2.11

Fourier series and series expansions

We can sometimes exploit Fourier series to either give us series approximations for numerical quantities, or to give us the result of summing a series.
Consider f (x) = x2 , which we expanded as a Fourier series in Eqn. (2.34) above, and lets choose
the expansion range to be (i.e. well set L = ). At x = 0 we have f (x) = fFS (x) = 0.

17

N
1
2
3
4
5
6
7
8
9
10
100
1000
10000
100000

N
3.4641016151
3.0000000000
3.2145502537
3.0956959368
3.1722757341
3.1192947921
3.1583061852
3.1284817339
3.1520701305
3.1329771955
3.1414981140
3.1415916996
3.1415926440
3.1415926535

N
0.3225089615
0.1415926536
0.0729576001
0.0458967168
0.0306830805
0.0222978615
0.0167135316
0.0131109197
0.0104774769
0.0086154581
0.0000945396
0.0000009540
0.0000000095
0.0000000001

Table 1: A series approximation to from Eqn. (2.81)


Substituting into Eqn. (2.34) we have

2 X 4(1)n
+
0=
3
n2
n=1

X
2
(1)n X (1)n+1
=
=
12
n2
n2
n=1
n=1

(2.79)

(1)n+1
.
n2

Using the above

This result can be useful in two ways:


1. We solve a physics problem, and find the answer as a sum
2
result we can replace the sum by 12 .

n=1

2. We need a numerical approximation for . We can get this by truncating the sum at some
upper value n = N [as in Eqn. (2.86)] and adding together all the terms in the sum.
2
1 1
1
=1 +
+ ...
12
4 9 16

(2.80)

v
u
N
u X
(1)n+1
t
N 12
n2
n=1

(2.81)

Lets call this approximation N :

Table 1 shows how N approaches as we increase N .


We can get different series approximations by considering different values of x in the same Fourier
series expansions. For instance, consider x = . This gives:

2 X 4(1)n
+
(1)n
=
2
3
n
n=1
2

2 X 1
=
(2)
6
n2
n=1

(2.82)

This is an example of the Riemann zeta function (s) which crops up a lot in physics. It has limits:
(

X
1 as s
1
(s)

(2.83)
s
n
as s 1
n=1
18

FOURIER ANALYSIS: LECTURE 6


2.11.1

Convergence of Fourier series

Fourier series (real or complex) are very good ways of approximating functions in a finite range, by
which we mean that we can get a good approximation to the function by using only the first few
modes (i.e. truncating the sum over n after some low value n = N ).
This is how music compression works in MP3 players, or how digital images are compressed in
JPEG form: we can get a good approximation to the true waveform by using only a limited number
of modes, and so all the modes below a certain amplitude are simply ignored.
We saw a related example of this in our approximation to using Eqn. (2.81) and Table 1.
Not examinable:
Mathematically, this translates as the Fourier components converging to zero i.e. an , bn 0 as
n , provided f (x) is bounded (i.e. has no divergences). But how quickly do the high order
coefficients vanish? There are two common cases:
1. The function and its first p 1 derivatives (f (x), f 0 (x), . . . f (p1) (x)) are continuous, but the
pth derivative f (p) (x) has discontinuities:
an , bn 1/np+1

for large n.

(2.84)

An example of this was our expansion of f (x) = x2 . When we periodically extend the function,
there is a discontinuity in the gradient (p = 1 derivative) at the boundaries x = L. We have
already seen an 1/n2 as expected (with bn = 0).
2. f (x) is periodic and piecewise continuous (i.e. it has jump discontinuities, but only a finite
number within one period):

an , bn 1/n for large n.

(2.85)

An example of this is the expansion of the odd function f (x) = x, which jumps at the
boundary. The Fourier components turn out to be bn 1/n (with an = 0).
End of non-examinable section.
2.11.2

How close does it get? Convergence of Fourier expansions

We have seen that the Fourier components generally get smaller as the mode number n increases.
If we truncate the Fourier series after N terms, we can define an error DN that measures how much
the truncated Fourier series differs from the original function: i.e. if
N

 nx 
 nx i
a0 X h
+
an cos
+ bn sin
.
fN (x) =
2
L
L
n=1
we define the error as
Z

(2.86)

DN =

dx |f (x) fN (x)|2 0.

19

(2.87)

Figure 2.6: The Gibbs phenomenon for truncated Fourier approximations to the signum function
Eqn. 2.88. Note the different x-range in the lower two panels.
That is, we square the difference between the original function and the truncated Fourier series at
each point x, then integrate across the full range of validity of the Fourier series. Technically, this
is what is known as an L2 norm.
Some things you should know, but which we will not prove: if f is reasonably well-behaved (no nonintegrable singularities, and only a finite number of discontinuities), the Fourier series is optimal in
the least-squares sense i.e. if we ask what Fourier coefficients will minimise DN for some given
N , they are exactly the coefficients that we obtain by solving the full Fourier problem.
Furthermore, as N , DN 0. This sounds like we are guaranteed that the Fourier series will
represent the function exactly in the limit of infinitely many terms. But looking at the equation for
DN , it can be seen that this is not so: its always possible to have (say) fN = 2f over some range
x, and the best we can say is that x must tend to zero as N increases.
EXAMPLE: As an example of how Fourier series converge (or not), consider the signum function
which picks out the sign of a variable:
(
1 if x < 0 ,
f (x) = signum x =
(2.88)
+1 if x 0 ,

20

N
DN
10 0.0808
50 0.0162
100 0.0061
250 0.0032
Table 2: Error DN on the N -term truncated Fourier series approximation to the signum function
Eqn. 2.88.
which we will expand in the range 1 x 1 (i.e. we set L = 1). The function is odd, so an = 0
and we find
Z 1
2
dx sin(nx) =
[1 (1)n ] .
(2.89)
bn = 2
n
0
f (x) has discontinuities at x = 0 and x = L = 1 (due to the periodic extension), so from
Sec. 2.11.1 we expected an 1/n.
In Table 2 we show the error DN for the signum function for increasing values of DN . As expected
the error decreases as N gets larger, but relatively slowly. Well see why this is in the next section.
2.11.3

Ringing artefacts and the Gibbs phenomenon

We saw above that we can define an error associated with the use of a truncated Fourier series of N
terms to describe a function. Note that DN measures the total error by integrating the deviation
at each value of x over the full range. It does not tell us whether the deviations between fN (x)
and f (x) were large and concentrated at certain values of x, or smaller and more evenly distributed
over all the full range.
An interesting case is when we try to describe a function with a finite discontinuity (i.e. a jump)
using a truncated Fourier series, such as our discussion of the signum function above.
In Fig. 2.6 we plot the original function f (x) and the truncated Fourier series for various N . We
find that the truncated sum works well, except near the discontinuity. Here the function overshoots
the true value and then has a damped oscillation. As we increase N the oscillating region gets
smaller, but the overshoot remains roughly the same size (about 18%).
This overshoot is known as the Gibbs phenomenon. Looking at the plot, we can see that it tends to
be associated with extended oscillations either side of the step, known as ringing artefacts. Such
artefacts will tend to exist whenever we try to describe sharp transitions with Fourier methods, and
are one of the reasons that MP3s can sound bad when the compression uses too few modes. We can
reduce the effect by using a smoother method of Fourier series summation, but this is well beyond
this course. For the interested, there are some more details at http://en.wikipedia.org/wiki/
Gibbs_phenomenon.

2.12

Parsevals theorem

There is a useful relationship between the mean square value of the function f (x) and the Fourier
coefficients. Parsevals formula is
Z L


1X
1
2
2
|f (x)| dx = |a0 /2| +
|an |2 + |bn |2 ,
(2.90)
2L L
2 n=1
21

or, for complex Fourier Series,


1
2L

L
2

|f (x)| dx =
L

|cn |2 .

(2.91)

n=

The simplicity of the expression in the complex case is an example of the advantage of doing things
this way.
The quantity |cn |2 is known as the power spectrum. This is by analogy with electrical circuits, where
power is I 2 R. So the mean of f 2 is like the average power, and |cn |2 shows how this is contributed
by the different Fourier modes.
Proving Parseval is easier in the complex case, so we will stick to this. The equivalent for the
sin+cos series is included for interest, but is not examinable. First, note that |f (x)|2 = f (x)f (x)
and expand f and f as complex Fourier Series:

|f (x)| = f (x)f (x) =

cn n (x)

n=

cm m (x)

(2.92)

(recall that n (x) = eikn x ). Then we integrate over L x L, noting the orthogonality of n
and m :

|f (x)| dx =
L

cn cm (2Lmn )

m,n=

= 2L

m,n=

RL
L

n (x)m (x) dx

(2.93)

cn cn

= 2L

n=

where we have used the orthogonality relation


2.12.1

cn cm

|cn |2

n=

n (x)m (x) dx = 2L if m = n, and zero otherwise.

Summing series via Parseval

RL
Consider once again the case of f = x2 . The lhs of Parsevals theorem is (1/2L) L x4 dx = (1/5)L4 .
The complex coefficients were derived earlier, so the sum on the rhs of Parsevals theorem is

X
n=

|cn | = |c0 | +

X
n6=0

|cn | =

L2
3

2
+2

2

X
2L2 (1)n
n=1

n2 2

L4 X 8L4
=
+
.
9
n4 4
n=1

(2.94)

Equating the two sides of the theorem, we therefore get

X
1
= ( 4 /8)(1/5 1/9) = 4 /90.
4
m
n=1

(2.95)

This is a series that converges faster than the ones we obtained directly from the series at special
values of x

22

FOURIER ANALYSIS: LECTURE 7

Fourier Transforms

Learning outcomes
In this section you will learn about Fourier transforms: their definition and relation to Fourier
series; examples for simple functions; physical examples of their use including the diffraction and
the solution of differential equations.
You will learn about the Dirac delta function and the convolution of functions.

3.1

Fourier transforms as a limit of Fourier series

We have seen that a Fourier series uses a complete set of modes to describe functions on a finite
interval e.g. the shape of a string of length `. In the notation we have used so far, ` = 2L. In some
ways, it is easier to work with `, which we do below; but most textbooks traditionally cover Fourier
series over the range 2L, and these notes follow this trend.
Fourier transforms (FTs) are an extension of Fourier series that can be used to describe nonperiodic
functions on an infinite interval. The key idea is to see that a non-periodic function can be viewed
as a periodic one, but taking the limit of ` . This is related to our earlier idea of being able to
construct a number of different periodic extensions of a given function. This is illustrated in Fig.
3.1 for the case of a square pulse that is only non-zero between a < x < +a. When ` becomes
large compared to a, the periodic replicas of the pulse are widely separated, and in the limit of
` we have a single isolated pulse.

a a

Figure 3.1: Different periodic extensions of a square pulse that is only non-zero between a <
x < +a. As the period of the extension, `, increases, the copies of the pulse become more widely
separated. In the limit of ` , we have a single isolated pulse and the Fourier series goes over
to the Fourier transform.
23

with adjacent modes separated by


Fourier series only include modes with wavenumbers kn = 2n
`
2
k = ` . What happens to our Fourier series if we let ` ? Consider again the complex series
for f (x):

X
f (x) =
Cn eikn x ,
(3.1)
n=

where the coefficients are given by


1
Cn =
`

`/2

dx f (x) eikn x .

(3.2)

`/2

and the allowed wavenumbers are kn = 2n/`. The separation of adjacent wavenumbers (i.e. for
n n + 1) is k = 2/`; so as ` , the modes become more and more finely separated in k. In
the limit, we are then interested in the variation of C as a function of the continuous variable k.
The factor 1/` outside the integral looks problematic for talking the limit ` , but this can be
evaded by defining a new quantity:
Z

dx f (x) eikx .
(3.3)
f (k) ` C(k) =

The function f(k) (officially called f tilde, but more commonly f twiddle; fk is another common
notation) is the Fourier transform of the non-periodic function f .
To complete the story, we need the inverse Fourier transform: this gives us back the function f (x)
if we know f. Here, we just need to rewrite the Fourier series, remembering the mode spacing
k = 2/`:
X
X
1 X
f (x) =
C(k)eikx =
(`/2) C(k)eikx k =
f (k) eikx k.
(3.4)
2
k
k
k
n

In this limit, the final form of the sum becomes an integral over k:
Z
X
g(k) k g(k) dk as k 0;

(3.5)

this is how integration gets defined in the first place. We can now write an equation for f (x) in
which ` does not appear:
Z
1
dk f(k) eikx .
(3.6)
f (x) =
2
Note the infinite range of integration in k: this was already present in the Fourier series, where the
mode number n had no limit.
EXAM TIP: You may be asked to explain how the FT is the limit of a Fourier Series (for perhaps
6 or 7 marks), so make sure you can reproduce the stuff in this section.
The density of states In the above, our sum was over individual Fourier modes. But if C(k) is
a continuous function of k, we may as well add modes in bunches over some bin in k, of size k:
X
f (x) =
C(k)eikx Nbin ,
(3.7)
k bin

24

where Nbin is the number of modes in the bin. What is this? It is just k divided by the mode
spacing, 2/`, so we have
` X
f (x) =
C(k)eikx k
(3.8)
2 k bin
The term `/2 is the density of states: it tells us how many modes exist in unit range of k. This is
a widely used concept in many areas of physics, especially in thermodynamics. Once again, we can
take the limit of k 0 and obtain the integral for the inverse Fourier transform.
Summary A function f (x) and its Fourier transform f(k) are therefore related by:
Z
1
dk f(k) eikx ;
f (x) =
2
Z

dx f (x) eikx .
f (k) =

(3.9)
(3.10)

We say that f(k) is the FT of f (x), and that f (x) is the inverse FT of f(k).
EXAM TIP: If you are asked to state the relation between a function and its Fourier transform
(for maybe 3 or 4 marks), it is sufficient to quote these two equations. If the full derivation is
required, the question will ask explicitly for it.
Note that, since the Fourier Transform is a linear operation,
F T [f (x) + g(x)] = f(k) + g(k).

(3.11)

For a real function f (x), its FT satisfies the same Hermitian relation that we saw in the case of
Fourier series:
f(k) = f (k)
(3.12)
Exercise: prove this.
FT conventions Eqns. (3.10) and (3.9) are the definitions we will use for FTs throughout this
course. Unfortunately, there are many different conventions in active use for FTs. Aside from using
different symbols, these can differ in:
The sign in the exponent
The placing of the 2 prefactor(s) (and sometimes it is

2)

Whether there is a factor of 2 in the exponent


The bad news is that you will probably come across all of these different conventions. The good
news is that that it is relatively easy to convert between them if you need to. The best news is that
you will almost never need to do this conversion.

25

k space and momentum space The Fourier convention presented here is the natural one that
emerges as the limit of the Fourier series. But it has the disadvantage that it treats the Fourier
transform and the inverse Fourier transform differently by a factor of 2, whereas in physics we need
to learn to treat the functions f (x) and f(k) as equally valid forms of the same thing: the real-space
and k-space forms. This is most obvious in quantum mechanics, where a wave function exp(ikx)
represents a particle with a well-defined momentum, p = h
k according to de Broglies hypothesis.

Thus the description of a function in terms of f (k) is often called the momentum-space version.
The result that illustrates this even-handed approach most clearly is to realise that the Fourier
transform of f (x) can itself be transformed:
Z
g
dk f(k) eiKk .
(3.13)
f(k)(K) =

We will show below that


fg
(k)(K) = 2f (K) :

(3.14)

so in essence, repeating the Fourier transform gets you back the function you started with. f and
f are really just two sides of the same coin.

FOURIER ANALYSIS: LECTURE 8


3.2

Some simple examples of FTs

In this section well find the FTs of some simple functions.


EXAM TIP: You may be asked to define and sketch f (x) in each case, and also to calculate and
sketch f(k).
3.2.1

The top-hat

A top-hat function (x) of height h and width 2a (a assumed positive), centred at x = d is defined
by:
(
h, if d a < x < d + a ,
(x) =
(3.15)
0, otherwise .
The function is sketched in Fig. 3.2.
Its FT is:
f(k) =

ikx

dx (x) e

d+a

=h

dx eikx = 2ah eikd sinc(ka)

(3.16)

da

The derivation is given below. The function sinc x sinx x is sketched in Fig. 3.3 (with notes on
how to do this also given below). f(k) will look the same (for d = 0), but the nodes will now be at
k = n
and the intercept will be 2ah rather than 1. You are very unlikely to have to sketch f(k)
a
for d 6= 0.

26

EXAM TIPS: If the question sets d = 0, clearly there is no need to do a variable change from
x to y.
Sometimes the question specifies that the top-hat should have unit area i.e. h (2a) = 1, so you
can replace h.
The width of the top-hat wont necessarily be 2a. . .
Deriving the FT:
f(k) =

ikx

dx (x) e

d+a

dx eikx

=h

(3.17)

da

Now we make a substitution u = x d (which now centres the top-hat at u = 0). The integrand
eikx becomes eik(u+d) = eiku eikd . We can pull the factor eikd outside the integral because
it does not depend on u. The integration limits become u = a. There is no scale factor, i.e.
du = dx.
This gives
f(k) = heikd

a
iku

du e
a

= heikd

ikd

= he

eiku
ik

a

ikd

= he
a

eika eika
ik

2a eika eika
sin(ka)

= 2aheikd
= 2ah eikd sinc(ka)
ka
2i
ka

(3.18)

Note that we conveniently multiplied top and bottom by 2a midway through.


Sketching sinc x: You should think of sinc x sinx x as a sin x oscillation (with nodes at x = n
for integer n), but with the amplitude of the oscillations dying off as 1/x. Note that sinc x is an
even function, so it is symmetric when we reflect about the y-axis.
The only complication is at x = 0, when sinc 0 = 00 which appears undefined. To deal with this,
expand sin x = x x3 /3! + x5 /5! + . . . , so it is obvious that sin x/x 1 as x 0.
EXAM TIP: Make sure you can sketch this, and that you label all the zeros (nodes) and
intercepts.

Figure 3.2: Sketch of top-hat function defined in Eqn. (3.15)


27

3.2.2

The Gaussian

The Gaussian curve is also known as the bell-shaped or normal curve. A Gaussian of width
centred at x = d is defined by:


(x d)2
(3.19)
f (x) = N exp
2 2
where N is a normalization constant, which is often set to 1. We can instead define the normalized

Gaussian, where we choose N so that the area under the curve to be unity i.e. N = 1/ 2 2 .
This normalization can be proved by a neat trick, which is to extend to a two-dimensional Gaussian
for two independent (zero-mean) variables x and y, by multiplying the two independent Gaussian
functions:
1 (x2 +y2 )/22
p(x, y) =
e
.
(3.20)
2 2
The integral over both variables can now be rewritten using polar coordinates:
Z
ZZ
Z
1
2
2
2 r er /2 dr
(3.21)
p(x, y) dx dy = p(x, y) 2 r dr =
2
2
and the final expression clearly integrates to

P (r > R) = exp R2 /2 2 ,

(3.22)

so the distribution is indeed correctly normalized.


The Gaussian is sketched for d = 0 and two different values of the width parameter . Fig. 3.4 has
N = 1 in each case, whereas Fig. 3.5 shows normalized curves. Note the difference, particularly in
the intercepts.
For d = 0, the FT of the Gaussian is


 2 2
Z
2

x
k
ikx
f(k) =
dx N exp 2 e
= 2N exp
,
2
2

i.e. the FT of a Gaussian is another Gaussian (this time as a function of k).

Figure 3.3: Sketch of sinc x

28

sin x
x

(3.23)

Deriving the FT For notational convenience, lets write a = 21 2 , so


Z



dx exp ax2 + ikx


f (k) = N

(3.24)

Now we can complete the square inside [. . .]:



2
ik
k2
ax ikx = a x +

2a
4a
2

(3.25)

giving
f(k) = N e

k2 /4a


2 !
ik
dx exp a x +
.
2a

(3.26)

We then make a change of variables:

u= a

ik
x+
2a


.

(3.27)

This does not change the limits on the integral, and the scale factor is dx = du/ a, giving
r
Z
N k2 /4a

2
2
u2

f (k) = e
du e
=N
ek /4a = ek /4a .
(3.28)
a
a

where we changed back from a to . To get this result, we have used the standard result
Z

2
du eu = .

(3.29)

3.3

Reciprocal relations between a function and its FT

These examples illustrate a general and very important property of FTs: there is a reciprocal (i.e.
inverse) relationship between the width of a function and the width of its Fourier transform. That
is, narrow functions have wide FTs and wide functions have narrow FTs.
This important property goes by various names in various physical contexts, e.g.:

Figure 3.4: Sketch of Gaussians with N = 1


29

Heisenberg Uncertainty Principle: the rms uncertainty in position space (x) and the rms
uncertainty in momentum space (p) are inversely related: (x)(p) h
/2. The equality
holds for the Gaussian case (see below).
Bandwidth theorem: to create a very short-lived pulse (small t), you need to include a very
wide range of frequencies (large ).
In optics, this means that big objects (big relative to wavelength of light) cast sharp shadows
(narrow FT implies closely spaced maxima and minima in the interference fringes).
We discuss two explicit examples in the following subsections:
3.3.1

The top-hat

The width of the top-hat as defined in Eqn. (3.15) is obviously 2a.


For the FT, whilst the sinc ka function extends across all k, it dies away in amplitude, so it does
have a width. Exactly how we define the width does not matter; lets say it is the distance between
the first nodes k = /a in each direction, giving a width of 2/a.
Thus the width of the function is proportional to a, and the width of the FT is proportional to 1/a.
Note that this will be true for any reasonable definition of the width of the FT.
3.3.2

The Gaussian

Again, the Gaussian extends infinitely but dies away, so we can define a width. For a Gaussian,
it is easy to do this rigorously in terms of the standard deviation (square root of the average of
(x d)2 ), which is just (check you can prove this).
Comparing the form of FT in Eqn. (3.23) to the original definition of the Gaussian in Eqn. (3.19), if
the width of f (x) is , the width of f(k) is 1/ by the same definition. Again, we have a reciprocal
relationship between the width of the function and that of its FT. Since p = h
k, the width in
momentum space is h
times that in k space.

Figure 3.5: Sketch of normalized Gaussians. The intercepts are f (0) =

30

1 .
2 2

Figure 3.6: The Fourier expansion of the function f (x) = 1/(4|x|1/2 ), |x| < 1 is shown in the LH
panel (a cosine series, up to n = 15). The RH panel compares df /dx with the sum of the derivative
of the Fourier series. The mild divergence in f means that the expansion converges; but for df /dx
it does not.
The only subtlety in relating this to the uncertainty principle is thatthe probability distributions
use ||2 , not ||. If the width
of (x) is , then the width of ||2 is / 2. Similarly, the uncertainty

in momentum is (1/)/ 2, which gives the extra factor 1/2 in (x)(p) = h


/2.

3.4

Differentiating and integrating Fourier series

Once we have a function expressed as a Fourier series, this can be a useful alternative way of
carrying out calculus-related operations. This is because differentiation and integration are linear
operations that are distributive over addition: this means that we can carry out differentiation or
integration term-by-term in the series:
f (x) =

Cn eikn x

n=

df
=
Cn (ikn ) eikn x
dx n=

f dx =

Cn (ikn )1 eikn x + const .

(3.30)
(3.31)
(3.32)

n=

The only complication arises in the case of integration, if C0 6= 0: then the constant term integrates
to be x, and this needs to be handled separately.
From these relations, we can see immediately that the Fourier coefficients of a function and its
derivative are very simply related by powers of k: if the nth Fourier coefficient of f (x) is Cn , the
nth Fourier coefficient of df (x)/dx is (ikn )Cn . Taking the limit of a non-periodic function, the same
relation clearly applies to Fourier Transforms. Thus, in general, multiple derivatives transform as:
 p 
 (p) 
df
F T f (x) = F T
= (ik)p f(k)
(3.33)
dxp
31

The main caveat with all this is that we still require that all the quantities being considered
must
be suitable for a Fourier representation, and this may not be so. For example, f (x) = 1/ x for
0 < x < 1 is an acceptable function: it has a singularity at x = 0, but this is integrable, so all the
Fourier coefficients converge. But f 0 (x) = x3/2 /2, which has a divergent integral over 0 < x < 1.
Attempts to use a Fourier representation for f 0 (x) would come adrift in this case, as is illustrated
in Fig. 3.6.

FOURIER ANALYSIS: LECTURE 9

The Dirac delta function

The Dirac delta function is a very useful tool in physics, as it can be used to represent a very
localised or instantaneous impulse whose effect then propagates. Informally, it is to be thought of
as an infinitely narrow (and infinitely tall) spike. Mathematicians think its not a proper function,
since a function is a machine, f (x), that takes any number x and replaces it with a well-defined
number f (x). Dirac didnt care, and used it anyway. Eventually, the theory of distributions was
invented to say he was right to follow his intuition.

4.1

Definition and basic properties

The Dirac delta function (x d) is defined by two expressions. First, it is zero everywhere except
at the point x = d where it is infinite:
(
0
for x 6= d ,
(x d) =
(4.34)
for x = d .
Secondly, it tends to infinity at x = d in such a way that the area under the Dirac delta function is
unity:
Z
dx (x d) = 1 .
(4.35)

4.1.1

The delta function as a limiting case

To see how a spike of zero width can have a well-defined area, it is helpful (although not strictly
necessary) to think of the delta function as the limit of a more familiar function. The exact shape
of this function doesnt matter, except that it should look more and more like a (normalized) spike
as we make it narrower.
Two possibilities are the top-hat as the width a 0 (normalized so that h = 1/(2a)), or the
normalized Gaussian as 0. Well use the top-hat here, just because the integrals are easier.
Let a (x) be a normalized top-hat of width 2a centred at x = 0 as in Eqn. (3.15) weve made
the width parameter obvious by putting it as a subscript here. The Dirac delta function can then
be defined as
(x) = lim a (x) .
(4.36)
a0

32

EXAM TIP: When asked to define the Dirac delta function, make sure you write both Eqns. (4.34)
and (4.35).
4.1.2

Sifting property

The sifting property of the Dirac delta function is that, given some function f (x):
Z
dx (x d) f (x) = f (d)

(4.37)

i.e. the delta function picks out the value of the function at the position of the spike (so long as it
is within the integration range). This is just like the sifting property of the Kronecker delta inside
a discrete sum.
EXAM TIP: If you are asked to state the sifting property, it is sufficient to write Eqn. (4.37).
You do not need to prove the result as in Sec. 4.1.5 unless specifically asked to.
Technical aside: The integration limits dont technically need to be infinite in the above formul.
If we integrate over a finite range a < x < b the expressions become:
(
Z b
1 for a < d < b ,
(4.38)
dx (x d) =
0 otherwise.
a
(
Z b
f (d) for a < d < b ,
(4.39)
dx (x d) f (x) =
0
otherwise.
a
That is, we get the above results if the position of the spike is inside the integration range, and zero
otherwise.
4.1.3

Compare with the Kronecker delta

The Kronecker delta


mn


1 m=n
=
0 m=
6 n

(4.40)

plays a similar sifting role for discrete modes, as the Dirac delta does for continuous modes. For
example:

X
An mn = Am
(4.41)
n=1

which is obvious when you look at it. Be prepared to do this whenever you see a sum with a
Kronecker delta in it.
4.1.4

Delta function of a more complicated argument

Sometimes you may come across the Dirac delta function of a more complicated argument, [f (x)],
e.g. (x2 4). How do we deal with these? Essentially we use the definition that the delta function
integrates to unity when it is integrated with respect to its argument. i.e.
Z
[f (x)]df = 1
(4.42)

33

Changing variables from f to x,


Z

df
dx = 1
(4.43)
dx
where we have not put the limits on x, as they depend on f (x). Comparing with one of the
properties of (x), we find that
(x x0 )
(4.44)
[f (x)] =
|df /dx|x=x0
where the derivative is evaluated at the point x = x0 where f (x0 ) = 0. Note that if there is more
than one solution (xi ; i = 1, . . .) to f = 0, then (f ) is a sum
X (x xi )
[f (x)] =
(4.45)
|df
/dx|
x=x
i
i
[f (x)]

4.1.5

Proving the sifting property

We can use the limit definition of the Dirac delta function [Eqn. (4.36)] to prove the sifting property
given in Eqn. (4.37):
Z
Z
Z
dx f (x) a (x) .
(4.46)
dx f (x) lim a (x) = lim
dx f (x) (x) =

a0

a0

We are free to pull the limit outside the integral because nothing else depends on a. Substituting
for a (x), the integral is only non-zero between a and a. Similarly, we can pull the normalization
factor out to the front:
Z
Z a
1
dx f (x) (x) = lim
dx f (x) .
(4.47)
a0 2a a

What this is doing is averaging f over a narrow range of width 2a around x = 0. Provided the
function is continuous, this will converge to a well-defined value f (0) as a 0 (this is pretty well
the definition of continuity).
Alternatively, suppose the function was differentiable at x = 0 (which not all continuous functions
will be: e.g. f (x) = |x|). Then we can Taylor expand the function around x = 0 (i.e. the position
of the centre of the spike):


Z a
Z
x2 00
1
0
dx f (0) + xf (0) + f (0) + . . . .
(4.48)
dx f (x) (x) = lim
a0 2a a
2!

The advantage of this is that all the f (n) (0) are constants, which makes the integral easy:

 2 a
 a

Z
1
x
f 00 (0) x3
a
0
dx f (x) (x) = lim
f (0) [x]a + f (0)
+
+ ...
a0 2a
2 a
2!
3 a



a2 00
= lim f (0) + f (0) + . . .
a0
6
= f (0) .

(4.49)
(4.50)
(4.51)

which gives the sifting result.


EXAM TIP: An exam question may ask you to derive the sifting property in this way. Make
sure you can do it.
Note that the odd terms vanished after integration. This is special to the case of the spike being
centred at x = 0. It is a useful exercise to see what happens if the spike is centred at x = d instead.
34

4.1.6

Some other properties of the delta function

These include:
(x) = (x)
x(x) = 0
(x)
(ax) =
|a|
(x a) + (x + a)
(x2 a2 ) =
2|a|

(4.52)

The proofs are left as exercises.


4.1.7

Calculus with the delta function

The -function is easily integrated:


x

dy (y d) = (x d),

(4.53)

where

(
0 x<d
(x d) =
1 xd

(4.54)

which is called the Heaviside function, or just the step function.


The derivative can also be written down, realising that the delta-function must obey the relation
f (x)(x) = f (0)(x), and applying the product rule:
f (x) d(x)/dx = f 0 (x) (x) + f (0) d(x)/dx.

(4.55)

Integrating this over an infinite range, the first term on the RHS gives f 0 (0), using the sifting
property; the second term gives zero, since (x) = 0 at either end of the interval. Thus the derivative
of the delta-function sifts for (minus) the derivative of the function:
Z
f (x) [d(x)/dx] dx = f 0 (0),
(4.56)

which could alternatively be proved by applying integration by parts.


4.1.8

More than one dimension

Finally, in some situations (e.g. a point charge at r = r0 ), we might need a 3D Dirac delta function,
which we can write as a product of three 1D delta functions:
(r r0 ) = (x x0 )(y y0 )(z z0 )

(4.57)

where r0 = (x0 , y0 , z0 ). Note that (r a) is not the same as (r a): the former picks out a
point at position a, but the latter picks out an annulus of radius a. Suppose we had a spherically
symmetric function f (r). The sifting property of the 3D function is
Z
f (r) (r a) d3 x = f (a) = f (a),
(4.58)
whereas

f (r) (r a) d x =

f (r) (r a) 4r2 dr = 4a2 f (a).


35

(4.59)

4.1.9

Physical importance of the delta function

The -function is a tool that arises a great deal in physics. There are a number of reasons for this.
One is that the classical world is made up out of discrete particles, even though we often treat
matter as having continuously varying properties such as density. Individual particles of zero size
have infinite density, and so are perfectly suited to be described by -functions. We can therefore
write the density field produced by a set of particles at positions xi as
X
(x) =
Mi (x xi ).
(4.60)
i

This expression means we can treat all matter in terms of just the density as a function of position,
whether the matter is continuous or made of particles.
This decomposition makes us look in a new way at the sifting theorem:
Z
f (x) = f (q) (x q) dq.

(4.61)

The integral is the limit of a sum, so this actually says that the function f (x) can be thought
of as made up by adding together infinitely many -function spikes: think of a function as a
mathematical hedgehog. This turns out to be an incredibly useful viewpoint when solving linear
differential equations: the response of a given system to an applied force f can be calculated if we
know how the system responds to a single spike. This response is called a Greens function, and
will be a major topic later in the course.

4.2

FT and integral representation of (x)

The Dirac delta function is very useful when we are doing FTs. The FT of the delta function follows
easily from the sifting property:
Z

f (k) =
dx (x d) eikx = eikd .
(4.62)

In the special case d = 0, we get simply f(k) = 1.


The inverse FT gives us the integral representation of the delta function:
Z
Z
1
1
ikx

dk f (k)e =
dk eikd eikx
(x d) =
2
2
Z
1
=
dk eik(xd) .
2

(4.63)
(4.64)

You ought to worry that its entirely unobvious whether this integral converges, since the integrand
doesnt die off at . A safer approach is to define the -function (say) in terms of a Gaussian of
width , where we know that the FT and inverse FT are well defined. Then we can take the limit
of 0.
In the same way that we have defined a delta function in x, we can also define a delta function
in k. This would, for instance, represent a signal composed of oscillations of a single frequency or
wavenumber K. Again, we can write it in integral form if we wish:
Z
1
(k K) =
ei(kK)x dx.
(4.65)
2
36

This k-space delta function has exactly the same sifting properties when we integrate over k as the
original version did when integrating over x.
Note that the sign of the exponent is irrelevant:

1
(x) =
2

eikx dk

(4.66)

which is easy to show by changing variable from k to k (the limits swap, which cancels the sign
change dk dk).

FOURIER ANALYSIS: LECTURE 10

Ordinary Differential Equations

We saw earlier that if we have a linear differential equation with a driving term on the right-hand
side which is a periodic function, then Fourier Series may be a useful method to solve it. If the
problem is not periodic, then Fourier Transforms may be able to solve it.

5.1

Solving Ordinary Differential Equations with Fourier transforms

The advantage of applying a FT to a differential equation is that we replace the differential equation
with an algebraic equation, which may be easier to solve. Let us illustrate the method with a couple
of examples.
5.1.1

Simple example

The equation to be solved is


d2 z
02 z = f (t).
2
dt
Take the FT, which for z is:

(5.67)

z(t)eit dt

z() =

(5.68)

and noting that d/dt i, so d2 /dt2 2 , the equation becomes


2 z() 02 z() = f().

(5.69)

Rearranging,
z() =
with a solution
1
z(t) =
2

f()
02 + 2

37

f() it
e d.
02 + 2

(5.70)

(5.71)

What this says is that a single oscillating f (t), with amplitude a, will generate a response in phase
with the applied oscillation, but of amplitude a/(02 + 2 ). For the general case, we superimpose
oscillations of different frequency, which is what the inverse Fourier transform does for us.
Note that this gives a particular solution of the equation. Normally, we would argue that we can
also add a solution of the homogeneous equation (where the rhs is set to zero), which in this case
is Ae0 t + Be0 t . Boundary conditions would determine what values the constants A and B take.
But when dealing with Fourier transforms, this step may not be appropriate. This is because the
Fourier transform describes non-periodic functions that stretch over an infinite range of time
so the manipulations needed to impose a particular boundary condition amount to a particular
imposed force. If we believe that we have an expression for f (t) that is valid for all times, then
boundary conditions can only be set at t = . Physically, we would normally lack any reason for
a displacement in this limit, so the homogeneous solution would tend to be ignored even though
it should be included as a matter of mathematical principle.
We will come back to this problem later, when we can go further with the calculation (see Convolution, section 6).

5.2

LCR circuits

Let us look at a more complicated problem with an electrical circuit. LCR circuits consist of an
inductor of inductance L, a capacitor of capacitance C and a resistor of resistance R. If they are
in series, then in the simplest case of one of each in the circuit, the voltage across all three is the
sum of the voltages across each component. The voltage across R is IR, where I is the current;
across the inductor it is LdI/dt, and across the capacitor it is Q/C, where Q is the charge on the
capacitor:
Q
dI
(5.72)
V (t) = L + RI + .
dt
C
Now, since the rate of change of charge on the capacitor is simply the current, dQ/dt = I, we can
differentiate this equation, to get a second-order ODE for I:
L

d2 I
dI
I
dV
+R + =
.
2
dt
dt C
dt

(5.73)

If we know how theR applied voltage V (t) varies with time, we can use Fourier methods to determine

I(t). With I()


= I(t)eit dt, and noting that the FT of dI/dt is i I(),
and of d2 I/dt2 it is

Hence
2 I().
1

2 LI()
+ iRI()
+ I()
= i V (),
(5.74)
C

where V () is the FT of V (t). Solving for I():

I()
=
and hence
1
I(t) =
2

i V ()
,
C 1 + iR 2 L

(5.75)

i V ()
eit d.
1
2
C + iR L

(5.76)

So we see that the individual Fourier components obey a form of Ohms law, but involving a complex
impedance, Z:
i

V () = Z()I();
Z = R + iL
.
(5.77)
C
38

Figure 5.7: A simple series LCR circuit.


This is a very useful concept, as it immediately allows more complex circuits to be analysed, using
the standard rules for adding resistances in series or in parallel.
The frequency dependence of the impedance means that different kinds of LCR circuit have functions
as filters of the time-dependent current passing through them: different Fourier components (i.e.
different frequencies) can be enhanced or suppressed. For example, consider a resistor and inductor
in series:
V ()

.
(5.78)
I()
=
R + iL
For high frequencies, the current tends to zero; for  R/L, the output of the circuit (current

over voltage) tends to the constant value I()/


V () = R. So this would be called a low-pass filter:
it only transmits low-frequency vibrations. Similarly, a resistor and capacitor in series gives

I()
=

V ()
.
R + (iC)1

(5.79)

This acts as a high-pass filter, removing frequencies below about (RC)1 . Note that the LR circuit
can also act in this way if we measure the voltage across the inductor, VL , rather than the current
passing through it:
V ()
V ()

VL () = iLI()
= iL
=
.
(5.80)
R + iL
1 + R(iL)1
Finally, a full series LCR circuit is a band-pass filter, which removes frequencies below (RC)1 and
above R/L from the current.

5.3

The forced damped Simple Harmonic Oscillator

The same mathematics arises in a completely different physical context: imagine we have a mass
m attached to a spring with a spring constant k, and which is also immersed in a viscous fluid that
exerts a resistive force proportional to the speed of the mass, with a constant of proportionality D.
Imagine further that the mass is driven by an external force f (t). The equation of motion for the
displacement z(t) is
m
z = kz Dz + f (t).
(5.81)
39

This is the same equation as the LCR case, with


(z, f, m, k, D) (I, V , L, C 1 , R).

(5.82)

To solve the equation for z(t), first define a characteristic frequency by 02 = k/m, and let = D/m.
Then
z + z + 02 z = a(t)
(5.83)
where a(t) = f (t)/m. Now take the FT of the equation with respect to time, and note that the FT
of z(t)

is i
z (w), and the FT of z(t) is 2 z(). Thus
2 z() + i z() + 02 z() = a
(),

(5.84)

where a
() is the FT of a(t). Hence
z() =

5.3.1

a
()
.
+ i + 02

(5.85)

Explicit solution via inverse FT

This solution in Fourier space is general and works for any time-dependent force. Once we have
a specific form for the force, we can in principle use the Fourier expression to obtain the exact
solution for z(t). How useful this is in practice depends on how easy it is to do the integrals, first to
transform a(t) into a
(), and then the inverse transform to turn z() into z(t). For now, we shall
just illustrate the approach with a simple case.
Consider therefore a driving force that can be written as a single complex exponential:
a(t) = A exp(it).
Fourier transforming, we get
Z
a
() =

(5.86)

Aeit eit dt = 2A( ) = 2A( ).

(5.87)

Unsurprisingly, the result is a -function spike at the driving frequency. Since we know that z() =
a
()/( 2 + i + 02 ), we can now use the inverse FT to compute z(t):
Z
a
()
1
eit d
(5.88)
z(t) =
2
2
2 + i + 0
Z
( )
= A
eit d
2
2

+
i
+

0
eit
= A
2 + i + 02
This is just the answer we would have obtained if we had taken the usual route of trying a solution
proportional to exp(it) but the nice thing is that the inverse FT has produced this for us
automatically, without needing to guess.

40

5.3.2

Resonance

The result can be made a bit more intuitive by splitting the various factors into amplitudes and
phases. Let A = |A| exp(i) and (2 + i + 02 ) = a exp(i), where
q
(5.89)
a = (02 2 )2 + 2 2
and
tan = /(02 2 ).

(5.90)

Then we have simply


|A|
exp[i(t + )],
(5.91)
a
so the dynamical system returns the input oscillation, modified in amplitude by the factor 1/a and
lagging in phase by . For small frequencies, this phase lag is very small; it becomes /2 when
= 0 ; for larger , the phase lag tends to .
z(t) =

The amplitude of the oscillation is maximized when a is a minimum. Differentiating, this is when
we reach the resonant frequency:
q
(5.92)
= res = 02 2 /2,
i.e. close to the natural frequency of the oscillator when is small. At this point, a = ( 2 02
4 /4)1/2 , which is 0 to leading order. From the structure of a, we can see that it changes by a large
factor when the frequency moves from resonance by an amount of order (i.e. if is small then
the width of the response is very narrow). To show this, argue that we want the term (02 2 )2 ,
which is negligible at resonance, to be equal to 2 02 . Solving this gives
= (02 0 )1/2 = 0 (1 /0 )1/2 ' 0 /2.

(5.93)

Thus we are mostly interested in frequencies that are close to 0 , and we can approximate a by
a ' [(2 02 )2 + 2 02 ]1/2 ' [402 ( 0 )2 + 2 02 ]1/2 .
Thus, if we set = res + , then

1
(0 )1
'
,
a
(1 + 42 / 2 )1/2

(5.94)

(5.95)

which is a Lorentzian dependence of the square of the amplitude on frequency deviation from
resonance.
5.3.3

Taking the real part?

The introduction of a complex acceleration may cause some concern. A common trick at an elementary level is to use complex exponentials to represent real oscillations; the argument being that
(as long as we deal with linear equations) the real and imaginary parts process separately and so
we can just take the real part at the end. Here, we have escaped the need to do this by saying that
real functions require the Hermitian symmetry cm = cn . If a(t) is to be real, we must therefore
also have the negative-frequency part:
a
() = 2A( ) + 2A ( + ).
41

(5.96)

The Fourier transform of this is


a(t) = A exp(it) + A exp(it) = 2|A| cos(t + ),

(5.97)

where A = |A| exp(i). Apart from a factor 2, this is indeed what we would have obtained via the
traditional approach of just taking the real part of the complex oscillation.
Similarly, therefore, the time-dependent solution when we insist on this real driving force of given
frequency comes simply from adding the previous solution to its complex conjugate:
z(t) =

|A|
|A|
|A|
exp[i(t + )] +
exp[i(t + )] = 2
cos(t + ).
a
a
a

42

(5.98)

h=f*g
*
g
x

h(x)

Figure 6.8: Illustration of the convolution of two functions, viewed as the area of the overlap
resulting from a relative shift of x.

FOURIER ANALYSIS: LECTURE 11

Convolution

Convolution combines two (or more) functions in a way that is useful for describing physical systems
(as we shall see). Convolutions describe, for example, how optical systems respond to an image,
and we will also see how our Fourier solutions to ODEs can often be expressed as a convolution. In
fact the FT of the convolution is easy to calculate, so it is worth looking out for when an integral
is in the form of a convolution, for in that case it may well be that FTs can be used to solve it.
First, the definition. The convolution of two functions f (x) and g(x) is defined to be
Z
f (x) g(x) =
dx0 f (x0 )g(x x0 ) ,

(6.99)

The result is also a function of x, meaning that we get a different number for the convolution for
each possible x value. Note the positions of the dummy variable x0 , especially that the argument
of g is x x0 and not x0 x (a common mistake in exams).
There are a number of ways of viewing the process of convolution. Most directly, the definition here
is a measure of overlap: the functions f and g are shifted relative to one another by a distance x,
and we integrate to find the product. This viewpoint is illustrated in Fig. 6.8.
But this is not the best way of thinking about convolution. The real significance of the operation is
that it represents a blurring of a function. Here, it may be helpful to think of f (x) as a signal, and
g(x) as a blurring function. As written, the integral definition of convolution instructs us to take
the signal at x0 , f (x0 ), and replace it by something proportional to f (x0 )g(x x0 ): i.e. spread out
over a range of x around x0 . This turns a sharp feature in the signal into something fuzzy centred
at the same location. This is exactly what is achieved e.g. by an out-of-focus camera.
Alternatively, we can think about convolution as a form of averaging. Take the above definition of
convolution and put y = x x0 . Inside the integral, x is constant, so dy = dx0 . But now we are
43

1
*

-a/2

a/2

=
-a/2

a/2

-a

Figure 6.9: Convolution of two top hat functions.


integrating from y = to , so we can lose the minus sign by re-inverting the limits:
Z
dy f (x y)g(y) .
f (x) g(x) =

(6.100)

This says that we replace the value of the signal at x, f (x) by an average of all the values around
x, displaced from x by an amount y and weighted by the function g(y). This is an equivalent view
of the process of blurring. Since it doesnt matter what we call the dummy integration variable,
this rewriting of the integral shows that convolution is commutative: you can think of g blurring f
or f blurring g:
Z
Z
f (x) g(x) =
dz f (z)g(x z) =
dz f (x z)g(z) = g(x) f (x).
(6.101)

6.1

Examples of convolution

1. Let (x) be the top-hat function of width a.


(x) (x) is the triangular function of base width 2a.
This is much easier to do by sketching than by working it out formally: see Figure 6.9.
2. Convolution of a general function g(x) with a delta function (x a).
Z
(x a) g(x) =
dx0 (x0 a)g(x x0 ) = g(x a).

(6.102)

using the sifting property of the delta function. This is a clear example of the blurring effect of
convolution: starting with a spike at x = a, we end up with a copy of the whole function g(x),
but now shifted to be centred around x = a. So here the sifting property of a delta-function
has become a shifting property. Alternatively, we may speak of the delta-function becoming
dressed by a copy of the function g.
The response of the system to a delta function input (i.e. the function g(x) here) is sometimes
called the Impulse Response Function or, in an optical system, the Point Spread Function.
3. Making double slits: to form double slits of width a separated by distance 2d between centres:
[(x + d) + (x d)] (x) .
We can form diffraction gratings with more slits by adding in more delta functions.
44

(6.103)

6.2

The convolution theorem

States that the Fourier transform of a convolution is a product of the individual Fourier transforms:
F T [f (x) g(x)] = f(k) g(k)
1
f (k) g(k)
F T [f (x) g(x)] =
2

(6.104)
(6.105)

where f(k), g(k) are the FTs of f (x), g(x) respectively.


Note that:
f(k) g(k)

dq f(q) g(k q) .

(6.106)

Well do one of these, and we will use the Dirac delta function.
The convolution h = f g is
Z

h(x) =

f (x0 )g(x x0 ) dx0 .

(6.107)

We substitute for f (x0 ) and g(x x0 ) their FTs, noting the argument of g is not x0 :
Z
1
0
0
f (x ) =
f(k)eikx dk
2
Z
1
0
0
g(k)eik(xx ) dk
g(x x ) =
2
Hence (relabelling the k to k 0 in g, so we dont have two k integrals)

Z Z
Z
1
ikx0
0 ik0 (xx0 )
0

h(x) =
f (k)e dk
g(k )e
dk dx0 .
(2)2

(6.108)

Now, as is very common with these multiple integrals, we do the integrations in a different order.
Notice that the only terms which depend on x0 are the two exponentials, indeed only part of the
second one. We do this one first, using the fact that the integral gives 2 times a Dirac delta
function:
Z

Z
Z
1
0 ik0 x
i(kk0 )x0
0

f (k)
g(k )e
e
dx dk 0 dk
h(x) =
(2)2

Z
Z

1
0
g(k 0 )eik x [2(k k 0 )] dk 0 dk
=
f(k)
(2)2

Having a delta function simplifies the integration enormously. We can do either the k or the k 0
integration immediately (it doesnt matter which you do let us do k 0 ):
Z

Z
1
0x
0
ik
0
0
h(x) =
f(k)
g(k )e (k k ) dk dk
2

Z
1
=
f(k)
g (k) eikx dk
2
Since

1
h(x) =
2

h(k)
eikx dk

45

(6.109)

we see that

h(k)
= f(k)
g (k).

(6.110)

Note that we can apply the convolution theorem in reverse, going from Fourier space to real space,
so we get the most important key result to remember about the convolution theorem:
Convolution in real space Multiplication in Fourier space
Multiplication in real space Convolution in Fourier space

(6.111)

This is an important result. Note that if one has a convolution to do, it is often most efficient to
do it with Fourier Transforms, not least because a very efficient way of doing them on computers
exists the Fast Fourier Transform, or FFT.
CONVENTION ALERT! Note that if we had chosen a different convention for the 2 factors
in the original definitions of the FTs, the convolution theorem would look differently. Make sure
you use the right one for the conventions you are using!
Note that convolution commutes, f (x) g(x) = g(x) f (x), which is easily seen (e.g. since the FT
is f(k)
g (k) = g(k)f(k).)
Example application: Fourier transform of the triangular function of base width 2a. We know
that a triangle is a convolution of top hats:
(x) = (x) (x) .

(6.112)

2

ka
F T [] = (F T [(x)]) = sinc
2

(6.113)

Hence by the convolution theorem:


2

FOURIER ANALYSIS: LECTURE 12


6.3
6.3.1

Application of FTs and convolution


Fraunhofer Diffraction

Imagine a single-slit optics experiment (see Fig. 6.10). Light enters from the left, and interferes,
forming a pattern on a screen on the right. We apply Huygens Principle, which states that each
point on the aperture acts as a source. Let the vertical position on the source be x, and the
transmission of the aperture is T (x). We will take this to be a top-hat for a simple slit,

1 a2 < x < a2
T (x) =
,
(6.114)
0
|x| > a2
but we will start by letting T (x) be an arbitrary function (reflecting partial transmission, or multiple
slits, for example).
46

Figure 6.10: A slit of width a permitting light to enter. We want to compute the intensity on a
screen a distance D away. Credit: Wikipedia
From a small element dx, the electric field at distance r (on the screen) is
dE = E0

T (x)dx i(krt)
e
,
r

for some source strength E0 . To get the full electric field, we integrate over the aperture:
Z
T (x)dx i(krt)
E(y) = E0
e
.
r

(6.115)

(6.116)

If the screen is far away, and the angle is small, then sin = y/D ' , and r ' D in the
denominator. In the exponent, we need to be more careful.
If D is the distance of the screen from the origin of x (e.g. the middle of the slit), then Pythagoras
says that
1/2
1/2
r = D2 + (y x)2
= D 1 + (y x)2 /D2
' D + (y x)2 /2D
(6.117)
(where we assume a distant screen and small angles, so that both x & y are  D). The terms in r
that depend on x are (y/D)x+x2 /2D = x+(x/2D)x; if we are interested in diffraction at fixed
, we can always take the screen far enough away that the second term is negligible (x/D  ). To
first order in , we then have the simple approximation that governs Fraunhofer diffraction:
r ' D x.

47

(6.118)

As a result,
E0 ei(kDt)
E(y) '
D

T (x)eixk dx.

(6.119)

So we see that the electric field is proportional to the FT of the aperture T (x), evaluated at k:
E(y) T (k) .

(6.120)

Note the argument is not k!


The intensity I(y) (or I()) is proportional to |E(y)|2 , so the phase factor cancels out, and

2


I() T (k) .
(6.121)
For a single slit of width a, we find as before that




a
ka
2
2
sinc
,
I() sinc
2

(6.122)

where = 2/k is the wavelength of the light. Note that the first zero of the diffraction pattern is
when the argument is , so a = , or = /a.
If the wavelength is much less than the size of the object (the slit here), then the diffraction pattern is
effectively confined to a very small angle, and effectively the optics is geometric i.e. straight-line
ray-tracing with shadows etc).
Diffraction using the convolution theorem
Double-slit interference: two slits of width a and spacing 2d. In terms of k 0 /,
F T [{(x + d) + (x d)} (x)] = F T [{(x + d) + (x d)}] F T [(x)]
= {F T [(x + d)] + F T [(x d)]} F T [(x)]
 0 
ka
ik0 d
ik0 d
= (e + e
) sinc
2
 0 
ka
= 2 cos(k 0 d) sinc
.
2

(6.123)
(6.124)
(6.125)
(6.126)

Hence the intensity is a sinc2 function, modulated by a shorter-wavelength cos2 function. See Fig.
6.11.
6.3.2

Solving ODEs, revisited

Recall our problem


d2 z
02 z = f (t).
2
dt
Using FTs, we found that a solution was
Z
1
f () it
z(t) =
e d.
2
2 0 + 2

(6.127)

(6.128)

Now we can go a bit further, because we see that the FT of z(t) is a product (in Fourier space), of
f() and
1
g() 2
(6.129)
0 + 2
48

Figure 6.11: Intensity pattern from a double slit, each of width b and separated by 2d (Credit:
Yonsei University) .
hence the solution is a convolution in real (i.e. time) space:
Z
z(t) =
f (t0 )g(t t0 ) dt0 .

(6.130)

An exercise for you is to show that the FT of


g(t) =

e0 |t|
20

(6.131)

is g() = 1/(02 + 2 ), so we finally arrive at the general solution for a driving force f (t):
Z
1
0
f (t0 )e0 |tt | dt0 .
(6.132)
z(t) =
20
0

Note how we have put in g(tt0 ) = e0 |tt | /20 here, not g(t) or g(t0 ), as required for a convolution.

FOURIER ANALYSIS: LECTURE 13

Parsevals theorem for FTs (Plancherels theorem)

For FTs, there is a similar relationship between the average of the square of the function and the
FT coefficients as there is with Fourier Series. For FTs it is strictly called Plancherels theorem, but
is often called the same as FS, i.e. Parsevals theorem; we will stick with Parseval. The theorem
says
Z
Z
1
2
|f (x)| dx =
|f(k)|2 dk.
(7.133)
2

49

It is useful to compare different ways of proving this:


P
(1) The first is to go back to Fourier series for a periodic f (x): f (x) = n cn exp(ikn x), and |f |2
requires us to multiply the series by itself, which gives lots of cross terms. But when we integrate
over one fundamental period, all oscillating terms average to zero. Therefore the only terms that
survive are ones where cn exp(ikn x) pairs with cn exp(ikn x). This gives us Parsevals theorem for
Fourier series:
Z
Z `/2
X
X
1 `/2
1 X 2
2
2
|f | ,
(7.134)
|f (x)| dx =
|cn |
|f (x)|2 dx = `
|cn |2 =
` `/2
` n
`/2
n
n
using the definition f = `cn . But the
is dk = 2/`, so 1/` is dk/2. Now we take the
R
P mode spacing
continuum limit of ` and dk
becomes dk.
(2) Alternatively, we can give a direct proof using delta-functions:
 Z

  Z
1
1
2

0
0
0
|f (x)| = f (x)f (x) =
f(k) exp(ikx) dk
f (k ) exp(ik x) dk ,
2
2

(7.135)

which is

ZZ
1
f(k)f (k 0 ) exp[ix(k k 0 )] dk dk 0 .
(2)2
If we new integrate over x, we generate a delta-function:
Z
exp[ix(k k 0 )] dx = (2)(k k 0 ).
So

7.1

1
|f (x)| dx =
2
2

ZZ

1
f(k)f (k ) (k k ) dk dk =
2

(7.136)

(7.137)
Z

|f(k)|2 dk.

(7.138)

Energy spectrum

As in the case of Fourier series, |f(k)|2 is often called the Power Spectrum of the signal. If we have
a field (such as an electric field) where the energy density is proportional to the square of the field,
then we can interpret the square of the Fourier Transform coefficients as the energy associated with
each frequency. i.e. Total energy radiated is
Z
|f (t)|2 dt.
(7.139)

By Parsevals theorem, this is equal to


1
2

|f()|2 d.

(7.140)

and we interpret |f()|2 /(2) as the energy radiated per unit (angular) frequency, at frequency .
7.1.1

Exponential decay

If we have a quantum transition from an upper state to a lower state, which happens spontaneously,
then the intensity of emission will decay exponentially, with a timescale = 1/a, as well as having
a sinusoidal dependence with frequency 0 :
f (t) = eat cos(0 t)
50

(t > 0).

(7.141)

10 000

8000

6000

4000

2000

0
0.0

0.5

1.0

1.5

2.0

Figure 7.12: Frequency spectrum of two separate exponentially decaying systems with 2 different
time constants . (x axis is frequency, y axis |f()|2 in arbitrary units).
Algebraically it is easier to write this as the real part of a complex exponential, do the FT with the
exponential, and take the real part at the end. So consider
1
f (t) = eat (ei0 t + ei0 t )
2
The Fourier transform is

(t > 0).

(7.142)

Z
1 atit+i0 t

f () =
(e
+ eatiti0 t ) dt
2 0
 atit+i0 t

e
eatiti0 t

2f () =

a i + i0 a i i0 0
1
1
+
=
(a + i i0 ) (a + i + i0 )

(7.143)

(7.144)
This is sharply peaked near = 0 ; near this frequency, we therefore ignore the second term, and
the frequency spectrum is
|f()|2 '

1
1
1
=
.
2
4 [a + i( 0 )] [a i( 0 )]
4 [a + ( 0 )2 ]

(7.145)

This is a Lorentzian curve with width a = 1/ . Note that the width of the line in frequency is
inversely proportional to the decay timescale . This is an example of the Uncertainty Principle,
and relates the natural width of a spectral line to the decay rate. See Fig. 7.12.

7.2

Correlations and cross-correlations

Correlations are defined in a similar way to convolutions, but look carefully, as they are slightly
different. With correlations, we are concerned with how similar functions are when one is displaced
2
Note that this integral is similar to one which leads to Delta functions, but it isnt, because of the eat term. For
this reason, you can integrate it by normal methods. If a = 0, then the integral does indeed lead to Delta functions.

51

by a certain amount. If the functions are different, the quantity is called the cross-correlation; if it
is the same function, it is called the auto-correlation, or simply correlation.
The cross-correlation of two functions is defined by
Z

c(X) hf (x)g(x + X)i

f (x)g(x + X) dx.

(7.146)

Compare this with convolution (equation 6.99). X is sometimes called the lag. Note that crosscorrelation does not commute, unlike convolution. The most interesting special case is when f and
g are the same function: then we have the auto-correlation function.
The meaning of these functions is easy to visualise if the functions are real: at zero lag, the autocorrelation function is then proportional to the variance in the function (it would be equal if we
divided the integral by a length `, where the functions are zero outside that range). So then the
correlation coefficient of the function is
r(X) =

hf (x)f (x + X)i
.
hf 2 i

(7.147)

If r is small, then the values of f at widely separated points are unrelated to each other: the point
at which r falls to 1/2 defines a characteristic width of a function. This concept is used particularly
in random processes.
The FT of a cross-correlation is
c(k) = f (k) g(k).

(7.148)

This looks rather similar to the convolution theorem, which is is hardly surprising given the similarity
of the definitions of cross-correlation and convolution. Indeed, the result can be proved directly from
the convolution theorem, by writing the cross-correlation as a convolution.
A final consequence of this is that the FT of an auto-correlation is just the power spectrum; or, to
give the inverse relation:
Z
1

|f|2 exp(ikX) dk.


(7.149)
hf (x)f (x + X)i =
2
This is known as the Wiener-Khinchin theorem, and it generalises Parsevals theorem (to which it
reduces when X = 0).

7.3

Fourier analysis in multiple dimensions

We have now completed all the major tools of Fourier analysis, in one spatial dimension. In many
cases, we want to consider more than one dimension, P
and the extension is relatively straightforward.
Start with the fundamental Fourier series, f (x) = n cn exp(i2nx/`x ). f (x) can be thought of
as F (x, y) at constant y; if we change y, the effective f (x) changes, so the cn must depend on y.
Hence we can Fourier expand these as a series in y:
X
cn (y) =
dnm exp(i2my/`y ),
(7.150)
m

where we assume that the function is periodic in x, with period `x , and y, with period `y . The
overall series is than
X
X
X
F (x, y) =
dnm exp[2i(nx/`x + my/`y )] =
dnm exp[i(kx x + ky y)] =
dnm exp[i(k x)].
n,m

n,m

n,m

(7.151)
52

ky

kx
2
L

Figure 7.13: Illustrating the origin of the density of states in 2D. The allowed modes are shown as
points, with a separation in kx and ky of 2/`, where ` is the periodicity. The number of modes
between |k| and |k| + d|k| (i.e. inside the shaded annulus) is well approximated by (`/2)2 times
the area of the annulus, as ` , and the mode spacing tends to zero. Clearly, in D dimensions,
the mode density is just (`/2)D .
This is really just the same as the 1D form, and the extension to D dimensions should be obvious.
In the end, we just replace the usual kx term with the dot product between the position vector and
the wave vector.
The Fourier transform in D dimensions just involves taking the limit of `x , `y etc. The
Fourier coefficients become a continuous function of k, in which case we can sum over bins in k
space, each containing Nmodes (k) modes:
X
F (x) =
d(k) exp[i(k x)] Nmodes .
(7.152)
bin

The number of modes in a given k-space bin is set by the period in each direction: allowed modes
lie on a grid of points in the space of kx , ky etc. as shown in Figure 7.13. If for simplicity the period
is the same in all directions, the density of states is `D /(2)D :
Nmodes =

`D
dD k.
(2)D
53

(7.153)

This is an important concept which is used in many areas of physics.


The Fourier expression of a function is therefore
Z
1
F (x) =
F (k) exp[i(k x) dD k],
(2)D

(7.154)

Where we have defined F (k) `D d(k). The inverse relation would be obtained as in 1D, by
appealing to orthogonality of the modes:
Z

F (k) = F (x) exp[i(k x)] dD x.


(7.155)

FOURIER ANALYSIS: LECTURE 14

Digital analysis and sampling

Imagine we have a continuous signal (e.g. pressure of air during music) which we sample by making
measurements at a few particular times. Any practical storage of information must involve this
step of analogue-to-digital conversion. This means we are converting a continuous function into
one that is only known at discrete points i.e. we are throwing away information. We would feel
a lot more comfortable doing this if we knew that the missing information can be recovered, by
some from of interpolation between the sampled points. Intuitively, this seems reasonable if the
sampling interval is very fine: by the definition of continuity, the function between two sampled
points should be arbitrarily close to the average of the sample values as the locations of the samples
gets closer together. But the sampling interval has to be finite, so this raises the question of how
coarse it can be; clearly we would prefer to use as few samples as possible consistent with not losing
any information. This question does have a well-posed answer, which we can derive using Fourier
methods.
The first issue is how to represent the process of converting a function f (x) into a set of values
{f (xi )}. We can do this by using some delta functions:
X
f (x) fs (x) f (x)
(x xi ).
(8.156)
i

This replaces our function by a sum of spikes at the locations xi , each with a weight f (xi ). This
representation of the sampled function holds the information of the sample values and locations.
So, for example, if we try to average the sampled function over some range, we automatically get
something proportional to just adding up the sample values that lie in the range:
Z x2
X
fs (x) dx =
f (xi ).
(8.157)
x1

in range

54

4 x 3 x 2 x x

2 x 3 x 4 x

1/ x

2/ x

1/ x

1/ x

2/ x

Figure 8.14: Top: An infinite comb in real space. This represents the sampling pattern of a function
which is sampled regularly every x. Bottom: The FT of the infinite comb, which is also an infinite
comb. Note that u here is k/(2).

8.1

The infinite comb

If we sample regularly with a spacing x, then we have an infinite comb an infinite series of
delta functions. The comb is (see Fig. 8.14):

g(x) =

(x jx)

(8.158)

j=

This is also known as the Shah function.


To compute the FT of the Shah function, we will write it in another way. This is derived from
the fact that the function is periodic, and therefore suitable to be written as a Fourier series with
` = x:
X
g(x) =
cn exp(2nix/x).
(8.159)
n

The coefficients cn are just


1
cn =
x
so that

x/2

(x) dx =
x/2

1 X
1
g(x) =
exp(2nix/x) =
x n
2

55

1
,
x

(8.160)

g(k) exp(ikx) dx.

(8.161)

Figure 8.15: If the sampling is fine enough, then the original spectrum can be recovered from the
sampled spectrum.

Figure 8.16: If the sampling is not fine enough, then the power at different frequencies gets mixed
up, and the original spectrum cannot be recovered.
From this, the Fourier transform is obvious (or it could be extracted formally by integrating our
new expression for g(x) and obtaining a delta-function):

2 X
g(k) =
(k 2n/x).
x n=

(8.162)

which is an infinite comb in Fourier space, with spacing 2/x.


The FT of a function sampled with an infinite comb is therefore (1/2 times) the convolution of
this and the FT of the function:

1
1 X

f (k 2n/x).
fs (k) =
f (k) g(k) =
2
x n=

(8.163)

In other words, each delta-function in the k-space comb becomes dressed with a copy of the
transform of the original function.
56

Figure 8.17: If sin t is sampled at unit values of t, then sin(t + 2t) is indistinguishable at the
sampling points. The sampling theorem says we can only reconstruct the function between the
samples if we know that high-frequency components are absent.

8.2

Shannon sampling, aliasing and the Nyquist frequency

We can now go back to the original question: do the sampled values allow us to reconstruct the
original function exactly? An equivalent question is whether the transform of the sampled function
allows us to reconstruct the transform of the original function.
The answer is that this is possible (a) if the original spectrum is bandlimited, which means that
the power is confined to a finite range of wavenumber (i.e. there is a maximum wavenumber kmax
which has non-zero Fourier coefficients); and (b) if the sampling is fine enough. This is illustrated
in Figs 8.15 and 8.16. If the sampling is not frequent enough, the power at different wavenumbers
gets mixed up. This is called aliasing. The condition to be able to measure the spectrum accurately
is to have a sample at least as often as the Shannon Rate
x =

1
.
kmax

(8.164)

(8.165)

The Nyquist wavenumber is defined as


kNyquist =

which needs to exceed the maximum wavenumber in order to avoid aliasing:


kNyquist kmax .

(8.166)

For time-sampled data (such as sound), the same applies, with wavenumber k replaced by frequency
.
There is a simple way of seeing that this makes sense, as illustrated in Figure 8.17. Given samples
of a Fourier mode at a certain interval, x, a mode with a frequency increased by any multiple of
2/x clearly has the same result at the sample points.
57

8.2.1

Interpolation of samples

The idea of having data that satisfy the sampling theorem is that we should be able to reconstruct
the full function from the sampled values; how do we do this in practice? If the sampled function is
the product of f (x) and the Shah function, we have seen that the FT of the sampled function is the
same as f/x, for |k| < /x. If we now multiply by T(k): a top-hat in k space, extending from
/x to +/x, with height x, then we have exactly f and can recover f (x) by an inverse
Fourier transform. This k-space multiplication amounts to convolving the sampled data with the
inverse Fourier transform of T (k), so we recover f (x) in the form
Z
Z X
X
f (nx)(qnx) T (xq) dq,
f (x) = [f (x)g(x)]T (x) = f (q)
(qnx) T (xq) dq =
n

(8.167)
using f (x)(x a) = f (a)(x a). The sum of delta-functions sifts to give
X
f (x) =
f (nx)T (x nx),

(8.168)

i.e. the function T (x) = sin[x/x]/(x/x) is the interpolating function. This is known as sinc
interpolation.

8.3

CDs and compression

Most human beings can hear frequencies in the range 20 Hz 20 kHz. The sampling theorem means
that the sampling frequency needs to be at least 40 kHz to capture the 20 kHz frequencies. The
CD standard samples at 44.1 kHz. The data consist of stereo: two channels each encoded as 16-bit
integers. Allowing one bit for sign, the largest number encoded is thus 215 1 = 32767. This allows
signals of typical volume to be encoded with a fractional precision of around 0.01% an undetectable
level of distortion. This means that an hour of music uses about 700MB of information. But in
practice, this requirement can be reduced by about a factor 10 without noticeable degradation in
quality. The simplest approach would be to reduce the sampling rate, or to encode the signal
with fewer bits. The former would require a reduction in the maximum frequency, making the
music sound dull; but fewer bits would introduce distortion from the quantization of the signal.
The solution implemented in the MP3 and similar algorithms is more sophisticated than this: the
time series is split into frames of 1152 samples (0.026 seconds at CD rates) and each is Fourier
transformed. Compression is achieved by storing simply the amplitudes and phases of the strongest
modes, as well as using fewer bits to encode the amplitudes of the weaker modes, according to a
perceptual encoding where the operation of the human ear is exploited knowing how easily faint
tones of a given frequency are masked by a loud one at a different frequency.

8.4

Prefiltering

If a signal does not obey the sampling theorem, it must be modified to do so before digitization.
Analogue electronics can suppress high frequencies although they are not completely removed.
The sampling process itself almost inevitably performs this task to an extent, since it is unrealistic
to imagine that one could make an instantaneous sample of a waveform. Rather, the sampled signal
is probably an average of the true signal over some period.

58

This is easily analysed using the convolution theorem. Suppose each sample, taken at an interval ,
is the average of the signal over a time interval T , centred at the sample time. This is a convolution:
Z
fc (t) = f (t0 )g(t t0 ) dt0 ,
(8.169)
where g(t t0 ) is a top hat of width T centred on t0 = t. We therefore know that
fc () = f() sin(T /2)/(T /2).

(8.170)

At the Nyquist frequency, / , the Fourier signal in f is suppressed by a factor sin(T /2 )/(T /2 ).
The natural choice of T would be the same as (accumulate an average signal, store it, and start
again). This gives sin(/2)/(/2) = 0.64 at the Nyquist frequency, so aliasing is not strongly
eliminated purely by binning the data, and further prefiltering is required before the data can be
sampled.

Discrete Fourier Transforms & the FFT

This section is added to the course notes as a non-examinable supplement, which may be of interest
to those using numerical Fourier methods in project work. We have explored the properties of
sampled data using the concept of an infinite array of delta functions, but this is not yet a practical
form that can be implemented on a computer.

9.1

The DFT

Suppose that we have a function, f (x), that is periodic with period `, and which is known only at
N equally spaced values xn = n(`/N ). Suppose also that f (x) is band-limited with a maximum
wavenumber that satisfies |kmax | < /(`/N ), i.e. it obeys the sampling theorem. If we wanted to
describe this function via a Fourier series, we would need the Fourier coefficients
Z
1 `
f (x) exp[ikx] dx.
(9.171)
fk (k) =
` 0
This integral can clearly be approximated by summing over the N sampled values:
fk (k) =

1X
1 X
f (xn ) exp[ikxn ] `/N =
f (xn ) exp[ikxn ];
` n
N n

(9.172)

in fact, we show below that this expression yields the exact integral for data that obey the sampling
theorem. The range of grid values is irrelevant because of the periodicity of f . Suppose we sum
from n = 1 to N and then change to n = 0 to N 1: the sum changes by f (x0 ) exp[ikx0 ]
f (xN ) exp[ikxN ], but f (x0 ) = f (xN ) and xN x0 = `. Since the allowed values of k are multiples
of 2/`, the change in the sum vanishes. We can therefore write what can be regarded as the
definition of the discrete Fourier transform of the data:
N 1
1 X
f (xn ) exp[ikm xn ],
fk (km ) =
N n=0

59

(9.173)

where the allowed values of k are km = m(2/`) and the allowed values of x are xn = n(`/N ). This
expression has an inverse of very similar form:
f (xj ) =

N
1
X

fk (km ) exp[ikm xj ].

(9.174)

m=0

To prove this, insert the first definition in the second, bearing in mind that km xn = 2mn/N . This
gives the expression
1 X
1 X
f (xn ) exp[2im(j n)/N ] =
f (xn ) z m ,
N m,n
N m,n

(9.175)

P
where z = exp[2i(j n)/N ]. Consider m z m : where
P j =m n we
Phavem z = N1 and the sum is N . But
if j 6= n, the
zero. To show this, consider z m z = m z + zP 1. But zN = 1, and
P sum is P
we have z m z m = m z m , requiring the sum to vanish if z 6= 1. Hence m z m = N jn , and this
orthogonality relation proves that the inverse is exact.
An interesting aspect of the inverse as written is that it runs only over positive wavenumbers; dont
we need k to be both positive and negative? The answer comes from the gridding in x and k: since
km xn = 2mn/N , letting m n + N has no effect. Thus a mode with m = N 1 is equivalent
to one with m = 1 etc. So the k-space array stores increasing positive wavenumbers in its first
elements, jumping to large negative wavenumbers once k exceeds the Nyquist frequency. In a little
more detail, the situation depends on whether N is odd or even. If it is odd, then m = (N + 1)/2 is
an integer equivalent to (N 1)/2, so the successive elements m = (N 1)/2 and m = (N + 1)/2
hold the symmetric information near to the Nyquist frequency, |k| = (N 1)/N /(`/N ). On
the other hand, if N is even, we have a single mode exactly at the Nyquist frequency: m = N/2
|k| = (N/2)(2/`) = /(`/N ). This seems at first as though the lack of pairing of modes at positive
and negative k may cause problems with enforcing the Hermitian symmetry needed for a real f (x),
this is clearly not the case, since we can start with a real f and generate the N Fourier coefficients
as above.
Finally, we should prove how this connects with our experience with using Fourier series. Here we
would say
Z
1 `
f (x) exp[ikx] dx.
(9.176)
fk (k) =
` 0
We have taken the integral over 0 to ` rather than symmetrically around zero, but the same Fourier
coefficient arises as long as we integrate over one period. Now, we have seen that the exact function
can be interpolated from the sampled values:
f (x) =

X
n=

f (nX)T (x nX) =

N
1
X
X

f ([n + mN ]X)T (x [n + mN ]X),

(9.177)

m= n=0

where X = `/N , T is the sinc interpolation function, and the second form explicitly sums over all

60

the periodic repetitions of f . Putting this into the integral for fk gives
Z `
1X
T (x [n + mN ]X) exp[ix] dx
fk =
f ([n + mN ]X)
` m,n
0
Z y2
1X
T (y) exp[iky] dy
=
f ([n + mN ]X) exp[ik(n + mN )X]
` m,n
y1
Z y2
N 1

X
1X
=
f (nX) exp[iknX]
T (y) exp[iky] dy
` n=0
y
1
m=
Z
N
1
1X
T (y) exp[iky] dy.
f (nX) exp[iknX]
=
` n=0

(9.178)
(9.179)

(9.180)

(9.181)

The successive simplifications use (a) the fact that f is periodic, so f (x + N X) = f (x); (b) the
fact that allowed wavenumbers are a multiple of 2/`, so kN X is a multiple of 2; (c) recognising
that the y limits are y1 = (n + mN )X and y2 = ` (n + mN )X, so that summing over m joins
together segments of length ` into an integral over all values of y. But as we saw in section 8.2.1,
the integral has the constant value
X while |k| is less than the Nyquist frequency. Thus, for the
PN 1
allowed values of k, fk = (1/N ) n=0 f (nX) exp[iknX], so that the DFT gives the exact integral
for the Fourier coefficient.

9.2

The FFT

We have seen the advantages of the DFT in data compression, meaning that it is widely used
in many pieces of contemporary consumer electronics. There is therefore a strong motivation to
compute the DFT as rapidly as possible; the Fast Fourier Transform does exactly this.
At first sight, there may seem little scope for saving time. If we define the complex number
W exp[i2/N ], then the DFT involves us calculating the quantity
Fm

N
1
X

fn W nm .

(9.182)

n=0

The most time-consuming part of this calculation is the complex multiplications between fn and
powers of W . Even if all the powers of W are precomputed and stored, there are still N 1 complex
multiplications to carry out for each of N 1 non-zero values of m, so apparently the time for DFT
computation scales as N 2 for large N .
The way to evade this limit is to realise that many of the multiplications are the same, because
W N = 1 but W nm has nm reaching large multiples of N up to (N 1)2 . As an explicit example,
consider N = 4:
F0
F1
F2
F3

= f0 + f1 + f2 + f3
= f0 + f1 W + f2 W 2 + f3 W 3
= f0 + f1 W 2 + f2 W 4 + f3 W 6
= f0 + f1 W 3 + f2 W 6 + f3 W 9 .

(9.183)
(9.184)
(9.185)
(9.186)

There are apparently 9 complex multiplications (plus a further 5 if we need to compute the powers
W 2 , W 3 , W 4 , W 6 , W 9 ). But the only distinct powers needed are W 2 & W 3 . The overall transform
61

can then be rewritten saving four multiplications: removing a redundant multiplication by W 4 ;


recognising that the same quantities appear in more than one Fi ; and that some multiplications
distribute over addition:
F0
F1
F2
F3

= f0 + f1 + f2 + f3
= f0 + f2 W 2 + W (f1 + f3 W 2 )
= f0 + f2 + W 2 (f1 + f3 )
= f0 + f2 W 2 + W 3 (f1 + f3 W 2 ).

(9.187)
(9.188)
(9.189)
(9.190)

So now there are 5 multiplications, plus 2 for the powers: a reduction from 14 to 7.
It would take us too far afield to discuss how general algorithms for an FFT are constructed to
achieve the above savings for any value of N . The book Numerical Recipes by Press et al. (CUP) has
plenty of detail. The result is that the naive N 2 time requirement can be reduced to N ln N ,
provided N has only a few small prime factors most simply a power of 2.

62

FOURIER ANALYSIS: LECTURE 15

10
10.1

Greens functions
Response to an impulse

We have spent some time so far in applying Fourier methods to solution of differential equations
such as the damped oscillator. These equations are all in the form of
Ly(t) = f (t),

(10.191)

where L is a linear differential operator. For the damped harmonic oscillator, L = (d2 /dt2 +
d/dt + 02 ). As we know, linearity is an important property because it allows superposition:
L(y1 + y2 ) = Ly1 + Ly2 . It is this property that lets us solve equations in general by the method of
particular integral plus complementary function: guess a solution that works for the given driving
term on the RHS, and then add any solution of the homogeneous equation Ly = 0; this is just
adding zero to each side, so the sum of the old and new y functions still solves the original equation.
In this part of the course, we focus on a very powerful technique for finding the solution to such
problems by considering a very simple form for the RHS: an impulse, where the force is concentrated
at a particular instant. A good example would be striking a bell with a hammer: the subsequent
ringing is the solution to the equation of motion. This impulse response function is also called a
Greens function after George Green, who invented it in 1828 (note the apostrophe: this is not a
Green function). We have to specify the time at which we apply the impulse, T , so the applied
force is a delta-function centred at that time, and the Greens function solves
LG(t, T ) = (t T ).

(10.192)

Notice that the Greens function is a function of t and of T separately, although in simple cases it
is also just a function of t T .
This may sound like a peculiar thing to do, but the Greens function is everywhere in physics. An
example where we can use it without realising is in electrostatics, where the electrostatic potential
satisfies Poissons equation:
2 = /0 ,
(10.193)
where is the charge density. What is the Greens function of this equation? It is the potential
due to a charge of value 0 at position vector q:
G(r, q) =

10.2

1
.
4|r q|

(10.194)

Superimposing impulses

The reason it is so useful to know the Greens function is that a general RHS can be thought of
as a superposition of impulses, just as a general charge density arises from summing individual

63

point charges. We have seen this viewpoint before in interpreting the sifting property of the deltafunction. To use this approach here, take LG(t, T ) = (t T ) and multiply both sides by f (T )
(which is a constant). But now integrate both sides over T , noting that L can be taken outside the
integral because it doesnt depend on T :
Z
Z
L G(t, T )f (T ) dT = (t T )f (T ) dT = f (t).
(10.195)
The last step uses sifting to show that indeed adding up a set of impulses on the RHS, centred at
differing values of T , has given us f (t). Therefore, the general solution is a superposition of the
different Greens functions:
Z
y(t) = G(t, T )f (T ) dT.
(10.196)
This says that we apply a force f (T ) at time T , and the Greens function tells us how to propagate
its effect to some other time t (so the Greens function is also known as a propagator).
10.2.1

Importance of boundary conditions

When solving differential equations, the solution is not unique until we have applied some boundary
conditions. This means that the Greens function that solves LG(t, T ) = (t T ) also depends
on the boundary conditions. This shows the importance of having boundary conditions that are
homogeneous: in the form of some linear constraint(s) being zero, such as y(a) = y(b) = 0, or y(a) =
y(b)

= 0. If such conditions apply to G(t, T ), then a solution that superimposes G(t, T ) for different
values of T will still satisfy the boundary condition. This would not be so for y(a) = y(b) = 1,
and the problem would have to be manipulated into one for which the boundary conditions were
homogeneous by writing a differential equation for z y 1 in that case.

10.3

Finding the Greens function

The above method is general, but to find the Greens function it is easier to restrict the form of the
differential equation. To emphasise that the method is not restricted to dependence on time, now
consider a spatial second-order differential equation of the general form
d2 y
dy
+
a
(x)
+ a0 (x)y(x) = f (x).
1
dx2
dx

(10.197)

Now, if we can solve for the complementary function (i.e. solve the equation for zero RHS), the
Greens function can be obtained immediately. This is because a delta function vanishes almost
everywhere. So if we now put f (x) (x z), then the solution we seek is a solution of the
homogeneous equation everywhere except at x = z.
We split the range into two, x < z, and x > z. In each part, the r.h.s. is zero, so we need to solve
the homogeneous equation, subject to the boundary conditions at the edges. At x = z, we have to
be careful to match the solutions together. The function is infinite here, which tells us that the
first derivative must be discontinuous, so when we take the second derivative, it diverges. The first
derivative must change discontinuously by 1. To see this, integrate the equation between z  and
z + , and let  0:
Z z+
Z z+
Z z+
Z z+ 2
dy
dy
dx +
a1 (x) dx +
a0 (x)dx =
(x z)dx.
(10.198)
2
dx
z dx
z
z
z
64

The second and third terms vanish as  0, as the integrands are finite, and the r.h.s. integrates
to 1, so


dy
dy

= 1.
(10.199)
dx z+ dx z
Note that the boundary conditions are important. If y = 0 on the boundaries, then we can add up
the Greens function solutions with the appropriate weight. If the Greens function is zero on the
boundary, then any integral of G will also be zero on the boundary and satisfy the conditions.
10.3.1

Example

Consider the differential equation


d2 y
+y =x
dx2
with boundary conditions y(0) = y(/2) = 0.

(10.200)

The Greens function is continuous at x = z, has a discontinuous derivative there, and satisfies the
same boundary conditions as y. From the properties of the Dirac delta function, except at x = z,
the Greens function satisfies
d2 G(x, z)
+ G(x, z) = 0.
(10.201)
dx2
(Strictly, we might want to make this a partial derivative, at fixed z. It is written this way so it
looks like the equation for y). This is a harmonic equation, with solution

A(z) sin x + B(z) cos x x < z
G(x, z) =
(10.202)
C(z) sin x + D(z) cos x x > z.
We now have to adjust the four unknowns A, B, C, D to match the boundary conditions.
The boundary condition y = 0 at x = 0 means that B(z) = 0, and y = 0 at x = /2 implies that
C(z) = 0. Hence

A(z) sin x x < z
G(x, z) =
(10.203)
D(z) cos x x > z.
Continuity of G implies that A(z) sin z = D(z) cos z and a discontinuity of 1 in the derivative implies
that D(z) sin z A(z) cos z = 1. We have 2 equations in two unknowns, so we can eliminate A or
D:
cos z
sin2 z
A(z) cos z = 1 A(z) =
= cos z
(10.204)
A(z)
2
cos z
sin z + cos2 z
and consequently D(z) = sin z. Hence the Greens function is

cos z sin x x < z
G(x, z) =
(10.205)
sin z cos x x > z
The solution for a driving term x on the r.h.s. is therefore (be careful here with which solution for
G to use: the first integral on the r.h.s. has x > z)
Z /2
Z x
Z /2
y(x) =
z G(x, z) dz = cos x
z sin z dz sin x
z cos z dz.
(10.206)
0

Integrating by parts,
1

y(x) = (x cos x sin x) cos x ( 2 cos x 2x sin x) sin x = x sin x.


2
2
65

(10.207)

10.4

Summary

So to recap, the procedure is to find the Greens function by


replacing the driving term by a Dirac delta function
solving the homogeneous equation either side of the impulse, with the same boundary conditions e.g. G = 0 at two boundaries, or G = G/x = 0 at one boundary.
Note the form of the solution will be the same for (e.g.) x < z and x > z, but the coefficients
(strictly, they are not constant coefficients, but rather functions of z) will differ either side of
x = z).
matching the solutions at x = z (so G(x, z) is continuous).
introducing a discontinuity of 1 in the first derivative G(x, z)/x at x = z
integrating the Greens function with the actual driving term to get the full solution.

FOURIER ANALYSIS: LECTURE 16


10.5

Example with boundary conditions at the same place/time

A mouse wishes to steal a piece of cheese from an ice-rink at the winter olympics. The cheese,
which has a mass of 1 kg, is conveniently sitting on a frictionless luge of negligible mass. The mouse
attaches a massless string and pulls, starting at t = 0. Unfortunately the mouse gets tired very
quickly, so the force exerted declines rapidly f (t) = et (SI units). Find, using Greens functions,
the resulting motion of the cheese, z(t) and its terminal speed.
The equation to be solved is
d2 z
= et .
dt2
Since the cheese is sitting on the luge, we take the boundary conditions to be
z = 0;

dz
= 0 at t = 0.
dt

(10.208)

(10.209)

We can, of course, solve this equation very easily simply by integrating twice, and applying the
boundary conditions. As an exercise, we are going to solve it with Greens functions. This also
makes the point that there is often more than one way to solve a problem.
For an impulse at T , the Greens function satisfies
G(t, T )
= (t T )
t2
so for t < T and t > T the equation to be solved is 2 G/t2 = 0, which has solution

A(T )t + B(T ) t < T
G(t, T ) =
C(T )t + D(T ) t > T
66

(10.210)

(10.211)

Now, we apply the same boundary conditions. G(t = 0) = 0 B = 0. The derivative G0 (t = 0) =


0 A = 0, so G(t, T ) = 0 for t < T . This makes sense when one thinks about it. We are applying
an impulse at time T , so until the impulse is delivered, the cheese remains at rest.
Continuity of G at t = T implies
C(T )T + D(T ) = 0,

(10.212)

and a discontinuity of 1 in the derivative at T implies that


C(T ) A(T ) = 1.

(10.213)

Hence C = 1 and D = T and the Greens function is



0
t<T
G(t, T ) =
tT t>T
The full solution is then

(10.214)

G(t, T )f (T )dT

z(t) =

(10.215)

where f (T ) = eT . Hence
Z
z(t) =

Z
G(t, T )f (T )dT +

G(t, T )f (T )dT.

(10.216)

The second integral vanishes, because G = 0 for t < T , so




Z t
Z t
 T t


T
T t
T
z(t) =
(t T )e dT = t e
T e
+
e dT
0
0
0

(10.217)

which gives the motion as


z(t) = t 1 + et .

(10.218)

We can check that z(0) = 0, that z 0 (0) = 0, and that z 00 (t) = et . The final speed is z 0 (t ) = 1,
so the cheese moves at 1 m s1 at late times. Note that this technique can solve for an arbitrary
driving term, obtaining the solution as an integral. This can be very useful, even if the integral
cannot be done analytically, as a numerical solution may still be useful.

10.6

Causality

The above examples showed how the boundary conditions influence the Greens function. If we are
thinking about differential equations in time, there will often be a different boundary condition,
which is set by causality. For example, write the first equation we considered in a form that
emphasises that it is a harmonic oscillator:
T ) + 2 G(t, T ) = (t T ).
G(t,
0

(10.219)

Since the system clearly cannot respond before it is hit, the boundary condition for such applications
would be expected on physical grounds to be
G(t, T ) = 0 (t < T ).

(10.220)

Whether or not such behaviour is achieved depends on the boundary conditions. Our first example
did not satisfy this criterion, because the boundary conditions were of the form y(a) = y(b) = 0.
67

This clearly presents a problem if T is between the points a and b: its as if the system knows
when we will strike the bell, or how hard, in order that the response as some future time t = b
will vanish. In contrast, our second example with boundary conditions at a single point ended up
yielding causal behaviour automatically, without having to put it in by hand.
The causal Greens function is particularly easy to find, because we only need to think about the
behaviour at t > T . Here, the solution of the homogeneous equation is A sin 0 t + B cos 0 t, which
must vanish at t = T . Therefore it can be written as G(t, T ) = A sin[0 (t T )]. The derivative
must be unity at t = T , so the causal Greens function for the undamped harmonic oscillator is
G(t, T ) =

10.6.1

1
sin[0 (t T )].
0

(10.221)

Comparison with direct Fourier solution

As a further example, we can revisit again the differential equation with the opposite sign from the
oscillator:
d2 z
(10.222)
02 z = f (t).
dt2
We solved this above by taking the Fourier transform of each side, to obtain
1
z(t) =
2

f() it
e d.
02 + 2

We then showed that this is in the form of a convolution:


Z
1
z(t) =
f (T )e0 |tT | dT.
20

(10.223)

(10.224)

This looks rather similar to the solution in terms of the Greens function, so can we say that
G(t, T ) = exp(0 |t T |)/20 ? Direct differentiation gives G = exp(0 |t T |)/2, with the +
sign for t > T and the sign for t < T , so it has the correct jump in derivative and hence satisfies
the equation for the Greens function.
But this is a rather strange expression, since it is symmetric in time: a response at t can precede
T . The problem is that we have imposed no boundary conditions. If we insist on causality, then
G = 0 for t < T and G = A exp[0 (t T )] + B exp[0 (t T )] for t > T . Clearly A = B, so
G = 2A sinh[0 (t T )]. This now looks similar to the harmonic oscillator, and a unit step in G at
t = T requires
1
G(t, T ) =
sinh[0 (t T )].
(10.225)
0
So the correct solution for this problem will be
Z t
1
f (T ) sinh[0 (t T )] dT.
(10.226)
z(t) =
0
Note the changed upper limit in the integral: forces applied in the future cannot affect the solution
at time t. We see that the response, z(t), will diverge as to increases, which is physically reasonable:
the system has homogeneous modes that either grow or decline exponentially with time. Special
care with boundary conditions would be needed if we wanted to excite only the decaying solution
in other words, this system is unstable.

68

FOURIER ANALYSIS: LECTURE 17

11

Partial Differential Equations and Fourier methods

The final element of this course is a look at partial differential equations from a Fourier point of
view. For those students taking the 20-point course, this will involve a small amount of overlap
with the lectures on PDEs and special functions.

11.1

Examples of important PDEs

PDEs are very common and important in physics. Here, we will illustrate the methods under study
with three key examples:
1 2
c2 t2
1
The diffusion equation : 2 =
D t
2

h 2

Schrodingers equation :
+ V = i
h
2m
t
The wave equation : 2 =

(11.227)
(11.228)
(11.229)

These are all examples in 3D; for simplicity, we will often consider the 1D analogue, in which (r, t)
depends only on x and t, so that 2 is replaced by 2 /x2 .
11.1.1

The wave equation

A simple way to see the form of the wave equation is to consider a single plane wave, represented
by = exp[i(k x t)]. We have 2 = k 2 , and ( 2 /t2 ) = 2 . Since /|k| = c, this
one mode satisfies the wave equation. But a general can be created by superposition of different
waves (as in Fourier analysis), so also satisfies the equation. Exactly the same reasoning is used
in deriving Schrodingers equation. Here we use de Broglies relations for momentum and energy:
p=h
k;

E=h
.

(11.230)

Then the nonrelativistic energy-momentum relation, E = p2 /2m+V becomes h


= (
h2 /2m)k 2 +V .
A single wave therefore obeys Schrodingers equation, and by superposition and completeness, so
does a general .
11.1.2

The diffusion equation

The diffusion equation is important because it describes how heat and particles get transported,
(typically) under conduction. For completeness, here is a derivation although this is nonexaminable in this course.

69

The heat flux density (energy per second crossing unit area of surface) is assumed to be proportional
to the gradient of the temperature (which we will call u, as T is conventionally used for the separated
function T (t)):
u(x, t)
,
(11.231)
f (x, t) =
x
where is a constant (the thermal conductivity). The minus sign is there because if the temperature
gradient is positive, then heat flows towards negative x.
Now consider a thin slice of width x from x to x + x: there is energy flowing in at the left which
is (per unit area) f (x), and energy is flowing out to the right at a rate f (x + x). So in time t, for
unit area of surface, the energy content increases by an amount
Q = t [f (x) f (x + x)] ' t x (f /x),

(11.232)

where we have made a Taylor expansion f (x + x) = f (x) + x (f /x) + O([x]2 ). This heats the
gas. The temperature rise is proportional to the energy added per unit volume,
u = c

Q
V

(11.233)

where the constant of proportionality c here is called the specific heat capacity. V = x is the
volume of the slice (remember it has unit area cross-section). Dividing by t then gives the diffusion
equation:


f
1
2u
u
2u
u = c
x t
= c 2 t
= 2
(11.234)
x
V
x
t
x
(for a constant = c). We often write as D, the diffusion coefficient. This same equation applies
to the evolution of concentrations of particle number density.
In 3D, this generalises to
u
= 2 u.
(11.235)
t
The heat transport relation f = (u/x) takes a vector form f = u, which is just a flow
in the direction of maximum temperature gradient, but otherwise identical to the 1D case. When
there is a flux-density vector in 3D, the corresponding density, , obeys the continuity equation,
f = /t. Since the change in temperature is c times the change in heat density, this gives
the above 3D heat equation.

11.2

Solving PDEs with Fourier methods

The Fourier transform is one example of an integral transform: a general technique for solving
differential equations.
Transformation of a PDE (e.g. from x to k) often leads to simpler equations (algebraic or ODE
typically) for the integral transform of the unknown function. This is because spatial derivatives
turn into factors of ik. Similar behaviour is seen in higher numbers of dimensions. When is a
single Fourier mode
2

ik;
k 2
x
x2
3D : ik; 2 k 2 .
1D :

70

(11.236)
(11.237)

These simpler equations are then solved and the answer transformed back to give the required
solution. This is just the method we used to solve ordinary differential equations, but with the
difference that there is still a differential equation to solve in the untransformed variable. Note
that we can choose whether to Fourier transform from x to k, resulting in equations that are still
functions of t, or we can transform from t to , or we can transform both. Both routes should work,
but normally we would choose to transform away the higher derivative (e.g. the spatial derivative,
for the diffusion equation).
The FT method works best for infinite systems. In subsequent lectures, we will see how Fourier
series are better able to incorporate boundary conditions.
11.2.1

Example: the diffusion equation

As an example, well solve the diffusion equation for an infinite system.


1 n(x, t)
2 n(x, t)
.
=
2
x
D t

(11.238)

The diffusion coefficient D is assumed to be independent of position. This is important, otherwise


the FT method is not so useful. The procedure is as follows:
FT each side:
Multiply both sides by eikx
Integrate over the full range < x < .
Write the (spatial) FT of n(x, t) as n
(k, t)
Pull the temporal derivative outside the integral over x
Use Eqn. (3.33) with p = 2 to get:
(ik)2 n
(k, t) =

1 n
(k, t)
D t

(11.239)

This is true for each value of k (k is a continuous variable). This is a partial differential
equation, but let us for now fix k, so we have a simple ODE involving a time derivative, and
we note that d(ln n
) = d
n/
n, so we need to solve
d ln n

= k 2 D.
dt

(11.240)

Its solution is ln n
(k, t) = k 2 Dt + constant. Note that the constant can be different for
different values of k, so the general solution is
2

n
(k, t) = n
0 (k) eDk t .

(11.241)

where n
0 (k) n
(k, t = 0), to be determined by the initial conditions.
The answer (i.e. general solution) comes via an inverse FT:
Z
Z
dk
dk
2
ikx
n(x, t) =
n
(k, t) e =
n
0 (k) eikxDk t .
2
2
71

(11.242)

n(x,t)

increasing time

x
Figure 11.18: Variation of concentration with distance x at various diffusion times.
SPECIFIC EXAMPLE: We add a small drop of ink to a large tank of water (assumed 1dimensional). We want to find the density of ink as a function of space and time, n(x, t).
Initially, all the ink (S particles) is concentrated at one point (call it the origin):
n(x, t = 0) = S (x)
implying (using the sifting property of the Dirac delta function),
Z
Z
ikx
dx (x) eikx = S.
dx n(x, t = 0) e
=
n
0 (k) n
(k, 0) =

(11.243)

(11.244)

Putting this into Eqn. (11.242) we get:


Z
Z
dk
dk
2
ikx
n(x, t) =
n
(k, t) e =
n
0 (k) eikxDk t
2
2
Z

dk
S
2
2
=
S eikxDk t =
ex /(4Dt) .
2 2Dt
2

(11.245)

(we used the completing the square trick that we previously used to FT the Gaussian). Compare
this with the usual expression for a Gaussian,


1
x2

exp 2
(11.246)
2
2

and identify the width with 2Dt.


So, the ink spreads
out with concentration described by a normalized Gaussian centred on the origin
with width = 2Dt. The important features are:
normalized: there are always S particles in total at every value of t
centred on the origin: where we placed the initial drop

width = 2Dt: gets broader as time increases


72


t: characteristic of random walk (stochastic) process

D: if we increase the diffusion constant D, the ink spreads out more quickly.

The solution n(x, t) is sketched for various t in Fig. 11.18.

FOURIER ANALYSIS: LECTURE 18


11.3

Fourier solution of the wave equation

One is used to thinking of solutions to the wave equation being sinusoidal, but they dont have to
be. We can use Fourier Transforms to show this rather elegantly, applying a partial FT (x k,
but keeping t as is).
The wave equation is
2

u(x, t)
2 u(x, t)
c
=
(11.247)
x2
t2
where c is the wave speed. We Fourier Transform w.r.t. x to get u(k, t) (note the arguments),
remembering that the FT of 2 /x2 is k 2 :
2

c2 k 2 u(k, t) =

2 u(k, t)
.
t2

(11.248)

This is a harmonic equation for u(k, t), with solution


u(k, t) = Aeikct + Beikct

(11.249)

However, because the derivatives are partial derivatives, the constants A and B can be functions
of k. Let us write these arbitrary functions as f(k) and g(k), i.e.
u(k, t) = f(k)eikct + g(k)eikct .

(11.250)

We now invert the transform, to give


Z
i
dk h
ikct
ikct
u(x, t) =
f (k)e
+ g(k)e
eikx
2
Z
Z

dk
dk
ik(xct)
f (k)e
+
g(k)eik(x+ct)
=
2
2
= f (x ct) + g(x + ct)
and f and g are arbitrary functions.

11.4

Fourier solution of the Schr


odinger equation in 2D

Consider the time-dependent Schrodinger equation in 2D, for a particle trapped in a (zero) potential
2D square well with infinite potentials on walls at x = 0, L, y = 0, L:
h
2 2
(x, t)
(x, t) = i
h
.

2m
t
73

(11.251)

For example, let us perform a FT with respect to x. 2 k.k = k 2 = (kx2 + ky2 ), so


t)
(k,
h
2k2
(k, t) = ih
.
2m
t

(11.252)

We can integrate this with an integrating factor:


2

k
t
t) = (k,
0)ei h2m
(k,
.

(11.253)

The time-dependence of the wave is eit , or, in terms of energy E = h


, eiEt/h , where
h
2k2
.
E=
2m

(11.254)

If the particle is trapped in the box, then the wavefunction must be zero on the boundaries, so
the wavelength in the x direction must be x = 2L, L, 2L/3, . . . i.e. 2L/m for q = 1, 2, 3, . . ., or
wavenumbers of kx = 2/x = q/L. Similarly, ky = r/L for r = 1, 2, . . ..
So the wavefunction is a superposition of modes of the form
 qx 
 ry 
(x, t) = A sin
sin
eiEt/h
L
L
RR 2
for integers q, r. A is a normalization constant to ensure that
| | dx dy = 1.

(11.255)

Each mode has an energy


h
2k2
h
2 2 (q 2 + r2 )
=
.
(11.256)
2m
2mL2
For a square well, the energy levels are degenerate different combinations of q and r give the same
energy level.
E=

12

Separation of Variables

We now contrast the approach of Fourier transforming the equations with another standard technique. If we have a partial differential equation for a function which depends on several variables,
e.g. u(x, y, z, t), then we can attempt to find a solution which is separable in the variables:
u(x, y, z, t) = X(x)Y (y)Z(z)T (t)

(12.257)

where X, Y, Z, T are some functions of their arguments, and we try to work out what these functions
are. Examples of separable functions are xyz 2 et , x2 sin(y)(1 + z 2 )t, but not (x2 + y 2 )zt. Not all
PDEs have separable solutions, but many physically important examples do.
Let us consider the one-dimensional wave equation (so we have only x and t as variables) as an
example:
1 2u
2u
=
.
(12.258)
x2
c2 t2
We try a solution u(x, t) = X(x)T (t):
1 2 (XT )
2 (XT )
=
.
x2
c2 t2

(12.259)

Now notice that on the left hand side, T is not a function of x, so can come outside the derivative,
and also, since X is a function of x only, the partial derivative with respect to x is the same as the
74

ordinary derivative. A similar argument holds on the right hand side, where X is not a function of
t, so
d2 X
X d2 T
T 2 = 2 2.
(12.260)
dx
c dt
The trick here is to divide by XT , to get
1 d2 X
1 d2 T
=
.
X dx2
c2 T dt2

(12.261)

Now, the left hand side is not a function of t (only of x), whereas the right hand side is not a
function of x (only of t). The only way these two independent quantities can be equal for all t and
x is if they are both constant. The constant is called the separation constant, and let us call it k 2 .
(Note that we arent sure at this point that the separation constant is negative; if it turns out it is
positive, well come back and call it k 2 , or, alternatively, let k be imaginary). Hence the equation
for X is (multiplying by X)
d2 X
= k 2 X.
(12.262)
2
dx
You know the solution to this:
X(x) = A exp(ikx) + B exp(ikx)

(12.263)

for constants A and B (alternatively, we can write X as a sum of sines and cosines).
The equation for T is
1 d2 T
= k 2 T
2
2
c dt

(12.264)

T (t) = C exp(it) + D exp(it)

(12.265)

which has solution


where = ck. If we take in particular B = C = 0 and A = D = 1, we have a solution
u(x, t) = exp[i(kx t)]

(12.266)

which we recognise as a sinusoidal wave travelling in the +x direction. In general, we will get a
mixture of this and exp[i(kx + t)], which is a sinusoidal wave travelling in the negative x direction.
We will also get the same exponentials with the opposite sign in the exponent. These could be
combined into
u(x, t) = A sin[(kx t) + ] + B sin[(kx t) + ],
(12.267)
which is a mixture of waves travelling in the two directions, with different phases.
IMPORTANT: Notice that we can add together any number of solutions with different
values of the separation constant k 2 , and we will still satisfy the equation. This means
that the full solution can be a more complicated non-periodic function, just as Fourier transforms
allow us to express a general function as a superposition of periodic modes. In this case (as we saw
above), the general solution of the 1D wave equation is
u(x, t) = f (x ct) + g(x + ct),

(12.268)

for any (twice-differentiable) functions f and g.


In general to find which of the many solutions are possible in a given situation, we need to specify
the boundary conditions for the problem. An important criterion is that the boundary conditions
need to be homogeneous, i.e. in the form u = 0 for some combination of time and space (commonly
75

1.0

1.0

0.5
T

0.0
-0.5

0.5
T

0.0
-0.5

-1.0
0

t
1

2
x

-1.0
0

t
1

2
x

4
6

4
6

Figure 12.19: Contrasting the travelling-wave and standing-wave solutions to the wave equation.
as a function of space at t = 0). This is because we want to build a general solution by superposition,
and sums of terms only keep the boundary condition unchanged if it is zero. If the condition is
u = const, we can convert it to u = 0 by subtracting the constant: uconst still solves the equation.
Here is another example. If we require that u is zero at two boundaries x = (0, ), and that at
t = 0 the solution is a sin wave, u(x, 0) = sin(3x), then the solution is u(x, t) = sin(3x) cos(3ct).
This is a standing wave, which does not propagate, just varies its amplitude with time.
Note that we can write the standing wave solution as a superposition of waves travelling in opposite
directions (with ck = ):
1
1 ikx
(e eikx ) (eit + eit )
2i
2

1  i(kx+t)
=
e
+ ei(kxt) ei(kxt) ei(kx+t)
4i
1
[sin(kx + t) + sin(kx t)] .
=
2

sin(kx) cos(t) =

(12.269)

FOURIER ANALYSIS: LECTURE 19


12.1

Solving the diffusion equation via separation of variables

Let us now try to solve the diffusion equation in 1D:


2u
1 u
=
.
2
x
t

(12.270)

We wish to find a solution with u 0 as t . We try separating variables, u(x, t) = X(x)T (t),
to find
1 d2 X
1 dT
=
= 2
(12.271)
2
X dx
T dt
where we have written the separation constant as 2 . The equation for X is the same as we had
before. This time, let us write the solution as sines and cosines:
X(x) = A sin(x) + B cos(x).
76

(12.272)

The equation for T is


or

dT
= 2 T,
dt

(12.273)

d ln T
= 2
dt

(12.274)

T (t) = C exp(2 t)

(12.275)

which has solution


so we have a separable solution (absorbing the constant C into A and B)
u(x, t) = [A sin(x) + B cos(x)] exp(2 t),

(12.276)

This tends to zero as t provided 2 > 0. If we had chosen 2 < 0, the spatial variation
would be a combination of sinh and cosh terms; because these functions do not oscillate, it is not
possible to satisfy homogeneous boundary conditions. Therefore we have to consider only 2 > 0.
This is physically reassuring, since the solution that diverges exponentially with time hardly feels
intuitively correct, and indeed it conflicts with our previous solution of the diffusion equation by
Fourier methods.
Note that we can add in solutions with different , and as usual the full solution will depend on
the initial conditions. This begins to look like Fourier analysis. We certainly need to add at least
a constant in order to make the solution physically sensible: as written, it allows for negative
temperatures. But we can always add a constant to any solution of the diffusion equation; so when
we speak of a boundary condition involving u = 0, this really means the temperature is at some
uniform average value, which we do not need to specify.

12.2

Separation of variables with several dimensions

As a last example of solving equations via separation of variables, let us consider a more complicated
situation, where we have 2 space dimensions, and time. Consider an infinite square column of side
L which is initially (at t = 0) at zero temperature, u(x, y, t = 0) = 0. We ignore z as by symmetry
there is no heat flowing along this direction. At t = 0 it is immersed in a heat bath at temperature
T0 . We need to solve the heat equation
2 u =

1 u
,
t

(12.277)

and we will look for separable solutions in the following form:


u(x, y, z, t) = T0 + X(x)Y (y)T (t).

(12.278)

We do this because this will make it easier to absorb the boundary conditions. Although u = 0
throughout the block initially, it is u = T0 at the surface of the block (at all times). Thus as written,
XY T = 0 on the surface of the block at t = 0, which is a simpler boundary condition. We didnt
have to choose this form, but it makes the working simpler. In any case, the differential equation
is independent of T0 :
d2 X
d2 Y
XY Z dT
=0
(12.279)
Y T 2 + XT 2
dx
dy
dt
Dividing by XY T ,
1 d2 Y
1 dT
1 d2 X
+

=0
(12.280)
2
2
X dx
Y dy
T dt
77

Since the first term is not a function of y, t, and the second is not a function of x, t etc, we conclude
that all the terms must be constant. e.g.
1 dT
=
T dt

T (t) et .

(12.281)

We next find the equation for X, by isolating terms which depend on x only:
1 d2 Y
1 d2 X
=

= kx2 = constant
X d2 x
Y d2 y

(12.282)

(the l.h.s. is not a function of y, z, the Y terms is not a function of x, hence they must be equal to
another constant).
d2 X
= kx2 X X(x) = Aeikx x + Beikx x
(12.283)
d2 x
and similarly for Y except that the equivalent wavenumber ky must satisfy kx2 + ky2 = , from
equation (12.282).
Now, as stated above, the terms we calculate here must be zero on the boundaries at x = 0, L,
y = 0, L. Hence the solutions for X and Y must be sinusoidal, with the correct period, e.g.
 mx 
X(x) sin
(12.284)
L
for any integer m. Similarly for Y . So a separable solution is
 ny 
 mx 
sin
emn t
u(x, y, t) = T0 + Cmn sin
L
L

(12.285)

where

2 2
(12.286)
mn = 2 (m + n2 ).
L
Here we have identified the separation constants explicitly with the integers m, n, rewriting =
kx2 + ky2 .
Now we can add the separable solutions:
u(x, y, t) = T0 +

Cmn sin

 mx 

m,n=0

sin

 ny 
L

emn t .

(12.287)

All that remains is to determine the constants Cmn . We use the initial condition that inside the
volume u = 0 when t = 0 (when the exponential term is unity), so

X
m,n=0

Cmn sin

 mx 
L

sin

 ny 
L

= T0 .

(12.288)

This looks very much like a Fourier Series, and we can use the same trick of the orthogonality of the
sin functions. Multiply by sin(m0 x/L) and integrate with respect to x, giving 0 unless m = m0 ,
and L/2 if m = m0 . Similarly for y, so
 2
Z L
Z L
L
Cmn
= T0
sin(mx/L)dx
sin(ny/L)dy
2
0
0

 mx L  L
 ny L
L
= T0
cos

cos
.
(12.289)
m
L
n
L
0
0
78

0
1.0

0.5

0.0
0
1
2
3

1.00
3

0.95
2

0.90
0
1

1
2
3

Figure 12.20: Temperature at an early time t = 0.01, for T0 = 1, = 1 and L = , and then at a
late time t = 1.
The cosines are zero if m, n are even. If m, n are both odd, the right hand side is 4L2 /(mn 2 ), from
which we get

16T0 /( 2 mn) m, n all odd
Cmn =
(12.290)
0
otherwise
Finally the full solution is
(

)
 mx 
 ny 
2

16 X 1
sin
sin
exp (m2 + n2 ) 2 t .
u(x, y, t) = T0 1 2
m,n odd mn
L
L
L

79

(12.291)

You might also like