Untitled
Untitled
Untitled
1
3
Great Clarendon Street, Oxford, OX2 6DP,
United Kingdom
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
© Charles S. Adams and Ifan G. Hughes 2019
The moral rights of the authors have been asserted
First Edition published in 2019
Impression: 1
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above
You must not circulate this work in any other form
and you must impose this same condition on any acquirer
Published in the United States of America by Oxford University Press
198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data
Data available
Library of Congress Control Number: 2018953423
ISBN 978–0–19–878678–8 (hbk.)
ISBN 978–0–19–878679–5 (pbk.)
DOI: 10.1093/oso/9780198786788.001.0001
Printed and bound by
CPI Group (UK) Ltd, Croydon, CR0 4YY
Links to third party websites are provided by Oxford in good faith and
for information only. Oxford disclaims any responsibility for the materials
contained in any third party website referenced in this work.
To K for love and cake, and E for the hedgehog. (CSA)
I Dad ac Aled, cyd-deithwyr ffyddlon ar drywydd dirgelwch
goleuni. (IGH)
Preface
PC, or laptop (or even your phone!) can be used to perform diffraction
integrals efficiently. The teaching of optics cannot be reduced to a
sequence of computer codes and algorithms; however, the insight gained
by utilizing modern computer techniques to produce visualizations in
conjunction with more conventional analytic approaches is, in our
experience, a far better way to gain a deeper understanding of light
propagation. For this reason, we make available a library of supporting
python codes that were used to generate many of the figures in this
book; see:
http://www.dur.ac.uk/physics/opticsf2f
Preliminary versions of parts of the material have been used by thou-
sands of physics and natural sciences students at Durham University;
we are grateful to all of those who helped identify and eradicate
inconsistencies, errors, and other sources of confusion. We also thank the
undergraduates and summer students who let us use some of the images
and data recorded in the undergraduate laboratory. In many ways, we
have learnt more from our students than any one else. Many colleagues
kindly donated their time to proofread various chapters, and we are
indebted to them for this service: Tom Lancaster gave us sage advice
about writing a physics book, and made suggestions for improvements
on most of the manuscript; we benefited enormously from the years of
experience as an optics teacher of Steve Hopkins, who read carefully the
first half; Robert Bettles helped root out mathematical inconsistencies
and made suggestions for clearer wording, especially in Chapter 13;
Lukas Novotny proofread Chapter 12; Aled and Rhiannon Hughes gave
advice on the photographic elements; and Eileen Lovell discovered our
problem with plurals. The authors have enjoyed discussions on various
topics in optics with mentors and colleagues over the years, including
Geoffrey Brooker, Antoine Browaeys, Allister Ferguson, Matthew Jones,
Klaus Mølmer, Tilman Pfau, Erling Riis, and Nicholas Spong. Mr John
Harris inspired one of the authors (IGH) to study optics at Ysgol Gyfun
Ystalyfera.
We would like to thank our families for support and encouragement,
our copy editor Graham Bliss, production editor Saranya Jayakumar,
and Sönke Adlung and Harriet Konishi at OUP for their enthusiasm
and patience.
CSA and IGH, Durham, on the 311th birthday of Leonhard Euler, who
taught us that eiπ = −1, 15 April, 2018.
Contents
1 Light as a wave 1
1.1 Wave optics 1
1.2 A brief history 1
1.3 Maxwell’s equations 2
1.4 Maxwell’s wave equation 2
1.5 Principle of superposition 3
1.6 The harmonic wave solution 3
1.7 E or B? 4
1.8 Phasors 5
1.9 Spatial frequency 5
1.10 Intensity/Poynting vector 7
1.11 Complex representation 8
1.12 Scalar approximation 9
1.13 General solution 10
1.14 Propagation 10
1.15 Waves and quanta 11
Exercises 13
4 Polarization 51
4.1 Introduction 51
4.2 Linear basis (|) 52
4.3 Linear polarization (|) 53
4.4 Circular polarization (|) 53
4.5 Elliptical polarization (|) 55
4.6 Circular basis (◦) 55
4.7 Poincaré sphere (◦) 56
4.8 Photon spin (◦) 56
4.9 Polarized light in a medium 57
4.10 Polarizers 58
4.11 Malus’ Law 58
4.12 Linear birefringence (|) 59
4.13 Wave plates (|) 59
4.14 Circular birefringence (|) 61
4.15 Natural optical activity (|) 61
4.16 The Faraday effect (|) 62
4.17 Interference 64
Exercises 67
8 Coherence 127
8.1 Introduction 127
8.2 Statistical light 128
8.3 Temporal coherence 128
8.4 White light 130
8.5 Wiener–Khinchin–Einstein theorem 132
8.6 Power spectral density 133
8.7 Intensity correlations 136
8.8 Spatial coherence 136
8.9 van Cittert–Zernike 137
8.10 Propagation of coherence 140
8.11 Stellar interferometry 141
Exercises 142
9.5 f to f 151
9.6 Two-lens system 153
9.7 Magnification 155
9.8 Complementarity I 156
Exercises 157
Exercises 210
References 261
Index 267
Light as a wave 1
We’re all equal before a wave.
1.1 Wave optics 1
Laird John Hamilton (San Francisco 1964–)
1.2 A brief history 1
1.3 Maxwell’s equations 2
1.1 Wave optics 1.4 Maxwell’s wave equation 2
1.5 Principle of superposition 3
This book is about wave optics, which is the foundation stone of 1.6 The harmonic wave solution 3
the wider edifice of optical phenomena illustrated in Fig. 1.1. The 1.7 E or B? 4
optics map includes: electromagnetic optics, where we care about 1.8 Phasors 5
the electromagnetic character of light; quantum optics, where we 1.9 Spatial frequency 5
care more about effects associated with counting individual photons, 1.10 Intensity/Poynting vector 7
and non-linear optics where the field is sufficiently strong that the 1.11 Complex representation 8
interaction with a medium is non-linear, see Chapter 13. Wave optics 1.12 Scalar approximation 9
gets us surprisingly far, and only in a few special topics do we need to 1.13 General solution 10
invoke additional phenomena associated with the full electromagnetic
1.14 Propagation 10
theory or quantum theory. A cornerstone of wave optics is the principle
1.15 Waves and quanta 11
of superposition, see Section 1.5. This says that we can add any
Chapter summary 13
solution of the wave equation to form new solutions. So starting with
Exercises 13
one wave (this chapter and Chapter 2), we can add another wave
and explain two wave phenomena such as interference (Chapter 3)
and polarization (Chapter 4), and then we add more (Chapters 5–
7). A sum of many waves allows us to explain the full range of wave
complexity found in Nature.
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
2 Light as a wave
1
Armand Hippolyte Louis Fizeau comparing data on the speed of light1 with independent measurements
(Paris 1819–Venteuil 1896) and Jean of the electrical permittivity and magnetic permeability of free space, 0
Bernard Léon Foucault (Paris 1819–
and μ0 , respectively,2 Maxwell realized that light is an electromagnetic
1868) made accurate measurements of √
the speed of light in 1848 and 1850, wave that travels at a speed c = 1/ μ0 0 . As most information we
respecitively. acquire about the Universe is delivered by electromagnetic waves this
2
Wilhelm Eduard Weber (Wittenberg was hugely significant. In making the connection, Maxwell unified the
1804–Göttingen 1891) and Rudolf Her- previously unrelated disciplines of optics and electromagnetism, and
mann Arndt Kohlrausch (Göttingen introduced the concept of the electromagnetic field—a function of space
1809–Erlangen 1858) found that the
ratio of electrostatic to electromagnetic
and time that characterizes the forces on particles. The field concept
units could be combined to produce a provided a template for modern physics and was subsequently applied
speed close to that of light in 1856. to matter as well, see e.g. Lancaster and Blundell (2014).
1 ∂2E
∇2 E − =0, (1.7)
c2 ∂t2
1.5 Principle of superposition 3
√
where c = 1/ μ0 0 is the speed of light. Maxwell’s wave equation,
eqn (1.7), is linear—there are no terms above first order in E. Note
also that both the electric and magnetic fields are governed by the same
wave equation.5 The challenge of wave optics is that even for the simplest 5
Taking the curl of eqn (1.2) and
of scenarios, the solution of the vector wave eqn (1.7) is complicated. substituting eqn (1.1) and eqn (1.4)
allows us to derive the wave equation
However, as we shall see there are a number of simplifications that often for the magnetic field:
apply.
1 ∂2B
∇2 B − =0. (1.8)
c2 ∂t2
1.5 Principle of superposition
A key feature of the wave equation, eqn (1.7), is that it is linear in the
field. As a consequence the principle of linear superposition holds.
Principle of superposition
F = F 0 cos(k · r − ωt + φ0 ) , (1.9)
1.7 E or B?
For an electromagnetic wave in vacuum or a non-magnetic medium, the
electric and magnetic fields, E and B, are linearly related, and it is
sufficient to consider either E or B only. To show this, take a harmonic
wave solution with k along the z axis such that, E = E 0 cos(kz −ωt) and
B = B0 cos(kz − ωt). Using the Maxwell equation, ∇ × E = −∂B/∂t,
we find that k × E 0 = ωB0 , and hence that
Fig. 1.3 A harmonic wave with
wavelength λ = 600 nm travelling along |E 0 /B0 | = c .
the z axis: eqn (1.9) with k·r = kz and
phase offset φ0 = 0. (a) The magnitude In general, we shall choose to work with E because it has the stronger
of the electric field as a function of time
at z = 0. We observe a wave crest
interaction with charges inside a medium. For example, the ratio of the
every 2 fs, i.e. 500, 000 billion waves per electric to the magnetic force from eqn (1.5) is
second, corresponding to a frequency of
F e qE c
500 THz. (b) The magnitude of the =
electric field as a function of position F m qv × B = v , (1.12)
at t = 0. We observe a wave crest
every 600 nm, corresponding to 1.67
million waves per metre, i.e. a spatial
where we have used the fact that for a harmonic wave in free space
frequency of 1.67 × 106 m−1 . |E/B| = c. In an insulator or dielectric the speed of a charge can be
estimated using the Bohr model of an atom. If the mean radius of a
bound electron is of the order of the Bohr radius a0 then, recalling that
in the Bohr model, the angular momentum of excited states is quantized,
mvr = , we arrive at an electron speed of v = /ma0 . Consequently,
the ratio of the electric to the magnetic force is
Fe c c 1
F m = v = /ma0 = α , (1.13)
1.8 Phasors
A convenient way to represent the phase of any wave at a particular
position and time is using a phasor—a unit vector that rotates anti-
clockwise in a fictional plane with an angle φ relative to the positive
horizontal axis.9 For the harmonic wave solution, eqn (1.9), with F 9
In complex notation, see Section 1.11,
replaced by E: this fictional plane corresponds to the
complex plane.
E = E 0 cos φ , (1.14)
Warning:
A phasor is a unit vector in a fictional plane representing the phase
of a wave, φ. The axes are directions in virtual space, not real space.
A phasor vector is only a graphical representation of phase and has
nothing to do with the electric field (or polarization) vector.
ν̃ = 1/λ . (1.15)
The magnitude of the wave vector is the phase change per unit length
and equal to 2π times the wave number:
1
u = sin θ . (1.17)
λ
where the 12 comes from the average of cos2 ωt and E0 is the magnitude
of E 0 . The key aspect is that the intensity of a light wave—the quantity
measured by optical detectors—is proportional to the square of the
amplitude of the electric field. If we count photons, the intensity
tells us how many photons we expect per unit area per unit time. In
Section 1.11, we shall derive the equivalent result using complex notation
for the field.
8 Light as a wave
E = E eiφ , (1.28)
I = 1 2
2 0 cE0 , (1.30)
1 ∂2E
∇2 E − =0. (1.31)
c2 ∂t2
For the special case where the field varies in only one spatial dimension,
say z, the wave equation reduces to Fig. 1.7 Intensity distribution in the
xz plane corresponding to the x and
∂2E 1 ∂2E z field components for focused light
− =0. (1.32) (high intensity is white, zero intensity
∂z 2 c2 ∂t2 is black). In the scalar approximation,
we only retain the dominant component
As we shall see next, the one-dimensional wave equation has a very of the field; in this example, Ex , and
general solution corresponding to a propagating wave form. the associated intensity, Ix . The effect
of a lens is to tilt the electric field
The scalar approximation breaks down if we tilt the field vector too vector, converting a part of Ex into
far relative to the propagation direction, as in the case of strong focusing Ez . For small tilt angles, the scalar
illustrated in Fig. 1.7. We shall discuss the full vector theory of focusing approximation remains valid. However,
in Chapter 12. A lens changes both the distribution of wave vectors, k, for larger tilt angles—a significant
fraction of π/2 (the strong focusing
and the distribution of electric field vectors, E. The scalar approximation limit)—other electric field components
says we can neglect this change in E which is only approximately true appear (lower image) and a full vector
as long as the range of propagation angles relative to the optical axis treatment is needed, see Chapter 12.
remains sufficiently small. The regime where the spread in propagation
angles is not too large is known as paraxial optics, see Section 2.13. It is
apparent from Fig. 1.7 that the scalar approximation is only applicable in
this paraxial regime. The scalar approximation is known to break down
when a light beam is tightly focused, and the longitudinal components
become significant, as is depicted in Fig. 1.7.
10 Light as a wave
∂f
= f (z − ct) , (1.33)
∂z
and
∂2f
= f (z − ct) . (1.34)
∂z 2
In addition
∂f
= −cf (z − ct) , (1.35)
∂t
moreover
∂2f
= c2 f (z − ct) . (1.36)
∂t2
Thus by substituting eqn (1.34) and eqn (1.36) into eqn (1.32) we
conclude that E = E0 f(z − ct) is a solution to the one-dimensional wave
Fig. 1.8 Generic solutions to the one- equation. Furthermore, a similar analysis shows that E = E0 g(z + ct)
dimensional wave equation correspond- is also a solution for any function, g(z, t), of the form g(z + ct). Using
ing to a wave propagating from left to the principle of superposition we see that the most general solution to
right (a) f(z − ct), and from right to left the one-dimensional wave equation is E = E0 [f(z − ct) + g(z + ct)]. We
(b) g(z + ct). The plots show the wave
at successive times tC > tB > tA . refer to f(z −ct) and g(z +ct) as travelling-wave solutions, for reasons
that are explained in Fig. 1.8. The solution E = E0 f(z − ct) evidently
represents an electric-field wave travelling to the right, with speed c;
whereas the solution E = E0 g(z + ct) represents an electric-field wave
travelling to the left, also with speed c.
1.14 Propagation
One of the most important questions in optics is how does light
propagate from A to B? Often, we know the field in a particular input
plane, and want to know the field everywhere else, or at least in a
particular observation plane. A large part of this book is devoted
to finding an expression for the field ‘downstream’ of the input plane.
We shall find—in Chapters 5 and 6, respectively—that the light field
anywhere can be written as a sum of waves with either planar or curved
wave fronts each with a particular phase. So the short answer to the
question, how does light propagate from A to B? is that it is all about the
phase. As momentum is conserved, a more accurate statement is that it
is all about phase and momentum. Using these simple concepts—adding
1.15 Waves and quanta 11
p = k . (1.39)
∇2 E + k 2 E = 0 , (1.40)
12 Light as a wave
where k = ω/c.
This time-independent form of the wave equation is known as the
Helmholtz equation. The time-independent Schrödinger equation
for the wave function, ψ of a particle of mass m, and energy E in a
potential V ,
2 2
− ∇ ψ + V ψ = Eψ , (1.41)
2m
can also be written in the form of a Helmholtz equation,
∇2 ψ + k 2 ψ = 0 , (1.42)
where k = 2m(E − V )/2 . Consequently, the same wave theory
works in both cases. The analogy works both ways. One can either think
of optical photons as being confined in potentials created by matter, as
in an optical fibre or waveguide; or particles confined by potentials which
can be created by light, as in optical tweezers (Adams et al. 1994). We
shall make use of this analogy in later chapters, particularly when we
Fig. 1.10 Intensity map for light consider what the field looks like inside an aperture and when we discuss
propagating through a two-lens system. the light distribution inside an optical fibre, see Chapter 11.
Black and white correspond to zero and We should keep in mind that the light field is inherently lumpy. We
peak intensity, respectively, but where
is the photon? How to calculate this shall often write the field amplitude, E0 , as if it were a constant, when
image is discussed in Chapter 6. in fact it contains fluctuations, see Chapter 8; E0 is the average of a
fluctuating field. A final caveat on the use of the photon concept is
that although we often talk about them, the photonic character of the
field is only really important if we are sensitive to individual photon
correlations in a way that goes beyond classical wave theory. The
story of correlated photons lies in the realm of quantum optics, see
e.g. Loudon (2000). Verifying that light is a quantum phenomenon
requires both interference and counting, i.e. both the wave and particle
character of the field. The reality of photons is referenced to correlations
in the detected signal. To emphasize this point, Roy Jay Glauber
(New York City 1925–)—co-recipient of the 2005 Nobel Prize in Physics
for his contribution to the quantum theory of optical coherence—says
(Roychoudhuri 2008): A photon is what a photon detector detects, and
A photon is where a photon detector detects it.
Exercises 13
Chapter summary
Exercises
(1.1) Speed of light the relation c2 = 1/(μ0 0 ). Evaluate 0 to four
The speed of light has been defined as significant figures.
c = 299 792 458 m s−1 (exact). Likewise, the
value for the vacuum permeability is exact, (1.2) Photons in a beam
μ0 = 4π × 10−7 N A−2 (exact). The value for A laser pointer emits light of wavelength λ =
the permittivity of free space is thus defined by 650 nm in a beam of power 2 mW. How many
photons per second are emitted?
14 Exercises
(1.9) Harmonic wave (1) Rewrite U in terms of the voltage across the
Sketch the form of the harmonic wave E = capacitor V and the capacitance C. For
E0 cos(kz − ωt), with a wavelength λ = 0.5 μm, a capacitor with area A and spacing d the
at t = 0, for z in the range −1.5 μm ≤ z ≤ 1.5 μm. capacitance is C = A0 /d and the field is E = V /d.
(1.10) Harmonic wave (2) Substituting for C, find an expression for energy
Sketch the form of the harmonic wave E = density, u. How would this expression change for
E0 cos(kz − ωt), with a wavelength λ = 0.5 μm, a time-varying field?
One wave: plane or curved 2
Every great decision creates ripples – like a huge boulder
2.1 Introduction 15
dropped in a lake.
2.2 Wave fronts 15
Benjamin Disraeli (London 1804–1881)
2.3 Plane waves 16
2.4 Transverse property 17
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
16 One wave: plane or curved
ik · E = 0, (2.8)
ik × E = iωB . (2.9)
Warning
Certain properties of plane waves, such as
are not necessarily shared by ALL light waves. Plane waves form
a convenient basis, but the properties of a superposition of plane waves
are not the same as the properties of the individual basis functions.
There are numerous examples of situations where the statement light
is a transverse wave is not valid.
The scalar plane wave has four parameters: (i) wavelength λ = 2π/k =
2πc/ω, (ii) amplitude E0 , (iii) propagation direction θ, and (iv) direction
of the electric field vector (in this case along y), but remember that
the transverse property means that the propagation direction and the
2.6 Plane wave in a medium 19
direction of electric field vector are related. The rate of phase variation
along the x axis is
kx = k sin θ , (2.12)
and the photon momentum along the x axis is
px = k sin θ . (2.13)
As discussed in Section 1.8, it is convenient to represent the phase of
any wave using a phasor. For the complex form of a scalar plane wave
eqn (2.10) the phasor angle is
φ = kx x + ky y + kz z , (2.14)
and eqn (2.10) can be written as
E = E0 eiφ . (2.15)
In the propagation direction, the phasor rotates anti-clockwise with an
angle proportional to the distance travelled, as illustrated in Fig. 2.6.
In this phasor representation, the field only has two parameters, an
amplitude E0 , and a phase, φ, but the phase is a function of space and
time. To describe the propagation of light, it is convenient to define an
optical axis, which we shall take as the z axis. The xy plane at z = 0 is
defined as the input plane, and the field in the input plane is written as Fig. 2.6 Phasor evolution for a plane
E (0) = E0 f(x , y ). Using eqn (2.10) the field a distance z downstream is wave in the xz plane. In the propa-
gation direction, the phasor completes
E (z) = eikz z E (0) , (2.16) one revolution (a 2π rotation) between
successive wave crests separated by a
where kz = k cos θ and eikz z is known as the propagator. In Chapter 6 distance λ.
we shall employ this plane wave propagation equation to describe the
propagation of arbitrary fields by writing them as a superposition of
plane waves propagating at different angles.
where we have allowed for the possibility that the plane wave has a
different amplitude and propagates in a different direction. At the
boundary z = 0, the variation of the phase along the x axis must be the
Fig. 2.7 Phase continuity at an
interface between vacuum (left) and a same, see Fig. 2.7, which requires that
medium with refractive index n (shaded
region on the right). sin θi = n sin θt . (2.21)
This is known as the law of refraction. For the more general case
where the interface is between a medium with refractive index ni and a
second medium with index nt , the law becomes
5
The first mathematical derivation
was by René Descartes (La Haye
ni sin θi = nt sin θt . (2.22)
en Touraine 1596–Stockholm 1650) in
Dioptrique 1637. Although the law of refraction is often referred to as Snell’s law, after
Willebrord Snellius (Leiden 1580–1626), it first appeared in the work of
Ibn Sahl (Baghdad 940–1000) in 984, see Rashid (1990).5
2.8 Dispersion
As the refractive index, n, depends on the wavelength, see Chapter 13,
different colours refract by different angles as they enter and leave a
medium, which gives rise to optical phenomena such as rainbows. The
change in refractive index with wavelength is known as dispersion. We
briefly review dispersion here and postpone the details to Chapter 13.
Media that transmit visible light typically have electronic resonances
Fig. 2.8 The angular frequency depen- at frequencies in the ultra-violet region and the refractive index spectrum
dence of the refractive index, n, for a looks something like the curve shown in Fig. 2.8, see Chapter 13. The
simple medium with a single resonance
at angular frequency, ω0 . If ω0 is in
refractive index is larger for higher frequencies and consequently shorter
the ultra-violet region, then the angular wavelengths—blue rather than red—refract more. It is worth noting
frequency of both red and blue light, ωr that this is the opposite to diffraction, where longer wavelengths—red
and ωb , respectively, is less than ω0 . rather than blue—diffract more, see Chapter 5.
The refractive index for blue light is
larger as it is closer to resonance.
Er⊥ sin(θi − θt )
= − . (2.27)
Ei⊥ sin(θi + θt )
The other case where the field is polarized within or parallel to the xz
plane is slightly more complex as now we have two components of the
field, see Fig. 2.9, and we need to go beyond a scalar wave theory. Now
only the component of the field perpendicular to the interface (the z
component in Fig. 2.9) satisfies the superposition principle—consistent
with Maxwell’s equation, ∇ · E = 0, with no surface charge. This gives
(Ei + Er ) cos θi = Et cos θt . (2.28)
and eliminating Et we find that the reflected wave is given by
Er −n cos θi + cos θt
= . (2.30)
Ei n cos θi + cos θt
tan θb = n .
Fig. 2.11 The intensity reflection At Brewster’s angle there is no reflected light, all the light is transmitted—
coefficient, R, at normal incidence, which is particularly useful in applications where low loss is important,
eqn (2.32), as a function of the
refractive index, n. The reflection
such as inside laser cavities, see Chapter 11. For typical glasses with
coefficient rises from R = 0.04 for n = n = 1.5, Brewster’s angle is about 57◦ , as shown in Fig. 2.10. The
1.5 (glass) to R = 0.17 for n = 2.4 disappearance of the reflected wave at Brewster’s angle arises due to the
(diamond). transverse nature of plane waves; namely, the impossibility of solutions
with coexisting orthogonal transverse fields. It follows that for a real
light field, which cannot have the infinite spatial extent of a plane wave,
8
Note that there is nothing in the the Brewster-angle condition can only be met partially.8
theory relating to the microscopic
properties of the medium. Conse-
quently, Brewster’s angle is not related
to the angular dependence of fields
2.11 Reflectivity
produced by microscopic dipoles inside
the medium. For normal incidence, θi = 0, the intensity reflection coefficient for either
polarization reduces to
2 2
Er 1−n
R = = . (2.32)
E0 1+n
We plot R versus refractive index in Fig. 2.11. As the reflectivity of
optical media like glass and transparent crystals is low, interference
2.12 Curved wave fronts 23
(1) First, being close to the optical axis means that we can make the
small-angle approximation:
x , y , x and y < z,
r = [z 2 + (x − x )2 + (y − y )2 ]1/2 ,
Example 2.1
Paraxial plane wave: In the paraxial regime, we can derive a paraxial form of
the plane wave solution, eqn (2.10). Rewriting the paraxial expression for the axial
component of the wave vector, eqn (2.35), in terms of spatial frequencies using kx =
2πu and ky = 2πv, we obtain
kz = k − π(u2 + v 2 )λ , (2.37)
where v is the spatial frequency in the y direction. Substituting for the components
of k in eqn (2.10) we obtain
2
+v 2 )λ]z i2π(ux+vy)
E = E0 ei[k−iπ(u e . (2.38)
This equation is known as the paraxial plane wave solution. The first exponential
factor is the phase evolution for a plane wave along z (less than kz for a wave
propagating at an angle), and the second is the phase variation along x and y. It
follows from eqn (2.38) that the field in an xy plane at z can be written as
2
+v 2 )λ]z (0)
E (z) = ei[k−iπ(u E , (2.39)
where E (0) is the field in the xy plane at z = 0. This equation, the paraxial form of
eqn (2.16), is the basis of paraxial Fourier optics, as we shall see in Chapter 6.
12
Unlike a spherical wave, the paraxial spherical wave has a particular For a paraxial spherical wave with
propagation direction, which we have taken as the positive z direction. source at (0, 0, 0), the relative error
at (x, 0, z) in neglecting x2 /(2z) in
The paraxial spherical wave is particularly useful in the description of 1/r is Δr /r = x2 /(2zr), whereas
diffraction, as we shall see in later chapters. A paraxial spherical wave the relative error in the phase, kr ,
solution with source at (x , 0, 0) is illustrated in Fig. 2.14. If the source is is Δφ/(2π) = kx2 /(4πz) = x2 /(2λz).
at the origin, x = y = 0, then in the y = 0 plane, the paraxial spherical For z λ, the former is negligible, i.e.,
the wave amplitude E0 /(kr ) is slowly
wave is varying compared to the phase, eikr .
2
eikρ /2z , expresses the wave-front curvature with radius of curvature
equal to the propagation distance, z. As we saw in Figs. 2.2 and 2.14, as
we propagate further from the source, the radius of curvature increases,
and the wave fronts become more and more planar. In Fig. 2.1(right),
we showed a visualization of the phase of a spherical wave in a particular
xz plane and indicated the phase variation along the x axis at a distance
2
z from the source, which is given by the real part of eikx /2z . Next, we
consider how an ideal lens changes the curvature of a paraxial spherical
13
Curved wave fronts are common wave.13
in optics and we shall encounter
many examples where quadratic phase
2
factors, similar to the eikρ /2z term in
eqn (2.40), arise in later chapters. 2.15 Lenses: a brief history
The lens—named after the lens culinaris or lentil—is an essential
14
So useful that Nature invented it! component in most optical instruments.14 Although the light-bending
The origin of man-made lenses is not properties of water and glass were well known to the Greeks and
clear. The so-called Nimrod lens, found
at the Assyrian temple of Nimrod and
Romans, the first scientific understanding of lenses is attributed to Ibn
now in the British Museum, is a lens- Sahl (Baghdad 940–1000) who wrote his treatise On Burning Mirrors
shaped glass over 2700 years old, but and Lenses in 984, which contained the first exposition of the law
may not have been used as a lens. of refraction. Ibn Sahl worked out that to focus light with minimal
2
E (L) = E (0) e−ikρ /2f
. (2.47)
2.17 Collimation
A lens can also be used to collimate light. A diverging spherical wave
centred at z = −f incident on a lens in the z = 0 plane is converted to
a plane wave, as shown in Fig. 2.17(i). Note the positive and negative
sign of the curvature for diverging and converging waves, Fig. 2.17(i)
and (ii), respectively.
Real lenses are not as perfect as the ideal thin lens we have considered
above. First, the focal length depends on the refractive index n which
is a function of wavelength, as discussed in Section 2.8. This leads
to chromatic aberration, where different wavelengths are focused
18
An ideal lens would have a parabolic in different planes. Chromatic aberrations are reduced by using a
or aspherical surface, however standard
polishing techniques produce spherical
second lens made of a different glass—an achromatic doublet—to cancel
surfaces. The difference gives rise to the dispersion of the first. In addition, beyond the paraxial regime,
spherical aberrations. In the parax- other sources of aberration arise due to spherical surfaces and the
ial regime, a spherical and parabolic finite thickness of the lens.18 Optical engineering is focused on reducing
surface are the same.
aberration using multiple lenses; see e.g. Fischer et al. (2008).
2.18 Imaging property 29
such that
Es2 −iks2 Es
e = 1 eiks1 ,
iks2 iks1
and from the ρ dependence,
1 1 1
− = − ,
s1 f s2
1 1 1
+ = . (2.52)
s1 s2 f
This equation is known as the thin-lens equation and is easily
extended from spherical waves to images. For images, each point in
an input plane a distance s1 upstream of the lens is mapped onto an
image point at a distance s2 downstream of the lens. The positions
z = −s1 and z = s2 are known as conjugate planes.
Exercises 31
Chapter summary
Exercises
(2.1) Plane-wave properties (1) axis in vacuum at a particular instant in time, e.g.,
Verify the results of equations (2.4)–(2.7). t = 0.
error (as a percentage) in using the small-angle would the field look like upstream of the lens?
approximation for the case of light propagating at (2.7) Diverging and converging paraxial spherical waves
an angle θ = 30◦ relative to the z axis. Write expressions for both diverging and converg-
(2.4) Paraxial distance ing paraxial spherical waves propagating along the
Write an expression for the distance r between z axis, if the two wave originate from z = −f and
an input point (x , 0) and an observation point z = f , respectively.
(x, z). Use |x − x | < z to expand r in terms of (2.8) Collimation of a point source
z, x, x , x2 , and x2 . What is this approximation Write an equation for the electric field of a
called? Explain, briefly, when it might be possible spherical wave centred on the origin. Rewrite
to neglect the x2 term while retaining the other this equation in a plane a distance z = f
terms. downstream in the paraxial regime. Comment on
(2.5) Paraxial plane waves the approximations used.
Write an equation to describe the electric field A plano-convex lens with focal length f is placed
variation along the x axis for a paraxial plane wave in the z = f plane. What is the form of the wave
with amplitude E0 propagating in the xz plane at fronts downstream of the lens?
an angle θ relative to the z axis. (2.9) Scalar and paraxial breakdown
Write an equation in complex notation for a plane Give an example of an optical instrument where
wave propagating at angles θx and θy relative to the scalar approximation breaks down. Explain
the z axis in the xz and yz planes, respectively. why these approximations break down in this case.
Express your answer in terms of kx = sin θx , (2.10) Intensity
ky = k sin θ, and k only. If the electric field at position (x, y, z) is given
Write an inequality for kx , ky , and z in the by E(x, y, z) = E0 ei(kx x+ky y+kz z) , write an
paraxial regime. Use this inequality to write an expression for the intensity distribution I(x, y, z)
expression for a paraxial plane wave. and comment on the spatial dependence.
(2.6) Paraxial spherical wave (2.11) Dispersion
Write an equation for a spherical wave with origin Both dispersion and diffraction may induce a
at z0 . Rewrite this equation for the case of the change in the propagation direction. Explain, why
paraxial regime. Comment on whether a lens with dispersion tends to deflect blue light more than
focal length f in the z = 0 plane would cancel red, whereas for diffraction it is the other way
or double the transverse phase dependence. What around.
Two waves: interference 3
Two roads diverged in a wood, and I —
3.1 Introduction 33
I took the one less traveled by,
3.2 A brief history 33
And that has made all the difference.
3.3 Two plane waves 35
3.4 Standing waves 36
Robert Frost (San Francisco 1874–Boston 1963),
Mountain Interval, 1916. 3.5 Two spherical waves 36
3.6 Young’s interferometer 37
3.7 Plane plus spherical 40
3.8 Three waves 42
3.1 Introduction 3.9 Diffraction grating 43
3.10 Interferometry 45
In Chapter 2 we considered one wave (with either planar or curved wave 3.11 Fabry–Perot etalon 45
fronts). In this chapter we consider the sum of two waves. The ability 3.12 Michelson interferometer 47
to add any two wave solutions follows directly from the linearity of the Chapter summary 49
wave equation and the principle of superposition, Section 1.5 in Chapter
Exercises 49
1. The two waves could be either completely independent, one wave
that induces another wave inside a medium, or two waves obtained from
one by dividing the wave front or amplitude into two and subsequently
recombining them. The two waves may have different frequencies, or
the same frequency but different propagation directions, or even the
same frequency and same direction but arrive via different paths. The
sum of two waves gives rise to the phenomenon of interference, see
Fig. 3.1. For two waves to interfere they must have a well-defined
relative phase, a property known as coherence that we shall consider
in Chapter 8. Depending on their relative phase, two waves may
interfere either constructively or destructively. We begin with a
brief history of interference phenomena.
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
34 Two waves: interference
Example 3.1
Two plane waves at angles ±θ 0 /2 to the z axis: Consider two plane
waves propagating in the xz plane at angles θ = ±θ0 /2 relative to the z axis,
such that the angle between them is θ0 , i.e., x and z, horizontal and vertical,
respectively in Fig. 3.4. The wave vectors are k1 = (k sin θ0 /2, 0, k cos θ0 /2) and
k2 = (−k sin θ0 /2, 0, k cos θ0 /2). In this case, eqn (3.1) becomes
E = E0 ei[k cos(θ0 /2)z−ωt] eik sin(θ0 /2)x + e−ik sin(θ0 /2)x ,
This equation says that we have a travelling wave along z and a standing wave along
x. The standing wave is formed because we have two components that propagate in
opposite directions along x. In Section 3.4, we shall consider the special case where
k1 and k2 are anti-parallel. By calculating the time average we find an intensity
I = 4I0 cos2 [k sin(θ0 /2)x] = 2I0 {1 + cos [2k sin(θ0 /2)x]} , (3.5)
where we have rewritten the result using 2 cos2
πux = 1 + cos 2πux to highlight that
the spatial frequency of cosine-squared is twice that of cosine.
There is no intensity variation along the bisector of k1 and k2 , i.e., along z, as
both plane waves have the same value of kz . However, there are periodic interference
fringes in the direction of Δk, which in this example is along x, as there are two
values of kx . This type of interference pattern may be observed, for example, when
a plane wave is reflected by a wedge-shaped piece of glass producing two reflected
waves propagating at different angles, see Exercise 3.2. By considering the intensity
variation along x, we find that the distance between maxima in the interference
pattern, Λ, is
λ
Λ= , (3.6)
2 sin(θ0 /2)
where θ0 is the angle between the wave vectors. In the small-angle approximation, the
spacing between the maxima is Λ ≈ λ/θ0 and the spacing frequency of the intensity
pattern is θ0 /λ.
−iωt
E = 2E0 e
cos(k · r) , (3.7)
Fig. 3.5 The formation of a standing
wave. Left column: Two counter- function of time
propagating waves moving apart at
times t = 0 (top) to 11 T (bottom)
and we get a static or standing wave whose spatial dependence is fixed
40
in intervals of T /40. Right column: but with an amplitude that oscillates in time. As above, we detect the
The corresponding sum. At t = T /4 intensity (or time-averaged Poynting vector magnitude), which is
(row 11), the amplitude is zero. For
t > T /4 (row 12), the amplitude starts I = 4I0 cos2 k · r = 2I0 (1 + cos 2k · r) . (3.8)
to increase again. The shaded region
indicates the maximum amplitude of
the resulting standing wave.
Note that whereas the intensity of each individual plane wave is uniform
their sum has an intensity distribution that varies sinusoidally along
the propagation direction, k. The spatial frequency of the interference
fringes is 2u = 2k/(2π), and as predicted by eqn 3.6, the spatial period
of the interference fringes is λ/2. A time sequence of two counter-
propagating waves and their sum is shown in Fig. 3.5.
ei[k1 ·(r−d1 )] ei[k2 ·(r−d2 )] Fig. 3.6 The interference pattern for
E = Es + Es , (3.9) two spherical waves with centres at
ik|r − d1 | ik|r − d2 | positions d1 and d2 relative to the
origin (marked by the black dot).
where d1 and d2 are vectors that specify the origin of each spherical The time-averaged intensity pattern is
wave. We can make a useful simplification when the propagation shown using the greyscale. The wave
vectors at the observation position P
distance z is much larger than both the wavelength λ and the separation (white dot) are k1 and k2 (indicated
of the sources, |d1 −d2 |. In this case we can put r−d1,2 ≈ r̄, where r̄ is a by the white arrows). The polar
vector from the mid-point between the source points to the observation vector and a vector from the mid-point
point, see Fig. 3.6. In this case between the sources to the point P are
r and r̄, respectively. The wave crests
ei(k1 ·r̄)
are indicated by the white semicircles.
E = Es 1 + ei[(k2 −k1 )·r̄] , (3.10) In contrast to plane waves, Fig. 3.4, the
ikr̄ interference pattern spreads out as it
propagates. Note the difference to an
and the intensity is amplitude interference pattern, such as
circular water waves, shown in Fig. 3.1.
I = 2Īs {1 + cos [(k2 − k1 ) · r̄]} , (3.11)
eikr1 eikr2
E = Es + Es , (3.12)
ikr1 ikr2
5
Only if we specify a hole diameter can where Es is the effective amplitude of the spherical waves,5 and r1 and r2
we relate Es to the amplitude of the are the distances from the centre of each hole to the observation point,
incident field, E0 , see Chapter 5.
as defined in Fig. 3.7. Again we neglect the explicit time dependence as
it cancels when we calculate intensity. In the far-field, z d, we can
neglect the slight
√ variation
√ in the relative amplitude of the two waves
and repace 1/ r by 1/ r̄, where r̄ = z + (x2 + y 2 )/2z is the paraxial
distance from the mid-point between the two apertures (0, 0, 0) to the
observation point (x, y, z), see Fig. 3.7. In this case
E = Ēs eikr1 + eikr2 , (3.13)
xd d2 xd d2
r1 = r̄ − + ; r2 = r̄ + + . (3.16)
2z 8z 2z 8z
In the far-field, where z x , it is also convenient to neglect the x2 /2z =
d2 /8z term as well. This is known as the Fraunhofer approximation,
see Chapter 5.7 For Young’s two apertures, we can retain the x2 terms, 7
Note that although the x2 /2z be-
as they cancel in the final result. Substituting for r1 and r2 , eqn (3.16) comes smaller with increasing z, the
x x/z does not, as the range of x
in eqn (3.13), the sum of the two waves becomes increases linearly with z as the light
field spreads out. Typically, z is at least
E = Ēs eik(r̄+d /8z) eikdx/2z + e−ikdx/2z ,
2
a factor of two larger than x and more
than an order of magnitude larger than
2 kdx x .
= 2Ēs eik(r̄+d /8z) cos . (3.17)
2z
This expression tells us how the path difference, ±dx/2z, and hence
the phasor angles, ±kdx/2z, depend on the transverse position x in the
observation plane. The prefactor contains information about the mean
distance between the apertures and the observation point, and gives rise
to a global phase that disappears when we calculate the time-averaged
intensity. The intensity is proportional to the modulus-squared of the
field:
kdx 2πx
I = 4Īs cos 2
= 2Īs 1 + cos , (3.18)
2z (λ/d)z
2
intensity, the global phase factor, eik(r̄+d /8z) , has disappeared and only
the relative phase between the two paths matters.
The intensity far downstream of Young’s two apertures, eqn (3.18)
plotted in Fig. 3.8, varies sinusoidally with position along the x axis
40 Two waves: interference
Example 3.2
Young’s double slit using eqn (3.11): Consider two slits aligned in the vertical,
or y direction, centred at (±d/2, 0), as in Fig. 3.9. The mid-point between the slits
is at the origin. If the field is uniform along y, we need only consider the field
dependence along x. From eqn (3.11) the sum of the two waves is
I = 2Īs {1 + cos [(k2 − k1 ) · r̄]} , (3.19)
where for slits the two waves are cylindrical, and Īs = Is /kr̄ with Is equal to the
on-axis intensity in the observation plane for a single slit. If θ1 and θ2 are the angles
from the centre of each slit to the observation point relative to the z axis, then
(k2 − k1 ) · r = (k sin θ2 − k sin θ1 )x + (k cos θ2 − k cos θ1 )z .
In the far-field, z x, we can make use of the small-angle approximation,
x − d/2
sin θ1
θ1 = ; cos θ1 = 1 ,
Fig. 3.9 Geometry of the double slit z
used in Example 3.2. In practice, z and
x + d/2
d, more like in Fig. 3.8. sin θ2
θ1 = ; cos θ2 = 1 .
z
Substituting into the intensity formula we find
kdx
I = 2Īs 1 + cos . (3.20)
z
This result is identical to the two-hole example, eqn (3.18), except for the different
form of Īs .
where we have used the paraxial form of the spherical wave, eqn (2.40).
The corresponding intensity distribution is
2 2
kρ kρ
I = 2Is 1 + cos = 4Is cos2 . (3.22)
2z 4z
This intensity pattern in the xy plane is shown in Fig. 3.10. As we
have seen before, interference converts phase information—in this case
the phase across the wave front of a spherical waves as in Fig. 2.1—
into intensity information. The plane wave acts as a reference that Fig. 3.10 The interference pattern
generates an intensity read-out of the phase of the other wave.9 The between a plane wave and a spherical
first zero moving away from the centre occurs at a radius ρ1 , where wave a distance z downstream of the
cos−1 (kρ21 /4z) = π/2, which gives origin of the spherical wave.
√ 9
We shall encounter similar patterns
ρ1 = λz . (3.23) when we look at phase differences
between on-axis and off-axis paths in
The subsequent dark fringes are at odd multiples of π, Chapter 5.
ρm = (2m − 1)λz , (3.24)
where m is an integer.
In Fig. 3.11 we show the interference pattern in the xz plane. A
pattern of this type would be produced if a point-like scatter (at the
origin) reflects a part of the incident plane wave. Alternatively, by
placing a mask at −z that reproduces the amplitude of the interference
pattern we can recreate the field produced by the scatterer when
it is not there. This is the principle of holography, invented by
Dennis Gabor (Budapest 1900–London 1978) while working at British
Thomson-Houston in 1947. The mask needed to create the required
phase and amplitude is known as a hologram. As any object is
simply an array of scattering points, holography can recreate images Fig. 3.11 Intensity pattern in the xz
of three-dimensional objects. Figure 3.11 also provides an example of plane for a superposition of a plane
how interference can produce light patterns that appear not to travel in wave and a spherical wave. A ‘mask’
placed at −z that creates the field
straight lines!
shown in Fig. 3.10 reproduces an image
of a point-like object at the origin. A
complex mask or hologram can be used
to recreate any scattered field.
Example 3.3
Newton’s rings: A pattern similar to Fig. 3.10 was first observed by Newton
in 1717 when he looked at the reflection from a spherical glass surface placed
on top of a reflecting planar surface. A similar pattern is also observed due to
the interference between the reflections from planar and curved surfaces of a
plano-convex lens, as illustrated in Fig. 3.12. The sum of planar and curved
waves from the front and back surface of a lens with radius of curvature RL is
E = R1/2 E0 eikz − (1 − R1/2 )R1/2 E0 eikz e2inkt ,
2
= R1/2 E0 eikz − (1 − R1/2 )R1/2 E0 eikz e2inkt0 eikz e−inkρ /RL
,
42 Two waves: interference
where R1/2 is the amplitude reflection coefficient as defined in eqn (2.32) and
t0 is the thickness
of the lens. If nkt0 = mπ, then we observe bright fringes at
positions ρm = (2m − 1)λRL /2 and the pattern is similar to Fig. 3.10 with
the bright and dark fringes interchanged.
We plot both the normalized field E/Ēs and the normalized intensity
I/Īs as a function of position in the observation plane in Fig. 3.14.
The new feature compared to two apertures is that there are now two
types of peaks; big peaks with relative intensity 9, and smaller peaks
with relative intensity 1. These two types of peak are called principal
maxima and subsidiary maxima, respectively.
10
Note that the spatial average of the The principal and subsidiary maxima are easily interpreted in terms
intensity along the x axis is three times
that of a single phasor. This highlights
of phasor diagrams. For a principal maximum, the phasors interfere
the key feature of interference—energy constructively, giving a resultant field vector that is three times larger
is conserved but spatially redistributed. and an intensity—proportional to the modulus-squared of the field—that
is nine times larger, see Fig. 3.14.10 As we move away from a principal
3.9 Diffraction grating 43
maximum along the x axis, one phasor remains fixed while the other two
rotate clockwise and anti-clockwise at the same rate. We can think of
this as a clock face with three hands. At a position x the relative angles
of the phasors are φ = −kdx/z, 0, and +kdx/z. When kdx/z = 2π/3,
i.e. x = 13 (λ/d)z, the three phasors interfere destructively to give zero.
We refer to the zero on either side of the central maximum as the first
zero. When kdx/z = π, the two rotating phasors are in the opposite
direction to the fixed phasor, giving a resultant of minus one, and hence
the intensity is one. This case corresponds to the subsidiary maximum
and is midway between principal maxima. As we move further along x
we eventually come back to the position at kdx/z = 2π where all the
phasors line up again, giving another principal maximum.
Example 3.4
Three plane waves: Consider the sum of three plane waves propagating at angles
−θ0 , 0, and +θ0 , relative to the zaxis in the xz plane. The total field is Fig. 3.14 Normalized field (a) and
E = E0 ei(k sin θ0 x+k cos θ0 z) + eikz + ei(−k sin θ0 x+k cos θ0 z) , (3.27) intensity (b) corresponding to the sum
of three phasors used to describe the
and the intensity distribution along the x axis in the far-field has the form far-field diffraction pattern produced
I = I0 1 + 4 cos [k(cos θ0 − 1)z] cos(k sin θ0 x) + 4 cos2 (k sin θ0 x) . (3.28) by three small apertures. Phasor
diagrams for positions where I/Is = 0,
The intensity pattern is plotted in Fig. 3.15. As the third plane wave has a different
1, and 9 are shown.
spatial frequency along z, the intensity remains a function of z, repeating over a
distance Λ = λ/(cos θ0 − 1).
sin2 (N kdx/2)
I = Īs . (3.29)
sin2 (kdx/2)
44 Two waves: interference
This function is plotted in Fig. 3.16. The key features are that: the
N/2
Er = Ei eiφm , (3.31)
m=−N/2
Example 3.5
Wavelength resolution: A grating is typically used to separate different wave-
lengths. If we change the wavelength, then the principal maxima move (except for
the central, or zero-order, maximum). If we consider two wavelengths, λ1 and λ2 ,
then the first-order diffraction peaks are at relative positions (λ1 /d)z and (λ2 /d)z,
respectively. We can say that the two wavelengths are ‘resolved’, if the first
principal maximum at one colour (λ1 ) overlaps with the zero adjacent to the principal
maximum of the other colour (λ2 ), as shown in Fig. 3.18.13 As the distance to the
first zero is (λ/N d)z, this condition gives
λ2 λ1 λ1
z = z+ z. (3.33)
d d Nd
Rearranging, we find that
Fig. 3.18 Detail of the zero and
λ1 λ1 first order for a diffraction grating
= =N . (3.34)
Δλ λ2 − λ 1 illuminated by two colours, λ1 (grey)
This quantity is often called the resolving power of the grating. The equation and λ2 (black). In this example, the
tells us that the smallest wavelength difference we can hope to resolve is inversely wavelength difference is chosen such
proportional to the number of slits (or lines) on the grating. Note that the resolving that first order at λ2 sits at the position
power is a factor of two higher for the second-order diffraction peak. of a zero for λ1 .
13
See also Chapter 9 for further
discussion of this point.
3.10 Interferometry
The application of interference to measurement is known as interfer-
ometry. Interferometry allows us to measure changes in the phase of a
wave, and is used to measure length, as in gravitational wave detection
(Abbott et al. 2016) or the spectrum of light, see Chapter 8. Young’s
two-hole experiment, N -slits, and the reflection grating in Fig. 3.17
are examples of wave front division interferometry, where the
wave front is divided spatially, and the component parts subsequently
recombine and interfere. The colour of a butterfly wing, or the chirped
echo from the steps of the Chichen Itza pyramid14 are other examples. It 14
The reflections of successive steps
is also possible to produce interference by amplitude division, where a produce a rising pitch similar to the call
of the Quetzal bird.
part of the light field is redirected by a partially reflecting surface, then
the two parts are recombined at the same, or another, interface. For
measurement applications, amplitude division has the advantage that
no light is thrown away. Below we focus on two examples of amplitude
division interferometry—the Fabry–Perot interferometer15 and the 15
Named after Maurice Paul Auguste
Michelson interferometer.16 Other examples include the Mach– Charles Fabry (Marseille 1867–Paris
1945) and Jean-Baptiste Alfred Pérot
Zehnder, Sagnac, and Jamin. Aside from differences in optical layout (Metz 1863–Paris 1925). Fabry, along
and applications, the underlying physics of all interferometers is the with Henri Buisson, also discovered the
same, namely, the addition of two (or more) waves. ozone layer, see Mulligan (1998).
16
Named after Albert Abraham
Michelson (Strzelno 1852–Pasadena
3.11 Fabry–Perot etalon 1931) who used it to try to measure
the motion of the Earth through the
In Chapter 2, we looked at the reflection and transmission of light at an aether.
interface. In this section we consider two interfaces where light reflects
back and forth giving rise to multiple-path interference. In optics, a
46 Two waves: interference
system that reflects light back and forth is variously known as a Fabry–
17
Translated from the French as stan- Perot interferometer, an etalon,17 or a cavity. An example is
dard. illustrated in Fig. 3.19. The physics of the Fabry–Perot interferometer—
described using an N phasor sum—provides a convenient starting point
to understand a diverse range of interference phenomena, including the
colour of soap and oil films, anti-reflection coatings on optics and laser
cavities. All that matters in each case is the reflection coefficient of each
interface, R1/2 , their separation, , and the wavelength of the light, λ.
The transmitted field is a sum of light transmitted directly and light
that is reflected back and forth and then transmitted as illustrated in
Fig. 3.19. If we assume that the incident light is a plane wave and the
amplitude reflection and transmission coefficients at each interface are
R1/2 and T 1/2 , respectively,18 then we can write the total transmission
as a geometric progression:
1
Et = E0 T 1 + Reiφ + R2 e2iφ + . . . = E0 T , (3.35)
Fig. 3.19 A Fabry–Perot interferome- 1 − Reiφ
ter where two interfaces at z = − and
z = 0 reflect light back and forth. where φ = 2nk/ cos θ is the phase accumulated during one round trip,
18 n is the refractive index, θ is the angle of propagation (relative to z)
The reflection and transmission coef-
ficients are, in general, complex as there inside the Fabry–Perot, and is the length. The transmitted intensity
is a phase shift on reflection, however,
is given by the modulus-squared:19
including these phase shifts does not
change the main result.
It 1
19 = , (3.36)
I0 1+ 4(F 2 /π 2 ) sin2 φ/2
It T T
= where we have defined
I0 1 − Reiφ 1 − Re−iφ
T2 √
= , π R
1 + R2 − 2R cos φ F = , (3.37)
T 2 1−R
=
1 + R2 − 2R(1 − 2 sin2 φ/2)
which is known as the finesse. The finesse determines the sensitivity
T2
= , of the transmission (or reflection) to small changes in the wavelength
(1 − R) + 4R sin2 φ/2
2
or spacing between the two interfaces.20 For example, if we plot the
Using T 2 = (1 − R)2 we get transmitted intensity, eqn (3.36), as a function of the length, , for two
It
=
1
.
values of the reflectivity, R = 0.10 (F = 1.1, grey) and R = 0.80
I0 1 + 4R2 /(1 − R)2 sin2 φ/2 (F = 14, black), in Fig. 3.20, we see a dramatic change in the width of
the transmission maxima. For low reflectivity and hence low finesse, the
20 transmission oscillates sinusoidally as a function of the etalon or cavity
In Example 7.12, we show that
the finesse also relates to the average length, , with peak transmission whenever the round-trip phase is an
number of times that light is reflected integer multiple of 2π. As the phase depends on the wavelength, λ, for
back and forth before escaping.
white light illumination, the interference fringes are coloured. This effect
gives rise to the colour observed in oil or soap films.
For high reflectivity interfaces, the transmission consists of narrow
resonances, the black curve in Fig. 3.20. Again, the resonances occur
when the phase difference is an integer multiple of 2π, i.e., the mth
resonance is given by φm = 2nk/ cos θ = mπ. If we write φ =
φm + then for high finesse, the transmission function, eqn (3.36), is
only non-zero in regions where is small, and we use the small-angle
approximation to write sin φ/2 ±/2. Using φm = 2nkm cos θ and
3.12 Michelson interferometer 47
km = 2πνm /c, for n = 1 and θ = 0, /2 = π(ν −νm )(2/c) and eqn (3.36)
becomes
It (Δν/2)2
= , (3.38)
I0 (ν − νm )2 + (Δν/2)2
where Δν is the full-width at half-maximum (FWHM) of the resonances,
Δνfsr
Δν = , (3.39)
F
and Δνfsr = c/(2) is known as the free spectral range. Equation
(3.38) says that for high finesse, the transmission is a sum of Lorentzian
resonances with spacing Δνfsr , and width, Δνfsr /F. As a high
finesse cavity or etalon has narrow transmission peaks, it is extremely
sensitive to small changes in either the cavity length or the wavelength
(equivalently frequency) and can be used as a frequency reference.
I = 2 I0
1
[1 + cos(4πΔ/λ)] , (3.41)
Chapter summary
• The sum of two waves (with either planar or curved wave fronts)
produces interference.
• Constructive or destructive interference depends on the relative
phase of the two waves.
• Young’s two-hole experiment is an example of a wave front
division interferometer. For small holes the interference pattern is
given by the sum of two spherical waves.
• The interference of a plane wave and a spherical wave produces a
Newton’s ring interference pattern.
• Thin films, diffraction gratings, and laser cavities are all examples
of multiple-path interference.
• Young’s two-hole experiment and the diffraction grating are
examples of wave front division. A Fabry–Perot and Michelson
are examples of amplitude division interferometry, where light
interferes via multiple reflections between two reflective interfaces.
Exercises
(3.1) Interference with two inclined plane waves of wave in the Fraunhofer approximation and show
different amplitudes that it can be written in the form
Rework the analysis of Section 3.3 for two plane
eikr̄
waves with the same frequency, propagating in E = Es (1 + ) eiφ .
different directions, with different amplitudes. ikr̄
Show that the interference pattern is still periodic Give expressions for and φ. In a Young’s double-
in space. What is the spatial period? What are slit experiment using a green laser pointer; the slit
the maximum and minimum intensities? positions are at x = ±0.5 mm and the distance
to the screen is z = 1 m. Estimate the size
(3.2) Wedge fringes of the phase term φ and the correction to the
A plane wave with wavelength 633 nm is incident amplitude for a laser wavelength λ = 500 nm. As
on a pane of glass whose front and back surface r̄ = z + x2 /z, we can write that 1/r̄ = 1/z to first
normals are inclined at angles of ±0.050◦ relative order in x/z. Use your answers to justify a further
to the propagation direction. Calculate the spatial approximation in order re-write the spherical wave
period of the fringes observed in reflection. in terms of x , x, and z only.
(3.3) Double slit with a green laser pointer (3.4) Adding N phasors
A spherical wave is written as E = Es eikr /(ikr). The phase of a wave evolves as eikr , where r is the
Explain why there is a factor of k in the distance traversed. Write an expression for r for
denominator. In the Fraunhofer approximation, the case where the start and finish coordinates in
the distance r between a point (x , 0) in the input the xz plane are (x , 0) and (x, z), respectively.
plane and a point (x, z) in the observation plane Rewrite r for the case where z x and x. Give
is given by r = r̄ − x x/z, where r̄ is the distance expressions for r that are used in the Fresnel and
between (0, 0) and (x, z). Rewrite the spherical Fraunhofer approximations, respectively.
50 Exercises
Write an expression for the sum of 4 phasors with between principal maxima .
source points x = − 32 d, − 12 d, 12 d, and 32 d. (3.10) Fabry–Perot etalon
The intensity of light is proportional to the Show that the free spectral range of a Fabry–Perot
modulus-squared of the field amplitude. Write an etalon is Δλ = λ/(2n ), where is the length and
expression for modulus-squared of the phasor sum. n is the refractive index inside the etalon. Find
Express your answer in terms of cosines. What is the optimal thickness of a thin film of titanium
the maximum value? dioxide intended to partially separate the D-lines
Draw phasor diagrams corresponding to the of sodium with wavelengths of 589.0 and 589.6 nm.
observer positions, (i) x = λz/2d and (ii) λz/d,
(3.11) Sensitivity of a gravitational wave detector
and specify the intensity in both cases.
A Michelson interferometer consists of a beam-
(3.5) Summing plane waves splitter that divides an input with amplitude E0
In an optics experiment, the light field can into two equal amplitude ‘arms’ with lengths 1
be approximated by three plane waves with and 2 . A mirror retro-reflects each arms such that
amplitude E0 propagating at angles −θ, 0, and +θ the two paths interfere at the beamsplitter.
relative to the z axis. Write an expression for the
field along the x axis. (a) Write an expression for the output field after
the two paths recombine at the beamsplitter.
(3.6) Summing real waves
State any assumptions you make.
In the xz plane, the general plane solution to
Maxwell’s wave equation is E = E0 cos(kx x + (b) The path difference, 2 − 1 , is chosen such
kz z − ωt). Consider two plane waves propagating that the intensity at the output is one-half
at angles ±θ relative to the z axis. Write an of the maximum value. A gravitational
expression for the total field along the x axis. Re- wave arriving at a Michelson interferometer
write the sum in the form of a standing wave and increases the length of one arm by Δ , and
discuss what happens as a function of time. What decreases the length of the other arm by Δ .
is the field at ωt = π? Explain, briefly, what (c) Write an expression for 2 − 1 in terms
this means for the energy of the field and energy of the wavelength λ in the absence of a
conservation. gravitational wave, i.e. when Δ = 0.
(3.7) Light and water (d) Next, write an expression for the output
In a phasor model of the tides, see Fig. 3.2, intensity as a function of Δ , assuming that
two phasors are sufficient to explain a wave form Δ is small.
with both principal and subsidiary maxima, i.e. (e) If the power circulating in each arm is
alternating larger and smaller peaks. In contrast, 0.8 MW and the minimal detectable signal
for light, three phasors are needed to account is 1.0 μW, the wavelength is 0.5 μm and the
for an equivalent intensity pattern, see Fig. 3.15. length of each arm is 4 km, estimate the
Explain, briefly, the difference between the two minimum strain, Δ / , that can be detected
cases. in principle.
(3.8) Young’s two holes (f) Give two reasons why Young’s double-slit
Young made two holes in an opaque screen with a interferometer is less well suited to measure
spacing of 1 mm. He observed the interference gravitational waves than a Michelson inter-
pattern on screen a distance 2 m downstream. ferometer.
What was the spacing between the interference ⎡ ⎤
fringes assuming that the centre wavelength of Hints:
light is 550 nm? ⎢ cos(A + B) = cos A cos B − sin A sin B. ⎥
⎢ ⎥
⎢ ⎥
(3.9) More than two holes ⎣ For small B, sin B = B, cos B = 1 ⎦
A screen contains four narrow slits uniformly and cos(A + B) = cos A − B sin A.
spaced with separation d. Give their positions
along the x axis assuming that they are (3.12) Energy conservation in the Michelson interferom-
symmetrically distributed about the z axis. eter
Write a phasor sum in the far-field. A Michelson interferometer is adjusted such that
Sketch phasor diagrams for (i) a position with zero the output as expressed by eqn (3.41) is zero.
intensity on either side of the principal maxima, Where has the energy gone?
and (ii) a position with zero intensity midway
Polarization 4
‘I remember in 37 when . . . you could go up a spiral staircase
4.1 Introduction 51
and sit up on top. Those were great, great days.’
4.2 Linear basis (|) 52
Tiny Tim (New York 1932–Hennepin County 1996).
4.3 Linear polarization (|) 53
4.4 Circular polarization (|) 53
4.5 Elliptical polarization (|) 55
4.1 Introduction 4.6 Circular basis (◦) 55
4.7 Poincaré sphere (◦) 56
Polarization is a fundamental property of any wave motion that can
4.8 Photon spin (◦) 56
sustain oscillations in more than one direction for a given direction of
4.9 Polarized light in a medium 57
propagation.1 Light can exist in two polarization states—photons have
4.10 Polarizers 58
an angular momentum or spin, either parallel or anti-parallel to the
propagation direction.2 Consequently, we can think of polarization as a 4.11 Malus’ Law 58
two-wave phenomenon. Most light sources such as lamps or stars tend 4.12 Linear birefringence (|) 59
to produce light with a mixture of polarization states—unpolarized 4.13 Wave plates (|) 59
light. Unpolarized light can be converted into polarized light using 4.14 Circular birefringence (|) 61
optical components. Lasers tend to produce polarized light. 4.15 Natural optical activity (|) 61
The polarization properties of light are responsible for many everyday 4.16 The Faraday effect (|) 62
optical phenomena such as the reduction of glare, or scatter, when 4.17 Interference 64
looking through polarizing sun glasses; the anti-glare devices on display Chapter summary 67
screens and monitors; optical devices such as DVD players; and the Exercises 67
glasses used in 3D cinema. Polarization analysis is used in the eyes of
the mantis shrimp and other animals. Understanding the polarization 1
The recently observed gravitational
properties of light is of vital importance in optical science, and finds waves (Abbott et al. 2016) also display
polarization phenomena.
utility in other fields.3 The scattered blue light from the sky is
2
polarized, with the extent and orientation of polarization depending As photons are massless and cannot
on the viewing angle with respect to the Sun. There is evidence that be brought to rest, they can only have
two angular momentum states.
bees can detect the direction of the electric-field vector in the celestial
3
polarization pattern (Evangelista et al. 2014), and it is thought that the In 1848, the study of polarized
light propagating through solutions of
Vikings used sky-polarimetric techniques for maritime navigation tartaric acid led to the discovery of
(Horváth et al. 2011). chiral chemistry by Louis Pasteur
In this chapter, we shall investigate the polarization properties of (Dole 1822–Marnes-la-Coquette 1895),
optical waves of infinite transverse spatial extent,4 formed from the see Section 4.15.
4
superposition of two co-propagating monochromatic waves of the same Using Fourier techniques from Chap-
ter 6 it is possible to generalize the
frequency,5 but with different electric field directions. The vector that treatment to spatially localized waves
specifies the direction of the electric field is called the polarization by summing over many wave vector
vector. The harmonic waves being superposed may have different orientations, see Chapter 12.
amplitudes and phase. 5
Again Fourier techniques can be
First, we consider polarized light propagating in free space. In a employed to extend the analysis to non-
plane perpendicular to the propagation direction, two orthogonal basis monochromatic waves.
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
52 Polarization
field is zero; the electric field vector sweeps out a spiral or helix as it
propagates in space. The sign of the phase difference of π/2 dictates the
sense of rotation of the electric field.
√
Substituting E1 = E0 / 2 and E2 = E1 e±iπ/2 in eqn (4.1) we obtain the
following expressions for L- and R-circularly polarized light:
EL = √1 E0
2
(ˆ1 + iˆ2 ) ei(k·r−ωt) , (4.4)
ER = √1 E0
2
(ˆ1 − iˆ2 ) ei(k·r−ωt) . (4.5)
whereas the ˆ2 components exactly cancel, leaving a plane wave linearly
polarized along ˆ1 . When E− and E+ have the same magnitude but
different phases the resultant wave is linearly polarized, but at another
angle in the transverse plane; the relevant calculation is outlined in the
end-of-chapter exercises. An example of this decomposition is shown
in Fig. 4.7. The concept that a phase difference between two circularly
polarized waves of opposite polarization leads to a rotation of the plane
of polarization will be useful in the context of optical activity, and
forms the basis for the explanation of Faraday rotation, as we shall
discuss in Section 4.14.
4.10 Polarizers
A polarizer or polarizing filter modifies both the polarization state
and the amount of light transmitted. A polarizer resolves the input
electric field into orthogonal components, and only transmits one of
them. A polarizer can be used to turn an unpolarized wave into a
polarized wave, at the expense of a reduction in the intensity of the
transmitted wave.
Polarizers may be based on either dichroism or Fresnel reflection.
Light incident at an interface at Brewster’s angle experiences a reflec-
tivity of 0% for the in-plane p-polarization, and 15% for the out-of-
plane s-polarization, see Fig. 2.10. Consequently, the reflected light
is s-polarized and the transmitted light has a higher percentage of p-
polarization. By using two media with high and low indices it is possible
to adjust Brewster’s angle to be 45◦ . Using a stack of alternating high
and low index layers, the reflection coefficient for the p-polarization can
be increased to close to 100%. This device, illustrated schematically in
Fig. 4.11, is known as a polarizing beam-splitter cube.
A polarizing filter attenuates one polarization and transmits the other,
Fig. 4.11 A polarizing beam-
splitter cube. A stack of layers with as in Fig. 4.12. An ideal filter would have no attenuation for one
high and low refractive indices inserted component, and infinite extinction for the orthogonal component. The
between two prisms reflects the out-of- most common polarizing filter is Polaroid, where the vastly different
plane s-polarization and transmits the
extinction along different axes is a manifestation of the alignment
in-plane p-polarization.
of herapathitite (iodoquinine sulfate) crystals embedded in a plastic
14
The polarizing effect of iodoqui- sheet.14
nine sulfate was discovered by Doctor
William Bird Herapath (Bristol 1820–
1868). The iodoquinine sulfate crystals
were found when iodine was added to
4.11 Malus’ Law
the urine of a dog that had been fed
quinine (Kahr et al. 2009). To calculate the reduction in the intensity of plane polarized light
incident on a polarizing filter, we simply resolve the incident light’s
electric field into components parallel and perpendicular to the axis
of the polarizer along which light is transmitted. Figure 4.2 shows
the geometry. If the angle between the direction of polarization of
the incident light and the transmission axis of the polarizer is α, then
only the component E0 cos α is transmitted. Recalling the result from
Chapter 1 that optical detectors detect intensity, proportional to the
Fig. 4.12 A polarizer (grey disc) only electric field squared, we can predict the intensity of a plane wave of
transmits light parallel to a particular
axis. In this example, the axis of the
incident intensity I0 that is transmitted by a polarizer at an angle α
polarizer is vertical and the input light with respect to the polarization vector of the input:
is linearly polarized at +45◦ . Only
the vertical component is transmitted. I (θ) = I0 cos2 α . (4.13)
The√amplitude is reduced by a factor
of 2, and the intensity by a factor
This result is called Malus’ Law, after Étienne-Louis Malus (Paris
of 2. In contrast to earlier figures,
now we indicate the polarization state 1775–1812), who made a number of fundamental discoveries, especially
by the black line that follows the time regarding polarized light. Unfortunately, Malus died within three years
evolution of the electric field vector at of discovering the phenomenon of polarization by reflection, leaving
a particular position.
Francois Arago (Estagel 1786–Paris 1853) and Jean-Baptiste Biot (Paris
1774–1862) to explain his observations (Levitt, 2009).
4.12 Linear birefringence (|) 59
E (0) = √1 E0
2
(ˆf + ˆs ) e−iωt .
inclined at an angle α with respect to the fast axis, the slow component
picks up a π phase shift and the action of the half-wave plate can be
written as
18
We are assuming that the fast and E in = E0 (cos α ˆ1 + sin α ˆ2 ) ei(k·r−ωt)
slow axes correspond to the directions
1 and ˆ
of ˆ 2 , respectively.
E out = E0 (cos α ˆ1 − sin α ˆ2 ) ei(k·r+nf k−ωt) , (4.14)
where is the length of the wave plate. In words, this result says that
the electric field is reflected with respect to the fast axis.18 Linearly
polarized light incident at an angle α with respect to the fast axis exits
the wave plate linearly polarized, but with the direction of polarization
rotated to be at an angle of −α.
Circularly polarized light: Recalling from Section 4.6 that the unit
vectors for L- and R-circularly polarized light are ˆ+ = √12 (ˆ1 + iˆ2 )
and ˆ− = √12 (ˆ1 − iˆ2 ), and that a half-wave plate retards the
slow polarization component by half a wavelength relative to the fast
polarization component, it is evident that circularly polarized light
changes its handedness on passing through a half-wave plate.
the fast and slow axes, i.e. along the direction √12 (ˆ1 + ˆ2 ); whereas R-
hand circular polarization also becomes linearly polarized, but along the
direction √12 (ˆ1 − ˆ2 ), θ = −π/4 (−45◦ ). These results are as expected
on account of the time-reversed situations previously described.
the identity and (ii) the enantiomeric purity of the substance, or (iii)
the concentration of a known substance in a solution.
Optical rotation is said to be dextro rotatory if the direction of
polarization rotates clockwise when looking towards the source, and
laevo rotatory if the rotation is anti-clockwise. Optical rotation
is a reciprocal optical process meaning that if a wave picks up a rotation
β on traversing the medium, the rotation is undone if the wave is retro-
Fig. 4.17 Photographs of two lasers reflected back through the same medium, see Fig. 4.15. Figure 4.17
with different wavelengths propagating shows how the plane of polarization of a linearly polarized wave rotates
through corn syrup (λ = 633 nm as it propagates through a sugar solution.
and 532 nm in the upper and lower
images, respectively). We only see
scattered light when the polarization
is orthogonal to the observation plane. 4.16 The Faraday effect (|)
The distance between the intensity
maxima is Λ = λ/(nL − nR ). For
In this section, we consider the Faraday effect, where an applied
the green laser (lower image) Λg =
14 cm. For the red laser (upper image) magnetic field induces circular birefringence. In 1845, in a sequence
Λr < Λg because red is further from of experimental investigations, Michael Faraday revealed for the first
resonance and the index difference is time the link between electromagnetism and light. These experiments
smaller, see Fig. 4.10. The attenuation
is larger for green light (lower image).
had far-reaching consequences that shaped the modern world, such as
Images courtesy of Miranda Nixon, the invention of electric motors and the ability to transform heat into
Durham University, 2015. electricity. Faraday showed that a magnetic field in the same direction
as the wave vector k can induce a change in the plane of polarization—
an effect which became known as Faraday rotation. We discussed in
4.16 The Faraday effect (|) 63
Section 4.6 how the natural basis for describing atom–light interactions
is the circular basis. Atomic transitions that are degenerate in the
absence of the magnetic field occur at different frequencies when the
field is applied. As a consequence, the absorption coefficient for L- and
R-circularly polarized light is different—the medium is said to exhibit
circular dichroism. There will also be a concomitant difference in
the refractive indices for the different handednesses of light, i.e. circular
birefringences. We can therefore use the same analysis as in Section 4.15
to predict a Faraday rotation angle for a medium of length, , of
β = π(nL − nR )/λ. As the index difference is proportional to the
external magnetic field B this is often written as
β = V B , (4.18)
where V = π(nL − nR )/(λB) (units rad.T−1 m−1 ) is called the Verdet
coefficient, which is a property of the medium.22 Media with large 22
It is often called the Verdet constant,
Verdet coefficients are either crystals that contain paramagnetic ions, somewhat of a misnomer as it is
wavelength dependent.
such as terbium, e.g. terbium gallium garnet (TGG); or atomic vapours,
where Verdet coefficients that are orders of magnitude larger than TGG
can be achieved, but only over a restricted wavelength range (Weller et
al. 2012).
Chapter summary
Exercises
(4.1) Plot of linearly polarized light 2 plane for times t/T = 0, 1/8, 1/4, 3/8, and
ˆ
1 –
Use eqn (4.1) to plot the electric field in the ˆ 1/2. Assume that E1 and E2 are real, equal in
68 Exercises
The projection of the angular momentum of the polarized waves let the electric field be
photons in the wave onto the axis of propagation E = E0 [cos (kz − ωt + δ− ) + cos (kz + ωt + δ+ )] ˆ x .
must therefore be reversed after traversing the Show that a shift of the origin of the
half-wave plate. Is this consistent with the coordinate system along the z axis using
conservation of angular momentum? the expression z = z + (δ− + δ+ ) /2k
(4.15) Intensity before and after wave plates allows the field to be rewritten as E =
Using eqns (4.14) and (4.15) verify that for E0 {cos [kz − (ωt + δ )] + cos [kz + (ωt + δ )]} ˆ
x ,
both half- and quarter-wave plates, although the where δ = (δ+ − δ− ) /2. Show that δ can also be
electric field is modified on transmission, the eliminated with an appropriate choice of temporal
intensity is invariant. origin.
(4.16) Cascading polarization components (1) (4.20) Standing waves with complex waves
Consider a linearly polarized wave incident Use complex notation for the plane waves to derive
normally on a sequence of wave plates. The the form of the electric field, eqns (4.22), (4.25),
direction of polarization is at π/4 with respect and (4.28), for the three different standing waves
to the initial quarter-wave plate axes; there then analysed in the text.
follows a half-wave plate with axes at an arbitrary (4.21) Faraday effect and optical diode
orientation, with the final element being a quarter- The electric field of left- and right-circularly
wave plate with the same orientation as the polarized plane waves propagating along the z axis
first. Describe the state of polarization after each may be written as E L = √12 E0 (ˆ x + iˆy ) ei(kz−ωt)
element.
and E R = √12 E0 (ˆx − iˆ
y ) ei(kz−ωt) , where ˆ x and
(4.17) Cascading polarization components (2) y are unit vectors along x and y.
ˆ
Consider unpolarized light incident normally on (i) Write an equation for a plane wave propagating
a polarizing filter; the transmitted light is then along z and linearly polarized along x in terms of
incident on a quarter-wave plate with axes E L and E R .
oriented at π/4 with respect to the axis of the (ii) The plane wave enters a Faraday medium at
polarizer. The light then reflects from a mirror z = 0. Inside the medium left- and right-circularly
and passes through the quarter-wave plate before polarized light have refractive indices, nL and nR ,
being incident on the polarizer for a second time. respectively. Write an equation for the field after
By analysing the polarization state after each propagating a distance z inside the medium.
component, explain why no light is transmitted (iii) By writing nL = n+Δn/2 and nR = n−Δn/2,
through the polarizer on the second traversal. where n = (nL + nR )/2 and Δn = nL − nR , show
What is a practical use of this device? (Hint: the that E = E0 (cos ϕˆ x − sin ϕˆ y ) ei(nkz−ωt) , where
mirror can be replaced by a computer monitor). ϕ = Δnkz/2 = π(nL − nR )z/λ. (Note that this
Does this device work for every colour? Does the result also applies to an optically active medium.)
device work for light waves that are not normally (iv) For rubidium gas, in a magnetic field of
incident? 0.600 T using a laser at 780 nm, nL − nR =
(4.18) Cascading polarization components (3) 9.75×10−5 . If the gas cell has a length of 2.00 mm
Consider a vertically polarized plane wave nor- what is the direction of polarization of light after
mally incident on a polarizer whose axis is parallel traversing the cell? What is the value of the Verdet
to the plane of the electric field of the light. constant?
Downstream the light traverses a second polarizer, (v) Explain, briefly, how this medium could be
whose axis is inclined at π/4 with respect to the combined with two linear polarizers to realize an
first, and a final polarizer whose axis is orthogonal optical diode (a device that transmits light in one
to the first. Write down (vector) expressions for direction only).
the electric field before and after each polarizer. (4.22) Magnetic fields of standing waves
What fraction of the initial light intensity is Use either (i) complex notation of the magnetic
transmitted by this sequence of polarizers? Repeat field of the constituent plane waves, or (ii) the
the analysis when the middle polarizer is removed. vector potential given the form of the electric field,
(4.19) Eliminating phase shifts by suitable choice of space to derive the magnetic fields for the three standing
and time origins waves as expressed by eqns (4.23), (4.26), and
For a pair of counter-propagating parallel linearly (4.29).
70 Exercises
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
72 Many waves I: Fresnel and Fraunhofer
Using
This equation is known as the Fresnel diffraction integral. In the ˆ ∞ √
paraxial regime, the distance from the source point (x , y , 0) to the e−πx
2
/(iλz)
dx = iλz ,
−∞
observation point (x, y, z) is
and similarly for the y integral, we find
2 2
(x − x ) + (y − y ) that
E0
rp = z + , A= .
2z iλz
This equation says that the field at a point P, with coordinates (x, y, z),
is given by a sum of contributions from points P , with coordinates
(x , y , 0), with a phase that depends on the optical path length between
P and P. The Fresnel diffraction integral extends the discrete phasor
sum we encountered in Chapter 3 to infinitely many waves.
Next we consider some special cases that can be solved analytically
and provide considerable insight. First the case of cylindrical symmetry.
74 Many waves I: Fresnel and Fraunhofer
where r̄ = z+ρ2 /2z, ρ = (x2 +y 2 )1/2 , and ρ = (x2 +y 2 )1/2 are the radial
displacements in the input plane and the observation plane, respectively.
To further simplify, we assume that the input field is a plane wave
propagating along z. To set an upper limit on the values of x and y that
Fig. 5.5 The radius of the first contribute—which is required in order to use the scalar approximation,
Fresnel zone, ρ1 , occurs when the path see Section 1.12—we can assume the input plane contains a circular
difference between, (0, 0) to (0, z) and aperture with radius Ra that we can vary. The scalar approximation
(ρ1 , 0) to (0, z), is λ/2. In the paraxial
is valid as long as Ra < z, such that we are in the paraxial regime.
limit, this is equal to ρ2 1 /2z.
As the input field, E0 f(x , y ), is cylindrically symmetric, we can replace
f(x , y ) by f(ρ ), where for a circular aperture f(ρ ) = 1 for ρ ≤ Ra and
0 otherwise. The field is given by the sum of paths from input points
(ρ , 0) to the observation point (0, z). The phase of these contributions
oscillates as the source point (ρ , 0) moves away from the z axis. In
Fig. 5.5 we have labelled two points, ρ1 and ρ2 , where the phase changes
sign. This happens when the path difference between the on-axis path
(0, 0) to (0, z) and the off-axis path (ρm , 0) to (0, z) is equal to an integer
multiple of λ/2. Using the paraxial distance, rp , we can write that
ρ2
m λ
= m , (5.5)
2z 2
which gives
√
Fig. 5.6 Fresnel zones: The field ρm = mλz . (5.6)
component arriving at (0, z) from (ρ , 0)
has a phase, The region in the input plane between ρm−1 and ρm is known as the
ρ2 mth Fresnel zone. In Fig. 5.6 we show these Fresnel zones in the input
φ=k z+ .
2z plane. Light passing through the white regions contributes to the field
The curve plotted along x shows cos φ, with positive phase, and light passing through the grey regions with
which changes sign whenever ρ =
√ negative phase.3 All zones have the same area,
mλz, where m is an integer. The
first Fresnel zone (central
√ white circle) π(ρ2m+1 − ρ2m ) = π [(m + 1)λz − mλz] = πλz . (5.7)
with radius ρ ≤ λz contributes
with positive
√ phase. The√second zone, Fresnel realized that as the field from successive Fresnel zones
between λz < ρ ≤ 2λz (shown interferes destructively, then if we block all the odd, or all the even,
in grey) contributes with a negative
phase. zones we can arrange to have purely constructive interference on-axis in
3 a particular plane downstream. The mask is known as a Fresnel zone
Although the phase plotted in Fig. 5.6
looks similar to the phase of a spherical plate and looks very similar to Fig. 5.6. Note that Fresnel zones are
wave, the phase across the aperture is abstract theoretical constructs, whereas a zone plate is a physical device.
uniform, and now we are considering The effective focal length of the zone plate is
the phase of wave components originat-
ing at (ρ , 0) when they arrive at (0, z). ρ21
f = , (5.8)
λ
5.5 Circular aperture 75
ˆ
E0 eikz Ra
2
E (z)
= eikρ /2z
2πρ dρ , (5.9)
iλz
0
2 2 kRa2
= −E0 eikz eikRa /2z − 1 = −2iE0 eikz eikRa /4z sin .
4z
Figure 5.10 (top row) shows the intensity pattern in the xy plane at
increasing distance z downstream of a circular aperture. The left and
right image are at z = 0 (Fresnel number infinite) and z = Ra2 /λ (Fresnel
number unity), respectively. Although for high Fresnel number (higher
Fresnel zones) the scalar approximation breaks down, it is possible to
approximate the vector nature of the field using an obliquity factor,
where E0 is replaced by 12 E0 (1+cos θ), or, more accurately, by considering
each vector component of the field, as we shall see in Chapter 12. The
bottom row in Fig. 5.10 shows the case of a complementary screen—an
opaque disk rather than an aperture. Here the spot of Arago is seen as
a bright region in the centre of the shadow.
Example 5.1
Finite lens size: A lens with finite size can only capture the light that falls within
the aperture of the lens. For a plane wave incident on a circular lens with diameter
D, the aperture function is given by a circ-function:
ρ 0 ρ > D/2
f(ρ ) = circ = . (5.17)
D 1 ρ ≤ D/2
Substituting into eqn (5.16), and evaluating the integral in cylindrical coordinates,
see Section B.13 in Appendix B, we find that
π2 D4 πDρ
I (f ) = I0 jinc 2
, (5.18)
16λ2 f 2 λf
where jinc(α) = J1 (α)/α and J1 is the first-order Bessel function of the first kind.
This intensity distribution is known as an Airy pattern and is shown in Fig. 5.12.
The first zero in the Airy pattern is given by the first zero of the Bessel function, and
occurs at a radius of ρ = 1.22f λ/D. The finite size of the intensity distribution at the
focus sets a limit to the smallest detail that can be resolved by a lens, see Chapter 9.
To resolve finer detail, the ratio f /D—called the f-number in photography—needs
to be small.
dark ring is located at a radius of propagation distance, z, is sufficiently large—which we call the far-field
ρ = x2 + y 2 = 1.22f λ/D. regime—that we can neglect the x2 and y 2 in the Fresnel diffraction
integral. Starting from eqn (5.4), we can write the intensity as
ˆˆ ∞
I0 2
I (z)
= f(x , y )e−i2π(xx +yy )/(λz) eikρ /(2z) dx dy ,
λ2 z 2 −∞
where I (z) = I0 |f(x , y )|2 is the field in the input plane. The
Fraunhofer approximation says that for z ρ (for all input
2
coordinates that contribute) we can set eikρ /2z 1 and therefore
ˆˆ ∞ 2
I0 −i2π(xx +yy )/(λz)
I (z)
= f(x , y )e dx dy . (5.19)
λ2 z 2 −∞
5.8 One, two, many slits 79
a2
z dR = , (5.20)
λ
10
At z = 10dR or 100dR the phase
error is π/40 ∼ 0.08 or π/400 ∼ 0.008,
where dR is the Rayleigh distance or Rayleigh length, see also therefore to achieve a 1% accuracy,
Section 5.8. The condition z dR defines the far field. Even in z dR means two orders of magnitude
this far-field region, the Fraunhofer diffraction formula is still only larger.
approximate.10
In the case of cartesian separability, Section 5.6, f(x , y ) = g(x )h(y )
and we can separate the x and y integrals in eqn (5.19). In this case
the far-field Fraunhofer diffraction formula is written as
ˆ ˆ 2
I0 ∞ −ikxx /z ∞ −ikyy /z
I = 2 2
(z)
g(x )e dx h(y )e dy . (5.21)
λ z −∞ −∞
A similar result holds in the focal plane of a lens with z = f . If the field
is uniform in the y direction, starting from eqn (5.13) and making the
Fraunhofer approximation, we obtain
ˆ 2
I0 ∞ −i2πxx /(λz)
I =
(z)
g(x )e dx . (5.22)
λz −∞
Example 5.2 11
We shall discuss the limitations of
Single-slit Fraunhofer diffraction: The slit is assumed to have a width a in the this idealized aperture function in
x direction and infinite spatial extent in the y direction, such that we can write the Chapter 11. In practice, the screen will
field in the z = 0 plane as E (0) = E0 g(x ), where11 have finite thickness, and the edges of
x 0 |x | > a/2 the slit are unlikely to be smooth on
g(x ) = rect = , (5.23)
a 1 |x | ≤ a/2 the scale of a wavelength; however, if
a λ and z a we can neglect these
where we have introduced the label rect to denote the rectangular function shown in imperfections.
Fig. 5.13(i).
80 Many waves I: Fresnel and Fraunhofer
to a2 . Increasing the width of the slit increases the input flux by a factor a and
reduces the width of the diffraction pattern by another factor of a. The case with
a lens is the same but with z replaced by f and the result only applies in the focal
plane, as shown in Fig. 5.14(ii).
giving
ˆ ∞
e−i2πdx/(λz) −iπax/(λz)
f(x )e−i2πxx /(λz)
dx = e − eiπax/(λz) .
−∞ −i2πx/(λz)
πax
= e−i2πdx/(λz) asinc . (5.28)
λz
The effect of translation is only to multiply by a phasor factor, e−i2πdx/(λz) .
Inserting this result into eqn (5.22) we find that
a2
πax
I (z) = I0 sinc2 , (5.29)
λz λz
which is the same as before. This seems odd; how can we translate the slit without
changing the diffraction pattern? The answer is that the Fraunhofer approximation
assumes that x z for all x that contribute, therefore we can only move the slit a
small distance, d z, before the Fraunhofer approximation breaks down.
In contrast, in the focal plane of the lens where the Fraunhofer diffraction formula
is ‘exact’, the insensitivity of the diffraction pattern to translation in the input plane
is illustrated in Fig. 5.16. In summary, a small displacement in the input plane gives
rise to an exponential phase factor in the far-field amplitude, which on its own does
not change the intensity distribution. This topic is explored further in Exercise 5.14.
Example 5.4
Double slit: Now consider two slits with width a and spacing as shown in
Fig. 5.13(iii). In this case, the input function becomes
⎧ Fig. 5.16 Fraunhofer diffraction using
⎪ 0 −∞ < x ≤ −d/2 − a/2
⎪
⎪ a lens in the z = 0 plane. In
⎨ 1 −d/2 − a/2 < x ≤ −d/2 + a/2
this example, a slit is placed in a
f(x ) = 0 −d/2 + a/2 < x ≤ d/2 − a/2 . (5.30)
⎪
⎪ plane at z = −f and the Fraunhofer
⎪
⎩ 1 d/2 − a/2 < x ≤ d/2 + a/2
diffraction pattern is observed at z = f .
0 d/2 + a/2 < x ≤ ∞
Translating the slit does not change the
The integral in the Fraunhofer diffraction formula is now a sum of two displaced slits position of the diffraction pattern.
at x = ±d/2. Using our previous result for a single displaced slit, eqn (5.28), the
12
integral for the two slits is the sum of two terms:12 We have chosen to write the integrals
ˆ ∞ ˆ a/2 in terms of k this time.
f(x )e−ikxx /z dx = e−ikdx/(2z) + eikdx/(2z) e−ikxxd /z dxd ,
−∞ −a/2
kdx kax
= 2a cos sinc . (5.31)
2z 2z
82 Many waves I: Fresnel and Fraunhofer
Now the exponential phase factors due to the slit displacements give rise to an
interference term which does modify the intensity pattern. Substituting k = 2π/λ,
we find that the intensity distribution in the far field is
πax
4I0 a2 πdx
I (z) = cos2 sinc2 . (5.32)
λz λz λz
This function is plotted in Fig. 5.17. The cosine-squared term produces interference
fringes with a spacing (λ/d)z, eqn (3.18), as in Chapter 3. The sinc-squared term
limits the peak intensity of each fringe. As the sinc-squared function goes to zero at
x = ±(λ/a)z, if d/a is an integer the expected fringe at this position is suppressed.
This is referred to as a missing order. In Fig. 5.17, d/a = 20 and the 20th fringe
(counting the central fringe as zero) is suppressed.
Example 5.5
Many slits (the diffraction grating): The above treatment can be extended to
N -slits, in which case the prefactor in eqn (5.31) becomes a sum of N terms, similar to
the N -slit interference discussed in Chapter 3. Extending the 2-slit sum in eqn (5.31)
5.9 2D Fraunhofer
The previous examples focused on diffraction in only one transverse
direction, x. We now consider some examples where there is diffraction
in both x and y. A schematic of the Fraunhofer diffraction is shown
5.9 2D Fraunhofer 83
Example 5.6
Laser beam: A useful example of cartesian separability is the case of a gaussian
13
laser beam,13 see Chapter 11 for more detail. For a cylindrically symmetrical laser Named after the function, see
beam, the field in the z = 0 plane may be written as E (0) E0 g(x )h(y ), where g(x ) = Sec. B.6 associated with Carl Friedrich
2 2 2 2
e−x /w0 , h(y ) = e−y /w0 , and w0 is the beam radius. Both the x -integral and Gauss (Brunswick 1777–Göttingen
y -integral in eqn (5.21) are performed by completing the square, see Appendix B. 1855).
The x -integral gives,
ˆ ∞
2 √ 2 2 2 2 2
e−x /w0 e−i2πxx /λ dx = πw0 e−π w0 x /λ z .
−∞
Similarly for y , and we obtain the far-field intensity distribution,
π 2 w4 2 2 2 2 2 π 2 w4 2 2
I (z) = I0 2 20 e−2π w0 ρ /λ z = I0 2 20 λze−2ρ /w , (5.35)
λ z λ z
2 2 2
where ρ = (x + y ) and w = [λ/(πw0 )]z is the far-field beam radius. Hence
in the far field, the Fraunhofer intensity distribution is also a gaussian, but with a
significantly larger beam radius, that is inversely proportional to the initial width,
w0 . The angular spread of the laser beam, see Fig. 5.20, is defined as Δθ = w/z,
thus
λ
Δθ = . (5.36)
πw0
As for the case of a single slit, the Fraunhofer approximation is only valid when the
width of the light distribution is much larger than the initial size, i.e., when spreading
due to diffraction Δθz w0 . The cross-over between initial size dominating
and diffraction dominating occurs at the Rayleigh distance—more often called the Fig. 5.20 In the far field, where the
Rayleigh range, zR , for laser beams—which is defined as ΔθzR = w0 , which gives laser beam radius, w, is much larger
πw02 than the initial size (or waist), w0 ,
zR = . (5.37) we can write w = Δθz, where Δθ =
λ
λ/πw0 .
The gaussian has the property that the product of the size and spread (momentum
distribution) is a minimum, consequently the Rayleigh range is larger than the
Rayleigh distance for other light distributions such as the rectangular aperture.
Although Fraunhofer diffraction gives the correct result for the far-field intensity
distribution of a laser it ignores wave front curvature, as we shall see in Chapter 11.
84 Many waves I: Fresnel and Fraunhofer
Example 5.7
Rectangular aperture: Our second example of Fraunhofer diffraction in two
transverse dimensions is the case of a rectangular aperture with width a and height
b, see Fig. 5.21(i). We assume uniform illumination, for example using a laser
beam with beam radius, w0 , much larger than the dimensions of the aperture,
w0 a > b. The field immediately downstream of the aperture plane can be
written as E (0) = E0 f(x , y ), with
x y
f(x , y ) = rect rect . (5.38)
a b
The corresponding intensity distribution is shown in Fig. 5.21(i). To find the intensity
distribution in the far field we use eqn (5.21) and the integral eqn (5.24) to obtain
πax
a2 b 2 πby
I (z) = I0 2 2 sinc2 sinc2 . (5.39)
λ z λz λz
This intensity distribution is shown in Fig. 5.21(ii). Along the x and y axes the first
zeros occur at x = ±λz/a and y = ±λz/b, respectively. For a > b, the input field is
Fig. 5.21 (i) Uniform illumination of wide and short, while the diffraction pattern is tall and thin. In Chapter 6 we shall
a rectangular aperture with dimensions see how this inverse scaling arises from the Fourier relationship between position and
a and b in the horizontal and vertical momentum—a narrow real space distribution requires a large spread in momentum,
directions, respectively. (ii) The far- and vice versa. The peak intensity of the diffraction pattern is proportional to (ab)2 ,
field intensity pattern. The first i.e., the square of the area of the aperture. This scaling is explored further in the
zeros are at ±λz/a and ±λz/b in end-of-chapter exercises.
the horizontal and vertical directions,
respectively.
Example 5.8
Single or multiple slits and a laser: A likely scenario in a single-slit diffraction
experiment is that the slit is tall and thin (b a), and the laser beam is smaller than
the aperture in the vertical direction, w0 < b, as in Fig. 5.22(i). If the laser beam
size is relatively large (of the order of a millimetre) then we are likely to observe the
diffraction pattern at an intermediate distance z corresponding to the far field in x
but the near field in y. In terms of the Rayleigh length for single slit diffraction and
Rayleigh range of the laser, the observation distance is
dR z zR .
For b > w0 a, the intensity profile is approximately given by a cartesian separable
function of the form f(x , y ) = g(x )h(y ), where
x y
g(x ) = rect and h(y ) = gauss , (5.40)
a w0
2 2
with gauss (y /w0 ) = e−y /w0 describing the laser field profile. We can assume that
the laser beam remains unchanged in the vertical y direction, and use the Fraunhofer
Fig. 5.22 (i) Laser illumination of a diffraction integral for one transverse dimension multiplied by a fixed y-dependence:
vertical slit with width a and height b,
πax
a2 y
where a b. The laser beam radius is I (z) = I0 sinc2 gauss2 . (5.41)
much larger that the width but smaller λz λz w0
than the height, a w0 < b. (ii) The For diffraction in only one direction the prefactor is 1/(λz), see eqn (5.22). The
far-field intensity pattern has a sinc- calculated far-field intensity pattern for this case is shown in Fig. 5.22(ii). The
squared pattern in the horizontal but pattern consists of a sinc-squared pattern along x and a gaussian along y. Figure
is gaussian in the vertical direction. (i) 5.23 shows an example with a laser beam and five vertical slits, where one sees the
and (ii) are not to scale. five-slit diffraction pattern along x and the gaussian profile along y. If, in contrast,
we move the observation plane back into the far field of the laser beam, z zR , then
there is diffraction in both transverse directions, and the intensity in the observation
plane is
πa2 w2
πax
πw y
0
I (z) = I0 2 20 sinc2 gauss2 . (5.42)
λ z λz λz
5.10 Fresnel integrals 85
The pattern looks similar to Fig. 5.22 but now with a beam radius in the vertical
direction given by, w = Δθz, where Δθ = λ/(πw0 ).
Example 5.9
Single slit: Consider a long narrow slit of width a in the z = 0 plane, orientated
vertically, along the y axis. The slit is illuminated by uniform monochromatic light 14
The Fresnel integrals are similar
propagating along z. As the field is uniform along y, we can use eqn (5.13) with
to the error function (Hughes 2010),
f(x ) = 1 for |x | ≤ a/2 and 0 otherwise, see Fig. 5.13(i), and the Fresnel diffraction
but with a complex argument. They
integral is
are also used in the design of roads
ˆ
E0 eikz a/2 ik(x−x )2 /2z and velodrome tracks to minimises
E (z) = √ e dx , (5.44) the forces experienced on entering a
iλz −a/2
bend. The minimum-force trajectory is
or in terms of ξ = x − x , as in Fig. 5.24. known as the transition curve.
ˆ
E0 eikz a/2−x ikξ2 /2z
E (z) = √ e dξ , (5.45)
iλz −a/2−x
and the intensity is
ˆ 2
I0 a/2−x
2
I (z) = eikξ /2z dξ , (5.46)
λz −a/2−x
86 Many waves I: Fresnel and Fraunhofer
where I0 is the intensity in the absence of the slit. The intensity pattern predicted
by eqn (5.46) is shown in Fig. 5.25. In
order to compute √ the integral, it is convenient
to rescale all distances in terms of λz/2, where λz is known as
the Fresnel
length. The rescaled position in the observation plane is x̃ = x/ λz/2. The
rescaled transverse displacement
between the input point (x , 0) and the observation
point (x, z) is ξ̃ = ξ / λz/2. Re-writing the integral in terms of these scaled
variables we find that
ˆ 2
I (z̃) 1 ξ̃2 i(π/2)ξ̃2
= e dξ̃ , (5.47)
I0 2 ξ̃1
where ξ̃1 = −ã/2 − x̃ and ξ̃2 = ã/2 − x̃, with ã = a/ λz/2 being the dimensionless
slit width. The reason for this rescaling is that we now can rewrite the integral in
terms of Fresnel integrals, giving
2
I (z̃) 1
2
= C ξ̃2 − C ξ̃1 + S ξ̃2 − S ξ̃1 . (5.48)
I0 2
Figure 5.25 is generated by evaluating eqn (5.43) on a grid for many values of (x, z).
Example 5.10
Double slit: We now extend the single-slit theory to more slits. For two slits of
width a separated by a distance d the aperture function looks as in Fig. 5.13(iii),
and the Fresnel diffraction integral is
ˆ ˆ d/2+a/2−x 2
I0 −d/2+a/2−x ikξ2 /2z 2
I (z) = e dξ + eikξ /2z dξ .
λz −d/2−a/2−x d/2−a/2−x
As previously, this integral can be rewritten as a sum of Fresnel integrals, now
with four terms rather than two. The four coefficients are ξ1 = −d/2 ˜ − ã/2 − x̃ to
Fig. 5.27 Intensity in the xz plane
˜ + ã/2 − x̃. The intensity pattern in the xz plane for this case is shown in
ξ4 = d/2 downstream of a double slit. In the
Fig. 5.27. We see how the light from each slit first spreads out, and then overlaps far field (far right) the intensity distri-
to form the two-slit interference pattern. In the far field, on the right-hand side of bution has evolved into fringes which
the figure, we see the cosine-squared interference fringes characteristic of Young’s spread out linearly with propagation
double-slit experiment, Chapter 3. However, the intensity pattern is very different distance, similar to the interference
to Young’s sketch, Fig. 3.3, as Young was drawing amplitude rather than intensity. between two cylindrical waves, Fig. 3.6.
Example 5.11
Edge: A slit reduces to an edge if we move the other edge to infinity, i.e., set ξ̃1 = −x̃
and ξ̃2 = ∞. In this case, using C(∞) = 1/2 and S(∞) = 1/2 we obtain
2 2
I (z̃) 1 1 1
= + C (x̃) + + S (x̃) . (5.49)
I0 2 2 2
This intensity pattern downstream of an edge is plotted in Fig. 5.28. Note that
the function in units of the scaled variable x̃ is always the same. If we propagate
in the z direction the pattern spreads out, corresponding to a simple rescaling of
the horizontal axis, but it always has the same functional form. Note also that
constructive interference leads to a higher value of the intensity in the shadow relative Fig. 5.28 Intensity pattern down-
to the incident wave, and that for large displacements from the edge the intensity stream of an edge. As the field
asymptotically becomes equal to the value obtained were the light to propagate propagates, the pattern retains its
without obstruction. For all points downstream with the same lateral displacement functional form but spreads out with a
as the edge (x = 0), the intensity is exactly one quarter the value of the incident scaling that depends on the square root
beam. This is discussed further in an end-of-chapter exercise. of the propagation distance z.
Chapter summary
Exercises
(5.1) Fresnel diffraction integral k = 2π/λ. For a field that is uniform in the y
Write an expression for the Fresnel diffraction direction, we can write f(x , y ) = f(x ). Show that
integral in terms of a sum of phasors for (i) the the field at y = 0 is given by
most general case, and (ii) when we can neglect ˆ
E0 ikz ∞ 2
diffraction in the y direction. Explain the two E (z) = √ e f(x )eik(x−x ) /2z dx .
main differences. iλz −∞
´ ∞ −πy2 /(iλz) √
(5.2) Fresnel diffraction integral: from two to one Hint: −∞
e dy = iλz.
transverse dimensions
What is the field along the z axis if the field is also
The Fresnel diffraction integral is
ˆ ∞ ˆ ∞ uniform in the x direction? How does your answer
E0
E (z) = f(x , y )eikrp dx dy , compare to the incident field?
iλz −∞ −∞
(5.3) Fresnel diffraction integral—cylindrical symmetry
where rp = z + [(x − x )2 + (y − y )2 ]/(2z) and Write an expression for the Fresnel diffraction
Exercises 89
integral in terms of a sum of phasors for (i) the m 1. Write an expression for width of the mth
most general case, and (ii) when we can neglect zone, δRm , in terms of m, the focal length f , and
diffraction in the y direction. Explain the two the wavelength, λ. Write an expression for the
main differences. focal spot size, xf = f λ/D, in terms of focal length
(5.4) Fresnel diffraction from an edge f , the wavelength λ, and the number of zones m.
Figure 5.28 shows that the intensity at x = 0, the Hence show that the width of the outermost (or
location of the edge, is one quarter of the value of mth) zone is approximately equal to the spot size.
the incident light. Why is this? [Hint: Consider (5.10) Other forms of the Fresnel diffraction integral
what the intensity would be at x = 0 from the Write the Fresnel diffraction integral for one
mirror-image edge, and then consider adding the transverse dimension x in the form of (i) a
fields from these two configurations.] convolution integral, and (ii) a Fourier transform.
(5.5) Fresnel zones (1) Write the Fourier variable kx in terms of k, x and
In Fig. 5.10 top row which images are closest to z, or u in terms of λ, x and z.
the case of 1, 2, and 4 Fresnel zones? Explain your (5.11) An improved Fresnel zone plate?
reasoning. A conventional Fresnel zone plate achieves a high
(5.6) Fresnel zones (2) intensity on-axis by blocking all of either the
The field on-axis at a distance z downstream of a odd or the even Fresnel zones, thus eliminating
cylindrically symmetrical aperture is given by the destructive cancellation of the fields from
ˆ neighbouring zones. What would happen if it were
E0 eikz ∞ ikρ2 /2z possible to manufacture a mask that rather than
E (z) = f(ρ )e 2πρ dρ ,
iλz 0 blocking the even zones, allowed the light to pass
but retarded the phase by π?
where f(ρ ) is the aperture function. Write an
expression for the field on-axis at a distance z (5.12) X-ray crystallography and Fraunhofer diffraction
downstream for the case of a circular annulus A typical wavelength for X-ray crystallography
with inner and outer radii ρ1 and ρ2 , respectively. is of the order of 1 × 10−10 m, and a typical
[Hint: separation of planes in a crystal is of the order of
a few ×10−10 m. Show therefore that the relevant
ˆ ξ2 2
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
92 Many waves II: Fourier
ˆ ∞
f(x) = F(u)ei2πux du . (6.6)
−∞
ˆ ∞
F(u) = f(x)e−i2πux dx . (6.7)
−∞
8 This equation says that the spectrum of wave amplitudes, F(u), is given
There is no standard convention for
the location of the minus sign in the by the Fourier transform of the spatial function, f(x). An example of
exponential in eqns (6.6) and (6.7), nor a Fourier transform is given in the bottom row of Fig. 6.3, with F(u)
where to include the factors of 2π. In and f(x) in the left and right columns, respectively. Filling in the gaps
the inverse transform, we use the sign in the discrete spectrum has the effect of suppressing all the repetitions
convention that matches the physics
convention for the positive frequency of the periodic wave form, leaving just one. In summary, a Fourier
term in the complex form of the plane- series, or discrete sum of waves, produces a periodic wave form; whereas
wave solution, ei2πux . Instead of using a continuous integral, or Fourier transform, produces any wave form.
the spatial frequency u, we could write
Often, we shall use the following shorthand notation:
the Fourier transform in terms of the
phase change per unit distance, which ˆ ∞
in optics corresponds to the component F(u) = F [f(x)] (u) = f(x)e−i2πux dx , (6.8)
of the wave vector in the x direction, −∞
kx = 2πu. In this case, the Fourier
transform takes the form: and
ˆ ∞
F(kx ) = f(x)e−ikx x dx , (6.10) ˆ ∞
−∞ −1
f(x) = F [F(u)] (x) = F(u)ei2πux du , (6.9)
and the inverse transform is −∞
ˆ ∞
dkx
f(x) = F(kx )eikx x . (6.11) for the forward and inverse transforms, respectively.8
−∞ 2π
Both the u and kx forms of the Fourier
In Chapter 8 we shall look beyond monochromatic light to consider
transform have their advantages and waves with different wavelengths. In this case, the wave form is made up
disadvantages. Generally, we shall use of components where the wave vector k not only has a range of directions
the u form, and convert to kx as (spread in kx and ky but k is fixed), but also a range of magnitudes,
required.
spread in k = 2π/λ. If the spectrum of angular frequencies, ω = ck, is
9
F(ω)dω is the amplitude of the given by a function, F(ω),9 then we use a Fourier transform in the time
field component with angular frequency domain to relate the time dependence of the field f(t) to the spectrum:
between ω and ω + dω.
ˆ ∞
dω
f(t) = F(ω)e−iωt , (6.12)
−∞ 2π
dependence,
ˆ ∞
F(ω) = f(t)eiωt dt . (6.13)
−∞
where A(0) = E0 F(u). Equation (6.15) says that we can write the field
E (0) as a superposition of plane waves with amplitudes, A(0) , and spatial
frequencies, u, along the x axis. As spatial frequency is related to the
angle of propagation via u = sin θ/λ, see Fig. 6.4, this is saying that any
spatial field distribution along x can be written as a superposition of
plane waves propagating at angles, θ, relative to the z axis.
To illustrate this angular-spectrum concept, in Fig. 6.5 we revisit the
Fig. 6.4 The sinusoidal curves show
case of two plane waves, first considered in Section 3.3. In (a) and (b) we the phase variation along x (or x) for
show the intensity patterns produced by two plane waves propagating plane waves propagating at the angle,
at different angles in the xz plane (note that the z axis is vertical in this θ, relative to the z axis, for two values
of θ. The larger the angle, the higher
plot). Below in (c) and (d) we plot the distribution of propagation angles,
the spatial frequency in the x direction.
the angular spectrum, A(0) = E0 F(u), which in this case is represented In the small-angle approximation, the
by two Dirac δ-functions, see Section B.2, one for each wave. Using spatial frequency in the x direction is
the relationship between spatial frequency and propagation angle, two u = θ/λ.
plane waves propagating at angles θ = ±θ0 /2 have spatial frequencies,
u = ±θ0 /(2λ), and as for eqn (6.7), we can write
θ0 θ0
F(u) = F0 δ u + +δ u− , (6.16)
2λ 2λ
96 Many waves II: Fourier
Example 6.1
Angular spectrum of a laser beam: Consider a laser beam, see Chapter 11 for
more details. The transverse field dependence in the z = 0 plane is given by
2
/w2
E (0) = E0 e−x , (6.20)
where w is the beam radius. The intensity distribution is plotted √ in Fig. 6.6 and
has the form of a gaussian with a standard deviation, or a 1/ e-width, Δx = w/2.
What is the angular spectrum associated with this light distribution?
The angular spectrum, eqn (6.17), is given by the Fourier transform of the field. In
Fig. 6.7 we show the construction of a gaussian wave form from harmonic waves. As
in Fig. 6.3 a discrete sum or Fourier series produces a periodic wave form, in this case
a train of gaussian wave packets. To suppress all the repetitions of the wave packet,
except one, we fill all the gaps in the discrete spectrum, i.e. replace the Fourier series
by a Fourier transform. For the special case of a gaussian wave form, the spectrum
is also a gaussian function. As the variables x and u are interchangeable, the Fourier Fig. 6.6 The transverse intensity
inversion theorem—the relationship between the forward and inverse transforms, see profile of a laser beam. At a transverse
Appendix B—is easily proved for this case. Mathematically, the Fourier transform displacement equal to the beam radius
is found by completing the square, see Section B.6, giving w the intensity falls to 1/e2 times the
√ 2 2 2 on-axis value.
A(0) = F E (0) (u) = πwE0 e−π u w . (6.21)
This expression tells us that the distribution of propagation angles, θ = sin−1 (uλ),
and hence the distribution of transverse momentum, px = h sin θ/λ, needed to form a
localized gaussian light distribution, is also gaussian. Substituting u = sin θ/λ ≈ θ/λ,
The angular divergence of a localized gaussian beam is illustrated in Fig. 6.8. This
inverse relationship between the initial real space width, w0 , and the angular spread,
Δθ, is true for all distributions, not just gaussians. We can also express the
angular divergence in terms of a transverse momentum distribution. The momentum
distribution is proportional to the modulus-squared of the angular spectrum,
2 2
u w2 2 2
/22
|A(0) |2 = πw2 E02 e−2π = πw2 E02 e−px w ,
where
√ we have used u = 2πpx /. This gives an uncertainty (standard deviation or a
1/ e-width) in the x-component of photon momentum of
Δpx = /w . (6.25)
Combining this with the uncertainty in position obtained from the intensity
distribution, Δx = w/2, we find that
Fig. 6.8 A gaussian laser beam with
width (standard deviation) Δx has a ΔxΔpx = /2 , (6.26)
momentum spread Δp = /(2Δx) which is the Heisenberg uncertainty relationship for photons at the waist of a
leading to an angle divergence Δθ = laser beam. In optics, Heisenberg’s uncertainty relationship—arising from the Fourier
2(Δp/p). The factor of 2 arises, as in relationship between space and momentum—tells us that if our light distribution has
Fig. 6.6, because Δθ is the angle to the a small spatial extent, it must have a large spread in transverse momentum, and vice
1/e√2 -intensity radius, whereas Δp is a versa.13
1/ e-width.
13
As we shall see in Chapter 11,
eqn (6.26) is only true at the position,
where the laser beam radius w is a
minimum. This position is called the
beam waist and the minimum value
6.4 Propagation
of w is written as w0 .
A common scenario in optics is that we know the field in a particular
plane, for example, E (0) in the z = 0 plane, and want to find the
field, E (z) , after propagating a distance z. Using the angular spectrum
method, propagation reduces to multiplying each plane-wave component
by a propagation phase. Recalling eqn (2.10), E = E0 ei(kx x+ky y+kz z) , we
see that in moving from z = 0 to z, the plane wave acquires a phase
14
For a plane wave propagating at a eikz z , which is called the propagator.14 Consequently, if the angular
small angle, θ, relative to the z axis spectrum in the input plane at z = 0 is A(0) , then the angular spectrum
shown in Fig. 2.5, the phase variation
along z is typically much faster than
a distance z downstream will be
along x, kz kx for small θ.
A(z) = eikz z A(0) . (6.27)
EZ=ifft2(P*fftshift(fft2(E0))) ,
Example 6.2
Hedgehog solution of Helmholtz equation Here we present an alternative where P=exp(-i*KZ*Z) is the propaga-
derivation of eqn (6.27), and hence eqn (6.29), starting from the Helmholtz equation, tor, KZ=2*pi*sqrt(1/(L*L)-U*U+V*V)
eqn (1.40): is the axial component of the wave
vector, L is the wavelength, and
∂2E ∂2E ∂2E ‘fft’ represents an in-built fast Fourier
2
+ 2
+ + k2 E = 0 .
∂x ∂y ∂z 2 transform, see Section B.14.
On substituting
¨ ∞ dkx dky
E= Aei(kx x+ky y) ,
−∞ 2π 2π
we find that
¨ ∞
dkx dky i(kx x+ky y) d2 A
e + (k2 − kx2 − ky2 )A = 0,
−∞ 2π 2π dz 2
which is satisfied if the integrand is zero for any kx or ky , i.e.
d2 A
+ (k2 − kx2 − ky2 )A = 0 .
dz 2
This equation has the solution
2 2 2 1/2
−kx −ky z
A(kx , ky , z) = ei(k )
A(kx , ky , 0) ,
Fig. 6.10 Application of eqn (6.29):
in agreement with eqn (6.27). The light distribution in the input
plane, E (0) , is propagated a distance
z to find E (z) and I (z) . The x
dependence of each I (z) ‘slice’ is
plotted vertically at horizontal position
6.5 Fourier to Fresnel z to produce a map of the intensity
distribution in the xz plane.
h = F −1 [H] = F −1 [eikz z ] .
e−iπu
2
λz
= e−π
2 2
u w2
, In the paraxial regime, we can calculate h analytically. Expanding kz in
terms of kx and ky ,
and the inverse transform becomes
2 2 2 1 2 2
F −1 [e−π u w ](u) = √ e−x /w , kx2 ky2
πw kz = (k 2 − kx2 − ky2 )1/2 k − − = k − πλ(u2 + v 2 ) , (6.32)
1 2 2k 2k
=√ e−πx /(iλz) ,
iλz
and the 2D inverse Fourier transform of the propagator is
and similarly for the y direction.
eikz F −1 e−iπ(u +v )λz (u, v)
2 2
h =
19
Note that the 1/i = e−iπ/2 factor
appearing in the Fresnel diffraction 1 ikz iπρ2 /(λz)
= e e ,
integral corresponds to a phase advance iλz
π/4 from each dimension. In the
time domain, an input field e−iωt where ρ2 = x2 + y 2 , and we have used the Fourier toolkit re-
becomes e−i(ωt+π/2) . This phase
advance is a problem for the secondary
sult, eqn (B.38).18 Inserting this result into the convolution integral,
wave concept because it appears that eqn (6.31), we find
secondary waves are ahead of the
ˆ ∞
incident field violating causality! The
phase advance is known as the Gouy E (z) = E (0) ∗ h = E (0) (x , y )h(x − x , y − y )dx ,
−∞
phase and we shall analyse it again for ˆ
the case of laser beams in Chapter 11. eikz ∞ (z) ik[(x−x )2 +(y−y )2 ]/2z
The Gouy phase can be interpreted = E e dx dy , (6.33)
geometrically as we cannot focus light
iλz −∞
to a point, a focused light beam
travels less far than predicted by the which is the Fresnel diffraction integral, eqn (5.2). By deriving the
geometrical path, see Boyd (1980). Fresnel diffraction integral from the hedgehog equation, eqn (6.29), we
demonstrate the equivalence of the Huygens–Frensel and Fourier optics
viewpoints. In addition, we obtain the amplitude and phase of the
constituent waves, E (0) /iλz, directly.19
6.6 Fresnel to Fourier 101
Example 6.3
Double slit revisited: Consider the case of two slits in the form of two rectangular
apertures each with width a, height b, separated by a distance, d, where b > d > a.
The horizontal component of the aperture function is
x
g(x ) = rect
(2)
∗ Xd (x) , (6.38)
a
102 Many waves II: Fourier
(N )
where we have made use of the replicating comb function, Xd (x), see Section B.9,
to make identical copies of the single-slit aperture function. The Fourier transform
(2)
of Xd (x) is
F Xd (x) (u) = e−iπud + eiπud = 2 cos πud ,
(2)
(6.39)
where u = x/λz. For laser illumination with a beam size, w0 < b, the light
distribution in the y direction is given by h(y ) = gauss(y /w0 ), see Fig. 6.11(i).
For an observation distance, dR z zR , we are in the far field for diffraction in
the x direction, but there is no diffraction in y and the y integral returns a factor of
√
λz times h(y). Using the convolution theorem for the x direction, Section B.4, we
can use eqn (6.37) to write the Fraunhofer intensity distribution as
πax
4I0 a2 πdx y
I (z) = cos2 sinc2 gauss2 . (6.40)
λz λz λz w0
This intensity distribution is shown in Fig. 6.11(ii). The x dependence is the same
as in eqn (5.32). This example illustrates the convenience of using Fourier methods.
20
As we are interested in the inten- symmetrically about the origin.20 We can write the aperture function
sity diffraction pattern we know that for the array as
the square modulus of the Fourier
(N −1)/2
transform of the aperture function is
invariant under a translation, therefore fN (x , y ) = f1 (x ± md, y ) ,
we are free to choose a convenient
origin. m=0
(N −1)/2
= f1 (x , y ) ∗ δ (x ± nd, y ) . (6.41)
m=0
6.7 Regular arrays 103
2
I0 sin(N πud)
I (z) = 2z2
|F [f1 (x , y )] |2 . (6.43)
λ
sin(πud)
single aperture distribution
The intensity diffraction pattern has two components: (i) the diffraction
pattern we would have obtained with only one aperture, and (ii) a
function of the array only, i.e. independent of the details of the aperture.
This result is known as the array theorem. Figures 6.12 and 6.13
show examples of a regular arrays of three identical circular apertures
and nine triangular apertures, respectively. The extension to a regular
two-dimensional array, Fig. 6.13, is an end-of-chapter exercise. The
array theorem is particularly useful because Nature often gives us exact Fig. 6.13 The aperture distribution
(above) and intensity diffraction pat-
copies of some functions in a regular array, e.g. crystals; eqn (6.43) is tern (below) for nine identical trian-
often used in crystallography. gular apertures. Note the alternating
principal and subsidiary maxima in
both x and y, with peak intensities set
by the diffraction pattern for a single
Example 6.4 triangular aperture.
Many slits and gratings revisited: As an example of a regular array we consider
laser illumination of N vertical slits. The input field is cartesian separable and we
can write the aperture function as f(x , y ) = g(x )h(y ), where along x the N -slits
are formed by a convolution of a rect function and a comb function with N teeth,
see Section B.9:
x
g(x ) = rect
(N )
∗ Xd (x) . (6.44)
a
For N = 5, g(x ) = rect(x /a) ∗ Xd (x ),
(5)
F Xd (x) (u) = e−i4πud + e−i2πud + 1 + ei2πud + ei4πud ,
(5)
and we obtain the intensity pattern shown in Fig. 5.23. Experimental images
corresponding to the intensity patterns for one to six slits are shown in Fig. 6.14.
Note how (i) the intensity of the principal maxima increases as N 2 ; (ii) the number
of subsidiary maxima is N − 2; and (iii) the width of the principal maxima scales as
1/N . For larger N —the grating limit—we can write the aperture function as a
product of an infinite Dirac comb, see Section B.9, and a function that describes
how the grating is illuminated. For uniform illumination of a grating with length
L = N d, the aperture function is
x x x 1 x
f(x ) = rect
(N )
∗ Xd (x) = rect ∗ rect X ,
a a L d d
and the diffracted field is proportional to
F f(x ) = N a sinc(πua) sinc(N πud) ∗ dX(ud) ,
∞
= N a sinc(πua) sinc[N π(ud − m)] ,
m=−∞
the atoms as blocking the field or as scattering sources of the field. The
Fraunhofer limit of the Fresnel diffraction integral, eqn (6.36), for the
complementary function can be evaluated using the properties of Fourier
transforms, see Appendix B. For a two-dimensional complementary
Example 6.5
Diffraction by a wire: As an illustration of Babinet’s principle we consider the
scenario illustrated schematically in Fig 6.16, where a wire is placed in the near field
of a laser pointer. The radius of the laser beam, w0 , is larger than the diameter of
the wire, a: w0 > a. We observe the diffraction pattern at a distance less than the
Rayleigh range, zR = πw02 /λ, but much larger than the Rayleigh length for the wire,
dR z zR , where dR = a2 /λ. In this case, we are in the far-field Fraunhofer
limit for the obstacle, but remain in the near field of the laser beam, and can assume
that the gaussian beam profile of the laser is unchanged. The aperture function,
f(x , y ) = g(x )h(y ), is
x x y
g(x ) = 1 − rect gauss and h(y ) = gauss .
a w0 w0
In the x direction we have a product of two functions, so to find the Fourier transform
we use the inverse convolution theorem. Assuming that we can neglect the change
in the laser beam, we obtain
x
πxa
x
G(u) = δ − asinc ∗ gauss , (6.48)
λz λz w0
106 Many waves II: Fourier
where we have substituted u = x/λz on the right-hand side. Note that the gauss
function that describes the laser beam is unchanged because we are in the near field
of the laser beam, z < zR . The convolution results in a gaussian at the origin
x = 0 and a slight smearing out of the sinc function, which is negligible for w0 a.
Consequently, the intensity distribution—proportional to the modulus-squared of
G(u)—looks like the original laser beam superimposed on top of a much wider sinc-
squared pattern as shown in Fig. 6.16.
Chapter summary
Exercises
(6.1) Fourier series coefficients (1) (6.3) Fourier series coefficients (3)
From the definition of the coefficients in eqn (6.1), Derive expressions which relate the coefficients cj
and by following the steps outlined in the text, of eqn (6.4) to ãj and φj .
derive the explicit relations of eqns (6.2) and (6.3). (6.4) Fourier series of a square wave
(6.2) Fourier series coefficients (2) A square wave with spatial period d is defined
Show that the coefficients in eqn (6.1) can be within one period as
combined into one amplitude and a phase for
1 |x| ≤ d/4
a cosine wave: aj cos (2πuj x) + bj sin (2πuj x) ≡ f(x) =
0 |x| > d/4
.
ãj cos (2πuj x + φj ), and derive expressions for ãj
and φj . (i) Show that the Fourier series of this function has
108 Exercises
coefficients, a0 = 1/2 and aj = −[2/(jπ)] sin jπ/2. (iv) Write an expression for the transverse
(ii) Sketch the function for the range −d ≤ x ≤ d. momentum distribution a distance z downstream
(iii) Explain why all bj terms are zero. of the beam waist. Comment on your reasoning.
(iv) Plot the Fourier series representation of the (6.7) Momentum distribution of a laser beam
series using Use the uncertainty principle for photons to derive
(a) the DC term and the fundamental spatial the standard deviation in the photon momentum
frequency, distribution in terms of the radius of the beam
waist, w0 . Comment on why the momentum
(b) the DC term, the fundamental, and the
distribution is independent of the laser wavelength
second harmonic, and
but the angular spread is not. [Hint: Δx = w0 /2.]
(c) the first ten non-zero terms.
(6.8) The transverse velocity of photons
(6.5) Fourier series of a rectified sine wave Write an equation for the angular divergence Δθ of
(i) Calculate the Fourier coefficients for a a laser beam with wavelength λ and beam waist
sinusoidal wave with a spatial period d, f(x) = w0 . Use this expression to estimate an average
sin (2πx/d). The rectified wave is defined within transverse velocity of photons, vx , for a red laser
one period d as pointer with λ = 0.63 μm and w0 = 1.0 mm.
Estimate the difference between the longitudinal
+ sin (2πx/d) 0 ≤ x ≤ d/2
aj = . velocity, vz , and the speed of light c = 3.0 ×
− sin (2πx/d) d/2 ≤ x ≤ d
108 m s−1 . Assume that the beam is cylindrically
(ii) Show that the Fourier series of this function symmetrical.
has coefficients bj = 0 and (6.9) Propagation
⎧ Explain, briefly, why if you know the light field in
⎨ 2/π j=0 the z = 0 plane, E (0) , you can then determine how
aj = 0 j = 1, 3, 5 . . . . the light will propagate downstream; whereas if
⎩
−(4/π)[1/(j 2 − 1)] j = 2, 4, 6 . . . you only know the intensity, I (0) , you can’t. What
information is missing in the intensity?
(iii) Sketch the function for the range −d ≤ x ≤ d.
(iv) Why are all bj terms zero? (6.10) Diffraction grating
(v) Why do we need only even numbered A one-dimensional diffraction grating has a
harmonics? transmission profile,
(vi) Plot the Fourier series representation of the x x
series using T(x ) = 0.5 + 0.4 cos 2π + 0.1 cos 4π ,
d d
(a) only the DC term, where d is the period of the grating. Show that the
(b) the DC term and the second harmonic, and intensity Fraunhofer diffraction pattern consists
(c) the first ten non-zero terms. of five spots. What is their angular location?
Calculate the relative intensities of the five spots.
(6.6) Angular spectrum of a laser
(6.11) Four identical apertures (1)
(i) Write an expression for field amplitude along
Four identical infinitesimally small holes are
the x axis of a laser beam with beam waist w0 .
aligned along the x axis, with neighbours
Assume that the laser is propagating in the z
separated by b. The transmission function of the
direction and that the beam waist is in the z = 0
aperture is
plane.
(ii) Write an expression for the x component of the T(x , y ) = δ(x + 3b/2, y ) + δ(x + b/2, y )
angular spectrum of plane waves A(0) in the z = 0 +δ(x − b/2, y ) + δ(x − 3b/2, y ).
plane.
(iii) Use your result to derive an expression for the Show that the Fraunhofer intensity diffraction
angular divergence. What is assumed about the pattern as a function of angles θx , θy is given by
angular width of the intensity distribution? 2
3bθx bθx
(iii) Using the de Broglie relation, write an expres- I(θx , θy ) = 4I1 cos 2π + cos 2π ,
sion for the transverse momentum distribution, i.e. 2λ 2λ
the probability of measuring a momentum, px , in where I1 is the intensity which would be obtained
the x direction. Comment on the normalization. from one hole. Sketch I/I1 as a function of θx .
Exercises 109
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
112 Optical phenomena in the time domain
where f(t) describes the time dependence of the field. The quantity
A(ω) = E0 F(ω) represents the frequency spectrum which can also be
written as a function of k or λ. Note that the units of the spectrum
1
As a consequence of the different A(ω) are different to the units of the angular spectrum A.1 Next, we use
variables in the Fourier transform, the Fourier transform to find the frequency spectrum of a square pulse.
eqns (6.7) and (6.13).
Example 7.1
A mode-locked laser: In a laser cavity of length L, see Fig. 11.6 in Chapter 11,
the boundary conditions for the electric field at the mirrors lead to only discrete
angular frequencies; for the mth mode the angular frequency is ωm = mω1 = mπc/L.
Therefore the spectrum is a sum of electric fields of the form Em exp(−iωm t + iφm ),
with a phase φm for each mode. By a process known as mode locking, it is possible
to arrange for all the modes to oscillate with the same phase, φm , in which case
we can write the spectrum as F(ω) = m Fm exp(−iωn t). If we assume that all
N modes have the same amplitude, then we can write F(ω) as a product of the
infinite frequency-replicating function, Xω1 (ω), and a rectangle function that selects
N modes, rect[(ω − ωc )/Δω]; F(ω) = F0 Xω1 (ω) rect[(ω − ωc )/Δω]; here F0 has
dimensions of time. The rect function is centred on ωc , the central angular frequency
of the laser spectrum, and Δω = (N − 1)ω1 is the bandwidth occupied by the excited
modes. The mode number for the central frequency is given by the ratio of the cavity
length to half the wavelength, 2L/λ. For lasers with a cavity length of L ∼ 1 m, the
excited modes have mode numbers of order ∼ 106 . We use eqn (7.8) to calculate the
temporal profile:
Fig. 7.7 The time dependence of the
ω − ωc
f(t) = F −1 [F(ω)](t) = F0 F −1 Xω1 (ω) × rect , output of a mode-locked laser. The
Δω pulses are separated by the cavity
ω − ωc round-trip time, T = c/2L, and have
= F0 F −1 [Xω1 (ω)] ∗ F −1 rect ,
Δω a width inversely proportional to the
laser bandwidth, τ = 2π/Δω.
F0 Δω Δω t
= exp(−iωc t) XT (t) ∗ sinc , (7.9)
4π 2 2
where T = 2π/ω1 . Note that we have used the convolution theorem between steps
two and three, and eqns (B.5), (B.9), and (B.13) between steps three and four. This
function is a set of periodically displaced sinc functions in time, and is known as a
6
mode-locked pulse train.6 The field profile is shown in Fig. 7.7. We see that the It can also be written explicitly as a
pulses are separated in time by T = 2π/ω1 = c/2L: this is the round-trip time of sum without the comb function:
the cavity. It is also evident that the temporal width of the pulses decreases as the f(t) = F0 Δω/(4π 2 ) exp(−iωc t)×
bandwidth of the laser, Δω, increases.
n=+∞
sinc Δω (t − nT )/2.
n=−∞
Example 7.2
Ar+ ion laser: For the Ar+ ion laser we can write the Fourier spectrum as
(ω − ωc )2
F(ω) = Xω1 (ω) G0 exp − ,
Δω 2
116 Optical phenomena in the time domain
where Δω is the bandwidth of the excited modes, and ωc the central angular
frequency of the laser. When all the modes are locked, we calculate the temporal
shape of the output:
f(t) = F −1 [F(ω)](t) ,
= F −1 Xω1 (ω) × G0 exp − (ω − ωc )2 /Δω 2 ,
−1
−1
= G0 F [Xω1 (ω)] ∗ F exp − (ω − ωc )2 /Δω 2 ,
ΔωG0 Δω t
= exp(−iωc t) XT (t) ∗ gauss . (7.10)
4π 3/2 2
This mode-locked pulse train, shown schematically in Fig. 7.8, is a set of periodically
displaced gaussian functions in time. As expected, the pulses are separated in time
by c/2L, and again it is evident that the temporal width of the pulses decreases as
Fig. 7.8 The time dependence of the the bandwidth of the laser increases.
output of a mode-locked laser with a
gaussian gain profile. The gaussian
pulses are separated by the cavity
round-trip time, T = c/2L, and have
a (1/e-intensity) width that is inversely
proportional to the laser bandwidth, 7.6 Two frequencies
τ = 1/Δω.
As we have done in earlier chapters, we will start our investigation of
optical fields that are not monochromatic by considering the simplest
possible case: the addition of two waves of almost equal frequency. Let
E1 and E2 be two harmonic waves propagating along the z direction.
Their sum is
where we have assumed for simplicity that both waves have the same
7
cos A + cos B = amplitude. Using a standard trigonometric identity7 we can rewrite this
as
A+B A−B
2 cos cos .
2 2
E = 2E0 cos k̄z − ω̄t cos (Δk z/2 − Δω t/2) , (7.12)
E(z, t) = F −1 e−iω(k)t F[E(z, 0)] , (7.16)
equation, eqn (6.29). Next, we solve this equation for the simplest
case of propagation in a dispersionless medium, where ω(k) is linearly
proportional to k.
Example 7.3
Dispersionless propagation: In a dispersionless medium, we can write the
dispersion relation as ω(k) = (c/n)k, where the refractive index, n, is independent
of frequency. In this case, eqn (7.16) is exactly solvable using the same method
we used to derive the Fresnel diffraction integral, Section 6.5. Writing h =
F −1 [e−i(c/n)kt ](z) = δ[z − (c/n)t] and using the inverse convolution theorem, we
find
E(z, t) = δ[z − (c/n)t] ∗ E(z, 0) = E[z − (c/n)t, 0] . (7.17)
The result is that the wave translates a distance z = (c/n)t without changing shape,
as shown in Figs. 7.10 and 7.11. This is the same result as we found in Section 1.13;
however, the use of eqn (7.16) is particularly powerful, as it can now be applied for
any form of ω(k), as we will show.
If we call the integral, or envelope function, F(ξ), then the field can be
written as
dω
E(z, t) = E0 ei(kc z−ωc t) F t − z . (7.21)
dk kc
7.9 Group velocity dispersion 119
The peak of the Fourier integral will occur when the phase factor is zero,
such that all the components add constructively. Therefore the group
will move to a location where
dω
t=z , (7.22)
dk kc
and we can define a group velocity as11 11
As the derivative is always evaluated
at kc (equivalently carrier angular
dω frequency, ωc ), the subscript to denote
vgp = . (7.23)
dk this is typically omitted.
In a dispersionless medium, where only the first two terms in
eqn (7.18) are significant, we learn from eqn (7.21) that the pulse
propagates without distortion in shape, as the envelope function that
defines the group propagates at a constant velocity—the group velocity.
Figure 7.11 is an example of the evolution of an optical pulse at the
group velocity—the same as Fig. 7.10 but for a rectangular pulse. Also,
although we have restricted our attention to waves moving along one
direction (z), the full vector group velocity can be calculated from the
divergence operator using the wave vector v gp = ∇k ω.
Recalling that the refractive index is defined by the relation kc =
n(ω)ω, we can also write the expression for the group velocity as
c
vgp = . (7.24)
n + ωdn/dω
This result helps us to classify two different regimes: (i) when dn/dω >
0, this case is referred to as normal dispersion, and vgp < vp ; and (ii)
when dn/dω < 0, this case is referred to as anomalous dispersion,
and vgp > vp .12 It must be emphasized that the group velocity is not 12
There are other mathematically
the average of the phase velocities. Equation (7.24) says that the group equivalent ways of writing the expres-
sion for group velocity, as the refractive
velocity depends both on the refractive index and its derivative with index can also be given as a function of
respect to frequency. Many curious optical phenomena arise when the wavelength or wave vector magnitude;
phase and group velocities are vastly different in magnitude; indeed, an end-of-chapter exercise investigates
the phase and group velocities can even have different signs. We shall some of the alternatives.
encounter some examples in the following sections. In a medium where
the phase velocity—or equivalently refractive index—is constant over the
frequency range spanned by the pulse, the group and phase velocities do
not differ. As expected, in vacuum, vgp = vp = c.
The concept of a group index is also useful in the context of dispersive
media. The group index, ngp , is defined as
c dn
ngp = =n+ω . (7.25)
vgp dω
We will now look at what happens when the gradient term, ωdn/dω
dominates.
Example 7.4
Dispersion of a gaussian wave packet: We can use the hedgehog equation,
eqn (7.16), to calculate the change in width (or duration) of a gaussian pulse
due to GVD—the third term in eqn (7.18). The pulse has initial wave form
2 2
E(z, 0) = E0 g(z, 0)h(z, 0), where g(z, 0) = e−z /2(Δz0 ) describes the pulse envelope
Fig. 7.12 Propagation of a gaussian and h(z, 0) =√eikc z , the carrier wave. The Fourier transform of the envelope function
2 2
pulse through a medium with group is G(k) = 2πΔz0 e−(Δz0 ) k /2 . Multiplying by the propagator in eqn (7.16),
velocity dispersion. The centre of the 2 2
e−iω(k)t , the third term in eqn (7.18) produces a term of the form, e−β k /2 , where
pulse propagates at a speed vgp = 2
c/ngp , where ngp is the group refractive d ω
β 2 = (Δz0 )2 + i t. (7.26)
index at the central frequency ωc . dk2
Taking the inverse transform, we obtain
Δz0 −z2 /β 2
g(z, t) = e ∗ δ(z − vt) , (7.27)
β
where v = ωc /kc and
1 1
= . (7.28)
β2 (Δz0 )2 + i(d2 ω/dk2 )t
This confirms that after propagation the wave form is still a gaussian, but with
a different width. The modified width, Δzt , is found by separating the real and
imaginary parts, i.e. writing
1 1
= + iα(t) , (7.29)
β2 (Δzt )2
which gives
2
1 d2 ω
(Δzt )2 = (Δz0 )2 + t2 , (7.30)
(Δz0 )2 dk2
and
(d2 ω/dk2 )2
α(t) = t2 , (7.31)
(Δz0 )2 + (d2 ω/dk2 )2 t2
is a linear chirp (the frequency, ω, depends linearly and the phase, ωt, quadratically,
on time).
Δt = |D|zΔλ . (7.32)
For the gaussian pulse with spatial width, Δz0 , from eqn (7.30) we
2
find that vgp Δt = (d2 ω/dk 2 )(z/Δz0 ). Substituting for spectral width,
Δk = −2πΔλ/λ2 = 1/Δz0 , and assuming vgp c inside the fibre, we
obtain
2π d2 ω
Δt = Δλz , (7.33)
c2 λ2 dk 2
and therefore,
2π d2 ω
|D| = . (7.34)
c2 λ2 dk 2
It is possible to write D in numerous ways but they all involve a second-
order derivative of the refractive index; an end-of-chapter exercise
investigates some of these alternatives. In optical communications, GVD
causes the pulses to spread out and merge into one another, which limits
the maximum data rate or propagation distance. For optical fibres,
the maximal dispersion is specified by the International Communication Fig. 7.13 Eye diagrams in a simulated
Union as |D| < 3.5 ps.nm−1 km−1 at λ = 1.55 μm. In Fig. 7.13 we show optical fibre communications link. The
a simulation of an optical communications signal to illustrate the effect ‘0’ and ‘1’ signal wave forms at
propagation distances of 0, 100, and
of dispersion. The signal is a rectangular wave with amplitude between 200 km are shown. If τ = 100 ps, then
0 and 1 and pulse duration τ (100 ps for 10 Gbit.s−1 ). The plots show the bandwidth at λ = 1.55 μm (Δλ =
both a 0 and 1 arriving within a time window centred around t = 0. This (λ2 /c)Δν for Δν = 20 GHz) is of
is known as an eye diagram. On the right we show a histogram of the order 0.15 nm. A dispersion coefficient
of |D| = 2 ps.nm−1 km−1 adds 30 ps
signal level recorded within a time window τ /2. The histogram indicates of broadening over 100 km, which is
the distinguishability between 0 and 1. In the lower plot, 0 and 1 are enough to significantly increase the bit-
clearly distinguishable; for longer propagation distance (middle and top) error rate (upper plots).
they become less so. The overlap of the 0 and 1 histograms determines
the bit-error rate (BER) of the communication channel.
effect of the large group index is to delay the pulse relative to the
dispersionless case, as is apparent in the top frame of Fig. 7.15. This is
the slow-light effect.
In practice, a large real part of the refractive index is associated with
a concomitant imaginary part (see Section 13.5) which leads to loss of
Fig. 7.15 Propagation of a non- light via scattering. This loss is apparent as a reduction in the intensity
resonant gaussian pulse with central of the pulse in Fig. 7.15. To avoid loss, it is preferable to choose the
angular frequency, ωc < ω0 . The delay angular frequency of the light, ωc , as far away as possible from the
relative to the dispersionless medium
(grey curve in the top graph) is the
resonant frequency of the medium, ω0 ; however, this also results in lower
slow light effect: ωdn/dω 1. dispersion. A solution to this problem is to exploit a phenomenon known
The normalized spectrum (grey) and as EIT (electromagnetically induced transparency), where an additional
n(ω) − 1 are shown inset. laser is used to engineer the dispersion of the medium.13
13
Using this idea, Schmidt and col-
leagues (Schmidt et al. 1996) achieved
a group velocity of c/3000. Later, Hau 7.12 Fast light
and colleagues (Hau et al. 1999) slowed
light to vgp = 17 m s−1 , slower than a By exploiting the anomalous dispersion region, it is possible to realize
cyclist! In EIT experiments temporal
control of the relevant optical fields media where the second term in eqn (7.25) is both large and negative, in
also allows one to store light; see which case the group index is negative. This phenomenon is known as
Fleischhauer et al. (2005) for details. fast light. The controversial description of ‘superluminal propagation’
is occasionally used to describe this regime. To observe fast light, one
must satisfy the inequality dn/dω < −n/ω. One strategy to fulfil
7.13 Information propagation 123
Exercises
(7.1) Spectrum of a rectangular pulse particular case the spectrum can also be calculated
Explicitly evaluate the integral in eqn (7.1) for the as the sum of a geometric series. Show explicitly
temporal function defined in eqn (7.2), and verify that this technique generates an identical answer
that the spectrum of a rectangular pulse in time to that of eqn (7.9).
is indeed two displaced sinc functions.
(7.7) Width of mode-locked pulses
(7.2) Spectrum of an isolated triangular pulse
In an Ar+ ion laser of cavity length L = 2.00 m,
A triangular pulse of duration τ has a profile that
the gain bandwidth has a gaussian profile and a
is zero for t ≤ −τ /2; the electric field increases
standard deviation of 2π × 1.0 GHz. Calculate the
linearly for −τ /2 ≤ t ≤ 0, decreases linearly
temporal duration and separation of the pulses in
to zero for 0 ≤ t ≤ τ /2, and is zero for t ≥
the mode-locked train.
τ /2. Calculate the frequency spectrum of this
pulse. [Hint: a triangular function can be obtained (7.8) Bandwidth of short-pulse lasers (1)
by convolving two rectangular pulses of half the A mode-locked Ti:sapphire laser has a central
width.] operating wavelength of 800 nm, and produces
(7.3) Spectrum of a periodic train of triangular pulses pulses of duration 10 fs. (i) Sketch the form
What is the frequency spectrum of a periodic train of the optical field. (ii) What is the angular-
(period T ) of triangular pulses of duration τ ? frequency bandwidth of the pulses? (iii) What is
the wavelength bandwidth of the pulses?
(7.4) Appearance of negative frequencies in the Fourier
transform of pulses´ (7.9) Bandwidth of short-pulse lasers (2)
∞
Start with F(ω) = −∞ f(t)eiωt dt. Note that F(ω) Show that if an optical pulse only lasts for
is in general complex, and we seem to need to sum a duration of approximately N cycles, the
over negative frequencies. In this question we shall bandwidth of the spectrum is approximately 1/N
consider the physical significance of these terms. of the central frequency.
(i) First, show that for a real function f(t), the
function F(ω) obeys the relation F(ω)∗ = F(−ω). (7.10) Intensity of mode-locked pulses
(ii) Next, show´that if f(t) is a real even function, (i) Show that the electric field strength of the
∞
then F(ω) = 2 0 f(t) cos ωt dt, which is real, and mode-locked pulse train of eqn (7.3) has a peak
F(ω) = F(−ω). Therefore the real part of F(ω) amplitude proportional to N , the number of
tells us how much cos ωt there is in f(t). excited modes. (ii) Hence show that the peak
(iii) Show that if f(t) is a real odd function, then intensity scales as N 2 . (iii) Further, show that the
F(ω) is purely imaginary, and F(ω) = −F(−ω). duration of the pulses is inversely proportional to
Therefore the imaginary part of F(ω) tells us how N . (iv) Hence show that the average intensity
much sin ωt there is in f(t). scales as N . This result emphasizes that the
(iv) Now we can get rid of the negative frequencies mode-locked pulses arise as a consequence of
completely. Using the results of the earlier parts interference, a process that can redistribute energy
of this question, show that we can always write the but can neither increase nor destroy the sum of the
transform of a real function as a sum over positive energy of the individual modes.
frequencies with real amplitudes.
(7.11) Temporal form of mode-locked pulse train
(7.5) Fourier transform of two pulses
Write the temporal form of the gaussian mode-
Verify the result in the text, that the Fourier
locked pulse train of eqn (7.10) explicitly as a sum
transform of a pair of identical pulses of duration
without the comb function.
τ and separation T is given by eqn (7.5).
(7.6) Uniform amplitude mode-locked pulse train (7.12) Photon lifetime inside a Fabry–Perot cavity
In the text we have used Fourier methods to An ultra-short gaussian light pulse with centre
calculate the time dependence of the uniform frequency, ωc , is emitted inside a Fabry–Perot
amplitude mode-locked pulse train. For this cavity, see Section 3.11. The field outside the
126 Exercises
cavity for t > 0 is (7.15) Phase and group velocities for matter waves
(i) Use a trial solution, ψ = Aei(kz−ωt) , in
2
/τ 2
E(t) = E0 e−(iωc +κ/2)t e−(t−mT ) , the Schrödinger equation to derive the following
m dispersion relation for the matter wave associated
with a particle of mass m in free space: k 2 /2m =
where 1/κ is the cavity decay time, T = 2 /c is ω. (ii) Show that the phase velocity is vp =
the cavity round-trip time, and m is an integer. k/2m. (iii) Show that the group velocity is vgp =
The two interfaces have intensity reflectivities of k/m. (iv) Interpret these results. (v) Do matter
R and unity, respectively. As we lose a fraction waves in free space exhibit normal or anomalous
1 − R on each round-trip, the change in the pulse dispersion?
intensity is δIp /δt = −(1 − R)Ip /[c/(2 )], which
(7.16) Cauchy’s formula for refractive index variation
has the solution
Cauchy found an empirical formula for the
variation of refractive index as a power series
Ip = I0 e−(1−R)t/[(2
)/c] ,
in 1/λ2 . The simplest form of his relation is
n = A + B/λ2 . For BK7 borosilicate glass the
which gives κ (1 − R)c/(2 ). To find the
coefficients are A = 1.5046 and B = 4200(nm)2 .
spectrum of light emitted by the cavity, |F(ω)|2 ,
Find (a) the refractive index, and (b) the group
it is convenient to write the time dependence as
index at: (i) 400 nm, (ii) 500 nm, and (iii) 600 nm.
f(t) = g(t)h(t), where
(7.17) GVD-induced pulse broadening (1)
For an optical pulse where group velocity
0 t<0
g(t) = , dispersion is not negligible, show that the pulse
e−(iωc +κ/2)t t>0
broadening is Δt = |D|zΔλ, where the dispersion
and parameter is
λ d2 n
h(t) = X(t/T ) ∗ gauss(t/τ ) . |D| = .
c dλ2
Find an expression for the Fourier transforms, Show that this can also be written as
G(ω) = F [g(t)] and H(ω) = F [h(t)] and then use
2πc d2 k
the convolution theorem to find an expression for |D| = .
λ2 dω 2
the spectrum of light |F(ω)|2 emitted by the cavity.
What is the width of the peaks in terms of κ? How (7.18) GVD-induced pulse broadening (2)
does this width of the peaks compare to the width (i) Write down the form of the electric field for
of the transmission resonances of a Fabry–Perot a gaussian pulse with initial spatial width (rms)
interferometer, eqn (3.38)? Δz0 . (ii) By taking the Fourier transform, eval-
(7.13) Slow and fast oscillations for two colours uate the wave vector spectrum using eqn (7.15).
A sodium lamp emits light with two wavelengths, (iii) Keep the group velocity dispersion term in the
589.0 and 589.6 nm. Evaluate the quantities k̄ = expansion of eqn (7.18), and evaluate the field at a
(k1 + k2 )/2, ω̄ = (ω1 + ω2 )/2, Δk = k2 − k1 , and time t later from eqn (7.14). [Hint: for a gaussian
Δω = ω2 − ω1 . Comment on the magnitudes of profile the integral is analytic, having completed
your results. the square.] (iv) Show that the spatial width
after propagation, Δzt , is exactly of the form of
(7.14) Alternative expressions for the group velocity eqn (7.30).
(i) Starting with the definition vgp = dω/dk,
(7.19) Slow light
and recalling that the phase velocity is defined as
What is the group index for a pulse of light slowed
vp = ω/k, show that vgp = vp + kdvp /dk = vp −
to 17 m s−1 ?
λdvp /dλ. What signs do dvp /dk and dvp /dλ take
in regions of normal and anomalous dispersion, (7.20) Fast light
respectively? In a slow or fast light medium there is a
(ii) Show that vgp = c/(n − λdn/dλ). compression of the physical length of the pulse by
(iii) Show that a factor 1/|ngp |. What is the length of a 25 ns
long pulse (i) in free space, and (ii) in a fast light
k dn medium with ngp = −1 × 106 ? [Hint: these are
vgp = vp 1− .
n dk the parameters from Jennewein et al. (2016)].
Coherence 8
Io ritornai dalla santissima onda, rifatto.
8.1 Introduction 127
I return to the sacred wave, refreshed.
8.2 Statistical light 128
Dante Aligheri (Florence 1265–Ravenna 1321), Divina Com-
8.3 Temporal coherence 128
media (1308–20)
8.4 White light 130
8.5 Wiener–Khinchin–Einstein
theorem 132
8.6 Power spectral density 133
8.1 Introduction 8.7 Intensity correlations 136
8.8 Spatial coherence 136
In previous chapters, when discussing interference and diffraction, we 8.9 van Cittert–Zernike 137
restricted our attention to monochromatic light with a single wavelength
8.10 Propagation of coherence 140
λ and a well-defined phase, i.e. light fields that can be described in terms
8.11 Stellar interferometry 141
of the harmonic wave solution, eqn (1.9) of Chapter 1. In this chapter, we
Chapter summary 142
shall extend our discussion to include light with more than one frequency,
Exercises 142
components whose relative phase varies with time, and extended sources
that emit neither plane nor spherical waves. The concept of coherence
relates to the extent that knowledge of the electric field at one point in
space and time provides information about the phase at other points,
i.e. to what extent there is a correlation between the fields at the
two locations (displaced in either space or time). The extent of these
correlations between fields at different locations gives us a quantitative
measure of the coherence, and there exists a continuous scale between
fully coherent and completely incoherent.
Incoherence describes the situation when we do not have complete
information about the field, and the best we can do is to express the field
as a statistical mixture of different components. The unknown phases
in incoherent light are treated as random. This is a significant break
with previous chapters where we have assumed that we had complete
knowledge of the field, and could write it as a superposition of fully Fig. 8.1 A statistical light source
coherent plane or spherical wave solutions, eqns (2.10) and (2.33), recall (in the plane on the left) emits light
Fig. 2.1. Real light fields do not behave like these idealizations because with different frequencies and phases
that evolve with time (the black and
light sources are themselves statistical, emitting different frequencies grey waves indicate two times). Our
with a different time dependence, as in Fig. 8.1. For a statistical source knowledge about the source is limited
it is impossible to completely predict the relative phase at subsequent by what we choose to detect.
times. In addition, the capabilities of measurement are limited. In
paraxial optics, where we choose to detect only light close to a particular
axis, we select parts of the field with particular phase relationships and
may find that coherence grows with propagation distance. Consequently,
it is important to emphasize that:
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
128 Coherence
where we have used the standard relationship between intensity and field
amplitude, I = 12 0 cE 2 . Now we perform the time average. The first
two terms have a constant average (of I1 and I2 , respectively) but the
last term has a magnitude that varies4 with the sum (ω1 + ω2 )/2, and
difference |ω1 − ω2 |/2, and what we see depends on whether our detector
is fast enough to follow these frequencies. As optical frequencies are Fig. 8.3 Top: Time dependence,
f(t), for a monochromatic wave with
in the hundreds of terahertz range, and even fast detectors can only central angular frequency, ωc , that is
measure tens of gigahertz, it is typically the case that we will only see phase shifted at random times t1 , t2 ,
the time averages of these terms, which are zero. Consequently, the t3 , etc. The average time between
time-averaged intensity is these phase jumps corresponds to the
coherence time. Below: The spectrum
of the wave, given by |F(ω)|2 , where
I = I1 + I2 . (8.4) F(ω) = F [f(t)]. The spectrum (shown
in grey) is described by a Lorentzian
This expression says that if the light field contains different frequency function (black line) with a width
components, and we measure the time-averaged intensity, then the total inversely proportional to the coherence
intensity is given by the sum of the intensities of each component. We time. The Fourier relationship between
the spectrum width and correlation
can generalize this result in the form of a crude rule,5 namely: for
(or coherence) time is an example
coherent light we add amplitudes and then square to find the intensity; of the Wiener–Khinchin–Einstein theo-
whereas for incoherent light we simply add intensities: rem, see Section 8.5 and Exercise 8.11.
4
Rule: coherent, add amplitudes; incoherent, add intensities. cos A cos B =
1
2
cos(A − B) + 1
2
cos(A + B).
5
We now extend our discussion to the more realistic case of fields com- Crude because all light is somewhere
posed of more than two discrete frequencies. The randomness associated between coherent and incoherent.
with incoherence arises from the addition of many independent waves
with different phases and different frequencies. In this case, inspired by
Fig. 8.3, we define coherence time, τc , as the inverse of the bandwidth6 6
A more detailed discussion of how the
of the light: width of various functions is defined is
found later in the chapter.
1
τc = , (8.5)
Δν
and the distance light travels in the coherence time is known as the
coherence length, Lc , defined as
Lc = cτc . (8.6) 7
Equation (8.5) and Table 8.1 clearly
demonstrate the motivation for using
Table 8.1 lists the spectral widths and coherence lengths for five different stable lasers in interferometers where
light sources.7 The coherence length of sunlight is so short that path large path differences are encountered.
130 Coherence
Table 8.1 Central wavelength, λc , spectral width, coherence time, and length for five
different light sources.
i.e. the sum of the two interference pattern for each colour on its
own. The intensity as a function of Δ—known as an interferogram—
is plotted at the bottom of Fig. 8.4 and on the right in row (iii) of
Fig. 8.5. The pattern is similar to the phenomena of beats in the time
domain, discussed in Chapter 7. As we shall show, the interferogram
provides a means of determining the spectrum of a source. As we add
more frequencies, row (iv) in Fig. 8.5, some of the maxima in the beat
pattern are suppressed. For a continuous spectrum, row (v), all but
the central fringe in the interferogram are suppressed. As we might
guess by comparing Fig. 8.5 to Fig. 6.7, the input spectrum and the
interferogram are related via a Fourier transform. We shall quantify
this Fourier relationship in Section 8.5. The use of a Michelson, or
other interferometer, to characterize an unknown spectrum is known as
Fourier-transform spectroscopy.
Figure 8.5 also illustrates the relationship between fringe visibility
and the input spectrum. The visibility of the fringes is equal to 1 for all
path-length differences for a monochromatic input, rows (i) and (ii); the
visibility oscillates periodically as a function of path-length difference for
discrete frequencies in the input, rows (iii)–(iv); the visibility rapidly
falls to zero as a function of path-length difference for white light input,
row (v). As we discussed in Section 8.3 the maximum path difference
at which interference can be observed is the coherence length. Row (v)
132 Coherence
The first term is the sum of the intensities of the waves from the separate
arms, and the second term involves the average of the product of the
fields of the separate waves. Γ(τ ) is know as the autocorrelation
12
It is also known as the first-order cor- function of the electric field,12 formally defined as
relation function. Later in the chapter,
ˆ Ta
when we meet intensity correlations,
∗ 1
second-order coherence functions will Γ(τ ) = E (t)E(t + τ ) = E ∗ (t)E(t + τ )dt , (8.11)
appear. Ta 0
where the averaging time Ta is long compared to the characteristic
13
To calculate the autocorrelation of timescale of the fluctuations.13 It is also useful to write a normalized
a function we multiply each point of form of the autocorrelation,
the function by another point a time
τ later, and then sum the products Γ(τ ) E ∗ (t)E(t + τ )
over the integration duration Ta — γ(τ ) = = . (8.12)
mathematically it is very similar to Γ(0) E ∗ (t)E(t)
the convolution operation, Section B.4;
however only one function is needed, The normalized autocorrelation function is constrained to lie within the
and there is no need to take the mirror range 0 ≤ |γ(τ )| ≤ 1. As it quantifies the temporal correlation between
image in the multiplication. the two fields in the Michelson interferometer it is a measure of the
coherence of the fields. Equation (8.12) quantifies the earlier discussion:
8.6 Power spectral density 133
when |γ(τ )| = 1 the fields are fully coherent, ‘add amplitudes’; |γ(τ )| = 0
the fields are fully incoherent, ‘add intensities’; and |γ(τ )| =
0 or 1 the
fields are partially coherent. Evidently the temporal behaviour of γ(τ )
allows a quantification of the coherence time and length.
The autocorrelation is also related to the fringe visibility. We can
write the time-averaged intensity of eqn (8.9) as
I = 2I0 {1 + γ(τ )} , (8.13)
where I0 is the intensity from either wave alone. Therefore, for this case,
we have
Imax − Imin (1 + |γ(τ )|) − (1 − |γ(τ )|)
V= = = |γ(τ )| , (8.14)
Imax + Imin (1 + |γ(τ )|) + (1 − |γ(τ )|)
i.e. the fringe visibility is equal to the normalized autocorrelation
function of the fields. This confirms mathematically our earlier result
that coherent light gives clear fringes; partially coherent light gives
less distinct interference fringes; and incoherent light does not produce
interference fringes.14 14
For this particular case of the two
Table 8.2 shows the functional form for the autocorrelation function, waves having equal amplitude, the
visibility is exactly equal to the mod-
γ(τ ), for a monochromatic wave, a Lorentzian chaotic light source— ulus of the autocorrelation; this one-
similar to the example shown in Fig. 8.3 and explored in Exercise 8.11— to-one correspondence is not obtained
and a Doppler-broadened light source. with waves of different intensities, as
is investigated in an end-of-chapter
exercise.
Table 8.2 Example of the autocorrelation function γ(τ ) for three different
types of light waves of central angular frequency ωc . Typical values of the
correlation time, τc , for different light sources can be found in Table 8.1.
spectrum of the light. As we saw in Section 8.2, the electric field from a
sum of many waves with different frequencies may be chaotic or random,
but the theorem also applies to any stationary random process and
its spectrum. A random, or stochastic, process is called stationary if
the statistical properties such as the mean and variance—calculated
from the probability densities governing the fluctuations; see Hughes
and Hase (2010)—are invariant under a translation of the origin of time.
Specifically, when calculating the average in eqn (8.11) the value of the
17
The wave form shown in Fig. 8.3 is an integral is independent of the initial time.17
example of a stationary random func- Even for a field with random fluctuations, we can still write the average
tion, where the statistical properties do
not change over time.
intensity as a sum of contributions from components with angular
18
frequency, ω, which is proportional to18
Note in particular the use of the ˆ ∞ˆ ∞
dummy variables for time in the Fourier
1
Δν = , (8.23)
τc
Example 8.1
Fourier transform spectroscopy: We can further highlight the role of the Fourier
transform in Michelson interferometry with broadband statistical light by looking
again at the intensity at the output, as expressed by eqn (8.9). Recalling the
Fourier relationship between the electric field’s autocorrelation and the power spectral
density, eqns (8.17) and (8.18), and the link between intensity and power spectral
density, eqn (8.19), and taking advantage of the fact that S(ω) is real, we can rewrite
23
this expression as23 Note the different integration limits
ˆ ∞ to eqn (8.19).
I=2 S(ω) [1 + cos (ωτ )] dω . (8.24)
0
The interpretation of this result is that the output of the Michelson interferometer
as a function of time delay, or path-length difference—the interferogram—is a sum
of contributions produced by each monochromatic component of the input light,
weighted by the power spectral density. For a sum of discrete outputs we have seen
that the visibility collapsed and then revived; here, with a continuous input spectrum,
there is a decrease in the visibility as a function of the time delay. Measuring the
decrease in visibility of the interference fringes as a function of path-length difference,
as in Fig. 8.6, allows us to evaluate the spectrum of the input light (the power spectral
density). This is the basis of a widely used technique known as Fourier transform
spectroscopy.
136 Coherence
plane waves with angles in the range −as /(2zs ) < θs < as /(2zs ) to enter
the interferometer.
In Young’s apparatus, for an on-axis input point the field at a
displacement x in the observation plane is
E = E1 eikdx/2z + E2 e−ikdx/2z . (8.26)
For an input point displaced by a distance xs the two terms pick up an
additional phase such that
E = E1 eikdx/2z eikdxs /2zs + E2 e−ikdx/2z e−ikdxs /2zs , (8.27) Fig. 8.7 Young’s interferometer, con-
sisting of a source plane at z = −zs ,
and the modulus-squared is a double-slit plane at z = 0, and the
observation plane at z. In this example,
kdx kdxs
EE ∗ = E12 + E22 + 2E1 E2 cos + . (8.28) the incident light is assumed to be a
z zs plane wave propagating at an angle θs
relative to the z axis. In the paraxial
If E2 = E1 , then we can write the intensity in the plane z as
limit, we can write that θs = xs /zs ,
kdx kdxs 2 kd x xs where xs is a transverse displacement
Is = 2Īs 1 + cos + = 4Īs cos + , (8.29) in the source plane. The effect of the
z zs 2 z zs displaced source point is to translate
where Īs is the intensity in the observation plane if one slit is blocked. the interference pattern by a distance
(xs /zs )z in the observation plane.
Hence, the result of the displaced input is simply to translate the
interference pattern by a geometrical factor (z/zs )xs .
The next step is to sum over the source coordinate, xs . As each
component originates from a different point on a distant source, e.g. the
Sun, we can assume that we can sum each component incoherently, i.e.
add their intensity contributions rather than their amplitudes. Hence for
an extended input, the intensity distribution is given by the integral of
eqn (8.29) over the source coordinate xs with limits ±as /2. The result of
this integral for different values of as is shown in Fig. 8.8(right column).
As more displaced waves are added, the visibility of the interference
fringes decreases towards zero; then the fringes partially reappear, but
with a π phase difference. The reason for this sign reversal is that there
are now more waves in the sum with their central maximum displaced
by half a spatial period. Next, we derive an analytical expression for the
visibility of the fringe pattern seen in Fig. 8.8.
There is a simple explanation for why the fringe visibility goes to zero
when as = (λ/d)zs . Recall that for a rectangular aperture of width as ,
the sinc diffraction pattern has first zeros at an angle of λ/as . Therefore
the transverse size of the spot on the screen with the double slits is
zs λ/as . If the slit is narrower than (λ/d)zs the diffraction pattern covers
both slits (of separation d). We can say that the transverse coherence
length is longer than the separation of the slits, hence we get clear
fringes. By contrast, for a wider slit there is a narrower diffraction
pattern, and light illuminating one of the slits is not coherent with the
light illuminating the other.31 Figure 8.10 highlights the importance of 31
In a photon picture one can ask
the width of the first slit in the formation of clear interference fringes whether it is possible to work out which
path a photon takes from the source
in Young’s experiment. Note the trade-off between the amount of to the screen. If one can, there is
light incident on the screen containing the double slits—a wider initial no interference. The condition on the
aperture transmits more light—and the coherence of the illumination—a width of the first slit is exactly to
narrower initial aperture illuminates the double slits more coherently. ensure that no ‘which path’ information
is available, see Section 9.8.
distribution over the source. Note that an incoherent set of emitters gives
rise, in general, to a partially coherent field. Specifically, propagation
and diffraction of light can improve the degree of coherence of the field.
That is why we emphasized earlier that coherence is a property of the
field, not the source: Young found a way to illuminate coherently two
holes using sunlight by placing a sufficiently small aperture in front of
them; thus the question is the Sun a coherent source? is not useful.
where the double integral is over all coordinates, x1 and x2 , in the z = 0
plane. The equation shows that the spatial coherence characteristics in
the two planes are related by a Fourier transform. The importance of the
2 2
Fresnel factor, eik(x2 −x1 )/2z , which characterizes the size of the Fresnel
zones, as illustrated in Fig. 8.12, decreases as z increases, because the
Fresnel zones become larger and the zones associated with input points
x1 and x2 have a larger overlap. If the field in the first plane has a
very short transverse coherence length, it is possible to derive a simpler
result: that the transverse correlation function in the observation plane
varies with position in exactly the same way as the field amplitude in a
coherent diffraction experiment, with the aperture function replaced by
Fig. 8.12 The propagation of coher- the intensity distribution of the source:
ence: the correlation (spatial coher-
ence) between the fields at (0, z) and 2 ˆ ∞
∗ 2 eikx /2z
(x, z) is given by the Fourier transform
E (0)E(x) = Lc I(x2 )e−ikxx2 /z dx2 , (8.35)
of the overlap between Fresnel zones c0 λz −∞
centred around points (x1 , z) and
(x2 , z) in the input plane. The spatial
coherence grows as the field propagates,
where we have written I(x2 ) = 12 c0 E02 |f(x2 )|2 and Lc is the coherence
i.e. a field that is incoherent at z = 0 length in the input plane. As a consequence, light from a source
can become more coherent in the plane composed of incoherent emitters (such as a star) with characteristic
z, as in Young’s two-hole experiment dimension D at a distance z downstream will have a transverse coherence
with sunlight.
length of ∼ λz/D. For the case of a uniform circular source of diameter
D the spatial coherence function first goes to zero at a value of 1.22λz/D.
Extending the analysis of transverse coherence to two transverse
dimensions we arrive at the concept of a coherence area. Our
example above used Young’s apparatus, concentrating on the form of
the interference fringes along one dimension. We found it useful to
compare the transverse coherence length of the light arriving at the two
slits relative to the slit separation. Of course, the light illuminating the
slits has a two-dimensional distribution. Using the van Cittert–Zernike
33
See Goodman (1985) for details. theorem, it is possible to show,33 for a monochromatic uniformly bright
8.11 Stellar interferometry 141
(λz)2 λ2
Ac = = , (8.36)
As Ωs
where Ωs is the solid angle subtended by the source at the centre of
the observation region. This result encapsulates the earlier discussion
that the coherence properties of a wave field improve with propagation,
i.e. it is easier to observe interference fringes if the light illuminates an
interferometer.
Chapter summary
Exercises
(8.1) Temporal coherence (8.3) Single emitters
If the spectral width of a source is quoted as the Discuss the coherence of light emitted by a
wavelength range Δλ, show that the coherence single atom. How would you test the coherence
time is Δt = λ2c /(cΔλ), where λc is the central properties experimentally?
wavelength.
(8.4) Visibility of fringes and γ(τ )
(8.2) Coherence of sunlight Repeat the analysis of the form of the interference
Assuming a filter is used to select a narrow band fringes in an amplitude-splitting interferometer,
of colours around a mean wavelength of 550 nm, but with different amplitudes for the two waves.
and given the diameter of the Sun, and the Earth– In this
case, show that the visibility is given by
Sun separation in the text, calculate the transverse V = 2 (I1 I2 )1/2 / (I1 + I2 ) |γ(τ )|, where I1 and
coherence length of sunlight at Earth. I2 are, respectively, the intensities that waves 1
Exercises 143
and 2 alone would produce. Show that this reduces expressions for the power spectrum. If we define
to the value quoted in the text in the special case the autocorrelation as
of equal intensities.
γ(t) = f ∗ (t )f(t + t)
(8.5) Autocorrelation function for a monochromatic ˆ
1 T ∗
wave = lim f (t )f(t + t)dt ,
T →∞ T
For a monochromatic wave with angular frequency 0
ω0 , show that γ(τ ) = e−iω0 τ . Hence show that the and the power spectrum as
magnitude of the first-order correlation is always
1 ∗
equal to one; i.e. the light is perfectly coherent. S(ω) = lim F (ω)F(ω) ,
T →∞ T
(8.6) Autocorrelation function for different light sources
then using Parseval’s theorem, see Appendix B,
Plot the three forms of the autocorrelation
eqn (B.25), and the Wiener–Khinchin–Einstein
function given in Table 8.2 as a function of
theorem, show that
the variable τ /τc . Comment on similarities and ˆ ∞
differences among the curves.
S(ω)dω = 2π|f(t)|2
.
(8.7) Properties of the autocorrelation function 0
Show that the autocorrelation function, Γ(τ ), is: If P(φ, t) is the probability of the phase jumping
(i) a maximum value at zero delay. by an amount between φ and φ + dφ in a time t,
(ii) a Hermitian symmetric function of τ ; i.e. then
Γ∗ (τ ) = Γ(−τ ). ˆ 2π
(iii) periodic if E is periodic, with the same period. γ(t) = e−iωc t e−iφ P(φ, t)dφ .
(iv) in the limit of large τ , Γ(τ → ∞) = E ∗
E
. 0
Use the result of part (iv) to comment on the value If the average time between phase jumps is τ then
of the autocorrelation function of the electric field we can write that
in the limit of large τ . 1
P(φ, t) = e−|t|/τ δ(φ) + (1 − e−|t|/τ ) ,
(8.8) Power-equivalent width of autocorrelation func- 2π
tions where the first term is the probability that no
Verify that the form of the autocorrelation jump has occurred and the second term is to
functions in Table 8.2 for (i) Lorentzian and (ii) normalize. Using these expressions, show that the
gaussian chaotic light are consistent with the autocorrelation function is
power-equivalent width as defined in eqn (8.15).
(8.9) The Wiener–Khinchin–Einstein theorem γ(t) = e−iωc t e−|t|/τ .
Use the results of the Fourier toolkit (especially Now use the Wiener–Khinchin–Einstein theorem,
the derivation of the convolution theorem) to fill to show that the normalized power spectrum is
in all of the details skipped in the text to prove the
Wiener–Khinchin–Einstein theorem, i.e. that the 1/τ 2
S̃(ω) = . (8.37)
autocorrelation function of a stationary random 1/τ 2 + (ω − ωc )2
process and the power spectrum of the process
Write a simulation of a harmonic wave with
form a Fourier transform pair.
random phase resets as in Fig. 8.3. Calculate the
(8.10) Normalized power spectral density Fourier transform numerically and fit the power
Using results from the Fourier toolkit, show spectrum with the analytical Lorentzian lineshape,
that the normalized power spectral density and eqn (8.37).
the normalized autocorrelation function form a Write an expression for the probability of detecting
Fourier transform pair. a photon with angular frequency between ω and
(8.11) Lorentzian lineshape—tutorial ω + dω over a measurement time T .
The wave form shown in Fig. 8.3 is an example of a (8.12) Fourier transform spectrometry—qualitative
stationary random function, where the statistical Sketch the interferograms associated with a source
properties do not change over time. Although an that emits:
analytic form for the Fourier transform of such (i) two colours with equal amplitude, and
functions does not exist, we can still calculate it (ii) a continuous spectrum with a width equal to
numerically and compare the result to analytical one-tenth of the central frequency.
144 Exercises
(8.13) Fourier transform spectrometry—quantitative (8.18) Temporal coherence and Young’s double slits (1)
Annotate sketches of interferograms for these Figure 8.15 shows the intensity pattern in a
specific cases: Young’s double-slit experiment where the tem-
(i) illumination with a sodium lamp emitting one poral coherence rather than spatial coherence
line at 589.0 nm which is twice as intense as determines the fringe visibility. Show using a
another line at 589.6 nm. Both lines have a similar analysis to Section 8.9 that the intensity
gaussian profile for their power spectral density is given by
with a full width at half maximum (FWHM) of
2 GHz. kc dx
I = 2I1 1 + |γ| cos ,
(ii) Illumination with a helium–neon laser with a z
wavelength 632.8 nm with a gaussian profile for the
power spectral density with a FWHM of 1.5 GHz. where kc = 2π/λc , νc = c/λc is the central
frequency of the input light, and γ = Γ(ν)/Γ(0),
(8.14) Young’s two-hole experiment
where
Thomas Young placed a single hole upstream of a
ˆ ∞
screen containing two holes in order to limit the
Γ(ν) = F(ν)ei2πνdx/(cz) dν .
effective spatial extent of the source. The distance −∞
between the single hole and the two holes was
equal to the distance between the two holes and Derive an expression for the intensity pattern if
the observation plane, zs = z = 1.0 m, and the slit instead of the gaussian spectrum of Fig. 8.15
separation was d = 1.0 mm. an interference filter is placed at the input, and
(i) Estimate the size of the hole needed to observed the frequency spectrum can be described by the
interference fringes for a central wavelength of function F(ν) = rect(ν − νc )/Δν. The eighth
0.55 μm. fringe at x = 8(λc /d)z is found to be suppressed
[Hint: take the upper limit on the hole size as completely. What is the relationship between Δν
the diameter where the fringes disappear, and and νc = c/λc ?
remember to include a factor of 1.22 for circular (8.19) Temporal coherence and Young’s double slits (2)
apertures.] Use the analysis of Exercise 8.18 to show that the
(ii) What is the spacing between the fringes? approximate number of fringes observed, e.g. the
(iii) If the visible spectrum is between 445 and number to first zero of the envelope function, is
625 nm, estimate how many fringes are visible. independent of the dimensions of the apparatus.
(8.15) Spatial coherence and Young’s double slits (1) If we can approximate visible sunlight as a
In the text we derived the width of the slit for rectangular function with Δν/νc = 1/3, estimate
which the visibility of the fringes first goes to zero, how many fringes Young might have been able to
as = (λ/d)zs . What is the relationship between observe.
the fringes from waves originating at positions xs (8.20) Temporal coherence and Young’s double slits (3)
and xs + as /2? Hence explain why the visibility is Figure 8.16 shows the intensity pattern in the
zero for this value of as . xz plane for a Young’s double-slit experiment
(8.16) Spatial coherence and Young’s double slits (2) using white light. Sketch the far-field intensity
Consider a Young’s interferometer where the first distributions along the x axis for
slit has a fixed width as , but the separation d (i) red light only,
between the pair of holes in the second screen is (ii) blue light only, and
variable. Discuss what happens to the visibility of (iii) all frequency components.
the fringes as a function of d. Comment on the near-field distributions for each
(8.17) Spatial coherence and Young’s double slits (3) case.
Figure 8.14 shows the intensity pattern in a (8.21) Michelson’s stellar interferometer
Young’s double-slit experiment as the entrance Using the parameters given in the text, calculate:
slit width as is varied. Sketch how the visibility (i) the transverse coherence length, and
varies as a function of as , indicating the position (ii) the coherence area of the light from Betelgeuse
as = (λ/d)zs . at Earth.
Exercises 145
λ
Λ = ,
2 sin α
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
148 Optical imaging
where tan α = D/(2f ). Any spatial frequencies larger than 1/Λ will miss
the edge of the lens so Λ sets the resolution limit. The Abbe diffraction
limit is often written in terms of the numerical aperture (NA) of
the lens. We can write that the smallest resolvable detail has a spatial
extent,
λ
Δx
, =
2NA
where NA = n sin α, if we include the possibility that the microscope
1
In the small-angle approximation, the may be operated with a liquid with refractive index n.1 Abbe’s result,
numerical aperture becomes Δx f λ/D, agrees approximately with the Fraunhofer or Fourier result,
NA = n sin tan −1 D
≈n
D
.
Δx = 1.22f λ/D, see Fig. 5.12. Abbe’s contribution also illustrates a
2f 2f fruitful synergy between industry and academia (and between theory
and experiment).
where
the optical axis. Such a scenario arises, for example, in astronomy when
imaging distant objects. The incident field in the z = 0 plane is given
by E0 fi (x , y ), where
fi (x , y ) = eikΔθx = ei2π(Δθ/λ)x .
intensity are said to be ‘just resolved’ when the maximum of one sits on
the first minimum of the other. This gives f Δθ = 1.22f λ/D and hence
an angular resolution limit,
λ
ΔθR = 1.22 . (9.6)
D
9.5 f to f
A lens performs a Fourier transform of the field incident on the lens,
however, the transform is not exact due to the effect of wave-front
curvature. The electric field at a position (x, y) in the focal plane is
given by the Fresnel diffraction integral, eqn (6.35),
eikf eikρ
2
/2f
E (f )
= F E (0) (u, v) , (9.7)
iλf
2
eikρ /2f , which tells us that the wave fronts in the focal plane are curved.
Although this curvature is not apparent if we are only interested in
intensity, it does have a dramatic effect on the propagation around the
focus. This effect is illustrated in Fig. 9.7(top image). The input field in
the z = 0 plane is a rect function with uniform phase. The signature of
the wave-front curvature term from eqn (9.7) is that the field upstream
and downstream of the focal plane is not the same.
Now we consider the effect of moving the input plane back to z =
−f , lower image in Fig. 9.7. We still observe an Airy pattern in the
focal plane, but the intensity pattern upstream and downstream is very
different to the upper image. Note how the intensity pattern upstream
and downstream of the focal plane is symmetric for this case. Now we
show mathematically that the effect of moving the input plane back to
z = −f is to cancel the wave-front curvature in the focal plane. If the
input plane is moved upstream a distance f , as in Fig. 9.7, then each
plane-wave component arriving in the focal plane has travelled an extra
distance f . This means that each plane-wave component characterized
by the angular spectrum amplitude A is multiplied by an additional
propagation phase, eikz f . Using the paraxial expansion of kz in terms
of kx and ky , eqn (6.32) from Section 6.5, we obtain
eikf e−ikρ f /2k
2
eikz f =
eikf e−ikρ
2
/2f
Fig. 9.7 The effect of moving the = , (9.8)
input plane from the lens plane z = 0 where we have used kρ = kρ/f . Hence in eqn (9.7) we replace A (0)
by
(upper image) to z = −f upstream
(lower image). Moving the input plane A(−f ) eikz f which gives
back to z = −f cancels the wave-front 2
eikf eikρ /2f (−f ) ikf −ikρ2 /2f
curvature (quadratic phase factor) in E (f ) = A e e ,
the focal plane such that the field iλf
upstream and downstream of the focus ei2kf (−f )
is symmetric (lower image). = F E . (9.9)
iλf
This is a remarkable result as it shows that by moving the input plane
upstream we can cancel the wave-front curvature in the focal plane and
(apart from a constant prefactor) the Fourier transform relationship
between the input and output fields is exact! We refer to this case as an
optical Fourier transform. There are numerous applications where
we care about the phase of the field, and the cancellation of the wave-
front curvature is important.
There is another significant change in moving the input plane
upstream. Light can now diffract between the input plane and the first
lens, which in practice has a finite diameter D. If the light distribution
in the input plane contains features that are strongly localized, with a
transverse size less that 1.22f λ/D, then some of the input light spreads
out sufficiently fast so as to miss the first lens. The finite size of the
lens, or correspondingly if the lens is apodized, the so-called entrance
pupil size, sets an upper limit on the spatial frequencies accepted by
the optical system. Next, we consider moving the input plane in the
example of Young’s double slit where the effect of changing the wave-
front curvature in the focal plane is dramatic.
9.6 Two-lens system 153
Example 9.1
Young’s double slit: An example of changing the lens position in a Young’s double-
slit experiment is shown in Fig. 9.8. Consider a ‘one-dimensional’ scenario where
the field is uniform in the y direction. An opaque screen with two slits with width a
and separation d is placed either in the z = 0 plane or at z = −f , and illuminated
by uniform monochromatic light with wavelength λ. The light passes through a lens
with focal length f in the z = 0 plane. A plot of the intensity pattern in the xz
plane for both cases—calculated using the angular spectrum method—is shown in
Fig. 9.8. Here we are interested in finding an analytical expression for the intensity
distribution in the focal plane of the lens at z = f . For the one-dimensional Fourier
transform, eqn (9.9) has the form
ei2kf (−f )
E (f ) = √ F E . (9.10)
iλf
The input field along the x axis is E (−f ) = E0 f(x ), where the aperture function is
given by
x
f(x ) = Xd (x ) ∗ rect
(2)
, (9.11)
a
and the field in the Fourier plane is proportional to the one-dimensional Fourier
transform,
F(u) = F f(x ) = asinc (πua) cos πud . (9.12)
Substituting into eqn (9.10) and using u = x/(λf ), we find
ei2kf πax πdx
E (f ) = E0 √ asinc cos . (9.13) Fig. 9.8 The effect of moving the input
iλf λf λf
plane from the lens plane, z = 0
The intensity distribution—proportional to the modulus-squared—is the same as if (upper image), to z = −f upstream
2
the double slit is in the lens plane, but the absence of the eikρ /2f term means that (lower image). Although the intensity
the wave fronts are planar rather than curved, as is apparent in Fig. 9.8. distribution in the focal plane at z =
As in previous chapters we should remember the distinction between Fraunhofer f remains the same, the signature of
diffraction—where the Fourier transform relationship holds for any propagation far- wave-front curvature is clearly manifest
field distance z—and the case of a lens—where the Fresnel quadratic phase terms are by the changes in the fringe pattern
still important, and the Fourier relationship only holds for a particular plane (the upstream and downstream of the focal
focal plane in this example). plane.
plane is described by aperture function f(x, y), i.e. E = E0 f(x, y), then
the field distribution in the Fourier plane is
ei2kf ei2kf
g(x, y) = F [f(x, y)] = F(u, v) , (9.14)
iλf iλf
where we use the mapping between the Fourier variables and the real
space, u = x/f λ and v = x/f λ, to obtain the real space distribution in
the Fourier plane. The field in the output plane is given by a Fourier
transform of the field in the Fourier plane,
ei2kf ei4kf
h(x, y) = F [g(x, y)] = − F [F(u, v)] . (9.15)
iλf (λf )2
To evaluate the Fourier transform of F(u, v) we try to make it look
the inversion of the image in the output plane is shown in Fig. 9.9.
Figure 9.10 illustrates the inverse scaling between the input/output
planes and the Fourier plane.
9.7 Magnification
Now we consider what happens if the lenses have different focal lengths
f1 and f2 . In this case the field distribution in the input plane is
magnified by a factor f2 /f1 , as follows from geometrical optics, see
Fig. 9.11. This is the basic principle of a microscope. A large
magnification is achieved by choosing a small f1 . The first lens is called
the objective because it is close to the object, and the second lens the
eyepiece. In a microscope, the object does not necessarily need to be
placed in the focal plane of the objective, but for convenience we consider
Fig. 9.10 A symmetrical two-lens
the case where it is. imaging system, with an input plane at
To see how the magnification works in terms of Fourier transforms we z = −2f , first lens at z = −f , second
repeat the analysis in Section 9.6, except with f1 for the first lens and f2 lens at z = f , and output plane at
for the second. We shall find that the change of Fourier variables leads z = 2f . The Fourier plane is located
midway between the lenses at z = 0.
directly to a rescaling of the image. The field in the Fourier plane is The upper and lower images illustrate
the inverse scaling between the input
ei2kf1 ei2kf1
g(x, y) = F [f(x, y)] (u1 , v1 ) = F(u1 , v1 ) , (9.19) plane and Fourier plane.
iλf1 iλf
where u1 = x/f1 λ and v1 = y/f1 λ. The field at the output is
ei2kf2
h(x, y) = F [g(x, y)] (u2 , v2 ) ,
iλf2
ei2k(f1 +f2 )
= − 2 F [F(u1 , v1 )] (u2 , v2 ) , (9.20)
λ f1 f2
where u2 = x/f2 λ and v2 = y/f2 λ. We do the Fourier transform by
rewriting it as an inverse transform but in terms of u2 and v2 , which
Fig. 9.11 A geometrical optics
introduces a scaling factor, schematic of a two-lens system with a
magnification of two.
F [F(u1 , v1 )] = F [F(x/f1 λ, y/f1 λ)] ,
= F {F [(f2 /f1 )u2 , (f2 /f1 )v2 )]} ,
ˆˆ ∞
f2 f2
= (f2 λ)2 F u2 , v2 e−i2π(u2 x+v2 y) du2 dv2 ,
f1 f1
−∞
x y
= (f1 λ)2 f − ,− . (9.21)
f2 /f1 f2 /f1
So the output field is
1 x y
h(x, y) = − f − ,− ei2k(f1 +f2 ) . (9.22) Fig. 9.12 A two-lens system with a
f2 /f1 f2 /f1 f2 /f1 magnification of two. The example
shown illustrates the magnification of
For f1 < f2 the image is stretched by a scaling factor f2 /f1 and as the some intensity fringes.
light is spread out over a large area the intensity is reduced by a factor
of (f2 /f1 )2 . In Fig. 9.12 we repeat the scenario of Fig. 9.11 except now
showing the intensity pattern as it propagates through the system.
156 Optical imaging
9.8 Complementarity I
In this chapter we have explored light propagation through optical
systems and seen that sometimes light is localized at particular positions,
or along particular paths, while at other times it is delocalized and
may even interfere with itself. In quantum physics, we tend to think
of path as a particle-like property and interference as a wave-
2
Formulated by Niels Bohr (Copen- like property, so what does this tell us about wave–particle duality?
hagen 1885–Copenhagen 1962). One solution to the wave–particle duality paradox is the principle of
complementarity,2 which states that one can observe either the wave-
like or particle-like properties but not both at the same time. For
example, in Young’s double-slit experiment we can observe either the
path—which slit the photons pass through—or the interference fringes,
but not both.
To illustrate why, we can use the example of Young’s double-slit
experiment, but performed with atoms or electrons rather than photons,
see Adams et al. (1994). To gain path information we need to look
at which slit an atom has passed through, as illustrated in Fig. 9.13.
Looking means scattering light off the atoms and in order to resolve
the slits we need to scatter photons with a range of angles of order
λ/d, where d is the slit separation. To capture all these photons we
need a high numerical aperture lens, called a Heisenberg microscope
after Werner Heisenberg (Würzburg 1901–Munich 1976) who devised
this thought experiment. As a result of momentum conservation, the
scattered photons change the momentum, and hence the wave vector,
Fig. 9.13 Schematic of the Heisenberg of the matter wave, see Fig. 9.13. As photons are emitted at random
microscope. A Young’s double-slit angles, the effect of many scattering events is to introduce a range of
experiment is performed using atoms
phase shifts, as in Fig. 8.8, which ‘washes out’ the interference pattern.
(or other particles such as electrons).
The microscope, consisting of a high If the recoil in the x direction is Δkx then the shifted fringes become
numerical aperture lens (grey), is cos[(kx + Δkx )d/2]. Using the Fourier relationship between spatial
designed to provide ‘which-path’ infor- resolving power, Δx, and photon momentum component, Δpx = Δkx ,
mation by detecting scattered photons
with wave vectors k. However, a
for Δx < d, Δkx > 2π/d, and the shift is greater than π leading to a
scattered photon deflects the particle, complete wash-out of the interference pattern. The effect of averaging
causing the interference fringes to shift. over a large enough range of Δkx to resolve the path ‘washes out’ the
interference fringes. In Chapter 10 we shall revisit this complementarity
concept using only photons, see Section 10.9.
Exercises 157
Chapter summary
Exercises
(9.1) Imaging of the point-spread function, assuming that it is
Write equations for (i) a spherical wave with origin limited by diffraction, for the centre of the optical
at z = f , and (ii) a paraxial spherical wave with spectrum (λ ∼ 0.55 μm).
origin at z = f in the z = 0 plane. Comment on
whether a lens with focal length f in the z = 0 (9.4) Resolving power
plane would cancel or double the transverse phase Give an expression for the angular resolution
dependence, and what the field would look like limit, Δθmin , for light with wavelength λ, of
upstream of the lens. an instrument with entrance aperture size, D.
Describe, briefly, the practical limits to the
(9.2) Point-spread function resolution of the instrument.
Write an expression for amplitude and intensity
point-spread functions for a single lens with (9.5) Focusing laser beams
diameter D and focal length f . A laser beam with beam waist w0 is incident
on a lens with focal length f and diameter D.
(9.3) Point-spread function of Hubble Space Telescope The effect of the finite size of the lens is to
Given that the Hubble Space Telescope has a clip the edges of the gaussian beam which can
2.4 m primary mirror, estimate the angular width be described using an aperture function, f(ρ ) =
158 Exercises
gauss(ρ /w0 )circ(ρ /D). The field in the focal (9.8) 2D Optical transforms
plane is proportional to the Fourier transform, The field incident on a lens with focal length f
2 2
F(kρ ) = F[f(x , y )](u, v). Write an expression for in the z = 0 plane is E (0) = 14 E0 e−ρ /w0 [3 +
the Fourier transform. Describe what happens in cos(2πx /d)], where w0 is the beam radius and
the two limit cases (i) w0 D and (ii) w0 < D. d = w0 /4 is the distance characterizing a fringe
Write expressions for the size of the focal spot pattern on the beam. What is the spacing between
in each case. Assuming that the total power of the intensity maxima in the x direction in units of
the laser is fixed such that the input intensity d? Write an expression for the field in the focal
is inversely proportional to w0 , comment on the plane. What are the positions of the maxima in
optimal ratio of w0 /D to maximize the on-axis the focal plane? What is the 1/e width of the
intensity in the focal plane. maxima in the focal plane? What is the ratio of
(9.6) Focal spot the spacing between the maxima to their width?
Write an equation for the intensity distribution What is the intensity ratio between the brightest
in the focal plane of a lens with diameter, D, and faintest maxima?
assuming that the field on the lens is characterized (9.9) Photography: depth-of-field
by an aperture function f(x , y ). Explain, briefly, In photography, the depth of field (depth of
why it is not possible to produce the intensity focus) relates to how far the object (image) plane
pattern can move without the image becoming blurred.
2 This maps directly onto the concept of Rayleigh
I0 π 2 D4 π(2D)ρ
I = 2 2 π(2D) gauss
(f ) 2
. distance, or Rayleigh range, where we ask how far
λ f λf can a light field propagate before its distribution
What is the input field distribution if the intensity changes substantially? So for a focused gaussian
in the focal plane is light distribution with beam radius wf , we can say
that the depth of focus or tolerance in the position
4 2 2
I0 π 2 D 1 β of the image plane is zR = πwf2 /λ. Show that the
I (f )
= 2 2 jinc (β) − jinc , depth of focus is proportional to the f-number
λ f 2 2 2
squared, where f-number is the ratio of the focal
where β = πDρ/(λf )? This relates to the topic of length f to the diameter of the entrance aperture,
apodization, discussed in the Chapter 10. D. Comment on how the result is modified for
(9.7) Convolution and ‘blurring’ of details depth of field.
Sketch the convolution of the one-dimensional (9.10) Two-lens system
function that represents a periodic array of Sketch a version of Fig. 9.14, labelling the key
rectangles of width a and separation d, with a planes along the propagation axis, with lines to
gaussian of width w. Illustrate three cases: (i) mark the spatial extent of the light field. Indicate
w a, (ii) w ∼ a, and (iii) w > d. Comment on the wave fronts before the first lens, after the
your results. second lens, and in the focal plane.
Fig. 9.14 Two examples of light propagating through a two-lens system, see Exercise 9.10.
Spatial filtering 10
Sometimes a strange light
10.1 Introduction 159
shines, purer than the moon,
10.2 Apodization 159
casting no shadow
10.3 Spatial filtering 164
10.4 1D periodic 165
R. S. Thomas (Caerdydd 1913–Pentrefelin 2000),
Frequencies, 1978. 10.5 2D periodic 166
10.6 2D arbitrary objects 167
10.7 Convolution 170
10.1 Introduction 10.8 Phase-contrast imaging 171
10.9 Complementarity II 172
In this chapter we examine apodization—modifying the aperture Chapter summary 173
function in order to change the point-spread function—and spatial Exercises 174
filtering—modifying the transmission in the Fourier plane, in order to
process an image or light distribution. Both of these concepts share
the feature that understanding Fourier optics enables the design of
optical devices with improved performance. Apodization exploits the
Fourier link between the lens plane and focal plane to suppress secondary
maxima in the point-spread function. We shall also analyse inverse
apodization, and explain how super resolution is achieved. Spatial
filtering exploits a two-lens system to perform Fourier analysis and
synthesis. By modifying the amplitude or phase of the diffraction
pattern in the Fourier plane we can re-engineer the image. By
modifying the phase—phase-contrast imaging—it is possible to
render transparent objects visible.
10.2 Apodization
Previously, in Chapter 9, our discussion of the imaging properties of
optical systems largely focused on only the central part of the diffraction
pattern. As encapsulated in the Rayleigh criterion, it is this central
part that determines the ability of the optical device to resolve two
equally bright point objects. The pattern of light surrounding the central
maximum is much fainter; but there are side lobes, or ‘wings’.1 1
Also called the ‘feet’ of the pattern.
If we are interested in looking for a faint object near a bright object,
then the wings of each diffraction pattern diminish significantly our
ability to resolve detail. Making a narrower point-spread function is
unlikely to be possible, as many optical instruments such as telescopes
are already as large as they feasibly can be; therefore another technique
is necessary to modify the point-spread function.
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
160 Spatial filtering
Table 10.1 One-dimensional apodization functions apod(x) over the range −a/2 ≤ x ≤
a/2, and the corresponding Fourier transforms—the amplitude point-spread function.
2|x| a πau
triangle 1− sinc2
a 2 2
πx 2a cos πau
cosine cos
a π (1 − 4a2 u2 )
πx a sinc πau
Hann cos2
a 2 (1 − a2 u2 )
21 1 2πx 2 4πx a 21 25
− 100
9
a2 u2 sinc πau
Blackman + cos + cos
50 2 a 25 a 2 (1 − a2 u2 ) (1 − a2 u2 /4)
Example 10.1
Cosine apodization: As an example of apodization in one dimension we calculate
the point-spread function when the conventional uniform pupil function, rect(x/a), is
softened by the cosine function apod(x) = cos(πx/a). Therefore the modified pupil
transmission function is t(x) = apod(x)rect(x/a).
psf(u) = F [t(x)] ,
x
πx
= F rect × cos ,
a
x a
πx
= F rect ∗ F cos ,
a a
1 1 1
= a sinc au ∗ δ u− +δ u+ ,
2 2a 2a
a 1 1
= sinc au − + sinc au + ,
2 2 2
2a cos πau
= ,
π (1 − 4a2 u2 )
which is the result quoted in Table 10.1. The two sinc functions are displaced such
that the amplitudes in the wings have opposite signs and partially cancel, resulting
in the desired effect of greatly suppressed secondary maxima.
Example 10.2
Two-dimensional gaussian apodization of a circular aperture: We consider
a gaussian transmission filter that modifies the field in the plane of the lens such that
the field immediately downstream is described by the distribution function
ρ ρ
f(x , y ) = gauss circ , (10.1)
w D
where w < D/2 for the filter to have any effect. Using the inverse convolution
theorem we find that the field in the focal plane is proportional to
πD 2 πDρ πwρ
F(ρ) = jinc ∗ πw2 gauss . (10.2)
4 λf λf
The convolution with a gaussian works as a smoothing function that slightly broadens
the central peak but also smoothes out the oscillations of the subsidiary maxima.
Detail of the field distribution in the focal plane is shown in Fig. 10.3. The field in Fig. 10.3 Apodization: Intensity
the xz plane is illustrated in Fig. 10.1. Note that in comparison with Fig. 9.2, the images of the focal ‘spot’ without (left)
focal spot is larger but the fringes around the central spot are strongly suppressed. and with (right) apodization. Notice
how the wings are suppressed, allowing
faint objects to be imaged in the
vicinity of the main spot.
The field in the xz plane for this example is shown in Fig. 10.4(ii).
Table 10.2 Comparison of the diffraction patterns obtained when the pupil of an image-
forming instrument is modified.
10.4 1D periodic
As a first example of spatial filtering, consider an input field that con-
tains a cosine-squared transverse modulation, E (−2f ) = E0 f(x)e−x /w0 ,
2 2
166 Spatial filtering
where f(x) = cos2 2πu0 x, 2u0 is the spatial frequency of the modulation,
and the beam size is larger than the modulation wavelength w0 >
02).
1/(2u Recalling
that cos2 2πu0 x = (1 + cos 4πu0 x)/2, we find
F cos 2πu0 x = [δ(u) + δ(u − 2u0 ) + δ(u + 2u0 )] /2; it follows that
the Fourier transform has three contributions, with spatial frequencies
u = 0, and u = ±2u0 . For this object we expect to see three intense
spots in the Fourier plane, as evident in Fig. 10.7.
To perform spatial filtering, we can choose to block either the high
or the low spatial frequencies in the Fourier plane. Low-pass filtering
is seen in the upper panel of Fig. 10.7, where the mask in the Fourier
plane blocks the spatial frequencies u = ±2u0 . The image therefore only
contains the spatial frequency u = 0, and the second lens transforms
this into a uniform beam in the output plane, as expected. High-pass
filtering is achieved (lower panel in Fig. 10.7) by choosing a mask in the
Fourier plane that blocks the u = 0 component, but transmits the spatial
frequencies u = ±2u0 . The interesting result is that when we block the
zero frequency component the spatial frequency of the fringes observed
in the output plane is doubled. This follows because when we remove
Fig. 10.7 Low- and high-pass spa-
tial filtering (upper and lower image, the u = 0 component we see interference between the ±2u0 components,
respectively): In the upper image which has a spatial frequency of 4u0 . (It is evident that the periodicity
only the low spatial frequencies pass of the bright lines in the plane z = 2f is twice as high as it is in the plane
through the Fourier plane at z =
0, leading to an output field without
z = −2f .) This is an example of false detail. Understanding false detail
fringes. In the lower image only the and the resolution possible by transmitting different diffraction orders
high spatial frequencies pass and only was important historically in Abbe’s development of a theory for image
fringes remain—note that the spatial formation in a microscope. We are familiar with optical systems where
period of the fringes is halved. Note
the similarity of the intensity pattern
the image is lower resolution than the object, but here the converse can
before the first lens to the Talbot occur.
carpets of Section 5.11.
10.5 2D periodic
Example 10.3
Cleaning up the mode of a laser: To illustrate low-pass filtering we consider a
practical example of the removal of intensity irregularities from a gaussian laser beam.
Passing a gaussian beam through numerous optical components, which might have
dust on them, leads to fringes, lumps, and bumps in the intensity profile. These
correspond with plane-wave components with higher spatial frequency, therefore
by using a 4f spatial filter and blocking high spatial frequencies in the Fourier
plane we can remove them. We show how this works for one particular spatial
frequency component. By extension, the same principle can be applied to other
spatial frequencies. Consider a laser beam with some cosine fringes in the x direction,
see Fig. 10.9(i). The field in the input plane at z = −2f is E (−2f ) = E0 f(x , y ), where
2 2 2 2πx
f(x , y ) = e−(x +y )/w0 1 + cos , (10.5)
d
where and d are the amplitude and wavelength of the fringes. The spatial frequency
of the fringes is u0 = 1/d. The ratio w0 /d tells us the number of fringes within a
distance equal to the beam radius. As an example we shall take w0 /d = 5. The field
in the Fourier plane is proportional to the Fourier transform,
F f(x , y ) = [G(u, v) ∗ H(u)] (u, v) ,
where
2
(u2 +v 2 )w0
2
G(u, v) = πw02 e−π ,
1 1
H(u) = δ(u) + δ u+ + δ u− ,
2 d 2 d
168 Spatial filtering
Example 10.4
Edge detection using a high-pass filter: One interesting application of a high-
pass filter is the ability to pick out edges in an image. We first consider what
Fig. 10.10 High-pass filtering of a happens mathematically in one dimension (using cylindrical lenses) and then some
rect function, g(x). Upper panel two-dimensional examples using letters. We shall use a gaussian filter function for
shows the original rect function (dark convenience. For a high-pass filter in the x direction, we define the transmission
grey), and the modified one obtained function as
by convolving with a gaussian (light
x
grey). The output of the 4f set-up is high(x ) = 1 − gauss , (10.8)
xc
proportional to the difference between
these functions, shown in the middle where xc defines an effective cut-off distance, where light with position x < xc in the
panel. The intensity—shown in the Fourier plane is strongly attenuated, similar to the 2D mask in Fig. 10.6. If the input
lower panel—is only non-zero where image is a broad rect function of width a, then in the Fourier plane, immediately
there is a sudden change—an edge—in after the filter, the field is proportional to
the input function. x πax
g (x) = a 1 − gauss sinc . (10.9)
xc λf
10.6 2D arbitrary objects 169
The field in the output plane is proportional to the Fourier transform of g (x), which
we can evaluate using the inverse convolution theorem. We find
x
x √
πxc x
h(x) = rect − rect ∗ πxc gauss . (10.10)
a a λf
The convolution of a gauss and a rect produces a smoothed rect function, as shown
in Fig. 10.10(top). The difference between a rect and a smoothed rect, shown in
Fig. 10.10(middle), is only non-zero near the edges. This function is large close to an
edge and zero everywhere else, i.e. a high-pass filter picks out the edges of an image,
because it is at the edges of the original function that the field varies most rapidly in
space, which generates high-frequency components. The high-pass filter only allows
these components to contribute to the image.
10.7 Convolution
A spatial filter set-up can be used to perform a convolution. For example,
if we want to make copies of an image we need to convolve with a comb
function. A convolution in real space is equivalent to a multiplication
in Fourier space, so by convolving with a comb we can multiply by the
Fourier transform of a comb (also a comb) in the Fourier plane. This is
illustrated in Fig. 10.14, where a grating is placed in the Fourier plane.
Example 10.5
Phase encoding: In phase-contrast imaging we read out the phase information in
the field via interference. To see how important the phase information is in any
image we consider a simple example where in the Fourier plane of a 4f spatial filter
we swap the phase information. The effect is shown in Fig. 10.16. The left-hand
column shows the input intensity, the middle column shows the field in the Fourier
plane, and the right-hand column shows the output field. In the Fourier plane we
swap the phase patterns, which converts the cat into a duck, and vice versa; i.e. phase
is more important than intensity in determining the image. If we ignore the intensity
information completely in the Fourier plane and just imprint the phase pattern on a
laser beam we get a similar result.
172 Spatial filtering
10.9 Complementarity II
In Chapter 9, we illustrated the principle of complementarity using
Heisenberg’s microscope, Section 9.8. Another way of looking at
complementarity, which has the advantage of only using photons, is
to exploit the real and Fourier space distributions in our 4f imaging
system. Whereas in the Heisenberg microscope we attempt to measure
which-path information and pay the price that the fringe visibility is
degraded, this time we attempt to observe the interference pattern and
find that the which-path information is degraded. The idea is to combine
Young’s double-slit experiment with the 4f spatial filter, as illustrated in
Fig. 10.17. If there is no mask, then photons that enter through input
slits A and B will exit through output slits A and B, respectively. If
the photon passes through both slits then there will be an interference
pattern in the Fourier plane.
that this optical system both provides path information and is sensitive
to interference? Does this violate complementarity? The answer is no,
because inserting the grating scrambles the which path information by
diffracting light from one entrance slit to the other exit slit, as illustrated
in the lower image in Fig. 10.17. The grating produces multiple copies
of each entrance slit as we saw in Fig. 10.14. An experiment verifying
this effect and reaffirming complementarity was performed using single
photons by Jacques et al. in 2008.
Interestingly, the better we match the grating transmission to the
interference pattern, the more efficient is the diffraction between paths
A and B. This is because the fringe visibility, V, and the path
distinguishability, D, are linked via the complementarity inequality,
V 2 +D2 ≤ 1, see Jacques et al. (2008). A convenient way to think about
this is to use time-reversal symmetry or the reciprocity theorem
to map back from the output plane to the Fourier plane; then it follows
that having light at both output ports would produce an interference
pattern that matches the grating, as happened for the two inputs.
Chapter summary
Exercises
(10.1) Intensity of subsidiary maxima filter show that the central peak of the point-
(i) For a one-dimensional rectangular aperture spread function has to be narrower than that
tabulate the intensity of the 1st, 2nd, . . ., 10th of the unaltered pupil. [Hint: At the centre of
subsidiary maximum relative to that of the central the diffraction pattern, from the central ordinate
peak. theorem we know that the field is the integral
(ii) Repeat the analysis for a two-dimensional sym- over (1 − T), which is positive. For a value of
metric aperture where the point-spread function is 1.22λ/D the field has two components, that from
the Airy pattern. the ‘1’ goes to zero. Therefore show that the field
(10.2) Point-spread functions of apodizing functions here must be negative. (Recall that the central
Use Fourier techniques, as demonstrated in maximum of the point-spread function of T is
Example 10.1, to reproduce the Fourier transforms wider than the Airy pattern.) Hence show that
of the apodizing functions listed in Table 10.1. the Fourier transform of this modified pattern
must cross zero between these two values—and
(10.3) Plotting apodized intensity point-spread functions thus is narrower.
Plot the intensity point-spread functions of the
apodizing functions in Table 10.1 to reproduce (10.8) Width of central maximum for an inverse-apodized
Fig. 10.2. Make two versions of each graph, one circular aperture—absorbing filter
with a linear and the other a logarithmic scale for Calculate the intensity point-spread function for a
the ordinate. circular pupil with diameter D in front of a lens
of focal length f using an annular aperture, which
(10.4) Reduced maximum intensity with apodized pupil only transmits light in the region
Show that the peak intensity of an apodized
function has to be less than that of the unapodized αD/2 ≤ ρ ≤ D/2 ,
aperture.
[Hint: use the central ordinate theorem.] with 0 ≤ α ≤ 1. Plot the width of the central
(10.5) Designing an apodizing function for a desired maximum and the peak intensity as a function
point-spread function of α. Comment on the trade-off between the
Discuss why it is not possible to design a point- narrower central maximum and the drop off in
spread function (amplitude or intensity), and then peak intensity.
find the corresponding apodizing function. (10.9) Width of central maximum for an inverse-apodized
[Hint: consider the extent of the Fourier transform circular aperture—phase filter
of an arbitrary function.] Repeat the analysis of the previous question, but
(10.6) Inverse apodization in one dimension with an inverse-apodizing mask given by
Consider the function
−1 0 ≤ ρ ≤ αD/2
⎧ apod(ρ ) =
1 αD/2 ≤ ρ ≤ D/2
⎨ 1 −a/2 ≤ x ≤ −αa/2
apod(x ) = 0 −αa/2 ≤ x ≤ αa/2 Comment on the trade-off between the narrower
⎩
1 αa/2 ≤ x ≤ a/2 central maximum and the drop off in peak
intensity.
where 0 ≤ α ≤ 1. Calculate the amplitude point-
spread function for this inverse-apodized function. (10.10) Spatial filtering of a 1D grating
Plot the width of the central maximum and the The object in a 4f set-up is a one-dimensional
peak intensity as a function of α. Comment on the grating with transmission profile
trade-off between the narrower central maximum
f(x ) = 0.5 + 0.4 cos(2πx /d) + 0.1 cos(4πx /d) ,
and the drop off in peak intensity.
(10.7) Narrower point-spread function for inverse- where d is the period of the grating.
apodized functions (a) Show that the intensity diffraction pattern in
For a circular pupil with an inverse-apodization the Fourier plane consists of five spots.
Exercises 175
(b) Given the wavelength of the light used is λ, in the Fourier plane to remove one of the horizontal
what is their location? bars, such that the image looks like an F.
(c) Calculate the relative intensities of the five
(10.13) Twisted comb
spots.
A grating with period d can be described using the
(d) What intensity pattern is observed in the
function Xd (x). If the grating is translated by a
output if
half-period this becomes t(x) = Xd (x−d/2). Find
(i) no filter is inserted,
the expression for the Fourier transform of t(x)
(ii) the outer two spots are blocked, and
with Fourier variable u = x/(λf ), and sketch the
(iii) only the outer two spots are transmitted?
function. Use your sketch to explain Fig. 10.14(ii)
(10.11) Fourier transform of the letter K and (iii).
Figure 10.5 depicts the square modulus of the two-
dimensional (2D) Fourier transform of the letter K. (10.14) Spatial filtering of a letter
Explain the form of the diffraction pattern. Write a caption for Fig. 10.18. Estimate the ratio
between the cut-off between low and high spatial
(10.12) Fourier transform of the letter E
frequencies, ρc , and the widths, a, of the lines used
(i) Sketch the form of the square modulus of the
to create the letter.
2D Fourier transform of the letter E.
(ii) Is it possible to insert a mask in the Fourier (10.15) Convolutor
plane to remove the vertical bar, and retain the Sketch the optical layouts used to obtain the
horizontal bars? images shown in Fig. 10.19. Indicate where the
(iii) If so, sketch the form of the filter. detector should be placed in order to observe (i)
(iv) Explain why it is impossible to insert a mask and (ii).
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
178 Light propagation: beams and guides
= πw02 E0 e−kρ w0 /4 ,
2 2
= πw02 E0 e−π
2
(u2 +v 2 )w02
A(0) (11.3)
For the case of a gaussian input field we can solve this equation exactly.
The angular spectrum in a plane at z is given by multiplying the angular
spectrum in the input plane, eqn (11.3), by the propagator:
where
q = z − izR
is known as the complex beam parameter and zR = kw02 /2 = πw02 /λ
is the Rayleigh range as defined in Chapter 5. Finally, we find the
inverse transform using F −1 [gauss(kρ ξ/2)] = 1/(πξ 2 )gauss(ρ/ξ) with
Fig. 11.1 The geometry of a gaussian
laser beam in the xz plane. The ξ 2 = i2q/k, which gives
greyscale shows intensity (peak inten- 1
F −1 e−ikρ q/2k
2 2
sity is white). The grey dashed lines = eikρ /2q . (11.6)
correspond to the beam radius, w, πi2q/k
which is the transverse distance at
which the intensity falls to 1/e2 of its Multiplying by the prefactor πw02 E0 eikz we get
on-axis value. The peak intensity and
minimum radius, w0 , occurs at the
position of the beam waist, which here zR 2
2 1/2
w = w0 1 + z 2 /zR , (11.10)
2
R = z + zR /z , (11.11)
Beam radius: The factor e−ρ /w in eqn (11.12) indicates that the
2 2
transverse spatial profile remains gaussian, but with a modified√beam Fig. 11.2 Illustration of the Rayleigh
radius given by eqn (11.10). At z = zR the beam radius w = 2w0 , range, zR , and angular divergence,
see Fig. 11.2, and the beam area is double that at the waist. In Δθ = λ/(πw0 ), of a laser beam in
the far-field, z > zR , we recover the simple result that the beam the xz plane. The initial beam radius
(waist) in the z = 0 plane√is w0 . The
radius increases linearly with distance, w = w0 z/zR , and the angular beam radius at z = zR is 2w0 .
divergence Δθ = w/z = w0 /zR = λ/πw0 , as we saw in Chapter 5. The
Rayleigh range characterizes the cross-over between the near field and
far field, and also how far the beam can propagate before the transverse
distribution changes substantially. In this respect it is analogous to
the Rayleigh distance in diffraction, see Chapter 5, and emphasizes the
point that a length scale corresponding to transverse size-squared over
wavelength is characteristic of all diffraction phenomena.
2
Wave-front curvature: The factor eikρ /2R in eqn (11.12)—similar
to the quadratic phase term appearing in the paraxial spherical wave in
Chapter 2, Section 2.14—tells us that the wave fronts are curved.
If we consider the phase factor associated with the spherical wave eikr
4
with r = (R2 + ρ2 )1/2 , then in the paraxial limit (ρ < R) we have We could have guessed this from
the symmetry of the gaussian beam
r = R + ρ2 /2R, giving the same dependence on ρ as the laser beam. So solution upstream and downstream of
the wave fronts are approximately spherical with a radius of curvature the waist, but still it may seem
R. Note that R > z so the effective origin of the spherical wave front surprising that the wave fronts at the
is always further away than the waist, as shown in Fig. 11.1. Note that waist are planar yet the beam spreads
out.
the radius of curvature is infinite in the z = 0 plane, i.e. the wave fronts
are planar at the beam waist, as shown in Fig. 11.3.4 Finally, we note
180 Light propagation: beams and guides
Gouy phase: The prefactor in eqn (11.12) is also complex and can
be rewritten in terms of an amplitude and a phase. First separating the
real and imaginary parts, we have
zR zR z 2 − izR z
= = R2 2 ,
iq iz + zR z + zR
which can be written as
zR zR w0 −iα
= 2 )1/2
e−iα = e ,
iq (z 2 + zR w
where
z
α = tan−1 (11.13)
zR
is the Gouy phase (see also Chapter 5), and the factor w0 /w ensures
that energy is conserved. This phase was first discovered by Louis George
Gouy (Vals les Bains 1854–1926) in 1890, and arises due to the finite
size of a wave at a focus. The Gouy phase evolves by π from one side
of the focus to the other, and is crucial in explaining some features in
light–matter interactions (see Chapter 13).
By equating the real and imaginary parts of eqn (11.16) we find the new
waist size in terms of zR2 and position z2 . This formula is particularly
useful in the design of laser cavities.
Consider the case where the laser beam is sufficiently well collimated
that we can assume that the beam waist lies in the same plane as the
lens, i.e. z1 = 0 as in Fig. 11.5. Putting z1 = 0 in eqn (11.16) and
equating imaginary parts, we obtain
zR1
zR2 = 2 /f 2 . (11.17)
1 + zR1
If zR1 = πw02 /λ and zR2 = πwf2 /λ, then for zR1 > f , zR2 f 2 /zR1 ,
giving
fλ
wf = , (11.18)
πw0
which is the same result as obtained using the Fraunhofer diffraction
formula, see Chapter 5. This is a useful result for estimating the size of
a focused laser beam; however, we have to remember that it assumes that
the lens does not introduce any aberrations—optimal focusing without
aberration is described as diffraction limited.
Note that the waist of the focused beam is not in the focal plane, see
Fig. 11.5, and the wave fronts in the focal plane are curved—not planar— Fig. 11.5 A laser beam with beam
as expected from the Fresnel diffraction integral, eqn (6.34). By equating waist w0 in the z = 0 plane is incident
the real parts in eqn (11.16) we find that the focal shift—the difference on a lens with focal length f positioned
2 in the z = 0 plane. The lens creates a
between the focal plane and waist plane—is approximately zR2 /f , where
new waist with radius, wf , at position
zR2 is the Rayleigh range of the focused beam. Alternatively, it can be z = f − zR2 2 /f , where z
R2 is the
written as f 3 /zR1
2
, where zR1 is the Rayleigh range of the incident beam, Rayleigh range of the focused beam.
see Exercise 11.11.
182 Light propagation: beams and guides
Example 11.1
Laser pointer: An interesting example of optical engineering which exploits these
ideas is a green laser pointer. The heart of the laser consists of a plano-convex cavity
similar to Fig. 11.6. The laser light is generated inside the cavity using a crystal of
yttrium aluminium garnet doped with neodynium ions (Nd:YAG). The Nd ions are
optically excited using light from a semiconductor diode laser and are subsequently
stimulated to emit into the laser mode. The emitted laser light at λ = 1.06 μm is
frequency doubled to produce green light at 0.532 μm using a non-linear crystal. A
typical cavity length is L = 1.00 cm. Using eqn (11.21) we obtain a mode waist size
of w0 = 58.1 μm, which defines the physical size of the region where we need to excite
the laser crystal.
11.5 Waveguides 183
Example 11.2
Symmetric cavity: A symmetric version of a laser cavity, see Fig. 11.7, consists of
two curved mirrors at z = ±L/2. In this case,
L z2
Rm = + R . (11.22)
2 L/2
Using zR = πw02 /λ we can re-arrange to find the beam waist at the centre of the
cavity,
1/2 1/4
λ L L 1/4
w0 = Rm − . (11.23)
π 2 2
We can see that w0 → 0 for L = 0 and L = 2Rm , which set the limits of stability of
the cavity. The optimal stability is at the mid-point, L = Rm , which also corresponds
with the maximum value of w0 for a particular cavity length. Fig. 11.7 Schematic of a symmetric
cavity. The wave front curvature of
the gaussian beam matches the mirror
curvature, at z = ±L/2. The beam
waist is located in the z = 0 plane.
For there to be a gaussian mode with
a finite beam waist, the cavity must
11.5 Waveguides be shorter than 2Rm , where Rm is the
radius of curvature of the mirror.
An optical resonator preserves the spatial profile of a propagating beam
by compensating diffraction by refocusing. If we use lenses rather than
curved mirrors, as in Fig. 11.8, the light propagates in one direction
rather than back and forth, yet the spatial mode remains the same. We
could call this sequence of lenses a single-mode waveguide. The most
common form of single-mode waveguide is an optical fibre, effectively an
infinitely long lens where the focusing effect is just enough to cancel
diffraction, creating a stable transverse field distribution or mode. In
a single-mode waveguide, the transverse profile is uniquely defined and
the light emerging from the fibre has the same spatial distribution as Fig. 11.8 Unwrapping the cavity in
the input. Fig. 11.7, where a sequence of lenses
Before we consider how this works, it is important to mention that rather than curved mirrors preserves
the spatial mode.
lensing is not the only way to guide light. If a reflecting interface
surrounds the propagation direction then light is guided like water in
a pipe. For example, light inside a glass rod may be confined by 6
This type of light guide is used
total internal reflection.6 This type of light guide is called multimode in the rod-lens endoscope invented
because it supports many propagation modes rather than just one. For by Harold Horace Hopkins (Leicester
1918–Reading 1994). Hopkins also
a multimode guide, light propagation cannot be described analytically made major contributions to Fourier
and we need to revert to the hedgehog equation, eqn (6.29), or vector optics and the theory of aberrations,
theory, see Chapter 12. still used in lens design.
Here we shall focus on single-mode wave guides as these are the 7
Although the mathematics of cylindri-
most useful for applications such as optical communications.7 As both cal waveguides is not pretty, as optical
light and matter are described by solutions of a wave equation, there is a fibres are so ubiquitous it is important
for any student of optics to have at least
useful analogy between guided light modes and the bound states found some idea of how they work, and where
in quantum systems. This idea is very powerful. Consequently we shall to find the mathematics if required.
devote a significant fraction of the following sections to the cross-over
between quantum physics and guided light.
184 Light propagation: beams and guides
In Fig. 11.10(ii) we show the sum of allowed modes, which is similar to,
but not a perfect reproduction of, a rect function, partly because the
maximum angular spatial frequency is capped at kmax = 2π/λ. Using
the modified light distribution we can calculate the far-field intensity
pattern, as shown in Fig. 11.10(iii). The difference to the sinc-squared
distribution is small, but noticeable. If we reduce the slit width the
discrepancy becomes larger as fewer modes are allowed. Worth noting is
that although now we have a discrete spectrum of kx values, each one is
associated with a mode with finite spatial extent, so has a corresponding
spread in kx .
is
2
/2R −ρ2 /w02
E (δz) = E0 einkδz ein0 kρ e ,
ρ2
n = n0 − Δn , (11.28)
w02
such that the refractive index is n0 on-axis and decreases by an amount
Δn over a transverse distance w0 . Substituting this index profile into
the gaussian beam propagation equation (neglecting terms above second
order in ρ), we find
Δn n0
exp −i 2 kδzρ exp i 2 kδzρ e−ρ /w0 .
2 2
E (δz)
= E0 e in0 kδz 2 2
w0 2zR
Example 11.3
GRIN fibre revisited: In this example, we find the mode inside a graded-index fibre
by solving the Helmholtz equation, eqn (1.40). Inside a medium with refractive index
n, ∇2 E + n2 k2 E = 0, and for a propagating mode of the form E(x, y, z) = E(x, y)eiβz ,
we obtain
∂2E ∂2E
+ + (n2 k2 − β 2 )E = 0 , (11.31)
∂x2 ∂y 2
where β is known as the propagation constant. The parabolic index variation in a
graded-index fibre has the convenient property of cartesian separability allowing us
to separate the Helmholtz equations for each transverse dimension. For x, we have
∂2E k 2 x2
+ n 2 2
0 k − β 2
− 2n 0 Δn E=0, (11.32)
∂x2 w02
Comparing the x2 terms in Helmholtz and Schrödinger equations, (11.32) and (11.33),
10
gives10 Equating the x2 terms gives
2 2n0 Δn(k2 /w02 ) = 1/a40 ,
Δn = , (11.35)
n0 k2 w02 √
and using w0 = 2a0 , we get
which is the same result we found by imprinting a parabolic phase.
2n0 Δn(k2 /w02 ) = 4/w04 .
188 Light propagation: beams and guides
Example 11.4
Higher-order modes: For the m = 1 mode in eqn (11.36), the integral in the core
becomes
ˆ ρ ˆ ρ
1 1/2
κdρ = n21 k2 − β 2 − 2 dρ ,
0 0 ρ
2 2 1/2
u ρ −1 a
= − 1 − cos ,
a2 uρ
where use has been made of the standard integral
ˆ
(ξ 2 − 1)1/2 1
dξ = (ξ 2 − 1)1/2 − cos−1 .
ξ ξ
In the cladding we get
ˆ ρ 2 2 1/2
w ρ 2 1/2 1 + (w2 ρ2 /a2 + 1)1/2 a
κdρ = + 1 − w + 1 + ln ,
a a2 1 + (w2 + 1)1/2 ρ
which uses another standard integral
ˆ
(ξ 2 + α2 )1/2 α + (ξ 2 + α2 )1/2
dξ = (ξ 2 + α2 )1/2 − αln .
ξ ξ
1 + w2 ρ2 /a2 + 1 a ⎣ w +1− w 2 ρ2
V (black curve). The gaussian mode
= B √ ρ exp
2 + 1⎦ ,
1 + w2 + 1 a2 of a graded-index fibre is shown in grey
for comparison. For small V the mode
for ρ < a and ρ > a, respectively. Again matching the field and gradient across the spreads out far into the cladding. For
boundary we find large V the mode is strongly localized
in the core, but when V > 2.405 higher-
1 π (u2 − 1)1/2
cos (u2 − 1)1/2 − cos−1 − = . (11.51) order modes can also propagate in the
u 4 V fibre.
We plot the solution to eqn (11.51) together with the solution for the
fundamental (m = 0) mode, eqn (11.49), in Fig. 11.14, and indicate the
range where the fibre is single mode. As V is inversely proportional to
the wavelength, the plot shows us that if the wavelength is too short
192 Light propagation: beams and guides
then V is large and the fibre becomes multimode. At the other extreme,
if the wavelength is too long, then it does not support any propagating
modes. Consequently any fibre with a particular core radius and index
step will have a finite range of wavelengths over which it is single mode.
For b close to zero, u ≈ V and w ≈ 0 and the mode decays only slowly
into the cladding. As V increases—which happens either if we increase
the core radius a or reduce the wavelength λ—both b and w increase
and the mode becomes more localized in the core. This is illustrated in
Fig. 11.15, where we plot the modes for three values of V . Eventually
as we increase V further, we can fit another mode into the core. Single-
mode fibres are typically designed to operate just below the single mode
limit, V = 2.405, where the mode is well localized within the core. This
minimizes losses due to variations in the core radius.
Chapter summary
Exercises
(11.1) Laser beams and spherical waves A red laser beam is intended for use in a survey
Write an equation for a gaussian beam in terms of theodolite. If it has a waist size of 1 mm,
the complex beam parameter, q = z−izR . Rewrite show that the beam ‘remains parallel’ over a
this equation in the far field, z zR , in terms of distance of approximately 5 m. Think of a suitable
the cylindrical coordinates (ρ, z). How does this definition for a diffracting beam to remain parallel.
compare to a paraxial spherical wave? How do Typically a beam-expanding telescope is used on
the amplitudes compare? What does this suggest the output of a survey laser. If the beam waist
about the effective size of a source of a spherical is expanded to be 25 mm, over what distance will
wave? the beam remain parallel? Which has the largest
(11.2) Laser beam: on-axis intensity (1) spot size on the moon: a He–Ne laser with waist
How does the on-axis intensity of a laser beam 1 mm, or beam expanded in a telescope to be 1 m?
scale with the beam waist w0 ? Explain the [Hint: Distance to moon = 3.8 × 105 km.]
physical origin of the power law. (11.10) Beam expansion (2)
(11.3) Laser beam: on-axis intensity (2) Calculate how many Rayleigh ranges a laser beam
Show that the probability of detecting a photon has to propagate before the central intensity
along the propagation axis of a laser beam is becomes 50 times weaker that its initial value.
described by a Lorentzian (or Cauchy–Lorentz) (11.11) Focal shift of a focused laser
distribution. A laser beam with a waist w0 = w1 is incident on
(11.4) Wave-front curvature a lens with focal length f in the z = 0 plane.
For a gaussian beam with waist in the z = 0 plane (a) What is the wave-front curvature of the
plot the wave-front curvature, R, as a function of incident beam in the z = 0 plane?
distance. For the abscissa use the scaled variable (b) What is the wave-front curvature, R,
z/zR . Verify mathematically that the wave-front immediately after the lens?
curvature attains its minimum value for z = zR .
(c) If the new waist is formed in a plane at
(11.5) Gouy phase z = z2 . Use your equation for R to write
For a gaussian beam with waist in the z = 0 plane an expression for z2 in terms of f and the
plot the Gouy phase as a function of distance. For Rayleigh range of the focused beam, zR2 .
the abscissa use the scaled variable z/zR . Verify Hence find an expression for the focal shift,
mathematically that the Gouy phase changes by zR2 − f .
π in traversing the plane containing the beam’s
(d) Use the complex beam parameter,
waist.
eqn (11.16), to show that z2 = f − f 3 /zR12
,
(11.6) Wave-front curvature at the waist where zR1 is the Rayleigh range of the input
Verify that the wave fronts of a gaussian beam beam.
are flat at the waist. Does this mean that the
(e) Using w2 = f λ/(πw1 ), show that the
beam can be considered to be a plane wave at this
expressions for z2 obtained in (c) and (d) are
location?
the same.
(11.7) Characterizing a gaussian beam
In addition to the wavelength, how many (11.12) Laser cavities
other independent parameters are required to Write an equation for the radius of curvature, R,
characterize a gaussian beam? of a laser beam with waist size w0 in terms of
the propagation distance, z. Define any additional
(11.8) Rayleigh range quantities used.
What is the Rayleigh range of a laser of wavelength Find the distance at which the radius of curvature
λ = 633 nm of waist 0.250 mm? What is the size is a minimum, i.e. the wave front is most
of the beam after it has propagated 500 m? curved. What is the radius of curvature, R, at
(11.9) Beam expansion (1) this distance?
194 Exercises
A laser cavity of length L consists of two mirrors , and the wavelength of the pump laser. Compare
with radius of curvature Rm . By matching wave- the Rayleigh range of the optimal pump beam to
front curvature to the mirror curvature, derive an the length of the crystal.
expression for the beam waist inside the cavity and (11.14) Laser focusing
find the maximum value of L that will give a stable By equating real and imaginary parts in
cavity. eqn (11.16) find expressions for the new waist
(11.13) Minimizing beam size position, z2 , and the new waist size, w2 , in terms
Write an equation for the radius, w, of a laser of the z1 and w1 .
beam with waist size, w0 , as a function of (11.15) Project: photonic crystal fibre
propagation distance, z. Define any quantities Use eqn (6.29) to simulate the propagation of a
used. gaussian mode through the hollow-core structure
In an optically pumped solid-state laser, it is found shown in Fig. 11.16, and see whether you can
that the laser threshold is a minimum when the find solutions where the mode is confined within
pump laser beam radius is minimized at both ends the lower index core. The cladding region is
of the laser crystal. Find an expression for the structured with regions of high and low index
optimal pump beam waist, w0 , that minimizes the which imprints a spatially dependent phase that
threshold in terms of the length of the laser crystal, impedes transverse propagation.
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
196 Vector light fields
has a finite extent, thus the gradient of the function can not be zero
everywhere. The full solution of Maxwell’s wave equation for fields
with finite spatial extent contains a component polarized parallel to the
propagation direction, as illustrated in the lower image in Fig. 12.1.
Before attempting to resolve this issue mathematically, let us consider
what is happening physically. We have already made extensive use of
the concept that any function, f(x, y, z), can be written as a sum of
plane waves; the difference now is that as these plane waves will be
inclined at different angles, there will be components of their (transverse)
electric fields which will lie along the z axis. This is evident in
Fig. 12.2 Two plane waves √ with Fig. 12.2, where we see how the superposition of plane waves can lead
amplitudes E1 and E2 = 2E1 , and
to regions where light is purely longitudinal, i.e. polarized parallel to
wave√ vectors√ k1 = (k, 0, 0) and k2 =
(k/ 2, 0, k/ 2). At a point (x, z) the ‘propagation direction’. Consequently, we should always expect that
where the two waves are π out of phase, beam solutions in vector diffraction theory will have components of the
the total field, E 1 + E 2 , is purely axial, electric field along the propagation equation, i.e. that optical beams
i.e. along z.
are not purely transverse waves. To see how this works mathematically,
let us write the form of the vector electric field for a beam as E =
[f(x, y, z) x̂ + g(x, y, z) ẑ] ei(kz−ωt) . On substituting this modified form
into the Maxwell equation, ∇ · E = 0, we obtain
∂f ∂g
+ ikg + =0. (12.1)
∂x ∂z
1
Linear first-order differential equa- By using an integrating factor,1 this linear first-order differential
tions that can be expressed in the form equation can be solved, giving the expression
dg/dz + P (z)g = Q(z) can be solved ˆ
by multiplying´ by an integrating factor, ∂f ikz
given by exp( z P (z )dz ). This makes g = −e−ikz e dz. (12.2)
∂x
the left-hand side of the equation a
total derivative, facilitating integration We learn three things from eqn (12.2): (i) for the special case of a
of both sides and giving a solution for
g(z).
plane wave, f(x, y, z) is constant and therefore the derivative is zero;
this is consistent with our earlier findings that there is no component
of the electric field along the propagation direction—plane waves are
transverse. (ii) For beams of light, the gradients of the field distribution
along the transverse polarization direction generate a longitudinal (z)
component of field. (iii) Certain beam shapes, such as a gaussian,
have zero gradient for all points along the line (x = 0, y = 0), and
as a consequence the field will be purely transverse along the z axis.
From the second point we also learn that narrow beams—where the
field distribution falls off quickly off-axis—will have large gradients with
2
This also applies to electromagnetic concomitant larger non-transverse field values.2 It is not surprising that
fields in waveguides. In particular, a narrow beam—which is very different to a plane wave—does not share
for microwave fields in waveguides,
where transverse confinement occurs
one of the properties of a plane wave. From a Fourier perspective, to
over a range comparable to the wave- make a narrow beam a broad range of plane waves at large angles will
length, significant axial electric field have to be summed.
components exist; see Yariv and Yeh
(2007).
12.2 Beyond paraxial
From Section 12.1 we saw that the gaussian scalar solution is not
compatible with Maxwell’s equations, and in Section 12.3 we develop a
12.2 Beyond paraxial 197
w q
Figure 12.3 shows plots of both the transverse and longitudinal com-
ponents of the electric field of the gaussian beam in the xz plane. As
expected from the symmetry of the gaussian beam, the longitudinal
component is zero on-axis. It is not too difficult to show (see end-of-
chapter exercise) that the ratio of the maximum value of the longitudinal
198 Vector light fields
component
√ −1/2 to the maximum value of the transverse component is
2e /kw = 0.858/kw = 0.137 λ/w. In the paraxial regime
the parameter 1/kw is small, therefore, as expected, the transverse
component dominates. It is also evident that the smaller the beam for
a given wavelength, the more prominent the longitudinal component, as
4
Note that in waveguides, where the expected.4
fields are confined in the transverse
directions by conductors, it is possible
to have pure transverse electric (TE) 12.2.1 Optical beams—‘non-existence’ theorems
and transverse magnetic (TM) modes;
see Yariv and Yeh (2007). Before going on to consider vector beams beyond the paraxial limit we
note that there are some ‘non-existence’ theorems about optical beams.
We emphasized in Chapter 2 that not all of the properties of plane waves
are shared by every optical beam, and in this section we have been able
to quantify the relative importance of the longitudinal component. This
result could also be interpreted as the impossibility of obtaining a pure
transverse beam. There are other such statements that can be made,
inspired by Lekner’s paper (Lekner, 2003):
• Pure transverse electric and magnetic optical beams do not exist.
• Beams of fixed linear polarization do not exist.
• Beams which are everywhere circularly polarized in a fixed plane
do not exist.
Further details of the mathematical analysis to back up these statements
are given in the end-of-chapter exercises. A related result is the non
existence of isotropic light waves. Maxwell’s equations dictate that all
Fig. 12.4 The hairy ball theorem: the
impossibility of a continuous transverse spherical electromagnetic waves are intrinsically anisotropic, i.e. the
vector field on a sphere. There must be electric and magnetic fields depend on at least one angular variable
at least two positions where there is a (Zangwill, 2013). The impossibility of achieving a vector field that
discontinuity and hence the amplitude
is everywhere tangent to a spherical surface is sometimes called the
is zero. Image courtesy of Nicholas
Spong, Durham University, 2018. ‘hairy billiard ball’ theorem, see Fig. 12.4; Milnor (1978) provides a
mathematical proof.
where the harmonic time dependence has been suppressed. The vector
angular spectrum method represents the field in the half space where z >
0 as a superposition of the basis functions of eqn (12.9). As in Chapter 6,
only the plane-wave components with kx2 + ky2 < k 2 contribute to the
summation. These components will contribute to the far field whereas
components with kx2 + ky2 > k 2 produce a complex kz and represent
evanescent plane waves that decay exponentially with z. Once again
we rewrite the transverse wave vector in terms of spatial frequencies
kx = 2πu and ky = 2πv and eliminate kz ; the phase term of the plane
wave becomes
2
−v 2 )1/2 z]/λ
ei(kx x+ky y+kz z) = ei2π[ux+vy+(1−u . (12.10)
We can also make use of the fact that plane waves are transverse to
eliminate the amplitude of the z component of the plane wave:
1
k·E =0 ⇒ E0z = − (kx E0x + ky E0y ) . (12.11)
kz
We are left with two field components, E0x and E0y . As we established
in Chapter 6 the angular spectrum of the plane waves can be calculated
from the Fourier transform of the field in the plane z = 0; we generalize
(0)
that idea here for vector waves. We introduce two functions,6 Ax 6
There can only be two independent
(0)
and Ay , which are, respectively, the angular spectrum of the x and functions that allow full specification
of all fields downstream. We have
y components of the electric field in the plane z = 0: (0) (0)
chosen Ax and Ay ; the spectrum of
ˆ ˆ ∞ Ez can be obtained from ∇ · E = 0,
A(0)
x (u, v) = Ex(0) (x, y)e−i2π(ux+vy) dxdy , and the spectrum of the magnetic field
−∞ from ∇ × E = −∂B/∂t. See Clemmow
(1966) for a full mathematical discus-
= F[Ex(0) (x, y)](u, v) , (12.12)
ˆ ˆ ∞ sion.
A(0)
y (u, v) = Ey(0) (x, y)e−i2π(ux+vy) dxdy ,
−∞
= F[Ey(0) (x, y)](u, v) . (12.13)
As for the scalar case, the propagation of the plane waves from the plane
z = 0 downstream is trivial—this is one strong motivation for using this
method—and simply involves multiplying the amplitude of the angular
spectrum by the phase factor eikz z ,
A(z)
x (u, v) = eikz z A(0)
x (u, v) ,
A(z)
y (u, v) = eikz z A(0)
y (u, v) . (12.14)
Therefore the full propagation equations for the vector field into the half
space where z > 0—the equivalent of the hedgehog equation (6.29)—
are given by the inverse Fourier transform of the angular spectrum
components:
200 Vector light fields
Ex(z) = F −1 eikz z F[Ex(0) ] , (12.15)
Ey(z) = F −1 eikz z F[Ey(0) ] , (12.16)
kx ky
Ez(z) = −F −1 eikz z F[Ex(0) ] + F[Ey(0) ] . (12.17)
kz kz
7
There is a very insightful discussion
It is worth reiterating the point we made in Chapter 6: eqns (12.15)–
by Clemmow (1966) as to the power
of the formalism that allows the plane- (12.17) are amazingly powerful. Once we have specified the tangential
wave representation of a field in a half electric field in one plane, these equations allow the full vector field
space solely in terms of an ‘aperture to be calculated downstream.7 There exist very efficient algorithms for
distribution’ of certain tangential field
components. An alternative physical
calculating the two-dimensional Fourier transforms at the heart of these
interpretation in terms of current results on a computer (and other modern electronic devices such as
densities as sources of electromagnetic smart phones); consequently these equations are used extensively in
fields is not as flexible computational optics.
We can also compare the full vector solutions of eqns (12.15)–(12.17)
with their scalar counterpart, eqn (6.29). Firstly, we notice that there is
no ‘cross-talk’ between the polarization components along the x and y
directions. As we often encounter the situation where only one of these
components is non-zero, we are reassured that the full vector solution
does not ‘generate’ another transverse component—this justifies one of
the main assumptions of the scalar approximation. Secondly, we see that
in agreement with Section 1.12 the longitudinal component is smaller
that the transverse component for each plane wave in the representation
by the geometrical factors kx /kz and ky /kz , for the x and y components,
respectively. Equation (12.17) allows us to quantify the discussion of
how large an angular width a certain beam may have for the beam to
be considered transverse to a certain degree of approximation.
eikr
(0) (0)
E R = Aθ (θ, φ)θ̂ + Aφ (θ, φ)φ̂ , (12.18)
r
where
(0) −ik
Aθ = cos φ A(0)
x (ks x , ks y ) + sin φ A(0)
y (ks x , ks y ) , (12.19)
2π
−ik
(0)
Aφ = cos θ − sin φ A(0)x (ksx , ksy )
2π
+ cos φ A(0)
y (ksx , ksy ) . (12.20)
What we learn from these equations is that the amplitude of only one of
the plane-wave building blocks from the angular spectrum representation
contributes to the asymptotic radiation field in the direction ŝ; that with
wave vector parallel to this direction, i.e. with kx = sx k, ky = sy k, and
kz = sz k. The contributions from all other plane waves destructively
interfere along this direction. The asymptotic form of the radiated field
can be written as11 11
The symbol ∼ is used to denote
‘asymptotic to’. See Rhodes (1964)
eikr and Smith (1997) for further details
E R (rs) ∼ E0 F(s) as kr → ∞ . (12.21) of asymptotic methods in electromag-
r
netism.
The (angular dependent) function E0 F(s) is referred to as the radiation
pattern. Note in particular that the radial dependence takes the form
of a spherical wave. The angular-spectrum method uses plane waves as
building blocks, but the asymptotic value of the field in a given direction
does not take the form of a plane wave.
We saw in Chapter 2, Fig. 2.1, that for a plane wave the surfaces
of constant phase are planar surfaces perpendicular to the direction of
propagation. We can think of these as geometrical wave fronts. The
geometrical rays are the vectors that denote the direction of energy flow.
For plane waves in vacuum the rays are parallel to the wave vector.
For the radiation field of eqn (12.18) the wave fronts are spherical and
the rays are in the radial direction, once again orthogonal to the wave
fronts.12 12
It is fascinating to see some of
Having spent some time on the far-field, asymptotic form of the light the concepts of geometrical optics
emerge from analysis of the far-field
downstream of an aperture, we move on to the issue of describing radiation pattern. Geometrical optics
the vector light fields that are achieved by focusing. But first, we has been one of the longest studied
augment the discussion of polarization states of light from Chapter 4, topics in the physical sciences, certainly
by introducing two new states of polarization. many centuries before the appearance
of electromagnetic fields and Maxwell’s
equations. The reader interested
in more details of the link between
12.4 Radial/azimuthal modes geometrical optics and the (vector)
wave theory of light is directed to Born
In Chapter 4 we used the linear and circular polarization bases and Wolf’s classic text, Born and Wolf
extensively. Here, we extend the discussion of types of polarization states (1999).
of optical beams. We shall demonstrate that radial and azimuthal
polarization states of light can be generated by taking suitable
combinations of linearly polarized beams. These so-called cylindrical
vector beams are solutions of the vector wave equation that obey
cylindrical symmetry in both amplitude and polarization (Zhan, 2009).
202 Vector light fields
Note that neither the radial nor the azimuthal polarized beams can
have a finite field at the origin,14 as there is a singularity in the direction 14
Beams with intensity patterns sim-
of the polarization vector—similar to the zero density at the centre of a ilar to those depicted in Figs. 12.7
and 12.8 are frequently referred to as
fluid vortex.15 ‘doughnut beams’.
15
Note that the most general cylindri-
cal vector beam has a fixed direction of
12.5 High-NA focusing polarization with respect to the radial
vector, and can be generated (Zhan,
In previous sections we have discussed the limitations of scalar diffraction 2009) from a linear superposition of
theory, and showed how a beyond-paraxial approximation can be used radial and azimuthal polarization.
to model vector fields. The fact that most of this book uses scalar
theory shows to a certain extent that the vector addition is not often
crucial. However, one particular field of modern optics where scalar
diffraction theory is not valid, and that necessitates the use of a vector
diffraction theory, is that of high numerical aperture (NA) optical
focusing and imaging. In addition to being of fundamental interest,
during this century there has been a burgeoning interest in the study of
this phenomenon, as tightly focused fields have found a wide range of
applications in, e.g. data storage, optical microscopy, optical tweezers16 16
Further details can be found in Jones
and particle manipulation, drilling holes, to name but a few. et al. (2015)
Most lenses have spherical surfaces, as these are by far the easiest to
machine. A compound lens formed of many spherical surfaces will, have 17
We do not have space to discuss the
many aberrations, especially if some of the rays are at large angles
details here, but there is a strong link
with respect to the optical axis. A notable exception is an aplanatic between the features of an aplanatic
system, where ‘aberration-free’ imaging of the points located in the lens and a concept known as the
vicinity of the optical axis can be achieved.17 Abbe sine condition, named after Ernst
Abbe. Systems are designed to fulfil
The starting equation for our analysis was written down and solved Abbe’s condition with the intention of
numerically for a plane wave illuminating a finite-diameter lens over reducing aberrations. An insightful
half a century ago, by Wolf (1959) and Richards and Wolf (1959). discussion can be found in Wave
However, in spite of the success of their vector angular spectrum Theory of Aberrations, Hopkins (1950)
method, the numerically intense nature of the integrals meant that not
much attention was devoted to this topic. Early in this millennium,
Youngworth and Brown (2000) published a paper that applied Richards 18
This allows the design of, for ex-
and Wolf’s method to the focusing of high-numerical-aperture cylindrical
ample, complex optical longitudinal
vector beams, and the topic has flourished since—largely as the power of polarization structures, such as linked
modern computers renders calculation of complicated three-dimensional and knotted longitudinal vortex lines;
fields and intensities in the vicinity of a focus a tractable problem.18 see Maucher et al. (2018).
204 Vector light fields
ˆ ˆ
−ik a1 ik(sx x+sy y+sz z)
E P
= e dsx dsy . (12.29)
2π Ω sz
The integration is over the solid angle, Ω, subtended by the lens at the
focus. The particularly simple form of the phase factor in the exponential
is a consequence of modelling an aplanatic aberration-free lens. The
element of solid angle, dsx dsy /sz , can also be written as (Richards and
Wolf, eqn (2.25))
dsx dsy /sz = dΩ = sin θdθdφ. (12.30)
The amplitude function a1 behind the lens is derived from the amplitude
before the lens by the equation
√
a1 = f cos θ f (θ) Er ĝ 0 + Eφ ĝ 0 × k̂ . (12.31)
12.5 High-NA focusing 205
√
The factor cos θ arises from energy conservation considerations; see
Richards and Wolf eqn (2.13), and the end-of-chapter exercises. It is
convenient to use cylindrical coordinates (ρP , φP , zP ) for point P , with
the origin at the paraxial focus.
From eqn (12.29) we derive expressions for the cartesian components
of the electric field in the vicinity of the focus, where
The first two are easily interchanged for the radial and azimuthal
components:
kf E0 πf E0
A= = . (12.37)
2 λ
206 Vector light fields
The integrals that appear in the above expressions are given by:
ˆ α √
I0 = f (θ) cos θ sin θ (1 + cos θ) J0 (kρP sin θ) eikzP cos θ dθ , (12.38)
ˆ0 α √
I1 = f (θ) cos θ sin2 θJ1 (kρP sin θ) eikzP cos θ dθ , (12.39)
ˆ0 α √
I2 = f (θ) cos θ sin θ (1 − cos θ) J2 (kρP sin θ) eikzP cos θ dθ . (12.40)
0
f (θ) = e−(x
2
+y 2 )/w2
= e−f
2
sin2 θ/w2
, (12.41)
where we have assumed that the waist of the (broad) gaussian beam is
on the front surface of the lens.
Figure 12.10 shows plots of Ex , Ey , and Ez , as well as the total intensity,
in the plane zP = 0. We have chosen the parameter w/f = 0.4. As
expected, the x-component dominates; the focal spot is compact but
not fully symmetric (it is elongated along the x axis). However, we also
see a weaker longitudinal Ez component, with a bimodal distribution.
From symmetry arguments it is easy to argue why the longitudinal
component has to be zero at the origin, but it is finite to either side.
There is also an even weaker Ey component, that arises from ‘cross talk’
between the directions when the rays are refracted by the lens. This has
12.5 High-NA focusing 207
ˆ √α
EρP =A f (θ) cos θ sin 2θJ1 (kρP sin θ) eikzP cos θ dθ , (12.42)
0
ˆ α √
EzP = 2iA f (θ) cos θ sin2 θJ0 (kρP sin θ) eikzP cos θ dθ . (12.43)
0
Fig. 12.11 Field distributions for a
focused radially polarized beam, see
Fig. 12.7, in the focal plane. The radial
From symmetry, the azimuthal component is zero everywhere. Fig- and axial polarization states, Eρ and
ures 12.11 and 12.12 show plots of Eρ and Ez in the xy plane with zP = 0, Ez , are shown. The parameters are the
and Iρ and Iz in the xz plane for y = 0, respectively. As in Fig. 12.10, we same as in Fig. 12.10.
have chosen w/f = 0.4. Figure 12.7 illustrates why on-axis the field has
to be purely longitudinal. The spot size is compact—indeed, one of the
most prominent early investigations of radial polarized light showed that
a smaller focal spot size could be obtained in comparison with linearly
polarized light (Dorn et al. 2003). Figure 12.11 also hints that the
strongest contribution to the radial longitudinal focus component comes
from rays which are tipped most by the lens. This can be confirmed by
having an annular mask which apodizes the input such that only light
within a restricted range of θ is transmitted.
The peak intensity of the radial and longitudinal components occur
at different locations, and for our parameters their ratio is 5. Clearly by
using a radial beam and a high numerical aperture lens the form of the
field at focus is very different to the prediction of scalar wave theory—
this is not remotely surprising, as the conditions are those under which
the scalar approximation is expected to fail.
ˆ α √
EφP = 2A f (θ) cos θ sin θJ1 (kρP sin θ) eikzP cos θ dθ . (12.44)
0
208 Vector light fields
From symmetry, the radial and longitudinal components are both zero.
Figure 12.13 shows plots of Eφ in the plane zP = 0 and the plane x = 0.
We have used the same parameters as previously, i.e. w = 40λ and the
focal length is f = 100λ . From symmetry it is evident that there can
be no field on-axis, and that there cannot be a longitudinal component.
To finish this section, we note that our treatment has assumed that
the waves behind the lens and in the focal region were in air. In many
high-resolution microscopes a transparent fluid with high refractive
index is used, and oil-immersion lenses are practically ubiquitous for
most applications. The numerical aperture is increased, allowing for
higher resolution. Our aim in this section was to give a flavour of
how vector-light formalism can be used to describe the tight focusing
of different light beams. This is a field of modern optics where there is
considerable activity, and we direct the reader to recent review articles—
Chen et al. (2012), Brown (2011)—and the research literature to see
what researchers are doing at the cutting edge of this vibrant field.
Chapter summary
Exercises
(12.1) Ratio of longitudinal to transverse fields for a and
gaussian beam ∂2 ∂2
+ G=0.
Verify the statement in the text after eqn (12.7), ∂x2 ∂y 2
that the ratio of longitudinal to transverse fields Therefore neither F nor G can localize the field
√
for a gaussian beam is 2e−1/2 /kw = 0.858/kw = about the axis and constitute a beam.
0.137 λ/w. (12.5) Non-existence theorem (2)—Beams of fixed linear
(12.2) Importance of the non-transverse field for a beam polarization do not exist
By interpreting w in the expansion parameter (This question is based on Lekner (2003), Section
1/kw generally as a measure of the width of a 2.2.) If a beam of fixed linear polarization existed,
beam, comment on the importance of the non- we could write it in the form E = Ex ei(kz−ωt) x̂.
transverse component of the field for the following: Using the two curl Maxwell equations show that
(i) a 1 m–wide beam at the entrance of a telescope; these imply that
2
(ii) a He–Ne laser beam (λ = 0.633 μm) expanded ∂ Ex ∂ 2 Ex
+ + k 2 Ex = 0 ,
to a width of 10 cm, and (iii) a diode laser (λ = ∂y 2 ∂z 2
0.780 μm) focused to 25 μm.
∂ 2 Ex ∂ 2 Ex
(12.3) Magnetic field in the paraxial limit = =0.
∂x∂y ∂x∂z
For paraxial vector fields where the longitudinal These equations imply that Ex must be a function
component is given by eqn (12.3), show that the of y and z but not x, and thus cannot be a localized
magnetic field is given by iωB = ∇ · E. Verify that beam solution along x.
these solutions are consistent with the Maxwell
(12.6) Non-existence theorem (3)—Beams which are
equations ∇ · E = 0, and ∇ · B = 0.
everywhere circularly polarized in a fixed plane do
(12.4) Non-existence theorem (1)—Pure transverse elec- not exist
tric and magnetic optical beams do not exist This question is based on Lekner (2003), Section
This question is based on Lekner (2003), Section 2.3. We can write a circularly polarized beam in
2.1. A purely transverse electric and magnetic the form E = (Ex x̂ + Ey ŷ) e−ickt , where Ex and Ey
beam would have fields of the form have the same magnitude and are a quarter of a
cycle out of phase. Using the same mathematical
E = Ex x̂ + Ey ŷ , procedure as the last question show that the
and functional form of Ex (and hence Ey ) has to be of
B = Bx x̂ + By ŷ , the form Ex = Ex (x, z), and Ex = Ex (y, z). This
implies that Ex can only be a function of z, and
with a time dependence of e−ickt . Substitute these thus cannot be a localized beam solution in the
solutions into Maxwell’s equations to show that x, y plane.
the electric field components are governed by the
(12.7) Magnetic field in the vector angular spectrum
equations
∂ 2 Ex formalism (1)
+ k 2 Ex = 0 , For the electric field solutions of eqns (12.15)–
∂z 2
and (12.17) find expressions for the (vector) magnetic
∂ 2 Ey field.
+ k 2 Ey = 0 .
∂z 2 (12.8) Magnetic field in the vector angular spectrum
Show that propagating solutions to these two formalism (2)
equations will be of the form Ex /E0 = eikz F (x, y) In the text we considered the vector angular
and Ey /E0 = eikz G (x, y), and that both F and spectrum method when we specified a tangential
G are harmonic solutions, i.e. subject to the electric field in an aperture in the z = 0
equations plane. This is sometimes referred to as the
2 TE case. We could also specify a tangential
∂ ∂2
+ F=0, magnetic field in this plane—the so-called TM
∂x2 ∂y 2
Exercises 211
case. Find expressions for the (vector) electric (12.15) Fields at the focus of a gaussian beam (1)
and magnetic fields for this case in terms of By considering the geometry of the refraction of
the Fourier transforms of Bx(0) and By(0) , where the rays, explain why the focal field of an x
B (x, y, z = 0) = Bx(0) x̂ + By(0) ŷ. polarized gaussian beam has a bimodal pattern
for the z-component of electric field, and a
(12.9) Vector angular spectrum (1)
quadrupolar symmetry for the field component
For the electric field solutions of eqns (12.15)–
along y.
(12.17) evaluate the fields in the plane z = 0.
Comment on your result. (12.16) Fields at the focus of a gaussian beam (2)
Generate your own version of Fig. 12.10. Study
(12.10) Vector angular spectrum (2) what happens as the width of the input gaussian
For the z-component of the electric field solution beam is varied. What happens when the initial
of eqn (12.17) there are terms proportional to kx width greatly exceeds the aperture of the lens?
and ky . Recalling the result of Appendix B about Explain your result.
the Fourier transform of a derivative, eqn (B.24),
(12.17) Fields at the focus of a radial beam (1)
explain why, in the context of Section 12.2, there
Generate your own version of Fig. 12.11.
is an inevitability to this result.
(12.18) Fields at the focus of a radial beam (2)
(12.11) Vector angular spectrum (3) Consider the effect of an annular aperture.
The electric field solutions of eqns (12.15)–(12.17) Restrict the range of integration of θ to from βα
are extremely compact, owing to the Fourier to α, where 0 ≤ β ≤ 1. Plot a graph of the full
notation. To see why these equations did not width at half maximum of the spot size versus β.
find much utility until the advent of efficient Comment on your result.
Fourier algorithms on powerful contemporary
computers, it is instructive to rework these (12.19) Fields at the focus of an azimuthal beam (1)
equations, retaining the integrals. Rewrite each Generate your own version of Fig 12.11.
of the equations explicitly as two pairs of double (12.20) Fields at the focus of an azimuthal beam (2)
integrals. By drawing a figure similar to Fig. 12.8 for
the case of azimuthal polarization for the input,
(12.12) Hermite–Gauss modes explain why the axial field is zero, and the focused
Use the Fourier transform of a derivative, field has to be transverse.
eqn (B.24), to show that if a TEM00 mode is a
solution to the hedgehog equation, eqn (6.29), then (12.21) Fields at the focus (1)
the Hermite–Gauss mode, eqn (12.22), is also a We have used the results of Richards and Wolf
solution. to calculate the vector field in the vicinity of the
focus of a high numerical aperture lens. Quantify
(12.13) Magnetic component of radiated field the concept of ‘in the vicinity’. What assumptions
For the asymptotic radiated field of eqn (12.18), of the model break down in other spatial regions?
what is the form of the magnetic field?
(12.22) Fields at the focus (2)
(12.14) Fields before and after an aplanatic lens Use the results presented in this Chapter to
By considering the energy transported along a produce a version of Fig. 12.14 for light linearly
ray parallel to the optical axis before the lens, polarized along y. How does the size of the focal
show that the field strength √ after the lens must spot in the x and y directions compare to the
be modified by a factor cos θ. prediction of paraxial theory?
212 Exercises
Optics f2f: From Fourier to Fresnel. Charles S. Adams and Ifan G. Hughes
c Charles S. Adams and Ifan G. Hughes 2019.
Published in 2019 by Oxford University Press. DOI: 10.1093/oso/9780198786788.001.0001
214 Light and matter
This expression contains two important results. First, the phase of the
propagating wave evolves with a phase change per unit length of k = nk,
where
1 N α
n = 1 + χ = 1 + , (13.13)
2 20
5
In Exercise 13.2 we consider whether is the refractive index.5 The refractive index is related to the real
it makes sense to define a refractive part of the polarizability, α , or equivalently, the real part of the
index over a distance scale less than
a wavelength. For most transparent
susceptibility, χ . Secondly, the amplitude of the wave is attenuated at
optical media, like glass or water, a rate, kχ /2. Hence, the real and imaginary part of the polarizability
the light frequency is less than the (or susceptibility) determine the refractive and scattering properties of
resonance frequency, ω < ω0 , α > 0, the medium, respectively. The distinguishing feature of the real and
and the refractive index is greater than
one.
imaginary parts is whether the dipole oscillates in phase, quadrature, or
anti-phase with the incident field. The real part oscillates predominantly
either in phase or anti-phase (with a π phase lag) relative to the driving
field. The imaginary part oscillates in quadrature (with a π/2 phase lag)
relative to the driving field, see Fig. 13.5.
The above derivation gives an important insight into the apparent
change in the speed of light, v = c/n, inside a medium. Light still
propagates at c but interference between the incident field and the phase-
shifted induced dipolar field, eqn (13.9) results in a wave with a modified
wavelength and a different apparent propagation speed.
13.3 Ewald–Oseen extinction 217
Example 13.1
The Bouguer–Beer–Lambert law: The second important result in eqn (13.12)
concerns the imaginary part of the polarizability or susceptibility. The imaginary
part modifies the amplitude of the wave (via extinction in the forward direction).
The intensity I = 12 0 |E|2 as a function of propagation distance is
(z,z ) 1
δEd = iχkδz E (z ) eik(z−z ) ,
2
where E (z ) is the total field in the z plane (including the dipolar fields
from other slabs) and eik(z−z ) is the propagation phase.6 The sum of 6
For an infinite slab the near-field
the forward- and backward- propagating dipolar fields is angular terms in the dipolar field
average to zero. A derivation including
ˆ ˆ
(z) kχ z (z ) ik(z−z ) kχ ∞ (z ) −ik(z−z ) the near-field terms can be found in
Ed = i dz E e +i dz E e , (13.15) Fearn et al. (1996).
2 0 2 z
and the total field in plane z is a sum of the incident field plus the
induced dipole fields,
(z) (z)
E (z) = Ei + Ed . (13.16)
Substituting this into eqn (13.15) and integrating along z over the whole
extent of the medium, we obtain an expression for the dipole component,
ˆ ˆ
(z) kχ ikz z i(k −k)z kχ −ikz ∞ i(k +k)z
Ed = i e dz E0 e +i e dz E0 e
2 0 2 z
kχE0 kχE0
=
eikz ei(k −k)z − 1 − e−ikz ei(k +k)z ,
2(k − k) 2(k + k)
kχE0 ik z kχE0
=
(e − e ikz
) −
eik z , (13.17)
2(k − k) 2(k + k)
where we have assumed that there is no contribution from the limit
z → ∞. Substituting this result into eqn (13.16) and adding the incident
field we obtain a total field,
kχE0 k 2 χE
E0 eik z = E0 −
eikz + 2 02 eik z . (13.18)
2(k − k) (k − k )
Equating terms, we find that
k2 χ
= 1, (13.19)
Fig. 13.7 The forward- and backward- (k 2 − k 2 )
propagating fields in an extended
medium. The field experienced by a which gives the ratio of the new to old angular spatial frequency,
dipole in any plane is the sum of the
k
incident field plus the forward- and = 1+χ=n , (13.20)
backward-propagating dipolar fields. k
7
Note that in macroscopic theory the where n is the refractive index.7 This result agrees with our previous
relationship between refractive index result, eqn (13.13), when the dipolar field is small, |χ|
1. From the
and susceptibility is obtained trivially
by definition. The speed of light, c/n, is
eikz term we get
assumed to arise due to the polarization 2E0 (k − k)
induced by the incident field, P = E0 = , (13.21)
χ0 E. Using kχ
D = r 0 E = 0 E + P
which by substituting for χ, we can rearrange as
2E0 (k − k) 2E0
and rearranging for P, we find E0 = = ,
(k 2 − k 2 )/k k /k + 1
P = (r − 1)0 E = (n2 − 1)0 E ,
2E0
where for a non-magnetic medium with = . (13.22)
√ n+1
μr = 1, n = r , we obtain χ = n2 − 1,
as in the microscopic theory. The reflected field is
2E0 n−1
Er = E0 −
= E0 . (13.23)
n+1 n+1
This is the same as the Fresnel reflection coefficient for normal incidence,
see Chapter 2. Finally, substituting in eqn (13.18) we find that the
induced dipolar field is
(z) 2
Ed = −E0 eikz + E0 einkz (13.24)
n+1
This remarkable result, known as the Ewald–Oseen extinction
8
Paul Peter Ewald (Berlin 1888— theorem,8 states that the dipole field consists of two terms:
Ithaca 1985), Carl Wilhelm Oseen
(Lund 1879—Uppsala, 1944). (1) The first term exactly cancels the incident field; and
(2) The second term is an equivalent wave with amplitude 2E0 /(n + 1)
that propagates with angular wave vector k = nk.
13.4 Clausius–Mossotti 219
13.4 Clausius–Mossotti
So far we have assumed that the atoms or molecules in the medium do
not interact except indirectly via the propagating field. In this case,
the microscopic response characterized by the polarizability and the
macroscopic response, characterized by the susceptibility are linearly
related. However in a dense medium, the near-field part of the induced
dipolar field, see Section A.3, may begin to influence the light–matter
interaction.
Consider a small spherical void inside a homogeneous dielectric, as in
Fig. 13.8. If the polarization P in the medium is uniform and aligned
with the z axis then using Gauss’ law we can say that the charge density
on the surface of the void, σ, is given by
ˆ ˆ
1
E · dS = σdS .
0
For an annulus of charge at an angle θ we have
P σ
cos θdS = − dS ,
0 0
so σ = −P cos θ. The area of the annulus is rdθ2πr sin θ. Using
Coulomb’s law to sum the z-component of the field at the centre of
the void due to surface charge, we find
ˆ π
1 P
Ez = 2
dθσ2πr2 sin θ cos θ = .
4π0 r 0 30
The total field at a particular location, called the ‘local’ field, is equal
to the sum of the incident field plus the dipolar field produced by all the
other dipoles,
P
Eloc = E+ , (13.25)
30
where P = N d is the polarization density. The susceptibility
determines the bulk response, P = 0 χE, whereas the polarizability
determines the local response, P = N αEloc . Substituting for E and P Fig. 13.8 The back action of dipoles
we find that on each other is modelled as the mean
field at the centre of a void due to the
N α/0 surrounding medium.
χ = . (13.26)
1 − 13 N α/0
This equation relates the macroscopic variable χ to the microscopic (or
single atom) parameter α and is known as the Lorentz–Lorenz law.9 9
Derived by Ludwig Lorenz (Helsingør
If we rewrite χ in terms of the refractive index χ = n2 − 1, eqn (13.20), 1829–Copenhagen 1891) in 1869 and
Hendrik Anton Lorentz (Arnhem 1853–
we find how the refractive index varies with number density N , Haarlem 1928) in 1880. Lorentz
N α/0 won the Nobel Prize in 1902 together
n2 = 1+ . (13.27) with Pieter Zeeman (Zonnemaire 1865–
1 − 13 N α/0 Amsterdam 1943) for the discovery
and explanation of the Zeeman effect,
At low density the second term is small and we recover the dilute result, and is also known for the Lorentz
eqn (13.13), whereas at higher density the term in the denominator transformation.
220 Light and matter
ˆ
1 |d|2 k 4 π
|d|2 ck 4
P = c0 2π sin θ3 dθ = . (13.43)
2 (4π0 )2 0 4π0 3
In the language of particles, we would say that the incoming photons
are scattered by the dipole; or as the dipole radiates, it removes energy
from the beam via scattering. Substituting for P in eqn (13.42), using
I0 = 12 c0 E02 and |d| = |α|E0 , we find that the scattering cross section
is
8π |α|2 4
σ = k . (13.44)
3 (4π0 )2
This is a useful result in two regimes. Firstly, when the light frequency
is far off resonance, where the wavelength dependence explains why the
sky is blue, and secondly on resonance.
Example 13.2
Off-resonance scattering: the blue sky: Many atoms or molecules have
resonances in the ultra-violet, so for visible light, the light frequency is much less
than the resonance frequency, ω ω0 . This is the case for light travelling through
air or other transparent media such as water or glass. In this case the polarizability,
eqn (C.12), becomes frequency independent, α
2D02 /(ω0 ), see eqn (C.16) in
Appendix C, and the scattering cross section, eqn (13.44), is inversely proportional
to the fourth power of the wavelength,
4
8π |α|2 2π 8π 3 |α|2 1
σ = = . (13.45)
3 (4π0 )2 λ 320 λ4
This is known as Rayleigh scattering and is responsible for the blue sky. Shorter
wavelength (e.g. blue light) is scattered more than longer wavelength (red light),
as in Fig. 13.12. If there are N molecules per unit volume (and we can neglect
intermolecular interactions) then the light intensity after propagating a distance z
through the atmosphere is I = I0 e−N σz . For white light, the directly transmitted
component contains more red than blue which is particularly apparent when we look
at the Sun at sunrise or sunset (see end-of-chapter exercise).
Example 13.3
Resonant scattering: The other interesting case is on resonance. Substituting the
resonant polarizability, eqn (C.12) with Δ = 0, giving α = i6π0 k−3 in eqn (13.44),
we find that the resonant cross section—the effective area removed from the Fig. 13.12 Sunlight propagating from
incident beam due to destructive interference in the forward direction—is left to right contains red (light grey)
3λ2 and blue (dark grey) photons. The blue
σ = . (13.46) sees a larger scattering cross-section
2π
leaving a larger component of red in the
This is a surprisingly simple result, and says that the resonance cross section does directly transmitted light.
not depend on the dipole moment or the scattering rate, it only depends on the
wavelength. As we shall see in Example 13.13, the same result holds for any point-like
scatterer with dimensions much less than the wavelength. However, if the scattered
field is isotropic then there is less overlap between the incident field and the dipolar
field in the forward direction, and we obtain a smaller cross section, σ = λ2 /2π.
224 Light and matter
Example 13.4
The optical theorem: A more general result for the scattering cross section that
does not assume anything about dipoles, and also applies to quantum scattering
of massive particles, is known as the optical theorem. Consider a point-like
scatterer in a plane wave, as shown in Fig. 13.13. The incident wave is a scalar
monochromatic plane wave, propagating along z axis, Ei = E0 eikz (neglecting the
explicit time dependence). The scatterer behaves like a point source that—in the far
field—radiates a spherical wave of the form
eikr
Ed = E0 f(θ) , (13.47)
r
where f(θ) contains information about the relative phase and angular distribution of
the dipolar field, θ is the angle relative to the z axis, and we are assuming that the
18
If the scatterer is an atomic dipole, scattering is cylindrically symmetric.18 The total field is
cylindrical symmetry only applies to eikr
the case of excitation using circularly E = E0 eikz + E0 f(θ) . (13.48)
r
polarized light.
If we are only interested in ‘forward scattering’, in the paraxial regime we can make
the following approximations, (θ 1, x, y z), and write
2
E eikρ /2z
= eikz 1 + f(0) , (13.49)
E0 z
where ρ2 = x2 + y 2 . The normalized intensity is
2 2
I eikρ /2z ∗ e−ikρ /2z
= 1 + f(0) + f (0) + ... . (13.50)
I0 z z
To find out how much light the dipole has removed we integrate over a disk with
radius R in the far field (R z). The integral of the quadratic phase factor gives
ˆ 2
R
1 R ikρ2 /2z 2π eikρ /2z 2π
e 2πρdρ = =i ,
Fig. 13.13 A point-like scatterer z 0 z ik/z k
0
illuminated by a plane wave, E0 eikz ,
where we have assumed that kR2 /2z 2π such that the exponential oscillates so
generates a scattered field, E0 f(θ)eikr/r .
fast around ρ = R that we can take the average, which is zero. The integral gives
The total field is the sum of the incident
P 2π
field and the scattered field. = πR2 + i [f(0) − f ∗ (0) + . . . ] . (13.51)
I0 k
Using i[f(0) − f ∗ (0)] = −2[f(0)] we can write
P 4π
= πR2 − [f(0)] . (13.52)
I0 k
This is known as the optical theorem. If [f(0)] > 0 the effect of the dipole is to
reduce the effective area of the beam by an amount
4π
σ = [f(0)] . (13.53)
k
This is known as the optical or extinction cross section. The subtle part of this
derivation is the factor of 1/i that crept in when we performed the integral—the same
1/i factor that arises in the Fresnel diffraction integral and the Gouy phase.
Resonant extinction: As above, we now apply this result to the case of a dipole
driven on resonance. To find f(0) we use the equation for the induced dipolar field,
eqn (13.40), with d = αE0 and α = i6π0 /k3 on resonance, i.e.
3 1
Ed = i E0 eikr . (13.54)
2 kr
Comparing to eqn (13.47) we find [f(0)] = 3/2k and substituting in eqn (13.53)
4π 3 3λ2
σ = = , (13.55)
k 2k 2π
as in Example 13.3.
13.7 The extinction paradox 225
13.8 Metals
Speiglein, Speiglein an der Wand, 20
The simple answer is that light is
Wer ist die Schönste im ganzen Land. reflected because the electron motion
Schneewittchen, Jacob (1785–1863) and Wilhelm Grimm inside the metal cancels the field at
(1786–1859). the surface. However, the field does
not decay instantaneously to zero. The
Why is a metal shiny?20 In this section we shall look at the optical main difference between metals and
insulators is that the electrons are free
response of metals as a function of the frequency of the electromagnetic and the electromagnetic field drives
field. In a metal, electron motion generates a current which creates oscillatory currents rather than dipoles.
a source term in Maxwell equations (1.2)–(1.3). For linearly polarized For small nanoscale metallic samples
light, we can rewrite the scalar wave equation (1.31) as the charges are bounded to the sample
and a dipolar resonance does occur.
1 ∂2E ∂J
∇2 E − = μ0 , (13.59)
c2 ∂t2 ∂t
226 Light and matter
1 ∂2E ωp2
∇2 E − = E . (13.61)
c2 ∂t2 c2
Substituting E = E0 e−iωt we obtain a Helmholtz equation,
1 2
∇2 E 0 + (ω − ωp2 )E0 = 0. (13.62)
c2
From this expression we see that the ωp corresponds to the light
frequency at which the source field due to electron motion becomes as
large as the driving field. For silver and copper, the plasma frequencies
are approximately 2.2 × 1015 Hz and 2.6 × 1015 Hz, respectively, which
are deep into the ultra-violet region of the spectrum (close to 100 nm),
about four times larger than the frequency of visible light (red light with
wavelength 600 nm has a frequency of 5 × 1014 Hz). Consequently for
22
In this model, ω > ωp , which implies visible light, ω < ωp .22
that the field due to the electrons For a plane-wave solution of the form E = E0 eik z we obtain
is larger than the driving field. For
plane waves this would violate energy 1 2
conservation. In practice, this does not k = (ω − ωp2 )1/2 = nk , (13.63)
happen when we include damping. c
where we have defined an effective refractive index,
& '1/2
ωp2
n = 1− 2 . (13.64)
ω
For ω < ωp the refractive index is imaginary, n = in where n = ωp /ω,
which means that light is strongly attenuated over a characteristic length
scale
1 c
δ =
= . (13.65)
n k ωp
Note that this is not the same as the low-frequency skin depth, as we
will show.
13.8 Metals 227
Example 13.5
Plasma dispersion: A plasma has well-characterized dispersion properties which
is important in the study of radio waves propagating in the atmosphere, and light
propagating at high frequencies through metals. The plasma dispersion relation is
ω 2 = k2 c2 + ωp2 . (13.66)
Waves with angular frequencies ω > ωp propagate, whereas waves with angular
frequencies ω < ωp are evanescent and decay. Recalling the definition of refractive
index ωn = kc, and substituting into eqn (13.66), we find that it has the functional
form
ω 2
p
n2 = 1 − . (13.67)
ω
Fig. 13.15 The real, n , and imaginary,
The phase velocity is given by
n , parts of the refractive index of
c c a metal or plasma as a function of
vp = = % , (13.68)
n 1 − ωp2 /ω 2 the angular frequency of the light in
the vicinity of the angular plasma
which always exceeds c for waves with angular frequencies ω > ωp . Differentiating frequency, ωp .
eqn (13.66) with respect to k allows us to calculate the group velocity,
dω
2ω = 2c2 k,
dk
dω ω
∴ = c2 ,
dk k
∴ vgp vp = c2 . (13.69)
Figure 13.15 shows the functional form of the frequency dependence of the refractive
indices of electromagnetic waves in a plasma. The product of group and phase
velocity is equal to the speed of light squared; therefore the group velocity is always
23
less than c for waves with angular frequencies ω > ωp .23 The explicit expression for the group
%
velocity is vgp = c 1 − ωp2 /ω 2 .
Example 13.6
Drude model and skin depth: Missing from the above description is damping
due to scattering by the lattice. In the Drude–Lorentz model the lattice imposes
a friction-like force characterized by a phenomenological resistive damping rate γ,
and the equation of electron motion becomes
where the friction coefficient is equal to the inverse of the damping time constant,
24
γ = 1/τ .24 The addition of a damping term modifies the refractive index to, see The damping rate, γ, is related to the
Appendix C, conductivity, σ. Consider the response
in the low-frequency limit ω → 0,
ωp2
n2 = 1− . (13.70) where the electron speed tends towards
ω2 + iωγ a steady-state drift with ẍ = 0. In
The damping rate for copper is γ/2π = 6.5 THz. For visible light, ω > γ, the this case, −eE = −mγ ẋ and the
free-electron refractive index is a reasonable approximation. √For lower frequencies, current density becomes J = N eẋ =
√ (N e2 τ /mγ)E = σE, which gives σ =
including microwave, ω < γ, we obtain n2
iωp2 /(ωγ). Using i = eiπ/2 = eiπ/4 =
√1 (1 + i), we find that
N e2 /mγ, i.e. lower damping gives a
2 higher conductivity.
& '1/2
ωp2
n = (1 + i) , (13.71)
2ωγ
228 Light and matter
i.e. the real and imaginary parts are equal. The attenuation coefficient is given by
the imaginary part times the magnitude of the wave vector,
& '1/2 & '1/2 1/2
ωp2 ωp2 ω σω
n k = k = , (13.72)
2ωγ 2γc2 20 c2
where we have used ωp2 = σγ/0 to write the index in terms of the conductivity. This
gives a low-frequency skin depth,
1/2
1 20 c2
δ = = . (13.73)
κ σω
The skin depth in copper reduces from 2 mm at 1 kHz, to 70 microns at 1 MHz, and
2 microns at 1 GHz. For this reason waveguides rather than wires are used in the
microwave domain for frequencies above around 20 GHz.
χ = χ(1)
+ χ(3)
E02 , (13.77)
where the first term is known as the linear susceptibility and the second
term as the optical Kerr non-linearity. As the non-linear term is
13.9 Non-linear optics 229
χ(1)
χ(3) =
2 . (13.79)
Eat
Using the Bohr model, the energy ω ∼ 12 e2 /4π0 r and |D0 | ∼ ea0 ,
giving Eat ∼ 12 e/4π0 a20 ∼ 5 × 1011 Vm−1 . As χ(1) ≤ 1, we obtain an
upper limit of
(3)
χ 10−23 V−2 m2 . (13.80)
This rough estimate is not so far off typical values; for example, air is
1.7 × 10−25 V−2 m2 and water is 2.5 × 10−22 V−2 m2 ; (Boyd, 1992). Note
that this is for off-resonant fields, and much higher values are obtained
by exploiting resonances. If the medium has an internal field due to the
crystal structure, then it is also possible to have a linear Stark effect
giving rise to a χ(2)
non-linearity. This χ(2) term can used to frequency-
double laser light, as we shall see in Example 13.7, or to convert between
frequencies using parametric down-conversion.
Example 13.7
Classical theory: In the classical Lorentz model, see Section C.4, the electron
is treated as a mass on a spring. The linear susceptibility χ(1) corresponds with
Hooke’s law, where the restoring force is linearly proportional to the displacement.
In this case, the electron moves in a harmonic potential of the form 12 mω02 x2 for a
field polarized along x. The non-linear terms correspond to higher-order anharmonic
terms in the potential. If we expand the susceptibility as a power series in E,
N d
χ = = χ(1) + χ(2) E + χ(3) E 2 + · · · , (13.81)
0 E
then successive terms correspond to harmonic, cubic, and quadratic terms in the
25
binding potential. In the absence of any symmetry-breaking fields,25 the first-order In non-linear crystals there is an
correction to the harmonic potential is x4 , which gives rise to the χ(3) term. The internal electric field imposed by the
effect of these non-linear terms on the electron motion and how this leads to the crystal structure.
appearance of harmonics in the spectrum is illustrated in Fig. 13.17.
230 Light and matter
Chapter summary
Exercises
(13.1) Dipole phase a tangent to the Earth’s surface. What is the
Draw a phasor diagram at t = 0 for an electric effective depth of the atmosphere if the Earth’s
field, E/E0 = e−iωt . Indicate the direction radius is 6.4 × 106 m? Estimate the ratio of red to
of rotation. Add a phasor corresponding to blue light at sunset.
an induced dipole, d/|d|, with resonant angular (13.4) Wave propagation in a plasma
frequency, ω0 , for (i) ω = ω0 , (ii) ω ω0 , and (iii) The dispersion relationship in a metal is ω 2 =
ω ω0 . k2 c2 + ωp2 , where the plasma frequency ωp2 is a
(13.2) Refractive index of a thin slab constant. Do electromagnetic waves propagating
In Fig. 13.18 we show the phase of a harmonic through a plasma at a frequency higher than the
wave propagating through a thin slab with length plasma frequency exhibit normal or anomalous
(the interfaces of the medium are indicated by dispersion?
dashed lines) and refractive index n. In free space (13.5) Impossibility of a function that only absorbs one
and inside the medium the phase change per unit frequency component
distance is k and nk, respectively. Figure 13.18 Revisit the discussion of causality in Section 13.5.
also shows the Fourier transform for a short slab, Consider the case of an optical wave that is zero
and inset a slab that is 10 times longer. until t = 0 incident on a device. (i) What would
(a) What is the length of the medium, in units the output be if the device filtered out only one
of the wavelength, in both cases? frequency component? (ii) Show that the wave
(b) What is the value of the refractive index? with this modified spectrum would have to have a
finite value for t < 0. (iii) Reason why such a filter
(c) What two properties are neglected in the is not compatible with causality.
simulation? [Hint: Fresnel and Bouguer.]
(13.6) Frequency filtering and temporal convolution
(d) Comment on the uncertainty in the magni- The spectrum of the output wave with input
tude of the wave vector inside the medium spectrum F(ω) from a filter with profile G(ω)
for the short and long medium. is F(ω)G(ω). Write an explicit expression for
(e) To what extent does it makes sense to define the time dependence of the output in terms of
a refractive index for a medium of length less the functions f (t) and g(t), the inverse Fourier
than or of order λ? transforms of F(ω) and G(ω), respectively.
(13.3) Blue sky (13.7) Kramers–Kronig from Hilbert
Above the atmosphere the solar intensity of red By substituting F(ω) = F (ω) + iF (ω) into
(0.65 μm) and blue (0.45 μm) light are equal. If eqn (13.34) and equating real and imaginary parts,
the vertical depth of the atmosphere is 10 km, find the expression for F (ω) in terms of F (ω)
and the average number density of molecules is and vice versa.
N = 1.0 × 1025 m−3 , estimate the difference (13.8) Non-linearities
between the intensity of red and blue light at the What are the units of χ(3) and n2 ? What is the
Earth’s surface. At sunset the Sun light makes conversion factor between them?
232 Exercises
The quantity A is known as the vector potential. From the scalar and
vector potentials we can also generate the electric field,
∂A
E = −∇φ − , (A.6)
∂t
where we recognize the electrostatic limit when the time derivative of
A is zero. From eqn (A.5) and eqn (A.6) it is evident that both vector
234 Electromagnetic scalar and vector potentials
5
Note that it is the constraints on Coulomb gauge: The Coulomb gauge is defined by the condition5
the functions φ and A that are most
important, we don’t particularly care ∇·A=0 , (A.11)
about the explicit form of the function
χ. in which case eqn (A.7) and eqn (A.8) simplify to
1 ∂ 1 ∂2A
−∇2 A + 2
∇φ + 2 2 = μ0 J (A.12)
c ∂t c ∂t
and
ρ
∇2 φ = − . (A.13)
0
A.3 Application: Electric field of a dipole 235
charge oscillating along one axis. We shall come back to this point after
we have derived the field produced by a linear dipole.
Example A.1
Far field of a dipole: A ‘quick’ derivation proceeds from the general solution to
Poisson’s equation for the vector potential in the Lorenz gauge, eqn (A.15) (see e.g.
Zangwill 2013):
ˆ
μ0 J(x , t )
A(r, t) = d 3 x , (A.17)
4π r
where r = |x − x | and t = t − |x − x |/c is the retarded time, see Jackson (1999).
For a localised dipole with charge q and position z = z0 e−iωt , we have
ˆ
d3 x Jz (x,t) = q ẋ = −iωqz0 e−iωt = −iωde−iωt
Example A.2
Near and far field: In the above we ignored the near field and although we do
not use the result in the main text we include it here for completeness. We avoid
9
A similar derivation is given in Souza using eqn (A.17) in order to demonstrate the near- and far-field matching directly.9
(1983). Again, we consider a ‘point’ dipole oscillating at frequency ω along the z axis. As J
is parallel to the z axis the only non-zero component of A is Az . For a ‘point’ dipole,
1 ∂ 2 Az
∇2 A z − = 0,
c2 ∂t2
everywhere except at the origin. The solution is a spherical wave,
ei(kr−ωt)
Az (r, t) = A0 ,
r
A.3 Application: Electric field of a dipole 237
with an unknown amplitude, A0 , that we can find by matching the scalar and vector
potential in the near field (kr < 1) using eqn (A.14),
∂Az 1 ik
∇ · A = cos θ = A0 cos θ − 2 + ei(kr−ωt) . (A.19)
∂r r r
For kr < 1 we can neglect the second term. Also in the near field, the potential
follows that of the static dipole. The potential due to a charge q at the origin is
q
φ(r, θ) = .
4π0 r
Adding a negative charge displaced by a distance z along the z axis, the potential
becomes
q 1 1
φ(r, θ) = − 2
4π0 r (r − z 2 − 2rz cos θ)1/2
q 1 1
= − 2
4π0 r (r − z − 2rz cos θ)
2 1/2
2
q z cos θ z d cos θ
= 1−1+ +O ≈ ,
4π0 r r r2 4π0 r2
where the static dipole moment is d = qz.
d cos θ i(kr−ωt)
φ(r, θ, t) = e .
4π0 r 2
Substituting in eqn (A.14) we find
1 i(kr−ωt) d cos θ i(kr−ωt)
−A0 cos θ e − iω e =0,
r2 4π0 c2 r2
so
iωd iμ0 ωd
A0 = − =− . (A.20)
4π0 c2 4π
Substituting eqn (A.20) into eqn (A.19) and using eqn (A.14), we find
iω iμ0 ωd cos θ 1 ik
− 2φ− − 2 + ei(kr−ωt) = 0 ,
c 4π r r
so
d cos θ 1 ik
φ(r, θ, t) = − ei(kr−ωt) .
4π0 r2 r
The electric field is given by
∂A
E = −∇φ − .
∂t
∂φ 1 ∂φ
∇φ = r̂ + θ̂
∂r r ∂θ
2
d k 2ik 2 1 ik
= cos θ + 2 − 3 r̂ − sin θ 3
− 2 θ̂ ei(kr−ωt) .
4π0 r r r r r
Using
iμ0 ωd
A = − (cos θr̂ − sin θ θ̂)ei(kr−ωt) ,
4πr
we find that
∂A μ0 ω 2 d
= − (cos θr̂ − sin θ θ̂)ei(kr−ωt) ,
∂t 4πr
d 2
= − k (cos θr̂ − sin θθ̂)ei(kr−ωt) ,
4π0 r
and so
d 1 ik 1 ik k2
E = − 2 2 cos θr̂ + − 2 − sin θ θ̂ ei(kr−ωt) .
4π0 r3 r r3 r r
238 Electromagnetic scalar and vector potentials
In the far field (kr > 1) the 1/r term, known as the radiation term, dominates.
This term corresponds with a dipolar radiation pattern emitted by a dipole with
negligible spatial size. Note that there is no radial component of the radiation (1/r)
term, i.e. the radiated field is transverse. In the near field (kr < 1) the higher-order
terms dominate. Note that the imaginary part of the field remains finite at the origin.
The imaginary part of the 1/r 3 and 1/r 2 terms has the form
1 k kr k3 r 3 k k3 r2 k3
lim sin kr− cos kr = 3− − 2+ +O(r 2 ) = .
r→0 r 3 r2 r 6r 3 r 2r 2 3
It is often useful to look at a particular component of the dipolar field. Here,
we choose the component parallel to the direction of the induced dipole, i.e. the z-
component,
Ez = Er cos θ − Eθ sin θ
d 1 ik 1 ik k2
= 3
− 2 2 cos2 θ − 3
− 2 + sin2 θ ei(kr−ωt) .
4π0 r r r r r
Note that the minus sign arises because as θ increases this reduces the z-component.
This last result gives a useful form for the scalar dipolar field:
d 1 ik k2
Ed = − (3 cos 2
θ − 1) + sin 2
θ ei(kr−ωt) , (A.21)
4π0 r3 r2 r
from which we obtain the far field, kr 1, radiation term as:
d k2
Ed = sin2 θei(kr−ωt) ,
4π0 r
in agreement with the quick derivation, eqn (A.18).
Dipoles in three dimensions: Finally, we discuss briefly the more complex case
of three-dimensional charge distributions. In the far field, we can describe the three-
dimensional oscillating charge distribution as a superposition of point-like dipoles
along the x, y, and z axes. More convenient is to take a z dipole, dz , and dipoles
rotating in both directions around the z axis, dx + idy and dx − idy (these modes
10
These different dipole modes are are all symmetric with respect to the z axis).10 The dipolar modes dz , and dx ± idy
associated with different internal quan- are associated with the transitions Δm = 0 and Δm = ±1 and are often labelled as
tum states. The simplest quantum π and σ ± . Light emitted due to a π transition is linearly polarized, whereas light
dipole corresponds with the superposi- emitted due to a σ ± transition is either circularly polarized close to the z axis or
tion between quantum states with total linearly polarized in the xy plane, see Chapter 4. The intensity distributions for
angular momentum J = 0 and J = 1. π and σ ± transitions are proportional to sin2 θ and 12 (1 + cos2 θ), respectively. If
The J = 0 and J = 1 states consist we add the radiation from all three modes together we find that the distribution is
of sub-states labelled m = 0 and m = isotropic and the intensity is a factor of 2 larger than for dz alone.
−1, 0, −1, respectively.
Fourier transform toolkit B
B.1 Executive summary B.1 Executive summary 239
B.2 δ-function 240
In this appendix, we focus on the mathematical properties of Fourier B.3 Properties 241
transforms and develop a toolkit that can be applied to optics B.4 Convolution 242
throughout the book.1 The following dozen equations—relating to B.5 rect sinc 243
Fourier transforms—are particularly useful in optics. The first six relate B.6 gauss gauss 244
to the mathematical properties of the Fourier transform. The second
B.7 δ-function constant 246
half dozen are particular examples of Fourier transform pairs. We
B.8 Phasor δ-function 247
use lower case letters for real space functions and capitals for function
B.9 comb comb 247
in Fourier (or frequency) space. We define a Fourier transform using:
B.10 2D Fourier transforms 250
ˆ ∞ B.11 Cartesian separability 250
F(u) = F [f(x)] (u) = f(x)e−i2πux dx , (B.1) B.12 2D rect 250
−∞
ˆ B.13 circ jinc 250
∞
−1
f(x) = F [F(u)] (x) = F(u)e i2πux
du . (B.2) B.14 Fourier on a computer 252
−∞ Exercises 253
1
The following six properties of Fourier transforms: the central If you are familiar with Fourier
transforms and are happy with the
ordinate theorem; linearity; translation; scaling; convolution (inverse dozen equations, B.2–13, you could skip
convolution); and cartesian separability, can be expressed as: the further details.
ˆ ∞
F(0) = f(x)dx , (B.3)
−∞
x
F rect (u) = a sinc (πua) , (B.9)
a
ρ πD2
F circ (u, v) = jinc [π W D] , (B.10)
D 4
x √
F gauss (u) = πw0 gauss (πuw0 ) , (B.11)
w0
sin N πud
(N )
F Xd (x) (u) = , (B.13)
sin πud
B.2 δ-function
The Dirac δ-function is defined as
∞ x=0
δ(x) = , (B.15)
0 x = 0
2
This normalization condition intro- with the condition that2
duces the scaling property ˆ ∞
x
δ = aδ(x) . δ(x)dx = 1 . (B.16)
a −∞
B.3 Properties
In this section, we consider some properties of the Fourier transform
operator.
Central ordinate theorem: From eqn (B.1), it follows that the
amplitude of the zero spatial frequency component is equal to the area
under the curve,
ˆ ∞
F(0) = f(x)dx . (B.19)
−∞
Symmetry: For even functions where f(x) = f(−x), F(u) = F(−u), the
‘forward’ and inverse transforms are the same, and applying the forward
transform twice returns the original function. For non-symmetric
functions, we find
ˆ ∞
F [F [f(x)]] = F [F(u)] = F(u)e−i2πux du
−∞
= f(−x) , (B.20)
B.4 Convolution
6
The convolution concept is useful in The convolution6 of two functions of x, g(x) and h(x), is defined as7
optics because it is often possible to ˆ ∞
write the input field either in the
form of a convolution integral, or as a (g ∗ h)(x) = g(x )h(x − x )dx . (B.26)
−∞
product of two functions, in which case
the field downstream can be expressed For a particular value of x this has the form of an overlap integral with
as a convolution. Convolution is also
useful in electronics, where for example,
h reflected about the x axis and re-centred at x, see Fig. B.1. In the
the output of a filter is given by a
convolution of the input function with
the frequency response of the filter.
7
Also sometimes written as g(x)⊗h(x).
Fig. B.1 The convolution of g(x ) and
h(x ): The left panel shows g(x ) and
h(x − x ) as a function of x for five
values of x. The parameter x behaves
as an offset—as x is varied, the function
h(x − x ) moves along the x axis. Note
that h(−x ) is a mirror reflection of
h(x ) about the x axis. The overlap
between g(x ) and h(x − x ) is shaded.
The shaded area gives the value of the
convolution as indicated by the black
dots in the right-hand panel. The
black dots trace out the full convolution
function, g ∗ h.
This equation is very useful in optics when we are able to write the input
field in terms of a product of functions.9 We consider another example
of a convolution of two rect function in the Exercise B.11, see Fig. B.15.
This is depicted in Fig. B.4. We could also write the Fourier transform
in terms of angular spatial frequency,
x √ kx w0
F gauss (kx ) = πw0 gauss . (B.35)
w0 2
Note that for w2 = 1/π we have
F e−πx (u) e−πu ,
2 2
= (B.36)
This result is important in laser physics, where we find that both gaus-
sians and derivatives of gaussians—related to Hermite polynomials—can
be used to describe the transverse-field profile of the field inside laser
cavities, see Chapter 11.
We can also have a gaussian with a complex argument. For a purely
imaginary argument the gaussian is a cosine and sine of a quadratic
function, like the wave front of a circular wave in Fig. 2.1. The Fourier
transform remains self-similar, for example with w2 = iλz/π,
√ Fig. B.4 gauss gauss: (a) f(x) =
F e−πx /iλz (u) = iλze−iπλzu ,
2 2
(B.38) gauss(x/w0 ). (b) F(u) = F [f](u) =
gauss(πw0 u).
which is relevant to the derivation of the Fresnel diffraction integral, see
Chapter 5.
Example B.1
Heisenberg uncertainty relationship: Consider a gaussian wave packet described
by the probability distribution
1 x2
P (x) = exp − ,
(2πσx2 )1/2 2σx2
√
where σx is the 1/ e width or standard deviation. In electromagnetism or quantum
mechanics this probability distribution would correspond to the field amplitude or
12
wave function,12 In optics, the field amplitude squared
determines the intensity or flux and
1 x2
ψ(x) = exp − . hence the probability of detecting a
(2πσx2 )1/4 4σx2 photon. In quantum mechanics, the
The amplitude distribution of angular spatial frequencies is given by the Fourier ground state
√ of the harmonic oscillator,
transform with respect to kx , σx = a0 / 2, where a0 = (/mωosc )1/2
1 is the 1/e width and ωosc is the
ψ̃(kx ) = π 1/2 2σx exp −kx2 σx2 , oscillation frequency.
(2πσx2 )1/4
and the probability distribution is
√
P(kx ) = 2 2π 1/2 σx exp −2kx2 σx2 .
246 Fourier transform toolkit
i.e. δ(x) and 1 are a Fourier transform pair as illustrated in Fig. B.5.
Similarly, for angular spatial frequencies,
ˆ ∞
F(kx ) = F [δ(x)] = δ(x)e−ikx x dx = 1 . (B.41)
−∞
The real space δ–function, δ(x), contains all angular spatial frequencies
with an equal amplitude, i.e. F(kx ) = 1. Inserting F(u) = 1 into the
Fig. B.5 δ-function constant: (a) inverse transform, eqn (6.9), we find
f(x) = δ(x) (the vertical arrow indicates ˆ ∞
a δ-function) and (b) F(u) = F [f ](u).
δ(x) = ei2πux du . (B.42)
−∞
N −1
a(1 − r N ) sin N πud
arj = ,
1−r = .
j=0 sin πud
to perform the sum analytically:
For small N , it is more convenient to write the Fourier transform in
(N )
F Xd (x) (u) terms of a discrete sum of phasors. For N = 2,
F Xd (x) = e−iπud + eiπud = 2 cos πud ;
N −1 (2)
(B.52)
= e−i(N −1)πud ein2πud ,
n=0
(3)
and so on. Figure B.7 shows the example of a Xd (x), and its transform.
The position of the first zero in the phasor sum is when the phasors
are evenly distributed around the clock face at angles of 2πn/N , with
n going from −(N − 1)/2 to +(N − 1)/2, see Fig. B.8. For N = 3,
Fig. B.8(i), the phasor angles are −2π/3, 0 and +2π/3. When we take
the modulus-squared, we obtain a principal maximum with height 9
and subsidiary maxima with height 1.
(N )
The modulus-squared of the Fourier transform of Xd (x) for N = 2–
6 is illustrated in Fig. B.9. The generic properties of the N -phasor sum
are:
For N rect functions with width a and spacing d, the Fourier transform
is
x sin N πud
(N )
F rect ∗ Xd (x) (u) = a sinc(πua) .
a sin πud
In Fig. B.10 the square of the Fourier transform is plotted for N = 1, 2,
3, 4, 5, and 12 with d = 2a. In optics, this expression is used to describe
the diffraction pattern for N -slits or a diffraction grating. As the
number of rect functions increases the peaks become narrower but their
Fig. B.8 Phasor diagrams correspond-
ing to the first zero of the Fourier height is still given by the envelope arising from a single rect function.
(N )
transform of Xd (x) for (i) N = 3, For large N the spectrum approaches the discrete Fourier series that we
(ii) 4, and (iii) 5. saw in trying to build a square, see Fig. 6.3. In optics, the square-wave
pattern with equal regions of on and off is known as a Ronchi grating.
B.9 comb comb 249
(N ) 1 x x
Xd (x) = X rect .
d d Nd
where u and v are the spatial frequencies corresponding with the x and
y directions, respectively. In optics there are two cases which occur
frequently: (i) cartesian separable where the function can be written
as a product of functions of x and y; and (ii) cylindrical symmetry,
where the function has cylindrical symmetry.
where G(u) = F [g(x)] and H(v) = F [h(y)] are the 1D transforms. Note
that 2D transforms have the same symmetry as the function, e.g. if the
function has two axes of symmetry then the Fourier transform will also
have two axes of symmetry.
B.12 2D rect
A simple example of a cartesian separable function is the two-dimensional
Fig. B.11 Two-dimensional intensity rect function given by the product of two rect functions in the x and y
map corresponding to the modulus- directions, i.e.
squared of the Fourier transform of x y
a two-dimensional rect function with 0 |x| > a/2 or |y| > b/2
rect rect = . (B.57)
width three times larger than the a b 1 |x| ≤ a/2 and |y| ≤ b/2
height, a = 3b. Both the function
and the transform have two axes of Using the one-dimensional Fourier transform of rect, eqn (B.31), and
symmetry.
cartesian separability, eqn (B.56), we find that
x y
F rect rect = ab sinc (πua) sinc (πvb) . (B.58)
a b
The Fourier transform of a two-dimensional rectangular function with
width three times larger than the height, a = 3b, is shown in Fig. B.11.
cylindrical
√ radial distance ρ = x2 + y 2 and its Fourier partner W =
u2 + v 2 .
The circ function is the cylindrically-symmetrical equivalent of the
rect function, and in optics it describes a circular aperture. The circ
function with diameter D is written as
ρ
0 ρ > D/2
circ = (B.59)
D 1 ρ ≤ D/2 .
The Fourier transform is given by
ρ πD2
F circ (u, v) = jinc (π W D) , (B.60)
D 4
where jinc is the cylindrical analogue of sinc. The derivation proceeds
as follows: we re-write the two-dimensional Fourier transform in polar
coordinates,
ˆ ∞ˆ ∞
F[f] = f(x, y)e−i2π(ux+vy) dxdy ,
−∞ −∞
ˆ 2π ˆ ∞
= f(ρ, θ)e−i2πW ρ(sin φ sin θ+cos φ cos θ) ρdρdθ ,
0 0
where W and φ are the Fourier space equivalents of ρ and θ. Using the
identity sin φ sin θ + cos φ cos θ = cos(φ − θ), if f(ρ, θ) is independent of
θ, then the angular integral gives the Bessel function,
ˆ 2π
1
J0 (W ρ) = e−i2πW ρ cos(φ−θ) dθ ,
2π 0
and we can rewrite the Fourier transform as an integral over ρ only,
ρ ˆ D/2
F circ = J0 (W ρ) ρdρ ,
D 0
πD2 J1 (π W D)
= ,
4 πW D
using the Bessel function identity
ˆ α
J1 (α) = βJ0 (β)dβ .
0
Example B.2
Arrays of identical shapes: Arrays of the same object are formed by a convolution
with a replicating comb function, see Sections 6.7 and B.9. Here we consider
the simple example of two circ functions separated by a distance, d, as shown in
Fig. B.13(i). Two circ functions centred at positions ±d/2 along the x axis are
described by the function
ρ
(2)
f(x, y) = circ ∗ Xd (x) . (B.62)
D
The Fourier transform is
πD2
F [f(x, y))] (u, v) = 2 cos πud jinc (π W D) . (B.63)
4
The modulus-squared of the Fourier transform corresponds to an Airy pattern with
cosine-squared interference fringes, as observed in Young’s two-hole experiment, see
Fig. B.13(ii) plus Chapters 3 and 5.
N −1
F[n] = f[m]e−i2πnm/N , (B.64)
16 m=0
One of the quickest routes to de-
termining whether FFT delivers the
output we expect is to find the Fourier where n is also an integer between 0 and N − 1. In this expression,
transform pair for a simple case such the real and Fourier space variables x and u are replaced by the
as cos 2πu0 x, and checking that the integers m and n, respectively. This equation is not exactly the discrete
Fourier transform returns peaks at the
spatial frequencies ±u0 .
variant of the Fourier transform because the sum is only over positive
frequencies (because computers prefer positive indices), whereas the
Fourier transform integrates over both positive and negative frequencies.
We can see the consequence of this difference by looking at a specific
example.16
Exercises 253
Example B.3
Discrete FT: Using eqn (B.64) we implement a Fourier transform manually and
compare it to an in-built routine. As an input we choose the binomial sequence with
N = 9 values, i.e. f[n] = [1, 8, 29, 58, 72, 58, 29, 8, 1]. This has a similar shape to a
gaussian, as illustrated in the top row of Fig. B.14. To map a gaussian centred at
the origin onto the m axis, we would use x[m] = [−4, −3, −2, −1, 0, 1, 2, 3, 4].
The values of F[n] found using eqn (B.64) are plotted as points in the middle row.
The values are complex so we plot the modulus. The grey boxes indicate the output
of the ‘python’ open-source fft module. The fact that they agree shows that the
in-built module is using an algorithm based on eqn (B.64). Note that the first term
in our output array is F[0] which is the zero frequency component (or dc offset in
electronics). Putting n = 0 in eqn (B.64), we recover the central ordinate theorem,
N −1
F[0] = f[m] , (B.65)
m=0 Fig. B.14 The top row shows the input
function. The middle row shows the
which in our example is 264. For n > 0 the terms decrease and then increase again
prediction of eqn (B.64) (black circles)
close to n = N − 1. This increase is not so surprising when we remember that for a
and the output of an open source fft
discrete Fourier series the wave form repeats, and if we continued with values n ≥ N
code. The bottom row shows the effect
we would return an identical sequence of numbers. This cyclic nature of the sum
of applying the function fftshift to the
means than we can shift the sequence such that zero is in the middle.17 The shifted
output.
output is gaussian-like, Fig. B.14(bottom row), as expected for a ‘gaussian’ input.
17
In ‘python’, the fftshift function
performs this operation.
Exercises
(B.1) Fourier transform properties (2)
(iv) Xd (x) ∗ rect(x/a).
Evaluate the Fourier transforms of the following (v) gauss(x/wx )gauss(y/wy ).
functions, using F(u) = F[f(x)](u), G(u) =
F[g(x)](u), H(u) = F[h(x)](u), and H(v) = (vi) gauss(ρ/w0 ).
F [h(y)](v), etc. where appropriate: (vii) rect(x/a)gauss(y/w0 ).
(viii) circ(ρ/D) cos(2πu0 x).
(i) g(x) + h(y).
Comment on when you might encounter each of
(ii) f(x − d).
these functions in optics.
(iii) f(x − d/a).
(B.3) rect
(iv) f[(x − d)/a]. (a) Sketch the following functions: (i) rect(2x),
(v) g(x)h(x). (ii) (1/2)rect(x/2), and (iii) 5rect[(x − 5)/5]. (b)
(vi) g(x)h(y). Write down the Fourier transforms of the following
functions (i) 5rect(x), (ii) 3rect(x/3), and (iii)
(vii) [f(x) ∗ g(x)]h(x). rect(x/5). Sketch the sum of the functions, and
(viii) [f(x) ∗ g(x)]h(y). the Fourier transform of the sum.
(B.2) Fourier transforms (B.4) Convolution of different-width rect functions
Evaluate the Fourier transforms of the following: Use a graphical technique to convolve rect(x/a)
with rect(x/b), where b > a.
(i) rect(x/a) ∗ δ(x − d).
(B.5) sine and cosine
(ii) rect(x/a)ei2πu0 x . The Fourier transform of cosine is two δ functions,
(4)
(iii) Xd (x). F [cos(2πu0 x)] (u) = 21 [δ(u − u0 ) + δ(u + u0 )] .
254 Exercises
Which two properties of Fourier transforms are use the results above to show that
used in deriving this result? What happens to the
Fourier transform in the limit u0 → 0? What is F [Xd (x)] (u) = X1/d (u) .
the Fourier transform of sin(2πu0 x)?
(B.6) Gaussian functions Comment on the unusual scaling property of this
Write an equation for the Fourier transform of pair of functions.
f(x) = gauss(x/a). In the propagation of light,
such as the paraxial approximation to a plane (B.8) comb
wave at an angle, we encounter quadratic phase An aperture consists of four narrow slits at x =
2
factors of the form, H(kx ) = eikx z/2k . The far-field −2d, −d, d, and 2d. Write an expression for
light distribution can be written as a convolution the aperture function, f(x), both as a difference
of the inverse Fourier transform of this function, between a comb function and a δ-function, and
h = F −1 [H(kx )](x), with the input field. Find the as a convolution of two comb functions. Write
function h. [Hint: H is a gaussian with a complex expressions for the Fourier transform in both
width, a, given by 1/a2 = −iz/k.] cases as a sum of phasors, and show that
they are the same. How many subsidiary
(B.7) Dirac combs and replicators maxima are there between the principal maxima
By substituting x̃ = x/d and ũ = ud into in the interference pattern far downstream of
ˆ ∞ ∞ the aperture? Comment on your reasoning by
F [X(x̃)] = δ(x̃ − m)e−i2πũx̃ dx̃ , describing or drawing a phasor diagram and giving
−∞ m=−∞
the angles corresponding to each of the zeros.
= X(ũ) , (B.9) Spatial frequency or angular spatial frequency
show that Rewrite eqns (B.3) to (B.14) in terms of angular
x
spatial frequencies kx = 2πu and ky = 2πv.
F X (u) = dX (ud) . (B.10) Two-dimensional Fourier transforms
d
If F(kx , ky ) = F[f(x, y)], find an expression for
If we define the replicator functions: F(0, 0) for f(x, y) = rect(x/D)rect(y/D) and
1 x
∞
X1/d (u) = u− = dX (ud) , tri(x/a) = (1/a) [rect(x/a) ∗ rect(x/a)], see
m=−∞
d Fig. B.15. Comment on the dimensions of tri(x/a).
we find that
where
where
1 1
u = (ρ̃ab + ρ̃ba ) , and v= (ρ̃ab − ρ̃ba ) . (C.8)
2 2i
For a constant field real amplitude E0 , both u and v are constant and
real. We find u and v by solving the optical Bloch equations.
where D̂ = −er is the dipole operator. Note that the electric field
vector is directed from positive to negative, whereas the dipole vector is
directed from negative to positive, as in Fig. C.1. The lower energy state
is when D̂ and E are parallel, in which case we can write Hint = −D̂E,
where D̂ = −erλ is the projection of r only the polarization state of the
Fig. C.1 A classical dipole in an light. Inserting E = E0 cos(kz − ωt) and the two-state wave function, we
electric field. have
i ċa e−iEa t/ |a + ċb e−iEa t/ |b − ca Ea e−iEa t/ |a + cb Eb e−iEb t/ |b
= Ea |aa| + Eb |bb|−D̂ ca e−iEa t/ |a + cb e−iEa t/ |b E0 cos ωt.
Ω
ρ̃˙ ba = −i (ρaa − ρbb ) + iΔρ̃ba .
2
Finally, we add a damping term. If the only damping mechanism is
spontaneous emission from the excited state then the coherence term
decays at a rate equal to one half of the spontaneous emission rate Γ
and the optical Bloch equation for the coherence becomes
Ω Γ
ρ̃˙ ba = −i (ρaa − ρbb ) + iΔρ̃ba − ρ̃ba .
2 2
Similarly, we obtain the corresponding equation for the rate of change
of the excited state population,
Ω
ρ̇bb = −i (ρ̃ab − ρ̃ba ) − Γρbb .
2
We can combine these equations into equations for u, v, and the
population difference w = 12 (ρbb − ρaa ):
1
u̇ = − Γu + Δv ,
2
1
v̇ = −Δu − Γv − Ωw ,
2
1
ẇ = Ωv − Γ w + .
2
In this form of the optical Bloch equations, the field acts as a torque
(Ω, 0, −Δ) on the Bloch vector (u, v, w) (see Adams et al., 1994). The
steady-state solutions of these equations are
Δ s Γ s 1 1
ust = , vst = , and wst = − ,
Ω 1+s 2Ω 1 + s 21+s
where s = (Ω2 /2)/(Δ2 +Γ2 /4) is known as the saturation parameter.
Another useful result is the steady-state population in the excited state,
1 1 s
ρbb = wst + = ,
2 21+s
Ω2 /4
= .
Δ + Γ2 /4 + Ω2 /2
2
258 Induced dipoles
D02 1
α = − , (C.12)
Δ + iΓ/2
match the terms in eqn (C.11). Note that far off resonance, |Δ| ∼ ω0 ,
we can neglect the damping but we must include the counter-rotating
3
(ω + ω0 ) terms, and the polarizability is
We have restricted our attention
to a system with one excited state.
However, in a multi-level system there
D02 D02
α = + . (C.14)
will be many energy levels, with (ω0 − ω) (ω0 + ω)
energies Ek , that contribute to the
polarizability. In this case, we sum the The dc polarizability3 is
contribution of each mode of the dipole
to give a polarizability of state j,
2D02
2|Djk |2 α = . (C.16)
αj = . (C.15) ω0
k=j
Ek − E j
In Example C.1 we derive a relationship between Γ and the dipole
matrix element, D0 .
C.4 Lorentz model 259
Example C.1
Scattering rate: To find an expression for the scattering (or spontaneous decay)
rate, Γ, we can use the fact that the power radiated by the dipole must be equal to
the energy loss rate of a single quantum emitter. For a two-level atom, the energy loss
rate is equal to the probability of being in the excited state, ρbb , times the excited
state decay rate, Γ, times the energy of one photon, ω0 , i.e.
P = Γρbb ω0 , (C.17)
where on resonance (Δ = 0) the steady-state excited-state probability is ρbb =
(D0 E0 /Γ)2 . The power radiated by an oscillating dipole is derived in Chapter 13,
eqn (13.43). Equating the energy loss, eqn (C.17), and the power radiated using the
resonant polarizability, α = i2D02 /Γ, we obtain
2
D0 E0 2 1 2D02 ck4
Γ ω0 = E0 , (C.18)
Γ 4π0 Γ 3
which gives
k3 D02 D02
Γ = = . (C.19)
3π0 3π0 (λ/2π)3
From this result we see that the coupling between a dipole and the vacuum
(characterized by Γ) is approximately equal to the coupling between two dipoles
separated by a distance r = λ/2π. The dipole–dipole interaction energy is D0 Ed ,
where Ed is the dipolar field, eqn (A.21).
Example C.2
Classical equation of motion: The bound charge is subject to a driving field,
E = E0 e−iωt , which induces a displacement x in the centre of mass of the electric
charge distribution. The displacement oscillates at the same frequency as the driving
field, x = x0 e−iωt . The amplitude of the oscillation depends on the amplitude of
the driving field, E0 , the drive frequency, ω, and the damping or scattering rate, Γ.
For convenience, we assume that the optical response is dominated by one resonance,
5
with resonant frequency ω0 .5 This difference between the drive frequency and the For real atoms or molecules the bound
resonant frequency is called the detuning, Δ = ω − ω0 . charge has multiple resonances, but
Consider a dipole consisting of a single bound electron with resonant frequency often the light is closer to resonance
ω0 . If the charge displacement, x, is small compared to the optical wavelength, then with one particular resonance and we
the equation of motion has the form can neglect the others.
e
ẍ + γ ẋ + ω02 x2 = − E0 e−iωt , (C.20)
m
260 Induced dipoles
where −e and m are the charge and mass of the electron. Substituting the trial
solution, x = x0 e−iωt , we find
e 1
x0 = − E0 . (C.21)
m −ω 2 + ω02 − iωγ
The induced dipole moment is defined as charge times displacement,
e2 1
d = −ex0 = − E0 . (C.22)
m ω 2 − ω02 + iωγ
Example C.3
Refractive index: From the induced dipole, eqn (C.22), assuming that the dipoles
do not interact, we obtain an expression for the refractive index,
N d N e2 1
n2 − 1 = χ= =− , (C.23)
0 E 0 m0 ω 2 − ω02 + iωγ
which, as we will show, is similar to the quantum model. In a metal, there is no
binding, ω0 = 0, and ωp = [N e2 /(m0 )]1/2 is known as the plasma frequency and
the refractive index is given by
2
ωp
n2 =1− ω 2 +iωγ
. (C.24)
At high frequency, ω > γ, this reduces to
2
ωp
n2 =1− ω2
. (C.25)
Example C.4
Comparison between quantum and classical dipole models: If the drive
frequency is close to resonance, ω + ω0
2ω0 and ω
ω0 , then
e2 1
d = E0 . (C.26)
mω0 −2Δ − iΓ
Rewriting the oscillation frequency in terms of the harmonic oscillator length, a20 =
/2mω0 , we obtain
e2 a20 1
d = − . (C.27)
Δ + iΓ/2
6
As previously, there is a clear distinc- We can define a semi-classical analogue of the dipole matrix element as D0 = −ea0 ,6
tion between the constant D0 which and then
only depends on the properties of the D2 1
atom, and the induced dipole moment d = − 0 E0 = αE0 , (C.28)
Δ + iΓ/2
d which depends on the amplitude of
the applied field. where
D2 1
α = − 0 . (C.29)
Δ + iΓ/2
This is the same result derived for the two-level system in the linear optics regime,
eqn (C.12). Note also the difference between the quantum and classical models:
eqn (C.28) predicts that the induced dipole grows linearly in proportion to the applied
field, whereas the (correct) quantum model, see eqn (C.10), shows that there is
saturation.
References
Horváth, G., Barta, A., Pomozi, I., Suhai, B., Hegedüs, R., Åkesson,
S., Meyer-Rochow, B. and Wehner, R (2011) On the trail of Vikings
with polarized skylight: experimental study of the atmospheric optical
prerequisites allowing polarimetric navigation by Viking seafarers
Philosophical Transactions of the Royal Society of London B: Biological
Sciences 366 772–82.
Hughes, I. G. and Hase, T. P. A. (2010) Measurements and their
Uncertainties: A Practical Guide to Modern Error Analysis Oxford
University Press, Oxford.
Hwang, J., Pototschnig, M., Lettow, R., Zumofen, G., Renn, A.,
Götzinger, S., and Sandoghdar V., (2009) A single-molecule optical
transistor Nature 460 76–80.
Isham, C. J. (1995) Lectures On Quantum Theory: Mathematical And
Structural Foundations Imperial College Press, London.
Jackson, J. D. (1999) Classical Electrodynamics 3rd edition, John Wiley
& Sons, New York.
Jackson, J. D. and Okun, L. B. (2001) Historical roots of gauge
invariance Review of Modern Physics 73 663–80.
Jackson, J. D. (2002) From Lorenz to Coulomb and other explicit gauge
transformations American Journal of Physics 70 917–28.
Jacques,V . Lai, N, D., Dréau, A., Zheng, D., Chauvat, D., Treussart,
F., Grangier, P. and Roch, J-F (2008) Illustration of quantum
complementarity using single photons interfering on a grating New
Journal of Physics 10 123009.
Jacquinot, P. and Roizen-Dossier, B. (1964) Apodisation Progress in
Optics 3 29–186.
Jennewein, S., Sortais, Y. R. P., Greffet, J. J., and Browaeys, A. (2016)
Propagation of light through small clouds of cold interacting atoms
Physical Review A 94 053828.
Jones, P. H., Maragò, O. M., and Volpe, G (2015) Optical Tweezers:
Principles and Applications Cambridge University Press, Cambridge.
Joos, E. and Zeh, H. D. (2003) Decoherence and the Appearance of a
Classical World in Quantum Theory Springer, Berlin.
Kasevich, M. and Chu, S. (1992) Laser cooling below a photon recoil
with three-Level atoms Physical Review Letters 69 1741–4.
Kasdin, N. J., Vanderbei, R. J., Spergel, D. N., and Littman, M.
G. (2003) Extrasolar planet finding via optimal apodized-pupil and
shaped-pupil coronagraphs The Astrophysical Journal 582 1147–61.
Keaveney, J., Hughes, I. G., Sargsyan, A., Sarkisyan, D., and Adams,
C. S. (2012) Maximal refraction and superluminal propagation in a
gaseous nanolayer Physical Review Letters 109 233001.
Krist, J. E., Hook, R. N., Stoehr, F. (2011) 20 years of Hubble Space
Telescope optical modeling using Tiny Tim Proceedings of the SPIE
8127 81270J.
264 References
complementary aperture, 76, 104, 105, degree of coherence, 128, 136, 139–40 wave, 4, 13
165, 225 density matrix, 255 electromagnetically induced transparency
complex depth of (EIT), 122
beam parameter, 178, 180–1, 197 field, 158 electron, 4, 11, 37, 61, 156, 213
notation, Fourier series, 93 focus, 158 bound, 20, 213
notation, polarizability, 214, 258 dextro rotary, 62 free, 213, 225
notation, Fourier transform, 94, 113 dichroism, 57–8, 63 ellipsoid, 26
notation, polarization, 54–5, 68–9 dielectric, 4, 213, 219–220 elliptical polarization
notation, plane waves, 16, 19, 32 diffraction, 57–8, 63 polarization, 55–6, 60, 68, 70
numbers, vii, 8–9 Fraunhofer, vii, 77–85, 101–105, 148, wave fronts, 26
modulus-squared, 8 153, 181, 184, 241 emission
conductivity, 227–8 Fresnel, vii, 72–7, 85–8 spontaneous, 128, 257–8
conjugate planes, 30 grating, 43–5, 77, 82, 87–8, 104, 109, endoscope, 183
connection formula, 190 114, 170, 172, 174–5, 248–9 energy, 35, 42, 195, 223
continuity equation, 7 limit, 147–8, 181 conservation, 21, 23, 37, 42, 50, 120,
continuous, 93 diode 180, 205, 226, 242
medium, 215 laser, 130, 182, 210 density, 7, 14
spectrum, 131, 135 optical, 63, 69 flux, 4, 7, 21, 37
sum, 93 dipole, 22, 213–225, 228–31, 235–8, level, 12, 187, 258
continuous wave (cw), 112 255–8 photon, 11
convolution, 100, 102–3, 105–6, 112, Dirac δ-function, 95, 240, 246 entrance pupil, 152
115, 118, 132, 148–50, 163, discrete envelope function, 112
168–70, 220, 239–40, 242–3, Fourier transform, 252–3 etalon, Fabry-Perot, see Fabry-Perot
248–9, 252–4 medium, 215 Euler, Leonhard, viii
core, of a fibre, 188–92 spectrum, 94, 96–7, 114–5, 131–2 evanescent wave, 199, 227
corn syrup, 62 sum, 40, 73, 91–4, 96, 131–2, 184–5, Ewald, Paul, 218
correlation, 248 Ewald–Oseen extinction theorem,
and coherence, 127 dispersion, vii, 20, 28, 32, 43, 116–24, 217–18
intensity, 136 126, 213–4, 221, 227, 231 extinction, 58, 217–18
photon, 12 anomalous, 119 cross section, 222, 224
spatial and van Cittert and Zernike, normal, 119 paradox, 105, 109, 225
140 dispersionless, 117–9, 122 extraordinary axis, 59
time, 133 dispersive medium, 117 eye diagram, 121
and Wiener–Khinchin–Einstein Doppler broadening, 128
theorem, 132 double slit, 40, 49–50, 79, 81–2, 87, 90, F
cosine wave, 92–3, 108 101, 113, 136–9, 144–6, 149,
COSTAR (Corrective optics space 152–3, 156, 172 f-number, 78, 158
telescope axial replacement), doublet, achromatic, 28 Fabry, Charles, 45
149 Drude–Lorentz model, 226–7 Fabry-Perot, 45–47
counter-propagating waves, 36, 65–6, false detail, 166
247 E Faraday, Michael, 1
cross section, 213, 223–5 Faraday
crystallography, 83, 89, 103 edge detection, 168 effect, 57, 61–3, 69
current density, 200, 226–7, 233–5 eigenfunction, 59 rotation, 56
curved wave fronts, vii, 10, 15–16, 23, eigenmode, 182 far field, 38–40, 78–84, 87, 101–2,
26 Einstein, Albert, 132 105–6, 178–9, 201, 222, 225
cylindrical special relativity, 123 fast axis, 59–60, 68
lens, 168 electric fast light, 122–3
symmetry, 27, 73–4, 78, 83, 88–9, current, 213, 225, 230, 234 fibre
160, 162–3, 188, 250–1 dipole, see dipole graded index, 185–7
vector beam, 201, 203 field, vii, 2–4, 14, 111, 127, 177, multimode, 183, 190–2
wave, 15, 23, 38, 40, 77, 87 196–8, 213, 228–9, 233–7, 256, 259 photonic crystal, 194
waveguide, 183–8 scalar approximation, 9 single mode, 183, 189–92
vector, 51–5, 57, 64–6, 198–200, step index, 188–92
D 202, 204–7 field, vii, 233
susceptibility, 215–21, 228–9 electric and magnetic, 1–5, 233
damping, 226–7, 257–9 electromagnetic electromagnetic, 2
de Broglie relation, 11, 13, 108 field, 2, 7, 11, 35, 61, 111, 196, 198, film
decay rate, 258 201, 213–14, 225, 227, 233–5 thin, 22, 46–7, 49–50
decoherence, 128 spectrum, 75 filter, 242
Index 269
imaging, 28–30, 87, 172, 204 diode, 130, 182 low-pass filter, 165–70
phase-contrast, 171–2 He–Ne, 85, 130, 193, 197
incoherence, 127 modes, see mode M
index of refraction, see refractive index pointer, 14, 49, 90, 105, 108, 182
in-quadrature, 8, 216, 256 ultra-stable, 128, 130 Mach–Zehnder interferometer, 45
intensity, 4, 7–9, 11–12, 20–3, 34–7 laser beam propagation, see magnetic field, 2, 4, 61–3, 65–7, 198–9
correlations, 136 propagation, laser beam derived from potential, 233–5
fluctuations, 12 Law of refraction, 20–2, 25–6 in the paraxial limit, 210
map, 12 Left-circularly polarized light, in phase with electric field, 17
maxima and minima, 40 convention, 53 not in phase with electric field, 65
point spread function, 149, 161–2 length, coherence, see coherence, for a plane wave, 17, 52
reflection coefficient, 22, 126 length, wave equation, 3
saturation, 258 lens magneto-optic media, 64
time average, 7, 9 aberrations, see aberration Malus, Étienne-Louis, 58
interference, vii, 1, 12, 22, 33–43, 64–5, achromatic, 28, 43 Malus’ Law, 58
71, 74, 81–2, 87, 96, 113, anaclastic, 26, 222 matrix element for light–matter
116–17, 128, 131, 136–41, 147, angular resolution limit, 151 interaction, 214, 255, 258,
156, 164, 172, 188, 216, 223, aplanatic, 203, 204, 211 260
252 beam divergence, 90, 97–8, 178, 189 matter wave, 11–12, 126, 156, 183, 189
interferogram, 131, 162 equation, 30 Maxwell, James, 1
interferometer, 37, 45–8, 129–30, 135, f to f, 151–2 Maxwell’s equations, 2, 3–4, 7, 17, 21,
137 f number, 78, 158 23, 65, 72, 184, 195–8, 201,
Michelson, 130–2, 135, 162 finite size, consequence of, 78 221, 225, 233–4
Michelson’s stellar, 141, 145 focal length, 27–8 meridional plane, 204
Ramsey, 113 Fourier-transform property of, see f Michelson, Albert, 45
Young’s, see Young’s two-hole to f Michelson
experiment Fraunhofer diffraction realized with, interferometry, 47–9, 70, 130–2, 135,
interferometry, 45–8, 130, 135 77–8 162
inverse apodization, 159, 163–4 Fresnel, 72, 86 stellar interferometry, see stellar
inverse Fourier transform, 94 geometry, 27–8 interferometry
iodoquinine sulfate, 58 graded-index, 185–6 microscope, 26, 147–8, 155, 166
history, 26, 147 microwave, 4, 196, 227–8
J honey drop, 34, 71 mirror
imprinting quadratic phase, 28, 29, history, 26
Jamin interferometer, 45 186 in laser cavity, 182–3
Janssen, Hans and Zacharia, 26 modifying point-spread function, see in telescope, 141, 149
jinc function, 78, 141, 149–50, 163–4, apodization metal, 1, 19, 43, 184, 213, 225–7, 260
240, 250–2 numerical aperture of, 148 mode, 177
Jupiter, 26 thin-lens approximation, 27–9 azimuthally polarized, 201–4
two-lens system, see two-lens system cavity, 182–3
K zone plate as, 74–5 fibre, 185–192
lifetime, of excited state, 214 Hermite–Gauss, 202, 211
Kelvin, see Thomson, William light-matter interaction, vii, 4, 55, longitudinal of laser, 115–16
Kerr effect, 59, 228 213–29 matching, 221–2
Khinchin, Aleksandr, 132 limit, Abbe diffraction, 147–8 radially polarized, 201–2
Kirchhoff, Gustav, 72 linearity of Fourier transform, 241 transverse of laser, 182–3
Kohlrausch, Rudolf, 2 linearly-polarized light, see transverse electric (TE) mode, see
Kramers, Hendrik, 180, 221 polarization, linear transverse electric (TE) mode
Kramers-Kronig relations, 57, 123, longitudinal component of electric field transverse electric and magnetic
220–1, 231 in a light beam, 9, 196–8, (TEM) mode, see transverse
Kronig, Ralph, 221 203, 206–8, 210 electric and magnetic (TEM)
Lorentz, Hendrick, 219 mode
L Lorentz–Lorenz law, 219–20 transverse magnetic (TM) mode, see
Lorentz force, 2 transverse magnetic (TM) mode
laevo rotatory, 62 Lorentz model, 259 of a waveguide, 183
laser Lorentzian lineshape, 47, 128–9, 143, within a slit, 184–5
argon-ion, 115–16 180 modulus squared, 8
cavity, 115–16, 182–4 Lorentzian chaotic light, 133 momentum
coherence time, 130 Lorenz, Ludvig, 219 distribution, 83–4, 95, 97–8, 108,
cooling, 64, 162 Lorenz gauge, 235–6 184–5, 241, 246
Index 271
quantum dipole model, 235, 238, 255, resonance frequency, 20, 62, 121–4, Snellius, Willebrord, 20
260 214–6, 222–5, 228–9, 258–60 Snell’s law, 20, 22
quantum field, 2, 11, 246 resonator, optical, see cavity Sommerfeld, Arnold, 123
quantum mechanics, 1, 11–12, 37, 55, retardation, in wave plate, 59–60 spatial filter (4f), 164–72
57, 59, 91, 113, 128, 136, 156, retarded time, 236 spatial filtering, 91, 148, 159, 164,
183, 189, 224–5, 234–5, 244–5, Richards–Wolf vector diffraction integral, 165–72
259–60 204 spatial frequency, 4, 5, 6, 14, 16, 18, 24,
quantum optics, 1, 12, 136 Right-circularly polarized light, 36, 92–7, 166–71, 184–5,
quantum tunneling, 191 convention, 53 240–7, 251
quarter-wave plate, 59–61 Ronchi grating, 87–8, 248–9 spectral width, 121, 128–30
quartz, birefringence, 68 rotatory spectrometer, Fourier Transform, 131,
quasi-monochromatic light, 136, 139 dextro, 62 160–2
laevo, 62 spectroscopy, Fourier Transform, 131,
R 135, 162
S spectrum
Rabi frequency, 256 angular, 91, 95, 96–8, 111, 152–3,
Rabi oscillations, 257–8 s polarization, 21, 58 178, 185
radiation Sagnac interferometer, 45 frequency, 111, 112–13, 117, 124,
field, 200–1, 215, 222, 238 Sahl, Ibn, 20, 26 131, 134, 146, 162, 230
term, of electric dipole, 238 Saturn, rings, 26 power, 113, 132–3, 134, 135, 143–4
rainbow, 20 scalar approximation, 9 vector angular spectrum, 195, 198,
Raman transition, 162 scalar potential, 2, 233, 235 199–204
Ramsey, Norman, 113 scalar wave, see wave, scalar speed of light, 2–4, 14, 16, 111, 123–4
Ramsey, interferometer and fringes, scaling property of Fourier transforms, sphere, field on, 198, 222
113–14 84, 90, 155, 239, 241, 244, 247 spherical aberration, see aberration,
scattered field (or light), 19, 41, 51, 62, spherical
randomness, 62, 127–9, 132, 134, 136,
156, 223–5 spherical wave, 15, 23, 25–30, 34,
156
scattering, 8, 41, 105, 122, 156, 213–17, 36–42, 71–2, 179, 193, 201,
ray, 1, 26, 200, 204
223–5, 258–9 204, 214–15
Rayleigh
scattering cross section, 213, 222, spin, photon, 56
criterion, 150, 159
223–5
distance (length), 79, 80, 84 spontaneous emission, 128, 257
Schrödinger equation, 12, 126, 184, 187,
limit, 151, 160 spot of Arago, see Arago, spot of
256
range, 83, 105, 178, 179–81, 186, 222 spot, focal, 89, 148, 158, 163, 206–7
secondary wave, 71–3, 100
scattering, 223 square wave, 93, 108, 248–9
second-order coherence, 136
theorem, 242 stability of laser cavity, 182–3
selection rules, 57
real field, 8, 9, 50, 54–5, 68, 113, 125 self Fourier, 245, 247 standing wave, 36, 64–6, 69, 93, 247
reciprocal rotation, 61–2 self replicating, 87 stationary phase, method of, 200
reciprocity theorem, 64, 172 shadow, 33–4, 71–2, 76, 86–7, 159, 225 stationary random process, 134, 143
rect function, 79, 84, 101, 103–5, short laser pulses, 113, 125 stellar interferometry, 141
112–15, 138, 141, 152–3, 161, signal-to-noise, 123, 149, 151 step-index fibre, 185, 188–9, 190
168–9, 240, 243–4, 247–51 signum function, 220 step function, 123, 243
reflection coefficient, see Fresnel sinc function, 80–2, 84, 101–2, 104–6, Heaviside function, 220
coefficients, 112–15, 138–41, 153, 161–2, stored light, 122
refraction, 20–2, 26, 75, 204, 213–4 168, 184–5, 240, 243, 244, stress-induced birefringence, 59
refractive index, 19–22, 28, 46, 59–60, 248–51 subsidiary maxima, 42–4, 103–4, 160–4,
117–22, 148, 186–8, 204, 208, sine condition (Abbe), 203 174, 248
229 sine wave, 92–3, 108 sugar, 61–2
for a medium of dipoles, 214–16, single mode superluminal propagation, 122–4
218–20, 236, 259–60 fibre, 189–92 superposition
for a plasma, 226–7 laser, 130 linear, vii, 15, 18–19, 55, 61, 64–5,
relative phase, 9, 11, 33–4, 37, 39, 42, waveguide, 183 72, 91–6, 99, 111, 116–17, 195–9,
48, 52, 56, 59, 61, 127, 129, skin depth, 184, 226–8 203, 213, 222, 238
138, 202, 214, 224 sky, colour of, 51, 223, 231 principle of, 1, 3, 10, 21, 33, 132
replicating function, see comb function, slit, 38, 45, 79, 80–8, 104 super resolution, 159, 163–4
resolution slow axis, 59, 60–1 susceptibility, electric, 215–21, 228–9
angular, 141–51 slow light, 122–4, 126 symmetry
limit, 147, 151, 160, 251 slowly varying envelope, 116 cartesian (cartesian separability), 76,
wavelength, 45 small-angle approximation, 24, 36, 40, 83–4, 90, 100–1, 103, 162, 167,
resolving power, 45, 156–7 95, 98, 139, 148 178, 187, 202, 250
Index 273
cylindrical, 27, 73, 160, 185, 188, 201, van Cittert–Zernike theorem, 137–40 wave number, 6
204–5, 224, 250–1 variance, 134 wave-particle duality, 11, 156
of Fourier transform, 241 vector light field, 197–210 wave plate, 59, 60–1, 68–70
time-reversal, see time-reversal vector wave, 2–3, 199, 201 half-wave plate, 59–60
symmetry vector potential, 2, 69, 233, 234–7 quarter-wave plate, 59–61
velocity segmented, 202
T of electromagnetic waves, 2 wave vector, 3, 6, 11, 16–17, 19, 23–5,
energy transport, 124 93–4, 96, 98, 111, 117, 197,
Talbot effect, 87–8, 166 front, 123 199, 201, 218
telescope, 26, 43, 147, 159, 251 group, 117, 116, 119–20, 122–4, 227 waveguide, 12, 183–4, 187, 196, 198, 228
Hubble Space Telescope, point-spread of information, 123–4 Weber, Wilhelm, 2
function, 149, 157 phase, 4, 116–17, 119, 124, 227 wedge-shaped slit, 72
temporal broadening of pulse, 121, 126 signal, 124 wedge fringes, 36, 49
temporal coherence, 128, 128–9, 136 Verdet coefficient, 63 white light, 46, 111, 223
thermal light, 136 Vikings, 51 coherence time, 130
thin lens, 27–30, 186 visibility, 128, 130, 172 fringes in Young’s double-slit
thin slab, 213, 231–2 and spatial coherence, 137–9 experiment, 146
Thomson, William, 34, 62 and stellar interferometry, 141 interferometry, 130–1
tide, 33–4, 91 and temporal coherence, 131–3, 135
width, of a spectral function 134–5
tide-predicting machine, 34 and which-path information, 156, 172
Wiener, Norbert, 132
tilt fringes, 70
Wiener–Khinchin–Einstein theorem,
time average, 4, 7–9, 21, 35–7, 129, 132, W
132, 134–5, 143–4
136, 222
window function, 160
time-reversal symmetry, 172 waist of a gaussian beam, 83, 98, 177,
WKB approximation, 189–91
see also reciprocity theorem 178–83, 193–4, 197
top hat, 243 wave
X
see also rect function circular, see circular wave
total internal reflection, 183, 202 cylindrical, see cylindrical wave
X-rays, 1, 33–4
transition, atomic, 57, 63, 162, 238, 255 electromagnetic, 2, 7, 198
translation property of Fourier electromagnetic wave in a metal, crystallography and diffraction, 83,
transforms, 102, 150, 239, 225–6 89
241, 248 electromagnetic wave in a plasma, diffraction and Babinet’s principle,
transmission coefficient, 21–2, 46 227 104
transmission function, 73, 160–2, 168–9 equation, 2, 3–4, 9–11, 225–6, 235 focusing by zone plate, 75
see also aperture function evanescent, 199, 227
transverse coherence length, 139, 140–1 front, 10, 15–17, 23, 26–8, 33, 35, 71, Y
transverse electric (TE) mode, 198 201
transverse electric and magnetic (TEM) front curvature, 16, 25–6, 83, 151–3, Young, Thomas, 1, 33–4
mode, 177 186 Young’s interferometer, see Young’s
transverse magnetic (TM) mode, 198 front curvature of gaussian beam, two-hole experiment
transverse wave, 17–18, 195–6 179, 182–3 Young’s two-hole experiment, 34, 37,
triangle function, 125, 161–2, 254 front division interferometry, 45 38–40, 50, 87, 90, 136–40,
triangular aperture, 103 harmonic, see harmonic wave 144–6, 152–3, 156, 172, 252
Twiss, Richard, 136 longitudinal, see longitudinal wave
two-lens system, 12, 147, 153–6, 159 matter, see matter wave Z
packet, 92, 97, 120, 130, 244–6
U plane, see plane wave Zeeman, Pieter, 219
scalar, 9–11, 98, 111, 225 Zeiss, Carl, 147
ultra-stable laser, 130 secondary, see secondary wave Zernike, Frits, 137, 171
ultra-violet region of spectrum, 20, 223, sound, 34 zero
226 spherical, see spherical wave first, 41, 43–4, 78, 80, 82, 84, 102,
uncertainty principle, 98, 241, 245–6 standing, see standing wave 140, 145, 163–4, 244, 248, 251
unpolarized light, 51, 58 transverse, see transverse wave frequency or spatial frequency, 6, 93,
travelling, 10 97, 246, 253
V water, 16, 33, 37, 50 order, 45
wavelength, 1, 3–6, 11, 18, 20, 28, 43–8, zone, Fresnel, see Fresnel zone
van Cittert, Pieter, 137 111, 130, 222–3 zone plate, 74, 75, 86, 89