Física de La Luz y Óptica
Física de La Luz y Óptica
Física de La Luz y Óptica
Justin Peatross
Michael Ware
Brigham Young University
2015 Edition
July 20, 2015
Copyright 2015 Justin Peatross and Michael Ware
All rights reserved. The authors retain the copyright to this book. However, the content is
available free of charge at optics.byu.edu. This book may be downloaded, printed, and
distributed freely as long this copyright notice is included. Any use of a portion of this
books content as part of another other work requires the express written permission of
the authors.
ISBN 978-1-312-92927-2
Preface
iii
Contents
Preface iii
Table of Contents v
0 Mathematical Tools 1
0.1 Vector Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
0.2 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
0.3 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
0.4 Fourier Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Appendix 0.A Table of Integrals and Sums . . . . . . . . . . . . . . . . 19
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1 Electromagnetic Phenomena 25
1.1 Gauss Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.2 Gauss Law for Magnetic Fields . . . . . . . . . . . . . . . . . . . . 27
1.3 Faradays Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.4 Amperes Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.5 Maxwells Adjustment to Amperes Law . . . . . . . . . . . . . . . . 31
1.6 Polarization of Materials . . . . . . . . . . . . . . . . . . . . . . . . 34
1.7 The Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
v
vi CONTENTS
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
10 Diffraction 257
10.1 Huygens Principle as Formulated by Fresnel . . . . . . . . . . . . 258
10.2 Scalar Diffraction Theory . . . . . . . . . . . . . . . . . . . . . . . . 260
10.3 Fresnel Approximation . . . . . . . . . . . . . . . . . . . . . . . . . 262
10.4 Fraunhofer Approximation . . . . . . . . . . . . . . . . . . . . . . . 264
10.5 Diffraction with Cylindrical Symmetry . . . . . . . . . . . . . . . . 265
Appendix 10.A Fresnel-Kirchhoff Diffraction Formula . . . . . . . . . . 267
viii CONTENTS
Index 333
Mathematical Tools
Before moving on to chapter 1 where our study of optics begins, it would be good
to look over this chapter to make sure you are comfortable with the mathematical
tools well be using. The vector calculus information in section 0.1 is used straight
away in Chapter 1, so you should review it now. In Section 0.2 we review complex
numbers. You have probably had some exposure to complex numbers, but if you
are like many students, you havent yet fully appreciated their usefulness. Your
life will be much easier if you understand the material in section 0.2 by heart.
Complex notation is pervasive throughout the book, beginning in chapter 2.
You may safely procrastinate reviewing Sections 0.3 and 0.4 until they come
up in the book. The linear algebra refresher in Section 0.3 is useful for Chapter 4,
where we analyze multilayer coatings, and again in Chapter 6, where we discuss
polarization. Section 0.4 provides an introduction to Fourier theory. Fourier trans-
Ren Descartes (1596-1650, French)
forms are used extensively in optics, and you should study Section 0.4 carefully
was born in in La Haye en Touraine
before tackling Chapter 7. (now Descartes), France. His mother
died when he was an infant. His fa-
ther was a member of parliament who
encouraged Descartes to become a
0.1 Vector Calculus lawyer. Descartes graduated with a
degree in law from the University of
Each position in space corresponds to a unique vector r x x + y y + z z, where Poitiers in 1616. In 1619, he had a se-
ries of dreams that led him to believe
x, y, and z are unit vectors with length one, pointing along their respective axes. that he should instead pursue science.
Boldface type distinguishes a variable as a vector quantity, and the use of x, y, Descartes became one of the great-
est mathematicians, physicists, and
and z denotes a Cartesian coordinate system. Electric and magnetic fields are
philosophers of all time. He is credited
vectors whose magnitude and direction can depend on position, as denoted by with inventing the cartesian coordinate
E (r) or B (r). An example of such a field is E (r) = q (r r0 ) 40 |r r0 |3 , which
system, which is named after him. For
the first time, geometric shapes could
is the static electric field surrounding a point charge located at position r0 . The be expressed as algebraic equations.
absolute-value brackets indicate the magnitude (or length) of the vector given by (Wikipedia)
|r r0 | = (x x 0 ) x + y y 0 y + (z z 0 ) z
q
2 (0.1)
= (x x 0 )2 + y y 0 + (z z 0 )2
1
2 Chapter 0 Mathematical Tools
Example 0.1
field at r= 2x + 2y + 2z due to a positive point charge q
Compute the electric
positioned at r0 = 1x + 1y + 2z .
We have
r r0 = (2 1)x + (2 1)y + (2 2)z = 1x + 1y
and p p
|r r0 | = (1)2 + (1)2 = 2
The electric field is then
q 1x + 1y
E= p 3
40 2
Consider the plane that contains the two vectors k and r. Call it the x 0 y 0 -plane. In
this coordinate system, the two vectors can be written as k = k cos x0 +k sin y0 and
r = r cos x0 +r sin y0 , where and are the respective angles that the two vectors
make with the x 0 -axis. The dot product gives k r = kr (cos cos + sin sin ).
This simplifies to k r = kr cos (see (0.13)), where is the angle between
the vectors. Thus, the dot product between two vectors is the product of the
magnitudes of each vector times the cosine of the angle between them.
Note that the cross product results in a vector, whereas the dot product mentioned
above results in a scalar (i.e. a number with appropriate units). The resultant
cross-product vector is always perpendicular to the two vectors that are cross
multiplied. If the fingers on your right hand curl from the first vector towards the
second, your thumb will point in the direction of the result. The magnitude of the
result equals the product of the magnitudes of the constituent vectors times the
sine of the angle between them.
We label the plane containing E and B the x 0 y 0 -plane. In this coordinate system, the
two vectors can be written as E = E cos x0 + E sin y0 and B = B cos x0 + B sin y0 ,
where and are the respective angles that the two vectors make with the x 0 -axis. Figure 0.2 Right-hand rule for
The cross product, according to (0.3), gives E B = E B (cos sin sin cos )z0 . cross product.
This simplifies to E B = E B sin z0 (see (0.14)), where is the angle be-
tween the vectors. The vectors E and B, which both lie in the x 0 y 0 -plane, are both
perpendicular to z 0 . If 0 < < , the result E B points in the positive z 0
direction, which is consistent with the right-hand rule.
f f f
f x, y, z = x + y + z (0.4)
x y z
E x E y E z
E = + + (0.5)
x y z
Example 0.2
Derive the gradient (0.4) in cylindrical coordinates defined by the transformations
x = cos and y = sin . (The coordinate z remains unchanged.)
2 See M. R. Spiegel, Schaums Outline of Advanced Mathematics for Engineers and Scientists, pp.
Solution: By inspection of Fig. 0.3, the cartesian unit vectors may be expressed as
In accordance with the rules of calculus, the needed partial derivatives expressed
in terms of the new variables are
= + and = +
x x x y y y
f f f
f = x + y + z
x y z
f sin f
= cos cos sin
f cos f f
+ sin + sin + cos + z
z
f 1 f f
= + + z
z
2 E ( E) ( E) (0.10)
E y E x E z E x E y E x E z E y
= + x y
y x y z x z x x y z y z
E z E x E z E y
+ z
x x z y y z
2 E x 2 E y
x + y 2 y + zE2z z and then rearranging, we
2
After adding and subtracting x 2
get
2 E y 2 E z 2
2 E x E y 2 E z
2
2 E x E y 2 E z
" # " # " #
2 E x
( E) = + + x + + + y + + + z
x 2 xy xz xy y 2 yz xz yz z 2
" 2
E y 2 E y 2 E y
" # # " #
2 E x 2 E x 2 E x 2 E z 2 E z 2 E z
+ + x + + y + + z
x 2 y 2 z 2 x 2 y 2 z 2 x 2 y 2 z 2
Schaums Outline of Advanced Mathematics for Engineers and Scientists, p. 154 (New York: McGraw-
Hill 1971).
6 Chapter 0 Mathematical Tools
The integration on the left-hand side is over the closed surface S, which contains
the volume V associated with the integration on the right-hand side. The unit
vector n points outward, normal to the surface. The divergence theorem is espe-
cially useful in connection with Gauss law, where the left-hand side is interpreted
as the number of field lines exiting a closed surface.
Example 0.3
Check the divergence theorem (0.11) for the vector function F x, y, z = y 2 x +
Solution: First, we evaluate the left side of (0.11) for the function:
I Z1 Z1 Z1 Z1 Z1 Z1
d xd y x 2 z z=1 d xd y x 2 z z=1 +
F nd a = d xd z x y y=1
Figure 0.4 The function F (red S 1 1 1 1 1 1
arrows) plotted for several points Z1 Z1 Z1 Z1 Z1 Z1
on the surface S.
d xd z x y y=1 + d yd z y 2 x=1
d yd z y 2 x=1
1 1 1 1 1 1
Z1 Z1 Z1 Z1 3 1 1
x x 2 8
=2 d xd y x 2 + 2 d xd zx = 4 + 4 =
3 1
2 1 3
1 1 1 1
Z1 Z1 Z1 Z1 2 1
x x3 8
Z
d xd yd z x + x 2 = 4 d x x + x2 = 4
Fd v = + =
2 3 1 3
V 1 1 1 1
The integration on the left-hand side is over an open surface S (not enclosing a
volume). The integration on the right-hand side is around the edge of the surface.
Again, n is a unit vector that always points normal to the surface . The vector d `
points along the curve C that bounds the surface S. If the fingers of your right
hand point in the direction of integration around C , then your thumb points
in the direction of n. Stokes theorem is especially useful in connection with
Amperes law and Faradays law. The right-hand side is an integration of a field
around a loop.
The last line of (0.19) is seen to be the sum of the first two lines, from which Eulers
formula directly follows.
Example 0.4
Prove (0.13) and (0.14) as well as cos2 + sin2 = 1 by taking advantage of (0.16).
e i e i e i + e i e i e i e i + e i
sin cos + sin cos = +
2i 2 2i 2
e i (+) + e i () e i () e i (+)
=
4i
e i (+) e i () + e i () e i (+)
+
4i
e i (+) e i (+)
= sin +
=
2i
Finally, we compute
2 2
e i + e i e i e i
cos2 + sin2 = +
2 2i
e 2i + 2 + e 2i e 2i 2 + e 2i
= =1
4 4
0.2 Complex Numbers 9
writing n o
A cos + = Re Ae i
(0.20)
have
a = cos
(0.24)
b = sin
These equations can be inverted to yield
p
= a2 + b2
b (0.25)
= tan1 (a > 0)
a
10 Chapter 0 Mathematical Tools
When a < 0, we must adjust by since the arctangent has a range only from
/2 to /2.
The transformations in (0.24) and (0.25) have a clear geometrical interpreta-
Quadrant I tion in the complex plane, and this makes it easier to remember them. They are
just the usual connections between Cartesian and polar coordinates. As seen in
II Fig. 0.5, is the hypotenuse of a right triangle having legs with lengths a and b,
and is the angle that the hypotenuse makes with the x-axis. Again, you should
be careful when a is negative since the arctangent is defined in quadrants I and
IV. An easy way to deal with the situation of a negative a is to factor the minus
III IV sign out before proceeding (i.e. a + i b = (a i b) ). Then the transformation
is made on a i b where a is positive. The overall minus sign out in front is
just carried along unaffected and can be factored back in at the end. Notice that
e i is the same as e i () .
Figure 0.5 A number in the com-
plex plane can be represented
either by Cartesian or polar repre- Example 0.5
sentation.
Write 3 + 4i in polar format.
Solution: We must be careful with the negative real part since it indicates a quad-
rant (in this case II) outside of the domain of the inverse tangent (quadrants I and
IV). Best to factor the negative out and deal with it separately.
1 (4) 1 4 1 4
3 + 4i = (3 4i ) = 32 + (4)2 e i tan 3 = e i 5e i tan 3 = 5e i (tan 3 )
p
z = (a + i b) a i b (0.26)
The complex conjugate is useful when computing the absolute value of a complex
Figure 0.6 Geometric representa-
number: p
tion of 3 + 4i p p
|z| = z z = (a i b) (a + i b) = a 2 + b 2 = (0.27)
Note that the absolute value of a complex number is the same as its magnitude
as defined in (0.25). The complex conjugate is also useful for eliminating complex
numbers from the denominator of expressions:
a + i b (a + i b) (c i d ) ac + bd + i (bc ad )
= = (0.28)
c + i d (c + i d ) (c i d ) c2 + d2
No matter how complicated an expression, the complex conjugate is calcu-
lated by inserting a minus sign in front of all occurrences of i in the expression,
and placing an asterisk on all complex variables in the expression. For example,
the complex conjugate of e i is e i assuming and are real, as can be seen
from Eulers formula (0.15). As another example consider
E 0 exp {i (kz t )} = E 0 exp i k z t
(0.29)
0.3 Linear Algebra 11
1
z + z
Re {z} = (0.30)
2
Notice that the expression for cos in (0.16) is an example of this formula. Some-
times when a lengthy expression is added to its own complex conjugate, we let
C.C. represent the complex conjugate in order to avoid writing the expression
twice.
In optics we sometimes encounter a complex angle, , such as kz in (0.29). The
imaginary part of K governs exponential decay (or growth) when a light wave
propagates in an absorptive (or amplifying) medium. Similarly, when we compute
the transmission angle for light incident upon a surface beyond the critical angle
for total internal reflection, we encounter the arcsine of a number greater than
one in an effort to satisfy Snells law. Even though such an angle does not exist in
the physical sense, a complex value for the angle can be found, which satisfies
(0.16) and describes evanescent waves.
Ax + B y = F and Cx + Dy = G (0.31)
where x and y are variables. A set of linear equations such as (0.31) can be
expressed using matrix notation as
A B x Ax + B y F
= = (0.32)
C D y Cx +Dy G
where the right-hand side is called the identity matrix. You can easily check that
the identity matrix leaves unchanged anything that it multiplies, and so (0.33)
simplifies to
1
x A B F
=
y C D G
Once the inverse matrix is found, the matrix multiplication on the right can be
performed and the answers for x and y obtained as the upper and lower elements
of the result.
The inverse of a 2 2 matrix is given by
1
A B 1 D B
= (0.35)
C D C A
A B
C D
where
A B
C AD C B
D
is called the determinant. We can check that (0.35) is correct by direct substitution:
1
A B A B 1 D B A B
=
C D C D AD BC C A C D
1 AD BC 0
= (0.36)
AD BC 0 AD BC
1 0
=
0 1
James Joseph Sylvester (1814-1897, The above review of linear algebra is very basic. In contrast, we next dis-
English) made fundamental contribu-
tions to matrix theory, invariant theory,
cuss Sylvesters theorem, which you probably have not previously encountered.
number theory, partition theory and com- Sylvesters theorem is useful when multiplying the same 2 2 matrix (with a de-
binatorics. He played a leadership role
terminate of unity) together many times (i.e. raising the matrix to a power). This
in American mathematics in the later
half of the 19th century as a professor situation occurs when modeling periodic multilayer mirror coatings or when
at the Johns Hopkins University and considering light rays trapped in a laser cavity as they reflect many times.
as founder of the American Journal of
Mathematics. (Wikipedia) Sylvesters Theorem:4 If the determinant of a 22 matrix is one, (i.e. AD BC = 1)
then
N
A sin N sin (N 1) B sin N
A B 1
= (0.37)
C D sin C sin N D sin N sin (N 1)
4 The theorem presented here is a specific case. See A. A. Tovar and L. W. Casperson, Generalized
Sylvester theorems for periodic applications in matrix optics, J. Opt. Soc. Am. A 12, 578-590 (1995).
0.4 Fourier Theory 13
where
1
cos = (A + D) (0.38)
2
N +1
A sin N sin (N 1) B sin N
A B 1 A B
=
C D sin C D C sin N D sin N sin (N 1)
2
A + BC sin N A sin (N 1) (AB + B D) sin N B sin (N 1)
1
= 2
sin (AC +C D) sin N C sin (N 1) D + BC sin N D sin (N 1)
Now we inject the condition AD BC = 1 into the diagonal elements and obtain
2
A + AD 1 sin N A sin (N 1) B [(A + D) sin N sin (N 1) ]
1
2
sin C [(A + D) sin N sin (N 1) ] D + AD 1 sin N D sin (N 1)
and then
A [(A + D) sin N sin (N 1) ] sin N B [(A + D) sin N sin (N 1) ]
1
sin C [(A + D) sin N sin (N 1) ] D [(A + D) sin N sin (N 1) ] sin N
through a system, we can also reassemble sinusoidal waves to see the effect on the
overall waveform. In fact, it will be possible to work simultaneously with infinitely
many sinusoidal waves, where the frequencies comprising a light field are spread
continuously over a range. Fourier transforms are also helpful for diffraction
problems where many waves (all with the same frequency) interfere spatially.
We begin with a derivation of the Fourier integral theorem. As asserted by
Fourier, a periodic function can be represented in terms of sines and cosines in
the following manner:
X
f (t ) = a n cos (nt ) + b n sin (nt ) (0.40)
n=0
/ /
Z Z
i mt
e i (mn)t d t
X
f (t )e dt = cn
n=
/ /
/
e i (mn)t
X
= cn
n= i (m n) / (0.44)
X 2c n e i (mn) e i (mn)
=
n= 2i (m n)
2c sin [(m n) ]
X n
=
n= (m n)
The function sin [(m n) ] / [(m n) ] is equal to zero for all n 6= m, and it is
equal to one when n = m (to see this, use LHospitals rule on the zero-over-zero
situation, or just go back and re perform the above integral for n = m). Thus, only
one term contributes to the summation in (0.44). We now have
/
Z
cm = f (t )e i mt d t (0.45)
2
/
from which the coefficients c n can be computed, given a function f (t ). (Note that
m is a dummy index so we can change it back to n if we like.)
This completes the circle. If we know the function f (t ), we can find the
coefficients c n via (0.45), and, if we know the coefficients c n , we can generate the
function f (t ) via (0.42). If we are feeling a bit silly, we might combine these into a
single identity:
/
Z
f (t )e i nt d t e i nt
X
f (t ) = (0.46)
n= 2
/
Z
1 0
e i nt f t 0 e i nt d t 0
X
f (t ) = lim (0.47)
2 0 n=
16 Chapter 0 Mathematical Tools
Recall that an integral is really a summation of rectangles under a curve with finely
spaced steps:
Zb ba
g () d lim g (a + n)
X
0 n=0
a
ba
(0.48)
a +b
2
+ n
X
= lim g
0 ba 2
n= 2
The final expression has been manipulated so that the index ranges through both
negative and positive numbers. If we set a = b and take the limit b , then the
above expression becomes
Z
g () d = lim g (n)
X
(0.49)
0 n=
Now, (0.47) has the same form as (0.49) if g (n) represents everything in
the square brackets of (0.47). The result is the Fourier integral theorem:
Z Z
1 1
e i t p f t 0 e i t d t 0 d
0
f (t ) = p (0.50)
2 2
The piece in brackets is called the Fourier transform, and the rest of the operation
is called the inverse Fourier transform. The Fourier integral theorem (0.50) is often
written with the following (potentially confusing) notation:
Z
1
f () p f (t )e i t d t
2
(0.51)
Z
1 i t
f (t ) p f () e d
2
tirely different, even taking on different units (e.g. the latter having extra units of
per frequency). The two functions are distinguished by their arguments, which
also have different units (e.g. time vs. frequency). Nevertheless, it is customary to
use the same letter to denote either function since they form a transform pair.
0.4 Fourier Theory 17
Example 0.6
2 /2T 2
Compute the Fourier transform of E (t ) = E 0 e t e i 0 t followed by the inverse
Fourier transform.
The integration can be performed with the help of (0.55), which yields
(0 )2
r
E0 2 2
E () = p 2
e 4(1/2T 2 )
= T E 0 e T (0 ) /2
2 1/2T
Similarly, the inverse Fourier transform of the above function is
Z Z
1 T 2 (0 )2 /2
i t T E0 T2 2
2 +(T 2 0 i t ) T2 20
E (t ) = p T E0e e d = p e 2 d
2 2
2 T 2 /2
which brings us back to where we started.
in such a way as to make the integral take on the value of the function f (t ). (You
can think of t 0 t d t 0 as an infinitely tall and infinitely thin rectangle centered
at t 0 = t with an area unity.) The integral only pays attention to the value of f t 0
at the point t 0 = t .
A remarkable attribute of the delta function can be seen from the Fourier
integral theorem. After rearranging the order of integration, the Fourier integral
theorem (0.50) can be written as
Z Z
0 1
e i (t t ) d d t 0
0
f (t ) = f t (0.53)
2
A comparison between (0.52) and (0.53) shows that you may write the delta
function as a uniform superposition of all frequency components:
Z
1
e i (t t ) d
0
0
t t =
(0.54)
2
Example 0.7
Use (0.54) to prove Parsevals theorem: 7
Z Z
f ()2 d = f (t )2 d t
Solution:
Z Z
f ()2 d = f () f () d
Z 1 Z 1 Z
0
= p f (t ) e i t d t p f t 0 e i t d t 0 d
2 2
The order of integration can be changed, and with the aid of (0.54) we get
Z Z Z 1 Z
0
e i (t (t )) d d t d t 0
2
f () d = f (t ) f t 0
2
Z Z
f (t ) f t 0 t 0 (t ) d t d t 0
=
Z Z
f (t ) f (t ) d t = f (t )2 d t
=
7 For a more general version of the relation, see G. B. Arfken and H. J. Weber, Mathematical
Methods for Physicists 6th ed., Sect. 15.5 (San Diego: Elsevier Academic Press 2005).
0.A Table of Integrals and Sums 19
Z
b2 +c
r
ax 2 +bx+c
e dx = e 4a (Re {a} > 0) (0.55)
a
Z
e i ax
d x = |b| e |ab| (0.56)
1 + x 2 /b 2
Z2
e i a cos( ) d = 2J 0 (a)
0
(0.57)
0
Za
a
J 0 (bx) x d x = J 1 (ab) (0.58)
b
0
Z 2
2 e b /4a
e ax J 0 (bx) x d x = (0.59)
2a
0
Z
sin2 (ax)
2
dx = (0.60)
(ax) 2a
0
dy y
Z
3/2 = p 2 (0.61)
y2 + c c y +c
p
dx 1 1 c
Z
p = p sin (0.62)
x x2 c c |x|
Z Z
sin(ax) sin(bx) d x = cos(ax) cos(bx) d x = ab (a, b integer) (0.63)
2
0 0
N N +1
1r
rn =
X
(0.64)
n=0 1r
N r (1 r N )
rn =
X
(0.65)
n=1 1r
1
rn =
X
(r < 1) (0.66)
n=0 1r
20 Chapter 0 Mathematical Tools
Exercises
P0.2 Use the dot product (0.2) to show that the cross product E B is per-
pendicular to E and to B.
r r0
1
r = ,
|r r0 | |r r0 |3
1 1 2 2
2 = + 2 +
2 z 2
Solution: (Partial)
Continuing with the approach in Example 0.2, we have
! !
2 f 2 f f 2 f f
= + + +
x 2 x 2 x x x 2 x x
! 2 !
2 f f f f f f
= + + + + +
x 2 x x x x 2 x x x
Exercises 21
and
2 f 2 f 2 f
2 f = + +
x 2 y 2 z 2
! !
2 2 2 2 f
2
f 2 f
= + + + +2 +
x 2 y 2 x y 2 x x y y
" ! !# " #
2 2 f 2 2 2 f 2 f
+ + + + +
x 2 y 2 x y 2 z 2
The needed first derivatives are given in Example 0.2. The needed second derivatives are
2 1 x2 sin2
=q =
x 2
3/2
x2 + y 2 x2 + y 2
2 2x y 2 sin cos
= 2 =
x 2 x2 + y 2 2
2 1 y2 cos2
=q =
y 2
3/2
x2 + y 2 x2 + y 2
2 2x y 2 sin cos
= 2 =
y 2 x2 + y 2 2
Finish the derivation by substituting these derivatives into the above expression.
P0.11 Verify Stokes theorem (0.12) for the function given in Example 0.3.
Take the surface to be a square in the x y-plane contained by x = 0,
x = 1, y = 0, and y = 1, as illustrated in Fig. 0.7.
P0.12 Verify the following vector integral theorem for the same volume used
in Example 0.3, but with F = y 2 x x + x y z and G = x 2 x:
Z I
[F ( G) + (G ) F] d v = F (G n) d a
V S
Figure 0.7
P0.13 Use the divergence theorem to show that the function in P0.5 is 4
times the three-dimensional delta function
3 r0 r x 0 x y 0 y z 0 z
1 if V contains r0
Z
3 r0 r d v =
0 otherwise
V
r r0 r r0
I Z
3 nd a = r 3 d v
r r 0 r r 0
S V
22 Chapter 0 Mathematical Tools
From P0.5, the argument in the integral on the right-hand side is zero except at r = r0 . Therefore,
if the volume V does not contain the point r = r0 , then the result of both integrals must be zero.
Let us construct a volume between an arbitrary surface S 1 containing r = r0 and S 2 , the surface
of a tiny sphere centered on r = r0 . Since the point r = r0 is excluded by the tiny sphere, the result
of either integral in the divergence theorem is still zero. However, we have on the tiny sphere
Z2Z !
r r0
I
1
nd a = r 2 sin d d = 4
r r 0 3 r 2
S2 0 0
Therefore, for the outer surface S 1 (containing r = r0 ) we must have the equal and opposite
result:
r r0
I
3 nd a = 4
r r 0
S1
This implies
r r0
4 if V contains r0
Z
r 3 d v = 0 otherwise
0
r r
V
rr0
The integrand exhibits the same characteristics as the delta function Therefore, r =
|rr0 |3
3 0
4 r r . The delta function is defined in (0.52)
P0.16 Invert (0.15) to get both formulas in (0.16). HINT: You can get a second
equation by considering Eulers equation with a negative angle .
P0.20 Write A cos(t )+2A sin(t +/4) as a single phase-shifted cosine wave
(i.e. find the amplitude and phase of the resultant cosine wave).
Exercises 23
P0.21 Prove that Fourier Transforms have the property of linear superposi-
tion:
F ag (t ) + bh (t ) = ag () + bh ()
1
Prove F g (at ) = |a|
P0.22 g a .
Prove F g (t ) = g ()e i .
P0.23
t2
)
P0.24 Show that the Fourier transform of E (t ) = E 0 e 2T 2 cos 0 t is
T E 0 (+02)2 0 )2
(
E () = e 2/T + e 2/T 2
2
P0.25 Take the inverse Fourier transform of the result in P0.24. Check that it
returns exactly the original function.
p Z Z
1 1 0
g (t ) e i t d t p h t 0 e i t d t 0
= 2 p
2 2
p
= 2g () h ()
24 Chapter 0 Mathematical Tools
Electromagnetic Phenomena
Here E and B represent electric and magnetic fields, respectively. The charge
density describes the charge per volume distributed through space.3 The current
density J describes the motion of charge density (in units of times velocity). The
constant 0 is called the permittivity, and the constant 0 is called the permeability.
Taken together, these are known as Maxwells equations.
After introducing a key revision of Amperes law, Maxwell realized that together
these equations comprise a complete self-consistent theory of electromagnetic
phenomena. Moreover, the equations imply the existence of electromagnetic
waves, which travel at the speed of light. Since the speed of light had been
measured before Maxwells time, it was immediately apparent (as was already
suspected) that light is a high-frequency manifestation of the same phenomena
that govern the influence of currents and charges upon each other. Previously,
optics had been considered a topic quite separate from electricity and magnetism.
Once the connection was made, it became clear that Maxwells equations form
the theoretical foundations of optics, and this is where we begin our study of light.
1 In Maxwells original notation, this set of equations was hardly concise, written without the
convenience of modern vector notation or . His formulation wouldnt fit easily on a T-shirt!
2 See J. D. Jackson, Classical Electrodynamics, 3rd ed., p. 1 (New York: John Wiley, 1999) or the
back cover of D. J. Griffiths, Introduction to Electrodynamics, 3rd ed. (New Jersey: Prentice-Hall,
1999).
3 In other parts of this book, we use for the radius in cylindrical coordinates, not to be confused
25
26 Chapter 1 Electromagnetic Phenomena
vector. We have written the force in terms of an electric field E (r), which is defined
throughout space (regardless of whether a second charge q is actually present).
The permittivity 0 amounts to a proportionality constant.
The total force from a collection of charges is found by summing expression
Origin
(1.5) over all charges q n0 associated with their specific locations r0n . If the charges
are distributed continuously throughout space, having density r0 (units of
Figure 1.1 The geometry of
Coulombs law for a point charge. charge per volume), the summation for finding the net electric field at r becomes
an integral:
0 r r0
1
Z
E (r) = r d v0 (1.7)
40 |r r0 |3
V
4
This three-dimensional integral gives the net electric field produced by the
charge density that exists in volume V .
Gauss law (1.1), the first of Maxwells equations, follows directly from (1.7)
with some mathematical manipulation. No new physical phenomenon is intro-
duced in this process.5
Origin
r r0
43 r0 r 4 x 0 x y 0 y z 0 z
r 3
(1.9)
0
|r r |
(r)
E (r) =
0
The (perhaps more familiar) integral form of Gauss law can be obtained by
integrating (1.1) over a volume V and applying the divergence theorem (0.11) to
the left-hand side:
1 Figure 1.3 Gauss law in integral
I Z
E (r) n d a = (r) d v (1.10) form relates the flux of the elec-
0
S V tric field through a surface to the
This form of Gauss law shows that the total electric field flux extruding through a charge contained inside that sur-
face.
closed surface S (i.e. the integral on the left side) is proportional to the net charge
contained within it (i.e. within volume V contained by S).
Example 1.1
Suppose we have an electric field given by E = (x 2 y 3 x + z 4 y) cos t . Use Gauss
law (1.1) to find the charge density (x, y, z, t ).
Solution:
= 0 E = 0 x + y + z (x 2 y 3 x + z 4 y) cos t = 20 x y 3 cos t
x y z
where
0 r r0
Z
0
B (r) = J r d v0 (1.12)
4 |r r0 |3
V
The latter equation is known as the Biot-Savart law. The permeability 0 dictates
the strength of the magnetic field, given the current distribution.
As with Coulombs law, we can apply mathematics to the Biot-Savart law
to obtain another of Maxwells equations. Nevertheless, the essential physics
is already inherent in the Biot-Savart law.7 Using the result from P0.4, we can
rewrite (1.12) as8
Jean-Baptiste Biot (1774-1862, French)
0 0 J r0
1
Z Z
was born in Paris. He attended the 0 0
cole Polytechnique where mathemati- B (r) = J r r dv = d v0 (1.13)
4 |r r0 | 4 |r r0 |
cian Gaspard Monge recognized his aca- V V
demic potential. After graduating, Biot
joined the military and then took part in Since the divergence of a curl is identically zero (see P0.6), we get straight away
an insurrection on the side of the Roy-
alists. He was captured, and his career the second of Maxwells equations (1.2)
might of have met a tragic ending there
had Monge not successfully pleaded
B = 0
for his release from jail. Biot went on to
become a professor of physics at the
College de France. Among other con- which is known as Gauss law for magnetic fields. (Two equations down; two to
tributions, Biot participated in the first
go.)
hot-air balloon ride with Gay-Lussac and
correctly deduced that meteorites that The similarity between B = 0 and E = /0 , Gauss law for electric fields,
fell on LAigle, France in 1803 came from is immediately apparent. In integral form, Gauss law for magnetic fields looks the
space. Later Biot collaborated with the
younger Felix Savart (1791-1841) on same as (1.10), only with zero on the right-hand side. If one were to imagine the
the theory of magnetism and electrical existence of magnetic monopoles (i.e. isolated north or south charges), then the
currents. They formulated their famous
law in 1820. (Wikipedia) right-hand side would not be zero. The law implies that the total magnetic flux
extruding through any closed surface balances, with as many field lines pointing
inwards as pointing outwards.
Example 1.2
The field surrounding a magnetic dipole is given by
B = 3xz x + 3y z y + 3z 2 r 2 z r 5
p
where r x 2 + y 2 + z 2 . Show that this field satisfies Gauss law for magnetic
fields (1.2).
7 Like Coulombs law, the Biot-Savart law is incomplete since it also implies an instantaneous
response of the magnetic field to a reconfiguration of the currents. The generalized version of the
Biot-Savart law, another of Jefimenkos equations, incorporates the fact that electromagnetic news
travels at the speed of light. Ironically, Gauss law for magnetic fields and Maxwells version of
Amperes law, derived from the Biot-Savart law, hold perfectly whether the currents are steady or
vary in time. The Jefimenko equations, analogs of Coulomb and Biot-Savart, also embody Faradays
law, the only of Maxwells equations that cannot be derived from the usual forms of Coulombs law
and the Biot-Savart law. See D. J. Griffiths, Introduction to Electrodynamics, 3rd ed., Sect. 10.2.2
(New Jersey: Prentice-Hall, 1999).
8 Note that ignores the variable of integration r0 .
r
1.3 Faradays Law 29
Solution:
xz y z 3z 2
1
B = 3 +3 + 3
x r 5 y r 5 z r 5 r
z 5xz r z 5y z r 6z 15z 2 r 3 r
= 3 5 6 +3 5 6 + 5 6 +
r r x r r y r r z r 4 z
r r r 3 r
12z 15z
= 6 x +y +z + 4
r5 r x y z r z
.p
The necessary derivatives are r /x = x x 2 + y 2 + z 2 = x/r , r /y = y/r , and
r /z = z/r , which lead to
12z 15z 3z
B = + =0 Michael Faraday (17911867, English)
r5 r5 r5 was one of the greatest experimental
physicists in history. Born on the out-
skirts of London, his family was not well
off, his father being a blacksmith. The
young Michael Faraday only had access
1.3 Faradays Law to a very basic education, and so he
was mostly self taught and never did
acquire much skill in mathematics. As a
Michael Faraday discovered that changing magnetic fields induce electric fields. teenager, he obtained a seven-year ap-
This distinct physical effect, called induction, can be observed when a magnet is prenticeship with a book binder, during
which time he read many books, includ-
waved by a loop of wire. Faradays law says that a change in magnetic flux through ing books on science and electricity.
a circuit loop (see Fig. 1.4) induces a voltage around the loop according to Given his background, Faradays entry
into the scientific community was very
gradual, from servant to assistant and
I Z
E d` = B n d a (1.14) eventually to director of the laboratory at
t the Royal Institution. Faraday is perhaps
C S best known for his work that established
the law of induction and for the discovery
The right side describes a change in the magnetic flux through a surface, and the that magnetic fields can interact with
light, known as the Faraday effect. He
left side describes the voltage around the loop containing the surface. also made many advances to chemistry
We apply Stokes theorem (0.12) to the left-hand side of Faradays law and during his career including figuring out
how to liquify several gases. Faraday
obtain
was a deeply religious man, serving as a
Deacon in his church. (Wikipedia)
B
Z Z Z
( E) n d a = B n d a or E+ n da = 0 (1.15)
t t
S S S
B
E =
t
which is the differential form of Faradays law (1.4) (three of Maxwells equations
down; one to go).
N
Example 1.3
For the electric field given in Example 1.1, E = (x 2 y 3 x+z 4 y) cos t , use Faradays Magnet
law (1.3) to find B(x, y, z, t ).
Figure 1.4 Faradays law.
30 Chapter 1 Electromagnetic Phenomena
Solution:
x y z
B
= E = cos t
x y z
t x 2 y 3 z 4 0
4 2 3
= cos t x (0) x z y (0) + y x y
y z x z
4 2 3
+z z z x y
x y
= 4z 3 x + 3x 2 y 2 z cos t
sin t
B = 4z 3 x + 3x 2 y 2 z
plus possibly a constant field.
The last term in (1.18) vanishes if we assume that the current density J is com-
pletely contained within the volume V so that it is zero at the surface S. Thus, the
expression for the curl of B (r) reduces to
0 r r0
Z
B (r) = 0 J (r) r0 J r0 d v 0
(1.19)
4 |r r0 |3
V
J
=0 (steady-state approximation) (1.20)
B = 0 J (1.21)
where n is the outward normal to the surface. The units on this equation are that
of current, or charge per time, leaving the volume.
Since we have considered a closed surface S, the net current leaving the enclosed
volume V must be the same as the rate at which charge within the volume vanishes:
Z
I = dv (1.25)
t
V
Upon equating these two expressions for current, as well as applying the diver-
gence theorem (0.11) to the former, we get
Z Z Z
Jd v = d v or J+ dv = 0 (1.26)
t t
V V V
well after Maxwell published his laws of electromagnetism, so in practice Maxwell accomplished
much more than merely fixing Amperes law.
10 Based on (1.27), one might think that the displacement current E/t ought to be zero in a
0
region of space with no charge density . However, in (1.27) appears in a volume integral over a
region of space sufficiently large (consistent with a previous supposition) to include any charges
responsible for the field E; presumably, all fields arise from sources.
1.5 Maxwells Adjustment to Amperes Law 33
Example 1.4
(a) Use Gauss law to find the electric field in a gap that interrupts a current-carrying
wire, as shown in Fig. 1.6.
(b) Find the strength of the magnetic field on contour C using Amperes law applied
to surface S 1 . C
(c) Show that the displacement current in the gap leads to the identical magnetic
field when using surface S 2 .
I I
Solution: (a) Well assume that the cross-sectional area of the wire A is much wider
than the gap separation. Then the electric field in the gap will be uniform, and the
integral on the left-hand side of (1.10) reduces to E A since there is essentially no
field other than in the gap. If the accumulated charge on the plate is Q, then the
right-hand side of (1.10) integrates to Q/0 , and the electric field turns out to be
E = Q/(0 A). Figure 1.6 Charging capacitor.
(b) Let the contour C be a circle at radius r . The magnetic field points around the
circumference with constant strength. The left-hand side of (1.22) becomes 2r B
while the right-hand side is
Q
Z
0 J nd a = 0 I = 0
t
S
Example 1.5
For the electric field E = (x 2 y 3 x + z 4 y) cos t (see Example 1.1) and the associ-
ated magnetic field B = 4z 3 x + 3x 2 y 2 z sint (see Example 1.3), find the current
density J (x, y, z, t ).
Solution:
* x y z
B E sin t
J = 0 = + 0 (x 2 y 3 x + z 4 y) sin t
x y z
0 t 0 3 2 2
4z 0 3x y
sin t
6x 2 y x 6x y 2 y + 12z 2 y + 0 (x 2 y 3 x + z 4 y) sin t
=
0
6x 2 y 12z 2 6x y 2
= 0 x 2 y 3 + x + 0 z 4 + y sin t
0 0 0
34 Chapter 1 Electromagnetic Phenomena
J = Jfree + Jm + Jp (1.28)
First, as you might expect, currents can arise from free charges in motion such
as electrons in a metal, referred to as Jfree . Second, individual atoms can exhibit
internal currents that give rise to paramagnetic and diamagnetic effects, denoted
by Jm . These are seldom important in optics problems, and so we will ignore these
types of currents by writing Jm = 0. Third, molecules in a material can elongate,
becoming dipoles in response to an applied electric field. This gives rise to a
polarization current Jp .
The polarization current is associated with a dipole distribution function P,
called the polarization11 (in units of dipoles per volume, or charge times length
per volume). Physically, if the dipoles (depicted in Fig. 1.7) change their strength
or orientation as a function of time, an effective current density arises in the
medium. Note that the time-derivative of an individual dipole moment renders
charge times velocity. Thus, the time derivative of sloshing dipoles per volume
gives a current density equal to
P
Figure 1.7 A polarized medium Jp = (1.29)
with P = 0. t
Next, we turn our attention to the charge density, which is often decomposed
as
= free + p (1.30)
We seldom consider the propagation of electromagnetic waveforms through
electrically charged materials, and so in this book we will always write free = 0.
One might be tempted in this case to assume that the overall charge density is
zero, but this would be wrong. Even though a material is electrically neutral, the
polarization P can vary in space, leading to local concentrations of positive or
negative charges. This type of charge density is denoted by p . It arrises from
nonuniform arrangements of dipoles, as depicted in Fig. 1.8.
To connect p with P, we write the continuity equation (1.23) for the current
and charge densities associated with the polarization:
p
Jp = (1.31)
t
Substitution of (1.29) into this equation immediately yields
p = P (1.32)
Figure 1.8 A polarized medium
with P 6= 0.
11 Unfortunately, the word polarization gets double usage. It also refers to the orientation of the
The left-hand side is a surface integral, which after integrating gives units of
charge. Physically, it is the sum of the charges touching the inside of surface
S (multiplied by a minus since by convention dipole vectors point from
the negatively charged end of a molecule to the positively charged end).
When P is zero, there are equal numbers of positive and negative charges
touching S from within, as depicted in Fig. 1.7. When P is not zero, the
positive and negative charges touching S are not balanced, as depicted in
Fig. 1.8. Essentially, excess charge ends up within the volume because the
non-uniform alignment of dipoles causes them to be cut preferentially at
the surface.12
P
E = (Gauss law) (1.33)
0
B = 0 (Gauss law for magnetism) (1.34)
B
E = (Faradays law) (1.35)
t
B E P
= 0 + + Jfree (Amperes law; fixed by Maxwell) (1.36)
0 t t
cutting any dipoles. However, the function P (r) is continuous, while the figures depict crudely just
a few dipoles. In a continuous material you cant draw a surface that avoids cutting dipoles.
13 It is not uncommon to see the macroscopic Maxwell equations written in terms of two auxiliary
fields: H and D. The field H is useful in magnetic materials. In these materials, the combination
B 0 in Amperes law is replaced by H B/0 M, where Jm = M is the current associated with
the materials magnetization. Since we only consider nonmagnetic materials (M = 0), there is little
point in using H. The field D, called the displacement, is defined as D 0 E + P. This combination
of E and P occurs in Coulombs law and Amperes law. For physical clarity, the authors of this book
elect to retain the prominence of the polarization P in the equations.
36 Chapter 1 Electromagnetic Phenomena
p
that 1 0 0 gives the correct speed of light c = 3108 m/s (which had previously
been measured). Faraday and Kerr had observed that strong magnetic and electric
fields affect light propagating in crystals. The time was right to suspect that light
was an electromagnetic phenomena taking place at high frequency.
At first glance, Maxwells equations might not immediately suggest (to the
inexperienced eye) that waves are solutions. However, we can manipulate the
equations (first order differential equations that couple E to B) into the familiar
wave equation (decoupled second order differential equations for either E or B).
You should become familiar with this derivation. In what follows, we will derive
the wave equation for E. The derivation of the wave equation for B is very similar
(see problem P1.6).
( E) + ( B) = 0 (1.37)
t
We may eliminate B by substitution from (1.4), which gives
2 E J
( E) + 0 0 = 0 (1.38)
t 2 t
Next we apply the vector identity (0.10), ( E) = ( E)2 E, and use Gauss
law (1.1) to replace the term E, which brings us to
2 E J
2 E 0 0 = 0 + (1.39)
t 2 t 0
Example 1.7
Show that the electric field
E = (x 2 y 3 x + z 4 y) cos t
= 20 x y 3 cos t
6x 2 y 12z 2 6x y 2
J = 0 x 2 y 3 + x + 0 z 4 + y sin t
0 0 0
satisfy the wave equation (1.39).
Solution: We have
2 E 3
2 E 0 0 = 2y + 6x 2 y x + 12z 2 y cos t
t 2
+ 0 0 2 (x 2 y 3 x + z 4 y) cos t
= 2y 3 + 6x 2 y + 0 0 2 x 2 y 3 x + 12z 2 + 0 0 2 z 4 y cos t
Similarly,
J
0 = 0 0 2 x 2 y 3 + 6x 2 y x + 0 0 2 z 4 + 12z 2 6x y 2 y cos t
+
t 0
+ 2y 3 x + 6x y 2 y cos t
= 2y 3 + 6x 2 y + 0 0 2 x 2 y 3 x + 12z 2 + 0 0 2 z 4 y cos t
The two expressions are identical, and the wave equation is satisfied.14
The magnetic field B satisfies a similar wave equation, decoupled from E (see
P1.6). However, the two waves are not independent. The fields for E and B must
be chosen to be consistent with each other through Maxwells equations. After
solving the wave equation (1.40) for E, one can obtain the consistent B from E via
Faradays law (1.35).
In vacuum all of the terms on the right-hand side in (1.40) are zero, in which
case the wave equation reduces to
2 E
2 E 0 0 =0 (vacuum) (1.41)
t 2
14 The expressions in Example 1.7 hardly look like waves. The (quite unlikely) current and charge
distributions, which fill all space, would have to be artificially induced rather than arise naturally in
response to a field disturbance on a medium.
38 Chapter 1 Electromagnetic Phenomena
Solutions to this equation can take on every imaginable functional shape (speci-
fied at a given instantthe evolution thereafter being controlled by (1.41)). More-
over, since the differential equation is linear, any number of solutions can be
added together to create other valid solutions. Consider the subclass of solutions
that propagate in a particular direction. These waveforms preserve shape while
traveling with speed
p
c 1 0 0 = 2.9979 108 m/s (1.42)
In this case, E depends on the argument urc t , where u is a unit vector specifying
the direction of propagation. The shape is preserved since features occurring at a
given position recur downstream at a distance c t after a time t . By checking this
solution in (1.41), one confirms that the speed of propagation is c (see P1.8). As
mentioned previously, one may add together any combination of solutions (even
with differing directions of propagation) to form other valid solutions.
Exercises 39
Exercises
P1.1 Consider an infinitely long hollow cylinder with inner radius a and
outer radius b as shown in Fig. 1.9. Assume that the cylinder has a
charge density = k/s 2 for a < s < b and no charge elsewhere, where s
is the radial distance from the axis of the cylinder. Use Gauss Law in
integral form to find the electric field produced by this charge for each
of the three regions: s < a, a < s < b, and s > b.
a
HINT: For each region first draw an appropriate Gaussian surface and
integrate the charge density over the volume to figure out the enclosed
charge. Then use Gauss law in integral form and the symmetry of the b
problem to solve for the electric field. Figure 1.9 A charged cylinder with
charge located between a and b.
P1.3 A conducting cylinder with the same geometry as P1.1 carries a current
density J = k/s z along the axis of the cylinder for a < s < b, where s is
the radial distance from the axis of the cylinder. Using Amperes Law
in integral form, find the magnetic field due to this current in regions
(a) s < a, (b) a < s < b, and (c) s > b.
HINT: For each region first draw an appropriate Amperian loop and
integrate the current density over the surface to figure out how much
current passes through the loop. Then use Amperes law in integral
form and the symmetry of the problem to solve for the magnetic field.
P1.4 Check that the E and B fields in P1.2, satisfy the rest of Maxwells equa-
tions:
(a) (1.1). What must be?
(b) (1.2).
(c) (1.4). What must J be?
40 Chapter 1 Electromagnetic Phenomena
P1.6 Derive the wave equation for the magnetic field B in vacuum (i.e. J = 0
and = 0).
P1.7 Show that the magnetic field in P1.2 is consistent with the wave equa-
tion derived in P1.6. What is the requirement on k and ?
P1.8 Verify that E(urc t ) satisfies the vacuum wave equation (1.41), where
E has an arbitrary functional form.
Screen D (e) Use (1.33) to show that E0 and u must be perpendicular to each
Laser other in vacuum.
A
L1.10 Measure the speed of light using a rotating mirror. Provide an estimate
B C of the experimental uncertainty in your answer (not the percentage
Rotating Delay Path
Mirror error from the known value). (video)
Figure 1.10 Geometry for lab 1.10. Figure 1.10 shows a simplified geometry for the optical path for light
in this experiment. Laser light from A reflects from a rotating mirror
at B towards C . The light returns to B , where the mirror has rotated,
sending the light to point D. Notice that a mirror rotation of deflects
the beam by 2.
Exercises 41
Retro-reflecting
Collimation Telescope
Rotating Long Corridor
mirror
Front of laser can
serve as screen
for returning light
Laser
P1.11 Ole Roemer made the first successful measurement of the speed of Ole Roemer (16441710, Danish) was
a man of many interests. In addition to
light in 1676 by observing the orbital period of Io, a moon of Jupiter measuring the speed of light, he created
with a period of 42.5 hours. When Earth is moving toward Jupiter, a temperature scale which with slight
modification became the Fahrenheit
the period is measured to be slightly less, owing to decreasing Jupiter-
scale, introduced a system of standard
Earth distance between successive Io orbits. When Earth is moving weights and measures, and was heavily
away from Jupiter, the situation is reversed, and the period is measured involved in civic affairs (city planning,
etc.). Scientists initially became inter-
to be slightly longer. ested in Ios orbit because its eclipse
(when it went behind Jupiter) was an
(a) If you were to measure the time for 40 observed orbits of Io when event that could be seen from many
Earth is moving directly toward Jupiter and then later measure the places on earth. By comparing accurate
measurements of the local time when
time for 40 observed orbits when Earth is moving directly away from Io was eclipsed by Jupiter at two remote
Jupiter, what would you expect the difference between these two mea- places on earth, scientists in the 1600s
surements to be? Take the Earths orbital radius to be 1.5 1011 m. To were able to determine the longitude
difference between the two places.
simplify the geometry, just assume that Earth moves directly toward or
away from Jupiter over the entire 40 orbits (see Fig. 1.12).
Earth
(b) Roemer did the experiment described in part (a), and experimen-
Io
tally measured a 22 minute difference. What speed of light would one Sun
deduce from that value?
Jupiter
P1.12 In an isotropic nonconducting medium (i.e. P = 0, Jfree = 0), the po- Earth
larization under certain assumptions can be written as function of the Figure 1.12 Geometry for P1.11
electric field: P = 0 (E ) E, where (E ) = 1 +2 E +3 E 2 . The higher
order coefficients in the expansion (i.e. 2 , 3 , ...) are typically small, so
only the first term is important unless the field is very strong. Nonlinear
optics deals with the study of intense light-matter interactions, where
the higher-order terms in the expansion become important. This can
lead to phenomena such as harmonic generation.
Starting with (1.40), show that the wave equation becomes:
2 E 2 2 E + 3 E 2 + E
2
E 0 0 1 + 1 = 0 0
t 2 t 2
Chapter 2
E(r, t ) = E0 cos k r t +
(2.2)
43
44 Chapter 2 Plane Waves and Refractive Index
AM
Here represents an arbitrary (constant) phase term. The vector k, called the
Frequency (Hz)
where k has units of inverse length, u is a unit vector defining the direction of
propagation, and vac is the length by which r must vary (in the direction of u) to
Radar
cause the cosine to go through a complete cycle. This distance is known as the
Microwave (vacuum) wavelength. The frequency of oscillation is related to the wavelength via
2c
= (vacuum) (2.4)
vac
The frequency has units of radians per second. Frequency is also often ex-
Infrared pressed as /2 in units of cycles per second or Hz. Notice that k and
cannot be chosen independently; the wave equation requires them to be related
Visible through the dispersion relation
k= (vacuum) (2.5)
c
Ultraviolet
Typical values for vac are given in Fig. 2.1. Sometimes the spatial period of the
wave is expressed as 1/vac , in units of cm1 , called the wave number.
A magnetic wave accompanies any electric wave, and it obeys a similar wave
equation (see P1.6). The magnetic wave corresponding to (2.2) is
X-rays
B(r, t ) = B0 cos k r t + ,
(2.6)
(2.6) must be identical. Therefore, in vacuum the electric and magnetic fields
Gamma Rays travel in phase. In addition, Faradays law requires (see P1.2)
k E0
B0 = (2.7)
The above cross product means that B0 , is perpendicular to both E0 and k. Mean-
while, Gauss law E = 0 forces k to be perpendicular to E0 . It follows that the
magnitudes of the fields are related through B 0 = kE 0 / or B 0 = E 0 /c, in view of
Figure 2.1 The electromagnetic
(2.5).
spectrum
The influence of the magnetic field only becomes important (in comparison
to the electric field) for charged particles moving near the speed of light. This
typically takes place only for extremely intense lasers (> 1018 W/cm2 , see P2.12)
where the electric field is sufficiently strong to cause electrons to oscillate with
velocities near the speed of light. We will be interested in optics problems that take
place at far less intensity where the effects of the magnetic field can typically be
safely ignored. Throughout the remainder of this book, we will focus our attention
mainly on the electric field with the understanding that we can at any time deduce
the (less important) magnetic field from the electric field via Faradays law.
2.2 Complex Plane Waves 45
Figure 2.2 depicts the electric field (2.2) and the associated magnetic field (2.6).
The figure is deceptive since the fields dont actually look like transverse waves on
a string. The wave is comprised of large planar sheets of uniform field strength
(difficult to draw). The name plane wave is given since a constant argument in
(2.2) at any moment describes a plane, which is perpendicular to k. A plane wave
fills all space and may be thought of as a series of infinite sheets, each with a
different uniform field strength, moving in the k direction.
E0 E0 e i (2.9)
1 We have assumed that each vector component of the field propagates with the same phase. To
Example 2.1
Verify that the complex plane wave (2.10) is a solution to the wave equation (2.1).
2 2 2
2 E0 e i (krt ) = E0 + + e i (k x x+k y y+k z zt )
x 2 y 2 z 2
(2.11)
= E0 k x2 + k y2 + k z2 e i (krt )
= k 2 E0 e i (krt )
1 2 2
E0 e i (krt ) = 2 E0 e i (krt ) (2.12)
c t
2 2 c
Upon insertion into (2.1) we obtain the vacuum dispersion relation (2.5), which
specifies the connection between the wavenumber k and the frequency .
2 E 2 P
2 E 0 0 = 0 (2.13)
t 2 t 2
Since we are considering sinusoidal waves, we consider solutions of the form
E = E0 e i (krt )
(2.14)
P = P0 e i (krt )
P0 () = 0 () E0 () (2.16)
where an overall phase was formerly held in the complex vector E0 .6 (The
tilde had been suppressed.) Figure 2.3 shows a graph of (2.21). The imaginary
part of the index causes the wave to decay as it travels. This accounts for
absorption. The real part of the index n is associated with the oscillations of the
wave. By inspection of the cosine argument in (2.21), we see that the speed of the
(diminishing) sinusoidal wave fronts is
0 It is apparent that n() is the ratio of the speed of the light in vacuum to the speed
of the wave in the material.
In a dielectric material, the vacuum relations (2.3) and (2.4) are modified to
read
2
0 10 20 Re {k} u, (2.23)
where
Figure 2.3 Electric field of a decay- vac /n. (2.24)
ing plane wave. For convenience
in plotting, the direction of prop- While the frequency is the same, whether in a material or in vacuum, the
agation is chosen to be in the z wavelength varies with the real part of the index n.
direction (i.e. u = z).
Example 2.2
When n = 1.5, = 0.1, and = 5 1014 Hz, find (a) the wavelength inside the
material, and (b) the propagation distance over which the amplitude of the wave
diminishes by the factor e 1 (called the skin depth).
Solution:
(a)
vac 2c 3 108 m/s
c
= = = = = 400 nm
n n n 1.5 5 1014 Hz
(b)
c c 3 108 m/s
e c z = e 1 z= = = = 950 nm
2 2 (0.1) 5 1014 Hz
(n + i )2 = n 2 2 + i 2n = 1 + Re + i Im = 1 +
(2.25)
6 For the sake of simplicity in writing (2.21) we assume linearly polarized light. That is, all vector
components of E0 have the same complex phase . We will consider other possibilities, such as
circularly polarized light, in chapter 6.
2.4 The Lorentz Model of Dielectrics 49
The real parts and the imaginary parts in the above equation are separately equal:
n 2 2 = 1 + Re and 2n = Im
(2.26)
= Im /2n
(2.27)
When this is substituted into the first equation of (2.26) we get a quadratic in n 2
2
4
2 Im
n 1 + Re n
=0 (2.28)
4
The positive7 real root to this equation is
v
u q 2 2
t 1 + Re + 1 + Re + Im
u
n= (2.29)
2
The imaginary part of the index is then obtained from (2.27).
When absorption is small we can neglect the imaginary part of (), and
(2.29) reduces to
n () = 1 + ()
p
(negligible absorption) (2.30)
Recall that polarization has units of dipoles per volume. Each dipole has strength
q e re , where re is a microscopic displacement of the electron from equilibrium.
At the time of Lorentz, atoms were thought to be clouds of positive charge
wherein point-like electrons sat at rest unless stimulated by an applied electric
field. In our modern quantum-mechanical viewpoint, re corresponds to an av-
erage displacement of the electronic cloud, which surrounds the nucleus (see
Fig. 2.4). The displacement re of the electron charge in an individual atom de-
pends on the local strength of the applied electric field E at the position of the
atom. Since the diameter of the electronic cloud is tiny compared to a wavelength
of (visible) light, we make the approximation that the electric field is uniform
across any individual atom.
Unperturbed
The Lorentz model uses Newtons equation of motion to describe an electron
displacement from equilibrium within an atom. In accordance with the classical
laws of motion, the electron mass m e times its acceleration is equal to the sum of
+ the forces on the electron:
m e re = q e E m e re k Hooke re (2.32)
The electric field pulls on the electron with force q e E.8 A drag force (or friction)
In an electric field m e re opposes the electron motion and accounts for absorption of energy.
Without this term, it is only possible to describe optical index at frequencies away
from where absorption takes place. Finally, k Hooke re is a force accounting for
- the fact that the electron is bound to the nucleus. This restoring force can be
+
thought of as an effective spring that pulls the displaced electron back towards
equilibrium with a force proportional to the amount of displacement, so this
term is essentially the familiar Hookes law. With some rearranging, (2.32) can be
written as
qe
Figure 2.4 A distorted electronic
re + re + 20 re = E (2.33)
me
cloud becomes a dipole.
where 0 k Hooke /m e is the natural oscillation frequency (or resonant fre-
p
quency) associated with the electron mass and the spring constant.
There is a subtle problem with our analysis, which we will continue to neglect,
but which should be mentioned. The field E in (2.32) is the net field, which is
influenced by the presence of all of the dipoles. The actual field that a dipole
feels, however, does not include its own field. That is, we should remove from E
the field produced by each dipole in its own vicinity. This significantly modifies
the result if the density of the material is sufficiently high. This effect is described
by the Clausius-Mossotti formula, which is treated in appendix 2.B.
In accordance with our examination of a single sinusoidal wave, we insert
(2.14) into (2.33) and obtain
qe
re + re + 20 re = E0 e i (krt ) (2.34)
me
8 The electron also experiences a force due to the magnetic field of the light, F = q v B, but
e e
this force is tiny for typical optical fields.
2.4 The Lorentz Model of Dielectrics 51
qe E0 e i (krt )
re = (2.35)
m e 20 i 2
The electron position re oscillates (not surprisingly) with the same frequency as
the driving electric field. This solution illustrates the convenience of complex no-
tation. The imaginary part in the denominator implies that the electron oscillates
with a phase different from the electric field oscillations; the damping term (the 1
imaginary part in the denominator) causes the two to be out of phase somewhat.
The complex algebra in (2.35) accomplishes quite easily what would otherwise be
cumbersome (i.e. working out a trigonometric phase).
We are now able to write the polarization in terms of the electric field. By
substituting (2.35) into (2.31) and rearranging, we obtain 0
90 95 100 105 110
2p
!
P = 0 E0 e i (krt ) (2.36)
0 i
2 2
Figure 2.5 Real and imaginary
parts of the index for a single
where the plasma frequency p has been introduced:9 Lorentz oscillator dielectric with
s p = 10 and 0 = 100.
N q e2
p (2.37)
0 m e
2p
() = (2.38)
20 i 2
The index of refraction is then found by substituting the susceptibility (2.38) into
(2.18). The real and imaginary parts of the index are solved by equating separately
the real and imaginary parts of (2.18), namely
2p
(n + i )2 = 1 + () = 1 + (2.39)
20 i 2
Graphs of n and are given in Figs. 2.5 and 2.6 for various parameters. 1
Most materials actually have more than one species of active electron, and
different active electrons behave differently. The generalization of (2.39) in this
case is
f j 2p j 0 10 20 30 40
2
(n + i ) = 1 + () = 1 +
X
(2.40)
j 0 j i j
2 2
Figure 2.6 Real and imaginary
9 In a plasma, charges move freely so that both the Hooke restoring force and the dragging term parts of the index for a single
can be neglected (i.e. 0
= 0,
= 0). For a plasma, p is the dominant parameter. Lorentz oscillator dielectric with
p = 10 and 0 = 20.
52 Chapter 2 Plane Waves and Refractive Index
where f j is the aptly named the oscillator strength for the j th species of active
electron, inserted into the model without justification to make results better agree
with observation. Each species also has its own plasma frequency p j , natural
frequency 0 j , and damping coefficient j . For frequency ranges where j and
can be ignored (i.e. away from resonances 0 j ), it is common to write Lorentzs
refractive-index formula (2.40) in terms of v ac = 2c/, in which case it is known
as the Sellmeier equation. (See P2.2.)
Lorentz introduced this model well before the development of quantum
mechanics. Even though the model pays no attention to quantum physics, it
works surprisingly well for describing frequency-dependent optical indices and
absorption of light. As it turns out, the Schrdinger equation applied to two levels
in an atom reduces in mathematical form to the Lorentz model in the limit of
low-intensity light. Quantum mechanics also explains the oscillator strength,
which before the development of quantum mechanics had to be inserted ad hoc
to make the model agree with experiments. The friction term turns out not to be
associated with something internal to atoms but rather with collisions between
atoms, which on average give rise to the same behavior.
We will include the current density Jfree while setting the medium polarization P
to zero. The wave equation is
2
2 E 0 0 E = 0 Jfree (2.42)
t 2 t
10 G. Burns, Solid State Physics, Sect. 9-5 (Orlando: Academic Press, 1985).
2.5 Index of Refraction of a Conductor 53
m e re = q e E m e re (2.44)
q e E0 e i (krt ) + +
ve re = (2.45)
me i
where again we assume that the electron oscillation excursions described by re are +
small compared to the wavelength so that r can be treated as a constant in (2.44). +
The current density (2.43) in terms of the electric field is then
N q e2 E0 e i (krt )
+
Jfree = (2.46)
+
me i
We substitute this together with the electric field into the wave equation (2.42) and
get
2 0 N q e2 E0 e i (krt )
Figure 2.8 The electrons in a
k 2 E0 e i (krt ) + 2 E0 e i (krt ) = i (2.47)
c me i conductor can easily move in
This simplifies down to the dispersion relation response to the applied field.
2p
!
2 2
k = 2 1 (2.48)
c i + 2
c2 c2
index may be extracted from (2.48).
Note that in the low-frequency limit (i.e. ), the current density (2.46)
reduces to Ohms law J = E, where = N q e2 /m e is the DC conductivity. In
the high-frequency limit (i.e. ), the behavior changes over to that of a
free plasma, where collisions, which are responsible for resistance, become less
important since the excursions of the electrons during oscillations become very
small. This formula captures the general behavior of metals, but actual values of
the index vary from this somewhat (see P2.6 ).
In either the conductor or dielectric model, the damping term removes energy
from electron oscillations. The damping term gives rise to an imaginary part
of the index, which causes an exponential attenuation of the plane wave as it
propagates.
54 Chapter 2 Plane Waves and Refractive Index
We require just two of Maxwells Equations: (1.3) and (1.4). We take the dot product
of B/0 with the first equation and the dot product of E with the second equation.
Then by subtracting the second equation from the first we obtain
E B B
B B
( E) E + 0 E + = E J (2.49)
0 0 t 0 t
The first two terms can be simplified using the vector identity P0.8. The next two
terms are the time derivatives of 0 E 2 /2 and B 2 /20 , respectively. The relation
John Henry Poynting (18521914, En- (2.49) then becomes
glish) was the youngest son of a Unitar-
0 E 2 B 2
ian minister who operated a school near B
Manchester England where John re- E + + = E J (2.50)
0 t 2 20
ceived his childhood education. He later
attended Owens College in Manchester This is Poyntings theorem. Each term in this equation has units of power per
and then went on to Cambridge Univer- volume.
sity where he distinguished himself in
mathematics and worked under James
Maxwell in the Cavendish Laboratory.
Poynting joined the faculty of the Univer-
It is conventional to write Poyntings theorem as follows:11
sity of Birmingham (then called Mason
Science College) where he was a profes- S+ (u field + u medium ) = 0 (2.51)
sor of physics from 1880 until his death. t
Besides developing his famous theorem
on the conservation of energy in electro-
where
B
magnetic fields, he performed innovative S E (2.52)
measurements of Newtons gravitational 0
constant and discovered that the Suns
radiation draws in small particles to-
is called the Poynting vector, which has units of power per area, called irradiance.
wards it, the Poynting-Robertson effect. The expression
Poynting was the principal author of a 0 E 2 B 2
multi-volume undergraduate physics u field + (2.53)
textbook, which was in wide use until the 2 20
1930s. (Wikipedia) is the energy per volume stored in the electric and magnetic fields. Derivations of
the electric field energy density and the magnetic field energy density are given in
Appendices 2.C and 2.D. (See (2.80) and (2.87).) The derivative
u medium
EJ (2.54)
t
11 See D. J. Griffiths, Introduction to Electrodynamics, 3rd ed., Sect. 8.1.2 (New Jersey: Prentice-Hall,
1999).
2.6 Poyntings Theorem 55
describes the power per volume delivered to the medium from the field. Equa-
tion (2.54) is reminiscent of the familiar circuit power law, Power = Voltage
Current. Power is delivered when a charged particle traverses a distance while
experiencing a force. This happens when currents flow in the presence of electric
fields.
Poyntings theorem is essentially a statement of the conservation of energy,
where S describes the flow of energy. To appreciate this, consider Poyntings
theorem (2.51) integrated over a volume V (enclosed by surface S). If we also
apply the divergence theorem (0.11) to the term involving S we obtain
I Z
S n d a = (u field + u medium ) d v (2.55)
t
S V
Notice that the volume integral over energy densities u field and u medium gives
the total energy stored in V , whether in the form of electromagnetic field energy
density or as energy density that has been given to the medium. The integration
of the Poynting vector over the surface gives the net Poynting vector flux directed
outward. Equation (2.55) indicates that the outward Poynting vector flux matches
the rate that total energy disappears from the interior of V . Conversely, if the
Poynting vector is directed inward (negative), then the net inward flux matches
the rate that energy increases within V . The vector S defines the flow of energy
through space. Its units of power per area are just what is needed to describe the
brightness of light impinging on a surface.
Example 2.3
(a) Find the Poynting vector S and energy density u field for the plane wave field E =
xE 0 cos (kz t ) traveling in vacuum. (b) Check that S and u field satisfy Poyntings
theorem.
zk xE 0 kE 0
B= cos (kz t ) = y cos (kz t )
EB kE 0
S= = xE 0 cos (kz t ) y cos (kz t )
0 0
= zc0 E 02 cos2 (kz t )
0 E 2 B 2 0 E 02 kE 02
u field = + = cos2 (kz t ) + cos2 (kz t )
2 20 2 20 2
= 0 E 02 cos2 (kz t )
Notice that S = cu. The energy density traveling at speed c gives rise to the power
per area passing a surface (perpendicular to z).
56 Chapter 2 Plane Waves and Refractive Index
(b) We have
S = c0 E 02 cos2 (kz t ) = 2kc0 E 02 cos (kz t ) sin (kz t )
z
whereas
u field
= 0 E 02 cos2 (kz t ) = 20 E 02 cos (kz t ) sin (kz t )
t t
Poyntings theorem (2.50) is satisfied since = kc.
It is common to replace the rapidly oscillating function cos2 (kz t ) with its time
average 1/2, but this would have inhibited our ability to take the above derivatives
needed in this specific problem.
k E0 i (krt )
B(r, t ) = e (2.56)
When k is complex, B is out of phase with E, and this occurs when absorption
takes place. On the other hand, when there is no absorption, then k is real, and B
and E carry the same complex phase.
Before computing the Poynting vector (2.52), which involves multiplication,
we must remember our unspoken agreement that only the real parts of the fields
are relevant. We necessarily remove the imaginary parts before multiplying (see
(0.22)). To obtain the real parts of the fields, we add their respective complex
conjugates and divide the result by 2 (see (0.30)). The real field associated with
the plane-wave electric field is
1h
i
E(r, t ) = E0 e i (krt ) + E0 e i (k rt ) (2.57)
2
and the real field associated with (2.56) is
1 k E0 i (krt ) k E0 i (k rt )
B(r, t ) = e + e (2.58)
2
Now we are ready to calculate the Poynting vector. The algebra is a little messy
in general, so we restrict the analysis to the case of an isotropic medium for the
sake of simplicity.
2.7 Irradiance of a Plane Wave 57
B
S E
0
1 k E0 i (krt ) k E0 i (k rt )
1h
i
= E0 e i (krt ) + E0 e i (k rt ) e + e
2 20
E
0 (kE0 ) i (kk )r
" #
E0 (kE0 ) 2i (krt )
1 e + e
= E0 k E
E
i (kk )r 0 k E0
40 +
0
e + e 2i (k rt )
(2.59)
Very often, we are interested in the time-average of the Poynting vector, denoted
by St , since there are no electronics that can keep up with the rapid oscillation of
visible light (i.e. > 1014 Hz). The first and last terms in (2.59) rapidly oscillate and
vanish under time averaging.
rule P0.3 to write E0 (k E0 ) = k E0 E0
Additionally, we can usethe
BAC-CAB
and similarly E0 k E0 = k E0 E0 , where we have employed kE0 = 0, which
k + k
E0 E0 e i (kk )r
St = (isotropic medium) (2.60)
40
n0 c
E0 E0 e 2 c ur
St = u (isotropic medium) (2.61)
2
This expression shows that (in an isotropic medium) the flow of energy is in the
direction of u (or k). This agrees with our intuition that energy flows in the direction
that the wave propagates.
so that they correspond to the local electric field. Equation (2.62) agrees with S in
Example 2.3 where n = 1 and E0 = xE 0 is real; the cosine squared averages to 1/2.
comparing sources to burning candles with prescribed dimensions made from Luminous Power (of a source):
whale tallow. Today, the procedure for measuring luminous power is essentially Visible light energy emitted per
to measure the radiometric power spectrum I (), and then calculate time from a source. Units: lumens
(lm) lm=(1/683) W @ 555 nm
Z
Lumens = R()I ()d (2.63) Luminous Solid-Angle Intensity
(of a source) Luminous power per
steradian emitted from a point-
where R() is the photometric response function plotted in Fig. 2.9. like source. Units: candelas (cd),
Photometric units are often used to characterize room lighting as well as cd = lm/Sr.
photographic, projection, and display equipment. For example, both a 60 W Luminance (of a source): Lumi-
incandescent bulb and a 13 W compact fluorescent bulb emit about 800 lumens nous solid-angle intensity per pro-
of light, while their radiometric output is much closer to their power rating. jected area of an extended source.
(The projected area foreshortens
The difference in photometric output versus radiometric output reflects the fact
by cos , where is the observa-
that most of the energy radiated from an incandescent bulb is emitted in the tion angle relative to the surface
infrared, where our eyes are not sensitive. Table 2.2 gives the names of the various normal.) Units: cd/cm2 = stilb,
photometric quantities, which parallel the entries for radiometric quantities in cd/m2 = nit, nit = 3183 lambert =
3.4 footlambert
Table 2.1.
Luminous Emittance or Exitance
(from a source): Luminous Power
Color emitted per unit surface area of an
extended source. Units: lm/cm2
In addition to brightness, our eye-brain measures some basic information about
the spectral content of light. We interpret this spectral information as the color Illuminance (to a receiver): Inci-
dent luminous power delivered
of the light. Color information arises from the cone receptors in the eye, which
per area to a receiver. Units: lux;
come in three varieties, each sensitive to light in a different wavelength band. lm/m2 = lux, lm/cm2 = phot,
Figure 2.10 plots the normalized sensitivity curves13 for short (S), medium (M), lm/ft2 = footcandle
and long (L) wavelength cones. When the three types of cones are stimulated
equally the light appears white, and when they are stimulated differently the light Table 2.2 Photometric quantities
appears colored. and units.
Light with different spectral distributions can produce the exact same color
sensation, so our perception of color only gives very general information about the
spectral content of light. For example, light coming from a computer display has a S
different spectral composition than the light incident on the camera that recorded M
the image, but both can produce the same color sensation. This ambiguity can L
lead to a potentially dangerous situation in the lab. For example, lasers from
670 nm to 800 nm all appear the same color. (They all stimulate the L and M
cones in essentially the same ratio.) However, your eyes response falls off quickly
400 500 600 700 800
in the near-infrared, so a dangerous 800 nm high-intensity beam can appear wavelength (nm)
about the same brightness as an innocuous 670 nm laser pointer.
Because we have have three types of cones, our perception of color can be well- Figure 2.10 Normalized cone sen-
represented using a three-dimensional vector space referred to as a color space.14 sitivity functions
13 A. Stockman, L. Sharpe, and C. Fach, The spectral sensitivity of the human short-wavelength
cones, Vision Research, 39, 2901-2927 (1999); A. Stockman, and L. Sharpe, Spectral sensitivities
of the middle- and long-wavelength sensitive cones derived from measurements in observers of
known genotype, Vision Research, 40, 1711-1737 (2000).
14 The methods we use to represent color are very much tied to human physiology. Other species
have photoreceptors that sense different wavelength ranges or do not sense color at all. For instance,
60 Chapter 2 Plane Waves and Refractive Index
A color space can be defined in terms of three basis light sources referred to
as primaries. Different colors (i.e. the vectors in the color space) are created
by mixing the primary light in different ratios. If we had three primaries that
separately stimulated each type of rod (S, M, and L), we could recreate any color
sensation exactly by mixing those primaries. However, by inspecting Fig. 2.10 you
can see that this ideal set of primaries cannot be found because of the overlap
between the S, M, and L curves. Any light that will stimulate one type of cone will
also stimulate another. This overlap makes it impossible to display every possible
color with three primaries. However, it is possible to quantify all colors with three
primaries, even if the primaries cant display the colorswell see how shortly.
The range of colors that can be displayed with a given set of primaries is referred
to as the gamut of that color space. As your experience with computers suggests,
we are able engineer devices with a very broad gamut, but there are always colors
visible in nature that cannot be recreated by a three-primary device.
The CIE1931 RGB15 color space is a very commonly encountered color space
based on a series of experiments performed by W. David Wright and John Guild
in 1931. In these experiments, test subjects were asked to match the color of a
monochromatic test light source by mixing monochromatic primaries at 700 nm
(R), 546.1 nm (G), and 435.8 nm (B ). The relative amount of R, G, and B light
required to match the color at each test wavelength was recorded as the color
matching functions r(), g (), and b(), shown in Fig. 2.11. Note that the color
matching functions sometimes go negative. This is most noticeable for r(), but
all three have negative values. These negative values indicate that the test color
was outside the gamut of the primaries (i.e. the color of the test source could not
be matched by adding primaries). In these cases, the observers matched the test
0 light as closely as possible by mixing primaries, and then they added some of the
primary light to the test light until the colors matched. The amount of primary
400 500 600 700
light that had to be added to the test light was recorded as a negative number. In
Test Wavelength (nm)
this way they were able to quantify the color, even though it couldnt be displayed
Figure 2.11 The CIE 1931 RGB using their primaries.
color-matching functions. To calculate the color components of an arbitrary light source with a radio-
metric spectrum I (), we integrating the spectrum against the color matching
functions:
Z Z Z
R = I ()rd G = I ()g d B = I ()bd (2.64)
Papilio butterflies have six types of cone-like photoreceptors and certain types of shrimp have
twelve. Reptiles have four-color vision for visible light, and pit vipers have an additional set of eyes
that look like pits on the front of their face. These pits are essentially pinhole cameras sensitive
to infrared light, and give these reptiles crude night-vision capabilities. On the other hand, some
insects can perceive markings on flowers that are only visible in the ultraviolet. Each of these
species would find the color spaces we use to record and display colors to be very inaccurate.
15 CIE is an abbreviation for the French Commission Internationale de lclairage, an interna-
tional commission that defines lighting and color standards. This standard was adopted in 1931,
and hence the name. Note that CIE1931 is not the RGB space most commonly encountered on a
computer to define colors on webpages and in photosthat space is referred to as sRGB and uses a
different set of primaries.
2.A Radiometry, Photometry, and Color 61
The triplet of numbers (R,G, B ) then uniquely define the color of the light source.
If R, G, or B turn out to be negative for a given I (), then that color of light
falls outside the gamut of these particular primaries. However, the negative
coordinates still provide a valid abstract representation of that color.
The RGB color space is an additive color model, where light emitting pri-
maries are added together to produce color and the absence of light gives black.
Subtractive color models produce color starting with a white reflective substrate
(i.e. something that reflects all frequencies of visible light equally like a piece
of paper or canvas) and then placing absorbing pigments over the substrate to
remove portions of the reflected spectrum.
Some schemes for displaying colors employ more than three basis vectors. For
example, color printers typically use the subtractive cyan, magenta, yellow, and
black (CMYK) color space, and some display manufacturers add a fourth additive
primary, such as yellow, to the typical set of red, green, and blue primaries. The
extra basis vector increases the range of colors that can be displayed by these
systems (i.e. it increases the gamut). However, the fourth basis vector makes
the color space overdetermined and only helps in displaying colorswe can
abstractly represent all colors using just three coordinates in an appropriately
chosen basis.
Example 2.4
The CIE1931 XYZ color space is derived from the CIE1931 RGB space by the trans-
formation
X 0.49 0.31 0.20 R
Y = 1 0.17697 0.81240 0.01063 G (2.65)
0.17697
Z 0.00 0.01 0.99 B
where X , Y , and Z are the color coordinates in the new basis. The matrix elements
in (2.65) were carefully chosen to give this color space some desirable properties:
the new coordinates (X , Y , or Z ) are always positive; the Y coordinate gives the
photometric brightness of the light while the X and Z coordinates describe the
color part (i.e. the chromatisity) of the light; and the coordinate (1/3,1/3,1/3) gives
the color white.
The XYZ coordinates do not represent new primaries, but rather linear combina-
tions of the original primaries. Find the representation in the CIE1931 RGB basis
for each of the basis vectors in the XYZ space.
Thus, the X basis vector RGB components are (0.4185, 0.09117, +0.0009209). Sim-
ilar calculations for the Y and Z basis vectors give (0.1587, 0.2524, 0.002550) and
(0.08283, 0.01571, 0.1786), respectively. Because the XYZ basis vectors contain
negative amounts of the physical RGB primaries, the XYZ basis is not physically
realizable. However, it is extensively used because it can abstractly represent all
colors using a triplet of positive numbers.
senting their influence with the macroscopic field. However, if they are symmetrically distributed
the result is the same. See J. D. Jackson, Classical Electrodynamics, 3rd ed., Sect. 4.5 (New York: John
Wiley, 1999).
2.B Clausius-Mossotti Relation 63
N /0
n2 1 = (2.72)
1 N /30
In this case, we may invert the relation to write N /0 in terms of the index:17
N n2 1
=3 2 (2.73)
0 n +2
Example 2.5
Xenon vapor at STP (density 4.46105 mol/cm3 ) has index n = 1.000702 measured
at wavelength 589nm. Use (a) the Clausius-Mossotti relation (2.71) and (b) the
uncorrected formula (i.e. numerator only) to predict the index for liquid xenon
with density 2.00102 mol/cm3 . Compare with the measured value of n = 1.332.18
Solution: At the low density, we may safely neglect the correction in the denom-
inator of (2.72) and simply write Natm /0 = 1.0007022 1 = 1.404 103 . The
liquid density Nliquid is 2.00 102 /4.46 105 = 449 times greater. Therefore,
Nliquid /0 = 449 1.404 103 = 0.630. (a) According to Clausius-Mossotti (2.72),
the index is r
0.630
n = 1+ = 1.341
1 0.630/3
(b) On the other hand, without the correction in the denominator, we get
p
n = 1 + 0.630 = 1.277
q e r zd /2 q e r + zd /2
E=
40 |r zd /2|3 40 |r + zd /2|3
17 This form of Clausius-Mossotti relation, in terms of the refractive index, was renamed the
Figure 2.12 The field lines sur-
Lorentz-Lorenz formula, but probably undeservedly so, since it is essentially the same formula. rounding a dipole.
18 D. H. Garside, H. V. Molgaard, and B. L. Smith, Refractive Index and Lorentz-Lorenz function
of Xenon Liquid and Vapour, J. Phys. B: At. Mol. Phys. 1, 449-457 (1968).
64 Chapter 2 Plane Waves and Refractive Index
We wish to compute the average field within a cubic volume V = L 3 that symmet-
rically encompasses the dipole.19 We take the volume dimension L to be large
compared to the dipole dimension d . Integrating the field over this volume yields
The terms multiplying x and y vanish since they involve odd functions integrated
over even limits on either x or y, respectively. On the remaining term, the integra-
tion on z has been executed. Before integrating the remaining expression over x
and y, we make the following approximation based on L >> d :
1 1 1
q =p q
x 2 + y 2 + (L d )2 /4 x 2 + y 2 + L 2 /4 1 2 Ld2 /2 2
x +y +L /4
Ld /4
1
=p 1 2
x 2 + y 2 + L 2 /4 x + y 2 + L 2 /4
which will make integration considerably easier.20 Then integration over the y
dimension brings us to21
ZL/2 ZL/2 ZL/2
qe d Ld y qe d L2d x
Z
Ed v = z dx 3/2 = z p
40 40
x 2 + L 2 /4 x 2 + L 2 /2
-L/2 -L/2
x 2 + y 2 + L 2 /4 -L/2
The final integral is the same as twice the integral from 0 to L/2.
p Then, with x > 0,
we can employ the variable change s = x 2 +L 2 /4 2d x = d s/ s L 2 /4 and obtain
LZ2 /2
qe d L2d s q e d 4
Z
Ed v = z p = z
40 s s 2 L 4 /16 40 3
L 2 /4
19 Authors often obtain the same result using a spherical volume with the (usually unmentioned)
conceptual awkwardness that spheres cannot be closely packed to form a macroscopic medium
without introducing voids.
20 One might be tempted to begin this calculation with the well-known dipole field
qe r zd /2 r + zd /2 qe d
E= h = [3r (z r) z]
40 r 3 2 3/2 2 3/2 40 r 3
h i i
1 z r dr + d 2 1 + z r dr + d 2
4r 4r
1
Z
U= (r) (r) d v (2.75)
2
V
We consider the potential to arise from the charges themselves. The factor 1/2
is necessary to avoid double counting. To appreciate this factor consider just
two point charges: We only need to count the energy due to one charge in the
presence of the others potential to obtain the energy required to bring the charges
together.
A substitution of (1.1) for (r) into (2.75) gives
0
Z
U= (r) E (r) d v (2.76)
2
V
0 0
Z Z
(r) E (r) d v
U= E (r) (r) d v (2.77)
2 2
V V
An application of the divergence theorem (0.11) on the first integral and a substi-
tution of (2.74) into the second integral yields
0 0
I Z
U= (r) E (r) nd a + E (r) E (r) d v (2.78)
2 2
S V
where
0 E 2
u E (r) (2.80)
2
is interpreted as the energy density of the electric field.
1
Z
U= J (r) A (r) d v (2.82)
2
V
As in (2.75), the factor 1/2 is necessary to avoid double counting the influence of
the currents on each other.
Under the assumption of steady currents (no variations in time), we may
substitute Amperes law (1.21) into (2.82), which yields
1
Z
U= [ B (r)] A (r) d v (2.83)
20
V
Next we employ the vector identity P0.8 from which the previous expression
becomes
1 1
Z Z
U= B (r) [ A (r)] d v [A (r) B (r)] d v (2.84)
20 20
V V
23 J. R. Reitz, F. J. Milford, and R. W. Christy, Foundations of Electromagnetic Theory 3rd ed., Sect.
Upon substituting (2.81) into the first equation and applying the Divergence
theorem (0.11) on the second integral, this expression for total energy becomes
1 1
Z I
U= B (r) B (r) d v [A (r) B (r)] n d a (2.85)
20 20
V S
where
B2
u B (r) (2.87)
20
is the energy density for a magnetic field.
68 Chapter 2 Plane Waves and Refractive Index
Exercises
A2vac
n2 = 1 +
2vac 20,vac
from (2.39) for a gas with negligible absorption (i.e. = 0, valid far
from resonance 0 ), where 0,vac corresponds to frequency 0 and A is
a constant. Many materials (e.g. glass, air) have strong resonances in
the ultraviolet. In such materials, do you expect the index of refraction
for blue light to be greater than that for red light? Make a sketch of n as
a function of wavelength for visible light down to the ultraviolet (where
0,vac is located).
P2.3 In the Lorentz model, take N = 1028 m3 for the density of bound
electrons in an insulator, and a single transition at 0 = 61015 rad/sec
(in the UV), and damping = 0 /5 (quite broad). Assume E0 is 104 V/m.
For three frequencies i) = 0 2, ii) = 0 , and iii) = 0 + 2 find
the magnitude and phase (relative to the phase of E0 e i (krt ) ) of the
following quantities. Find
(a) the amplitude and phase of the charge displacement re (2.35).
(b) the amplitude and phase of the susceptibility (). Does ()
depend on the strength of the E-field?
(c) n and at the three frequencies via (2.29) and (2.27).
Answer: i) n = 1.33, = 0.25, ii) n = 0, = 4.42, iii) n = 0.85, = 0.25.
(d) the three speeds of light in terms of c and how far light penetrates
into the material before only 1/e of the amplitude of E remains.
P2.6 Use (2.48) and expressions that follow (2.48) to calculate the index
of silver at = 633nm. The density of free electrons in silver is N =
5.86 1028 m3 and the DC conductivity is = 6.62 107 C2 / (J m s).25
Compare with the actual index given in P2.5.
Answer: n + i = 0.02 + i 4.5
P2.9 In the case of a linearly-polarized plane wave, where the phase of each
vector component of E0 is the same, re-derive (2.62) directly from the
real field (2.21). For simplicity, you may ignore absorption (i.e.
= 0).
HINT: The time-average of cos2 k r t + is 1/2.
P2.10 (a) Find the intensity (in W/cm2 ) produced by a short laser pulse with
duration t = 2.5 1014 s and energy E = 100 mJ, focused in vacuum
to a round spot with radius r = 5 m.
24 Handbook of Optical Constants of Solids, Edited by E. D. Palik (Elsevier, 1997).
25 G. Burns, Solid State Physics, p. 194 (Orlando: Academic Press, 1985).
70 Chapter 2 Plane Waves and Refractive Index
P2.11 (a) What is the intensity (in W/cm2 ) on the retina when looking directly
at the sun? Assume that the eyes pupil has a radius r pupil = 1 mm.
Take the Suns irradiance at the earths surface to be 1.4 kW/m2 , and
neglect refractive index (i.e. set n = 1). HINT: The Earth-Sun distance
is d o = 1.5 108 km and the pupil-retina distance is d i = 22 mm. The
radius of the Sun r Sun = 7.0 105 km is de-magnified on the retina
according to the ratio d i /d o .
(b) What is the intensity at the retina when looking directly into a
1 mW HeNe laser? Assume that the smallest radius of the laser beam
is r waist = 0.5 mm positioned d o = 2 m in front of the eye, and that the
entire beam enters the pupil. Compare with part (a).
P2.12 Show that the magnetic field of an intense laser with = 1 m becomes
important for a free electron oscillating in the field at intensities above
1018 W/cm2 . This marks the transition to relativistic physics. Neverthe-
less, for convenience, use classical physics in making the estimate.
HINT: At lower intensities, the oscillating electric field dominates, so
the electron motion can be thought of as arising solely from the electric
field. Use this motion to calculate the magnetic force on the mov-
ing electron, and compare it to the electric force. The forces become
comparable at 1018 W/cm2 .
P2.13 The CIE1931 RGB color matching function r(), g (), and b() can be
transformed using (2.65) to obtain color matching functions for the
XYZ basis: x(), y(), and z(), plotted in Fig 2.13. As with the RGB
color matching functions, the XYZ color matching functions can be
used to calculate the color coordinates in the XYZ basis for an arbitrary
spectrum:
Z Z Z
X = I ()xd Y = I () yd Z = I ()zd (2.88)
400 500 600 700
wavelength (nm)
The function y() was chosen to be exactly the scotoptic response
Figure 2.13 Color matching func- curve (shown in Fig. 2.9), so that Y describes the photometric bright-
tions for the CIE XYZ color space.
ness of the light.
(a) Obtain the XYZ color matching functions from www.cvrl.org and
calculate the luminous power for a light source with a radiometric
Exercises 71
spectrum
2
I () = I 0 e [(500 nm)/(20 nm)]
with I 0 = 1 W/nm. HINT: Remember that the response function y is
the photometric response function of the eye. The standard units from
the website give y with a peak value of one, so youll need to use the
fact that 1 lm = 1/683 W at the peak of the response curve to get the
units right.
(b) Calculate the XYZ color coordinates for the light source in (a).
(c) Calculate the normalized x, y, and z components defined by
X
x=
X +Y + Z
Y
y=
X +Y + Z
Z Figure 2.14 A chromaticity dia-
z= = 1x y gram plotting the colors visible
X +Y + Z
to the human eye versus x and
Locate this color on the chromaticity diagram in Fig. 2.14. Describe y, as defined in P2.13. The col-
what color light with this spectrum would appear, and how it is possible ors of single-wavelength light lie
to represent it using just two coordinates (x and y) as on the diagram. along the dark line around the
(HINT: You can display a color with bright primaries or dim primaries edge of the diagram, while the
without changing the color as long as the color of the primaries doesnt colors that can be displayed by
standard computer and TV dis-
change.)
plays fall inside the white sRGB
P2.14 LEDs used in home lighting typically have a power spectrum that looks triangle. This image was created
using sRGB encoding so colors
similar to this function:
h 460 nm 2 outside the sRGB triangle can only
560 nm 2 i
I () = I 0 e 15 nm + 0.4e 90 nm be approximated in the chart. All
display systems suffer from a lim-
where the narrow peak is a blue LED and the broad peak is a yellow ited gamut like this, so the only
way to experience the vivid sen-
phosphor coating that is deposited over the blue LED. Use the process
sation of single-wavelength light
below to display the color of this LED on a computer display. is to view the scattered light from
(a) Obtain a copy of the XYZ color matching functions (available at a laser (or a gas discharge with a
www.cvrl.org) and calculate the XYZ coordinates using the process grating) in person.
described in P2.13.
(b) Now transform the XYZ coordinates into the (R, G, B ) basis using
allows you to view faint and bright stars in the night sky at the same
time, even though their brightness differs by several orders of magni-
tude. For the sake easy display, the sRGB standard approximates the
response of the eye by taking each (R, G, B ) value, represented by C ,
and mapping it to the corresponding component in the sRGB basis like
this: (
12.92C C 0.0031308
C sRGB = 1
1.055C 0.055 C > 0.0031308
2.4
1999).
73
74 Chapter 3 Reflection and Refraction
material on the right, as depicted in the Fig. 3.1. When a plane wave traveling
in the direction ki is incident on the boundary from the left, it gives rise to a
reflected plane wave traveling in the direction kr and a transmitted plane wave
traveling in the direction kt . The incident and reflected waves exist only to the
left of the material interface, and the transmitted wave exists only to the right of
the interface. The angles i , r , and t give the angles that each respective wave
vector (ki , kr , and kt ) makes with the normal to the interface.
For simplicity, well assume that both of the materials are isotropic here.
(Chapter 5 discusses refraction for anisotropic materials.) In this case, ki , kr , and
kt all lie in a single plane, referred to as the plane of incidence, (i.e. the plane
represented by the surface of this page). We are free to orient our coordinate
system in many different ways (and every textbook seems to do it differently!).2
We choose the yz plane to be the plane of incidence, with the z-direction normal
to the interface and the x-axis pointing into the page.
The electric field vector for each plane wave is confined to a plane perpendic-
ular to its wave vector. We are free to decompose the field vector into arbitrary
components as long as they are perpendicular to the wave vector. It is customary
to choose one of the electric field vector components to be that which lies within
the plane of incidence. We call this p-polarized light, where p stands for parallel to
z-axis the plane of incidence. The remaining electric field vector component is directed
normal to the plane of incidence and is called s-polarized light. The s stands for
x-axis senkrecht, a German word meaning perpendicular.
directed into page
Using this system, we can decompose the electric field vector Ei into its p-
(p)
polarized component E i and its s-polarized component E i(s) , as depicted in
Fig. 3.1. The s component E i(s) is represented by the tail of an arrow pointing
into the page, or the x-direction in our convention. The other fields Er and Et
Figure 3.1 Incident, reflected, and
are similarly split into s and p components as indicated in Fig. 3.1. All field
transmitted plane wave fields at a
material interface.
components are considered to be positive when they point in the direction of
their respective arrows.3 Note that the s-polarized components are parallel for
all three plane waves, whereas the p-polarized components are not (except at
normal incidence) because each plane wave travels in a different direction.
By inspection of Fig. 3.1, we can write the various wave vectors in terms of the
y and z unit vectors:
ki = k i y sin i + z cos i
kr = k r y sin r z cos r
(3.1)
kt = k t y sin t + z cos t
Also by inspection of Fig. 3.1 (following the conventions for the electric fields
indicated by the arrows), we can write the incident, reflected, and transmitted
2 For example, our convention is different than that used by E. Hecht, Optics, 3rd ed., Sect. 4.6.2
Each field has the form (2.8). We have utilized the k-vectors (3.1) in the exponents
of (3.2).
Figure 3.2 Animation of s- and
Now we are ready to connect the fields on one side of the interface to the
p-polarized fields incident on an
fields on the other side. This is done using boundary conditions. As explained in interface as the angle of incidence
appendix 3.A, Maxwells equations require the components of E that are parallel is varied.
to the interface to be the same on either side of the boundary. In our coordinate
system, the x and y components are parallel to the interface, whereas z = 0 defines
the interface. This means that at z = 0 the x and y components of the combined
incident and reflected fields must equal the corresponding components of the
transmitted field:
h i h i
E i y cos i + xE i(s) e i (ki y sin i i t ) + E r y cos r + xE r(s) e i (kr y sin r r t )
(p) (p)
h i
= E t y cos t + xE t(s) e i (kt y sin t t t )
(p)
(3.3)
Since this equation must hold for all conceivable values of t and y, we are com-
pelled to set all the phase factors in the complex exponentials equal to each other.
The time portion of the phase factors requires the frequency of all waves to be the
same:
i = r = t (3.4)
(We could have guessed that all frequencies would be the same; otherwise wave
fronts would be annihilated or created at the interface.) Similarly, equating the
spatial terms in the exponents of (3.3) requires Willebrord Snell (or Snellius) (1580
1626, Dutch) was an astronomer and
mathematician born in Leiden, Nether-
k i sin i = k r sin r = k t sin t (3.5) lands. In 1613 he succeeded his father
as professor of mathematics at the Uni-
Now recall from (2.19) the relations k i = k r = n i /c and k t = n t /c. With these versity of Leiden. He was an accom-
plished mathematician, developing a
relations, (3.5) yields the law of reflection new method for calculating as well
as an improved method for measuring
r = i (3.6) the circumference of the earth. He is
most famous for his rediscovery of the
law of refraction in 1621. (The law was
and Snells law known (in table form) to the ancient
n i sin i = n t sin t (3.7) Greek mathematician Ptolemy, to Per-
sian engineer Ibn Sahl (900s), and to
The three angles i , r , and t are not independent. The reflected angle matches Polish philosopher Witelo (1200s).) Snell
authored several books, including one
the incident angle, and the transmitted angle obeys Snells law. The phenomenon on trigonometry, published a year after
of refraction refers to the fact that i and t are different. That is, light bends as it his death. (Wikipedia)
transmits through an interface.
76 Chapter 3 Reflection and Refraction
Because the exponents are all identical, (3.3) reduces to two relatively simple
equations (one for each dimension, x and y):
and
(p) (p) (p)
Ei + Er cos i = E t cos t (3.9)
We have derived these equations from the boundary condition (3.54) on the
parallel component of the electric field. This set of equations has four unknowns
(p) (p)
(E r , E r(s) , E t , and E t(s) ), assuming that we pick the incident fields. We require
two additional equations to solve the system. These are obtained using the
separate boundary condition on the parallel component of magnetic fields given
in (3.58) (also discussed in appendix 3.A).
From Faradays law (1.3), we have for a plane wave (see (2.56))
kE n
B= = u E (3.10)
c
where u k/k is a unit vector in the direction of k. We have also utilized (2.19)
for a real index. This expression is useful for writing Bi , Br , and Bt in terms of the
electric field components that we have already introduced. When injecting (3.1)
and (3.2) into (3.10), the incident, reflected, and transmitted magnetic fields turn
out to be
ni h i
xE i + E i(s) z sin i + y cos i e i [ki ( y sin i +z cos i )i t ]
(p)
Bi =
c
n r h (p) i
xE r + E r(s) z sin r y cos r e i [kr ( y sin r z cos r )r t ]
Br = (3.11)
c h
nt i
xE t + E t(s) z sin t + y cos t e i [kt ( y sin t +z cos t )t t ]
(p)
Bt =
c
Next, we apply the boundary condition (3.58), namely that the components of B
parallel to the interface (i.e. in the x and y dimensions) are the same4 on either
side of the plane z = 0. Since we already know that the exponents are all equal
and that r = i and n i = n r , the boundary condition gives
ni h (p)
i n h
i (p)
i n h
t (p)
i
xE i + E i(s) y cos i + xE r E r(s) y cos i = xE t + E t(s) y cos t
c c c
(3.12)
As before, (3.12) reduces to two relatively simple equations (one for the x dimen-
sion and one for the y dimension):
(p) (p) (p)
ni E i E r = nt E t (3.13)
and
n i E i(s) E r(s) cos i = n t E t(s) cos t (3.14)
These two equations together with (3.8) and (3.9) allow us to solve for the reflected
Er and transmitted fields Et for the s and p polarization components. However,
(3.8), (3.9), (3.13), and (3.14) are not yet in their most convenient form.
4 We assume the permeability is the same everywhereno magnetic effects.
0
3.2 The Fresnel Coefficients 77
The ratio of the reflected and transmitted field components to the incident
field components are specified by the Fresnel coefficients, which are defined as
follows:
All of the above forms of the Fresnel coefficients are potentially useful, depending
1 on the problem at hand. Remember that the angles in the coefficient are not
independently chosen, but are subject to Snells law (3.7). Snells law has been
used to produce the alternative expressions from the first.
The Fresnel coefficients pin down the electric field amplitudes on the two
0 sides of the boundary. They also keep track of phase shifts at a boundary. In
Fig. 3.3 we have plotted the Fresnel coefficients for the case of an air-glass inter-
-0.5 face. Notice that the reflection coefficients are sometimes negative in this plot,
which corresponds to a phase shift of upon reflection (note e i = 1). Later we
-1 will see that when absorbing materials are encountered, more complicated phase
0 20 40 60 80
shifts can arise due to the complex index of refraction.
Pi = Pr + Pt (3.24)
Moreover, the power separates cleanly into power associated with s- and p-
polarized fields:
(p) (p) (p)
P i(s) = P r(s) + P t(s) and Pi = Pr + Pt (3.25)
Since power is proportional to intensity (i.e. power per area) and intensity
is proportional to the square of the field amplitude. We can write the fraction
of reflected power, called reflectance, in terms of our previously defined Fresnel
3.3 Reflectance and Transmittance 79
coefficients:
(s) 2 (p) 2
E r (p) (p) E r
P r(s) I r(s) 2 Pr Ir 2
Rs = = = |r s | and Rp = = 2 = r p
P i(s) I i(s) (s) 2 (p)
Pi
(p)
Ii (p)
E i E i
(3.26)
The total reflected intensity is therefore
(p) (p)
I r = I r(s) + I r = R s I i(s) + R p I i (3.27)
where, according to (2.62), the total incident intensity is given by
(p) 1 2
(p) 2
I i = I i(s) + I i = n i 0 c E i(s) + E i (3.28)
2
From (3.25) and (3.26), the transmitted power is
(p) (p) (p)
(p)
P t(s) = P i(s) P r(s) = (1 R s ) P i(s)
and Pt = Pi + Pr = 1 Rp Pi (3.29) 1
From this expression we see that the fraction of the power that transmits, called 0.8
the transmittance, is
(p) 0.6
P t(s) Pt
Ts = 1 Rs and Tp = 1 Rp (3.30)
P i(s) Pi
(p) 0.4
Figure 3.4 shows typical reflectance and transmittance values for an air-glass 0.2
interface.
0
You might be surprised at first to learn that 0 20 40 60 80
2
T s 6= |t s |2 and T p 6= t p (3.31)
Figure 3.4 The reflectance and
However, recall that the transmitted intensity (in terms of the transmitted fields)
transmittance plotted versus i for
depends also on the refractive index. The Fresnel coefficients t s and t p relate the the case of an air-glass interface
bare electric fields to each other, whereas the transmitted intensity is with n i = 1 and n t = 1.5.
(s) (p) 1 (s) 2 (p) 2
I t = I t + I t = n t 0 c E t + E t (3.32)
2
In view of (3.28) and (3.32), we expect T s and T p to depend on the ratio of the
2
refractive indices n t and n i in addition to |t s |2 or t p .
There is another more subtle reason for the inequalities in (3.31). Consider
a lateral strip of light associated with a plane wave incident upon the material
interface in Fig. 3.5. Upon refraction into the second medium, the strip is seen
to change its width by the factor cos t / cos i . This is a purely geometrical effect,
owing to the change in propagation direction at the interface. Since power is
intensity times area, the transmittance picks up this geometrical factor via the
ratio of the areas A t /A i as follows:
P t(s) I t(s) A t n t cos t
Ts = = |t s |2
P i(s) I i(s) A i n i cos i
(p) (p) (not valid if total internal reflection) (3.33)
Pt It At n t cos t 2
Tp = = tp
(p)
Pi
(p)
Ii Ai n i cos i Figure 3.5 Light refracting into a
surface
80 Chapter 3 Reflection and Refraction
Note that (3.33) is valid only if a real angle t exists; it does not hold when the
incident angle exceeds the critical angle for total internal reflection, discussed in
section 3.5. In that situation, we must stick with (3.30).
Example 3.2
Show analytically that R p + Tp = 1, where R p is given by (3.26) and T p is given by
(3.33).
n i cos t n t cos i 2
R p =
David Brewster (17811868, Scottish)
n i cos t + n t cos i
was born in Jedburgh, Scottland. His n i2 cos2 t 2n i n t cos i cos t + n i2 cos2 i
father was a teacher and wanted David =
to become a clergyman. At age twelve, (n i cos t + n t cos i )2
David went to the University of Edin-
burgh for that purpose, but his inclination From (3.23) and (3.33) we have
for natural science soon became ap- 2
n t cos t 2n i cos i
parent. He became licensed to preach,
Tp =
but his interests in science distracted n i cos i n i cos t + n t cos i
him from that profession, and he spent
much of his time studying diffraction.
4n i n t cos i cos t
=
Taking an empirical approach, Brewster (n i cos t + n t cos i )2
independently discovered many of the
same things usually credited to Fresnel. Then
He even made a dioptric apparatus for
lighthouses before Fresnel developed n i2 cos2 t + 2n i n t cos i cos t + n i2 cos2 i
his. Brewster became somewhat famous Rp + Tp =
in his day for the development of the (n i cos t + n t cos i )2
kaleidoscope and stereoscope for en- (n i cos t + n t cos i )2
joyment by the general public. Brewster = =1
was a prolific science writer and editor (n i cos t + n t cos i )2
throughout his life. Among his works is
an important biography of Isaac Newton.
He was knighted for his accomplish-
ments in 1831. (Wikipedia)
3.4 Brewsters Angle
completely Notice r p and R p go to zero at a certain angle in Figs. 3.3 and 3.4, indicating that
s-polarized
reflection no p-polarized light is reflected at this angle. This behavior is quite general, as
100% p-transmission
we can see from the final form of the Fresnel coefficient formula for r p in (3.22),
which has tan (i + t ) in the denominator. Since the tangent blows up at /2,
the reflection coefficient goes to zero when
i + t = (requirement for zero p-polarized reflection) (3.34)
2
By inspecting Fig. 3.1, we see that this condition occurs when the reflected and
transmitted wave vectors, kr and kt , are perpendicular to each other (see also
Fig. 3.6). If we insert (3.34) into Snells law (3.7), we can solve for the incident
angle i that gives rise to this special circumstance:
Figure 3.6 Brewsters angle coin-
cides with the situation where kr n i sin i = n t sin i = n t cos i (3.35)
and kt are perpendicular. 2
3.5 Total Internal Reflection 81
0
The angle that satisfies this equation, in terms of the refractive indices, is Oscillating
Dipole
readily found to be
nt
B = tan1 (3.36)
ni 270 90
We have replaced the specific i with B in honor of Sir David Brewster who first
discovered the phenomenon. The angle B is called Brewsters angle. At Brewsters
angle, no p-polarized light reflects (see L 3.4). Physically, the p-polarized light
180
cannot reflect because kr and kt are perpendicular. A reflection would require
the microscopic dipoles at the surface of the second material to radiate along Figure 3.7 The intensity radiation
their axes, which they cannot do. Maxwells equations know about this, and so pattern of an oscillating dipole as
everything is nicely consistent. a function of angle. Note that the
dipole does not radiate along the
axis of oscillation, giving rise to
3.5 Total Internal Reflection Brewsters angle for reflection.
From Snells law (3.7), we can compute the transmitted angle in terms of the
incident angle:
1 n i
t = sin sin i (3.37)
nt
The angle t is real only if the argument of the inverse sine is less than or equal to
one. If n i > n t , we can find a critical angle beyond which the argument begins to
exceed one:
nt
c sin1 (3.38)
ni
When i > c , then there is total internal reflection and we can directly show that
R s = 1 and R p = 1 (see P3.9).5 To demonstrate this, one computes the Fresnel
coefficients (3.20) and (3.22) while employing the following substitution:
v
u 2
q un
cos t = 1 sin t = i t i2 sin2 i 1
2
(i > c ) (3.39)
nt
(see P0.19).
In this case, t is a complex number. However, we do not assign geometrical
significance to it in terms of any direction. Actually, we dont even need to know
the value for t ; we need only the values for sin t and cos t , as specified by
Snells law (3.7) and (3.39). Even though sin t is greater than one and cos t
is imaginary, we can use their values to compute r s , r p , t s , and t p . (Complex
notation is wonderful!)
Upon substitution of (3.39) into the Fresnel reflection coefficients (3.20) and
(3.22) we obtain
r
n i2
n i cos i i n t n t2
sin2 i 1
rs = r (i > c ) (3.40)
n i2
n i cos i + i n t n t2
sin i 1
2
5 M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 1.5.4 (Cambridge University Press, 1999).
82 Chapter 3 Reflection and Refraction
and r
n i2
n t cos i i n i n t2
sin2 i 1
rp = r (i > c ) (3.41)
n i2
n t cos i + i n i n t2
sin i 1
2
These Fresnel coefficients can be manipulated (see P3.9) into the forms
v
u 2
n t
u n
t i sin2 1
r s = exp 2i tan1 i (i > c ) (3.42)
n i cos i n 2
t
and
v
u 2
ni t iu n
r p = exp 2i tan1 sin2
i 1 (i > c ) (3.43)
n t cos i n 2
t
Figure 3.9 plots the evanescent wave described by (3.44) along with the associ-
ated incident wave. The phase of the evanescent wave indicates that it propagates
parallel to the boundary (in the y-dimension). Its strength decays exponentially
away from the boundary (in the z-dimension). We leave the calculation of t s and
t p as an exercise (P3.10).
6 G. R. Fowles, Introduction to Modern Optics, 2nd ed., Sect 2.9 (New York: Dover, 1975).
3.6 Reflections from Metal 83
r s = |r s | e i s (3.49)
and
r p = r p e i p
(3.50)
7 See M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 14.2 (Cambridge University Press,
1999).
84 Chapter 3 Reflection and Refraction
We refrain from putting (3.47) and (3.48) into this form using the general ex-
pressions; we would get a big mess. It is a good idea to let your calculator or
a computer do it after a specific value for N n + i is chosen. An important
point to notice is that the phases upon reflection can be very different for s and
p-polarization components (i.e. p and s can be very different). This is true in
general, even when the reflectivity is high (i.e. |r s | and r p on the order of unity).
Brewsters angle exists also for surfaces with complex refractive index. How-
ever, in general the expressions (3.48) and (3.50) do not go to zero at any incident
angle i . Rather, the reflection of p-polarized light can go through a minimum at
some angle i , which we refer to as Brewsters angle (see Fig. 3.10). This minimum
is best found numerically since the general expression for r p in terms of n and
E
I Z
B d ` = 0 J + 0 n d a (3.55)
t
C S
As before, we are able to perform the path integration on the left-hand side for
the geometry depicted in the figure, which gives
I
B d ` = B 1|| d B 1 `1 B 2 `2 B 2|| d +B 2 `2 +B 1 `1 = B 1|| B 2|| d (3.56)
The notation for parallel and perpendicular components on either side of the
interface is similar to that used in (3.52).
Again, we can shrink the loop down until it has zero surface area by letting the
lengths `1 and `2 go to zero. In this situation, the right-hand side of (3.55) goes to
zero (ignoring the possibility of surface currents):
E
Z
J + 0 n d a 0 (3.57)
t
S
8 This form can be obtained from (1.4) by integration over the surface S in Fig. 3.11 and applying
Exercises
P3.1 Derive the Fresnel coefficients (3.22) and (3.23) for p-polarized light.
P3.2 Verify that each of the alternative forms given in (3.20)(3.23) are equiv-
alent. Show that at normal incidence (i.e. i = t = 0) the Fresnel coeffi-
cients reduce to
nt ni 2n i
lim r s = lim r p = and lim t s = lim t p =
i 0 i 0 nt + ni i 0 i 0 nt + ni
L3.4 (a) In the laboratory, measure the reflectance for both s and p polarized
light from a flat glass surface at about ten angles. Especially watch for
Brewsters angle (described in section 3.4). You can normalize the
detector by measuring the beam before the glass surface. Figure 3.12
illustrates the experimental setup. (video)
High sensitivity
detector
Slide detector
with the beam
Uncoated glass
Polarizer on rotation stage
Laser
P3.7 (a) Find Brewsters angle for an air-glass interface with n glass = 1.5.
(b) Compute R s and R p at this angle.
P3.8 Diamonds have an index of refraction of n = 2.42 which allows total in-
ternal reflection to occur at relatively shallow angles of incidence. Gem
cutters choose facet angles that ensure most of the light entering the
top of the diamond will reflect back out to give the stone its expensive
sparkle. One such cut, the Eulitz Brilliant" cut, is shown in Fig. 3.14.
(a) What is the critical angle for diamond?
(b) What fraction of the light reflects for internal angles i = 40.5 and
i = 50.6 ? One way to spot a fake diamond is by noticing reduced
brilliance in the sparkle. Are these angles both beyond the critical
angle for fused quartz (n = 1.46)?
Figure 3.14 A Eulitz Brilliant cut
(c) For each angle and assuming s-polarized light, find the phase shift diamond.
upon reflection s where r s = |r s | e i s .
P3.9 Derive (3.42) and (3.43) and show that R s = 1 and R p = 1. HINT: See
problem P0.15.
88 Chapter 3 Reflection and Refraction
P3.12 Light (vac = 500 nm) reflects internally from a glass surface (n = 1.5)
surrounded by air. The incident angle is i = 45 . An evanescent wave
travels parallel to the surface on the air side. At what distance from the
surface is the amplitude of the evanescent wave 1/e of its value at the
surface?
P3.13 The complex index for silver is given by n = 0.13 and = 4.0.9 Find r s
and r p when reflecting at i = 80 and put them into the forms (3.49)
and (3.50). Assume the light propagates in vacuum on the incident
80
side.
Answer: r s = 0.997e i 3.057 , r p = 0.969e i 1.187
s
p
P3.14 (a) Using a computer, plot R s , R p versus i for silver (n = 0.13 and
= 4.0). Make a separate plot of the phases s and p from (3.49) and
(3.50). Clearly label each plot.
Figure 3.15 Geometry for P3.13
(b) Can you identify Brewsters angle (i.e. where R p is minimum)?
89
90 Chapter 4 Multiple Parallel Interfaces
also have the advantage of being more durable and less prone to damage from
high-intensity lasers.
middle region.1
As of yet, we do not know the amplitudes or phases of the net forward and net
backward traveling plane waves in the middle layer. We denote them by E 1(s) and
(p) (p)
E 1(s) or by E 1 and E 1 , separated into their s and p components as usual. Similarly,
(p) (p)
E 0(s) and E 0 as well as E 2(s) and E 2 are understood to include light that leaks
through the boundaries from the middle region. Thus, we need only concern
ourselves with the five plane waves depicted in Fig. 4.1.
The various plane-wave fields are connected to each other at the boundaries
via the single-boundary Fresnel coefficients (3.20)(3.23). At the first surface we
define
sin 1 cos 0 sin 0 cos 1 sin 1 cos 1 sin 0 cos 0
r s01 r p01
sin 1 cos 0 + sin 0 cos 1 sin 1 cos 1 + sin 0 cos 0
(4.1)
2 sin 1 cos 0 2 sin 1 cos 0
t s01 t p01
sin 1 cos 0 + sin 0 cos 1 sin 1 cos 1 + sin 0 cos 0
The notation 0 1 indicates the first surface from the perspective of starting
on the incident side and propagating towards the middle layer. The Fresnel
coefficients for the backward traveling light approaching the first interface from
within the middle layer are given by
r s0 1
= r s01 r p0 1
= r p01
2 sin 0 cos 1 2 sin 0 cos 1 (4.2)
t s0 1
t p0 1
sin 0 cos 1 + sin 1 cos 0 sin 0 cos 0 + sin 1 cos 1
where 0 1 indicates connections at the first interface, but from the perspective
of beginning inside the middle layer. Finally, the single-boundary coefficients for
light approaching the second interface are
The forward-traveling wave in the middle region arises from both a transmis-
sion of the incident wave and a reflection of the backward-traveling wave in the
middle region at the first interface. Using the Fresnel coefficients, we can write
E 1(s) as the sum of fields arising from E 0(s) and E 1(s) as follows:
The factor t s01 and r s0 1 are the single-boundary Fresnel coefficients selected from
(4.1). Similarly, the overall reflected field E 0(s) , is given by the reflection of the
incident field and the transmission of the backward-traveling field in the middle
region according to
E 0(s) = r s01 E 0(s) + t s0 1 E 1(s) (4.5)
Two connections made; two to go.
Before we continue, we need to specify an origin so that we can calculate
phase shifts associated with propagation in the middle region. Propagation was
not an issue in the single-boundary problem studied back in chapter 3. However,
in the double-boundary problem, the thickness of the middle region dictates
phase variations that strongly influence the result. We take the origin to be
located on the first interface, as shown in Fig. 4.1. Since all fields in (4.4) and (4.5)
are evaluated at the origin (y, z) = (0, 0), there were no phase factors needed.
We will connect the plane-wave fields across the second interface at the point
r = zd . The appropriate phase-adjusted2 field at (y, z) = (0, d ) is E 1(s) e i k1 r =
E 1(s) e i k1 d cos 1 , since E 1(s) is the field at the origin (y, z) = (0, 0). The transmitted
field in the final medium arises only from the forward-traveling field in the middle
region, and at our selected point it is
Note that E 2(s) stand for the transmitted field at the point (y, z) = (0, d ); its local
phase can be built into its definition so no need to write an explicit phase.
The backward-traveling plane wave in the middle region arises from the
reflection of the forward-traveling plane wave in that region:
Like before, E 1(s) is referenced to the origin (y, z) = (0, 0). Therefore, the factor
e i k1 r = e i k1 d cos 1 is needed at (y, z) = (0, d ).
The relations (4.4)(4.7) permit us to find overall transmission and reflection
coefficients for the two-interface problem.
Example 4.1
Derive the transmission coefficient that connects the final transmitted field to the
incident field for the double-interface problem according to t stot E 2(s) /E 0(s) .
2 In the middle region, k = k 1 y sin 1 + z cos 1 and k1 = k 1 y sin 1 z cos 1 .
1
4.1 Double-Interface Problem Solved Using Fresnel Coefficients 93
E 2(s) i k1 d cos 1
E 1(s) = e (4.8)
t s12
r s12
E 1(s) = E 2(s) e i k1 d cos 1 (4.9)
t s12
Next, substituting both (4.8) and (4.9) into (4.4) yields the connection we seek
between the incident and transmitted fields:
The coefficient t stot derived in Example 4.1 connects the amplitude and phase
of the incident field to the amplitude and phase of the transmitted field in a
manner similar to the single-boundary Fresnel coefficients. The numerator of
(4.11) reminds us of the physics of the situation: the field transmits through the
first interface, acquires a phase due to propagating through the middle layer, and
then transmits through the second interface. The denominator of (4.11) modifies
the result to account for feedback from multiple reflections in the middle region.3
The overall reflection coefficient is found to be (see P4.1)
r stot = r 01
s + (4.12) (p can be switched for s)
E 0(s) 1 r s0 1 r s12 e i 2k1 d cos 1 1
The initial reflection from the first interface is described by the first term r s01 . 0.8
The numerator in (4.12) can be simplified algebraically, but we have left it in this 0.6
longer form to emphasize the physics of the situation: light transmits through
0.4
the first interface, propagates through the middle layer, reflects from the second
interface, propagates back through the middle layer, and transmits back through 0.2
the first interface to interfere with the initial reflection. The denominator of the 0
0 20 40 60 80
second term accounts for the effects of multiple-reflection feedback.
Figure 4.2 shows the magnitudes of the overall reflection and transmission co-
efficients for the case of a quarter-wave thickness coating of Magnesium Fluoride Figure 4.2 Plots of the magnitudes
on glass with k 1 d = /2. This coating is meant to reduce reflections by having the of the overall reflection and trans-
initial reflection described by the first term in (4.12) and the secondary reflection mission coefficients for a quar-
described by the second term add out of phase (i.e. have a relative phase shift of ter wave thickness (k 1 d = /2)
of MgF2 (n1 = 1.38) on glass
3 Our derivation method avoids the need for explicit accounting of multiple reflections. For an
(n 2 = 1.5) in air (n 0 = 1).
alternative approach arriving at the same result via an infinite geometric series, see M. Born and E.
Wolf, Principles of Optics, 7th ed., Sect. 7.6.1 (Cambridge University Press, 1999) or G. R. Fowles,
Introduction to Modern Optics, 2nd ed., Sect 4.1 (New York: Dover, 1975).
94 Chapter 4 Multiple Parallel Interfaces
-
0 20 40 60 80
4.2 Transmittance through Double-Interface at Sub Criti-
cal Angles
Figure 4.3 Plots of the phases of We are now in a position to calculate the fraction of power that transmits through
the overall reflection and trans-
or reflects from a double-interface arrangement. Because the transmission coeffi-
mission coefficients for a quar-
cient (4.11) has a simpler form than the reflection coefficient (4.12), it is easier to
ter wave thickness (k 1 d = /2)
of MgF2 (n1 = 1.38) on glass calculate the total transmittance T stot and obtain the reflectance, if desired, from
(n 2 = 1.5) in air (n 0 = 1). the relationship (see (3.30))
T stot + R stot = 1 (4.13)
When the transmitted angle 2 is real (i.e. 1 does not exceed the critical
angle), we may write the fraction of the transmitted power as in (3.33):
n 2 cos 2 tot 2
T stot = t
n 0 cos 0 s
(p can be switched for s) 01 2 12 2 (2 real) (4.14)
n 2 cos 2 t t
s s
=
n 0 cos 0 e i k1 d cos 1 r s0 1 r s12 e i k1 d cos 1 2
T smax
(p can be switched for s) T stot = (1 and 2 real) (4.15)
s
1 + F s sin2 2
where
T s01 T s12
T smax p 2 (4.16)
1 R s0 1 R s12
4 M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 7.6.1 (Cambridge University Press, 1999).
4.2 Transmittance through Double-Interface at Sub Critical Angles 95
The quantity T smax is the maximum possible transmittance of power through the
two surfaces. The single-interface transmittances (T s01 and T s12 ) and reflectances
(R s0 1 and R s12 ) are calculated from the single-interface Fresnel coefficients in
the usual way as described in chapter 3. The numerator of T smax represents the
combined transmittances for the two interfaces without considering feedback
due to multiple reflections. The denominator enhances this value to account for
reinforcing feedback in the middle layer.
The exact argument of the sine function, s , can strongly influence the trans-
mission. The term 2k 1 d cos 1 represents the phase delay acquired during round-
trip propagation in the middle region. The terms r s0 1 and r s12 account for
possible phase shifts upon reflection from each interface. They are defined in-
directly by writing the single-boundary Fresnel reflection coefficients in polar
format: i 0 1 i 12
r s0 1 = r s0 1 e r s and r s12 = r s12 e r s (4.19)
If the indices of refraction in all regions are real, r s0 1 and r s12 take on values of
either zero or (i.e. the coefficients are positive or negative real numbers). When
the indices are complex, other phase values are possible.
F s is called the coefficient of finesse, which determines how strongly the trans-
mittance is influenced when s is varied (for example, through varying d or the
wavelength vac ).
Example 4.2
Consider a beam splitter designed for s-polarized light incident on a substrate of
Partial Anti-reflection
glass (n = 1.5) at 45 as shown in Fig. 4.4. A thin coating of zinc sulfide (n = 2.32)
reflection coating
is applied to the front of the glass to cause about half of the light to reflect. A coating
magnesium fluoride (n = 1.38) coating is applied to the back surface of the glass to
minimize reflections at that surface.5 Each coating constitutes a separate double- 46%
interface problem. The front coating is deferred to problem P4.5. In this example, 54%
find the highest transmittance possible through the antireflection film at the back
of the beam splitter and the smallest possible d that accomplishes this for light
with wavelength vac = 633 nm. Glass
Solution: For the back coating, we have n 0 = 1.5, n 1 = 1.38, and n 2 = 1. We can
find 0 and 1 from 2 = 45 using Snells law
Figure 4.4 Side view of a beam-
sin 45
n 1 sin 1 = sin 2 1 = sin1
= 30.82 splitter.
1.38
sin 45
n 0 sin 0 = sin 2 0 = sin1 = 28.13
1.5
5 We ignore possible feedback between the front and rear coatings. Since the antireflection
films are usually imperfect, beam splitter substrates are often slightly wedged so that unwanted
reflections from the second surface travel in a different direction.
96 Chapter 4 Multiple Parallel Interfaces
r s0 1 = , r s12 = 0
2
R s12 r s12 = |0.253|2 = 0.0640
T s01 = T s0 1
= 1 R s0 1
= 1 0.0030 = 0.997
12 1 2
Ts = 1 Rs = 1 0.0640 = 0.936
0.960
T stot = 1 +
1 + 0.0570 sin2 2k1 d cos
2
The maximum transmittance occurs when the sine is zero. In that case, T stot =
0.960, meaning that 96% of the light is transmitted. Without the coating, a situation
we can recover by temporarily setting d = 0, the transmittance would be 90.8%, so
the coating gives a significant improvement.
We find the smallest thickness d that minimizes reflection by setting the argument
of the sine to :
2k 1 d cos 1 + = 2
Since k 1 = 2n 1 /vac , we have
vac 633 nm
d = = = 134 nm
4n 1 cos 1 4(1.38) cos 30.82
4.3 Beyond Critical Angle: Tunneling of Evanescent Waves 97
Example 4.3
Calculate the transmittance of p-polarized light through the region between two
closely spaced 45 right prisms, as shown in Fig. 4.6, as a function of vac and
the prism spacing d . Take the index of refraction of the prisms to be n = 1.5
surrounded by index n = 1, and use 0 = 2 = 45 . Neglect possible reflections
from the exterior surfaces of the prisms.
2
2 cos 1 sin 2
2 2 (i 0.3536) p1
12 2
2
t p = = = 0.640
cos 2 sin 2 + cos 1 sin 1 p1 p1 + (i 0.3536) (1.061)
2 2
1
For the last step in the r p12 calculation, see problem P0.15. Also note that r p12 =
r p0 1 = r p01 since n 0 = n 2 . We also need
d
2 2
k 1 d cos 1 = d cos 1 = d (i 0.3536) = i 2.22
vac vac vac
We are now ready to compute the total transmittance (4.14). The factors out in
front vanish since 0 = 2 and n 0 = n 2 , and we have
01 2 12 2
0 t p t p
0 0.5 1 1.5 2
T ptot =
e i k1 d cos 1 r 0 1 r 12 e i k1 d cos 1 2
p p
Figure 4.7 shows a plot of the transmittance (4.22) calculated in Example 4.3.
Maurice Paul Auguste Charles Fabry
(1867-1945, French) was born in Mar- Notice that the transmittance is 100% when the two prisms are brought together as
seille, France. At age 18, he entered the expected. That is, T ptot (d = 0) = 1. When the prisms are about a wavelength apart,
cole Polytechnique in Paris where he
studied for two years. Following that, he
the transmittance is significantly reduced, and as the distance gets large compared
spent a number of years teaching state to a wavelength, the transmittance quickly goes to zero (T ptot (d /vac 1) 0).
secondary school while simultaneously
working on a doctoral dissertation on in-
terference phenomona. After completing
his doctorate, he began working as a lec- 4.4 Fabry-Perot Instrument
turer and laboratory assistant at the Uni-
versity of Marseille where a decade later In the 1890s, Charles Fabry realized that a double interface could be used to
he was appointed a professor of physics.
Soon after his arrival to the University distinguish wavelengths of light that are very close together. He and a talented
of Marseille, Fabry began a long and experimentalist colleague, Alfred Perot, constructed an instrument and began
fruitful collaboration with Alfred Perot
(1863-1925). Fabry focused on theoret-
to use it to make measurements on various spectral sources. The Fabry-Perot
ical analysis and measurements while instrument6 consists of two identical (parallel) surfaces separated by spacing d .
his colleague did the design work and We can use our analysis in section 4.2 to describe this instrument. For simplicity,
construction of their new interferometer,
which they continually improved over the we choose the refractive index before the initial surface and after the final surface
years. During his career, Fabry made to be the same (i.e. n 0 = n 2 ). We assume that the transmission angles are such
significant contributions to spectroscopy
and astrophysics and is credited with that total internal reflection is avoided. The transmission through the device
co-discovery of the ozone layer. See J. F. depends on the exact spacing between the two surfaces, the reflectance of the
Mulligan, Who were Fabry and Perot?,
Am. J. Phys. 66. 797-802 (1998).
surfaces, as well as on the wavelength of the light.
6 M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 7.6.2 (Cambridge University Press, 1999).
4.4 Fabry-Perot Instrument 99
If the spacing d separating the two parallel surfaces is adjustable, the instru-
ment is called a Fabry-Perot interferometer. If the spacing is fixed while the angle
of the incident light is varied, the instrument is called a Fabry-Perot etalon. An
etalon can therefore be as simple as a piece of glass with parallel surfaces. Some-
times, a thin optical membrane called a pellicle is used as an etalon (occasionally
inserted into laser cavities to discriminate against certain wavelengths). However,
to achieve sharp discrimination between closely-spaced wavelengths, a relatively
large spacing d is desirable.
As we previously derived (4.15), the transmittance through a double boundary
is
T max
T tot = (4.23)
1 + F sin2 2
details of the coatings, we can say that each coating has a certain reflectance R
and transmittance T . However, as light goes through a coating, it can also be
attenuated because of absorption. In this case, we have
R +T + A = 1 (4.27)
that it is normal to the surfaces. It is critical for the two surfaces of the interferom-
eter to be extremely close to parallel. When aligned correctly, the transmission
of a collimated beam will blink all together as the spacing d is changed (by tiny
amounts). A mechanical actuator can be used to vary the spacing between the
0 plates while the transmittance is observed on a detector. To make the alignment
of the instrument somewhat less critical, a small aperture can be placed in front
Figure 4.11 Transmittance as the of the detector so that it observes only a small portion of the beam.
separation d is varied (F = 100). The transmittance as a function of plate separation is shown in Fig. 4.11. In
this case, varies via changes in d (see (4.26) with cos 1 = 1 and fixed wave-
length). As the spacing is increased by only a half wavelength, the transmittance
4.5 Setup of a Fabry-Perot Instrument 101
changes through a complete period. The various peaks in the figure are called
fringes. Etalon
Point
The setup for a Fabry-Perot etalon is similar to that of the interferometer Source
except that the spacing d remains fixed. Often the two surfaces in the etalon are
held parallel to each other by a precision spacer. An advantage to the Fabry-Perot
Angle
etalon (as opposed to the interferometer) is that no moving parts are needed. To Adjustment Screen
make measurements with an etalon, the angle of the light is varied rather than the
plate separation. After all, to see fringes, we just need to cause in (4.23) to vary Figure 4.12 A diverging monochro-
matic beam traversing a Fabry-
in some way. According to (4.26), we can do that as easily by varying 1 as we can
Perot etalon. (The angle of diver-
by varying d . One way to obtain a range of angles is to observe light from a point
gence is exaggerated.)
source, as depicted in Fig. 4.12. Different portions of the beam go through the
device at different angles. When aligned straight on, the transmitted light forms a
bulls-eye pattern on a screen.
In Fig. 4.13 we graph the transmittance T tot (4.23) as a function of angle
(holding vac = 500 nm and d = 1 cm fixed). Since cos 1 is not a linear function,
Transmission
the spacing of the peaks varies with angle. As 1 increases from zero, the cosine
steadily decreases, causing to decrease. Each time decreases by 2 we get a
new peak. Not surprisingly, only a modest change in angle is necessary to cause
the transmittance to vary from maximum to minimum, or vice versa.
The bulls-eye pattern in Fig. 4.12 can be understood as the curve in Fig. 4.13 0
0 5 10 15
rotated about a circle. Depending on the exact spacing between the plates, the
angles where the fringes occur can be different. For example, the center spot
Figure 4.13 Transmittance
could be dark.
through a Fabry-Perot etalon
Spectroscopic samples often are not compact point-like sources. Rather, they (F = 10) as the angle 1 is varied. It
are extended diffuse sources. The point-source setup shown in Fig. 4.12 wont is assumed that the distance d is
work for extended sources unless all of the light at the sample is blocked except chosen such that is a multiple of
for a tiny point. This is impractical if there remains insufficient illumination at 2 when the angle is zero.
the final screen for observation.
In order to preserve as much light as possible, we can sandwich the etalon
between two lenses. We place the diffuse source at the focal plane of the first lens.
We place the screen at the focal plane of the second lens. This causes an image of
Diffuse
the source to appear on the screen.7 Each point of the diffuse source is mapped Source Screen
Lens Etalon Lens
to a corresponding point on the screen. Moreover, the light associated with any
particular point of the source travels as a unique collimated beam in the region
between the lenses. Each collimated beam traverses the etalon with a specific
angle. Thus, light associated with each emission point traverses the etalon with
higher or lower transmittance, according to the differing angles. The result is that
a bulls eye pattern becomes superimposed on the image of the diffuse source.
Figure 4.14 Setup of a Fabry-Perot
The lens and retina of your eye can be used for the final lens and screen. etalon for looking at a diffuse
7 If the diffuse source has the shape of Mickey Mouse, then an image of Mickey Mouse appears source.
on the screen. Imaging techniques are discussed in chapter 9.
102 Chapter 4 Multiple Parallel Interfaces
the wavelength vac + (all else remaining the same), (4.26) shifts to
4n 1 d cos 1
= + 2r (4.28)
vac +
The change in wavelength is usually very small compared to vac , so we can
represent the denominator with a truncated Taylor-series expansion:
0
1 1 1 /vac
= = (4.29)
Figure 4.15 Transmittance as the vac + vac (1 + /vac ) vac
spacing d is varied for two dif-
ferent wavelengths (F = 100). The amount that changes is then seen to be
The solid line plots the transmit- 4n 1 d cos 1
tance of light with a wavelength = (4.30)
of vac , and the dashed line plots 2vac
the transmittance of a wavelength If the change in wavelength is enough to cause = 2, the fringes in Fig. 4.15
shorter than vac . Note that the
shift through a whole period, and the picture looks the same.
fringes shift positions for different
This brings up an important limitation of the instrument. If the fringes shift
wavelengths.
by too much, we might become confused as to what exactly has changed, owing
to the periodic nature of the fringes. If two wavelengths arent sufficiently close,
the fringes of one wavelength may be shifted past several fringes of the other
wavelength, and we will not be able to tell by how much they differ.
This introduces the concept of free spectral range, which is the wavelength
change FSR that causes the fringes to shift through one period. We find this by
setting (4.30) equal to 2. After rearranging, we get
2vac
FSR = (4.31)
2n 1 d cos 1
The free spectral range tends to be extremely narrow; a Fabry-Perot instrument is
not well suited for measuring wavelength ranges wider than this. In summary, the
8 M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 7.6.3 (Cambridge University Press, 1999).
4.6 Distinguishing Nearby Wavelengths in a Fabry-Perot Instrument 103
free spectral range is the largest change in wavelength permissible while avoiding
confusion. To convert this wavelength difference FSR into a corresponding
frequency difference, one differentiates = c/vac to get
cFSR
|FSR | = (4.32)
2vac
Example 4.4
A Fabry-Perot interferometer has plate spacing d = 1 cm and index n 1 = 1. If it
is used in the neighborhood of vac = 500 nm, find the free spectral range of the
instrument.
In solving for (4.34) for FWHM , we see that this equation requires
FWHM
F sin2 =1 (4.35)
4
where we have taken advantage of the fact that is assumed to be a multiple of
2. Next, we suppose that FWHM /4 is rather small so that we may represent the
104 Chapter 4 Multiple Parallel Interfaces
4
FWHM
=p . (4.36)
F
The ratio of the period between peaks 2 to the width FWHM of individual peaks
is called the reflecting finesse (or just finesse).
p
2 F
f = (4.37)
FWHM 2
FSR 2vac
FWHM = = p (4.38)
f n 1 d cos 1 F
As a final note, the ratio of vac to FWHM , where FWHM is the minimum
change of wavelength that the instrument can distinguish in the neighborhood of
vac is called the resolving power:
vac
RP (4.39)
FWHM
Fabry-Perot instruments tend to have very high resolving powers as the following
example illustrates.
Example 4.5
If the Fabry-Perot interferometer in Example 4.4 has reflectivity R = 0.85, find the
finesse, the minimum distinguishable wavelength separation, and the resolving
power.
FSR 0.0125 nm
FWHM = = = 0.00065 nm (4.40)
f 19
4.7 Multilayer Coatings 105
The instrument can distinguish two wavelengths separated by this tiny amount,
which gives an impressive resolving power of
vac 500 nm
RP = = = 772, 000
FWHM 0.00065 nm
For comparison, the resolving power of a typical grating spectrometer is much less
(a few thousand). However, a grating spectrometer has the advantage that it can
simultaneously observe wavelengths over hundreds of nanometers, whereas the
Fabry-Perot instrument is confined to the extremely narrow free spectral range.
where N denotes the number of layers in the coating. The subscript 0 represents
the initial medium outside of the multilayer, and the subscript N + 1 represents
the final material, or the substrate on which the layers are deposited.
9 G. R. Fowles, Introduction to Modern Optics, 2nd ed., Sect 4.4 (New York: Dover, 1975); E. Hecht,
z-direction
In each layer, only two plane waves exist, each of which is composed of light
arising from the many possible bounces from various layer interfaces. The arrows
pointing right indicate plane wave fields in individual layers that travel roughly
in the forward (incident) direction, and the arrows pointing left indicate plane
wave fields that travel roughly in the backward (reflected) direction. In the final
(p)
region, there is only one plane wave traveling with a forward direction (E N +1 )
which gives the overall transmitted field.
As we have studied in chapter 3 (see (3.9) and (3.13)), the boundary conditions
for the parallel components of the E field and for the parallel components of the
B field lead respectively to
cos 0 E 0 + E 0 = cos 1 E 1 + E 1
(p) (p)
(p) (p)
(4.42)
and
(p) (p)
(p) (p)
n 0 E 0 E 0 = n 1 E 1 E 1 (4.43)
Similar equations give the field connection for s-polarized light (see (3.8) and
(3.14)).
We have applied these boundary conditions at the first interface only. Of
course there are many more interfaces in the multilayer. For the connection
between the j th layer and the next, we may similarly write
cos j E j e i k j d j cos j + E j e i k j d j cos j = cos j +1 E j +1 + E j +1
(p) (p)
(p) (p)
(4.44)
and
n j E j e i k j d j cos j E j e i k j d j cos j = n j +1 E j +1 E j +1
(p) (p)
(p) (p)
(4.45)
Here we have set the origin within each layer at the left surface. Then when
making the connection with the subsequent layer at the right surface, we must
specifically take into account the phase k j d j z = k j d j cos j . This corresponds
to the phase acquired by the plane wave field in traversing the layer with thickness
d j . The right-hand sides of (4.44) and (4.45) need no phase adjustment since the
( j + 1)th field is evaluated on the left side of its layer.
4.7 Multilayer Coatings 107
and
n N E N e i k N d N cos N E N e i k N d N cos N = n N +1 E N +1
(p) (p) (p)
(4.47)
where
0 j =0
j (4.49)
k j d j cos j 1 j N
and
(p)
E N +1 0 (4.50)
(It would be good to take a moment to convince yourself that this set of matrix
equations properly represents (4.42)(4.47) before proceeding.) We rewrite (4.48)
as
1
cos j e i j cos j e i j cos j +1 cos j +1
(p) (p)
E j E j +1
=
n j ei j n j e i j
(p) (p)
Ej n j +1 n j +1 E j +1
(4.51)
Keep in mind that (4.51) represents a distinct matrix equation for each differ-
ent j . We can substitute the j = 1 equation into the j = 0 equation to get
1
cos 0 cos 0 cos 2 cos 2
(p) (p)
E 0 (p) E 2
(p) = M1 (p) (4.52)
E0 n0 n 0 n2 n 2 E2
where we have grouped the matrices related to the j = 1 layer together via
1
cos 1 cos 1 cos 1 e i 1 cos 1 e i 1
(p)
M1 (4.53)
n1 n 1 n 1 e i 1 n 1 e i 1
We can continue to substitute into this equation progressively higher order equa-
tions (i.e. for j = 2, j = 3, ... ) until we reach the j = N layer. All together this will
108 Chapter 4 Multiple Parallel Interfaces
give
1 N
!
cos 0 cos 0 cos N +1 cos N +1
(p) (p)
E 0 E N +1
Y (p)
(p) = Mj
E0 n0 j =1
n 0 0 n N +1 n N +1
(4.54)
where the matrices related to the j th layer are grouped together according to
1
cos j cos j cos j e i j cos j e i j
(p)
Mj
nj n j n j ei j n j e i j
(4.55)
cos j i sin j cos j /n j
=
i n j sin j / cos j cos j
The matrix inversion in the first line was performed using (0.35). The symbol
signifies the product of the matrices with the lowest subscripts on the left:
N
Y (p) (p) (p) (p)
M j M1 M2 M N (4.56)
j =1
(p)
As a finishing touch, we divide (4.54) by the incident field E 0 as well as perform
the matrix inversion on the right-hand side to obtain
(p)
(p)
1 (p) E N +1 E 0
(p)
(p) =A (4.57)
E0 E 0 0
where
N
!
cos 0 cos N +1
(p) (p)
(p) a 11 a 12 1 n0 0 Y (p)
A = Mj
cos 0
(p) (p)
a 21 a 22 2n 0 cos 0 j =1
n0 0 n N +1
(4.58)
In the final matrix in (4.58) we have replaced the entries in the right column with
zeros. This is permissable since it operates on a column vector with zero in the
bottom component.
Equation (4.57) represents two equations, which must be solved simultane-
(p) (p) (p) (p)
ously to find the ratios E 0 /E 0 and E N +1 /E 0 . Once the matrix A (p) is computed,
this is a relatively simple task:
(p)
E N +1 1
t ptot (p)
= (p)
(Multilayer) (4.59)
E 0 a 11
(p) (p)
E0 a 21
r ptot (p)
= (p)
(Multilayer) (4.60)
E 0 a 11
The convenience of this notation lies in the fact that we can deal with an
arbitrary number of layers N with varying thickness and index. The essential
information for each layer is contained succinctly in its respective 2 2 char-
acteristic matrix M . To find the overall effect of the many layers, we need only
4.8 Periodic Multilayer Stacks 109
multiply the matrices for each layer together to find A from which we compute
the reflection and transmission coefficients for the whole system.
The derivation for s-polarized light is similar to the above derivation for p-
polarized light. The equation corresponding to (4.57) for s-polarized light turns
out to be (s) (s)
1 (s) E N +1 E 0
(s)
(s) =A (4.61)
E 0 E 0 0
where
N
!
(s) (s)
n 0 cos 0
(s) a 11 a 12 1 1 Y (s) 1 0
A = Mj
(s)
a 21 (s)
a 22 2n 0 cos 0 n 0 cos 0 1 j =1
n N +1 cos N +1 0
(4.62)
and
cos j i sin j /(n j cos j )
(s)
Mj = (4.63)
i n j cos j sin j cos j
The transmission and reflection coefficients are found (as before) from
E N(s)+1 1
t stot (s)
= (s) (Multilayer) (4.64)
E 0 a 11
E 0(s) (s)
a 21
r stot = (Multilayer) (4.65)
E 0(s) a 11
(s)
substrate
Sometimes multilayer coatings are made with repeated stacks of layers. If
the same series of layers in (4.66) is repeated many times, say q times, Sylvesters
theorem (see section 0.3) can come in handy. A block of matrices, corresponding
to a repeated pattern within the stack, can be conveniently taken to any power.
Sylvesters theorem requires that the determinant of the matrix be to equal one,
Figure 4.18 A repeated multilayer
which is true for matrices of the form (4.55) and (4.63) or any product of them. structure with alternating high
It is common for high-reflection coatings to be designed with alternating high and low indexes where each layer
and low refractive indices. For high reflectivity, each layer should have a quarter- is a quarter wavelength in thick-
wave thickness. Since the layers alternate high and low indices, at every other ness. This structure can achieve
boundary there is a phase shift of upon reflection from the interface. Hence, very high reflectance.
the quarter wavelength spacing is appropriate to give constructive interference in
the reflected direction.
Example 4.6
110 Chapter 4 Multiple Parallel Interfaces
Derive the reflection and transmission coefficients for p polarized light interacting
with a high reflector constructed using a /4 stack.
vac
dj =
4n j cos j
i cos j /n j
(p) 0
Mj =
i n j / cos j 0
The matrices for a high and a low refractive index layer are multiplied together in
the usual manner. Each layer pair takes the form
H L
# " n cos
i cos i cos
" #" #
0 nH 0 nL
nL cos H 0
= H L
i nH i nL cos L
cos H 0 cos L 0 0 nn Hcos L H
H
#q
nnL cos
"
N 0
(p) cos
Y
Mj = H L
cos L
j =1 0 nn Hcos
L H
H q
nnL cos
cos 0
H L
=
cos L q
0 nn Hcos L H
With A (p) in hand, we can now calculate the transmission coefficient from (4.59)
0
1
t t ptot = (/4 stack, p-polarized) (4.67)
H q cos N +1 cos L q
n N +1
-0.5 nnL cos
cos cos 0 + nn Hcos n0
H L L H
r
and the reflection coefficient from (4.60)
-1
H q cos N +1 n H cos L q
n N +1
0 5 10 nnL cos
H cos L cos 0 n L cos H n0
q r ptot = (/4 stack, p-polarized) (4.68)
n L cos H q cos N +1 n H cos L q
n N +1
n cos cos 0 + n cos n0
H L L H
Figure 4.19 The transmission and
reflection coefficients for a quarter
wave stack as q is varied (n L = 1.38 The quarter-wave multilayer considered in Example 4.6 can achieve extraor-
and n H = 2.32).
dinarily high reflectivity. In the limit of q , we have t p 0 and r p 1 (see
Fig. 4.19), giving 100% reflection with a phase shift.
Exercises 111
Exercises
P4.3 Verify that that (4.14) simplifies to (4.15) assuming 1 and 2 are real.
P4.4 A light wave impinges at normal incidence on a thin glass plate with
index n and thickness d .
(a) Show that the transmittance through the plate is
1
T tot = 2
1+ ( ) sin2
n 2 1
2nd
4n 2 vac
HINT: Find
n 1
r 12 = r 0 1
= r 01 =
n +1
and then use
T 01 = 1 R 01
T 12 = 1 R 12
(b) If n = 1.5, what is the maximum and minimum possible transmit-
tance through the plate?
(c) If the plate thickness is d = 150 m (same index as part (b)), what
wavelengths transmit with maximum throughput? Express your answer
as a formula involving an integer m.
P4.5 Show that the maximum reflectance possible from the front coating
in Example 4.2 is 46%. Find the smallest possible d that accomplishes
this for light with wavelength vac = 633 nm.
112 Chapter 4 Multiple Parallel Interfaces
P4.6 Re-compute (4.22) for the case of s-polarized light. Write the result in
the same form as the final expression in (4.22).
1.44
Answer: T stot =
e 4.44d / +e 4.44d / 0.560
Figure 4.21
Separation (cm)
(a) Use a computer to plot the transmittance through the gap (i.e. the
Figure 4.20 Theoretical vs. mea- result of P4.6) as a function of separation d (normal to gap surface).
sured microwave transmission Neglect reflections from other surfaces of the prisms.
through wax prisms. Mismatch is
presumably due to imperfections
(b) Measure the transmittance of the microwaves through the gap as
in microwave collimation and/or a function of spacing d (normal to the surface) and superimpose the
extraneous reflections. results on the graph of part (a). Figure 4.20 shows a plot of typical data
taken with this setup. HINT: Ignore surface reflections by normalizing
the measured power to a value of 1 when d = 0. (video)
P4.9 Generate a plot like Fig. 4.13, showing the fringes you get in a Fabry-
Perot etalon when 1 is varied. Let T max = 1, F = 10, vac = 500 nm,
d = 1 cm, and n 1 = 1.
Exercises 113
(a) Plot T vs. 1 over the angular range used in Fig. 4.13.
(b) Suppose d is slightly different, say 1.00001 cm. Make a plot of T max
vs 1 for this situation.
P4.10 Consider the configuration depicted in Fig. 4.12, where the center of the
diverging light beam vac = 633 nm approaches the plates at normal
incidence. Suppose that the spacing of the plates (near d = 0.5 cm) is
just right to cause a bright fringe to occur at the center. Let n 1 = 1. Find
the angle for the m th circular bright fringe surrounding the central spot
(the 0th fringe corresponding to the center). HINT: cos = 1 2 /2. The
p
answer has the form a m; find the value of a.
Diverging Lens
Laser
Filter
Fabry-Perot CCD
Etalon Camera
Figure 4.22
P4.14 Show that (4.64) for a single layer (i.e. two interfaces), is equivalent to
(4.11). WARNING: This is more work than it may appear at first.
114 Chapter 4 Multiple Parallel Interfaces
P4.15 (a) What should be the thickness of the high and the low index layers in
a periodic high-reflector mirror? Let the light be p-polarized and strike
the mirror surface at 45 . Take the indices of the layers be n H = 2.32
and n L = 1.38, deposited on a glass substrate with index n = 1.5. Let
the wavelength be vac = 633 nm.
(b) Find the reflectance R with 1, 2, 4, and 8 periods in the high-low
stack.
P4.16 Find the high-reflector matrix for s-polarized light that corresponds to
(4.66).
(b) Show that for a two-coating arrangement (where n 1 and n 2 are each
a /4 film), that !2
n 22 n g n 12
R= 2
n 2 + n g n 12
(c) Design anti-reflection coatings using the scheme in (a) and the
scheme in (b). You have a choice of these common coating materials:
ZnS (n = 2.32), CeF (n = 1.63) and MgF (n = 1.38). Find the recipe that
gives you the lowest R in each case. (When considering scheme (b), be
sure to specify which material is n 1 and which is n 2 .)
P4.18 In this problem, we will see that the trick used in P4.17, employing
a bilayer to improve anti reflection, doesnt get better with repeated
bilayers. Consider a bilayer anti-reflection coating (each coating set for
/4) using n 1 = 1.38 and n 2 = 1.38 applied to a glass substrate n g = 1.50
at normal incidence. Suppose the coating thicknesses are optimized for
vac = 550 nm (in the middle of the visible range) and ignore possible
variations of the indices with . Use a computer to plot R(air ) for 400
to 700 nm (visible range). Do this for a single bilayer (one layer of each
coating), two bilayers, four bilayers, and 25 bilayers. HINT: You will see
the good AR coating turn into a good HR coating.
Review, Chapters 14
R4 T or F: The real part of the refractive index cannot be less than one.
R9 T or F: The critical angle for total internal reflection exists on both sides
of a material interface.
115
116 Review, Chapters 14
R12 T or F: For incident angles beyond the critical angle for total internal
reflection, the Fresnel coefficients t s and t p are both zero.
R14 T or F: For a given incident angle and value of n, there is only one
single-layer coating thickness d that will minimize reflections.
Problems
R18 Consider an interface between two isotropic media where the incident
field is defined by
h i
Ei = E i y cos i z sin i + xE i(s) e i [ki ( y sin i +z cos i )i t ]
(p)
(a) Make substitutions from Snells law to show what each of these
equations reduces to when i = 0. Express you answers in terms of n i
and n t .
(b) What percent of light (intensity) reflects from a glass surface (n =
1.5) when light enters from air (n = 1) at normal incidence?
(c) What percent of light reflects from the glass surface when light exits
into air at normal incidence?
R20 Light goes through a glass prism with optical index n = 1.55. The light
enters at Brewsters angle and exits at normal incidence as shown in
Fig. 4.25.
(a) Derive and calculate Brewsters angle B . You may use the results of
R18 (c).
(b) Calculate .
(c) What percent of the light (power) goes all the way through the prism
if it is p-polarized? You may use the expression employed in R19(c). Figure 4.25
(d) Repeat part (c) for s-polarized light.
118 Review, Chapters 14
R22 A thin glass plate with index n = 1.5 is oriented at Brewsters angle so
that p-polarized light with wavelength vac = 500 nm goes through
with 100% transmittance.
(a) What is the minimum thickness that will make the reflection of
s-polarized light be maximum?
(b) What is the total transmittance T stot for this thickness assuming
s-polarized light?
T max T2 4R
T tot = , Tmax = , F=
1 + F sin2 2 2
(1 R)2
(1 R)
4n1 d
and = cos 1 + 2r
vac
(a) Derive the free spectral range
2vac
FSR =
2nd cos 1
2vac
FWHM = p
F n 1 d cos 1
R24 For a Fabry-Perot etalon, let R = 0.90, vac = 500 nm, n = 1, and d =
5.0 mm.
(a) Suppose that a maximum transmittance occurs at the angle = 0.
What is the nearest angle where the transmittance will be half of the
maximum transmittance? You may assume that cos = 1 2 /2.
119
(b) You desire to use a Fabry-Perot etalon to view the light from a large
diffuse source rather than a point source. Draw a diagram depicting
where lenses should be placed, indicating relevant distances. Explain
briefly how it works.
(p)
(p)
1 (p) E N +1 E 0
(p)
(p) =A
E0 E 0 0
where
N
!
cos 0 cos N +1
(p)
1 n0 Y (p) 0
A = Mj
2n 0 cos 0 n0 cos 0 j =1
n N +1 0
Figure 4.27
cos j i sin j cos j /n j
j = k j d j cos j
(p)
Mj =
i n j sin j / cos j cos j
(c) Assuming the parameters in part (b), find the index of refraction n 1
that will make the reflectance be zero.
Selected Answers
To this point, we have considered only isotropic media where the susceptibility
() (and hence the index of refraction) is the same for all propagation directions
and polarizations. In anisotropic materials, such as crystals, it is possible for
light to experience a different index of refraction depending on the alignment of
the electric field E (i.e. polarization). This difference in the index of refraction
occurs when the direction and strength of the induced dipoles depends on the
lattice structure of the material in addition to the propagating field.1 The unique
properties of anisotropic materials make them important elements in many
optical systems.
In section 5.1 we discuss how to connect E and P in anisotropic media using a
susceptibility tensor. In section 5.2 we apply Maxwells equations to a plane wave
traveling in a crystal. The analysis leads to Fresnels equation, which relates the
components of the k-vector to the components of the susceptibility tensor. In
section 5.3 we apply Fresnels equation to a uniaxial crystal (e.g. quartz, sapphire)
where x = y 6= z . In the context of a uniaxial crystal, we show that the Poynting
vector and the k-vector are generally not parallel.
More than a century before Fresnel, Christian Huygens successfully described
birefringence in crystals using the idea of elliptical wavelets. His method gives
the direction of the Poynting vector associated with the extraordinary ray in a
crystal. It was Huygens who coined the term extraordinary since one of the
rays in a birefringent material appeared not to obey Snells law. Actually, the
k-vector always obeys Snells law, but in a crystal, the k-vector points in a different
direction than the Poynting vector, which delivers the energy seen by an observer.
Huygens approach is outlined in Appendix 5.D.
NaCl) are highly symmetric and respond to electric fields the same in any direction.
121
122 Chapter 5 Propagation in Anisotropic Media
However, at low intensities the response of materials is still linear (or propor-
tional) to the strength of the electric field. The linear constitutive relation which
connects P to E in a crystal can be expressed in its most general form as
Px xx x y xz Ex
P y = 0 y x y y y z E y (5.1)
Pz zx z y zz Ez
The matrix in (5.1) is called the susceptibility tensor. To visualize the behavior
of electrons in such a material, we imagine each electron bound as though by
tiny springs with different strengths in different dimensions to represent the
anisotropy (see Fig. 5.1). When an external electric field is applied, the electron
experiences a force that moves it from its equilibrium position. The springs
Figure 5.1 A physical model of an
(actually the electric force from ions bound in the crystal lattice) exert a restoring
electron bound in a crystal lat-
tice with the coordinate system force, but the restoring force is not equal in all directionsthe electron tends to
specially chosen along the princi- move more along the dimension of the weaker spring. The displaced electron
pal axes so that the susceptibility creates a microscopic dipole, but the asymmetric restoring force causes P to be in
tensor takes on a simple form. a direction different than E as depicted in Fig. 5.2.
To understand the geometrical interpretation of the many coefficients i j ,
assume, for example, that the electric field is directed along the x-axis (i.e. E y =
E z = 0) as depicted in Fig. 5.2. In this case, the three equations encapsulated in
(5.1) reduce to
P x = 0 xx E x
P y = 0 y x E x
P z = 0 zx E x
Notice that the coefficient xx connects the strength of P in the x direction with
the strength of E in that same direction, just as in the isotropic case. The other two
coefficients ( y x and zx ) describe the amount of polarization P produced in the
y and z directions by the electric field component in the x-dimension. Likewise,
the other coefficients with mixed subscripts in (5.1) describe the contribution to
P in one dimension made by an electric field component in another dimension.
Figure 5.2 The applied field E As you might imagine, working with nine susceptibility coefficients can get
and the induced polarization P in
complicated. Fortunately, we can greatly reduce the complexity of the description
general are not parallel in a crystal
by a judicious choice of coordinate system. In Appendix 5.A we explain how
lattice.
conservation of energy requires that the susceptibility tensor (5.1) for typical
non-aborbing crystals be real and symmetric (i.e. i j = j i ).2
Appendix 5.B shows that, given a real symmetric tensor, it is always possible
to choose a coordinate system for which off-diagonal elements vanish. This is
true even if the lattice planes in the crystal are not mutually orthogonal (e.g.
rhombus, hexagonal, etc.). We will imagine that this rotation of coordinates
2 By typical we mean that the crystal does not exhibit optical activity. Optically active crystals
have a complex susceptibility tensor, even when no absorption takes place. Conservation of energy
in this more general case requires that the susceptibility tensor be Hermitian (i j = j i ).
5.2 Plane Wave Propagation in Crystals 123
has been accomplished. In other words, we can let the crystal itself dictate the
orientation of the coordinate system, aligned to the principal axes of the crystal
for which the off-diagonal elements of (5.1) are zero
With the coordinate system aligned to the principal axes, the constitutive
relation for a non absorbing crystal simplifies to
Px x 0 0 Ex
P y = 0 0 y 0 Ey (5.2)
Pz 0 0 z Ez
P = x0 x E x + y0 y E y + z0 z E z (5.3)
By assumption, x , y , and z are all real. (We have dropped the double subscript;
x stands for xx , etc.)
(0 E + P) = k (0 E + P) = 0 (5.5)
B = kB = 0 (5.6)
We immediately notice the following peculiarity: From its definition, the Poynting
vector S E B/0 is perpendicular to both E and B, and by (5.6) the k-vector is
perpendicular to B. However, by (5.5) the k-vector is not necessarily perpendicu-
lar to E, since in general k E 6= 0 if P points in a direction other than E. Therefore,
k and S are not necessarily parallel in a crystal. In other words, the flow of energy
and the direction of the phase-front propagation can be different in anisotropic
media.
124 Chapter 5 Propagation in Anisotropic Media
Our main goal here is to relate the k-vector to the susceptibility parameters x ,
y , and z . To do this, we plug our trial plane-wave fields into the wave equation
(1.40). Under the assumption Jfree = 0, we have
2 E 2 P
2 E 0 0 = 0 + ( E) (5.7)
t 2 t 2
We begin by substituting the trial solutions (5.4) into the wave equation (5.7). After
carrying out the derivatives we find
k 2 E 2 0 (0 E + P) = k (k E) (5.8)
Inserting the constitutive relation (5.3) for crystals into (5.8) yields
k 2 E 2 0 0 1 + x E x x + 1 + y E y y + 1 + z E z z = k (k E)
(5.9)
This relationship is unwieldy because of the mix of electric field components that
appear in the expression. This was not a problem when we investigated isotropic
materials for which the k-vector is perpendicular to E, making the right-hand side
of the equations zero. However, there is a trick for dealing with this.
Relation (5.9) actually contains three equations, one for each dimension. Explicitly,
these equations are
2
2
k 2 1 + x E x = k x (k E)
(5.10)
c
2
k 2 2 1 + y E y = k y (k E)
(5.11)
c
and
2
k2
1 + z E z = k z (k E) (5.12)
c2
We have replaced the constants 0 0 with 1/c 2 in accordance with (1.42). We
multiply (5.10)(5.12) respectively by k x , k y , and k z and also move the factor in
square brackets in each equation to the denominator on the right-hand side. Then
by adding the three equations together we get
k x2 (k E) k y2 (k E) k z2 (k E)
i+h i+h i = k x E x + k y E y + k z E z = (k E)
2 1+ 2 2 1+
k2 ( x) (1+ y ) k2 ( z)
h
c2 k2 c2 c2
(5.13)
Now k E appears in every term and can be divided away. This gives the dispersion
relation (unencumbered by field components):
2
k x2 ky k z2 2
+ + = 2 (5.14)
k 2 c 2 /2 1 + x k 2 c 2 /2 1 + y k 2 c 2 /2 1 + z
c
The dispersion relation (5.14) allows us to find a suitable k, given values for ,
x , y , and z . Actually, it only restricts the magnitude of k; we must still decide
on a direction for the wave to travel (i.e. we must choose the ratios between k x , k y ,
and k z ). To remind ourselves of this fact, we introduce a unit vector that points in
the direction of k
k = k x x + k y y + k z z = k u x x + u y y + u z z = k u (5.15)
With this unit vector inserted, the dispersion relation (5.14) for plane waves in a
crystal becomes
2
u x2 uy u z2 2
+ + = 2 2 (5.16)
k 2 c 2 /2 1 + x k 2 c 2 /2 1 + y k 2 c 2 /2 1 + z
k c
We may define refractive index as the ratio of the speed of light in vacuum
c to the speed of phase propagation in a material /k (see P1.9). The relation
introduced for isotropic media (i.e. (2.19) for real index) remains appropriate.
That is
kc
n= (5.17)
This familiar relationship between k and , in the case of a crystal, depends on
the direction of propagation in accordance with (5.16).
Inspired by (2.30), we will find it helpful to introduce several refractive-index
parameters:
n x 1 + x
p
q
n y 1 + y (5.18)
n z 1 + z
p
u x2 u 2y u z2 1
2 2
+
2 2
+
2 2
= 2 (5.19)
n nx n ny n nz n
This is called Fresnels equation3 (not to be confused with the Fresnel coefficients
studied in chapter 3). The relationship contains the yet unknown index n that
varies with the direction of the k-vector (i.e. the direction of the unit vector u).
After multiplying through by all of the denominators (and after a fortuitous
cancelation owing to u x2 + u 2y + u z2 = 1), Fresnels equation (5.19) can be rewritten
as a quadratic in n 2 . The two solutions are
p
2 B B 2 4AC
n = (5.20)
2A
3 To better distinguish from the Fresnel coefficients, sometimes this is called Fresnels equation of
wave normals. See Principles of Optics, 7th Ed., Born and Wolf, p. 796.
126 Chapter 5 Propagation in Anisotropic Media
where
A u x2 n x2 + u 2y n 2y + u z2 n z2 (5.21)
B u x2 n x2 n 2y + n z2 + u 2y n 2y n x2 + n z2 + u z2 n z2 n x2 + n 2y
(5.22)
C n x2 n 2y n z2 (5.23)
The upper and lower signs (+ and ) in (5.20) give two positive solutions for n 2 .
The positive square root of these solutions yields two physical values for n. It turns
out that each of the two values for n is associated with a polarization direction
of the electric field, given a propagation direction k. A broader analysis carried
out in appendix 5.C renders the orientation of the electric fields, whereas here we
only show how to find the two values of n. We refer to the two indices as the slow
and fast index, since the waves associated with each propagate at speed v = c/n.
In the special cases of propagation along one of the principal axes of the
crystal, the index n takes on two of the values n x , n y , or n z , depending on which
are orthogonal to the direction of propagation.
Example 5.1
Calculate the two possible values for the index of refraction when k is in the z
direction (in the crystal principal frame).
Inserting this expression into (5.20), we find the two values for the index:
n = nx , n y
The index n x is experienced by light whose electric field points in the x-dimension,
and the index n y is experienced by light whose electric field points in the y-
dimension (see appendix 5.C ).
Before moving on, let us briefly summarize what has been accomplished so
far. Given values for x , y , and z associated with light in a crystal at a given
Figure 5.3 Spherical coordinates. frequency, you can define the indices n x , n y , and n z , according to (5.18). Next, a
direction for the k-vector is chosen (i.e. u x , u y , and u z ). This direction generally
has two values for the index of refraction associated with it, found using Fresnels
5.3 Biaxial and Uniaxial Crystals 127
While finding the direction of the optic axes in a biaxial crystal is straight forward,
obtaining an expression for the associated indices of refraction is messy. The
smaller value is commonly referred to as the fast index and the larger value 0 0.19
the slow index. Figure 5.4 shows the two refractive indices (i.e. the solutions
Figure 5.4 The fast and slow re-
to Fresnels equation (5.20)) for a biaxial crystal plotted with color shading on
fractive indices (and their differ-
the surface of a sphere. Each point on the sphere represents a different and . ence) as a function of direction
The two optic axes are apparent in the plot of the difference between n slow and for potassium niobate (KNbO3 ) at
n fast ; the two indices coincide when propagation is along either optic axis. When = 500 nm (n x = 2.22, n y = 2.35,
propagating in these directions, either polarization experiences the same index. and n z = 2.41) .
For the remainder of this chapter, we will focus on the simpler case of uniaxial
crystals.
128 Chapter 5 Propagation in Anisotropic Media
n z = ne (5.26)
n x = n y = no (5.27)
These names were coined by Huygens, one of the early scientists to study light
in crystals (see appendix 5.D). A uniaxial crystal with n e > n o is referred to as a
positive crystal, and one with n e < n o is referred to as a negative crystal.
To calculate the index of refraction for a wave propagating in a uniaxial crystal,
we use definitions (5.26) and (5.27) along with the spherical representation of u
(5.24) in Fresnels equation (5.20) to find the following two values for n (see P5.5):
through the surface. We must consider separately the portion of the light that
experiences the ordinary index and the portion that experiences the extraordinary
index. Because of the different indices, the ordinary and extraordinary polarized
light refract into the crystal at two different angles; they travel at two different
velocities in the crystal; and they have two different wavelengths in the crystal.
If we assume that the index outside of the crystal is one, Snells law for the
ordinary polarization is
where n o is the ordinary index inside the crystal. The extraordinary polarized
light also obeys Snells law, but now the index of refraction in the crystal depends
on direction of propagation inside the crystal relative to the optic axis. Snells law
for the extraordinary polarization is
where 0 is the angle between the optic axis inside the crystal and the direction of
propagation in the crystal (given by t in the plane of incidence). When the optic
axis is at an arbitrary angle with respect to the surface the relationship between
0 and t is cumbersome. We will examine Snells law only for the specific case
when the optic axis is perpendicular to the crystal surface, for which t = 0 .
y-axis
Example 5.2
Examine Snells law for a uniaxial crystal with optic axis perpendicular to the
surface.
z-axis
Solution: Refer to Fig. 5.6. With the optic axis perpendicular to the surface, if the
x-axis (directed into page)
light hits the crystal surface straight on, the index of refraction is n o , regardless
of the orientation of polarization since 0 = t = 0. When the light strikes the
surface at an angle, s-polarized light continues to experience the index n o , while
p-polarized light experiences the extraordinary index n e (t ). 4
When we insert (5.29) into Snells law (5.31) with 0 = t , the expression can be
inverted to find the transmitted angle t in terms of i (see P5.6): Figure 5.6 Propagation of light in a
uniaxial crystal with its optic axis
n e sin i
tan t = q (extraordinary polarized, optic axis surface) (5.32) perpendicular to the surface.
n o n e2 sin2 i
As strange as this formula may appear, it is Snells law, but with an angularly
dependent index.
4 The correspondence between s and p and ordinary and extraordinary polarization components
is specific to the orientation of the optic axis in this example. For arbitrary orientations of the
optic axis with respect to the surface, the ordinary and extraordinary components will generally be
mixtures of s and p polarized light.
130 Chapter 5 Propagation in Anisotropic Media
Example 5.3
Find a relationship between direction of the Poynting Vector in a uniaxial crystal
and the angle of incidence in the special case where the optic axis is perpendicular
to the surface.
Let the k-vector lie in the y-z plane. We may write it as k = k y sin t + z cos t .
Then the ordinary component of the field points in the x-direction, while the
extraordinary component lies in the y-z plane.
Equation (5.33) requires
k (0 E + P) = k y sin t + z cos t 0 n o2 E x x + n o2 E y y + n e2 E z z
= 0 k n o2 E y sin t + n e2 E z cos t
(5.34)
=0
5.5 Poynting Vector in a Uniaxial Crystal 131
Therefore, the y and z components of the extraordinary field are related through
n o2 E y
Ez = tan t (5.35)
n e2
n2
E = E y y z o2 tan t e i (krt ) (extraordinary polarized) (5.36)
ne
n o2
k E k y sin t + z cos t E y y z ne2 tan t i (krt )
B= = e
2
kE y n o
= x sin t tan t + cos t e i (krt )
n e2
(extraordinary polarized) (5.37)
The time-averaged Poynting vector then becomes
B
St = Re {E} Re
0 t
k|E y |2
! !
n 2 n2 D E
= y z o2 tan t o2 sin t tan t + cos t x cos2 (k r t + E y )
0 ne ne t
2
! !
k|E y | n o 2 n 2
= sin t tan t + cos t z + y o2 tan t
20 n e2 ne
(extraordinary polarized) (5.38)
Let us label the direction of the Poynting vector with the angle S . By definition,
the tangent of this angle is the ratio of the two vector components of S:
Sy n o2
tan S = tan t (extraordinary polarized) (5.39)
Sz n e2
While the k-vector is characterized by the angle t , the Poynting vector is char-
acterized by the angle S . Combining (5.32) and (5.39), we can connect S to the
incident angle i :
n o sin i
tan S = q (extraordinary polarized) (5.40)
n e n e2 sin2 i
As we noted in the last example, we have the case where ordinary polarized light is
s-polarized light, and extraordinary polarized light is p-polarized light due to our
specific choice of orientation for the optic axis in this section. In general, the s- and
p-polarized portions of the incident light can each give rise to both extraordinary
and ordinary rays.
132 Chapter 5 Propagation in Anisotropic Media
xe xx x y xz Fx
0
y x
N qe y e = y y y z F y (5.41)
qe
ze zx z y zz Fz
The column vector on the left represents the components of the displacement
re . We next invert (5.41) to find the force of the electric field on an electron as a
function of its displacement5
Fx k xx kx y k xz xe
Fy = kyx ky y k y z ye (5.42)
Fz k zx kz y k zz ze
where
1
k xx kx y k xz xx x y xz
N q e2
kyx ky y kyz y x y y y z (5.43)
0
k zx kz y k zz zx z y zz
Here the various k i j represent spring constants as opposed to components of
wave vectors.
The total work done on an electron in moving it to its displaced position is
given by Z
W= F(r0 ) d r0 (5.44)
path
While there are many possible paths for getting the electron to any specific dis-
placement (each path specified by a different history of the electric field), the
work done along any of these paths must be the same if the system is conservative
(i.e. no absorption). For example, if the final displacement of r = x e x + y e y we
could have the following two paths:
Path 2
Path 1
(0,0,0) (0,0,0)
5 This inversion assumes the field changes slowly so the forces on the electron are always es-
sentially balanced. This is not true for optical fields, but the proof gives the right flavor for why
conservation of energy results in the symmetry. A more formal proof that doesnt make this as-
sumption can be found in Principles of Optics, 7th Ed., Born and Wolf, pp. 790-791.
5.B Rotation of Coordinates 133
We can use (5.42) in (5.44) to calculate the total work done on the electron
along path 1:
Z xe Z ye
W= F x (x 0 , y 0 = 0, z 0 = 0)d x 0 + F y (x 0 = x e , y 0 , z 0 = 0)d y 0
0 0
Z xe Z ye
0 0
= k xx x d x + (k y x x e + k y y y 0 ) d y 0
0 0
k xx 2 ky y 2
= xe + k y x xe y e + y
2 2 e
If we take path 2, the total work is
Z ye Z xe
0 0 0 0
W= F y (x = 0, y , z = 0)d y + F x (x 0 , y 0 = y e , z 0 = 0)d x 0
Z0 y e Z xe 0
0 0
= ky y y d y + (k xx x + k x y y e ) d x 0
0
0 0
ky y k xx 2
= y e2 + k x y x e y e + x
2 2 e
Since the work must be the same for these two paths, we clearly have k x y = k y x .
Similar arguments for other pairs of dimensions ensure that the matrix of k
coefficients is symmetric. From linear algebra, we learn that if the inverse of a
matrix is symmetric then the matrix itself is also symmetric. When we combine
this result with the definition (5.43), we see that the assumption of no absorption
requires the susceptibility matrix to be symmetric.
P = 0 E (5.45)
where
Ex Px xx x y xz
E Ey P Py x y y y y z (5.46)
Ez Pz xz y z zz
Our task is to find a new coordinate system x 0 , y 0 , and z 0 for which the susceptibil-
ity tensor is diagonal. That is, we want to choose x 0 , y 0 , and z 0 such that
P0 = 0 0 E0 , (5.47)
where
0 0 0
E x0 0 P x0 0 0 0
x0 x 0y 0 y 0
E0 E y0 0 P0 P y0 0 0
0 (5.48)
E z0 0 P z0 0 0 0 0z 0 z 0
134 Chapter 5 Propagation in Anisotropic Media
To arrive at the new coordinate system, we are free to make pure rotation trans-
formations. In a manner similar to (6.29), a rotation through an angle about the
z-axis, followed by a rotation through an angle about the resulting y-axis, and
finally a rotation through an angle about the new x-axis, can be written as
R 11 R 12 R 13
R R 21 R 22 R 23
R 31 R 32 R 33
cos 0 sin cos sin 0
1 0 0
= 0 cos sin 0 1 0 sin cos 0
0 sin cos sin 0 cos 0 0 1
cos cos cos sin sin
= cos sin sin sin cos cos cos sin sin sin sin cos
sin sin cos sin cos sin cos cos sin sin cos cos
(5.49)
E = R1 E0
(5.51)
P = R1 P0
where
cos cos cos sin sin sin cos sin sin cos sin cos
1
R = cos sin cos cos sin sin sin sin cos cos sin sin
sin sin cos cos cos
R 11 R 21 R 31
= R 12 R 22 R 32 = RT (5.52)
R 13 R 23 R 33
Note that the inverse of the rotation matrix is the same as its transpose, an impor-
tant feature that we exploit in what follows.
Upon inserting (5.51) into (5.45) we have
R1 P0 = 0 R1 E0 (5.53)
or
P0 = 0 RR1 E0 (5.54)
5.C Electric Field in a Crystal 135
From this equation we see that the new susceptibility tensor we seek for (5.47) is
0 RR1
R 11 R 12 R 13 xx x y xz R 11 R 21 R 31
= R 21 R 22 R 23 x y y y y z R 12 R 22 R 32
R 31 R 32 R 33 xz y z zz R 13 R 23 R 33
x 0 x 0 0x 0 y 0 0x 0 z 0
0
= x 0 y 0 0y 0 y 0 0y 0 z 0
0
(5.55)
0x 0 z 0 0y 0 z 0 0z 0 z 0
We have expressly indicated that the off-diagonal terms of 0 are symmetric (i.e.
0i j = 0j i ). This can be verified by performing the multiplication in (5.55). It is a
consequence of being symmetric and R1 being equal to RT
The three off-diagonal elements of 0 (appearing both above and below the
diagonal) are found by performing the matrix multiplication in the second line
of (5.55). The specific expressions for these three elements are not particularly
enlightening. The important point is that we can make all three of them equal to
zero since we have three degrees of freedom in the angles , , and . Although,
we do not expressly solve for the angles, we have demonstrated that it is always
possible to set
0x 0 y 0 = 0
0x 0 z 0 = 0 (5.56)
0y 0 z 0 =0
2
1 + x k y2 k z2
kx k y kx kz
c2 Ex
2
2 2
kx k y Ey = 0
1 + y kx kz k y kz
c2
2 Ez
1 + z k x2 k y2
kx kz k y kz c2
(5.57)
slightly nicer:
n2
x
u 2y u z2 ux u y ux uz
Ex
n2
n 2y Ey = 0
ux u y n2
u x2 u z2 u y uz (5.58)
n z2 Ez
ux uz u y uz n2
u x2 u 2y
For (5.58) to have a non-trivial solution (i.e. non zero fields), the determinant
of the matrix must be zero. Imposing this requirement is an equivalent way to
derive Fresnels equation (5.19) for n.
Given a direction for u and a value for n (from Fresnels equation), we can use
(5.58) to determine the direction of the electric field associated with that index.
It is left as an exercise to show that in non-degenerate cases7 (i.e. n 6= n x , n y , n z ),
the appropriate field direction for a value of n is given by
ux
n 2 n x2
Ex
uy
Ey 2 (n 6= n x , n y , n z ) (5.59)
n n 2y
Ez
uz
n 2 n z2
This is a proportionality rather than an equation because Maxwells equations
only specify the direction of Ewe are free to choose the amplitude. Because
Fresnels equation gives two values for n, (5.59) specifies two distinct polarization
components associated with each propagation direction u. These polarization
components form a natural basis for describing light propagation in a crystal.
When light is composed of a mixture of these two polarizations, the two polariza-
tion components experience different indices of refraction.
If any of the components of u (i.e. u x , u y , or u z ) is precisely zero, the corre-
sponding entry in (5.59) yields a zero-over-zero situation. This happens when at
least one of the dimensions in (5.58) becomes decoupled from the others. In these
cases, one can re-solve (5.58) for the polarization directions as in the following
example.
Example 5.4
Determine the directions of the two polarization components associated with light
propagating in the u = z direction. (Compare with Example 5.1.)
Notice that all three dimensions are decoupled in this system (i.e. there are no
off-diagonal terms). In Example 5.1 we found that the two values of n associated
with u = z are n x and n y . If we use n = n x in our set of equations, we have
0 0 0
n 2y Ex
0 1 0 Ey = 0
n x2
(a) Polarization Direction for Slow Index
n z2 Ez
0 0 n x2
We can use (5.59) to study the behavior of polarization direction as the direc-
tion of propagation varies. Figure 5.7 shows plots of the polarization direction (i.e. (b) Polarization Direction for Fast Index
To find the directions of the electric field for light that experiences the ordinary
index of refraction in a uniaxial crystal, we insert n = n o into the requirement
(5.58), and solve for the allowed fields (see P5.11) to find Figure 5.7 Polarization direction
associated with the two values of n
sin
in Potassium Niobate (KNbO3 ) at
Eo (u) cos (5.61) = 500 nm (n x = 2.22, n y = 2.34,
0 and n z = 2.41) and = /4. Frame
(c) shows the angle between the
This field component is associated with the ordinary wave. Just as in an isotropic
two polarization components.
medium such as glass, the index of refraction for light with this polarization does
not vary with . The polarization component associated with n e () is found by
8 The two components of the electric displacement vector D = E + P remain perpendicular.
0
138 Chapter 5 Propagation in Anisotropic Media
using (5.59):
sin cos
n 2 () n 2
e o
sin sin
Ee (u) 2 (5.62)
n e () n o2
cos
n e2 () n e2
Notice that this polarization component is partially directed along the optic axis
(i.e. it has a z-component), and it is not perpendicular to k since u Ee (u) 6= 0 (see
P5.12). It is, however, perpendicular to the ordinary polarization component, since
Ee Eo = 0.
Notice that when = 0, (5.29) reduces to n = n o so that both indices are the same.
On the other hand, if = /2 then (5.29) reduces to n = n e . These limits must be
approached carefully in (5.62).
by the hypotenuse of the right triangle seen in Fig. 5.8. Let the point where the
wave front touches the ellipse be denoted by y, z = (z tan S , z). The slope (rise
over run) of the line that connects these two points is then
dz z
= (5.65)
dy ct / sin i z tan S
At the point where the wave front touches the ellipse (i.e. y, z = (z tan S , z)), the
dz yn e2 n e2 y n e2
= = = tan S (5.66)
n o2 z n o2
r
dy y2
n o ct 1 (c t /n 2
e)
We would like these two slopes to be the same. We therefore set them equal to
each other:
n e2 z c t n e2 tan S n e2
tan S = = 2 tan2 S + 1 (5.67)
n o2 ct / sin i z tan S z n o2 sin i no
s
ct n e2
= no tan2 S + 1 (5.68)
z n o2
This agrees with (5.40) as anticipated. Again, Huygens approach obtained the
correct direction of the Poynting vector associated with the extraordinary wave.
140 Chapter 5 Propagation in Anisotropic Media
Exercises
P5.1 (a) Solve Fresnels equation (5.19) to find the two values of n 2 associated
with a given u. In other words, fill in the steps leading to (5.20)(5.23).
(b) Point out that both solutions for n 2 are real and positive, when n x ,
n y , and n z are real and B 2 4AC 0 in (5.20). Show that B 2 4AC 0
in the following special cases: Case I: u x = 1, u y = 0, u z = 0; Case II:
p p p
u x = 1/ 3, u y = 1/ 3, u z = 1/ 3.
HINT: First manipulate (5.19) into the form
h i
u x2 + u 2y + u z2 1 n 6
h i
+ n x2 + n 2y + n z2 u x2 n 2y + n z2 u 2y n x2 + n z2 u z2 n x2 + n 2y n 4
h i
n x2 n 2y + n x2 n z2 + n 2y n z2 u x2 n 2y n z2 u 2y n x2 n z2 u z2 n x2 n 2y n 2 + n x2 n 2y n z2 = 0
P5.2 Show that Fresnels equation (5.19) may equivalently be written as.
u x2 u 2y u z2
+ + =0
1
n2
n12 1
n2
n12 1
n2
n12
x y z
HINT: Use 1 = u x2 + u 2y + u z2 .
P5.3 Suppose you have a crystal with n x = 1.5, n y = 1.6, and n z = 1.7. Use
Fresnels equation to determine what the two indices of p refraction are
for a k-vector in the crystal along the u = (x + 2y + 3z)/ 14 direction.
P5.4 (a) Show that for a biaxial crystal, the direction of the optic axes are
given by (5.25) in the x-z plane.
(b) Show that (5.25) only makes sense if the axes are chosen such that
n y is in between n x and n z . Does the formula work if n x n y n z ?
How do the values of relate to the case when n z n y n x ?
HINT: Use spherical coordinates as in (5.24). The two indices are the
same when B 2 4AC = 0. Under the assumption that n y lies between
n x and n z , B 2 4AC = 0 can only be satisfied when = 0.
Exercises 141
P5.5 Use definitions (5.26) and (5.27) along with the spherical representa-
tion of u (5.24) in Fresnels equation (5.20) to calculate the two values
for the index in a uniaxial crystal (i.e. (5.28) and (5.29)).
HINT: First show that
A = n o2 sin2 + n e2 cos2
B = n o2 n e2 + n o4 sin2 + n e2 n o2 cos2
C = n o4 n e2
P5.7 Suppose you have a quartz plate (a uniaxial crystal) with its optic axis
oriented perpendicular to the surfaces. The indices of refraction for
quartz are n o = 1.54424 and n e = 1.55335. A plane wave with wave-
length vac = 633 nm passes through the plate. After emerging from
the crystal, there is a phase difference between the two polarization
components of the plane wave, and this phase difference depends on
incident angle i . Use a computer to plot as a function of incident
angle from zero to 90 for a plate with thickness d = 0.96 mm .
HINT: For s-polarized light, show that the number of wavelengths that
d
fit in the plate is (s) . For p-polarized light, show that the
(vac /n o ) cos t
number of wavelengths that fit in the plate and the extra leg outside
d
of the plate (see Fig. 5.9) is (p) + , where
(vac /n p ) cos t vac
h i
(p)
= d tan t(s) tan t sin i Figure 5.9 Diagram for P5.7.
L5.8 In the laboratory, send a HeNe laser (vac = 633 nm) through two
crossed polarizers, oriented at 45 and 135 . Place the quartz plate
described in P5.7 between the polarizers on a rotation stage. Now Dim spots
Bright spots
equal amounts of s- and p-polarized light strike the crystal as it is
rotated from normal incidence. (video)
If the phase shift between the two paths discussed in P5.7 is an odd
integer times , the polarization direction of the light transmitted
through the crystal is rotated by 90 , and the maximum transmission
through the second polarizer results. (In this configuration, the crystal
acts as a half wave plate, which we discuss in Chapter 6.) If the phase
shift is an even integer times , then the polarization is rotated by 180
and minimum transmission through the second polarizer results. Plot
these measured maximum and minimum points on your computer-
generated graph of the previous problem.
P5.9 A calcite crystal is cut and polished such that the optic axis is perpendic-
ular to the surface.9 If 590 nm light enters with incident angle i = 45 ,
what is the difference between the transmitted angles of the Poynting
vector for s- and p-polarized light? Calcite is a uniaxial crystal with
n o = 1.658 and n e = 1.486 at this wavelength.
P5.11 (a) Show that the field polarization component associated with n = n o
in a uniaxial crystal is given by (5.61)by substituting this value for n
into (5.58) and determining what combination of field components are
allowable.
(b) Show that the field is directed perpendicular to the plane containing
u and z.
P5.12 (a) Show that the electric field for extraordinary polarized light Ee (u) in
a uniaxial crystal is not perpendicular to k (i.e. u).
(b) Show that the ordinary polarization component Eo (u) is perpendic-
ular to k.
9 This is called an a-cut. Calcite cleaves naturally along its rhombohedron form, which is not the
same as an a-cut.
Chapter 6
Polarization of Light
When the direction of the electric field of light oscillates in a regular, predictable
fashion, we say that the light is polarized. Polarization describes the direction
of the oscillating electric field, a distinct concept from dipoles per volume in a
material P also called polarization. In this chapter, we develop a formalism
for describing polarized light and the effect of devices that modify polarization.
If the electric field oscillates in a plane, we say that it is linearly polarized. The
electric field can also spiral around while a plane wave propagates, and this is
called circular or elliptical polarization. There is a convenient way for keeping
track of polarization using a two-dimensional Jones vector.
Many devices can affect polarization such as polarizers and wave plates. Their
effects on a light field can be represented by 2 2 Jones matrices that operate on
the Jones vector representing the light. A Jones matrix can describe, for example,
a polarizer oriented at an arbitrary angle or it can characterize the influence of
a wave plate, which is a device that introduces a relative phase between two
components of the electric field.
In this chapter, we will also see how reflection and transmission at a material
interface influences field polarization. As we saw previously, s-polarized light
can acquire a phase lag or phase advance relative to p-polarized light. This is Figure 6.1 Animation showing
especially true at metal surfaces, which have complex indices of refraction. The different polarization states of
light.
Fresnel coefficients studied in chapters 3 and 4 can be conveniently incorporated
into a Jones matrix to keep track of their influence polarization. Ellipsometry,
outlined in appendix 6.A, is the science of characterizing optical properties of
materials through an examination of these effects.
Throughout this chapter, we consider light to have well characterized polar-
ization. However, most common sources of light (e.g. sunlight or a light bulb)
have an electric-field direction that varies rapidly and randomly. Such sources
are commonly referred to as unpolarized. It is common to have a mixture of
unpolarized and polarized light, called partially polarized light. The Jones vector
formalism used in this chapter is inappropriate for describing the unpolarized
portions of the light. In appendix 6.B we describe a more general formalism for
dealing with light having an arbitrary degree of polarization.
143
144 Chapter 6 Polarization of Light
E (z, t ) = E x x + E y y e i (kzt )
(6.2)
As always, only the real part of (6.2) is physically relevant. The complex amplitudes
of E x and E y keep track of the phase of the oscillating field components. In
general the complex phases of E x and E y can differ, so that the wave in one of the
dimensions lags or leads the wave in the other dimension.
The relationship between E x and E y describes the polarization of the light.
+ For example, if E y is zero, the plane wave is said to be linearly polarized along the
x-dimension. Linearly polarized light can have any orientation in the xy plane,
and it occurs whenever E x and E y have the same complex phase (or a phase
differing by ). For our purposes, we will take the x-dimension to be horizontal
and the y-dimension to be vertical unless otherwise noted.
As an example, suppose E y = i E x , where E x is real. The y-component of the
field is then out of phase with the x-component by the factor i = e i /2 . Taking the
real part of the field (6.2) we get
h i h i
E (z, t ) = Re E x e i (kzt ) x + Re e i /2 E x e i (kzt ) y
= E x cos (kz t ) x + E x cos (kz t + /2) y (left circular) (6.3)
y
= E x cos (kz t ) x sin (kz t ) y
x
z In this example, the field in the y-dimension lags behind the field in the x-
dimension by a quarter cycle. That is, the behavior seen in the x-dimension
Figure 6.2 The combination of
happens in the y-dimension a quarter cycle later. The field never goes to zero
two orthogonally polarized plane
waves that are out of phase results simultaneously in both dimensions. In fact, in this example the strength of the
in elliptically polarized light. Here electric field is constant, and it rotates in a circular pattern in the x-y dimensions.
we have left circularly polarized For this reason, this type of field is called circularly polarized. Figure 6.2 graph-
light created as specified by (6.3). ically shows the two linear polarized pieces in (6.3) adding to make circularly
polarized light.
If we view a circularly polarized light field throughout space at a frozen instant
in time (as in Fig. 6.2), the electric field vector spirals as we move along the z-
dimension. If the sense of the spiral (with time frozen) matches that of a common
wood screw oriented along the z-axis, the polarization is called right handed. (It
makes no difference whether the screw is flipped end for end.) If instead the field
6.2 Jones Vectors for Representing Polarization 145
spirals in the opposite sense, then the polarization is called left handed. The field
shown in Fig. 6.2 is an example of left-handed circularly polarized light.
An equivalent way to view the handedness convention is to imagine the light
impinging on a screen as a function of time. The field of a right-handed circularly
polarized wave rotates counter clockwise at the screen, when looking along the k
direction (towards the front side of the screen). The field rotates clockwise for a
left-handed circularly polarized wave.
Linearly polarized light can become circularly or, in general, elliptically po-
larized after reflection from a metal surface if the incident light has both s- and
p-polarized components. A good experimentalist working with light needs to
know this. Reflections from multilayer dielectric mirrors can also exhibit these
phase shifts.
Please notice that A and B are real non-negative dimensionless numbers that
satisfy A 2 + B 2 = 1. If E y is zero, then B = 0 and everything is well-defined. On the
1 E. Hecht, Optics, 3rd ed., Sect. 8.12.2 (Massachusetts: Addison-Wesley, 1998).
146 Chapter 6 Polarization of Light
1 2AB cos
1
= tan (6.12)
2 A2 B 2
with respect to the x-axis (see P6.8). This angle sometimes corresponds to the
minor axis and sometimes to the major axis of the ellipse, depending on the exact
values of A, B , and . The other axis of the ellipse (major or minor) then occurs at
/2 (see Fig. 6.3).
We can deduce whether (6.12) corresponds to the major or minor axis of the
ellipse by comparing the strength of the electric field when it spirals through the
direction specified by and when it spirals through /2. The strength of the
electric field at is given by (see P6.8)
p
E = |E eff | A 2 cos2 + B 2 sin2 + AB cos sin 2 (E max or E min ) (6.13)
6.4 Linear Polarizers and Jones Matrices 147
and the strength of the field when it spirals through the orthogonal direction
( /2) is given by
p
E /2 = |E eff | A 2 sin2 + B 2 cos2 AB cos sin 2 (E max or E min ) (6.14)
After computing (6.13) and (6.14), we decide which represents E min and which
E max according to
E max E min (6.15)
We could predict in advance which of (6.13) or (6.14) corresponds to the major
axis and which corresponds to the minor axis. However, making this prediction is
as complicated as simply evaluating (6.13) and (6.14) and determining which is
greater.
Elliptically polarized light is often characterized by the ellipticity, given by the
ratio of the minor axis to the major axis:
E min
e (6.16)
E max
The ellipticity e ranges between zero (corresponding to linearly polarized light)
and one (corresponding to circularly polarized light). Finally, the helicity or
handedness of elliptically polarized light is as follows (see P6.2):
< < 2 right-handed helicity (6.18) Figure 6.3 The electric field of el-
liptically polarized light traces an
ellipse in the plane perpendicular
6.4 Linear Polarizers and Jones Matrices to its propagation direction. The
two plots are for different values
In 1928, Edwin Land invented an inexpensive polarizing device. He did it by of A, B , and . The angle can de-
scribe the major axis (top) or the
stretching a polymer sheet and infusing it with iodine. The stretching caused the
minor axis (bottom), depending
polymer chains to align along a common direction, whereupon the sheet was
on the values of these parameters.
cemented to a substrate. The infusion of iodine caused the individual chains to
become conductive, like microscopic wires.
When light impinges upon Lands Polaroid sheet, the component of electric
field that is parallel to the polymer chains causes a current Jfree to oscillate in
that dimension. The resistance to the current quickly dissipates the energy (i.e.
the refractive index is complex) and the light is absorbed. The thickness of the
Polaroid sheet is chosen sufficiently large to ensure that virtually none of the light
with electric field component oscillating along the chains makes it through the
device.
The component of electric field that is orthogonal to the polymer chains
encounters electrons that are essentially bound to the narrow width of individual
polymer molecules. For this polarization component, the wave passes through
the material much like it does through typical dielectrics such as glass (i.e. the
refractive index is real). Today, there is a wide variety of technologies for making
polarizers, many very different from Polaroid.
148 Chapter 6 Polarization of Light
As expected, the polarization state is unaffected by the polarizer. (We have ignored
possible attenuation from surface reflections.)
Now consider vertically polarized light traversing the same horizontal polarizer. In
this case, we have:
1 0 0 0
= (horizontal polarizer on vertical linear polarization)
0 0 1 0
While you might readily agree that the matrices given in (6.19) and (6.20)
can be used to get the right result for light traversing a horizontal or a vertical
polarizer, you probably arent very impressed as of yet. In the next few sections,
Linear polarizer
we will derive Jones matrices for a number of optical elements that can modify
polarization: polarizers at arbitrary angle, wave plates at arbitrary angle, and cos2 sin cos
reflection or transmissions at an interface. Table 6.2 shows Jones matrices for sin cos sin2
each of these devices. Before deriving these specific Jones matrices, however, we Half wave plate
take a moment to appreciate why the Jones matrix formulation is useful.
cos 2 sin 2
The real power of the formalism becomes clear as we consider situations sin 2 cos 2
where light encounters multiple polarization elements in sequence. In these situ-
ations, we use a product of Jones matrices to represent the effect of the compound Quarter wave plate
the light encounters the devices. Therefore, the matrix for the first device (J1 ) is Transmission through an
written on the right, and so on until the last device encountered, which is written interface
on the left, farthest from the Jones vector.
tp 0
When part of the light is absorbed by passing through one or more polarizers 0 ts
in a system, the Jones vector of the exiting light does not necessarily remain
normalized to magnitude one (see Example 6.1). The factor by which
0 2 the
intensity
0
2 Table 6.2 Common Jones Matri-
of the light decreases is given by A x + B y A x + B y = A + B 0 . The
0 0
0
ces. The angle is measured with
intensity exiting from the system is then respect to the x-axis and specifies
the transmission axis of a linear
1 0 2
E = |E eff |2 A 0 2 + B 0 2
0 2
I 0 = nc0 E eff where eff (6.23) polarizer or the fast axis of a wave
2 plate.
Here, E eff is the original effective field before entering the system (see (6.10)), and
0
E eff is the final effective field.
For the sake of further analysis, if desired, one can renormalize the final Jones
vector and write it in standard form as follows:
A 0 e i A0 |A 0 |
=
B 0 e i 0 i
0 0
|A 0 |2 + |B 0 |2 |B |e
p
0
This is the Jones vector that is consistent with E eff . The uninteresting overall
i A0
phase factor e can be incorporated into E eff , making A 0 real and positive. 0 is
0
Let the transmission axis of the polarizer be specified by the unit vector e1
Figure 6.5 Light transmitting
and the absorption axis of the polarizer be specified by e2 (orthogonal to the
through a polarizer oriented with
transmission axis at angle from transmission axis). The vector e1 is oriented at an angle from the x-axis, as
x-axis. shown in Fig. 6.6. We need to write the electric field components in terms of the
new basis specified by e1 and e2 . By inspection of the geometry, the x-y unit
vectors are connected to the new coordinate system via:
x = cos e1 sin e2
(6.25)
y = sin e1 + cos e2
where
E 1 E x cos + E y sin
(6.27)
E 2 E x sin + E y cos
Now we introduce the effect of the polarizer on the field: E 1 is transmitted
unaffected, while E 2 is extinguished. To account for the effect of the device, we
Figure 6.6 Electric field compo- multiply E 2 by a parameter . In the case of the polarizer, is zero, but when we
nents written in the e1 e2 basis. consider wave plates we will use other values for . After traversing the polarizer,
the field becomes
Eafter (z, t ) = (E 1 e1 + E 2 e2 ) e i (kzt ) (6.28)
We now have the field after the polarizer, but it would be nice to rewrite it in terms
of the original xy basis. By inverting (6.25), or again by inspection of Fig. 6.6, we
see that
e1 = cos x + sin y
(6.29)
e2 = sin x + cos y
Substitution of these relationships into (6.28) together with the definitions (6.27)
6.6 Jones Matrix for Wave Plates 151
(6.30)
Notice that if = 1 (i.e. no polarizer), then we get back exactly what we started
with (i.e. (6.30) reduces to (6.24)).
To get to the Jones matrix for the polarizer, we note that (6.30) is a linear mix-
ture of E x and E y which can be represented with matrix algebra. If we represent
the electric field as a two-dimensional column vector with its x component in the
top and its y component in the bottom (like a Jones vector), then we can rewrite
(6.30) as
By adjusting the thickness of the wave plate, one can introduce any desired phase
difference.
The most common types of wave plates are the quarter-wave plate and the
half-wave plate. The quarter-wave plate introduces a phase difference of
where m is an integer. This means that the polarization component along the
slow axis is delayed spatially by a half wavelength (or three halves, etc.). When
m = 0 in either (6.34) or (6.35), the wave plate is said to be zero order.
Slow axis The derivation of the Jones matrix for the two wave plates is essentially the
same as the derivation for the polarizer in the previous section. Let e1 correspond
to the fast axis, and let e2 correspond to the slow axis, as illustrated in Fig. 6.7. We
proceed as before. However, instead of setting equal to zero in (6.31), we must
Fast axis
choose values for appropriate for each wave plate. Since nothing is absorbed,
should have a magnitude equal to one. The important feature is the phase of
. As seen in (6.33), the field component along the slow axis accumulates excess
Waveplate
phase relative to the component along the fast axis, and we let account for this.
In the case of the quarter-wave plate, the appropriate factor from (6.34) is
Transmitted polarization
components have altered = e i /2 = i (quarter-wave plate) (6.36)
relative phase
This describes a relative phase delay for the light emerging with polarization along
Figure 6.7 Wave plate interacting
with a plane wave. the slow axis. Substituting (6.36) into (6.30) yields the Jones matrix for a quarter
wave plate:
cos2 + i sin2 sin cos i sin cos
Example 6.2
Calculate the Jones matrix for a quarter-wave plate at = 45 , and determine its
effect on horizontally polarized light.
Solution: At = 45 , the Jones matrix for the quarter-wave plate (6.37) reduces to
e i /4
1 i
p (quarter-wave plate, fast axis at = 45 ) (6.40)
2 i 1
Example 6.3
Calculate the effect of a half wave plate at an arbitrary on horizontally polarized
light.
The resulting Jones vector describes linearly polarized light at an angle = 2 from
the x-axis (see Table 6.1).
To the extent that the s and p components of the field behave differently,
the overall polarization state is altered. For example, a linearly-polarized field
upon reflection can become elliptically polarized (see L 6.9). Even when a wave
reflects at normal incidence so that the s and p components are indistinguishable,
right-circular polarized light becomes left-circular polarized. This is the same
effect that causes a right-handed person to appear left-handed when viewed in a
mirror.
We can use Jones calculus to keep track of how reflection and transmission
influences polarization. However, before proceeding, we emphasize that in this
context we do not strictly adhere to a single coordinate system as we did in
chapter 3, for example in Fig. 3.1. Instead, we consider each plane wave, whether
incident, reflected or transmitted, to propagate in the z-direction of its own frame,
regardless of the relative angles between the incident and reflected wave. This
loose manner of defining coordinate systems, depicted in Fig. 6.9, has a great
advantage. The x and y dimensions in each individual frame is aligned parallel
to their respective s and p field component. We will adopt the convention that
p-polarized light in all cases is associated with the x-dimension (horizontal, say).
The s-polarized component then lies along the y-dimension (vertical). These
conventions are different from those used in chapter 3 but will do us no harm.
Figure 6.9 Incident, reflected and We are now in a position to see why there is a handedness inversion upon
transmitted plane waves, each
reflection from a mirror. Notice in Fig. 6.9 that for the incident light, the s-
propagating along the z-axis of its
own reference frame.
component of the field crossed into the p-component of the field yields a vector
pointing along the beams propagation direction. However, for the reflected light,
the s-component crossed into the p-component points opposite to that beams
propagation direction.
The Jones matrix corresponding to reflection from a surface is
r p 0
(Jones matrix for reflection) (6.43)
0 rs
and p-components differ markedly, which can cause linearly polarized light to
become elliptically polarized (see P6.14).
Transmission through a material interface can also influence the polarization
of the field, although typically to a lesser degree. However, there is no handedness
inversion, since the light continues on in a forward sense. The Jones matrix for
transmission is
tp 0
(Jones matrix for transmission) (6.44)
0 ts
If a beam of light encounters a series of mirrors, the final polarization is
determined by multiplying the sequence of appropriate Jones matrices (6.43) rotated x-axis
(in the plane of incidence)
onto the initial polarization. This procedure is straightforward if the normals
to all of the mirrors lie in a single plane (say parallel to the surface of an optical
original
bench). However, if the beam path deviates from this plane (due to vertical y-axis
tilt on the mirrors), then we must reorient our coordinate system before each
mirror to have a new horizontal (p-polarized dimension) and the new vertical rotated
(s-polarized dimension). Earlier in this chapter we performed a rotation of a y-axis
original
coordinate system through an angle , described in (6.27), which is also useful x-axis
here. The rotation can be accomplished by multiplying the following matrix onto
the incident Jones vector: Figure 6.10 If the plane of inci-
dence does not coincide for suc-
cos sin
(rotation of coordinates through an angle ) (6.45) cessive elements in an optical
sin cos system, a rotation matrix must be
applied to rotate the x-axis to the
This is understood as a rotation about the z-axis. The angle of rotation is
plane of incidence before comput-
chosen such that the rotated x-axis lies in the plane of incidence for the mirror.
ing the effect of each element.
When such a reorientation of coordinates is necessary, the two orthogonal field
components in the initial coordinate system are stirred together to form the field
components in the new system. This does not change the intrinsic characteristics
of the polarization, just its representation.
one after the sample, where s and p-polarized reflections take place. The first
polarizer ensures that linearly polarized light arrives at the test surface (polarized
at angle to give both s and p-components). The Jones matrix for the test surface
reflection is given by (6.43), and the Jones matrix for the analyzing polarizer
oriented at angle is given by (6.32). The Jones vector for the light arriving at the
detector is then
where
tan cos tan tan2 tan2
2 and (6.50)
tan2 + tan2 tan2 + tan2
In commercial ellipsometers, the angle of the analyzing polarizer often rotates at
a high speed, and the time dependence of the light reaching a detector is analyzed.
From this type of measurement, the coefficients and can be extracted with
high precision. Then equations (6.50) can be inverted (see problem P6.15) to
reveal s
1+
tan = |tan | and cos = p sign() (6.51)
1 1 2
From a series of these types of measurements, it is possible to extract the values
of n and for materials from the expressions for r s and r p (with the aid of a
computer!). With a sufficiently large number of unique measurements, it is
possible even to characterize multilayer coatings involving layers with varying
thicknesses and indices.
unpolarized. The transverse electric field direction in natural light varies rapidly
(and quasi randomly). Such variations imply the superposition of multiple fre-
quencies as opposed to the single frequency assumed in the formulation of Jones
calculus earlier in this chapter. Unpolarized light can become partially polarized
when it, for example, reflects from a surface at oblique incidence, since s and p
components of the polarization might reflect with differing strength.
Stokes vectors are used to keep track of the partial polarization (and atten-
uation) of a light beam as the light progresses through an optical system.5 In
contrast, Jones vectors are designed for pure polarization states. We can consider
any light beam as an intensity sum of completely unpolarized light and perfectly
polarized light:
I = I pol + I un (6.52)
It is assumed that both types of light propagate in the same direction.
The main characteristic of unpolarized light is that it cannot be extinguished
by a single polarizer (even in combination with a wave plate). Moreover, the
transmission of unpolarized light through an ideal polarizer is always 50%. On the
other hand, polarized light (be it linearly, circularly, or elliptically polarized) can
always be represented by a Jones vector, and it is always possible to extinguish it Sir George Gabriel Stokes (1819
using a quarter wave plate and a single polarizer. 1903, Irish) was born in Skreen, Ireland.
He entered Cambridge University at
We may introduce the degree of polarization as the fraction of the intensity age 18 and graduated four years later
that is in a definite polarization state: with the distinction of senior wrangler.
In 1849, he became a professor of
I pol mathematics at Cambridge where he
pol (6.53) later worked with James Clerk Maxwell
I pol + I un and Lord Kelvin to form the Cambridge
School of Mathematical Physics. Stokes
The degree of polarization takes on values between zero and one. Thus, if the light was a powerful mathematician as well
as good experimentalist, often testing
is completely unpolarized (such that I pol = 0), the degree of polarization is zero, his theoretical solutions in the laboratory.
and if the beam is fully polarized (such that I un = 0), the degree of polarization is In addition to his contributions to optics,
Stokes made important contributions to
one. fluid dynamics (e.g. the Navier-Stokes
A Stokes vector, which characterizes a partially polarized beam, is written as equations) and to mathematical physics;
Stokes theorem is employed several
places in this in this book.(Wikipedia)
S0
S
1
S2
S3
The parameter
I I pol + I un
S0 = (6.54)
I in I in
is a comparison of the beams intensity (or power) to a benchmark or input
intensity, I in , measured before the beam enters the optical system under consid-
eration. I represents the intensity at the point of investigation, where one wishes
to characterize the beam. Thus, the value S 0 = 1 represents the input intensity,
5 E. Hecht, Optics, 3rd ed., Sect. 8.12.1 (Massachusetts: Addison-Wesley, 1998).
158 Chapter 6 Polarization of Light
and S 0 can drop to values less than one, to account for attenuation of light by
polarizers in the system. (S 0 could increase in the atypical case of amplification.)
The next parameter, S 1 , describes how much the light looks either horizontally
or vertically polarized, and it is defined as
2I hor
S1 S0 (6.55)
I in
Here, I hor represents the amount of light detected if an ideal linear polarizer is
placed with its axis aligned horizontally directly in front of the detector (inserted
where the light is characterized). S 1 ranges between negative one and one, taking
on its extremes when the light is linearly polarized either horizontally or vertically,
respectively. If the light has been attenuated, it may still be perfectly horizontally
polarized even if S 1 has a magnitude less than one. (Alternatively, one might
examine S 1 /S 0 , which is guaranteed to range between negative one and one.)
The parameter S 2 describes how much the light looks linearly polarized along
the diagonals. It is given by
2I 45
S2 S0 (6.56)
I in
Similar to the previous case, I 45 represents the amount of light detected if an
ideal linear polarizer is placed with its axis at 45 directly in front of the detector
(inserted where the light is characterized). As before, S 2 ranges between negative
one and one, taking on extremes when the light is linearly polarized either at 45
or 135 .
Finally, S 3 characterizes the extent to which the beam is either right or left
circularly polarized:
2I r-cir
S3 S0 (6.57)
I in
Here, I r-cir represents the amount of light detected if an ideal right-circular po-
larizer is placed directly in front of the detector. A right-circular polarizer is
one that passes right-handed polarized light, but blocks left handed polarized
light. One way to construct such a polarizer is a quarter wave plate, followed
by a linear polarizer with the transmission axis aligned 45 from the wave-plate
fast axis, followed by another quarter wave plate at 45 from the polarizer (see
P6.12).6 Again, this parameter ranges between negative one and one, taking on
the extremes for right and left circular polarization, respectively.
Importantly, if any of the parameters S 1 , S 2 , or S 3 take on their extreme values
(i.e. a magnitude equal to S 0 ), the other two parameters necessarily equal zero.
As an example, if a beam is linearly polarized in the horizontal direction with
I = I in , then we have I hor = I in , I 45 = I in /2, and I r-cir = I in /2. This yields S 0 = 1,
S 1 = 1, S 2 = 0, and S 3 = 0. As a second example, suppose that the light has
been attenuated to I = I in /3 but is purely left circularly polarized. Then we have
6 The final quarter wave plate is to put the light back into the original circular state not needed
I hor = I in /6, I 45 = I in /6, and I r-cir = 0. Whereas the Stokes parameters are S 0 = 1/3,
S 1 = 0, S 2 = 0, and S 3 = 1/3.
Another interesting case is completely unpolarized light, which transmits 50%
through all of the polarizers discussed above. In this case, I hor = I 45 = I r-cir = I /2
and S 1 = S 2 = S 3 = 0.
Example 6.4
Find the Stokes
parameters for perfectly polarized light, represented by an arbitrary
Jones vector BA where A and B are complex.7 Depending on the values A and B ,
Solution: The input intensity of this polarized beam is I in = I pol = |A|2 + |B |2 , ac-
cording to Eq. (6.23), where we absorb the factor 12 0 c |E eff |2 into |A|2 and |B |2
for convenience. The Jones vector for the light that passes through a horizontal
polarizer is
1 0 A A
=
0 0 B 0
which gives a measured intensity of I hor = |A|2 . Similarly, the Jones vector when
the beam is passed through a polarizer oriented at 45 is
A +B 1
1 1 1 A
=
2 1 1 B 2 1
leading to an intensity of
|A + B |2 |A|2 + |B |2 + A B + AB
I 45 = =
2 2
Finally, the Jones vector for light passing through a right-circular polarizer (see
P6.12) is
A +iB
1 1 i A 1
=
2 i 1 B 2 i
giving an intensity of
|A + i B |2 |A|2 + |B |2 + i (A B AB )
I r-cir = =
2 2
Thus, the Stokes parameters become
|A|2 + |B |2
S0 = =1
I in
2|A|2 |A|2 + |B |2 |A|2 |B |2
S1 = =
I in I in I in
|A|2 + |B |2 + A B + AB |A|2 + |B |2 A B + AB
S2 = =
I in I in I in
|A|2 + |B |2 + i (A B AB ) |A|2 + |B |2 (A B AB )
S3 = =i
I in I in I in
7 We will find it easier in this appendix to write A |A|
instead of , where is the phase
B |B |e i
difference between B and A.
160 Chapter 6 Polarization of Light
Note that the unpolarized portion of the light does not contribute to S 1 , S 2 ,
or S 3 . Half of the unpolarized light survives any of the test filters, which cancels
I +I
neatly with the unpolarized portion of S 0 = polI in un in Eqs. (6.55)(6.57).
With the aid of the results in Example 6.4, a completely general form of the
Stokes vector may then be written as
S0 I pol + I un
S 1 |A|2 |B |2
1
= (6.58)
S 2 I in A B + AB
S3 i (A B AB )
where the Jones vector for the polarized portion of the light is
A
B
and the intensity of the polarized portion of the light is
Again, we have hidden the factor 12 0 c |E eff |2 for the polarized portion of the light
inside |A|2 and |B |2 .
We would like to express the degree ofq polarization in terms of the Stokes
parameters. We first note that the quantity S 12 + S 22 + S 32 can be expressed as
s
2
(A B + AB ) 2 i (A B AB ) 2
2
|A| |B |2
q
2 2 2
S1 + S2 + S3 = + +
I in I in I in
|A|2 + |B |2 (6.60)
=
I in
I pol
=
I in
Substituting (6.54) and (6.60) into the expression for the degree of polarization
(6.53) yields
1q 2
pol S 1 + S 22 + S 32 (6.61)
S0
If the light is polarized such that it perfectly transmits through or is perfectly
extinguished by one of the three test polarizers associated with S 1 , S 2 , or S 3 , then
the degree of polarization will be unity. Obviously, it is possible to have pure
polarization states that are not aligned with the axes of any one of these test
polarizers. In this situation, the degree of polarization is still one, although the
values S 1 , S 2 , and S 3 may all three contribute to (6.61).
Finally, it is possible to represent polarizing devices as matrices that operate
on the Stokes vectors in much the same way that Jones matrices operate on
Jones vectors. Since Stokes vectors are four-dimensional, the matrices used are
four-by-four. These are known as Mueller matrices.8
We know that 50% of the unpolarized light transmits through a polarizer, ending
up as polarized light with Jones vector
r
A 01 I un cos
=
B 10 2 sin
(see table 6.1). As usual, let give the angle of the transmission axis relative to the
horizontal. The Jones matrix (6.23) acts on the polarized portion of the light as
follows
A 01 A 02
h i h i
One might be tempted to add B 10
and B 20
, but this would be wrong, since
the two beams are not coherent. As mentioned previously, unpolarized light
necessarily contains multiple frequencies, and so the fields from the polarized and
unpolarized beams destructively interfere as often as they constructively interfere.
In this case, we simply add intensities rather than fields. That is, we have
Similarly,
Since the light has gone through a linear polarizer, we are guaranteed that A 0 and
B 0 have the same phase. Therefore, A 0 B 0 = A 0 B 0 = |A 0 ||B 0 |. In view of (6.58), these
results lead to
0 2 0 2
A + B S 0 cos 2 sin 2
0
S0 = = + S1 + S2
I in 2 2 2
0 2 0 2
A B S 0 cos 2 sin 2
0
S 2 cos2 sin2
S1 = = + S1 +
I in 2 2 2
cos 2 cos2 2 sin 4
= S0 + S1 + S2
2
0 0 0 02 4
A B + A B
S 0 cos 2 sin 2
S 20 = =2 + S1 + S 2 cos sin
I in 2 2 2
sin 2 sin 4 sin2 2
= S0 + S1 + S2
20 0 40 0 2
A B A B
S 30 = i =0
I in
162 Chapter 6 Polarization of Light
S 00 1 cos 2 sin 2 0 S0
1
S 0 1 cos 2
1 = cos2 2 2 sin 4 S1
0
1
S 0 2 sin 2 sin2 2 0 S2
2 sin 4
2
S 30 0 0 0 0 S3
The Mueller matrix for a half wave plate is worked out below. The Mueller
matrix for a quarter wave plate is deferred to problem P6.16
We know that all of the light transmits through the wave plate. This immediately
gives
S 00 = S 0
The wave plate does nothing to unpolarized light. On the other hand, the polarized
portion of the light is influenced by the wave plate as follows (see (6.39)):
A0
cos 2 sin 2 A A cos 2 + B sin 2
= =
B0 sin 2 cos 2 B A sin 2 B cos 2
As usual, is the angle of the fast axis relative to the horizontal. (As expected,
0 2 0 2
A + B = |A|2 +|B |2 ; the intensity of the light is unaltered.) Using (6.58) we get
0 2 0 2
Hans Mueller (1900-1965, Swiss) was A B |A cos 2 + B sin 2|2 |A sin 2 B cos 2|2
a shepherd until his late teens. As a S 10 = =
I in Ii n
physics professor at MIT, he built on the 2
|A| |B |2 cos 4 + (A B + AB ) sin 4
work of Stokes and in 1943 formulated a
matrix method for manipulating Stokes = = S 1 cos 4 + S 2 sin 4
Ii n
vectors. He was an engaging lecturer
into the 1950s and was known for his ex- A 0 B 0 + A 0 B 0
S 20 =
citing demonstrations. He was a student Ii n
of Arnold Sommerfeld, and did seminal
(A cos 2 + B sin 2) (A sin 2 B cos )
work on ferroelectricity (he is reported =
to have coined the term). See Laszlo Ii n
Tisza,Adventures of a Theoretical Physi- (A cos 2 + B sin 2) (A sin 2 B cos )
cist, Part II: America, Phys. Perspect. +
11, 120-168 (2009). Ii n
|A|2 |B |2 AB + A B
= sin 4 cos 4 = S 1 sin 4 S 2 cos 4
Ii n Ii n
A 0 B 0 A 0 B 0
S 30 = i
Ii n
(A cos 2 + B sin 2) (A sin 2 B cos )
=i
Ii n
(A cos 2 + B sin 2) (A sin 2 B cos )
i
Ii n
A B AB
= i = S 3
Ii n
6.B Partially Polarized Light 163
S 00 1 0 0 0 S0
S0 0 cos 4 sin 4 S1
0
1 =
S0 0 sin 4 cos 4 0 S2
2
S 30 0 0 0 1 S3
Exercises
P6.2 Prove that if 0 < < , the helicity is left-handed, and if < < 2 the
helicity is right-handed.
HINT: Write the relevant real field associated with (6.5)
Polarizer Polarizer
Laser
HINT: Linearly polarized light contains equal amounts of right and left
circularly polarized light. Consider
ei
1 1 1
+
2 i 2 i
where is the phase delay of the right circular polarization. Show that
this can be written as
cos /2
ei
sin /2
Exercises 165
cos
sin
P6.5 (a) Suppose that linearly polarized light is oriented at an angle with
respect to the horizontal or x-axis (see table 6.1). What fraction of the
original intensity gets through a vertically oriented polarizer?
(b) If the original light is right-circularly polarized, what fraction of the
original intensity gets through the same polarizer?
P6.7 (a) Suppose that linearly polarized light is oriented at an angle with
respect to the horizontal or x-axis. What fraction of the original inten-
sity emerges from a polarizer oriented with its transmission at angle
from the x-axis?
Answer: cos2 ( ); compare with P6.5.
(b) If the original light is right circularly polarized, what fraction of the
original intensity emerges from the same polarizer?
HINT: A polarizer alone can reveal the direction of the major and minor
axes and the ellipticity, but it does not reveal the helicity. Use a quarter-
wave plate (oriented at a special angle ) to convert the unknown
elliptically polarized light into linearly polarized light. A subsequent
polarizer can then extinguish the light, from which you can determine
the Jones vector of the light coming through the wave plate. This must
equal the original (unknown) Jones vector (6.11) operated on by the
wave plate (6.37). As you solve the matrix equation, it is helpful to note
that the inverse of (6.37) is its own complex conjugate.
(b) What fraction of the original intensity transmits through the system?
(b) Check that the device leaves right-circularly polarized light unal-
tered while killing left-circularly polarized light.
Set the difference equal to /4 for each bounce. The equation you get
does not have a clean analytic solution, but you can plot it to find a
numerical solution. Side
View
= 50 and
Answer: There are two angles that work: = 53 .
P6.15 Derive (6.49) and (6.51), often used for ellipsometry measurements.
1cos 2 1+cos 2
HINT: Using sin2 = 2 and cos2 = 2 , first show
r p r s +r s r p | r p |2
|r s | 2 tan |r s |2
tan2
I 1 sin 2 + cos 2
| r p |2 | r p |2
|r s | 2 + tan2 |r s | 2 + tan2
168 Chapter 6 Polarization of Light
Superposition of Quasi-Parallel
Plane Waves
169
170 Chapter 7 Superposition of Quasi-Parallel Plane Waves
It turns out that group velocity can also become superluminal when signif-
icant absorption and/or amplification of the light pulse is involved. This is no
cause for alarm (nor is it cause for an abundance of gee-whiz papers on the
subject). Absorption and amplification can cause a pulse to appear to move
unexpectedly fast through a reshaping effect. Group velocity, or rather its inverse
group delay, takes this into account, which makes it remarkably general. In such
a scenario, energy can be lost from the back of a pulse or perhaps added to an
already-present forward portion of a pulse such that the average pulse position
appears to advance superluminally. When all energy is accounted for (both the
energy in the medium and in the light pulse), however, nothing advances faster
than the universal speed limit c. Appendix 7.B provides analysis of how a medium
exchanges energy with a pulse to produce these eye-catching effects.
(Recall the conspiracy that only the real parts of the fields are relevant crucial
before multiplying.) The above expression is cumbersome because of the many
cross terms that arise when the two summations are multiplied. We need some
simplifying assumptions before we can make any real progress on this expression.
For example, we can time-average the rapid fluctuations in the expression that
vary on the scale of optical frequencies. Additionally, it is common to encounter
the situation where all plane-wave components travel roughly parallel to each
other, which will be a big help in simplifying (7.3). Let us further assume that the
km vectors are real.1
1 If the wave vectors are complex, the result is essentially the same, but, as in (2.62), the field
1 h n o n o
km Re E j e i k j r j t Re Em e i (km rm t )
X
S(r, t ) =
j ,m m 0 (7.4)
n o n o i
Re Em e i (km rm t ) Re E j e i k j r j t km
The last term in (7.4) can be dismissed if all k-vectors are approximately parallel to
each other, in which case all of the km are essentially perpendicular to each of the
E j . We will make this rather stringent assumption and kill the last line in (7.4). The
magnitude of the Poynting vector then becomes (with the help of (0.30))
i k r j t
X km E j e j + Ej e i k j r j t
S(r, t ) =
j ,m m 0 2
( )
Em e i (km rm t ) + Em e i (km rm t )
2 (parallel k-vectors)
X km n
E j Em e i k j +km r j +m t + Ej Em e i k j +km r j +m t
=
j ,m 4m 0
o
+ E j Em e i k j km r j m t + Ej Em e i k j km r j m t
(7.5)
The terms involving ( j + m )t oscillate rapidly and time-average to zero. By
comparison, the terms involving ( j m )t oscillate slowly (especially when the
j are all in the neighborhood of the m ) or not at all when j = m. We retain the
slower fluctuations and discard the rapid oscillations. For purposes of computing
the intensity we can approximate the index as approximately constant, and write
k m /(m 0 ) n0 c. With these simplifications, (7.5) becomes
i k j km r j m t
n0 c X E j Em e + Ej Em e i k j km r(n m )t
S(r, t )osc =
2 j ,m 2
( )
n0 c X
i k j r j t
X
i (km rm t )
= Re Ej e Em e
2 j m
n0 c
Re E (r, t ) E (r, t ) .
=
2
(parallel k-vectors) (7.6)
As we previously studied (see P1.9), the velocities of the wave crests for these two
waves are
Figure 7.1 Animation showing su-
v p1 = 1 /k 1 and v p2 = 2 /k 2 (7.9)
perposition of two plane waves
(electric fields) with different fre- These are known as the phase velocities of the individual plane waves.
quencies and traveling at different Next consider a composite wave created from the superposition of the above
speeds. two plane waves:
E(r, t ) = E0 e i (k1 r1 t ) + E0 e i (k2 r2 t ) (7.10)
The two plane waves interfere, producing regions of higher and lower intensity
that move in time. Remarkably, these intensity peaks can propagate at speeds
quite different from either of the phase velocities in (7.9). The intensity (7.7) for
the field (7.10) is computed as follows:
Intensity
n0 c h ih i
I (r, t ) = E0 E0 e i (k1 r1 t ) + e i (k2 r2 t ) e i (k1 r1 t ) + e i (k2 r2 t )
2
n0 c h i
= E0 E0 2 + e i [(k2 k1 )r(2 1 )t ] + e i [(k2 k1 )r(2 1 )t ]
2 (7.11)
= n0 cE0 E0 [1 + cos [(k2 k1 ) r (2 1 ) t ]]
= n0 cE0 E0 [1 + cos (k r t )]
Position
where
Figure 7.2 Intensity of two inter-
k k2 k1
fering plane waves. The solid line (7.12)
shows intensity averaged over 2 1
rapid oscillations.
2 At extreme intensities, when the influence of the magnetic field becomes comparable to that of
the electric field, the distinction between propagating and standing fields becomes important to
the behavior of charged particles in that field.
7.2 Group vs. Phase Velocity: Sum of Two Plane Waves 173
The darker line in Fig. 7.2 shows the intensity computed with (7.11). Keep in
mind that this intensity is averaged over rapid oscillations. For comparison, the
lighter line shows the Poynting flux with the rapid oscillations retained, according
to (7.5). It is left as an exercise (see P7.3) to show that the rapid-oscillation peaks
in Fig. 7.2 move with a phase velocity derived from the average k and average
of the two plane waves:
vp (7.13)
k
An examination of the cosine argument in (7.11) reveals that the time-averaged
curve in Fig. 7.2 (dark) travels with speed
d
vg = (7.14)
k d k
Sir William Rowen Hamilton (1805
This is known as the group velocity. Essentially, v g may be thought of as the 1865, Irish) was born in Dublin, Ireland,
velocity for the envelope that encloses the rapid oscillations. As noted, the group the fourth of nine children. At a very
early age, he showed a remarkable
velocity is often written as a derivative rather than a ratio of finite differences; the ability to learn languages while living
derivative will be more natural when dealing with a continuum of plane waves with his uncle who was a linguist. He
became proficient in nearly a dozen
rather than a pair of planes. languages and in later life enjoyed read-
In general, v g and v p are not the same. This means that as the waveform ing in various languages as a means of
propagates, the rapid oscillations move within the larger modulation pattern, for relaxation. At age eight, Hamilton en-
tered a mental arithmetic contest against
example, continually disappearing at the front and reappearing at the back of a nine-year-old prodigy from America.
each modulation. The group velocity is identified with the propagation of overall Hamilton lost and as a result determined
to spend much more time on mathemat-
waveforms. The presence of intensity in a waveform is clearly tied more to v g ics instead of languages. Hamilton went
than to v p . on to to make enormous contributions
to mathematical physics. His reformu-
lation of classical dynamics proved to
be the ideal framework for later devel-
Example 7.1 opments in electrodynamics, quantum
mechanics, and quantum field theory.
Determine the phase velocity and group velocity for the superposition of two plane Ironically, Hamilton was originally em-
waves in a plasma (see P2.7). ployed as an observational astronomer
at Dunsink Obervatory, a post for which
he was not particularly well suited. The
Solution: The index of refraction is given by University of Dublin didnt mind, however,
q owing to the outstanding quality of his
theoretical per suits. Hamilton is cred-
n plasma () = 1 2p /2 < 1 (assuming > p ) (7.15) ited with first articulating the concept of
group velocity, although only abstracts
The phase velocity for each frequency is computed as of his lectures on the subject have been
preserved: Researches respecting vi-
1 + 2 c bration, connected with the theory of
vp = = (7.16) light, Proc. Roy. Irish Acad. 1, 267, 341
n plasma (1 )1 /c + n plasma (2 )2 /c n plasma ()
(1839).(Wikipedia)
For convenience, we have taken 1 and 2 to lie very close to each other. Since
n plasma < 1, both of these velocities exceed c. However, the group velocity is
d d k 1 d n plasma () 1
vg = = = = = n plasma () c (7.17)
k dk d d c
which is clearly less than c. The derivation of the final expression in (7.17) from
the previous one is left as an exercise.
174 Chapter 7 Superposition of Quasi-Parallel Plane Waves
Z
1
E (r, ) = p E (r, t ) e i t d t (7.19)
2
7.3 Frequency Spectrum of Light 175
This operation is called the Fourier transform. It is used to generate the spectrum
E (r, ) from the field E(r, t ) in much the same way that (7.18) is used to generate
the field E(r, t ) from the spectrum E (r, ).
Although only the real part of E(r, t ) is physically relevant, we can continue
our habit of working with the complex field and taking the real part of E (r, t ) at
our leisure.3 In fact, we will find it advantageous to work with the complex field
instead of only the real part. We will not run into trouble as long as we remember
never to discard the imaginary part of E (r, ), only the imaginary part of E (r, t ).
The intensity formula (7.7) remains useful for continuous superpositions of
plane waves (i.e. a field defined by the inverse Fourier transform (7.18)):
n0 c
I (r, t ) E(r, t ) E (r, t ) (7.20)
2
Remember, this formula specifically requires the fields to be in complex for-
mat, and it takes care of the time-average over rapid oscillations automatically.4
Moreover, the above expression for I (r, t ) assumes that all relevant k-vectors are
essentially parallel.
Similarly, we will define the power spectrum produced from E (r, ), which we
write as
n0 c
I (r, ) E (r, ) E (r, ) (7.21)
2
The power spectrum I (r, ) is what one observes when the waveform is sent into
a spectral analyzer or spectrometer. We must apologize again for the potentially
confusing notation (in wide usage): I (r, ) is not the Fourier transform of I (r, t )!
They are defined exclusively through (7.20) and (7.21).
Parsevals theorem (see Example 0.7) imposes an interesting connection be-
tween the time-integral of the intensity and the frequency-integral of the power
spectrum:
Z Z
I (r, t )d t = I (r, ) d (7.22)
With the above formalities out of the way, we will illustrate the use of Fourier
transforms through some examples.
Example 7.2
Find E (r, ) associated with the field
2 2T 2
e i 0 t
E(r, t ) = E0 (r) e t (7.23)
3 Since Fourier transforms are linear, one can take the Fourier transform of the real and imaginary Figure 7.3 Real part of electric
parts of a field separately. Appropriate modifications to E (r, ) in the frequency domain will not field (7.23) with T = 4/0 and
cause the two parts to become mingled. Upon taking the inverse Fourier transform to obtain E(r, t ) T = 10/0 , where 2/0 is the
again, the original real part remains purely real, and the original imaginary part remains purely period of the carrier frequency.
imaginary.
4 To use this expression there needs to be a sufficient number of oscillations within the waveform
The real part of this field is shown in Fig. 7.3 for two different durations T . The
intensity profile computed by (7.20) is shown in Fig. 7.4 .
This integral can be performed with the help of (0.55), and we obtain
T 2 (0 )2
E (r, ) = T E0 (r) e 2 (7.25)
Notice that E (r, ) has units of Field multiplied by time, or in other words, field per
frequency.
Example 7.3
Check Parsevals theorem for the field and spectrum in Example 7.2.
where we have used (0.55) to perform the integration. This result has units of
energy per area, called fluence. It is the energy per area absorbed by a detector
after the pulse has concluded. The frequency integration in (7.22) yields
Z Z
n0 c 2
(0 )2
I (r, ) d = E0 (r) E0 (r) T 2 e T d
2
p
n0 c
= E0 (r) E0 (r) T 2
2 T
which is the same answer.
Example 7.4
Take the inverse Fourier transform of (7.25) to recover the original waveform (7.23).
Since only the real part of the time profile E(r, t ) is physically relevant, you
might be curious about how the Fourier transform of the real part of the field
compares with that of the complex version of the field that we have been using.
Indeed, there are situations where it is more appropriate to use the real version
of the field rather than its complex form. For example, if a waveform includes
multiple propagation directions or if a waveform contains only a few cycles, then
the motivation/interpretation behind (7.20) and the convenience of the complex
format begins to wane.
178 Chapter 7 Superposition of Quasi-Parallel Plane Waves
Example 7.5
Take the Fourier transform of just the real part of waveform (7.23).
E(r, t ) + E (r, t )
Er (r, t ) =
2
2 E (r) e i 0 t + E (r) e i 0 t
(7.27)
2 0
= e t 2T 0
2
2 /2T 2
Figure 7.7 Spectrum based on If E0 (r) is real, then this field can be written as E0 (r) e t cos (0 t ). The Fourier
(7.28) with T = 10/0 . Compare transform (7.19) yields (see P0.24)
with the lower curve in Fig. 7.5
T 2 (+0 )2 T 2 (0 )2
E0 (r) e 2 + E0 (r) e 2
Er (r, ) = T (7.28)
2
The spectrum is shown in Fig. 7.7.
From the above example, you might notice that the transform of the real
part of a field tends to be more cumbersome than the transform of the entire
complex field. For the real field, both positive and negative frequency components
contribute to the overall spectrum.5 Moreover, the Fourier transform of a real
function Er (r, t ) obeys the symmetry relation
whereas the Fourier transform of the complex field depicted in Fig. 7.5 does not.
twice the spectrum of the real representation, but plotted only for the positive frequencies.
6 See J. D. Jackson, Classical Electrodynamics, 3rd ed., Sect. 7.8 (New York: John Wiley, 1999).
7.4 Wave Packet Propagation and Group Delay 179
The exponent in (7.30) is called the phase delay for the pulse propagation. It
is often expanded in a Taylor series about the pulse carrier frequency 0 :
k 1 2 k
k r = k|0 + ( 0 ) + 2
( 0 ) + r (7.33)
0 2 2 0
The integral in (7.34) is recognized as the Fourier transform of the original pulse
with a new time argument:
0
E (r0 + r, t ) = E r0 , t t 0 e i (k(0 )r0 t )
(7.36)
Notice that (7.32) for propagating in vacuum agrees with this result, since kvac (0 )
r = 0 z/c. The second factor in (7.36) merely gives a phase shift governed by the
phase velocity of the carrier frequency (see (7.9)):
0
v p (0 ) = (7.37)
k (0 )
The phase shift vanishes for propagation in vacuum. Ignoring the phase shift,
(7.36) is only altered by a delay t 0 , the time required for the pulse to traverse the
displacement r.
The function k r is known as the group delay function, and in (7.35) it
k()
1
v g (0 ) = (7.38)
0
Group delay (or group velocity) essentially tracks the center of the pulse.
In our derivation we have assumed that the phase delay k()r could be well-
represented by the first two terms of the expansion (7.33). While this assumption
7.5 Quadratic Dispersion 181
gives results that are often useful, higher-order terms can also play a role. In
section 7.5 well find that the next term in the expansion controls the rate at which
the pulse spreads as it travels. We should also note that there are times when the
expansion (7.33) fails to converge (when 0 is near a resonance of the medium),
and the above expansion approach is not valid. Well analyze pulse propagation
in this sticky situation in section 7.6.
k () z
= k 0 z + v g1 ( 0 ) z + ( 0 )2 z + (7.39)
where
0 n (0 )
k 0 k (0 ) = (7.40)
c
k = (0 ) + 0 n (0 )
0
n
v g1
(7.41)
0 c c
2
1 k n (0 ) 0 n 00 (0 )
0
= + (7.42)
2 2 0 c 2c
8 See J. D. Jackson, Classical Electrodynamics, 3rd ed., Sect. 7.9 (New York: John Wiley, 1999).
182 Chapter 7 Superposition of Quasi-Parallel Plane Waves
Example 7.7
A Gaussian waveform similar to that in Example 7.6 propagates throught a piece of
glass with thickness r = z. Compute the waveform exiting the glass.
Solution: Again, the Fourier transform of the Gaussian pulse before propagation
is given by (7.25):
T 2 (0 )2
E (0, ) = T E0 e 2
25 fs With the aid of expansion (7.39), the inverse Fourier transform (7.31) (which yields
56 fs the pulse after propagation) becomes
Z T 2 (0 )2
1 1
(0 )z+i (0 )2 z i t
E (z, t ) = p E0 Te 2 e i k0 z+i v g e d
2
Figure 7.8 A 25 fs pulse traversing
(7.43)
an ` = 1 cm piece of BK7 glass. Z
T E0 e i (k0 z0 t ) (T 2 /2i z )(0 )2 i v g1 (0 )zi (0 )t
= p e e d
2
The above integral can be performed with the aid of (0.55). The result is
2
(
t z/v g )
T E0 e i (k0 z0 t )
s
2
4 T2 1i 2z/T 2
( )
E (z, t ) = p e
T2
2 2
2 1 i 2z/T
2 (7.45)
i
tan1 2z
( t z/v g ) (1+i 2z/T 2 )
e2 T2 2
2T 1+ 2z/T 2
2
( )
= E0 e i (k0 z0 t ) q 2 e
4
1 + 2z/T 2
where
2
(z) z (7.47)
T2
Figure 7.9 Animation of a and p
Gaussian-envelope pulse (elec- T (z) T 1 + 2 (z) (7.48)
tric field) undergoing dispersion
during transit.
We can immediately make a few observations about (7.46). First, note that
at z = 0 (i.e. zero thickness of glass), (7.46) reduces to the input pulse E (0, t ) =
E0 e t /2T e i 0 t , as it should. Secondly, the peak of the pulse moves at speed v g
2 2
2 2
since the factor e (t z/v g ) /2T (z) controls the pulse amplitude, while the other
7.6 Generalized Context for Group Delay 183
terms (multiplied by i ) in the exponent of (7.46) merely alter the phase. Also
note that the duration of the pulse increases and its peak intensity decreases as it
travels, since T (z) increases with z. In P7.8 we will find that (7.46) also predicts
that for large z, the field of the spread-out pulse oscillates less rapidly at the begin-
ning of the pulse than at the end (assuming > 0). This phenomenon, known as
pulse chirping, means that red frequencies get ahead of blue frequencies during
propagation since the red frequencies experience a lower index of refraction.
While Example 7.7 is worked out for the specific case of a Gaussian pulse,
the results are qualitatively similar for all pulses. The exact details vary with
pulse shape, but all short pulses eventually broaden and chirp as they propagate
through a dispersive medium such as glass. Higher-order terms in the expansion
(7.33) that were neglected cause additional spreading, chirping, and other defor-
mations to the pulses as they propagate. The influence of each order becomes
progressively more cumbersome to study analytically. It is easier to perform the
inverse Fourier transform numerically; there is no need to resort to the expansion
of k () if the integration is done numerically.
Figure 7.11 Transit time defined as the difference between arrival time at two points.
For simplification, we have assumed that the light travels in a uniform direction
by using intensity rather than the Poynting vector.
Consider a pulse as it travels from point r0 to point r = r0 + r in a homoge-
neous medium. The difference in arrival times at the two points is
t t r t r0 (7.50)
The pulse shape can evolve in complicated ways between the two points, spread-
ing with different portions being absorbed (or amplified) during transit as de-
picted in Fig. 7.11. Nevertheless, (7.50) renders an unambiguous time interval
between the passage of the pulse center at each point.
Before Propagation This difference in arrival time can be shown to consist of two terms (see
P7.11):9
t = tG (r) + t R (r0 ) (7.51)
The first term, called the net group delay, dominates if the field waveform is
initially symmetric in time (e.g. an unchirped Gaussian). It amounts to a spectral
average of the group delay function taken with respect to the spectral content of
the pulse arriving at the final point r = r0 + r:
R
After Propagation I (r, ) Rek
r d
tG (r) = (7.52)
R
I (r, ) d
Figure 7.12 Normalized power
spectrum of a broadband pulse where I (r, ) is given in (7.21). The two curves in Fig. 7.12 show I (r0 , ) (before
before and after propagation propagation) and I (r, ) (after propagation) for an initially Gaussian pulse. As
through an absorbing medium seen in (7.52), the pulse travel time depends on the spectral shape of the pulse at
with the complex index shown in the end of propagation.
Fig. 7.10. The absorption line eats
9 M. Ware, S. A. Glasgow, and J. Peatross, The Role of Group Velocity in Tracking Field Energy in
a hole in the spectrum.
Linear Dielectrics, Opt. Express 9, 506-518 (2001).
7.6 Generalized Context for Group Delay 185
Note the close resemblance between the formulas (7.49) and (7.52). Both are
expectation integrals. The former is executed as a center-of-mass integral on
time; the latter is executed in the frequency domain on Rek r/, the group
delay function (7.38). The group delay at every frequency present in the pulse
influences the result. If the pulse has a narrow bandwidth in the neighborhood
of 0 , the integral reduces to Rek/|0 r, in agreement with (7.38) (see P7.9).
The net group delay depends only on the spectral content of the pulse, indepen-
dent of its temporal organization (i.e., the phase of E (r, ) has no influence). Only
the real part of the k-vector plays a direct role in (7.52).
The second term in (7.51) is the reshaping delay t R . It represents a delay
that arises solely from a reshaping of the spectral amplitude. Often this term is
negligible. The term takes into account how the pulse time center-of-mass shifts
as portions of the spectrum are removed (or added), as illustrated in Fig. 7.13. It
is computed at r0 before propagation takes place:10
t R (r0 ) = t r0 altered t r0
(7.53)
Here t r0 represents the usual arrival time of the pulse at the initial point r0 ,
according to (7.49). The intensity at this point is associated with a field E (r0 , t )
whose spectrum is E (r0 , ). On the other hand, t r0 altered is the arrival time of
a pulse with modified spectrum E (r0 , ) e Imkr . Notice that E (r0 , ) e Imkr is Figure 7.13 The center of a
still evaluated at the initial point r0 . Only the spectral amplitude (not the phase) chirped pulse can shift owing
is modified, according to what is anticipated to be lost (or gained) during the trip. to the reshaping effect when spec-
trum is removed.
In contrast to the net group delay, the reshaping delay is sensitive to how a pulse
is organized. The reshaping delay is negligible if the pulse is initially symmetric
(in amplitude and phase) before propagation. The reshaping delay also goes to
zero in the narrowband limit, and the total delay reduces to the net group delay.
Example 7.8
Find the time required for a Gaussian pulse (7.23) to traverse a slab of absorption
material (neglecting possible surface reflections). Let the material response be
described by the Lorentz model described in section 2.3 with the carrier frequency
of the pulse 0 , coinciding with the material resonance frequency. Let the slab
have thickness r = c1 /10 and absorption strength 2p = 10.
net group delay should be computed with the initial rather than final spectrum.
11 In general, one should write to distinguish the carrier frequency of the pulse from the
0
resonance frequency of the material 0 ; in practice, these are often different.
186 Chapter 7 Superposition of Quasi-Parallel Plane Waves
The index of refraction n + i is given by (2.39) (see also (2.27) and (2.29)). Since
the expressions for n and are complicated, the integration in the above formula
must be performed numerically.
p
The result when T = T1 = 101 / 2 (narrowband) is
Figure 7.15 Delay as a function of tG = 5.1/ = 51r /c = 0.72T1
pulse duration. p
and the result when T = T2 = 1 / 2 (broadband) is
The narrowband pulse (with duration T1 ) in Example 7.8 traverses the ab-
sorbing medium superluminally (i.e. faster than c). The negative transit time
means that the center-of-mass of the exiting pulse emerges even before the
center-of-mass of the entering pulse reaches the medium! On the other hand,
the broadband pulse (with the shorter duration T2 ) has a large positive delay time,
indicating that the exiting pulse emerges subluminally.
Figure 7.14 shows the intensity profiles for these two pulses as they traverse
the absorption slab, calculated with the aid of (7.31). By eye, one can see how
Figure 7.16 Narrowband pulse the centers of the two pulses are either advanced or delayed as they go through
traversing an absorbing medium. the absorption medium. In both cases, the pulse that emerges is well within
the envelope of the original pulse propagated forward at c. In the case of the
broadband pulse, the absorption peak eats a hole in the center of the spectrum
as shown in Fig. 7.12, causing the emerging pulse to be distorted in time. The
analysis in this section predicts the center of pulses, whereas to see the shape of
pulses one needs to calculate (7.31).
The results for the two pulse durations in Example 7.8 indicate a trend. Su-
perluminal behavior only occurs for long boring pulses. In the case of a single
absorption resonance, this comes with a severe cost of attenuation. Figure 7.15
shows the delay time as a function of pulse duration. As the injected pulse be-
comes more sharply defined in time, the superluminal behavior does not persist.
Sharply defined waveforms (i.e. broadband) cannot propagate superluminally
precisely because much of their bandwidth lies away from the frequencies with
superluminal group delays.
Figure 7.17 Animation compar-
We should mention that superluminal propagation cannot persist for indefi-
ing narrowband vs. broadband
nite distances since the medium eventually removes the superluminal spectral
Gaussian pulses traversing an am-
plifying slab (green stripe) slightly components through absorption (or else adds subluminal spectral components
off resonance. in the case of amplification). This limits the amount that a pulse center can be
advancedon the scale of the pulses own duration.
7.A Pulse Chirping in a Grating Pair 187
As we saw for the absorption situation the exiting pulse is tiny and resides
well within the original envelope of the pulse propagated forward at speed c, as
depicted in Fig. 7.16. Without the absorbing material in place, the signal would
be detectable just as early. This statement is also true for amplifying media.12
Figure 7.17 shows narrowband and broadband pulses traversing an amplifying
medium. In this case, superluminal behavior occurs for spectra near by but not
on an amplifying resonance. If the pulse is too broadband, its spectrum will be
amplified, which adds slower components to the overall group delay.
While it may be surprising at first to realize that group velocity can become
superluminal, it is to be expected for pulses whose spectra lie in the vicinity of a
medium resonance. Group velocity v g tracks the presence of field energy, whether
that energy propagates or is extracted from the medium at a point down stream.
Energy is never transported faster than the universal speed limit c. A detailed
analysis of energy flow is given in Appendix 7.B.
oscillator strength f .
188 Chapter 7 Superposition of Quasi-Parallel Plane Waves
diffraction formula13
2c
r () = sin1 sin i (7.54)
d
where d is the grating groove spacing. By examining the geometry of the figure,
we see that the reflected k-vector is given by k = x cos r + y sin r /c.
Suppose we know the pulse at a point r0 on the first grating. Next we choose a
First Second point r0 + r on the second grating where we will determine the outgoing pulse.
Grating Grating Since we are considering an infinitely wide plane-wave pulse, it doesnt matter
where we choose that point as long as it lies on the surface of the second grating.
The waveform will be the same everywhere on the second grating, only with a
different arrival time. For convenience, we might as well take the second point to
be r0 + r = r0 + L x as shown in Fig. 7.19.
The phase delay needed for (7.30) becomes
L
k () r = cos r (7.55)
c
Figure 7.19 Direction of k-vector
We will express this as a Taylor-series expansion similar to (7.39) so that we can
between parallel gratings (top
perform the inverse Fourier transform analytically. We will approximate (7.55) as
view). Grating rulings run in and
out of the page.
k () r k 0 L + v g1 ( 0 ) L + ( 0 )2 L + (7.56)
so that we can take advantage of formula (7.46). To calculate the terms in this
expansion we will need the derivative of (7.54):
d r
1 2c 1 2c
=q =
d 2 d 2 d
p
1 sin2 r
2c 2
1 d sin i
(7.57)
2c sin i + sin r
= 2 =
d cos r cos r
and
r 1 + sin r sin i
d k
v g1
= (7.61)
d 0 L c cos r
0
1 d 2 k r (sin i + sin r )2
= (7.62)
2 d 2 0 L 2c cos3 r 0
In the case of a Gaussian pulse, we can employ (7.46), where L takes the place of
z, and k 0 , v g1 and are defined by (7.60) (7.62). The duration of the pulse is
controlled by (7.62) and the spacing between the gratings L.
Zt
0 P r, t 0
u med (r, t ) = E r, t dt0 (7.64)
t 0
The expression (7.63) for the energy density includes all (relevant) forms of energy,
including a non-zero integration constant u (r, ) corresponding to energy
14 M. Ware, S. A. Glasgow, and J. Peatross, Energy Transport in Linear Dielectrics, Opt. Express 9,
519-532 (2001).
190 Chapter 7 Superposition of Quasi-Parallel Plane Waves
stored in the medium before the arrival of any pulse (important in the case of an
amplifying medium). u field (r, t ) and u med (r, t ) are both zero before the arrival of
the pulse (i.e. at t = ). In addition, u field (r, t ), given by (2.53), returns to zero
after the pulse has passed (i.e. at t = +).
As u med increases, the energy in the medium increases. Conversely, as u med
decreases, the medium surrenders energy to the electromagnetic field. While it is
possible for u med to become negative, the combination u med + u () (i.e. the net
energy in the medium) can never go negative since a material cannot surrender
more energy than it posesses to begin with.
Poyntings theorem (2.51) has the form of a continuity equation which when
integrated spatially over a small volume V yields
I Z
S da = u dV (7.65)
t
A V
where the left-hand side has been transformed into an surface integral (via the
divergence theorem (0.11)) representing the power leaving the volume. Let the
volume be small enough to take S to be uniform throughout V .
We can define an energy transport velocity (directed along S) as the effective
speed at which all of the energy density would need to travel in order to achieve
the Poynting flux:
S
vE (7.66)
u
Note that this ratio of the Poynting flux to the energy density has units of velocity.
When the total energy density u is used in computing (7.66), the energy transport
velocity has a fictitious nature; it is not the actual velocity of the total energy
(since part is stationary), but rather the effective velocity necessary to achieve
the same energy transport that the electromagnetic flux alone delivers. If we
reduce the denominator to the subset of the energy that can move, namely u field ,
the Cauchy-Schwartz inequality (i.e. 2 + 2 2) ensures an energy transport
velocity v E remains strictly bounded by the speed of light in vacuum c. The total
energy density u is at least as great as the field energy density u field . Hence, this
strict luminality is maintained.
Centroid of Energy
vE u d 3 r S d 3r
R R
vE R = R (7.67)
u d 3r u d 3r
where we have assumed that the volume for the integration encloses all energy in
the system and that the field near the edges of this volume is zero. Since we have
included all energy, Poyntings theorem (2.51) can be written with no source terms
(i.e. S + u/t = 0). This means that the total energy in the system is conserved
and is given by the integral in the denominator of (7.68). This allows the derivative
to be brought out in front of the entire expression giving
r ru d 3 r
R
vE = where r R (7.69)
t u d 3r
The latter expression represents the center-of-mass or centroid of the total en-
ergy in the system, which is guaranteed to evolve strictly luminally since vE is
everywhere luminal.15
Z Z Z
p1
0 0 i 0
E r, 0 e i t d 0 p (r, ) E (r, ) e i t d d t 0
0
u med (r, ) =
2 2
(7.72)
15 Although (7.69) guarantees that the centroid of the total energy moves strictly luminally, there is
no such limitation on the centroid of field energy alone. The steps leading to (7.69) are not possible
if u field is used in place of u. Explicitly, that is
ru field d 3 r
R
S
6=
t u field d 3 r
R
u field
As was pointed out, the left-hand side is strictly luminal. However, the right-hand side can easily
exceed c as the medium exchanges energy with the field. In an amplifying medium, for example, the
rapid appearance of a pulse downstream can occur when the leading portion of a pulse stimulates
energy already present in the medium to convert to the form of field energy. Group velocity is
related to this method of accounting, which is why it also can become superluminal.
16 We assume that the real forms of the fields in the time domain are used for the sake of this
multiplication.
192 Chapter 7 Superposition of Quasi-Parallel Plane Waves
where we have incorporated (7.70) and evaluated u med after the pulse is over at
t = . We may change the order of integration and write
Z Z Z
1 0 0
u med (r, ) = i 0 d (r, ) E (r, ) d 0 E r, 0 e i (+ )t d t 0
2
(7.73)
The final integral is a delta function a delta function similar to (0.54), which allows
the middle integral also to be performed. The expression for u med then reduces to
Z
u med (r, ) = i 0 (r, ) E (r, ) E (r, ) d (7.74)
In this derivation, we take E(r, t ) and P(r, t ) to be real functions, so we can employ
the symmetry (7.29) along with
Then we obtain
Z
u med (r, ) = 0 Im (r, ) E (r, ) E (r, ) d (7.75)
The expression (7.75) describes the net energy density transfered to a point
in the medium after all action has finished (i.e. at t = ). It involves the power
spectrum of the pulse. We can modify this formula in an intuitive way so that it
describes the transfer of energy density to the medium for any time during the
pulse.
Since the medium is unable to anticipate the spectrum of the entire pulse
before experiencing it, the material responds to the pulse according to the history
of the field up to each instant. In particular, the material has to be prepared for
the possibility of an abrupt cessation of the pulse at any moment, in which case
all exchange of energy with the medium immediately ceases. In this extreme sce-
nario, there is no possibility for the medium to recover from previously incorrect
attenuation or amplification, so it must have gotten it right already.
If the pulse were in fact to abruptly terminate at a given instant, it would
not be necessary to integrate the inverse Fourier transform (7.19) beyond the
termination time t after which all contributions are zero. Causality requires that
the medium be indifferent to whether a pulse actually terminates if that possibility
lies in the future. Therefore, (7.75) can apply for any time t (not just for t = )
if the spectrum (7.19) is evaluated just for that portion of the field previously
experienced by the medium (up to time t ).
The following is then an exact representation for the energy density (7.64)
transferred to the medium:
Z
u med (r, t ) = 0 Im (r, ) Et (r, ) Et (r, ) d (7.76)
7.B Causality and Exchange of Energy with the Medium 193
where
Zt
1
E r, t 0 e i t d t 0
0
E t (r, ) p
(7.77)
2
This time dependence enters only through Et (r, ) Et (r, ), known as the instan-
taneous power spectrum.
The expression (7.76) gives physical insight into the manner in which causal
dielectric materials exchange energy with different parts of an electromagnetic
pulse. Since the function E t () is the Fourier transform of the pulse truncated
at the current time t and set to zero thereafter, it can include many frequency
components that are not present in the pulse taken in its entirety. This explains
why the medium can respond differently to the front of a pulse compared to the
back. Even though absorption or amplification resonances may lie outside of Figure 7.20 Real and imaginary
the spectral envelope of a pulse taken in its entirety, the instantaneous spectrum parts of the refractive index for an
amplifying medium.
on a portion of the pulse can momentarily lap onto or off of resonances in the
medium.
In view of (7.76) and (7.77) it is straightforward to predict when the electro-
magnetic energy of a pulse will exhibit superluminal or subluminal behavior. In
section 7.5, we saw that this behavior is controlled by the group velocity function.
However, in (7.76) and (7.77), we see that it is also predictable from the imaginary
part of the susceptibility (r, ).
If the entire pulse passing through point r has a spectrum in the neighborhood
of an amplifying resonance, but not on the resonance, superluminal behavior
can result. The instantaneous spectrum during the front portion of the pulse is
generally wider and can therefore lap onto the nearby gain peak. The medium
accordingly amplifies this perceived spectrum, and the front of the pulse grows.
The energy is then returned to the medium from the latter portion of the pulse
as the instantaneous spectrum narrows and withdraws from the gain peak. The
effect is not only consistent with the principle of causality, it is a direct and general
consequence of causality as demonstrated by (7.76) and (7.77). p
As an illustration, consider the broadband waveform with T2 = 1 / 2 de-
scribed in Example 7.8. Consider an amplifying medium with index shown in
Fig. 7.20 with the amplifying resonance (negative oscillator strength) set on the
frequency 0 = 0 + 2, where 0 is the carrier frequency. Thus, the resonance Figure 7.21 Animation of a nar-
structure is centered a modest distance above the carrier frequency, and there is rowband pulse traversing an am-
only minor spectral overlap between the pulse and the resonance structure. plifying medium off resonance.
Fig. 7.21 shows how the early portion of a pulse has a wide instantaneous The black dot shows the move-
ment of the center of all energy.
spectrum computed by (7.77) that laps onto the amplifying resonance. As the
The red line inside the medium
wings grow and access the neighboring resonance, the pulse extracts more energy shows the energy held in that
from the medium. As the wings diminish, the pulse surrenders much of that medium, which cannot go nega-
energy back to the medium, which shifts the center of the pulse forward producing tive. The lower figure shows the
a superluminal effect. instantaneous spectrum of the
In this appendix we have indirectly proven that a sharply defined signal edge pulse at the front of the medium
cannot propagate faster than c. If a signal edge begins abruptly at time t 0 , the relative to the narrow amplifying
resonance.
instantaneous spectrum E t () clearly remains identically zero until that time. In
194 Chapter 7 Superposition of Quasi-Parallel Plane Waves
other words, no energy may be exchanged with the medium until the field energy
from the pulse arrives. Since, as was pointed out in connection with (7.66), the
Cauchy-Schwartz inequality prevents the field energy from traveling faster than c,
at no point in the medium can a signal front exceed c.
P () = 0 () E () (7.78)
They made an argument based on causality (i.e. effect cannot precede cause),
which allows one to obtain the real part of () from the imaginary part of (),
if it is known for all . Similarly, one can obtain the imaginary part of () from
the real part of (). We develop the Kramers-Kronig formulas below.17
We can replace E () in (7.78) with the Fourier transform of E (t ) in accordance
with (7.19). In addition, we take the inverse Fourier transform (7.19) of both sides
of (7.78) and obtain
Z Z
0 1 0 i t 0 0 i t
P (t ) = p () p E t e dt e d (7.79)
2 2
Now for the causality argument: The polarization of the medium P (t ) cannot
depend on the field E t 0 at future times t 0 > t . Therefore the expression in square
brackets must be identically zero unless t t 0 > 0. This places a restriction on the
functional form of () as we shall see.
The causality argument comes explicitly into play when we employ the fol-
lowing integral formula:18
Z 0 0
i (t t 0 ) 0 1 ei (tt ) 0
e = sign{t t } d (7.81)
i 0
17 See J. D. Jackson, Classical Electrodynamics, 3rd ed., Sect. 7.10 (New York: John Wiley, 1999).
Also B. Y.-K. Hu, Kramers-Kronig in two lines, Am. J. Phys. 57, 821 (1989).
18 This integral, which is a specific instance of Cauchys theorem, is tricky because it involves two
diverging pieces, to either side of the singularity = 0 . The divergences have opposite sign so that
they cancel. The integration must approach the singularity in the same manner from either side, in
which case the result is called the principal value. In practical terms, if the integral is performed
numerically, the sampling of points should straddle the singularity symmetrically; other sampling
schemes can change the result dramatically, which is incorrect.
7.C Kramers-Kronig Relations 195
+1 (t > t 0 )
Apparently, we require the positive sign since sign{t t0 } .
1 (t < t 0 )
Upon substitution of (7.81) into (7.80) and after changing the order of integra-
tion within the square brackets we obtain
Z
Z
0 1 ()
Z
d e i (t t ) d 0 d t 0
0 0 0
P (t ) = E t (7.82)
2 i 0
These are known as the Kramers-Kronig relations on real and imaginary parts of
.19 If the real part of is known at all frequencies, we can use the Kramers-Kronig
relations to generate the imaginary part, and visa versa. We see that the real and
imaginary parts of cannot be chosen independently, if we are to respect the
principle of causality.
Example 7.9
Show that the expression in square brackets of (7.80) is zero when t 0 > t , if ()
satisfies the Kramers-Kronig relations (7.85).
Z Z Z
i (t t 0 ) i (t t 0 ) 0
() e d = Re () e d + i Im () e i (t t ) d
Z Z Z
Re 0
t t 0 1
0
= Re () e i ( )d + i d 0 e i (t t ) d
0
Z Z Z
0
i (t t 0 ) 1 0 e i (t t )
d + Re d d 0
= Re () e
i 0
(7.86)
19 As with (7.81), the principal value of the integral must be calculated. If the integral is performed
numerically, the sampling of points should straddle the singularity symmetrically. Separately, the
integral on each side of 0 = diverges, but with opposite sign.
196 Chapter 7 Superposition of Quasi-Parallel Plane Waves
where we have invoked the Kramers-Kronig relation for Im () (7.85) and inter-
changed the order of integration in the final expression. Since we are specifically
considering future times t 0 > t , we have by (7.81)
Z 0
1 e i (t t ) 0 0
d = e i (t t )
i
0
Hence
Z Z Z
0 0 0 0
() e i (t t ) d = Re () e i (t t ) d Re 0 e i (t t ) d 0
=0
(7.87)
Finally, it is worth noting that the Kramers-Kronig relations also apply to the
real and imaginary parts of the index of refraction (subtract one). 20
Z Z
0 n 0 1
1 0 1
n () 1 = d and () = d 0 (7.88)
0 0
One can use the Kramers-Kronig relations to find the real part of the index from
a measurement of absorption, if the measurement is done over a broad enough
range of the spectrum. This is the most useful form of the Kramers-Kronig rela-
tions.
It is sometimes convenient to multiply the numerator and denominator inside
the integrands of (7.88) by 0 + . Then noting that n is an even function and
is an odd function allows us to dismiss either 0 or in the numerator and
integrate21 over positive frequencies only:
Z Z 0
0 0 n 1
2 0 2
n () 1 = d and () = d 0 (7.89)
02 2 02 2
0 0
20 This follows from Cauchys theorem since the index (subtract one) is the square root of ().
The Kramers-Kronig relations for () guarantee that () has no poles in the upper half complex
plane, when is considered (for mathematical purposes) to be a complex variable. Taking the
square root does not introduce poles into the upper half plane.
21 The integrals (7.88) and (7.89) diverge to either side of 0 = , but with opposite sign. Again,
the principal value of the integral is required, which means a numeric grid should straddle the
singularity symetrically.
Exercises 197
Exercises
P7.2 Equation (7.7) implies that there is no interference between fields that
are polarized along orthogonal dimensions. That is, the intensity of
Exercises for 7.2 Group vs. Phase Velocity: Sum of Two Plane Waves
P7.6 (a) Consider the Gaussian pulse defined in (7.25). Determine the Full-
Width-at-Half-Maximum (FWHM) of the intensity I (r, t ), represented
by TFWHM (or t FWHM if you wish), and FWHM of the power spectrum
I (r, ), represented by FWHM (or FWHM if you wish).
HINT: Both answers are in terms of T .
(b) Give an uncertainty principle for the product of t FWHM FWHM .
TFWHM
T= p
2 ln 2
(see P7.6).
P7.8 If the pulse defined in (7.46) travels through the material for a very
long distance z such that T (z) T (z), show that the instantaneous
frequency of the pulse (defined to be the time derivative of the overall
phase) is
t z/v g
0 +
2z
Exercises 199
COMMENT: As the wave travels, the earlier part of the pulse oscillates
more slowly than the later part. This is called chirp, and it means
that the red frequencies get ahead of the blue ones since they experi-
ence a lower index. The instantaneous frequency is the effective local
frequency.
P7.10 When the spectrum a pulse is very broad, the reshaping delay (7.53) is
negligible. Show that in this case the net delay reduces to
r
lim tG (r) =
T 0 c
assuming k and r are parallel. This implies that a sharply defined
signal cannot travel faster than c.
HINT: The real index of refraction n goes to unity far from resonance,
and the imaginary part goes to zero.
R h
i
d E (r, ) E (r, )
t = i T [E (r, )]
R
d E (r, ) E (r, )
t = T [E (r0 + r, )] T [E (r0 , )] .
200 Chapter 7 Superposition of Quasi-Parallel Plane Waves
Coherence Theory
Coherence theory is the study of correlations that exist between different parts
of a light field. Temporal coherence indicates a correlation between fields offset
in time, E(r, t ) and E(r, t ). Spatial coherence has to do with correlations be-
tween fields at different spatial locations, E(r, t ) and E(r + r, t ). Because light
oscillations are too fast to resolve directly, we usually need to study optical co-
herence using interference techniques. In these techniques, light from different
times or places in the light field are brought together at a detection point. If
the two fields have a high degree of coherence, they consistently interfere either
constructively or destructively at the detection point. If the two fields are not
coherent, the interference at the detection point rapidly fluctuates between con-
structive and destructive interference, so that a time-averaged signal does not
show interference.
You are probably already familiar with two instruments that measure coher-
ence: the Michelson interferometer, which measures temporal coherence, and
Youngs two-slit interferometer, which measures spatial coherence. Your pre-
liminary understanding of these instruments was probably gained in terms of
single-frequency plane waves, which are perfectly coherent for all separations in
time and space. In this chapter, we build on that foundation and derive descrip-
tions that are appropriate when light with imperfect coherence is sent through
these instruments. We also discuss a practical application known as Fourier spec-
troscopy (Section 8.4) which allows us to measure the spectrum of light using a
Michelson interferometer rather than a grating spectrometer.
Beam
Splitter
8.1 Michelson Interferometer
A Michelson interferometer employs a 50:50 beamsplitter to divide an initial
beam into two identical beams and then delays one beam with respect to the
other before bringing them back together (see Fig. 8.1). Depending on the relative
Detector
path difference d (roundtrip by our convention) between the two arms of the
system, the light can interfere constructively or destructively in the direction of Figure 8.1 Michelson interferome-
the detector. The relative path difference d introduces a time delay , defined by ter.
201
202 Chapter 8 Coherence Theory
d /c.
If the input light is a plane-wave, the net field at the detector consists of the
field coming from one arm of the interferometer E0 e i (kzt ) added to the field
coming from the other arm E0 e i (kz(t )) . These two fields are identical except
for the delay . The intensity seen at the detector as a function of path difference
is computed to be
c0 h i h i
I tot () = E0 e i (kzt ) + E0 e i (kz(t )) E0 e i (kzt ) + E0 e i (kz(t ))
2
Figure 8.2 The intensity seen at c0
2E0 E0 + 2E0 E0 cos()
the detector of a Michelson inter- =
2
ferometer with a plane-wave input.
= 2I 0 [1 + cos()]
Because the plane wave is coher-
(Plane Wave Input) (8.1)
ent over an infinite distance, the
output oscillates without dimin-
where I 0 c20 E0 E0 is the intensity from one beam alone (when the other arm of
ishing as the delay is adjusted the interferometer is blocked). This formula is probably familiar. It describes how
in either direction. When the in- the intensity at the detector oscillates between zero and four times the intensity
tensity at the detector is zero, all from one beam,1 as plotted in Fig. 8.2.
of the light is reflected back to the When light containing a continuous band of frequencies is sent through
source. the interferometer, (8.1) no longer holds. Instead of repeating indefinitely, the
oscillations at the detector become less pronounced as increases. The concept
of temporal coherence describes how fast fringe visibility diminishes as delay is
introduced in an arm of the Michelson interferometer. The less coherent the
light source, the faster the fringes die out as the delay increases. To model this
behavior, we need to expand our analysis beyond (8.1).
Consider an arbitrary waveform E(t ) (comprised of many frequency compo-
nents) that has traveled through the first arm of a Michelson interferometer to
arrive at the detector in Fig. 8.1. The beam that travels through the second arm
of the interferometer is identical, but delayed by the round-trip delay : E (t ).
The total field at the detector is the sum of these two fields:
Etot (t , ) = E (t ) + E (t ) (8.2)
The total intensity I tot at the detector is found using (7.21) (with n = 1):
c0
I tot (t , ) = Etot (t , ) Etot (t , )
2
c0
E(t ) E (t ) + E(t ) E (t ) + E(t ) E (t ) + E(t ) E (t )
=
2
c0
= I (t ) + I (t ) + E(t ) E (t ) + E(t ) E (t )
2
= I (t ) + I (t ) + c0 Re E(t ) E (t )
(8.3)
c0
As a reminder, the function I (t ) = 2 E(t ) E (t ) corresponds to the intensity
of first beam at the detector when the second arm of the interferometer is blocked.
1 Keep in mind that if a 50:50 beam splitter is used, then the intensity arriving to the detector
from one arm alone (with other arm blocked) is one fourth of the original beam, since the light
meets the beam splitter twice.
8.1 Michelson Interferometer 203
The rapid oscillations of the light are automatically averaged away in I (t ) since
we used (7.21), but the slowly varying envelope of the arbitrary pulse is retained.
The intensity of the combined beams I tot (t , ) varies with t and also depends on
the path delay .
We consider I (t ) to be a pulse with a finite duration. We will be interested
in the total amount of energy (per area) that the pulse deposits on a detector.2
The detected signal, which well denote by Sig(), is the time-integrated intensity,
called fluence in units of energy per area:
Z
Sig() Itot (t, ) dt (8.4)
The proportionality accounts for the calibration of the detector, which might
report in volts or current, etc. The fluence arriving at the detector is sensitive
to the delay between the arms of the interferometer. Presumably, we can Albert Abraham Michelson (1852
repeatedly send identical pulses into the interferometer and record Sig() for 1931, United States) was born in Poland,
but he immigrated to the US with his
many different delays . We can manipulate the fluence integral in (8.4) into a parents and grew up in the rough mining
more useful form that will make the coherence properties more evident. towns of California and Nevada where
his father was a merchant. Michelson
attended high school in San Fransisco.
He entered the US Naval Academy in
Manipulation of the fluence integral
1869 (with intervention from US Presi-
dent Grant after Michelson pleaded his
Inserting (8.3) into the fluence integral, we have case when the president was walking
Z Z Z Z near the White House). After two years
at sea, Michelson returned to the Naval
I tot (t , ) d t = I (t )d t + I (t ) d t + c0 Re E (t ) E (t ) d t (8.5) Academy to teach physics and mathe-
matics for several years. Michelson was
fascinated by the problem of determining
The first two integrals on the right-hand side of (8.5) are equal,3 and give the the speed of light, and developed suc-
fluence E from either arm of the interferometer when the other arm is blocked: cessive experiments to measure it more
Z Z accurately. He is probably most famous
for his experiment conducted at Case
E I (t )d t = I (t ) d t (8.6) School of Applied Science in Cleveland
with Edward Morley to detect the motion
of the earth through the ether. Michelson
The final integral in (8.5) remains unchanged if we take a Fourier transform fol-
later was a professor at the University of
lowed by an inverse Fourier transform: Chicago and then at Caltech. In 1907,
Z Z
Z Z
he became the first American to win the
1 1 Nobel prize, for his contributions to op-
E(t ) E (t ) d t = p d e i p d e i E (t ) E (t ) d t tics. Michelson married late in life and
2 2 was the father of four. (Wikipedia)
(8.7)
The reason for this procedure is so that we can take advantage of the autocor-
relation theorem
p described in p P0.27. With it, the expression in square brackets
simplifies to 2E () E () = 22I () /c0 . Then with the aid of (8.6) and (8.7),
the overall fluence (8.5) becomes
Z Z
1
I tot (t , ) d t = 2E 1 + Re I ()e i d (8.8)
E
2 For sub-nanosecond laser pulses, a detector automatically integrates the entire energy of the
pulse since a detector cannot keep up with temporal variations on such a rapid time scale.
3 Note that the second integral is insensitive to since a change of variables t 0 = t converts it
With (8.8), we can rewrite the physical signal (8.4) in the more useful form
Sig() 2E 1 + Re ()
(8.9)
where the dependence on the path delay is entirely contained in the degree of
coherence function ():4
R
I () e i d
() (8.10)
R
I () d
The denominator of (8.10) was rewritten with the help of Parsevals theorem
R R
E I (t )d t = I () d . Remarkably, the signal out of the Michelson inter-
ferometer does not depend on the phase of E (). It depends only on the amount
of light associated with each frequency through I () 20 c E () E ().
We could have derived (8.9) using another strategy, which may seem more intuitive
than the approach above. Equation (8.1) gives the intensity at the detector when a
single plane wave of frequency goes through the interferometer. Now suppose
that a waveform composed of many frequencies is sent through the interferometer.
The intensity associated with each frequency acts independently, obeying (8.1)
individually.
The total energy (per area) accumulated at the detector is then a linear superposi-
tion of the spectral intensities of all frequencies present:
Z Z
I tot (, ) d = 2I () [1 + cos ()] d (8.11)
While this procedure may seem obvious, the fact that we can do it is remarkable!
Remember that it is usually the fields that we must add together before finding the
intensity of the resulting superposition. The formula (8.11) with its superposition
of intensities relies on the fact that the different frequencies inside the interferom-
eter when time-averaged (over all time) do not interfere. Certainly, the fields at
different frequencies do interfere (or beat in time). However, they constructively
interfere as often as they destructively interfere, and in a time-averaged picture it
is as though the individual frequency components transmit independently. Again,
in writing (8.11) we considered the light to be pulsed rather than continuous so
that the integrals converge.
We can manipulate (8.11) as follows:
R
Z Z I () cos () d
I tot (, ) d = 2 I () d
1 + (8.12)
R
I () d
4 M. Born and E. Wolf, Principles of Optics, 7th ed., p. 570 (Cambridge University Press, 1999).
8.2 Coherence Time and Fringe Visibility 205
This is the same as (8.8) since we can replace cos() with Re e i , and we can
apply Parsevals theorem (8.6) to the other integrals. Thus, the above arguments
lead to (8.9) and (8.10).
Example 8.1
Compute the output signal when a Gaussian pulse with spectrum (7.25) is sent
into a Michelson interferometer.
Formula (0.55) was used to complete the integration. According to (8.9), the signal
at the detector is then
2
Sig() 2E 1 + Re () = 2E 1 + e 4T 2 cos (0 )
Figure 8.3 shows this signal for a given T . As delay is added (or subtracted), the
output signal oscillates. Eventually enough delay is introduced such that the
very short pulses no longer interfere (arriving sequentially), and the output signal
becomes steady.
The coherence length is the distance that light travels in this time:
`c cc (8.14)
Example 8.2
Find the fringe visibility and the coherence time for the Gaussian pulse studied in
Example 8.1.
6 M. Born and E. Wolf, Principles of Optics, 7th ed., p. 570 (Cambridge University Press, 1999).
8.3 Temporal Coherence of Continuous Sources 207
This is shown as the dashed line in Fig. 8.4. As expected, the fringe visibility dies
off as delay gets farther from the origin (i.e. where the interferometer arms are
equidistant). From (8.13) the coherence time is
Z Z p
2
c = ()2 d =
d =
e 2T 2 2T
The duration T must be large enough to average over any fluctuations that are
present in the light source.
For a continuous light source, the signal at the detector (8.9) becomes
Sig() 2 I (t )t 1 + Re ()
(continuous source) (8.18)
Although technically the integrals used in (8.10) to compute () also diverge in
the case of continuous light, the numerator and the denominator diverge in the
same way. Therefore, we may renormalize I () in a similar fashion to deal with
this problem. The units in the numerator and denominator cancel so that ()
always remains dimensionless. Once we have the degree of coherence function
(), we can calculate the coherence time and fringe visibility just as we did for
pulsed sources.
Extracting I ()
The left-hand side is known since it is the measured data, and a computer can be
employed to take the Fourier transform of it. The first term on the right-hand side
is the Fourier transform of a constant:
Z p
1
F {2E } = 2E p e i d = 2E 2 () (8.21)
2
Notice that (8.21) is zero everywhere except where = 0, where a spike occurs.
This represents the DC component of F Sig () .
which we rearrange to
Z Z Z
p Z
1 0 1 0
2 I (0 ) e i ( ) d d 0 + I (0 ) e i ( +) d d 0
2 2
From (0.52) we note that the terms in parentheses are delta functions, so we have
Z
p Z
2 I (0 ) 0 d 0 + I (0 ) 0 + d 0
7 J. Peatross and S. Bergeson, Fourier Spectroscopy of Ultrashort Laser Pulses, Am. J. Phys. 74,
842-845 (2006).
8 This is weird since normally we take Fourier transforms on fields rather than expressions
involving intensity!
8.5 Youngs Two-Slit Setup and Spatial Coherence 209
F Sig () 2E 0 () + I () + I ()
(8.23)
The Fourier transform of the measured signal is seen to contain three terms, one
of which is the power spectrum I () that we are after. Fortunately, when graphed
as a function of (shown in Fig. 8.5), the three pieces on the right-hand side
of (8.20) do not overlap. As a reminder, the measured signal as a function of
looks something like that in Fig. 8.3. The oscillation frequency of the fringes
lies in the neighborhood of 0 . In summary, to obtain I () using a Michelson
interferometer, 1) record Sig (); 2) take its Fourier transform; and 3) extract the A graphical depiction of
Figure 8.5p
F {Sig()} 2 .
curve at positive frequencies.
Fringe Pattern
Point Source
Figure 8.6 A point source produces coherent (locked phases) light. When this light
which traverses two slits and arrives at a screen it produces a fringe pattern.
screen. In close analogy with (8.1), the resulting intensity pattern on a far-away
screen is
I tot (h) = 2I 0 [1 + cos (kd 2 kd 1 )] = 2I 0 1 + cos kh y/D (8.24)
Notice the close similarity between this expression and the output from a Michel-
son interferometer for a plane wave (8.1). We will consider h (the separation of
the slits) to be the counterpart of (the delay introduced by moving a mirror in
the Michelson interferometer). To obtain the final expression in (8.24) we made
use of the following Taylor expansions:
s 2 " 2 #
y h/2 y h/2
q
2
d1 y = y h/2 + D 2 = D 1 + = D 1+ + (8.25)
D2 2D 2
and
s 2 " 2 #
y + h/2 y + h/2
q
2
d2 y = y + h/2 + D 2 = D 1+ = D 1+ + (8.26)
D2 2D 2
Fringe Pattern
Figure 8.7 Light from an extended source is only partially coherent. Fringes are still
possible, but they exhibit less contrast.
that its frequency is approximately with a phase that fluctuates randomly over
time intervals much longer than the period of oscillation 2/.10
The light emerging from the j th point at y 0j travels by means of two very
narrow slits to a point y on a screen. Let E1 (y 0j ) and E2 (y 0j ) be the fields on the
screen at y, both originating from the point y 0j , but traveling respectively through
the two different slits. We assume that these fields have the same polarization,
and we will suppress the vectorial nature of the fields. For simplicity, we assume
the two fields have the same (real) amplitude at the screen E 0 (y 0j ). Thus, we write
the two fields as
n h i o
0 0
0 0 i k r 1 (y j )+d 1 (y) t +(y j )
E 1 (y j ) = E 0 (y j )e (8.27)
and n h i o
0 0
0 0 i k r 2 (y j )+d 2 (y) t +(y j )
E 2 (y j ) = E 0 (y j )e (8.28)
We have explicitly included an arbitrary phase (y 0j ) assigned to each emission
point at the source.
We now set about finding the cumulative field at y arising from the many
points indexed by the subscript j . The total field on the screen at point y is
Xh i
E tot (h) = E 1 (y 0j ) + E 2 (y 0j ) (8.29)
j
10 Random phase fluctuations necessarily imply some frequency bandwidth, however small.
the dependence on the slit separation h. The intensity associated with (8.29) is
0 c
I tot (h) = |E tot (h)|2
2 " #
0 c X 0 0
X 0 0
= E 1 (y j ) + E 2 (y j ) E 1 (y m ) + E 2 (y m )
2 j m
0 c Xh i
= E 1 (y 0j )E 1 (y m
0
) + E 2 (y 0j )E 2 (y m
0
) + E 1 (y 0j )E 2 (y m
0
) + E 2 (y 0j )E 1 (y m
0
)
2 j ,m
0 c X
0 0
i k r 1 (y 0 )r 1 (y m0 ) i k r 2 (y 0j )r 2 (y m
0
)
= E 0 (y j ) E 0 (y m ) e j
+e
2 j ,m
i k r 1 (y 0j )r 2 (y m
0
) i k (d 1 (y)d 2 (y)) i (y 0j )(y m
0
)
+2Re e e e
(8.30)
At this juncture we make a critical assumption: the phase of the emission
(y 0j ) varies in time independently at every point on the source. This is sometimes
Thomas Young (17731829, English)
was born in Milverton, Somerset, and called the stochastic assumption, and it is appropriate for the emission from
was the oldest of ten children. By age thermal sources such as starlight (filtered to a narrow frequency range), a glowing
fourteen, he had become proficient at a
filament, or spontaneous emission from an excited gas or plasma. However, it is
dozen different languages. As a young
adult, he studied medicine and then not appropriate for coherent sources like lasers (more on that in appendix 8.B).
went to Gttingen, Germany where he A wonderful simplification happens to (8.30) when the phase difference
earned a doctoral degree in physics. In
1801, he was appointed professor of (y 0j )(y m0
) varies randomly. If j 6= m, then exp{i ((y 0j ) (y m
0
))} time-averages
natural philosophy at the Royal Insti-
to zero. On the other hand, if j = m, then the factor reduces to e 0 = 1. Formally,
tute, but he also maintained an active
medical practice on the side. He con- this is written
tributed to a wide variety of fields and
helped to decipher ancient Egyptian hi- i (y 0j )(y m
0
) 1 if j = m,
eroglyphs, including the Rosetta Stone. e = j ,m (random phase assumption) (8.31)
He published descriptions of the heart t 0 if j 6= m.
and arteries as well as how the eye ac-
commodates to see at different depths where j ,m is known as the Kronecker delta function. The time-averaged intensity
and how the eye perceives color. In en-
gineering fields, Young is well known his
under the stochastic assumption (8.31) then reduces to
analysis of stresses and strains in elastic ( )
media. Youngs double-slit experiment X 0
X 0
X 0 0
0 i k r 1 (y j )r 2 (y j ) i k (d 1 (y)d 2 (y))
gave convincing evidence of the wave I (h) = I (y ) + I (y ) + 2Re
tot t j I (y )e
j e j
nature of light, overturning Newtons cor- j j j
pusculor theory. Regarding this, Thomas (8.32)
Young traded ideas with Augustin Fres- h y/D. Very similarly, we may also
nel through correspondence. (Wikipedia)
We may use (8.25) to simplify d 1 (y) d 2 (y) =
write r 1 (y 0j )r 2 (y 0j )
= h y 0j /R. The only thing left to do is to put (8.32) into a slightly
more familiar form:
" #
0
I tot (h)t = 2 I (y j ) 1 + Re (h)
X
(random phase assumption) (8.33)
j
We have introduced
kh y 0
kh y j
e i I (y 0j )e i
P
D R
j
(h) (8.34)
I (y 0j )
P
j
8.A Spatial Coherence for a Continuous Spatial Distribution 213
which is known as the degree of coherence. It controls the fringe pattern seen at
the screen.
We can generalize (8.33) so that it applies to the case of a continuous distribu-
tion of light as opposed to a collection of discrete point sources. In Appendix 8.A
we show how summations in (8.33) and (8.34) become integrals over the source
intensity distribution, and we write
where
kh y R kh y 0
e i D I (y 0 )e i R d y0
(h) (8.36)
R
I (y 0 )d y 0
0
Here I (y ) has units of intensity per length of source.
The factor exp i kh y/D defines the locations of the periodic fringes on the
screen. The rest of (8.36) controls the (more interesting) depth of the fringes as
the slit separation h is varied. When the slit separation h increases, the amplitude
of (h) tends to diminish until the intensity at the screen becomes uniform.
kh y 0
When the two slits have very small separation (such that e i R = 1) then we
have (h) = 1 and very good fringe visibility results. (h) dictates the degree of
spatial coherence in much the same way that () dictates the degree of temporal
coherence. Notice the close similarity between (8.36) and (8.10).
As the slit separation h increases, the fringe visibility
V (h) = (h)
(8.37)
Z
2
h c 2 (h) d h
(8.38)
0
Z Z
E 1 (y 0j ) E 1 (y 0 )d y 0 0
E 1 (y 00 )d y 00
X X
and E 1 (y m )
j m
(8.39)
Z Z
E 2 (y 0j ) E 2 (y 0 )d y 0 0
E 2 (y 00 )d y 00
X X
and E 2 (y m )
j m
Rather than deal with a time average of randomly varying phases, we will instead
work with a linear superposition of all conceivable phase factors. That is, we will
write the phase (y 0 ) as K y 0 , where K is a parameter with units of inverse length,
which we allow to take on all possible real values with uniform likelihood. The
way we modify (8.31) for the continuous case is then
h i Z
i (y 0j )(y m
0
) 1 0 00
e = j ,m e i K (y y ) d K = (y 00 y 0 ) (8.40)
t 2
Z Z
0 c h 0 00 0 00
d y E (y 0 )
0
d y 00 E (y 00 ) e i k (r 1 (y )r 1 (y )) + e i k (r 2 (y )r 2 (y ))
I tot (h) =
2
n 0 00
oi
+2Re e i k (d1 (y)d2 (y)) e i k (r 1 (y )r 2 (y )) (y 00 y 0 )
(8.41)
Again, consistent with (8.25), we may write d 1 (y) d 2 (y) = h y/D and r 1 (y 0 )
r 2 (y 0 )
= h y 0 /R, and (8.41) reduces to
Z Z
kh y kh y 0
I tot (h) = 2 I (y 0 )d y 0 + 2Re e i D I (y 0 )e i R d y 0 (8.42)
where
1 2
I (y 0 ) 0 c E (y 0 )
(8.43)
2
For I tot to have normal units of intensity, I (y 0 ) must have units of intensity per
length of source, implying that E (y 0 ) has units of field per square root of length.
R
Hence, I (y 0 )d y 0 is the intensity at the screen caused by the entire extended
source when only one slit is open. We see that (8.42) is equivalent to (8.35) and
(8.36).
2 2
i ( y 0 )+i k y 02 i kh y 0 i (y 0 )+i k y 02 i kh y 0
Z Z
0 0 0 0
I tot (h) E (y ) e 2R e 2R d y +
E (y ) e 2R e 2R d y
k y 02 kh y 0 k y 02 kh y 0
kh y
Z Z
0 0
E y 0 e i (y )+i 2R e i 2R d y 0 E y 0 e i ( y )+i 2R e i 2R d y 0
+ 2Ree i D
(8.44)
where we have employed (8.25) and (8.26) and similar expressions involving R
and y 0 .
The first term on the right-hand side of (8.44) is the intensity on the screen
when the lower slit is covered. The second term is the intensity on the screen
when the upper slit is covered. The last term is the interference term, which
modifies the sum of the individual intensities when light goes through both slits.
Notice the occurrence of Fourier transforms (over position) on the quantities
inside of the square brackets. Later, when we study diffraction theory, we will
recognize these transforms as determining the strength of fields impinging on
the individual slits. This corresponds to a major difference between a spatially
coherent source and a random-phase source. With the random-phase source, the
slits are always illuminated with the same strength regardless of the separation.
However, with a coherent source, beaming can occur such that the strength as
well as phase of the field at each slit depends on the slit separation.
A beautiful simplification occurs when the phase of the emitted light has the
following distribution:
k y 02
(y 0 ) = (converging spherical wave) (8.45)
2R
Equation (8.45) is not as arbitrary as it may first appear. This particular phase
is an approximation to a concave spherical wave front converging to the center
between the two slits. This type of wave front is created when a plane wave passes
through a lens. With the special phase (8.45), the intensity (8.44) reduces to
2
Z h
kh y 0
n kh y oi
0 i 2R
d y 1 + Re e i D
0
I tot (h) 2
E (y ) e
(converging spherical wave) (8.46)
The factor
Z
kh y 0
E (y 0 ) e i 2R d y 0
corresponds to the field impinging on the screen and which arises from either slit,
when positioned at h/2. Let this field be denoted by |E 1 (h/2)|. The field strength
when the single slit is positioned at h compared to that when it is positioned at
216 Chapter 8 Coherence Theory
zero is
R
0
i kh y 0 0
E (y ) e R d y
E 1 (h)
=
E (0)
1 R
E (y 0 ) d y 0
(converging
spherical wave assumption) (8.47)
This looks very much like fringe visibility (h) given by (8.37) and (8.36) except
that the magnitude of the field appears in (8.47), whereas the intensity appears in
(8.36).
This may seem rather contrived, but at least it is cute, and it is known as the
van Cittert-Zernike theorem.11 It says that the spatial coherence of an extended
source with randomly varying phase drops off with lateral slit separation in the
same way that the field pattern at the focus of a converging spherical wave would
drop off, whose field amplitude distribution is the same as the original intensity
distribution.
11 M. Born and E. Wolf, Principles of Optics, 7th ed., p. 574 (Cambridge University Press, 1999).
Exercises 217
Exercises
P8.2 (a) Show that the fringe visibility of a Gaussian spectral distribution
(see Example 8.2) goes from 1 to e /2 = 0.21 as the round-trip delay
increases from zero to the coherence length.
(b) Derive an expression for the FWHM wavelength bandwidth FWHM
in terms of the coherence length `c and the center wavelength 0 .
HINT: First determine FWHM , defined to be the width of I () at half
of its peak (see P7.6). To convert to a wavelength difference, use =
2c
|FWHM |
= 2c2 FWHM .
0
P8.3 Show that Re{()} defined in (8.10) reduces to cos (0 ) in the case of
a plane wave E (t ) = E 0 e i (k0 z0 t ) being sent through a Michelson inter-
ferometer. In other words, the output intensity from the interferometer
reduces to
I = 2I 0 [1 + cos (0 )]
as you already expect.
HINT: Dont be afraid of delta functions. After integration, the left-over
delta functions cancel.
P8.4 Light emerging from a dense hot gas has a collisionally broadened
power spectrum described by the Lorentzian function
I (0 )
I () = 2
0
1+ FWHM /2
P8.5 (a) The spectral phase of the light in P8.4 is randomly organized. De-
scribe qualitatively how the light behaves as a function of time.
(b) Now suppose that the phase of the light is somehow neatly orga-
nized such that
i E (0 ) e i c z
E () =
i + 0
FWHM /2
Perform the inverse Fourier transform on the field and determine the
light intensity as a function of time. Make a sketch.
HINT:
Z
e i ax 2i e i a
if a>0
dx = Im > 0
x + 0 if a<0
Figure 8.8
Exercises for 8.5 Youngs Two-Slit Setup and Spatial Coherence
P8.7 (a) A point source with wavelength = 500 nm illuminates two parallel
slits separated by h = 1.0 mm. If the screen is D = 2 m away, what is
the separation between the diffraction peaks on the screen? Make a
sketch.
12 J. Peatross and S. Bergeson, Fourier Spectroscopy of Ultrashort Laser Pulses, Am. J. Phys. 74,
842-845 (2006).
Exercises 219
(b) A thin piece of glass with thickness d = 0.01 mm and index n = 1.5 is
placed in front of one of the slits. By how many fringes does the pattern
at the screen move?
HINT: Add to k (d 2 d 1 ) in (8.24) , where is the phase difference
between traversing the glass and traversing an empty region of the
same thickness.
L8.8 (a) Carefully measure the separation of a double slit in the lab (h
0.1 mm separation) by shining a HeNe laser ( = 633 nm) through it
and measuring the diffraction peak separations on a distant wall (say,
2 m from the slits).
HINT: For better accuracy, measure across several fringes and divide.
Double slit
Single slit separation h
Diffuser
width a Filter
Laser
CCD
Camera
Rotating diffuser
to create phase
variation
Figure 8.9
(b) Create an extended light source with a HeNe laser using a time-
varying diffuser followed by an adjustable single slit. (The diffuser
must rotate rapidly to create random time variation of the phase at
each point as would occur automatically for a natural source such
as a star.) Place the double slit at a distance of R 100 cm after the
first slit. (Take note of the exact value of R, as you will need it for the
next problem.) Use a lens to image the diffraction pattern that would
have appeared on a far-away screen into a video camera. Observe
the visibility of the fringes. Adjust the width of the source with the
single slit until the visibility of the fringes disappears. After making the
source wide enough to cause the fringe pattern to degrade, measure
the single slit width a by shining a HeNe laser through it and observing
the diffraction pattern on the distant wall. (video)
HINT: As we will study later, a single slit of width a produces an inten-
sity pattern on a screen a distance L away described by
a
I (x) = I peak sinc2 x
L
sin sin
where sinc () and lim = 1.
0
NOTE: It would have been nicer to vary the separation of the two slits
to determine the width of a fixed source. However, because it is hard to
220 Chapter 8 Coherence Theory
make an adjustable double slit, we vary the size of the source until the
spatial coherence of the light matches the slit separation.
a/2
y0
y i kh
a/2 y a/2 y0
e i kh D e
R
y0 y
h i
d y0 e i kh D e i kh R d y0
R R
I 0 exp i kh R + D
i kh
R
a/2 a/2 a/2
(h) = = =
a/2 a a
I0d y 0
R
a/2
a/2 a/2
i kh
y
e
R e i kh R y
= e i kh D sinc kha
= e i kh D
2i kh a/2
R
2R
Note that
Z
sin2 x
dx =
(x)2 2
0
Review, Chapters 58
R26 T or F: As light enters a crystal, the Poynting vector always obeys Snells
law.
R27 T or F: As light enters a crystal, the k-vector obeys Snells law for the
extraordinary wave.
R29 T or F: The integral of I (t ) over all t equals the integral of I () over all
.
R37 T or F: The Fourier transform (or inverse Fourier transform if you prefer)
of I () is proportional to the degree of temporal coherence.
221
222 Review, Chapters 58
R38 T or F: The Youngs two-slit setup is ideal for measuring the temporal
coherence of light.
Problems
where is the angle made with the optic axis. At the frequency of a
ruby laser, KDP has indices n o () = 1.505 and n e () = 1.465. At the
frequency of the second harmonic, the indices are n o (2) = 1.534 and
n e (2) = 1.487.
In order to make the indices at the two frequencies the same, decide
Horizontal Vertical
Polarizer Polarizer
which frequency should propagate as an ordinary wave and which
should propagate as an extraordinary one. What angle will make the
indices the same?
R42 (a) Derive the Jones matrix for half wave plate with its fast axis making
an arbitrary angle with the x-axis.
HINT: Project an arbitrary polarization with E x and E y onto the fast
and slow axes of the wave plate. Shift the slow axis phase by , and then
project the field components back onto the horizontal and vertical axes.
The answer is
exiting the polarizer to the incoming intensity as a function of ? x-axis Transmission Axis
R43 (a) What is the spectral content (i.e. I ()) of a square laser pulse
E 0 e i 0 t , |t | /2
E (t ) =
0 , |t | > /2
Make a sketch of I (), indicating the location of the first zeros. Figure 8.11 Polarizing Elements
(b) What is the temporal shape (i.e. I (t )) of a light pulse with frequency
content
E 0 , | 0 | /2
E () =
0 , | 0 | > /2
where in this case E 0 has units of E-field per frequency. Make a sketch
of I (t ), indicating the location of the first zeros.
(c) If E () is given ( not necessarily the same function as above), and
the light passes through a material with index n () and thickness `,
how would you find E (t ) after exiting the material? Please set up the
integral without performing it.
50 nm, centered at 0 = 800 nm. Assume that the light has a Gaussian
frequency profile
0 2
I () = I (0 )e
Z
B 2 /4A+C
r
Ax 2 +B x+C
e dx = e Re {A} > 0
A
Find the fringe visibility V (I max I min )/(I max + I min ) as a function of
(i.e. the round-trip delay due to moving one of the mirrors).
R47 A diffuse source of light impinges on a Youngs double slit (with slit
separation h) positioned a distance R from the source. A screen is
placed a distance D following the slit. The degree of coherence is given
by
kh y R kh y 0
e i D I (y 0 )e i R d y 0
(h)
R
I (y 0 )d y 0
Fringe Pattern
02 02
emission distribution with the form I (y 0 ) = I 0 /y 0 e y /y .
(a) Compute the function (h). HINT: See integral provided in R44(b).
(b) The intensity on the screen oscillates as a function of y. As h grows
wider, the amplitude of oscillations decreases. How wide must the slit
separation h become (in terms of R, k, and y 0 ) to reduce the visibility
to
I max I min 1
V =
I max + I min 3
(c) Sketch the intensity at the screen I 1 + Re(h) when the visibility
is 1/3.
Selected Answers
R40: 51.12 .
R41: (b) 1/4, (c) 1/2.
2
2 (0 )
(1i )
R46: (b) partial: E () = T E 0 e T 2 .
Chapter 9
Light as Rays
So far in our study of optics, we have described light in terms of waves, which sat-
isfy Maxwells equations. However, as you are probably aware, in many situations
light can be thought of as rays pointing along the direction of wave propagation. A
ray picture is useful when one is interested in the macroscopic flow of light energy,
but rays fail to reveal fine details, in particular wave and diffraction phenomena.
For example, simple ray theory suggests that a lens can focus light down to a point.
However, if a beam of light were concentrated onto a true point, the intensity
would be infinite! Nevertheless, ray theory is useful for predicting where a focus
occurs. It is also useful for describing imaging properties of optical systems (e.g.
lenses and mirrors).
Beginning in section 9.3 we study the details of ray theory and the imaging
properties of optical systems. First, however, we examine the justification for ray
theory starting from Maxwells equations. In the short-wavelength limit, Maxwells
equations give rise to the eikonal equation, which governs the direction of rays
in a medium with an index of refraction that varies with position. The German
word eikonal comes from the Greek from which the modern word icon
derives. The eikonal equation therefore has a descriptive title since it controls the
formation of images. The eikonal equation provides an adequate description as
long as the features of interest are large compared to a wavelength.
The eikonal equation describes the direction of ray propagation, even in com-
plicated situations such as desert mirages where air is heated near the ground and
has a different index than the air farther from the ground. Rays of light from the
sky that initially are directed toward the ground can be bent such that they travel
parallel to or even up from the ground, owing to the inhomogeneous refractive
index. The eikonal equation can also be used to deduce Fermats principle, which
in short says that light travels from point A to point B following a path that takes
the minimum time. This principle can be used, for example, toderive Snells law.
Fermat asserted this principle more than a century before Maxwells equations
were known, but it is nice to give justification retroactively using the modern
perspective.
Much of this chapter is devoted to the propagation of rays through optical
227
228 Chapter 9 Light as Rays
1 1 1
= + (9.1)
Sir Isaac Newton (16431727, En-
f do di
glish) was born in Lincolnshire, England
three months after the death of his fa- Even a complicated multi-element optical system obeys (9.1) if d o and d i are
ther who was a farmer. Newton spent measured from principal planes rather than the single plane of, for example, a
much of his childhood with his maternal
grandmother, after his mother remar- thin lens.
ried. (Newton did not like his stepfather.) Paraxial ray theory can also be used to study the stability of laser cavities. The
In his teenage years, Newtons mother
tried to persuade him to take up farming,
formalism predicts whether a ray, after many round trips in the cavity, remains
but his love for education won out. He near the optical axis (trapped and therefore stable) or if it drifts endlessly away
became the top-ranked student at his from the axis of the cavity on successive round trips.
school and was admitted into Trinity Col-
lege, Cambridge at age 18. Newton was In appendix 9.A we address deviations from the paraxial ray theory known
influenced by the works of Descartes, as aberrations. We also comment on ray-tracing techniques, used for designing
Copernicus, Galileo, and Kepler. Upon
graduation four years later, the univer- optical systems that minimize such aberrations.
sity closed for two years because of a
plague. Newtons return to farm life coin-
cided with a remarkable period when he
first developed ideas on calculus, gravita-
9.1 The Eikonal Equation
tion, and optics. Newton later returned to
Cambridge where he spent his extraor- For simplicity, consider light consisting of only a single frequency . The wave
dinarily prolific career and became the equation (2.13) for an isotropic medium with a real refractive index may be written
first scientist to be knighted. In optics,
Newton advanced the ray theory of light as
and image formation. He wrote a land- [n (r)]2 2
mark textbook on the subject. He also 2 E(r, t ) + E (r, t ) = 0 (9.2)
c2
showed that white light is comprised
of many colors and that the amount of where we have already performed the time differentiation on the assumed single-
refraction depends on color. He built the
first reflecting telescope, which avoids
frequency time dependence e i t . Although in chapter 2 we considered solutions
chromatic aberration. Newton advo- to the wave equation in a homogeneous material, the wave equation remains valid
cated against the wave theory of light in when the index of refraction varies throughout space (i.e. if n (r) is an arbitrary
favor of his corpuscular theory. (Imag-
ining that Newton foresaw the quantized function of r). In this case, the usual plane-wave solutions no longer satisfy the
nature of light energy gives too much wave equation.
credit!) (Wikipedia)
As a trial solution for (9.2), we take
where
2
k vac = = (9.4)
c vac
Here R (r) is a real scalar function (which depends on position) having the dimen-
sion of length. By taking R (r) to be real, there is no absorption or amplification.
Even though the trial solution (9.3) looks somewhat like a plane wave,1 the func-
tion R (r) accommodates wave fronts that can be curved or distorted as depicted
in Fig. 9.1. At any given instant t , the phase of the curved surfaces described by
R (r) = constant can be interpreted as wave fronts. The wave fronts travel in the
direction for which R (r) varies the fastest. This direction is aligned with R (r),
which lies in the direction perpendicular to surfaces of constant phase.
The substitution of the trial solution (9.3) into the wave equation (9.2) gives
1 h i
2
2 E0 (r) e i kvac R(r) + [n (r)]2 E0 (r) e i kvac R(r) = 0 (9.5)
k vac Figure 9.1 Wave fronts (i.e. sur-
faces of constant phase given by
where we have divided each term by e i t .
R(r)) distributed throughout space
in the presence of a spatially inho-
mogeneous refractive index. The
Computing the Laplacian in (9.5)
gradient of R gives the direction of
The gradient of the x component of the field is travel for a wavefront.
h i
E 0x (r) e i kvac R(r) = [E 0x (r)] e i kvac R(r) + i k vac E 0x (r) [R (r)] e i kvac R(r)
Upon combining the result for each vector component of E0 (r), the required spatial
derivative can be written as
h i
2 E0 (r) e i kvac R(r) = 2 E0 (r) k vac
2
E0 (r) [R (r)] [R (r)] + i k vac E0 (r) 2 R (r)
+2i k vac x [E 0x (r)] [R (r)] + y E 0 y (r) [R (r)]
+ z [E 0z (r)] [R (r)]}) e i kvac R(r)
After performing the Laplacian and after some rearranging, (9.5) becomes
2 E0 (r) i 2i
R(r) R(r) [n(r)]2 E0 (r) = 2 R (r) +
2
+ xE 0x (r) R (r)
k vac k vac k vac
2i 2i
+ yE 0 y (r) R (r) + z E 0z (r) R (r)
k vac k vac
(9.6)
1 If the index is spatially independent (i.e. n (r) n), then (9.3) reduces to the usual plane-wave
solution of the wave equation. In this case, we have R (r) = k r/k vac and the field amplitude
becomes constant (i.e. E0 (r) E0 ).
230 Chapter 9 Light as Rays
where s is a unit vector pointing in the direction R (r), the direction normal to
wave front surfaces. Equation (9.8) is called the eikonal equation.2
y
Example 9.1
h
Suppose that a region of air above the desert
on a p
hot day has an index of refraction
y n y n 2 2
that varies with height according to = 0 1 + y /h . Verify that R x, y =
h/2 n 0 x y 2 /2h is a solution to the eikonal equation. (See problem P9.1 for a more
x y x y/2 x y/4
s (h) = p s (h/2) = p s (h/4) = p
2 5/4 17/16
These are represented in Fig. 9.2. In a desert mirage, light from the sky can appear
to come from a lower position. We can determine a path for the rays by setting
d y/d x equal to the slope of s:
dy y
= y = y 0 e (xx0 )/h
dx h
2 M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 3.1.1 (Cambridge University Press, 1999).
9.2 Fermats Principle 231
It can be shown that the Poynting vector is directed along s (see P9.2). In other
words, the direction of s specifies the direction of energy flow. The unit vector s
at each location in space points perpendicular to the wave fronts and indicates
the direction that the waves travel as seen in Fig. 9.1. A collection of vectors s
distributed throughout space are called rays.
In retrospect, we might have jumped straight to (9.8) without going through
the above derivation. After all, we know that each part of a wave front advances
in the direction of its gradient R (r) (i.e. in the direction that R (r) varies most
rapidly). We also know that each part of a wave front defined by R (r) = constant
travels at speed c/n (r). The slower a given part of the wave front advances, the
more rapidly R (r) changes with position r and the closer the contours of constant
phase. It follows that R (r) must be proportional to n (r) since R (r) denotes the
rate of change in R (r).
ZB
ns d ` is independent of path from A to B. (9.11)
A
path that connects A and B, the cosine associated with the dot product is less than
one at most points along that path, whereas the result of the integral is the same.
Therefore, if we artificially remove the dot product from the integral (i.e. exclude
the cosine factor), the result of the integral will exceed the true value unless the
path chosen follows the direction of s (i.e. the path that corresponds to the one
that light rays actually follow).
In mathematical form, this argument can be expressed as
A
ZB
B
Z
ns d ` = min nd ` (9.12)
A A
The integral on the right is called the optical path length (OP L) between points A
and B:
ZB
B B
OP L| A nd ` (9.13)
A
Figure 9.3 A ray of light leaving
The conclusion is that the true path that light follows between two points (i.e.
point A arriving at B.
the one that stays parallel to s) is the one with the shortest optical path length.
The index n may vary with position and therefore can be different for each of the
incremental distances d `.
Fermats principle is usually stated in terms of the time it takes light to travel
between points. The travel time t depends not only on the path taken by the
light but also on the velocity of the light v (r), which varies spatially with the
refractive index:
ZB ZB
B d` d` OP L|BA
t | A = = = (9.14)
v(r) c/n(r) c
A A
To find the correct path for the light ray that leaves point A and crosses point
B, we need only minimize the optical path length between the two points (pro-
portional to travel time). The optical path length is not the actual distance that
the light travels; it is proportional to the number of wavelengths that fit into that
distance (see (2.24)). Thus, as the wavelength shortens due to a higher index of
refraction, the optical path length increases. The correct ray traveling from A to
B does not necessarily follow a straight line but can follow a complicated curve
according to how the index varies.
A B An imaging situation occurs when many paths from point A to point B have
the same optical path length. An example of this occurs when a lens causes an
image to form. In this case all rays leaving point A (on an object) and traveling
Figure 9.4 Rays of light leaving
through the system to point B (on the image) experience equal optical path
point A with the same optical path lengths. Although the ray traveling through the center of the lens depicted in
length to B. Fig. 9.4 has a shorter geometric path length, it goes through more material so that
the optical path length is the same as for the outer rays.
To summarize Fermats principle,5 of the many rays that might emanate from
5 The minimization of (9.14) does not give the correct path in anisotropic media such as crystals
where n depends on the direction of a ray as well as on its location (see P9.5).
9.2 Fermats Principle 233
a point A, the ray that crosses a second point B is the one that follows the shortest
optical path length. If many rays tie for having the shortest optical path, we say
that an image of point A forms at point B.
Example 9.2
Use Fermats principle to derive Snells law.
Solution: Consider the many rays of light that leave point A seen in Fig. 9.5. Only
one of the rays passes through point B. Within each medium we expect the light to
travel in a straight line since the index is uniform. However, at the boundary we
must allow for bending since the index changes.
The optical path length between points A and B may be written
q q
OP L = n i x i2 + y i2 + n t x t2 + y t2 (9.15)
B
We need to minimize this optical path length to find the correct one according to
Fermats principle.
Since points A and B are fixed, we may regard x i and x t as constants. The distances
y i and y t are not constants although the combination
y tot = y i + y t (9.16) A
Notice that
yi yt
sin i = q and sin t = q (9.19)
x i2 + y i2 x t2 + y t2
Example 9.3
Use Fermats principle to derive the equation of curvature for a reflective surface
that causes all rays leaving one point to image to another. Do the calculation in
two dimensions rather than in three.6
Solution: We adopt the convention that the origin is half way between the points,
which are separated by a distance 2a, as shown in Fig. 9.6. If the points are to image
to each other, Fermats principle requires that the total path length be a constant;
call it b. If the total path from the object
point to the image point includes a
reflection from an arbitrary point x, y , we may write constant path length as
q q
(x + a)2 + y 2 + (x a)2 + y 2 = b (9.21)
To get (9.21) into a more recognizable form, we isolate the first square root and
square both sides of the equation, which gives
q
(x + a)2 + y 2 = b 2 + (x a)2 + y 2 2b (x a)2 + y 2
After squaring the two binomial terms, some nice cancelations occur, and we get
Figure 9.6
q
4ax b 2 = 2b (x a)2 + y 2
16a 2 4b 2 x 2 4b 2 y 2 = 4a 2 b 2 b 4
Finally, we divide both sides by the term on the right to obtain the (hopefully)
familiar form of an ellipse
x2 y2
2 + 2 =1 (9.22)
b b 2
4 4 a
Fig. 9.6 represents the end of an amplifier rod while the other represents the end of a thin flash-lamp
tube.
9.3 Paraxial Rays and ABCD Matrices 235
perceive in our day-to-day world, the ray approximation is often a very good one.
This is the reason that ray optics was developed long before light was understood
as a wave.
We consider ray theory within the paraxial approximation, meaning that
we restrict our attention to rays that are near and almost parallel to an optical
axis of a system, say the z-axis. It is within this approximation that the familiar
imaging properties of lenses occur. An image occurs when all rays from a point
on an object converge to a corresponding point on what is referred to as the
image. To the extent that the paraxial approximation is violated, the clarity of
an image can suffer, and we say that there are aberrations present. The field
of optical engineering is often concerned with minimizing aberrations in cases
where the paraxial approximation is not strictly followed. This is done so that, for
example, a camera can take pictures of objects that occupy a fairly wide angular
field of view, where rays violate the paraxial approximation. Optical systems are
typically engineered using the science of ray tracing, which is described briefly in
section 9.A.
As we develop paraxial ray theory, we should remember that rays impinging
on devices such as lenses or curved mirrors should strike the optical component
at near normal incidence. To quantify this statement, the paraxial approximation
is valid to the extent that
sin
= (9.23)
is a good approximation, and similarly
tan
= (9.24)
Here, the angle (in radians) represents the angle that a particular ray makes
with respect to the optical axis. There is an important mathematical reason for
this approximation. The sine is a nonlinear function, but at small angles it is
approximately linear and can be represented by its argument. This linearity
greatly simplifies the analysis since it reduces the problem to linear algebra.
Conveniently, we will be able to keep track of imaging effects with a 22 matrix
formalism.
Consider a ray propagating in the yz plane where the optical axis is in the z- Figure 9.7 The behavior of a ray as
direction. Let us specify a ray at position z 1 by two coordinates: the displacement light traverses a distance d .
from the axis y 1 and the orientation angle 1 (see Fig. 9.7). If the index is uniform
everywhere, the ray travels along a straight path. It is straightforward to predict the
coordinates of the same ray downstream, say at z 2 . First, since the ray continues
in the same direction, we have
2 = 1 (9.25)
By referring to Fig. 9.7 we can write y 2 in terms of y 1 and 1 :
y 2 = y 1 + d tan 1 (9.26)
where d z 2 z 1 . Equation (9.26) is nonlinear in 1 , but within the paraxial
approximation it becomes simply
y 2 = y 1 + 1 d (9.27)
236 Chapter 9 Light as Rays
Example 9.4
Let the distance d be subdivided into two distances, a and b, such that d = a +
b. Show that an application of the ABCD matrix for distance a followed by an
application of the ABCD matrix for b renders same result as an application of the
ABCD matrix for distance d .
where the subscript mid refers to the ray in the middle position after traversing
the distance a. If we combine the equations, we get
y2 1 b 1 a y1
= (9.30)
2 0 1 0 1 1
which is in agreement with (9.28) since the ABCD matrix for both displacements is
A B 1 b 1 a 1 a +b
= = (9.31)
C D 0 1 0 1 0 1
y2 = y1 (9.32)
We adopt the widely used convention that, upon reflection, the positive z-
direction is reoriented so that we consider the rays still to travel in the positive
z sense. An easy way to remember this is that the positive z direction is always
taken to be down stream of where the light is headed. Notice that in Fig. 9.8, the
reflected ray approaches the z-axis. In this case 2 is a negative angle (as opposed
to 1 which is drawn as a positive angle) and is equal to
2 = (1 + 2i ) (9.33)
where i is the angle of incidence with respect to the normal to the spherical
mirror surface. By the law of reflection, the incident and reflected ray both occur
at an angle i , referenced to the surface normal. The surface normal points
towards the center of curvature of the mirror surface, which we assume is on the
z-axis a distance R away. By convention, the radius of curvature R is positive if
the mirror surface is concave and negative if the mirror surface is convex.
= 1 + i (9.35)
and when this is combined with (9.34), we get Figure 9.8 A ray depicted in the
act of reflection from a spherical
y1 surface.
i = 1 (9.36)
R
With this we are able to put (9.33) into a useful linear form:
2
2 = y 1 + 1 (9.37)
R
Equations (9.32) and (9.37) describe a linear transformation that can be con-
cisely formulated as
y2 1 0 y1
= (9.38) ABCD matrix for a curved mirror
2 2/R 1 1
The ABCD matrix in this transformation describes the act of reflection from a
concave mirror with radius of curvature R. As noted, the radius R is negative
when the mirror is convex.
The final basic element that we shall consider is a spherical interface between
two materials with indices n i and n t (see Fig. 9.9). This has an effect similar to
that of the curved mirror, which changes the direction of a ray without altering
238 Chapter 9 Light as Rays
its distance y 1 from the optical axis. Please note that here the radius of curvature
is considered to be positive for a convex surface (opposite convention from that
of the mirror). In this way, if the lower index is on the left, a positive radius R for
the interface tends to deflect rays towards the axis just as a positive radius for a
mirror does. Again, we are interested only in the act of transmission without any
travel before or after the interface. As before, (9.32) applies (i.e. y 2 = y 1 ).
At the interface, the rays obey Snells law, which in the paraxial approximations
is written
Figure 9.9 A ray depicted in the n i i = n t t (9.39)
act of transmission at a curved The angles i and t are referenced from the surface normal, as seen in Fig. 9.9.
material interface.
i = 1 + (9.40)
and
t = 2 + (9.41)
where is the angle that the surface normal makes with the z-axis. As before (see
(9.34)), within the paraxial approximation we may write
interface between regions with different indices (9.43). All other ABCD matrices
that we will use are composites of these three. For example, one can construct the
Table 9.1 Summary of ABCD matri-
ces for common optical elements. ABCD matrix for a lens by using two matrices like those in (9.43) to represent the
entering and exiting surfaces of the lens. A distance matrix (9.28) can be inserted
to account for the thickness of the lens. It is left as an exercise to derive the ABCD
matrix for a thick lens (see P9.7).
9.5 ABCD Matrices for Combined Optical Elements 239
Example 9.5
Derive the ABCD matrix for a thin lens, where the thickness between the two lens
surfaces is ignored. (See P 9.7 for the more general case of a thick lens.)
Solution: A thin lens is depicted in Fig. 9.10. R 1 is the radius of curvature for the
first surface (which is positive if convex as drawn), and R 2 is the radius of curvature
for the second surface (which is negative as drawn).
We take the index outside of the lens to be unity while that of the lens material to
be n. We apply the ABCD matrix (9.43) in sequence, once for entering the lens and
once for exiting:
A B 1 0 11 0
= 1 1
1
C D R2 (n 1) n R1 n 1 n
" # (9.44) ABCD matrix for a thin lens
1 0
=
(n 1) R11 R12 1
The matrix for the first interface is written on the right, where it operates first on
an incoming ray vector. In this case, n i = 1 and n t = n. The matrix for the second
surface is written on the left so that it operates afterwards. For the second surface,
n i = n and n t = 1.
Notice the close similarity between the ABCD matrix for a thin lens (9.44) and
the ABCD matrix for a curved mirror (9.38). The ABCD matrix for either the thin
lens or the mirror can be written as
A B 1 0
= (9.45)
C D 1/ f 1 Figure 9.10 Thin lens.
where in the case of the thin lens the focal length f is given by lens makers formula
1 1 1
= (n 1) (focal length of thin lens) (9.46)
f R1 R2
The reason for calling f the focal length will become apparent later. Table 9.1
gives a summary of ABCD matrices of common optical elements.
Example 9.6
Derive the ABCD matrix for a window with thickness d and index n.
Solution: We can again take advantage of the ABCD matrix for a curved interface
(9.43), only with R 1 = and R 2 = to provide flat surfaces. We take the index
outside of the window to be unity and the index inside the window to be n. We use
240 Chapter 9 Light as Rays
the ABCD matrix (9.43) twice, once for each interface, sandwiching matrix (9.31),
which endows the window with thickness:
A B 1 0 1 d 1 0
=
C D 0 n 0 1 0 n1
(9.48)
1 d /n
= (window)
0 1
As far as rays are concerned, a window is effectively shorter to traverse than free
space.8 Fig. 9.11 illustrates why this is the case. The displacement of the exiting ray
is not as great as it would have been without the window. The window impedes
Figure 9.11 Window.
the rate at which the ray can move away from or toward the optical axis.
Example 9.7
h i
y
h i
y
Find ray 22 that results when 11 propagates through a distance a, reflects from
a mirror of radius R, and then propagates through a distance b. See Fig. 9.12.
Solution: The final ray in terms of the initial one is computed as follows:
y2 1 b 1 0 1 a y1
=
2 0 1 2/R 1 0 1 1
1 2b/R a + b 2ab/R y1
= (9.49)
2/R 1 2a/R 1
(1 2b/R) y 1 + (a + b 2ab/R) 1
=
(2/R) y 1 + (1 2a/R) 1
As always, the ordering of the matrices is important. The first effect that the ray
experiences is represented
h i by the matrix on the right, which is in the position that
y1
operates first on 1 .
8 In contrast, the optical path length OPL is effectively longer than free space by the factor n.
9.6 Image Formation 241
Imagine a ray contained within a plane that is parallel to the yz plane but for
which x > 0. One might be concerned that when the ray meets, for example, a
spherically concave mirror, the radius of curvature in the perspective of the yz
dimension might be different for x > 0 than for x = 0 (at the center of the mirror).
This concern is actually quite legitimate and is the source of what is known as
spherical aberration. Nevertheless, in the paraxial approximation the intersection
with the curved mirror of all planes that are parallel to the optical axis gives the
same curve.
To see why this is so, consider the curvature of the mirror in Fig. 9.8. As we
move away from the mirror center (in the x or y-dimension or some combination
thereof), the mirror curves to the left by the amount
= R R cos (9.50)
2
. = 1 /2. And since in this approxi-
In the paraxial approximation, we have cos
mation we may also write = x 2 + y 2 R, (9.50) becomes
p
However, when this matrix is used in succession to form a lens, the resulting matrix has determinant
equal to one.
242 Chapter 9 Light as Rays
where by (9.47) we have replaced 2/R with 1/ f . Because of the similarity between
the behavior of a curved mirror and a thin lens, the above expression can also
represent a ray traveling a distance a, traversing a thin lens with focal length f ,
and then traveling a distance b. The only difference is that, in the case of the thin
lens, f is given by lens makers formula (9.46).
As is well known, it is possible to form an image with either a curved mirror
or a lens. Suppose that the initial ray is one of many rays that leaves a particular
point on an object positioned a = d o before the mirror (or lens). In order for an
image to occur at d i = b, it is essential that all rays leaving the particular point on
the object converge to a corresponding point on the image. That is, we want rays
leaving the point y 1 on the object (which may take on a range of angles 1 ) all to
converge to a single point y 2 at the image. In the following equation we need y 2
to be independent of 1 :
Ay 1 + B 1
y2 A B y1
= = (9.54)
2 C D 1 C y 1 + D1
do di 1 1 1
do + di =0 = + (9.56)
Figure 9.13 Image formation by a f f do di
thin lens.
which is the familiar imaging formula (9.1). When the object is infinitely far away
(i.e. d o ), the image appears at d i f . This gives a physical interpretation
to the focal length f , as we have been calling it. Please note that d o and d i can
each be either positive (real as depicted in Fig. 9.13) or negative (virtual meaning
a screen cannot be inserted to display the image).
The magnification of the image is found by comparing the size of y 2 to y 1 .
From (9.53)(9.56), the magnification is found to be
y2 di di
M = A = 1 = (9.57)
y1 f do
The negative sign indicates that for positive distances d o and d i the image is
inverted.
Example 9.8
Beginning students are often taught to draw ray diagrams such as the one in Fig.
9.14, which shows a real image formed by a thin lens. Several key rays aid in a
graphic prediction of the location and size of the image. Use ABCD-matrix analysis
to describe the effect of the lens on the three rays drawn.
9.6 Image Formation 243
object
A
C
image
Solution: Ray A is parallel to the axis with height y 1 before traversing the lens. Just
after the lens, ray A is described by
y2 1 0 y1 y1
= =
2 1/ f 1 0 y 1 / f
1 f y1 0
which crosses the axis at the focus d = f , since 0 1 y 1 / f
=
y 1 / f
.
Meanwhile, ray B traverses the lens just where it crosses the axis. The lens
does nothing to this ray:
y2 1 0 0 0
= =
2 1/ f 1 1 1
Ray B is un-deflected. For ray B, 1 = y 1 /d o .
Finally, ray C, which crosses the axis a distance d = f before the lens, be-
comes parallel to the axis after traversing the lens:
f 1
1 0 1 f 0
=
1/ f 1 0 1 1 0
For ray C, 1 = y 2 y 1 /d o .
A telescope consists of two lenses, the first called the objective lens with focal
length f o and the second called the eye piece with focal length f e . The function of a
telescope is to map all rays having incident angle 1 into corresponding rays all
with wider angle 2 . Importantly, 2 should only depend on 1 and not on where
each ray enters the objective lens (i.e. y 1 ).
The ABCD matrix of the system is
L
" #
1 0 1 0 1 L
A B 1 L fo
= =
C D f1e 1 0 1 f1o 1 f1o 1
fe + L
fo fe 1 L
fe
Ay 1 + B 1
y2 A B y1
Rays from = =
a distant 2 C D 1 C y 1 + D1
object Real Image
L
we need C = 0 or f1o 1
fe + fo fe = 0 to ensure that 2 depends only on 1 . This
reduces simply to
L = fo + fe (9.61)
which is the required separation between the objective lens and the eye piece.
Figure 9.15 Basic telescope con-
sisting of an objective lens and an The angular magnification is defined to be
eye piece.
2 L fo
M = D = 1 = (9.62)
1 fe fe
The angular magnification is governed by the ratio of the two focal lengths. When
looking through a telescope, the apparent angular separation between distant
objects is enhanced by this factor. The minus sign indicates that objects in the
field of view tend to be inverted.
of the system, the overall matrix simplifies to one that looks identical to the matrix
for a simple thin lens (9.45).
With knowledge of the positions of the principal planes, one can treat the
complicated imaging system in the same way that one treats a simple thin lens.
That is, we can simply use the common imaging formulas (9.56) and (9.57). The
only difference is that d o is the distance from the object to the first principal plane
and d i is the distance from the second principal plane to the image. In the case
of an actual thin lens, both principal planes are at p 1 = p 2 = 0. For a composite
system, p 1 and p 2 can be either positive or negative.
First Second
We assert that for any optical system,11 p 1 and p 2 can always be selected such Principal Principal
that we can write Plane Plane
1 p2 A B 1 p1 A + p 2C p 1 A + B + p 1 p 2C + p 2 D Figure 9.16 A multi-element sys-
= tem represented as an ABCD ma-
0 1 C D 0 1 C p 1C + D
trix for which principal planes
1 0 always exist.
=
1/ f eff 1
(9.63)
The final matrix is that of a simple thin lens, and it takes the place of the composite
system including the distances to the principal planes.
Our task is to find the values of p 1 and p 2 that make (9.63) true. We can straight-
away make the definition
f eff 1/C (9.64)
We can also solve for p 1 and p 2 by setting the diagonal elements of the matrix to 1.
Explicitly, we get
1D
p 1C + D = 1 p 1 = (9.65)
C
and
1 A
A + p 2C = 1 p 2 = (9.66)
C
It remains to be shown that the upper right element in (9.63) (i.e. p 1 A + B +
p 1 p 2C + p 2 D) automatically goes to zero for our choices of p 1 and p 2 . This may
seem unlikely at first, but watch what happens!
When (9.65) and (9.66) are substituted into the upper right matrix element of (9.63)
we get
1D 1D 1 A 1 A
p 1 A + B + p 1 p 2C + p 2 D = A +B + C+ D
C C C C
1
= [1 AD + BC ] (9.67)
C
1 A B
= 1
C C D
10 R. Guenther, Modern Optics, p. 186 (New York: Wiley, 1990).
11 The starting and ending refractive index must be the same.
246 Chapter 9 Light as Rays
This vanishes (as desired) since the determinant of the ABCD matrix is one, in
accordance with (9.52).
(a)
9.8 Stability of Laser Cavities
The ABCD matrix formulation provides a powerful tool to analyze the stability of a
laser cavity.12 The basic elements of a laser cavity include an amplifying medium
(b)
and mirrors to provide feedback. Presumably, at least one of the end mirrors is
partially transmitting so that energy is continuously extracted from the cavity.
Here, we dispense with the amplifying medium and concentrate our attention on
the optics providing the feedback.
(c) As might be expected, the mirrors must be carefully aligned or successive
reflections might cause rays to walk continuously away from the optical axis,
so that they eventually leave the cavity out the side. If a simple cavity is formed
with two flat mirrors that are perfectly aligned parallel to each other, one might
(d) suppose that the mirrors would provide ideal feedback. However, all rays except
for those that are perfectly aligned to the mirror surface normals would eventually
wander out of the side of the cavity as illustrated in Fig. 9.17a. Such a cavity is said
to be unstable. We would like to do a better job of trapping the light in the cavity.
To improve the situation, a cavity can be constructed with concave end mir-
Figure 9.17 (a) A ray bouncing
rors to help confine the beams within the cavity. Even so, one must choose
between two parallel flat mirrors.
carefully the curvature of the mirrors and their separation L. If this is not done
(b) A ray bouncing between two
curved mirrors in an unstable correctly, the curved mirrors can overcompensate for the tendency of the rays
configuration. (c) A ray bouncing to wander out of the cavity and thus aggravate the problem. Such an unstable
between two curved mirrors in a scenario is depicted in Fig. 9.17b.
stable configuration. (d) Stable Figure 9.17c depicts a cavity made with curved mirrors where the separation
cavity utilizing a lens and two flat L is chosen appropriately to make the cavity stable. Although a ray, as it makes
end mirrors. successive bounces, can strike the end mirrors at a variety of points, the curvature
of the mirrors keeps the trajectories contained within a narrow region so that
they cannot escape out the sides of the cavity.
There are many ways to make a stable laser cavity. For example, a stable cavity
can be made using a lens between two flat end mirrors as shown in Fig. 9.17d. Any
combination of lenses (perhaps more than one) and curved mirrors can be used
to create stable cavity configurations. Ring cavities can also be made to be stable
where in no place do the rays retro-reflect from a mirror but circulate through
a series of elements like cars going around a racetrack. The ABCD matrix for a
round trip in the cavity will be useful for this analysis.
Example 9.9
Find the round-trip ABCD matrix for the cavities shown in Figs. 9.17c and 9.17d.
12 P. W. Milonni and J. H. Eberly, Lasers, Sect. 14.3 (New York: Wiley, 1988).
9.8 Stability of Laser Cavities 247
Solution: The round-trip ABCD matrix for the cavity shown in Fig. 9.17c is
A B 1 L 1 0 1 L 1 0
= (9.68)
C D 0 1 2/R 2 1 0 1 2/R 1 1
where we begin the round trip with a reflection from the first mirror.
The round-trip ABCD matrix for the cavity shown in Fig. 9.17d is
A B 1 2L 1 1 0 1 2L 2 1 0
= (9.69)
C D 0 1 1/ f 1 0 1 1/ f 1
where we begin the round trip with transmission through the lens moving to the
right. It is somewhat arbitrary where a round trip begins. Multiplication of the
above matrices will be necessary to do problems P9.17 and P9.18.
At this point you might be concerned that taking an ABCD matrix to the N th
power can be a lot of work. (It is already a significant work just to compute the
ABCD matrix for a single round trip.) In addition, we are interested in letting N
be very large, perhaps even infinity. You can relax because we have a neat trick to
accomplish this daunting task.
By Sylvesters theorem in appendix 0.3, we have
N
A sin N sin (N 1) B sin N
A B 1
= (9.71)
C D sin C sin N D sin N sin (N 1)
where
1
(A + D) cos = (9.72)
2
This is valid as long as the determinant of the ABCD matrix is one. As noted
earlier (see (9.52)), we are in luck! The determinant is one any time a ray begins
and stops in the same refractive index, which by definition is guaranteed for any
round trip. We therefore can employ Sylvesters theorem for any N that we might
choose, including very large integers.
We would like the elements of (9.71) to remain finite as N becomes very large.
If this is the case, then we know that a ray remains trapped within the cavity
and stays reasonably close to the optical axis. Since N only appears within the
argument of a sine function, which is always bounded between 1 and 1 for
248 Chapter 9 Light as Rays
real arguments, it might seem that the elements of (9.71) always remain finite
as N approaches infinity. However, it turns out that can become imaginary
depending on the outcome of (9.72), in which case the sine becomes a hyperbolic
sine, which can blow up as N becomes large. In the end, the condition for cavity
stability is that a real must exist for (9.72), or in other words we need
1
1 < (A + D) < 1 (condition for a stable cavity) (9.73)
2
It is left as an exercise to apply this condition to (9.68) and (9.69) to find the
necessary relationships between the various element curvatures and spacing in
(a) order to achieve cavity stability.
commercial software designed for this purpose. Such software packages are able
to develop and optimize designs for specific applications. A useful feature in many
software packages is that the user can specify that the design should employ only
standard optical components available from known optics companies. In any
case, it is typical to specify that all lenses in the system should have spherical
surfaces since these are much less expensive to manufacture. We mention briefly
several common classes of aberrations that can occur if a lens system is not
properly designed.
Chromatic aberration arises from the fact that the index of refraction for glass
varies with the wavelength of light. Since the focal length of a lens depends on
the index of refraction (see, for example, Eq. (9.46)), the focal length of a lens
varies with the wavelength of light. Chromatic aberration can be compensated
for by using a pair of lenses made from two types of glass as shown in Fig. 9.20
(the pair is usually cemented together to form a doublet lens). The lens with the
shortest focal length is made of the glass whose index has the lesser dependence
on wavelength. By properly choosing the prescription of the two lenses, you
can exactly compensate for chromatic aberration at two wavelengths and do a
good job for a wide range of others. Achromatic doublets can also be designed to
minimize spherical aberration (see below), so they are often a good choice when
you need a high quality lens.
Monochromatic aberrations arise from the shape of the lens rather than the
variation of n with wavelength. Before the advent of computers facilitated the high dispersion
low dispersion
widespread use of ray tracing, these aberrations had to be analyzed analyti- glass glass
cally. The analytic results derived previously in this chapter were based on a
first-order approximation (e.g. sin ). One can increase the accuracy of the Figure 9.20 Chromatic aberration
causes lenses to have different
theory for non-paraxial rays by retaining higher-order terms in the polynomial-
focal lengths for different wave-
representation of sin . With higher-order terms included, the wave fronts con- lengths. It can be corrected using
verging towards an image point are still approximately spherical, but have aber- an achromatic doublet lens.
ration terms added in (shown conceptually in Fig. 9.18(b)). Without going into
detail, there are five aberration terms in the standard second-order analysis,
which represent a convenient basis for discussing aberration.
The first aberration term is known as spherical aberration. This type of aber-
ration results from the fact that rays traveling through a spherical lens at large
radii experience a different focal length than those traveling near the axis. For
a converging lens, this causes far-off-radius rays to focus before the near-axis
rays as shown in Fig. 9.21. This problem can be helped by orienting lenses so
that the face with the least curvature is pointed towards the side where the light
rays have the largest angle. This procedure splits the bending of rays more evenly
between the front and back surface of the lens. As mentioned above, you can also
cement two lenses made from different types of glass together so that spherical
aberrations from one lens are corrected by the other.
The aberration term referred to as astigmatism occurs when an off-axis object
point is imaged to an off-axis image point. In this case a spherical lens has a
different focal length in the horizontal and vertical dimensions. For a focusing lens
Figure 9.21 Spherical aberration
in a plano-convex lens.
250 Chapter 9 Light as Rays
c
b
a
c
Horizontal
b
Focus
Vertical
Focus
a
Image on screen
Figure 9.23 Illustration of coma. Rays traveling through the center of the lens are im-
aged to point a as predicted by paraxial theory. Rays that travel through the lens at
radius b in the plane of the figure are imaged to point b. Rays that travel through the
lens at radius b , but outside the plane of the figure are imaged to other points on the
Object
circle (in the image plane) containing point b. Rays at that travel through the lens at
Figure 9.22 Astigmatism causes other radii on the lens (e.g. c ) also form circles in the image plane with radius propor-
the horizontal and vertical di- tional to 2 with the center offset from point a a distance proportional to 2 . When
mensions to focus at different light from each of these circles combines on the screen it produces an imaged point
distances. with a comet tail.
this causes the two dimensions to focus at different distances, producing a vertical
line at one image plane and a horizontal line at another (see Fig. 9.22). A lens
can also be inherently astigmatic even when viewed on axis if it is football shaped
rather than spherical. In this case, the astigmatic aberration can be corrected by
inserting a cylindrical lens at the correct orientation (this is a common correction
needed in eyeglasses).
A third aberration term is referred to as coma. This is observed when off-axis
points are imaged and produces a comet shaped tail with its head at the point
predicted by paraxial theory. (The term coma refers to the atmosphere of a
Undistorted
comet, which is how the aberration got its name.) This aberration is distinct from
astigmatism, which is also observed for off-axis points, since coma is observed
even when all of the rays are in one plane (see Fig. 9.23). You have probably seen
coma if youve ever played with a magnifying glass in the sunjust tilt the lens
slightly and you see a comet-like image rather than a point.
The curvature of the field aberration term arises from the fact that spherical
lenses image spherical surfaces to another spherical surface, rather than imaging
Barrel Distortion a plane to a plane. This is not so bad for your eyeball, which has a curved screen,
but for things like cameras and movie projectors we would like to image to a flat
screen. When a flat screen is used and the curvature of the field aberration is
present, the image will focus well near the center, but become progressively out
of focus as you move to the edge of the screen (i.e. the flat screen is farther from
the curved image surface as you move from the center).
The final aberration term is referred to as distortion. This aberration occurs
Pincushion Distortion when the magnification of a lens depends on the distance from the center of
the screen. If magnification decreases as the distance from the center increases,
Figure 9.24 Distortion occurs
when magnification is not con- then barrel distortion is observed. When magnification increases with distance,
stant across an extended image. pincushion distortion is observed (see Fig. 9.24).
All lenses will exhibit some combination of the aberrations listed above (i.e.
9.A Aberrations and Ray Tracing 251
Exercises
P9.1 Consider the index described in Example 9.1. The solution given in
the example corresponds to rays that asymptotically approach y = 0. A
more general solution is given by
p
q
R = n 0 x 1 + y y /h 1 + > 0 and y 2 /h 2 > 0
2 2
This corresponds to rays that either hit the ground or return toward the
sky without reaching the ground, depending on the sign of .
(a) Verify that R satisfies the eikonal equation and determine the
function R x, y .
p
HINT: d 2 = 2 2 2 ln + 2 ( > 0).
R p p
p
(b) Verify that the light path is given by y = h cosh xx p 0 when
h 1+
p
xx
> 0 or is given by y = h || sinh p 0 when < 0. Consider only
h 1+
the region y > 0 (i.e. above ground). Notice that these solutions can
make rays that travel either to the right or to the left.
HINT: cosh2 sinh2 = 1 dd cosh = sinh dd sinh = cosh .
(c) Make a sketch of these two solution classes in the case of = 1/4.
P9.2 Prove that under the approximation of very short wavelength, the
Poynting vector is directed along R (r) or s.
Solution: (partial)
We have E(r, t ) = E0 (r)e i (kvac R(r)t ) , and from Faradays law (1.35) we have
i
B(r, t ) = E0 (r)e i (kvac R(r)t )
i i (kvac R(r)t )
B(r, t ) = e [ E0 (r)] + i k vac e i (kvac R(r)t ) [R(r) E0 (r)]
vac i [kvac R(r)t ] 1
= i e [ E0 (r)] e i [kvac R(r)t ] [R (r) E0 (r)]
2c c
The first term vanishes in the limit of very short wavelength, so we simply write
1
B(r, t ) [R (r)] E0 (r) e i [kvac R(r)t ] .
c
1
S= Re {E(r, t )} Re {B(r, t )}
0
1
E (r, t ) + E (r, t ) B(r, t ) + B (r, t )
=
40
Exercises 253
The BAC-CAB rule (P0.3) will come in handy along with the fact that R(r)E0 (r) 0. To confirm
the latter, we employ Gausss law (1.33) and the constitutive relation (2.16) as follows:
h i
1 + (r) E0 (r)e i (kvac R(r)t ) = 0
e i (kvac R(r)t ) 1 + (r) E0 (r) + i k vac e i (kvac R(r)t ) 1 + (r) [R (r) E0 (r)] = 0
After canceling the common exponential factor, using k vac = 2/vac , and performing some
algebra, we get
1 + (r) E0 (r)
i vac + R(r) E0 (r) = 0
2 1 + (r)
R(r) E0 (r) 0
P9.3 A mirage can be created using a pan of icewater placed above a hotplate
with a small air gap,13 as shown in Fig. 9.25. Suppose a thermocouple
measures a uniform temperature gradient from 100 C to 300 C over
a distance 4 mm. A narrow laser beam travels d = 16 cm through the
center of the gap where T = 200 C. How far is the laser beam deflected
laterally (D) after traveling an additional distance L = 40 m? Assume
that the index of refraction follows n = 1 + TP , where = 7.8 107 Pa
K
5
and P = 1 atm = 1.01310 Pa and T is the temperature in Kelvin.
Deflected Beam
Laser Ice Water D
Undeflected Beam
(Not to Scale)
Hotplate
Screen
Jack Stand
Figure 9.25 Setup to deflect a laser beam between a hotplate and pan of ice
water.
HINT: Consider parallel rays within the laser beam, separated by a small
lateral displacement y. The difference in optical path length OP L
while crossing the hotplate compared to y matches approximately
the lateral displacement D compared to L.
13 L. Richey, B. Stewart, and J. Peatross, Creating and Analyzing a Mirage, Phys. Teach. 44,
460-464 (2006).
254 Chapter 9 Light as Rays
L Screen
L
Hotplate Long Hallway
P9.4 Use Fermats Principle to derive the law of reflection (3.6) for a reflective
surface.
HINT: Do not consider light that goes directly from A to B; require a
single bounce.
P9.5 Show that Fermats Principle fails to give the correct path for an extraor-
dinary ray entering a uniaxial crystal whose optic axis is perpendicular
to the surface.
HINT: With the index given by (5.29), show that Fermats principle
Figure 9.27
leads to an answer that neither agrees with the direction of the k-vector
(5.32) nor with the direction of the Poynting vector (5.40).
P9.6 Derive the ABCD matrix that takes a ray on a round trip through a
simple laser cavity consisting of a flat mirror and a concave mirror of
radius R separated by a distance L. HINT: Start at the flat mirror. Use
the matrix in (9.28) to travel a distance L. Use the matrix in (9.38) to
represent reflection from the curved mirror. Then use the matrix in
(9.28) to return to the flat mirror. The matrix for reflection from the flat
mirror is the identity matrix (i.e. R flat ).
P9.7 Derive the ABCD matrix for a thick lens made of material n 2 sur-
rounded by a liquid of index n 1 . Let the lens have curvatures R 1 and R 2
and thickness d .
Answer:
n1
n
1 + Rd 1 d n1
A B 1 n2 2
=
n
d
n1 n2
n1
C D n2 1 1 1
R1 R2 + R1 R2 2 n2 n1 1 Rd n2 1
1 2
P9.8 (a) Show that the ABCD matrix for a thick lens given in table 9.1 reduces
to that of a thin lens (9.45) when the thickness goes to zero.
(b) Starting from the ABCD matrix of a thick lens in table 9.1, deduce
the ABCD matrix for a thick window (thickness d ). HINT: A window
may be thought of as a thick lens with infinite radii of curvature.
Exercises 255
P9.9 Show that the matrix for a thick lens can be derived by sandwiching a
window between two thin lenses.
C
HINT: All relevant formulas appear in table 9.1. Let the thin lenses each
have a planar side adjacent to the window. This gives focal lengths A
1 (n1) 1 (n1)
f 1 = R 1 and f 2 = R 2 , where R 1 is the radius of the first surface of image object
lens 1, and R 2 is the radius of the second surface of lens 2 (negative if B
convex).
P9.14 (a) Consider a thick lens (see Fig. 9.31) with d = 5 cm, R 1 = 5 cm,
R 2 = 10 cm, n = 1.5. Compute the ABCD matrix of the lens.
(b) Where are the principal planes located and what is the effective
focal length f eff for this system? Figure 9.31
256 Chapter 9 Light as Rays
L9.15 Deduce the positions of the principal planes and the effective focal
length of a compound lens system. Reference the positions of the
principal planes to the outside ends of the metal hardware that encloses
the lens assembly. (video)
HINT: Obtain three sets of distances to the object and image planes
Figure 9.32 and place the data into (9.59) to create three distinct equations for the
unknowns A, B, C, and D. Find A, B, and C in terms of D and place
the results into (9.52) to pin down D. The effective focal length and
principal planes can then be found through (9.64)(9.66).
P9.16 Use a computer program to calculate the ABCD matrix for the com-
pound system shown in Fig. 9.33, known as the Tessar lens. The
details of this lens are as follows (all distances are in the same units,
and only the magnitude of curvatures are givenyou decide the sign):
Convex-convex lens 1 (thickness 0.357, R 1 = 1.628, R 2 = 27.57, n =
1.6116) is separated by 0.189 from concave-concave lens 2 (thickness
0.081, R 1 = 3.457, R 2 = 1.582, n = 1.6053), which is separated by 0.325
from plano-concave lens 3 (thickness 0.217, R 1 = , R 2 = 1.920, n =
1 2 3 4
1.5123), which is directly followed by convex-convex lens 4 (thickness
Figure 9.33 0.396, R 1 = 1.920, R 2 = 2.400, n = 1.6116).
P9.17 (a) Show that the cavity depicted in Fig. 9.17c is stable if
L L
0 < 1 1 <1
R1 R2
(b) The two concave mirrors have radii R 1 = 60 cm and R 2 = 100 cm.
Over what range of mirror separation L is it possible to form a stable
laser cavity?
HINT: There are two different stable ranges with an unstable range
between them.
P9.18 Find the stable ranges for L 1 = L 2 = L for the laser cavity depicted in
Fig. 9.17d with focal length f = 50 cm.
L9.19 Experimentally determine the stability range of a HeNe laser with ad-
justable end mirrors. Check that this agrees reasonably well with theory.
Figure 9.34
Can you think of reasons for any discrepancy? (video)
Chapter 10
Diffraction
In the 1600s, Christiaan Huygens developed a wave description for light. Unfor-
tunately, his ideas were largely overlooked at the time because Sir Isaac Newton
promoted a competing theory. Newton proposed that light should be thought
of as many tiny bullets, or corpuscles, as he called them. Newtons ideas pre-
vailed for more than a century, perhaps because he was right on so many other
things, until 1807 when Thomas Young performed his famous two-slit experiment,
conclusively demonstrating the wave nature of light. Even then, Youngs conclu-
sions were accepted only gradually by others, a notable exception being a young
Frenchman named Augustin Fresnel. The two formed a close friendship through
correspondence, and it was Fresnel that followed up on Youngs conclusions and
dedicated his life to a study of light.
Fresnels skill as a mathematician allowed him to transform physical intuition
into powerful and concise ideas. Perhaps Fresnels greatest accomplishment was Christiaan Huygens (16291695,
the adaptation of Huygens principle of wavelet superposition into a mathematical Dutch) was born in The Hague, Nether-
lands. His father was friends with the
formula. Ironically, he used Newtons calculus to achieve this. Huygens principle mathematician Ren Descartes, which
asserts that a wave front can be thought of as many wavelets, which propagate and probably influenced his upbringing. Huy-
gens studied law and mathematics at
interfere to form new wave fronts. This is illustrated in Fig. 10.1. The phenomenon the University of Leiden, which preceded
of diffraction is then understood as the spilling of wavelets around obstructions a very productive career as a scientist
in the path of light. and mathematician. During mid career,
Huygens held a position in the French
After formulating Huygens principle as a diffraction integral, Fresnel made Academy of Sciences in Paris for 15
an approximation to his own formula, called the Fresnel approximation, for the years, but he spent the majority of his
life in The Hague. Huygens was the
sake of making the integration easier to perform. As far as approximations go, first to advocate the wave theory of light.
the Fresnel approximation is surprisingly accurate in describing the light field He was able to explain birefringence in
terms of his wave theory assuming a
in the region downstream from an aperture. The diffraction pattern can evolve refractive index that varied with direc-
in complicated ways as the distance from an aperture increases. At distances far tion. Huygens constructed a telescope
with which he discovered Saturns moon
downstream from an aperture, the diffraction pattern acquires a final form that
Titan. He also made the first detailed
no longer evolves, other than to grow in proportion to distance. This far-field observations of the Orion nebula. Huy-
limit is often of interest, and it turns out that the Fresnel diffraction formula can gens made significant advancements
in clock-making technology and wrote
be simplified further in this case. The far-field limit of the Fresnel diffraction a book on probability theory. Huygens
formula is called the Fraunhofer approximation. was one of the earliest science-fiction
writers and speculated that life exists on
From the modern perspective, Fresnels diffraction formula needs justifica- other planets in his book Cosmotheoros.
(Wikipedia)
257
258 Chapter 10 Diffraction
tion starting from Maxwells equation. The diffraction formula is based on scalar
diffraction theory, which ignores polarization effects. In some situations, ignor-
ing polarization is benign, but in other situations, ignoring polarization effects
produces significant errors. These issues as well as the approximations leading to
scalar diffraction theory are discussed in section 10.2.
i e i kR
E (x, y, z) = E (x 0 , y 0 , 0) d x 0d y 0 (10.1)
R
aperture
where q
R= (x x 0 )2 + (y y 0 )2 + z 2 (10.2)
is the radius of each wavelet as it individually intersects the point (x, y, z). We will
Figure 10.2
call (10.1) the Huygens-Fresnel2 diffraction formula, although Fresnel is credited
with this integral formulation. The factor i / in front of the integral in (10.1)
ensures the right phase and field strength (not to mention correct units). Justifica-
tion for this factor is given in section 10.3 and in appendix 10.A. To summarize,
1 For simplicity, we use the term spherical wave in this book to refer to waves of the type
imagined by Huygens (i.e. of the form e i kR /R). There is a different family of waves based on
spherical harmonics that are also sometimes referred to as spherical waves. These waves have
angular as well as radial dependence, and they are solutions to Maxwells equations. See J. D.
Jackson, Classical Electrodynamics, 3rd ed., pp. 429432 (New York: John Wiley, 1999).
2 M. Born and E. Wolf, Principles of Optics, 7th ed., p. 414 (Cambridge University Press, 1999).
10.1 Huygens Principle as Formulated by Fresnel 259
(10.1) tells us how to compute the field downstream given knowledge of the field
in an aperture. The field at each point (x 0 , y 0 ) in the aperture, which may vary
with strength and phase, is treated as the source for a spherical wave. The integral
in (10.1) sums the contributions from all of these wavelets.
Example 10.1
Find the on-axis3 (i.e. x, y = 0) intensity following a circular aperture of diameter
D illuminated by a uniform plane wave.
Z2 Z i k p 02 +z 2
D/2
i E0 e
E (0, 0, z) = d 0 0 d 0
02 + z 2
p
0 0
p 02 2 D/2 p
i E0 e i k +z
2 2
= 2 = E 0 e i k (D/2) +z e i kz
ik
0
wide plane wave. One simply computes the diffraction of the blocked portion
of the field as though it came from an opening in a mask. The result is then
subtracted from the plane wave (no integration needed for the plane wave), as
depicted in Fig. 10.6. This is known as Babinets principle.
Mask When Fresnel first presented his diffraction formula to the French Academy
Block of Sciences, a certain judge of scientific papers named Simon Poisson noticed
that Fresnels formula predicted that there should be light in the center of the
geometric shadow behind a circular obstruction. This seemed so absurd to
Figure 10.6 Side view of a circular Poisson that he initially disbelieved the theory, until the spot was shortly thereafter
block in a plane wave giving rise experimentally confirmed, much to Poissons chagrin. Needless to say, Fresnels
to diffraction in the geometric paper was then awarded first prize, and this spot appearing behind circular blocks
shadow. has since been known as Poissons spot.
Example 10.2
Find the on-axis (i.e. x, y = 0) intensity behind a circular block of diameter D
placed in a uniform plane wave.
Solution:
Fromp Example 10.1, the on-axis field behind a circular aperture is
i kz i k (D/2)2 +z 2
E0 e e . Babinets principle says to subtract this result from a
plane wave to obtain the field behind the circular block. The situation is depicted
in Fig. 10.6. The on-axis field is then
p p
2 2 2 2
E (0, 0, z) = E 0 e i kz E 0 e i kz e i k (D/2) +z = E 0 e i k (D/2) +z
In the exact center of the shadow behind the circular obstruction, the intensity is
the same as the illuminating plane wave for all distance z. A spot of light in the
center forms right away; no wonder Poisson was astonished!
where k n/c is the magnitude of the usual wave vector (see also (9.2)). Equa-
tion (10.4) is called the Helmholtz equation. Again, it is merely the wave equation
10.2 Scalar Diffraction Theory 261
written for the case of a single frequency, where the trivial time dependence has
been removed. To obtain the full wave solution, just append the factor e i t to
the solution of (10.4).
At this point we take an egregious step: We ignore the vectorial nature of E(r)
and write (10.4) using only the magnitude E (r). When using scalar diffraction
theory, we must keep in mind that it is based on this serious step. Under the
scalar approximation, the vector Helmholtz equation (10.4) becomes the scalar
Helmholtz equation:
2 E (r) + k 2 E (r) = 0 (10.5)
This equation of course is consistent with (10.4) in the case of a plane wave.
However, we are interested in spherical waves of the form E (r ) = E 0 r 0 e i kr /r . It
turns out that such spherical waves are exact solutions to the scalar Helmholtz
equation (10.5). The proof is left as an exercise (see P10.3). Nevertheless, spherical Francois Jean Dominique Arago
waves of this form only approximately satisfy the vector Helmholtz equation (10.4). (1786-1853, French) was born in Cata-
lan France, where his father was the
We can get away with this sleight of hand if the radius r is large compared to a Treasurer of the Mint. As a teenager,
wavelength (i.e., kr 1) and if we restrict r to a narrow range perpendicular to Arago was sent to a municipal college
in Perpignan where he developed a
the polarization. deep interest in mathematics. In 1803,
he entered the Ecole Polytechnique in
Paris, where he purportedly was dis-
Significance of the Scalar Wave Approximation appointed that he was not presented
with new knowledge at a higher rate.
The solution of the scalar Helmholtz equation is not completely unassociated with He associated with famous French
mathematicians Simon Poisson and
the solution to the vector Helmholtz equation. In fact, if E scalar (r) obeys the scalar
Pierre-Simmon Laplace. He later worked
Helmholtz equation (10.5), then with Jean-Baptiste Biot to measure the
meridian arch to determine the exact
E (r) = r E scalar (r) (10.6) length of the meter. This work took him
to the Balearic Islands, Spain, where
obeys the vector Helmholtz equation (10.4). he was imprisoned as a spy, being sus-
pected because of lighting fires atop
Consider a spherical wave, which is a solution to the scalar Helmholtz equation: a mountain as part of his surveying ef-
forts. After a heroic prison escape and
E scalar (r) = E 0 r 0 e i kr /r (10.7) a subsequent string of misfortunes, he
eventually made it back to France where
Remarkably, when this expression is placed into (10.6) the result is zero. Although he took a strong interest in optics and
zero is in fact a solution to the vector Helmholtz equation, it is not very interesting. the wave theory of light. Arago and Fres-
A more interesting solution to the scalar Helmholtz equation is nel established a fruitful collaboration
that extended for many years. It was
Arago who demonstrated Poissons spot
i e i kr
E scalar (r) = r 0 E 0 1 cos (10.8) (sometimes called Aragos spot). Arago
kr r also invented the first polarizing filter. In
later life, he served a brief stint as the
which is one of an infinite number of unique spherical solutions that exist. Notice French prime minister. (Wikipedia)
that in the limit of large r , this expression looks similar to (10.7), aside from the
factor cos . The vector form of this field according to (10.6) is
i e i kr
E (r) = r 0 E 0 1 sin (10.9)
kr r
This field looks approximately like the scalar spherical wave solution (10.7) in the
limit of large r if the angle is chosen to lie near
= /2 (spherical coordinates).
Since our use of the scalar Helmholtz equation is in connection with this spherical
wave under these conditions, the results are close to those obtained from the
vector Helmholtz equation.
262 Chapter 10 Diffraction
Fresnel developed his diffraction formula (10.1) a half century before Maxwell
assembled the equations of electromagnetic theory. In 1887, Gustav Kirchhoff
demonstrated that Fresnels diffraction formula satisfies the scalar Helmholtz
equation. In doing this he clearly showed the approximations implicit in the
theory, and made a slight revision to the formula:
i e i kR 1 + cos (R, z)
E x 0, y 0, 0 d x 0d y 0
E x, y, z = (10.10)
R 2
aperture
The factor in square brackets, Kirchhoffs revision, is known as the obliquity factor.
Here, cos(R, z) indicates the cosine of the angle between R and z. Notice that this
Figure 10.7 factor is approximately equal to one when the point (x, y, z) is chosen to be in
the forward direction; we usually study diffraction under this circumstance. On
the other hand, the obliquity factor equals zero for fields traveling in the reverse
direction (i.e. in the z direction). This fixes a problem with Fresnels version of
the formula (10.1) based on Huygens wavelets, which suggested that light could
as easily diffract in the reverse direction as in the forward direction
In honor of Kirchhoffs work, (10.10) is referred to as the Fresnel-Kirchhoff
diffraction formula. The details of Kirchhoffs more rigorous derivation, including
how the factor i / naturally arises, are given in appendix 10.A. Since the Fresnel-
Kirchhoff formula can be understood as a superposition of spherical waves, it is
not surprising that it satisfies the scalar Helmholtz equation (10.5).
R
=z (denominator only; Fresnel approximation) (10.11)
Example 10.3
Compute the Fresnel diffraction field following a rectangular aperture (dimensions
x by y) illuminated by a uniform plane wave.
If we assume that the light coming through the aperture is highly directional, such
that it propagates mainly in the z-direction, we are motivated to write the field Figure 10.8 Field amplitude fol-
as E (x, y, z) = E (x, y, z)e i kz . Upon substitution of this into the scalar Helmholtz lowing a rectangular aperture
equation (10.5), we arrive at computed in the Fresnel approxi-
mation.
2 E 2 E E 2 E
+ 2 + 2i k + =0 (10.14)
x 2 y z z 2
2
At this point we make the paraxial wave approximation,5 which is |2k zE | | zE2 |.
That is, we assume that the amplitude of the field varies slowly in the z-direction
5 P. W. Milonni and J. H. Eberly, Laser, Sect. 14.4 (New York: Wiley, 1988).
264 Chapter 10 Diffraction
such that the wave looks much like a plane wave. We permit the amplitude to
change as the wave propagates in the z-direction as long as it does so on a scale
much longer than a wavelength. This leads to the paraxial wave equation:
2
2
+ + 2i k E (x, y, z)
=0 (paraxial wave equation) (10.15)
x 2 y 2 z
It turns out that the Fresnel approximation (10.13) is an exact solution to the
paraxial wave equation (see P10.5). That is, (10.15) is satisfied by
h i
i k
i 2z (xx 0 )2 +( yy 0 )2
E (x, y, z)
=
0 0
E (x , y , 0)e d x0d y 0 (10.16)
z
6 Since the Fraunhofer approximation is easier to use, many textbooks present it before the
Fresnel approximation.
7 J. W. Goodman, Introduction to Fourier Optics, p. 61 (New York: McGraw-Hill, 1968).
10.5 Diffraction with Cylindrical Symmetry 265
k 02 02
Obviously, the removal of e i 2z (x +y ) from the integrand improves our chances
of being able to perform the integration analytically. In fact the integral can be
interpreted as a two-dimensional (inverse) Fourier transform on the aperture
field E x 0 , y 0 , 0 , where kx/z and k y/z can be thought of as spatial frequencies.
Example 10.4
Compute the Fraunhofer diffraction pattern following a rectangular aperture (di-
mensions x by y) illuminated by a uniform plane wave.
It is left as an exercise (see P10.6) to perform the integration and compute the
intensity. The result turns out to be Figure 10.9 Fraunhofer diffraction
x 2 y 2 pattern (field amplitude) gener-
2 x 2 y
I x, y, z = I 0 sinc x sinc y (10.20) ated by a uniformly illuminated
2 z 2 z z
rectangular aperture with a height
where sinc () sin /. Note that lim sinc () = 1. twice the width.
0
500/k
We are able to perform the integration over 0 with the help of the formula (0.57):
z = 75/k Z2
k 0
k 0
i z cos(0 ) 0
e d = 2J 0 (10.26)
z
0
Example 10.5
Compute the Fresnel and Fraunhofer diffraction patterns following a circular
500/k
aperture (diameter D) illuminated by a uniform plane wave.
Figure 10.10 Field amplitude fol-
lowing a circular aperture com- Solution: According to (10.27), the field downstream is
puted in the Fresnel approxima-
k 2 D/2
tion. 2e i kz e i k 02 k 0
Z
2z
0 0 i
E , z = i E 0 d e
2z J0
z z
0
which can be integrated analytically (with the aid of (0.58)). It is left as an exercise
to perform the integration and to show that the intensity of the Fraunhofer pattern
is 2
2
D 2
J 1 kD/2z
I , z = I 0
2 (10.29)
4z kD/2z
V U
I Z
U 2V V 2U d v
U V da = (10.30)
n n
S V
V e i kr /r
(10.31)
U E (r)
where E (r) is assumed to satisfy the scalar Helmholtz equation, (10.5). When
these functions are used in Greens theorem (10.30), we obtain
I " # Z " #
e i kr e i kr E 2e
i kr
e i kr 2
E da = E E dv (10.32)
n r r n r r
S V
e i kr e i kr 2 e i kr e i kr 2
E 2 E = k 2 E + k E =0 (10.33)
r r r r
8 Most authors define the jinc without the factor of 2, which gives the inconvenient normalization
where we have taken advantage of the fact that E (r) and e i kr /r both satisfy (10.5).
This is exactly the reason for our judicious choices of the functions V and U since
with them we were able to make half of (10.30) disappear. We are left with
I " #
e i kr e i kr E
E da = 0 (10.34)
n r r n
S
Now consider a volume between a small sphere of radius at the origin and an
outer surface of arbitrary shape. The total surface that encloses the volume is
comprised of two parts (i.e. S = S 1 + S 2 as depicted in Fig. 10.12).
When we apply (10.34) to the surface in Fig. 10.12, we have
I " # I " #
e i kr e i kr E e i kr e i kr E
E da = E da (10.35)
n r r n n r r n
S2 S1
This geometry with multiple surfaces is motivated by the hope of finding the field
at the origin (inside the little sphere) from knowledge of the field on the outside
surface. To this end, we assume that is so small that E (r) is approximately the
same everywhere on the surface S 1 . Then the integral over S 1 becomes
where we have used spherical coordinates. Notice that we have employed the
chain rule to execute the normal derivative /n. Since r always points opposite
to the direction of the surface normal n, the normal derivative r /n is always
equal to 1.11 We can perform the angular integration in (10.36) as well as take
the limit 0:
I " # " ! #
e i kr e i kr E 2 e i kr e i kr 2e
i kr
E
lim E d a = 4 lim r 2 + i k E r
0 n r r n 0 r r r r
S1 r =
E
= 4 lim e i k + i ke i k E e i k
0 r r =
= 4E (0)
(10.37)
With the aid of (10.37), Greens theorem applied to our specific geometry
reduces to I " i kr #
1 e E e i kr
E (0) = E da (10.38)
4 r n n r
S2
where r /n = cos (r, n) indicates the cosine of the angle between r and n. We
have also assumed that the distance r is much larger than a wavelength in order
to drop a term. Next, we assume that the field illuminating the aperture can be
written as E
= E x, y e i kz . This represents a plane-wave field traveling through
E E z
= i k E x, y e i kz (1) = i kE
= (10.41)
n z n
Substituting (10.40) and (10.41) into (10.39) yields
i e i kr
1 + cos (r, n)
E (0) = E da (10.42)
r 2
aperture
Finally, we wish to rearrange our coordinate system to that depicted in Fig. 10.2.
In our derivation, it was less cumbersome to place the origin at a point of interest
12 Later Sommerfeld noticed that these two assumptions actually contradict each other, and he
revised Kirchhoffs work to be more accurate. In practice this revision makes only a tiny difference
as light spills onto the back of the aperture, over a length scale of a wavelength. We will ignore
this effect and go with Kirchhoffs (slightly flawed) assumption. For further discussion see J. W.
Goodman, Introduction to Fourier Optics, Sect. 3-4 (New York: McGraw-Hill, 1968).
270 Chapter 10 Diffraction
after the aperture. Now that we have completed our mathematics, we can switch
around the coordinate system and place the origin in the plane of the aperture as
in Fig. 10.2:
i e i kR 1 + cos (r, z)
E x 0, y 0, 0 d x 0d y 0
E x, y, z = (10.43)
R 2
aperture
where q
2
(x x 0 )2 + y y 0 + z 2
R= (10.44)
which brings us to the Fresnel-Kirchhoff diffraction formula (10.10).
The unit vector n always points normal to the surface of volume V over which
the integral is taken. Let the vector function f be U V , where U and V are both
analytical functions of the position coordinate r. Then (10.45) becomes
I Z
(U V ) n d a = (U V ) d v (10.46)
S V
So far we havent done much. Equation (10.49) is nothing more than the diver-
gence theorem applied to the vector function U V . We can also write an equation
similar to (10.49) where U and V are interchanged:
U
I Z
V U + V 2U d v
V da = (10.50)
n
S V
We subtract (10.50) from (10.49), which leads to (10.30) known as Greens theorem.
Exercises 271
Exercises
L10.2 (a) Why does the on-axis intensity behind a circular opening fluctuate
(see Example 10.1) whereas the on-axis intensity behind a circular
obstruction remains constant (see Example 10.2)?
(b) Create a collimated laser beam several centimeters wide. Observe
the on-axis intensity on a movable screen (e.g. a hand-held card) be-
hind a small circular aperture and behind a small circular obstruction
placed in the beam. (video)
(c) In the case of the circular aperture, measure the distance to several
on-axis minima and check that it agrees with (10.3).
Laser
Figure 10.15
1 2 2
2 1 1
= r + sin +
r r 2 r 2 sin r 2 sin2 2
P10.4 (a) A vector field is needed to satisfy Maxwells equations instead of the
scalar field in P10.3, whose real part after appending e i t is
A
E (r ) = cos (kr t )
r
Lets attempt to create a vector field from this scalar field in the simplest
way possible. From experience, we expect a transverse wave, which we
take to oscillate in the direction:
A
E(r ) = cos (kr t )
r
(i) Show that E satisfies Gausss Law (1.1). (ii) Compute the curl of E in
Faradays Law (1.3) to deduce B. (iii) Show that this B satisfies Gauss
Law for magnetism (1.2). (iv) Finally, show that the above E and B do
not satisfy Amperes law (1.4).
HINT: In spherical coordinates
1 2 1 1 E
E = r E r + (sin E ) +
r 2 r r sin r sin
E 1 E r
1 1
sin E +
E = r r E
r sin r sin r
1 E r
+ (r E )
r r
A sin
1
E(r, ) = cos (kr t ) sin (kr t )
r kr
(i.e. the real part of (10.9) with time dependence appended) does satisfy
Maxwells equations. Describe how this wave behaves as a function of
r and . What conditions need to be satisfied for this equation to be
well approximated by the spherical wave in part (a)?
P10.5 By direct substitution, show that (10.16) satisfies the paraxial wave
equation (10.15).
Exercises 273
P10.6 Calculate the Fraunhofer diffraction field and intensity patterns for a
rectangular aperture (dimensions x by y) illuminated by a plane
wave E 0 . In other words, derive (10.20).
P10.7 A single narrow slit has a mask placed over it so the aperture function
is not a square profile but rather a cosine: E (x 0 , y 0 , 0) = E 0 cos(x 0 /L)
for L/2 < x 0 < L/2 and E (x 0 , y 0 , 0) = 0 otherwise. Calculate the far-field
(Fraunhofer) diffraction pattern. Make a plot of intensity as a function
of kLx/2z; qualitatively compare the pattern to that of a regular single
slit. Do not perform any integration in the y dimension. Write the
intensity as being proportional to an x-dependent expression.
P10.8 (a) Repeat Example 10.1 to find the on-axis intensity (i.e. = 0) after
a circular aperture in both the Fresnel approximation (10.27) and the
Fraunhofer approximation (10.28).
(b) Make suitable approximations directly to (10.3) to obtain the same
answers as in part (a).
(c) Check how well the Fresnel and Fraunhofer approximations work by
graphing the Fresnel- and Fraunhofer-approximation results together
with (10.3) on a single plot as a function of z. Take D = 10 m and
= 500 nm. To see the result better, use a log scale on the z-axis.
Answer:
4
Fresnel
Approx.
3
Fraunhofer
2 Approximation
1
Huygens-Fresnel
0
-3 -2 -1 0
10 10 10 10
z (mm)
Figure 10.16 The Fraunhofer Ap-
Figure 10.17 On-axis intensity behind a circular aperture calculated using proximation by Sterling Cornaby
the Fresnel diffraction formula (10.1), the Fresnel approximation (10.27),
and the Fraunhofer approximation (10.28).
P10.9 Calculate the Fraunhofer diffraction intensity pattern (10.29) for a cir-
cular aperture (diameter D) illuminated by a plane wave E 0 . That is,
repeat example 10.5 while filling in the integration step. For added
benefit, try to do it without peeking.
274 Chapter 10 Diffraction
Diffraction Applications
275
276 Chapter 11 Diffraction Applications
with a 1 cm radius (not necessarily circular) is used with visible light, the light
must travel more than a kilometer in order to reach the Fraunhofer limit. It
may therefore seem unlikely to reach the Fraunhofer limit in a typical optical
Figure 11.1 Diffraction in the far system, especially if the aperture or beam size is relatively large. Nevertheless,
field. spectrometers, which typically utilize diffraction gratings many centimeters wide,
depend on achieving the Fraunhofer limit within the confines of a manageable
instrument box. This is accomplished using imaging techniques. The Fraunhofer
limit is also naturally reached in other instruments that employ lenses such as
telescopes.
Consider a lens with focal length f placed in the path of light following an
aperture (see Fig. 11.2). Let the lens be placed an arbitrary distance L after the
aperture. The lens produces an image of the Fraunhofer pattern at a new location
d i following the lens according to the imaging formula (see (9.56))
1 1 1
= + . (11.2)
f (z L) d i
Keep in mind that the lens interrupts the light before the Fraunhofer pattern
has a chance to form. This means that the Fraunhofer diffraction pattern may
Figure 11.2 Imaging of the Fraunhofer diffraction pattern to the focus of a lens.
11.1 Fraunhofer Diffraction with a Lens 277
di
= f. (11.3)
Thus, a lens makes it very convenient to observe the Fraunhofer diffraction pat-
tern even from relatively large apertures. It is not necessary to let the light propa-
gate for kilometers. We need only observe the pattern at the focus of the lens as
shown in Fig. 11.2. Notice that the spacing L between the aperture and the lens is
unimportant to this conclusion.
Even though we know that the Fraunhofer diffraction pattern occurs at the
focus of a lens, the question remains as to the size of the image. To find the answer,
let us examine the magnification (9.57), which is given by
di
M = (11.4)
(z L)
Taking the limit of very large z and employing (11.3), the magnification becomes
f
M (11.5)
z
This is a remarkable result. When the lens is inserted, the size of the diffraction
pattern decreases by the ratio of the lens focal length f to the original distance
z to a far-away screen. Since in the Fraunhofer regime the diffraction pattern is
proportional to distance (i.e. si ze z), the image at the focus of the lens scales
in proportion to the focal length (i.e. si ze f ). This means that the angular
width of the pattern is preserved! With the lens in place, we can rewrite (11.1)
straightaway as
2
1 1
k 0 0
E x 0, y 0, 0 e f (
i xx +y y )
I x, y, L + f d x 0 d y 0
= c0 (11.6)
2 f
aperture
which describes the intensity distribution pattern at the focus of the lens.
Although (11.6) correctly describes the intensity at the focus of a lens, we
cannot easily write the electric field since the imaging techniques that we have
used do not easily render the phase information. To obtain an expression for
the field, it will be necessary to employ the Fresnel diffraction formula, which
we accomplish in the remainder of this section. Before doing so, we will need to
know how a lens adjusts the phase fronts of the light passing through it.
Consider a monochromatic light field that goes through a thin lens with focal
length f . In traversing the lens, the wavefront undergoes a phase shift that varies
across the lens. We will reference the phase shift to that experienced by the light
278 Chapter 11 Diffraction Applications
that goes through the center of the lens. We take the distances `1 and `2 , as drawn
in Fig. 11.3, to be positive.
The light passing through the off-axis portion of the lens experiences less material
than the light passing through the center. The difference in optical path length is
(n 1) (`1 + `2 ) (see discussion connected with (9.13)). This means that the phase
of the field passing through the off-axis portion of the lens relative to the phase of
the field passing through the center is
= k (n 1) (`1 + `2 ) . (11.7)
The negative sign indicates a phase advance (i.e. same sign as t ). In (11.7), k
Figure 11.3 A thin lens, which
represents the wave number in vacuum (i.e. 2/vac ); since `1 and `2 correspond
modifies the phase of a field pass-
to distances outside of the lens material.
ing through.
We can find expressions for `1 and `2 from the equations describing the spherical
surfaces of the lens:
(R 1 `1 )2 + x 2 + y 2 = R 12
(11.8)
(R 2 + `2 )2 + x 2 + y 2 = R 22
As drawn in Fig. 11.3, R 1 is a positive radius of curvature while R 2 is negative, in ac-
cordance with conventions in chapter 9. In the spirit of the Fresnel approximation,
which takes place in the paraxial limit, it is appropriate to neglect the terms `21 and
`22 in comparison to other terms present in (11.8). Within this approximation, the
equations become
x2 + y 2 x2 + y 2
`1
= and `2 = (11.9)
2R 1 2R 2
Substitution into (11.7) yields
x2 + y 2
k 2
1 1
= k (n 1) x + y2
= (11.10)
R1 R2 2 2f
where the focal length of a thin lens f has been introduced according to the lens-
makers formula (9.46).
In summary, the light traversing a lens experiences a relative phase shift given by
i k x 2 +y 2 )
E x, y, z after lens = E x, y, z before lens e 2 f (
(11.11)
Starting from the known field E x 0 , y 0 , 0 at the aperture, we compute the field
i 2kf (x 2 +y 2 )
eik f e
h
i k x 002 +y 002 )
E (x 00 , y 00 , L)e 2 f (
i
E x, y, L + f = i
f
i 2kf (x 002 +y 002 ) i kf (xx 00 +y y 00 )
e e d x 00 d y 00 (11.13)
Notice that at least the integration portion of this formula looks exactly like
the Fraunhofer diffraction formula! This happened even though in the preceding
discussion we did not at any time specifically make the Fraunhofer approximation.
The result (11.14) implies the intensity distribution (11.6) as anticipated. However,
the phase of the field is also revealed in (11.14).
In general, the field caries a wave front curvature as it passes through the
focal plane of the lens. In the special case L = f , the diffraction formula takes a
particularly simple form:
e 2i k f
0 0 i kf (xx 0 +y y 0 )
E (x 0 , y 0 , 0)e d x 0d y 0
E (x , y , L + f )L= f = i (11.15)
f
When the lens is placed at this special distance following the aperture, the Fraun-
hofer diffraction pattern viewed at the focus of the lens carries a flat wave front.
280 Chapter 11 Diffraction Applications
Example 11.1
What minimum telescope-lens diameter is required to distinguish a Jupiter-like
planet (orbital radius 8 108 km) from its star if they are 10 light-years away?
This seems like a piece of cake; a telescope with a diameter bigger than 7cm will
do the trick. However, the vastly unequal brightness of the star and the planet is
2 often defined without the factor of 2
282 Chapter 11 Diffraction Applications
the real technical challenge. The diffraction rings in the stars diffraction pattern
completely swamp the faint signal from the planet.
where the offset in the arguments shifts the location of the aperture. The field
comprising all of the identical apertures is
N
E x 0, y 0, 0 = E aperture (x 0 x n0 , y 0 y n0 , 0)
X
Figure 11.9 Array of identical aper- (11.19)
tures. n=1
We next compute the Fraunhofer diffraction pattern for the above field. Upon
inserting (11.19) into the Fraunhofer diffraction formula (10.19) we obtain
k 2 +y 2 Z
e i kz e i 2z (x ) X
N Z k 0 0
d x0 d y 0 E aperture x 0 x n0 , y 0 y n0 , 0 e i z (xx +y y )
E x, y, z = i
z n=1
(11.20)
where we have taken the summation out in front of the integral. We have also
integrated over the entire (infinitely wide) mask, taking E aperture to be zero except
inside each aperture.
Even without yet choosing the shape of the identical apertures, we can make
some progress on (11.20) with the change of variables x 00 x 0 x n0 and y 00 y 0 y n0 :
k 2
+y 2 ) X Z Z
e i kz e i 2z (x N
00
d y 00 E aperture x 00 , y 00 , 0
E x, y, z = i dx
z n=1
k 00
+x n0 )+y ( y 00 +y n0 )]
e i z [x (x
(11.21)
Next we pull the factor exp {i kz (xx n0 + y y n0 )} out in front of the integral to arrive
11.3 The Array Theorem 283
Example 11.2
Calculate the Fraunhofer diffraction pattern for two identical circular apertures
with diameter D whose centers are separated by a spacing h.
Z Z
E aperture x 0 x n0 , y 0 x n0 , 0 = d x0 d y 0 x 00 x n0 y 00 x n0 E aperture x 0 x 00 , y 0 y 00 , 0
The integral in (11.20) therefore may be viewed as a 2-D Fourier transform of a convolution, where
kx/z and k y/z play the role of spatial frequencies. The convolution theorem (see P0.26) indicates
that this is the same as the product of Fourier transforms. The 2-D Fourier transform for the delta
function (times 2) is
Z Z
k 00 00 k 0 0
00
d y 00 x 00 x n0 y 00 y n0 e i z (xx +y y ) = e i z (xxn +y y n )
dx
The array theorem (11.22) exhibits this factor. It multiplies the single-slit Fraunhofer diffraction
integral, which is the Fourier transform of the other function.
284 Chapter 11 Diffraction Applications
N +1
0
xn = n h, y n0 = 0 (11.23)
2
where N is the total number of slits. Then the summation in the array theorem,
(11.22), becomes
N k 0 0 khx
N
N +1 X khx
e i z (xxn +y y n ) = e i e i n
X
Figure 11.11 Transmission grating.
z 2 z (11.24)
n=1 n=1
(11.25)
e i
khx
2z
N
ei
khx
2z
N sin N khx
2z
= khx khx
=
e i 2z ei 2z khx
sin 2z
The diffraction pattern for a single slit was previously calculated in example 10.4.
When (11.25) and (10.20) are installed in the array theorem (11.22), we get for the
intensity
2 khx
sin N 2z x 2 y 2 2 x 2 y
I x, y, z = I0 sinc x sinc y (11.26)
2 z 2 z z
sin2 khx2z
pattern in (11.26) for fixed y, say y = 0. The intensity pattern in the horizontal
dimension may be written as
sin2 N hx
x
z
I (x) = I peak sinc2 x (11.27)
z
N 2 sin hx
2
z
sin N
Note that lim = N so we have placed N 2 in the denominator and absorbed
0 sin
the same factor into the definition of I peak , which represents the intensity on the N=2
screen at x = 0. Again, the intensity I peak is associated with a given value of y.
It is left as an exercise to study the functional form of (11.27), especially how
the number of slits N influences the behavior. The case of N = 2 describes
the diffraction pattern for a Youngs double slit experiment. We now have a
description of the Youngs two-slit pattern in the case that the slits have finite
openings of width x rather than infinitely narrow ones.
N=5
11.5 Spectrometers
The formula (11.27) can be exploited to make wavelength measurements. This
forms the basis of a diffraction grating spectrometer. In order to achieve good spa-
tial separation between wavelengths, it is necessary to allow the light to propagate
a far distance. Optimal wavelength separation therefore occurs in the Fraunhofer
regime for which (11.27) applies. N = 10
A spectrometer has relatively poor resolving power compared to a Fabry-Perot
interferometer. Nevertheless, a spectrometer is not hampered by the serious
limitation imposed by free spectral range. A spectrometer is able to measure a
wide range of wavelengths simultaneously. The Fabry-Perot interferometer and
the grating spectrometer in this sense are complementary, the one being able
to make very precise measurements within a narrow wavelength range and the
other being able to characterize wide ranges of wavelengths simultaneously. N = 100
To appreciate how a spectrometer works, consider Fraunhofer diffraction
from a grating, as described by (11.27). The structure of the diffraction pattern
has various peaks. For example, Fig. 11.12a shows the diffraction peaks from a
Youngs double slit (i.e. N = 2). The diffraction pattern is comprised of the typical
Youngs double-slit
pattern
multiplied
by thediffraction pattern of a single slit.
(Note that sin2 2 hx
z /4sin
2 hx
z = cos
2 hx
z .)
-4 -2 0 2 4
As the number of slits N increases, the peaks tend to sharpen while staying in
the same location as the peaks in the Youngs double-slit pattern. Figure 11.12b
shows the case for N = 5. The prominent peaks occur when sin(hx/z) in the Figure 11.12 Diffraction through
various numbers of slits, each
denominator of (11.27) goes to zero. Keep in mind that the numerator goes to
with x = h/2 (slit widths half
zero at the same places, creating a zero-over-zero situation, so the peaks are not
the separation). The dotted line
infinitely tall. shows the single slit diffraction
pattern. (a) Diffraction from a
double slit. (b) Diffraction from 5
slits. (c) Diffraction from 10 slits.
(d) Diffraction from 100 slits.
286 Chapter 11 Diffraction Applications
With larger values of N , the peaks can become extremely sharp, and the small
secondary peaks in between become tiny in comparison. Fig. 11.12c shows the
case of N = 10 and Fig. 11.12d, shows the case of N = 100.
When very many slits are used, the resulting sharp diffraction peaks becomes
very useful for measuring spectra of light, since the position of the diffraction
peaks depends on wavelength (except for the center peak at x = 0). If light of differ-
ent wavelengths is simultaneously present, then the diffraction peaks associated
with different wavelengths appear in different locations.
Consider the inset in Fig. 11.12d, which gives a close-up view of the first-order
diffraction peak for N = 100. The location of this peak on a distant screen varies
with the wavelength of the light. How much must the wavelength change to cause
the peak to move by half of its width as marked in the inset of Fig. 11.12d? This
corresponds to the minimum wavelength separation that allows two associated
peaks to be distinguished.
As mentioned, the main diffraction peaks occur when the denominator of (11.27)
goes to zero, i.e.
hx
= m (11.28)
z
The numerator of (11.27) goes to zero at these same locations (i.e. N hx/z =
Figure 11.13 Animation showing N m), so the peaks remain finite. If two nearby wavelengths 1 and 2 are sent
diffraction through a number of through the grating simultaneously, their m th peaks are located at
slits. mz1 mz2
x1 = and x 2 = (11.29)
h h
These are spatially separated by
mz
x x 2 x 1 = (11.30)
h
where 2 1 .
Meanwhile, we can find the spatial width of, say, the first peak by considering the
change in x 1 that causes the sine in the numerator of (11.27) to reach the nearby
zero (see inset in Fig. 11.12d). This condition implies
h x 1 + x peak
N = N m + (11.31)
1 z
We will say that two peaks, associated with 1 and 2 , are barely distinguishable
when x = x peak . We also substitute from (11.29) to rewrite (11.31) as
h (mz1 /h + mz/h)
N = N m + = (11.32)
1 z Nm
RP = mN (11.33)
The resolving power is proportional to the number of slits illuminated on the
diffraction grating. The resolving power also improves for higher diffraction
orders m.
Example 11.3
What is the resolving power with m = 1 of a 2-cm-wide grating with 500 slits per
millimeter, and how wide is the 1st-order diffraction peak for 500-nm light after
1-m focusing?
500
RP = mN = 2 cm = 104
0.1 cm
and the minimum distinguishable wavelength separation is
mf 1m
x = = 0.05nm = 25 m
h 2 106 m
Light out
Slit
Grating
Slit
Light in
k 2 +y 2 Z
e i kz e i 2z (x ) Z h 02 02 2
i k 02 02 k 0 0
d x0 d y 0 E 0 e (x +y )/w 0 e i 2z (x +y ) e i z (xx +y y )
E x, y, z = i
z
(11.36)
The Gaussian profile itself limits the dimension of the aperture, so there is no
problem with integrating to infinity. Equation (11.36) can be rewritten as
k 2 +y 2
E 0 e i kz e i 2z (x ) Z k ky
1
i 2z x 02 i kx
z x
0 Z 1 k
+i 2z y 02 i z y 0
w2 w2
d x0e d y 0e
E x, y, z = i 0 0
z
(11.37)
4 The beam would converge to narrower widths if instead we used a phase associated with
The integrals over x 0 and y 0 have the identical form and can be done individually
with the help of the integral formula (0.55). The algebra is cumbersome, but the
integral in the x 0 dimension becomes
1 2
Z i kx
2
1
w2
i k 02
x i kx
x 0
z
d x 0e
2z z
0 = exp
k
1
i 2z 1 k
w 02 4 w 2 i 2z
0
1
2
2
kx
= exp
k 2z 2z
i 2z 1 + i kw 2 2z kw 2 i
0 0
1
2 h i
2 2z
z kx kw 02
+i
= r exp
2 i tan1 2z 2
2z kw 2
2z
1 + kw 2 e 0 2z 1 + kw 2
0 0
(11.38)
A similar expression results from the integration on y 0 .
When (11.38) and the equivalent expression for the y-dimension are used in
(11.37), the result is
(x2 +y 2 )!
1 k
2 +i 2z
k 2 2 w2
e i kz e i 2z (x +y )
1+ 2z
kw 2
0
i tan1 2z
kw 2
E x, y, z = E 0 r 2 e
0 e 0 (11.39)
2z
1 + kw 2
0
This rather complicated-looking expression for the field distribution is in fact very
useful and can be directly interpreted, as discussed in the next section.
A Gaussian field profile is one of few diffraction problems that can be handled con-
veniently in either the Cartesian (as above) or cylindrical coordinate. In cylindrical
coordinates, the Fresnel diffraction integral (10.27) is
k 2 Z
2i e i kz e i k 02 k 0
2z 02 2
E , z = 0 d 0 E 0 e /w 0 e i 2z J 0
z z
0
290 Chapter 11 Diffraction Applications
k 2
z
" #
1 i k
k 2 4
w2 2z
2e i kz e i 2z e 0
E , z = i E 0
z
1 k
2 w 02
i 2z
2 1 k
+i 2z
k 2
!2
w2
e i kz e i 2z 0 2z
2z 1+ i tan1
kw 2 kw 2
= E0 s 2 e
0 e 0
2z
1+ kw 02
w 0 22 i kz+i 2R(z)
k 2
i tan1 zz
E , z = E 0
e w (z) e 0 (11.40)
w (z)
where
2 x 2 + y 2, (11.41)
q
w (z) w 0 1 + z 2 /z 02 , (11.42)
2
R (z) z + z 0 /z, (11.43)
2
kw 0
z0 (11.44)
2
This formula describes the lowest-order Gaussian mode, the most common laser
beam profile.5
It turns out that (11.40) works equally well for negative values of z. The
expression can therefore be used to describe the field of a simple laser beam
everywhere (before and after it goes through a focus). In fact, the expression works
also near z = 0! At z = 0 the diffracted field (11.40) returns the exact expression for
the original field profile (11.34) (see P11.11).There is good reason for this since
the Fresnel diffraction integral is an exact solution to the paraxial wave equation
(10.15). The beam (11.40) satisfies the paraxial wave equation for positive and
negative z. In short, (11.40) may be used with impunity as long as the divergence
angle of the beam is not too wide.
5 Lasers can also be multimode, exhibiting more complicated structure through higher-order
modes.
11.7 Gaussian Laser Beams 291
represents spherical wave fronts as parabolic curves (same as the paraxial ap-
proximation). As a reminder, to restore the temporal dependence of the field, we
append e i t to the solution, as discussed in connection with (10.4).
The phase i tan1 z/z 0 is perhaps a bit more mysterious. It is called the Gouy
shift and is actually present for any light that goes through a focus, not just laser
beams. The Gouy shift is not overly dramatic since the expression tan1 z/z 0
ranges from /2 (at z = ) to /2 (at z = +). Nevertheless, when light goes
through a focus, it experiences an overall phase shift of .
Figure 11.17 Real part of a Gaussian laser field at an instant in time. The radius of
curvature of wavefronts is apparent.
292 Chapter 11 Diffraction Applications
Example 11.4
Write the beam waist w 0 in terms of the f-number, defined to be the ratio of z to
the beam diameter 2w(z) far from the beam waist.
Solution: Far away from the beam waist (i.e. z >> z 0 ) the laser beam expands
along a cone. That is, its diameter increases in proportion to distance.
q
w (z) = w 0 1 + z 2 /z 02 w 0 z/z 0
The cone angle is parameterized by the f-number, the ratio of the cone height to
its base:
z z z0
f # lim = =
z 2w (z) 2w 0 z/z 0 2w 0
Substitution of (11.44) into this expressions yields
Figure 11.18 2 f #
w0 = (11.46)
Equation (11.46) gives a convenient way to predict the size of a laser focus.
One calculates the f-number by dividing the diameter of the beam far from the
focus into the distance from the focus. In practice you may be very surprised at
how poorly a beam may focus in comparison with the theoretical prediction (due
to aberrations). It is always good practice to directly measure your focus if its size
is important to an experiment.
Figure 11.19 Gaussian laser beam traversing an optical system described by an ABCD
matrix. The dark lines represent the incoming and exiting beams. The gray line repre-
sents where the exiting beam appears to have been.
z, referenced to the position of the beams waist as shown in Fig. 11.19. The beam
exiting from the system, in general, has a new Rayleigh range z 00 . The waist of the
new beam also occurs at a different location. Let z 0 denote the location of the exit
of the optical system, referenced to the location of the waist of the new beam. If
the exiting beam diverges as in Fig. 11.19, then it emerges from a virtual beam
waist located before the exit point of the system. In this case, z 0 is taken to be
positive. On the other hand, if the emerging beam converges to an actual waist,
then z 0 is taken to be negative since the exit point of the system occurs before the
focus.
The ABCD law is embodied in the following relationship:6
A (z + i z 0 ) + B
z 0 + i z 00 = (11.47)
C (z + i z 0 ) + D
where A, B , C
p, and D are the matrix elements of the optical system. The imaginary
number i 1 imbues the law with complex arithmetic. It makes two equations
from one, since the real and imaginary parts of (11.47) must separately be equal.
We now prove the ABCD law. We begin by showing that the law holds for
two specific ABCD matrices. First, consider the matrix for propagation through a
distance d :
A B 1 d
= (11.48)
C D 0 1
We know that simple propagation has minimal effect on a beam. The Rayleigh
range is unchanged, so we expect that the ABCD law should give z 00 = z 0 . The
propagation through a distance d modifies the beam position by z 0 = z + d . We
now check that the ABCD law agrees with these results by inserting (11.48) into
(11.47):
1 (z + i z 0 ) + d
z 0 + i z 00 = = z + d + i z 0 (propagation through distance d) (11.49)
0 (z + i z 0 ) + 1
Thus, the law holds in this case.
Next we consider the ABCD matrix of a thin lens (or a curved mirror):
A B 1 0
= (11.50)
C D 1/ f 1
6 The complex conjugate of this expression works equally well.
294 Chapter 11 Diffraction Applications
A beam that traverses a thin lens undergoes the phase shift k 2 /2 f , according
to (11.11). This modifies the original phase of the wave front k 2 /2R (z), seen in
(11.40). The phase of the exiting beam is therefore
k 2 k 2 k 2
= (11.51)
2R (z 0 ) 2R (z) 2 f
where we do not keep track of unimportant overall phases such as kz or kz 0 . With
(11.43) this relationship reduces to
1 1 1 1 1 1
= 2 0
= 2
(11.52)
R (z 0 ) R (z) f 0 0
z + z 0 /z z + z 0 /z f
In addition to this relationship, the local radius of the beam given by (11.42)
cannot change while traversing the thin lens. Therefore,
2
!
z0 z2
0 0
w z = w (z) z 0 1 + 2 = z 0 1 + 2 (11.53)
z 00 z0
On the other hand, the ABCD law for the thin lens gives
1 (z + i z 0 ) + 0
z 0 + i z 00 = (traversing a thin lens with focal length f )
1/ f (z + i z 0 ) + 1
(11.54)
It is left as an exercise (see P11.14) to show that (11.54) is consistent with (11.52)
and (11.53).
So far we have shown that the ABCD law works for two specific examples,
namely propagation through a distance d and transmission through a thin lens
with focal length f . From these elements we can derive more complicated systems.
However, the ABCD matrix for a thick lens cannot be constructed from just these
two elements. We can construct the matrix for a thick lens if we sandwich a thick
window (as opposed to empty space) between two thin lenses (see P9.9). The
proof that the matrix for a thick window obeys the ABCD law is left as an exercise
(see P11.17). With these relatively few elements, essentially any optical system
can be constructed, provided that the beam propagation begins and ends in the
same index of refraction.
To complete our proof of the general ABCD law, we need only show that when
it is applied to the compound element
A B A2 B2 A1 B1 A 2 A 1 + B 2C 1 A 2 B 1 + B 2 D 1
= =
C D C2 D2 C1 D1 C 2 A 1 + D 2C 1 C 2 B 1 + D 2 D 1
(11.55)
it gives the same answer as when the law is applied sequentially, first on
A1 B1
C1 D1
and then on
A2 B2
C2 D2
11.A ABCD Law for Gaussian Beams 295
Explicitly, we have
A 2 z 0 + i z 00 + B 2
00 00
z + i z0 =
C 2 z 0 + i z 00 + D 2
h i
A 2 CA11(z+i
(z+i z 0 )+B 1
z 0 )+D 1 + B 2
= h i
C 2 CA11(z+i
(z+i z 0 )+B 1
z 0 )+D 1 + D 2
A 2 [A 1 (z + i z 0 ) + B 1 ] + B 2 [C 1 (z + i z 0 ) + D 1 ] (11.56)
=
C 2 [A 1 (z + i z 0 ) + B 1 ] + D 2 [C 1 (z + i z 0 ) + D 1 ]
(A 2 A 1 + B 2C 1 ) (z + i z 0 ) + (A 2 B 1 + B 2 D 1 )
=
(C 2 A 1 + D 2C 1 ) (z + i z 0 ) + (C 2 B 1 + D 2 D 1 )
A (z + i z 0 ) + B
=
C (z + i z 0 ) + D
Thus, we can construct any ABCD matrix that we wish from matrices that are
known to obey the ABCD law. The resulting matrix also obeys the ABCD law.
296 Chapter 11 Diffraction Applications
Exercises
P11.1 Fill in the steps leading to (11.14) starting from (11.12) and (11.13).
Show that the intensity distribution (11.6) is consistent with (11.14).
L11.2 Set up a collimated plane wave in the laboratory using a HeNe laser
( = 633 nm) and appropriate lenses.
(a) Choose a rectangular aperture (x by y) and place it in the plane
wave. Observe the Fraunhofer diffraction on a very far away screen (i.e.,
2
where z k2 aperture radius is satisfied). Check that the location of
CCD
Camera
Filters
Screen
Laser
Far-away Removable
mirror mirror Aperture
Figure 11.20
P11.3 On the night of April 18, 1775, a signal was sent from the Old North
Church steeple to Paul Revere, who was 1.8 miles away: One if by
land, two if by sea. If in the dark, Pauls pupils had 4 mm diameters,
what is the minimum possible separation between the two lanterns
that would allow him to correctly interpret the signal? Assume that the
predominant wavelength of the lanterns was 580 nm.
HINT: You dont need to worry about refractive index inside the eye,
n = 1.33. This causes the angular separation between the images to be
/1.33 inside the eye. The wavelength also shortens to 580 nm/1.33,
causing a smaller diffraction pattern. As far as resolution is concerned,
the two effects exactly compensate.
Exercises 297
L11.4 Simulate two stars using laser beams ( = 633 nm). Align them nearly
parallel with a small lateral displacement. (A mirror can aid in getting
the beams very close.) Send the beams down a long corridor until
diffraction causes both beams to blend together so that it is no longer
apparent that they are from two distinct sources. Use a lens to image
the two sources onto a CCD camera. Use a variable iris near the lens to
create different diameters.
Laser
CCD
Laser Camera
Filter
Pupil
Figure 11.21
P11.8 For the case of N = 1000 in P11.7, you wish to position a narrow slit at
the focus of the lens so that it transmits only the first-order diffraction
peak (i.e. at khx/ 2 f = ). (a) How wide should the slit be if it is to
match the width of the peak (as defined in (11.31))?
(b) What small change in wavelength (away from = 500 nm) will
cause the intensity peak to shift by the width of the slit found in part
(a)?
L11.9 (a) Use a HeNe laser to determine the period h of a reflective grating.
(b) Give an estimate of the blaze angle on the grating. HINT: Assume
that the blaze angle is optimized for first-order diffraction of the HeNe
laser (for one side) at normal incidence. The blaze angle enables a
mirror-like reflection of the diffracted light on each groove. (video)
Figure 11.23 (c) You have two mirrors of focal length 75 cm and the reflective grating
in the lab. You also have two very narrow adjustable slits and the ability
to tune the angle of the grating. Sketch how to use these items to make
a monochromator. If the beam that hits the grating is 5 cm wide, what
do you expect the ultimate resolving power of the monochromator to
be in the wavelength range of 500 nm? HINT: See Fig. 11.14.
L11.10 Study the Jarrell Ash monochromator. Use a tungsten lamp as a source
and observe how the instrument works by taking the entire top off. Do
not breathe-on or touch the optical surfaces when you do this. In the
dark, trace the light inside of the instrument with a card and observe
what happens when you change the wavelength setting. Place the top
back on when you are done. (video)
(a) Predict the best theoretical resolving power that this instrument can
do assuming 1200 lines per millimeter. You will need to measure the
width of the grating, which is matched well to the width of each mirror.
(b) What should the width x of the entrance and exit slits be to obtain
this resolving power? Assume = 500 nm.
Exercises 299
P11.12 Use the Fraunhofer integral formula (either (10.19) or (10.28)) to deter-
mine the far-field pattern of a Gaussian laser focus (11.34).
HINT: The answer should agree with P11.11 part (b).
L11.13 Consider the following setup where a diverging laser beam is collimated
using an uncoated lens. A double reflection from the two surfaces of the
lens (known as a ghost) comes out in the forward direction, focusing
after a short distance. Use a CCD camera to study this focused beam. Ghost Beam
Uncoated
Filter Pin Hole Lens
Laser CCD
Camera
Lens
150 cm
Figure 11.25
k 2 1 z
q
I t , z = I 2 + I 1 , z + 2 I 2 I 1 , z cos
tan
2R (z) z0
7 J. Peatross and M. V. Pack, Viewing the Mathematical Structure of Gaussian Laser Beams in a
both R (z) and the Gouy shift tan1 z/z 0 , which are not present in the
intensity distribution of a single beam (see (11.45)).
(a) Determine the f-number for the ghost beam (see example 11.4). Use
this measurement to predict a value for w 0 . HINT: You know that at the
z=0 z = +z0
lens, the focusing beam is the same size as the collimated beam.
(b) Measure the actual spot size w 0 at the focus. How does it compare
to the prediction?
HINT: Before measuring the spot size, make a subtle adjustment to
z = -z0 z = +2z0 the tilt of the lens. This incidentally causes the phase between the two
beams to vary by small amounts, which you can set to = /2. Then
at the focus the cosine term vanishes and the two beams dont interfere
(i.e. the intensities simply add). This is accomplished if the center of
the interference pattern is as dark as possible either far before or far
after the focus.
z = -2z0 z = +3z0
(c) Observe the effect of the Gouy shift. Since tan1 z/z 0 varies over a
range of , you should see that the ring pattern inverts as you move
the camera from before the focus to after the focus (i.e. the bright rings
exchange with the dark ones).
z = -3z0 z = +4z0 (d) Predict the Rayleigh range z 0 and check that the radius of curvature
R (z) z + z 02 /z agrees with measurement at a small distance from the
Figure 11.26 focus.
HINT: You should see interference rings similar to those in Fig. 11.26.
The only phase term that varies with is k 2 /2R (z). If you count N
fringes out to a radius , then k 2 /2R (z) has varied by 2N .
P11.14 Find the solutions to (11.54) (i.e. find z 0 and z 00 in terms of z and z 0 ).
Show that the results are in agreement with (11.52) and (11.53).
P11.15 Assuming a collimated beam (i.e. z = 0 and beam waist w 0 ), find the
location L = z 0 and size w 00 of the subsequent focus when the beam
goes through a thin lens with focal length f .
L11.16 Place a long-focal-length lens (e.g. f = 100 cm) in a HeNe laser beam
soon after the exit mirror of the cavity where the beam waist w 0 is sub
millimeter. Characterize the focus of the resulting laser beam using
filters and a CCD camera, and compare the results with the expressions
derived in P11.15.
Exercises 301
P11.17 Prove the ABCD law for a beam propagating through a thick window of
material with matrix
A B 1 d /n
=
C D 0 1
Chapter 12
12.1 Interferograms
Consider the Michelson interferometer seen in Fig. 12.1. Suppose that the beam-
spliter divides the fields evenly, so that the overall output intensity is given by
(8.1):
Figure 12.1 Michelson interferom-
I tot = 2I 0 [1 + cos ()] (12.1) eter.
As a reminder, is the roundtrip delay time of one path relative to the other. This
equation is based on the idealized case, where the amplitude and phase of the two
1 See M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 7.5.5 (Cambridge: Cambridge
303
304 Chapter 12 Interferograms and Holography
(a) beams are uniform and perfectly aligned to each other following the beamsplitter.
The entire beam blinks on and off as the delay path is varied.
What happens if one of the retro-reflecting mirrors is misaligned by a small
angle ? The fringe patterns seen in Fig. 12.2 (a)-(c) are the result. By the law
of reflection, the beam returning from the misaligned mirror deviates from the
ideal path by an angle 2. This puts a relative phase variation of
I tot = 2I 0 1 + cos +
(12.3)
(c) The phase term depends on the local position within the beam through x and
y. Regions of uniform phase, called fringes (in this case individual stripes), have
the same intensity. As the delay is varied, the fringes seem to move across the
detector. In this case, the fringes appear at one edge of the beam and disappear
at the other.
Another interesting situation arises when the beams in a Michelson interfer-
ometer are diverging. A fringe pattern of concentric circles will be seen at the
(d) detector when the two beam paths are unequal (see Fig. 12.2 (d)). The radius of
curvature for the beam traveling the longer path is increased by the added amount
of delay d = c. Thus, if beam 1 has radius of curvature R 1 when returning to the
beam splitter, then beam 2 will have radius R 2 = R 1 +d upon return (assuming flat
mirrors). The relative phase (see phase term in (11.40)) between the two beams is
Optic to
be tested
12.3 Generating Holograms
In the late 1940s, Dennis Gabor developed the concept of holography, but it wasnt
until after the invention of the laser that this field really blossomed. Consider
a coherent monochromatic beam of light that is split in half by a beamsplitter,
similar to that in a Michelson interferometer. Let one beam, called the reference
Imaging
beam, proceed directly to a recording film, and let the other beam scatter from Objective
an arbitrary object back towards the same film. The two beams interfere at the Camera
recording film. It is best to split the beam initially into unequal intensities such
that the light scattered from the object has an intensity similar to the reference Figure 12.4 Twyman-Green setup
for testing lenses.
beam at the film.
The purpose of the film is to record the interference pattern. It is important
that the coherence length of the light be much longer than the difference in
path length starting from the beam splitter and ending at the film. In addition,
during exposure to the film, it is important that the whole setup be stable against
vibrations on the scale of a wavelength since this will cause the fringes to wash
306 Chapter 12 Interferograms and Holography
Object
Film out. For simplicity, we neglect the vector nature of the electric field, assuming
that the scattering from the object for the most part preserves polarization and
that the angle between the two beams incident on the film is modest (so that the
electric fields of the two beams are close to parallel). To the extent that the light
scattered from the object contains the polarization component orthogonal to that
of the reference beam, it provides a uniform (unwanted) background exposure to
Beamsplitter the film on top of which the fringe pattern is recorded.
Figure 12.5 Exposure of holo- In general terms, we may write the electric field arriving at the film as5
graphic film. E film (r) e i t = E object (r) e i t + E ref (r) e i t (12.5)
Here, the coordinate r indicates locations on the film surface, which may have
arbitrary shape but often is a plane. The field E object (r), which is scattered from
the object, is in general very complicated. The field E ref (r) may be equally compli-
cated, but typically it is convenient if it has a simple form such as a plane wave,
since this beam must be re-created later in order to view the hologram.
The intensity of the field (12.5) is given by
1 2
I film (r) = c0 E object (r) + E ref (r)
2
1 h 2 i (12.6)
= c0 E object (r) + |E ref (r)|2 + E ref
(r) E object (r) + E ref (r) E object (r)
2
For typical photographic film, the exposure of the film is proportional to the
intensity of the light hitting it. This is known as the linear response regime. That
Dennis Gabor (19001979, Hungarian) is, after the film is developed, the transmittance T of the light through the film is
was born in Budapest. As a teenager, proportional to the intensity of the light that exposed it (I film ). However, for low
he fought for Hungary in World War
I. Following the war, he studied at the exposure levels, or for film specifically designed for holography, the transmission
Technical University of Budapest and of the light through the film can be proportional to the square of the intensity
later at the Technical University of Berlin.
In 1927, Gabor completed his doctoral
of the light that exposes the film. Thus, after the film is exposed to the fringe
dissertation on cathode ray tubes and pattern and developed, the film acquires a spatially varying transmission function
began a long career working on electron-
according to
beam devices such as oscilloscopes, 2
televisions, and electron microscopes. T (r) I film (r) (12.7)
It was in the context of electron optics
that he invented the concept of holog- If at a later point in time light of intensity I incident is directed onto the film, it will
raphy, which relied on the wave nature transmit according to I transmitted = T (r)I incident . In this case, the field, as it emerges
of electron beams. Gabor did this work
while working for a British company, af-
from the other side of the film, will be
ter fleeing Germany when Hitler came
to power. Holography did not become
E transmitted (r) = t (r) E incident (r) I film (r) E incident (r) (12.8)
practical until after the invention of the p
laser, which provided a bright coherent
where t (r) = T (r).
light source. (Gabor had attempted to
make holograms earlier using a spec-
tral line from a mercury lamp.) In 1964 12.4 Holographic Wavefront Reconstruction
the first hologram was produced. Soon
after, holograms became commercially
available and were popularized. Ga- To see a holographic image, we re-illuminate film (previously exposed and devel-
bor accepted a post as professor of oped) with the original reference beam. That is, we send in
applied physics at the Imperial College
of London from 1958 until he retired in E incident (r) = E ref (r) (12.9)
1967. He was awarded the Nobel prize
in physics in 1971 for the invention of 5 See P. W. Milonni and J. H. Eberly, Lasers, Sect. 16.4-16.5 (New York: Wiley, 1988); G. R. Fowles,
holography. (Wikipedia) Introduction to Modern Optics, 2nd ed., Sect. 5.7 (Toronto: Dover, 1975).
12.4 Holographic Wavefront Reconstruction 307
Image
and view the light that is transmitted. According to (12.6) and (12.8), the trans-
Film
mitted field is proportional to
Example 12.1
Analyze the three field terms in (12.10) for a hologram made from a point object,
as depicted in Fig. 12.7.
Figure 12.7 Exposure to holo-
graphic film by a point source
Solution: Presumably, the point object is illuminated sufficiently brightly so as to
and a reference plane wave. The
make the scattered light have an intensity similar to the reference beam at the film.
holographic fringe pattern for a
Let the reference plane wave strike the film at normal incidence. Then the reference point object and a plane wave ref-
field will have constant amplitude and phase across it; call it E ref . The field from erence beam exposing a flat film is
the point object can be treated as a spherical wave: shown on the right.
E ref L p 2 2
E object = p e i k L +
(point source example) (12.11)
L2 + 2
Here represents the radial distance from the center of the film to some other
308 Chapter 12 Interferograms and Holography
Reference Undeflected point on the film. We have taken the amplitude of the object field to match E ref in
beam Film beam the center of the film.
After the film is exposed, developed, and re-illuminated by the reference beam, the
field emerging from the right-hand-side of the film, according to (12.10), becomes
2 2
E ref L
E ref L p 2 2
E transmitted 2 2
e i k L +
+ E ref E ref + E ref p
L +
2 2
L2 + 2
p (12.12)
E ref L 2 2
2
+ E ref e i k L +
L +
p
2 2
Reference
beam Film We see the three distinct waves that emerge from the holographic film. The first
term in (12.12) represents the plane wave reference beam passing straight through
the film with some variation in amplitude (depicted in Fig. 12.8 (a)). The second
Virtual term in (12.12) has the identical form as the field from the original object (aside
image
from an overall amplitude factor). It describes an outward-expanding spherical
wave, which gives rise to a virtual image at the location of the original point object,
as depicted in Fig. 12.8 (b). The final term in (12.12) corresponds to a converging
spherical wave, which focuses to a point at a distance L from the observers side of
the screen (depicted in Fig. 12.8 (c)).
Field associated
with virtual
image
Reference
beam
Real
image
Field associated
with real
image
Film
Exercises
P12.4 Consider a diffraction grating as a simple hologram. Let the light from
the object be a plane wave (object placed at infinity) directed onto a
flat film at angle . Let the reference beam strike the film at normal
incidence, and take the wavelength to be .
(a) What is the period of the fringes?
(b) Show that when re-illuminated by the reference beam, the three
terms in (12.10) give rise to zero-order and 1st-order diffraction (occur-
ring on each side of zero-order).
P12.5 (a) Show that the phase of the real image in (12.12) may be approxi-
mated as = k 2 /2L, aside from a spatially independent overall
phase. Compare with (11.10) and comment.
(b) This hologram is similar to a Fresnel zone plate, sometimes used to
focus extreme ultraviolet light or x-rays, since it is difficult to make a
lens otherwise at those wavelengths.6 Graph the field transmission for
6 Tiny Fresnel zone plates can be made for this purpose using electron-beam lithography.
310 Chapter 12 Interferograms and Holography
(m)
R51 T or F: The imaging relation 1/ f = 1/d o + 1/d i relies on the paraxial ray
approximation.
R55 T or F: The spacing L between two flat mirrors can be chosen to make
a laser cavity stable.
311
312 Review, Chapters 912
R59 T or F: The central peak of the Fraunhofer diffraction from two nar-
row slits separated by spacing h has the same width as the central
diffraction peak from a single slit with width x = h.
R61 T or F: The array theorem is useful for deriving Fresnel diffraction from
a grating.
Problems
R67 (a) Consider a ray of light emitted from an object, which travels a
distance d o before traversing a lens of focal length f and then traveling
a distance d i .
y2 y1
Write a vector equation relating to . Be sure to simplify
2 1
image the equation so that only one ABCD matrix is involved.
object
1 0 1 d
HINT: ,
1/ f 1 0 1
(b) Explain the requirement on the ABCD matrix in part (a) that ensures
Figure 12.10 that an image appears for the distances chosen. From this requirement,
extract a familiar constraint on d o and d i . Also, make a reasonable
definition for magnification M in terms of y 1 and y 2 , then substitute to
find M in terms of d o and d i .
(c) A telescope is formed with two thin lenses separated by the sum of
their focal lengths f 1 and f 2 . The purpose of a telescope is to enlarge
313
the apparent angle between points in the distant field of view. All rays
entering the telescope with angle 1 are mapped into a (presumably)
larger angle 2 .
Give a sensible definition for angular magnification in terms of 1
and 2 and use the ABCD-matrix formulation to derive the angular
magnification of the telescope in terms of f 1 and f 2 . Figure 12.11
A B
R68 (a) Show that a system represented by a matrix (beginning
C D
and ending in the same index of refraction) can be made to look like
the matrix for a thin lens if suitable distances p 1 and p 2 are appended
before and after the ABCD system.
A B
HINT: = 1.
C D
(b) Where are the principal planes located and what is the effective
focal length for two identical thin lenses with focal lengths f that are
separated by a distance d = f (see Fig. 12.12)?
R69 Derive the on-axis intensity (i.e. x, y = 0) of a Gaussian laser beam if Figure 12.12
you know that at z = 0 the electric field of the beam is
02
2
E 0 , z = 0 = E 0 e w0
Fresnel approximation:
k 2 +y 2
i e i kz e i 2z (x ) k 02 02 k 0 0
E x, y, z E x 0 , y 0 , 0 e i 2z (x +y ) e i z (xx +y y ) d x 0 d y 0
=
z
Z
B 2 +C
r
Ax 2 +B x+C
e dx = e 4A .
A
R70 (a) You decide to construct a simple laser cavity with a flat mirror and
another mirror having concave curvature of R = 100 cm. What is the
longest possible stable cavity that you can make?
HINT: Sylvesters theorem is
N
A sin N sin (N 1) B sin N
A B 1
=
C D sin C sin N D sin N sin (N 1)
(b) The amplifier is YLF crystal, which lases at = 1054 nm. You decide
to make the cavity 10 cm shorter than the longest possible (i.e. found in
part (a)). What is the value of w 0 , and where is the beam waist located
inside the cavity (the place we assign to z = 0)?
HINT: For a mode to exist in a laser cavity, the radius of curvature of
each of the end mirror matches the radius of curvature R (z) of the
beam at that location.
w 0 22 i kz+i k2 i tan1 zz
E , z = E 0
e w (z) e 2R(z) e 0
w (z)
2 x 2 + y 2
q
w (z) w 0 1 + z 2 /z 02
R (z) z + z 02 /z
kw 02
z0
2
R71 (a) Compute the Fraunhofer diffraction intensity pattern for a uni-
formly illuminated circular aperture with diameter D.
k
i kz i 2z (x 2 +y 2 ) RR k 0 0
HINT: E x, y, z = i e e z E x 0 , y 0 , 0 e i z (xx +y y ) d x 0 d y 0
2 Ra
e i cos( ) d 0 ,
0
1
J 0 (bx) xd x = ba J 1 (ab)
R
J 0 () = 2
0 0
R72 (a) Derive the Fraunhofer diffraction pattern for the field from a uni-
formly illuminated single slit with width x. (Dont worry about the
y-dimension.)
(b) Find the Fraunhofer intensity pattern for a grating with N slits
of width x positioned on the mask at x n0 = h n N2+1 so that the
N rN 1
rn =r
X
n=1 r 1
315
(c) Consider Fraunhofer diffraction from the grating in part (b). The
grating is 5.0 cm wide and is uniformly illuminated. For best resolution
in a monochromator with a 50 cm focal length, what should the width
of the exit slit be? Assume = 500 nm.
the field at the center of the focus found in part (a), and the width is
w 0 = 2 f # / with f # f /D. Fig. 12.14 shows how well this Gaussian
approximation fits the actual curve. We have assumed that the first
aperture is a distance f before the lens so that at the focus after the
lens the wave front is flat. To avoid integration, you may want to use
the field provided in R70 and take the far-field limit: z >> z 0 .
Figure 12.14 Diffraction pattern
from a circular aperture (solid)
Selected Answers
that is chopped by a pinhole to
removed the diffraction rings (dot-
R70: (a) 100 cm (b) 0.32 mm.
ted). A Gaussian field (dashed)
R71: (b) 4.8 108 km. approximates the center portion
that transmits through the pin-
R72: (c) 5 m. hole.
Chapter 13
Blackbody Radiation
Hot objects glow. In 1860, Kirchhoff proposed that the radiation emitted by hot
objects as a function of frequency is approximately the same for all materials.1
The notion that all materials behave similarly led to the concept of an ideal
blackbody radiator. Most materials have a certain shininess that causes light to
reflect or scatter in addition to being absorbed and reemitted. However, light
that falls upon an ideal blackbody is absorbed perfectly before the possibility of
reemission, hence the name blackbody.
The distribution of frequencies emitted by a blackbody radiator is related
to its temperature. We often consider a blackbody radiator that is in thermal
equilibrium with the surrounding light that is absorbed and reemitted. If it is
not in thermal equilibrium, for example, if more light is emitted than absorbed,
then the object inevitably cools as light escapes to the environment, moving the
system toward thermal equilibrium.
The Sun is a good example of a blackbody radiator. The light emitted from the Gustav Kirchhoff (18241887, Ger-
man) was born in Konigsberg, the son of
Sun is associated with its surface temperature. Any light that arrives to the Sun a lawyer. Kirchhoff attended the Univer-
from outer space is virtually 100% absorbed, however little light that might be, so sity of Konigsberg. While still a student,
he developed what are now called Kirch-
the name blackbody aptly describes it. Mostly, light escapes to the much colder hoffs law for electrical circuits. During
surrounding space (i.e. it is not in thermal equilibrium), and the temperature of his career, Kirchhoff was a professor in
Breslau, Heidelberg, and finally Berlin.
the Suns surface is maintained by the fusion process within. As another example, Kirchhoff was one of the first to study the
a glowing tungsten filament in an ordinary light bulb may be reasonably described spectra emitted by various objects when
as a blackbody radiator. However, surface reflections make it less than ideal both heated. Not coincidentally, his colleague
in Heidelberg was Robert Bunsen, in-
for absorption and emission. ventor of the Bunsen burner. Kirchhoff
Experimentally, a near perfect blackbody radiator can be constructed from coined the term blackbody radiation.
He demonstrated that an excited gas
a hollow object. An example is shown in Fig. 13.1. As the interior of the object gives off a discrete spectrum, and that
is heated, the light present inside the internal cavity is in equilibrium with the an unexcited gas surrounding a black-
body emitter produces dark lines in the
glowing walls. A small hole can be drilled through the wall to observe the radiation blackbody spectrum. Together Kirch-
inside without significantly disturbing the system. The observation hole can be hoff and Bunsen discovered caesium
and rubidium. Later in his career, Kirch-
thought of as a perfect blackbody since any light entering the hole from the
hoff showed how to derive Fresnels
outside is eventually absorbed (before being potentially reemitted), if not on the diffraction formula starting from the wave
equation. (Wikipedia)
1 An important exception is atomic vapors, which have relatively few discrete spectral lines.
However, Kirchhoffs assumption holds quite well for most solids, which are sufficiently complex.
317
318 Chapter 13 Blackbody Radiation
plification.
13.2 Failure of the Equipartition Principle 319
Within the enclosed cavity, light travels at speed c isotropically in all directions. A
factor of 1/2 arrises because only half of the energy travels towards the hole from
within the cavity as opposed to away. The remaining factor of 1/2 occurs because
the light emerging from the hole is directionally distributed over a hemisphere,
rather than flowing in the direction of the surface normal n. The average over the
hemisphere is carried out as follows:
2 /2 2 /2
d r n sin d d r cos sin d
R R R R
0 0 0 0 1
= = (13.3)
2 /2 2 /2 2
d r sin d d r sin d
R R R R
0 0 0 0
Although (13.1) describes the total intensity of the light that leaves a blackbody
surface, it does not describe what frequencies make up the radiation field. This
frequency distribution was not fully described for another two decades, when
Max Planck developed his famous formula. Planck was first to arrive at the correct
formula for the spectrum of blackbody radiation, building on the work of others,
most notably Wien, who came very close. At first, Planck tweaked Wiens formula
to match newly available experimental data. When he attempted to explain
it, he was forced to introduce the concept of light quanta. Even Planck was
uncomfortable with and perhaps disbelieved the assumption that his formula
implied, but he deserves credit for recognizing and articulating it.
potential energy). The problem then reduces to that of finding the number of
unique modes for the radiation at each frequency.5 The idea is that requiring each
mode of electromagnetic energy to hold energy k B T should reveal the spectral
shape of blackbody radiation.
where each component of the wave number in any of the three dimensions is an
integer times
k 0 = 2/L (13.5)
Considering a box of size L does not artificially restrict our analysis, since we may
later take the limit L so that our box represents the entire universe. Moreover,
L will naturally disappear from our calculation when we later consider the density
of modes.
Figure 13.3 The volume of a thin
spherical shell in n, m, ` space. We can think of a given wave number k as specifying the equation of a sphere in a
coordinate system with axes labeled n, m, and `:
2
k
2 2 2
n +m +` = (13.6)
k0
The fact that the integers n, m, and ` range over both positive and negative values
automatically takes into account that the field may travel in the forwards or the
backwards direction.
We need to know how many more ways there are to choose n, m, and ` when the
wave number k/k 0 increases to (k + d k)/k 0 . The answer is the difference in the
volume of the two spheres shown in Fig. 13.3:
k2 d k
# modes in (k,k+d k) = 4 2 (13.7)
k0 k0
This is the number of terms in (13.4) associated with a wave number between k
and k + d k.
5 See O. Svelto, Principles of Lasers, 4th ed., translated by D. C. Hanna, Sect. 2.2.1 (New York:
where the extra factor of 2 accounts for two independent polarizations, not speci-
fied in (13.4). As anticipated, the dependence on L has disappeared from (13.8)
after substituting from (13.5).
We can immediately see that (13.8) disagrees drastically with the Stefan-
Boltzmann law (13.2), since (13.8) is proportional to temperature rather than
to its fourth power. In addition, the integral in (13.8) is seen to diverge, meaning
that regardless of the temperature, the light carries infinite energy density! This
has since been named the ultraviolet catastrophe since the divergence occurs
on the short wavelength end of the spectrum. This is a clear failure of classical
physics to explain blackbody radiation. Nevertheless, Rayleigh emphasized the
fact that his formula works well for the longer wavelengths.
It is instructive to make the change of variables k = /c in the integral to write
Z
2
u field = k B T d (13.9)
2 c 3
0
James Jeans (18771946, English)
2 2 3
The important factor / c can now be understood to be the number of modes was born in Ormskirk, England. He at-
tended Cambridge University and later
per frequency. Then (13.9) is rewritten as taught there for most of his career. He
also taught at Princeton University for
Z a number of years. One of his major
u field = () d (13.10) contributions was the development of
Jeans length, the critical radius for inter-
0 stellar clouds, which determines whether
a cloud will collapse to form a star. In
where his later career, Jeans became some-
2 what well known to the public for his
Rayleigh-Jeans () = k B T (13.11) lay-audience books highlighting scien-
2 c 3 tific advances, in particular relativity and
describes (incorrectly) the spectral energy density of the radiation field associated cosmology. (Wikipedia)
with blackbody radiation.
became available over a fairly wide wavelength range. In keeping with Kirchhoffs
notion of an ideal blackbody radiator, the results were observed to be indepen-
dent of the material for most solids. The intensity per frequency depended only
on temperature and when integrated over all frequencies agreed with the Stefan-
Boltzmann law (13.1).
In 1896, Wilhelm Wien considered the known physical and mathematical
constraints on the spectrum of blackbody radiation and proposed a spectral
function that seemed to work:8
3 e /kB T
Wien () = (13.12)
2 c 3
An important feature of (13.12) is that it gives a result proportional to T 4 when
integrated over all frequency (i.e. the Steffan-Boltzmann law).
Wilhelm Wien (18641928, German) Wiens formula did a fairly good job of fitting the experimental data. However,
was born in Gaffken, Prussia (now
Primorsk, Russia). As a teenager, he in 1900 Lummer and Pringshein, colleagues of Max Planck, reported experimental
attended schools in Rastenburg and data that deviated from the Wien distribution at long wavelengths (infrared).
then Heidelberg. He later attended the
University of Gttingen and then the
Planck was privy to this information early on and introduced a modest revision
University of Berlin. In 1886, he received to Wiens formula that fit the data beautifully everywhere:
his Ph.D. after working under Hermann
von Helmholtz where he studied the in- 3
fluence of materials on the color of light. Planck () = (13.13)
2 c 3 e /kB T 1
In 1896 Wien developed an empirical
formula for the spectral distribution of
blackbody radiation. He collaborated where = 1.054 1034 J s is an experimentally determined constant.9
with Planck, who gave the law a founda-
tion in electromagnetic and thermody-
Figure 13.4 shows the Planck spectral distribution curve together with the
namic theory. Planck later improved the Rayleigh-Jeans curve (13.11) and the Wien curve (13.12). As is apparent, the Wien
formula, whereupon it became known by distribution does a good job nearly everywhere. However, at long wavelengths
his name. However, Wiens formula for
the peak wavelength of the blackbody it was off by just enough for the experimentalists to notice that something was
curve, called Wiens displacement law, wrong.
remains valid. In 1898, Wien identified a
positive particle equal in mass to the hy- At this point, it may seem fair to ask, What did Planck do that was so great?
drogen atom, which was later named the After all, he simply guessed a function that was only a slight modification of
proton. Wien received the Nobel prize in
1911 for his work on heat and radiation.
Wiens distribution. And he knew the answer from the back of the book, namely
(Wikipedia) Lummers and Pringsheins well done experimental results. (At the time, Planck
was unaware of the work by Rayleigh.)
Planck gets well-deserved credit for interpreting the meaning of his new
formula. His interpretation was what he called an act of desperation. He did
not necessarily believe in the implications of his formula; in fact, he presented
them somewhat apologetically. It was several years later that the young Einstein
published his paper explaining the photoelectric effect in the context of Plancks
work.
Plancks insight was an enormous step toward understanding the quantum
0 2 4 6 8 10 nature of light. Nevertheless, it took another three decades to develop a more
8 The constant h had not yet been introduced by Planck. The actual way that Wien wrote his
Figure 13.4 Energy density per distribution was Wien () = a3 e b/T , where a and b were parameters used to fit the data.
9 Plancks constant was first introduced as h = 6.626 1034 J s, convenient for working with
frequency according to Planck,
Wien, and Rayleigh-Jeans. frequency , expressed in Hz. It is common to write h/2 when working with frequency ,
expressed in rad/s.
13.3 Plancks Formula 323
The Boltzmann factor can be normalized by dividing by the sum of all such factors
to obtain the probability of having energy n in a particular mode:
Max Planck (18581947, German) was
e n/kB T h i born in Kiel, the sixth child in his family.
Pn = = e n/kB T 1 e /kB T (13.14) His father was a law professor. When
P m/k T Max was about nine years old, his family
e B
m=0 moved to Munich where he attended
gymnasium. A mathematician, Herman
We used (0.66) to accomplish the above sum, which is a geometric series. Muller took an interest in his schooling
and tutored him in mechanics and as-
The expected energy in a particular mode of the field is the sum of each possible tronomy. Planck was a gifted musician,
energy level (i.e. n) times the probability of its occurrance: but he decided to pursue a career in
physics. At age 16 he enrolled in the
h
iX University of Munich. By age 22, he
nP n = 1 e /kB T ne n/kB T
X
had finished his doctoral dissertation
n=0 n=0 and habilitation thesis. He was initially
h i ignored by the academic community
= 1 e /kB T e n/kB T
X
and worked for a time as an unpaid lec-
(/k B T ) n=0 turer. He became an associate professor
(13.15)
h
/1k B T
i 1 of theoretical physics at the University
= 1 e of Kiel and then a few years later took
(/k B T ) 1 e /kB T over Kirchhoffs post at the University
of Berlin. After nearly twenty years of
= idyllic and happy family life, a series
e /kB T 1
of tragedies hit the Planck household.
We used (0.66) again as well as a clever derivative trick. Plancks first wife and mother of four,
died. Then his eldest son was killed
in action during World War I. Soon af-
ter, his twin daughters each died giving
Equation (13.15) provides the expected energy in any of the modes of the radi- birth to their first child. Later Plancks
ation field, as dictated by Plancks assumption. To obtain the Planck distribution remaining son from his first marriage
was executed for participating in a failed
(13.13), we replace k B T in the Rayleigh-Jeans formula (13.10) with the correct attempt to assassinate Hitler. Planck
expected energy (13.15).10 won the Nobel prize in 1918 for his in-
troduction of energy quanta, but he had
It is interesting that we are now able to derive the constant in the Stefan- serious reservations about the course
Boltzmann law (13.2) in terms of Plancks constant (see P13.3). The Stefan- that quantum mechanics theory took.
(Wikipedia)
Boltzmann law is obtained by integrating the spectral density function (13.13)
10 See O. Svelto, Principles of Lasers, 4th ed., translated by D. C. Hanna, Sect. 2.2.2 (New York:
over all frequencies to obtain the total field energy density, which is in thermal
equilibrium with the blackbody radiator:
Z
4 2 k B4 4 4
u field = Planck ()d = T T 4 (13.16)
c 60c 2 3 c
0
Since Plancks constant was not introduced until a couple decades after the Stefan-
Boltzmann law was developed, one might more appropriately say that the Stefan-
Boltzmann constant pins down Plancks constant.
Figure 13.5 Blackbody spectrum
(13.17) plotted for the surface
temperature of three stars: Sirius Example 13.1
(9900 K), the Sun (5750 K), and
Betelgeuse (3300 K). Determine Planck () such that
Z Z
u field = Planck () d = Planck () d
330 K
0 0
310 K
where Planck () and Planck () represent distinct functions distiguished by their
294 K
arguments.
273 K
184 K
Solution: The change of variables 2c/ d = 2cd /2 gives
Z0 Z
(2c/)3 d
16c
u field = 2c 2 = d
2 c 3 e (2c/)/kB T 1 5 e 2c/kB T 1
Figure 13.6 Blackbody spectrum 0
N1 = A 21 N2 B 12 () N1 + B 21 () N2 ,
(13.18)
N2 = A 21 N2 + B 12 () N1 B 21 () N2
P = u field /3 (13.24)
on the walls of the container. This can be derived from the fact that radiation of
energy E imparts a momentum
E
p = cos (13.25)
c
Derivation of (13.24)
13 See P. W. Milonni, The Quantum Vacuum An Introduction to Quantum Electrodynamics, Sect.
Consider a thin layer of space adjacent to a container wall with area A. If the layer
has thickness z, then the volume in the layer is Az. Half of the radiation inside
the layer flows toward the wall, where it is absorbed. The total energy in the layer
that will be absorbed is then E = (Az)u field /2, which arrives during the interval
t = z/(c cos ), assuming for the moment that all light is directed with angle ;
we must average the angle of light propagation over a hemisphere.
The pressure on the wall due to absorption (i.e. force or d p/d t per area) is then
2 /2
p 1
d sin d
R R
t A Z/2
0 0 u field u field
P abs = = cos2 sin d = (13.26)
2 /2 2 6
d sin d
R R
0
0 0
In equilibrium, an equal amount of radiation is also emitted from the wall. This
gives an additional pressure P emit = P abs , which confirms that the total pressure is
given by (13.24).
Figure 13.8 Field inside a black-
body radiator.
We derive the Stefan-Boltzmann law using the concept of entropy, which is
defined in differential form by the quantity
dQ
dS (13.27)
T
where d Q is the injection of heat (or energy) into the radiation field in the box
and T is the temperature at which that injection takes place. We would like to
write d Q in terms of u field , V , and T . Then we may invoke the fact that S is a state
variable, which implies
2 S 2 S
= (13.28)
T V V T
This is a mathematical statement of the fact that S is fully defined if the internal
energy, temperature, and volume of a system are specified. That is, S does not
depend on past temperature and volume history; it is dictated by the present
state of the system.
To obtain d Q in the form that we need, we can use the 1st law of thermody-
namics. It states that a change in internal energy dU = d (u fieldV ) can take place
by the injection of heat d Q or by doing work dW = P dV as the volume increases:
d Q = dU + P dV = d (u fieldV ) + P dV
1
= V d u field + u field dV + u field dV (13.29)
3
d u field 4
=V d T + u field dV
dT 3
We have used energy density times volume to obtain the total energy U in the radi-
ation field in the box. We have also used (13.24) to obtain the work accomplished
by pressure as the volume changes.
328 Chapter 13 Blackbody Radiation
V d u field 4u field
dS = dT + dV (13.30)
T dT 3T
When we differentiate (13.30) with respect to temperature or volume we get
S 4u field
=
V 3T (13.31)
S V d u field
=
T T dT
We are now able to evaluate the partial derivatives in (13.28), which give
which depends on the number of configurations n obj for a given state (defined,
for example, by fixed energy and volume). Now imagine that the object is placed
in contact with a very large thermal reservoir. The object could be the electro-
magnetic radiation inside a hollow blackbody apparatus, and the reservoir could
be the walls of the apparatus, capable of holding far more energy than the light
field can hold. The condition for thermal equilibrium between the object and the
reservoir is
S obj S res 1
= (13.35)
Uobj Ures T
where temperature has been introduced as a definition, which is consistent with
(13.27).
The total number of configurations for the combined system is N = n obj n res ,
where n obj and n res are the number of configurations available within the object
and the reservoir separately. A thermodynamic principle is that all possible
13.B Boltzmann Factor 329
N
P = n res = e S res /kB (13.36)
n obj
eq S res
eq
S res (Ures ) = S res Ures + Ures Ures + ... (13.37)
Ures Ures
eq
Higher order terms are not needed since we assume the reservoir to be very large
so that it is disturbed only slightly by variations in the object. Since the overall
energy of the system is fixed, we may write
eq
Ures Ures = Ures = Uobj (13.38)
where Uobj is a small change in energy in the object. When (13.35), (13.37), and
(13.38) are introduced into (13.36), the probability for the specific configuration
eq Uob j
1
S (U )
becomes P e kB res res kB T , or simply
Uob j
P e kB T
(13.39)
since the first term in the exponent is constant. Uobj represents an amount
energy added to the object to establish a configuration. In the case of blackbody
radiation, a mode takes on energy Uobj = n, where n is the number of energy
quanta in the mode. The probability that a mode carries energy n is therefore
kn
proportional to e B T
.
330 Chapter 13 Blackbody Radiation
Exercises
P13.1 The Sun has a radius of R S = 6.96 108 m. What is the total power that
it radiates, given a surface temperature of 5750 K?
P13.3 Derive (or try to derive) the Stefan-Boltzmann law by integrating the
(a) Rayleigh-Jeans energy density
Z
u field = Rayleigh-Jeans () d
0
Please comment.
(b) Wien energy density
Z
u field = Wien () d
0
Please evaluate .
R 6
HINT: x 3 e ax d x = a4
.
0
(c) Planck energy density
Z
u field = Planck () d
0
0.00290 m K
max =
T
which gives the strongest wavelength present in the blackbody spectral
distribution.
HINT: See Example 13.1. You may like to know that the solution to the
transcendental equation (5 x) e x = 5 is x = 4.965.
(b) What is the strongest wavelength emitted by the Sun, which has a
surface temperature of 5750 K (see P13.1)?
(c) Repeat the problem to find max and show that max 2 max
6= c. (We
naturally observe max when making a measurement using a grating
spectrometer.) HINT: The solution to (3 x) e x = 3 is x = 2.821
Index
ABCD law for Gaussian beams, 292 causality, 189, 192, 194
ABCD matrices, 234, 238 centroid, 191
ABCD matrix, 236 characteristic matrix, 108
aberrations, 235, 248 chirp, 181, 183
absolute value, complex number, 10 chirped pulse amplification, 187
absolute value, vector, 1 chromatic aberration, 249
Airy pattern, 281 circular polarization, 143, 144
Amperes law, 25, 30 circular polarizer, 158
Ampere-Maxwell, 31 Clausius-Mossotti, 50, 62
angle-addition formula, 7 coefficient of finesse, 95, 99
anisotropic medium, 121 coherence length, 205, 206
aperture, 258 coherence time, 205, 206
Arago, Francois Jean Dominique, 261 color, 58
array theorem, 275, 282 color matching function, 60
arrival time, 183 color space, 59
astigmatism, 249 coma, 250
autocorrelation theorem, 24 complex angle, 11
complex conjugate, 10
Babinets principle, 260 complex notation, 43, 45
beam waist, 275, 288, 291 complex numbers, 6
Bessel function, 19, 266, 280 complex plane, 10
biaxial crystal, 127 complex polar representation, 9
Biot, Jean-Baptiste, 28 concave, 237
Biot-Savart law, 28 conductivity, 69
birefringence, 121, 127, 130 conductor, refractive index of, 52
blackbody radiation, 317 constitutive relation, 47
blaze angle, 298 constitutive relation in crystals, 121
Bohr, Niels, 324 continuity equation, 31
Boltzmann factor, 328 continuous source, temporal coher-
boundary conditions at an interface, ence, 207
75, 84 convex, 237
Brewsters angle, 80 convolution theorem, 23
Brewster, David, 80 cosine, complex representation, 7
broadband, 169 Coulombs law, 26
critical angle, 81
carrier frequency, 180
cross product, 2
Cartesian coordinates, 1
333
334 INDEX
b2 +c
Z r
2
e ax +bx+c
dx = e 4a (Re {a} > 0) (0.55)
a
e i ax |b| |ab|
Z
2 2
dx = e (b > 0) (0.56)
0 1 + x /b 2
Z 2
e i a cos( ) d = 2J 0 (a)
0
(0.57)
Z0 a
a
J 0 (bx) x d x = J 1 (ab) (0.58)
0 b
2
e b /4a
Z
2
e ax J 0 (bx) x d x = (0.59)
0 2a
2
Z
sin (ax)
2
dx = (0.60)
0 (ax) 2a
dy y
Z
3/2 = p 2 (0.61)
y2 + c c y +c
p
dx 1 c
Z
p = p sin1 (0.62)
2 c |x|
Z x x c Z
sin(ax) sin(bx) d x = cos(ax) cos(bx) d x = ab (a, b integer) (0.63)
0 0 2
N
X n 1r N +1
r = (0.64)
n=0 1r
N r (1 r N )
rn =
X
(0.65)
n=1 1r
1
rn =
X
(r < 1) (0.66)
n=0 1r