Introduction To Paraxial Geometrical Optics

Introduction to Paraxial

Geometrical Optics

The Domain of Geometrical Optics

If the wavelength of light is imagined to become vanishingly small, we enter a domain

in which the concepts of geometrical optics suffice to analyze optical systems. While the
actual wavelength of light is always finite, nonetheless provided all vatiations or changes
of the amplitude and phase of a wavefield take place on spatial scales that are very large
compared with a wavelength, the predictions of geometrical optics will be accurate. Ex-
amples of situations for which geometrical optics does not yield accurate predictions occur
when we insert a sharp edge or a sharply defined aperture in a beam of light, or when we
change the phase of a wave by a significant fraction of 21t radians over spatial scales that
are comparable with a wavelength.
Thus if we imagine a periodic phase grating for which a "smooth" change of phase
by 21t radians takes place only over a distance of many wavelengths, the predictions of
geometrical optics for the amplitnde distribution behind the gratiog will be reasonably
accurate. On the other hand, if the changes of 21t radians take place in only a few wave-
lengths, or take place very abruptly, then diffraction effects can not be ignored, and a fnIl
wave-optics (or "physical-optics") treatment of the problem is needed.
This appendix is not a complete introduction to the subject of geometrical optics.
Rather, we have selected several topics that will help the reader better understand the rela-
tionship between geometrical optics and physical optics. In addition, several geometrical
concepts that are needed in formulatiog the physical-optics description of imaging and
spatial filtering systems are introduc~.

The Concept of a Ray

Consider a monochromatic disturbance traveling· in a medinm with refractive index that
varies slowly on the scale of an optical wavel~ngth. Such a disturbance can be described
by an amplitude and phase distribution

UU') = AU') exp[jkoS(1')], (B-1)

where A (1') is the amplitude and ko See) is the phase of the wave. Here ko is the free-space
wavenumber 211:/),,0; the refractive index n of the medium is contained in the definition
of s. S(1') is caIled the Eikonal function. We follow the argument presented in [271] (p. 52)
in finding the equation that must be satisfied by the Eikonal function.
Surfaces defined by

S el) = constant

are called wavefronts of the disturbance. The direction of power flow and the direction of
the wave vector k are both normal to the wavefronts at each point r in an isotropic medinm.
A ray is defined as a trajectory or a path through space that starts at any particular point
on a wavefront and moves through space with the wave, always remaining perpendicular
to the wavefront at every point on the trajectory. Thus a ray traces out the path of power
flow in an isotropic medium. Substitution of (B-1) in the Helmholtz equation of Eq. (3-12)
yields the following equation that must be satisfied by both A (r) and S (r):

k; [n 2 -IVSI 2 ] A+ V2 A - jko [2VS. VA + AV2 S] = O.

The real and inoaginary parts of this equation must vanish independently. For the real part
to vanish, we require

+ G: Y V:A. (B-2)

Using the artifice of allowing the wavelength to approach zero to recover the geometrical-
optics limit of this equation, the last term is seen to vanish, leaving the so-called Eikonal
equation, which is perhaps the most fundamental description of the behavior of light under
the approxinoations of geometrical optics,


This equation serves to define the wavefront S. Once the wavefronts are known, the trajec-
tories defining rays can be determined.

Rays and Local Spatial Frequency

Consider a monochromatic wave propagating in three dimensional space defined by an
(x, y, z) coordinate system, with propagation being in the positive z direction. At each
point on a plane of constant z, there is a well defined direction of the ray through that
point, a direction that coincides with the direction of the wave vector k at that point.
We have seen previously that an arbitrary distribution of complex field across a plane
can be decomposed by means of a Foulier transfofl)l ioto a collection of plane-wave com-
ponents traveling io different directions. Each ~uch plane wave component has a unique
wave vector with direction cosioes (a. fJ, y) defined io Fig. 3.9, and can be regarded as
one spatial frequency associated with the wave.
The spatial frequencies defined through the Fourier decomposition exist everywhere
io space and cannot be regarded as being localized. However, for complex functions with
a phase that does not vary too rapidly, the concept of a local spatial frequency can be
iotroduced, as was done io Section 2.2. The definitions of the local spatial frequencies
(fzx, flY) given there can also be viewed as definiog the local direction cosines (ClI, fJI, n)
of the wavefront through the relations

al = )..fzx fJI = )..fzy (B-4)

These local direction cosioes are io fact the direction cosines of the ray through the (x, y)
plane at each point. This leads us to the followiog important observation:

The description of the local spatial frequencies of a wavefront is identical

with the description of that wavefront in terms of the rays of geometrical op-
tics. Ray direction cosines are found from local spatial frequencies simply by
multiplication by the wavelength.

B.2 Refraction, Snell's Law, and the Paraxial Approximation

Rays traveling io a medium with constant iodex of refraction always travel in straight lines,
as can be derived from the Eikonal equation. However, when the wave travels through
a medium having an iodex of refraction that changes in space (i.e., an iohomogeneous
medium), the ray directions will undergo changes that depend on the changes of refrac-
tive iodex. When the changes of refractive index are gradual, the ray trajectories will be
smoothly changing curves in space. Such bending of the rays is called refraction.
However, when a wave encounters an abrupt boundary between two media haviog
different refractive indices, the ray directions are changed suddenly as they pass through
the interface. The angles of incidence el and refraction e2, as shown in Fig. 3.1, are related
by Snell's law,


where n 1 and n2 are the refractive iodices of the first and second media, respectively. In
the problems of interest here, the changes of refractive iodex, as encountered, for example,
on passage through a lens, will always be abrupt, so Snell's law will form the basis for our
A further simplifying approximation can be made if we restrict attention to rays that
are traveliog close to the optical axis and at small angles to that axis, the geometrical optics
version of the paraxial approximation. In such a case, Snell's law reduces to a simple linear
relationship between the angle of incidence and the angle of refraction,


and in addition the cosines of these angles can be replaced by nnity.

The producCe =, n~!Of the refractive index n and an angle ~ within that medium is
called a reduced angle. Thus the paraxial version of Snell's law states that the reduced
angle remains constant as light passes through a sharp interface between media of different
refractive indices,


B.3 The Ray-Transfer Matrix

Under paraxial conditions, the properties of rays in optical systems can be treated with
an elegant matrix formalism, which in many respects is the geometrical-optics equivalent
of the operator methods of wave optics introduced in Section 5.4. Additional references
for this material are [271], [177], and [280]. To apply this methodology, it is necessary
to consider only meridional rays, which are rays traveling in paths that are completely
contained in a single plane containing the z axis. We call the transverse axis in this plane
the y axis, and therefore the plane of interest is the (y, z) plane.
Figure B.I shows the typical kind of ray propagation problem that must be solved in
order to understand the effects of an optical system. On the left, at axial coordinateZI, is an
input plane of an optical system, and on the right, at axial coordinate zz, is an output plane.
A ray with transverse coordinate YI enters the system at angle Ih, and the same ray, now
with transverse coordinate Y2, leaves the system with anglel/z. The goal is to detennine the
position YZ and angle e2 of the output ray for every possible YI and el associated with an
input ray.
Under the paraxial condition, the relationships between (YZ, e2) and (YI, el) are linear
and can be written explicitly as

Figure 8.1 Input and output of an optical system.

Y2 = AYI + Bel
e2 = CYlj- Del,
where for reasons that will become evident, we use reduced angles rather than just angles.
The above equation can be expressed more compactly in matrix notation,


The matrix

M- [A B]
- C D

is called the raY'transier rnatrnar the ABCDmatrix. '

The ray-transfer matrix has an interesting interpretation in terms of local spatial fre-
quencies. In the (y, z) plane under paraxial conditions, the reduced ray angle with respect
to the z axis is related to local spatial frequency II through

() e
= - =-.
Therefore the ray-transfer matrix can be regarded as specifying a transformation between
the spatial distribution of local spatial frequency at the. input and the corresponding distri-
bution at the output.

Elementary Ray-Transfer Matrices

Certain simple structures are commonly encountered 'in ray tracing problems. Here we
specify the ray-transfer matrices for the most important of these structures. They are all
illustrated in Fig. B.2.
I. Propagation through free space of index n. Geometrical rays travel in straight
lines in a medium with constant refractive index. Therefore the effect of propaga-
tion through free space is to translate the location of the ray in proportion to the
angle at which it travels and to leave the angle of the ray unchanged. The ray-
transfer matrix describing propagation over distance d is therefore

M= Udin l (B-IO)

2. Refraction at a planar interface. At a planar interface the position of the ray is

unchanged but the angle of the ray is transformed according to Snell's law; the
reduced angle remains unchanged. Therefore the ray-transfer matrix for a planar
interface between a medium of refractive index n I and a medium of refractive index
n2 is

M= [~ n. (B-l1)
Y1 Y2
61 61

(a) (b)


(c) (d)
Figure B.2 Elementary structures for ray-transfer matrix calculations. (a) Free space, (b) a planar
interface, (c) a spherical interface, and (d) a thin lens.

3. Refraction at a spherical interface. At a spherical interface between an initial

medium with refractive index nl and a final medium with refractive index n2, the
position of a ray is again not changed, but the angle is changed. However at a point
on the interface at distance y from the optical axis, the normal to the interface is
not parallel to the optical axis, but rather is inclined with respect to the optical axis
by angle
,I, .
'f' =arcsm-~-,
where R is the radius of the spherical surface. Therefore if (Ij and fi2 are measured
with respect to the optical axis, Snell's law at transverse coordinate y becomes
y y
nlfil + nl Ii = n2 fi2 + n2Ii'
Of, using reduced angles,

Solving for e2 yields

" " nl - nz
fiz=fil+ R y.

The ray-transfer matrix for a spherical interface can now be written as

M=[ nl-n2
Note that a positive value for R signifies a convex surface encountered from left to
right, while a negative value for R signifies a concave surface.
4. Passage through a thin lens. A thin lens (inpex n2 embedded in a medium of
index n I) can be treated by cascading two spherical interfaces. The roles of n I and
n2 are interchanged for the two surfaces. Representing the ray-transfer manices of
the surfaces on the left and the right by MI and M2, respectively, the ray-transfer
matrix for the sequence of two surfaces is

=[?, ~][ -(n2 - nl) U1 - 1 2

We define the focal length of the lens by

.!. = n2 - nl (_1 _..!..), (B-13)

f nl RI R2
in which case the ray-transfer matrix for a thin lens becomes

M=[_~ ~l (B-14)

The most useful elementary ray-transfer matrices have now been presented. Propaga-
tion through a system consisting of regions of free space separated by thin lenses can be
treated with these matrices. Note that, just as with the wave-optics operators presented in
Chapter 5, the ray-transfer matrices should be applied in the sequence in which the struc-
tures are encountered. If light propagates first through a structure with ray-transfer matrix
MI, then through a s!mcture with ray-transfer matrix M2, etc., with a final s!mcture having
ray-transfer matrix M., then the overall ray-transfer matrix for the entire system is


We note also that, because we have chosen to use reduced angles, rather than the
angles themselves in the definition of the ray-transfer matrix, all of the elementary matrices
presented have a determinant that is unity.

B.4 Conjugate Planes, Focal Planes, and Principal Planes

There exist certain planes within an optical system that play important conceptual and
practical roles. In this section we explain the three most important of these types of planes.

Conjugate Planes
Two planes within an optical system are said to be conjugate planes if the intensity distri-
bution across one plane is an image (generally magnified or demagnified) of the intensity
distribution across the other plane. Likewise, two points are said to be conjugate points if
one is the image of the other.
The properties that must be satisfied by the ray-tran~fer_trix between two conjugate
planes can be deduced by considering the relation between two conjugate points YI and Yz,
as implied by Eq. (B-8). The position of the point Y2 that is conjugate to YI should be in-
dependent of the reduced angle 81 of a ray through YJ. implying that the matrix element B
should be zero. The position Y2 should be related to the position YI only through the trans-
verse magnification m" which is the scale factor between coordinates in the two planes.
We conclude that the matrix element A must equal m,. In addition, the angles of the rays
passing through Y2 will generally be magnified or demagnified with respect to the angles of
the same rays passing through YI. The magnification for reduced angles is represented by
rna, and we conclude that the matrix element D must satisfy D = rna. There is no general
restriction on the matrix element C, so the ray-transfer matrix between conjugate planes
takes the general form

M=[? ~a l
Recalling that angles and positions are conjugate Fourier variables, the scaling theorem of
Fourier analysis implies that the transverse magnification and the angular magnification
must be related in a reciprocal fashion. The magnifications rn, and rna are in fact related

Thus the form of the ray-transfer matrix for conjugate planes is

M= [? rn~1 l
Note that both m, and rna can be positive or negative (signifying image inversion), but they
must be of the same sign.
The paraxial relation (B-16) has a more general nonparaxial form, known as the sine
condition, which states that for conjugate points YI and Y2 the following equation must be

Focal Planes
Consider a parallel bundle of rays traveling parallel to the optical axis and entering a lens.
Whether that lens is thick or thin, for paraxial rays there will exist a point on the optical
axis toward which that ray bundle will converge (positive lens) or from which it will appear
to diverge (negative lens). See Fig. B.3 for an illustration. Considering a positive lens for
the moment, the point behind the lens at which this originally parallel ray bundle crosses
in a focused point is called the rear focal point or the second focal point of the lens. A
plane constructed through that point perpendicular to the optical axis is called the rear
(a) (b)

(c) (d)
Figure B.3 Definition of focal points. (a) Rear focal point of a positive lens.(b) front focal point of a
positive lens, (c) front focal point of a negative lens, and (d) rear focal point of a negative lens.

focal plane or the second focal plane. It has the property that a paraxial parallel bundle of
rays traveling into the lens at any angle with respect to the optical axis will be brought to a
focus at a point in the focal plane that depends on the initial angle of the bundle.
In a similar fashion, consider a point source on the optical axis io front of a positive
lens, thick or thio. The particular point in front of the lens for which the diverging bundle
of rays is made to emerge as a parallel bundle traveling parallel to the optical axis behind
the lens is called thefrontfocal point (or the first focal point) of the lens. A plane erected
through the front focal point normal to the optical axis is called the front focal plane (or
the first focal plane) of the lens.
For a negative lens, the roles of the front and rear focal points and planes are reversed.
The front focal poiot is now the point from which a bundle of rays, originally parallel to
the optical axis, appears to be divergiog when viewed from the exit side of the lens. The
rear focal point is defined by the point of convergence of an incident bundle of rays that
emerges parallel or collimated after passage through the lens.
The mapping from the front focal plane to the rear focal plane is one that maps angles
ioto positions, and positions into angles. If f is the focal length of the lens, then the ray-
transfer matrix between focal planes takes the form

M= [_°7 ~ J-
as can be readily verified by multiplyiog together three matrices representing propagation
over distance f, passage through a thin lens with focal length f, and propagation over a
second distance f.

Principal Planes
By the definition of a thio lens, a ray incident at input coordinate Yl exits that lens at the
same coordinate Y2 = Yl. For a thick lens this simple idealization is no longer valid. A
ray entering the first spherical surface at coordinate Yl will in general leave the second
spherical surface at a different coordinate Y2 l' Yl, as can be seen in Fig. B.3.
Much of the simplicity of a thin lens can be retained"for a thick lens by introducing
the concept of principal planes. Principal planes are planes where the focusing power of
the lens can be imagined to be concentrated.
To find the first principal plane of a lens, trace a ray from the front focal point to the
first lens surface, as shown in Fig. BA. By definition of the focal point, that ray will exit
the second surface of the lens parallel to the optical axis, i.e., in a collimated bearn. If
we project the incident ray forward and the exiting ray backward into the lens, retaining
their original angles, they will intersect at a point. A plane throngh this point normal to the
optical axis defines thefirst principal plane. For this geometry it is possible to imagine that
all the refraction associated with the lens takes place in this principal plane.
In the most general case, different rays diverging from the front focal point might
define different planes, which would be an indication that the principal plane is not a plane
at all, but rather is a curved surface. Such can be the case for lenses with very large aperture
or for special lenses such as wide-angle lenses, but for the lenses of interest to us in this
book the principal planes are indeed flat to an excellent approximation.
The second principal plane is found by starting with a ray that is parallel to the optical
axis, and tracing it through the rear focal point of the lens. The extension of the incident
ray and the exiting ray intersect in a point, which in turn defines the second principal plane
of the lens, again normal to the optical axis. For this geometry it is possible to imagine that
all of the power of the lens is concentrated in the second principal plane.
For more general geometries, ray bending can be imagined to take place in both of
the principal planes. As will be seen shortly, the two planes are in fact conjugate to one
another with unit magnification. A ray incident at particular transverse coordinates on the
first principal plane will exit from the second principal plane at those sarne coordinates,
but in general with a change of angle.

Figure 8.4 Definitions of principal planes. (a) First principal plane Pl. (b) second principal plane P2.
In general, the first and second principal planes are separate planes. However, the
definition of a thin lens implies that for such a lens the distinguishing characteristic is that
the first aud second principal planes coincide, aud: all' the focusing power can be imagined
to be concentrated in a single plane.
The relationship between the principal plaues can be more fully understood if we de-
rive the ray-transfer matrix that holds for propagation between the two principal plaues.
The derivation is based on the two geometries already introduced, namely that of a point
source at the front focal point that yields a collimated ray bundle leaving the second prin-
cipal plane, and that of a collimated bundle incident on the first principal plane that yields
a ray bundle converging from the second principal plane toward a focus at the rear focal
point. Considering the case of collimated input light passing through the rear focal point,
we find that the matrix element A must be unity, and the matrix element C must be -n 1/ f.
Consideration of the case of input rays diverging from the front focal point shows that
B = 0 and D = I. Thus the ray-transfer matrix for the passage between principal planes

M=[ -~ n.
This matrix is identical with the ray-transfer matrix describing passage through a thin lens.
Thus by constructing the principal planes, and by tracing rays only up to the first principal
plane and away from the second principal plane, we are able to treat a complex lens system
as if it were a simpler thin lens. Note that the ray-transfer matrix above implies that the two
principal planes are conjugate to one another, and the magnification between them is unity.
Thefoeallength of a lens is by definition the distance of a principal plane from the cor-
responding focal point that was used in its definition. AssunJing that the refractive indices
of the media in front of and behind the lens are the same, the distance of the front focal
plane from the first principal plane is identical with the distance of the rear focal point from
the second principal plane. That is, the two focal lengths of the lens are the sarne. Note that
for some lenses the second principal plane may lie to the left of the first principal plaue.
Such an occurrence does not change the definition of the focal length. It can also be shown
that the distances Zl and Z2 in the lens law
Zl + Z2 = f
are measured from the first and second principal planes. These various relations are illus-
trated in Fig. B.5.

8.5 Entrance and Exit Pupils

Until now, we have not considered the effects of pupils (i.e., finite apertures) in optical
systems. Apertures, of course, give rise to diffraction effects. The concepts of entrance and
exit apertures are of great importance in calculations of the effects of diffraction on optical
Figure 8.5 Relations between principal planes, focal lengths, and object/image distances.

A system oflenses may contain several or many clifferent apertnres, but one such aper-
tnre always provides the severest limitation to the extent of the optical wavefront captnred
at the input of the system, and to the extent of the optical wavefront leaving the system.
That apertnre may lie deep within the system of lenses, but the single apertnre that most

Entrance and
exit pupils

(a) Entrance
I / pupil
~ -.
... Exit


Figure B.6 Entrance and exit pupils. (a) Entrance and exit pupils coincide with the.physical pupil, (b)
the exit pupil coincides with the physical pupil, and (c) the entrance pupil coincides with the physical
severely restricts the bundle of rays passing through the system is in effect the aperture that
limits the extent of the wavefront at both the input and at the output.
The entrance pupil of the optical system is defined as the image of the most severely
limiting aperture, when viewed from the object space, looking through any optical ele-
ments that may precede the physical aperture. The exit pupil of the system is also defined
as the image of the physical aperture, but this time looking from the image space through
any optical elements that may lie between that aperture and the image plane.
Figure B.6 illustrates the entrance and exit pupils for a very simple system consisting
of a single lens, for three cases: a limiting pupil (1) in the plane of the lens, (2) following the
lens, and (3) preceding the lens. In the first case, the entrance and exit apertures coincide
with the real physical aperture in the plane of the lens. In the second case, the exit pupil
coincides with the physical pupil (which is assumed to limit the angle of the bundle of rays
more severely than does the lens aperture), and the entrance pupil is a virtual image of the
physical aperture, lying to the right of the lens. In the third case, the entrance pupil is the
real physical aperture lying to the left of the lens. In this case, the exit pupil is a virtual
image of the physical aperture, lying in a plane to the left of the lens.
In a more complex optical system, containing many lenses and many apertures, it is
in general necessary to trace rays through the entire system in order to determine which
aperture constitutes the most severe restriction on the ray bundles and therefore which
aperture must be imaged to find the entrance and exit pupils.
Once the location and extent of the exit pupil are known, the effects of diffraction
on the image of a point-source object can be calculated. For an object point source, a
converging bundle of rays fills the exit pupil on its way to a geometrical image. If the optical
system has no aberrations, the geometrical image is an ideal point and the converging
bundle defines a perfect spherical wave. The exit pupil limits the angular extent of the
converging bundle. The Fraunhofer diffraction forrnnIa can now be applied at the exit pupil,
using the distance from that pupil to the image as the distance appearing in the formula.

