Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Stat Phys

Download as pdf or txt
Download as pdf or txt
You are on page 1of 83

PHYS20352 Statistical Physics

2021-2022

Judith McGovern

March 21, 2022


Contents

1 Revision of thermodynamics 2
1.1 States of a system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 The Zeroth Law of Thermodymamics . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Internal energy and the First Law of Thermodynamics . . . . . . . . . . . . . . 3
1.4 Second law of Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Thermodynamic potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Variable particle number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6.1 Chemical Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Statistical Physics of isolated systems 12


2.1 Microcanonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 The statistical basis of entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 The spin-half paramagnet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 From entropy to temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4.1 The isolated spin-half paramagnet in a magnetic field . . . . . . . . . . 20
2.4.2 The ideal gas, first attempt . . . . . . . . . . . . . . . . . . . . . . . . . 22

3 Statistical Physics of Non-isolated Systems 23


3.1 The Boltzmann Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 The Partition Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Entropy, Helmholtz Free Energy and the Partition Function . . . . . . . . . . . 27
3.4 The Gibbs entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.5 The paramagnet at fixed temperature . . . . . . . . . . . . . . . . . . . . . . . . 30
3.6 Adiabatic demagnetisation and the third law of thermodynamics . . . . . . . . . 33
3.7 Vibrational and rotational energy of a diatomic molecule . . . . . . . . . . . . . 35
3.8 Translational energy of a molecule in an ideal gas . . . . . . . . . . . . . . . . . 36
3.8.1 The Density of States . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.8.2 The Maxwell-Boltzmann Distribution . . . . . . . . . . . . . . . . . . . 39
3.9 Factorisation of partition functions . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.10 The Equipartition Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.10.1 The heat capacity of a crystal . . . . . . . . . . . . . . . . . . . . . . . . 42
3.11 The N particle partition function for indistinguishable particles . . . . . . . . . 43
3.12 The ideal gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.12.1 Chemical potential of ideal gas . . . . . . . . . . . . . . . . . . . . . . . 46
3.13 Using Classical Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.14 Systems with variable particle number —The Gibbs distribution . . . . . . . . 49
3.14.1 Two examples of the Gibbs Distribution . . . . . . . . . . . . . . . . . . 51

1
4 Quantum Gases 53
4.1 Bosons and fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 The ideal gas of bosons or fermions: beyond the classical approximation . . . . 54
4.2.1 The classical approximation again . . . . . . . . . . . . . . . . . . . . . 56
4.3 The ideal Fermi Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.1 Electrons in a metal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.2 White dwarf stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 The ideal Bose Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4.1 Photons and Black-body radiation . . . . . . . . . . . . . . . . . . . . . 64
4.4.2 Phonons and the Debye model . . . . . . . . . . . . . . . . . . . . . . . 68
4.4.3 Bose-Einstein condensation . . . . . . . . . . . . . . . . . . . . . . . . . 71

A Miscellaneous background 78
A.1 Revision of ideal gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
A.2 Lagrange multipliers for constrained minimisation . . . . . . . . . . . . . . . . . 79
A.3 Hyperbolic Trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Chapter 1

Revision of thermodynamics

In this chapter, the reader is referred to sections 1 & 2 of the on-line notes of the previous
version of this course for more details and references.

1.1 States of a system


Take-home message: Classical thermodynamics describes macroscopic systems in
equilibrium in terms of a few measurable variables
Thermodynamics as you studied it last year is largely a study of macroscopic systems in, or
approximately in, equilibrium. Mostly these were fluids, particularly gases, and “macroscopic”
refers to the fact that we were dealing with reasonably large quantities, measured in moles and
grams. Empirically, when such a system is allowed to reach thermal and mechanical (and if
necessary chemical) equilibrium, it is described by a very few properties such as temperature,
pressure, volume and composition (if more than one species is present). These properties
are termed functions of state or state variables, as it is empirically observed that they are
determined only by the present state of the system and not the history by which it reached
that state. Furthermore though one can add a few further state variables to the list, such as
internal energy and entropy, it turns out that they cannot all be varied independently. For
a given mass of a single-component fluid, for instance (not necessarily an ideal gas), if the
temperature and pressure are specified, everything else has a definite value which cannot be
altered. Relationships between these variables (which may or may not have a simple form) are
called equations of state.
Equilibrium is in practice never absolute: the most stable gas in the most unreactive vessel
will still eventually leak out or distort or corrode its container. But for all practical purposes,
we can talk about it being in equilibrium once there is no net observable flow of particles or
heat between different regions of the system. Gases come to equilibrium very quickly on a
human times scale; treacle rather less quickly. Some systems “never” do, it seems; glasses
are an example. Systems that exhibit hysteresis (dependence on past conditions, such as
permanent magnets) do not reach equilibrium in a reasonable time. But in this lecture course
we will assume that for systems of interest equilibrium can be reached, and that changes to the
conditions (eg pressure) can be made sufficiently slowly that the system remains in equilibrium
(sometimes termed “quasi-equilibrium”) at all times. A reversible process is one in which an
infinitesimal change in the external conditions is enough to reverse the direction of the process;
an example would be the compression of a gas in a cylinder with a frictionless piston by exerting
an external force only just sufficient to move the piston one way or the other. When we draw
the reversible compression of a gas on a P V plot, we are assuming that the system always has

3
a well-defined pressure, which would not be the case if there were turbulence or shock waves
arising from too-rapid movement of the piston. Such irreversible processes are drawn as dotted
lines between the initial and final states.
In the context of these lectures, the state of a macroscopic system in equilibrium specified
by a handful of macroscopically-manipulable variables is call the macrostate. It completely
ignores what is going on with the very many individual atoms that comprise the system. A
much richer description of the positions and momenta of all the atoms (or of their combined
quantum state) is in principle possible, and it is called a microstate of the system.1 The
ability of classical thermodynamics to describe systems without reference to the microstates is
a consequence of the laws of probability and of large numbers. The goal of statistical physics
is to derive thermodynamics from the behaviour of atoms and molecules, and we will returmn
to this in the next chapter. The current chapter is largely revision of ideas already met in
Properties of Matter.

1.2 The Zeroth Law of Thermodymamics


Take-home message: Absolute zero is the point at which thermal motion of an
ideal gas vanishes
The zeroth law of thermodynamics says that if two bodies are separately in thermal equi-
librium with a third body, they are also in thermal equilibriums with one another. All three
are then said to be at the same temperature.
If the third body changes visibly as it is heated, then it can be used as a “thermoscope” to
verify equality of temperature or to rank bodies according to temperature. This is independent
of any numerical scale.
A thermometer is a calibrated thermoscope. Any thermoscope can be used to define a
numerical temperature scale over some range. Thermoscopes based on the volume of gases led
finally to the ideal gas or absolute temperature scale, measured in Kelvin and defined to be
273.16 K at the triple point of water:
PV
T = lim × 273.16 K (1.1)
P →0 (P V )triple

The low pressure limit is taken because real gases approach ideal behaviour in that limit.
The numerical value at the triple point was chosen so that the degree Kelvin matched the
degree Celsius to high accuracy. Unlike earlier temperature scales there is no need to define
the temperature at two points, because the zero of the Kelvin scale is absolute zero. This is
the temperature at which the pressure of an ideal gas would vanish, because (classically) the
motion of its molecules would cease.
In these notes, the symbol T will always refer to absolute temperature.

1.3 Internal energy


and the First Law of Thermodynamics
Take-home message: Energy can be transferred to a system by adding heat or
doing work, but the net effect is the same
1
A microstate is NOT necessarily the state of a microscopically small system! Systems of any size have
microstates, though only large systems also have macrostates.
The first law of thermodynamics states that any change in internal energy E of a system is
due to the amount of heat added to it and the work done on it:

∆E = Q + W or dE = d̄Q + d̄W (1.2)

The internal energy is a function of state and changes by a finite (∆E) or infinitesimal (dE)
amount in any process. But heat and work are NOT functions of state; the same change of
state may be effected by different reversible or irreversible processes involving different amounts
of heat transfer and work done. d̄Q and d̄W are not true differentials but just infinitesimal
amounts of energy transfer in the form of heat and work.
For reversible processes we write

dE = d̄Qrev + d̄W rev (1.3)

In an adiabatic processe, Q = 0.
For a system taken round a cycle, so that it returns to its initial state, it is obvious that
∆E = 0. But Q and W , not being functions of state, will not vanish in general. This is the
basis of a heat engine: the net work done by the system is equal to the net heat transferred to
the system. But we will see in a moment that one cannot simply add an amount of heat and
extract the same amount of work. Some of the heat must be discarded.
Expressions for the work done in reversible processes are as follows:
Compression of a fluid
d̄W rev = −P dV (1.4)
To reversibly stretch a wire of tension Γ (that’s a capital gamma) by dl requires

d̄W rev = Γ dl (1.5)

and to increase the area of a film of surface tension γ by dA requires

d̄W rev = γ dA (1.6)

Note in the last two cases the sign is different from the first; that’s because it takes work to
stretch a wire or a soap film (dl or dA positive) but to compress a gas (dV negative).
Lastly, to reversibly increase the magnetic field B imposed upon a paramagnetic sample
requires
d̄W rev = −m · dB = −V M · dB (1.7)
where M is the magnetisation per unit volume, and m is the total magnetic moment of the
sample.
To repeat, these hold only for reversible processes. To calculate the internal energy change
for irreversible processes such as the free expansion of a gas, it is necessary to find a reversible
process linking the same initial and final states of the system.

1.4 Second law of Thermodynamics


Take-home message: No process which would decrease the entropy of a system will
happen spontaneously. OR Heat won’t pass from a cooler to a hotter; You can try
it if you like but you far better notter
It is observed that when systems start out of equilibrium, they always evolve to the same
equilibrium state determined by the external constraints. For instance if a gas is isolated from
its surrounding in an insulating container and is initially confined only to one part of that
container, if it is then allowed to spread to the whole container it will always reach the same,
uniform, state. (Note that to even talk of a system being out of equilibrium there has to be
something about the system which is NOT specified by the functions of state such as internal
energy and total container volume; here it includes the relative amount of gas in different parts
of the container.) Or if two bodies not at the same temperature are brought into thermal
contact, heat will flow from the hotter to the cooler till they end up at the same temperature
and not evolve further. Heat will never be observed to flow from in the other direction.
It is also observed that in any process that extracts energy from an out-of-equilibrium
system (such as a hot body with colder surroundings) in order to do work, it is not possible to
use all of the energy that would flow to the surroundings if it were left to cool naturally. Some
(in practice most) of the energy will always need to be discarded to the surroundings as waste
heat.
These two observations are summarised in two statements of the second law of thermody-
namics
There are two classic statements of the second law of thermodynamics
One due to Kelvin and Planck:
It is impossible to construct an engine which, operating in a cycle,
will produce no other effect than the extraction of heat from a
reservoir and the performance of an equivalent amount of work.

And another due to Clausius:


It is impossible to construct an refrigerator which, operating in a
cycle, will produce no other effect than the transfer of heat from a
cooler body to a hotter one.

Note the careful wording: of course it is possible to think of processes which convert heat
into work (expansion of a hot gas in a piston) or which pass heat from a cool to a hot body
(real fridges) but other things change as well (the gas ends up cooler; you have an electricity
bill to pay). The bit about “operating in a cycle” ensures that the engine is unchanged by the
process.
The two statements may not appear to have anything to do with one another, but in fact
each one implies the other: a hypothetical engine or pump which violates one statement can,
along with another normal engine or pump, form a new composite machine which violates the
other.
Clausius’s statement leads to the following. By consideration of a system taken round a
cycle, with heat added to and removed from the system while it or the relevant part of it is at
various temperatures, Clausius’s theorem says that the sum of the heat added weighted by the
inverse of the temperature at which it is added is less than or equal to zero:
I
d̄Q
≤0 (1.8)
T
The inequality becomes an equality for reversible systems, for which we could take the cycle in
the opposite direction:
d̄Qrev
I
= 0. (1.9)
T
This is interesting because a quantity whose change vanishes over a cycle implies a function
of state. We know that heat itself isn’t a function of state, but it seems that in a reversible
process “heat over temperature” is a function of state. It is called entropy with the symbol S:

d̄Qrev
dS = (1.10)
T

Hence we have, for a fluid, a new statement of the first law: the fundamental thermodynamic
relation
dE = T dS − P dV (1.11)
Note that reversible adiabatic processes are isentropic and dE = d̄W rev .
Furthermore, by considering a cycle in which a system undergoes an irreversible change,
followed by a reversible return to the initial state, we have

d̄Qirrev < T dS (1.12)

and so, for an isolated system, during any spontaneous change

dS ≥ 0. (1.13)

and once the entropy reaches a maximum, no further change can take place and the system is
in equilibrium.
An alternative statement of the second law is thus:

The entropy of an isolated system can never decrease.

This is a powerful principle, but it is empirically based and leaves many questions unan-
swered, principally about the nature of entropy.
If we start with entropy increase, we recover Clausius’ statement: when a (positive) amount
of heat Q flows from a hotter body to a cooler one, the entropy decrease of the hot body,
−Q/TH is less than the entropy increase of the cold body, Q/TC . So overall the entropy of
the hot and cold bodies together (or a hot body plus its cooler surroundings, together “the
universe”) increases.
We also recover Kelvin’s statement: to remove heat from a hot body and turn it entirely
into work will decrease the entropy of the universe. We need to add enough of the heat to the
surroundings so that their entropy increases to compensate.
In a Carnot engine QH is removed from a hot reservoir and QC is discarded to a cold one
in such a way that the combined entropy change is zero
QH QC
= (1.14)
TH TC
and the difference is available to do work: W = QH − QC .
Along with heating, other processes which increases entropy are free expansion of a gas,
or mixing of two species of gas initially confined to different parts of a chamber. It is not
immediately clear what these have in common....
1.5 Thermodynamic potentials
Take-home message: Depending on the external conditions, other “potentials” or
“free energies” are easier to work with than the internal energy as they account
for the entropy changes of the surroundings
The fundamental thermodynamic relation
dE = T dS − P dV (1.15)
implies that there is a function E(S, V ) with

∂E ∂E
=T and = −P (1.16)
∂S V ∂V S
Equivalently we may use S(E, V ):

1 P ∂S 1 ∂S P
dS = dE + dV ; = and = (1.17)
T T ∂E V T ∂V E T
In this view, temperature and pressure are derived properties of systems whose energy and
volume are fixed. This is appropriate for isolated systems.
However in practice we often want to work with temperature as the variable we control,
and systems which are not isolated but in contact with a heat bath of our choice.
Now we cannot say that the approach to equilibrium will maximise the entropy of the
system, because it can exchange heat with the heat bath at temperature T , and only when we
consider both together will entropy increase.
Imagine a spontaneous change at constant volume during which Q is absorbed by the system,
∆E = Q. The system starts and ends at the temperature T of the heat bath, though it may
change in between (due to a chemical reaction, for instance). The total change in entropy has
two parts, ∆S for the system and −Q/T for the surroundings. So
Q 1
∆Stot = ∆S − = (T ∆S − ∆E) ≥ 0
T T
⇒ ∆(T S − E) ≥ 0 (1.18)
So if we concentrate only on the system, rather than its entropy being maximised in the
approach to equilibrium, the quantity T S − E is maximised. Conventionally we define the
Helmholtz free energy, F , as the negative of this, so it is minimised. We have
F = E − TS ⇒ dF = −SdT − P dV ; (1.19)
and
∂F ∂F
= −S and = −P. (1.20)
∂T V ∂V T
The Helmholtz free energy will turn out to play a crucial role in statistical physics.
If we fix the pressure rather than the volume, we are led to define
G = E − TS + PV ⇒ dG = −SdT + V dP ; (1.21)
and the approach to equilibrium minimises the Gibbs free energy G of the system.
In the latter two cases, the principle is still that the system plus surroundings together
evolve to maximise their entropy. It only looks as if the system may be “trying to minimise its
energy” because an energy transfer to (possibly cooler) surroundings more than compensates
for the decrease in entropy of the system itself.
1.6 Variable particle number
Take-home message: The chemical potential is the same as the Gibbs free energy,
and it is the key to diffusive and chemical equilibrium
In all of the above we have assumed a fixed amount of material in the system. If however
it is in diffusive contact with a reservoir, we must add a term +µdN to each of dE, dF and
dG. µ is the so-called chemical potential, and just as heat flows from high to low temperature,
particles flow from high to low chemical potential, and equilibrium—no net flow—means equal
chemical potential. One of the three expressions for µ as a derivative of a thermodymamic
potential is  
∂G
µ= (1.22)
∂N T,P
But G is extensive, and P and T are intensive, so G must be directly proportional to N :
G(T, P, N ) = N g(T, P ) (where g is the Gibbs free energy per molecule), and so µ = g. So for
a single component system, µ is just the Gibbs free energy per molecule. Phase coexistence at
a given temperature and pressure, for instance, requires equal Gibbs free energy per molecule
in the two phases. Otherwise the system would evolve to minimise its Gibbs free energy and
one phase would convert into the other (melt, freeze, boil, condense....).
However for a two-component system, with separate chemical potentials µ1 and µ2 , G
depends not only on the extensive variable N = N1 + N2 but also on the ratio N1 /N2 , which is
intensive. The Gibbs free energy per molecule of one substance can depend on the concentration
of the other. If substance 1 is ethanol and substance 2 water, the chemical potential of ethanol
is different in beer (5%) and vodka (40%). None-the-less an extension of extensivity—that G
must double if N1 and N2 both double—can be shown, rather surprisingly, to imply that
X
G= µi Ni . (1.23)
i

In other words the chemical potential of each species remains the Gibbs free energy per particle
of that species, even though it also depends on the relative concentrations of all species present.2
For an ideal gas,
T P
S(T, P ) = S(T0 , P0 ) + Cp ln − nR ln . (1.24)
T0 P0
At constant temperature T0 , E and P V are also constant, and so from G = E − T S + P V we
have an expression comparing the chemical potential or Gibbs free energy per molecule at two
different pressures:
P2
µ(T0 , P2 ) − µ(T0 , P1 ) = kB T0 ln (1.25)
P1
For mixtures of ideal gases, the absence of interactions means that each species has the same
chemical potential as it would if the other species were absent, and hence the pressure were
equal to the partial pressure of that species, Pi = (Ni /N )P .
Eq. (1.25) says the chemical potential is higher at higher pressures and gas will diffuse from
higher to lower pressure. (A similar expression holds for solutions, with pressure replaced by
concentration.) That seems obvious, but it is different from the equalisation of mechanical
pressure. In particular if two ideal gases are at different concentrations on either side of a rigid
membrane, but only one can pass through, the partial pressure of the mobile one will equalise
even if that increases the mechanical total pressure on one side.
2
We will prove this when we consider the Grand Potential later in the course, see section 3.14.
1.6.1 Chemical Reactions
The treatment of chemical reactions is very like that of phase transitions. Again, we are
considering conditions of constant temperature and pressure, and the question is the following:
how far will a reaction go?
First consider the simplest case of a reaction with only one reactant and one product:
*
A ) B. An example is the interconversion of n-pentane and isopentane (or pentane and
methyl-butane, for those of us who learned our chemistry in the last fifty years).

H H H H H C

H C C C C C H C C C C
H H H H H
pentane methylbutane
(hydrogens omitted)

Spontaneous changes will minimise the Gibbs free energy. With temperature and pressure
fixed only the numbers of A and B can change. Since they can only interconvert, dNA = −dNB
and
dG = µA dNA + µB dNB = (µA − µB )dNA (1.26)
So if µA > µB , A will convert to B, but if µB > µA , the opposite will happen. So at equilibrium,
when no further changes happen, the chemical potentials must be equal. (Remember that the
chemical potentials are functions of concentration, so they will change as the reaction proceed.)

In the figure “E” marks the equilibrium concentration, at the point where µA = µB .
If there are more reactants or products, say for the more general reaction in which a moles
of A and b moles of B react to form x moles of X and y moles of Y,

aA + bB *
) xX + yY, (1.27)

the numbers of A, B, X and Y change together: dNA = (a/b)dNB = −(a/x)dNX = −(a/y)dNY .


So

dG = µA dNA + µB dNB + µC dNC + µD dND = a1 (aµA + bµB − xµX − yµY ) dNA (1.28)
and equilibrium is when
aµA + bµB = xµX + yµY . (1.29)
This result is general: equilibrium is reached when the weighted sum of the chemical potentials
of the reactants equals that of the products.
Now consider the hugely simplified case where the species are in gaseous form and can be
treated as ideal gases, so that their chemical potential is just their Gibbs free energy per mole,
µi = gi (Pi , T ). (Here g refers to G/n rather than G/N .)3
We define the molar Gibbs free energy of reaction as

gr = xgX + ygY − agA − bgB (1.30)

and this will be zero at equilibrium. If we know gr0 ≡ gr (T0 , P0 ) at some reference temperature
and the same reference partial pressure for all reactants and products, (usually T0 = 25o C and
Pi = P0 = 1 bar), then at other partial pressures but the same temperature, from (1.25) it will
be
 
0 PX PY PA PB
gr (Pi , T0 ) =gr + RT0 x ln + y ln − a ln − b ln
P0 P0 P0 P0
" x  y  a  b #
PX PY P0 P0
= gr0 + RT0 ln (1.31)
P0 P0 PA PB

Hence, since at equilibrium gr = 0,


x  y  a  b
gr0
  
PX PY P0 P0
= exp − = Kp (T0 ) . (1.32)
P0 P0 PA PB RT0

If gr0 < 0 and so K is large, the reaction will tend to favour the products (X, Y) over the
reactants (A, B); if on the other hand gr0 > 0 (K small) the reactants will be favoured, But
the actual composition at equilibrium will depend on the initial conditions; if for instance B
is in very short supply, there will always be plenty of A left. If the LHS of equation (1.32) is
calculated for some non-equilibrium state it is called Q; the reaction will go forwards if Q < K
and backwards if Q > K.
Chemists know Kp (T ) as the equilibrium coefficient, and using the symbol [A] for the ratio
of the concentration of species A in a gas or liquid to a standard concentration, write (1.32) as

[X]x [Y]y
= Kc (T ). (1.33)
[A]a [B]b

The two equilibrium constants are not numerically the same, because they refer to different
standard states (n0 = 1 mol l−1 for concentrations). They are related by

Kp = (n0 RT /P0 )x+y−a−b Kc . (1.34)

We have predicted Kp for ideal gases and solutions, but only in terms of a known Gibbs free
energy of reaction at that temperature and standard pressures/concentrations. To predict the
behaviour as a function of temperature requires the statistical theory.
3
This is the only section in which we will use g to mean specific Gibbs free energy. In statistical physics g
will always be a degeneracy factor, the number of microstates with the same energy.
For completeness, the general forms of these equations for N reactants and N 0 products can
be written
N0
X N
X
gr = xi gXi − aj gAj (1.35)
i=1 j=1

and, at equilibrium,
N0  x N  a
gr0
 
Y PXi i Y P0 j
= exp − . (1.36)
i=1
P0 1=1
PA j RT0
Note, as given, gr will depend on whether the reaction is written, e.g., O2 + 2H2 * ) 2H2 O
1 *
or O
2 2
+ H2 ) H2 O, being twice as large in the first case as in the second. You should be
able to convince yourself that the partial pressures at equilibrium do not depend on this. For
a reaction with a single product like this one, though, it is usual to talk about the standard
Gibbs free energy of formation as referring to a single mole of product, i.e. the second case.
All of this is phrased in terms of chemical reactions. But not all reactions are chemical:
in a neutron star, neutrons, protons and electrons can interact via p + e− * ) n; they reach an
equilibrium at which the chemical potentials on either side of the reaction are equal. This is
heavily biased toward neutrons, because the chemical potential of the light electron is much
higher for a given concentration than that of the heavy proton or neutron. The treatment
above doesn’t reveal that, it would be hidden in the reference quantity gr0 . But in fact, we can
and will demonstrate it in statistical physics.
Chapter 2

Statistical Physics of isolated systems

Take-home message: We hope to answer the following questions - what is entropy?


why does it increase? can we predict the equations of state of a system from first
principles?
References
• Mandl 2.1-2
• Bowley and Sánchez 4.1 (beware–B&S use W for Ω)
• Kittel and Kroemer 1,2
The heart of the method of Statistical Physics is to predict the macroscopic properties of a
system in equilibrium by averaging over all the microstates which are compatible with the
constraints—fixed volume, energy and particle number for an isolated idea gas in a perfectly
insulating container, for instance, or (in subsequent sections) fixed temperature and chemical
potential instead of energy and particle number. Furthermore we have to assign a value for the
probability with which each microstate should contribute to the average. The set of all allowed
microstates is called an ensemble and the averages are ensemble averages.
When we measure the properties of a gas, we know instinctively that the momentary po-
sitions and momenta of all the particles are not important. They are changing constantly on
a timescale which is very short compared with the time it takes to make any measurement of
pressure, temperature etc. If the average time between collisions is of the order of nanoseconds,
and if there are 1023 atoms in a container, the state of the system changes 1032 times per sec-
ond! What we measure really is some kind of time-averaged effect of many microscopic motions.
(Think of the kinetic theory derivation of pressure, for instance.) Statistical Physics formalises
this, and says that for an isolated system we will obtain the correct macroscopic result if we
take an ensemble average over all accessible microstates, assigning them equal probability. This
would be justified in classical mechanics if something called the ergodic hypothesis were true:
this states that the during its time evolution, a system visits all possible accessible states in
phase-space (specified by the positions and momenta of all particles) with equal probability.
A whole branch of mathematics is devoted to the ergodic hypothesis in systems which
include, but are not restricted to, Newtonian dynamics. Some flavour of the topic may be
covered in the third year course on non-linear systems. But we are not going to explore it
here. Quite apart from anything else, we know that quantum, not classical, mechanics, is the
correct paradigm, and will be important for many of the systems we will look at. We will
briefly consider the ideal gas in classical mechanics, largely for historical interest. However we
will not start with a gas, but with something much easier to picture, something with a close
analogue in a simple counter problem: the ideal paramagnetic lattice.

13
2.1 Microcanonical Ensemble
Take-home message: The properties of a macrostate are averaged over many mi-
crostates.
The crucial link from microscopic to macroscopic properties is as follows. If the value of
some quantity X in the ith microstate is Xi , and the probability that the system is in that
microstate is pi , then the value of X in the macrostate is the ensemble average.
X
hXi = pi Xi (2.1)
i

An ensemble is just a collection: we imagine a collection of copies of the system, each in


one of the allowed microstates. If the number of copies ν is much, much larger than Ω, then
each microstate will be represented with a frequency which reflects its probability: if νi is the
number of copies in state i, we have νi = νpi . (We use ν for numbers of copies, and n or N
for numbers of atoms. The former is hugely greater than the latter—it is just as well that it is
only a theoretical concept.)
Then if we use λ = 1 . . . ν to label the copies and i to label the microstates,
1X 1X X
hXi = Xλ = νi X i = p i Xi (2.2)
ν λ ν i i

There are three kinds of ensembles commonly used in statistical physics. Where the real
system is isolated, that is at fixed energy and particle number, the copies in the ensemble are
also isolated from one another; this is called the microcanonical ensemble.
If the real system is in contact with a heat bath, that is at fixed temperature, the copies
are assumed to be in thermal contact, with all the rest of the copies acting as a heat bath of
any individual copy. This is called the canonical ensemble.
Finally, if the real system can exchange both heat and particles with a reservoir, at fixed
temperature and chemical potential, the copies are also assumed to be in diffusive contact.
This is called the grand canonical ensemble.
The idea that averaging over very many copies of a probabilistic system gives, as the number
tends to infinity, a fixed reproducible result, is an example of the law of large numbers. As an
example, if we roll many identical dice (or one die many times) the average score is predictable
(though it is only 3.5 if the die is fair). Though it sounds obvious, in probability theory it
does need proved, though we will not do so here. Crucial is the fact that each roll of the die
is independent of the others, so that in the jargon the results are “independent identically-
distributed random variables”. The usual laws of probability say that the probabilities of a set
of independent outcomes is just the product of the individual probabilities, so that for instance,
with a fair die, the probability of getting first a 6, then anything but a six, in two consecutive
rolls is just 16 × 56 .
We start by considering an isolated system (constant energy, volume and particle number).
The fundamental principle that allows the averaging over microstate to be done is the postulate
of equal a priori probabilities or, in plain English, the assumption that all allowed microstates
are equally likely. (Allowed or accessible means having the same volume, particle number and
and total energy as the macrostate.) We use Ω for the number of such microstates, so the
probability of the system being in any one microstate is
1 X 1
pi = and pi = Ω =1 (2.3)
Ω i

It should be stressed that, as the name suggests, this postulate is not proved. It is assumed
to be true, but the validation relies on the success of the predictions of the theory.
Imagine we have counters, blue on one side and green on the other, and we toss them and
place them on a 6 × 6 checkerboard. Full information involves listing the colour at each site:
this is the equivalent of a microstate.
Many different patterns are possible, such as the following. Every configuration is equally
likely—or unlikely—to occur: There are Ω = 236 = 6.87 × 1010 patterns and the the probability
of each is (1/2)36 = 1.46 × 10−11 . (This satisfies the “postulate of equal a priori probabilities”.)

Suppose from a distance we only knew, from the average colour, how many counters were
green and how many blue, without being able to distinguish different arrangements of the same
numbers of counters. Then a “macrostate” would be characterised simply by the overall hue,
determined by the total number of green counters (the rest being blue).

Clearly, most macrostates correspond to many microstates. If the macroscopic description


is “15 green”, the following are a few of the allowed microstates:
How many are there in total? This is the common problem of splitting a group of N into
two smaller groups, of n and N − n, without caring about the ordering in each group, and the
number of ways of doing it is
N!
Ω= (2.4)
n! (N − n)!
Think of counters on squares: there are N ! ways of putting N distinguishable counters on
N squares. However if n of the counters are green, there are n! ways of arranging the green
counters among themselves without changing the pattern, and (N − n)! ways of arranging the
blues.1
Here, N = 36 and n = 15, so the total is 5.59 × 109 . For n = 10 there are only 2.54 × 108 ,
whereas for n = 18, there are 9.08 × 109 . This is the maximum.
The numbers N !/n! (N − n)! are called the binomial coefficients (since they enter the bino-
mial expansion) and they are written N Cn or ( Nn ).

2.2 The statistical basis of entropy


Take-home message: The increase of entropy can be understood as an evolution
from less to more probable configurations
Suppose a system has an extra degree of freedom which isn’t specified by fixing the volume
and internal energy. For a mixture of two ideal gases, it could be the degree to which they are
fully mixed–for example ratio of the concentrations of one species on either side of the box.
If the gases started out separated by a partition the concentrations would start at 0:1; the
classical law of increase of entropy tells us they will evolve till the ratio reaches 0.5:0.5, and not
change thereafter. (You have previously calculated the increase in entropy to be 2nR ln 2, for
n moles initially in each half of the box.) At the classical level, we don’t understand this yet.
It is just a law of nature, deduced ultimately from observation.
Statistical physics can explain the spontaneous increase of entropy. There are many more mi-
crostates corresponding to the equilibrium configuration (fully mixed) than the non-equilibrium
configurations (not fully mixed). The number of microstates as a function of mixing looks some-
thing like this, but really much sharper for systems consisting of 1023 atoms:

0:1 0.5:0.5 1:0

If we start at a point of unequal mixing, the configurations which are rather better mixed
are more numerous than those which are rather less well mixed. So as interactions cause the
1
Here, there are just two sets of size n and N − n. For more sets of size p, q, r....we have N !/(p!q!r!..)
system to jump from one microstate to another, it is more likely to end up better mixed. This
continues till full mixing is reached, at which point there is no further direction to the changes.
What has this to do with entropy? Classically, the system is evolving from a macrostate of
lower entropy to one of higher entropy. Statistically, it is evolving from less probable to more
probable macrostates, that is from macrostates corresponding to smaller numbers of microstates
to those corresponding to larger numbers of microstates.
So does the number of microstates, Ω, equal the entropy? No, because if we double the
size of a system, we have Ω2 , not 2Ω, microstates (think of the number of ways of choosing the
microstate of each half independently). So Ω isn’t extensive. But ln Ω is. So if we make the
connection
S = kB ln Ω (2.5)
then we can understand both entropy and its increase.
This expression is due to Boltzmann, it is known as the Boltzmann entropy to distinguish
it from a more general expression we will meet later, and it is inscribed on his grave in Vienna
(though with W for Ω.)
In principle, if the increase of entropy is just a probabilistic thing, it might sometimes
decrease. However we will see that for macroscopic systems the odds are so overwhelmingly
against an observable decrease that we might as well say it will never happen.
What is kB , Boltzmann’s constant? It must be a constant with dimensions of entropy,
Joules/Kelvin, and it turns out that the correct numerical correspondence is given by the gas
constant R divided by Avogadro’s number:
R
kB = = 1.381 × 10−23 J K−1 = 8.617 × 10−5 eV K−1 (2.6)
NA
This is validated, for instance, by the derivation of the ideal gas law (see here).
As an example, let’s consider the checkerboard example again. Imagine starting with a
perfectly ordered all-blue board, then choosing a counter at random, tossing it, and replacing
it. After repeating this a few times, there are highly likely to be some green counters on the
board—the chance of the board remaining blue is only about 1 in 2n after n moves. As time
goes on, the number of greens will almost certainly increase—not on every move, but over the
course of a few moves. Here is a snapshot of the board taken once every 10 moves. The number
of greens is 0, 3, 5, 9, 12, 15, 15, 17, 18.
Here is a graph of the number of greens over 100 and 1000 moves.

25 25
20 20
15 15
10 10
5 5

20 40 60 80 100 200 400 600 800 1000

We see that, after the first 100 moves, the system stayed between 18 ± 6 almost all of the
time. These fluctuations are quite large in percentage terms, ±33%, but then it is a very small
system—not really macroscopic at all.
If we now look at a larger system, 30 × 30, we see that fluctuations are still visible, but they
are much smaller in percentage terms–the number of greens is mostly 450 ± 30, or ±7%.

600 600
500 500
400 400
300 300
200 200
100 100

500 1000 1500 2000 5000 10000 15000 20000

A 25-fold increase in the size of the system has reduced the percentage fluctuations by a
factor
√ of 5. We will see later that an n-fold increase should indeed reduce the fluctuations
23
by n. We can predict that a system with 10 counters—truly macroscopic—would have
fluctuations of only about 10−10 %, which would be quite unobservable. The entropy of the
system would never appear to decrease.

2.3 The spin-half paramagnet


Take-home message: The paramagnet is the simplest illustration of statistical ideas

• Mandl 2.1
• Bowley and Sánchez 4.3
• Kittel and Kroemer 1,2

What is a spin- 21 paramagnet? A paramagnet is a substance which can be magnetised by


an external magnetic field, and the magnetisation is aligned with the external field. Unlike a
ferromagnet the response is weak and does not remain if the field is switched off.
A crystal of atoms with non-zero spin will act as a paramagnet, as the spins will tend to line
up in an external field. From quantum mechanics we know the spin-projection along the field
can only take certain discrete values. For simplicity we consider spin- 12 , so that sz = ± 12 . In an
ideal paramagnet, the spins do not feel one another, but react independently to an external
field.
Thus the ideal paramagnet is a lattice of N sites at each of which the spin points either
up or down. Each of these has a magnetic moment ±µ. Only the total magnetic moment is
macroscopically measurable, and this is just the sum of the individual moments. If n↑ spins
are pointing up and n↓ = N − n↑ are pointing down, the total magnetic moment is
m = n↑ µ + n↓ (−µ) = µ(2n↑ − N ). (2.7)
In an external field, the spin-up atoms will have lower energy, −µB, and the spin-down atoms
have higher energy, µB, so the the total energy is just
E = n↑ (−µB) + n↓ (µB) = −Bm (2.8)
However we are going to start with zero external magnetic field so that all states have
the same energy. The magnetisation is then an example of an extra degree of freedom not
specified by the energy, as discussed in the previous section.
All the pictures carry over if for “blue” you read spin-up. (The chequerboard is 2-D and a
crystal is 3-D, but in the absence of interaction the geometry is irrelevant; all that counts is
the total number N of atoms.)
So macrostates are characterised by their magnetic moment (or magnetisation, M = m/V ),
but microstates by the list of spins at each site. For N = 3 there are four macrostates and
eight microstates.
m=3 µ

m= µ
Ν=3
m=− µ

m=−3µ

In general the number of microstates for a given macrostate, with n↑ = 12 (N + m/µ), is


N!
Ω(n↑ ) = (2.9)
n↑ ! (N − n↑ )!
You should be able to prove that the sum of the Ω(N, n↑ ) over all n↑ is 2N . (Write 2 =
(1 + 1) . . . ...)
Below we plot Ω(n), normalised to 1 at the peak, as a function of n/N , for different values
of N .
W
€€€€€€€€€€
Wmax
1

0.8

0.6

0.4

N=100
0.2
104 N=10
n
€€€€
0.2 0.4 0.6 0.8 1 N

As N gets larger, the function is more and more sharply peaked, and it is more and more
likely that in the absence of an external magnetic field there will be equal numbers of up and
down spins, giving zero magnetisation.
For large N , the curve is very well approximated by a Gaussian (see examples sheet),
(n−N/2)2

Ω(n) ∝ e N/2 (2.10)

with a mean√ of N/2 and a standard deviation σ = N /2. Thus the fractional size of fluctuations
goes as 1/ N .
Since the probabilities of various sizes of fluctuations from the mean in a Gaussian are
known, we can show that in a macroscopic system, 100σ deviations are vanishingly unlikely
with a probability of 1 in 102173 . Even they would be undetectable (only in part in 1010 for a
system of 1024 particles), so the macroscopic magnetisation is very well defined indeed.
Furthermore the number of microstates of the whole system is dominated by those states
where the numbers of up and down spins are approximately equal. In comparison, there are so
few of the rest we can ignore them.
The fact that the binomial distribution tends to a Gaussian as the number becomes very
large is an example of the central limit theorem. It is not restricted to the binomial distribution:
for any N independent identically-distributed random variables with mean µ and standard
deviation σ,
√ for large N the distribution of their sum is a Gaussian of mean N µ and standard
deviation N σ. We won’t prove it here, but the examples ask you to test it for a particular
example.

2.4 From entropy to temperature


Take-home message: From the entropy, we can calculate all the usual functions of
state of a system

• Mandl 2.4
• (Bowley and Sánchez 2.9)

Ε1 Ε2
V1 V2
N1 N2

In the last section we deduced the existence of entropy, and the fact that at equilibrium the
entropy is a maximum, from statistical arguments. Now we would like to know if we could go
further and, even if we knew no classical thermodynamics, deduce the existence of temperature,
pressure and chemical potential.
By considering two systems in contact with one another we can indeed deduce the existence
of properties which determine whether they are in thermal, mechanical and diffusive equilibrium
even if we knew no classical thermodynamics: these are the three partial derivatives of the
entropy with respect to energy, volume and particle number.
Consider a system divided in two by a wall which can move, and through which energy
and particles can pass. The equilibrium division of the space into two volumes V1 and V2 ,
with energy and particle number similarly divided, will be the one which corresponds to the
maximum number of microstates, and hence to the maximum entropy. If we consider heat flow
only,    
∂S ∂S
dS = dE1 + dE2 (2.11)
∂E1 V1 ,N1 ∂E2 V2 ,N2
.
But the microstates of each half can be counted independently, so the entropies add:

S(E1 , E2 , V1 , V2 , N1 , N2 ) = S1 (E1 , V1 , N1 ) + S2 (E2 , V2 , N2 ) (2.12)

and also, since the total energy is conserved, dE1 = −dE2 . So


"    #
∂S1 ∂S2
dS = − dE1 (2.13)
∂E1 V1 ,N1 ∂E2 V2 ,N2

and the entropy will be maximised when a small energy flow no longer changes the entropy:
   
∂S1 ∂S2
= (2.14)
∂E1 V1 ,N1 ∂E2 V2 ,N2

So we deduce there is some property of bodies which governs heat flow; this is clearly
related to temperature. By considering volume changes and particle flow we discover two
more properties which are clearly related to pressure and chemical potential. To discover the
relation we would have to calculate them for some system and see how they compared with
the temperature, pressure etc of classical thermodynamics. However the following assignments
clearly work:
     
∂S 1 ∂S P ∂S µ
= = =− (2.15)
∂E V,N T ∂V E,N T ∂N E,V T

since they give


1 P µ
dE + dV − dN
dS = (2.16)
T T T
which is the fundamental thermodynamic relation rearranged.
In the following two subsections these ideas are applied to the paramagnet, and we can have
our first go at the ideal gas now, too.

2.4.1 The isolated spin-half paramagnet in a magnetic field


We can apply these to the spin- 12 paramagnet in a magnetic field. There is just one difference:
we can’t derive pressure in this case, because the work is magnetic and not mechanical. Instead
of −P dV we have −mdB in the fundamental thermodynamic relation, so we have an expression
for m instead of P :  
m ∂S
= (2.17)
T ∂B E,N
Now for an isolated system the energy is fixed, and therefore so is the number of up spins:
E = −µB(n↑ − n↓ ) = µB(N − 2n↑ ) (note µ is now the magnetic moment!) Then we have
 
N!
S = kB ln Ω(E, B) = kB ln (2.18)
n↑ ! (N − n↑ )!

with n↑ = 21 (N − E/µB).
For large numbers of spins, we can use Stirling’s approximation:

ln n! = n ln n − n (2.19)

giving
S = kB (N ln N − n↑ ln n↑ − (N − n↑ ) ln(N − n↑ )) . (2.20)
(Note that S is maximum, S = N kB ln 2, when n↑ = n↓ = N/2, the point of maximum disorder.2
So
 
1 ∂S
=
T ∂E B,N
   
∂S ∂n↑
=
∂n↑ N ∂E B,N
 
kB n↑
= ln (2.21)
2µB n↓
Differentiating the entropy with respect to B instead, and using the above result for T , we
get an expression for m:
 
m kB E n↑ E
= − 2
ln ⇒m = −
T 2µB n↓ B
We knew that of course, but it’s good that it works.
We note that for T to be positive there have to be more spins aligned with the field than
against it. We will come back to this point later. The paramagnet is unusual in that there
is a maximum entropy when n↑ = n↓ , corresponding to T → ∞. Most systems can increase
their entropy indefinitely as the temperature increases. But for large N , so long as the energy
above the ground state Eexcite is small (n↓  n↑ , Eexcite  N µB/2), the number of microstates
of the paramagnet rises extremely rapidly with energy, and hence the entropy rises too. If
we have two paramagnetic samples sharing a fixed energy between them, then the number of
microstates of the whole system is Ω12 = Ω1 (N1 , E1 )Ω2 (N2 , E − E1 ). Ω1 rises with energy E1
but Ω2 falls. The above plot shows the two, together with the number of microstates of the
whole system as a function of E1 (for N1 = N2 = 105 and Eexcite = 2000µB). Ω12 is sharply
peaked at the point where the energy is shared equally between the two systems, which means
in this case that their temperature is the same. The system is overwhelmingly likely to be in
one of the microstates in the immediate vicinity of this maximum, where the entropy of the
combined system is maximised. Though a toy model, this basically illustrates what is going on
every time two macroscopic bodies reach thermal equilibrium.
2
This is actually slightly surprising, because it seems to imply Ω = 2N – but that is ALL microstates. For
large N , the distribution is so sharply peaked that the overwhelming majority of the microstates ARE at the
peak! Including subleading terms in Stirling’s approximation
√ give Smax less than Stotal by a term of order log N ,
which allows for a peak width which is of order N .
2.4.2 The ideal gas, first attempt
To do the ideal gas properly we need to know the quantum states of particles in a box. We
learned this in PHYS20101 last semester, so we will be able to tackle it properly later. However
with much less work we can at least discover how the entropy depends on volume at fixed particle
number and energy.
Consider an isolated system of N atoms in a box of volume V . Imagine the box subdivided
into many tiny cells of volume ∆V , so that there are V /∆V cells in all (this number should be
much greater than N ). Now each atom can be in any cell, so there are V /∆V microstates for
each atom, and (V /∆V )N microstates for the gas as a whole. Thus
 
V
S = N kB ln . (2.22)
∆V

As a result the change in entropy when the volume changes (at fixed energy) is
 
Vf
∆S = N kB ln . (2.23)
Vi

This is exactly what we found from classical thermodynamics, A.7, and incidentally it validates
the kB in S = kB ln Ω. From the entropy we have
 
P ∂S
=
T ∂V E,N
N kB T
⇒ P =
V
⇒ P V = N kB T (2.24)

So we have derived the ideal gas equation from first principles!


A problem with this expression for the entropy is that it depends on the size ∆V of the
imaginary cells into which we subdivided our box. This is clearly unsatisfactory (though at
least entropy changes are independent of it), but classical physics can’t do any better. Quantum
physics can though! If you want to jump ahead to see the full expression (called the Sackur-
Tetrode equation) see here.
Chapter 3

Statistical Physics of Non-isolated


Systems

Take-home message: For macroscopic systems results can be more easily obtained
by regarding the temperature, rather then the energy, as fixed.
In principle, with the tools of the last section we could tackle all the problems we want now.
But it turns out to be hard to calculate the entropy of any isolated system more complicated
than an ideal paramagnet. This is because in an isolated system the energy is fixed, and it
becomes complicated to work out all the possible ways the total energy can be split between
all the atoms of the system: we can’t treat each atom as independent of all the others, even if
they are non-interacting.
We don’t have to consider isolated systems though. In this section we will consider systems
in contact with a heat bath, so that their temperature, rather than their energy, is constant.
This has the advantage that if the atoms of a system don’t interact with one another, they can
be treated independently.
For a macroscopic system, there is very little difference in the results from the two ap-
proaches. If the temperature is held √ constant the energy will fluctuate, but the fractional size
of the fluctuations decreases as 1/ N and so, from a macroscopic point of view, the energy
does not appear to vary and it makes little difference whether the heat bath is there or not. So
lots of results we obtain in this section are also applicable to isolated, macroscopic systems.
We will introduce something call the partition function from which we can calculate the
energy, pressure etc. The heart of the partition function is the Boltzmann distribution,
already met last year, which gives the probability that a particle in contact with a heat bath
will have a given energy.

3.1 The Boltzmann Distribution


Take-home message: The form of the Boltzmann distribution!

• Mandl 2.5
• (Bowley and Sánchez 5.1)
• Kittel and Kroemer 3

If a system is in contact with a heat bath at temperature T , the probability that it is in the
ith microstate, with energy εi , is given by the Boltzmann distribution.

24
The details of the derivation are as follows. We consider a system S in contact with a heat
reservoir R, the whole forming a single isolated system with energy E0 which is fixed but which
can be differently distributed between its two parts. We can apply what we already know about
isolated systems to the whole, to obtain information about the probabilities of the microstates
of S,.

ER= E0− ε

S
ΕS = ε

Heat can be exchanged between the system and reservoir, and the likelihood of a particular
partition depends on the number of microstates of the whole system S + R corresponding to
that partition. (The equilibrium partition will be the one which maximises the number of
microstates, but that is not what we are interested in here.) Since the system and reservoir are
independent, the total number of microstates factorises: Ω = ΩS ΩR
Now suppose we specify the microstate of S that we are interested in, say the ith (with
energy εi ) and ask:

• what is the probability pi of finding the system in that microstate?

It will be proportional to the number of compatible microstates Ω(E0 , εi ) of the whole


system S + R. However ΩS = 1 as we’ve specified the state of S, so only the microstate of the
reservoir is unspecified: Ω(E0 , εi ) = ΩR (E0 − εi )
Using the relation between Ω and entropy, we can write

pi ∝ ΩR (E0 − εi ) = exp{SR (E0 − εi )/kB } (3.1)

If R is to be a good reservoir, it must be much bigger than S, so εi  E0 . Thus we can expand


SR about SR (E0 ) and keep only the lowest terms:
   2 
∂SR 2 ∂ SR
SR (E0 − εi ) = SR (E0 ) − εi + 12 εi + ... (3.2)
∂E V,N ∂E 2 V,N

where the derivatives are evaluated at E0 . But the derivative of S with respect to E is just the
inverse of the temperature. Dropping the third term as negligibly small,1

pi ∝ exp{SR (E0 )/kB − εi /(kB T )} ∝ exp{−εi /(kB T )} (3.3)


1 ∂ 2 SR ∂
( ∂E 2 )V,N= ∂E ( T1 )V,N = − T12 × C1 V , so for εi ∼ kB T  CV T ∼ E0 , the second derivative contribution to
(3.1) is suppressed by εi /E0 which is assumed to be very much less than 1. Higher derivativees will be further
suppressed.
(since SR (E0 ) is a constant, independent of the microstate we are interested in). Calling the
constant of proportionality 1/Z, this is our result:

e−εi /kB T
pi = (3.4)
Z

The normalisation constant


P Z is found by saying that the probability that the system is in
some microstate is one: j pj = 1, so
X
Z= e−εj /kB T (3.5)
j

To recap: if we specify the microstate (with a given energy) of S, the probability of this
depends of the number of microstates of the reservoir with the remaining energy. This decreases
with decreasing reservoir energy in just the way given by the Boltzmann distribution.
For an ideal gas or paramagnet, where interactions between atoms can be ignored, any
individual particle can be considered as the system S. In that case the Boltzmann distribution
holds for the state of an individual atom (hence typical first-year applications like the variation
of pressure with height in the atmosphere, and the distribution of velocities of atoms in a gas).
For the spin- 12 paramagnet in a magnetic field B there only are two energy states; ε↑ = −µB
and ε↓ = µB. So

e µB/kB T
p↑ =
Z1
−µB/kB T
e
p↓ =
Z1
and Z1 = e µB/kB T
+ e−µB/kB T (3.6)

(The label on Z1 refers to the fact that we are talking about the state of a single particle.)
In the whole system of N atoms, the number of up-spins on average will be hni↑ = N p↑ , so
we have  
n↓
= e−2µB/kB T (3.7)
n↑
This is exactly consistent with the expression we found for the temperature of the isolated
system with a fixed number of up-spins (and hence energy).
n¯ n­
1

0.5

T
ΜBkB

Note that in thermal equilibrium, the average number of particles in the higher energy state
is always less than the number in the lower energy state. As the temperature tends to infinity
the ratio approaches, but never exceeds, one.
3.2 The Partition Function
• Mandl 2.5
• Bowley and Sánchez 5.2
• Kittel and Kroemer 3
Take-home message: Far from being an uninteresting normalisation constant, Z
is the key to calculating all macroscopic properties of the system!
The normalisation constant in the Boltzmann distribution is also called the partition func-
tion: X
Z= e−εj /kB T (3.8)
j

where the sum is over all the microstates of the system. 2


How can a constant be a function? Well for a given system and reservoir, that is fixed
temperature, particle number, volume or magnetic field (as appropriate), Z is a constant. But
if the temperature etc are allowed to vary, then Z is a function of them: Z = Z(T, N, V ) or
Z = Z(T, N, B). (The dependence on V or B comes through the energies of the microstates
εi )
Why are we emphasising this? Because if we know Z, we can calculate all macroscopic
properties of the system – energy, pressure, magnetisation, entropy. . .
For instance the average energy hEi (actually an ensemble average) is
εi e−εi /kB T
X P
hEi = εi pi = Pi −εj /k T (3.9)
je
B
i

The top line is like the bottom line (the partition function) except that each term is multiplied
by εi . We can get the top line from the bottom by differentiating by “1/(kB T )”. This is a bit
awkward, so we introduce a new symbol
1
β≡ (3.10)
kB T
giving
1 ∂Z
hEi = − (3.11)
Z ∂β N,V
or
∂ ln Z
hEi = − (3.12)
∂β
(where—contrary to the strict instructions given earlier—we will take it for granted that it is
particle number and volume or magnetic field constant that we are holding constant.)
When we calculate averages using the Boltzmann probabilities, at fixed temperature, the
corresponding ensemble is called the canonical ensemble.
From the energy we can find the heat capacity:
 
∂ hEi
CV = . (3.13)
∂T V,N
2
Recall that the microstate is of the whole system, and so the energies i are not necessarily small; for a
mole of gas they could be of the order kJ. It would be easy to forget this, because in a lot of the applications,
like the last one, we will in fact apply it to single spins and particles with energies of the order eV or smaller.
We have found the average energy, but there will be fluctuations as heat is randomly ex-
changed between the system and the heat bath. These are given by

(∆E)2 = E 2 − hEi2


(3.14)

(∆E)2 is related to the heat capacity as follows. Since

1 ∂ 2Z
 

2
E = (3.15)
Z ∂β 2 N,V

(which should be obvious by analogy with the corresponding expression for hEi) we obtain

2∂ 2 ln Z ∂T ∂ hEi 2 CV
(∆E) = = − = (kB T ) (3.16)
∂β 2 ∂β ∂T kB
For a normal macroscopic system the average energy is of the order of N kB T and the heat
capacity is of the order of N kB . Thus
∆E 1
≈√ (3.17)
E N
For a system of 1024 atoms, ∆E/E ≈ 10−12 and so fluctuations are unobservable. There is
no practical difference between an isolated system of energy E and one in contact with a heat
bath at the same temperature.
There are exceptions. Near a critical point—where the distinction between two phases
disappears—the heat capacity becomes very large and the fluctuations do too. This can be
observed as “critical opalescence” where the meniscus between the liquid and gas phases disap-
pears and the substance becomes milky and opaque and scatters light. A video and emplanation
of an analogous phenomenon can be found here courtesy of

3.3 Entropy, Helmholtz Free Energy and the Partition


Function
Take-home message: Once we have the Helmholtz free energy F we can calculate
everything else we want.

• Mandl 2.5
• (Bowley and Sánchez 5.3-6)
• Kittel and Kroemer 3

We can’t use an ensemble average directly for the entropy, because it doesn’t make sense
to talk about the entropy of a microstate. But we can talk about the entropy of the ensemble
since the many copies can be in many different microstates. So we define the entropy of the
system as the entropy of the ensemble divided by the number of copies, ν, in the ensemble:
hSi = Sν /ν.
The ensemble has νi copies in the ith microstate, so the number of ways of arranging these
is
ν!
Ων = (3.18)
ν1 ! ν2 ! ν3 ! . . .
(compare the ways of arranging counters on the in the chequerboard).
So, using Stirling’s approximation,
X
ln Ων = ν ln ν − ν − (νi ln νi − νi )
i
X P
= νi (ln ν − ln νi ) (using ν = i νi in two places)
i
X
= − νi (ln νi /ν)
i
X
= −ν pi ln pi (3.19)
i

So the ensemble entropy is Sν = kB ln Ων and the system entropy is


X
hSi = −kB pi ln pi (3.20)
i

This expression is called the Gibbs entropy. (Note that as all pi lie between 0 and 1, the
entropy is positive.)
Note that we have not said anything about what distribution the probabilities pi follow.
For an isolated system, pi = 1/Ω for each of the Ω allowed microstates, giving the Boltzmann
entropy S = kB ln Ω as before. For a system in contact with a heat bath, pi is given by the
Boltzmann distribution, so
X
hSi = − kB pi ln pi
i
X
= − kB pi (−εi β − ln Z)
i
= kB (hEi β + ln Z)
hEi
= + kB ln Z (3.21)
T
Rearranging we get kB T ln Z = − hEi + T hSi = − hF i where F is the Helmholtz free energy,
or
F = −kB T ln Z. (3.22)
Since F = E − T S, from the fundamental thermodynamic relation we obtain dF = −SdT −
P dV + µdN . Thus
     
∂F ∂F ∂F
S=− P =− µ= (3.23)
∂T V,N ∂V T,N ∂N T,V

(You first met these in the derivation of Maxwell’s relations.) For a magnetic system, we have
m = − (∂F/∂B)T,N instead of the equation for P.
Remember, Z and hence F depend on V (or B) through the energies of the microstates. For
instance the energy levels of a particle in a box of side L are proportional to h̄2 /(mL2 ) ∝ V −2/3 .
These relations are reminiscent of those we met in the case of an isolated system, but there
the entropy was the key; here it is the Helmholtz free energy. We can make the following
comparison:
It should not surprise us to find that the Helmholtz free energy is the key to a system at
fixed temperature (in contrast to the entropy for an isolated system) as that is what we found
classically (see here.)
system Isolated in contact with heat bath
fixed E,N,V or B T,N,Vor B
key microscopic function no. of microstates Ω partition function Z
key macroscopic function S= kB logΩ F=− kBT log Z

3.4 The Gibbs entropy


Take-home message: The state of maximum entropy can be thought of as the one in which a
determination of the microstate would causeP the most information gain
We derived the expression S = −kB i pi ln pi from an ensemble average, essentially using
the law of large numbers and the ergodic hypothesis. We did not use any other result in
statistical physics at that point, and so it is tempting to think that it tells us something about
a probability distribution in a more general context.
In fact the expression was independently derived in the context of information theory by
Shannon (without the kB ). The idea is that if S is large, there must be many outcomes all
with non-vanishing probabilities, that is we do not know what the outcome is likely to be in
advance. On the other hand if one outcome is virtually certain, S ≈ 0. So the initial S,
reflecting our prior knowledge from which we have calculated the pi , gives us a measure of how
much information we gain on average by actually finding out the outcome.
The form was derived by considering a “surprise function” S(p) quantifying the information
gain associated with a particular outcome of probability p. It must be positive for 0 < p ≤ 1 and
satisfy S(1) = 0 (no surprise if the outcome is certain). Furthermore for two independent events
the probability of a particular pair of outcomes with probabilities p and q is pq; but if they
are independent it is reasonable that the information gained is additive:P S(pq) = S(p) + S(q).
Only S = −c ln p has these properties. Then for a distribution, S = − i pi ln pi is the “average
surprise” (defined up to an arbitrary multiplicative constant).3
Modern statistical physics starts from the Gibbs / Shannon entropy and derives the equilib-
rium distributions (the set of pi ), from the following principle: the equilibrium entropy should
be maximised subject to constraints. For the microcanonical distribution the only constraint
is that the sum of the probabilities is 1. To obtain the Boltzmann distribution the constraint
is not on the temperature but the average energy, which amounts to the same thing.
How do we minimise with constraints? We use the method of Lagrange multipliers (section
A.2). For a set of n constraints which are functions of the set of probabilities
P un (pi ) and which
are required to vanish, instead of minimising S, we minimise S + n λn un with respect to all
the pi .
To obtain the microcanonical distribution, for which all Ω accessible microstates have the
same energy, we need, for all j,
!
d X X   
−kB pi ln pi + λ pi − 1 =0 ⇒ −kB ln pj + 1 + λ = 0, (3.24)
dpj i i

from which, rearranging, we get

pj = exp(−1 + λ/kB ) (3.25)


3
Shannon used logarithm to the base 2, which introduces a factor of 1/ ln 2.
The exact form is uninformative (we don’t know immediately know λ) but the principle is
not: all probabilities are equal. We have derived the principle of equal a priori probabilities by
maximising the Gibbs entropy. (We can then say that pi = 1/Ω and solve for λ if we want.)
To obtain the canonical distribution, for which the microstates have different energies i ,
constraining the average energy to be E we need, for all j,
!
d X X  X 
−kB pi ln pi + λ1 pi − 1 + λ2 i pi − E =0 (3.26)
dpj i i i

It is left as an exercise to show that we recover the Boltzmann distribution and to identify the
two Lagrange multipliers in terms of β and Z.
Finally the grand canonical distribution has microstates with variable particle number and
we fix the average N ; we obtain the Gibbs distribution which we will meet in the next chapter,
and the third Lagrange multiplier is related to the chemical potential.
In the context of information theory, the procedure of maximising S subject to the con-
straints is used a as way of finding an unbiased prior estimate of the probabilities. (It is not
always possible to conduct infinite numbers of trials in order to assign probabilities in proportion
to the frequencies!)

3.5 The paramagnet at fixed temperature


Take-home message: Understand the paramagnet and you are close to mastering
the subject!

• Mandl 3
• (Bowley and Sánchez 5.7)
• (Kittel and Kroemer 3)

First, recap previous sections on the isolated spin- 12 paramagnet at zero and non-zero mag-
netic field.
The ideal paramagnet is a lattice of N sites at each of which the spin points either up or
down. Each of these has a magnetic moment ±µ. In an external field, these two states will
have different energy; spin-up has energy −µB, and spin-down, µB. As we saw previously the
partition function for a single atom is therefore
 
µB/kB T −µB/kB T µB
Z1 = e +e = 2 cosh = 2cosh (µBβ) (3.27)
kB T

(Remember β = 1/kB T .)
Since the atoms are non-interacting, the total energy and magnetisation of the system are
just N times the average energy and magnetisation of a single spin. The energy is
∂ ln Z1
hEi = −N = −N µB tanh(µBβ) (3.28)
∂β

(For a refresher on hyperbolic trig functions, see here.)


At low T , all the spins are aligned with the field and the energy per spin is close to −µB.
However as T increases, thermal fluctuations start to flip some of the spins; this is noticeable
E E
Β T
1ΜB ΜBkB

-NΜB -NΜB

when kB T is of the order of µB. As T gets very large, the energy tends to zero as the number
of up and down spins become more nearly equal. Remember, hn↓ /n↑ i = exp(−2µB/kB T ), so
it never exceeds one.
We can also calculate the heat capacity :
∂E
CV = = N kB (µBβ)2 sech 2 (µBβ) (3.29)
∂T

CV

-NkB 2

T
ΜBkB

We see that the heat capacity tends to zero both at high and low T . At low T the heat
capacity is small because kB T is much smaller than the energy gap 2µB, so thermal fluctuations
which flip spins are rare and it is hard for the system to absorb heat. This behaviour is universal;
quantisation means that there is always a minimum excitation energy of a system and if the
temperature is low enough, the system can no longer absorb heat.
The high-T behaviour arises because the number of down-spins never exceeds the number of
up-spins, and the energy has a maximum of zero. As the temperature gets very high, that limit
is close to being reached, and raising the temperature still further makes very little difference.
This behaviour is not universal, but only occurs where there is a finite number of energy levels
(here, there are only two). Most systems have an infinite tower of energy levels, there is no
maximum energy and the heat capacity does not fall off.

Low Temperature High Temperature

>> >>
kBT µB n n kBT >> µB n n
Up to now we’ve cheated a bit, (though the results are correct,) in that we didn’t calculate
the partition function for the whole system, only for a single spin. It is easy to show however
that the partition function for N non-interacting spins on a lattice is

ZN = (Z1 )N (3.30)

Let’s start with a system that has two single-particle energy levels, ε1 and ε2 . The single-
particle partition function is
Z1 = e−ε1 β + e−ε2 β . (3.31)
The partition function for two distinguishable particles is

Z2 = e−2ε1 β + 2e−(ε1 +ε2 )β + e−2ε2 β = (Z1 )2 , (3.32)

where the second state is multiplied by 2 because there are two ways that two distinguishable
particles can be in different levels.
In general, for N particles, the energies are nε1 + (N − n)ε2 , for 0 ≤ n ≤ N , and there are
N !/n!(N − n)! separate microstate of this energy. So
N
X N!
ZN = e−(nε1 +(N −n)ε2 )β
n=0
n!(N − n)!
N
X N!
= e−nε1 β e−(N −n)ε2 β
n=0
n!(N − n)!
N
X N! n −ε2 β N −n
= e−ε1 β e = (Z1 )N (3.33)
n=0
n!(N − n)!

where we’ve used the binomial expansion of (x + y)N .


If there are more than two energy levels, Z1 has more terms, but a similar derivation can
be done. However we won’t show it because it is just a special case of a future section.
Since hEN i is derived from ln ZN , we see immediately that the results for N particles will
just be N times the single particle values, as we assumed at the start of this section. We can also
calculate the Helmholtz free energy, F = −kB T ln ZN . The magnetisation, m = −(∂F/∂B)T,N ,
gives m = −E/B as expected. We can find the entropy, from S = −(∂F/∂T )B,N or from
S = (E − F )/T :  
hSi = N kB ln (2 cosh(µBβ)) − µBβ tanh(µBβ) (3.34)
Below we plot S and m against temperature for several different external fields. At zero
m
S
NkB log2 NΜ
B0  2 2B0
0.5 B0
B0

2B0 B0  2

T
T

temperature, the magnetisation goes to N µ: all the spins are up. There is no disorder, and so
the entropy is zero.
The stronger the field, the higher the temperature has to be before the spins start to be
appreciably disordered.
At high temperatures the spins are nearly as likely to be up as down; the magnetisation
falls to zero and the entropy reaches a maximum. The entropy of this state is N kB ln 2, as we
have already seen.
There is a caveat to the formula ZN = (Z1 )N . The argument says that there are a number of
different microstates with the same number of up and down spins. Since the spins are arranged
on a lattice, this is correct; every spin can be distinguished from every other spin by its position.
When we go on to consider a gas, however, this is no longer so, and the relation between Z1
and ZN changes. The treatment for indistinguishable particles is here.
Returning to the expression for the partition function for two distinguishable two-state
particles, (3.32), we see that two microstates have the same energy ε1 + ε2 and instead of listing
them twice we have written it once and multiplied by two. We say that this energy is “doubly
degenerate” or has a degeneracy of two. In general we can write the partition function as as
sum over distinct energies εn rather than over all microstates if we include the degeneracies gn :
X X
Z= e−εj β = gn e−εn β .
j n

The first sum runs over microstates, the second over energies.

3.6 Adiabatic demagnetisation and the third law of ther-


modynamics
Take-home message: The properties of a paramagnet can be put to practical use
to achieve low temperatures, but we can never get to absolute zero.

• Mandl 5.6

By magnetising and demagnetising a paramagnetic sample while controlling the heat flow,
we can lower its temperature.
S
a B1

B2 >B1
c
b

We start with the sample in a magnetic field B1 at an (already fairly low) temperature T1 .
a→ b: With the sample in contact with a heat bath at T1 , we increase the magnetic field
to B2 .
b→ c: With the sample now isolated, we slowly decrease the field to B1 again. This is
the adiabatic demagnetisation step; because the process is slow and adiabatic, the entropy is
unchanged.
By following these steps on a T − S plot, we see that the second, constant entropy, step,
reduces the temperature. The entropy is a function of B/T only, not B or T separately (see
here) so if we reduce B at constant S, we reduce T also.
The following figure shows what is happening to the spins. In the first step we increase the

kBT1 2 µB 1

Isothermal
magnetisation

2 µB 1 2 µB 2
kBT 2
adiabatic kBT1
c demagnetisation
b

level spacing while keeping the temperature constant, so the population of the upper level falls.
In the second step we reduce the level spacing again, but as the spins are isolated there is no
change in level occupation. The new, lower level occupation is now characteristic of a lower
temperature than the original one.
If we start with a large sample, we could repeat the process with a small sub-sample, the
remaining material acting as a heat bath during the next magnetisation. By this method
temperatures of a fraction of a Kelvin can be reached. However after a few steps less and
less is gained each time, as the curves come together as T → 0. (Once the electron spins are
all ordered, one can start to order the nuclear spins, and reach even lower temperatures—the
magnetic moment of the nucleus is around a two-thousandth of that of the atom), but even
that has its limits.
This is an important and general result. There is always a minimum excitation energy ε
of the system, and once kB T  ε there is no further way of lowering the temperature. The
unattainability of absolute zero is the third law of thermodynamics.

The laws of Thermodynamics


1: You can’t win, you can only break even.
2: You can only break even at T=0.
3: You can’t reach T=0....

In the process above, the lowest temperature attainable is obviously proportional to µB1 /kB .
You might wonder why we can’t just take B1 → 0. But in any real paramagnet, there is a weak
coupling between the spins which means that they prefer to be aligned with one another. If
we remove the external field, this coupling acts like a weak internal field, and at low enough
temperatures the spins will still be ordered. The strength of this coupling is then what governs
the lowest attainable temperature.

S S
NkB log2 NkB log2

0.5 0.5

Ideal paramagnet Real paramagnet


T T

3.7 Vibrational and rotational energy of a diatomic molecule


Take-home message: Along with the next section, the partition function allows
us to predict the temperature-dependent internal energy and heat capacities of
diatomic gases, based on quantum mechanics

• (Bowley and Sánchez 5.11,5.12)

So far we have only looked at two-level systems such as the paramagnet. More usually there
are many or even infinitely many levels, and hence terms in the partition function. In some
special cases the partition function can still be expressed in closed form.
Vibrational energy of a diatomic molecule
The energy levels of a quantum simple harmonic oscillator of frequency ω are

εn = (n + 12 )h̄ω n = 0, 1, 2 . . . (3.35)

so

1
X
e−εn β = e− 2 h̄ωβ e0 + e−h̄ωβ + e−2h̄ωβ . . .

Z1 =
n=0
1
e− 2 h̄ωβ
=
1 − e−h̄ωβ
 −1
= 2 sinh( 12 h̄ωβ) (3.36)

xn = (1 − x)−1 , with
P
where we have used the expression for the sum of a geometric series, n
x = e−h̄ωβ .
From this we obtain
∂ ln Z1
hE1 i = − = 12 h̄ω coth ( 12 h̄ωβ). (3.37)
∂β

The low temperature limit of this (kB T  h̄ω; h̄ωβ → ∞) is 12 h̄ω, which is what we expect
if only the ground state is populated. The high temperature limit (kB T  h̄ω ; h̄ωβ → 0) is
kB T , which should ring bells! (See here for more on limits.)
Typically the high temperature limit is only reached around 1000 K.
Rotational energy of a diatomic molecule
The energy levels of a rigid rotor of moment of inertia I are
l(l + 1)h̄2
εl = l = 0, 1, 2 . . . (3.38)
2I
but there is a complication; as well as the quantum number L there is ml , −l ≤ ml ≤ l, and
the energy doesn’t depend on ml . Thus the lth energy level occurs 2l + 1 times in the partition
function, giving
∞ X
l ∞
−l(l+1)h̄2 β/2I 2
X X
Z1 = e = (2l + 1) e−l(l+1)h̄ β/2I . (3.39)
l=0 ml =−l l=0

The term 2l + 1 is called a degeneracy factor since “degenerate” levels are levels with the
same energy. (I can’t explain this bizarre usage, but it is standard.) For general β this cannot
be further simplified. At low temperatures successive term in Z1 will fall off quickly; only the
lowest levels will have any significant occupation probability and the average energy will tend
to zero.
At high temperatures, (kB T  h̄2 /2I) there are many accessible levels and the fact that
they are discrete rather than continuous is unimportant; we can replace the sum over l with an
integral dl; changing variables to x = l(l + 1) gives
2I
Z1 =
h̄2 β
hE1 i = kB T (3.40)

Typically h̄2 /2I is around 10−3 eV, so the high-temperature limit is reached well below room
temperature.
It is not an accident that the high-temperature limit of the energy was kB T in both cases!
These are examples of equipartition which is the subject of a future section.

3.8 Translational energy of a molecule in an ideal gas


Take-home message: We can calculated the translational energy too. We are not
done with the idea gas though....

• Mandl 7.1-4, Appendix B


• Bowley and Sánchez 5.9,7.2
• Kittel and Kroemer 3

This example is rather more complicated than the preceding ones, but the result is simple and
powerful.
The non-interacting atoms of the gas are in a cuboidal box of side lengths Lx , Ly and Lz ,
and volume V ≡ Lx Ly Lz . The sides of the box are impenetrable, so the wave function ψ must
vanish there, but inside the box the atom is free and so ψ satisfies the free Schrödinger equation

h̄2 2
− ∇ ψ(x, y, z) = Eψ(x, y, z). (3.41)
2m
The equation, and the boundary conditions, are satisfied by
     
nx πx ny πy nz πz
ψ(x, y, z) = A sin sin sin (3.42)
Lx Ly Lz

with nx , ny and nz integers greater than zero. The corresponding energy is


2  2  2 ! 2
k 2 h̄2

nx π ny π nz π h̄
ε(nx , ny , nz ) = + + ≡ (3.43)
Lx Ly Lz 2m 2m

where k 2 = kx2 + ky2 + kz2 and kx = πnx /Lx etc. So the one-particle partition function is
X
Z1 = e−ε(nx ,ny ,nz )β . (3.44)
{nx ,ny ,nz }

In general this cannot be further simplified. However there can be simplifications if kB T is


much greater than the spacing between the energy levels, as we saw in the rotational case. For
a volume of 1 litre, that spacing is of order h̄2 π 2 /2mL2 ≈ 10−20 eV—truly tiny. Even at the
lowest temperatures ever reached, we are in the high-temperature regime! Thus we can replace
the sum over levels by an integral. (ThisPis called the continuum
R approximation. We choose kx ,
ky and kz as the variables, and replace nx with (Lx /π) dkx , giving

Z∞Z∞Z∞
Lx Ly Lz
Z1 = dkx dky dkz e−ε(k)β
π3
0 0 0
Z∞Zπ/2Zπ/2
V
= 3 k 2 sin θk dk dθk dφk e−ε(k)β converting to spherical polar coordinates
π
0 0 0
Z∞
1 V
= 4π 3 k 2 dk e−ε(k)β
8 π
0
Z∞
≡ g(k)e−ε(k)β dk where g(k) ≡ V k 2 /2π 2 (3.45)
0

The factor of 1/8 in the penultimate line comes from the fact that we only integrated over
positive values of kx etc, that is over the positive octant of k-space. g(k) is called the density
of states in k-space; g(k)dk is the number of states within range of k → k + dk. See here for
more on this concept.
This section only depended on the fact that the energy is independent of the direction of k.
Now we use the actual form of ε(k) to complete the calculation:
Z∞
V 2 −h̄2 k2 β/2m
Z1 = k e dk
2π 2
0
 3/2
m
=V ≡ V nQ . (3.46)
2πh̄2 β
Z1 is a pure number, so “nQ ” must have dimensions of 1/V like a number density; it is called the
quantum concentration and is temperature-dependent. From Z1 we can obtain the average
single particle energy:
∂ ln Z1 3
hE1 i = − = kB T (3.47)
∂β 2
as we should have expected.

3.8.1 The Density of States


Take-home message: This way of presenting the calculation for the ideal gas makes
the partition function as a sum over states more intuitive.
kz

ky
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
1110
11
00
0
00
11
0
10
1
0 111
00
1
1100 11
00
11
0011
00
k+dk
11
00 11
0011
00
11
00 11
00
11
0011
00
11
0000
11 00
1 0
11
1
1
0
0
1
00
11 1
0
00
11
0
1
00
11
11
000
110
1
1
0
00
11
00
110
1
0
1
1
0
0
11
1
00
11
0
1 00
11
00
0
1
00
11 00
11
00
11
0
1 110
00
0
1
00
11
01
1 0
11
1
00
11 0 11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
0
1
00
1100
11
11
00
11 00
11
00
11
000
1
0
1 000
11
0
1
00
11
0
1
00
11
1
0 00
11
0
1 0
1 ky 11
00
11
0011
00
11
00 11
0011
00
11
00
k 00
11
11
0011
00
00
11
1
0
00
110
1100
1
011
11 0
10
110
00 11
1
0
110
11000
11 11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
k+dk kx
k
kx

Going through the algebra to calculate the translational partition function we turned a sum
over the integers nx , ny and nz which count the number of half wavelengths along the three
sides, to an integral over k. Since the energy depends only on k = |k|, we could do the integral
over the direction of k leaving only the integral over k; in this process we collected a number
of factors and called them the density of state: g(k) = V k 2 /2π 2 , so that
Z∞
Z1 = g(k) e−ε(k)β dk (3.48)
0

We see that g(k) is acting as a “degeneracy factor”, which we first met in the context of the
rotor. If there is more than one energy level with the same energy, and we replace the sum over
individual states with a sum over allowed energies, we need to include a factor in front of the
Boltzmann factor for degenerate levels so that they are counted often enough.
The picture above shows a graphical representation of the allowed states in k-space. Since
 
πnx πny πnz
k= , , , (3.49)
Lx Ly Lz
with nx etc positive, the allowed values of k form a three-dimensional lattice. The density of
states is the number of states within an infinitesimal range of k, and hence of energy. This is
just the volume of an octant of a spherical shell, (1/8) × 4πk 2 × dk, divided by the volume of
k-space per state, π 3 /V , giving
V k2
g(k)dk = dk. (3.50)
2π 2
We will meet this expression many times in the rest of the course. Later, we will apply
it to systems of particles with spin s, where the full description of every single particle state
includes the spin projection ms , where ms takes on 2s + 1 values at integer steps between −s
and s. So for a spin- 12 particle, ms = − 21 or 12 ; for spin-1, ms = −1, 0 or 1. In these cases the
spatial states specified by (3.42) corresponds to 2s+1 distinct quantum states (all of the same
energy in the absence of a magnetic field) and so the density of states has an extra degeneracy
factor gs ≡ 2s + 1:
gs V k 2
g(k) = . (3.51)
2π 2
and
 3/2
mkB T
Z1 = gs V ≡ V gs nQ . (3.52)
2πh̄2
If expressions are given without the gs , they apply to spin-zero particles only. It can be restored
by nQ → gs nQ in any expression missing it.
Later on we will also use g(ε), where ε(k) is the energy of a particle with momentum h̄k.
g(ε) is defined so that the number of states with wave numbers between k and k + dk has to
be the same as the number with energies between ε(k) and ε(k + dk) = ε(k) + dε, i.e.


g(k)dk = g(ε)dε where dε = dk
dk
So in 3D, for non-relativistic particles with ε = h̄2 k 2 /2m,

gs V (2m)3/2 1/2
g(ε) = ε .
4π 2 h̄3

Finally, a comment on notation: Mandl uses D(k) rather than g(k), as I did in the notes
for Thermal and Statistical Physics. Dr Galla used both! Strictly, g(k) is not a degeneracy but
a degeneracy density so a stricter notation would be something like dg/dk. That is what D(k)
is trying to indicate. The lecturers of PHYS30151, whose past exams you will be looking at in
due course, used dn/dk. But the symbol n is overworked in this course as it is! Warning: Dr
Xian in PHYS20252 uses g(ε) to mean a double density, the number of states per unit energy
and per unit (physical) volume. Hence in his notation there is no V in the density of states.

3.8.2 The Maxwell-Boltzmann Distribution


Take-home message: The density of states can be used to derive the Maxwell-
Boltzmann distribution of molecular speeds in a gas.

• Mandl 7.7
• Bowley and Sánchez 7.4
• Kittel and Kroemer 14
The speed v of a particle is related to the wave number k by mv = h̄k. We already know
the probability of a particle having k in the range k → k + dk, and so we can immediately write
down the corresponding probability of the speed being in the range v → dv:
g(k) e−ε(k)β
P (k → k + dk) = dk where ε(k) = h̄2 k 2 /2m
Z1
V e−ε(v)β  m 3 2
P (v → v + dv) = v dv where ε(v) = mv 2 /2
2π 2 Z1 h̄
r  3/2
2 m 2
⇒ P (v) = v 2 e−mv /2kB T (3.53)
π kB T
This is called the Maxwell-Boltzmann distribution, and it is plotted below.
PHvL

$%%%%%%%%
€€€€€€€€€€€€%%%%
kB T
€€
m

We can find the most probable speed (from dP (v)/dv = 0), as well as the mean speed and
the rms speed:
r r
2kB T kB T
vp = ≈ 1.41
r m r m
8kB T kB T
hvi = ≈ 1.60
πm r m r
p 3kB T kB T
vrms = hv 2 i = ≈ 1.73 (3.54)
m m
These are marked on the graph above.
Note that h̄ has disappeared from P (v), which can be derived from the Boltzmann distri-
bution in a purely classical theory provided the normalisation is obtained from requiring the
integral of P (v) to be one.

3.9 Factorisation of partition functions


Take-home message: Where degrees of freedom are independent, the full partition
function is a product of individual partition functions.

In lectures, we repeatedly use ZN = (Z1 )N for independent distinguishable particles, and


we also used Z1 = Z1tr Z1rot Z1vib for the independent contributions of vibrational, rotation and
translational degrees of freedom to a single-particle’s partition function. In these notes we
prove that where the energy of a system separates into independent contributions like this, the
partition function factorises.
In full generality, let us suppose that a microstate has N independent contributions to its
energy, the allowed values of the first being ε(1) (1) (1)
1 , ε2 , ε3 , . . ., and similarly for the others, with
(n)
εi being the ith allowed value of the nth contribution. Also, let Z (n) be the partition function
for the nth contribution: X  
(n)
Z (n) = exp −εi β . (3.55)
i

Then the full partition function is


X  
(3) (N )
Z= exp −(ε(1) i + ε (2)
j + ε k + . . . + ε p )β
i,j,k,...p
X  
(3)
. . . exp −ε(N )
  
= exp −ε(1)
i β exp −ε (2)
j β exp −ε k β p β
i,j,k,...p
! ! ! !
X X X   X
(3)
exp −ε(N )
  
= exp −ε(1)
i β exp −ε(2)
j β exp −εk β ... p β
i j k p

= Z (1) Z (2) Z (3) . . . Z (N ) . (3.56)

It is the step between the second and third lines, in which we interchange the order of addition
and multiplication, that is tricky at first! But it is no harder than the following (in reverse):

(a+b+c)(p+q+r)(x+y+z) = apx+apy+apz+aqx+aqy+aqz+arx+ary+arz+bpx+. . .+crz


(3.57)
More compactly,
N
!
X X (n)
Z= exp − εi n β
i1 ,i2 ,...iN n=1

X N
Y  
(n)
= exp −εin β
i1 ,i2 ,...iN n=1
N X
Y   N
Y
(n)
= exp −εin β = Z (n) . (3.58)
n=1 in n=1

3.10 The Equipartition Theorem


Take-home message: The classical theory of equipartition holds in the high-temperature
limit

• Mandl 7.9
• Bowley and Sánchez 5.14
• (Kittel and Kroemer 3)

The results for vibrational, rotational and translational energies demonstrate that, at high
enough temperatures, the law of equipartition of energy holds: each quadratic term in the
classical expression for the energy contributes 21 kB T to the average energy and 12 kB to the heat
capacity. The oscillator has quadratic kinetic and potential terms:

Evib = 12 mẋ2 + 21 mω 2 x2 2 d.o.f, E → kB T . (3.59)


The rotor has two perpendicular axes about which it can rotate; each with a quadratic kinetic
energy (rotations about the axis have no effect in quantum mechanics; classically the moment
of inertia is tiny):
Erot = 12 Iω12 + 12 Iω22 2 d.o.f, E → kB T . (3.60)
The translational kinetic energy has three terms for the three dimensions of space:

Etr = 21 mẋ2 + 12 mẏ 2 + 21 mż 2 3 d.o.f, E → 32 kB T . (3.61)

Now we understand what governs “high enough”: kB T has to be much greater than the spacing
between the quantum energy levels. If this is not satisfied, the heat capacity will be reduced,
dropping to zero at low temperatures. The corresponding degree of freedom is said to be frozen
out; this is the situation for the vibrational degrees of freedom at room temperature.
Here is an idealised graph of the heat capacity of hydrogen with temperature, ( P.
c Eyland,
University of New South Wales)

As the moment of inertia for H2 is small, the temperature by which equipartition holds for
rotational modes is actually quite high. Bowley and Sánchez have a graph taken from data
(Fig. 5.8).
We can predict the specific heat of other substances based on equipartition, simply by
counting the degrees of freedom. For a solid, we expect the molar heat capacity to be 3RT
since each atom is free to vibrate in three directions. This is the law of Dulong and Petit, and
it works well for a variety of solids at room temperature. (More details here.)
Equipartition does not hold, even at high temperatures, if the energy is not quadratic. For
instance the gravitational potential energy is linear in height, and the average potential energy
of a molecule in an isothermal atmosphere is kB T , not 21 kB T .
p Similarly the kinetic energy of a highly relativistic particle is given by the non-quadratic
c p2x + p2y + p2z (= h̄ck), not by the quadratic (p2x + p2y + p2z )/2m, and the average kinetic energy
is 3kB T , not 23 kB T .

3.10.1 The heat capacity of a crystal


Take-home message: The Einstein model of independent oscillators only works
quantitatively in the high-temperature limit
Based on equipartition, we expect the molar heat capacity for a solid to be 3RT since each
atom is free to vibrate in three directions. This is the law of Dulong and Petit, and it works
well for a variety of solids at room temperature. It is reproduced, as might be expected, by the
Einstein model for a crystal which considers each atom linked to its neighbours by six springs
(3N in total)—the algebra is just like that of the vibrations of a diatomic molecule giving

hEi = 23 N h̄ω coth ( 12 h̄ωβ)


−2
CV = 34 N kB (h̄ωβ)2 sinh 21 h̄ωβ (3.62)

At low temperature (β → ∞) the energy tends to 3N kB (h̄ωβ)2 e−h̄ωβ . Although this tends to
zero, it does not agree with the observed low temperature behaviour, which is proportional to
T 3 . More sophisticated models, such as that of Debye, allow for collective vibrations of many
atoms which have much lower frequency, and hence contribute to the internal energy and heat
capacity at much lower temperatures. We will revisit this when we consider gases of bosons,
not that the connection is obvious right now.

3.11 The N particle partition function


for indistinguishable particles
Take-home message: atoms are fundamentally indistinguishable, and this has mea-
surable macroscopic consequences

In the previous section we assumed that the average energy of N non-interacting atoms was
the same as N times the average energy of one atom, an obvious (and correct) consequence of
the law of large numbers and indeed almost the definition of “non-interacting”. It could also
be derived from ZN = (Z1 )N . To continue our study of the ideal gas, we want to calculate F ,
and from that the pressure and entropy. But recalling that
 3/2
mkB T
Z1 = V gs ≡ V gs nQ . (3.63)
2πh̄2
and using F = −kB T ln ZN would give

F = −N kB T ln(V gs nQ ) warning, incorrect! (3.64)

and that doesn’t make sense. If one has double the number of particles, in double the volume,
the Helmholtz free energy, like the energy, should double. They are both extensive variables.
But the expression above is not extensive.
The solution comes from a very surprising quarter—quantum mechanics. Quantum me-
chanics says that atoms of the same element are fundamentally indistinguishable, exactly the
same. As a result, for instance, all observables have to be unchanged when we interchange two
identical atoms. We can’t give them labels and know which one is in which state. But if we
recall the derivation of derivation of ZN = (Z1 )N which held for the paramagnet, crucially all
N particles were distinguishable (by their position in the lattice). This is not the case for a
gas. So what is the N -particle partition function for indistinguishable particles?
Consider first the partition function for the simplest case, of two particles and two energy
levels. If the particles are distinguishable, as in the upper picture below, there are four states,
two of which have energy ε, and the two-particle partition function is

Z2 = e0 + 2e−εβ + e−2εβ = (Z1 )2 (3.65)


ε

0
Distinguishable particles

0
Indistinguishable particles

However for indistinguishable atoms, we can’t give them labels and know which one is in
which state. Thus there are only three states, as in the lower picture, and the partition function
is
Z2 = e0 + e−εβ + e−2εβ 6= (Z1 )2 (3.66)
If we use (Z1 )2 , we over-count the state in which the particles are in different energy levels. In
general there is no simple expression for the N -particle partition function for indistinguishable
particles.
However we note that (Z1 )N over-counts the states in which all N particles are in different
energy levels by exactly N !. So if we are in a position where there are many more accessible
energy levels (that is, levels with energy less than a few kB T ) than there are particles, the
probability of any two particles being in the same energy level is small, and almost all states
will have all the particles in different levels. Hence a good approximation is

(Z1 )N
ZN = (3.67)
N!

It turns out that this is exactly what we need to fix the ideal gas.

3.12 The ideal gas


Take-home message: We can now derive the equation of state and other properties
of the ideal gas.

• Mandl 7.1,7.4-6
• Bowley and Sánchez 6.5
• Kittel and Kroemer 3

We are now reaching the most important test of statistical physics: the ideal gas. For the
moment we assume it is monatomic; the extra work for a diatomic gas is minimal.
The one-particle translational partition function, at any attainable temperature, is
 3/2
mkB T
Z1 = V gs nQ , where nQ ≡ . (3.68)
2πh̄2
We already saw that assuming the atoms to be distinguishable yields a non-extensive
Helmholtz free energy, and argued that we had to treat them as indistinguishable. So we
want to see if we use the approximate form from the previous section,
(Z1 )N
ZN = . (3.69)
N!
We can, if we can convince ourselves that that it is very unlikely that any two atoms are in
the same energy
R kmax level. In the ideal gas, we can calculate the number of levels below, say,
2 2
2kB T , from 0 g(k)dk with h̄ kmax /2m = 2kB T , giving 2.1nQ V . So we see that nQ , the
“quantum concentration”, is a measure of the number of states available, and we can use the
approximation ZN = (Z1 )N /N ! provided N  nQ V (or n  nQ ). This is the classical limit.
We also note that nQ ≈ 1/λ3 , where λ is the thermal de Broglie wavelength (the wavelength
of a particle of energy of order kB T ). So the condition n  nQ is equivalent to saying that the
separation of the atoms is much greater than their wavelength, exactly the condition given in
last semester’s quantum mechanics course for classical behaviour.
The energy and heat capacity derived from (3.69) are unchanged from those of section 3.8,
as N ! doesn’t depend on β.
For the Helmholz free energy, using Stirling’s approximation ln(N !) ≈ N ln N − N , we find
hF i = − kB T ln ZN = −N kB T (ln Z1 − ln N + 1)
   
V
= − N kB T ln + ln (gs nQ ) + 1
N
   
n
= N kB T ln −1 . (3.70)
gs nQ
Since nQ is composed only of constants and T , it is intensive; the number density n ≡ N/V is
the ratio of extensive quantities and so is also intensive. Hence F is clearly simply proportional
to N , and so extensive as required.
Then
 
∂F
P = −
∂V T,N
N kB T
= (3.71)
V
and for the entropy, the Sackur-Tetrode equation
 
∂F
S= −
∂T V,N
   
n 1 dnQ
= − N kB ln − 1 + N kB T
gs nQ nQ dT
  
gs nQ  5
= N kB ln + (3.72)
n 2
Since n  nQ if the result is to be valid, S is also positive, as it should be!
The expression for P is clearly experimentally verifiable: it is the ideal gas law. That’s
good, but we expected to get that. More interestingly the Sackur-Tetrode equation for S can
also be checked. First, if we unpick the dependence on V and T , we get
3
S = N kB (ln V + ln T + const.) (3.73)
2
which is in accord with the form derived from classical thermodynamics (see here). But more
importantly it predicts the absolute entropy of a gas at a certain temperature, and this can
be checked experimentally too. If we start with the solid at some very low temperature T0 , at
which the entropy can be assumed to be very small, and we know the experimental specific
heat capacity as a function of temperature and the latent heats of melting and vaporisation,
we can numerically calculate the integral
ZT
d̄Q
= S(T ) − S(T0 ) ≈ S(T ) (3.74)
T
T0

Good agreement is found. An example with numerical details can be found here, from Edward
J. Groth of Princeton University.
Finally, we include vibrations and rotations as well as translations: since the one-particle
energies are independent and add, ε = εtr + εrot + εvib , the partition functions multiply: Z1 =
Z1tr Z1rot Z1vib (the argument is like that for the N -particle partition function for distinguishable
particles and is given here) and so
(Z1tr )N (Z1rot )N (Z1vib )N
ZN = = ZNtr (Z1rot )N (Z1vib )N
N!
F = Ftr + Frot + Fvib (3.75)
and the energy and entropy also add.
It is important to note that, assuming a truly ideal gas which never condenses or solidifies,
the Sackur-Tetrode equation is not valid for indefinitely low temperatures. It must be wrong,
because as T → 0, nQ → 0 and S → −∞. But we know that S → 0 as T → 0, because all
the particles occupy the lowest energy level. But of course that is exactly the regime in which
ZN = (Z1 )N /N ! is no longer valid.
For a gas with the density of air at STP, n ≈ 3 × 1025 m−3 . We have nQ ≈ n for T ≈ 10−2 K,
so real gases are essentially always classical.
An example of a non-classical gas is the conduction electrons in a metal; they are free
to move within the metal and can be treated as a dense gas (n ≈ 1029 m−3 ), but at room
temperature nQ ≈ 1027 m−3 . So the quantum nature of the electron (specifically the fact that
it is a fermion) becomes all important.

3.12.1 Chemical potential of ideal gas


We can find the chemical potential from the Gibbs free energy, is given by G = E − T S + P V .
For a monatomic ideal gas, E = 23 N kB T and P V = N kB T . So the sum of these exactly cancels
the constant term in the Sackur Tetrode entropy, to give
g n 
s Q
G = −T S + 52 N kB T = −N kB T ln (3.76)
n
and the chemical potential, the Gibbs free energy per particle, is
g n 
s Q
µ = −kB T ln (3.77)
n
So in fact, perhaps surprisingly, for an ideal gas in the classical limit, the chemical potential is
large and negative. And unlike in classical thermodynamics, we can actually calculate it if we
know the temperature and number density.4
4
Note we could also get the same result from F , (3.70), using µ = ∂F/∂N |T,V .
It is important to point out that this chemical potential is relative to the lowest energy
state of the system. The zero-point energy of a particle in a box is truly negligible. But in a
reaction in which mass is not conserved (which to a varying degree is true of all reactions!) we
should write g n 
s Q
µ = mc2 − kB T ln (3.78)
n
In cosmological applications, we should note that the Sackur-Tetrode entropy, and hence
the expression above, is only correct for non-relative particles.
Consider reactions such as the ionisation of atomic hydrogen

) p + e− .
H*

Equilibrium requires µH − µp − µe = 0, or
! ! !
(e) (p) (H)
2nQ 2nQ 4nQ
ln + ln − ln + (mH − mp − me )c2 β = 0
ne np nH
(p) (e)
ne np nQ nQ εβ (e)
⇒ = (H)
e ≈ nQ eεβ (3.79)
nH nQ
We have defined ε as the (negative) energy of the bound state of the electron in hydrogen,
−13.6 eV, and given the closeness in mass of p and H we have set the ratio of their quantum
concentrations to 1 in the last step. The factors of 2 in the chemical potentials of the proton
and electron are the usual spin degeneracies gs = 2 for spin- 21 particles. The factor of 4 for
hydrogen is also a degeneracy factor: the total spin of the electron and proton can be S = 1 or
S = 0 with 4 spin states (3+1) in all.
One can take this formula in various directions, but if one assumes the electron density and
temperature are known, it gives the ionization fraction np /nH . It can be rewritten in terms of
the chemical potential of the electrons (but with the conventional zero of energy, which we will
here but not in general write as µ̃e ) as
(e)
np 2nQ εβ 1 (−µ̃+ε)β
= 12 e = 2e . (3.80)
nH ne
Astrophysics students will have met (3.79) as the Saha equation. We will meet this problem
again in section 3.14.1.

3.13 Using Classical Mechanics


Take-home message: The classical or high-temperature limit of the ideal gas can
(nearly) be derived from classical mechanics.
In all of the above, we have counted microstates by considering quantum states of a systems
(whether 2-state spin- 21 atoms, rotational and vibrational energy levels, states of a particle in
a box....). But statistical physics predates QM, and so do results such as equipartition and the
equation of state of an ideal gas. How were microstates originally counted?
We got an idea of the difficulty back in our very first look at the ideal gas, where we had
to confront the fact that the position of a particle in a box is, classically, continuous and hence
uncountable. The solution there was to divide the volume into tiny cells and accept that our
result would depend on the cell size, removing predictive power for absolute entropies but
effectively predicting changes.
This is close to the original formulation of the counting of microstates. We work in “phase
space”, in which the state of a system is specified by the (3D) position and momentum of each
particle, 6N independent coordinates in total. Instead of summing over states we will integrate
over phase-space, but we will introduce the concept of an elementary volume in phase-space,
so that the number of microstates in a phase-space volume (positions and momenta in some
range) is R 3N R 3N
d ~x d p~
Ω= (3.81)
h3N
Somewhat mysteriously, we have used h3 for the elementary phase-space volume for a single
particle. Note that h has the dimensions of momentum times length.
The energy of the state will be the sum of the kinetic and potential energies; in principle that
allows for interactions between particles but we will ignore them, so that the only potential
energy comes from an external trap (Warning: V means volume but V (x) is a potential.
Sorry....) Then
3N
!!
Z Z 2
1 X p i
ZN = 3N d3N ~x d3N p~ exp −β + V ({xi }) (3.82)
h N! i=1
2m

The justification for (and limitations of) the 1/N ! are the same as before, and since true
indistinguishability is a quantum property, we cannot entirely get away from QM.
For particles in aR box, the potential within the box vanishes and positions outside the box
are inaccessible, so d3N ~x = V N . So
3N
!
N Z 2
V X pi
ZN = 3N d3N p~ exp −β (3.83)
h N! i=1
2m

There are three ways to tackle this. The simplest is to see it as a product of Gaussians in
pi
3N Z∞
VN Y VN ZN
 
2 β
ZN = 3N dpi exp −pi = 3N (2πmkB T )3N/2 = 1 (3.84)
h N ! i=1 2m h N! N!
−∞

where Z1 = V nQ , as obtained previously.


This wouldn’t work for relativistic particles for which the energy doesn’t factorise into a
sum of functions of the three components of momentum. So we can instead, for each particle,
write d3 p~ as 4πp2 dp, then express the 3N -dimensional integral as a product of N terms, here:

Z∞  N
 
ZN

1 V β 
ZN = dp p2 exp −p2 = 1 (3.85)
N ! h3 2m N!
0

This approach is just like the k-space one we used above, with p = hk/2π and g(k)dk = g(p)dp
giving g(p) = 4πV p2 /h3 .
The final approach, which however is only valid in the same case as the first, where each p2i
is summed for all N particles and all 3 directions, is to consider k-space (or rather p-space) in
3N dimensions, using the expression for the surface area of a hypersphere. I will leave it as an
exercise for the mathematically-minded reader to check the same result is obtained. (See also
the lecture notes of the previous lecturer.)
So we have recovered the same result as before, and all our results for the ideal gas, including
the Helmholtz free energy and Sackur-Tetrode entropy (3.72), are recovered, with all the same
restrictions on validity (N/V  nQ ). But I have pulled the wool over your eyes. They only
agree if the arbitrary parameter h I introduced in the elementary volume of phase space is
actually Planck’s constant! Any other choice would correctly predict changes in S and F , but
not absolute values.
On the problem sheet you will be asked to show that the partition function for a single
harmonic oscillator is also reproduced in the high-temperature limit by this classical phase
space approach. In this case, because there is a potential, the integral d3N ~x has to be done
explicitly and does not simply yield a power of the volume. Since the potential is quadratic
in each xi , though, we just get more Gaussian integrals, and the result is the same as in the
high-temperature limit of the QM approach, (3.36): in one dimension
1
Z1 = and E = kB T. (3.86)
h̄ωβ
Finally, if you are interested, the concept of the hypersphere can be used to tackle the ideal
gas in the microcanonical ensemble. The problem there, if you recall, is that the energy of the
whole system is constrained to be fixed, so the particles, while not interacting, cannot simply be
treated as independent as they can if they are in contact with a heat bath. In calculating Ω, the
integral d3N p~ has to be carried out with a δ function on the total kinetic energy: effectively this
restricts |~p| to the surface of the hypersphere. Using the surprising result that, for very large
N , essentially all the volume of an N -dimensional hypersphere resides at the surface (because
of the factor rN −1 in the volume element), one integrates instead over all states with energies
up to energy E, and the result for S = kB ln Ω is indeed just the Sackur Tetrode one (as a
function of E = 3/2N kB T rather than T ). The previous lecturer’s notes (section 3.6) give more
detail, but I simply note the result as further proof that for a large system, the microcanonical
and canonical approaches give the same result.

3.14 Systems with variable particle number —The Gibbs


distribution
• Mandl 11.1
• Bowley and Sánchez 9.7-9
• Kittel and Kroemer 5

Take-home message: The Boltzmann distribution is modified if the particle


number can change
First we tackled isolated systems, then we considered systems in contact with a heat bath
at temperature T . Now we consider systems which are also in diffusive contact with a particle
reservoir at chemical potential µ. In this case the Boltzmann distribution is modified and is
called the Gibbs distribution.
This is useful in itself. However just as using the Boltzmann distribution freed us from
the constraint that the total energy of all the particles had to add up to a given total energy,
and allowed us to consider each particle independently, so using the Gibbs distribution frees us
from the constraint that the total numbers of particles in each energy level has to add up to a
fixed total, and allows us to treat each energy level independently. Once again we will use the
fact that fluctuations in a macroscopic system are negligible to draw conclusions for isolated
systems as well.
The temperature is a measure of the decrease in entropy of the reservoir from giving up
heat to the system; the chemical potential is a measure of the energy decrease (and entropy
increase) of the reservoir from giving up particles to the system (see (2.15)).
We want to find the probability that our system, in contact with the reservoir, will be in a
certain microstate i with an energy εi and particle number Ni . As usual, recall that neither εi
nor Ni will be small for a typical system.
The derivation follows that of the Boltzmann distribution closely. Again the probability
of the system being in the given microstate depends on the number of microstates available
to the reservoir with energy E0 − εi and particle number N0 − Ni . Expressing the number of
microstates as the exponential of the entropy, making a Taylor expansion of the entropy about
SR (E0 , N0 ), and expressing the derivatives of the entropy in terms of T and µ thus,
   
∂SR 1 ∂SR µ
= =− , (3.87)
∂E V,N T ∂N E,V T

gives
e (µNi −εi )/kB T
pi = (3.88)
Z
with X
Z= e (µNj −εj )/kB T (3.89)
j

The new normalisation constant Z is called the grand partition function. Macroscopic
functions of state are calculated via ensemble averages as usual; the relevant ensemble in this
case is called the grand canonical ensemble.
The following properties are easily proved by analogy with the corresponding ones for the
Boltzmann distribution (see here and here):
 
∂ ln Z
hN i = kB T
∂µ β
 
∂ ln Z
hEi = − + µ hN i
∂β µ
X 1
hSi = − kB pi ln pi = (hE − µN i + kB T ln Z)
i
T
⇒ ΦG ≡ − kB T ln Z = hE − T S − µN i . (3.90)

The quantity −kB T ln Z is a new thermodynamic potential called the grand potential: Mandl
gives it the unfortunate symbol Ω but we will use ΦG like Bowley and Sánchez. (They use Ξ–
“Xi”–for Z.) From the fundamental thermodynamic relation we get

dΦG = −SdT − P dV − N dµ (3.91)

and hence
     
∂ΦG ∂ΦG ∂ΦG
S=− P =− N =− . (3.92)
∂T V,µ ∂V T,µ ∂µ T,V
system Isolated in contact with heat bath heat and particle bath
fixed E,N,V or B T,N,Vor B T, µ, Vor B
key microscopic function no. of microstates Ω partition function Z grand partition function
key macroscopic function S= kB logΩ F=− kBT log Z ΦG =− kBT log

So whereas in an isolated system the entropy is the key (compare here) and in a system at
constant temperature it is the Helmholtz free energy (compare here), here the grand potential
is the key to the other functions of state.
The natural variables of the grand potential are T, V and µ. But of these, T and µ are inten-
sive. Like any thermodynamic potential ΦG itself is extensive, so it must be simply proportional
to V , the only extensive one: ΦG = V φG (T, µ).
But
 
∂ΦG
φG (T, µ) = = −P
∂V T,µ
⇒ ΦG = − P V. (3.93)

This explains why it is not greatly used in thermodynamics. But the fact that ΦG is so
simple doesn’t lessen its formal utility in statistical mechanics.
The grand potential in the case that more than one species is present is
X X
ΦG ≡ E − T S − µi Ni so dΦG = −SdT − P dV − Ni dµi . (3.94)
i i

We can use this to prove that µi is the Gibbs free energy per particle of species i, as claimed
previously ((1.23))
X X
G = E − T S + P V = E − T S − (E − T S − µi Ni ) = µi Ni . (3.95)
i i

3.14.1 Two examples of the Gibbs Distribution


Take-home message: There is an important distinction between systems in which
energy levels can only have one particle in them, and those where they can have
many.
Example 1: Sites which bind a single molecule
We first consider sites on a surface which can bind a single molecule only; the energy of the
empty site is 0 and that of the occupied site is ε0 (which can have either sign, but is negative
for binding). This example is common in biology, where receptor molecules can be occupied or
unoccupied.
The surface is in contact with a gas or a solution with chemical potential µ (the energy
drop of the solution when it loses a molecule). What is the grand partition function, and the
average occupancy of a site?
There are only two microstates here: unoccupied, with N = 0 and ε = 0, and occupied,
with N = 1 and ε = ε0 , so there are only two terms in the grand partition function:

Z = e0 + e(µ−ε0 )β = 1 + e(µ−ε0 )β (3.96)


Then  
∂(−kB T ln Z) 1
hN i = − = . (3.97)
∂µ β e(ε0 −µ)β +1
Below we plot the average occupancy as a function of ε0 , the energy of the level in question.
We see that hN i is always less than 1, as it must be. If a level lies above the chemical potential,
<N>
1

low T
high T

0.5

¶0
Μ

ε0 > µ then it is less likely to be occupied, since it is energetically more favourable for the
molecule to remain in solution. Conversely if ε0 < µ then it is more likely to be occupied,
since that is the energetically favourable configuration. As always, it is the temperature which
determines the likelihood of the less favourable configuration obtaining. At zero temperature,
the distribution becomes a step function, with hN i = 1 if ε0 < µ and hN i = 0 if ε0 > µ.

Example 2: sites which bind many molecules


This is less realistic, but we imagine a site which can have any number of molecules occupying
it, with energy ε0 per molecule. There are then infinitely many terms in the grand partition
function which form a geometric series:
1
Z = e0 + e(µ−ε0 )β + e2(µ−ε0 )β + . . . =
1− e(µ−ε0 )β
1
hN i = . (3.98)
e(ε0 −µ)β − 1
Below we plot the average occupancy as a function of ε0 , the energy of the level in question.
Unlike the first example, there is no limit to hN i. Thus it doesn’t make sense to consider states
<N>

low T high T

¶0
Μ

with ε0 < µ, as their occupancy will be infinite. (The formula above for hN i is no longer valid
in that case.) For ε0 close to µ the occupancy will be high, and it falls off as ε0 increases. The
rapidity of the drop depends on temperature; for T = 0 only a level with ε0 = µ would have
non-zero occupancy.
Chapter 4

Quantum Gases

4.1 Bosons and fermions


Take-home message: All particles in nature are either bosons or fermions. Their
statistical properties are very different: no two fermions can be in the same state,
but there is no such restriction on bosons.

• Mandl 9.2
• Bowley and Sánchez 10.2

Bosons are particles with integer spin:


spin 0: 1 H and 4 He in ground state, pion, Higgs boson
spin 1: 1 H and 4 He in first excited state, ρ meson, photon, W and Z bosons, gluons
spin 2: 16 O in ground state, graviton.
Fermions are particles with half-integer spin:
spin 21 : 3 He in ground state, proton, neutron, quark, electron, neutrino
spin 32 : 5 He in ground state, ∆ baryons (excitations of the proton and neutron)

First a note on notation. When we talk about the “spin” of a composite particle we mean
its total angular momentum, and we usually use the symbol J rather than S. Last semester you
met J as the (quantised, vector) sum of orbital and spin angular momentum for the electron,
but more generally it is the (quantised, vector) sum of all the spins and all the orbital angular
momenta of all the constituents.
The basic building blocks of atoms are all fermions, while the force carriers (photon, gluon,
W,Z) and the Higgs are bosons. The rules of addition of angular momentum mean that two
spin- 12 particles with no orbital angular momentum can either have total angular momentum
J = 0 or 1; if we add a third, we have J = 12 or 32 ; a fourth gives J = 0, 1 or 2, and so on.
Adding in orbital angular momentum, which is integer, gives more possibilities, but J is still
always half integer for an odd number of spin- 12 particles and integer for an even number. So
composite particles (nuclei, atoms, molecules) made of an odd number of protons, neutrons
and electrons are also fermions, whereas those made of an even number are bosons. Note that
a particle is either a fermion or boson. Excitations of composite partcles (nuclei, atoms) can
change the spin only by an integer amount and so don’t change the its nature.
Fermions obey the Pauli exclusion principle: no more than one fermion can occupy a
single quantum state. (The value of the spin quantum number ms is part of the description of
the state; if that is ignored then two spin- 12 or four spin- 32 particles can occupy the same spatial

54
state.) This is the basis of atomic structure and the periodic table, it explains the properties
of metals and of white dwarves and neutron stars.
The “grown-up” version of the Pauli exclusion principle is that the overall wave function
of a system of identical fermions must be antisymmetric under exchange of any pair. For two
spin- 12 particles in the same spatial state (say the 1s state of helium) the overall wave function
must be
Ψ(r1 , r2 , m1 , m2 ) = √12 φ1s (r1 )φ1s (r2 )(↑↓ − ↓↑) (4.1)
The spatial part is symmetric but the spin part is antisymmetric. (This corresponds to overall
spin 0). If we try to construct a state with three particles in the same spatial state we can’t do
it, there is no state (↑↓↑ − . . .) which changes sign when we interchange any pair of particles.
So the Pauli exclusion principle follows from the requirement for antisymmetry.
It is possible to have a 2-fermion spin-state such as ↑↑ (spin-1) but then the particles have
to be in different spatial states, eg
 
Ψ(r1 , r2 , m1 , m2 ) = √12 φ1s (r1 )φ2s (r2 ) − φ2s (r1 )φ1s (r2 ) ↑↑ (4.2)

There is no exclusion property for identical bosons, but there is a restriction on their wave
function: it must be symmetric under exchange of any pair. So for spinless bosons

Ψ(r1 , r2 , r3 . . .) = √1 φ(r1 )φ(r2 )φ(r3 ) . . . (4.3)


2

is a perfectly acceptable wave function and there is no Pauli exclusion principle. For spin-1
bosons we have to ensure that the overall space-spin wave function is symmetric, but the details
are not important here.
So bosons are free to (indeed, other things being equal, “prefer” to) crowd into the same
quantum state. This explains the spectrum of black-body radiation and the operation of lasers,
the properties of liquid 4 He and superconductors.
The need for the wave function to be either symmetric or antisymmetric for identical parti-
cles stems from the very meaning of “identical”: nothing observable can change if we swap the
particles. In particular

|Ψ(r1 , r2 , m1 , m2 )|2 = |Ψ(r2 , r1 , m2 , m1 )|2 (4.4)

This implies that


Ψ(r1 , r2 , m1 , m2 ) = eiα Ψ(r2 , r1 , m2 , m1 ) (4.5)
for some real phase α. But two swaps have to return us to the same state without any phase
change at all, so 2α = 2nπ. Only α = π and α = 2π are possible.

4.2 The ideal gas of bosons or fermions: beyond the


classical approximation
Take-home message: Where the number of available states approaches the number
of particles in the system, the properties of the gas will depend on whether multiple
occupancy is allowed (bosons) or not (fermions).

• Mandl 11.2,11.4,11.5
• Bowley and Sánchez 10.2-3
When we derived the properties of the ideal gas previously, using the classical approximation
for the partition function (Z1 )N /N !, our results were only valid if the number of available single-
particle levels greatly exceeded the number of particles in the gas (nQ  n). This was because
we knew that we were not treating states with more than one particle in them correctly. Now we
know that if the gas particles are fermions, that isn’t even possible, so we need a new approach.
What we do is lift the restriction that the number of particles in the gas is fixed, and use the
Gibbs distribution instead of Boltzmann.
In a previous section we looked at the grand partition function for a single state which could
accept either only one, or many, particles. These will be our starting points for the consideration
of gases of fermions or bosons in the regime in which we cannot ignore the possibility of multiple
occupancy of states (bosons) or restrictions in the available states because they are already
occupied (fermions).
We then find that rather than focus on a single particle in the gas, it is easier to focus on
what is happening in a single energy level. Then we can write, using r to label the energy level,
not the particle
Y
Z = Z 1 Z2 Z 3 . . . = Zr where Zr = 1 + e(µ−εr )β + e2(µ−εr )β . . . (4.6)
r

(For fermions the sum in Zr is restricted to the first two terms.)


Perhaps this step is not obvious. Each term in each Zr corresponds to 0, 1, 2, 3... particles
in that state. So Z is then a sum of many terms, each one with a particular occupancy of
each level (n1 , n2 , n3 ) ≡ {ni }. These are exactly the microstates of the whole system when N
is not fixed. The argument is closely analogous to that for the factorisation of the (ordinary)
partition function for independent particles, section 3.9.
Note that we ignoring conventional interactions, in that we are summing over the single-
particle states which by definition are calculated without considering interactions between par-
ticles. But it wouldn’t be quite correct to say that we are treating the particles as independent,
because their boson or fermion nature will be have an important influence on the multi-particle
system.
Since the log of a product is the sum of logs of the individual terms, the grand potential
ΦG , the energy, the particle
P number and Pthe entropy all consist of sums of the contributions
from each level: hN i = r hNr i, hEi = r hNr i εr etc.
We are going to introduce a new notation here. To reflect the fact that hNr i, the occupancies
of individual level, are microscopic, we are going to use small letters, and because the h...i
notationPis clumsy we will use an overline, also commonly used for averages: so hNr i ≡ nr and
hN i = r nr etc. This is the same notation as Mandl, and as Dr Galla. Dr Xian in PHYS20252
uses f . There is potential for confusion with the particle density (which Dr Galla called ρ!)
but I hope forewarned is forearmed.
Furthermore we have already found the single-level grand partition functions Zr and the
average occupancies nr : for fermions, which obey the Pauli exclusion principle:
1
Zr = 1 + e(µ−εr )β nr = (4.7)
e(εr −µ)β + 1
and for bosons, which don’t:
1 1
Zr = nr = . (4.8)
1− e(µ−εr )β e(εr −µ)β −1
(see section 3.14.1).
For a gas the sum over discrete energy levels is replaced by an integral over the wave number
k, weighted by the density of states.

Z∞ Z∞
hN i = g(k)n(k)dk hEi = g(k)ε(k)n(k)dk (4.9)
0 0

where ε(k) = h̄2 k 2 /2m for non-relativistic particles and

1 n
fermions
o
n(k) = for bosons (4.10)
e(ε(k)−µ)β ± 1

Note that for bosons, µ must be less than the energy of the lowest level (zero for most purposes)
but for fermions µ can be (and often will be) greater than 0.
The Gibbs distribution assumes variable particle number and constant chemical potential
(as well variable energy and constant temperature). However we knew that for a large system,
fluctuations are small, and the results will be essentially the same as the more difficult problem
of an gas with fixed particle number. To cast it in this form, we use the expressions of (4.9) to
find the value of µ which gives the desired N . Then we can also find the average energy per
particle. This is conceptually simple, but not usually possible analytically except in certain
limits. In the next subsection we will recover the classical ideal gas.

4.2.1 The classical approximation again


Take-home message: The Gibbs distribution gives an alternative way of treating
the ideal gas in the classical limit.
Here we rederive our previous results for the ideal gas regarding the particle number as
variable also, fixing instead the chemical potential.
From the previous sections on the Gibbs distribution and the ideal gas of bosons or fermions,
we have

ΦG ≡ − kB T ln Z
X Q
= − kB T ln Zr using Z = r Zr
r
X
ln 1 ± e(µ−εr )β

= ∓ kB T (4.11)
r

where r labels the single particle energy levels, and the signs are for fermions and bosons
respectively.
Now imagine that eµβ  1, which requires µ to be large and negative. Never mind for a
moment what that means physically. Then, using ln(1 + x) ≈ x for small x, we get
X
ΦG = − kB T eµβ e−εr β
r
µβ
= − kB T e Z1 (T ) (4.12)

where Z1 is the one-particle translational partition function (not grand p.f.) for an atom in an
ideal gas. As we calculated previously, Z1 (T ) = V gs nQ (T )
From ΦG we can find the average particle number:
 
∂ΦG
N = −
∂µ T,V
= eµβ Z1 (4.13)

and solving for µ we get


g n 
s Q
µ = −kB T ln(Z1 /N ) = −kB T ln (4.14)
n
So now we see that µ large and negative requires n  nQ or far fewer particles than states—
exactly the classical limit as defined before. We obtained this result previously, see (3.77).
Finally, since ΦG = E − T S − µN = F − µN , we have

F = ΦG + µN
= − N kB T − N kB T ln(Z1 /N )
= − N kB T (ln(Z1 /N ) + 1) (4.15)

However this is exactly what we get from F = −kB T ln ZN with ZN = (Z1 )N /N !. Thus we
recover all our previous results.
We can also look at the occupancy n(ε): for large negative µ,

1 e−εβ
n(ε) = ≈ eµβ e−εβ = N (4.16)
e(ε−µ)β Z1
which is the Boltzmann distribution as expected. The three distributions, Bose-Einstein (or-
ange), Boltzmann (blue) and Fermi-Dirac (green) are plotted as a function of (ε − µ)β below:
<N>
2.5

B 2.0 B-E

1.5

1.0
F-D
0.5

β (ε - μ)
-1 0 1 2 3

It is reassuring that we can recover the classical ideal gas predictions of course. We can also
look at the first corrections in an expansion in the density (n/nQ ). For a classical gas this is
called a virial expansion; for a van der Waal gas, for instance,
 
 a N
P V = N kB T 1 + b − + ...
RT V

where b is the excluded volume due to finite molecular size and a arises from attractive inter-
actions.
But even for an ideal (zero electrostatic interactions, point-like) Bose or Fermi gas such
terms appear. An example on the problem sheets asks you to show that, for for fermions and
bosons respectively,  
n
P V = N kB T 1 ± √ + ... . (4.17)
4 2gs nQ
So for fermions the pressure is larger than expected, consistent with a reluctance to occupy
the same space (like a non-zero b). This isn’t too surprising. More surprising perhaps is the
fact that for bosons, the pressure is smaller, as if there were attractive interactions. Whereas
fermion “like” to keep apart, bosons are gregarious!

4.3 The ideal Fermi Gas


4.3.1 Electrons in a metal
Take-home message: The properties of Fermi gases such as electrons in metals and
neutron stars are dramatically different from ideal classical gases.

• Mandl 11.5
• Bowley and Sánchez 10.4.2

<N>
1

0.5 kB T


Μ

We have already seen that for electrons in metal, the number of states with energies of
order kB T is much less that the number of electrons to be accommodated. Because electrons
are fermions, they can’t occupy the same levels, so levels up to an energy far above kB T will
need to be filled. The occupancy is given by (4.10) for fermions
1
n(ε) = (4.18)
e(ε−µ)β + 1
which is plotted above.
At zero temperature, it is clear what the ground state—the state of lowest possible energy—
of a fermion gas must be. All energy levels will be occupied (singly-occupied if we regard the
spin as part of the specification of the state) up to a maximum, and all higher levels will be
unoccupied. The occupation function n(ε) becomes a step function—one up to a certain value
of ε and zero thereafter. Do our results bear this out?
Considering n(ε), we see that the limit T → 0, β → ∞ needs to be taken rather carefully.
Clearly it will depend on the sign of ε − µ. If ε < µ then the argument of the exponential is
very large and negative and the exponential itself can be ignored in the denominator, simply
giving n(ε) = 1. But if ε > µ, the argument of the exponential is very large and positive and
the “+1” can be ignored in the denominator, so that n(ε) → e−(µ−ε)β → 0. So
(
1 for ε < µ
n(ε) → (4.19)
0 for ε > µ

So in fact µ is the energy of the highest occupied state at zero temperature. This is also known
as the Fermi energy, εF , and indeed the two terms, “chemical potential” and “Fermi energy”
are used interchangeably for a Fermi gas.1 The value of k corresponding to this energy, kF ,
is referred to as the “Fermi momentum” (albeit that should really be h̄kF ). A gas like this
where only lowest levels are occupied is called a degenerate gas: this is a different usage from
“degenerate” to mean “equal energy”. The filled levels are called the Fermi sea and the top of
the sea is called the Fermi surface.
So what is the value of the Fermi energy at zero temperature? It is simply fixed by N ,
which we set equal to hN i:
Z∞ ZkF
N = g(k)n(k)dk = g(k)dk
0 0
ZkF
gs V gs V kF3
= k 2 dk = (4.20)
2π 2 2π 2 3
0
2 2
so, with gs = 2 and = h̄ kF /2m for non-relativistic electrons, and n = N/V

(3π 2 h̄3 n)2/3


kF = (3π 2 n)1/3 and εF = (4.21)
2m
The energy is given by
Z∞ ZkF
h̄2 gs V h̄2 kF5
E = g(k)n(k)ε(k)dk = k 2 g(k)dk = = 35 N εF (4.22)
2m 4π 2 m 5
0 0

Note all these depend, as we expect, only on N/V and not on V alone. εF is intensive, and
E ∝ N . For copper, εF = 7 eV.
Note too that though we have written our integrals over wave number k, we can equally
switch to ε as the variable, using the density of space in energy from section 3.51, eg:
Z∞ ZεF 3/2
gs V (2m)3/2

1/2 gs V 2mεF
N = g(ε)n(ε)dε = ε dε = (4.23)
4π 2 h̄3 6π 2 h̄2
0 0

which is the same as before. Generally I prefer only to remember g(k) and to work out g(ε) if
required. After all g(k) only depends on the dimension of space, usually 3D in practice, while
g(ε) differs depending on whether the particles are relativistic or not.2
1
Actually sometimes the “Fermi energy” is used exclusively for the zero-temperature chemical potential,
the highest filled energy, and “Fermi level” is used for the energy with an occupancy of 0.5, which is ε = µ.
We will use Fermi energy at or near zero temperature, and chemical potential where the occupancy deviates
substantially from a step function.
2
Note the warning about the difference in notation for g(ε) between Dr Xian and myself contained at the
end of section 3.8.1. He divides out the volume V .
All of the above is at zero temperature. At finite temperatures, the picture will change—
but not by as much as you might expect. Thermal excitations can only affect levels within a
few kB T of εF . But at room temperature, kB T = 0.025 eV. For kB T to equal εF would need
T ∼ 80, 000 K, a temperature at which the metal would have vaporised. (TF = εF /kB is called
the Fermi temperature, but in no sense is it a real temperature, it’s just a way of expressing
the Fermi energy in other units.) We can see from (4.18) that n(k) will still essentially be 0 or
1 unless |ε − εF | is of the order of kB T . This is shown in the figure at the top of the section.
As the temperature rises, the Fermi energy doesn’t remain constant, though it doesn’t
change much initially. Again we find it from requiring the electron density to be correct,
Z∞ Z∞ 3/2 Z∞ 1
k2

N 1 gs 1 2m 1 x2
= g(k)n(k)dk = 2 dk = dx (4.24)
V V 2π 2
e(h̄ k2 /2m−µ)β + 1 2 h̄2 β π2 zex + 1
0 0 0

where z = e−µβ and we have made the change of variable to the dimensionless x ≡ ε(k)β:
1/2 1/2
h̄2 β 2
 
2m 1 2m
x= k k= 2
1/2
x , dk = 2 x−1/2 dx (4.25)
2m h̄ β 2 h̄ β
Rearranging, and setting gs = 2, we get
Z∞ 1  2 3/2 √
x2 2 h̄ β π n
F (z) ≡ x
dx = 4π n = . (4.26)
ze + 1 2m 2 2nQ
0

The integrand is shown, as a function of energy at fixed temperature, on the left for three
positive values of µ and on the right for µ = 0 and two negative values of µ; the area under the
curve is proportional to N/V . In each plot blue, orange and green are in decreasing order of
µ; the horizontal and vertical scales on the left are much larger than on the right.
g(ε}n(ε) g(ε}n(ε)

ε ε

The area under the curves, hence the function F (z), can be obtained by numerical integration.
Then z, and hence µ, can can be chosen to obtain the correct particle number density.3
Note that the n/nQ is greater than 1 for copper at room temperature. F (z) becomes large
as z → 0, µ  kB T , which fits √ this situation. Conversely for z  1, corresponding to large
negative µ, the integral tends to π/2z and we recover the classical limit as in the last section,
(4.14).
Below is plotted (in blue, labelled “F-D”) the ratio of the chemical potential to the zero-
temperature Fermi energy, as a function of temperature in units of the Fermi temperature. Also
shown (in orange, labelled “B”) is the classical approximation (4.14) (ignore for the moment
the green curve labelled “B-E”):
3
Why have I used z as the variable? Well it is more common to define z = eµβ —it even has a name, the
fugacity—so my variable is z = 1/z. But I thought that to write z −1 would be ugly.... F (z) is defined in terms
of so-called polylogarithms of z, but the details will not concern us.
μ/εF
1
B F-D
T/TF
1 2 3
-1

-3
1 2 3 4 5 6 7 B-E
-10
-5
-20

The number density n doesn’t appear because it has been eliminated in terms of εF . The
classical approximation slowly approaches the full expression as T → ∞, as can be seen in the
inset panel.4
The fact that thermal fluctuations affect only a small fraction of all the electrons has a
number of consequences. For instance the electronic heat capacity is much less than the 23 N kB T
predicted by equipartition. Thermal excitations can only affect states with energies of the order
of kB T below the Fermi surface, roughly a fraction kB T /EF of the total, and their excess energy
is about kB T . So the extra energy above the zero temperature value of 53 N EF is of order
N (kB T )2 /EF and the electronic heat capacity is of order
T
CV ∝ N k B . (4.27)
TF
A more careful calculation gives the constant of proportionality to be π 2 /2. This linear rise with
temperature can be seen at very low temperatures; at higher temperatures the contribution of
lattice vibrations dominates, and we will explore this later.

Data from Lien and Phillips, Phys. Rev. 133 (1964) A1370

The figure above shows the molar heat capacity as a function of temperature for potassium.5
What is actually plotted is C/T against T 2 , so the straight-line implies

C = (2.08 mJK−2 )T + (2.57 mJK−4 )T 3 . (4.28)


4
For a non-relativistic gas in 2D, the relation between µ and N can be solved analytically, and a similar-
looking graph is obtained.
5
The bottom line shows the range 0 < T 2 < 0.3, with the scale on the bottom of the figure; the upper line
shows the continuation of the same function for 0.3 < T 2 < 1.8, with the scale on the top.
The prediction of the model above would give the first number as 1.7 mJK−2 for potassium
with a Fermi temperature of 2.4 × 104 K, which is good to 20% — not bad considering how
simple the model is.
Similar considerations allow predictions to be made about the thermal conductivity; nor-
malising by the electrical conductivity to cancel the dependence on the collision time gives the
Wiedemann-Franz law:
κ
∝T (4.29)
σ
These will be covered in much more detail in PHYS20252.
Most of the resistance to compression of metals is due to the fact that, if the volume is
reduced, the energy of all the single-electron states increases, increasing the electronic internal
energy. This same effect turns out to stabilise white dwarf stars against gravitational collapse,
as we will now see.
A very different substance whose behaviour is pretty well described as degenerate Fermi
“gas” is liquid 3 He at sub-Kelvin temperatures, and indeed the heat capacity is nicely linear.
However below 2 mK the behaviour changes and is better described by a bosonic liquid ex-
hibiting superfluidity! The picture is that the fermion atoms pair up to form bosons, which can
condense—as we will see later.

4.3.2 White dwarf stars


In the formation of a main sequence star, a cloud of hydrogen collapses till it is dense enough for
nuclear fusion to occur; equilibrium is obtained when the outward pressure of the hot plasma
balances the gravitational attraction. Energy is radiated but fusion generates more. When the
hydrogen fuel is mostly used up, however, this equilibrium is lost. Further nuclear processes can
also occur, raising the central temperature and causing the star to become a red giant; however
when all such processes are exhausted the cooling star will start to collapse. For a star of the
size of the sun, the collapse continues until the increasing density and decreasing temperature
causes the electron to cease to be a classical gas, being better described by the Fermi-Dirac
distribution. Eventually, though still very hot (“white hot”) the density is such that µ  kB T
and the electron gas is well described by a “zero-temperature” degenerate electron gas, with
the gravitational attraction balanced by the degeneracy pressure.
The pressure of a degenerate Fermi gas can be obtained from the grand potential; since at
zero temperature ΦG = E − µN = E − N εF , from (4.22) we have

3 h̄2 (3π 2 )2/3 5/3


− 1 N εF = 25 N εF P = 25 nεF =

P V = −ΦG = − 5
⇒ n . (4.30)
5me

Note in passing that P V = 32 E, as for a classical ideal gas! This holds even at finite
temperature, as you will be asked to show on the problems sheets. The origin of both the
internal energy and the pressure is quite different though.
We will assume that the composition of the star is constant, so that there is a constant
ratio between the electron density and the matter density, ρ ∝ n. For hydrogen the constant of
proportionality would just be mH , whereas for heavier nuclei it will be around 2 amu or slightly
more (roughly, one proton and one neutron per electron). It turns out that, apart from the
total mass, the composition is the only thing that distinguishes one white dwarf from another.
Lighter stars are mostly carbon and oxygen, heavier ones have heavier elements up to iron.
Within a star the density, Fermi energy and and pressure will vary with radius. For equi-
librium, the pressure difference across a spherical shell at r has to exactly balance the weight:
Gρ(r)M (r)
4πr2 (P (r) − P (r + dr)) = 4πr2 dr
r2
Zr
dP Gρ(r) 2
⇒ =− 2 4πr0 ρ(r0 )dr0 (4.31)
dr r
0

Writing P (r) = Cρ(r)5/3 allows us to turn this into a second-order non-linear differential
equation for the density ρ(r). And with some manipulation that we won’t go into here (Google
“polytrope” if you are interested), it can be shown that the solution can be cast in terms of
the average density and a universal dimensionless function of the scaled variable r/R, where R
is the radius of the star:
ρ(r) = ρ f (r/R) (4.32)
Without going into the details of that function, we can none-the-less derive an interesting
consequence: a mass-radius relation for white dwarf stars. We return to Eq. (4.31) and integrate
to obtain the pressure difference from the centre to the surface r = R at which P (R) = 0:
ZR ZR
dP Gρ(r)M (r)
− dr = P (0) = dr (4.33)
dr r2
0 0

Then using (4.32), which also implies a universal function g for the mass M (r) (which is
obviously related to f but we don’t need the details):
M (r) = M g(r/R), where 4
3
πR3 ρ = M; (4.34)

and looking back to (4.30) we see that the central pressure is proportional to ρ5/3 , giving
ZR Z1
f (r/R)g(r/R) GM ρ f (x)g(x)
ρ5/3 ∝ GM ρ 2
dr = dx (4.35)
r R x2
0 0

The integral is universal and dimensionless; ie just a number which is the same for all white
dwarfs of a similar composition. Then we canel one power of ρ from either side to get
(M/R3 )2/3 ∝ M/R ⇒ M R3 = constant (4.36)
This relationship is pretty well satisfied by white dwarf stars of masses less than one solar mass
(see table below). Note that it implies that the more massive the star, the smaller the radius.
For 1 solar mass, the radius is about that of the earth!
It looks like there would be no upper limit on the mass, but we have assumed that the
electrons are non-relativistic, and as the Fermi energy approaches me c2 this is no longer valid.
In the highly relativistic regime the pressure is proportional to n4/3 (see problem sheets), and
doesn’t grow fast enough as the star shrinks to stabilise it (check it in (4.36): the radius
cancels). So there is an upper bound to the mass of a white dwarf of about 1.4 solar masses—
the Chandrasekhar limit. This involves a pleasing combination of microscopic and macroscopic
parameters with a numerical factor which is of order 1:
(h̄ c/G)3/2
∼ M
(2mp ) 2
Above that an even more compact object sustained by the degeneracy pressure of neutrons
can form, but that again has an upper mass limit of about 2 solar masses (and radius of about
10 km). Beyond that, only black holes are possible...

table from J Forshaw, PHYS30151

4.4 The ideal Bose Gas


4.4.1 Photons and Black-body radiation
Take-home message: Black-body radiation is an example of a Bose gas

• Mandl 10.3-5
• Bowley and Sánchez 8.5

Classically, black-body or cavity radiation would arise from considering the modes (or stand-
ing waves) of a conducting cavity. These are solutions to the wave equation for the EM fields
subject to suitable boundary conditions at the walls, as discussed last semester. As the fields
are vector rather than scalar, the picture is a little more complicated than the solutions to the
Schrödinger equation (3.42) but the end result is essentially the same: modes of a cuboidal
cavity of sides Lx , Ly and Lz are characterised by the three discrete wave-vector components
k = (nx π/Lx , nz π/Ly , nz π/Lz ) with integer ni . Furthermore for each mode the restriction
k · E = 0 allows for two polarisation states for each k.6 But classically there is a big difference
for EM fields: these are the modes, which are discrete, but the amplitudes of the fields of any
mode can take any value. We might expect that in thermal equilibrium the energy in each
mode would be kB T (the energy density is quadratic in E and B, so two degrees of freedom).
Though the modes are discrete it is an excellent approximation for a macroscopic box to replace
the sum with an integral, to give the energy per unit volume in the field for a frequency range
ω → ω + dω, where ω = ck:

g(ω) kB T 2
u(ω)dω = kB T dω ⇒ u(ω) = ω ? (4.37)
V π 2 c3
6
A nice introduction to cavity modes is given here, but of course it is not examinable.
Experimentally this matches the low-frequency (long-wavelength) spectrum of black-body radi-
ation well and is called the Rayleigh-Jeans law. But clearly it cannot hold for indefinitely large
frequency, as it increases without bound: integrating over all frequencies, this would predict
that the energy in the EM field of a cavity is infinite, and even the coolest black body would
be more than white hot! And indeed, experimentally, deviations are seen at higher frequency:
the observed spectrum reaches a maximum and then falls away exponentially. The failure of
the Rayleigh-Jeans law has been termed the “ultraviolet catastrophe”.7
The correct radiation formula was found in 1900 by Max Planck who, in what he described as
“an act of desperation”, proposed that the amplitude of the fields was not arbitrary, but that the
energy in any given mode was quantised, that is it had to be an integer multiple of an elementary
energy which is proportional to the frequency—in modern terminology, E = nhf = nh̄ω.
This was the original introduction of h, Planck’s constant. Now the single quantum harmonic
oscillator is a problem we solved long back, finding (ignoring the zero-point energy)

h̄ω
hEi = . (4.38)
eh̄ωβ−1
(See (3.37), subtracting 12 h̄ω.)
Including the density of states to account for the degeneracy at a given ω give the energy
per unit volume per unit frequency is

g(ω) h̄ω h̄ ω3
u(ω) = = (4.39)
V eh̄ωβ − 1 π 2 c3 eh̄ωβ − 1

For low frequencies, h̄ω  kB T , the denominator can be approximated by h̄ωβ and we recover
the Rayleigh-Jeans result (4.37). This is exactly analogous to recovering equipartition for an
oscillator. But for high frequencies, the exponential in the denominator dominates and u falls
off as ω 3 e−h̄ωβ , curing the ultraviolet catastrophe—and matching experiment nicely with an
appropriately-chosen value of the Planck’s constant.8

u(ω) u(ω)

R-J

Planck

ω
ω

7
It should be noted that the generally accepted radiation law before Planck was the Wein Law u ∼ ω 3 e−bω/T ,
which in fact is correct at high frequencies. The Rayleigh-Jeans law, though from the start clearly not general,
was observed to do better at low energies. The Planck formula we are about to discuss interpolates between
the two. As a further interesting note, it was Einstein in 1905 who derived the law together with all the
corresponding factors from equipartition, and fully articulated the conflict between this and Planck’s law.
8
Even better, since u depends on h̄ and kB separately, he could determine both—the latter was not actually
known at that point. The gas constant R was, so that immediately gave Avogadro’s number NA . And the
product NA e (the Faraday constant) was also known from electrolysis, so he found e too, all to within a few
percent of their currently accepted values, and well before any accurate determination from any other method.
Not bad for someone who originally rejected the atomic hypothesis...
The plot above shows, on the left, the Rayleigh-Jeans distribution and the Planck dis-
tribution for a given temperature, and on the right, the Planck distribution for three differ-
ent temperatures. The maximum of the Planck distribution—the frequency with the highest
intensity—is at ω = 2.82kB T /h̄, a relation which is called the Wien displacement law. The ob-
servation of scaling of the maximum with temperature predates Planck, and is obtained from
du/dω = 0 with the numerical solution of 3(1 − e−x ) = x at x = 2.8214. The sun’s surface
temperature of 5778 K means that its spectrum peaks in the visible range and its light pretty
much defines white. Betelgeuse at 3500 K peaks in the near infrared and appears red, while
Sirius at 10000 K peaks in the UV and appears blue.
At this point you might be wondering why we waited till now to cover this application; we
could have done it after the ideal gas in section 3. True. But look again at (4.39). In the context
of the current section, it should look very familiar: in fact it is just what we expect from an
ultrarelativistic Bose gas with chemical potential µ = 0. And indeed we can so interpret it: we
can switch from the classical picture of cavity modes of the EM fields with the energy arbitrarily
quantised, to a picture of a gas of massless photons of energy h̄ω. The two polarisation states
translate to gs = 2 (unexpected for a spin-1 particle but things work a bit differently when they
are massless). Photons, being spin-1, are bosons, so there can be many in any given state. And
so the number of photons for a frequency range ω → ω + dω, where ω = ck and ε = h̄ω, is

hN (ε)i dε = g(ε)n(ε)dε

and the energy density is (see the end of section 3.8.1 and the problem sheets for more on
changing variables):

hE(ε)i
u(ω)dω ≡ dε
V
g(ε) gs ε2 1
=ε n(ε) dε = ε 2 dε
V 2π (h̄c)3 eεβ − 1
h̄ ω3
= 2 3 h̄ωβ dω (4.40)
π c e −1
in agreement with Planck.
Why is the chemical potential for photons zero? The walls of the cavity act as a heat bath
but not in any meaningful sense a particle reservoir: photons don’t exist in the walls, only
energy does. The chemical potential is given by µ = (∂Swalls /∂N )E = 0. This situation occurs
where ever particle number is not a conserved quantity.
We should now clarify the relation between cavity and blackbody radiation. Perfect black-
body radiation is obtained as the emission from a small hole in a cavity, and the relation
between the two is that the flux F (ω) of the emitted radiation, that is, the power emitted per
unit emitting area, per unit frequency, is related to the energy density in the box by
c
F (ω) = u(ω). (4.41)
4
The factor of 4 is the same as enters in the formula for effusion of gas from a small hole. If we
integrate over frequency, we get the total flux from an area A of
Z∞ Z∞
A h̄ω 3 A(kB T )4 x3 π 2 kB4
L= 2 2 dω = dx = AT 4 (4.42)
4π c eh̄ωβ − 1 4π 2 h̄3 c2 ex − 1 60h̄3 c2
0 0
The x-integral has the exact value of π 4 /15.9
But this is exactly Stefan’s law, L = AσT 4 , except that we have now predicted the value of
the Stefan-Boltzmann constant σ in terms of Planck and Boltzmann’s constants:
π 2 kB4
σ= . (4.43)
60h̄3 c2
And indeed it does (of course) match with the empirical value.
Stars are not perfect black bodies by any means. The most perfect black body we know is
the cosmic microwave background. If you are familiar with cosmology and the results of recent
experiments such as WMAP and Planck, you may have a picture like the one below on the
right in your head:

But wonderfully informative though that is, it is actually shows the angular scale of deviations,
of the order of a few part in 105 , from a nearly perfect Planck spectrum at a temperature
of 2.728 ± 0.004 K. The figure on the left shows the actual spectrum obtained by the FIRAS
instrument on the COBE satellite. In the original paper10 there are no error bars shown because
they are “a small fraction of the line thickness”; those in the figure are scaled up by 200 times
so they can be seen.
Returning to photons as a bose gas, we can derive some more properties of radiation. The
total energy is11
Z∞ Z∞
V h̄ω 3 V (kB T )4 x3 kB4 π 2 4σ
hEi = 2 3 h̄ωβ
dω = 2 3 x
dx = 3
V T4 = V T4 (4.44)
π c e −1 π (h̄c) e −1 15(h̄c) c
0 0

The grand potential for an ultrarelativistic gas is (see problem sheet 8)


ΦG = − 31 hEi ,
so, since P V = −ΦG ,
4σ 4 ∂ΦG 16σ
P = T and S=− = V T 3. (4.45)
3c ∂T 3c
These turn out to be of great importance in cosmology.
In passing we note that for µ = 0, ΦG = F and E = F + T S = T S − P V .
R∞ n
9
The integral 0 x3 (z −1 ex − 1)−1 dx = 6 n=1 zn4 , so for z = 1 as here we have the sum of the reciprocals of
P
the 4th power of the integers. Those of you doing Complex Variables will have learned a method to sum such
series, using the residues of the function coth(πz)/z 4 .
10
Fixsen et al, Astrophysical Journal 473 (1996) 576; the figure as shown is widely available on the web,
attributed to E. L. Wright of UCLA. The right-hand figure is Planck data from ESA.
11
Many texts use a for 4σ/c. Dr Galla used σ! Don’t confuse σ the Stefan-Boltzmann constant with the
number density per unit area which often appears on the problem sheets....
4.4.2 Phonons and the Debye model
• Mandl 6.3
• Bowley and Sánchez 8.7

The switch in perspective from modes of the EM fields in a cavity, to particles—photons—is


rather a profound one. Familiarity has perhaps obscured that; most physicists I think would say
that photons are “real”. As well as the Planck distribution, processes such as the photoelectric
effect and Compton scattering amply attest to the particle nature of light. A similar switch
from fields to particles is made for all particles in quantum field theory, in that the theory is
formulated in terms of classical-looking fields and then the particles are treated as quantised
excitations of these fields. An example you might have heard of is the Higgs: the background
field couples to all matter particles giving them mass, but excitations of the field are the short-
lived Higgs bosons produced at the LHC. (This is not examinable!)
Condensed matter (solid state) physicists deal with complicated systems such as crystals
with many excitation modes. Where the energy in these modes is quantised, they also treat
the excitations as particles; phonons (from lattice vibrations) are the best known, but magnons
(collective excitations of the electrons’ spin), plasmons, rotons (excitations of superfluid 4 He)
and several more are also discussed. They are known as quasiparticles; they may not be
considered “real” but they do exhibit some particle-like behaviours, notably in scattering light
and other particles.
In this course we are only, briefly, going to look at phonons.
In the earlier parts of the courses we have repeatedly mentioned Einstein’s model of a
crystal, in which every atom vibrates independently with a single frequency ωE (which can be
fit to data): We found (ignoring zero-point energy)
2
eh̄ωE β

h̄ωE h̄ωE
hEi = N h̄ω β CV = 3N kB (4.46)
e E −1 kB T (eh̄ωE β − 1)2

The figure below is reproduced from Einstein’s paper Ann. Phys. 22, 180, (1907), with the
frequency parameter expressed as a temperature, h̄ωE /kB = 1320 K.

This gives the correct high-temperature limit of the internal energy and heat capacity, but it
can be seen to deviate at low temperatures, and indeed is clearly fundamentally flawed. If one
atom is displaced, it pulls on its neighbours which will also be displaced, and so on. We are
reminded of coupled pendulums; you can set one swinging, but the other will start to swing
too. To analyse the subsequent motion we can use the normal modes of the whole system,
which in this case involve the two pendulums swinging together, either in phase or 180◦ out
of phase; these modes have different frequencies with the in-phase mode being lower that the
out-of-phase one. For N pendulums there will be N modes, the lowest-frequency one having all
swinging together and the highest frequency one having each out of phase with its neighbour,
but with another N − 2 modes of intermediate frequency. Note that the modes (and hence
frequencies) are discrete because the number of pendulums is finite. This is reminiscent of the
modes of the EM field in a cavity, which were discrete because the volume was finite, and the
allowed wave numbers were multiples of π/L.
For a 1D chain of identical atoms of mass m, with Ks being the effective spring constant
between each pair (actually the curvature at the minimum of the interatomic potential), there
are two kinds of modes, longitudinal and transverse. For both, the modes are characterised
by the distance λ the pattern of atomic displacements to repeat.. The length of the chain
sets the maximum λ,12 and the distance a between the atoms sets the minimum; in terms
of wave number, k = nπ/L for n = 1, 2 . . . L/a. The frequency
p for longitudinal modes was
shown in first year to be ω(k) = 2ω0 sin(ka/2) where ω0 = Ks /m, which at low frequency is
ω = kaω0 , a linear dispersion relation reminiscent of ω = ck for photons with vs = aω0 being
the speed of sound in the chain. In a simple model, transverse modes have a lower frequency
than longitudinal modes of the same k; there are two of these for vibrations in the two planes
perpendicular to the direction of the chain.
Introducing quantum mechanics, the energy in each mode will be restricted to multiples
of h̄ω (again ignoring unobservable zero-point energy), and we see that we have a picture of
massless (quasi-)particles corresponding to the excitations in each mode. And either from the
quantum oscillator approach of Section 3.7, or by treating the phonons as a bosonic gas with
zero chemical potential (same argument as for photons, phonons aren’t conserved), we get the
number of phonons in a mode as
1
n(k) = (4.47)
eh̄ω(k)β − 1
Even for 3D monatomic crystals with a cubic structure the possible modes of vibration might
seem overwhelming. But in fact we can, again, characterise the modes by the wave vector in 3D,
k = (nx π/Lx , ny π/Ly , nz π/Lz ) with integer ni , this time though with a maximum ni = Li /a.
In 3D it is planes of atoms which move together, and k is parallel to the normal to the plane.
The total number of modes is three (for three modes) times the number of ways of choosing
nx , ny , nz with each ni in the allowed range, which is just the product of the three nmaxi . So
3 3
the number of modes is 3Lx Ly Lz /a = 3V /a = 3N , where N is the number of atoms. That
is gratifying, because it is, as it must be, the same as if we treated each atom as vibrating
independently in 3D.
This also means that if we are in the high-temperature regime, in which the energy in every
mode is kB T , we will recover the law of Dulong and Petit, E = 3N kB T and CV = 3N kB , as in
the Einstein model.
To find the internal energy and heat capacity due to lattice vibrations at lower temperatures,
we would like to use an approach like that for photons, integrating the quantity of interest
weighted by g(k)n(k), albeit with a cut-off in k. There are some issues with this. Strictly
speaking the cut-off on k is in cartesian coordinates, and furthermore the frequency is not a
simple function of |k|, independent of direction. In addition the longitudinal and transverse
12
The existence of a maximum λ of course depends on the boundary condition. If the end atoms are fixed,
then λmax = L/2. In fact the modes have the end atoms with maximum displacement (cosines rather than
sines) and again λmax = N a/2 = L/2. If periodic boundary conditions are applied, as is common, λmax = L
but there are two modes for each λ and the net effect is the same.
modes do not have the same dispersion relation (wave speed).13 The Debye model ignores these
subtleties, and further assumes a linear dispersion relation ω = vs k. It imposes a cut-off kD on
|k| such that the number of modes is correct:
ZkD 3
3V kD
3N = g(k)dk = ⇒ kD = (6π 2 n)1/3 (4.48)
2π 2 3
0

Then the energy is


ZkD ZkD
3V k3
hEi = ε(k)g(k)n(k)dk = 2 h̄vs dk (4.49)
2π eh̄vs kβ − 1
0 0

At very high temperatures, β → 0, n(k) → kB T /h̄kvs and we recover hEi = 3N kB T . This is


only a consequence of equipartition for each mode and having ensured the correct number of
modes, it is does not constitute an independent check of the model.
More generally though, defining TD = h̄vs kD /kB (often ΘD is used instead):

ZkD 4 TZD /T
k3 x3

3V 3V kB T
hEi = 2 h̄vs h̄v kβ
dk = h̄vs dx
2π e s −1 2π 2 h̄vs ex − 1
0 0
ZkD 3 TZD /T
k 3 h̄vs k eh̄vs kβ x4 e x

3V dβ 3V kB T
⇒ CV = − 2 h̄vs dk = kB dx (4.50)
2π dT (eh̄vs kβ − 1)2 2π 2 h̄vs (ex − 1)2
0 0

For very low temperatures the upper limit on the x integrals in (4.50) becomes large and as
the integrand falls off exponentially at large x, we can take the cut-off to infinity. Then the
integral in hEi is the same as we met in the context of the black body spectrum and equals
π 4 /15. So
V π 2 (kB T )4
hEi →
10 (h̄vs )3
3  3
2π 2 kB T 12π 4

T
⇒ CV =kB V = kB N (4.51)
5 h̄vs 5 TD
Note that the in the first form, the dependence on N has disappeared, which does make
sense: only long-wavelength, low frequency modes are contributing; the energy per unit volume
only depends on vs , a bulk property which is insensitive to the atomic substructure. For that
reason we expect the result to be reasonably robust, even if some of the model assumptions
are somewhat suspect. And so we recover the CV ∝ T 3 contribution from lattice vibrations
to the specific heat which we saw in the data for potassium in Eq. (4.28). Though TD can
be predicted, it can also be fit to data, as in the figure below on the left, where it can be
shown that the different heat capacity curves for a number of metals collapse to a single curve
if plotted as a function of T /TD . (Note: the numbers given for the Debye temperatures are just
fit parameters, and other sources using different data will not give identical numbers—see for
example Mandl’s fig. 6.7.)
13
That particular point can be circumvented by using an average speed defined by 3/vs3 = 2/vt3 + 1/vl3 , where
vl and vt are the speeds of sound for transverse and longitudinal waves respectively; in the expression for the heat
capacity (4.50) that is equivalent to calculating the contributions from the two types of modes independently.
CV /NkB
3.0

2.5

2.0
1.5

1.5
1.0

1.0
Debye 0.5

0.5
Einstein 0.05 0.10 0.15 0.20 0.25
T/TD
0.2 0.4 0.6 0.8 1.0

The difference between the Debye and Einstein predictions are shown above on the right. Since
they both have one free parameter, I have chosen ωE = 0.75ωD to get the best agreement.
Clearly experiment is not going to tell the difference between them except at low temperatures
(shown in the inset panel for clarity).
In reality, monatomic simple cubic crystals are not common. However the Debye model
works pretty well for more complicated cases too (none of the metals in the figure above are
simple cubic). Just for the record, we make a few comments. The density of states in k space,
though always derived for a cuboidal box, is independent of the shape of the box. The basic idea
that that vibrational modes are countable and that a system of N atoms will have 3N distinct
modes is always valid. The cut-off or maximum value of k is the edge of the Brillouin zone, the
set of vectors that cannot be reduced in length by subtracting a reciprocal lattice vector (just
as in 1D, a standing wave with wave number 3π/4a cannot be distinguished from one of π/4a
because the displacements of the atoms are identical in both). For most structures the Brillouin
zone is a polyhedron which is closer to a sphere than the cube is, so that is actually an advantage
for the Debye model. Finally for a structure with more than one atom per unit cell (as for
any non-monatomic crystal) it is usual to choose the Debye cut-off to reproduce the number of
unit cells in the crystal, not the number of atoms. There will be two types of vibration. Three
for each k are like the ones considered above for which ω ∼ k at low frequencies, which are
termed acoustic modes and which give the dominant low-temperature heat capacity. The rest
are those in which the different types of atom vibrate against each other, termed optical modes
(since if the atoms are charged the oscillating dipoles will interact with EM radiation); their
frequencies do not go to zero as k → 0. If a crude model is to be used, the constant-frequency
Einstein model is more appropriate for these; the correct high-temperature heat capacity will
then be recovered. None of these details of real crystals are examinable in this course.

4.4.3 Bose-Einstein condensation


• Mandl 11.6
• Bowley and Sánchez 10.5

Let us return to matter, and bosons such as 4 He and 87 Rb, though ignoring any interactions
between them (an approximation the validity of which will, broadly, depend on the density—
but it should be said at the outset that there are some qualitative changes when interactions
are present).
Now for bosons, the occupancy is
1
n(ε) = (4.52)
e(ε−µ)β − 1
and as we have seen, this does not make sense for ε < µ. So for a Bose gas the chemical
potential must be less than the lowest energy level (often taken for convenience to be 0). For a
sufficiently warm dilute gas µ will be large and negative and we will be in the classical regime,
but as n/nQ grows, µ will increase (ie become less negative), initially just as in the Fermi case.
And as in that case, in practice we find the chemical potential by requiring the density to be
correct: for non-relativistic particles in 3D we have
Z∞ Z∞ 3/2 Z∞ 1
k2

N 1 gs 2m gs x2
= g(k)n(k)dk = 2 dk = dx (4.53)
h̄2 β
2
V V 2π e(h̄ k2 /2m−µ)β − 1 4π 2 zex − 1
0 0 0

where as before z = e−µβ and we have made the change of variable


1/2 1/2
h̄2 β 2
 
2m 1 2m
x= k k= x 1/2
, dk = x−1/2 dx (4.54)
2m h̄2 β 2 h̄2 β
Just as in the finite-temperature Fermi gas, the integral, which can be done numerically, is only
a function of z, which can therefore be adjusted to obtain the correct density.

Z∞ 1 3/2 √
h̄2

x2 π n
G(z) ≡ x
dx = 4π 2 n = . (4.55)
ze − 1 2mkB T 2 gs nQ
0

Here, in contrast to the fermion case, z > 1 always. (Always remember that nQ ∝ T 3/2 .)
As required, the integral grows as z decreases. But it reaches a maximum at z = 1 (µ = 0)
and so there would seem to be a maximum possible value of n/nQ —for a given density, a
minimum temperature! We will return to this, of course. But remaining below this limit on
n/nQ , we obtain the relation between the chemical potential and the number density in the
same way as for fermions; the results are shown in green in the second figure of section 4.3.1.
(The use of units of εF to eliminate explicit density dependence in that figure is purely a matter
of convenience of course, it has no physical meaning for bosons.) The average energy, entropy,
pressure and other properties are similarly calculable numerically in terms of integrals like that
of (4.55), once the chemical potential is fixed.
Below on the left we show plots of g(k)n(k) at fixed temperature for a variety of small
(negative) chemical potentials. (We switch to k rather than  because the low-k curve is better
behaved.)14 The area under the curve gives the corresponding N ; as expected we see that it
grows as falls and µ rises towards zero from below. And as claimed above, the area reaches a
maximum as µ → 0. No more particles can be accommodated.
14
It takes a little thought to convince one’s self that G(z) in (4.55) tends to a finite maximum as z → 1,
since for z = 1 the integrand diverges as x−1/2 for x → 0. ButR a the integral does not diverge; it goes as x1/2
−1/2
for x → 0, and so the lower limit contributes zero. (Note 0 x dx = a1/2 ). For the full integrand, of
course, the upper limit of the integral and hence the final result is finite. Switching to k, at low k we have
g(k)n(k) ∼ k 2 /(h̄k 2 β/2m) =constant, so the problem does not arise.
g(k)n(k) g(k)n(k)

-10-6 N fixed
T fixed 1.01
-10-3
1.1

-0.01
1.4

T=1.8TC
μβ=-0.1

k k

On the right we show fixed N for various T (in units of “TC ”, here just a shorthand for a
quantity that depends on N with units of temperature; defined below in Eq. (4.57)). Again,
once T = TC and µ = 0, no further adjustment can allow the temperature to be lowered further
while keeping N fixed.
From all of this, we see that the maximum N or minimum T is obtained when µ → 0,
Z∞ Z∞
NC 1 gs k2
= g(k)n(k)dk = 2 2 dk
V V 2π e(h̄ k2 /2m)β − 1
0 0
 3/2 Z∞ 1  3/2
2m gs x2 2mkB T 2.31516 gs
= dx =
h̄2 β 4π 2 ex − 1 h̄2 4π 2
0
= 2.61238 gs nQ (4.56)

Or in terms of temperature,
2/3
h̄2

n
TC = 3.3125 . (4.57)
mkB gs

The subscript C could stand for “critical”, though for reasons we will see below it usually
stands for “condensation”.
But what is going on here? Suppose we just have one state, of energy ε. Then for a given
µ the occupancy is
 
1 1 kB T
N = (−µ)β ⇒ µ = ε − kB T ln 1 + ≈ε− for N  1. (4.58)
e −1 N N
There is no limit on N ! µ just gets closer and closer to ε, from below. So why are we having
problems when we allow more than one state (for a particle in a box, using the density of
states)?
The problem is simply that we have said that everything varies smoothly with k, so that
we have replace a sum over discrete states with a weighted integral over k. But as µ → 0,
n(k) is varying extremely rapidly at low k. And the weighting, the density of states g(k) ∝ k 2 ,
vanishes at the lowest energies, so exactly the states we expect to have the most occupancy
are given a zero weighting! The apparent limit on n/nQ , or on T given n, is an artefact of our
approximation.
In principle, therefore, we should just switch back to a sum over modes. Recall there is
3h̄2 π 2
a single ground state, with energy (taking a cube for simplicity) E0 = 2mV 2/3 , then 3 states
6h̄2 π 2 9h̄2 π 2
with energy 2mV 2/3 , three with 2mV 2/3 and so on. We can show (it is set as an exercise on the
problem sheet) that as we lower the temperature below the critical point, n(ε) continues to
vary smoothly over all states except the ground state. So in fact we can continue to use the
density of states, so long as we treat the ground state separately:
Z∞ Z∞
gs V k2
N = N0 + g(k)n(k)dk = N0 + 2 2 2 dk
2π e(h̄ k /2m)β − 1
0 0
 3/2 Z∞ 1
2m x V gs 2
= N0 + dx
h̄2 β −14π 2 ex
0
 3/2
T
= N0 + 2.61238 gs V nQ = N0 + N (4.59)
TC
 3/2 !
T
⇒ N0 = N 1 − . (4.60)
TC

(where TC is itself a function of N ).


The really important thing to note is that for any T even just a little below TC , N0 is of
the order of N . Maybe a hundredth or a tenth
√ of N , but for macroscopic N that is very much
bigger than, say, the fluctuations of order N that we ignore when using the grand canonical
ensemble and applying it to a fixed particle number. As the temperature drops below TC , a
macroscopic number of particles “condense” into the ground state: Bose-Einstein condensation.
This is in comparison with a typical state of higher energy (say ε ∼ kB T ) where the occupancy
is “a few”, of order 1.
μ N0 g(k)n(k)
T N
TC T=0.9Tc

0.7

0.5

0.3

T k
TC

As the temperature drops, the fraction of the particles in the condensate rises steadily, till as
T → 0, N0 → N , as shown in the middle panel above. The remainder (N − N0 ) corresponds to
the the area under the curves in the right hand panel. The condensate is a single, macroscopic,
collective quantum object which (for cold trapped atomic gases) can actually be seen by the
naked eye.
The energy of the gas is given by
Z∞
gs V k 2 ε(k)
E = N ε0 + 2 2 2 dk
2π e(h̄ k /2m)β −1
0
3/2 Z∞
x3/2

2m gs V
= N ε0 + (kB T )5/2 dx
h̄2 4π 2 ex − 1
0
3/2 Z∞
x3/2

gs V 5mkB T
⇒ CV = kB 2 dx, (4.61)
4π h̄2 ex − 1
0
where the x integral is just another number. So the heat capacity below TC is proportional to
T 3/2 .15
It is also interesting to know what it is just above TC , because interesting behaviour in the
heat capacity is often an experimental sign of a phase transition such as “condensation” in
the vapour-liquid context. This is more complicated, because the energy will depend on the
non-vanishing µ which is implicitly a function of T . As before writing z = e−µβ ,

3/2 Z∞
x3/2

2m 5/2 gs V
E= (kB T ) dx
h̄2 4π 2 zex − 1
0
3/2 Z∞
x3/2

5mkB T gs V
⇒ CV = kB dx (4.62)
h̄2 4π 2 zex − 1
0
3/2  Z∞
zx3/2
 
2m 5/2 gs V 1 ∂µ
+ (kB T ) T −µ dx (4.63)
h̄2 4π 2 kB T 2 ∂T (zex − 1)2
0

If we now set µ → 0, z → 1 we get


Z∞ Z∞
 
 3/2 3/2 3/2
gs V 2mkB T 5 x 1 ∂µ x
CV → kB dx + dx (4.64)
4π 2 h̄2 2 ex −1 kB ∂T (ex − 1)2
0 0

The first term is the same as what we found for T < TC , (4.61), so it is continuous. But the
second term vanishes below T = TC and is negative and grows linearly at T > TC . We could
∂µ
go further in the calculation to get an expression for ∂T , but instead we simply plot CV ; it has
indeed a cusp at T = TC . This is indicative of a second-order phase transition (like that of the
Ising paramagnet).
CV

3
2
N kB

T
TC

The relation P = 23 E/V continues to hold below TC if the contribution of the condensate is
ignored, which is a good approximation. But with the vanishing of µ, which above TC depends
on N/V , P depends only on the temperature and not on the volume. This in fact is not
physical as the system would not be stable against collapse. But we have ignored interactions;
Bose-Einstein gases of real atoms will have repulsive interactions at short distances.
There are a couple of things in the set-up that, on a second reading, might bother you a
little. Firstly, the ground state does not have zero energy, so the chemical potential below TC
15
The N ε0 in E is not a misprint for N0 ε0 . see below.
is not actually zero, but given by (from (4.58))
 
1 kB T
µ = ε0 − kB T ln 1 + ≈ ε0 − (4.65)
N0 N0

There are two aspects to this: the ε0 is just a resetting of the zero of energy. In the continuum
contribution one likewise shifts the energy so that the variable of integration x is proportional
to ε − ε0 . (It is this shift that makes the first term in the energy in (4.62) up to N ε0 , the
zero-point energy of all the particles.) The other aspect is the part that falls as 1/N0 ; for
macroscopic occupancy of the ground state that is the part we are truly ignoring when we
say µ = 0, and it makes no appreciable difference to the continuum part of the distribution.
Second, if the ground state is macroscopically occupied, what about the states just above the
ground state? It is true that the continuum approximation may not get their contribution quite
right. (It is the continuum approximation.) But as you will show on the problem sheet, the
occupancy of the first excited state scales as N 2/3 , which for a macroscopic system is many
orders of magnitude below N . Only the ground state is macroscopically occupied.
Eric A. Cornell, Wolfgang Ketterle, and Carl E. Wieman received the 2001 Nobel Prize
in Physics “for the achievement of Bose-Einstein condensation in dilute gases of alkali atoms,
and for early fundamental studies of the properties of the condensates”.16 The JILA paper
demonstrated that “a condensate was produced in a vapor of rubidium-87 atoms that was
confined by magnetic fields and evaporatively cooled. The condensate fraction first appeared
near a temperature of 170 nanokelvin and a number density of 2.5 × 1012 cm−3 ”.17 The other,
from MIT, used 23 Na. “The condensates contained up to 5 × 105 atoms at densities exceeding
1014 cm−3 . The striking signature of Bose condensation was the sudden appearance of a bimodal
velocity distribution below the critical temperature of ∼ 2 µK. The distribution consisted of
an isotropic thermal distribution and an elliptical core attributed to the expansion of a dense
condensate.”18
The famous image below is from the MIT group and shows the velocity distribution above,
just below and well below TC , with the condensate standing out as a spike at zero momentum,
distinguished from the more diffuse cloud with a thermal distribution.

Another system of bosons that exhibits interesting behaviour at low temperatures has been
known for much longer. 4 He liquifies 4.2 K, and never solidifies at normal pressures. At 2.2 K,
though, it starts to show very peculiar behaviour, which is known as superfluidity: it behaves
as if it had two fluid components, one normal but the other which can flow without viscosity,
16
The information accompanying the prize citation is here.
17
Observation of Bose-Einstein Condensation in a Dilute Atomic Vapor, M. H. Anderson, J. R. Enshe, M.
R. Matthews, C. E. Wieman, E. A. Cornell, Science (1995) 269 198.
18
Bose-Einstein Condensation in a Gas of Sodium Atoms, K. B. Davis, M. -O. Mewes, M. R. Andrews, N.
J. van Druten, D. S. Durfee, D. M. Kurn, and W. Ketterle Phys. Rev. Lett. 75 (1995) 3969.
which will flow up and over the edge of an open container, and cannot rotate as a bulk fluid.
This component is called a superfluid, and as the temperature drops, the superfluid component
tends to 100%. At 2.17 K, the heat capacity shows a pronounced spike (see below). 3 He, on
the other hand, shows no such strange behaviour until the temperature drops to 2 mK. Fritz
London in 1938 suggested that the superfluid is in fact a Bose-Einstein condensate; a rough
estimate of the BEC transition temperature gives about 3 K, not too far off from the observed
one, and if you don’t look too closely the heat capacity curve (below) is similar.19

In fact non-interacting bosons do not exhibit superfluidity (mathematically that is, they
are not experimentally achievable). Dilute cold atom BECs, which are weakly interacting, do
show superfluidity, and indeed it has been demonstrated that there is a BEC in liquid 4 He. But
the condensate fraction never rises above 10%, and it is clear that the interactions (which are
strong enough to make it a liquid, after all) are too strong for the treatment we have used to
be even an approximate description of the relevant physics.
What about the (fermionic) 3 He? How can that show any kind of condensation, even at
very low temperatures indeed? In fact this is an example of a phenomenon called pairing; if
attractive interactions exist then pairs of fermions can form composite bosons (and of course
all material bosons are composite); so long as kB T is well below the binding energy the pairs
can form a BEC. Pairing is similarly behind the phenomenon of superconductivity and the
superfluidity of the matter of neutron stars. But that is well beyond the scope of this course.

19
Figure from F London, Superfluidity, Wiley 1954. In fact the helium heat capacity diverges, albeit only
logarithmically, at the critical temperature.
Appendix A

Miscellaneous background

A.1 Revision of ideal gas


Just as important as knowing these equations is knowing that they only apply to ideal gases!
An “ideal” gas is one with point-like, non-interacting molecules. However the molecules are
allowed to be poly-atomic, and so have internal degrees of freedom (rotational, vibrational).
The behaviour of all gases tends to that of an ideal gas at low enough pressures; at STP noble
gases such as argon are very close to ideal, and even air is reasonably approximated as ideal.
Ideal gases obey the ideal gas law

P V = nRT or P V = N kB T (A.1)

where N is the number of molecules, n = N/NA is the number of moles (not to be confused
with the number density, N/V , also denoted by n), R = 8.314 JK−1 and kB = R/NA =
1.381 × 10−23 JK−1 is Boltzmann’s constant. The ideal gas law encompasses Boyle’s Law and
Charles’ Law. It requires the temperature to be measured on an absolute scale like Kelvin’s.
Ideal gases have internal energies which depend only on temperature: if CV is the heat
capacity at constant volume,

E = E(T ) and dE = CV dT
⇒ E = CV T if CV is constant.(A.2)

In general the heat capacity may change with temperature; however at STP it is usually
adequate to consider it as constant and equal to

E = 12 nf R (A.3)

per mole, where nf is the number of active degrees of freedom. For monatomic gases nf = 3
(translational) and for diatomic gases nf = 5 (translational and rotational; vibrational modes
are “frozen out”.)
The heat capacities at constant pressure and at constant volume differ by a constant for
ideal gases:
CP − CV = nR. (A.4)

79
During reversible adiabatic compression or expansion of an ideal gas the pressure and volume
change together in such a way that

CP
P V γ = constant where γ≡ (A.5)
CV

For a monatomic gas at STP, γ = 5/3 = 1.67; for a diatomic gas, γ = 7/5 = 1.4 . Using the
ideal gas law, we also have
1
T V γ−1 = constant and T P γ −1 = constant. (A.6)
Note that γ − 1 = nR/CV .
Starting from the fundamental thermodynamic relation 1.11 together with the equation of
state A.1 and energy-temperature relation ig:energy, we can show that the entropy change of
a n moles of moles of ideal gas
 nf /2 !
Tf Vf
∆S = nR ln (A.7)
Ti Vi

Since dS = CTv dT at constant volume, this can be checked experimentally for any gas which is
close enough to ideal. But as the expression is ill-defined as Ti → 0, classical thermodynamics
cannot predict the absolute entropy even of an ideal gas.

A.2 Lagrange multipliers for constrained minimisation


Arfken 22.3
Riley 5.9
Consider a hill with height h(x, y) (atypically, here, we use y as an independent variable).
To find the highest point, we want to simultaneously satisfy
∂h ∂h
=0 and =0 (A.8)
∂x ∂y

(checking that it really is a maximum that we’ve found). But consider a different problem: on a
particular path across the hill (which does not necessarily reach the summit) what is the highest
point reached? The path may be specified as y = g(x) or more symmetrically as u(x, y) = 0.
This is constrained maximisation: we are constrained to stay on the path.
The trick is to extremize h(x, y) + λu(x, y) with respect to x and y; these two equations
together with the constraint u(x, y) = 0 are enough to fix the three unknowns xm , ym and
λ (though the value of the last is uninteresting and not usually found explicitly; this is also
called the method of “undetermined multipliers”.) So for example with a hemispherical hill
h = h0 (1 − x2 − y 2 ) and a straight-line path u(x, y) = y − mx − c = 0 we have1

∂(h + λu)
=0 ⇒ −2h0 x − λm = 0 ⇒ x = −λm/2h0
∂x
∂(h + λu)
=0 ⇒ −2h0 y + λ = 0 ⇒ y = λ/2h0 = −x/m. (A.9)
∂y

Combining the constraint y − mx − c = 0 with y = −x/m gives x = −c/(m−1 + m), so the


maximum is reached at
c 1 + m2 − c2
(xm , ym ) = (−m, 1), hm = h0 . (A.10)
1 + m2 1 + m2
As promised we didn’t find λ, though we could. In this case we could simply have substituted
y = mx + c into h(x, y) and minimised with respect to x alone, which would have been easier
(check!), but for a more complicated hill and/or path Lagrange’s method is simpler.

A.3 Hyperbolic Trigonometry


Remember that ordinary trig functions are defined as follows:
1 1 iθ
cos θ = (eiθ + e−iθ ) sin θ = (e − e−iθ ) (A.11)
2 2i
and it is useful sometimes to use the extra functions
1 1 1
sec θ ≡ cosec θ ≡ cot θ ≡ (A.12)
cos θ sin θ tan θ
Hyperbolic trig functions are defined similarly:
1
cosh x = 12 (ex + e−x ) sinh x = (ex − e−x ) (A.13)
2

1 1 1
sech x ≡ cosech x ≡ coth x ≡ (A.14)
cosh x sinh x tanh x
From the definitions above it is easy to show that
d cosh x d sinh x d tanh x
= sinh x = cosh x = sech2x. (A.15)
dx dx dx
Often we are interested in the small- or large-x limits of these functions. What we want is
to find a simple function which approximates to a more complicated one in these limits. So
while it is true that as x → 0, sinh x → 0, that is not usually what we want; what we want is
how it tends to zero.
1 ∂(h+λu)
Some presentations of this subject add the equation ∂λ = 0 to the list, but from that we just recover
the imposed constraint u = 0.
From the small-x expansion of the exponential ex = 1 + x + 21 x2 + . . . we get
x→0 x→0 x→0
sinh x −→ x tanh x −→ x cosh x −→ 1 + 12 x2 (A.16)

The limit of cosh x often causes problems; whether we keep the x2 term depends on the context,
given that we want to be able to say more than “tends to 0” or “tends to ∞”. It may be useful
to remember instead
x→0 x→0
cosh x −→ 1 but cosh x − 1 −→ 12 x2 (A.17)

The same is true of the exponential:


x→0 x→0
ex −→ 1 but ex − 1 −→ x. (A.18)

In a particular problem we find that the energy of a system is


h̄ω
hEi = (A.19)
eh̄ωβ −1
Naively we would say that at high temperatures, as β → 0, the denominator vanishes and the
energy tends to infinity. That is true but not very helpful. If we are more sophisticated we
see that the denominator actually tends to h̄ωβ and hEi → 1/β = kB T . That is a much more
useful prediction, since it can be verified experimentally.
The high-x limits are easier; e−x → 0 and so
x→∞ x→∞ x→∞
sinh x −→ 12 ex cosh x −→ 21 ex tanh x −→ 1 (A.20)

You might also like