Stat Phys
Stat Phys
Stat Phys
2021-2022
Judith McGovern
1 Revision of thermodynamics 2
1.1 States of a system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 The Zeroth Law of Thermodymamics . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Internal energy and the First Law of Thermodynamics . . . . . . . . . . . . . . 3
1.4 Second law of Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Thermodynamic potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Variable particle number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6.1 Chemical Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1
4 Quantum Gases 53
4.1 Bosons and fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 The ideal gas of bosons or fermions: beyond the classical approximation . . . . 54
4.2.1 The classical approximation again . . . . . . . . . . . . . . . . . . . . . 56
4.3 The ideal Fermi Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.1 Electrons in a metal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.2 White dwarf stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 The ideal Bose Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4.1 Photons and Black-body radiation . . . . . . . . . . . . . . . . . . . . . 64
4.4.2 Phonons and the Debye model . . . . . . . . . . . . . . . . . . . . . . . 68
4.4.3 Bose-Einstein condensation . . . . . . . . . . . . . . . . . . . . . . . . . 71
A Miscellaneous background 78
A.1 Revision of ideal gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
A.2 Lagrange multipliers for constrained minimisation . . . . . . . . . . . . . . . . . 79
A.3 Hyperbolic Trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Chapter 1
Revision of thermodynamics
In this chapter, the reader is referred to sections 1 & 2 of the on-line notes of the previous
version of this course for more details and references.
3
a well-defined pressure, which would not be the case if there were turbulence or shock waves
arising from too-rapid movement of the piston. Such irreversible processes are drawn as dotted
lines between the initial and final states.
In the context of these lectures, the state of a macroscopic system in equilibrium specified
by a handful of macroscopically-manipulable variables is call the macrostate. It completely
ignores what is going on with the very many individual atoms that comprise the system. A
much richer description of the positions and momenta of all the atoms (or of their combined
quantum state) is in principle possible, and it is called a microstate of the system.1 The
ability of classical thermodynamics to describe systems without reference to the microstates is
a consequence of the laws of probability and of large numbers. The goal of statistical physics
is to derive thermodynamics from the behaviour of atoms and molecules, and we will returmn
to this in the next chapter. The current chapter is largely revision of ideas already met in
Properties of Matter.
The low pressure limit is taken because real gases approach ideal behaviour in that limit.
The numerical value at the triple point was chosen so that the degree Kelvin matched the
degree Celsius to high accuracy. Unlike earlier temperature scales there is no need to define
the temperature at two points, because the zero of the Kelvin scale is absolute zero. This is
the temperature at which the pressure of an ideal gas would vanish, because (classically) the
motion of its molecules would cease.
In these notes, the symbol T will always refer to absolute temperature.
The internal energy is a function of state and changes by a finite (∆E) or infinitesimal (dE)
amount in any process. But heat and work are NOT functions of state; the same change of
state may be effected by different reversible or irreversible processes involving different amounts
of heat transfer and work done. d̄Q and d̄W are not true differentials but just infinitesimal
amounts of energy transfer in the form of heat and work.
For reversible processes we write
In an adiabatic processe, Q = 0.
For a system taken round a cycle, so that it returns to its initial state, it is obvious that
∆E = 0. But Q and W , not being functions of state, will not vanish in general. This is the
basis of a heat engine: the net work done by the system is equal to the net heat transferred to
the system. But we will see in a moment that one cannot simply add an amount of heat and
extract the same amount of work. Some of the heat must be discarded.
Expressions for the work done in reversible processes are as follows:
Compression of a fluid
d̄W rev = −P dV (1.4)
To reversibly stretch a wire of tension Γ (that’s a capital gamma) by dl requires
Note in the last two cases the sign is different from the first; that’s because it takes work to
stretch a wire or a soap film (dl or dA positive) but to compress a gas (dV negative).
Lastly, to reversibly increase the magnetic field B imposed upon a paramagnetic sample
requires
d̄W rev = −m · dB = −V M · dB (1.7)
where M is the magnetisation per unit volume, and m is the total magnetic moment of the
sample.
To repeat, these hold only for reversible processes. To calculate the internal energy change
for irreversible processes such as the free expansion of a gas, it is necessary to find a reversible
process linking the same initial and final states of the system.
Note the careful wording: of course it is possible to think of processes which convert heat
into work (expansion of a hot gas in a piston) or which pass heat from a cool to a hot body
(real fridges) but other things change as well (the gas ends up cooler; you have an electricity
bill to pay). The bit about “operating in a cycle” ensures that the engine is unchanged by the
process.
The two statements may not appear to have anything to do with one another, but in fact
each one implies the other: a hypothetical engine or pump which violates one statement can,
along with another normal engine or pump, form a new composite machine which violates the
other.
Clausius’s statement leads to the following. By consideration of a system taken round a
cycle, with heat added to and removed from the system while it or the relevant part of it is at
various temperatures, Clausius’s theorem says that the sum of the heat added weighted by the
inverse of the temperature at which it is added is less than or equal to zero:
I
d̄Q
≤0 (1.8)
T
The inequality becomes an equality for reversible systems, for which we could take the cycle in
the opposite direction:
d̄Qrev
I
= 0. (1.9)
T
This is interesting because a quantity whose change vanishes over a cycle implies a function
of state. We know that heat itself isn’t a function of state, but it seems that in a reversible
process “heat over temperature” is a function of state. It is called entropy with the symbol S:
d̄Qrev
dS = (1.10)
T
Hence we have, for a fluid, a new statement of the first law: the fundamental thermodynamic
relation
dE = T dS − P dV (1.11)
Note that reversible adiabatic processes are isentropic and dE = d̄W rev .
Furthermore, by considering a cycle in which a system undergoes an irreversible change,
followed by a reversible return to the initial state, we have
dS ≥ 0. (1.13)
and once the entropy reaches a maximum, no further change can take place and the system is
in equilibrium.
An alternative statement of the second law is thus:
This is a powerful principle, but it is empirically based and leaves many questions unan-
swered, principally about the nature of entropy.
If we start with entropy increase, we recover Clausius’ statement: when a (positive) amount
of heat Q flows from a hotter body to a cooler one, the entropy decrease of the hot body,
−Q/TH is less than the entropy increase of the cold body, Q/TC . So overall the entropy of
the hot and cold bodies together (or a hot body plus its cooler surroundings, together “the
universe”) increases.
We also recover Kelvin’s statement: to remove heat from a hot body and turn it entirely
into work will decrease the entropy of the universe. We need to add enough of the heat to the
surroundings so that their entropy increases to compensate.
In a Carnot engine QH is removed from a hot reservoir and QC is discarded to a cold one
in such a way that the combined entropy change is zero
QH QC
= (1.14)
TH TC
and the difference is available to do work: W = QH − QC .
Along with heating, other processes which increases entropy are free expansion of a gas,
or mixing of two species of gas initially confined to different parts of a chamber. It is not
immediately clear what these have in common....
1.5 Thermodynamic potentials
Take-home message: Depending on the external conditions, other “potentials” or
“free energies” are easier to work with than the internal energy as they account
for the entropy changes of the surroundings
The fundamental thermodynamic relation
dE = T dS − P dV (1.15)
implies that there is a function E(S, V ) with
∂E ∂E
=T and = −P (1.16)
∂S V ∂V S
Equivalently we may use S(E, V ):
1 P ∂S 1 ∂S P
dS = dE + dV ; = and = (1.17)
T T ∂E V T ∂V E T
In this view, temperature and pressure are derived properties of systems whose energy and
volume are fixed. This is appropriate for isolated systems.
However in practice we often want to work with temperature as the variable we control,
and systems which are not isolated but in contact with a heat bath of our choice.
Now we cannot say that the approach to equilibrium will maximise the entropy of the
system, because it can exchange heat with the heat bath at temperature T , and only when we
consider both together will entropy increase.
Imagine a spontaneous change at constant volume during which Q is absorbed by the system,
∆E = Q. The system starts and ends at the temperature T of the heat bath, though it may
change in between (due to a chemical reaction, for instance). The total change in entropy has
two parts, ∆S for the system and −Q/T for the surroundings. So
Q 1
∆Stot = ∆S − = (T ∆S − ∆E) ≥ 0
T T
⇒ ∆(T S − E) ≥ 0 (1.18)
So if we concentrate only on the system, rather than its entropy being maximised in the
approach to equilibrium, the quantity T S − E is maximised. Conventionally we define the
Helmholtz free energy, F , as the negative of this, so it is minimised. We have
F = E − TS ⇒ dF = −SdT − P dV ; (1.19)
and
∂F ∂F
= −S and = −P. (1.20)
∂T V ∂V T
The Helmholtz free energy will turn out to play a crucial role in statistical physics.
If we fix the pressure rather than the volume, we are led to define
G = E − TS + PV ⇒ dG = −SdT + V dP ; (1.21)
and the approach to equilibrium minimises the Gibbs free energy G of the system.
In the latter two cases, the principle is still that the system plus surroundings together
evolve to maximise their entropy. It only looks as if the system may be “trying to minimise its
energy” because an energy transfer to (possibly cooler) surroundings more than compensates
for the decrease in entropy of the system itself.
1.6 Variable particle number
Take-home message: The chemical potential is the same as the Gibbs free energy,
and it is the key to diffusive and chemical equilibrium
In all of the above we have assumed a fixed amount of material in the system. If however
it is in diffusive contact with a reservoir, we must add a term +µdN to each of dE, dF and
dG. µ is the so-called chemical potential, and just as heat flows from high to low temperature,
particles flow from high to low chemical potential, and equilibrium—no net flow—means equal
chemical potential. One of the three expressions for µ as a derivative of a thermodymamic
potential is
∂G
µ= (1.22)
∂N T,P
But G is extensive, and P and T are intensive, so G must be directly proportional to N :
G(T, P, N ) = N g(T, P ) (where g is the Gibbs free energy per molecule), and so µ = g. So for
a single component system, µ is just the Gibbs free energy per molecule. Phase coexistence at
a given temperature and pressure, for instance, requires equal Gibbs free energy per molecule
in the two phases. Otherwise the system would evolve to minimise its Gibbs free energy and
one phase would convert into the other (melt, freeze, boil, condense....).
However for a two-component system, with separate chemical potentials µ1 and µ2 , G
depends not only on the extensive variable N = N1 + N2 but also on the ratio N1 /N2 , which is
intensive. The Gibbs free energy per molecule of one substance can depend on the concentration
of the other. If substance 1 is ethanol and substance 2 water, the chemical potential of ethanol
is different in beer (5%) and vodka (40%). None-the-less an extension of extensivity—that G
must double if N1 and N2 both double—can be shown, rather surprisingly, to imply that
X
G= µi Ni . (1.23)
i
In other words the chemical potential of each species remains the Gibbs free energy per particle
of that species, even though it also depends on the relative concentrations of all species present.2
For an ideal gas,
T P
S(T, P ) = S(T0 , P0 ) + Cp ln − nR ln . (1.24)
T0 P0
At constant temperature T0 , E and P V are also constant, and so from G = E − T S + P V we
have an expression comparing the chemical potential or Gibbs free energy per molecule at two
different pressures:
P2
µ(T0 , P2 ) − µ(T0 , P1 ) = kB T0 ln (1.25)
P1
For mixtures of ideal gases, the absence of interactions means that each species has the same
chemical potential as it would if the other species were absent, and hence the pressure were
equal to the partial pressure of that species, Pi = (Ni /N )P .
Eq. (1.25) says the chemical potential is higher at higher pressures and gas will diffuse from
higher to lower pressure. (A similar expression holds for solutions, with pressure replaced by
concentration.) That seems obvious, but it is different from the equalisation of mechanical
pressure. In particular if two ideal gases are at different concentrations on either side of a rigid
membrane, but only one can pass through, the partial pressure of the mobile one will equalise
even if that increases the mechanical total pressure on one side.
2
We will prove this when we consider the Grand Potential later in the course, see section 3.14.
1.6.1 Chemical Reactions
The treatment of chemical reactions is very like that of phase transitions. Again, we are
considering conditions of constant temperature and pressure, and the question is the following:
how far will a reaction go?
First consider the simplest case of a reaction with only one reactant and one product:
*
A ) B. An example is the interconversion of n-pentane and isopentane (or pentane and
methyl-butane, for those of us who learned our chemistry in the last fifty years).
H H H H H C
H C C C C C H C C C C
H H H H H
pentane methylbutane
(hydrogens omitted)
Spontaneous changes will minimise the Gibbs free energy. With temperature and pressure
fixed only the numbers of A and B can change. Since they can only interconvert, dNA = −dNB
and
dG = µA dNA + µB dNB = (µA − µB )dNA (1.26)
So if µA > µB , A will convert to B, but if µB > µA , the opposite will happen. So at equilibrium,
when no further changes happen, the chemical potentials must be equal. (Remember that the
chemical potentials are functions of concentration, so they will change as the reaction proceed.)
In the figure “E” marks the equilibrium concentration, at the point where µA = µB .
If there are more reactants or products, say for the more general reaction in which a moles
of A and b moles of B react to form x moles of X and y moles of Y,
aA + bB *
) xX + yY, (1.27)
dG = µA dNA + µB dNB + µC dNC + µD dND = a1 (aµA + bµB − xµX − yµY ) dNA (1.28)
and equilibrium is when
aµA + bµB = xµX + yµY . (1.29)
This result is general: equilibrium is reached when the weighted sum of the chemical potentials
of the reactants equals that of the products.
Now consider the hugely simplified case where the species are in gaseous form and can be
treated as ideal gases, so that their chemical potential is just their Gibbs free energy per mole,
µi = gi (Pi , T ). (Here g refers to G/n rather than G/N .)3
We define the molar Gibbs free energy of reaction as
and this will be zero at equilibrium. If we know gr0 ≡ gr (T0 , P0 ) at some reference temperature
and the same reference partial pressure for all reactants and products, (usually T0 = 25o C and
Pi = P0 = 1 bar), then at other partial pressures but the same temperature, from (1.25) it will
be
0 PX PY PA PB
gr (Pi , T0 ) =gr + RT0 x ln + y ln − a ln − b ln
P0 P0 P0 P0
" x y a b #
PX PY P0 P0
= gr0 + RT0 ln (1.31)
P0 P0 PA PB
If gr0 < 0 and so K is large, the reaction will tend to favour the products (X, Y) over the
reactants (A, B); if on the other hand gr0 > 0 (K small) the reactants will be favoured, But
the actual composition at equilibrium will depend on the initial conditions; if for instance B
is in very short supply, there will always be plenty of A left. If the LHS of equation (1.32) is
calculated for some non-equilibrium state it is called Q; the reaction will go forwards if Q < K
and backwards if Q > K.
Chemists know Kp (T ) as the equilibrium coefficient, and using the symbol [A] for the ratio
of the concentration of species A in a gas or liquid to a standard concentration, write (1.32) as
[X]x [Y]y
= Kc (T ). (1.33)
[A]a [B]b
The two equilibrium constants are not numerically the same, because they refer to different
standard states (n0 = 1 mol l−1 for concentrations). They are related by
We have predicted Kp for ideal gases and solutions, but only in terms of a known Gibbs free
energy of reaction at that temperature and standard pressures/concentrations. To predict the
behaviour as a function of temperature requires the statistical theory.
3
This is the only section in which we will use g to mean specific Gibbs free energy. In statistical physics g
will always be a degeneracy factor, the number of microstates with the same energy.
For completeness, the general forms of these equations for N reactants and N 0 products can
be written
N0
X N
X
gr = xi gXi − aj gAj (1.35)
i=1 j=1
and, at equilibrium,
N0 x N a
gr0
Y PXi i Y P0 j
= exp − . (1.36)
i=1
P0 1=1
PA j RT0
Note, as given, gr will depend on whether the reaction is written, e.g., O2 + 2H2 * ) 2H2 O
1 *
or O
2 2
+ H2 ) H2 O, being twice as large in the first case as in the second. You should be
able to convince yourself that the partial pressures at equilibrium do not depend on this. For
a reaction with a single product like this one, though, it is usual to talk about the standard
Gibbs free energy of formation as referring to a single mole of product, i.e. the second case.
All of this is phrased in terms of chemical reactions. But not all reactions are chemical:
in a neutron star, neutrons, protons and electrons can interact via p + e− * ) n; they reach an
equilibrium at which the chemical potentials on either side of the reaction are equal. This is
heavily biased toward neutrons, because the chemical potential of the light electron is much
higher for a given concentration than that of the heavy proton or neutron. The treatment
above doesn’t reveal that, it would be hidden in the reference quantity gr0 . But in fact, we can
and will demonstrate it in statistical physics.
Chapter 2
13
2.1 Microcanonical Ensemble
Take-home message: The properties of a macrostate are averaged over many mi-
crostates.
The crucial link from microscopic to macroscopic properties is as follows. If the value of
some quantity X in the ith microstate is Xi , and the probability that the system is in that
microstate is pi , then the value of X in the macrostate is the ensemble average.
X
hXi = pi Xi (2.1)
i
There are three kinds of ensembles commonly used in statistical physics. Where the real
system is isolated, that is at fixed energy and particle number, the copies in the ensemble are
also isolated from one another; this is called the microcanonical ensemble.
If the real system is in contact with a heat bath, that is at fixed temperature, the copies
are assumed to be in thermal contact, with all the rest of the copies acting as a heat bath of
any individual copy. This is called the canonical ensemble.
Finally, if the real system can exchange both heat and particles with a reservoir, at fixed
temperature and chemical potential, the copies are also assumed to be in diffusive contact.
This is called the grand canonical ensemble.
The idea that averaging over very many copies of a probabilistic system gives, as the number
tends to infinity, a fixed reproducible result, is an example of the law of large numbers. As an
example, if we roll many identical dice (or one die many times) the average score is predictable
(though it is only 3.5 if the die is fair). Though it sounds obvious, in probability theory it
does need proved, though we will not do so here. Crucial is the fact that each roll of the die
is independent of the others, so that in the jargon the results are “independent identically-
distributed random variables”. The usual laws of probability say that the probabilities of a set
of independent outcomes is just the product of the individual probabilities, so that for instance,
with a fair die, the probability of getting first a 6, then anything but a six, in two consecutive
rolls is just 16 × 56 .
We start by considering an isolated system (constant energy, volume and particle number).
The fundamental principle that allows the averaging over microstate to be done is the postulate
of equal a priori probabilities or, in plain English, the assumption that all allowed microstates
are equally likely. (Allowed or accessible means having the same volume, particle number and
and total energy as the macrostate.) We use Ω for the number of such microstates, so the
probability of the system being in any one microstate is
1 X 1
pi = and pi = Ω =1 (2.3)
Ω i
Ω
It should be stressed that, as the name suggests, this postulate is not proved. It is assumed
to be true, but the validation relies on the success of the predictions of the theory.
Imagine we have counters, blue on one side and green on the other, and we toss them and
place them on a 6 × 6 checkerboard. Full information involves listing the colour at each site:
this is the equivalent of a microstate.
Many different patterns are possible, such as the following. Every configuration is equally
likely—or unlikely—to occur: There are Ω = 236 = 6.87 × 1010 patterns and the the probability
of each is (1/2)36 = 1.46 × 10−11 . (This satisfies the “postulate of equal a priori probabilities”.)
Suppose from a distance we only knew, from the average colour, how many counters were
green and how many blue, without being able to distinguish different arrangements of the same
numbers of counters. Then a “macrostate” would be characterised simply by the overall hue,
determined by the total number of green counters (the rest being blue).
If we start at a point of unequal mixing, the configurations which are rather better mixed
are more numerous than those which are rather less well mixed. So as interactions cause the
1
Here, there are just two sets of size n and N − n. For more sets of size p, q, r....we have N !/(p!q!r!..)
system to jump from one microstate to another, it is more likely to end up better mixed. This
continues till full mixing is reached, at which point there is no further direction to the changes.
What has this to do with entropy? Classically, the system is evolving from a macrostate of
lower entropy to one of higher entropy. Statistically, it is evolving from less probable to more
probable macrostates, that is from macrostates corresponding to smaller numbers of microstates
to those corresponding to larger numbers of microstates.
So does the number of microstates, Ω, equal the entropy? No, because if we double the
size of a system, we have Ω2 , not 2Ω, microstates (think of the number of ways of choosing the
microstate of each half independently). So Ω isn’t extensive. But ln Ω is. So if we make the
connection
S = kB ln Ω (2.5)
then we can understand both entropy and its increase.
This expression is due to Boltzmann, it is known as the Boltzmann entropy to distinguish
it from a more general expression we will meet later, and it is inscribed on his grave in Vienna
(though with W for Ω.)
In principle, if the increase of entropy is just a probabilistic thing, it might sometimes
decrease. However we will see that for macroscopic systems the odds are so overwhelmingly
against an observable decrease that we might as well say it will never happen.
What is kB , Boltzmann’s constant? It must be a constant with dimensions of entropy,
Joules/Kelvin, and it turns out that the correct numerical correspondence is given by the gas
constant R divided by Avogadro’s number:
R
kB = = 1.381 × 10−23 J K−1 = 8.617 × 10−5 eV K−1 (2.6)
NA
This is validated, for instance, by the derivation of the ideal gas law (see here).
As an example, let’s consider the checkerboard example again. Imagine starting with a
perfectly ordered all-blue board, then choosing a counter at random, tossing it, and replacing
it. After repeating this a few times, there are highly likely to be some green counters on the
board—the chance of the board remaining blue is only about 1 in 2n after n moves. As time
goes on, the number of greens will almost certainly increase—not on every move, but over the
course of a few moves. Here is a snapshot of the board taken once every 10 moves. The number
of greens is 0, 3, 5, 9, 12, 15, 15, 17, 18.
Here is a graph of the number of greens over 100 and 1000 moves.
25 25
20 20
15 15
10 10
5 5
We see that, after the first 100 moves, the system stayed between 18 ± 6 almost all of the
time. These fluctuations are quite large in percentage terms, ±33%, but then it is a very small
system—not really macroscopic at all.
If we now look at a larger system, 30 × 30, we see that fluctuations are still visible, but they
are much smaller in percentage terms–the number of greens is mostly 450 ± 30, or ±7%.
600 600
500 500
400 400
300 300
200 200
100 100
A 25-fold increase in the size of the system has reduced the percentage fluctuations by a
factor
√ of 5. We will see later that an n-fold increase should indeed reduce the fluctuations
23
by n. We can predict that a system with 10 counters—truly macroscopic—would have
fluctuations of only about 10−10 %, which would be quite unobservable. The entropy of the
system would never appear to decrease.
• Mandl 2.1
• Bowley and Sánchez 4.3
• Kittel and Kroemer 1,2
m= µ
Ν=3
m=− µ
m=−3µ
0.8
0.6
0.4
N=100
0.2
104 N=10
n
0.2 0.4 0.6 0.8 1 N
As N gets larger, the function is more and more sharply peaked, and it is more and more
likely that in the absence of an external magnetic field there will be equal numbers of up and
down spins, giving zero magnetisation.
For large N , the curve is very well approximated by a Gaussian (see examples sheet),
(n−N/2)2
−
Ω(n) ∝ e N/2 (2.10)
√
with a mean√ of N/2 and a standard deviation σ = N /2. Thus the fractional size of fluctuations
goes as 1/ N .
Since the probabilities of various sizes of fluctuations from the mean in a Gaussian are
known, we can show that in a macroscopic system, 100σ deviations are vanishingly unlikely
with a probability of 1 in 102173 . Even they would be undetectable (only in part in 1010 for a
system of 1024 particles), so the macroscopic magnetisation is very well defined indeed.
Furthermore the number of microstates of the whole system is dominated by those states
where the numbers of up and down spins are approximately equal. In comparison, there are so
few of the rest we can ignore them.
The fact that the binomial distribution tends to a Gaussian as the number becomes very
large is an example of the central limit theorem. It is not restricted to the binomial distribution:
for any N independent identically-distributed random variables with mean µ and standard
deviation σ,
√ for large N the distribution of their sum is a Gaussian of mean N µ and standard
deviation N σ. We won’t prove it here, but the examples ask you to test it for a particular
example.
• Mandl 2.4
• (Bowley and Sánchez 2.9)
Ε1 Ε2
V1 V2
N1 N2
In the last section we deduced the existence of entropy, and the fact that at equilibrium the
entropy is a maximum, from statistical arguments. Now we would like to know if we could go
further and, even if we knew no classical thermodynamics, deduce the existence of temperature,
pressure and chemical potential.
By considering two systems in contact with one another we can indeed deduce the existence
of properties which determine whether they are in thermal, mechanical and diffusive equilibrium
even if we knew no classical thermodynamics: these are the three partial derivatives of the
entropy with respect to energy, volume and particle number.
Consider a system divided in two by a wall which can move, and through which energy
and particles can pass. The equilibrium division of the space into two volumes V1 and V2 ,
with energy and particle number similarly divided, will be the one which corresponds to the
maximum number of microstates, and hence to the maximum entropy. If we consider heat flow
only,
∂S ∂S
dS = dE1 + dE2 (2.11)
∂E1 V1 ,N1 ∂E2 V2 ,N2
.
But the microstates of each half can be counted independently, so the entropies add:
and the entropy will be maximised when a small energy flow no longer changes the entropy:
∂S1 ∂S2
= (2.14)
∂E1 V1 ,N1 ∂E2 V2 ,N2
So we deduce there is some property of bodies which governs heat flow; this is clearly
related to temperature. By considering volume changes and particle flow we discover two
more properties which are clearly related to pressure and chemical potential. To discover the
relation we would have to calculate them for some system and see how they compared with
the temperature, pressure etc of classical thermodynamics. However the following assignments
clearly work:
∂S 1 ∂S P ∂S µ
= = =− (2.15)
∂E V,N T ∂V E,N T ∂N E,V T
with n↑ = 21 (N − E/µB).
For large numbers of spins, we can use Stirling’s approximation:
ln n! = n ln n − n (2.19)
giving
S = kB (N ln N − n↑ ln n↑ − (N − n↑ ) ln(N − n↑ )) . (2.20)
(Note that S is maximum, S = N kB ln 2, when n↑ = n↓ = N/2, the point of maximum disorder.2
So
1 ∂S
=
T ∂E B,N
∂S ∂n↑
=
∂n↑ N ∂E B,N
kB n↑
= ln (2.21)
2µB n↓
Differentiating the entropy with respect to B instead, and using the above result for T , we
get an expression for m:
m kB E n↑ E
= − 2
ln ⇒m = −
T 2µB n↓ B
We knew that of course, but it’s good that it works.
We note that for T to be positive there have to be more spins aligned with the field than
against it. We will come back to this point later. The paramagnet is unusual in that there
is a maximum entropy when n↑ = n↓ , corresponding to T → ∞. Most systems can increase
their entropy indefinitely as the temperature increases. But for large N , so long as the energy
above the ground state Eexcite is small (n↓ n↑ , Eexcite N µB/2), the number of microstates
of the paramagnet rises extremely rapidly with energy, and hence the entropy rises too. If
we have two paramagnetic samples sharing a fixed energy between them, then the number of
microstates of the whole system is Ω12 = Ω1 (N1 , E1 )Ω2 (N2 , E − E1 ). Ω1 rises with energy E1
but Ω2 falls. The above plot shows the two, together with the number of microstates of the
whole system as a function of E1 (for N1 = N2 = 105 and Eexcite = 2000µB). Ω12 is sharply
peaked at the point where the energy is shared equally between the two systems, which means
in this case that their temperature is the same. The system is overwhelmingly likely to be in
one of the microstates in the immediate vicinity of this maximum, where the entropy of the
combined system is maximised. Though a toy model, this basically illustrates what is going on
every time two macroscopic bodies reach thermal equilibrium.
2
This is actually slightly surprising, because it seems to imply Ω = 2N – but that is ALL microstates. For
large N , the distribution is so sharply peaked that the overwhelming majority of the microstates ARE at the
peak! Including subleading terms in Stirling’s approximation
√ give Smax less than Stotal by a term of order log N ,
which allows for a peak width which is of order N .
2.4.2 The ideal gas, first attempt
To do the ideal gas properly we need to know the quantum states of particles in a box. We
learned this in PHYS20101 last semester, so we will be able to tackle it properly later. However
with much less work we can at least discover how the entropy depends on volume at fixed particle
number and energy.
Consider an isolated system of N atoms in a box of volume V . Imagine the box subdivided
into many tiny cells of volume ∆V , so that there are V /∆V cells in all (this number should be
much greater than N ). Now each atom can be in any cell, so there are V /∆V microstates for
each atom, and (V /∆V )N microstates for the gas as a whole. Thus
V
S = N kB ln . (2.22)
∆V
As a result the change in entropy when the volume changes (at fixed energy) is
Vf
∆S = N kB ln . (2.23)
Vi
This is exactly what we found from classical thermodynamics, A.7, and incidentally it validates
the kB in S = kB ln Ω. From the entropy we have
P ∂S
=
T ∂V E,N
N kB T
⇒ P =
V
⇒ P V = N kB T (2.24)
Take-home message: For macroscopic systems results can be more easily obtained
by regarding the temperature, rather then the energy, as fixed.
In principle, with the tools of the last section we could tackle all the problems we want now.
But it turns out to be hard to calculate the entropy of any isolated system more complicated
than an ideal paramagnet. This is because in an isolated system the energy is fixed, and it
becomes complicated to work out all the possible ways the total energy can be split between
all the atoms of the system: we can’t treat each atom as independent of all the others, even if
they are non-interacting.
We don’t have to consider isolated systems though. In this section we will consider systems
in contact with a heat bath, so that their temperature, rather than their energy, is constant.
This has the advantage that if the atoms of a system don’t interact with one another, they can
be treated independently.
For a macroscopic system, there is very little difference in the results from the two ap-
proaches. If the temperature is held √ constant the energy will fluctuate, but the fractional size
of the fluctuations decreases as 1/ N and so, from a macroscopic point of view, the energy
does not appear to vary and it makes little difference whether the heat bath is there or not. So
lots of results we obtain in this section are also applicable to isolated, macroscopic systems.
We will introduce something call the partition function from which we can calculate the
energy, pressure etc. The heart of the partition function is the Boltzmann distribution,
already met last year, which gives the probability that a particle in contact with a heat bath
will have a given energy.
• Mandl 2.5
• (Bowley and Sánchez 5.1)
• Kittel and Kroemer 3
If a system is in contact with a heat bath at temperature T , the probability that it is in the
ith microstate, with energy εi , is given by the Boltzmann distribution.
24
The details of the derivation are as follows. We consider a system S in contact with a heat
reservoir R, the whole forming a single isolated system with energy E0 which is fixed but which
can be differently distributed between its two parts. We can apply what we already know about
isolated systems to the whole, to obtain information about the probabilities of the microstates
of S,.
ER= E0− ε
S
ΕS = ε
Heat can be exchanged between the system and reservoir, and the likelihood of a particular
partition depends on the number of microstates of the whole system S + R corresponding to
that partition. (The equilibrium partition will be the one which maximises the number of
microstates, but that is not what we are interested in here.) Since the system and reservoir are
independent, the total number of microstates factorises: Ω = ΩS ΩR
Now suppose we specify the microstate of S that we are interested in, say the ith (with
energy εi ) and ask:
where the derivatives are evaluated at E0 . But the derivative of S with respect to E is just the
inverse of the temperature. Dropping the third term as negligibly small,1
e−εi /kB T
pi = (3.4)
Z
To recap: if we specify the microstate (with a given energy) of S, the probability of this
depends of the number of microstates of the reservoir with the remaining energy. This decreases
with decreasing reservoir energy in just the way given by the Boltzmann distribution.
For an ideal gas or paramagnet, where interactions between atoms can be ignored, any
individual particle can be considered as the system S. In that case the Boltzmann distribution
holds for the state of an individual atom (hence typical first-year applications like the variation
of pressure with height in the atmosphere, and the distribution of velocities of atoms in a gas).
For the spin- 12 paramagnet in a magnetic field B there only are two energy states; ε↑ = −µB
and ε↓ = µB. So
e µB/kB T
p↑ =
Z1
−µB/kB T
e
p↓ =
Z1
and Z1 = e µB/kB T
+ e−µB/kB T (3.6)
(The label on Z1 refers to the fact that we are talking about the state of a single particle.)
In the whole system of N atoms, the number of up-spins on average will be hni↑ = N p↑ , so
we have
n↓
= e−2µB/kB T (3.7)
n↑
This is exactly consistent with the expression we found for the temperature of the isolated
system with a fixed number of up-spins (and hence energy).
n¯ n
1
0.5
T
ΜBkB
Note that in thermal equilibrium, the average number of particles in the higher energy state
is always less than the number in the lower energy state. As the temperature tends to infinity
the ratio approaches, but never exceeds, one.
3.2 The Partition Function
• Mandl 2.5
• Bowley and Sánchez 5.2
• Kittel and Kroemer 3
Take-home message: Far from being an uninteresting normalisation constant, Z
is the key to calculating all macroscopic properties of the system!
The normalisation constant in the Boltzmann distribution is also called the partition func-
tion: X
Z= e−εj /kB T (3.8)
j
The top line is like the bottom line (the partition function) except that each term is multiplied
by εi . We can get the top line from the bottom by differentiating by “1/(kB T )”. This is a bit
awkward, so we introduce a new symbol
1
β≡ (3.10)
kB T
giving
1 ∂Z
hEi = − (3.11)
Z ∂β N,V
or
∂ ln Z
hEi = − (3.12)
∂β
(where—contrary to the strict instructions given earlier—we will take it for granted that it is
particle number and volume or magnetic field constant that we are holding constant.)
When we calculate averages using the Boltzmann probabilities, at fixed temperature, the
corresponding ensemble is called the canonical ensemble.
From the energy we can find the heat capacity:
∂ hEi
CV = . (3.13)
∂T V,N
2
Recall that the microstate is of the whole system, and so the energies i are not necessarily small; for a
mole of gas they could be of the order kJ. It would be easy to forget this, because in a lot of the applications,
like the last one, we will in fact apply it to single spins and particles with energies of the order eV or smaller.
We have found the average energy, but there will be fluctuations as heat is randomly ex-
changed between the system and the heat bath. These are given by
(∆E)2 = E 2 − hEi2
(3.14)
1 ∂ 2Z
2
E = (3.15)
Z ∂β 2 N,V
(which should be obvious by analogy with the corresponding expression for hEi) we obtain
2∂ 2 ln Z ∂T ∂ hEi 2 CV
(∆E) = = − = (kB T ) (3.16)
∂β 2 ∂β ∂T kB
For a normal macroscopic system the average energy is of the order of N kB T and the heat
capacity is of the order of N kB . Thus
∆E 1
≈√ (3.17)
E N
For a system of 1024 atoms, ∆E/E ≈ 10−12 and so fluctuations are unobservable. There is
no practical difference between an isolated system of energy E and one in contact with a heat
bath at the same temperature.
There are exceptions. Near a critical point—where the distinction between two phases
disappears—the heat capacity becomes very large and the fluctuations do too. This can be
observed as “critical opalescence” where the meniscus between the liquid and gas phases disap-
pears and the substance becomes milky and opaque and scatters light. A video and emplanation
of an analogous phenomenon can be found here courtesy of
• Mandl 2.5
• (Bowley and Sánchez 5.3-6)
• Kittel and Kroemer 3
We can’t use an ensemble average directly for the entropy, because it doesn’t make sense
to talk about the entropy of a microstate. But we can talk about the entropy of the ensemble
since the many copies can be in many different microstates. So we define the entropy of the
system as the entropy of the ensemble divided by the number of copies, ν, in the ensemble:
hSi = Sν /ν.
The ensemble has νi copies in the ith microstate, so the number of ways of arranging these
is
ν!
Ων = (3.18)
ν1 ! ν2 ! ν3 ! . . .
(compare the ways of arranging counters on the in the chequerboard).
So, using Stirling’s approximation,
X
ln Ων = ν ln ν − ν − (νi ln νi − νi )
i
X P
= νi (ln ν − ln νi ) (using ν = i νi in two places)
i
X
= − νi (ln νi /ν)
i
X
= −ν pi ln pi (3.19)
i
This expression is called the Gibbs entropy. (Note that as all pi lie between 0 and 1, the
entropy is positive.)
Note that we have not said anything about what distribution the probabilities pi follow.
For an isolated system, pi = 1/Ω for each of the Ω allowed microstates, giving the Boltzmann
entropy S = kB ln Ω as before. For a system in contact with a heat bath, pi is given by the
Boltzmann distribution, so
X
hSi = − kB pi ln pi
i
X
= − kB pi (−εi β − ln Z)
i
= kB (hEi β + ln Z)
hEi
= + kB ln Z (3.21)
T
Rearranging we get kB T ln Z = − hEi + T hSi = − hF i where F is the Helmholtz free energy,
or
F = −kB T ln Z. (3.22)
Since F = E − T S, from the fundamental thermodynamic relation we obtain dF = −SdT −
P dV + µdN . Thus
∂F ∂F ∂F
S=− P =− µ= (3.23)
∂T V,N ∂V T,N ∂N T,V
(You first met these in the derivation of Maxwell’s relations.) For a magnetic system, we have
m = − (∂F/∂B)T,N instead of the equation for P.
Remember, Z and hence F depend on V (or B) through the energies of the microstates. For
instance the energy levels of a particle in a box of side L are proportional to h̄2 /(mL2 ) ∝ V −2/3 .
These relations are reminiscent of those we met in the case of an isolated system, but there
the entropy was the key; here it is the Helmholtz free energy. We can make the following
comparison:
It should not surprise us to find that the Helmholtz free energy is the key to a system at
fixed temperature (in contrast to the entropy for an isolated system) as that is what we found
classically (see here.)
system Isolated in contact with heat bath
fixed E,N,V or B T,N,Vor B
key microscopic function no. of microstates Ω partition function Z
key macroscopic function S= kB logΩ F=− kBT log Z
It is left as an exercise to show that we recover the Boltzmann distribution and to identify the
two Lagrange multipliers in terms of β and Z.
Finally the grand canonical distribution has microstates with variable particle number and
we fix the average N ; we obtain the Gibbs distribution which we will meet in the next chapter,
and the third Lagrange multiplier is related to the chemical potential.
In the context of information theory, the procedure of maximising S subject to the con-
straints is used a as way of finding an unbiased prior estimate of the probabilities. (It is not
always possible to conduct infinite numbers of trials in order to assign probabilities in proportion
to the frequencies!)
• Mandl 3
• (Bowley and Sánchez 5.7)
• (Kittel and Kroemer 3)
First, recap previous sections on the isolated spin- 12 paramagnet at zero and non-zero mag-
netic field.
The ideal paramagnet is a lattice of N sites at each of which the spin points either up or
down. Each of these has a magnetic moment ±µ. In an external field, these two states will
have different energy; spin-up has energy −µB, and spin-down, µB. As we saw previously the
partition function for a single atom is therefore
µB/kB T −µB/kB T µB
Z1 = e +e = 2 cosh = 2cosh (µBβ) (3.27)
kB T
(Remember β = 1/kB T .)
Since the atoms are non-interacting, the total energy and magnetisation of the system are
just N times the average energy and magnetisation of a single spin. The energy is
∂ ln Z1
hEi = −N = −N µB tanh(µBβ) (3.28)
∂β
-NΜB -NΜB
when kB T is of the order of µB. As T gets very large, the energy tends to zero as the number
of up and down spins become more nearly equal. Remember, hn↓ /n↑ i = exp(−2µB/kB T ), so
it never exceeds one.
We can also calculate the heat capacity :
∂E
CV = = N kB (µBβ)2 sech 2 (µBβ) (3.29)
∂T
CV
-NkB 2
T
ΜBkB
We see that the heat capacity tends to zero both at high and low T . At low T the heat
capacity is small because kB T is much smaller than the energy gap 2µB, so thermal fluctuations
which flip spins are rare and it is hard for the system to absorb heat. This behaviour is universal;
quantisation means that there is always a minimum excitation energy of a system and if the
temperature is low enough, the system can no longer absorb heat.
The high-T behaviour arises because the number of down-spins never exceeds the number of
up-spins, and the energy has a maximum of zero. As the temperature gets very high, that limit
is close to being reached, and raising the temperature still further makes very little difference.
This behaviour is not universal, but only occurs where there is a finite number of energy levels
(here, there are only two). Most systems have an infinite tower of energy levels, there is no
maximum energy and the heat capacity does not fall off.
>> >>
kBT µB n n kBT >> µB n n
Up to now we’ve cheated a bit, (though the results are correct,) in that we didn’t calculate
the partition function for the whole system, only for a single spin. It is easy to show however
that the partition function for N non-interacting spins on a lattice is
ZN = (Z1 )N (3.30)
Let’s start with a system that has two single-particle energy levels, ε1 and ε2 . The single-
particle partition function is
Z1 = e−ε1 β + e−ε2 β . (3.31)
The partition function for two distinguishable particles is
where the second state is multiplied by 2 because there are two ways that two distinguishable
particles can be in different levels.
In general, for N particles, the energies are nε1 + (N − n)ε2 , for 0 ≤ n ≤ N , and there are
N !/n!(N − n)! separate microstate of this energy. So
N
X N!
ZN = e−(nε1 +(N −n)ε2 )β
n=0
n!(N − n)!
N
X N!
= e−nε1 β e−(N −n)ε2 β
n=0
n!(N − n)!
N
X N! n −ε2 β N −n
= e−ε1 β e = (Z1 )N (3.33)
n=0
n!(N − n)!
2B0 B0 2
T
T
temperature, the magnetisation goes to N µ: all the spins are up. There is no disorder, and so
the entropy is zero.
The stronger the field, the higher the temperature has to be before the spins start to be
appreciably disordered.
At high temperatures the spins are nearly as likely to be up as down; the magnetisation
falls to zero and the entropy reaches a maximum. The entropy of this state is N kB ln 2, as we
have already seen.
There is a caveat to the formula ZN = (Z1 )N . The argument says that there are a number of
different microstates with the same number of up and down spins. Since the spins are arranged
on a lattice, this is correct; every spin can be distinguished from every other spin by its position.
When we go on to consider a gas, however, this is no longer so, and the relation between Z1
and ZN changes. The treatment for indistinguishable particles is here.
Returning to the expression for the partition function for two distinguishable two-state
particles, (3.32), we see that two microstates have the same energy ε1 + ε2 and instead of listing
them twice we have written it once and multiplied by two. We say that this energy is “doubly
degenerate” or has a degeneracy of two. In general we can write the partition function as as
sum over distinct energies εn rather than over all microstates if we include the degeneracies gn :
X X
Z= e−εj β = gn e−εn β .
j n
The first sum runs over microstates, the second over energies.
• Mandl 5.6
By magnetising and demagnetising a paramagnetic sample while controlling the heat flow,
we can lower its temperature.
S
a B1
B2 >B1
c
b
We start with the sample in a magnetic field B1 at an (already fairly low) temperature T1 .
a→ b: With the sample in contact with a heat bath at T1 , we increase the magnetic field
to B2 .
b→ c: With the sample now isolated, we slowly decrease the field to B1 again. This is
the adiabatic demagnetisation step; because the process is slow and adiabatic, the entropy is
unchanged.
By following these steps on a T − S plot, we see that the second, constant entropy, step,
reduces the temperature. The entropy is a function of B/T only, not B or T separately (see
here) so if we reduce B at constant S, we reduce T also.
The following figure shows what is happening to the spins. In the first step we increase the
kBT1 2 µB 1
Isothermal
magnetisation
2 µB 1 2 µB 2
kBT 2
adiabatic kBT1
c demagnetisation
b
level spacing while keeping the temperature constant, so the population of the upper level falls.
In the second step we reduce the level spacing again, but as the spins are isolated there is no
change in level occupation. The new, lower level occupation is now characteristic of a lower
temperature than the original one.
If we start with a large sample, we could repeat the process with a small sub-sample, the
remaining material acting as a heat bath during the next magnetisation. By this method
temperatures of a fraction of a Kelvin can be reached. However after a few steps less and
less is gained each time, as the curves come together as T → 0. (Once the electron spins are
all ordered, one can start to order the nuclear spins, and reach even lower temperatures—the
magnetic moment of the nucleus is around a two-thousandth of that of the atom), but even
that has its limits.
This is an important and general result. There is always a minimum excitation energy ε
of the system, and once kB T ε there is no further way of lowering the temperature. The
unattainability of absolute zero is the third law of thermodynamics.
In the process above, the lowest temperature attainable is obviously proportional to µB1 /kB .
You might wonder why we can’t just take B1 → 0. But in any real paramagnet, there is a weak
coupling between the spins which means that they prefer to be aligned with one another. If
we remove the external field, this coupling acts like a weak internal field, and at low enough
temperatures the spins will still be ordered. The strength of this coupling is then what governs
the lowest attainable temperature.
S S
NkB log2 NkB log2
0.5 0.5
So far we have only looked at two-level systems such as the paramagnet. More usually there
are many or even infinitely many levels, and hence terms in the partition function. In some
special cases the partition function can still be expressed in closed form.
Vibrational energy of a diatomic molecule
The energy levels of a quantum simple harmonic oscillator of frequency ω are
εn = (n + 12 )h̄ω n = 0, 1, 2 . . . (3.35)
so
∞
1
X
e−εn β = e− 2 h̄ωβ e0 + e−h̄ωβ + e−2h̄ωβ . . .
Z1 =
n=0
1
e− 2 h̄ωβ
=
1 − e−h̄ωβ
−1
= 2 sinh( 12 h̄ωβ) (3.36)
xn = (1 − x)−1 , with
P
where we have used the expression for the sum of a geometric series, n
x = e−h̄ωβ .
From this we obtain
∂ ln Z1
hE1 i = − = 12 h̄ω coth ( 12 h̄ωβ). (3.37)
∂β
The low temperature limit of this (kB T h̄ω; h̄ωβ → ∞) is 12 h̄ω, which is what we expect
if only the ground state is populated. The high temperature limit (kB T h̄ω ; h̄ωβ → 0) is
kB T , which should ring bells! (See here for more on limits.)
Typically the high temperature limit is only reached around 1000 K.
Rotational energy of a diatomic molecule
The energy levels of a rigid rotor of moment of inertia I are
l(l + 1)h̄2
εl = l = 0, 1, 2 . . . (3.38)
2I
but there is a complication; as well as the quantum number L there is ml , −l ≤ ml ≤ l, and
the energy doesn’t depend on ml . Thus the lth energy level occurs 2l + 1 times in the partition
function, giving
∞ X
l ∞
−l(l+1)h̄2 β/2I 2
X X
Z1 = e = (2l + 1) e−l(l+1)h̄ β/2I . (3.39)
l=0 ml =−l l=0
The term 2l + 1 is called a degeneracy factor since “degenerate” levels are levels with the
same energy. (I can’t explain this bizarre usage, but it is standard.) For general β this cannot
be further simplified. At low temperatures successive term in Z1 will fall off quickly; only the
lowest levels will have any significant occupation probability and the average energy will tend
to zero.
At high temperatures, (kB T h̄2 /2I) there are many accessible levels and the fact that
they are discrete rather than continuous is unimportant; we can replace the sum over l with an
integral dl; changing variables to x = l(l + 1) gives
2I
Z1 =
h̄2 β
hE1 i = kB T (3.40)
Typically h̄2 /2I is around 10−3 eV, so the high-temperature limit is reached well below room
temperature.
It is not an accident that the high-temperature limit of the energy was kB T in both cases!
These are examples of equipartition which is the subject of a future section.
This example is rather more complicated than the preceding ones, but the result is simple and
powerful.
The non-interacting atoms of the gas are in a cuboidal box of side lengths Lx , Ly and Lz ,
and volume V ≡ Lx Ly Lz . The sides of the box are impenetrable, so the wave function ψ must
vanish there, but inside the box the atom is free and so ψ satisfies the free Schrödinger equation
h̄2 2
− ∇ ψ(x, y, z) = Eψ(x, y, z). (3.41)
2m
The equation, and the boundary conditions, are satisfied by
nx πx ny πy nz πz
ψ(x, y, z) = A sin sin sin (3.42)
Lx Ly Lz
where k 2 = kx2 + ky2 + kz2 and kx = πnx /Lx etc. So the one-particle partition function is
X
Z1 = e−ε(nx ,ny ,nz )β . (3.44)
{nx ,ny ,nz }
Z∞Z∞Z∞
Lx Ly Lz
Z1 = dkx dky dkz e−ε(k)β
π3
0 0 0
Z∞Zπ/2Zπ/2
V
= 3 k 2 sin θk dk dθk dφk e−ε(k)β converting to spherical polar coordinates
π
0 0 0
Z∞
1 V
= 4π 3 k 2 dk e−ε(k)β
8 π
0
Z∞
≡ g(k)e−ε(k)β dk where g(k) ≡ V k 2 /2π 2 (3.45)
0
The factor of 1/8 in the penultimate line comes from the fact that we only integrated over
positive values of kx etc, that is over the positive octant of k-space. g(k) is called the density
of states in k-space; g(k)dk is the number of states within range of k → k + dk. See here for
more on this concept.
This section only depended on the fact that the energy is independent of the direction of k.
Now we use the actual form of ε(k) to complete the calculation:
Z∞
V 2 −h̄2 k2 β/2m
Z1 = k e dk
2π 2
0
3/2
m
=V ≡ V nQ . (3.46)
2πh̄2 β
Z1 is a pure number, so “nQ ” must have dimensions of 1/V like a number density; it is called the
quantum concentration and is temperature-dependent. From Z1 we can obtain the average
single particle energy:
∂ ln Z1 3
hE1 i = − = kB T (3.47)
∂β 2
as we should have expected.
ky
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
1110
11
00
0
00
11
0
10
1
0 111
00
1
1100 11
00
11
0011
00
k+dk
11
00 11
0011
00
11
00 11
00
11
0011
00
11
0000
11 00
1 0
11
1
1
0
0
1
00
11 1
0
00
11
0
1
00
11
11
000
110
1
1
0
00
11
00
110
1
0
1
1
0
0
11
1
00
11
0
1 00
11
00
0
1
00
11 00
11
00
11
0
1 110
00
0
1
00
11
01
1 0
11
1
00
11 0 11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
0
1
00
1100
11
11
00
11 00
11
00
11
000
1
0
1 000
11
0
1
00
11
0
1
00
11
1
0 00
11
0
1 0
1 ky 11
00
11
0011
00
11
00 11
0011
00
11
00
k 00
11
11
0011
00
00
11
1
0
00
110
1100
1
011
11 0
10
110
00 11
1
0
110
11000
11 11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
11
00
11
0011
00
11
00 11
0011
00
11
00 11
00
11
0011
00
k+dk kx
k
kx
Going through the algebra to calculate the translational partition function we turned a sum
over the integers nx , ny and nz which count the number of half wavelengths along the three
sides, to an integral over k. Since the energy depends only on k = |k|, we could do the integral
over the direction of k leaving only the integral over k; in this process we collected a number
of factors and called them the density of state: g(k) = V k 2 /2π 2 , so that
Z∞
Z1 = g(k) e−ε(k)β dk (3.48)
0
We see that g(k) is acting as a “degeneracy factor”, which we first met in the context of the
rotor. If there is more than one energy level with the same energy, and we replace the sum over
individual states with a sum over allowed energies, we need to include a factor in front of the
Boltzmann factor for degenerate levels so that they are counted often enough.
The picture above shows a graphical representation of the allowed states in k-space. Since
πnx πny πnz
k= , , , (3.49)
Lx Ly Lz
with nx etc positive, the allowed values of k form a three-dimensional lattice. The density of
states is the number of states within an infinitesimal range of k, and hence of energy. This is
just the volume of an octant of a spherical shell, (1/8) × 4πk 2 × dk, divided by the volume of
k-space per state, π 3 /V , giving
V k2
g(k)dk = dk. (3.50)
2π 2
We will meet this expression many times in the rest of the course. Later, we will apply
it to systems of particles with spin s, where the full description of every single particle state
includes the spin projection ms , where ms takes on 2s + 1 values at integer steps between −s
and s. So for a spin- 12 particle, ms = − 21 or 12 ; for spin-1, ms = −1, 0 or 1. In these cases the
spatial states specified by (3.42) corresponds to 2s+1 distinct quantum states (all of the same
energy in the absence of a magnetic field) and so the density of states has an extra degeneracy
factor gs ≡ 2s + 1:
gs V k 2
g(k) = . (3.51)
2π 2
and
3/2
mkB T
Z1 = gs V ≡ V gs nQ . (3.52)
2πh̄2
If expressions are given without the gs , they apply to spin-zero particles only. It can be restored
by nQ → gs nQ in any expression missing it.
Later on we will also use g(ε), where ε(k) is the energy of a particle with momentum h̄k.
g(ε) is defined so that the number of states with wave numbers between k and k + dk has to
be the same as the number with energies between ε(k) and ε(k + dk) = ε(k) + dε, i.e.
dε
g(k)dk = g(ε)dε where dε = dk
dk
So in 3D, for non-relativistic particles with ε = h̄2 k 2 /2m,
gs V (2m)3/2 1/2
g(ε) = ε .
4π 2 h̄3
Finally, a comment on notation: Mandl uses D(k) rather than g(k), as I did in the notes
for Thermal and Statistical Physics. Dr Galla used both! Strictly, g(k) is not a degeneracy but
a degeneracy density so a stricter notation would be something like dg/dk. That is what D(k)
is trying to indicate. The lecturers of PHYS30151, whose past exams you will be looking at in
due course, used dn/dk. But the symbol n is overworked in this course as it is! Warning: Dr
Xian in PHYS20252 uses g(ε) to mean a double density, the number of states per unit energy
and per unit (physical) volume. Hence in his notation there is no V in the density of states.
• Mandl 7.7
• Bowley and Sánchez 7.4
• Kittel and Kroemer 14
The speed v of a particle is related to the wave number k by mv = h̄k. We already know
the probability of a particle having k in the range k → k + dk, and so we can immediately write
down the corresponding probability of the speed being in the range v → dv:
g(k) e−ε(k)β
P (k → k + dk) = dk where ε(k) = h̄2 k 2 /2m
Z1
V e−ε(v)β m 3 2
P (v → v + dv) = v dv where ε(v) = mv 2 /2
2π 2 Z1 h̄
r 3/2
2 m 2
⇒ P (v) = v 2 e−mv /2kB T (3.53)
π kB T
This is called the Maxwell-Boltzmann distribution, and it is plotted below.
PHvL
$%%%%%%%%
%%%%
kB T
m
We can find the most probable speed (from dP (v)/dv = 0), as well as the mean speed and
the rms speed:
r r
2kB T kB T
vp = ≈ 1.41
r m r m
8kB T kB T
hvi = ≈ 1.60
πm r m r
p 3kB T kB T
vrms = hv 2 i = ≈ 1.73 (3.54)
m m
These are marked on the graph above.
Note that h̄ has disappeared from P (v), which can be derived from the Boltzmann distri-
bution in a purely classical theory provided the normalisation is obtained from requiring the
integral of P (v) to be one.
It is the step between the second and third lines, in which we interchange the order of addition
and multiplication, that is tricky at first! But it is no harder than the following (in reverse):
X N
Y
(n)
= exp −εin β
i1 ,i2 ,...iN n=1
N X
Y N
Y
(n)
= exp −εin β = Z (n) . (3.58)
n=1 in n=1
• Mandl 7.9
• Bowley and Sánchez 5.14
• (Kittel and Kroemer 3)
The results for vibrational, rotational and translational energies demonstrate that, at high
enough temperatures, the law of equipartition of energy holds: each quadratic term in the
classical expression for the energy contributes 21 kB T to the average energy and 12 kB to the heat
capacity. The oscillator has quadratic kinetic and potential terms:
Now we understand what governs “high enough”: kB T has to be much greater than the spacing
between the quantum energy levels. If this is not satisfied, the heat capacity will be reduced,
dropping to zero at low temperatures. The corresponding degree of freedom is said to be frozen
out; this is the situation for the vibrational degrees of freedom at room temperature.
Here is an idealised graph of the heat capacity of hydrogen with temperature, (
P.
c Eyland,
University of New South Wales)
As the moment of inertia for H2 is small, the temperature by which equipartition holds for
rotational modes is actually quite high. Bowley and Sánchez have a graph taken from data
(Fig. 5.8).
We can predict the specific heat of other substances based on equipartition, simply by
counting the degrees of freedom. For a solid, we expect the molar heat capacity to be 3RT
since each atom is free to vibrate in three directions. This is the law of Dulong and Petit, and
it works well for a variety of solids at room temperature. (More details here.)
Equipartition does not hold, even at high temperatures, if the energy is not quadratic. For
instance the gravitational potential energy is linear in height, and the average potential energy
of a molecule in an isothermal atmosphere is kB T , not 21 kB T .
p Similarly the kinetic energy of a highly relativistic particle is given by the non-quadratic
c p2x + p2y + p2z (= h̄ck), not by the quadratic (p2x + p2y + p2z )/2m, and the average kinetic energy
is 3kB T , not 23 kB T .
At low temperature (β → ∞) the energy tends to 3N kB (h̄ωβ)2 e−h̄ωβ . Although this tends to
zero, it does not agree with the observed low temperature behaviour, which is proportional to
T 3 . More sophisticated models, such as that of Debye, allow for collective vibrations of many
atoms which have much lower frequency, and hence contribute to the internal energy and heat
capacity at much lower temperatures. We will revisit this when we consider gases of bosons,
not that the connection is obvious right now.
In the previous section we assumed that the average energy of N non-interacting atoms was
the same as N times the average energy of one atom, an obvious (and correct) consequence of
the law of large numbers and indeed almost the definition of “non-interacting”. It could also
be derived from ZN = (Z1 )N . To continue our study of the ideal gas, we want to calculate F ,
and from that the pressure and entropy. But recalling that
3/2
mkB T
Z1 = V gs ≡ V gs nQ . (3.63)
2πh̄2
and using F = −kB T ln ZN would give
and that doesn’t make sense. If one has double the number of particles, in double the volume,
the Helmholtz free energy, like the energy, should double. They are both extensive variables.
But the expression above is not extensive.
The solution comes from a very surprising quarter—quantum mechanics. Quantum me-
chanics says that atoms of the same element are fundamentally indistinguishable, exactly the
same. As a result, for instance, all observables have to be unchanged when we interchange two
identical atoms. We can’t give them labels and know which one is in which state. But if we
recall the derivation of derivation of ZN = (Z1 )N which held for the paramagnet, crucially all
N particles were distinguishable (by their position in the lattice). This is not the case for a
gas. So what is the N -particle partition function for indistinguishable particles?
Consider first the partition function for the simplest case, of two particles and two energy
levels. If the particles are distinguishable, as in the upper picture below, there are four states,
two of which have energy ε, and the two-particle partition function is
0
Distinguishable particles
0
Indistinguishable particles
However for indistinguishable atoms, we can’t give them labels and know which one is in
which state. Thus there are only three states, as in the lower picture, and the partition function
is
Z2 = e0 + e−εβ + e−2εβ 6= (Z1 )2 (3.66)
If we use (Z1 )2 , we over-count the state in which the particles are in different energy levels. In
general there is no simple expression for the N -particle partition function for indistinguishable
particles.
However we note that (Z1 )N over-counts the states in which all N particles are in different
energy levels by exactly N !. So if we are in a position where there are many more accessible
energy levels (that is, levels with energy less than a few kB T ) than there are particles, the
probability of any two particles being in the same energy level is small, and almost all states
will have all the particles in different levels. Hence a good approximation is
(Z1 )N
ZN = (3.67)
N!
It turns out that this is exactly what we need to fix the ideal gas.
• Mandl 7.1,7.4-6
• Bowley and Sánchez 6.5
• Kittel and Kroemer 3
We are now reaching the most important test of statistical physics: the ideal gas. For the
moment we assume it is monatomic; the extra work for a diatomic gas is minimal.
The one-particle translational partition function, at any attainable temperature, is
3/2
mkB T
Z1 = V gs nQ , where nQ ≡ . (3.68)
2πh̄2
We already saw that assuming the atoms to be distinguishable yields a non-extensive
Helmholtz free energy, and argued that we had to treat them as indistinguishable. So we
want to see if we use the approximate form from the previous section,
(Z1 )N
ZN = . (3.69)
N!
We can, if we can convince ourselves that that it is very unlikely that any two atoms are in
the same energy
R kmax level. In the ideal gas, we can calculate the number of levels below, say,
2 2
2kB T , from 0 g(k)dk with h̄ kmax /2m = 2kB T , giving 2.1nQ V . So we see that nQ , the
“quantum concentration”, is a measure of the number of states available, and we can use the
approximation ZN = (Z1 )N /N ! provided N nQ V (or n nQ ). This is the classical limit.
We also note that nQ ≈ 1/λ3 , where λ is the thermal de Broglie wavelength (the wavelength
of a particle of energy of order kB T ). So the condition n nQ is equivalent to saying that the
separation of the atoms is much greater than their wavelength, exactly the condition given in
last semester’s quantum mechanics course for classical behaviour.
The energy and heat capacity derived from (3.69) are unchanged from those of section 3.8,
as N ! doesn’t depend on β.
For the Helmholz free energy, using Stirling’s approximation ln(N !) ≈ N ln N − N , we find
hF i = − kB T ln ZN = −N kB T (ln Z1 − ln N + 1)
V
= − N kB T ln + ln (gs nQ ) + 1
N
n
= N kB T ln −1 . (3.70)
gs nQ
Since nQ is composed only of constants and T , it is intensive; the number density n ≡ N/V is
the ratio of extensive quantities and so is also intensive. Hence F is clearly simply proportional
to N , and so extensive as required.
Then
∂F
P = −
∂V T,N
N kB T
= (3.71)
V
and for the entropy, the Sackur-Tetrode equation
∂F
S= −
∂T V,N
n 1 dnQ
= − N kB ln − 1 + N kB T
gs nQ nQ dT
gs nQ 5
= N kB ln + (3.72)
n 2
Since n nQ if the result is to be valid, S is also positive, as it should be!
The expression for P is clearly experimentally verifiable: it is the ideal gas law. That’s
good, but we expected to get that. More interestingly the Sackur-Tetrode equation for S can
also be checked. First, if we unpick the dependence on V and T , we get
3
S = N kB (ln V + ln T + const.) (3.73)
2
which is in accord with the form derived from classical thermodynamics (see here). But more
importantly it predicts the absolute entropy of a gas at a certain temperature, and this can
be checked experimentally too. If we start with the solid at some very low temperature T0 , at
which the entropy can be assumed to be very small, and we know the experimental specific
heat capacity as a function of temperature and the latent heats of melting and vaporisation,
we can numerically calculate the integral
ZT
d̄Q
= S(T ) − S(T0 ) ≈ S(T ) (3.74)
T
T0
Good agreement is found. An example with numerical details can be found here, from Edward
J. Groth of Princeton University.
Finally, we include vibrations and rotations as well as translations: since the one-particle
energies are independent and add, ε = εtr + εrot + εvib , the partition functions multiply: Z1 =
Z1tr Z1rot Z1vib (the argument is like that for the N -particle partition function for distinguishable
particles and is given here) and so
(Z1tr )N (Z1rot )N (Z1vib )N
ZN = = ZNtr (Z1rot )N (Z1vib )N
N!
F = Ftr + Frot + Fvib (3.75)
and the energy and entropy also add.
It is important to note that, assuming a truly ideal gas which never condenses or solidifies,
the Sackur-Tetrode equation is not valid for indefinitely low temperatures. It must be wrong,
because as T → 0, nQ → 0 and S → −∞. But we know that S → 0 as T → 0, because all
the particles occupy the lowest energy level. But of course that is exactly the regime in which
ZN = (Z1 )N /N ! is no longer valid.
For a gas with the density of air at STP, n ≈ 3 × 1025 m−3 . We have nQ ≈ n for T ≈ 10−2 K,
so real gases are essentially always classical.
An example of a non-classical gas is the conduction electrons in a metal; they are free
to move within the metal and can be treated as a dense gas (n ≈ 1029 m−3 ), but at room
temperature nQ ≈ 1027 m−3 . So the quantum nature of the electron (specifically the fact that
it is a fermion) becomes all important.
) p + e− .
H*
Equilibrium requires µH − µp − µe = 0, or
! ! !
(e) (p) (H)
2nQ 2nQ 4nQ
ln + ln − ln + (mH − mp − me )c2 β = 0
ne np nH
(p) (e)
ne np nQ nQ εβ (e)
⇒ = (H)
e ≈ nQ eεβ (3.79)
nH nQ
We have defined ε as the (negative) energy of the bound state of the electron in hydrogen,
−13.6 eV, and given the closeness in mass of p and H we have set the ratio of their quantum
concentrations to 1 in the last step. The factors of 2 in the chemical potentials of the proton
and electron are the usual spin degeneracies gs = 2 for spin- 21 particles. The factor of 4 for
hydrogen is also a degeneracy factor: the total spin of the electron and proton can be S = 1 or
S = 0 with 4 spin states (3+1) in all.
One can take this formula in various directions, but if one assumes the electron density and
temperature are known, it gives the ionization fraction np /nH . It can be rewritten in terms of
the chemical potential of the electrons (but with the conventional zero of energy, which we will
here but not in general write as µ̃e ) as
(e)
np 2nQ εβ 1 (−µ̃+ε)β
= 12 e = 2e . (3.80)
nH ne
Astrophysics students will have met (3.79) as the Saha equation. We will meet this problem
again in section 3.14.1.
The justification for (and limitations of) the 1/N ! are the same as before, and since true
indistinguishability is a quantum property, we cannot entirely get away from QM.
For particles in aR box, the potential within the box vanishes and positions outside the box
are inaccessible, so d3N ~x = V N . So
3N
!
N Z 2
V X pi
ZN = 3N d3N p~ exp −β (3.83)
h N! i=1
2m
There are three ways to tackle this. The simplest is to see it as a product of Gaussians in
pi
3N Z∞
VN Y VN ZN
2 β
ZN = 3N dpi exp −pi = 3N (2πmkB T )3N/2 = 1 (3.84)
h N ! i=1 2m h N! N!
−∞
Z∞ N
ZN
1 V β
ZN = dp p2 exp −p2 = 1 (3.85)
N ! h3 2m N!
0
This approach is just like the k-space one we used above, with p = hk/2π and g(k)dk = g(p)dp
giving g(p) = 4πV p2 /h3 .
The final approach, which however is only valid in the same case as the first, where each p2i
is summed for all N particles and all 3 directions, is to consider k-space (or rather p-space) in
3N dimensions, using the expression for the surface area of a hypersphere. I will leave it as an
exercise for the mathematically-minded reader to check the same result is obtained. (See also
the lecture notes of the previous lecturer.)
So we have recovered the same result as before, and all our results for the ideal gas, including
the Helmholtz free energy and Sackur-Tetrode entropy (3.72), are recovered, with all the same
restrictions on validity (N/V nQ ). But I have pulled the wool over your eyes. They only
agree if the arbitrary parameter h I introduced in the elementary volume of phase space is
actually Planck’s constant! Any other choice would correctly predict changes in S and F , but
not absolute values.
On the problem sheet you will be asked to show that the partition function for a single
harmonic oscillator is also reproduced in the high-temperature limit by this classical phase
space approach. In this case, because there is a potential, the integral d3N ~x has to be done
explicitly and does not simply yield a power of the volume. Since the potential is quadratic
in each xi , though, we just get more Gaussian integrals, and the result is the same as in the
high-temperature limit of the QM approach, (3.36): in one dimension
1
Z1 = and E = kB T. (3.86)
h̄ωβ
Finally, if you are interested, the concept of the hypersphere can be used to tackle the ideal
gas in the microcanonical ensemble. The problem there, if you recall, is that the energy of the
whole system is constrained to be fixed, so the particles, while not interacting, cannot simply be
treated as independent as they can if they are in contact with a heat bath. In calculating Ω, the
integral d3N p~ has to be carried out with a δ function on the total kinetic energy: effectively this
restricts |~p| to the surface of the hypersphere. Using the surprising result that, for very large
N , essentially all the volume of an N -dimensional hypersphere resides at the surface (because
of the factor rN −1 in the volume element), one integrates instead over all states with energies
up to energy E, and the result for S = kB ln Ω is indeed just the Sackur Tetrode one (as a
function of E = 3/2N kB T rather than T ). The previous lecturer’s notes (section 3.6) give more
detail, but I simply note the result as further proof that for a large system, the microcanonical
and canonical approaches give the same result.
gives
e (µNi −εi )/kB T
pi = (3.88)
Z
with X
Z= e (µNj −εj )/kB T (3.89)
j
The new normalisation constant Z is called the grand partition function. Macroscopic
functions of state are calculated via ensemble averages as usual; the relevant ensemble in this
case is called the grand canonical ensemble.
The following properties are easily proved by analogy with the corresponding ones for the
Boltzmann distribution (see here and here):
∂ ln Z
hN i = kB T
∂µ β
∂ ln Z
hEi = − + µ hN i
∂β µ
X 1
hSi = − kB pi ln pi = (hE − µN i + kB T ln Z)
i
T
⇒ ΦG ≡ − kB T ln Z = hE − T S − µN i . (3.90)
The quantity −kB T ln Z is a new thermodynamic potential called the grand potential: Mandl
gives it the unfortunate symbol Ω but we will use ΦG like Bowley and Sánchez. (They use Ξ–
“Xi”–for Z.) From the fundamental thermodynamic relation we get
and hence
∂ΦG ∂ΦG ∂ΦG
S=− P =− N =− . (3.92)
∂T V,µ ∂V T,µ ∂µ T,V
system Isolated in contact with heat bath heat and particle bath
fixed E,N,V or B T,N,Vor B T, µ, Vor B
key microscopic function no. of microstates Ω partition function Z grand partition function
key macroscopic function S= kB logΩ F=− kBT log Z ΦG =− kBT log
So whereas in an isolated system the entropy is the key (compare here) and in a system at
constant temperature it is the Helmholtz free energy (compare here), here the grand potential
is the key to the other functions of state.
The natural variables of the grand potential are T, V and µ. But of these, T and µ are inten-
sive. Like any thermodynamic potential ΦG itself is extensive, so it must be simply proportional
to V , the only extensive one: ΦG = V φG (T, µ).
But
∂ΦG
φG (T, µ) = = −P
∂V T,µ
⇒ ΦG = − P V. (3.93)
This explains why it is not greatly used in thermodynamics. But the fact that ΦG is so
simple doesn’t lessen its formal utility in statistical mechanics.
The grand potential in the case that more than one species is present is
X X
ΦG ≡ E − T S − µi Ni so dΦG = −SdT − P dV − Ni dµi . (3.94)
i i
We can use this to prove that µi is the Gibbs free energy per particle of species i, as claimed
previously ((1.23))
X X
G = E − T S + P V = E − T S − (E − T S − µi Ni ) = µi Ni . (3.95)
i i
low T
high T
0.5
¶0
Μ
ε0 > µ then it is less likely to be occupied, since it is energetically more favourable for the
molecule to remain in solution. Conversely if ε0 < µ then it is more likely to be occupied,
since that is the energetically favourable configuration. As always, it is the temperature which
determines the likelihood of the less favourable configuration obtaining. At zero temperature,
the distribution becomes a step function, with hN i = 1 if ε0 < µ and hN i = 0 if ε0 > µ.
low T high T
¶0
Μ
with ε0 < µ, as their occupancy will be infinite. (The formula above for hN i is no longer valid
in that case.) For ε0 close to µ the occupancy will be high, and it falls off as ε0 increases. The
rapidity of the drop depends on temperature; for T = 0 only a level with ε0 = µ would have
non-zero occupancy.
Chapter 4
Quantum Gases
• Mandl 9.2
• Bowley and Sánchez 10.2
First a note on notation. When we talk about the “spin” of a composite particle we mean
its total angular momentum, and we usually use the symbol J rather than S. Last semester you
met J as the (quantised, vector) sum of orbital and spin angular momentum for the electron,
but more generally it is the (quantised, vector) sum of all the spins and all the orbital angular
momenta of all the constituents.
The basic building blocks of atoms are all fermions, while the force carriers (photon, gluon,
W,Z) and the Higgs are bosons. The rules of addition of angular momentum mean that two
spin- 12 particles with no orbital angular momentum can either have total angular momentum
J = 0 or 1; if we add a third, we have J = 12 or 32 ; a fourth gives J = 0, 1 or 2, and so on.
Adding in orbital angular momentum, which is integer, gives more possibilities, but J is still
always half integer for an odd number of spin- 12 particles and integer for an even number. So
composite particles (nuclei, atoms, molecules) made of an odd number of protons, neutrons
and electrons are also fermions, whereas those made of an even number are bosons. Note that
a particle is either a fermion or boson. Excitations of composite partcles (nuclei, atoms) can
change the spin only by an integer amount and so don’t change the its nature.
Fermions obey the Pauli exclusion principle: no more than one fermion can occupy a
single quantum state. (The value of the spin quantum number ms is part of the description of
the state; if that is ignored then two spin- 12 or four spin- 32 particles can occupy the same spatial
54
state.) This is the basis of atomic structure and the periodic table, it explains the properties
of metals and of white dwarves and neutron stars.
The “grown-up” version of the Pauli exclusion principle is that the overall wave function
of a system of identical fermions must be antisymmetric under exchange of any pair. For two
spin- 12 particles in the same spatial state (say the 1s state of helium) the overall wave function
must be
Ψ(r1 , r2 , m1 , m2 ) = √12 φ1s (r1 )φ1s (r2 )(↑↓ − ↓↑) (4.1)
The spatial part is symmetric but the spin part is antisymmetric. (This corresponds to overall
spin 0). If we try to construct a state with three particles in the same spatial state we can’t do
it, there is no state (↑↓↑ − . . .) which changes sign when we interchange any pair of particles.
So the Pauli exclusion principle follows from the requirement for antisymmetry.
It is possible to have a 2-fermion spin-state such as ↑↑ (spin-1) but then the particles have
to be in different spatial states, eg
Ψ(r1 , r2 , m1 , m2 ) = √12 φ1s (r1 )φ2s (r2 ) − φ2s (r1 )φ1s (r2 ) ↑↑ (4.2)
There is no exclusion property for identical bosons, but there is a restriction on their wave
function: it must be symmetric under exchange of any pair. So for spinless bosons
is a perfectly acceptable wave function and there is no Pauli exclusion principle. For spin-1
bosons we have to ensure that the overall space-spin wave function is symmetric, but the details
are not important here.
So bosons are free to (indeed, other things being equal, “prefer” to) crowd into the same
quantum state. This explains the spectrum of black-body radiation and the operation of lasers,
the properties of liquid 4 He and superconductors.
The need for the wave function to be either symmetric or antisymmetric for identical parti-
cles stems from the very meaning of “identical”: nothing observable can change if we swap the
particles. In particular
• Mandl 11.2,11.4,11.5
• Bowley and Sánchez 10.2-3
When we derived the properties of the ideal gas previously, using the classical approximation
for the partition function (Z1 )N /N !, our results were only valid if the number of available single-
particle levels greatly exceeded the number of particles in the gas (nQ n). This was because
we knew that we were not treating states with more than one particle in them correctly. Now we
know that if the gas particles are fermions, that isn’t even possible, so we need a new approach.
What we do is lift the restriction that the number of particles in the gas is fixed, and use the
Gibbs distribution instead of Boltzmann.
In a previous section we looked at the grand partition function for a single state which could
accept either only one, or many, particles. These will be our starting points for the consideration
of gases of fermions or bosons in the regime in which we cannot ignore the possibility of multiple
occupancy of states (bosons) or restrictions in the available states because they are already
occupied (fermions).
We then find that rather than focus on a single particle in the gas, it is easier to focus on
what is happening in a single energy level. Then we can write, using r to label the energy level,
not the particle
Y
Z = Z 1 Z2 Z 3 . . . = Zr where Zr = 1 + e(µ−εr )β + e2(µ−εr )β . . . (4.6)
r
Z∞ Z∞
hN i = g(k)n(k)dk hEi = g(k)ε(k)n(k)dk (4.9)
0 0
1 n
fermions
o
n(k) = for bosons (4.10)
e(ε(k)−µ)β ± 1
Note that for bosons, µ must be less than the energy of the lowest level (zero for most purposes)
but for fermions µ can be (and often will be) greater than 0.
The Gibbs distribution assumes variable particle number and constant chemical potential
(as well variable energy and constant temperature). However we knew that for a large system,
fluctuations are small, and the results will be essentially the same as the more difficult problem
of an gas with fixed particle number. To cast it in this form, we use the expressions of (4.9) to
find the value of µ which gives the desired N . Then we can also find the average energy per
particle. This is conceptually simple, but not usually possible analytically except in certain
limits. In the next subsection we will recover the classical ideal gas.
ΦG ≡ − kB T ln Z
X Q
= − kB T ln Zr using Z = r Zr
r
X
ln 1 ± e(µ−εr )β
= ∓ kB T (4.11)
r
where r labels the single particle energy levels, and the signs are for fermions and bosons
respectively.
Now imagine that eµβ 1, which requires µ to be large and negative. Never mind for a
moment what that means physically. Then, using ln(1 + x) ≈ x for small x, we get
X
ΦG = − kB T eµβ e−εr β
r
µβ
= − kB T e Z1 (T ) (4.12)
where Z1 is the one-particle translational partition function (not grand p.f.) for an atom in an
ideal gas. As we calculated previously, Z1 (T ) = V gs nQ (T )
From ΦG we can find the average particle number:
∂ΦG
N = −
∂µ T,V
= eµβ Z1 (4.13)
F = ΦG + µN
= − N kB T − N kB T ln(Z1 /N )
= − N kB T (ln(Z1 /N ) + 1) (4.15)
However this is exactly what we get from F = −kB T ln ZN with ZN = (Z1 )N /N !. Thus we
recover all our previous results.
We can also look at the occupancy n(ε): for large negative µ,
1 e−εβ
n(ε) = ≈ eµβ e−εβ = N (4.16)
e(ε−µ)β Z1
which is the Boltzmann distribution as expected. The three distributions, Bose-Einstein (or-
ange), Boltzmann (blue) and Fermi-Dirac (green) are plotted as a function of (ε − µ)β below:
<N>
2.5
B 2.0 B-E
1.5
1.0
F-D
0.5
β (ε - μ)
-1 0 1 2 3
It is reassuring that we can recover the classical ideal gas predictions of course. We can also
look at the first corrections in an expansion in the density (n/nQ ). For a classical gas this is
called a virial expansion; for a van der Waal gas, for instance,
a N
P V = N kB T 1 + b − + ...
RT V
where b is the excluded volume due to finite molecular size and a arises from attractive inter-
actions.
But even for an ideal (zero electrostatic interactions, point-like) Bose or Fermi gas such
terms appear. An example on the problem sheets asks you to show that, for for fermions and
bosons respectively,
n
P V = N kB T 1 ± √ + ... . (4.17)
4 2gs nQ
So for fermions the pressure is larger than expected, consistent with a reluctance to occupy
the same space (like a non-zero b). This isn’t too surprising. More surprising perhaps is the
fact that for bosons, the pressure is smaller, as if there were attractive interactions. Whereas
fermion “like” to keep apart, bosons are gregarious!
• Mandl 11.5
• Bowley and Sánchez 10.4.2
<N>
1
0.5 kB T
¶
Μ
We have already seen that for electrons in metal, the number of states with energies of
order kB T is much less that the number of electrons to be accommodated. Because electrons
are fermions, they can’t occupy the same levels, so levels up to an energy far above kB T will
need to be filled. The occupancy is given by (4.10) for fermions
1
n(ε) = (4.18)
e(ε−µ)β + 1
which is plotted above.
At zero temperature, it is clear what the ground state—the state of lowest possible energy—
of a fermion gas must be. All energy levels will be occupied (singly-occupied if we regard the
spin as part of the specification of the state) up to a maximum, and all higher levels will be
unoccupied. The occupation function n(ε) becomes a step function—one up to a certain value
of ε and zero thereafter. Do our results bear this out?
Considering n(ε), we see that the limit T → 0, β → ∞ needs to be taken rather carefully.
Clearly it will depend on the sign of ε − µ. If ε < µ then the argument of the exponential is
very large and negative and the exponential itself can be ignored in the denominator, simply
giving n(ε) = 1. But if ε > µ, the argument of the exponential is very large and positive and
the “+1” can be ignored in the denominator, so that n(ε) → e−(µ−ε)β → 0. So
(
1 for ε < µ
n(ε) → (4.19)
0 for ε > µ
So in fact µ is the energy of the highest occupied state at zero temperature. This is also known
as the Fermi energy, εF , and indeed the two terms, “chemical potential” and “Fermi energy”
are used interchangeably for a Fermi gas.1 The value of k corresponding to this energy, kF ,
is referred to as the “Fermi momentum” (albeit that should really be h̄kF ). A gas like this
where only lowest levels are occupied is called a degenerate gas: this is a different usage from
“degenerate” to mean “equal energy”. The filled levels are called the Fermi sea and the top of
the sea is called the Fermi surface.
So what is the value of the Fermi energy at zero temperature? It is simply fixed by N ,
which we set equal to hN i:
Z∞ ZkF
N = g(k)n(k)dk = g(k)dk
0 0
ZkF
gs V gs V kF3
= k 2 dk = (4.20)
2π 2 2π 2 3
0
2 2
so, with gs = 2 and = h̄ kF /2m for non-relativistic electrons, and n = N/V
Note all these depend, as we expect, only on N/V and not on V alone. εF is intensive, and
E ∝ N . For copper, εF = 7 eV.
Note too that though we have written our integrals over wave number k, we can equally
switch to ε as the variable, using the density of space in energy from section 3.51, eg:
Z∞ ZεF 3/2
gs V (2m)3/2
1/2 gs V 2mεF
N = g(ε)n(ε)dε = ε dε = (4.23)
4π 2 h̄3 6π 2 h̄2
0 0
which is the same as before. Generally I prefer only to remember g(k) and to work out g(ε) if
required. After all g(k) only depends on the dimension of space, usually 3D in practice, while
g(ε) differs depending on whether the particles are relativistic or not.2
1
Actually sometimes the “Fermi energy” is used exclusively for the zero-temperature chemical potential,
the highest filled energy, and “Fermi level” is used for the energy with an occupancy of 0.5, which is ε = µ.
We will use Fermi energy at or near zero temperature, and chemical potential where the occupancy deviates
substantially from a step function.
2
Note the warning about the difference in notation for g(ε) between Dr Xian and myself contained at the
end of section 3.8.1. He divides out the volume V .
All of the above is at zero temperature. At finite temperatures, the picture will change—
but not by as much as you might expect. Thermal excitations can only affect levels within a
few kB T of εF . But at room temperature, kB T = 0.025 eV. For kB T to equal εF would need
T ∼ 80, 000 K, a temperature at which the metal would have vaporised. (TF = εF /kB is called
the Fermi temperature, but in no sense is it a real temperature, it’s just a way of expressing
the Fermi energy in other units.) We can see from (4.18) that n(k) will still essentially be 0 or
1 unless |ε − εF | is of the order of kB T . This is shown in the figure at the top of the section.
As the temperature rises, the Fermi energy doesn’t remain constant, though it doesn’t
change much initially. Again we find it from requiring the electron density to be correct,
Z∞ Z∞ 3/2 Z∞ 1
k2
N 1 gs 1 2m 1 x2
= g(k)n(k)dk = 2 dk = dx (4.24)
V V 2π 2
e(h̄ k2 /2m−µ)β + 1 2 h̄2 β π2 zex + 1
0 0 0
where z = e−µβ and we have made the change of variable to the dimensionless x ≡ ε(k)β:
1/2 1/2
h̄2 β 2
2m 1 2m
x= k k= 2
1/2
x , dk = 2 x−1/2 dx (4.25)
2m h̄ β 2 h̄ β
Rearranging, and setting gs = 2, we get
Z∞ 1 2 3/2 √
x2 2 h̄ β π n
F (z) ≡ x
dx = 4π n = . (4.26)
ze + 1 2m 2 2nQ
0
The integrand is shown, as a function of energy at fixed temperature, on the left for three
positive values of µ and on the right for µ = 0 and two negative values of µ; the area under the
curve is proportional to N/V . In each plot blue, orange and green are in decreasing order of
µ; the horizontal and vertical scales on the left are much larger than on the right.
g(ε}n(ε) g(ε}n(ε)
ε ε
The area under the curves, hence the function F (z), can be obtained by numerical integration.
Then z, and hence µ, can can be chosen to obtain the correct particle number density.3
Note that the n/nQ is greater than 1 for copper at room temperature. F (z) becomes large
as z → 0, µ kB T , which fits √ this situation. Conversely for z 1, corresponding to large
negative µ, the integral tends to π/2z and we recover the classical limit as in the last section,
(4.14).
Below is plotted (in blue, labelled “F-D”) the ratio of the chemical potential to the zero-
temperature Fermi energy, as a function of temperature in units of the Fermi temperature. Also
shown (in orange, labelled “B”) is the classical approximation (4.14) (ignore for the moment
the green curve labelled “B-E”):
3
Why have I used z as the variable? Well it is more common to define z = eµβ —it even has a name, the
fugacity—so my variable is z = 1/z. But I thought that to write z −1 would be ugly.... F (z) is defined in terms
of so-called polylogarithms of z, but the details will not concern us.
μ/εF
1
B F-D
T/TF
1 2 3
-1
-3
1 2 3 4 5 6 7 B-E
-10
-5
-20
The number density n doesn’t appear because it has been eliminated in terms of εF . The
classical approximation slowly approaches the full expression as T → ∞, as can be seen in the
inset panel.4
The fact that thermal fluctuations affect only a small fraction of all the electrons has a
number of consequences. For instance the electronic heat capacity is much less than the 23 N kB T
predicted by equipartition. Thermal excitations can only affect states with energies of the order
of kB T below the Fermi surface, roughly a fraction kB T /EF of the total, and their excess energy
is about kB T . So the extra energy above the zero temperature value of 53 N EF is of order
N (kB T )2 /EF and the electronic heat capacity is of order
T
CV ∝ N k B . (4.27)
TF
A more careful calculation gives the constant of proportionality to be π 2 /2. This linear rise with
temperature can be seen at very low temperatures; at higher temperatures the contribution of
lattice vibrations dominates, and we will explore this later.
Data from Lien and Phillips, Phys. Rev. 133 (1964) A1370
The figure above shows the molar heat capacity as a function of temperature for potassium.5
What is actually plotted is C/T against T 2 , so the straight-line implies
Note in passing that P V = 32 E, as for a classical ideal gas! This holds even at finite
temperature, as you will be asked to show on the problems sheets. The origin of both the
internal energy and the pressure is quite different though.
We will assume that the composition of the star is constant, so that there is a constant
ratio between the electron density and the matter density, ρ ∝ n. For hydrogen the constant of
proportionality would just be mH , whereas for heavier nuclei it will be around 2 amu or slightly
more (roughly, one proton and one neutron per electron). It turns out that, apart from the
total mass, the composition is the only thing that distinguishes one white dwarf from another.
Lighter stars are mostly carbon and oxygen, heavier ones have heavier elements up to iron.
Within a star the density, Fermi energy and and pressure will vary with radius. For equi-
librium, the pressure difference across a spherical shell at r has to exactly balance the weight:
Gρ(r)M (r)
4πr2 (P (r) − P (r + dr)) = 4πr2 dr
r2
Zr
dP Gρ(r) 2
⇒ =− 2 4πr0 ρ(r0 )dr0 (4.31)
dr r
0
Writing P (r) = Cρ(r)5/3 allows us to turn this into a second-order non-linear differential
equation for the density ρ(r). And with some manipulation that we won’t go into here (Google
“polytrope” if you are interested), it can be shown that the solution can be cast in terms of
the average density and a universal dimensionless function of the scaled variable r/R, where R
is the radius of the star:
ρ(r) = ρ f (r/R) (4.32)
Without going into the details of that function, we can none-the-less derive an interesting
consequence: a mass-radius relation for white dwarf stars. We return to Eq. (4.31) and integrate
to obtain the pressure difference from the centre to the surface r = R at which P (R) = 0:
ZR ZR
dP Gρ(r)M (r)
− dr = P (0) = dr (4.33)
dr r2
0 0
Then using (4.32), which also implies a universal function g for the mass M (r) (which is
obviously related to f but we don’t need the details):
M (r) = M g(r/R), where 4
3
πR3 ρ = M; (4.34)
and looking back to (4.30) we see that the central pressure is proportional to ρ5/3 , giving
ZR Z1
f (r/R)g(r/R) GM ρ f (x)g(x)
ρ5/3 ∝ GM ρ 2
dr = dx (4.35)
r R x2
0 0
The integral is universal and dimensionless; ie just a number which is the same for all white
dwarfs of a similar composition. Then we canel one power of ρ from either side to get
(M/R3 )2/3 ∝ M/R ⇒ M R3 = constant (4.36)
This relationship is pretty well satisfied by white dwarf stars of masses less than one solar mass
(see table below). Note that it implies that the more massive the star, the smaller the radius.
For 1 solar mass, the radius is about that of the earth!
It looks like there would be no upper limit on the mass, but we have assumed that the
electrons are non-relativistic, and as the Fermi energy approaches me c2 this is no longer valid.
In the highly relativistic regime the pressure is proportional to n4/3 (see problem sheets), and
doesn’t grow fast enough as the star shrinks to stabilise it (check it in (4.36): the radius
cancels). So there is an upper bound to the mass of a white dwarf of about 1.4 solar masses—
the Chandrasekhar limit. This involves a pleasing combination of microscopic and macroscopic
parameters with a numerical factor which is of order 1:
(h̄ c/G)3/2
∼ M
(2mp ) 2
Above that an even more compact object sustained by the degeneracy pressure of neutrons
can form, but that again has an upper mass limit of about 2 solar masses (and radius of about
10 km). Beyond that, only black holes are possible...
• Mandl 10.3-5
• Bowley and Sánchez 8.5
Classically, black-body or cavity radiation would arise from considering the modes (or stand-
ing waves) of a conducting cavity. These are solutions to the wave equation for the EM fields
subject to suitable boundary conditions at the walls, as discussed last semester. As the fields
are vector rather than scalar, the picture is a little more complicated than the solutions to the
Schrödinger equation (3.42) but the end result is essentially the same: modes of a cuboidal
cavity of sides Lx , Ly and Lz are characterised by the three discrete wave-vector components
k = (nx π/Lx , nz π/Ly , nz π/Lz ) with integer ni . Furthermore for each mode the restriction
k · E = 0 allows for two polarisation states for each k.6 But classically there is a big difference
for EM fields: these are the modes, which are discrete, but the amplitudes of the fields of any
mode can take any value. We might expect that in thermal equilibrium the energy in each
mode would be kB T (the energy density is quadratic in E and B, so two degrees of freedom).
Though the modes are discrete it is an excellent approximation for a macroscopic box to replace
the sum with an integral, to give the energy per unit volume in the field for a frequency range
ω → ω + dω, where ω = ck:
g(ω) kB T 2
u(ω)dω = kB T dω ⇒ u(ω) = ω ? (4.37)
V π 2 c3
6
A nice introduction to cavity modes is given here, but of course it is not examinable.
Experimentally this matches the low-frequency (long-wavelength) spectrum of black-body radi-
ation well and is called the Rayleigh-Jeans law. But clearly it cannot hold for indefinitely large
frequency, as it increases without bound: integrating over all frequencies, this would predict
that the energy in the EM field of a cavity is infinite, and even the coolest black body would
be more than white hot! And indeed, experimentally, deviations are seen at higher frequency:
the observed spectrum reaches a maximum and then falls away exponentially. The failure of
the Rayleigh-Jeans law has been termed the “ultraviolet catastrophe”.7
The correct radiation formula was found in 1900 by Max Planck who, in what he described as
“an act of desperation”, proposed that the amplitude of the fields was not arbitrary, but that the
energy in any given mode was quantised, that is it had to be an integer multiple of an elementary
energy which is proportional to the frequency—in modern terminology, E = nhf = nh̄ω.
This was the original introduction of h, Planck’s constant. Now the single quantum harmonic
oscillator is a problem we solved long back, finding (ignoring the zero-point energy)
h̄ω
hEi = . (4.38)
eh̄ωβ−1
(See (3.37), subtracting 12 h̄ω.)
Including the density of states to account for the degeneracy at a given ω give the energy
per unit volume per unit frequency is
g(ω) h̄ω h̄ ω3
u(ω) = = (4.39)
V eh̄ωβ − 1 π 2 c3 eh̄ωβ − 1
For low frequencies, h̄ω kB T , the denominator can be approximated by h̄ωβ and we recover
the Rayleigh-Jeans result (4.37). This is exactly analogous to recovering equipartition for an
oscillator. But for high frequencies, the exponential in the denominator dominates and u falls
off as ω 3 e−h̄ωβ , curing the ultraviolet catastrophe—and matching experiment nicely with an
appropriately-chosen value of the Planck’s constant.8
u(ω) u(ω)
R-J
Planck
ω
ω
7
It should be noted that the generally accepted radiation law before Planck was the Wein Law u ∼ ω 3 e−bω/T ,
which in fact is correct at high frequencies. The Rayleigh-Jeans law, though from the start clearly not general,
was observed to do better at low energies. The Planck formula we are about to discuss interpolates between
the two. As a further interesting note, it was Einstein in 1905 who derived the law together with all the
corresponding factors from equipartition, and fully articulated the conflict between this and Planck’s law.
8
Even better, since u depends on h̄ and kB separately, he could determine both—the latter was not actually
known at that point. The gas constant R was, so that immediately gave Avogadro’s number NA . And the
product NA e (the Faraday constant) was also known from electrolysis, so he found e too, all to within a few
percent of their currently accepted values, and well before any accurate determination from any other method.
Not bad for someone who originally rejected the atomic hypothesis...
The plot above shows, on the left, the Rayleigh-Jeans distribution and the Planck dis-
tribution for a given temperature, and on the right, the Planck distribution for three differ-
ent temperatures. The maximum of the Planck distribution—the frequency with the highest
intensity—is at ω = 2.82kB T /h̄, a relation which is called the Wien displacement law. The ob-
servation of scaling of the maximum with temperature predates Planck, and is obtained from
du/dω = 0 with the numerical solution of 3(1 − e−x ) = x at x = 2.8214. The sun’s surface
temperature of 5778 K means that its spectrum peaks in the visible range and its light pretty
much defines white. Betelgeuse at 3500 K peaks in the near infrared and appears red, while
Sirius at 10000 K peaks in the UV and appears blue.
At this point you might be wondering why we waited till now to cover this application; we
could have done it after the ideal gas in section 3. True. But look again at (4.39). In the context
of the current section, it should look very familiar: in fact it is just what we expect from an
ultrarelativistic Bose gas with chemical potential µ = 0. And indeed we can so interpret it: we
can switch from the classical picture of cavity modes of the EM fields with the energy arbitrarily
quantised, to a picture of a gas of massless photons of energy h̄ω. The two polarisation states
translate to gs = 2 (unexpected for a spin-1 particle but things work a bit differently when they
are massless). Photons, being spin-1, are bosons, so there can be many in any given state. And
so the number of photons for a frequency range ω → ω + dω, where ω = ck and ε = h̄ω, is
hN (ε)i dε = g(ε)n(ε)dε
and the energy density is (see the end of section 3.8.1 and the problem sheets for more on
changing variables):
hE(ε)i
u(ω)dω ≡ dε
V
g(ε) gs ε2 1
=ε n(ε) dε = ε 2 dε
V 2π (h̄c)3 eεβ − 1
h̄ ω3
= 2 3 h̄ωβ dω (4.40)
π c e −1
in agreement with Planck.
Why is the chemical potential for photons zero? The walls of the cavity act as a heat bath
but not in any meaningful sense a particle reservoir: photons don’t exist in the walls, only
energy does. The chemical potential is given by µ = (∂Swalls /∂N )E = 0. This situation occurs
where ever particle number is not a conserved quantity.
We should now clarify the relation between cavity and blackbody radiation. Perfect black-
body radiation is obtained as the emission from a small hole in a cavity, and the relation
between the two is that the flux F (ω) of the emitted radiation, that is, the power emitted per
unit emitting area, per unit frequency, is related to the energy density in the box by
c
F (ω) = u(ω). (4.41)
4
The factor of 4 is the same as enters in the formula for effusion of gas from a small hole. If we
integrate over frequency, we get the total flux from an area A of
Z∞ Z∞
A h̄ω 3 A(kB T )4 x3 π 2 kB4
L= 2 2 dω = dx = AT 4 (4.42)
4π c eh̄ωβ − 1 4π 2 h̄3 c2 ex − 1 60h̄3 c2
0 0
The x-integral has the exact value of π 4 /15.9
But this is exactly Stefan’s law, L = AσT 4 , except that we have now predicted the value of
the Stefan-Boltzmann constant σ in terms of Planck and Boltzmann’s constants:
π 2 kB4
σ= . (4.43)
60h̄3 c2
And indeed it does (of course) match with the empirical value.
Stars are not perfect black bodies by any means. The most perfect black body we know is
the cosmic microwave background. If you are familiar with cosmology and the results of recent
experiments such as WMAP and Planck, you may have a picture like the one below on the
right in your head:
But wonderfully informative though that is, it is actually shows the angular scale of deviations,
of the order of a few part in 105 , from a nearly perfect Planck spectrum at a temperature
of 2.728 ± 0.004 K. The figure on the left shows the actual spectrum obtained by the FIRAS
instrument on the COBE satellite. In the original paper10 there are no error bars shown because
they are “a small fraction of the line thickness”; those in the figure are scaled up by 200 times
so they can be seen.
Returning to photons as a bose gas, we can derive some more properties of radiation. The
total energy is11
Z∞ Z∞
V h̄ω 3 V (kB T )4 x3 kB4 π 2 4σ
hEi = 2 3 h̄ωβ
dω = 2 3 x
dx = 3
V T4 = V T4 (4.44)
π c e −1 π (h̄c) e −1 15(h̄c) c
0 0
The figure below is reproduced from Einstein’s paper Ann. Phys. 22, 180, (1907), with the
frequency parameter expressed as a temperature, h̄ωE /kB = 1320 K.
This gives the correct high-temperature limit of the internal energy and heat capacity, but it
can be seen to deviate at low temperatures, and indeed is clearly fundamentally flawed. If one
atom is displaced, it pulls on its neighbours which will also be displaced, and so on. We are
reminded of coupled pendulums; you can set one swinging, but the other will start to swing
too. To analyse the subsequent motion we can use the normal modes of the whole system,
which in this case involve the two pendulums swinging together, either in phase or 180◦ out
of phase; these modes have different frequencies with the in-phase mode being lower that the
out-of-phase one. For N pendulums there will be N modes, the lowest-frequency one having all
swinging together and the highest frequency one having each out of phase with its neighbour,
but with another N − 2 modes of intermediate frequency. Note that the modes (and hence
frequencies) are discrete because the number of pendulums is finite. This is reminiscent of the
modes of the EM field in a cavity, which were discrete because the volume was finite, and the
allowed wave numbers were multiples of π/L.
For a 1D chain of identical atoms of mass m, with Ks being the effective spring constant
between each pair (actually the curvature at the minimum of the interatomic potential), there
are two kinds of modes, longitudinal and transverse. For both, the modes are characterised
by the distance λ the pattern of atomic displacements to repeat.. The length of the chain
sets the maximum λ,12 and the distance a between the atoms sets the minimum; in terms
of wave number, k = nπ/L for n = 1, 2 . . . L/a. The frequency
p for longitudinal modes was
shown in first year to be ω(k) = 2ω0 sin(ka/2) where ω0 = Ks /m, which at low frequency is
ω = kaω0 , a linear dispersion relation reminiscent of ω = ck for photons with vs = aω0 being
the speed of sound in the chain. In a simple model, transverse modes have a lower frequency
than longitudinal modes of the same k; there are two of these for vibrations in the two planes
perpendicular to the direction of the chain.
Introducing quantum mechanics, the energy in each mode will be restricted to multiples
of h̄ω (again ignoring unobservable zero-point energy), and we see that we have a picture of
massless (quasi-)particles corresponding to the excitations in each mode. And either from the
quantum oscillator approach of Section 3.7, or by treating the phonons as a bosonic gas with
zero chemical potential (same argument as for photons, phonons aren’t conserved), we get the
number of phonons in a mode as
1
n(k) = (4.47)
eh̄ω(k)β − 1
Even for 3D monatomic crystals with a cubic structure the possible modes of vibration might
seem overwhelming. But in fact we can, again, characterise the modes by the wave vector in 3D,
k = (nx π/Lx , ny π/Ly , nz π/Lz ) with integer ni , this time though with a maximum ni = Li /a.
In 3D it is planes of atoms which move together, and k is parallel to the normal to the plane.
The total number of modes is three (for three modes) times the number of ways of choosing
nx , ny , nz with each ni in the allowed range, which is just the product of the three nmaxi . So
3 3
the number of modes is 3Lx Ly Lz /a = 3V /a = 3N , where N is the number of atoms. That
is gratifying, because it is, as it must be, the same as if we treated each atom as vibrating
independently in 3D.
This also means that if we are in the high-temperature regime, in which the energy in every
mode is kB T , we will recover the law of Dulong and Petit, E = 3N kB T and CV = 3N kB , as in
the Einstein model.
To find the internal energy and heat capacity due to lattice vibrations at lower temperatures,
we would like to use an approach like that for photons, integrating the quantity of interest
weighted by g(k)n(k), albeit with a cut-off in k. There are some issues with this. Strictly
speaking the cut-off on k is in cartesian coordinates, and furthermore the frequency is not a
simple function of |k|, independent of direction. In addition the longitudinal and transverse
12
The existence of a maximum λ of course depends on the boundary condition. If the end atoms are fixed,
then λmax = L/2. In fact the modes have the end atoms with maximum displacement (cosines rather than
sines) and again λmax = N a/2 = L/2. If periodic boundary conditions are applied, as is common, λmax = L
but there are two modes for each λ and the net effect is the same.
modes do not have the same dispersion relation (wave speed).13 The Debye model ignores these
subtleties, and further assumes a linear dispersion relation ω = vs k. It imposes a cut-off kD on
|k| such that the number of modes is correct:
ZkD 3
3V kD
3N = g(k)dk = ⇒ kD = (6π 2 n)1/3 (4.48)
2π 2 3
0
ZkD 4 TZD /T
k3 x3
3V 3V kB T
hEi = 2 h̄vs h̄v kβ
dk = h̄vs dx
2π e s −1 2π 2 h̄vs ex − 1
0 0
ZkD 3 TZD /T
k 3 h̄vs k eh̄vs kβ x4 e x
3V dβ 3V kB T
⇒ CV = − 2 h̄vs dk = kB dx (4.50)
2π dT (eh̄vs kβ − 1)2 2π 2 h̄vs (ex − 1)2
0 0
For very low temperatures the upper limit on the x integrals in (4.50) becomes large and as
the integrand falls off exponentially at large x, we can take the cut-off to infinity. Then the
integral in hEi is the same as we met in the context of the black body spectrum and equals
π 4 /15. So
V π 2 (kB T )4
hEi →
10 (h̄vs )3
3 3
2π 2 kB T 12π 4
T
⇒ CV =kB V = kB N (4.51)
5 h̄vs 5 TD
Note that the in the first form, the dependence on N has disappeared, which does make
sense: only long-wavelength, low frequency modes are contributing; the energy per unit volume
only depends on vs , a bulk property which is insensitive to the atomic substructure. For that
reason we expect the result to be reasonably robust, even if some of the model assumptions
are somewhat suspect. And so we recover the CV ∝ T 3 contribution from lattice vibrations
to the specific heat which we saw in the data for potassium in Eq. (4.28). Though TD can
be predicted, it can also be fit to data, as in the figure below on the left, where it can be
shown that the different heat capacity curves for a number of metals collapse to a single curve
if plotted as a function of T /TD . (Note: the numbers given for the Debye temperatures are just
fit parameters, and other sources using different data will not give identical numbers—see for
example Mandl’s fig. 6.7.)
13
That particular point can be circumvented by using an average speed defined by 3/vs3 = 2/vt3 + 1/vl3 , where
vl and vt are the speeds of sound for transverse and longitudinal waves respectively; in the expression for the heat
capacity (4.50) that is equivalent to calculating the contributions from the two types of modes independently.
CV /NkB
3.0
2.5
2.0
1.5
1.5
1.0
1.0
Debye 0.5
0.5
Einstein 0.05 0.10 0.15 0.20 0.25
T/TD
0.2 0.4 0.6 0.8 1.0
The difference between the Debye and Einstein predictions are shown above on the right. Since
they both have one free parameter, I have chosen ωE = 0.75ωD to get the best agreement.
Clearly experiment is not going to tell the difference between them except at low temperatures
(shown in the inset panel for clarity).
In reality, monatomic simple cubic crystals are not common. However the Debye model
works pretty well for more complicated cases too (none of the metals in the figure above are
simple cubic). Just for the record, we make a few comments. The density of states in k space,
though always derived for a cuboidal box, is independent of the shape of the box. The basic idea
that that vibrational modes are countable and that a system of N atoms will have 3N distinct
modes is always valid. The cut-off or maximum value of k is the edge of the Brillouin zone, the
set of vectors that cannot be reduced in length by subtracting a reciprocal lattice vector (just
as in 1D, a standing wave with wave number 3π/4a cannot be distinguished from one of π/4a
because the displacements of the atoms are identical in both). For most structures the Brillouin
zone is a polyhedron which is closer to a sphere than the cube is, so that is actually an advantage
for the Debye model. Finally for a structure with more than one atom per unit cell (as for
any non-monatomic crystal) it is usual to choose the Debye cut-off to reproduce the number of
unit cells in the crystal, not the number of atoms. There will be two types of vibration. Three
for each k are like the ones considered above for which ω ∼ k at low frequencies, which are
termed acoustic modes and which give the dominant low-temperature heat capacity. The rest
are those in which the different types of atom vibrate against each other, termed optical modes
(since if the atoms are charged the oscillating dipoles will interact with EM radiation); their
frequencies do not go to zero as k → 0. If a crude model is to be used, the constant-frequency
Einstein model is more appropriate for these; the correct high-temperature heat capacity will
then be recovered. None of these details of real crystals are examinable in this course.
Let us return to matter, and bosons such as 4 He and 87 Rb, though ignoring any interactions
between them (an approximation the validity of which will, broadly, depend on the density—
but it should be said at the outset that there are some qualitative changes when interactions
are present).
Now for bosons, the occupancy is
1
n(ε) = (4.52)
e(ε−µ)β − 1
and as we have seen, this does not make sense for ε < µ. So for a Bose gas the chemical
potential must be less than the lowest energy level (often taken for convenience to be 0). For a
sufficiently warm dilute gas µ will be large and negative and we will be in the classical regime,
but as n/nQ grows, µ will increase (ie become less negative), initially just as in the Fermi case.
And as in that case, in practice we find the chemical potential by requiring the density to be
correct: for non-relativistic particles in 3D we have
Z∞ Z∞ 3/2 Z∞ 1
k2
N 1 gs 2m gs x2
= g(k)n(k)dk = 2 dk = dx (4.53)
h̄2 β
2
V V 2π e(h̄ k2 /2m−µ)β − 1 4π 2 zex − 1
0 0 0
Z∞ 1 3/2 √
h̄2
x2 π n
G(z) ≡ x
dx = 4π 2 n = . (4.55)
ze − 1 2mkB T 2 gs nQ
0
Here, in contrast to the fermion case, z > 1 always. (Always remember that nQ ∝ T 3/2 .)
As required, the integral grows as z decreases. But it reaches a maximum at z = 1 (µ = 0)
and so there would seem to be a maximum possible value of n/nQ —for a given density, a
minimum temperature! We will return to this, of course. But remaining below this limit on
n/nQ , we obtain the relation between the chemical potential and the number density in the
same way as for fermions; the results are shown in green in the second figure of section 4.3.1.
(The use of units of εF to eliminate explicit density dependence in that figure is purely a matter
of convenience of course, it has no physical meaning for bosons.) The average energy, entropy,
pressure and other properties are similarly calculable numerically in terms of integrals like that
of (4.55), once the chemical potential is fixed.
Below on the left we show plots of g(k)n(k) at fixed temperature for a variety of small
(negative) chemical potentials. (We switch to k rather than because the low-k curve is better
behaved.)14 The area under the curve gives the corresponding N ; as expected we see that it
grows as falls and µ rises towards zero from below. And as claimed above, the area reaches a
maximum as µ → 0. No more particles can be accommodated.
14
It takes a little thought to convince one’s self that G(z) in (4.55) tends to a finite maximum as z → 1,
since for z = 1 the integrand diverges as x−1/2 for x → 0. ButR a the integral does not diverge; it goes as x1/2
−1/2
for x → 0, and so the lower limit contributes zero. (Note 0 x dx = a1/2 ). For the full integrand, of
course, the upper limit of the integral and hence the final result is finite. Switching to k, at low k we have
g(k)n(k) ∼ k 2 /(h̄k 2 β/2m) =constant, so the problem does not arise.
g(k)n(k) g(k)n(k)
-10-6 N fixed
T fixed 1.01
-10-3
1.1
-0.01
1.4
T=1.8TC
μβ=-0.1
k k
On the right we show fixed N for various T (in units of “TC ”, here just a shorthand for a
quantity that depends on N with units of temperature; defined below in Eq. (4.57)). Again,
once T = TC and µ = 0, no further adjustment can allow the temperature to be lowered further
while keeping N fixed.
From all of this, we see that the maximum N or minimum T is obtained when µ → 0,
Z∞ Z∞
NC 1 gs k2
= g(k)n(k)dk = 2 2 dk
V V 2π e(h̄ k2 /2m)β − 1
0 0
3/2 Z∞ 1 3/2
2m gs x2 2mkB T 2.31516 gs
= dx =
h̄2 β 4π 2 ex − 1 h̄2 4π 2
0
= 2.61238 gs nQ (4.56)
Or in terms of temperature,
2/3
h̄2
n
TC = 3.3125 . (4.57)
mkB gs
The subscript C could stand for “critical”, though for reasons we will see below it usually
stands for “condensation”.
But what is going on here? Suppose we just have one state, of energy ε. Then for a given
µ the occupancy is
1 1 kB T
N = (−µ)β ⇒ µ = ε − kB T ln 1 + ≈ε− for N 1. (4.58)
e −1 N N
There is no limit on N ! µ just gets closer and closer to ε, from below. So why are we having
problems when we allow more than one state (for a particle in a box, using the density of
states)?
The problem is simply that we have said that everything varies smoothly with k, so that
we have replace a sum over discrete states with a weighted integral over k. But as µ → 0,
n(k) is varying extremely rapidly at low k. And the weighting, the density of states g(k) ∝ k 2 ,
vanishes at the lowest energies, so exactly the states we expect to have the most occupancy
are given a zero weighting! The apparent limit on n/nQ , or on T given n, is an artefact of our
approximation.
In principle, therefore, we should just switch back to a sum over modes. Recall there is
3h̄2 π 2
a single ground state, with energy (taking a cube for simplicity) E0 = 2mV 2/3 , then 3 states
6h̄2 π 2 9h̄2 π 2
with energy 2mV 2/3 , three with 2mV 2/3 and so on. We can show (it is set as an exercise on the
problem sheet) that as we lower the temperature below the critical point, n(ε) continues to
vary smoothly over all states except the ground state. So in fact we can continue to use the
density of states, so long as we treat the ground state separately:
Z∞ Z∞
gs V k2
N = N0 + g(k)n(k)dk = N0 + 2 2 2 dk
2π e(h̄ k /2m)β − 1
0 0
3/2 Z∞ 1
2m x V gs 2
= N0 + dx
h̄2 β −14π 2 ex
0
3/2
T
= N0 + 2.61238 gs V nQ = N0 + N (4.59)
TC
3/2 !
T
⇒ N0 = N 1 − . (4.60)
TC
0.7
0.5
0.3
T k
TC
As the temperature drops, the fraction of the particles in the condensate rises steadily, till as
T → 0, N0 → N , as shown in the middle panel above. The remainder (N − N0 ) corresponds to
the the area under the curves in the right hand panel. The condensate is a single, macroscopic,
collective quantum object which (for cold trapped atomic gases) can actually be seen by the
naked eye.
The energy of the gas is given by
Z∞
gs V k 2 ε(k)
E = N ε0 + 2 2 2 dk
2π e(h̄ k /2m)β −1
0
3/2 Z∞
x3/2
2m gs V
= N ε0 + (kB T )5/2 dx
h̄2 4π 2 ex − 1
0
3/2 Z∞
x3/2
gs V 5mkB T
⇒ CV = kB 2 dx, (4.61)
4π h̄2 ex − 1
0
where the x integral is just another number. So the heat capacity below TC is proportional to
T 3/2 .15
It is also interesting to know what it is just above TC , because interesting behaviour in the
heat capacity is often an experimental sign of a phase transition such as “condensation” in
the vapour-liquid context. This is more complicated, because the energy will depend on the
non-vanishing µ which is implicitly a function of T . As before writing z = e−µβ ,
3/2 Z∞
x3/2
2m 5/2 gs V
E= (kB T ) dx
h̄2 4π 2 zex − 1
0
3/2 Z∞
x3/2
5mkB T gs V
⇒ CV = kB dx (4.62)
h̄2 4π 2 zex − 1
0
3/2 Z∞
zx3/2
2m 5/2 gs V 1 ∂µ
+ (kB T ) T −µ dx (4.63)
h̄2 4π 2 kB T 2 ∂T (zex − 1)2
0
The first term is the same as what we found for T < TC , (4.61), so it is continuous. But the
second term vanishes below T = TC and is negative and grows linearly at T > TC . We could
∂µ
go further in the calculation to get an expression for ∂T , but instead we simply plot CV ; it has
indeed a cusp at T = TC . This is indicative of a second-order phase transition (like that of the
Ising paramagnet).
CV
3
2
N kB
T
TC
The relation P = 23 E/V continues to hold below TC if the contribution of the condensate is
ignored, which is a good approximation. But with the vanishing of µ, which above TC depends
on N/V , P depends only on the temperature and not on the volume. This in fact is not
physical as the system would not be stable against collapse. But we have ignored interactions;
Bose-Einstein gases of real atoms will have repulsive interactions at short distances.
There are a couple of things in the set-up that, on a second reading, might bother you a
little. Firstly, the ground state does not have zero energy, so the chemical potential below TC
15
The N ε0 in E is not a misprint for N0 ε0 . see below.
is not actually zero, but given by (from (4.58))
1 kB T
µ = ε0 − kB T ln 1 + ≈ ε0 − (4.65)
N0 N0
There are two aspects to this: the ε0 is just a resetting of the zero of energy. In the continuum
contribution one likewise shifts the energy so that the variable of integration x is proportional
to ε − ε0 . (It is this shift that makes the first term in the energy in (4.62) up to N ε0 , the
zero-point energy of all the particles.) The other aspect is the part that falls as 1/N0 ; for
macroscopic occupancy of the ground state that is the part we are truly ignoring when we
say µ = 0, and it makes no appreciable difference to the continuum part of the distribution.
Second, if the ground state is macroscopically occupied, what about the states just above the
ground state? It is true that the continuum approximation may not get their contribution quite
right. (It is the continuum approximation.) But as you will show on the problem sheet, the
occupancy of the first excited state scales as N 2/3 , which for a macroscopic system is many
orders of magnitude below N . Only the ground state is macroscopically occupied.
Eric A. Cornell, Wolfgang Ketterle, and Carl E. Wieman received the 2001 Nobel Prize
in Physics “for the achievement of Bose-Einstein condensation in dilute gases of alkali atoms,
and for early fundamental studies of the properties of the condensates”.16 The JILA paper
demonstrated that “a condensate was produced in a vapor of rubidium-87 atoms that was
confined by magnetic fields and evaporatively cooled. The condensate fraction first appeared
near a temperature of 170 nanokelvin and a number density of 2.5 × 1012 cm−3 ”.17 The other,
from MIT, used 23 Na. “The condensates contained up to 5 × 105 atoms at densities exceeding
1014 cm−3 . The striking signature of Bose condensation was the sudden appearance of a bimodal
velocity distribution below the critical temperature of ∼ 2 µK. The distribution consisted of
an isotropic thermal distribution and an elliptical core attributed to the expansion of a dense
condensate.”18
The famous image below is from the MIT group and shows the velocity distribution above,
just below and well below TC , with the condensate standing out as a spike at zero momentum,
distinguished from the more diffuse cloud with a thermal distribution.
Another system of bosons that exhibits interesting behaviour at low temperatures has been
known for much longer. 4 He liquifies 4.2 K, and never solidifies at normal pressures. At 2.2 K,
though, it starts to show very peculiar behaviour, which is known as superfluidity: it behaves
as if it had two fluid components, one normal but the other which can flow without viscosity,
16
The information accompanying the prize citation is here.
17
Observation of Bose-Einstein Condensation in a Dilute Atomic Vapor, M. H. Anderson, J. R. Enshe, M.
R. Matthews, C. E. Wieman, E. A. Cornell, Science (1995) 269 198.
18
Bose-Einstein Condensation in a Gas of Sodium Atoms, K. B. Davis, M. -O. Mewes, M. R. Andrews, N.
J. van Druten, D. S. Durfee, D. M. Kurn, and W. Ketterle Phys. Rev. Lett. 75 (1995) 3969.
which will flow up and over the edge of an open container, and cannot rotate as a bulk fluid.
This component is called a superfluid, and as the temperature drops, the superfluid component
tends to 100%. At 2.17 K, the heat capacity shows a pronounced spike (see below). 3 He, on
the other hand, shows no such strange behaviour until the temperature drops to 2 mK. Fritz
London in 1938 suggested that the superfluid is in fact a Bose-Einstein condensate; a rough
estimate of the BEC transition temperature gives about 3 K, not too far off from the observed
one, and if you don’t look too closely the heat capacity curve (below) is similar.19
In fact non-interacting bosons do not exhibit superfluidity (mathematically that is, they
are not experimentally achievable). Dilute cold atom BECs, which are weakly interacting, do
show superfluidity, and indeed it has been demonstrated that there is a BEC in liquid 4 He. But
the condensate fraction never rises above 10%, and it is clear that the interactions (which are
strong enough to make it a liquid, after all) are too strong for the treatment we have used to
be even an approximate description of the relevant physics.
What about the (fermionic) 3 He? How can that show any kind of condensation, even at
very low temperatures indeed? In fact this is an example of a phenomenon called pairing; if
attractive interactions exist then pairs of fermions can form composite bosons (and of course
all material bosons are composite); so long as kB T is well below the binding energy the pairs
can form a BEC. Pairing is similarly behind the phenomenon of superconductivity and the
superfluidity of the matter of neutron stars. But that is well beyond the scope of this course.
19
Figure from F London, Superfluidity, Wiley 1954. In fact the helium heat capacity diverges, albeit only
logarithmically, at the critical temperature.
Appendix A
Miscellaneous background
P V = nRT or P V = N kB T (A.1)
where N is the number of molecules, n = N/NA is the number of moles (not to be confused
with the number density, N/V , also denoted by n), R = 8.314 JK−1 and kB = R/NA =
1.381 × 10−23 JK−1 is Boltzmann’s constant. The ideal gas law encompasses Boyle’s Law and
Charles’ Law. It requires the temperature to be measured on an absolute scale like Kelvin’s.
Ideal gases have internal energies which depend only on temperature: if CV is the heat
capacity at constant volume,
E = E(T ) and dE = CV dT
⇒ E = CV T if CV is constant.(A.2)
In general the heat capacity may change with temperature; however at STP it is usually
adequate to consider it as constant and equal to
E = 12 nf R (A.3)
per mole, where nf is the number of active degrees of freedom. For monatomic gases nf = 3
(translational) and for diatomic gases nf = 5 (translational and rotational; vibrational modes
are “frozen out”.)
The heat capacities at constant pressure and at constant volume differ by a constant for
ideal gases:
CP − CV = nR. (A.4)
79
During reversible adiabatic compression or expansion of an ideal gas the pressure and volume
change together in such a way that
CP
P V γ = constant where γ≡ (A.5)
CV
For a monatomic gas at STP, γ = 5/3 = 1.67; for a diatomic gas, γ = 7/5 = 1.4 . Using the
ideal gas law, we also have
1
T V γ−1 = constant and T P γ −1 = constant. (A.6)
Note that γ − 1 = nR/CV .
Starting from the fundamental thermodynamic relation 1.11 together with the equation of
state A.1 and energy-temperature relation ig:energy, we can show that the entropy change of
a n moles of moles of ideal gas
nf /2 !
Tf Vf
∆S = nR ln (A.7)
Ti Vi
Since dS = CTv dT at constant volume, this can be checked experimentally for any gas which is
close enough to ideal. But as the expression is ill-defined as Ti → 0, classical thermodynamics
cannot predict the absolute entropy even of an ideal gas.
(checking that it really is a maximum that we’ve found). But consider a different problem: on a
particular path across the hill (which does not necessarily reach the summit) what is the highest
point reached? The path may be specified as y = g(x) or more symmetrically as u(x, y) = 0.
This is constrained maximisation: we are constrained to stay on the path.
The trick is to extremize h(x, y) + λu(x, y) with respect to x and y; these two equations
together with the constraint u(x, y) = 0 are enough to fix the three unknowns xm , ym and
λ (though the value of the last is uninteresting and not usually found explicitly; this is also
called the method of “undetermined multipliers”.) So for example with a hemispherical hill
h = h0 (1 − x2 − y 2 ) and a straight-line path u(x, y) = y − mx − c = 0 we have1
∂(h + λu)
=0 ⇒ −2h0 x − λm = 0 ⇒ x = −λm/2h0
∂x
∂(h + λu)
=0 ⇒ −2h0 y + λ = 0 ⇒ y = λ/2h0 = −x/m. (A.9)
∂y
1 1 1
sech x ≡ cosech x ≡ coth x ≡ (A.14)
cosh x sinh x tanh x
From the definitions above it is easy to show that
d cosh x d sinh x d tanh x
= sinh x = cosh x = sech2x. (A.15)
dx dx dx
Often we are interested in the small- or large-x limits of these functions. What we want is
to find a simple function which approximates to a more complicated one in these limits. So
while it is true that as x → 0, sinh x → 0, that is not usually what we want; what we want is
how it tends to zero.
1 ∂(h+λu)
Some presentations of this subject add the equation ∂λ = 0 to the list, but from that we just recover
the imposed constraint u = 0.
From the small-x expansion of the exponential ex = 1 + x + 21 x2 + . . . we get
x→0 x→0 x→0
sinh x −→ x tanh x −→ x cosh x −→ 1 + 12 x2 (A.16)
The limit of cosh x often causes problems; whether we keep the x2 term depends on the context,
given that we want to be able to say more than “tends to 0” or “tends to ∞”. It may be useful
to remember instead
x→0 x→0
cosh x −→ 1 but cosh x − 1 −→ 12 x2 (A.17)