Lecture Notes On Statistical Mechanics and Thermodynamics: Universität Leipzig
Lecture Notes On Statistical Mechanics and Thermodynamics: Universität Leipzig
Lecture Notes On Statistical Mechanics and Thermodynamics: Universität Leipzig
and Thermodynamics
Universität Leipzig
List of Figures 1
3. Time-evolving ensembles 23
3.1. Boltzmann Equation in Classical Mechanics . . . . . . . . . . . . . . . . . . 23
3.2. Boltzmann Equation, Approach to Equilibrium in Quantum Mechanics . 29
4. Equilibrium Ensembles 32
4.1. Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2. Micro-Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.1. Micro-Canonical Ensemble in Classical Mechanics . . . . . . . . . . 32
4.2.2. Microcanonical Ensemble in Quantum Mechanics . . . . . . . . . . 39
4.2.3. Mixing entropy of the ideal gas . . . . . . . . . . . . . . . . . . . . . 42
4.3. Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.3.1. Canonical Ensemble in Quantum Mechanics . . . . . . . . . . . . . 44
4.3.2. Canonical Ensemble in Classical Mechanics . . . . . . . . . . . . . . 47
4.3.3. Equidistribution Law and Virial Theorem in the Canonical Ensemble 50
4.4. Grand Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5. Summary of different equilibrium ensembles . . . . . . . . . . . . . . . . . . 57
4.6. Approximation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
1.1. Boltzmann’s tomb with his famous entropy formula engraved at the top. 4
6.1. The triple point of ice water and vapor in the (P , T ) phase diagram . . . 82
6.2. A large system divided into subsystems I and II by an imaginary wall. . . 83
6.3. Change of system from initial state i to final state f along two different
paths. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.4. A curve γ ∶ [0, 1] → R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.5. Sketch of the submanifolds A. . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.6. Adiabatics of the ideal gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
1
List of Figures
6.7. Carnot cycle for an ideal gas. The solid lines indicate isotherms and the
dashed lines indicate adiabatics. . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.8. The Carnot cycle in the (T , S)-diagram. . . . . . . . . . . . . . . . . . . . . 94
6.9. A generic cyclic process in the (T , S)-diagram. . . . . . . . . . . . . . . . . 95
6.10. A generic cyclic process divided into two parts by an isotherm at temper-
ature TI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.11. The process describing the Diesel engine in the (P , V )-diagram. . . . . . 98
6.12. Imaginary phase diagram for the case of 6 different phases. At each point
on a phase boundary which is not an intersection point, ϕ = 2 phases are
supposed to coexist. At each intersection point ϕ = 4 phases are supposed
to coexist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.13. The phase boundary between solution and a solute. . . . . . . . . . . . . . 107
6.14. Phase boundary of a vapor-solid system in the (P , T )-diagram . . . . . . 109
2
1. Introduction and Historical Overview
1
Of course this theory turned out to be incorrect. Nevertheless, we nowadays know that heat can be
radiated away by particles which we call “photons”. This shows that, in science, even a wrong idea
can contain a germ of truth.
2
It seems that Lavoisier’s foresight in political matters did not match his superb scientific insight. He
became very wealthy owing to his position as a tax collector during the “Ancien Régime” but got in
trouble for this lucrative but highly unpopular job during the French Revolution and was eventually
sentenced to death by a revolutionary tribunal. After his execution, one onlooker famously remarked:
“It takes one second to chop off a head like this, but centuries to grow a similar one.”
3
1. Introduction and Historical Overview
Parallel to this largely phenomenological view of heat, there were also early attempts
to understand this phenomenon from a microscopic angle. This viewpoint seems to
have been first stated in a transparent fashion by D. Bernoulli in 1738 in his work on
hydrodynamics, in which he proposed that heat is transferred from regions with energetic
molecules (high internal energy) to regions with less energetic molecules (low energy).
The microscopic viewpoint ultimately lead to the modern ’bottom up’ view of heat by
J.C. Maxwell, J. Stefan and especially L. Boltzmann. According to Boltzmann, heat is
associated with a quantity called “entropy” which increases in irreversible processes. In
the context of equilibrium states, entropy can be understood as a measure of the number
of accessible states at a defined energy according to his famous formula
S = kB log W (E) ,
Figure 1.1.: Boltzmann’s tomb with his famous entropy formula engraved at the top.
4
1. Introduction and Historical Overview
• Financial markets
• Astronomy
and many more. Here is an, obviously incomplete, list of some key innovations in the
subject:
Timeline
17th century:
18th century:
19th century:
1850 W. Thomson and H. von Helmholtz: impossibility of a perpetuum mobile (2nd law)
1876 (as well as 1896 and 1909) controversy concerning entropy, Poincaré recurrence is
not compatible with macroscopic behavior
5
1. Introduction and Historical Overview
20th century:
6
2. Basic Statistical Notions
• A random variable x can have different outcomes forming a set Ω = {x1 , x2 , . . .},
e.g. for tossing a coin Ωcoin = {head,tail} or for a dice Ωdice = {1, 2, 3, 4, 5, 6}, or
v = (vx , vy , vz ) ∈ R3 }, etc.
for the velocity of a particle Ωvelocity = {⃗
(i) P (E) ≥ 0.
(ii) P (Ω) = 1.
(iii) If E ∩ E ′ = ∅ ⇒ P (E ∪ E ′ ) = P (E) + P (E ′ ).
In mathematics, the data (Ω, P , {E}) is called a probability space and the above
axioms basically correspond to the axioms for such spaces. For instance, for a fair dice
1
the probabilities would be Pdice ({1}) = . . . = Pdice ({6}) = 6 and E would be any subset
of {1, 2, 3, 4, 5, 6}. In practice, probabilities are determined by repeating the experiment
(independently) many times, e.g. throwing the dice very often. Thus, the “empirical
definition” of the probability of an event E is
NE
P (E) = lim , (2.1)
N →∞ N
7
2. Basic Statistical Notions
∫ p(x)dx = 1, 0 ≤ p(x) ≤ ∞.
−∞
A mathematically more precise way to think about the quantity p(x)dx is provided by
measure theory, i.e. we should really think of p(x)dx = dµ(x) as defining a measure and
of {E} as the corresponding collection of measurable subsets. A typical case is that p is
a smooth (or even just integrable) function on R and that dx is the Lebesgue measure,
with E from the set of all Lebesgue measurable subsets of R. However, we can also
consider more pathological cases, e.g. by allowing p to have certain singularities. It is
possible to define “singular” measures dµ relative to the Lebesgue measure dx which
are not writable as p(x)dx and p an integrable function which is non-negative almost
everywhere, such as e.g. the Dirac measure, which is formally written as
N
p(x) = ∑ pi δ(x − yi ), (2.3)
i=1
8
2. Basic Statistical Notions
Let us collect some standard notions and terminology associated with probability
spaces:
∞
⟨F (x)⟩ ∶= ∫ F (x)p(x)dN x. (2.4)
−∞
Here, the function F (x) should be such that this expression is actually well-defined,
i.e. F should be integrable with respect to the probability measure dµ = p(x)dN x.
Note that it is not automatically guaranteed that the moments are well-defined,
and the same remark applies to the expressions given below. The probability
distribution p can be reconstructed from the moments under certain conditions.
This is known as the “Hamburger moment problem”.
∞ ∞
(−ik)n n
p̃(k) = ∫ dx e−ikx p(x) = ⟨e−ikx ⟩ = ∑ ⟨x ⟩ . (2.6)
n=0 n!
−∞
∞
1
p(x) = ikx
∫ dk e p̃(k). (2.7)
2π
−∞
∞
(−ik)n n
log p̃(k) = ∑ ⟨x ⟩c . (2.8)
n=1 n!
9
2. Basic Statistical Notions
⟨x⟩c = ⟨x⟩
⟨x⟩ =
⟨x2 ⟩ = +
⟨x3 ⟩ = +3 +
⟨x4 ⟩ = +4 +3 +6 +
A blob indicates a connected moment, also called ‘cluster’. The linked cluster theorem
states that the numerical coefficients in front of the various terms can be obtained by
finding the number of ways to break points into clusters of this type. A proof of the
linked cluster theorem can be obtained as follows: we write
⎡ ⎤
∞
(−ik)m m ∞ (−ik)
n ∞ ⎢ (−ik)nin ⟨xn ⟩c in ⎥
∑ ⟨x ⟩ = e∑n=1 n! ⟨x ⟩c = ∏ ∑ ′
n
⎢ ( ) ⎥⎥, (2.9)
⎢ i !
0 m! ⎢ n! ⎥
n=1 in ⎣ n
⎦
′ ⟨xn ⟩icn
⟨x ⟩ = ∑ m! ∏
m
, (2.10)
{in } n in ! (n!)in
10
2. Basic Statistical Notions
1 (x−µ)2
p(x) = √ e− 2σ2 . (2.11)
2πσ
We find µ = ⟨x⟩ and σ 2 = ⟨x2 ⟩ − ⟨x⟩2 = ⟨x⟩2c . The higher moments are all expressible
in terms of µ and σ in a systematic fashion. For example:
⟨x2 ⟩ = σ 2 + µ2
⟨x3 ⟩ = 3σ 2 µ + µ3
⟨x4 ⟩ = 3σ 4 + 6σ 2 µ2 + µ4
Fix N and let Ω = {1, . . . , N }. Then the events are subsets of Ω, such as {n}. We
think of n = NA as the number of times an outcome A occurs in N trials, where
0 ≤ q ≤ 1 is the probability for the event A.
N
PN ({n}) = ( ) q n (1 − q)N −n (2.13)
n
N
⇒ p̃N (k) = ⟨e−ikn ⟩ = (qe−ik + (1 − q)) (2.14)
This is the limit of the binomial distribution for N → ∞ when x = n, α are fixed
where q = α
N (rare events). It is given by (x ∈ R+ = Ω):
αx
p(x) = e−α , (2.15)
Γ (x + 1)
where Γ is the Gamma function1 . In order to derive this as a limit of the binomial
∞
1
For natural numbers n, we have Γ(n + 1) = n!. For x ≥ 0, we have Γ(x + 1) = ∫ dt tx e−t .
0
11
2. Basic Statistical Notions
distribution, we start with the characteristic function of the latter, given by:
N
α α α(e−ik −1)
p̃N (k) = ( e−ik + (1 − )) Ð→ e = p̃(k), as N → ∞. (2.16)
N N
The formula for the Poisson distribution then follows from p(x) = (1/2π) ∫ dk p̃(k)eikx
(one might use the residue theorem to evaluate this integral). Alternatively, one
may start from
N (N − 1) . . . (N − x + 1) x α N −x αx
pN (x) = α (1 − ) → e−α , as N → ∞.
Γ(x + 1)N x N Γ (x + 1)
(2.17)
A standard application of the Poisson distribution is radioactive decay: let q = λ∆t
the decay probability in a time interval ∆t = N.
T
If x denotes the number of decays,
then the probability is obtained as:
(λT )x −λT
p(x) = e . (2.18)
Γ (x + 1)
H({σi }) = −J ∑ σi σk − h ∑ σi , (2.19)
ik i
where J, h are parameters, and where the first sum is over all lattice bonds ik in
the volume V . The second sum is over all lattice sites in V . The probability of a
configuration is then given by the Boltzmann weight
1
H({σi }) = exp[−βH({σi })]. (2.20)
Z
12
2. Basic Statistical Notions
1 −µl(ω)−gn(ω)
P (ω) = e . (2.21)
Z
Here, µ, g are positive constants. For large µ ≫ 1, short walks between x and y are
favored, and for large g ≫ 1, self-avoiding walks are favored. Z = Zx,y (V , µ, g) is a
normalization constant ensuring that the probabilities add up to unity. Of interest
are e.g. the “free energy density” f = ∣V ∣−1 log Z, or the average number of steps
the walk spends in a given subset S ⊂ V , given by ⟨#{S ∩ ω}⟩.
In general, such observables are very difficult to calculate, but for g = 0 (uncon-
strained walks) there is a nice connection between Z and the Gaussian distribu-
tion, which is the starting point to obtain many further results. Let ∂α f (i) =
f (i + e⃗α ) − f (i) be the “lattice partial derivative” of a function f (i) defined on the
lattice sites i ∈ V , in the direction of the α-th unit vector, e⃗α , α = 1, . . . , d. Let
∑ ∂α2 = ∆ be the “lattice Laplacian”. The lattice Laplacian can be identified with
a matrix ∆ij of size ∣V ∣ × ∣V ∣ defined by ∆f (i) = ∑j ∆ij f (j). Define the covariance
matrix as C = (−∆ + m2 )−1 and consider the corresponding Gaussian measure for
the variables {φi } ∈ R∣V ∣ (one real variable per lattice site in V ). One shows that
1 − ∑ φi (−∆+m
2)
d∣V ∣ φ
1
Zx,y = ⟨φx φy ⟩ ≡ ∫ φx φy e 2
ij φj
(2.22)
(2π)∣V ∣/2 (det C)1/2
then we say that the variables x = (x1 , . . . , xN ) are independent. This notion can
be generalized immediately to any “Cartesian product” Ω = Ω1 × ... × ΩN of proba-
bility spaces. In the case of independent identically distributed real random variables
xi , i = 1, ..., N , there is an important theorem characterizing the limit as N → ∞, which
is treated in more detail in the homework assignments. Basically it says that (under
(x −µ)
certain assumptions about p) the random variable y = ∑ √i has Gaussian distribution
√ N
for large N with mean 0 and spread σ/ N . Thus, in this sense, a sum of a large number
13
2. Basic Statistical Notions
In the context of computer science, the factor kB is dropped, and the natural log is
replaced by the logarithm with base 2, which is natural to use if we think of information
encoded in bits (kB is merely inserted here to be consistent with the conventions in
statistical physics).
More or less evident generalizations exist for more general probability spaces. For
example, for the discrete probability space such as Ω = {1, ..., N } with probabilities
{p1 , . . . , pN } for the elementary events, i.e. P ({i}) = pi , the information entropy is given
by Sinf = −kB ∑ pi log pi . It can be shown that the information entropy (in computer
i
science normalization) is roughly equal to the average (with respect to the given proba-
bility distribution) number of yes/no questions necessary to determine whether a given
event has occurred (cf. exercises).
A practical application of information entropy is as follows: suppose one has an en-
semble whose probability distribution p(x) is not completely known. One would like to
make a good guess about p(x) based on some partial information such as a finite number
of moments, or other observables. Thus, suppose that Fi (x), i = 1, ..., n are observables
for which ⟨Fi (x)⟩ = fi are known. Then a good guess, representing in some sense a
minimal bias about p(x), is to minimize Sinf , subject to the n constraints ⟨Fi (x)⟩ = fi .
In the case when the observables are µ and σ, the distribution obtained in this way is
the Gaussian. So the Gaussian is, in this sense, our best guess if we only know µ and σ
(cf. exercises).
14
2. Basic Statistical Notions
3N 3N
∫ ρ(P , Q)d P d Q = 1, 0 ≤ ρ(P , Q) ≤ ∞. (2.26)
Ω
According to the basic concepts of probability theory, the ensemble average of an ob-
servable F (P , Q) is then simply
The probability distribution ρ(P , Q) represents our limited knowledge about the system
which, in reality, is of course supposed to be described by a single trajectory (P (t), Q(t))
in phase space. In practice, we cannot know what this trajectory is precisely other than
for a very small number of particles N and, in some sense, we do not really want to know
the precise trajectory at all. The idea behind ensembles is rather that the time evolution
(=phase space trajectory (Q(t), P (t))) typically scans the entire accessible phase space
(or sufficiently large parts of it) such that the time average of F equals the ensemble
average of F , i.e. in many cases we expect to have:
T
1
lim
T →∞ T
∫ F (P (t), Q(t)) dt = ⟨F (P , Q)⟩ , (2.28)
0
for a suitable (stationary) probability density function. This is closely related to the
“ergodic theorem” and is related to the fact that the equations of motion are derivable
from a (time independent) Hamiltonian. Hamilton’s equations are
∂H ∂H
ẋiα = ṗiα = − , (2.29)
∂piα ∂xiα
2
This description is not always appropriate, as the example of a rigid body shows. Here the phase
space coordinates take values in the co-tangent space of the space of all orthogonal frames describing
the configuration of the body, i.e. Ω ≅ T ∗ SO(3), with SO(3) the group of orientation preserving
rotations.
15
2. Basic Statistical Notions
p⃗2i
H= ∑ + ∑ V (⃗
xi − x xj ) ,
⃗j ) + ∑ W (⃗ (2.30)
i 2m i<j j
´¹¹ ¹ ¹¸ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
kinetic energy interaction external potential
E ⃗
x
Φt Bt
B0
∣B0 ∣ = ∣Φt (B0 )∣ = ∣Bt ∣
Q
Figure 2.3.: Evolution of a phase space volume under the flow map Φt .
16
2. Basic Statistical Notions
Proof of the theorem: Let (P ′ , Q′ ) = (P (t), Q(t)), such that (P (0) = P , Q(0) = Q).
Then we have
∂(P ′ , Q′ ) 3N 3N
d3N P ′ d3N Q′ = d Pd Q , (2.31)
∂(P , Q)
and we would like to show that JP ,Q (t) = 1 for all t. Let us write the Jacobian as
∂(P ′ ,Q′ )
JP ,Q (t) = ∂(P ,Q) . Since the flow evidently satisfies Φt+t′ (P , Q) = Φt′ (Φt (P , Q)), the
chain rule and the properties of the Jacobian imply JP ,Q (t + t′ ) = JP ,Q (t)JP ′ ,Q′ (t′ ). We
now show that ∂JP ,Q (0)/∂t = 0. For small t, we can expand as follows:
∂H
P ′ = P + tṖ + O(t2 ) = P − t + O(t2 ),
∂Q
∂H
Q′ = Q + tQ̇ + O(t2 ) = Q + t + O(t2 ).
∂P
It follows that
⎡ ⎤
∂(P ′ , Q′ ) ⎢ ⎛−∂P ∂Q H −∂ 2 H ⎞ ⎥
⎢ 2 ⎥
JP ,Q (t) = = det ⎢13N ×3N + t ⎜ 2 Q
⎟ + O(t )⎥
∂(P , Q) ⎢ ⎝ ∂P H ∂ Q ∂P H ⎠ ⎥
⎢ ⎥
⎣ ⎦
⎛ ⎞
⎜ ⎟
⎜ ∂2H ∂2H ⎟
= 1+t⎜−
⎜ ∂x ∂p + ⎟ + O(t2 )
⎟
⎜ ∂p ∂x iα ⎟
¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶⎟
⎜´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹iα iα iα
⎝ =0 ⎠
= 1 + O(t2 ).
This implies ∂JP ,Q (0)/∂t = 0 (and JP ,Q (0) = 0). The functional equation for the Ja-
cobean then implies that the time derivative vanishes for arbitrary t:
∂ ∂ ∂
JP ,Q (t) = ′ JP ,Q (t + t′ )∣ = JP ,Q (t) ′ JP ′ ,Q′ (t′ )∣ = 0. (2.32)
∂t ∂t t′ =0 ∂t t′ =0
Together with JP ,Q (0) = 1, this gives the result JP ,Q (t) = 1 for all t, i.e. the flow is
area-preserving.
The flow Φt is not only area preserving on the entire phase-space, but also on the energy
surface ΩE (with the natural integration element understood). Such area-preserving
flows under certain conditions imply that the phase space average equals the time aver-
age, cf. (2.28). This is expressed by the ergodic theorem:
17
2. Basic Statistical Notions
Theorem: Let (P (t), Q(t)) be dense in ΩE and F continuous. Then the time average
is equal to the ensemble average:
T
1
lim
T →∞ T
∫ F (P (t), Q(t)) dt = ∫ F (P , Q). (2.33)
0 ΩE
The key hypothesis is that the orbit lies dense in ΩE and that this surface is com-
pact. The first is clearly not the case if there are further constants of motion, since the
orbit must then lie on a submanifold of ΩE corresponding to particular values of these
constants. The Kolmogorov-Arnold-Moser (KAM) theorem shows that small perturba-
tions of systems with sufficiently many constants of motion again possess such invariant
submanifolds, i.e. the ergodic theorem does not hold in such cases. Nevertheless, the
ergodic theorem still remains an important motivation for studying ensembles.
One puzzling consequence of Liouville’s theorem is that a trajectory starting at (P0 , Q0 )
comes back arbitrarily close to that point, a phenomenon called Poincaré recurrence.
An intuitive “proof” of this statement can be given as follows:
ΩE
Bk+1
B0 etc.
Φ1
B1
Bk
Figure 2.4.: Sketch of the situation described in the proof of Poincaré recurrence.
B0 ∩ Bk = ∅ ∀k ∈ N.
18
2. Basic Statistical Notions
This clearly contradicts the assumption that ΩE is compact and therefore the statement
of the theorem has to be true.
Historically, the recurrence argument played an important role in early discussions of the
notion of irreversibility, i.e. the fact that systems generically tend to approach an equi-
librium state, whereas they never seem to spontaneously leave an equilibrium state and
evolve back to the (non-equilibrium) initial conditions. To explain the origin of resp. the
mechanisms behind this irreversibility is one of the major challenges of non-equilibrium
thermodynamics and we shall briefly come back to this point later. For the moment,
we simply note that in practice the recurrence time τrecurrence would be extremely large
compared to the natural scales of the system such as the equilibration time. We will
verify this by investigating the dynamics of a toy model in the appendix. Here we only
give a heuristic explanation. Consider a gas of N particles in a volume V . The volume
is partitioned into sub volumes V1 , V2 of equal size. We start the system in a state where
the atoms only occupy V1 . By the ergodic theorem we estimate that the fraction of time
the system spends in such a state is ⟨χQ∈V1 ⟩ = 2−3N (for an ideal gas), where χQ∈V1 gives
1 if all particles are in V1 , and zero otherwise. For N = 1 mol, i.e. N = O(1023 ), this
fraction is astronomically small. So there is no real puzzle!
its spectral decomposition3 , the probability for measuring the outcome ai is given by
3
A general self-adjoint operator on a Hilbert space will have a spectral decomposition A = ∫−∞ adEA (a).
∞
The spectral measure does not have to be atomic, as suggested by the formula (2.34). The corre-
sponding probability measure is in general dµ(a) = ⟨Ψ∣dEA (a)Ψ⟩.
19
2. Basic Statistical Notions
Thus, if we assign the state ∣Ψ⟩ to the system, the set of possible measuring outcomes
for A is the probability space Ω = {a1 , a2 , . . . } with (discrete) probability distribution
given by {p1 , p2 , . . .}.
In statistical mechanics we are in a situation where we have incomplete information
about the state of a quantum mechanical system. In particular, we do not want to
prejudice ourselves by ascribing a pure state ∣Ψ⟩ to the system. Instead, we describe it
by a statistical ensemble. Suppose we believe that the system is in the state ∣Ψi ⟩ with
probability pi , where, as usual, ∑ pi = 1, pi ≥ 0. The states ∣Ψi ⟩ should be normalized, i.e.
⟨Ψi ∣Ψi ⟩ = 1, but they do not have to be orthogonal or complete. Then the expectation
value ⟨A⟩ of an operator is defined as
Introducing the density matrix ρ = ∑i pi ∣Ψi ⟩⟨Ψi ∣ this may also be written as
The density matrix has the properties trρ = ∑i pi = 1, as well as ρ† = ρ. Furthermore, for
any state ∣Φ⟩ we have
⟨Φ∣ρ∣Φ⟩ = ∑ pi ∣⟨Ψi ∣Φ⟩∣2 ≥ 0.
i
̷ d
ih ∣Ψ(t)⟩ = H∣Ψ(t)⟩
dt
20
2. Basic Statistical Notions
where ∑ f (Ei ) = 1 and pi = f (Ei ) > 0 (here, Ei label the eigenvalues of the Hamiltonian
i
H and ∣Ψi ⟩ its eigenstates, i.e. H∣Ψi ⟩ = Ei ∣Ψi ⟩). The characteristic example is given by
1 −βH
f (H) = e , (2.38)
Zβ
where Zβ = ∑ e−βEi . More generally, if {Qα } are operators commuting with H, then
i
another choice is
1 −βH−∑ µα Qα
ρ= e α . (2.39)
Z(β, µα )
We will come back to discuss such ensembles below in chapter 4.
One often deals with situations in which a system is comprised of two sub-systems A
and B described by Hilbert spaces HA , HB . The total Hilbert space is then H = HA ⊗ HB
(⊗ is the tensor product). If {∣i⟩A } and {∣j⟩B } are orthonormal bases of HA and HB ,
an orthonormal basis of H is given by {∣i, j⟩ = ∣i⟩A ⊗ ∣j⟩B }.
Consider a (pure) state ∣Ψ⟩ in H, i.e. a pure state of the total system. It can be
expanded as
∣Ψ⟩ = ∑ ci,j ∣i, j⟩.
i,j
2
∑ ∣ci,j ∣ = 1. (2.40)
i,j
⎛ ⎞
= ∑ ∑ c̄i,k cj,k A ⟨i∣a∣j⟩A
i,j ⎝ k ⎠
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
=∶(ρA )ji
= trA (aρA ) .
21
2. Basic Statistical Notions
The reduced density matrix reflects the limited information of an observer only having
access to a subsystem. The quantity
is called the entanglement entropy of subsystem A. One shows that Sv.N. (ρA ) =
Sv.N. (ρB ), so it does not matter which of the two subsystems we use to define it.
Example: Let HA = C2 = HB with orthonormal basis {∣ ↑⟩, ∣ ↓⟩} for either system
A or B. The orthonormal basis of H is then given by {∣ ↑↑⟩, ∣ ↑↓⟩, ∣ ↓↑⟩, ∣ ↓↓⟩}.
from which it follows that the reduced density matrix of subsystem A is given by
ρA = ∣ ↑⟩⟨ ↑ ∣. (2.43)
1
⟨Ψ∣ã∣Ψ⟩ = (⟨↑↓ ∣ − ⟨↓↑ ∣) (a ⊗ 1B ) (∣ ↑↓⟩ − ∣ ↓↑⟩)
2
1
= (⟨↑ ∣a∣ ↑⟩ + ⟨↓ ∣a∣ ↓⟩) , (2.45)
2
from which it follows that the reduced density matrix of subsystem A is given by
1
ρA = (∣ ↑⟩⟨↑ ∣ + ∣ ↓⟩⟨↓ ∣) . (2.46)
2
1 1 1 1
Sent = −kB tr (ρA log ρA ) = −kB ( log + log ) = kB log 2. (2.47)
2 2 2 2
22
3. Time-evolving ensembles
∂ ∂ρt (P , Q) ∂P ∂ρt (P , Q) ∂Q
ρt (P , Q) = + = {ρt , H} (P , Q) , (3.2)
∂t ∂P ∂t ∂Q ∂t
´¸¶ ´¸¶
=− ∂H
∂Q
= ∂H
∂P
where {⋅, ⋅} denotes the Poisson bracket. Let us define the 1-particle density f1 by
f1 (⃗ ⃗1 ; t) ∶= ⟨∑ δ 3 (⃗
p1 , x p1 − p⃗i ) δ 3 (⃗ ⃗i ) ⟩
x1 − x
i
N
= N ∫ ρt (⃗ ⃗1 , p⃗2 , x
p1 , x ⃗N ) ∏ d3 xi d3 pi .
⃗2 . . . , p⃗N , x (3.3)
i=2
N
f2 (⃗ ⃗1 , p⃗2 , x
p1 , x ⃗2 ; t) = N (N − 1) ∫ ρt (⃗ ⃗1 , p⃗2 , x
p1 , x ⃗N ) ∏ d3 xi d3 pi .
⃗2 . . . , p⃗N , x (3.4)
i=3
23
3. Time-evolving ensembles
s p⃗2i s
Hs = ∑ + ∑ V(⃗ ⃗j ) + ∑ W(⃗
xi − x xi ), (3.5)
i=1 2m 1≤i<j≤s i=1
∂fs s
∂V (⃗ xi − x ⃗s+1 ) ∂fs+1
− {Hs , fs } = ∑ ∫ d3 ps+1 d3 xs+1 ⋅ . (3.6)
∂t i=1 ∂ x⃗ i ∂ p⃗i
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
streaming term
collision term
⎡ ⎤
⎢∂ ∂W ∂ p⃗1 ∂ ⎥⎥ ∂V (⃗ ⃗2 )
x1 − x ∂f2
⎢ − ⋅ + ⋅ f1 = ∫ d3 p2 d3 x2 ⋅ .
⎢ ∂t ⃗1 ∂ p⃗1 ⃗1 ⎥⎥ ⃗2 ∂ p⃗1
⎢ ∂x m ∂x ∂x
⎣ ´¸¶ ´¸¶ ⎦ ´¸¶
=F⃗ (ext. force) =⃗
v (velocity) unknown!
(3.7)
An obvious feature of the BBGKY hierarchy is that the equation for f1 involves f2 , that
for f2 involves f3 , etc. In this sense the equations for the individual fi are not closed.
To get a manageable system, some approximations/truncations are necessary.
In order to derive the Boltzmann equation, the BBGKY-hierarchy is approximat-
ed/truncated in the following way:
(a) we set f3 ≈ 0.
Let us discuss the conditions under which such assumptions are (approximately) valid.
Basically, one needs to have a sufficiently wide separation of the time-scales of the system.
The relevant time scales are described as follows.
(i) Let v be the typical velocity of gas particles (e.g. v ≈ 100 ms at room temperature and 1atm)
x) varies, i.e. the box size. Then τv ∶=
and let L be the scale over which W(⃗ L
v is
the extrinsic scale (e.g. τv ≈ 10−5 s for L ≈ 1mm).
1
(iii) We can also define the mean free time τx ≈ τc
nd3
≈ nvd3
, n= V ,
N
which is the average
time between subsequent collisions. We have τx ≈ 10−8 s ≫ τc in our example.
24
3. Time-evolving ensembles
The Boltzmann equation may now be “derived” by looking at the second equation in
the BBGKY hierarchy and neglecting time derivatives. This gives
⎡ ∂ ⎤⎥
⎢ ∂ ∂ ∂
⎢v⃗1 + v⃗2 − F⃗ (⃗ ⃗2 ) (
x1 − x − )⎥ f2 = 0, (3.8)
⎢ ∂x ⃗1 ⃗2
∂x ∂ p⃗1 ∂ p⃗2 ⎥⎦
⎣
The derivation of the Boltzmann equation from this is still rather complicated and we
only state the result, which is:
∂ ∂ ∂
[ − F⃗ + v⃗1 ] f1 (⃗ ⃗1 ; t) =
p1 , x
∂t ∂ p⃗1 ⃗1
∂x
dσ
− ∫ d3 p2 d2 Ω ∣ ∣ v1 − v⃗2 ∣ ⋅ [f1 (⃗
⋅ ∣⃗ ⃗1 ; t) f1 (⃗
p1 , x p2 , x p′1 , x
⃗1 ; t) − f1 (⃗ p′2 , x
⃗1 ; t) f1 (⃗ ⃗1 ; t)],
dΩ ´¹¹ ¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¹ ¶
´ ¸¹¶ ∝flux
cross-section
(3.9)
where Ω = (θ, φ) is the solid angle between p⃗ = p⃗1 − p⃗2 and p⃗′ = p⃗′1 − p⃗′2 , and d2 Ω = sin θdθdφ.
The meaning of the differential cross section ∣dσ/dΩ∣ is shown in the following picture
representing a classical 2-particle scattering process:
p⃗′
dφ
dΩ Ω̂(θ, φ)
p⃗
φ
⃗b
θ
x
O
db
dσ
∣ ∣ ∶= Jacobian between ⃗b and Ω̂ = (θ, φ) (3.10)
dΩ
2
D
= ( ) for hard spheres with diameter D.
2
25
3. Time-evolving ensembles
The integral expression on the right side of the Boltzmann equation (3.9) is called the
collision operator, and is often denoted as C[f1 ](t, p⃗1 , x
⃗1 ). It represents the change in
the 1-particle distribution due to collisions of particles. The two terms in the brackets
[...] under the integral in (3.9) can be viewed as taking into account that new particles
with momentum p⃗1 can be created or be lost, respectively, when momentum is transferred
from other particles in a collision process.
It is important to know whether f1 (⃗ ⃗; t) is stationary, i.e. time-independent. In-
p, x
tuitively, this should be the case when the collision term C[f1 ] vanishes. This in turn
should happen if
f1 (⃗ ⃗; t)f1 (⃗
p1 , x p2 , x p′1 , x
⃗; t) = f1 (⃗ p′2 , x
⃗; t)f1 (⃗ ⃗; t). (3.11)
As we will now see, one can derive the functional form of the 1-particle density from this
condition. Taking the logarithm on both sides of (3.11) gives, with F1 = log f1 etc.,
p⃗ 2
whence F must be a conserved quantity, i.e. either we have F = β 2m ⃗ ⋅ p⃗ or
or F = α
F = γ. It follows, after renaming constants, that
(p−
⃗ p⃗0 )2
f1 = c ⋅ e−β 2m . (3.13)
p⃗ 2
3 1
The mean kinetic energy is found to be ⟨ 2m ⟩= 2β , so β = kB T is identified with the
inverse temperature of the gas.
This interpretation of β is reinforced by considering a gas of N particles confined to
a box of volume V . The pressure of the gas results from a force K acting on a wall
26
3. Time-evolving ensembles
element of area A, as depicted in the figure below. The force is equal to:
(f1 (⃗
p)d3 p)⋅(Avx ∆t) 2px
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
1 ⎛ particles impacting A ⎞ ⎛ momentum transfer ⎞
3
K= ∫ d p ⋅ # ⎜ ⎟×⎜ ⎟
∆t ⎝ during ∆t with momenta between p⃗ and p⃗ + d⃗ p ⎠ ⎝ in x − direction ⎠
0 ∞ ∞
1
= ∫ dpx ∫ dpy ∫ dpz f1 (⃗
p) (Avx ∆t) ⋅ (2px ) .
∆t
−∞ −∞ −∞
Note, that the first integral is just over half of the range of px , which is due to the fact
that only particles moving in the direction of the wall will hit it.
Together with (3.13) it follows that the pressure P is given by
K p2 n
P= = ∫ d3 p f1 (⃗
p) x = . (3.15)
A m β
1
Comparing with the equation of state for an ideal gas, P V = N kB T , we get β = kB T .
p⃗
p⃗′ A
vx ⋅ ∆t
27
3. Time-evolving ensembles
In this case we have to deal with a much more complicated f1 , not equal to the
Maxwell-Boltzmann distribution. As the example of an air-flow suggests, the Boltzmann
equation is also closely related to other equations for fluids such as the Euler- or Navier-
Stokes equation, which can be seen to arise as approximations of the Boltzmann equation.
The Boltzmann equation can easily be generalized to a gas consisting of several species
α, β, . . . which are interacting via the 2-body potentials Vα,β (⃗ ⃗(β)). As before,
x(α) − x
(α)
we can define the 1-particle density f1 (⃗ ⃗, t) for each species. The same derivation
p, x
leading to the Boltzmann equation now gives the system of equations
∂ ∂ ∂
− F⃗
(α)
[ + v⃗ ] f1 = ∑ C (α,β) , (3.16)
∂t ∂ p⃗ ⃗
∂x β
dσα,β
C (α,β) = − ∫ d3 p2 d2 Ω ∣ v1 − v⃗2 ∣ ×
∣ ∣⃗
dΩ (3.17)
(α) (β) (α) (β)
× [f1 (⃗ ⃗1 ; t) f1 (⃗
p1 , x ⃗1 ; t) − f1 (⃗
p2 , x p′1 , x
⃗1 ; t) f1 (⃗p′2 , x
⃗1 ; t)].
This system of equations has great importance in practice e.g. for the evolution of the
abundances of different particle species in the early universe. In this case
(α) (α)
f1 (⃗ ⃗; t) ≈ f1 (⃗
p, x p, t) (3.18)
are homogeneous distributions and the external force F⃗ on the left hand side of equations
(3.16) is related to the expansion of the universe.
Demanding equilibrium now amounts to
⃗ p
(p− ⃗0 (α))2
(α)
f1 ∝ e−β 2m , (3.20)
i.e. we have the same temperature T for all α. In the context of the early universe
it is essential to study deviations from equilibrium in order to explain the observed
abundances.
By contrast to the original system of equations (Hamilton’s equations or the BBGKY
hierarchy), the Boltzmann equation is irreversible. This can be seen for example by
introducing the function
28
3. Time-evolving ensembles
which is called Boltzmann H-function. It can be shown (cf. exercises) that ḣ(t) ≥ 0,
with equality if
f1 (⃗ ⃗; t)f1 (⃗
p1 , x p2 , x p′1 , x
⃗; t) = f1 (⃗ p′2 , x
⃗; t)f1 (⃗ ⃗; t),
a result which is known as the H-theorem. We just showed this equality holds if and
only if f1 is given by the Maxwell-Boltzmann distribution. Thus, we conclude that h(t) is
an increasing function, as long as f1 is not equal to the Maxwell-Boltzmann distribution.
In particular, the evolution of f1 , as described by the Boltzmann equation, is irreversible.
Since the Boltzmann equation is only an approximation to the full BBGKY hierarchy,
which is reversible, there is no mathematical inconsistency. However, it is not clear,
a priori, at which stage of the derivation the irreversibility has been allowed to enter.
Looking at the approximations (a) and (b) made above, it is clear that the assumption
that the 2-particle correlations f2 are factorized, as in (b), cannot be exactly true,
since the outgoing momenta of the particles are correlated. Although this correlation is
extremely small after several collisions, it is not exactly zero. Our decision to neglect it
can be viewed as one reason for the emergence of irreversibility on a macroscopic scale.
The close analogy between the definition of the Boltzmann H-function and the infor-
mation entropy Sinf , as defined in (2.24), together with the monotonicity of h(t) suggest
that h should represent some sort of entropy of the system. The H-theorem is then
viewed as a “derivation” of the 2nd law of thermodynamics (see Chapter 6). However,
this point of view is not entirely correct, since h(t) only depends on the 1-particle density
f1 and not on the higher particle densities fs , which in general should also contribute
to the entropy. It is not clear how an entropy with sensible properties has to be defined
in a completely general situation, in particular when the above approximations (a) and
(b) are not justified.
f1 (⃗ ⃗; t)f1 (⃗
p1 , x p2 , x p′1 , x
⃗; t) − f1 (⃗ p′2 , x
⃗; t)f1 (⃗ ⃗; t)
29
3. Time-evolving ensembles
equation, one can again derive a corresponding H-theorem. Rather than explaining
the details, we give a simplified “derivation” of the H-theorem, which also will allow
us to introduce a simple minded but very useful approximation of the dynamics of
probabilities, discussed in more detail in the Appendix.
The basic idea is to ascribe the approach to equilibrium to an incomplete knowledge
of the true dynamics due to perturbations. The true Hamiltonian is written as
H = H0 + H1 , (3.22)
where H1 is a tiny perturbation over which we do not have control. For simplicity, we
assume that the spectrum of the unperturbed Hamiltonian H0 is discrete and we write
H0 ∣n⟩ = En ∣n⟩. For a typical eigenstate ∣n⟩ we then have
⟨n∣H1 ∣n⟩
≪ 1. (3.23)
En
Let pn be the probability that the system is in the state ∣n⟩, i.e. we ascribe to the system
the density matrix ρ = ∑n pn ∣n⟩⟨n∣. For generic perturbations H1 , this ensemble is not
stationary with respect to the true dynamics because [ρ, H] ≠ 0. Consequently, the von
Neumann entropy Sv.N. of ρ(t) = eitH ρe−itH depends upon time. We define this to be
the H-function
h(t) ∶= Sv.N. (ρ(t)). (3.24)
Next, we approximate the dynamics by imagining that our perturbation H1 will cause
jumps from state ∣i⟩ to state ∣j⟩ leading to time-dependent probabilities as described by
the master equation1
where Tij is the transition amplitude2 of going from state ∣i⟩ to state ∣j⟩. Thus, the ap-
proximated, time-dependent density matrix is ρ(t) = ∑n pn (t)∣n⟩⟨n∣, with pn (t) obeying
the master equation. Under these approximations it is straightforward to calculate that
1
ḣ(t) = kB ∑ Tij [pi (t) − pj (t)][log pi (t) − log pj (t)] ≥ 0. (3.26)
2 i,j
The latter inequality follows from the fact that both terms in parentheses [...] have the
1
This equation can be viewed as a discretized analog of the Boltzmann equation in the present context.
See the Appendix for further discussion of this equation.
2
According to Fermi’s golden rule, the transition amplitude is given by
2πn 2
Tij = ̷ ∣⟨i∣H1 ∣j⟩∣ ≥ 0,
h
where n is the density of final states.
30
3. Time-evolving ensembles
same sign, just as in the proof of the classical H-theorem (exercises). Note that if we
had defined h(t) as the von Neumann entropy, using a density matrix ρ that is diagonal
in an eigenbasis of the full Hamiltonian H (rather than the unperturbed Hamiltonian),
then we would have obtained [ρ, H] = 0 and consequently ρ(t) = ρ, i.e. a constant
h(t). Thus, in this approach, the H-theorem is viewed as a consequence of our partial
ignorance about the system, which prompts us to ascribe to it a density matrix ρ(t)
which is diagonal with respect to H0 . In order to justify working with a density matrix
ρ that is diagonal with respect to H0 (and therefore also in order to explain the approach
to equilibrium), one may argue very roughly as follows: suppose that we start with a
system in a state ∣Ψ⟩ = ∑ γn ∣n⟩ that is not an eigenstate of the true Hamiltonian H. Let
m
us write
iEn t
̷
∣Ψ(t)⟩ = ∑ γn (t)e h ∣n⟩ ≡ eiHt ∣Ψ⟩.
n
γn (t) = γn = const.,
but for H1 ≠ 0 this is typically not the case. The time average of an operator (observable)
A is given by
T
1
lim
T →∞ T
∫ ⟨Ψ(t)∣A∣Ψ(t)⟩ dt = Tlim
→∞
tr(ρ(T )A), (3.27)
0
with
T
1 it(En −Em )
⟨n∣ρ(T )∣m⟩ = ∫ γn (t)γm (t)e ̷
h dt. (3.28)
T
0
For T → ∞ the oscillating phase factor eit(En −Em ) is expected to cause the integral to
vanish for En ≠ Em , such that ⟨n∣ρ(T )∣m⟩ ÐÐÐ→ pn δn,m . It follows that
T →∞
T
1
lim
T →∞ T
∫ ⟨Ψ(t)∣A∣Ψ(t)⟩ dt = tr(Aρ), (3.29)
0
where the density matrix ρ is ρ = ∑n pn ∣n⟩⟨n∣. Since [ρ, H0 ], the ensemble described
by ρ is stationary with respect to H0 . The underlying reason is that while ⟨n∣H1 ∣n⟩ is
≪ En , it can be large compared to ∆En = En − En+1 = O(e−N ) (where N is the particle
number) and can therefore induce transitions causing the system to equilibrate.
31
4. Equilibrium Ensembles
4.1. Generalities
In the probabilistic description of a system with a large number of constituents one
considers probability distributions (=ensembles) ρ(P , Q) on phase space, rather than
individual trajectories. In the previous section, we have given various arguments leading
to the expectation that the time evolution of an ensemble will generally lead to an equi-
librium ensemble. The study of such ensembles is the subject of equilibrium statistical
mechanics. Standard equilibrium ensembles are:
Recall that in classical mechanics the phase space Ω of a system consisting of N particles
without internal degrees of freedom is given by
Ω = R6N . (4.1)
where H denotes the Hamiltonian of the system. In the micro-canonical ensemble each
point of ΩE is considered to be equally likely. In order to write down the corresponding
ensemble, i.e. the density function ρ(P , Q), we define the invariant volume ∣ΩE ∣ of ΩE
by
1 3N 3N
∣ΩE ∣ ∶= lim ∫ d P d Q, (4.3)
∆E→0 ∆E
E−∆E≤H(P ,Q)≤E
32
4. Equilibrium Ensembles
∂Φ(E)
∣ΩE ∣ = , with Φ(E) = ∫ d3N P d3N Q. (4.4)
∂E
H(P ,Q)≤E
1
ρ(P , Q) = δ (H(P , Q) − E) . (4.5)
∣ΩE ∣
To avoid subtleties coming from the δ-function for sharp energy one sometimes replaces
this expression by
⎧
⎪
1 ⎪
⎪
⎪1, if H(P , Q) ∈ (E − ∆E, E).
ρ(P , Q) = ⋅⎨ . (4.6)
∣{E − ∆E ≤ H(P , Q) ≤ E}∣ ⎪
⎪
⎪
⎪0, if H(P , Q) ∉ (E − ∆E, E)
⎩
Strictly speaking, this depends not only on E but also on ∆E. But in typical cases
∣ΩE ∣ depends exponentially on E, so there is practically no difference between these
two expressions for ρ(P , Q) as long as ∆E ≲ E. We may alternatively write the second
definition as:
1
ρ= [Θ(H − E + ∆E) − Θ(H − E)] . (4.7)
W (E)
Here we have used the Heaviside step function Θ, defined by
⎧
⎪
⎪
⎪
⎪1, for E > 0
Θ(E) = ⎨
⎪
⎪
⎪0, otherwise.
⎪
⎩
As we have already said, in typical cases, changing W (E) in this definition to kB log ∣ΩE ∣
will not significantly change the result. It is not hard to see that we may equivalently
write in either case
S(E) = −kB ∫ ρ(P , Q) log ρ(P , Q) d3N P d3N Q = Sinf (ρ) , (4.10)
33
4. Equilibrium Ensembles
i.e. Boltzmann’s definition of entropy coincides with the definition of the information
entropy (2.24) of the microcanonical ensemble ρ. As defined, S is a function of E
and implicitly V , N , since these enter the definition of the Hamiltonian and phase space.
Sometimes one also specifies other constants of motion or parameters of the system other
than E when defining S. Denoting these constants collectively as {Iα }, one defines W
accordingly with respect to E and {Iα } by replacing the energy surface with:
Example:
N
p⃗ 2
The ideal gas of N particles in a box has the Hamiltonian H = ∑ ( 2m + W(⃗
xi )), where
i=1
the external potential W represents the walls of a box of volume V . For a box with hard
walls we take, for example,
⎧
⎪
⎪
⎪
⎪0 inside V
x) = ⎨
W(⃗ . (4.12)
⎪
⎪
⎪∞ outside V
⎪
⎩
N
2
⃗i inside the box ,
ΩE = { (P , Q) ∈ Ω ∣ x ∑ p⃗ = 2Em }, (4.13)
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ i=1
→V N ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
= sphere of dimension
√
3N − 1 and radius 2Em
√ 3N −1
∣ΩE ∣ = V N 2Em area (S 3N −1 ) ⋅2m. (4.14)
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
d/2
= 2π d
Γ( 2 )
Here, Γ(x) = (x − 1)! denotes the Γ-function. The entropy S(E, V , N ) is therefore given
by
3N 3N 3N 3N
S(E, V , N ) ≈ kB [N log V + log(2πmE) − log + ], (4.15)
2 2 2 2
34
4. Equilibrium Ensembles
x x
log x! ≈ ∑ log i ≈ ∫ log y dy = x log x − x + 1
i=1 1
⇒ x! ≈ e−x xx .
⎡ 3/2 ⎤
⎢ 4πemE ⎥
⎢
S(E, V , N ) ≈ N kB log ⎢V ( ) ⎥⎥ . (4.16)
⎢ 3N ⎥
⎣ ⎦
Given the function S(E, V , N ) for a system, one can define the corresponding tem-
perature, pressure and chemical potential as follows:
RR RR RR
1 ∂S RRRR ∂S RRRR ∂S RRRR
∶= R , P ∶= T R , µ ∶= −T R . (4.17)
T ∂E RRRR ∂V RRRR ∂N RRRR
RRV ,N RRE,N RRE,V
For the ideal classical gas this definition, together with (4.16), yields for instance
1 ∂S 3 N kB
= = , (4.18)
T ∂E 2 E
3
E = N kB T . (4.19)
2
This formula states that for the ideal gas we have the equidistribution law
average energy 1
= kB T . (4.20)
degree of freedom 2
One can similarly verify that the abstract definition of P in (4.17) above gives
P V = kB N T , (4.21)
35
4. Equilibrium Ensembles
F = mg
A
z
gas (N particles)
Here, we obviously have P V = mgz. From the microcanonical ensemble for the combined
piston-gas system, the total energy is obtained as
where we have neglected the kinetic energy p2 /2m of the piston (this could be made
more rigorous by letting m → ∞, g → 0). Next, we calculate
= ∫ dz Wgas (Etotal − P V , V , N ) ,
with V = Az. We evaluate the integral through its value at the maximum, which is
located at the point at which
d d
0= Wgas (Etotal − P V , V , N ) = Wgas (Etotal − P V , V , N )
dz dV
∂Wgas ∂Wgas P ∂Sgas 1 ∂Sgas Sgas
= ⋅ (−P ) + = (− + ) e kB .
∂E ∂V kB ∂E kB ∂V
RR
∂Sgas RRRR ∂Sgas P
RRR =P = , (4.23)
∂V RR ∂E T
RRE,N
36
4. Equilibrium Ensembles
ter β that arose in the Boltzmann-Maxwell distribution (3.13), which we also interpreted
as temperature there. We first ask the following question: What is the probability for
finding particle number 1 having momentum lying between p⃗1 and p⃗1 + d⃗
p1 ? The answer
p1 )d3 p1 , where W (⃗
is: W (⃗ p1 ) is given by
p1 ) = ∫ ρ(P , Q) d3 p2 . . . d3 pN d3 x1 . . . d3 xN .
W (⃗ (4.25)
We wish to calculate this for the ideal gas. To this end we introduce the Hamiltonian
H ′ and the kinetic energy E ′ for the remaining atoms:
N ⎛ p⃗2i ⎞
H′ = ∑ + W(⃗
xi ) , (4.26)
i=2 ⎝ 2m ⎠
p⃗21
E′ = E − , E −H = E′ − H ′. (4.27)
2m
V ′ ′
N
3 3 V ∣ΩE ′ ,N −1 ∣
W (⃗
p1 ) = ∫ δ(E − H ) ∏ d pi d xi =
∣ΩE ∣ i=2 ∣ΩE,N ∣
3N
− 52
( 32 N − 1)! E′ 2
= ( ) . (4.28)
π 2 ( 32 N − 25 )!(2mE)3/2
3
E
( 3N
2 + a)! 3N
a−b
3N
≈( ) , for a, b ≪ ,
( 3N 2 2
2 + b)!
3 3N
− 52
3N 2 ⎛
p⃗2 ⎞
2
W (⃗
p1 ) ≈ ( ) 1− 1 (4.29)
4πmE ⎝ 2mE ⎠
Using
a bN
(1 − ) ÐÐÐ→ e−ab ,
N N →∞
3N
and β = 2E (⇔ E = 32 kB N T ), we find that
3N
− 52
⎛ p⃗2 ⎞ ⃗2
2
3N p
1− 1 ÐÐÐ→ e−
1
2 2mE , (4.30)
⎝ 2mE ⎠ N →∞
37
4. Equilibrium Ensembles
1
which confirms our interpretation of β as β = kB T .
We can also confirm the interpretation of β by the following consideration: consider
two initially isolated systems and put them in thermal contact. The resulting joint
probability distribution is given by
1
ρ(P , Q) = δ (H1 (P1 , Q1 ) + H2 (P2 , Q2 ) −E). (4.32)
∣ΩE ∣ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
system 1 system 2
Since only the overall energy is fixed, we may write for the total allowed phase space
volume (exercise):
For typical systems, the integrand is very sharply peaked at the maximum (E1∗ , E2∗ ), as
depicted in the following figure:
E2
E
∣ΩE1 ∣ ⋅ ∣ΩE−E1 ∣
E2∗
E1
E1∗ E
Figure 4.2.: The joint number of states for two systems in thermal contact.
∗ ∗
At the maximum we have ∂E (E1 ) = ∂E (E2 ) from which we get the relation:
∂S1 ∂S2
1 1 1
= = (uniformity of temperature). (4.34)
T1 T2 T
Since one expects the function to be very sharply peaked at (E1∗ , E2∗ ), the integral in
38
4. Equilibrium Ensembles
which means that the entropy is (approximately) additive. Note that from the condi-
tion of (E1∗ , E2∗ ) being a genuine maximum (not just a stationary point), one gets the
important stability condition
∂ 2 S1 ∂ 2 S2
+ ≤ 0, (4.35)
∂E12 ∂E22
∂2S
implying ∂E 2
≤ 0 if applied to two copies of the same system. We can apply the same
considerations if S depends on additional parameters, such as other constants of motion.
Denoting the parameters collectively as X = (X1 , ..., Xn ), the stability condition becomes
∂2S
∑ vi vj ≤ 0, (4.36)
i,j ∂Xi ∂Xj
for any choice of displacements vi (negativity of the Hessian matrix). Thus, in this case,
S is a concave function of its arguments. Otherwise, if the Hessian matrix has a positive
eigenvalue e.g. in the i-th coordinate direction, then the corresponding displacement vi
will drive the system to an inhomogeneous state, i.e. one where the quantity Xi takes
different values in different parts of the system (different phases).
Let H be the Hamiltonian of a system with eigenstates ∣n⟩ and eigenvalues En , i.e.
H∣n⟩ = En ∣n⟩, and consider the density matrix
1
ρ= ∑ ∣n⟩⟨n∣, (4.37)
W n∶E−∆E≤En ≤E
where the normalization constant W is chosen such that trρ = 1. The density matrix ρ is
analogous to the distribution function ρ(P , Q) in the classical microcanonical ensemble,
eq. (4.6), since it effectively amounts to giving equal probability to all eigenstates with
energies lying between E and E − ∆E. By analogy with the classical case we get
39
4. Equilibrium Ensembles
Since W (E) is equal to the number of states with energies lying between E − ∆E and E,
it also depends, strictly speaking,on ∆E. But for ∆E ≲ E and large N , this dependency
can be neglected (cf. Homework 3). Note that
1 1
Sv.N. (ρ) = −kB tr (ρ log ρ) = −kB ⋅ ∑ log ,
n∶E−∆E≤En ≤E W W
1
= kB log W ⋅ ∑ 1
W n∶E−∆E≤En ≤E
= kB log W ,
so S = kB log W is equal to the von Neumann entropy for the statistical operator ρ,
defined in (4.37) above. Let us illustrate this definition in an
̷2
h
En = (k 2 + ky2 + kz2 ) , (4.41)
2m x
̷ ∂
since px = i ∂x ,
h
etc. Recall that W was defined by
40
4. Equilibrium Ensembles
ky
kx2 + ky2 = 2mE
̷2
h
× × × × × × × × ×
× × × × × × × × ×
× × × × × × × × ×
× × × × × × × × ×
× × × × × × × × ×
× × × × × × × × ×
× × × × × × × × ×
× × × × × × × × ×
× × × × × × × × ×
kx
2m(E−∆E)
kx2 + ky2 = ̷2
h
̷=
In the continuum approximation we have (recalling that h 2π ):
h
W = ∑ 1 ≈ ∫ d3 n
E−∆E≤En ≤E E−∆E≤En ≤E
Lx Ly Lz 3
= ∫ d k
̷2
π3
{E−∆E≤ 2m
h
(kx2 +ky2 +kz2 )≤E}
3
2m 2 V 2
= ( ̷2 ) 3 ∫ E ′ dE ′ ∫ d2 Ω
h π
E−∆E 1/8 of S 2
3 RE
R
4π V 2mE 2 RRRR
= ( ̷2 ) RRR
3 (2π)3 h RRR
RE−∆E
3
4π (2mE) 2
≈ V , for ∆E ≈ E. (4.42)
3 h3
W= ∫ d3 p d3 x = V ∫ d3 p
{E−∆E≤H≤E} p⃗ 2
{E−∆E≤ 2m ≤E}
2
= V (2m) 2 ∫ E ′ dE ′ ∫ d2 Ω
3
{E−∆E≤E ′ ≤E} S2
RRE
4π 3R R
= V (2mE) 2 RRRRR .
3 RRR
RE−∆E
41
4. Equilibrium Ensembles
This is just h3 times the quantum mechanical result. For the case of N particles, this
suggests the following relation1 :
1
WNqm ≈ WNcl . (4.44)
h3N
A puzzle concerning the definition of entropy in the micro-canonical ensemble (e.g. for
an ideal gas) is revealed if we consider the following situation of two chambers, each of
which is filled with an ideal gas:
wall
gas 1 gas 2
(N1 , V1 , E1 ) (N2 , V2 , E2 )
T1 = T = T2
⎡ 3⎤
⎢ 4πemi Ei 2 ⎥⎥
⎢
Si (Ni , Vi , Ei ) = Ni kB log ⎢Vi ( ) ⎥. (4.45)
⎢ 3Ni ⎥
⎢ ⎥
⎣ ⎦
The wall is now removed and the gases can mix. The temperature of the resulting ideal
gas is determined by
3 E1 + E2 Ei
kB T = = . (4.46)
2 N1 + N2 Ni
1
The quantity W cl is for this reason often defined by
Also, one often includes further combinatorial factors to include the distinction between distinguish-
able and indistinguishable particles, cf. (4.49).
42
4. Equilibrium Ensembles
3
S = N kB log [V (2πmkB T ) 2 ]
V V
∆S = N1 kB log − N2 kB log
V1 V2
= −N kB ∑ ci log vi , (4.48)
i
with ci = Ni
N and vi = V .
Vi
This holds also for an arbitrary number of components
and raises the following paradox: if both gases are identical with the same density
N1
N = N ,
N2
from a macroscopic viewpoint clearly “nothing happens” as the wall is removed.
Yet, ∆S ≠ 0. The resolution of this paradox is that the particles have been treated as
distinguishable, i.e. the states
and
have been counted as microscopically different. However, if both gases are the same,
they ought to be treated as indistinguishable. This change results in a different
definition of W in both cases. Namely, depending on the case considered, the correct
definition of W should be:
⎧
⎪
⎪
⎪
⎪∣Ω(E, V , {Ni })∣
⎪ if distinguishable
W (E, V , {Ni }) ∶= ⎨ 1 (4.49)
⎪
⎪
⎪ ∣Ω(E, V , {Ni })∣ if indistinguishable,
⎪
⎪ ∏ Ni !
⎩i
where Ni is the number of particles of species i. Thus, the second definition is the
physically correct one in our case. With this change (which in turn results in a different
definition of the entropy S), the mixing entropy of two identical gases is now ∆S = 0. In
1
quantum mechanics the symmetry factor N! in W qm (for each species of indistinguishable
particles) is automatically included due to the Bose/Fermi alternative, which we shall
discuss later, leading to an automatic resolution of the paradox.
The non-zero mixing entropy of two identical gases is seen to be unphysical also at
43
4. Equilibrium Ensembles
the classical level because the entropy should be an extensive quantity. Indeed, the
arguments of the previous subsection suggest that for V1 = V2 = 12 V and N1 = N2 = 12 N
we have
RR RR RRR R
′ RRR ′ V N RRR
∣Ω(E, V , N )∣ = ∫ dE RRΩ (E − E , , )RR RRRΩ (E ′ , V , N )RRRRR
RRR 2 2 RRR RRR 2 2 RRRR
R R RR R
2
1 1 1
≈ ∣Ω ( E, V , N )∣
2 2 2
E N V
S(E, N , V ) = 2S ( , , ). (4.50)
2 2 2
E N V
S(E, N , V ) = νS ( , , ), (4.51)
ν ν ν
and thus
S(E, N , V ) = N ⋅ σ(, n), (4.52)
system B
(reservoir, e.g. an ideal gas)
system A
heat exchange
44
4. Equilibrium Ensembles
The overall energy E = EA + EB of the combined system is fixed, as are the particle
numbers NA , NB of the subsystems. We think of NB as much larger than NA ; in fact
we shall let NB → ∞ at the end of our derivation. We accordingly describe the total
Hilbert space of the system by a tensor product, H = HA ⊗ HB . The total Hamiltonian
of the combined system is
H = HA + HB + HAB , (4.53)
´¸¶ ´¸¶ ´¸¶
system A system B interaction (neglected)
where the interaction is needed in order that the subsystems can interact with each
other. Its precise form is not needed, as we shall assume that the interaction strength is
arbitrarily small. The Hamiltonians HA and HB of the subsystems A and B act on the
Hilbert spaces HA and HB , and we choose bases so that:
Since E is conserved, the quantum mechanical statistical operator of the combined sys-
tem is given by the micro canonical ensemble with density matrix
1
ρ= ⋅ ∑ ∣n, m⟩⟨n, m∣ . (4.54)
W n,m∶
(A) (B)
E−∆E≤En +Em ≤E
(A)
=WB (E−En )
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹· ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
1
ρA = ∑( ∑ 1 ) ∣n⟩A ⊗ A ⟨n∣.
W n
(A) (A) (A)
m∶E−En −∆E≤Em ≤E−En
Now, using the extensively of the entropy SB of system B we find (with nB = NB /VB
45
4. Equilibrium Ensembles
1
log WB (E − En(A) ) = SB (E − En(A) )
kB
(A)
NB E NB En ∂σB E
= σB ( , nB ) − ( , nB )
kB NB kB NB ∂ NB
(A) 2
NB ⎛ En ⎞ ∂ 2 σB E
+ ( , nB ) + . . . .
kB ⎝ NB ⎠ ∂2 NB
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
=O( N1 )→0 (as NB →∞, i.e., reservoir ∞-large!)
B
1 1
Thus, using β = kB T and T = ∂E ,
∂S
we have for an infinite reservoir
which means
1 −βEn(A)
WB (E − En(A) ) = e . (4.56)
Z
Therefore, we find the following expression for the reduced density matrix for system A:
1 (A)
−βEn
ρA = ∑e ∣n⟩A ⊗ A ⟨n∣, (4.57)
Z n
Here we have dropped the subscripts “A” referring to our sub system since we can at this
point forget about the role of the reservoir B (so H = HA , V = VA etc. in this formula).
This finally leads to the statistical operator of the canonical ensemble:
1
ρ= e−βH(N ,V ) . (4.59)
Z(β, N , V )
Particular, the only quantity characterizing the reservoir entering the formula is the
temperature T .
46
4. Equilibrium Ensembles
In the classical case we can make similar considerations as in the quantum mechanical
case. Consider the same situation as above. The phase space of the combined system is
(P , Q) = (PA , QA , PB , QB ).
´¹¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¶
system A system B
HAB accounts for the interaction between the particles from both systems and is ne-
glected in the following. By analogy with the quantum mechanical case we get a reduced
probability distribution ρA for sub system A:
with
⎧
⎪
1 ⎪
⎪
⎪1 if E − ∆E ≤ H(P , Q) ≤ E
ρ= ⋅⎨
W ⎪
⎪
⎪0 otherwise.
⎪
⎩
From this it follows that
1
ρA (PA , QA ) = ∫ d3NB PB d3NB QB
W
{E−∆E≤HA +HB ≤E}
1
= ∫ d3NB PB d3NB QB
W
{E−HA (PA ,QA )−∆E≤HB (PB ,QB )≤E+HA (PA ,QA )}
1
= W2 (E − H1 (P1 , Q1 ))
W (E)
It is then demonstrated precisely as in the quantum mechanical case that the reduced
density matrix ρ ≡ ρA for system A is given by (for an infinitely large system B):
1 −βH(P ,Q)
ρ(P , Q) = e , (4.61)
Z
47
4. Equilibrium Ensembles
1
Z ∶= ( ) ∫ d3N P d3N Q e−βH(P ,Q)
N !h3N
3N/2
1 2πm
= 3N
( ) ∫ d3N Q e−βVN (Q) (4.62)
N !h β V N
The quantity λ ∶= √ h
2πmkB T
is sometimes called the “thermal deBroglie wavelength”.
As a rule of thumb, quantum effects start being significant if λ exceeds the typical
dimensions of the system, such as the mean free path length or system size. Using this
definition, we can write
1 3N −βVN (Q)
Z(β, N , V ) = 3N ∫ Nd Qe . (4.63)
N !λ V
Of course, this form of the partition function applies to classical, not quantum, systems.
The unconventional factor of h3N is nevertheless put in by analogy with the quantum
mechanical case because one imagines that the “unit” of phase space for N particles (i.e.
the phase space measure) is given by d3N P d3N Q/(N !h3N ), inspired by the uncertainty
principle ∆Q∆P ∼ h, see e.g. our discussion of the atom in a cube for why the normalized
classical partition function then approximates the quantum partition function. The
motivation of the factor N ! is due to the fact that we want to treat the particles as
indistinguishable. Therefore, a permuted phase space configuration should be viewed as
equivalent to the unpermuted one, and since there are N ! permutations, the factor 1/N !
effectively compensates a corresponding overcounting (here we implicitly assume that
VN is symmetric under permutations). For the discussion of the N !-factor, see also our
discussion on mixing entropy. In practice, these factors often do not play a major role
because the quantities most directly related to thermodynamics are derivatives of
for instance P = −∂F /∂V ∣T ,N , see chapter 6.5 for a detailed discussion of such relations.
F is also called the free energy.
Example:
One may use the formula (4.63) to obtain the barometric formula for the average
⃗ in a given external potential. In this case the Hamiltonian
particle density at a position x
H is given by
N p⃗2i N
H=∑ + ∑ W(⃗ xi ) ,
i=1 2m i=1 ´¹¹ ¹ ¹ ¸¹ ¹ ¹ ¹¶
external potential,
no interaction
between the particles
48
4. Equilibrium Ensembles
⃗2
p
1 1 N −β( 2mi +W(⃗xi ))
ρ (P , Q) = e−βH(P ,Q) = ∏ e . (4.65)
Z Z i=1
x) is given by
The particle density n(⃗
⃗ 2
N
1 −β( 2m
p
+W(⃗
x))
x) = ⟨∑ δ 3 (⃗
n(⃗ ⃗)⟩ = N ∫ d3 p
xi − x e , (4.66)
i=1 Z1
where 3
2⃗
3 3
p
−β( 2m +W(⃗
x)) 2πm 2
Z1 = ∫ d p d x e =( ) ∫ d3 x e−β(W(⃗x)) . (4.67)
β
From this we obtain the barometric formula
x) = n0 e−βW(⃗x) ,
n(⃗ (4.68)
with n0 given by
N
n0 = . (4.69)
∫ d3 x e−βW(⃗x)
In particular, for the gravitational potential, W(x, y, z) = −mgz, we find
z kmgT
n(z) = n0 e B . (4.70)
f⃗(⃗ x)F⃗ (⃗
⃗ (⃗
x) = n x) = −⃗
n(⃗ ⃗
x) ∇W(⃗ ⃗ (⃗
x) = ∇P x). (4.71)
Together with P (⃗
x) = n(⃗
x)kB T it follows that
⃗ x) = −n(⃗
kB T ∇n(⃗ ⃗
x)∇W(⃗
x) (4.72)
and thus
⃗ log n(⃗
kB T ∇ ⃗
x) = −∇W(⃗
x), (4.73)
x) = n0 e−βW(⃗x) .
n(⃗ (4.74)
49
4. Equilibrium Ensembles
We first derive the equidistribution law for classical systems with a Hamiltonian of the
form
N p⃗2i
H=∑ + V(Q), Q = (⃗ ⃗N ) .
x1 , . . . , x (4.75)
i=1 2mi
We take as the probability distribution the canonical ensemble as discussed in the pre-
vious subsection, with probability distribution given by
1 −βH(P ,Q)
ρ(P , Q) = e . (4.76)
Z
∂
0 = ∫ d3N P d3N Q (A(P , Q) ρ(P , Q))
∂piα
∂A ∂H
= ∫ d3N P d3N Q ( − βA ) ρ(P , Q)
∂piα ∂piα
∂A ∂H
=⟨ ⟩ − β ⟨A ⟩, i = 1, . . . , N , α = 1, 2, 3. (4.77)
∂piα ∂piα
∂A ∂H
kB T ⟨ ⟩ = ⟨A ⟩, (4.78)
∂piα ∂piα
and similarly
∂A ∂H
kB T ⟨ ⟩ = ⟨A ⟩. (4.79)
∂xiα ∂xiα
The function A should be chosen such that that the integrand falls of sufficiently rapidly.
For A(P , Q) = piα and A(P , Q) = xiα , respectively, we find
∂H p2
⟨piα ⟩ = ⟨ iα ⟩ = kB T (4.80)
∂piα mi
∂H ∂V
⟨xiα ⟩ = ⟨xiα ⟩ = kB T . (4.81)
∂xiα ∂xiα
50
4. Equilibrium Ensembles
V(Q) = ∑ V(⃗ ⃗j )
xi − x + ∑ W(⃗
xi , ) . (4.82)
i<j i
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
interactions external potential
1
= 2 ∑ V(⃗ ⃗j )
xi − x
i,j
⃗kl ≡ x
Writing x ⃗k − x
⃗l for the relative distance between the k-th and the l-th particle, we
find by a lengthy calculation:
∂V(Q) 1 ∂ ∂W
∑ ⟨xiα ⟩ = ∑ ∑ ⟨xiα V(⃗ ⃗l )⟩ + ∑ ∑ ⟨xiα
xk − x (⃗
xk )⟩
i,α ∂xiα 2 i,α k,l ∂xiα i,α k ∂xiα
1 ⎛ ∂V ⎞
= ∑ ⟨xiα (⃗ ⃗l )⟩ (δik − δil ) δαβ + ∫ d3 x ⟨∑ δ 3 (⃗
xk − x ⃗k )⟩ x
x−x ⃗
⃗ ⋅ ∇W(⃗
x)
2 i,α,k,β,l ⎝ ∂xβ ⎠ k
1
= ∑ ⟨(⃗
xk − x ⃗ (⃗
⃗l )∇V ⃗l )⟩ + ∫ d3 x n(⃗
xk − x x)F⃗ (⃗ x) ⋅⃗ x
2 k,l ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
force density =− ∂∂p
⃗
x
,
p= pressure density
1 ∂V ∂p 1 ∂V
= ∑ ⟨⃗xkl ⟩ − ∫ d3 x x ⃗⋅ = ∑ ⟨⃗
xkl ⃗ x ⋅p
⟩ + ∫ d3 x ∇⃗
2 k,l ⃗kl
∂x ∂x ⃗ 2 k,l ⃗kl
∂x ´¸¶
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ =3
partial integral
1 ∂V
= ∑ ⟨⃗xkl ⟩ + 3P V ,
2 k,l ⃗kl
∂x
1 ∂V
P V = N kB T − ∑ ⟨⃗ xkl ⟩. (4.83)
6 k,l ∂x ⃗kl
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
=0 for ideal gas
Thus, interactions tend to increase P when they are repulsive, and tend to decrease P
when they are attractive. This is of course consistent with our intuition.
A well-known application of the virial law is the following example:
51
4. Equilibrium Ensembles
p⃗21 ∂V
⟨ ⟩=⟨ ⃗1 ⟩ = 3kB T ,
x
m1 ⃗1
∂x
⟨⃗
v⟩ assuming that the stars in the outer region
have reached thermal equilibrium, so that
they can be described by the canonical en-
Figure 4.6.: Distribution and velocity
semble. We put v⃗ = p⃗1 /m1 , v = ∣⃗
v ∣ and
of stars in a galaxy.
v 2 ⟩ ≈ ⟨v⟩2 as well
R = ∣x⃗1 ∣, and assume that ⟨⃗
as
∂V mj 1 1
⟨ ⃗1 ⟩ = m1 G ∑ ⟨
x ⟩ ≈ m1 M G⟨ ⟩ ≈ m1 M G , (4.84)
⃗1
∂x j≠1 ∣⃗ ⃗j ∣
x1 − x R ⟨R⟩
supposing that the potential felt by star 1 is dominated by the Newton potential cre-
ated by the core of the galaxy containing most of the mass M ≈ ∑j mj . Under these
approximations, we conclude that
M ⟨v⟩2
≈ . (4.85)
⟨R⟩ G
This relation is useful for estimating M because ⟨R⟩ and ⟨v⟩ can be measured or esti-
mated. Typically ⟨v⟩ = O (102 km
s ).
Q0
xiα
V0
52
4. Equilibrium Ensembles
1 ∂2V
V(Q) = V0 + ∑ (Q0 ) ∆xiα ∆xjβ + . . . , (4.86)
2 ∂xiα ∂xjβ
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
=fiαjβ
∂V
∑ ⟨xiα ⟩ ≈ 2 ⟨V⟩ = ∑ kB T = 3N kB T , (4.87)
i,α ∂xiα i,α
p2iα p⃗2
∑⟨ ⟩ = 2 ⟨∑ i ⟩ = 3N kB T . (4.88)
i,α mi i 2mi
⟨H⟩ = 3N kB T . (4.89)
This relation is called the Dulong-Petit law. For real lattice systems there are devia-
tions from this law at low temperature T through quantum effects and at high temper-
ature T through non-linear effects, which are not captured by the approximation (4.86).
Our discussion for classical systems can be adapted to the quantum mechanical con-
text, but there are some changes. Consider the canonical ensemble with statistical
1 −βH
operator ρ = Ze . From this it immediately follows that
[ρ, H] = 0, (4.90)
p⃗2i
H =∑ + VN (Q). (4.92)
i 2mi
53
4. Equilibrium Ensembles
̷ ∂
By using [a, bc] = [a, b] c + b [a, c] and p⃗j = h
⃗j
i ∂x we obtain
⎡ p⃗2 ⎤
⎢ ⎥
[H, A] = ∑ ⎢ i , x ⃗j [V(Q), p⃗j ]
⃗j ⎥ ⋅ p⃗j + ∑ x
⎢
i,j ⎣ 2m ⎥
i ⎦ j
̷
h p⃗2j
= ∑ ̷ ∑x
+ ih ⃗j ∂x⃗j V(Q),
i j mj j
which gives
∂V ̷ cancels out) .
∑ ⟨⃗
xj ⟩ = 2 ⟨Hkin ⟩ (h (4.93)
j ⃗j
∂x
Applying now the same arguments as in the classical case to evaluate the left hand side
leads to
2 1 ∂V
PV = ⟨Hkin ⟩ − ∑ ⟨⃗ xkl ⟩. (4.94)
3 6 k,l ⃗kl
∂x
´¹¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
≠N kB T for ideal gas
⇒ quantum effects!
For an ideal gas the contribution from the potential is by definition absent, but the
contribution from the kinetic piece does not give the same formula as in the classical
case, as we will discuss in more detail below in chapter 5. Thus, even for an ideal
quantum gas (V = 0), the classical formula P V = N kB T receives corrections!
system B
system A (NB , VB , EB )
(NA , VA , EA )
energy and
particle exchange
Figure 4.8.: A small system coupled to a large heat and particle reservoir.
The treatment of this ensemble is similar to that of the canonical ensemble. For
definiteness, we consider the quantum mechanical case. We have E = EA + EB for the
total energy, and N = NA + NB for the total particle number. The total system A+B is
54
4. Equilibrium Ensembles
described by the microcanonical ensemble, since E and N are conserved. The Hilbert
space for the total system is again a tensor product, and the statistical operator ρ of the
total system is accordingly given by
1
ρ= ⋅ ∑ ∣n, m⟩⟨n, m∣ , (4.95)
W (A) (B)
E−∆E≤En +Em ≤E
(A) (B)
Nn +Nm =N
H = HA + HB + HAB , (4.96)
´¸¶ ´¸¶ ´¸¶
system A system B interaction (neglected)
We are using notations similar to the canonical ensemble such as ∣n, m⟩ = ∣n⟩A ∣m⟩B and
Note that the particle numbers of the individual subsystems fluctuate, so we describe
them by number operators N̂A , N̂B acting on HA , HB .
The statistical operator for system A is described by the reduced density matrix ρA
for this system, namely by
1 (A) (A)
ρA = ∑ WB (E − En , N − Nn , VB ) ∣n⟩A ⊗ A ⟨n∣. (4.99)
W n
As before, in the canonical ensemble, we use that the entropy is an extensive quantity
to write
1
log WB (EB , NB , VB ) = SB (EB , NB , VB )
kB
1 EB NB
= V B σB ( , ),
kB VB VB
55
4. Equilibrium Ensembles
(A) (A)
1 −β(En −µNn )
ρA = ∑ e ∣n⟩A ⊗ A ⟨n∣ (4.101)
Y n
Thus, only the quantities β and µ characterizing the reservoir (system B) have an in-
fluence on system B. Dropping from now on the reference to “A”, we can write the
statistical operator of the grand canonical ensemble as
1 −β(H(V )−µN̂ (V ))
ρ= e , (4.102)
Y
The analog of the free energy for the grand canonical ensemble is the Gibbs free
energy. It is defined by
G ∶= −β −1 log Y (β, µ, V ) . (4.104)
The grand canonical partition function can be related to the canonical partition function.
The Hilbert space of our system (i.e., system A) can be decomposed
H = C ⊕ H1 ⊕ H2 ⊕ H3 ⊕ . . . , (4.105)
´¸¶ ´¸¶ ´¸¶ ´¸¶
vacuum 1 particle 2 particles 3 particles
with HN the Hilbert space for a fixed number N of particles2 , and that the total Hamil-
tonian is given by
H = H1 + H2 + H3 + . . .
N p⃗2i
HN = ∑ + VN (⃗ ⃗N ) .
x1 , . . . , x
i=1 2m
Then [H, N̂ ] = 0 (N̂ has eigenvalue N on HN ), and H and N̂ are simultaneously diago-
nalized, with (assuming a discrete spectrum of H)
2
For distinguishable particles, this would be HN = L2 (RN ). However, in real life, quantum mechanical
particles are either bosons or fermions, and the corresponding definition of the N -particle Hilbert
space has to take this into account, see Ch. 5.
56
4. Equilibrium Ensembles
which is the desired relation between the canonical and the grand canonical partition
function.
We also note that for a potential of the standard form
VN = ∑ V(⃗ ⃗j ) + ∑ W(⃗
xi − x xi )
1≤i<j≤N 1≤i≤N
Defining Partition
Ensemble Statistical operator
property function
Microcanonical no energy exchange 1
W (E, N , V ) ρ = [Θ (H − E + ∆E) − Θ (H − E)]
ensemble no particle exchange W
The relationship between the partition functions W , Z, Y and the corresponding nat-
ural termodynamic “potentials” is summarized in the following table:
Further explanations regarding the various thermodynamic potentials are given below
57
4. Equilibrium Ensembles
in section 6.7.
N p⃗2i N
HN = ∑ + ∑ Vij + ∑ Wj , (4.108)
i=1 2m 1≤i<j≤N j=1
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¸ ¹ ¹ ¹ ¶
=VN =WN
= interaction = external potential,
due to box of volume V
where Vij = V(⃗ ⃗j ) is the two-particle interaction between the i-th and the j-th
xi − x
particle. The partition function for the grand canonical ensemble is (see (4.103)):
∞
Y (µ, β, V ) = ∑ eβµN Z(β, V , N )
N =0
∞
1
= ∑ eβµN ⋅ ⋅ ∫ d3N Q e−βVN (Q) . (4.109)
N =0 N !λ3N N
V
Here, λ = √ h
2πmkB T
is the thermal deBroglie wavelength. To compute the remaining
integral over Q = (⃗ ⃗N ) is generally impossible, but one can derive an expansion
x1 , . . . , x
of which the first few terms may often be evaluated exactly. For this we write the
58
4. Equilibrium Ensembles
integrand as
⎛ ⎞
e−βVN = exp −β ∑ Vij = ∏ e−βVij ≡ ∏(1 + fij ), (4.110)
⎝ i<j ⎠ i<j i<j
where we have set fij ≡ f (⃗ ⃗j ) = 1 − e−βVij . The idea is that we can think of ∣fij ∣ as
xi − x
small in some situations of interest, e.g. when the gas is dilute (such that ∣Vij ∣ ≪ 1 in
“most of phase space”), or when β is small (i.e. for large temperature T ). With this in
mind, we expand the above product as
and substitute the result into the integral ∫ d3N Q e−βVN (Q) . The general form of the
VN
resulting integrals that we need to evaluate is suggested by the following representative
example for N = 6 particles:
4 3
∫ d x1 . . . d x6 f12 f35 f45 f36 = (4.112)
⎛ 1 3 5 ⎞ ⎛ 1 ⎞ ⎛ 3 5 ⎞
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
=⎜
⎜
⎟ = ⎜
⎟ ⎜
⎟ × ⎜
⎟ ⎜
⎟.
⎟ (4.113)
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝ 2 4 6 ⎠ ⎝ 2 ⎠ ⎝ 4 6 ⎠
To keep track of all the integrals that come up, we introduced the following convenient
graphical notation. In our example, this graphical notation amounts to the following.
Each circle corresponds to an an integration, e.g.
2 d3 x2 , (4.114)
2 4 f24 . (4.115)
The connected parts of a diagram are called “clusters”. Obviously, the integral associated
with a graph factorizes into the corresponding integrals for the clusters. Therefore, the
“cluster integrals” are the building blocks, and we define
1
bl (V , β) = ⋅ (sum of all l − cluster integrals) . (4.116)
l!λ3l−3 V
The main result in this context, known as the linked cluster theorem3 , is that
1 1 ∞
log Y (µ, V , β) = 3 ∑ bl (V , β)z l , (4.117)
V λ l=1
3
The proof of the linked cluster theorem is very similar to that of the formula (2.10) for the cumulants
⟨xn ⟩c , see section 2.1.
59
4. Equilibrium Ensembles
where z = eβµ is sometimes called the fugacity. If the fij are sufficiently small, the first
few terms (b1 , b2 , b3 , . . .) will give a good approximation. Explicitly, one finds (exercise):
1 3
b1 = ∫ d x = 1, (4.118)
1!λ0 V
V
1 3 3
b2 = ∫ d x1 d x2 f12 , (4.119)
2!λ3 V 2
V
1 3 3 3
b3 = ∫ d x1 d x2 d x3 (f12 f23 + f13 f12 + f13 f23 +f12 f13 f23 ), (4.120)
3!λ6 V 3 ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
V
→3 times the same integral
1-clusters (b1 ) ∶ 1
2-clusters (b2 ) ∶ 1 2
1 1 1 1
3-clusters (b3 ) ∶
2 3 2 3 2 3 2 3
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
→3 times the same integral
As exemplified by the first 3 terms in b3 , topologically identical clusters (i.e. ones that
differ only by a permutation of the particles) give the same cluster integral. Thus, we
only need to evaluate the cluster integrals for topologically distinct clusters.
1
Given an approximation for V log Y , one obtains approximations for the equations of
state etc. by the general methods described in more detail in section 6.5.
60
5. The Ideal Quantum Gas
⎛ 1 2 3 ... N −1 N ⎞
σ∶⎜ ⎟.
⎝ σ(1) σ(2) σ(3) ... σ(N − 1) σ(N ) ⎠
From Uσ2 = 1 it then follows that ησ2 = 1, hence ησ ∈ {±1} and from Uσ Uσ′ = Uσσ′ it follows
that ησ ησ′ = ησσ′ . The only possible constant assignments for ησ are therefore given by
⎧
⎪
⎪
⎪
⎪1 ∀σ (Bosons)
ησ = ⎨ (5.1)
⎪
⎪
⎪sgn(σ) ∀σ (Fermions).
⎪
⎩
The second characterization also makes plausible the fact that sgn(σ) is an invariant
satisfying sgn(σ)sgn(σ ′ ) = sgn(σσ ′ ).
Example:
61
5. The Ideal Quantum Gas
1 2 3 4 5
σ:
2 4 1 5 3
HN = H1 ⊗ . . . ⊗ H1
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
N factors
to the Hilbert space for Bosons/Fermions one can apply the projection operators
1
P+ = ∑ Uσ
N ! σ∈SN
1
P− = ∑ sgn(σ)Uσ .
N ! σ∈SN
P±2 = P± , P±† = P± , P+ P− = P− P+ = 0.
⎧
⎪
⎪
⎪
⎪P+ HN for Bosons
±
HN =⎨ (5.3)
⎪
⎪
⎪P− HN for Fermions.
⎪
⎩
N p⃗2i N ̷2
h
HN = ∑ = ∑− ∂x⃗2i . (5.4)
i=1 2m i=1 2m
62
5. The Ideal Quantum Gas
The eigenstates for a single particle are given by the wave functions
√
8
Ψk⃗ (⃗
x) = sin (kx x) sin (ky y) sin (kz z) , (5.5)
V
where kx = L , . . .,
πnx
with nx = 1, 2, 3, . . ., and similarly for the y, z-components. The
product wave functions Ψk⃗1 (⃗
x1 )⋯Ψk⃗N (⃗
xN ) do not satisfy the symmetry requirements
for Bosons/Fermions. To obtain these we have to apply the projectors P± to the states
∣k1 ⟩ ⊗ ⋯ ⊗ ∣kN ⟩ ∈ HN . We define:
N!
∣k⃗1 , . . . , k⃗N ⟩± ∶= √ P± (∣k1 ⟩ ⊗ . . . ⊗ ∣kN ⟩), (5.6)
c± ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
∣k1 ,...,kN ⟩
where c± is a normalization constant, defined by demanding that ± ⟨k⃗1 , . . . , k⃗N ∣k⃗1 , . . . , k⃗N ⟩± =
⃗⃗
1. (We have used the Dirac notation ⟨x∣k⟩ ≡ Ψ ⃗ (⃗
x).) Explicitly, we have: k
1
∣k⃗1 , . . . , k⃗N ⟩+ = √ ∑ ∣k⃗σ(1) , . . . , k⃗σ(N ) ⟩ for Bosons, (5.7)
c+ σ∈SN
1
∣k⃗1 , . . . , k⃗N ⟩− = √ ∑ sgn(σ)∣k⃗σ(1) , . . . , k⃗σ(N ) ⟩ for Fermions. (5.8)
c− σ∈SN
1
Note, that the factor N! coming from P± has been absorbed into c± .
Examples:
1
∣k⃗1 , k⃗2 ⟩− = √ (∣k⃗1 , k⃗2 ⟩ − ∣k⃗2 , k⃗1 ⟩) ,
2
with ∣k⃗1 , k⃗2 ⟩− = 0 if k⃗1 = k⃗2 . This implements the Pauli principle.
More generally, for an N -particle fermion state we have
1
∣k⃗1 , k⃗2 ⟩+ = √ (∣k⃗1 k⃗2 ⟩ + ∣k⃗2 k⃗1 ⟩) . (5.10)
2
63
5. The Ideal Quantum Gas
(a) Bosons: Let nk⃗ be the number of appearances of the mode k⃗ in ∣k⃗1 , . . . , k⃗N ⟩+ , i.e.
nk⃗ = ∑ δk,
⃗k ⃗i . Then c+ is given by
i
c+ = N ! ∏ nk⃗ . (5.12)
⃗
k
c+ = 3!2!1! = 12.
⃗ ∣ {k}⟩
⃗ 1 ⃗
⟨{k} = ∑ ⟨kσ(1) , . . . , k⃗σ(N ) ∣k⃗σ′ (1) , . . . , k⃗σ′ (N ) ⟩
− σ,σ ′ ∈SN c−
N!
= ∑ ⟨k⃗1 , . . . , k⃗N ∣k⃗σ(1) , . . . , k⃗σ(N ) ⟩
c− σ∈SN
N ! nk⃗1 !nk⃗2 ! . . . nk⃗N ! N!
= = = 1,
c− c−
⃗
because the term under the second sum is zero unless the permuted {k}’s are
identical (this happens ∏ nk⃗ ! times for either bosons or fermions), and because for
⃗
k
fermions, the occupation numbers nk⃗ can be either zero or one.
Z ± (N , V , β) ∶= trHN
± (e
−βH
) . (5.13)
H± = ⊕ HN
±
= C ⊕ H1± ⊕ . . . (5.14)
N ≥0
64
5. The Ideal Quantum Gas
±
On HN the particle number operator N̂ has eigenvalue N . The grand canonical
partition function Y ± is then defined as before as (cf. (4.103) and (4.107)):
∞
Y ± (µ, V , β) ∶= trH± (e−β(H−µN̂ ) ) = ∑ e+µβN Z ± (N , V , β) (5.15)
N =0
Another representation of the states in H± is the one based on the occupation num-
bers nk⃗ :
√
a†⃗ ∣ . . . , nk⃗ , . . .⟩± = nk⃗ + 1∣ . . . , nk⃗ + 1, . . .⟩± , (5.16)
k
√
ak⃗ ∣ . . . , nk⃗ , . . .⟩± = nk⃗ ∣ . . . , nk⃗ − 1, . . .⟩± , (5.17)
(a) Bosons:
(b) Fermions:
⃗ N̂⃗ = ∑ (k)a
H = ∑ (k) ⃗ † a⃗ (5.18)
k ⃗ k k
⃗
k ⃗
k
⃗ = ̷ 2k
⃗2
where (k) h
2m for non-relativistic particles. With the formalism of creation and
destruction operators at hand, the grand canonical partition function for bosons and
fermions, respectively, may now be calculated as follows:
65
5. The Ideal Quantum Gas
⃗
−β ∑k⃗ nk⃗ ((k)−µ)
= ∑ e
{nk⃗ }
⎛ ∞ −β((k)−µ)n
⃗ ⎞
=∏ ∑e
⃗ ⎝n=0
k
⎠
−1
⃗
−β((k)−µ)
= ∏ (1 − e ) . (5.19)
⃗
k
1
⃗
= ∏ (1 + e−β((k)−µ) ) . (5.20)
⃗
k
⎛ N̂k⃗ −β(H−µN̂ ) ⎞
n̄k⃗ ∶= ⟨N̂k⃗ ⟩± = trH± e , (5.21)
⎝Y ± ⎠
can be calculated by means of a trick. Let us consider the bosonic case (“+”). From the
above commutation relations we obtain
1 † 1
n̄k⃗ = trH+ ( +
a⃗ ak⃗ e−β(H−µN̂ ) ) = trH+ ( + a†⃗ ak⃗ e− ∑p⃗ β((⃗p)−µ)N̂p⃗ )
Y k Y k
1 † − ∑p⃗ β((⃗p)−µ)(N̂p⃗+δk,
⃗ p⃗) a )
= trH+ ( a e ⃗
Y + k⃗ k
1
= trH+ ( a⃗ a† e− ∑p⃗ β((⃗p)−µ)(N̂p⃗+δk,
⃗ p⃗) )
Y + k k⃗
⃗ 1
= e−β((k)−µ) trH+ a⃗ a† e− ∑p⃗ β((⃗p)−µ)N̂p⃗
Y + k k⃗
´¸¶
1+N̂k⃗
⃗
= e−β((k)−µ) (1 + n̄k⃗ ) .
66
5. The Ideal Quantum Gas
Applying similar arguments in the fermionic case we find for the expected number den-
sities:
1
n̄k⃗ = ⃗
, for bosons, (5.23)
eβ((k)−µ) −1
1
n̄k⃗ = ⃗
, for fermions. (5.24)
eβ((k)−µ) +1
⃗ N̂⃗ ⟩ = ∑ (k)
E± = ⟨H⟩± = ∑ ⟨(k) ⃗ n̄± . (5.25)
k ⃗
k ±
⃗
k ⃗
k
1
ρ± = P e−βHN .
± ±
(5.26)
ZN
Let ∣ {⃗
x}⟩± be an eigenbasis of the position operators. Then, with η ∈ {+, −}:
̷ 2k
N h ⃗2
′ ′ 1 1 −β i=1
∑ i
x } ∣ ρ ∣ {⃗
⟨{⃗ x}⟩ η = ∑ ∑ Ψ+ x′ }) Ψσ′ {k}
({⃗ ⃗ ({⃗
x}), (5.27)
2m
η ησ ησ ′ e ⃗
⃗ σ,σ ′ ∈SN cη ZN σ{k}
{k}
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
N
≡ ∏ Ψkσ(i)
⃗ (⃗
xi )
i=1
where Ψk⃗ (⃗
x) ∈ H1 are the 1-particle wave functions and
⎧
⎪
⎪
⎪
⎪1 for bosons
ησ = ⎨ (5.28)
⎪
⎪
⎪sgn(σ) for fermions.
⎪
⎩
The sum ∑′ ⃗ ⃗ is restricted in order to ensure that each identical particle state
{k 1 ,...,kN }
appears only once. We may equivalently work in terms of the occupation number rep-
67
5. The Ideal Quantum Gas
∏{k} ⃗!
⃗ nk
′
∑ =∑ , (5.29)
⃗ ⃗
N!
{k} {k}
where the factor in the unrestricted sum compensates the over-counting. This gives with
the formulas for cη derived above
(5.30)
3
For V → ∞ we may replace the sum ∑ by V
(2π)3 ∫ d k, which yields
⃗
k
̷ 2k
N h ⃗2 N
′
⃗σj x ⃗ ′ x
′ 1 VN 1 3N −β i=1
∑ i −i ∑ (k ⃗ j −k σ j ⃗j )
x } ∣ ρ ∣ {⃗
η ⟨{⃗ x}⟩ η = ∑
2m
η σ η σ ′ ∫ d ke e j=1
ZN N ! (2π) σ,σ′
2 3N V N
1 d3 k −ik(⃗
⃗ ′ ̷2
⃗2 h
e xσj −⃗xσ′j ) e−β 2m .
k
= ∑ σ σ ∏∫
η η ′
2
ZN N ! σ,σ′ j (2π) 3
1 − π ∑ (⃗
x −⃗
x )
η x′ } ∣ ρ ∣ {⃗
⟨{⃗ x}⟩ η = 3N ∑ ησ e λ2 j j σj (5.31)
ZN λ N ! σ
⃗′ = x
⃗, taking ∫ d3N x on both sides gives, and using tr ρ = 1 gives:
!
Setting x
1 3N − π ∑j (⃗
xj −⃗
xσj )2
ZN = 3N ∫ d x ∑ ησ e λ2 . (5.32)
N !λ σ∈SN
The terms with σ ≠ id are suppressed for λ → 0 (i.e. for h → 0 or T → ∞), so the leading
order contribution comes from σ = id. The next-to leading order corrections come from
N (N −1)
those σ having precisely 1 transposition (there are 2 of them). A permutation
with precisely one transposition corresponds to an exchange of two particles. Neglecting
68
5. The Ideal Quantum Gas
1 3N N − 2π (⃗
x1 −⃗
x2 ) 2
ZN = 3N ∫ d x [1 + (N − 1) η e λ2 + . . .]
N !λ 2
N
1 V N (N − 1)
η ∫ d3 r e− λ2 r⃗ + . . .]
2π 2
= ( 3 ) [1 + (5.33)
∫ d x=V N ! λ
3 2V
⎡ ⎤
1 V
N ⎢ N (N − 1)
3
2πλ2 2 ⎥
⎢ ⎥
= ( ) ⎢1 + η ( ) + . . .⎥ .
N ! λ3 ⎢ 2V 4π ⎥
⎢ ⎥
⎣ ⎦
e V kB T N 2 λ3
F = −N kB T log [ ⋅ ]− η +... (5.34)
λ3 N 2V 3
22
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¶
using N !≈N N e−N using log(1+)≈
Together with the following relation for the pressure (cf. (4.17)),
∂F
P =− ∣ , (5.35)
∂V T
it follows that
λ3
P = nkB T (1 − η n 5 + . . .) , (5.36)
22
where n = N
V is the particle density. Comparing to the classical ideal gas, where we had
P = nkB T , we see that when nλ3 is of order 1, quantum effects significantly increase the
pressure for fermions (η = −1) while they decrease the pressure for bosons (η = +1). As
we can see comparing the expression (5.33) with the leading order term in the cluster
expansion of the classical gas (see chapter 4.6), this effect is also present for a classical
gas to leading order if we include a 2-body potential V(⃗
r), such that
⃗2
−2π r
e−βV(⃗r) − 1 = ηe λ2 (from (5.33)). (5.37)
⃗2 ⃗2
r) = −kB T log [1 + ηe− ] ≈ −kB T ηe−
2π r 2π r
V(⃗ λ2 λ2 , for r ≳ λ. (5.38)
A sketch of V(⃗
r) is given in the following picture:
69
5. The Ideal Quantum Gas
V(⃗
r)
Fermions η = −1 (repulsive)
r⃗
Bosons η = +1 (attractive)
Thus, we can say that quantum effects lead to an effective potential. For fermions the
resulting coorection to the pressure P in (5.36) is called degeneracy pressure. Note
that according to (5.36) the degeneracy pressure is proportional to kB T n2 λ3 for fermions,
which increases strongly for increasing density n. It provides a mechanism to support
very dense objects against gravitational collapse, e.g. in neutron stars.
⃗ † a⃗ ,
H = ∑ (k)a s = 1, . . . , g = 2S + 1, (5.39)
⃗ k,s k,s
⃗
k,s
For the grand canonical ensemble the Hilbert space of particles with spin is given by
H ± = ⊕ HN
±
, H1 = L2 (V , d3 x) ⊗ Cg . (5.41)
N ≥0
It is easy to see that for the grand canonical ensemble this results in the following
expressions for the expected number densities n̄k⃗ and the mean energy E± :
g
n̄±k⃗ = ⟨N̂k⃗ ⟩± = ⃗
(5.42)
eβ((k)−µ) ∓1
⃗
(k)
E = ⟨H⟩± = g ∑ ⃗
. (5.43)
⃗
k eβ((k)−µ) ∓1
70
5. The Ideal Quantum Gas
In the canonical ensemble we find similar expressions. For a non-relativistic gas we get,
d3 k
with ∑ → V ∫ (2π)3
for V → ∞:
⃗
k
E± d3 k h̷ 2 k2 1
± ∶= = g∫ ̷ 2 k2 . (5.44)
V (2π) 2m eβ( 2m −µ) ∓ 1
3 h
1
̷ 2 k2 2π 2 1
Setting x = h
2mkB T or equivalently k = λ x 2 and defining the fugacity z ∶= eβµ , we find
∞ 3
± g 2 dx x 2
= 3 √ ∫ −1 x . (5.45)
kB T λ π z e ∓1
0
or similarly
∞
⟨N̂ ⟩±
1
± g 2 dx x 2
n̄ = = 3 √ ∫ −1 x . (5.46)
V λ π z e ∓1
0
Furthermore, we also have the following relation for the pressure P± and the grand
canonical potential G± = −kB T log Y ± (cf. section 4.4):
RR
∂G± RRRR
P± = − R . (5.47)
∂V RRRR
RRT ,µ
From (5.46) it follows that in the case of spin degeneracy the grand canonical partition
function Y ± is given by
⎡ ⎤∓g
⎢ ⃗ ⎥
Y = ⎢⎢∏ (1 ∓ ze
± −β( k)
)⎥⎥ . (5.48)
⎢ k⃗ ⎥
⎣ ⎦
Taking the logarithm on both sides and taking a large volume V → ∞ to approximate
the sum by an integral as before yields
̷ 2 k2
P± d3 k − 2mk
h
= ∓g ∫ log [1 ∓ ze BT ]
kB T (2π)3
∞
g 4 √ −1
3
dx x 2
= 3 π ∫ −1 x . (5.49)
λ 3 z e ∓1
0
∞
dx xm−1 (ηz)n
∫ = η(m − 1)! ∑ ,
z −1 ex − η n=1 n
m
71
5. The Ideal Quantum Gas
n̄± λ3 z2 z3 z4
= z ± 3 + 3 ± 3 +... (5.50)
g 22 32 42
βP± λ3 z2 z3 z4
= z ± 5 + 5 ± 5 +... (5.51)
g 22 32 42
⎡ 1 n̄± λ3 ⎤
⎢ ⎥
P± = n̄± kB T ⎢1 ∓ 5 ( ) + . . .⎥ , (5.52)
⎢ 22 g ⎥
⎣ ⎦
which for g = 1 gives the same result for the degeneracy pressure we obtained previously
in (5.36). Note again the “+” sign for fermions.
There are two possibilities for the helicity (“spin”) of a photon which is either parallel or
anti-parallel to p⃗, corresponding to the polarization of the light. Hence, the degeneracy
factor for photons is g = 2 and the Hamiltonian is given by
p)a+p⃗,s ap⃗,s +
H = ∑ (⃗ ... (⃗
p ≠ 0). (5.54)
p⃗,s=±1 ´¸¶
interaction
72
5. The Ideal Quantum Gas
e−
γ1 γ3
e+ e+ σ ≈ 10−50 cm2
γ2 γ4
e−
1 cσN cm3
= = cσn ≈ 10−44 × n , (5.55)
τ V s
where N = ⟨N̂ ⟩ is the average number of photons inside V and n = N/V their density.
Even in extreme places like the interior sun, where T ≈ 107 K, this leads to a mean
collision time of 1018 s. This is more than the age of the universe, which is approximately
1017 s. From this we conclude that we can safely treat the photons as an ideal gas!
By the methods of the previous subsection we find for the grand canonical partition
function, with µ = 0:
⎡ ⎤2
⎢ 1 ⎥
Y = tr (e −βH ⎢
) = ⎢∏ ⎥ , (5.56)
⎥
⎢p⃗≠0 1 − e −β(⃗
p ) ⎥
⎣ ⎦
since the degeneracy factor is g = 2 and photons are bosons. For the Gibbs free energy
(in the limit V → ∞) we get1
∞
2V d3 p −βcp V (kB T )4
G = −kB T log Y = ∫ log (1 − e ) = 2 ̷ 3 ∫ dx x2 log(1 − e−x )
β ̷
(2π h) 3 π (hc)
0
∞
V (kB T )4 1 dx x3 V (kB T )4 π 2
= 2 ̷ 3 (− ) ∫ x =− ̷ 3 .
π (hc) 3 e −1 (hc) 45
0
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
4
=−2ζ(4)=− π45
4σ
⇒ G=− V T4 . (5.58)
3c
73
5. The Ideal Quantum Gas
16σ
⇒ S = V T3 (5.60)
3c
1 d3 p ∣⃗
p∣
E = ⟨H⟩ = 2 ∑ (⃗
p) = 2V ∫ .
p⃗≠0 eβ(⃗p) − 1 ̷ 3 βc∣⃗
(2π h) e ∣ − 1
p
4σ
⇒ E = V T4 (5.61)
c
Finally, the pressure P can be calculated as
RR RRR
∂G RRRR R
(kB T log Y ) RRRRR
∂
P =− RRR = , (5.62)
∂V RR ∂V RRR
RRT ,µ=0 RT ,µ=0
see again chapter 6.5 for systematic review of such formulas. This gives
4σ 4
⇒ P = T (5.63)
3c
As an example, for the sun, with Tsun = 107 K, the pressure is P = 2, 500, 000 atm and
for a H-bomb, with Tbomb = 105 K, the pressure is P = 0.25 atm.
Note that for photons we have
1E
P= ⇔ E = 3P V . (5.64)
3V
Photons in a cavity: Consider now a setup where photons can leave a cavity through
a small hole:
speed c
cavity Ω
74
5. The Ideal Quantum Gas
The intensity of the radiation which goes through the opening is given by
2π 1
cu(ν) dΩ 1 c
I(ν, T ) = ∫ = ∫ dφ ∫ d cos ϑ cu(ν) = u(ν),
4π 4π 4
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ 0 0
radiation into
solid angle dΩ
where c is the speed of light, and where u(ν)dν is the average number of emitted particles
in frequency range ν . . . ν + dν per unit volume. We thus have
∞
Itotal = ∫ dν I(ν, T ) = σT 4 . (5.65)
0
We now find u(ν). For the mean particle number ⟨Np⃗⟩ we first find
2 ̷ k.
⃗
⟨N̂p⃗⟩ = for momentum p⃗ = h (5.66)
eβc∣⃗p∣ − 1
2 d3 p
V ,
eβc∣⃗p∣ − 1 ̷ 3
(2π h)
V dp
̷3 p2 .
π2h eβcp − 1
h ν3
u(ν) = . (5.67)
πc3 e khν
BT − 1
This is the famous law found by Planck in 1900 which lead to the development of
quantum theory! The Planck distribution looks as follows:
75
5. The Ideal Quantum Gas
u(ν)
hνmax ≈ 2.82kB T
ν
Solving u′ (νmax ) = 0 one finds that the maximum of u(ν) lies at hνmax ≈ 2.82kB T , a
relation also known as Wien’s law. The following limiting cases are noteworthy:
(i) hν ≪ kB T :
In this case we have
kB T ν 2
u(ν) ≈ (5.68)
πc3
This formula is valid in particular for h → 0, i.e. it represents the classical limit. It
was known before the Planck formula. It is not only inaccurate for larger frequen-
cies but also fundamentally problematic since it suggests ⟨H⟩ = E ∝ ∫ dν u(ν) = ∞,
which indicates an instability not seen in reality.
(ii) hν ≫ kB T :
In this case we have
hν 3 k−hνT
u(ν) ≈ e B (5.69)
πc3
This formula had been found empirically by Wien without proper interpretation
of the constants (and in particular without identifying h).
2 d3 p 1
⟨N̂ ⟩ = ∑ ≈ 2V ∫
p⃗≠0 e
p∣ − 1
βc∣⃗ ̷ e ∣−1
(2π h) 3 βc∣⃗
p
3
2ζ(3) kB T
= 2
V( ̷ ) (5.70)
π hc
Combining this formula with that for the entropy S, eq. (5.60), gives the relation
8π 4
S= kB N ≈ 3.6N kB . (5.71)
3ζ(3)
where N ≡ ⟨N̂ ⟩ is the mean total particle number from above. Thus, for an ideal photon
76
5. The Ideal Quantum Gas
⟨N̂ ⟩ g 1
n= = ∑ ⃗
. (5.72)
V V k⃗ β((k)−µ)
e −1
d3 k
The sum is calculated for sufficiently large volumes again by replacing ∑k⃗ by V ∫ (2π)3
,
which yields
d3 k 1
n ≈ g∫
(2π)3 β((k)−µ)
⃗
e −1
g k2
= ∫ dk (5.73)
2π 2 ⃗
β((k)−µ)
e −1
The particle density is clearly maximal for µ → 0 and its maximal value is given by nc
̷ 2 k2
where, with (k) = h
2m ,
g k2
nc = ∫ dk ⃗
2π 2 eβ(k) −1
3 ∞
g 2m 2 dx x2
= 2 ( ̷ 2 ) ∫ x2
2π β h e −1
0
3 ∞
g 2m 2 ∞
= 2 ( ̷ 2 ) ∑ ∫ dx x2 e−nx
2
2π β h n=1 0
g 3
= ζ ( ),
λ3 2
√
and where λ = h2
2πmkB T is the thermal deBroglie wavelength. From this wee see that
n ≤ nc , and the limiting density is achieved for the limiting temperature
2
h2 ⎛ n ⎞
3
Tc = . (5.74)
2πmkB ⎝ gζ ( 23 ) ⎠
Equilibrium states with higher densities n > nc are not possible at finite volume. A
new phenomenon happens, however, for infinite volume, i.e. in the thermodynamic
limit, V → ∞. Here, we must be careful because density matrices are only formal
77
5. The Ideal Quantum Gas
⃗
−β((k)−µ)
⟨a†p⃗ak⃗ ⟩ = e ⟨ak⃗ a†p⃗⟩ , (5.75)
and therefore
⃗
−β((k)−µ) ⃗
−β((k)−µ)
(1 − e ) ⟨a†p⃗ak⃗ ⟩ = e ⃗ p.
δk,⃗ (5.76)
⎧
⎪ ⎧
⎪
π 3 ⎪
⎪
⎪a ⃗ ⎪
⎪
⎪a(k)⃗
⃗
finite volume: k ∈ ( Z) and ⎨ k Ð→ infinite volume: k⃗ ∈ R 3
and ⎨
L ⎪
⎪
⎪ ⎪
⎪δ 3 (k⃗ − p⃗)
⎪
⎪ δ⃗
⎩ k,⃗p ⎪
⎩
⃗2
̷ 2k ̷2
⃗2 h
⎛ −β( h2m −µ) ⎞
⃗ = e−β( 2m −µ) δ 3 (⃗
k
⃗
1−e ⟨a† (⃗
p)a(k)⟩ p − k). (5.77)
⎝ ⎠
In that limit, the statistical operator ρ of the grand canonical ensemble does not make
mathematical sense, because e−βH+βµN̂ does not have a finite trace (i.e. Y = ∞). Nev-
ertheless, the condition (5.77), called the “KMS condition” in this context, still makes
sense. We view it as the appropriate substitute for the notion of Gibbs state in the
thermodynamic limit.
What are the solutions of the KMS-condition? For µ < 0 the unique solution is the
usual Bose-Einstein distribution:
δ 3 (⃗ ⃗
p − k)
⃗ p)⟩ =
⟨a† (k)a(⃗ .
̷2 2
β( h2m
k
−µ)
e −1
The point is that for µ = 0 other solutions are also possible, for instance
δ 3 (⃗ ⃗
p − k)
⟨a+ (⃗ ⃗ =
p)a(k)⟩ ⃗ 3 (⃗
+ (2π)3 n0 δ 3 (k)δ p)
̷ 2 ⃗2
β h2m
k
e −1
for some n0 ≥ 0 (this follows from ⟨A+ A⟩ ≥ 0 for operators A in any state). The particle
number density in the thermodynamic limit (V → ∞) is best expressed in terms of the
78
5. The Ideal Quantum Gas
⃗:
creation operators at sharp position x
1 3 −i⃗⃗
p) =
a(⃗ 3 ∫ d xe
px
x).
a(⃗ (5.78)
(2π) 2
1 3 3 † ⃗ x
⃗ e−i(⃗p−k)⃗
n = ⟨N̂ (⃗
x)⟩ = ∫ d p d k ⟨a (⃗
p)a(k)⟩ = nc + n0 . (5.79)
(2π)3
for T below Tc , and n0 = 0 above Tc . The formation of the condensate can thereby be
seen as a phase transition at T = Tc .
We can also write down more general solutions to the KMS-condition, for example:
⃗
d3 k eik(⃗x−⃗y)
⟨a† (⃗
x)a(⃗
y )⟩ = ∫ + f (⃗
x)f (⃗
y ), (5.81)
(2π)3 eβ h̷2m
2 k2
−1
⃗ 2 f = 0. To understand the
where f is any harmonic function, i.e. a function such that ∇
physical meaning of these states, we define the particle current operator ⃗j(⃗x) as
⃗j(⃗ −i †
x) ∶= (a (⃗ ⃗ x) − a(⃗
x)∇a(⃗ ⃗ † (⃗
x)∇a x)) . (5.82)
2m
−i
⟨⃗j(⃗
x)⟩ = (f (⃗ ⃗ (⃗
x)∇f x) − f (⃗ ⃗ (⃗
x)∇f x)) = v⃗ (5.83)
2m
This means that the condensate flows in the direction of v⃗ without leaving equilibrium.
Another solution is f (⃗
x) = f (x, y, z) = x + iy. In this case one finds
describing a circular motion around the origin (vortex). The condensate can hence flow
or form vortices without leaving equilibrium. This phenomenon goes under the name of
superfluidity.
79
6. The Laws of Thermodynamics
The laws of thermodynamics predate the ideas and techniques from statistical mechanics,
and are, to some extent, simply consequences of more fundamental ideas derived in
statistical mechanics. However, they are still in use today, mainly because:
(iii) microscopic descriptions are sometimes not known (e.g. black hole thermodynam-
ics) or are not well-developed (non-equilibrium situations).
(i) The empirical evidence that, for a very large class of macroscopic systems, equilib-
rium states can generally be characterized by very few parameters. These thermo-
dynamic parameters, often called X1 , ..., Xn in the following, can hence be viewed
as “coordinates” on the space of equilibrium systems.
(ii) The idea to perform mechanical work on a system, or to bring equilibrium systems
into “thermal contact” with reservoirs in order to produce new equilibrium states
in a controlled way. The key idea here is that these changes (e.g. by “heating up
a system” through contact with a reservoir system) should be extremely gentle so
that the system is not pushed out of equilibrium too much. One thereby imagines
that one can describe such a gradual change of the system by a succession of
equilibrium states, i.e. a curve in the space of coordinates X1 , ..., Xn characterizing
the different equilibrium states. This idealized notion of an infinitely gentle/slow
change is often referred to as “quasi-static”.
(iii) Given the notions of quasi-static changes in the space of equilibrium states, one can
then postulate certain rules guided by empirical evidence that tell us which kind
of changes should be possible, and which ones should not. These are, in essence,
the laws of thermodynamics. For example, one knows that if one has access to
equilibrium systems at different temperature, then one system can perform work
on the other system. The first and second law state more precise conditions about
80
6. The Laws of Thermodynamics
such processes and imply, respectively, the existence of an energy- and entropy
function on equilibrium states. The zeroth law just states that being in thermal
equilibrium with each other is an equivalence relation for systems, i.e. in particular
transitive. It implies the existence of a temperature function labelling the different
equivalence classes.
Θ ∶ {equilibrium systems} → R,
such that Θ is equal for systems in thermal equilibrium with each other. To see this,
let us imagine that the equilibrium states of the systems I,II and III are parametrized
by some coordinates {A1 , A2 , . . .} , {B1 , B2 , . . .} and {C1 , C2 , . . .}. Since a change in I
implies a corresponding change in III, there must be a constraint1
Since, according to the 0th law, we also must have the constraint
we can proceed by noting that for {A1 , A2 , . . . , B1 , B2 , . . .} which satisfy the last equation,
(6.3) must be satisfied for any {C2 , C3 , . . .}! Thus, we let III be our reference system
and set {C2 , C3 , . . .} to any convenient but fixed value. This reduces the condition (6.4)
1
This is how one could actually mathematically implement the idea of “thermal contact”
81
6. The Laws of Thermodynamics
PV
= const. = T [K] =∶ Θ. (6.6)
N kB
By bringing this system (for V → ∞) in contact with any other system, we can measure
the (absolute) temperature of the latter. For example, one can define the triple point
of the system water-ice-vapor to be at 273.16 K. Together with the definition of
J
kB = 1.4 × 10−23 K ) this then defines, in principle, the Kelvin temperature scale. Of
course in practice the situation is more complicated because ideal gases do not exist.
water
ice
vapor
Ttriple P
Figure 6.1.: The triple point of ice water and vapor in the (P , T ) phase diagram
82
6. The Laws of Thermodynamics
subsystems
I II
system
Figure 6.2.: A large system divided into subsystems I and II by an imaginary wall.
X1
f
γ
γ′
X2
Figure 6.3.: Change of system from initial state i to final state f along two different
paths.
Here, by an ‘adiabatic change”, one means a change without heat exchange. Consider
a particle moving in a potential. By fixing an arbitrary reference point X0 , we can define
an energy landscape
X
E(X) = ∫ δW , (6.7)
X0
83
6. The Laws of Thermodynamics
where the integral is along any path connecting X0 with X, and where X0 is a refer-
ence point corresponding to the zero of energy. δW is the infinitesimal change of work
done along the path. In order to define more properly the notion of such integrals of
“infinitesimals”, we will now make a short mathematical digression on differential forms.
N
α = ∑ αi (X1 , . . . , XN ) dXi . (6.8)
i=1
We define
1 N
dXi (t)
∫ α ∶= ∫ ∑ αi (X1 (t), . . . , XN (t)) dt, (6.9)
i=1 dt
γ 0 ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
“=dXi ”
∂f ∂f
df (X1 , . . . , XN ) = (X1 , . . . , XN ) dX1 + . . . + (X1 , . . . , XN ) dXN . (6.10)
∂X1 ∂XN
df is called an “exact” 1-form. From the definition of the path integral along γ it is
obvious that
1
d
∫ df = ∫ {f (X1 (t), . . . , XN (t))} dt = f (γ(1)) − f (γ(0)) , (6.11)
dt
γ 0
so the integral of an exact 1-form only depends on the beginning and endpoint of the
path. An example of a curve γ ∶ [0, 1] → R2 is given in the following figure:
X1
γ(1)
γ(0)
X2
The converse is also true: The integral is independent of the path γ if and only if there
exists a function f on RN , such that df = α, or equivalently, if and only if αi = ∂f
∂Xi .
84
6. The Laws of Thermodynamics
where αi1 ...ip are (smooth) functions of the coordinates Xi . We declare the dXi to
anti-commute,
dXi dXj = −dXj dXi . (6.13)
Then we may think of the coefficient tensors as totally anti-symmetric, i.e. we can
assume without loss of generality that
where σ is any permutation of p elements and sgn is its signum (see the discussion of
fermions in the chapter on the ideal quantum gas). We may now introduce an operator
d with the following properties:
(ii) df = ∑ ∂X
∂f
i
dXi for 1-forms f ,
i
(iii) d2 Xi = 0,
where in (i), (iii) f is any p form and g is any q form. On scalars (i.e. 0-forms) the
operator is defined (ii) as before, and the remaining rules (i), (iii) then determine it
for any p-form. The relation (??) can be interpreted as saying that we should think
of the differentials dXi , i = 1, ..., N as “fermionic-” or “anti-commuting variables”.2 For
instance, we then get for a 1-form α:
∂αi
dα = ∑ dXj dXi (6.15)
i,j ∂Xj ´¹¹ ¹ ¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¹ ¹ ¶
=−dXi dXj
1 ∂αi ∂αj
= ∑( − ) dXj dXi . (6.16)
2 i,j ∂Xj ∂Xi
The expression for dα of a p-form follows similarly by applying the rules (i)-(iv). The
rules imply the most important relation for p forms,
d2 α = d(dα) = 0 . (6.17)
Conversely, it can be shown that for any p + 1 form f on RN such that df = 0 we must
have f = dα for some p-form α. This result is often referred to as the Poincaré lemma.
2
Mathematically, the differentials dXi are the generators of a Grassmann algebra of dimension N .
85
6. The Laws of Thermodynamics
An important and familiar example for this from field theory is provided by force fields f⃗
on R3 . The components fi of the force field may be identified with the components of a
⃗ × f⃗ = 0, i.e.
1-form called F = ∑ fi dXi . The condition dF = 0 is seen to be equivalent to ∇
we have a conservative force field. Poincaré’s lemma implies the existence of a potenial
−W, such that F = −dW; in vector notation, f⃗ = −∇W.
⃗ A similar statement is shown to
hold for p-forms
Just as a 1-form can be integrated over oriented curves (1-dimensional surfaces), a
p form can be integrated over an oriented p-dimensional surface Σ. If that surface is
parameterized by N functions Xi (t1 , ..., tp ) of p parameters (t1 , . . . , tp ) ∈ U ⊂ Rp (the
ordering of which defines an orientation of the surface), we define the corresponding
integral as
∂Xi1 ∂Xip
∫ α = ∫ dt1 ...dtp ∑ αi1 ...ip (X(t1 , ..., tp )) ... . (6.18)
Σ U i1 ,...,ip ∂t1 ∂tp
Using the language of differentials, the 1st law of thermodynamics may also be stated
as saying that, in the absence of heat exchange, the infinitesimal work is an exact 1-form,
dE = δW , (6.20)
or alternatively,
dδW = 0 . (6.21)
We can break up the infinitesimal work change into the various forms of possible work
such as in
86
6. The Laws of Thermodynamics
dE = δQ + ∑ Ji dXi . . (6.23)
i ´¸¶ ´¸¶
force displacement
This relation is best viewed as the definition of the infinitesimal heat change δQ. Thus,
we could say that the first law is just energy conservation, where energy can consist of
either mechanical work or heat. We may then write
δQ = dE − ∑ Ji dXi (6.24)
i ´¸¶ ´¸¶
force displacement
from which it can be seen that δQ is a 1-form depending on the variables (E, X1 , ..., Xn ).
An overview over several thermodynamic forces and displacements is given in the
following table:
Table 6.1.: Some thermodynamic forces and displacements for various types of systems.
E γ1
∆Q1 = ∫ δQ ≠ ∫ δQ = ∆Q2 γ2
γ1 γ2
So, there does not exist a function Q = Q(V , A, N , . . .) such that δQ = dQ! Traditionally,
one refers to processes where δQ ≠ 0 as “non-adiabatic”, i.e. heat is transferred.
87
6. The Laws of Thermodynamics
One important consequence of the 2nd law is the existence of a state function S,
called entropy. As before, we denote the n “displacement variables” generically by
Xi ∈ {V , N , . . .} and the “forces” by Ji ∈ {−P , µ, . . .}, and consider equilibrium states
labeled by (E, {Xi }) in an n + 1-dimensional space. We consider within this space the “
adiabatic” submanifold A of all states that can be reached from a given state (E ∗ , {Xi∗ })
by means of a reversible and quasi-static (i.e. sufficiently slowly performed) process.
On this submanifold we must have
n
dE − ∑ Ji dXi = 0, (6.25)
i=1
since otherwise there would exist processes disturbing the energy balance (through the
exchange of heat), and we could then choose a sign of δQ such that work is performed
on a system by converting heat energy into work, which is impossible by the 2nd law.
We choose a (not uniquely defined) function S labeling different submanifolds A:
X1 (e.g. V )
(E ∗ , X1∗ )
A A is called
adiabatic curve,
S = const. on A
n
This means that dS is proportional to dE − ∑ Ji dXi . Thus, at each point (E, {Xi })
i=1
there is a function Θ(E, X1 , ..., Xn ) such that
n
ΘdS = dE − ∑ Ji dXi (6.26)
i=1
Θ can be identified with the temperature T [K] for suitable choice of S = S(E, X1 , ..., Xn ),
88
6. The Laws of Thermodynamics
which then uniquely defines S. This is seen for instance by comparing the coefficients in
⎛ ∂S n
∂S ⎞ n
T dS = T dE + ∑ dXi = dE − ∑ Ji dXi , (6.27)
⎝ ∂E i=1 ∂Xi ⎠ i=1
which yields
n
∂S ∂S
0 = (T − 1) dE + ∑ ( + Ji ) dXi (6.28)
∂E i=1 ∂Xi
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
=0 =0
1 ∂S ∂S
= and Ji = − . (6.29)
T ∂E ∂Xi
We recognize the first of those relations as the defining relation for temperature which
was stated in the microcanocical ensemble (cf. section 4.2.1.). We can now rewrite (6.26)
as
n
dE = T dS + ∑ Ji dXi = T dS − P dV + µdN + . . . . (6.30)
i=1
By comparing this formula with that for energy conservation for a process without heat
transfer, we identify
δQ
δQ = heat transfer = T dS ⇒ dS = (noting that d(δQ) ≠ 0!). (6.31)
T
Equation (6.30), which was derived for quasi-static processes, is the most important
equation in thermodynamics.
0 = dE + P dV .
2E
P = P (E, V ) = , (6.32)
3V
and therefore
2E
0 = dE + dV . (6.33)
3V
∂E(V )
Thus, we can parametrize the adiabatic A by E = E(V ), such that dE = ∂V dV on
89
6. The Laws of Thermodynamics
A. We then obtain
∂E 2 E
0=( + ) dV
∂V 3 V
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
=0
2/3
V∗
⇒ E(V ) = E ∗ ( )
V
(E ∗ , V ∗ )
A
V
Of course, we may also switch to other thermodynamic variables, like (S, V ), such
that E now becomes a function of (S, V ):
∂E ∂E
dE = T dS − P dV = ( ) dV + ( ) dS (6.34)
∂V ∂S
∂E ∂E
0=( + P ) dV + ( − T ) dS (6.35)
∂V ∂S
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
=0 =0
∂E ∂E
T= ∣ and P =− ∣ , (6.36)
∂S V ∂V S
which hold generally (cf. section 4.2.1, eq. (4.17)). For an ideal gas (P V = N kB T and
E = 32 N kB T ) we thus find
∂E ∂E
− V = kB N ,
∂V ∂S
3 ∂E
E = kB N
2 ∂S
90
6. The Laws of Thermodynamics
This coincides with the expression (4.16), found in section 4.2.1 with the help of classical
3 5
statistical mechanics provided we set c∗ = (4πm) 2 ( Ne ) 2 . Indeed, we find in that case
⎡ ⎤
⎢ V 4πem E 32 ⎥
⎢ ⎥
S = N kB log ⎢ ( ) ⎥ (6.41)
⎢N 3 N ⎥
⎢ ⎥
⎣ ⎦
This coincides with the formula found before in the context of the micro canonical en-
semble. (Note the we must treat the particles there as indistinguishable and include the
1
N! into the definition of the microcanonical partition function W (E, N , V ) for indistin-
guishable particles, cf. section 4.2.3).
We next discuss the Carnot engine for an ideal (mono atomic) gas. As discussed in
section 4.2., the ideal gas is characterized by the relations:
3 3
E = N kB T = P V . (6.42)
2 2
IV → I: adiabatic compression
91
6. The Laws of Thermodynamics
∆W
η ∶= (6.43)
∆Qin
where
II
∆Qin = ∫ δQ
I
IV
is the total heat added to the system (analogously, ∆Qout = ∫ δQ is the total heat given
III
off by system into a colder reservoir), and where
II III IV I
∆W = ∮ δW = (∫ + ∫ + ∫ + ∫ )δW
I II III IV
is the total work done by the system. We may also write δQ = T dS and δW = P dV
n
(or more generally δW = − ∑ Ji dXi if other types of mechanical/ chemical work are
i=1
performed by the system). By definition no heat exchange takes place during II → III
and IV → I.
We now wish to calculate ηCarnot . We can for instance take P and V as the variables
to describe the process. We have P V = const. for isothermal processes by (6.42). To
calculate the adiabatics, we could use the results from above and change the variables
from (E, V ) → (P , V ) using (6.42), but it is just as easy to do this from scratch: We
start with δQ = 0 for an adiabatic process. From this follows that
0 = dE + P dV (6.44)
3 3 ∂P
dE = d(P V ) = (V + P ) dV , (6.45)
2 2 ∂V
and therefore
3 3 ∂P 5
0 = d(P V ) + P dV = ( V + P ) dV . (6.46)
2 2 ∂V 2
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
=0
∂P 5 5
V = − P, ⇒ P V γ = const., γ= (6.47)
∂V 3 3
92
6. The Laws of Thermodynamics
I Qin
II
TH
VI TC
QoutIII
V
Figure 6.7.: Carnot cycle for an ideal gas. The solid lines indicate isotherms and the
dashed lines indicate adiabatics.
From E = 32 P V , which gives dE = 0 on isotherms, it follows that the total heat added to
the system is given by
II II
∆Qin = ∫ (dE + P dV ) = ∫ P dV
I ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ I
´¸¶
1st law from dE=0
δQ=dE+P dV on isotherms
II
= N kB TH ∫ V −1 dV
I
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
P V =N kB TH on isotherm
VII
= N kB TH log . (6.48)
VI
Using this result together with P dV = −dE on adiabatics we find for the total mechanical
work done by the system:
II III IV I
∆W = ∫ P dV + ∫ P dV + ∫ P dV + ∫ P dV
I II III IV
III
VII VIII I
= N kB TH log − ∫ dE − N kB TC log − ∫ dE
VI VIV IV
II
VII VIII
= EII − EIII + EIV − EI + N kB (TH log − TC log ).
VI VIV
93
6. The Laws of Thermodynamics
VII VIII
∆W = N kB (TH log − TC log ). (6.49)
VI VIV
∆W TC log VIII/VIV
ηCarnot = = 1− (6.50)
∆Qin TH log VII/VI
The relation (6.47) for the adiabatics, together with the ideal gas condition (6.42) implies
γ
PII VII = PIII VIII
γ
⇒ TH VII
γ−1
= TC VIII
γ−1
,
TC
η = 1− . (6.51)
TH
This fundamental relation for the efficiency of a Carnot cycle can be derived also using
the variables (T , S) instead of (P , V ), which also reveals the distinguished role played
by this process. As dT = 0 for isotherms and dS = 0 for adiabatic processes, the Carnot
cycle is just a rectangle in the T -S-diagram:
A
I II
TH
TC
VI III
SI SII S
94
6. The Laws of Thermodynamics
II II
∆Qin = ∫ δQ = ∫ T dS = TH (SII − SI ). (6.52)
I I
To compute ∆W , the total mechanical work done by the system, we observe that (as
∮ dE = 0)
∆W = ∮ δW = ∮ P dV
= ∮ (P dV + dE)
= ∮ T dS.
If A is the domain enclosed by the rectangular curve describing the process in the T -S
diagram, Gauss’ theorem gives
∆W = ∮ T dS = ∫ d(T dS) = ∫ dT dS
A A
= (TH − TC )(SII − SI ),
∆W (TH − TC )∆S TC
ηCarnot = = = 1− < 1, (6.53)
∆Qin TH ∆S TH
Consider now the more general cycle given by the curve C in the (T , S)-diagram depicted
in the figure below:
∆Qin
TH
C+
C−
TC
−∆Qout
95
6. The Laws of Thermodynamics
We define C± to be the part of of the boundary curve C where heat is injected resp.
given off. Then we have dS > 0 on C+ and dS < 0 on C− . For such a process, we define
the efficiency η = η(C) as before by the ratio of net work ∆W and injected heat ∆Qin :
∆W
η= . (6.54)
∆Qin
∆W = − ∮ δW = ∮ (T dS − dE) = ∮ T dS,
C C C
∆Qin = ∫ T dS,
C+
∮C T dS ∫C T dS ∆Qin
η= = 1+ − = 1− . (6.55)
∫C+ T dS ∫C+ T dS ∆Qout
0 ≤ ∫ T dS ≤ TH ∫ dS (as dS < 0 on C+ ),
C+ C+
∫ T dS ≤ TC ∫ dS ≤ 0 (as dS ≤ 0 on C− ).
C− C−
∫C− T dS TC ∫C− dS TC
ηC = 1 + ≤ 1+ = 1− = ηCarnot , (6.56)
∫C+ T dS TH ∫C+ dS TH
where we used the above inequalities as well as 0 = ∮ dS = ∫C+ dS + ∫C− dS. Thus, we
conclude that an arbitrary process is always less efficient than the Carnot process. This
is why the Carnot process plays a distinguished role.
We can get a more intuitive understanding of this important finding by considering
the following process:
96
6. The Laws of Thermodynamics
C+
TH
A C
TC
C−
S
∆S
The heat ∆Qin is given by ∆Qin = TH ∆S, and as before ∆W = ∫ T dS = ∫ dT dS. Thus,
C A
∆W is the area A enclosed by the closed curve C. This is clearly smaller than the area
enclosed by the corresponding Carnot cycle (dashed rectangle). Now divide a general
cyclic process into C = C1 ∪ C2 , as sketched in the following figure:
TH
C1
TI
C2
TC
Figure 6.10.: A generic cyclic process divided into two parts by an isotherm at temper-
ature TI .
This process describes two cylic processes acting one after the other, where the heat
dropped during cycle C1 is injected during cycle C2 at temperature TI . It follows from
the discussion above that
∆W2 TI − TC TC
η(C2 ) = ≤ = 1− , (6.57)
∆Q2,in TI TI
which means that the cycle C2 is less efficient than the Carnot process acting between
temperatures TI and TC . It remains to show that the cycle C1 is also less efficient than
the Carnot cycle acting betweeen temperatures TH and TI . The work ∆W1 done along
C1 is again smaller than the area enclosed by the latter Carnot cycle, i.e. we have
∆W1 ≤ (TH − TI )∆S. Furthermore, we must have ∆Q1,in ≥ ∆Q1,out = TI ∆S, which yields
∆W1 TH − TI TI
η(C1 ) = ≤ ≤ 1− .
∆Q1,in TI TH
Thus, the cycle C1 is less efficient than the Carnot cycle acting between temperatures
TH and TI . It follows that the cycle C = C1 ∪ C2 must be less efficient than the Carnot
cycle acting between temperatures TH and TC .
97
6. The Laws of Thermodynamics
Another example of a cyclic process is the Diesel engine. The idealized version of this
process consists of the following 4 steps:
III → IV: adiabatic expansion with work done by the expanding fluid
P
∆Qin
II III
PII = VIII
usable mechanical
torque on piston
PIV VI
∆Qout
PI I
VII VIII VI = VIV V
Figure 6.11.: The process describing the Diesel engine in the (P , V )-diagram.
II III IV I
∆W (∫I + ∫II + ∫III + ∫IV ) T dS
ηDiesel = =
∆Qin ∫
III
T dS II
I
∫ T dS
ηDiesel = 1 + IV
III
. (6.58)
∫II T dS
98
6. The Laws of Thermodynamics
I I I
3 5
∫ T dS = ∫ (dE + P dV ) = ∫ ( V dP + P dV )
2 2 ´¸¶
IV IV IV
=0
3
= N kB (TI − TIV ),
2
III III III
3 5
∫ T dS = ∫ (dE + P dV ) = ∫ ( V dP + P dV )
2 ´¸¶ 2
II II II
=0
5
= N kB (TIII − TII ),
2
3 TIV − TI
ηDiesel = 1 − (6.59)
5 TIII − TII
n
dE = T dS − P dV + µdN + . . . ( = T dS + ∑ Ji dXi ). (6.60)
i=1
F = E −TS . (6.61)
dF = dE − SdT − T dS
= T dS − P dV + µdN + . . . − SdT − T dS
= −SdT − P dV + µdN + . . .
n
( = −SdT + ∑ Ji dXi ).
i=1
99
6. The Laws of Thermodynamics
RR RR RR
∂F RRRR ∂F RRRR ∂F RRRR
S=− R , P =− R , µ= R , ... . (6.64)
∂T RRRR ∂V RRRR ∂N RRRR
RRV ,N RRT ,N RRT ,V
Indeed:
⎧
⎪ − H ⎫
⎪
∂F ⎪
⎪ − k HT 1 trHe kB T ⎪
⎪
= −kB ⎨log tre B + ⎬
∂T ⎪
⎪
⎪ kB T tre− kBHT ⎪
⎪
⎪
⎩ ⎭
= kB trρ log ρ = −S
In the same way, we may look for a function G of the variables (T , µ, V ). To this end,
we form the grand potential
G = E − T S − µN = F − µN (6.67)
100
6. The Laws of Thermodynamics
The differential of G is
dG = dF − µdN − N dµ
= −SdT − P dV − N dµ
Writing out dG as
RR RR
∂G RRRR ∂G RRRR
dG = R dT + R dV + . . .
∂T RRRR ∂V RRRR
RRV ,µ RRT ,µ
and comparing the coefficients, we get
⎛ RR ⎞ ⎛ RR ⎞ ⎛ RR ⎞
⎜ ∂G RRRR ⎟ ⎜ ∂G RRRR ⎟ ⎜ ∂G RRRR
0=⎜ RRR +S ⎟ dT + ⎜ RRR +P ⎟ dV + ⎜ RRR +N ⎟ ⎟ dµ,
∂T RR ∂V RR ∂µ RR
⎝ RRV ,µ ⎠ ⎝ RRT ,µ ⎠ ⎝ RRT ,V ⎠
RR RR RR
∂G RRRR ∂G RRRR ∂G RRRR
S=− R , N =− R , P =− R . (6.68)
∂T RRRR ∂µ RRRR ∂V RRRR
RRV ,µ RRT ,V RRT ,µ
1 − (H(Vk )−µN̂ )
ρ(T , µ, V ) = e BT and S = −kB trρ log ρ. (6.69)
Y
Indeed:
⎧
⎪ (H−µN̂ ) ⎫
⎪
∂G ⎪
⎪
⎪ (H−µN̂ )
1 tr(H − µN̂ )e kB T ⎪
−
⎪
⎪
− k T
= −kB ⎨log tre B + ⎬
∂T ⎪
⎪
⎪ kB T (H−µ
− k T
N̂ ) ⎪
⎪
⎪
⎪
⎩ tre B ⎪
⎭
= kB trρ log ρ = −S
The second relation can be demonstrated in a similar way (with N = ⟨N̂ ⟩). To get
a function H which naturally depends on the variables (P , T , N ), we form the free
101
6. The Laws of Thermodynamics
RR RR RR
∂H RRRR ∂H RRRR ∂H RRRR
S=− R , µ= R , V = R . (6.72)
∂T RRRR ∂N RRRR ∂P RRRR
RRP ,N RRP ,T RRN ,T
or equivalently
dH = −SdT + V dP + µdN . (6.73)
The free3 enthalpy is often used in the context of chemical processes, because these
naturally occur at constant atmospheric pressure. For processes at constant pressure P
(isobaric processes) we then have
RR RR RR
∂S RRRR ∂S RRRR ∂S RRRR
S= R E+ R V +∑ RRR Ni . (6.76)
∂E RRRR ∂V RRRR i ∂Ni RRR
RRV ,Ni RRE,Ni RV ,E
E + P V − ∑ µi Ni − T S = 0 , (6.78)
i
or equivalently
H = ∑ µi Ni . (6.79)
i
3
One also uses the enthalpy defined as E + P V . Its natural variables are T , P , N which is more useful
for processes leaving N unchanged.
102
6. The Laws of Thermodynamics
Let us summarize the properties of the potentials we have discussed so far in a table:
Thermodynamic Natural
Definition Fundamental equation
potential variables
The relationship between the various potentials can be further elucidated by means
of the Legendre transform (cf. exercises). This characterization is important because
it makes transparent the convexity respectively concavity properties of G, F following
from the convexity of S.
where χi is the chemical symbol of the i-th compound. For example, the reaction
C + O2 ⇆ CO2
is described by χ1 =C, χ2 =O2 , χ3 =CO2 and r1 = −1, r2 = −1, r3 = +1, or r = (−1, −1, +1).
The full system is described by some complicated Hamiltonian H(V ) and number op-
erators N̂i for the i-th compound. Since the dynamics can change the particle number,
we will have [H(V ), N̂i ] ≠ 0 in general. We imagine that an entropy S(E, V , {Ni }) can
be assigned to an ensemble of states with energy between E − ∆E and E, and average
particle numbers {Ni = ⟨N̂i ⟩}, but we note that the definition of S in microscopic terms
is far from obvious because N̂i is not a constant of motion.
The entropy should be maximized in equilibrium. Since N = (N1 , . . . , Nk ) changes by
103
6. The Laws of Thermodynamics
d
S(E, V , N + nr)∣ = 0. (6.81)
dn n=0
k
0 = µ ⋅ r = ∑ µi ri . (6.82)
i=1
Let us now assume that in equilibrium we can use the expression for µi of an ideal
gas with k distinguishable components and Ni indistinguishable particles of the i-th
component. This is basically the assumption that interactions contribute negligibly to
the entropy of the equilibrium state. According to the discussion in section 4.2.3 the
total entropy is given by
k
S = ∑ Si + ∆S, (6.83)
i=1
where Si = S(Ei , Vi , Ni ) is the entropy of the i-th species, ∆S is the mixing entropy, and
we have
Ni N
= , ∑ Ni = N , ∑ Vi = V , ∑ Ei = E. (6.84)
Vi V
The entropy of the i-th species is given by
⎡ 3⎤
⎢ 4 Ei 2⎥
⎢ eVi ⎥
Si = Ni kB ⎢log + log ( πemi ) ⎥ . (6.85)
⎢ N 3 N ⎥
⎢ i i ⎥
⎣ ⎦
k
∆S = −N kB ∑(ci log ci − ci ), (6.86)
i=1
where ci = Ni
N is the concentration of the i-th component. Let µ̄i be the chemical potential
of the i-th species without taking into account the contribution due to the mixing:
RR ⎡ ⎤
∂Si RRRR ⎢ V 4πm E 32 ⎥
µ̄i ⎢ i i i ⎥
=− RRR = kB log ⎢ ( ) ⎥
T ∂Ni RR ⎢ 3Ni ⎥
RRVi ,Ei ⎢ Ni ⎥
⎣ ⎦
Si 5
=− + kB .
Ni 2
104
6. The Laws of Thermodynamics
µi = µ̄i + kB T log ci
5 Si T
= kB T − + kB T log ci
2 Ni
1
= (Ei + P Vi − T Si ) + kB T log ci
Ni
= hi +kB T log ci ,
´¸¶
= Hi /Ni = free enthalpy
per particle for species i
where we have used the equations of state for the ideal gas for each species. From this
it follows that the condition for equilibrium becomes
which yields
∆h
1 = e kB T ∏ cri i , (6.88)
i
or equivalently
∣r ∣
∏ ci i
− k∆hT ri >0
e B = ∣r ∣
, (6.89)
∏ ci i
ri <0
with ∆h = ∑ ri hi the enthalpy increase for one reaction. The above relation is sometimes
i
called the “mass-action law”. It is clearly in general not an exact relation, because we
have treated the constituents as ideal gases. Nevertheless, it is often a surprisingly good
approximation.
105
6. The Laws of Thermodynamics
chemical potentials µi must have the same value in each phase, i.e. we have for all α:
∂S 1 ∂S P ∂S µi
(X (α) ) = , (X (α) ) = , (X (α) ) = − . (6.90)
∂E T ∂V T ∂Ni T
1 P µ1 µk
ξ = ( , ,− ,...,− ) (6.91)
T T T T
T
(2)
(5)
(3)
(1) (4)
(6)
Figure 6.12.: Imaginary phase diagram for the case of 6 different phases. At each point
on a phase boundary which is not an intersection point, ϕ = 2 phases are
supposed to coexist. At each intersection point ϕ = 4 phases are supposed
to coexist.
(α)
∑λ S(X (α) ) ≤ S(∑ λ(α) X (α) ), (6.93)
α α
as long as ∑α λ(α) = 1, λ(α) ≥ 0. Since the coexisting phases are in equilibrium with
each other, we must have “=” rather than “<” in the above inequality. Otherwise,
the entropy would be maximized for some non-trivial linear combination X min =
∑α λ(α) X (α) , and only one homogeneous phase given by this minimizer X min could
be realized.
By (1) and (2) it follows that in the region C ⊂ R2+k , where several phases can coexist,
106
6. The Laws of Thermodynamics
d (α) ∂ (α) ∂2
0= ξI (X + λX (α) )∣ = ∑ XJ ξI (X) = ∑ XJ S(X)
dλ λ=0 J ∂XJ J ∂XJ ∂XI
(α) ∂2
= ∑ XJ S(X)
J ∂XI ∂XJ
(α) ∂
= ∑ XJ ξJ (X) ,
J ∂XI
X (α) ⋅ dξ = 0, (6.95)
which must hold in the coexistence region C. Since the equation must hold for all
α = 1, . . . , ϕ, the coexistence region is is subject to ϕ constraints, and we therefore need
f = (2 + k − ϕ) parameters to describe the coexistence region in the phase diagram. This
statement is sometimes called the Gibbs phase rule.
Example:
Consider the following example of a phase boundary between coffee and sugar:
solute = sugar
107
6. The Laws of Thermodynamics
1 P µ
E (1) d ( ) + V (1) d ( ) − N (1) d ( ) = 0
T T T
1 P µ
E (2) d ( ) + V (2) d ( ) − N (2) d ( ) = 0.
T T T
We assume that the particle numbers are equal in both phases, N (1) = N (2) ≡ N , which
means that f = 2 + k − ϕ = 1.Thus,
dT dP
[E (1) − E (2) + P (V (1) − V (2) )] 2
= (V (1) − V (2) ) , (6.96)
T T
or, equivalently,
dP (T ) ∆E + P ∆V
= . (6.97)
dT T ∆V
Together with the relation ∆E = T ∆S − P ∆V we find the Clausius-Clapeyron-equation
dP ∆S
= . (6.98)
dT ∆V
As an application, consider a solid (phase 1) in equilibrium with its vapor (phase 2). For
the volume we should have V (1) ≪ V (2) , from which it follows that ∆V = V (1) − V (2) ≈
−V (2) . For the vapor phase, we assume the relations for an ideal gas, P V (2) = kB T N (2) =
kB T N . Substitution for P gives
dP ∆Q P
= , with ∆Q = −∆S ⋅ T . (6.99)
dT N kB T 2
∆Q
Assuming ∆q = N to be roughly independent of T , we obtain
− k∆qT
P (T ) = P0 e B (6.100)
108
6. The Laws of Thermodynamics
P (T )
The corresponding chemical potentials are denoted µ1 and µ2 . The grand canonical
partition function,
can be written as
∞
Y (µ1 , µ2 , V , β) = ∑ YN1 (µ2 , β, V )eβµ1 N1 , (6.101)
N1 =0
where YN1 is the grand canonical partition function for substance 2 with a fixed number
1 YN
N1 of particles of substance 14 . Let now yN ∶= V Y0 . It then follows that
⎡ ⎤
Y ⎢ βµ1 N1 ⎥
log Y ≡ log Y0 + log ⎢
= log Y0 + log ⎢1 + ∑ V yN1 e ⎥, (6.102)
⎥
Y0 ⎢ N1 >0 ⎥
⎣ ⎦
hence
log Y = log Y0 + V y1 (µ2 , β )eβµ1 + O(e2βµ2 ). (6.103)
´¸¶
no V dependence for large
systems as free energy
G=−kB T log Y ∼V
4
Here we assume implicitly that [H, N̂1 ] = 0 so that H maps subspaces of N1 -particles to itself.
109
6. The Laws of Thermodynamics
1 ∂
N1 = − log Y (µ1 , µ2 , V , β), (6.104)
β ∂µ1
using the manipulations with thermodynamic potentials reviewed in section 6.5. Because
log Y0 does not depend on µ1 , we find
On the other hand, we have for the pressure (see section 6.5)
1 ∂
P =− log Y (µ1 , µ2 , V , β), (6.107)
β ∂V
which follows again from (6.105). Using that y1 is approximately independent of V for
large volume, we obtain the following relation:
Using eβµ1 = n1
y1 + O (n21 ), which follows from (6.106), we get
Here we note that y1 , which in general is hard to calculate, fortunately does not appear
on the right hand side at this order of approximation.
Consider now two copies of the system called A and B, separated by a wall which
(A)
leaves through water, but not the ions of the solute. The concentration n1 of ions on
(B)
one side of the wall need not be equal to the concentration n1 on the other side. So
we have different pressures P (A) and P (B) . Their difference is
(A) (B)
∆P = P (A) − P (B) = kB T (n1 − n1 ),
(A) (B)
hence, writing ∆n = n1 − n1 , we obtain the osmotic formula, due to van ’t Hoff:
∆P = kB T ∆n . (6.109)
In the derivation of this formula we neglected terms of the order n21 , which means that
the formula is valid only for dilute solutions!
110
A. Dynamical Systems and Approach to
Equilibrium
dpi (t)
= ∑ [Tij pj (t) − Tji pi (t)] , (A.1)
dt j≠i
where Tij > 0 is the transition amplitude for going from state j to the state i per unit
of time. We call this law the “master equation.” As already discussed in sec. 3.2, the
master equation can be thought of as a version of the Boltzmann equation. In the context
of quantum mechanics, the transition amplitudes Tij induced by some small perturbation
2
of the dynamics H1 would e.g. be given by Fermi’s golden rule, Tij = 2πn/h ∣⟨i∣H1 ∣j⟩∣
and would therefore be symmetric in i and j, Tij = Tji . In this section, we do not assume
that the transition amplitude is symmetric as this would exclude interesting examples.
It is instructive to check that the master equation has the desired property of keeping
pi (t) ≥ 0 and ∑ pi (t) = 1. The first property is seen as follows. Suppose that t0 is the first
i
time that some pi (t0 ) = 0. From the structure of the master equation, it then follows
that dpi (t0 )/dt > 0, unless in fact all pj (t0 ) = 0. This is impossible, because the sum of
111
A. Dynamical Systems and Approach to Equilibrium
d d
∑ pi = ∑ pi
dt i i dt
= ∑ ∑ (Tij pj − Tji pi )
i j∶j≠i
= ∑ Tij pj − ∑ Tji pi = 0.
i,j∶i≠j i,j∶j≠i
eq eq
∑ Tij pj = pi ∑ Tji . (A.2)
j∶j≠i j∶j≠i
An important special case is the case of symmetric transition amplitudes. We are in this
case for example if the underlying microscopic dynamics is reversible. In that case, the
uniform distribution peq
i =
1
N is always stationary (micro canonical ensemble).
d
pn = R(n − 1)pn−1 + M (n + 1)pn+1 − (M + R)npn , for (n ≥ 1)
dt ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
increase in probability increase in probability decrease in probability
for n bacteria due to for n bacteria due to for n bacteria due to
reproduction among group death among group of either reproduction (leading
of (n − 1) bacteria (n + 1) bacteria to more bacteria) or death
(leading to fewer bacteria)
(A.3)
d
p0 = M p1 . (A.4)
dt
112
A. Dynamical Systems and Approach to Equilibrium
⎧
⎪
⎪
⎪
⎪Tn(n+1) = M (n + 1)
⎪
⎪
⎪
⎪
⎨Tn(n−1) = R(n − 1) (A.5)
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪Tij =0 otherwise,
⎩
R(n − 1)peq eq eq
n−1 + M (n + 1)pn+1 = (R + M )npn , with n ≥ 1 and p1 = 0. (A.6)
It follows by induction that in this example the only possible equilibrium state is given
by
⎧
⎪
⎪
⎪
⎪1 if n = 0
peq =⎨ (A.7)
n
⎪
⎪
⎪0 if n ≥ 1,
⎪
⎩
i.e. we have equilibrium if and only if all bacteria are dead.
dpi (t)
= ∑ Xij pj (t), (A.8)
dt j
where
⎧
⎪
⎪
⎪
⎪ if i ≠ j
⎪Tij
Xij = ⎨ (A.9)
⎪
⎪
⎪− ∑ Tki if i = j.
⎪
⎪
⎩ k≠i
We immediately find that Xij ≥ 0 for all i ≠ j and Xii ≤ 0 for all i. We can obtain
Xii < 0 if we assume that for each i there is at least one state j with nonzero transition
amplitude Tij . We make this assumption from now on. The formal solution of (A.8) is
given by the following matrix exponential:
113
A. Dynamical Systems and Approach to Equilibrium
diagonalize it! Nevertheless, it turns out that the master equation gives us a sufficient
amount of information to understand the key features of the eigenvalue distribution. If
we define the evolution matrix A(t) by
then, since A(t) maps element-wise positive vectors p = (p1 , . . . , pN ) to vectors with
the same property, it easily follows that Aij (1) ≥ 0 for all i, j. Hence, by the Perron-
Frobenius theorem, the eigenvector v of A(1) whose eigenvalue λmax has the largest real
part must be element wise positive, vi ≥ 0 for all i, and λmax must be real and positive,
This (up to a rescaling) unique vector v must also be an eigenvector of X , with real
eigenvalue log λmax = Emax . We next show that any eigenvalue E of X (possibly ∈ C)
has Re(E) ≤ 0 by arguing as follows: Let w be an eigenvector of X with eigenvalue E,
i.e. X w = Ew. Then
∑ Xij wj = (E − Xii )wj , (A.13)
j≠i
and therefore
∑ Xij ∣wj ∣ ≥ ∣E − Xii ∣ ∣wj ∣ , (A.14)
j≠i
which follows from the triangle inequality and Xij ≥ 0 for i ≠ j. Taking the sum ∑i and
using (A.9) then yields ∑i (Xii + ∣E − Xii ∣) ∣wi ∣ ≤ 0 and therefore (Xii + ∣E − Xii ∣) ∣wi ∣ ≤ 0
for at least one i. Since Xii < 0, this is impossible unless Re(E) ≤ 0. Then it follows that
Emax ≤ 0 and then also λmax ≤ 1. We would now like to argue that Emax = 0, in fact.
Assume on the contrary Emax < 0. Then
which is impossible as evolution preserves ∑ vi (t) > 0. From this we conclude that
i
Emax = 0, or X v = 0, and thus
vj
peq
j = (A.15)
∑i vi
is an equilibrium distribution. This equilibrium distribution is unique (from the Perron-
Frobenius theorem). Since any other eigenvalue E of X must have Re(E) < 0, any
distribution {pi (t)} must approach this equilibrium state. We summarize our findings:
2. Any distribution {pi (t)} obeying the master equation must approach equilibrium
as ∣pj (t) − peq
j ∣ = O(e
−t/τrelax
) for all states j, where the relaxation timescale is given
114
A. Dynamical Systems and Approach to Equilibrium
where i is the energy of the state i. Equation (A.16) is called the detailed balance
condition. It is easy to see that it implies
peq
i =e
−βi
/Z.
Thus, in this case, the unique equilibrium distribution is the canonical ensemble, which
was motivated already in chapter 4.
If the detailed balance condition is fulfilled, we may pass from Xij , which need not
be symmetric, to a symmetric (hence diagonalizable) matrix by a change of the basis as
βEi
follows. If we set qi (t) = pi (t)e 2 , we get
dqi (t) N
= ∑ X̃ij qj (t), (A.17)
dt j=1
where
βi −βj
X̃ij = e 2 Xij e 2
is now symmetric. We can diagonalize it with real eigenvalues λn ≤ 0 and real eigen-
vectors w(n) , so that X̃ w(n) = λn w(n) . The eigenvalue λ0 = 0 again corresponds to
(0)
equilibrium and wi ∝ e−βi /2 . Then we can write
βi
(n)
pi (t) = peq
i +e
− 2 ∑ cn e n wi ,
tλ
(A.18)
n≥1
where cn = q(0) ⋅ w(n) are the Fourier coefficients. We see again that pi (t) converges to
the equilibrium state exponentially with relaxation time − λ11 < ∞, where λ1 < 0 is the
largest non-zero eigenvalue of X̃ .
115
A. Dynamical Systems and Approach to Equilibrium
The system has 2N possible states C, and we let pC (t) be the probability that the system
is in the state C at time t. Furthermore, let τ0 be the time scale for one update of the
system, i.e. a spin flip occurs with probability dt
τ0 during the time interval [t, t + dt]. We
assume that all spin flips are equally likely in our model. This leads to a master equation
(A.1) of the form
⎧ N ⎫
dpC (t) 1 ⎪⎪1 ⎪
⎪
= ⎨ ∑ pCi (t) − pC (t)⎬ = ∑ XCC ′ pC ′ (t). (A.20)
dt τ0 ⎪
⎪ N ⎪
⎪
⎩ i=1 ⎭ C′
Here, the first term in the brackets {. . .} describes the increase in probability due to a
change Ci → C, where Ci differs from C by flipping the ith spin. This change occurs with
1
probability N per time τ0 . The second term in the brackets {. . .} describes the decrease
in probability due to the change C → Ci for any i. It can be checked from definition of
X that
∑ XCC ′ = 0 ⇒ ∑ pC (t) = 1 ∀t. (A.21)
C C
1
peq
C = ∀C ∈ {−1, +1}N . (A.22)
2N
p± (t) = ∑ pC (t) = probability for finding the 1st spin up/down at time t. (A.23)
C∶σ1 =±1
The master equation implies an evolution equation for p+ (and similarly p− ), which is
obtained by simply summing (A.20) subject to the condition ∑ . This gives:
C∶σ1 =±1
dp+ 1 1 1
= { (1 − p+ ) − p+ }, (A.24)
dt τ0 N N
116
A. Dynamical Systems and Approach to Equilibrium
1 1
So for t → ∞, we have p+ (t) → 2 at an exponential rate. This means 2 is the equilibrium
value of p+ . Since this holds for any chosen spin, we expect that the relaxation time
towards equilibrium is τrelax ≈ N
2 τ0 and we see
A more precise analysis of relaxation time involves finding the eigenvalues of the 2N -
dimensional matrix XCC ′ : we think of the eigenvectors u0 , u1 , u2 , . . . with eigenvalues
λ0 = 0, λ1 , λ2 , . . . as functions u0 (C), u1 (C), . . . where C = (σ1 , . . . , σN ). Then the
eigenvalue equation is
′
∑ XCC ′ un (C ) = λn un (C), (A.27)
C′
and we have
1
u0 (C) ≡ u0 (σ1 , . . . , σN ) = peq
C = ∀C. (A.28)
2N
Now we define the next N eigenvectors uj1 , j = 1, . . . , N by
⎧
⎪
⎪
⎪
⎪α if σj = +1
uj1 (σ1 , . . . , σN ) =⎨ (A.29)
⎪
⎪
⎪ if σj = −1.
⎪
⎩
β
Imposing the eigenvalue equation gives α = −β, and then λ1 = − N2 . The eigenvectors are
2 , 1 ≤ i < j ≤ N is
orthogonal to each other. The next set of eigenvectors uij
⎧
⎪
⎪
⎪
⎪α if σi = 1, σj = 1
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪−α if σi = 1, σj = −1
uij
2 (σ 1 , . . . , σ ) = ⎨ (A.30)
N
⎪
⎪
⎪−α if σi = −1, σj = 1
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪α if σi = −1, σj = −1,
⎩
4
2 are again found to be orthogonal, with the eigenvalue λ2 = − N . The
The vectors uij
subsequent vectors are constructed in the same fashion, and we find λk = − 2k
N for the
k-th set. The general solution of the master equation is given by (A.10)
N
pC (t) = peq
C +∑ ∑ ai1 ...ik (t) uk1 k (C)
i ...i
117
A. Dynamical Systems and Approach to Equilibrium
we get
2kt/(N τ )
ai1 ...ik (t) = ai1 ...ik (0)e− 0 .
This gives the relaxation time for a general distribution. We see that the relaxation time
is given by the exponential with the smallest decay (the term with k = 1 in the sum),
leading to the relaxation time τrelax = N τ0 /2 already guessed before. This is exponentially
small compared to the ergodic time! For N = 1 mol we have, approximately
τergodic
= O(e(10 ) ).
23
(A.32)
τrelax
e−βE(C)
⟨F ⟩ = ∑ F (C) , (A.33)
C Z(β)
′
TC,C ′ e−βE(C ) = TC ′ ,C e−βE(C) , (A.34)
as well as ∑ TC ′ ,C = 1 for all C. The discretized version of the master equation then
C′
becomes
pC ′ (t + 1) = ∑ TC ′ ,C pC (t).
C
In the simplest case, the sum is over all configurations C differing from C ′ by flipping
precisely one spin. If Ci′ is the configuration obtained from some configuration C ′ by
flipping spin i, we therefore assume that TC ′ ,C is non-zero only if C = Ci′ for some i.
One expects, based on the above arguments, that this process will converge to the
118
A. Dynamical Systems and Approach to Equilibrium
Metropolis Algorithm
(2) Choose randomly a spin i and determine the change in energy E(C) − E(Ci ) = δi E
for the new configuration Ci obtained by flipping one spin i.
(3) Choose a uniformly distributed random number u ∈ [0, 1]. If u < e−βδi E , change
σi → −σi , otherwise leave σi unchanged.
(4) Rename Ci → C.
Running the algorithm m times, going through approximately N iterations each time,
gives the desired sample C1 , ..., Cm distributed approximately according to e−βE(C) /Z.
The expectation value < F > is then computed as the average of F (C) over the sample
C1 , ..., Cm . For example, in the case of the Ising model (in one dimension) describing a
chain of N spins with σi ∈ {±1}, the energy is given by:
E(C) = −J ∑ σi σi+1 ,
0<i<N
where J is the strength of the interaction between the i-th and the (i + 1)-th spin in the
chain. The change in energy if we flip one spin is very easy to calculate in this example
because the interaction is local, as it is in most models.
119
A. Dynamical Systems and Approach to Equilibrium
Acknowledgements
These lecture notes are based on lectures given by Prof. Dr. Stefan Hollands at the
University of Leipzig. The typesetting was done by Stefanie Riedel and Michael Gransee.
120