Termodinámica Estadística
Termodinámica Estadística
Termodinámica Estadística
Statistical Thermodynamics
Gunnar Jeschke
Copyright 2015 Gunnar Jeschke
Title image: Billard von No-w-ay in collaboration with H. Caps - Eigenes Werk.
Lizenziert unter GFDL ber Wikimedia Commons -
https://commons.wikimedia.org/wiki/File:Billard.JPG
Chapter 2 Word cloud: http://de.123rf.com/profile_radiantskies
Chapter 3 Dice image: http://de.123rf.com/profile_whitetag
Chapter 4 Matryoshka image: http://de.123rf.com/profile_mikewaters
Chapter 5 Word cloud: http://de.123rf.com/profile_radiantskies
Chapter 7 Dumbbell image: http://de.123rf.com/profile_filipobr
Chapter 8 Spaghetti image: http://de.123rf.com/profile_winterbee
http://www.epr.ethz.ch
Licensed under the Creative Commons Attribution-NonCommercial 3.0 Unported License (the
License). You may not use this file except in compliance with the License. You may obtain a
copy of the License at http://creativecommons.org/licenses/by-nc/3.0.
Design and layout of the lecture notes are based on the Legrand Orange Book available at
http://latextemplates.com/template/the-legrand-orange-book.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1 General Remarks 7
1.2 Suggested Reading 8
1.3 Acknowledgment 9
3 Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 Discrete Probability Theory 17
3.1.1 Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.2 Multiple Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.3 Functions of Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.4 Discrete Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.5 Probability Distribution of a Sum of Random Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.6 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.7 Stirlings Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Continuous Probability Theory 24
3.2.1 Probability Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.2 Selective Integration of Probability Densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.3 Sum of Two Continuous Random Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4 Classical Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1 Statistical Ensembles 31
4.1.1 Concept of an Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1.2 Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.1 Swendsens Postulates of Thermodynamics 41
5.1.1 Cautionary Remarks on Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.1.2 Swendsens Postulates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.1.3 Entropy in Phenomenological Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.1.4 Boltzmanns Entropy Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.3 Irreversibility 47
5.3.1 Historical Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.3.2 Irreversibility as an Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6 Quantum Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.1 Quantum Canonical Ensemble 51
6.1.1 Density Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.1.2 Quantum Partition Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8 Macromolecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.1 Thermodynamics of Mixing 81
8.1.1 Entropy of Binary Mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.1.2 Energy of Binary Mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Books 93
Articles 94
Web Pages 94
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
General Remarks
Suggested Reading
Acknowledgment
1 Introduction
Today, Mr. K. complained, Scores of people claim in public that they can type sizeable books
entirely on their own, and this is universally accepted. The Chinese philosopher Chuan-Tzu,
when already at his prime age, brushed a tome of one hundred thousand words with nine out of
ten being citations. Such books cannot be written anymore, since wit is missing. Hence, only by
the tools of a single man thoughts are produced, whereas he feels lazy if their number is low.
Indeed, no thought can be reused and no expression be cited. How little they all need for their
doings! A pen and some paper is all they have to show! And without any help, with only the puny
material that a single man can carry on his arms, they set up their huts! They dont know of
larger buildings than a loner can raise.
required connection. Chapter 2 will discuss this basic idea in some more detail and will present a
set of postulates due to Oliver Penrose [Pen70]. The discussion of these postulates clarifies what
the remaining mathematical problem is and how we avoid it in applications.
In this course we do not assume that students are already familiar with probability theory,
rather we will introduce its most important concepts in Chapter 3. We do assume that the concepts
of phenomenological thermodynamics are known, although we shall shortly explain them on
first use in these lecture notes. The most important new concept in this course is the one of an
ensemble description, which will be introduced in Chapter 4 first only for classical particles.
This will set the stage for discussing the concepts of irreversibility and entropy in Chapter 5. We
will complete the foundations part with a discussion of quantum ensembles in Chapter 6. This
Chapter will also make the transition to applications, by treating first the harmonic oscillator and
second the Einstein model of a crystal with the apparatus that we command at that point.
We shall then illustrate the relation to phenomenological thermodynamics by discussing
the partition functions of gases and by computing thermodynamical state functions from these
partition functions in Chapter 7. The final Chapter 8 will shortly discuss the consequences of
statistical thermodynamics for macromolecular systems and introduce the concepts of lattice
models, random walks, and entropic elasticity.
The time available for this course does not permit to treat all aspects of statistical thermo-
dynamics and statistical mechanics that are important in physical chemistry, chemical physics,
polymer physics, and biophysics, let alone in solid-state physics. The most important omissions
are probably kinetic aspects of chemical reactions, which are treated in detail in a lecture course
on Advanced Kinetics, and the topic of phase transitions, including the famous Ising chain model.
We believe that the foundations laid in the present course will allow students to understand these
topics from reading in the textbooks listed in the following Section.
For many of the central concepts I have looked up (English) Wikipedia articles and have
found that these articles are, on average, of rather good quality. They do differ quite strongly
from each other in style and notation. When using only Wikipedia or other internet resources it
is difficult to fit the pieces of information together. If, on the other hand, you already do have a
basic level of understanding, but some difficulties with a particular concept, such sources may
provide just the missing piece of information. The NIST guide for computing thermodynamical
state functions from the results of ab initio computations is a particularly good example for a
useful internet resource [Iri98].
1.3 Acknowledgment
I am grateful to M. Schfer, U. Hollenstein, and F. Merkt for making their lecture notes for this
course available and to Takuya Segawa for thorough proofreading of the first manuscript of these
notes. All remaining errors are my own sole responsibility.
Basic Assumptions of Statistical
Thermodynamics
Thermodynamics Based on Statistical Mechanics
The Markovian Postulate
Phase space
Hamiltonian Equations of Motion
The Liouville Equation
Quantum Systems
Statistical Mechanics Based on Postulates
The Penrose Postulates
Implications of the Penrose Postulates
Concept 2.1.1 Newtonian equations of motion. With particle mass m, Cartesian coordinates
qi (i = 1, 2, . . . , 3N ) and velocity coordinates qi , a system of N identical classical point
particles evolves by
d2 qi
m 2
= V (q1 , . . . , q3N ) , (2.1)
dt qi
Notation 2.1. The dynamical state or microstate of the system at any instant is defined by the
6N Cartesian and velocity coordinates, which span the dynamical space of the system. The
curve of the system in dynamical space is called a trajectory.
The concept extends easily to atoms with different masses mi . If we could, at any instant,
precisely measure all 6N dynamical coordinates, i.e., spatial coordinates and velocities, we could
12 Basics of Statistical Mechanics
precisely predict the future trajectory. The system as described by the Newtonian equations of
motions behaves deterministically.
For any system that humans can see and handle directly, i.e., without complicated technical
devices, the number N of particles is too large (at least of the order of 1018 ) for such complete
measurements to be possible. Furthermore, for such large systems even tiny measurement errors
would make the trajectory prediction useless after a rather short time. In fact, atoms are quantum
objects and the measurements are subject to the Heisenberg uncertainty principle, and even the
small uncertainty introduced by that would make a deterministic description futile.
We can only hope for a theory that describes what we can observe. The number of
observational states or macrostates that can be distinguished by the observer is much smaller than
the number of dynamical states. Two classical systems in the same dynamical state are necessarily
also in the same observational state, but the converse is not generally true. Furthermore, the
observational state also evolves with time, but we have no equations of motion for this state (but
see Section 2.2.2). In fact we cannot have deterministic equations of motion for the observational
state of an individual system, precisely because the same observational state may correspond to
different dynamical states that will follow different trajectories.
Still we can make predictions, only these predictions are necessarily statistical in nature. If
we consider a large ensemble of identical systems in the same observational state we can even
make fairly precise predictions about the outcome. Penrose [Pen70] gives the example of a
women at a time when ultrasound diagnosis can detect pregnancy, but not sex of the foetus. The
observational state is pregnancy, the two possible dynamical states are on path to a boy or girl.
We have no idea what will happen in the individual case, but if the same diagnosis is performed
on a million of women, we know that about 51-52% will give birth to a boy.
How then can we derive stable predictions for an ensemble of systems of molecules? We
need to consider probabilities of the outcome and these probabilities will become exact numbers
in the limit where the number N of particles (or molecules) tends to infinity. The theory required
for computing such probabilities will be treated in Chapter 3.
R Our current usage of the term ensemble is loose. We will devote the whole Chapter 4 to
clarifying what types of ensembles we use in computations and why.
n(A, N , T )
P (A|T ) = lim (2.2)
N N
where n(A, N , T ) is the number of times the outcome A is observed in the first N trials.
A trial T conforming to this definition is statistically regular, i.e., the limit exists and is the
same for all infinite series of the same trial. If the physical probability is assumed to be a stable
property of the system under study, it can be measured with some experimental error. This
experimental error has two contributions: (i) the actual error of the measurement of the quantity
A and (ii) the deviation of the experimental frequency of observing A from the limit defined in
Eq. (2.2). Contribution (ii) arises from the experimental number of trials N not being infinite.
2.2 Phase space 13
We need some criterion that tells us whether T is statistically regular. For this we split the
trial into a preparation period, an evolution period, and the observation itself. The evolution
period is a waiting time during which the system is under controlled conditions. Together with
the preparation period it needs to fulfill the Markovian postulate.
Concept 2.1.2 Markovian postulate. A trial T that invariably ends up in the observational
state O of the system after the preparation stage is called statistically regular. The start of the
evolution period is assigned a time t = 0.
Note that the system can be in different observational states at the time of observation;
otherwise the postulate would correspond to a trivial experiment. The Markovian postulate is
related to the concept of a Markovian chain of events. In such a chain the outcome of the next
event depends only on the current state of the system, but not on states that were encountered
earlier in the chain. Processes that lead to a Markovian chain of events can thus be considered as
memoryless.
The Newtonian equations of motion (2.1) are very convenient for atomistic molecular dynamics
(MD) computations. Statistical analysis of trajectories encountered during such MD simulations
can be analyzed in terms of thermodynamic quantities, such as free energy. However, for
analyzing evolution of the system in terms of spectroscopic properties, the Newtonian description
is very inconvenient. Since spectroscopic measurements can provide the most stringent tests of
theory, we shall use the Hamiltonian formulation of mechanics in the following. This formulation
is particularly convenient for molecules that also have rotational degrees of freedom. For that,
we replace the velocity coordinates by momentum coordinates pj = mj qj , where index j runs
over all atoms. Furthermore, we assume M identical molecules, with each of them having f
degrees of freedom, so that the total number of degrees of freedom is F = f M . Such as system
can be described by 2F differential equations
Definition 2.2.1 Phase space and state space.Phase space is the space where microstates
of a system reside. Sometimes the term is used only for problems that can be described
in spatial and momentum coordinates, sometimes for all problems where some type of a
Hamiltonian equation of motion applies. Sometimes the term state space is used for the
space of microstates in problems that cannot be described by (only) spatial and momentum
coordinates.
14 Basics of Statistical Mechanics
If the molecule is just a single atom, we have only f = 3 translational degrees of freedom
and the Hamiltonian is given by
1
p2x,i + p2y,i + p2z,i ,
H (pi , qi ) = (2.5)
2m
describing translation. For molecules with n atoms, three of the f = 3n degrees of freedom are
translational, two or three are rotational for linear and non-linear molecules, respectively, and the
remaining 3n 5 or 3n 6 degrees of freedom are vibrational.
= {H, } . (2.8)
t
For the probability density along a phase space trajectory, i.e., along a trajectory that is taken
by microstates, we find
d
(q(t), p(t), t) = 0 . (2.9)
dt
If we consider a uniformly distributed number dN of ensemble members in a volume element
d0 in phase space at time t = 0 and ask about the volume element d in which these ensemble
members are distributed at a later time, we find
d = d0 . (2.10)
phase space is replaced by the density operator b and the Liouville equation by the Liouville-von-
Neumann equation
b i hb i
= H, b . (2.11)
t ~
In quantum mechanics, observables are represented by operators A.b The expectation value of
an observable can be computed from the density operator that represents the distribution of the
ensemble in phase space,
hAi
b = Trace bA b . (2.12)
We note that the Heisenberg uncertainty relation does not introduce an additional complication
in statistical mechanics. Determinism had been lost before and the statistical character of the
measurement on an individual system is unproblematic, as we seek only statistical predictions
for a large ensemble. In the limit of an infinite ensemble, N , there is no uncertainty
and the expectation values of incompatible observables are well defined and can be measured
simultaneously. Such an infinitely large system is not perturbed by the act of observing it. The
only difference between the description of classical and quantum systems arises from their
statistical behavior on permutation of the coordinates of two particles, see Section 6.2.
Based on the Penrose postulates it can be shown [Pen70] that the definition of Boltzmann entropy
(Chapter 5) ensures both properties, but that statistical expressions for entropy ensure only the
non-decrease property, not in general the additivity property. This appears to leave us in an
inconvenient situation. However, it can also be shown that for large systems, in the sense that the
number of macrostates is much smaller than the number of microstates, the term that quantifies
non-additivity is negligibly small compared to the total entropy [Pen70]. The problem is thus
rather a mathematical beauty spot than a serious difficulty in application of the theory.
Discrete Probability Theory
Discrete Random Variables
Multiple Discrete Random Variables
Functions of Discrete Random Variables
Discrete Probability Distributions
Probability Distribution of a Sum of Random
Numbers
Binomial Distribution
Stirlings Formula
Continuous Probability Theory
Probability Density
Selective Integration of Probability Densities
Sum of Two Continuous Random Numbers
3 Probability Theory
A = {aj } , (3.1)
where in our example index j runs from 1 to 6, whereas in general it runs from 1 to the number
NA of possible events. Each of the events is assigned a probability 0 P (aj ) 1. Impossible
events (for a given preparation) have probability zero and a certain event has probability 1.
Since
PNA one and only one of the events must happen in each trial, the probabilities are normalized,
j P (aj ) = 1. A simplified model of our example trial is the rolling of a die. If the die is fair,
we have the special situation of a uniform probability distribution, i.e., P (aj ) = 1/6 for all j.
Concept 3.1.1 Random variable. A set of random events with their associated probabilities is
called a random variable. If the number of random events is countable, the random variable is
called discrete. In a computer, numbers can be assigned to the events, which makes the random
variable a random number. A series of trials can then be simulated by generating a series of N
pseudo-random numbers that assign the events observed in the N trials. Such simulations are
called Monte Carlo simulations. Pseudo-random numbers obtained from a computer function
need to be adjusted so that they reproduce the given or assumed probabilities of the events.
Problem 3.1 Using the Matlab function rand, which provides uniformly distributed random
numbers in the open interval (0, 1), write a program that simulates throwing a die with six
faces. The outer function should have trial number N as an input and a vector of the numbers of
encountered ones, twos, ... and sixes as an output. It should be based on an inner function that
simulates a single throw of the die. Test the program by determining the difference from the
expectation P (aj ) = 1/6 for ever larger numbers of trials.
18 Probability Theory
Note that we have introduced a brief notation that suppresses indices j and k. This notation is
often encountered because of its convenience in writing.
If we know the probabilities P (a, b) for all NA NB possible combinations of the two events,
we can compute the probability of a single event, for instance a,
X
PA (a) = P (a, b) , (3.3)
b
R The unfortunate term marginal does not imply a small probability. Historically, these
probabilities were calculated in the margins of probability tables [Swe12].
Another quantity of interest is the conditional probability P (a|b) of an event a, provided that
b has happened. For instance, if we call two cards from a full deck, the probability of the second
card being a Queen is conditional on the first card having been a Queen. With the definition for
the conditional probability we have
Theorem 3.1.2 Bayes theorem. If the marginal probability of event b is not zero, the
conditional probability of event a given b is
P (b|a)PA (a)
P (a|b) = . (3.6)
PB (b)
Bayes theorem is the basis of Bayesian inference, where the probability of proposition a
is sought given prior knowledge (short: the prior) b. Often Bayesian probability is interpreted
subjectively, i.e., different persons, because they have different prior knowledge b, will come to
different assessments for the probability of proposition a. This interpretation is incompatible
with theoretical physics, where, quite successfully, an objective reality is assumed. Bayesian
probability theory can also be applied with an objective interpretation in mind and is nowadays
used, among else, in structural modeling of biomacromolecules to assess agreement of a model
(the proposition) with experimental data (the prior).
In experimental physics, biophysics, and physical chemistry, Bayes theorem can be used
to assign experimentally informed probabilities to different models for reality. For example
assume that a theoretical modeling approach, for instance an MD simulation, has provided a
set of conformations A = {aj } of a protein molecule and associated probabilities PA (aj ). The
probabilities are related, via the Boltzmann distribution, to the free energies of the conformations
3.1 Discrete Probability Theory 19
(this point is discussed later in the lecture course). We further assume that we have a measurement
B with output bk and we know the marginal probability PB (b) of encountering this output for a
random set of conformations of the protein molecule. Then we need only a physical model that
provides the conditional probabilities P (bk |aj ) of measuring bk given the conformations aj and
can compute the probability P (aj |bk ) that the true conformation is aj , given the result of our
measurement, via Bayes theorem. Eq. (3.6). This procedure can be generalized to multiple
measurements. The required P (bk |aj ) depend on measurement errors. The approach allows for
combining possibly conflicting modeling and experimental results to arrive at a best estimate
for the distribution of conformations.
The events associated with two random variables can occur completely independent of each
other. This is the case for throwing two dice: the number shown on the black die does not depend
on the number shown on the red die. Hence, the probability to observe a 2 on the black and a
3 on the red die is (1/6) (1/6) = 1/36. In general, joint probabilities of independent events
factorize into the individual (or marginal) probabilities, which leads to huge simplifications in
computations. In the example of two coupled spins SA = 5/2 and SB = 5/2 the two random
variables mS,A and mS,B may or may not be independent. This is decided by the strength of the
coupling, the preparation of trial T , and the evolution time t before observation.
Concept 3.1.3 Independent variables. If two random variables are independent, the joint
probability of two associated events is the product of the two marginal probabilities,
As a consequence, the conditional probability P (a|b) equals the marginal probability of a (and
vice versa),
For a set of more than two random variables two degrees of independence can be established,
a weak type of pairwise independence and a strong type of mutual independence. The set
is mutually independent if the marginal probability distribution in any subset, i.e. the set of
marginal probabilities for all event combinations in this subset, is given by the product of the
corresponding marginal distributions for the individual events.1 This corresponds to complete
independence. Weaker pairwise independence implies that the marginal distributions for any pair
of random variables are given by the product of the two corresponding distributions. Note that
even weaker independence can exist within the set, but not throughout the set. Some, but not all
pairs or subsets of random variables can exhibit independence.
Another important concept for multiple random variables is whether or not they are distin-
guishable. In the example above we used a black and a red die to specify our events. If both
dice would be black, the event combinations (a2 , b3 ) and (a3 , b2 ) would be indistinguishable
and the corresponding composite event of observing a 2 and a 3 would have a probability of
1/18, i.e. the product of the probability 1/36 of the basic composite event with its multiplicity 2.
In general, if n random variables are indistinguishable, the multiplicity equals the number of
permutations of the n variables, which is n! = 1 2 (n 1) n.
1As the distributions are vectors and all combinations have to be considered, an outer product must be taken.
20 Probability Theory
where G(a, b) is an arbitrary function of a and b and the Kronecker delta g,G(a,b) assumes
the value one if g = G(a, b) and zero otherwise. In our example, g = G(a, b) = a + b will
assume the value of 5 for the event combinations (1, 4), (2, 3), (3, 2), (4, 1) and no others. Hence,
PG (5) = 4/36 = 1/9. There is only a single combination for g = 2, hence PG (2) = 1/36, and
there are 6 combinations for g = 7, hence PG (7) = 1/6. Although the probability distributions
for the individual random numbers A and B are uniform, the one for G is not. It peaks at the
value of g = 7 that has the most realizations. Such peaking of probability distributions that
depend on multiple random variables occurs very frequently in statistical mechanics. The peaks
tend to become the sharper the larger the number of random variables that contribute to the sum.
If this number N tends to infinity, the distribution of the sum g is so sharp that the distribution
width (to be specified below) is smaller than the error in the measurement of the mean value
g/N (see Section 3.1.5). This effect is the very essence of statistical thermodynamics: Although
quantities for a single molecule may be broadly distributed and unpredictable, the mean value for
a large number of molecules, lets say 1018 of them, is very well defined and perfectly predictable.
In a numerical computer program, Eq. (3.9) for only two random variables can be implemented
very easily by a loop over all possible values of g with inner loops over all possible values of a
and b. Inside the innermost loop, G(a, b) is computed and compared to loop index g to add or
not add P (a, b) to the bin corresponding to value g. Note however that such an approach does
not carry to large numbers of random variables, as the number of nested lops increases with
the number of random variables and computation time thus increases exponentially. Analytical
computations are simplified by the fact that g,G(a,b) usually deviates from zero only within
certain ranges of the summation indexes j (for a) and k (for b). The trick is then to find the proper
combinations of index ranges.
Problem 3.2 Compute the probability distribution for the sum g of the numbers shown by two
dice in two ways. First, write a computer program using the approach sketched above. Second,
compute the probability distribution analytically by making use of the uniform distribution for
the individual events (P (a, b) = 1/36 for all a, b. For this, consider index ranges that lead to a
given value of the sum g.2
Concept 3.1.4 Mean value and standard deviation. For any function F (A) of a random
The standard deviation, which characterizes the width of the distribution of the function values
f (a), is given by,
s
X
= (F (a) hF i)2 PA (a) . (3.11)
a
The mean value is the first moment of the distribution, with the nth moment being defined by
X
hF n i = F n (a)PA (a) . (3.12)
a
2 = hF 2 i hF i2 . (3.14)
Assume that we know the mean values for functions F (A) and G(B) of two random variables
as well as the mean value hF Gi of their product, which we can compute if the joint probability
function P (a, b) is known. We can then compute a correlation function
RF G = hF Gi hF ihGi , (3.15)
which takes the value of zero, if F and G are independent random numbers.
Problem 3.3 Compute the probability distribution for the normalized sum g/M of the numbers
obtained on throwing M dice in a single trial. Start with M = 1 and proceed via M =
10, 100, 1000 to M = 10000. Find out how many Monte Carlo trials N you need to guess the
converged distribution. What is the mean value hg/M i? What is the standard deviation g ? How
do they depend on N ?
N
X
S= Fj , (3.16)
j=1
N
X
hSi = hFj i . (3.17)
j=1
22 Probability Theory
If motion of the individual molecules is uncorrelated, the individual random numbers Fj are
independent. It can then be shown that the variances add [Swe12],
N
X
S2 = j2 (3.18)
j=1
For identical molecules, all random numbers have the same mean hF i and variance F2 and
we find
hSi = N hF i (3.19)
S2 = N F2 (3.20)
S = N F . (3.21)
This result relates to the concept of peaking of probability distributions for a large number of
molecules that was introduced above on the example of the probability distribution for sum of
the numbers shown by two dice. The width of the distribution normalized to its mean value,
S 1 F
= , (3.22)
hSi N hF i
scales with the inverse square root of N . For 1018 molecules, this relative width of the distribution
is one billion times smaller than for a single molecule. Assume that for a certain physical quantity
of a single molecule the standard deviation is as large as the mean value. No useful prediction
can be made. For a macroscopic sample, the same quantity can be predicted with an accuracy
better than the precision that can be expected in a measurement.
Hence, we need to divide the total number of permutations N ! by the numbers of permutations
in each subset, n! and (N n)! for the first and second subset, respectively. The multiplicity that
we need is the number of combinations of N elements to the nth class, which is thus given by
the binomial coefficient,
N N!
= , (3.23)
n n!(N n)!
providing the probability distribution
N
PS (n) = P n (1 P )N n . (3.24)
n
10-3
0.03 4
0.025
A B
[G(n) - P(n)]/max(P(n))
2
0.02
P(n)
0.015 0
0.01
-2
0.005
0 -4
0 200 400 600 800 1000 0 200 400 600 800 1000
n n
Figure 3.1: Gaussian approximation of the binomial distribution. (A) Gaussian approximation (red
dashed line) and binomial distribution (black solid line) for P = 0.37 and N = 1000. (B) Error of the
Gaussian approximation relative to the maximum value of the binomial distribution.
Concept 3.1.5 Central limit theorem. Suppose that a large number N of observations has
been made with each observation corresponding to a random number that is independent
from the random numbers of the other observations. According to the central limit theorem,
the mean value hSi/N of the sum of all these random numbers is approximately normally
distributed, regardless of the probability distribution of the individual random numbers, as
long all the probability distributions of all individual random numbers are identical.a The
central limit theorem applies, if each individual random variable has a well-defined mean value
24 Probability Theory
(expectation value) and a well-defined variance. These conditions are fulfilled for statistically
regular trials T .
a
If the individual random numbers are not identically distributed, the theorem will still apply, if Lyapunovs
condition or Lindebergs condition is fulfilled. See the very useful and detailed Wikipedia article on the Central
limit theorem for more information and proofs.
Concept 3.1.6 Stirlings formula.For large numbers N the natural logarithm of the factorial
can be approximated by Stirlings formula
ln N ! N ln N N + 1 , (3.28)
N ! N N exp(1 N ) (3.29)
for the factorial itself. For large numbers N it is further possible to neglect 1 in the sum and
approximate ln N ! N ln N N .
The absolute error of this approximation for N ! looks gross and increases fast with increasing
N , but because N ! grows much faster, the relative error becomes insignificant already at moderate
N . For ln N ! it is closely approximated by 0.55/N . In fact, an even better approximation has
been found by Gosper [Swe12],
1 1
ln N ! N ln N N + ln 2N + . (3.30)
2 3
Gospers approximation is useful for considering moderately sized systems, but note that several
of our other assumptions and approximations become questionable for such systems and much
care needs to be taken in interpreting results. For the macroscopic systems, in which we are
mainly interested here, Stirlings formula is often sufficiently precise and Gospers is not needed.
Slightly better than Stirlings original formula, but still a simple approximation is
N
N
N ! 2N . (3.31)
e
quantities are assumed to be continuous. For instance, spatial coordinates in phase space are
assumed to be continuous, as are the momentum coordinates for translational motion in free
space.
To work with continuous variables, we assume that an event can return a real number instead
of an integer index. The real number with its associated probability density is a continuous
random number. Note the change from assigning a probability to an event to assigning a
probability density. This is necessary as real numbers are not countable and thus the number of
possible events is infinite. If we want to infer a probability in the usual sense, we need to specify
an interval [l, u] between a lower bound l and an upper bound u. The probability that trial T will
turn up a real number in this closed interval is given by
Z u
P ([l, u]) = (x)dx . (3.32)
l
Concept 3.2.1 Moment analysis. The nth moment of a probability density distribution is
defined as,
Z
n
hx i = xn (x)dx . (3.34)
The first moment is the mean of the distribution. With the mean hxi, the central moments are
defined
Z
n
h(x hxi) i = (x hxi)n (x)dx . (3.35)
The second central moment is the variance x2 and its square root x is the standard deviation.
R In many books and articles, the same symbol P is used for probabilities and probability
densities. This is pointed out by Swendsen [Swe12] who decided to do the same, pointing
out that the reader must learn to deal with this. In the next section he goes on to confuse
marginal and conditional probability densities with probabilities himself. In these lecture
notes we use P for probabilities, which are always unitless, finite numbers in the interval
[0, 1] and for probability densities, which are always infinitesimally small and may have
a unit. Students are advised to keep the two concepts apart, which means using different
symbols.
10 -3
12
1
A B
0.8 0.75
8
0.45
0.6
6
0.4 4
0.2 2
0 0
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
x x
Figure 3.2: Generation of random numbersRthat conform to a given probability density distribution. (A)
x
Cumulative probability distribution P (x) = ()d for (x) = exp(x4 ) (blue). A pseudo-random
number with uniform distribution in (0, 1), here 0.75, selects the ordinate of P (x) (red dashed horizontal
line). The corresponding abscissa, here x = 0.45 (red dashed vertical line), is an instance of a random
number with probability density distribution (x). (B) Distribution of 105 random numbers (grey line)
and target probability density distribution (x) = exp(x4 ) (black line).
5This one-liner may cause efficiency problems if computational effort per trial besides random number generation
is small.
3.2 Continuous Probability Theory 27
Likewise, the conditional probability density (y|x) is defined at all points where x (x) 6= 0,
(x, y)
(y|x) = . (3.38)
x (x)
If two continuous random numbers are independent, their joint probability density is the product
of the two individual probability densities,
A B
Figure 3.3: Monte Carlo simulation of a two-dimensional probability density distribution. (A) Two-
dimensional probability density distribution corresponding to the first-order membrane function used
in the Matlab logo. (B) Distribution of 107 random numbers conforming to the probability density
distribution shown in (A).
Problem 3.4 Write a Matlab program that generates random numbers conforming to a two-
dimensional probability density distribution mem that resembles the Matlab logo (see Figure
3.3). The (not yet normalized) distribution mem is obtained with the function call L =
membrane(1,resolution,9,9);. Hint: You can use the reshape function to generate a
vector from a two-dimensional array as well as for reshaping a vector into a two-dimensional
array. That way the two-dimensional problem (or, in general, a multi-dimensional problem) can
be reduced to the problem of a one-dimensional probability density distribution.
Concept 3.2.2 Dirac delta function. The Dirac delta function is a generalized function with
the following properties
1. RThe function (x) is zero everywhere except at x = 0.
2. (x)dx = 1.
The function can be used to select the value f (x0 ) of another continuous function f (x),
Z
f (x0 ) = f (x)(x x0 )dx . (3.40)
28 Probability Theory
This concept can be used, for example, to compute the probability density of a new random
variable s that is a function of two given random variables x and y with given joint probability
density (x, y). The probability density (s) corresponding to s = f (x, y) is given by
Z Z
(s) = (x, y) (s f (x, y)) dxdy . (3.41)
Note that the probability density (s) computed that way is automatically normalized.
y
a 8
c
1
r(x)
6
0 6 x
4
b 1/36
1
r(y)
0 6 y 0 4 6 8 x
Figure 3.4: Probability density distributions for two continuous random numbers x and y that are
uniformly distributed in the interval [0, 6] and have zero probability density outside this interval. a)
Marginal probability density x (x). b) Marginal probability density y (y). c) Joint probability density
(x, y). In the light blue area, = 1/36, outside = 0. The orange line corresponds to s = 4 and the
green line to s = 8.
We now use the concept of selective integration to compute the probability density (s) for
the sum s = x + y of the numbers shown by two continuous dice, with each of them having a
uniform probability density in the interval [0, 6] (see Fig. 3.4). We have
Z Z
(s) = (x, y) (s (x + y)) dydx (3.42)
Z 6Z 6
1
= (s (x + y)) dydx . (3.43)
36 0 0
The argument of the delta function in the inner integral over y can be zero only for
0 s x 6, since otherwise no value of y exists that leads to s = x + y. It follows that x s
and x s 6. For s = 4 (orange line in Fig. 3.4c) the former condition sets the upper limit of
the integration. Obviously, this is true for any s with 0 s 6. For s = 8 (orange line in Fig.
3.4c) the condition x s 6 sets the lower limit of the integration, as is also true for any s with
6 s 12. The lower limit is 0 for 0 s 6 and the upper limit is 6 for 6 s 12. Hence,
Z s
1 s
(s) = ds = for s 6 , (3.44)
36 0 36
3.2 Continuous Probability Theory 29
and
6
12 s
Z
1
(s) = ds = for s 6 . (3.45)
36 s6 36
From the graphical representation in Fig. 3.4c it is clear that (s) is zero at s = 0 and s = 12,
assumes a maximum of 1/6 at s = 6, increases linearly between s = 0 and s = 6 and decreases
linearly between s = 6 and s = 12.
Statistical Ensembles
Concept of an Ensemble
Ergodicity
Microcanonical Ensemble
Canonical Ensemble
Boltzmann Distribution
Equipartition Theorem
Internal Energy and Heat Capacity of the Canoni-
cal Ensemble
Grand Canonical Ensemble
4 Classical Ensembles
4.1.2 Ergodicity
Instead of considering a large ensemble of systems at the same time (ensemble average), we
could also consider a long trajectory of a single system in phase space. The single system
will go through different microstates and if we observe it for a sufficiently long time, we might
expect that it visits all accessible points in phase space with a frequency that corresponds to the
associated probability density. This idea is the basis of analyzing MD trajectories in terms of
thermodynamic state functions. The ensemble average hAi is replaced by the time average A.
We assume
hAi = A . (4.1)
to temperature and to the energy assigned to points in phase space. Points are accessible if
their energy is not too much higher than the energy minimum in phase space. Whether a single
dynamic system visits all these points at the same given temperature- and what time it needs
to sample phase space- depends on energy barriers. In MD simulations sampling problems are
often encountered, where molecular conformations that are thermodynamically accessible are
not accessed within reasonable simulation times. A multitude of techniques exists for alleviating
such sampling problems, none of them perfect. In general, time-average methods, be they
computational or experimental, should be interpreted only with care in terms of thermodynamics.
In this lecture course we focus on ensemble-average methods, which suffer from a loss in dynamic
information, but get the thermodynamic state functions right.
The probability density in phase space of the microcanonical ensemble is thus relatively
easy to compute. However, the restriction to constant energy, i.e. to an isolated system,
severely limits application of the microcanonical ensemble. To see this, we consider the simplest
system, an electron spin S = 1/2 in an external magnetic field B0 . This system is neither
classical nor describable in phase space, but it will nicely serve our purpose. The system has
a state space consisting of only two states |i and |i with energies = ~ge B B0 /2 and
= ~ge B B0 /2.2 In magnetic resonance spectroscopy, one would talk of an ensemble of
1It is more tricky to argue that it will only vanish if is uniform. However, as the individual particles follow
random phase space trajectories, it is hard to imagine that the right-hand side could be stationary zero unless is
uniform.
2Where ge is the g value of the free electron and B the Bohr magneton.
4.3 Canonical Ensemble 33
isolated spins, if the individual spins do not interact with each other. We shall see shortly
that this ensemble is not isolated in a thermodynamical sense, and hence not a microcanonical
ensemble.
The essence of the microcanonical ensemble is that all systems in the ensemble have the same
energy E, this restricts probability density to the hypersurface with constant E. If our ensemble
of N spins would be a microcanonical ensemble, this energy would be either E = ~ge B B0 /2
or E = ~ge B B0 /2 and all spins in the ensemble would have to be in the same state, i.e., the
ensemble would be in a pure state. In almost any experiment on spins S = 1/2 the ensemble
is in a mixed state and the populations of states |i and |i are of interest. The system is not
isolated, but, via spin relaxation processes, in thermal contact with its environment. To describe
this situation, we need another type of ensemble.
R Because each system can exchange heat with the bath and thus change its energy, systems
will transfer between subensembles during evolution. This does not invalidate the idea of
microcanonical subensembles with constant particle numbers Ni . For a sufficiently large
ensemble at thermal equilibrium the ni are constants of motion.
There are different ways of deriving the Boltzmann distribution. Most of them are rather
abstract and rely on a large mathematical apparatus. The derivation gets lengthy if one wants
to create the illusion that we know why the constant introduced below always equals 1/kB T ,
where kB = R/NAv is the Boltzmann constant, which in turn is the ratio of the universal gas
constant R and the Avogadro constant NAv . Here we follow a derivation [WF12] that is physically
transparent and relies on a minimum of mathematical apparatus that we have already introduced.
and
r1
X
Ni i = E , (4.5)
0
where E is a constant total energy of the system. We need to be careful in interpreting the
latter equation in the ensemble picture. The quantity E corresponds to the energy of the whole
canonical ensemble, which is indeed a constant of motion, if we consider a sufficiently large
number of systems in contact with a thermal bath. We can thus use our simple model of N
particles for guessing the probability density distribution in the canonical ensemble.
What we are looking for is the most likely distribution of the N particles on the r energy
levels. This is equivalent to putting N distinguishable balls into r boxes. We did already solve
the problem of distributing N objects to 2 states when considering the binomial distribution in
of a configuration with n objects in the first state and N n
Section 3.1.6. The statistical weight
objects in the second state was N n . With this information we would already be able to solve the
problem of a canonical ensemble of N spins S = 1/2 in thermal contact with the environment,
disregarding for the moment differences between classical and quantum statistics (see Section
6.2).
Coming back to N particles and r energy levels, we still have N ! permutations. If we assign
the first N0 particles to the state with energy 0 , the next N1 particles to 1 and so on, we need
to divide each time by the number of permutations Ni ! in the same energy state, because the
sequence of particles with the same energy does not matter. We call the vector of the occupation
numbers Ni a configuration. The configuration specifies one particular macrostate of the system
and the relative probability of the macrostates is given by their statistical weights,
N!
= . (4.6)
N0 !N1 ! . . . Nr1 !
The most probable macrostate is the one with maximum statistical weight . Because of the
peaking of probability distributions for large N , we need to compute only this most probable
macrostate; it is representative for the whole ensemble. Instead of maximizing we can as well
maximize ln , as the natural logarithm is a strictly monotonous function. This allows us to
apply Stirlings formula,
r1
X
ln = ln N ! ln Ni ! (4.7)
0
r1
X r1
X
= N ln N N + 1 Ni ln Ni + Ni r . (4.8)
0 0
Note that the second term on the right-hand side of Eq. (4.9) has some similarity to the entropy
of mixing, which suggests that ln is related to entropy.
4.3 Canonical Ensemble 35
In addition, we need to consider the boundary conditions of constant particle number, Eq. (4.4),
X
N = Ni = 0 (4.11)
i
It might appear that Eq. (4.11) could be used to cancel a term in Eq. (4.10), but this would
be wrong as Eq. (4.11) is a constraint that must be fulfilled separately. For the constrained
maximization we can use the method of Lagrange multipliers.
We now consider the case where the possible sets of the n variables are constrained by c
additional equations
gj (x1 , x2 , . . . , xn ) = 0 , (4.14)
where index j runs over the c constraints (j = 1 . . . c). Each constraint introduces another
equation of the same form as the one of Eq. (4.13),
n
X gj
gj = xi = 0 . (4.15)
xi xk 6=xi
i=1
If a set of variables {x0,1 . . . , x0,n } solves the constrained problem then there exists a set
{0,1 . . . 0,r } for which {x0,1 , x0,2 , . . . , x0,n } also corresponds to a stationary point of the
Lagrangian function L(x1 , . . . , xn , 1 , . . . r ). Note that not all stationary points of the
Lagrangian function are necessarily solutions of the constrained problem. This needs to be
checked separately.
The two boundary conditions fix only two of the population numbers Ni . We can choose the
multipliers and in a way that (1 + ln Ni + + i ) = 0 for these two Ni , which ensures that
the partial derivatives of ln with respect to these two Ni vanishes. The other r 2 population
numbers can, in principle, be chosen freely, but again we must have
1 + ln Ni + + i = 0 (4.19)
for all i to make sure that we find a maximum with respect to variation of any of the r population
numbers. This gives
Ni = ei (4.20)
giving
N
=P i
, (4.22)
ie
Ni ei
P (i) = = P . (4.23)
N ie
i
Concept 4.3.3 Boltzmann distribution. For a classical canonical ensemble with energy levels
i the probability distribution for the level populations is given by the Boltzmann distribution
Ni ei /kB T
Pi = = P /k T . (4.25)
N ie
i B
required for normalization is called canonical partition function.a The partition function is a
thermodynamical state function.
a
The dependence on N and V arises, because these parameters influence the energy levels
For the partition function, we use the symbol Z relating to the German term Zustandssumme,
which is a more lucid description of this quantity.
4.3 Canonical Ensemble 37
1 1 2
trans = mv 2 = p , (4.27)
2 2m
i.e., it is quadratic in the velocity coordinates of dynamic space or the momentum coordinates of
phase space. Translational energy is distributed via three degrees of freedom, as the velocities or
momenta have components along three pairwise orthogonal directions in space. Each quadratic
degree of freedom thus contributes a mean energy of kB T /2.
If we accept that the Lagrange multiplier assumes a value 1/kB T , we find a mean energy
kB T of an harmonic oscillator in the high-temperature limit [WF12]. Such an oscillator has two
degrees of freedom that contribute quadratically in the degrees of freedom to energy,
1 1
vib = v 2 + f x2 , (4.28)
2 2
where is the reduced mass and f the force constant. The first term contributes to kinetic energy,
the second to potential energy. In the time average, each term contributes the same energy and
assuming ergodicity this means that each of the two degrees of freedom contributes with kB T /2
to the average energy of a system at thermal equilibrium.
The same exercise can be performed for rotational degrees of freedom with energy
1
rot = I 2 , (4.29)
2
where I is angular momentum and angular frequency. Each rotational degree of freedom,
being quadratic in again contributes a mean energy of kB T /2.
Based on Eq. (4.23) it can be shown [WF12] that for an energy
f
X
i = 0 + 1 + 2 + . . . = k , (4.30)
k=1
where index k runs over the individual degrees of freedom, the number of molecules that
contribute energy k does not depend on the terms j with j 6= k. It can be further shown that
1
hk i = (4.31)
2
3The condition of a quadratic contribution arises from an assumption that is made when integrating over the
corresponding coordinate.
38 Classical Ensembles
R The equipartition theorem applies to all degrees of freedom that are activated. Translational
degrees of freedom are always activated and rotational degrees of freedom are activated
at ambient temperature, which corresponds to the high-temperature limit of rotational
dynamics. To vibrational degrees of freedom the equipartition theorem applies only in the
high-temperature limit. In general, the equipartition theorem fails for quantized degrees of
freedom if the quantum energy spacing is comparable to kB T /2 or exceeds this value. We
shall come back to this point when discussing the vibrational partition function.
Thus we obtain
1 dZ d ln Z
u = N kB T 2 = N kB T 2 . (4.34)
Z dT dT
Again the analogy of our simple system to the canonical ensemble holds. At this point we
have computed one of the state functions of phenomenological thermodynamics from the set of
energy levels. The derivation of the Boltzmann distribution has also indicated that ln , and thus
the partition function Z are probably related to entropy. We shall see in Section 5.2 that this is
indeed the case and that we can compute all thermodynamic state functions from Z.
Here we can still derive the heat capacity cV at constant volume, which is the partial derivative
of internal energy with respect to temperature. To that end we note that the partition function for
the canonical ensemble relates to constant volume and constant number of particles.
u 2 ln Z ln Z
cV = =N kB T = N kB (4.35)
T V T T V T 1/T V
[ ln Z/1/T ] N kB [ ln Z/1/T ]
= N kB = (4.36)
T V T2 1/T V
2
kB ln z
= 2 . (4.37)
T (1/T )2 V
In the last line of Eq (4.37) we have substituted the molecular partition function Z by the partition
function for the whole system, ln z = N ln Z. Note that this implies a generalization. Before, we
were considering a system of N identical particles. Now we implicitly assume that Eq. (4.37),
as well as u = kB T 2 ddT ln z
will hold for any system, as long as we correctly derive the system
partition function z.
We note here that the canonical ensemble describes a closed system that can exchange heat
with its environment, but by definition it cannot exchange work, because its volume V is constant.
This does not present a problem, since the state functions can be computed at different V . In
particular, pressure p can be computed from the partition function as well (see Section 5.2).
However, because the canonical ensemble is closed, it cannot easily be applied to all problems
that involve chemical reactions. For this we need to remove the restriction of a constant number
of particles in the systems that make up the ensemble.
4.4 Grand Canonical Ensemble 39
Concept 4.4.1 Grand canonical ensemble. An ensemble with constant chemical potential
k of all components, and constant volume V that is at thermal equilibrium with a heat
bath at constant temperature T and in chemical equilibrium with its environment is called
a grand canonical ensemble. It can be considered as consisting of canonical subensembles
with different particle numbers N . The grand canonical state energies and partition function
contain an additional chemical potential term. With this additional term the results obtained
for the canonical ensemble apply to the grand canonical ensemble, too.
whereas the probability distribution over the levels and particle numbers is
P
e( k Ni,k k i )/kB T
Pi = . (4.39)
Zgc
Note that the index range i is much larger than for a canonical ensemble, because each microstate
is now characterized by a set of particle numbers Ni,k , where k runs over the components.
R At this point we are in conflict with the notation that we used in the lecture course
on phenomenological thermodynamics (PC I http://www.epr.ethz.ch/education/
Thermodynamics_PCI/). In that course we defined the chemical potential as a molar
quantity, here it is a molecular quantity. The relation is PCI = NAv PCVI . Using the PC
I notation in the current lecture notes would be confusing in other ways, as is generally
used in statistical thermodynamics for the molecular chemical potential. A similar remark
applies to capital letters for state functions. Capital letters denote either a molecular
quantity or a molar quantity. The difference will be clear from the context. We note that in
general small letters for state functions (except for pressure p) denote extensive quantities
and capital letters (except for volume V ) denote intensive quantities.
Swendsens Postulates of Thermodynamics
Cautionary Remarks on Entropy
Swendsens Postulates
Entropy in Phenomenological Thermodynamics
Boltzmanns Entropy Definition
The Relation of State Functions to the Partition
Function
Entropy and the Partition Function
Helmholtz Free Energy
Gibbs Free Energy, Enthalpy, and Pressure
Irreversibility
Historical Discussion
Irreversibility as an Approximation
Entropy and Information
Gibbs Entropy
Von Neumann Entropy
Shannon Entropy
5 Entropy
dqrev = cV dT . (5.4)
We see that under the assumptions that we have made the entropy can be computed from the
partition function. In fact, there should be a unique mapping between the two quantities, as both
the partition function and the entropy are state functions and thus must be uniquely defined by
the state of the system.
We now proceed with computing constant k in the mathematical definition of Boltzmann
entropy, Eq. (5.3). By inserting Eq. (4.9) into Eq. (5.3) we have
r1
!
X
s = k N ln N Ni ln Ni . (5.12)
i=0
We have neglected the term r on the right-hand side of Eq. (4.9), as is permissible if the number
N of particles is much larger than the number r of energy levels. Furthermore, according to Eq.
(4.25) and the definition of the partition function, we have Ni = N ei /kB T /Z. Hence,
r1 i /kB T
" !#
X e ei /kB T
s = k N ln N N ln N (5.13)
Z Z
i=0
r1 i /kB T r1 i /kB T r1 i /kB T
" #
X e X e X e i
= k N ln N N ln N + N ln Z + N
Z Z Z kB T
i=0 i=0 i=0
(5.14)
" Pr1 #
i /kB T
N i=0 i e
= k N ln N N ln N + N ln Z + , (5.15)
kB T Z
where we have used the definition of the partition function of going from Eq. (5.14) to (5.15).
Using Eq. (4.32) for substitution in the last term on the right-hand side of Eq. (5.15), we find
u
s = k N ln Z + . (5.16)
kB T
Comparison of Eq. (5.16) with Eq. (5.11) gives two remarkable results. First, the multiplicative
constant k in Boltzmanns entropy definition can be identified as k = kB = R/NAv . Second, for
the system of N identical, distinguishable classical particles, we must have
zdist = Z N . (5.17)
identical ideal gases is called the Gibbs paradox. The Gibbs paradox can be healed by treating
the particles as indistinguishable. This reduces the statistical weight by N ! for the total system
and by (N/2)! for each subsystem, which just offsets the volume effect. Hence, for an ideal gas
we have
1 N
zindist = Z . (5.18)
N!
It may appear artificial to treat classical particles as indistinguishable, because the trajectory of
each particle could, in principle, be followed if they adhere to classical mechanics equations of
motion, which we had assumed. Note, however, that we discuss a macrostate and that we have
explicitly assumed that we cannot have information on the microstates, i.e., on the trajectories.
In the macrostate picture, particles in an ideal gas are, indeed, indistinguishable. For an ideal
crystal, on the other hand, each particle could be individually addressed, for instance, by high
resolution microscopy. In this case, we need to use Eq. (5.17).
f = u Ts . (5.19)
f = u T (u/T + kB ln z) (5.20)
= kB T ln z . (5.21)
We note that this value of f , which can be computed from only the canonical partition function
and temperature, corresponds to the global minimum over all macrostates. This is not surprising.
After all, the partition function was found in a maximization of the probability of the macrostate.
where we have skipped the lower index n indicating constant molar amount. This is permissible
for the canonical ensemble, where the number of particles is constant by definition. From Eq.
(5.22) it follows that
ln z
pV = kB T (5.23)
ln V T
and
ln z ln z
h = u + pV = kB T + . (5.24)
ln T V ln V T
5.3 Irreversibility 47
5.3 Irreversibility
5.3.1 Historical Discussion
Daily experience tells us that some processes are irreversible. Phenomenological thermodynamics
had provided recipes for recognizing such processes by an increase in entropy for an isolated
system or decrease of free energy for a closed system. When Boltzmann suggested a link between
classical mechanics of molecules on a microscopic level and irreversibility of processes on the
macroscopic level, many physicists were irritated nevertheless. In retrospect it is probably fair to
say that a controversial discussion of Boltzmanns result could only ensue because atomistic or
molecular theory of matter was not yet universally accepted at the time. It is harder to understand
why this discussion is still going on in textbooks. Probably this is related to the fact that physicists
in the second half of the 19th and first half of the 20th believed that pure physics has implications
in philosophy, beyond the obvious ones in epistemology applied to experiments in the sciences.
If statistical mechanics is used to predict the future of the universe into infinite times, problems
ensue. If statistical mechanics is properly applied to well-defined experiments there are no such
problems.
Classical mechanics of particles does not involve irreversibility. The equations of motion
have time reversal symmetry and the same applies to quantum-mechanical equations of motion.
If the sign of the Hamiltonian can be inverted, the system will evolve backwards along the
same trajectory in phase space (or state space) that it followed to the point of inversion. This
argument is called Umkehreinwand or Loschmidt paradox and was brought up (in its classical
form) by Loschmidt. The argument can be refined and is then known as the central paradox:
Each microstate can be assigned a time-reversed state that evolves, under the same Hamiltonian,
backwards along the same trajectory. The two states should have the same probability. The
central paradox confuses equilibrium and non-equilibrium dynamics. At equilibrium a state and
the corresponding time-reversed state indeed have the same probability, which explains that the
macrostate of the system does not change and why processes that can be approximated by a series
of equilibrium states are reversible. If, on the other hand, we are not at equilibrium, there is no
reason for assuming that the probabilities of any two microstates are related. The system is at
some initial condition with a given set of probabilities and we are not allowed to pose symmetry
requirements to this initial condition.
The original Umkehreinwand, which is based on sign inversion of the Hamiltonian rather then
the momenta of microstates, is more serious than the central paradox. Time-reversal experiments
of this type can be performed, for instance, echo experiments in magnetic resonance spectroscopy
and optical spectroscopy. In some of these echo experiments, indeed the Hamiltonian is sign-
inverted, in most of these experiments application of a perturbation Hamiltonian for a short
time (pulse experiment) causes sign inversion of the density matrix. Indeed, the first paper on
observation of such a spin echo by Erwin Hahn was initially rejected with the argument that
he could not have observed what he claimed, as this would have violated the Second Law of
Thermodynamics. A macroscopic time-reversal experiment that creates a colorant echo in
corn syrup can be based on laminar flow [UNM12]. We note here that all these time-reversal
experiments are based on preparing a system in a non-equilibrium state. To analyze them, changes
in entropy or Helmholtz free energy must be considered during the evolution that can be reversed.
These experiments do not touch the question whether or not the same system will irreversibly
48 Entropy
approach an equilibrium state if left to itself for a sufficiently long time. We can see this easily for
the experiment with colorants and corn syrup. If, after setup of the initial state and evolution to
the point of time reversal, a long time would pass, the colorant echo would no longer be observed,
because diffusion of the colorants in corn syrup would destroy spatial correlation. The echo
relies on the fact that diffusion of the colorants in corn syrup can be neglected on the time scale
of the experiment, i.e., that equilibrium cannot be reached. The same is true for the spin echo
experiment, which fails if the evolution time is much longer than the transverse relaxation time
of the spins.
Another argument against irreversibility was raised by Zermelo, based on a theorem by
Poincar. The theorem states that any isolated classical system will return repeatedly to a
point in phase space that is arbitrarily close to the starting point. This argument is known as
Wiederkehreinwand or Zermelo paradox. We note that such quasi-periodicity is compatible with
the probability density formalism of statistical mechanics. The probability density distribution is
very sharply peaked at the equilibrium state, but it is not zero at the starting point in phase space.
The system fluctuates around the equilibrium state and, because the distribution is sharply peaked,
these fluctuations are very small most of the time. Once in a while the fluctuation is sufficiently
large to revisit even a very improbable starting point in phase space, but for a macroscopic system
this while is much longer than the lifetime of our galaxy. For practical purposes such large
fluctuations can be safely neglected, because they occur so rarely. That a system will never
evolve far from the equilibrium state once it had attained equilibrium is an approximation, but the
approximation is better than many other approximations that we use in physics. The statistical
error that we make is certainly much smaller than our measurement errors.
This is the definition of Gibbs entropy, while Boltzmann entropy is assigned to an individual
microstate. Note that we have used a capital S because Gibbs entropy is a molecular entropy.
Using Eq. (4.25), we obtain for the system entropy s = N S,
X i
s = kB N Pi ln Z (5.27)
kB T
i
u
= + kB ln z , (5.28)
T
where we have assumed distinguishable particles, so that ln z = N ln Z. We have recovered Eq.
(5.11) that we had derived for the system entropy starting from Boltzmann entropy and assuming
a canonical ensemble. For a canonical ensemble of distinguishable particles, either concept can
be used. As noted above, Gibbs entropy leads to the paradox of a positive mixing entropy for
combination of two subsystems made up by the same ideal gas. More generally, Gibbs entropy is
not extensive if the particles are indistinguishable. The problem can be solved by redefining the
system partition function as in Eq. (5.18).
This problem suggests that entropy is related to the information we have on the system.
Consider mixing of 13 CO2 with 12 CO2 .4 At a time when nuclear isotopes were unknown, the
two gases could not be distinguished and mixing entropy was zero. With a sufficiently sensitive
spectrometer we could nowadays observe the mixing process by 13 C NMR. We will observe
spontaneous mixing. Quite obviously, the mixing entropy is not zero anymore.
This paradox cautions against philosophical interpretation of entropy. Entropy is a quantity
that can be used for predicting the outcome of physical experiments. It presumes an observer and
depends on the information that the observer has or can obtain.5 Statistical mechanics provides
general recipes for defining entropy, but the details of a proper definition depend on experimental
context.
Unlike the system entropy derived from Boltzmann entropy via the canonical ensemble,
Gibbs entropy is, in principle, defined for non-equilibrium states. Because it is based on the same
probability concept, Gibbs entropy in an isolated system is smaller for non-equilibrium states
than for equilibrium states.
S = kB Trace { ln } , (5.29)
where is the density matrix. Some physics textbooks dont distinguish von Neumann entropy
from Gibbs entropy [Sch06]. Von Neumann entropy is a constant of motion if an ensemble of
classical systems evolves according to the Liouville equation or a quantum mechanical system
evolves according to the Liouville-von-Neumann equation. It cannot describe the approach of an
isolated system to equilibrium. Coupling of the quantum mechanical system to an environment
can be described by the stochastic Liouville equation[Kub63; Tan06; VF74]
b i hb i b
= H, b + beq ) ,
b (b (5.30)
t ~
4This thought experiment was suggested to me by Roland Riek.
5One can speculate on philosophical interpretations. Irreversibility could be a consequence of partitioning the
universe into an observer and all the rest, a notion that resonates with intuitions of some mystical thinkers across
different religious traditions. Although the idea is appealing, it cannot be rationally proved. Rational thought already
implies that an observer exists.
50 Entropy
where b is a Markovian operator and eq the density matrix at equilibrium. This equation of
b
motion can describe quantum dissipative systems, i.e., the approach to equilibrium, without
relying explicitly on the concept of entropy, except for the computation of eq , which relies
on generalization of the Boltzmann distribution (see Section 6.1.2). However, to derive the
Markovian operator ,
b
b explicit assumptions on the coupling between the quantum mechanical
system and its environment must be made, which is beyond the scope of this lecture course.
A logarithm to the basis of 2 is used here as the information is assumed to be coded by binary
numbers. Unlike for discrete states in statistical mechanics, an event may be in the set but still
have a probability P (aj ) = 0. In such cases, P (aj ) log2 P (aj ) is set to zero. Shannon entropy
is the larger the more random the distribution is, or, more precisely, the closer the distribution
is to a uniform distribution. Information is considered as deviation from a random stream of
numbers or characters. The higher the information content is, the lower the entropy.
Shannon entropy can be related to reduced Gibbs entropy = S/kB . It is the amount of
Shannon information that is required to specify the microstate of the system if the macrostate
is known. When expressed with the binary logarithm, this amount of Shannon information
specifies the number of yes/no questions that would have to be answered to specify the microstate.
We note that this is exactly the type of experiment presumed in the second Penrose postulate
(Section 2.3.1). The more microstates are consistent with the observed macrostate, the larger is
this number of questions and the larger are Shannon and Gibbs entropy. The concept applies
to non-equilibrium states as well as to equilibrium states. It follows, what was stated before
Shannon by G. N. Lewis: "Gain in entropy always means loss of information, and nothing more".
The equilibrium state is the macrostate that lacks most information on the underlying microstate.
We can further associate order with information, as any ordered arrangement of objects
contains information on how they are ordered. In that sense, loss of order is loss of information
and increase of disorder is an increase in entropy. The link arises via probability, as the total
number of arrangements is much larger than the number of arrangements that conform to a
certain order principle. Nevertheless, the association of entropy with disorder is only colloquial,
because in most cases we do not have quantitative descriptions of order.
Quantum Canonical Ensemble
Density Matrix
Quantum Partition Function
Quantum and Classical Statistics
-H/kBT
e
Types of Permutation Symmetry
Bose-Einstein Statistics
Fermi-Dirac Statistics
Maxwell-Boltzmann Statistics req =
Simple Quantum Systems
Spin S = 1/2
Harmonic Oscillator
Einstein and Debye Models of a Crystal
Tr{e -H/k T
} B
6 Quantum Ensembles
Concept 6.1.1 Density Matrix. The microstates that can be assumed by a system in a quantum
ensemble are specified by a possible set of wavefunctions i (i = 1 . . . r). The probability or
population of the ith microstate is denoted as Pi , and for the continuous case the probability
density for a given wavefunction is denoted as p(). The density operator is then given by
r1
X
b = P (i) |i i hi | (discrete) (6.1)
i=0
Z
b = p() |i i hi | (continuous) . (6.2)
The density operator can be expressed as a density matrix with respect to a set of basis
functions |ki. For exact computations the basis functions must form a countable complete
set that allows for expressing the system wavefunctions i as linear combinations of basis
functions. For approximative computations, it suffices that this linear combination is a good
approximation. The matrix elements of the density matrix are then given by
r1
X
nm = P (i) hm |i i hi | ni (discrete) (6.3)
Zi=0
nm = p() hm |i i hi | ni (continuous) . (6.4)
P
With the complex coefficients ck in the linear combination representation |i = k ck |ki, the
matrix elements are
nm = cm cn , (6.5)
52 Quantum Ensembles
where the asterisk denotes the complex conjugate and the bar for once denotes the ensemble
average. It follows that diagonal elements (m = n) are necessarily real, nn = |cn |2 and that
mn is the complex conjugate of nm . Therefore, the density matrix is Hermitian and the
density operator is self-adjoint. The matrix dimension is the number of basis functions. It is
often convenient to use the eigenfunctions of the system Hamiltonian H
b as the basis functions,
but the concept of the density matrix is not limited to this choice.
R That the density matrix can be expressed in the basis of eigenstates does not imply that
the ensemble can be represented as consisting of only eigenstates, as erroneously stated
by Swendsen [Swe12]. Off-diagonal elements of the density matrix denote coherent
superpositions of eigenstates, or short coherences. This is not apparent in Swendsens
simple example where coherence is averaged to zero by construction. The ensemble can
be represented as consisting of only eigenstates if coherence is absent. In that case the
density matrix is diagonal in the eigenbasis. Diagonal elements of the density matrix
denote populations of basis states.
In quantum mechanics, it is well defined what information we can have about the macrostate
of a system, because quantum measurements are probabilistic even for a microstate. We can
observe only quantities that are quantum-mechanical observables andDthese E observables are
represented by operators A. It can be shown that the expectation value A of any observable
b b
can be computed from the density matrix by [SJ01; Swe12]
D E n o
Ab = Trace bA b , (6.6)
where we have used operator notation for b to point out that b and A b must be expressed in the
same basis.
Since the expectation values of all observables are the full information that we can have
on a quantum system, the density matrix specifies the full information that we can have on
the ensemble. However, the density matrix does not fully specify the ensemble itself, i.e., we
cannot infer the probabilities P (i) or probability density function () from the density matrix
(Swendsen gives a simple example [Swe12]). This is another example for the information loss on
microstates that comes about when we can only observe macrostates and that is conceptually
equivalent to entropy. The von-Neumann entropy can be computed from the density matrix by
Eq. (5.29).
We note that there is one important distinction between classical and quantum-mechanical
observations for an individual system. In the quantum case we can specify only an expectation
value, and the second and third Penrose postulates (Section 2.3.1) do not apply: neither can
we simultaneously measure all observables (they may be incompatible), nor is the outcome of
a later measurement independent of the current measurement. However, quantum uncertainty
is much smaller than measurement errors for the large ensembles that we treat by statistical
thermodynamics. Hence, the Penrose postulates apply to the quantum-mechanical ensembles
that represent macrostates, although they do not apply to the microstates.
If all systems in a quantum ensemble populate the same microstate, i.e., they correspond to
the same wavefunction, the ensemble is said to be in a pure state. A pure state corresponds to
minimum rather than maximum entropy. Otherwise the system is said to be in a mixed state.
distribution, partition function, entropy and all other state functions for classical systems from
the canonical ensemble anyway, we can simply ignore this problem. The canonical ensemble is
considered to be at thermal equilibrium with a heat bath (environment) of infinite size. It does
not matter whether this heat bath is of classical or quantum mechanical nature. For an infinitely
sized quantum system, the energy spectrum is continuous, which allows us to exchange energy
between the bath and any constituent system of the quantum canonical ensemble at will.
We can derive Boltzmann distribution and partition function for the density matrix by analogy
to the classical case. For that we consider the density matrix in the eigenbasis. The energies of the
eigenstates are the eigenvalues i of the Hamiltonian H. All arguments and mathematical steps
from Section 4.3.1 still apply, with a single exception: Quantum mechanics allows for microstates
that are coherent superpositions of eigenstates. The classical derivation carries over if and only
if we can be sure that the equilibrium density matrix can be expressed without contributions
from such microstates, which would lead to off-diagonal elements in the representation in the
eigenbasis of H. b This argument can indeed be made. Any superposition of two eigenstates
|ni and |mi with amplitudes |cn | and |cm | can be realized with arbitrary phase difference
between the two eigenfunctions. The microstates with the same |cn | and |cm | but different all
have the same energy. The entropy of a subensemble that populates these microstates is maximal
if the distribution of phase differences is uniform in the interval [0, 2). In that case cm cn
vanishes, i.e., such subensembles will not contribute off-diagonal elements to the equilibrium
density matrix.
We can now arrange the ei /kB T in matrix form,
= eH/kB T ,
b
(6.7)
with the matrix elements ii = ei /kB T and ij = 0 for i 6= j. The partition function is the sum
of all the diagonal elements of this matrix, i.e. the trace of . Hence,
eH/kB T
b
beq = n o, (6.8)
Trace eH/k
b BT
where we have used operator notation. This implies that Eq. (6.8) can be evaluated in any basis,
b In a different basis, ei /kB T needs to be computed as a matrix
not only the eigenbasis of H.
exponential and, in general, the density matrix eq will have non-zero off-diagonal elements in
such a different basis.
The quantum-mechanical partition function,
n o
Z = Trace eH/kB T ,
b
(6.9)
is independent of the choice of basis, as the trace of a matrix is invariant under unitary
transformations. Note that we have used a capital Z for a molecular partition function. This is
appropriate, as the trace of beq in Eq. (6.8) is unity. In the eigenbasis, the diagonal elements of
eq are the populations of the eigenstates at thermal equilibrium. There is no coherence for a
sufficiently large quantum ensemble at thermal equilibrium.
We note that the density matrix at thermal equilibrium can be derived in a more strict manner
by explicitly considering a system that includes both the canonical ensemble and the heat bath
and by either tracing out the degrees of freedom of the heat bath [Sch06] or relying on a series
expansion that reduces to only two terms in the limit of an infinite heat bath [Sch06; Swe12].
When approaching zero absolute temperature, the matrix element of in the eigenbasis that
corresponds to the lowest energy i becomes much larger than all the others. At T = 0, the
corresponding ground state is exclusively populated and the ensemble is in a pure state if there is
54 Quantum Ensembles
just one state with this energy. For T on the other hand, differences between the diagonal
matrix elements vanish and all states are equally populated. The ensemble is in a maximally
mixed state.
{P1 , P2 , A1 , P3 , A2 , A3 } for three particles and three levels denotes a state where level A1 s
occupied by particles P1 and P2 , level A2 is occupied by particle P3 and level A3 is empty.
With this convention the last energy level is necessarily the last element of the set (any particle
standing right from it would not have an associated level), hence only (Ni + Ai 1)! such
permutations exist. Each permutation also encodes a sequence of particles, but the particles are
indistinguishable. Thus we have to divide by Ni ! in order to not double count configurations that
we cannot distinguish. We also have to divide by (Ai 1)!, since sets with a different sequence
of the levels but the same occupation numbers of each individual level are not distinguishable.
Alternatively, the division by (Ai 1)! can be understood by noting that the energy levels are
ordered by increasing energy and of all (Ai 1)! permutations of the lowest Ai 1 levels we
consider only the one that is properly ordered. Hence, the number of configurations and thus the
number of microstates in the interval between i and i + d is
(Ni + Ai 1)!
Ci = . (6.10)
Ni ! (Ai 1)!
The configurations in energy intervals with different indices i are independent of each other.
Hence, the statistical weight of a macrostate is
Y (Ni + Ai 1)!
= (6.11)
Ni ! (Ai 1)!
i
As the number of energy levels is, in practice, infinite, we can choose the Ai sufficiently large for
neglecting the 1 in Ai 1. In an exceedingly good approximation we can thus write
Y (Ni + Ai )!
= . (6.12)
Ni !Ai !
i
The next part of the derivation is the same as for the Boltzmann distribution in Section 4.3.1,
i.e., it relies on maximization of ln using
P the Stirling formula and considering the constraints
of conserved total particle number N = i Ni and conserved total energy of the system [WF12].
The initial result is of the form
Ni 1
=
, (6.13)
Ai Be i 1
where B is related to the Lagrange multiplier by B = e and thus to the chemical potential
by B = e/(kB T ) . After a rather tedious derivation using the definitions of Boltzmann entropy
and (u/s)V = T we can identify with 1/kB T [WF12]. We refrain from reproducing this
derivation here, as the argument is circular: It uses the identification of k with kB in the definition
of Boltzmann entropy that we had made earlier on somewhat shaky grounds. We accept the
identification of || with 1/kB T as general for this type of derivations, so that we finally have
Ni 1
= . (6.14)
Ai Be B T 1
i /k
Up to this point we have supposed nothing else than a continuous, or at least sufficiently
dense, energy spectrum and identical bosons. To identify B we must have information on this
energy spectrum and thus specify a concrete physical problem. When using the density of states
for an ideal gas consisting of quantum particles with mass m in a box with volume V (see Section
7.2 for derivation),
V
D() = 4 2 3 m3/2 1/2 , (6.15)
h
we find, for the special case Bei /kB T 1,
(2mkB T )3/2 V
B= . (6.16)
h3 N
56 Quantum Ensembles
Ai !
Ci = (6.17)
Ni ! (Ai Ni )!
and, considering mutual independence of the configurations in the individual energy intervals,
the statistical weight of a macrostate for fermions is
Y Ai !
= . (6.18)
Ni ! (Ai Ni )!
i
Again, the next step of the derivation is analogous to derivation of the Boltzmann distribution
in Section 4.3.1 [WF12]. We find
Ni 1
= . (6.19)
Ai Be B T + 1
i /k
For the special case Bei /kB T 1, B is again given by Eq. (6.16). Comparison of Eq. (6.19
) with Eq. (6.14) reveals as the only difference the sign of the additional number 1 in the
denominator on the right-hand side of the equations. In the regime Bei /kB T 1, for which we
have specified B, this difference is negligible.
It is therefore of interest when this regime applies. As i 0 in the ideal gas problem, we
have ei /kB T 1, so that B 1 is sufficient for the regime to apply. Wedler and Freund [WF12]
have computed values of B according to Eq. (6.16) for the lightest ideal gas, H2 , and have
found B 1 for p = 1 bar down to T = 20 K and at ambient temperature for pressures up to
p = 100 bar. For heavier molecules, B is larger under otherwise identical conditions. Whether a
gas atom or molecule is a composite boson or fermion thus does not matter, except at very low
temperatures and very high pressures. However, if conduction electrons in a metal, for instance
in sodium, are considered as a gas, their much lower mass and higher number density N/V leads
to B 1 at ambient temperature and even at temperatures as high as 1000 K. Therefore, a gas
model for conduction electrons (spin 1/2) must be set up with Fermi-Dirac statistics.
configurations can be distinguished in the energy interval between i and i + i . Because the
particles are distinguishable (tagged), the configurations in the individual intervals are generally
not independent from each other, i.e. the total number of microstates does not factorize into
the individual numbers of microstates in the intervals. We obtain more configurations than that
because we have the additional choice of distributing the N tagged particles to r intervals. We
have already solved this problem in Section 4.3.1, the solution is Eq. (4.6). By considering the
additional number of choices, which enters multiplicatively, we find for the statistical weight of a
macrostate
N! Nr1
= AN0 N1
0 A1 . . . Ar1 (6.21)
N0 !N1 ! . . . Nr1 !
Y ANi
i
= N! . (6.22)
Ni !
i
It appears that we have assumed a countable number r of intervals, but as in the derivations
for the Bose-Einstein and Fermi-Dirac statistics, nothing prevents us from making the intervals
arbitrarily narrow and their number arbitrarily large.
Again, the next step in the derivation is analogous to derivation of the Boltzmann distribution
in Section 4.3.1 [WF12]. All the different statistics differ only in the expressions for , constrained
maximization of ln uses the same Lagrange ansatz. We end up with
Ni 1
= . (6.23)
Ai Be /kB T
i
Comparison of Eq. (6.23) with Eqs. (6.14) and (6.19) reveals that, again, only the 1 in the
denominator on the right-hand side makes the difference, now it is missing. In the regime, where
Bose-Einstein and Fermi-Dirac statistics coincide to a good approximation, both of them also
coincide with Maxwell-Boltzmann statistics.
There exist two caveats. First, we already know that the assumption of distinguishable
particles leads to an artificial mixing entropy for two subsystems consisting of the same ideal gas
or, in other words, to entropy not being extensive. This problem does not, however, influence
the probability distribution, it only influences scaling of entropy with system size. We can solve
it by an ad hoc correction when computing the system partition function from the molecular
partition function. Second, to be consistent we should not use the previous expression for B,
because it was derived under explicit consideration of quantizationP of momentum.1 However, for
Maxwell-Boltzmann statistics B can be eliminated easily. With i Ni = N we have from Eq.
(6.23)
1 X
N= Ai ei /kB T , (6.24)
B
i
which gives
1 N
=P i /kB T
. (6.25)
B i Ai e
Ni Ai ei /kB T
Pi = =P i /kB T
. (6.26)
N i Ai e
1This is more a matter of taste than of substance. As long as Bei /kN T 1, we can approximate any type
of quantum statistics by Maxwell-Boltzmann statistics before solving for B. We are thus permitted to freely mix
Maxwell-Boltzmann statistics with quantum-mechanical equations of motion.
58 Quantum Ensembles
Comparison of Eq. (6.26) with the Boltzmann distribution given by Eq. (4.25) reveals the
factors Ai as the only difference. Thus, the probability distribution for Maxwell-Boltzmann
statistics deviates from the most common form by the degree of degeneracy Ai of the individual
levels. This degeneracy entered the derivation because we assumed that within the intervals
between i and i + i several levels exist. If i is finite, we speak of near degeneracy. For
quantum systems, degeneracy of energy levels is a quite common phenomenon even in small
systems where the energy spectrum is discrete. In order to describe such systems, the influence
of degeneracy on the probability distribution must be taken into account.
Concept 6.2.1 Degeneracy. In quantum systems with discrete energy levels there may exist
gi quantum states with the same energy i that do not coincide in all their quantum numbers.
This phenomenon is called degeneracy and gi the degree of degeneracy. A set of gi degenerate
levels can be populated by up to gi fermions. In the regime, where Boltzmann statistics is
applicable to the quantum system, the probability distribution considering such degeneracy is
given by
Ni gi ei /kB T
Pi = =P i /kB T
(6.27)
N i gi e
The condition that degenerate levels do not coincide in all quantum numbers makes sure that the
Pauli exclusion principle does not prevent their simultaneous population with fermions.
At this point we can summarize the expected number of particles with chemical potential at
level i with energy i and arbitrary degeneracy gi for Bose-Einstein, Fermi-Dirac, and Boltzmann
statistics:
gi
Ni = ( )/(k T ) Bose Einstein statistics (6.29)
e i B 1
gi
Ni = ( )/(k T ) Fermi Dirac statistics (6.30)
e i B +1
gi
Ni = ( )/(k T ) Boltzmann statistics . (6.31)
e i B
P
Note that the chemical potential in these equations is determined by the condition N = i Ni .
The constant B in the derivations above is given by B = e/(kB T ) . If N is not constant, we
have = 0 and thus B = 1.
b = 1fx pb2
H b2 + , (6.39)
2 2
where the reduced mass is
mA mB
= (6.40)
mA + mB
and where the first term on the right-hand side of Eq. (6.39) corresponds to potential energy and
the second term to kinetic energy.
Eq. (6.39) can be cast in the form
2 2
b = 1 2 (R RE )2 ~ d ,
H (6.41)
2 2 dR2
where we have substituted x
b by the deviation of the atom-atom distance R from the bond length
RE and introduced the angular oscillation frequency of a classical oscillator with
s
f
= . (6.42)
where we have substituted x = e~/kB T and n =Pv to obtain the last line. Since for finite
temperatures 0 < e~/kB T < 1, the infinite series n
n=0 x converges to 1/(1 x). Hence,
e~/2kB T
Z= . (6.48)
1 e~/kB T
We can again discuss the behavior for T 0. In the denominator, the argument of the
exponential function approaches , so that the denominator approaches unity. In the numerator
the argument of the exponential function also approaches , so that the partition function
approaches zero and Helmholtz free energy f = kB T ln Z can only be computed as a limiting
value. The term kB ln Z in the entropy equation (5.11) approaches .
This problem can be healed by shifting the energy scale by E = ~/2. We then have2
1
Z= . (6.49)
1 e~/kB T
With this shift, the partition function and the population of the ground state v = 0 both approach
1 when the temperature approaches zero. For the term u/T in the entropy expression we still
need to consider a limiting value, but it can be shown that lim u/T = 0. Since kB ln Z = 0 for
T 0
Z = 1, entropy of an ensemble of harmonic oscillators vanishes at the zero point in agreement
with Nernsts theorem. Helmholtz free energy f = kB T ln Z approaches zero.
R For computing a Boltzmann distribution we can shift all energy levels by the same offset
E without influencing the P (i), as such a shift leads to a multiplication by the same
factor of the numerator and of all terms contributing to the partition function. Such a shift
can remove an infinity of the partition function.
This partition function can also be expressed with a characteristic vibrational temperature
~
vib = . (6.50)
kB
2The shift does not influence the denominator, as it merely removes the first factor on the right-hand side of Eq.
(6.45).
6.3 Simple Quantum Systems 61
3 3N ~
= N ~ + ~/k T . (6.56)
2 e B 1
With the characteristic vibrational temperature vib introduced in Eq. (6.50) and by setting
N = NAv to obtain a molar quantity, we find
3 3Rvib
Uvib = Rvib + /T . (6.57)
2 e vib 1
The molar heat capacity of an Einstein solid is the derivative of Uvib with respect to T . We
note that we do not need to specify constant volume or constant pressure, since this simple model
depends on neither of these quantities. We find
(vib /T )2 evib /T
Cvib = 3R 2 . (6.58)
evib /T 1
62 Quantum Ensembles
According to the rule of Dulong and Petit we should obtain the value 3R for T . Since
the expression becomes indeterminate (0/0), we need to compute a limiting value, which is
possible with the approach of de lHospital where we separately differentiate the numerator and
denominator[WF12]. The derivation is lengthy but it indeed yields the limiting value 3R:
(vib /T )2 evib /T
lim Cvib = lim 3R 2 (6.59)
T T evib /T 1
2 (vib /T ) vib /T 2
= 3R lim (6.60)
T 2 1 evib /T evib /T (vib /T 2 )
(vib /T )
= 3R lim (6.61)
T 1 evib /T
vib /T 2
= 3R lim (6.62)
T evib /T (vib /T 2 )
= 3R . (6.63)
In Eq. (6.59) in the numerator and in going from Eq. (6.60) to (6.61) we have set evib /T to 1, as
we may for T . As the expression was still indeterminate, we have computed the derivatives
of numerator and denominator once again in going from Eq. (6.61) to (6.62) and finally we have
once more set evib /T to 1 in going from Eq. (6.62) to (6.63). We see that Einsteins very
simple model agrees with the rule of Dulong and Petit.
R The model of the Einstein solid differs from a model of NAv one-dimensional harmonic
oscillators according to Section 6.3.2 only by a power of 3 in the partition function, which,
after computing the logarithm, becomes a factor of 3 in the temperature-dependent term of
Uvib and thus in Cvib . Hence, in the high-temperature limit the vibrational contribution to
the molar heat capacity of a gas consisting of diatomic molecules is equal to R. It follows
that, in this limit, each molecule contributes an energy kB T to the internal energy, i.e.
each of the two degrees of freedom (potential and kinetic energy of the vibration) that are
quadratic in the coordinates contributes a term kB T /2. This agrees with the equipartition
theorem. Likewise, the Einstein solid agrees with this theorem.
From experiments it is known that molar heat capacity approaches zero when temperature
approaches zero. Again the limiting value can be computed by the approach of de l Hospital
[WF12], where this time we can neglect the 1 in evib /T 1, as evib /T tends to infinity for
T 0. In the last step we obtain
1
lim Cvib = 6R lim =0. (6.64)
T 0 T 0 evib /T
Thus, the Einstein solid also agrees with the limiting behavior of heat capacity at very low
temperatures.
Nevertheless the model is too simple, and Einstein was well aware of that. Vibrations of
the individual atoms are not independent, but rather collective. The lattice vibrations, called
phonons, have a spectrum whose computation is outside the scope of the Einstein model. A
model that can describe this spectrum has been developed by Debye based on the density of states
of frequencies . This density of states in turn has been derived by Rayleigh and Jeans based on
the idea that the phonons are a system of standing waves in the solid. It is given by [WF12]
4 2
D () = V . (6.65)
c3
Debye replaced c by a mean velocity of wave propagation in the solid, considered one longitudinal
and two transverse waves and only the 3N states with the lowest frequencies, as the solid has
6.3 Simple Quantum Systems 63
hmax
D = . (6.66)
kB
In this model the molar heat capacity of the solid becomes
3 Z D /T
x4 e x
T
Cvib = 9R dx (6.67)
D 0 (ex 1)2
The integral can be evaluated numerically after series expansion and finally Debyes T 3 law,
T3
lim Cvib = 233.8R , (6.68)
T 0 3D
results. This law does not only correctly describe that the heat capacity vanishes at absolute zero,
it also correctly reproduces the scaling law, i.e., the T 3 dependence that is found experimentally.
The high-temperature limit can also be obtained by series expansion and is again Dulong-Petits
value of 3R.
The Debye model is still an approximation. Phonon spectra of crystalline solids are not
featureless. They are approximated, but not fully reproduced, by a 2 dependence. The deviations
from the Debye model depend on the specific crystal structure.
Separation of contributions
Collective Degrees of Freedom
Factorization of Energy Modes
Translational Partition Function
Density of States of an Ideal Gas
Partition Function and Accessible States
Nuclear Spin Partition Function
High-Temperature Limit
Symmetry Requirements
Rotational Partition Function
Rigid Rotor Assumption and Rotamers
Accessible States and Symmetry
Vibrational Partition Function
The Harmonic Oscillator Extended
Vibrational Contributions to U , CV , and S
Electronic Partition Function
Equilibrium Constant for Gas Reactions
In many cases, the individual contributions are separable, i.e. the modes corresponding to
different types of motions can be treated independently. Roughly speaking, this results from a
separation of energy ranges (frequency bands) of the modes and a corresponding separation of
time scales. Nuclear spin degrees of freedom have much lower energy than rotational degrees
of freedom which usually have much lower energy than vibrational degrees of freedom which
have much lower energies than electronic excitation. The independence of nuclear and electron
motion is basis of the Born-Oppenheimer approximation and the independence of rotational
and vibrational motion is invoked when treating a molecule as a rigid rotor. Separability of
energy modes leads to a sum rule for the energy contributions for a single closed-shell molecule
[Mac98],
where j,trs , j,ns , j,rot , j,vib , and j,el are the translational, nuclear spin, rotational, vibrational,
and electronic contributions, respectively. For a monoatomic molecule (atom) j,rot and j,vib
vanish. If both the number of neutrons and of protons in the nucleus is even, the nucleus has spin
I = 0. In that case the nuclear spin contribution vanishes for an atom, even in the presence of an
external magnetic field. If all nuclei have spin zero, the nuclear spin contribution also vanishes
for a diatomic or multi-atomic molecule.
We see that the total partition function is the product of the partition functions corresponding to
the individual modes. This consideration can be extended to multiple modes. With Eq. (7.1) it
follows that
By considering Eq. (5.17) or Eq. (5.18) we see that we can also compute the partition function
for a given mode for all N particles before multiplying the modes. We have already seen that we
7.2 Translational Partition Function 67
N /N ! to heal the Gibbs paradox. What about the other, internal degrees of
must set ztrs = Ztrs
freedom? If two particles with different internal states are exchanged, they must be considered
distinguishable, exactly because their internal state tags them. Hence, for all the other modes
we have z = ZN . Thus,
1
z= (Ztrs Zns Zrot Zvib Zel )N (7.8)
N!
ZN
= trs Zns
N N
Zrot N
Zvib ZelN . (7.9)
N!
Accordingly, we can consider each of the partition functions in turn. We also note that separability
of the energies implies factorization of the molecular wavefunction,
h2 2 2 2
trs = n x + n y + n z (7.11)
8ma2
1
p2x + p2y + p2z .
= (7.12)
2m
It follows that momentum is also quantized with |pi | = (h/2a)ni (i = x, y, z). It is convenient
to consider momentum in a Cartesian frame where h/2a is the unit along the x, y, and z axes.
Each state characterized by a unique set of translational quantum numbers (nx , ny , nz ) owns a
small cube with volume h3 /8a3 in the octant with x 0, y 0, and z 0. Since momentum
can also be negative, we need to consider all eight octants, so that each state owns a cell in
momentum space with volume h3 /a3 . In order to go to phase space, we need to add the spatial
coordinates. The particle can move throughout the whole cube with volume a3 . Hence, each
state owns a phase space volume of h3 .
By rearranging Eq. (7.11) we can obtain an equation that must be fulfilled by the quantum
numbers,
and by using Eq. (7.12) we can convert it to an equation that must be fulfilled by the components
of the momentum vector,
All states with quantum numbers that make the expression on the left-hand side of Eq. (7.13)
or Eq. (7.14) smaller than 1 correspond to energies that are smaller than . The momentum
associated with these states lies in the sphere defined by Eq. (7.14) with radius 12 8m and
68 Partition Functions of Gases
volume 6 (8m)3/2 . With cell size h3 /a3 in momentum space the number of cells with energies
smaller than is
8 2 V
N () = (m)3/2 , (7.15)
3 h3
where we have substituted a3 by volume V of the box. The number of states in an energy interval
between and + d is the first derivative of N () with respect to and is the sought density of
states,
V
D() = 4 2 3 m3/2 1/2 . (7.16)
h
The contributions along orthogonal spatial coordinates are also independent of each other and
factorize. Hence,
3/2
2mkB T
Ztrs = Ztrs,x Ztrs,y Ztrs,z = V , (7.19)
h2
where we have again substituted a3 by V and, as by now usual, also by 1/kB T . The
corresponding molar partition function is
" 3/2 #NAv
1 2mkB T
ztrs = V . (7.20)
NAv ! h2
Concept 7.2.1 Number of accessible states. The molecular canonical partition function Z is
a measure for the number of states that are accessible to the molecule at a given temperature.
Ni gi ei /kB T
Pi = = (7.21)
N Z
R For T 0 only the g0 lowest energy states are populated. In the absence of ground-state
degeneracy, g0 = 1, we find Z = 1 and with an energy scale where U (T = 0) = 0 we
have S(0) = 0 in agreement with Nernsts theorem.
7.3 Nuclear Spin Partition Function 69
An expression for the translational contribution to the entropy of an ideal gas can be derived
from Eq. (7.19), Eq. (5.18), and Eq. (5.11). We know that u = 3N kB T /2, so that we only need
to compute ln ztrs ,
1 N
ln ztrs = ln z (7.22)
N ! trs
= ln N ! + N ln ztrs (7.23)
= N ln N + N + N ln ztrs (7.24)
ztrs
= N 1 + ln , (7.25)
N
where we have used Stirlings formula to resolve the factorial. Thus we find
u
s = + kB ln z (7.26)
T
3 ztrs
= N kB + kB N 1 + ln (7.27)
2 N
5 ztrs
= N kB + ln (7.28)
2 N
By using Eq. (7.19) we finally obtain the Sackur-Tetrode equation
( " 3 #)
5 2mkB T 2 V
s = N kB + ln . (7.29)
2 h2 N
To obtain the molar entropy Sm , N has to be replaced by NAv . Volume can be substituted by
pressure and temperature, by noting that the molar volume is given by Vm = RT /p = NAv V /N .
With NAv kB = R and the molar mass M = NAv m we obtain
( " 3 #)
5 2M kB T 2 RT
Sm = R + ln (7.30)
2 NAv h2 NAv p
where the Ii are the nuclear spin quantum numbers for nuclei in the molecule. Magnetic
equivalence leads to degeneracy of nuclear spin levels, but does not influence the total number of
nuclear spin states. Since the term u/T in Eq. (5.11) is negligible and zns = Zns N , we have
X
s = N kB ln (2Ii + 1) . (7.32)
i
This contribution to entropy is not generally negligible. Still it is generally ignored in textbooks,
which usually does not cause problems, as the contribution is constant under most conditions
where experiments are conducted and does not change during chemical reactions.
ns,T+ = |i (7.34)
1
ns,T0 = (|i + |i) (7.35)
2
ns,T = |i . (7.36)
The translational, vibrational, and electron wavefunction are generally symmetric with respect to
the exchange of the two nuclei. The rotational wavefunction is symmetric for even rotational
quantum numbers J and antisymmetric for odd quantum numbers. Hence, to ensure that the
generalized Pauli principle holds and the total wavefunction is antisymmetric with respect to
exchange of indistinguishable nuclei, even J can only be combined with the antisymmetric
nuclear spin singlet state and odd J only with the symmetric triplet state. The partition functions
for these two cases must be considered separately. For H2 we have
2
X
Zpara = (2J + 1) eJ(J+1)~ /2IkB T , (7.37)
Jeven
where gJ = 2J + 1 is the degeneracy of the rotational states and I is the moment of inertia, and
2
X
Zortho = 3 (2J + 1) eJ(J+1)~ /2IkB T , (7.38)
Jodd
Each rotamer has its own moment of inertia and, hence, its own set of states with respect to
total rotation of the molecule. Because the energy scales of internal and total rotations are not
well separated and because in larger molecules some vibrational modes may also have energies
in this range, the partition function is not usually separable for large molecules. In such cases,
insight into statistical thermodynamics can be best obtained by MD simulations. In the following,
we consider small molecules that can be assumed to behave as a rigid rotor. We first consider
diatomic molecules, where it certainly applies on the level of precision that we seek here.
The energy levels of a rigid rotor of a diatomic molecule are quantized by the rotational
quantum number J and given by
h2
J = J(J + 1) = hcBJ(J + 1) , (7.41)
8 2 I
where
I = r2 (7.42)
h
B= (7.43)
8 2 Ic
is the rotational constant. After introducing the characteristic rotational temperature,
h2 hcB
rot = = (7.44)
8 2 IkB kB
72 Partition Functions of Gases
we have
For sufficiently high temperatures and a sufficiently large moment of inertia, the density of states
is sufficiently large for replacing the sum by an integral,
Z
T
Zrot (2J + 1) eJ(J+1)rot /T dJ = . (7.47)
0 rot
Deviations between the partition functions computed by Eq. (7.46) and Eq. (7.47) are visualized
in Figure 7.1. As state functions depend on ln Z, the continuum approximation is good for
T /rot 1, which applies to all gases, except at low temperatures for those that contain
hydrogen. At ambient temperature it can be used in general, except for a further correction that
we need to make because of symmetry considerations.
15 A -0.5 B
ln(Zrot,approx/Zrot,exact)
-1
10
-1.5
Zrot
-2
5
-2.5
0 -3
0.1 0.3 1 3 10 0.1 0.3 1 3 10
T/Qrot T/Qrot
Figure 7.1: Continuum approximation for the rotational partition function. (A) Rotational partition
function obtained by the sum expression (7.46) (black line) and by the integral expression (7.47)
corresponding to the continuum approximation (red line). (B) Logarithm of the ratio between the rotational
partition functions obtained by the continuum approximation (7.47) and the exact sum formula (7.46).
counts accessible states, by two. In this example, we have deliberately chosen a case with only
one nuclear spin state. If nuclear spin states with different symmetry exist, all rotational states
are accessible, but correlated to the nuclear spin states, as we have seen before for dihydrogen. In
the following we consider the situation with only one nuclear spin state or for a fixed nuclear spin
state.
Table 7.1: Symmetry numbers corresponding to symmetry point groups [Her45; Iri98].
Although we still did not discuss other complications for multi-atomic molecules, we
generalize this concept of symmetry-accessible states by introducing a symmetry number . In
general, is the number of distinct orientations of a rigid molecule that are distinguished only
by interchange of identical atoms. For an NH3 molecule, rotation about the C3 axis by 120
generates one such orientation from the other. No other rotation axis exists. Hence, = 3 for
NH3 . In general, the symmetry number can be obtained from the molecules point group [Her45;
Iri98], as shown in Table 7.1.
With the symmetry number, the continuum approximation (7.47) becomes
T
Zrot = , (7.48)
rot
where we still assume that symmetry is sufficiently high for assigning a single characteristic
temperature.
We further find
1
ln Zrot = ln T + ln , (7.49)
rot
and with this
ln Zrot
2
Urot = NAv kB T (7.50)
T V
= RT 2 ln T (7.51)
T
= RT (linear) . (7.52)
On going from Eq. (7.50) to (7.51) we could drop the second term on the right-hand side of Eq.
(7.49), as this term does not depend on temperature. This is a general principle: Constant factors
in the partition function do not contribute to internal energy. The result can be generalized to
multi-atomic linear molecules that also have two rotational degrees of freedom. This result is
expected from the equipartition theorem, as each degree of freedom should contribute a term
kB T /2 to the molecular or a term RT /2 to the molar internal energy. However, if we refrain
from the continuum approximation in Eq. (7.47) and numerically evaluate Eq. (7.46) instead,
we find a lower contribution for temperatures lower than or comparable to rot . This is also a
general principle: Contributions of modes to internal energy and, by inference, to heat capacity,
are fully realized only at temperatures much higher than their characteristic temperature and are
negligible at temperatures much lower than their characteristic temperature.2
2The zero-point vibrational energy is an exception from this principle with respect to internal energy, but not heat
capacity.
74 Partition Functions of Gases
For the rotation contribution to molar heat capacity at constant volume of a linear molecule
we obtain
Crot,V = Urot = R . (7.53)
T
A non-linear multi-atomic molecule has, in general, three independent rotational moments of
inertia corresponding to three pairwise orthogonal directions a, b, c. With
h2
rot, = ( = a , b , c) (7.54)
8 2 I kB
one finds for the partition function
1/2
T3
Zrot = . (7.55)
rot,a rot,b rot,c
For spherical-top molecules, all three moments of inertia are equal, Ia = Ib = Ic , and hence all
three characteristic temperatures are equal. For symmetric-top molecules, Ia = Ib 6= Ic .
The general equations for Urot and Crot,V at sufficiently high temperature T rot are
d
Urot = RT (7.56)
2
d
Crot,V = R , (7.57)
2
where d = 1 for a free internal rotation (e.g., about a CC bond), d = 2 for linear, and d = 3 for
non-linear molecules. We note that the contribution of a free internal rotation needs to be added
to the contribution from total rotation of the molecule.
The expressions for the rotational contributions to molar entropy are a bit lengthy and do not
provide additional physical insight. They can be easily derived from the appropriate expressions
for the rotational partition function and internal energy using Eq. (5.11). We note, however,
that the contribution from the u/T term in the entropy expression is dR/2 and the contribution
from ln Z is positive. Hence, at temperatures where the continuum approximation is valid, the
rotational contribution to entropy is larger than dR/2.
vib 2 evib /T
Uvib
Cvib,V = =R 2 , (7.63)
T V T evib /T 1
which is called the Einstein equation. With the Einstein function,
u2 eu
FE (u) = , (7.64)
(eu 1)2
it can be written as
vib
Cvib,V = RT FE . (7.65)
T
For computing the vibrational contribution to molar entropy we revert to the shifted energy
scale. This is required, as inclusion of the zero-point contribution to u would leave us with an
infinity. We find
" #
vib,i
Svib,i = R ln 1 evib,i /T . (7.66)
T evib /T 1
Again contributions from individual normal modes add up. For vib,i /T 1, which is the
usual case, both terms in the brackets are much smaller than unity, so that the contribution of any
individual normal mode to entropy is much smaller than R. Hence, at ambient temperature the
vibrational contribution to entropy is negligible compared to the rotational contribution unless
the molecule contains heavy nuclei.
where n = 0 . . . is the principal quantum number, zq the nuclear charge, and RE the Rydberg
constant. However, this is an exception. For molecules and all other neutral atoms closed
expressions for the energy levels cannot be found.
In most cases, the problem can be reduced to considering either only the electronic ground
state with energy el,0 or to considering only excitation to the first excited state with energy
el,1 . If we use an energy shift to redefine el,0 = 0, we can define a characteristic electronic
temperature
el,1
el = . (7.68)
kB
Characteristic electronic temperatures are usually of the order of several thousand Kelvin. Hence,
in most cases, el T applies, only the electronic ground state is accessible, and thus
where gel,0 is the degeneracy of the electronic ground state. We note that spatial degeneracy of
the electronic ground state cannot exist in non-linear molecules, according to the Jahn-Teller
theorem. However, a spatially non-degenerate ground state can still be spin-degenerate.
In molecules, total orbital angular momentum is usually quenched ( = 0, ground state).
In that case
{}
Zel = 2S + 1 , (7.70)
where S is the electron spin quantum number. For the singlet ground state of a closed-shell
{}
molecule (S = 0) we have Zel = 1, which means that the electronic contribution to the
partition function is negligible. The contribution to internal energy and heat capacity is generally
negligible for el T . The electronic contribution to molar entropy,
{}
Sel = R ln (2S + 1) , (7.71)
is not negligible for open-shell molecules or atoms with S > 0. At high magnetic fields and low
temperatures, e.g. at T < 4.2 K and B0 = 3.5 T, where the high-temperature approximation for
electron spin states does no longer apply, the electronic partition function and corresponding
energy contribution are smaller than given in Eq. (7.70). For a doublet ground state (S = 1/2)
the problem can be solved with the treatment that we have given in Section 6.3.1. For S > 1/2
the spin substates of the electronic ground state are not strictly degenerate even at zero magnetic
field, but split by the zero-field splitting, which may exceed thermal energy in some cases. In that
case Eq. (7.70) does not apply and the electronic contribution to the partition function depends
on temperature. Accordingly, there is a contribution to internal energy and to heat capacity.
For a > 0 species with term symbol 2S+1 , each component is doubly degenerate
[SHM12]. For instance, for NO with a ground state ( = 1), both the 2 1/2 and the 2 3/2
state are doubly degenerate. As the 2 3/2 state is only 125 cm1 above the ground state, the
characteristic temperature for electronic excitation is el = 178 K. In this situation, Eq. (7.69)
does not apply at ambient temperature. The energy gap to the next excited state, on the other
hand, is very large. Thus, we have
This equation is fairly general, higher excited states rarely need to be considered. The electronic
contribution to the heat capacity of NO derived from Eq. (7.72) is in good agreement with
experimental data from Eucken and dOr [Mac98].
7.7 Equilibrium Constant for Gas Reactions 77
|A | A + |B | B |C | C + |D | D (7.73)
G = 0 , (7.74)
hence
X
i i = A A + B B + C C + D D = 0 , (7.75)
i
where the i are molar chemical potentials. To solve this problem, we do not need to explicitly
work with the grand canonical ensemble, as we can compute the i from the results that we have
already obtained for the canonical ensemble. According to one of Gibbs fundamental equations,
which we derived in the lecture course on phenomenological thermodynamics, we have
X
df = sdT pdV + i dni . (7.76)
i
a result that we had also obtained in the lecture course on phenomenological thermodynamics.
Using Eq. (5.21), Eq. (5.18), and Stirlings formula, we obtain for the contribution fi of an
individual chemical species to Helmholtz free energy
fi = kB T ln zi (7.78)
1 Ni
= kB T ln Z (7.79)
Ni ! i
= kB T (Ni ln Zi Ni ln Ni + Ni ) (7.80)
Zi
= ni RT ln ni RT , (7.81)
ni NAv
where ni is the amount of substance (mol). Eq. (7.77) then gives
1 Zi
i = ni RT RT ln RT (7.82)
ni ni NAv
Zi
= RT ln (7.83)
ni NAv
Zi
= RT ln . (7.84)
Ni
Eq. (7.84) expresses the dependence of the chemical potential, a molar property, on the
molecular partition function. It may appear odd that this property depends on the absolute
number of molecules Ni , but exactly this introduces the contribution of mixing entropy that
counterbalances the differences in standard chemical potentials i . Because of our habit of
shifting energies by el,0 and by the zero-point vibration energies, we cannot directly apply Eq.
(7.84). We can avoid explicit dependence on the el,0,i and the zero-point vibrational energies by
78 Partition Functions of Gases
relying on Hess law and referencing energies of all molecules to the state where they are fully
dissociated into atoms. The energies i,diss for the dissociated states can be defined at 0 K. We
find
A A + B B = C C D D , (7.89)
which gives
The dependence on volume arises from the dependence of the canonical partition functions on
volume.
i
By dividing all particle numbers by NAv and volume V i , we obtain the equilibrium constant
Kc (T ) in molar concentrations
| | | |
ZC C ZD D P
Kc (T ) = | | | |
(NAv V ) i i
eU0 /RT . (7.96)
ZA A ZB B
P
By dividing them by the total particle number N = i Ni to the power of i we obtain
| | | |
ZC C ZD D P
Kx (V, T ) = | | | |
N i i
eU0 /RT , (7.97)
ZA A ZB B
which coincides with the thermodynamical equilibrium constant K at the standard pressure p .
The most useful equilibrium constant for gas-phase reactions is obtained by inserting pi = ci RT
into Eq. (7.96)3:
| | | | Pi i
ZC C ZD D
RT
Kp (T ) = | | | |
eU0 /RT . (7.98)
ZA A ZB B NAv V
For each molecular species, the molecular partition function is a product of the contributions
from individual modes, Eq. (7.7), that we have discussed above. In the expression for equilibrium
constants, the nuclear-spin contribution cancels out since the number of nuclei and their spins
are the same on both sides of the reaction equation. Symmetry requirements on the nuclear
wavefunction are considered in the symmetry numbers i for the rotational partition function.
The electronic contribution often reduces to the degeneracy of the electronic ground state and in
the vibrational contribution, normal modes with vib,i > 5T can be neglected.
8 Macromolecules
VA
A = (8.1)
VA + VB
VB
B = = 1 A . (8.2)
VA + VB
In our example we assign the lattice site a volume v0 , which cannot be larger than the volume
required for one molecule of the smaller component in the mixture. The other component
may then also occupy a single site (similarly sized components) or several lattice sites. A
macromolecule with a large degree of polymerization consists of a large number of monomers
and will thus occupy a large number of lattice sites. The molecular volumes of the species are
vA = NA v0 (8.3)
vB = NB v0 , (8.4)
where NA and NB are the number of sites occupied by one molecule of species A and B,
respectively. We consider the three simple cases listed in Table 8.1. Regular solutions are
mixtures of two low molecular weight species with NA = NB = 1. Polymer solutions are
mixtures of one type of macromolecules (NA = N 1) with a solvent, whose molecular
volume defines the lattice site volume v0 (NB = 1). Polymer blends correspond to the general
case 1 6= NA 6= NB 6= 1. They are mixtures of two different species of macromolecules, so that
NA , NB 1.
NA NB
Regular solutions 1 1
Polymer solutions N 1
Polymer blends NA NB
Table 8.1: Number of lattice sites occupied per molecule in different types of mixtures [RC03].
s = kB ln , (8.6)
where is the number of ways in which the molecules can be arranged on the lattice (number of
microstates). In a homogeneous mixture, a molecule or monomer of component A can occupy
any of the n lattice sites. Before mixing, it can occupy only one on the lattice sites in volume VA .
Hence, the entropy change for one molecule of species A is
SA = kB ln n kB ln A n (8.7)
n
= kB ln (8.8)
A n
= kB ln A . (8.9)
We note the analogy with the expression that we had obtained in phenomenological thermody-
namics for an ideal mixture of ideal gases, were we had used the molar fraction xi instead of the
volume fraction i . For ideal gases, Vi ni and thus i = xi . Eq. (8.10) generalizes the result
to any ideal mixture in condensed phase. The mixture is ideal because we did not yet consider
energy of mixing and thus could get away with using a microcanonical ensemble.
8.1 Thermodynamics of Mixing 83
For discussion it is useful to convert the extensive quantity smix to the intensive entropy of
mixing per lattice site,
A B
S mix = kB ln A + ln B , (8.11)
NA NB
where we have used the number of molecules per species ni = ni /Ni and normalized by the
total number n of lattice sites.
For a regular solution with NA = NB = 1 we obtain the largest entropy of mixing at given
volume fractions of the components,
where the approximation by Eq. (8.14) holds for B 1/N , i.e. for solving a polymer and
even for any appreciable swelling of a high-molecular weight polymer by a solvent. For polymer
blends, Eq. (8.11) holds with NA , NB 1. Compared to formation of a regular solution or a
polymer solution, mixing entropy for a polymer blend is negligibly small, which qualitatively
explains the difficulty of producing such polymer blends. Nevertheless, the entropy of mixing is
always positive, and thus the Helmholtz free energy F mix = T S mix always negative, so
that an ideal mixture of two polymers should form spontaneously. To see what happens in real
mixtures, we have to consider the energetics of mixing.
Before doing so, we note the limitations of the simple lattice model. We have neglected
conformational entropy of the polymer, which will be discussed in Section 8.2.3. This amounts to
the assumption that conformational entropy does not change on mixing. For blends of polymers,
this is a very good assumption, whereas in polymers solutions there is often an excluded volume
that reduces conformational space. We have also neglected the small volume change that occurs
on mixing, most notably for regular solutions. For polymer solutions and blends this volume
change is very small.
a macroscopic system. The mean-field interaction energy per lattice site occupied by an A unit is
thus
To continue, we need to specify the lattice, as the number of sites a adjacent to the site
under consideration depends on that. For a cubic lattice we would have a = 6. We keep a as a
parameter in the hope that we can eliminate it again at a later stage. If we compute a weighted
sum of the expressions (8.15) and (8.15) we double count each pairwise interaction, as we will
encounter it twice. Hence, total interaction energy of the mixture is
an
u= [A UA + (1 A ) UB ] , (8.17)
2
where we have used the probability A of encountering a site occupied by a unit A and (1 A )
of encountering a site occupied by a unit B. By inserting Eqs. (8.15) and (8.16) into Eq. (8.17)
and abbreviating A = , we obtain
an
u= { [uAA + (1 ) uAB ] + (1 ) [uAB + (1 ) uBB ]} (8.18)
2 h
an 2 i
= uAA + 2 (1 ) uAB + (1 )2 uBB . (8.19)
2
Before mixing the interaction energy per site in pure A is auAA /2 and in B auBB /2. Hence,
the total interaction energy before mixing is
an
u0 = [uAA + (1 ) uBB ] , (8.20)
2
so that we obtain for the energy change u = u u0 on mixing
an h 2 i
u = uAA + 2 (1 ) uAB + (1 )2 uBB uAA (1 ) uBB (8.21)
2
an 2
uAA + 2 (1 ) uAB + 1 2 + 2 1 + uBB
= (8.22)
2
an
= [ ( 1) uAA + 2 (1 ) uAB + ( 1) uBB ] (8.23)
2
an
= (1 ) (2uAB uAA uBB ) . (8.24)
2
We again normalize by the number n of lattice sites to arrive at the energy change per site on
mixing:
a
U mix = (1 ) (2uAB uAA uBB ) . (8.25)
2
For discussion we need an expression that characterizes the mixing energy per lattice site as
a function of composition and that can be easily combined with the mixing entropy to free
energy. The Flory interaction parameter,
a 2uAB uAA uBB
= , (8.26)
2 kB T
elegantly eliminates the number of adjacent lattice sites and provides just such an expression:
U mix = (1 ) kB T . (8.27)
8.2 Entropic Elasticity 85
Introducing such a parameter is an often-used trick when working with crude models. If the
parameter is determined experimentally, the expression may fit data quite well, because part of
the deviations of reality from the model can be absorbed by the parameter and its dependence on
state variables. We finally obtain the Flory-Huggins equation for the Helmholtz free energy of
mixing, F mix = U mix T S mix ,
1
F mix = kB T ln + ln (1 ) + (1 ) . (8.28)
NA NB
As the entropy contribution (first two terms in the brackets on the right-hand side of Eq.
(8.28)) to F mix is always negative, entropy always favors mixing. The sign of F mix depends
on the sign of the Flory parameter and the ratio between the energy and entropy. The Flory
parameter is negative and thus favors mixing, if 2uAB < uAA + uBB , i.e., if the interaction in
AB pairs is more attractive than the mean interaction in AA and and BB pairs. Such cases occur,
but are rare. In most cases, the Flory parameter is positive. Since the entropy terms are very
small for polymer blends, such blends tend to phase separate. In fact, high molecular weight
poly(styrene) with natural isotope abundance phase separates from deuterated poly(styrene).
This scalar product is of interest, as we can use it to compute the mean-square end-to-end distance
hR2 i of an ensemble of chains, which is the simplest parameter that characterizes the spatial
dimension of the chain. With the end-to-end distance vector of a chain with n bonds,
n
X
~n =
R ~ri , (8.30)
i=1
we have
~ n2 i
hR2 i = hR (8.31)
~n R
= hR ~ ni (8.32)
* n ! n +
X X
= ~ri ~rj (8.33)
i=1 j=1
n X
X n
= h~ri ~rj i . (8.34)
i=1 j=1
86 Macromolecules
In the freely jointed chain model we further assume that there are no correlations between
the directions of different bond vectors, hcos ij i = 0 for i 6= j. Then, the double sum in Eq.
(8.35) has only n non-zero terms for i = j with cos i i = 1. Hence,
This again appears to be a crude model, but we shall now rescue it by redefining l. In an ideal
polymer chain we can at least assume that there is no interaction between monomers that are
separated by many other monomers,
Furthermore, for a given bond vector ~ri the sum over all correlations with other bond vectors
converges to some finite number that depends on i,
n
X
hcos ij i = C 0 (i) . (8.38)
j=1
Therefore, when including the correlations, Eq. (8.35) can still be simplified to
n
X
2
hR i = l 2
C 0 (i) = Cn nl2 , (8.39)
i=1
where Florys characteristic ratio Cn is the average value of C 0 (i) over all backbone bonds of
the chain.
In general, Cn depends on n, but for very long chains it converges to a value C . For
sufficiently long chains, we can thus approximate
hR2 i nC l2 , (8.40)
which has the same dependence on n and l as the crude model of the freely jointed chain, Eq.
(8.36). Hence, we can define an equivalent freely jointed chain with N Kuhn segments of length
b. From
hR2 i = N b2 nC l2 (8.41)
and the length of the maximally stretched equivalent chain, the contour length Rmax ,
Rmax = N b , (8.42)
we obtain
2
Rmax
N= (8.43)
C nl2
and the Kuhn length
hR2 i C nl2
b= = . (8.44)
Rmax Rmax
8.2 Entropic Elasticity 87
Typical values of C for synthetic polymers range from 4.6 for 1,4-poly(isoprene) to 9.5 for
atactic poly(styrene) with corresponding Kuhn lengths of 8.2 to 18 , respectively.
At this point we have found the mean-square end-to-end distance as a parameter of an
equilibrium macrostate. If we stretch the chain to a longer end-to-end distance, it is no longer at
equilibrium and must have larger free energy. Part of this increase in free energy must come
from a decrease in entropy that stretching induces by reducing the number of accessible chain
conformations. It turns out that this entropic contribution is the major part of the increase in
free energy, typically 90%. The tendency of polymer chains to contract after they have been
stretched is thus mainly an entropic effect. To quantify it, we need a probability distribution for
the end-to-end vectors and to that end, we introduce a concept that is widely used in natural
sciences.
Concept 8.2.1 Random walk. Many processes can be discretized into individual steps. What
happens in the next step may depend on only the current state or also on what happened in
earlier steps. If it depends only on the current state, the process is memoryless and fits the
definition of a Markov chain. A Markov chain where the events are analogous steps in some
parameter space can be modeled as a random walk. A random walk is a mathematically
formalized succession of random steps. A random walk on a lattice, where each step can only
lead from a lattice point to a directly neighboring lattice point is a particularly simple model.
We can use the concept of a random walk in combination with the concepts of statistical
thermodynamics in order to solve the problem of polymer chain stretching and contraction. The
problem is solved if we know the dependence of Helmholtz free energy on the length of the
end-to-end vector. This, in turn, requires that we know the entropy and thus the probability
distribution of the length of the end-to-end vector. This probability distribution is given by the
number
p of possible random walks (trajectories) that lead to a particular end-to-end distance
~ 2
R .
For simplicity we start with a simpler example in one dimension that we can later extend to
three dimensions. We consider the standard example in this field, a drunkard who has just left a
pub. We assume that, starting at the pub door, he makes random steps forth and back along the
road. What is the probability P (N, x) that after N steps he is at a distance of x steps up the road
from the pub door? The problem is equivalent to finding the number W (N, x) of trajectories of
length N that end up x steps from the pub door and dividing it by the total number of trajectories.
Any such trajectory consists of N+ steps up the road and N steps down the road, with the
final position being x = N+ N [RC03]. The number of such trajectories is, again, given by a
binomial distribution (see Section 3.1.6)
(N+ + N )! N!
W (N, x) = = , (8.45)
N+ !N ! [(N + x) /2]! [(N x) /2]!
whereas the total number of trajectories is 2N , as the drunkard has two possibilities at each step.
Hence,
1 N!
P (N, x) = N
, (8.46)
2 [(N + x) /2]! [(N x) /2]!
88 Macromolecules
leading to
N +x N x
ln P (N, x) = N ln 2 + ln(N !) ln ! ln !. (8.47)
2 2
The last two terms on the right-hand side can be rewritten as
x/2
N +x N X N
ln ! = ln !+ ln + s and (8.48)
2 2 2
s=1
x/2
N x N X N
ln ! = ln ! ln +1s , (8.49)
2 2 2
s=1
which leads to
x/2
N X N/2 + s
ln P (N, x) = N ln 2 + ln(N !) 2 ln ! ln . (8.50)
2 N/2 + 1 s
s=1
This result does no longer depend on step size, not even implicitly, because we have removed the
dependence on step number N . Therefore, it can be generalized to three dimensions. Since the
random walks along the three pairwise orthogonal directions in Cartesian space are independent
of each other, we have
At this point we relate the result to the conformational ensemble of an ideal polymer chain,
using the Kuhn model discussed in D Section
E 8.2.1. We pose the question of the distribution of
~ 2
mean-square end-to-end distances R with the Cartesian components of the end-to-end vector
~ being x = Rx , y = Ry , and z = Rz . According to Eq. (8.41), we have
R
D E
~ 2 = Rx2 + Ry2 + Rz2
R (8.64)
= N b2 . (8.65)
0.01 1
Cumulative probability
0.008 A 0.8 B
r3d(N,R) [a.u.]
0.006 0.6
0.004 0.4
0.2
0.002
0
0
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
R/(bN ) R/(bN )
Because of this scaling with the volume of an infinitesimally thin spherical shell, the probability
density distribution (Figure 8.1A) for the end-to-end distance does not peak
at zero distance. As
seen in Figure 8.1B it is very unlikely to encounter a chain with R > 2b N . Since the contour
length is Rmax = N b, we can conclude that at equilibrium almost all chains have end-to-end
distances shorter than 2Rmax / N .
We need to discuss validity of the result, because in approximating the discrete binomial
distribution by a continuous Gaussian probability distribution we had made the assumption
x N . Within the ideal chain model, this assumption corresponds to an end-to-end distance
that is much shorter than the contour length N b. If R approaches N b, the Gaussian distribution
overestimates true probability density. In fact, the Gaussian distribution predicts a small, but
finite probability for the chain to be longer than its contour length, which is unphysical. The
model can be refined to include cases of such strong stretching of the chain [RC03]. For our
qualitative discussion of entropic elasticity not too far from equilibrium, we can be content with
Eq. (8.68).
~ = kB ln (N, R)
S(N, R) ~ . (8.70)
8.2 Entropic Elasticity 91
The probability density distribution in Eq. (8.68) is related to the statistical weight by
~
~ = R (N, R) ,
3d (N, R) (8.71)
~ R
(N, R)d ~
because 3d is the fraction of all conformations that have an end-to-end vector in the infinitesimally
small interval between R ~ and R~ + dR.~ Hence,1
Z
~ ~
S(N, R) = kB ln 3d (N, R) + kB ln ~
(N, R)dR ~ (8.72)
3 ~2
R 3
3
Z
= kB 2 + kB ln + kB ln ~ ~
(N, R)dR . (8.73)
2 Nb 2 2N b2
The last two terms do not depend on R ~ and thus constitute an entropy contribution S(N, 0) that
is the same for all end-to-end distances, but depends on the number of monomers N ,
~2
~ = 3 kB R + S(N, 0) .
S(N, R) (8.74)
2 N b2
Since by definition the Kuhn segments of an ideal chain do not interact with each other,
~ The Helmholtz free energy F (N, R)
the internal energy is independent of R. ~ = U (N, R)~
~
T S(N, R) can thus be written as
~2
~ = 3 kB T R + F (N, 0) .
F (N, R) (8.75)
2 N b2
It follows that the free energy of an individual chain attains a minimum at zero end-to-end vector,
in agreement with our conclusion in Section 8.2.2 that the probability density is maximal for
a zero end-to-end vector. At longer end-to-end vectors, chain entropy decreases quadratically
with vector length. Hence, the chain can be considered as an entropic spring. Elongation of
the spring corresponds to separating the chain ends by a distance R N b. The force required
for this elongation is the derivative of Helmholtz free energy with respect to distance. For one
dimension, we obtain
F N, R ~
3kB T
fx = = Rx . (8.76)
Rx N b2
~
For the three-dimensional case, the force is a vector that is linear in R,
3kB T ~
f~ = R, (8.77)
N b2
i.e., the entropic spring satisfies Hookes law. The entropic spring constant is 3kB T /(N b2 ).
Polymers are thus the easier to stretch the larger their degree of polymerization (proportional
to N ), the longer the Kuhn segment b and the lower the temperature T . In particular, the
temperature dependence is counterintuitive. A polymer chain under strain will contract if
temperature is raised, since the entropic contribution to Helmholtz free energy, which counteracts
the strain, then increases.
1This separation of the terms is mathematically somewhat awkward, since in the last two terms the argument of
the logarithm has a unit. However, if the two terms are combined the logarithm of the unit cancels.
Bibliography
Books
[AP13] P. W. Atkins and J. de Paula. Physikalische Chemie. 5th Ed. Weinheim: WILEY-VCH,
2013 (cited on page 8).
[ER06] T. Engel and P. Reid. Physikalische Chemie. 1st Ed. Mnchen: Pearson Studium,
2006 (cited on page 8).
[Her45] G. Herzberg. Molecular Spectra and Molecular Structure: II. Infrared and Raman
Spectra of Polyatomic Molecules. New York: van Nostrand, 1945 (cited on page 73).
[Mac98] A. Maczek. Statistical Thermodynamics. First. Oxford: Oxford University Press,
1998 (cited on pages 8, 66, 68, 74, 76).
[Pen70] O. Penrose. Foundations of Statistical Mechanics. Oxford: Pergamon Press, 1970
(cited on pages 8, 12, 15, 16).
[Pre+97] W. H. Press et al. Numerical Recipes in C. Second. Cambridge: Cambridge University
Press, 1997 (cited on page 25).
[RC03] M. Rubinstein and R. H. Colby. Polymer Physics. First. Oxford: Oxford University
Press, 2003 (cited on pages 81, 82, 85, 87, 90).
[SHM12] M. Schfer, U. Hollenstein, and F. Merkt. Advance Physical Chemistry. Statistical
Thermodynamics. Autumn Semester 2012. Zrich: ETH Zrich, 2012 (cited on
page 76).
[Sch06] F. Schwabl. Statistische Mechanik. 3rd Ed. Berlin: Springer, 2006 (cited on pages 8,
14, 49, 53).
[SJ01] A. Schweiger and G. Jeschke. Principles of Pulse Electron Paramagnetic Resonance.
Oxford: Oxford University Press, 2001 (cited on page 52).
[Swe12] R. H. Swendsen. An Introduction to Statistical Mechanics and Thermodynamics. 1st
Ed. Oxford: Oxford University Press, 2012 (cited on pages 8, 18, 20, 22, 24, 25, 41,
52, 53).
[WF12] G. Wedler and H.-J. Freund. Lehrbuch der Physikalischen Chemie. 6th Ed. Weinheim:
WILEY-VCH, 2012 (cited on pages 8, 33, 37, 43, 44, 5457, 61, 62, 67, 77, 79).
94 Macromolecules
Articles
[Coh11] D. Cohen. Lecture Notes in Statistical Mechanics and Mesoscopics. In: arXiv:1107.0568
(2011), pages 1121 (cited on page 8).
[FR03] S. R. D. French and D. P. Rickles. Understanding Permutation Symmetry. In:
arXiv:quant-ph/0301020 (2003), pages 1 (cited on page 54).
[Kub63] R. Kubo. In: J. Math. Phys. 4 (1963), pages 174183 (cited on page 49).
[Tan06] Y. Tanimura. In: J. Phys. Soc. Jap. 75 (2006), page 082001 (cited on page 49).
[VF74] A. J. Vega and D. Fiat. In: J. Chem. Phys. 60 (1974), pages 579583 (cited on
page 49).
Web Pages
[Iri98] K. K. Irikura. Essential Statistical Thermodynamics. http://cccbdb.nist.gov/
thermo.asp. [Online; accessed 4-August-2015]. 1998 (cited on pages 9, 73).
[UNM12] UNM. Laminar flow. https://www.youtube.com/watch?v=_dbnH- BBSNo.
[Online; accessed 28-July-2015]. 2012 (cited on page 47).
Index
A D
Einstein equation . . . . . . . . . . . . . . . . . . . . . . 75
C ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
canonical . . . . . . . . . . . . . . . . . . . . . . . . . 33
canonical ensemble . . . . . . . . . . . . . . . . . . . . 33
grand canonical . . . . . . . . . . . . . . . . . . . 39
central limit theorem . . . . . . . . . . . . . . . . . . . 23 microcanonical . . . . . . . . . . . . . . . . . . . . 32
central paradox . . . . . . . . . . . . . . . . . . . . . . . . 47 entropy . . . . . . . . . . . . . . . . . . . . . . . . 15, 41, 45
characteristic temperature Boltzmann . . . . . . . . . . . . . . . . . . . . . . . . 43
electronic . . . . . . . . . . . . . . . . . . . . . . . . . 76 Clausius . . . . . . . . . . . . . . . . . . . . . . . . . . 42
rotational . . . . . . . . . . . . . . . . . . . . . . . . . 70 Gibbs . . . . . . . . . . . . . . . . . . . . . . . . . 43, 49
vibrational . . . . . . . . . . . . . . . . . . . . . . . . 60 mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
coherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Shannon . . . . . . . . . . . . . . . . . . . . . . . . . . 50
configuration . . . . . . . . . . . . . . . . . . . . . . . . . . 34 von Neumann . . . . . . . . . . . . . . . . . . . . . 49
contour length . . . . . . . . . . . . . . . . . . . . . . . . . 86 equipartition theorem . . . . . . . . . . . . . . . 37, 62
correlation function . . . . . . . . . . . . . . . . . . . . 21 ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
96 INDEX
Liouville-von-Neumann equation . . . . . . . . 15
Loschmidt paradox . . . . . . . . . . . . . . . . . . . . . 47
F
Fermi-Dirac statistics . . . . . . . . . . . . . . . . . . . 56 M
fermion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Flory characteristic ratio . . . . . . . . . . . . . . . . 86 macrostate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Flory interaction parameter . . . . . . . . . . . . . 84 Markov chain . . . . . . . . . . . . . . . . . . . . . . 13, 87
Flory-Huggins equation . . . . . . . . . . . . . . . . 85 Markovian postulate . . . . . . . . . . . . . . . . . . . . 13
freely jointed chain . . . . . . . . . . . . . . . . . . . . . 85 Maxwell-Boltzmann statistics . . . . . . . . . . . 56
mean field approach . . . . . . . . . . . . . . . . 61, 83
mean value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
G microcanonical ensemble . . . . . . . . . . . . . . . 32
microstate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Gibbs paradox . . . . . . . . . . . . . . . . . . . . . . . . . 46 mixed state . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
grand canonical ensemble . . . . . . . . . . . . . . 39 mixing entropy . . . . . . . . . . . . . . . . . . . . . . . . 81
ground state . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 moment analysis . . . . . . . . . . . . . . . . . . . . . . . 25
moment of inertia . . . . . . . . . . . . . . . . . . . . . . 71
H
N
Hamiltonian equations of motion . . . . . . . . 13
Heisenbergs uncertainty principle . . . . . . . 60 Newtonian equations of motion . . . . . . . . . . 11
Hess law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
O
I
observational state . . . . . . . . . . . . . . . . . . . . . 12
ideal chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
independent events . . . . . . . . . . . . . . . . . . . . . 21
independent variables . . . . . . . . . . . . . . . . . . 19 P
probability distribution . . . . . . . . . . . . . . . . . 20
moments . . . . . . . . . . . . . . . . . . . . . . . . . . 21
pure state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 V
variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Q
W
quantum number
rotational . . . . . . . . . . . . . . . . . . . . . . 70, 71 Wiederkehreinwand . . . . . . . . . . . . . . . . . . . . 48
translational . . . . . . . . . . . . . . . . . . . . . . . 67
vibrational . . . . . . . . . . . . . . . . . . . . . . . . 59
Z
Zermelo pardox . . . . . . . . . . . . . . . . . . . . . . . . 48
R
random number
continuous . . . . . . . . . . . . . . . . . . . . . . . . 25
discrete . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
independent . . . . . . . . . . . . . . . . . . . . . . . 27
random walk . . . . . . . . . . . . . . . . . . . . . . . . . . 87
residual zero-point vibration . . . . . . . . . . . . 60
rotamers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Sackur-Tetrode equation . . . . . . . . . . . . . . . . 69
standard deviation . . . . . . . . . . . . . . . . . . . . . 20
state space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
statistical weight . . . . . . . . . . . . . . . . . . . . . . . 32
statistics
Bose-Einstein . . . . . . . . . . . . . . . . . . . . . 54
Fermi-Dirac . . . . . . . . . . . . . . . . . . . . . . . 56
Maxwell-Boltzmann . . . . . . . . . . . . . . . 56
Stirlings formula . . . . . . . . . . . . . . . . . . . . . . 24
Gospers approximation . . . . . . . . . . . . 24
stochastic Liouville equation . . . . . . . . . . . . 49
symmetry number . . . . . . . . . . . . . . . . . . . . . 73
trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Umkehreinwand . . . . . . . . . . . . . . . . . . . . . . . 47