Statistical Mechanics: Alice Pagano

LECTURE NOTES
OF
STATISTICAL MECHANICS
Collection of the lectures notes of professor Orlandini.
Edited By
Alice Pagano
The University of Padua
Academic year 2019-2020
Source available:
https://github.com/AlicePagano/Lectures-Statistical-Mechanics
Compiled: Wednesday 5th February, 2020

Abstract
In this document I have tried to reorder the notes of the statistical mechanics course
held by Professor Enzo Orlandini at the Department of Physics of the University of
Padua during the first semester of the 2019-20 academic year of the master’s degree.
The notes are fully integrated with the material provided by the professor in the
Moodle platform. In addition, I will integrate them, as best as possible, with the
books recommended by the professor.
There may be formatting errors, wrong marks, missing exponents etc. If you find
errors, let me know (alice.pagano@studenti.unipd.it) and I will correct them, so that
this document can be a good study support.
Padova, Wednesday 5th February, 2020 Alice Pagano
iii
iv
Contents
Introduction ix
1 Recall of Thermodynamics 1
1.1 A short recap of thermodynamics definitions . . . . . . . . . . . . . . . 1
1.2 Equilibrium states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Equations of states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Legendre transform and thermodynamic potentials . . . . . . . . . . . 5
1.5 Maxwell relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Response functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6.1 Response functions and thermodynamic stability . . . . . . . . 12
2 Equilibrium phases and thermodynamics of phase transitions 15

2.1 Equilibrium phases as minima of Gibbs free energy . . . . . . . . . . . 15
2.2 First order phase transition and phase coexistence . . . . . . . . . . . 15
2.2.1 Critical points . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.2 Ferromagnetic system . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Second order phase transition . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Helmholtz free-energy . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 Thermodynamic of phase coexistence . . . . . . . . . . . . . . . . . . . 22
2.4.1 Lever Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.2 Phase coexistence (one component system) . . . . . . . . . . . 23
2.4.3 Clausius-Clapeyron equation . . . . . . . . . . . . . . . . . . . 24
2.4.4 Application of C-C equation to the liquid-gas coexistence line . 25
2.5 Order parameter of a phase transition . . . . . . . . . . . . . . . . . . 27
2.6 Classification of the phase transitions . . . . . . . . . . . . . . . . . . . 28
2.6.1 Thermodynamic classification . . . . . . . . . . . . . . . . . . . 28
2.6.2 Eherenfest classification . . . . . . . . . . . . . . . . . . . . . . 28
2.6.3 Modern classification . . . . . . . . . . . . . . . . . . . . . . . . 29
2.7 Critical exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.7.1 Divergence of the response functions at the critical point . . . . 30
2.7.2 Critical exponents definition . . . . . . . . . . . . . . . . . . . . 30
2.7.3 Law of the corresponding states . . . . . . . . . . . . . . . . . . 32
2.7.4 Thermodynamic inequalities between critical exponents . . . . 32
3 Recall of statistical mechanics and theory of ensembles 35

3.1 Statistical ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 The canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.1 Energy fluctuations in the canonical ensemble . . . . . . . . . . 39
3.3 Isothermal and isobaric ensemble . . . . . . . . . . . . . . . . . . . . . 40
3.3.1 Saddle point approximation . . . . . . . . . . . . . . . . . . . . 42
3.4 Gran canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4 Statistical mechanics and phase transitions 43
v
vi
4.1 Statistical mechanics of phase transitions . . . . . . . . . . . . . . . . . 43

4.1.1 Magnetic system (canonical) . . . . . . . . . . . . . . . . . . . . 44
4.1.2 Fluid system (gran canonical) . . . . . . . . . . . . . . . . . . . 44
4.1.3 Thermodynamic limit with additional constraints . . . . . . . . 45
4.1.4 Statistical mechanics and phase transitions . . . . . . . . . . . 45
4.2 Critical point and correlations of fluctuations . . . . . . . . . . . . . . 46
4.3 Finite size effects and phase transitions . . . . . . . . . . . . . . . . . . 48
4.4 Numerical simulations and phase transitions . . . . . . . . . . . . . . . 49
5 Role of the models in statistical mechanics 51

5.1 Role of the models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.2 The Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.2.1 d -dimensional Ising model . . . . . . . . . . . . . . . . . . . . . 52
5.2.2 Mathematical properties of the Ising model with nearest neigh-
bours interactions . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2.3 Ising model and Z2 symmetry. . . . . . . . . . . . . . . . . . . 56
5.3 Lattice gas model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4 Fluid system in a region Ω . . . . . . . . . . . . . . . . . . . . . . . . 58
5.4.1 From the continuous to the lattice gas model . . . . . . . . . . 59
6 Some exactly solvable models of phase transitions 61

6.1 1-dim Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.1.1 Recursive method . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.1.2 Transfer Matrix method . . . . . . . . . . . . . . . . . . . . . . 64
6.2 General transfer matrix method . . . . . . . . . . . . . . . . . . . . . . 66
6.2.1 The free energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.2.2 The correlation function . . . . . . . . . . . . . . . . . . . . . . 68
6.2.3 Results for the 1-dim Ising model . . . . . . . . . . . . . . . . . 71
6.3 Classical Heisenberg model for d=1 . . . . . . . . . . . . . . . . . . . 74
6.4 Zipper model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.4.1 Transfer matrix method for the Zipper model . . . . . . . . . . 80
6.5 Transfer matrix for 2 − dim Ising . . . . . . . . . . . . . . . . . . . . . 81
7 The role of dimension, symmetry and range of interactions in phase

transitions 85
7.1 Energy-entropy argument . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.1.1 1-dim Ising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.1.2 d-dim Ising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.2 Role of the symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.3 Continuous symmetries and phase transitions . . . . . . . . . . . . . . 91
7.4 Role of the interaction range . . . . . . . . . . . . . . . . . . . . . . . . 94
7.4.1 Ising model with infinite range . . . . . . . . . . . . . . . . . . 94
8 Mean field theories of phase transitions and variational mean field 97

8.1 Mean field theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.1.1 Mean field for the Ising model (Weiss mean field) . . . . . . . . 97
8.1.2 Free-energy expansion for m ' 0 . . . . . . . . . . . . . . . . . 100
8.1.3 Mean field critical exponents . . . . . . . . . . . . . . . . . . . 101
8.2 Mean field variational method . . . . . . . . . . . . . . . . . . . . . . . 104
8.2.1 Mean field approximation for the variational approach . . . . . 105
8.2.2 First approach: Bragg-Williams approximation . . . . . . . . . 106
8.2.3 Second approach: Blume-Emery-Griffith model . . . . . . . . . 111
8.2.4 Mean field again . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Contents vii
9 Non ideal fluids: Mean field theory, Van der Walls, Virial expansion
and Cluster expansion 121
9.1 Mean field theory for fluids . . . . . . . . . . . . . . . . . . . . . . . . 121
9.2 Van der Waals equation . . . . . . . . . . . . . . . . . . . . . . . . . . 122
9.2.1 Critical point of Van der Waals equation of state . . . . . . . . 123
9.2.2 Law of corresponding states . . . . . . . . . . . . . . . . . . . . 125
9.2.3 Region of coexistence and Maxwell’s equal area rule . . . . . . 125
9.2.4 Critical exponents of Van der Walls equation . . . . . . . . . . 127
9.3 Theories of weakly interacting fluids . . . . . . . . . . . . . . . . . . . 129
9.3.1 Van der Walls and virial expansion . . . . . . . . . . . . . . . . 130
9.3.2 Cluster expansion technique for weakly interacting gases . . . . 131
9.3.3 Computation of virial coefficients for some interaction poten-
tials Φ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.3.4 Higher order terms in the cluster expansion . . . . . . . . . . . 137
10 Landau theory of phase transition for homogeneous systems 143

10.1 Introduction to Landau theory . . . . . . . . . . . . . . . . . . . . . . 143
10.2 Landau theory for the Ising model . . . . . . . . . . . . . . . . . . . . 144
10.2.1 Costruction of L . . . . . . . . . . . . . . . . . . . . . . . . . . 144
10.2.2 Equilibrium phases . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.3 Critical exponents in Landau’s theory . . . . . . . . . . . . . . . . . . 146
10.4 First-order phase transitions in Landau theory . . . . . . . . . . . . . . 148
10.4.1 Phase stability and behaviour of χT . . . . . . . . . . . . . . . 152
10.4.2 Computation of T ∗∗ . . . . . . . . . . . . . . . . . . . . . . . . 153
10.4.3 Computation of T ∗ . . . . . . . . . . . . . . . . . . . . . . . . . 153
10.5 Multicritical points in Landau theory . . . . . . . . . . . . . . . . . . . 153
10.6 Liquid crystals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
10.6.1 What are liquid crystals? . . . . . . . . . . . . . . . . . . . . . 157
10.6.2 Definition of an order parameter for nematic liquid crystals . . 158
10.6.3 Landau-de Gennes theory for nematic liquid crystals . . . . . . 160
11 Role of fluctuations in critical phenomena: Ginzburg criterium,

Coarse-graining and Ginzburg-Landau theory of phase transitions 163
11.1 Importance of fluctuations: the Ginzburg criterium . . . . . . . . . . . 163
11.1.1 Fluctuation-dissipation relation . . . . . . . . . . . . . . . . . . 165
11.1.2 Computation of ET OT . . . . . . . . . . . . . . . . . . . . . . . 166
11.1.3 Estimation of ET OT as t → 0− . . . . . . . . . . . . . . . . . . 166
11.2 Functional partition function and coarse graining . . . . . . . . . . . . 167
11.3 Coarse graining procedure for the Ising model . . . . . . . . . . . . . . 168
11.3.1 Computation of Hef f [m(~r)] . . . . . . . . . . . . . . . . . . . . 169
11.3.2 Magnetic non-homogeneous field . . . . . . . . . . . . . . . . . 172
11.3.3 Functional derivatives . . . . . . . . . . . . . . . . . . . . . . . 172
11.4 Saddle point approximation: Landau theory for non-homogeneous sys-
tems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
11.5 Correlation function in the saddle point approximation for non-homogeneous
systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
11.5.1 Solution of (11.42) by Fourier transform . . . . . . . . . . . . . 177
11.6 Including fluctuations at the Gaussian level (non interacting fields) . . 179
11.6.1 Gaussian approximation for the Ising model in Ginzburg-Landau
theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
12 Widom’s scaling theory. Block-spin Kadanoff ’s transformation 187

12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
12.2 Widom’s static scaling theory . . . . . . . . . . . . . . . . . . . . . . . 188
12.2.1 Homogeneous functions of one or more variables . . . . . . . . 189
12.2.2 Widom’s scaling hypothesis . . . . . . . . . . . . . . . . . . . . 191
12.3 Relations between critical exponents . . . . . . . . . . . . . . . . . . . 191
12.3.1 Exponent β (scaling of the magnetization) . . . . . . . . . . . . 191
12.3.2 Exponent δ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
12.3.3 Exponent γ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
12.3.4 Exponent α (scaling of the specific heat) . . . . . . . . . . . . . 193
12.3.5 Griffiths and Rushbrooke’s equalities . . . . . . . . . . . . . . . 194
12.3.6 An alternative expression for the scaling hypothesis . . . . . . . 194
12.3.7 Scaling of the equation of state . . . . . . . . . . . . . . . . . . 194
12.4 Kadanoff’s block spin and scaling of the correlation function . . . . . . 195
12.4.1 Kadanoff’s argument for the Ising model . . . . . . . . . . . . . 195
12.4.2 Kadanoff’s argument for two-point correlation functions . . . . 198
13 Renormalization group theory. Universality 201

13.1 Renormalization group theory (RG) . . . . . . . . . . . . . . . . . . . 201
13.1.1 Main goals of RG . . . . . . . . . . . . . . . . . . . . . . . . . . 202
13.1.2 Singular behaviour in RG . . . . . . . . . . . . . . . . . . . . . 204
13.1.3 Zoology of the fixed points . . . . . . . . . . . . . . . . . . . . . 204
13.1.4 Linearization of RG close to the fixed points and critical expo-
nents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
13.2 The origins of scaling and critical behaviour . . . . . . . . . . . . . . . 207
13.3 Real space renormalization group (RSRG) . . . . . . . . . . . . . . . . 209
13.3.1 Ising d = 11, RSRG with l = 22. . . . . . . . . . . . . . . . . . . . 210
13.3.2 Ising d = 11, RSRG with l = 3 . . . . . . . . . . . . . . . . . . . 213
13.3.3 Decimation procedure for d > 11: proliferation of the interactions 216
13.3.4 Decimation procedure and transfer matrix method . . . . . . . 220
13.3.5 Migdal-Kadanoff for anysotropic Ising model on a square lattice 221
13.4 RG transformation: general approach . . . . . . . . . . . . . . . . . . . 223
13.4.1 Variational RG . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
14 Spontaneous symmetry breaking 227

14.1 Spontaneous symmetry breaking . . . . . . . . . . . . . . . . . . . . . 227
14.2 Spontaneous breaking of continuous symmetries and the onset of Gold-
stone particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
14.2.1 Quantum relativistic case (field theory) . . . . . . . . . . . . . 229
14.3 Spontaneous symmetry breaking in gauge symmetries . . . . . . . . . . 233
14.3.1 Statistical mechanics . . . . . . . . . . . . . . . . . . . . . . . . 233
14.3.2 Field theory analog: the Higgs mechanism for an abelian group 233
14.3.3 Non-abelian gauge theories . . . . . . . . . . . . . . . . . . . . 236
14.3.4 Extension of Higgs mechanism to non-abelian theories . . . . . 237
Conclusions 239
Bibliography 241
viii
Introduction
The goal of statistical mechanics [1] is to predict the macroscopic properties of

bodies, most especially their thermodynamic properties, on the basis of their micro-
scopic structure.
The macroscopic properties of greatest interest to statistical mechanics are those
relating to thermodynamic equilibrium. As a consequence, the concept of thermody-
namic equilibrium occupies a central position in the field.
The microscopic structure of systems examined by statistical mechanics can be
described by means of mechanical models: for example, gases can be represented
as systems of particles that interact by means of a phenomenologically determined
potential. Other examples of mechanical models are those that represent polymers
as a chain of interconnected particles, or the classical model of crystalline systems,
in which particles are arranged in space according to a regular pattern, and oscillate
around the minimum of the potential energy due to their mutual interaction. The
models to be examined can be, and recently increasingly are, more abstract, how-
ever, and exhibit only a faint resemblance to the basic mechanical description (more
specifically, to the quantum nature of matter). The explanation of the success of such
abstract models is itself the topic of one of the more interesting chapters of statisti-
cal mechanics: the theory of universality and its foundation in the renormalization
group.
The models of systems dealt with by statistical mechanics have some common
characteristics. We are in any case dealing with systems with a large number of
degrees of freedom: the reason lies in the corpuscular (atomic) nature of matter. The
degrees of freedom that one considers should have more or less comparable effects on
the global behavior of the system. This state of affairs excludes the application of
the methods of statistical mechanics to cases in which a restricted number of degrees
of freedom “dominates” the others—for example, in celestial mechanics, although the
number of degrees of freedom of the planetary system is immense, an approximation
in which each planet is considered as a particle is a good start. In this case, we
can state that the translational degrees of freedom (three per planet)—possibly with
the addition of the rotational degrees of freedom, also a finite number—dominate
all others. These considerations also make attempts to apply statistical concepts
to the human sciences problematic because, for instance, it is clear that, even if
the behavior of a nation’s political system includes a very high number of degrees of
freedom, it is possible to identify some degrees of freedom that are disproportionately
important compared to the rest. On the other hand, statistical methods can also be
applied to systems that are not strictly speaking mechanical—for example, neural
networks (understood as models of the brain’s components), urban thoroughfares
(traffic models), or problems of a geometric nature (percolation).
The simplest statistical mechanical model is that of a large number of identical
particles, free of mutual interaction, inside a container with impenetrable and per-
fectly elastic walls. This is the model of the ideal gas, which describes the behavior
of real gases quite well at low densities, and more specifically allows one to derive the
well-known equation of state.
ix
The introduction of pair interactions between the particles of the ideal gas allows
us to obtain the standard model for simple fluids. Generally speaking, this model
cannot be resolved exactly and is studied by means of perturbation or numerical
techniques. It allows one to describe the behavior of real gases (especially noble
gases), and the liquid–vapor transition (boiling and condensation).
The preceding models are of a classical (nonquantum) nature and can be applied
only when the temperatures are not too low. The quantum effects that follow from
the inability to distinguish particles are very important for phenomenology, and they
can be dealt with at the introductory level if one omits interactions between particles.
In many of the statistical models we will describe, however, the system’s fun-
damental elements will not be “particles,” and the fundamental degrees of freedom
will not be mechanical (position and velocity or impulse). If we want to understand
the origin of ferromagnetism, for example, we should isolate only those degrees of
freedom that are relevant to the phenomenon being examined (the orientation of
the electrons’ magnetic moment) from all those that are otherwise pertinent to the
material in question.
The simplest case is that in which there are only two values—in this fashion, we
obtain a simple model of ferromagnetism, known as the Ising model, which is by far
the most studied model in statistical mechanics. The ferromagnetic solid is therefore
represented as a regular lattice in space, each point of which is associated with a
degree of freedom, called spin, which can assume the values +1 and -1. This model
allows one to describe the paramagnet-ferromagnet transition, as well as other similar
transitions.
In this course, classical statistical mechanics of system at equilibrium is treated.
The exam is divided into two parts: first, common oral exam (same exercise and
question for everyone, it is a written part), second part, oral.
x
Outline of the course
1. Brief recap of thermodinamics.
2. Equilibrium phases and thermodynamics of the phase transitions.
3. Statistical mechanics and theory of ensambles.
4. Thermodinamic limit and phase transitions in statistical mechanics.
5. Order parameter and critical point.
6. The role of modelling in the physics of phase transitions.
7. The Ising model.
8. Exact solutions of the Ising model.
9. Transfer matrix method.
10. Role of dimension and range of interactions in critical phenomena (lower critical
dimension).
11. Approximations: Meanfield theory Weiss and variational mean field.
12. Landau theory of phase transitions: the role of symmetries.
13. Relevance of fluctuations: the Geinzburg criterium and the notion of the upper
critical dimension.
14. The Ginzburg-Landau model.
15. Landau theory for non-homogeneous system. The ν exponent.
16. Gaussian fluctuations in the G-L theory.
17. Widom’s scaling theory.
18. Kadauoff’s theory of scaling.
19. The theory of renormalisations group and the origin of universality in critical
phenomena.
20. Spontaneous symmetry breaking.
xi
xii
Chapter 1
Recall of Thermodynamics
Lecture 1.
1.1 A short recap of thermodynamics definitions Wednesday 9th
October, 2019.
The systems we are considering are Compiled:
Wednesday 5th
1. In equilibrium with an external bath at fixed temperature T. February, 2020.
2. Made by a (large) number N of degrees of freedom. For instance, we remind

that 1 mol ≈ NA ∼ 1023 elementary units.
Thermodynamic is a macroscopic theory of matter at equilibrium. It starts either

from experimental observations or from assiomatic assumptions and establishes rig-
orous relations between macroscopic variables (observables) to describe systems at
equilibrium. One of the first important concept is the one of extensive variables.
For instance, the extensive variables that characterize the system at equilibrium are
the internal energy U, volume V, number of particles N and magnetization M ~ that
"scale with the system". In general, the extensive variable are additive.
In thermodynamic, it is important the concept of walls and thermodynamic
constrains that are necessary for a complete definition of a thermodynamic sys-
tem. With their presence or absence it is possible to control and redistribute the
thermodynamic variables for changing the system. The typical walls are:
• Adiabatic walls: no heat flux. If it is removed we obtain a diathermic walls.
• Rigid walls: no mechanical work. If it is removed we obtain a flexible or mobile

walls.
• Impermeable walls: no flux of particles (the number of particles remain con-

straints). If it is removed we obtain a permeable walls.
1.2 Equilibrium states

Consider a system in an equilibrium state, if the system changes our aim is to
study the next equilibrium state of the system. Therefore, we move from a system
in equilibrium to another. The fundamental problem of thermodynamics is how to
characterize the new system.
Now, we define the concept of equilibrium states. Consider macroscopic states
that are fully described by extensive variables such as the internal energy U, the
volume V, the number of particles N, the magnetization M, ~ etc . . .. If these variables
are time independent, the system is in a steady state. Moreover, if there are no
macroscopic currents, the system is at equilibrium. Therefore, we describe a system
by characterizing all the extensive variables at equilibrium.
1
2 Chapter 1. Recall of Thermodynamics
Suppose that the system changes slow in time, it goes from an equilibrium state
to another one and the transformation is so slow that in each ∆t the system is at
equilibrium. Hence, considering a sequence of equilibrium states, the quasi-static
transformation are described by the 1st Law of Thermodynamic:
dU = δQ − δW (1.1)
The variation of the internal energy of the systems depends by two factors, δw that
is the work done by the system during a quasi-static process (infinitive slow), and δQ
that is the heat absorbed by the system during the process. Remember that we write
dU because it is a differential quantity, while the other quantities with the δ are only
small quantities. Therefore, dU is a function of state, the other are not.
Remark. The convention is δQ > 0 if the heat is absorbed by the system, and δW > 0
if the work is done by the system.
For example, considering a simple fluid with a given pressure ,if we change the
volume, the work done by the systems is δW = P dV . For a magnetized system, we
have δW = −H ~ · d M.
~
In conclusion, starting from an equilibrium state and removing some constraints
(i.e. wall properties), we want to find the new equilibrium state compatible with the
new constrains.
Suppose a system with adiabatic rigid impermeable constraints. The system on
the left is characterized by V1 , N1 , U1 , the one on the right by V2 , N2 , U2 . There are
many ways for solving this problem. We use the most general way, that is by using
the maximum entropy principle. If exists a function S of the extensive variables of
the system that is defined for all equilibrium states, we call it entropy and the 1st
fundamental relation is
S = S(U, V, N ) (1.2)
The new values taken by the extensive parameters when a constraint has been re-
moved are the ones that maximize S. It means dS = 0 and d2 S < 0, given the
remaining constraint.
The properties of S are:
1. S is an additive function with respect to the subsystems in which the system is

partitioned:
X
S= S (α) (1.3)
α
2. S is differentiable and monotonically increasing with respect to the internal

energy U. It means that ∂U V,N > 0.
∂S

3. For each subsystem (α) we have:
S (α) = S (α) (U (α) , V (α) , N (α) ) (1.4)
This fundamental relation holds for each subsystem.
4. S is an homogeneous function of 1st order with respect to the extensive param-

eters, namely:
S(λU, λV, λN ) = λS(U, V, N ), ∀λ > 0 (1.5)
It means that S is an extensive quantity.

1.3. Equations of states 3
Remark. Since S is monotonically increasing in U, the following inequality holds:

∂S
>0
∂U V,N
Therefore, we have ∂S
6= 0 and it can be inverted locally.

∂U V,N
Afterwards, S = S(U, V, N ) inverted in U gives the 2st fundamental relation
U = U (S, V, N ) (1.6)
It means that, we can look or S or U and, when this quantities are known, all the
informations about the system can be obtained.
By taking the differential of the fundamental relation
U = U (S, V, N1 , . . . , Nr )
one gets
r
∂U ∂U X ∂U
dU = dS + dV + dNj (1.7)
∂S V,Nj ∂V S,Nj ∂Nj S,V
j=1
| {z } | {z } | {z }
T −P µj
absolute pressure electrochemical
temperature potential
1.3 Equations of states

Now, we define another set of variables that are called intensive variables. The
term intensive means that it is independent of the size of the system, namely that
the value of the variable relative to a subsystem is equal to that of the whole system.
The intensive variables are themselves functions of S,V,N, and examples of intensive
variables are the pressure, P, and the temperature of the system, T.
The state equations are defined as:
T = T (S, V, N1 , . . . , Nr ) (1.8a)
P = P (S, V, N1 , . . . , Nr ) (1.8b)
µj = µj (S, V, N1 , . . . , Nr ) (1.8c)
Remark. If all the state equations are known, the fundamental relation is determined
a part from a constant. It means that the coefficients of the differential (1.7) are
known.
Example 1
Let us see some examples of equations of state:
• For an ideal gas:
P V = N Kb T (1.9)
• Van-Der Walls equation of the state:
αN 2

P+ (V − N b) = N kb T (1.10)
V2
• For magnetic systems, another equation of state is the Curie Law :
CH
M= (1.11)
T
Remark. We compute ∂U

∂M S,N = H.
The equations of state are homogeneous functions of zero degree. For example,
considering the temperature T :
0
T (λS, λV, λN ) = T (S, V, N )
It means that at equilibrium the temperature of a subsystem is equal to the one of

the whole system. Similarly,
P (λS, λV, λN ) = P (S, V, N )
Now, we keep the S parameter separates from the other that are substituted
by generalized displacements, as (V, N1 , . . . , Nr ) → Xj . The fundamental relation
becomes
U = U (S, X1 , . . . , Xr+1 ) (1.12)
and we define:

∂U
≡T (1.13a)
∂S

∂U
≡ Pj (1.13b)
∂Xj
The differential is written as the following:
r+1
X
dU = T dS + Pj dXj (1.14)
j=1
where X1 = V is the volume and P1 = −P is the pressure.

From the equilibrium condition,
dU = 0
one can get a relation between intensive variables in differential form as the Gibbs-
Duhem relation:
r+1
X
S dT + Xj dPj = 0 (1.15)
j=1
For a one-component simple fluid system, the equation (1.15) simplifies into
S dT − V dP + N dµ = 0
and dividing by the number of moles N
dµ = −s dT + v dP (1.16)
that is the Gibbs-Duhem relation in a molar form.

For a magnetic system, we have
~ · dM
dU = T dS + H ~ + µ dN (1.17)
Remark. Note that µ = µ(T, P ) is a relation between intensive variables.

To summarize, the fundamental relations are S = S(U, V, N1 , . . . , Nr ) or S =
~ N1 , . . . , Nr ) for magnetic systems. In the energy representation we have
S(U, M,
~ N1 , . . . , Nr ).
U = U (S, V, N1 , . . . , Nr ) or U = U (S, M,
1.4. Legendre transform and thermodynamic potentials 5
1.4 Legendre transform and thermodynamic potentials

In many situations, it is convenient to change some extensive variables with their
conjugate intensive ones that became independent and free to vary. We have new
thermodynamic potentials. It works as following; suppose we have a function as
Y = Y (X0 , X1 , . . . , Xk , . . . , Xr+1 ) (1.18)

2
such that Y is strictly convex in say, Xk ( ∂X
∂ Y
2 > 0) and smooth . The idea is to find
1
k
a transformation such that
Y = Y (X0 , X1 , . . . , Pk , . . . , Xr+1 ) (1.19)
where
∂Y
Xk → Pk ≡ (1.20)
∂Xk
i.e. Pk substitutes Xk as a new independent variable. In mathematics this is called
Legendre transform.
The thermodynamic potentials are extremely useful tools, whose name derives
from an analogy with mechanical potential energy: as we will later see, in certain
circumstances the work obtainable from a macroscopic system is related to the change
of an appropriately defined function, the thermodynamic potential. They are useful
because they allow one to define quantities which are experimentally more easy to
control and to rewrite the fundamental thermodynamic relations in terms of them.
Mathematically, all the thermodynamic potentials are the result of a Legendre
transformation of the internal energy, namely they are a rewriting of the internal
energy so that a variable has been substituted with another.
Example 2: How to calculate thermodynamic potentials
Suppose we want to replace the entropy S with its conjugate derivative
∂U
T =
∂S
One starts form the fundamental relation
U = U (S, V, N1 , . . . )
and transforms U such that S is replaced by T as a new independent variable.
Let us consider the transformation
∂U
A≡U −S = U − TS
∂S
By differentiating A we get
dA = dU − T dS − S dT
On the other hand X
dU = T dS + Pj dXj
j
It implies that
X
dA = −S dT + Pj dXj
j
For such a system we have A = A(T, V, N1 , . . . , Nr ). It is a function of T instead

of S, as wanted. Similarly for a magnetic system A = A(T, M, ~ N1 , . . . , Nr ).
1
A smooth function is a function that has continuous derivatives up to some desired order over
some domain.
Helmholtz free energy

The Helmholtz free energy is defined as:
A ≡ U − TS (1.21)
In terms of heat and mechanical work, since dU = δQ − δW :
dA = dU − d(T S) = δQ − T dS − S dT − δW
Hence,
δW = (δQ − T dS) − S dT − dA (1.22)
On the other hand, for a reversible transformation we have
δQ = T dS
which implies
δW = −S dT − dA (1.23)
If the reversible transformation is also isothermal, dT = 0 and we obtain dA = dW .
It is reminiscent of a potential energy.
Remark. For an isothermal but not reversible (spontaneous) process we know the 2nd
Law of Thermodynamics
δQ ≤ T dS
which implies
(δW )irr = δQ − T dS − dA ≤ − dA . (1.24)
Hence, if δW = 0 and dT = 0, we have dA ≤ 0. Therefore, in a spontaneous
(irreversible) process, the thermodynamic system, as a function of T,V,N etc, evolves
towards a minimum of the Helmoltz free energy A = A(T, V, N1 , . . . , Nr ).
In the case of a system with (P, V, T ), we have:
X
dA = −S dT − P dV + µj dNj (1.25)
j
where

∂A
−S = (1.26a)
∂T V,Nj

∂A
−P = (1.26b)
∂V T,Nj

∂A
µj = (1.26c)
∂Nj T,V
~ M,
For a magnetic system (H, ~ T ):
~ · dM
~ +
X
dA = −S dT + H µj dNj (1.27)
j
with
∂A
Hα = (1.28)
∂Mα T,{Nj }
1.4. Legendre transform and thermodynamic potentials 7
Heltalpy
The Hentalpy is the partial Legendre transform of U that replaces the volume
V with the pressure P as independent variable.
Consider U = U (S, V, N1 , . . . , Nr ) and −P = ∂V
∂U
, we define the hentalpy as
H = U + PV (1.29)
Remark. Note that the plus sign in the definition of the hentalpy is just because the
minus of the P.
We have:
dH = dU + P dV + V dP
X
= T dS −
PdV
+ µj dNj +
PdV
+ V dP
j (1.30)
X
= T dS + V dP + µj dNj
j
Finally, we obtain the relation H = H(S, P, N1 , . . . , Nr ).
Gibbs potential
The Gibss potential is obtained by performing the Legendre transform of U to
replace S and V with T and P.
Consider again U = U (S, V, N1 , . . . , Nr ) and T = ∂U
∂S , −P = ∂V , then we have:
∂U
G = U − TS + PV = A + PV (1.31)
For a simple fluid system
dG = dU − T dS − S dT + P dV + V dP
X
=
T −
dS PdV
+ µj dNj −
T − S dT +
dS PdV
+ V dP
j (1.32)
X
= −S dT + V dP + µj dNj
j
Hence, G = G(T, P, N1 , . . . , Nr ).
For a magnetic system, the Gibbs potential is defined as
~ ·H
G=A−M ~ (1.33)
and
~ ·H
dG = dA − d M ~ = d(U − T S) − d(−M~ · H)
~
~ ·H
= dU − T dS − S dT − d M ~ −M ~ · dH
~
(1.34)
=
TdS
+ ~
H ~ −
· dM T − S dT −
dS ~
dM ~ −M

·H ~ · dH
~
~ · dH
= −S dT − M ~
~ and also
and finally G = G(T, H)

∂G
S=− (1.35a)
∂T H ~

~ = − ∂G
M (1.35b)
~ T
∂H
Grand canonical potential

Lecture 2.
Friday 11th The grand canonical potential is obtained by performing the Legendre trans-
October, 2019. form of U to replace S and N with T and µ. The corresponding Legendre transform
Compiled: is
Wednesday 5th X r Xr
February, 2020. Ω = U − TS − µi Ni = A − µi Ni (1.36)
i=1 i=1
Differentiating this relation we obtain:
X r
X
dΩ = dU − S dT − T dS − dµj Nj − µi dNi
ij i=1
r r (1.37)
X X
= (δQ − T dS) − δW − S dT − dµj Nj − µj dNj
j=1 j=1
Hence, Ω = Ω(T, P, {µj }).
1.5 Maxwell relations

Internal energy U and entropy S are homogeneus function of the first order. A
consequence of this fact is the relation called Euler equation:
X
U = TS − PV + µj Nj (1.38)
j
Example 3: How to derive the Euler equation

Using the additive property of the internal energy U, we can derive a useful
thermodynamic relation, the Euler equation.
U (λS, λV, λN1 , . . . , λNm ) = λU (S, V, N1 , . . . Nm )
Let us differentiate this “extensivity condition” with respect to λ:

m
∂U (λS, . . . ) ∂U (λS, . . . ) X ∂U (λS, . . . )
S+ V + Ni = U (S, V, N1 , . . . , Nm )
∂(λS) ∂(λV ) ∂(λNi )
i=1
Setting λ = 1 in the above equation, we obtain:

∂U ∂U ∂U ∂U
S+ V + N1 + · · · + Nm = U
∂S ∂V ∂N1 ∂Nm
Using the definition of the intensive parameters, we arrive at the Euler equation:
m
X
U = TS − PV + µi Ni
i=1
Instead, the Maxwell relations are relations between the mixed derivatives
of the thermodynamic potentials. They can be obtained from the expressions of
dU , dH , dA , dG and dΩ and from the Schwarz theorem on mixed partial deriva-
tives.
Due to Schwarz theorem, if a thermodynamic potential depends on t + 1 variables
there will be t(t+1)
2 independent mixed derivatives.
1.5. Maxwell relations 9
Example 4: Internal energy U = U (S, V, N )
dU = T dS − P dV + µ dN (1.39)
where
∂U ∂U
T = −P =
∂S V,N ∂V S,N
It implies that
∂2U ∂2U

∂T ∂P
= = =−
∂V ∂S ∂V from
S,N Schwarz ∂S∂V ∂S V,N
inequality
therefore, we have the 1° Maxwell relation :

∂T ∂P
=−
∂V S,N ∂S V,N
All the 3 Maxweel relations obtained by the differential (1.39) with t = 2, for
which we have t + 1 = 3 and t(t+1)
2 = 3 ([S, V, N ]), are

∂T ∂P
(S, V ) : =− (1.40a)
∂V S,N ∂S V,N

∂T ∂µ
(S, N ) : = (1.40b)
∂N V,S ∂S V,N

∂P ∂µ
(V, N ) : − = (1.40c)
∂N S,V ∂V S,N
Example 5: Helmholz A = A(T, V, N )
dA = −S dT − P dV + µ dN (1.41)
In this case the 3 Maxweel relations ([T, V, N ]) are

∂S ∂P
(T, V ) : = (1.42a)
∂V T,N ∂T V,N

∂S ∂µ
(T, N ) : − = (1.42b)
∂N T,V ∂T V,N

∂P ∂µ
(V, N ) : − = (1.42c)
∂N V,T ∂V T,N
Example 6: Gibbs G = G(T, P, N )
dG = −S dT + V dP + µ dN (1.43)
In this case the 3 Maxweel relations ([T, P, N ]) are

∂S ∂V
(T, P ) : − = (1.44a)
∂P T,N ∂T P,N

∂S ∂µ
(T, N ) : − = (1.44b)
∂N T,P ∂T P,N

∂V ∂µ
(P, N ) : = (1.44c)
∂N P,T ∂P T,N
1.6 Response functions

Response functions are quantities that express how a system reacts when some
external parameters are changed.
In fact, aim of most experiments is to measure the response of a thermodynamic
system write respect to controlled variatious of thermodynamic variables. Any osser-
vation is just the pertubation of a system and looking for the response. A list of the
commonly used response functions is the following:
• Thermal expansion coefficient at constant pressure.

1 ∂V
αP ≡ (1.45)
V ∂T P,N
• Adiabatic compressibility.
∂2H

1 ∂V 1
kS = − = − (1.46)
V ∂P ∂H
S,N V =( ∂P )S,N V ∂P 2 S,N
• Isothermal compressibility.
∂2G

1 ∂V 1
kT = − = − (1.47)
V ∂P T,N V =( ∂G
∂P )T,N
V ∂P 2 T,N
Remark. Remember that kT it is the second derivative of the Gibbs potential

write respect to pressure.
• Molar heat capacity at constant pressure.
∂2G

δQ ∂S
cP = =T = −T (1.48)
dT P,N ∂T P,N −S=( ∂G
∂T )
∂T 2 P,N
P,N
• Specific heat at constant volume. Consider a quasi static transformation.

2
δQ ∂S ∂(−∂A/∂T )V,N ∂ A
cV = =T = = −T
dT V,N ∂T V,N ∂T V,N ∂T 2 V,N
(1.49)
~ H,
• Magnetic susceptibility (d=1) for a magnetic system (M, ~ T ).
2
∂M ∂ G
χT = = − 2
(1.50)
∂H T M =− ∂H
∂G
| ∂H T
T
1.6. Response functions 11
~ H
More generals, M, ~ we have
∂ 2 G

∂Mα ∂G
χαβ = , Mα = − ⇒ χαβ = (1.51)
∂Hβ T ∂Hα T ∂Hβ ∂Hα T
Remark. Note that the response functions, when used with the Maxwell relations, al-
low to express observables usually inaccessible to experiments with measurable quan-
tities.
Let us illustrate a lemma useful for calculation:
Lemma 1
Let x, y, z be quantities that satisfy the relation f (x, y, z) = 0. If w is a function
of two any variables chosen between x, y, z, then:

∂y
1. ∂x = ∂x

∂y ∂z w ∂z ww

∂x 1
2. ∂y z ∂y =
( ∂x )z

∂x ∂y ∂z

3. ∂y ∂z = −1 (concatenation relation or triple product rule).
z x ∂x y
Example 7
The Maxwell relation
∂S ∂V
=−
∂P T,N ∂T P,N
obtained from
dG = −S dT + V dP
((T, P ) equation) used with the response function αP , permits to write

∂S
= −V αP (1.52)
∂P T,N | {z }
| {z } measurable
inaccessible
to experiments
Example 8
Let us start with the Maxwell relation

∂S ∂P
=
∂V T,N ∂T V,N
obtained from ((T, V ) equation)
dA = −S dT − P dV
From some property of multi-variable differential calculus one has the triple
product rule:
∂P ∂V ∂T
= −1
∂T V,N ∂P T,N ∂V P,N
Hence
∂V

−1

∂P ∂T P,N
= ∂V ∂T
=− ∂V
∂T

V,N ∂P T,N ∂V P,N ∂P T,N (1.53)
−V αP αP
= =
−V kT kT
1.6.1 Response functions and thermodynamic stability

Now, we analyze the concept of thermal stability. If one injects heat in a system
either at constant volume or at constant pressure, its temperature will inevitably
increase 
cV ≡ δQ ≥0
dT V (1.54)
cP ≡ δQ ≥0
dT P
Remark. The thermal capacities are non-negative functions!

It is useful also the concept of mechanical stability. If one compress a system by
keeping T constant, we would expect that it shrinks

1 ∂V
kT = − ≥0 (1.55)
V ∂P T
Similar considerations for a magnetic system, gives
cH ≥ 0, cM ≥ 0, χT ≥ 0 (1.56)
Remark. In diamangetic systems χM can also be negative.
Exercise 1
By using Maxwell relations show that
T V αP2 ∂V 2

1
cP − cV = = T (1.57a)
kT V kT ∂T P
2
T ∂M
cH − cM = (1.57b)
χT ∂T H
Solution. Let us start considering a system with a fixed number of particles

(namely dN = 0) and such that S is explicitly expressed in terms of T and V.
Then:

∂S ∂S
dS = dT + dV
∂T V ∂V T
Dividing by dT both sides keeping the pressure constant, and then multiplying
by T :
∂S ∂S ∂S ∂V
T −T =T
∂T P ∂T V ∂V T ∂T P
it implies
∂S ∂V
cP − cV = T
∂V T ∂T P
Now, using the Maxwell relation ∂V
∂S
∂T V and using the triple product
= ∂P

T
rule we obtain
∂P ∂P ∂V
=−
∂T V ∂V T ∂T P
we get:
2
∂P ∂V ∂P TV 2
cP − cV = −T = −T αP2 V 2 = α
∂V T ∂T P ∂V T kT P
It can be shown similarly for magnetic systems.

1.6. Response functions 13
A consequence is that, since the right hand terms are non negative, it follows
that (
cP ≥ cV ≥ 0
(1.58)
cH ≥ cM ≥ 0
For resuming, we have seen the thermodynamic of a phase, where the equilibrium
state can be described by the maximum of the entropy. If we have a given phase, we
can look for the Gibbs function. If we have more phases, we want to change between
these phases.
Chapter 2
Equilibrium phases and

thermodynamics of phase
transitions
2.1 Equilibrium phases as minima of Gibbs free energy

Experimentally, any element or compound can be found, depending on the ther-
modynamic conditions in which it is, in different phases. When we say that a system
is in a particular phase we mean that its physical properties (like density or magne-
tization) are uniform.
Equilibrium states are given by maxima of the entropy and minima of internal
energy, or by minima of thermodynamics potentials such as A and G. Let us consider
for example the Gibbs potential per particle of a fluid system
G
≡ g = g(T, P ) (2.1)
N
that depends on two intensive variables T and P and is not anymore a function of N
because we have divided for N. Let us define α as the phase of a one-component system
(say α = gas or liquid). Therefore, the thermodynamic properties are described by
surfaces of function gα (T, P ) and for all equilibrium phase we have a surface on the
space (T,P,g). For each value of T and P the thermodynamically stable phase is the
one for which gα (T, P ) is minimum.
2.2 First order phase transition and phase coexistence

Let us suppose for example that the system can be found in two phases α and β
(for example liquid and solid). Consider the surface gα and gβ , we are looking for the
lower one.
For given values of T and P the stable phase will be that with the lowest value
of g: for example, if we have gα (T, P ) < gβ (T, P ) then the system will be in phase
α. Therefore there will be regions in (T, P ) space were the most stable phase will be
α and others in which it will be β. If we now plot the values of g as a function of T
and T in (g, P, T ) space for every phase of the system, we can determine the regions
where the two phases will be the stable ones, namely we can determine the phase
diagram of the system, as illustrated in Figure 2.1.
The very interesting region of this space (and the one on which we will focus our
attention in this section) is the line where the surfaces of the two phases intersect:
along this the two phases coexist, and when the system crosses it we say that it
undergoes a phase transition. The coexistence line is the projection on the (T,P)
15
16 Chapter 2. Equilibrium phases and thermodynamics of phase transitions
plane of the intersection between different surfaces, so the coexistence condition is:
gα (T, P ) = gβ (T, P ) (2.2)
g(T, P )
α
β
β
α
P
β
α
Figure 2.1: Phase diagram: stability of phases.
To fix the ideas, let us choose a given value of pressure P = P ∗ and study the
behavior of g(T, P ∗ ) as a function of T when we go from solid to gas, as illustrated
in Figure 2.2.
solid
liquid
a b gas
∗
P
f reezing point
boiling point
T triple Tc T
point
Figure 2.2: (T, P ) projection.
The existence of a critical point has a very intriguing consequence: since the
liquid-gas coexistence line ends in a point, this means that a liquid can continuously
2.2. First order phase transition and phase coexistence 17
solid
b
liquid
gas
solid stable liquid stable gas stable T
Figure 2.3: (g, T ) projection at a fixed pression P = P ∗ . The red line is the coexistence
line of the two phases α and β.
be transformed in a gas (or viceversa), and in such a way that the coexistence of
liquid and gaseous phases is never encountered.
At the coexistence line, gsolid (Ta , P ∗ ) = gliq (Ta ) and gliq (Tb ) = ggas (Tb , P ∗ ), as
shown in Figure 2.3.
Note also that:
• At the coexistence points a and b of the two phases, one has gα (T ) = gβ (T ).
• g(T ) is a continuous function of T.

2
• Note that, S = − ∂G and ∂ G
> 0. This implies that g(T ) is

∂T V cP = −T ∂T 2 P
concave in T at fixed P.

∂g
How about its derivatives? Since P is fixed we can vary T and look for s = − ∂T .
P
As we cross different phases we have discontinuities, where ∆sT is called the latent
heat. It is illustrated in Figure 2.4.
If there is a finite discontinuity in one, or more, of the first derivatives of the
appropriate thermodynamic potential, the transition is called first order transition. In
general, a phase transition is signaled by a singularity in a thermodynamic potential.
We can also fix the temperature T = T ∗ and look at the variation of P, as shown
in Figure 2.5.
∂g
Note that, we have v = ∂P >0:
T
∂2g

∂v
= = −vkT < 0 (2.3)
∂P 2 ∂P T
so, also in this case we had a jump of the first order derivative of the thermodynamic
potential g. It is illustrated in Figure 2.6.
f ixed P
gas
liquid
∆sT = latent
heat
solid
Ta Tb T
Figure 2.4: (s, T ) projection.
P g
f ixed T
solid
b liquid
a
b
a
gas
T triple
Tc T gas liquid solid P
point
Figure 2.5: Left: (T, P ) projection. Right: (g, P ) projection at a fixed temperature
T = T ∗.
f ixed P
gas liquid solid P
Figure 2.6: (v, P ) projection.

2.2. First order phase transition and phase coexistence 19
2.2.1 Critical points

At the critical point (Pc , Tc ) the system can pass from the liquid to the gas phase
(and vice versa) in a continuous way
∆s = ∆v = 0
Usually, critical points are end point of first order transition phases. Why there is
no critical point between solid and liquid? The crossover between phases having the
same symmetry define the Landau point. There is a break of symmetry, for instance
we can think about the structure of the bravais lattice. Instead, from gas to liquid
symmetries are not broken.
critical point
solid
liquid
Pc
1 atm
gas
f reezing point
boiling point
Tc T
T triple
point
Figure 2.7: Phase diagram of a fluid. All the phase transition are first-order except at
the critical point C. Beyond C it is possible to move continuously from liquid to a gas. The
boundary between the solid and liquid phases is thought to be always first-oder and not to
terminate in a critical point.
2.2.2 Ferromagnetic system

A similar behaviour can be encountered in magnetic systems. We can have a
magnetization different from 0 even when the is no magnetic field. Supposing P ↔
H, V ↔ M , we have (P, T ) ↔ (H, T ).
The magnetization M has a jump at H = 0 for temperatures lower than the
critical one; in this case since M = − ∂H
∂F
we see that the first derivative of the free
energy F with respect to H has a jump discontinuity. For instance, consider Figure
2.9. At the critical point the magnetization would pass through zero.
phase positive M
continuous
T∗ Tc T
phase negative M paramagnetic

phase
Figure 2.8: Phase diagram for a magnetic system in (T, H) space. A line of first-order
transitions at zero field ends in a critical point at a temperature Tc .
<M >
T = T∗ < Tc
Figure 2.9: Plot of the Magnetization for T = T ∗ < Tc .

2.3. Second order phase transition 21
2.3 Second order phase transition

The transitions are classified in the first order transition and continuous transition.
If the first derivatives are continuous, but second derivatives are discontinuous, or
infinite, the transition will be described as higher order, continuous or critical. This
is different from the previous situation, in which we had a jump for the first order
derivative of a thermodynamic potential. Some examples are illustrated in Figure
2.10.
Let us suppose that

∂g
= −s (2.4a)
∂T P

∂g
=v (2.4b)
∂P T
are continuous. We suppose also that
2
∂ g ∂v
= = vαp (2.5)
∂T ∂P ∂T P
is discontinuous. An example is superconductivity.
g s
T T
Ta Ta
(a) Thermodynamic potential g. (b) Continuous s.

v cP
T T
Ta Ta
(c) Continuous v. (d) Discontinuous cP .
Figure 2.10: Example of a second order phase transition.
If we look for example at the specific heat cP in Figure 2.10d, it represent the
transition from superconducting.
The critical point is special because there is not a jump, so we can go continuously
from gas to liquid. The response function when we plot this point shows that the
specific heat diverges.
The superfluid transition is a transition where the second derivative of the thermo-
dynamic potential diverges. There are many phase transitions that can be classified
in different ways.
Remark. Note that at the coexistence line we increase V, but the pressure remains
constant. At the coexistence line we see bubbles. It is the density that is changing
locally, the bubbles becomes bigger and bigger and at the VG , becomes a liquid.
2.3.1 Helmholtz free-energy
A P
f ixed T
linear
VL VG V VL VG V
(a) (V, A) projection for fixed T. (b) (P, V ) projection.
Figure 2.11: Helmholtz free-energy and phase transition.
Consider A = A(T, V, N ), here P is replaced by V which has the derivative

discontinuous at the first order transition. Moreover, P > 0 implies ∂A/∂V < 0 and
1 ∂2A

1 ∂V 1 ∂P
kT = − =− = >0 (2.6)
V ∂P T V ∂V T V ∂V 2 T
so, A is an overall convex function of V. The behaviour of A when there is a first

order phase transition is as in Figure 2.11a. The linear sector becomes an horizontal
one in the P = −(∂A/∂V )T = P (V ) curve (Figure 2.11b).
2.4 Thermodynamic of phase coexistence

2.4.1 Lever Rule
Lecture 3.
Wednesday 16th The lever rule [2] is a rule used to determine the mole fraction of each phase of
October, 2019. a binary equilibrium phase diagram. For instance, it can be used to determine the
Compiled: fraction of liquid and solid phases for a given binary composition and temperature
Wednesday 5th that is between the liquid and solid line.
February, 2020. In an alloy or a mixture with two phases, α and β, which themselves contain two
elements, A and B, the lever rule states that the mass fraction of the α phase is
β
α wB − wB
w = (2.7)
α − wβ
wB B
where
2.4. Thermodynamic of phase coexistence 23
• wB
α : is the mass fraction of element B in the α phase.
β
• wB : is the mass fraction of element B in the β phase.
• wB : is the mass fraction of element B in the entire alloy or mixture.
Example 9
Consider Figure 2.12; at all points between A and B the system is a mixture
of gas and liquid. Points D has global density ρD = ρA + ρB and therefore
vD = ρ1D , vA = ρ1A , vB = ρ1B which implies:
NA NB
vD = vA + vB = xA vA + xB vB
N N
Since xA + xB = 1 we have (xA + xB )vD = xA vA + xB vB and finally by rear-
ranging, one finds the Lever Rule. It shows that the relative concentration of
the liquid-gas mixture changes with V :
xA vB − vD
=
xB vD − vA
T > Tc
T = Tc
D A
B
T < Tc
VB VD VA V
Figure 2.12: (V, P ) projection. In the region between A and B the gas and the liquid
phase coexist by keeping the pressure constant.
2.4.2 Phase coexistence (one component system)

Consider a (P, V, T ) system as a mixture of two species (1, 2) at temperature
T1 , T2 , pressure P1 , P2 and chemical potentials µ1 , µ2 . The equilibrium condition is
given by the maximum of the total entropy S = S1 + S2 and gives the conditions
T1 = T2 , P1 = P2 , µ1 = µ2 (2.8)
this is the coexistence condition of the two phases.

In terms of the Gibbs potential G = U − T S + P V , where U is given by the Euler

equation U = T S − P V + µ1 N1 + µ2 N2 , the Gibbs per mole is
G1
g1 (T, P ) ≡ = µ1 (2.9a)
N1
G2
g2 (T, P ) ≡ = µ2 (2.9b)
N2
Therefore, on the coexistence line it should hold the relation
g1 (T, P ) = g2 (T, P ) (2.10)
2.4.3 Clausius-Clapeyron equation

The coexistence curves [3], as the one illustrated in Figure 2.13, are less arbi-
trary than is immediately evident; the slope dP / dT of a coexistence curve is fully
determined by the properties of the two coexisting phases.
The slope of a coexistence curve is of direct physical interest. Consider cubes of
ice at equilibrium in a glass of water. Given the ambient pressure, the temperature of
the mixed system is determined by the liquid-solid coexistence curve of water; if the
temperature were not on the coexistence curve some ice would melt, or some liquid
would freeze, until the temperature would again lie on the coexistence curve (or one
phases would become depleted). If the ambient pressure were to decrease perhaps,
by virtue of a change in altitude, then the temperature of the glass of water would
appropriately adjust to a new point on the coexistence curve. If ∆P were the change
in pressure, then the change in temperature would be ∆T = ∆P/(dP / dT )coex ,
where the derivative in the denominator is the slope of the coexistence curve.
Remark. Ice skating presents another interesting example. The pressure applied to
the ice directly beneath the blade of the skate shifts the ice across the solid-liquid
coexistence curve, providing a lubricating film of liquid on which the skate slides. The
possibility of ice skating depends on the negative slope of the liquid-solid coexistence
curve of water.
Now, suppose to know the position on the coexistence line (for example the melt
temperature Tm at the atmospheric pressure P0 ). Is it possible to find other points
on the curve? For example Tm at lower or higher pressure?
The answer is yes for small deviations of T and P from a. The idea is to compute
the slope of the tangent of the coexistence curve, i.e. (dP / dT ). This is given by the
Clausius-Clapeyron equation. Both at a and b the two phases 1 and 2 coexist. This
means that at the coexistence line
( (a) (a)
g1 = g2
(b) (b) (2.11)
g1 = g2
Hence, if a and b are very close:

(b) (a)
(
dg1 = g1 − g1
(b) (a) (2.12)
dg2 = g2 − g2
Therefore, the starting point for Clausius-Clapeyron is
⇒ dg1 = dg2 (2.13)
From the molar version of the Gibbs-Duhem relation, we have

(
dg1 = −s1 dT + v1 dP = dµ1
(2.14)
dg2 = −s2 dT + v2 dP = dµ2
2.4. Thermodynamic of phase coexistence 25
Po + dP
slope
b
Po
a
g1 = g2
Tmelt Tm + dT T
Figure 2.13: (T, P ) projection. The coexistence line is represented in red, while in green
the slope between the two points a and b.
taking the difference, one obtains
−(s2 − s1 ) dT + (v2 − v1 ) dP = 0
The slope is called Clausius-Clapeyron equation:
(s2 − s1 )

dP ∆s
= = (2.15)
dT coex (v2 − v1 ) ∆v
Remark. Since (dP / dT )coex is finite, the equation explains why a first order tran-
sition is characterised by discontinuous changes in entropy and volume (or density).
∆S gives the latent heat L12 1 :
L12 = T ∆s (2.16)
whence, the Clapeyron equation is
dP L12
= (2.17)
dT T ∆v
2.4.4 Application of C-C equation to the liquid-gas coexistence line

Now, we go from gas (region 2) to liquid (region 1), we have:
s2 − s1

dP
=
dT coex v2 − v1
The Clapeyron equation embodies the Le Chatelier principle 2 . Consider a liquid-

gas transition (the coexistence curves are shown in Figure 2.14):
s2 − s1

dP
>0 ⇒ >0
dT coex v2 − v1
1
The latent heat of fusion is the quantity of heat required to melt one mole of solid.
2
"When a settled system is disturbed, it will adjust to diminish the change that has been made
to it".
Figure 2.14: (T, P ) projection. Region 1: liquid. Region 2: gas. The lines represent the
combinations of pressures and temperatures at which two phases can exist in equilibrium.
and since v2 > v1 , we have s2 > s1 . The gas has more entropy as it should be.
The slope of the phase curve is positive, then an increase in pressure at constant
temperature tends to drive the system to the more dense (solid) phase, and an increase
in temperature tends to drive the system to the more entropic (liquid) phase.
When going from a low-temperature phase to a high-temperature phase entropy
always increases ∆S > 0, because cP ≡ T (∂S/∂T )P > 0.
The sign of ∆V is more uncertain though. To see this point, let us consider the
C-C equation at the solid-liquid (now solid is region 1 and liquid region 2) coexistence
curve. At the melt temperature:

dP δQmelt
= , δQmelt = Qliq − Qsolid > 0
dT coex Tmelt ∆vmelt
In general, ∆vm = vliq − vsolid > 0 which implies (dP / dT )coex > 0. There are
cases, however, where ∆vm = vliq − vsolid < 0 because ρliq > ρsolid (for instance
the H2 0, or also Silicon and Germanium). The paradigmatic example is the freezing
of water where vice > vliq since ice is less dense than liquid water at the coxistence
(0 < T < 4). This implies that dP / dT < 0.
Example 10: Melting point on Everest
Consider T = 237K and P = P0 . If we suppose that
δQm = 6.01kJ/mol, ∆v = −1.7cm3 /mol
we have
dP δQm 6.01103 J/mol
= = = −1.29 · 104 J/m3 = −1.29bar/K
dT T ∆v 273 · (−1.7cm3 /mol)
∆P (P0 − PEverest ) (1 − 0.36)atm

⇒ ∆T = = = = −0.5◦C
(−1.29P a/K) (−1.29P a/K) (−1.29P a/K)
⇒ Tm (Everest) = Tm (P0 ) + 0.5◦C
Example 11: Boiling point on Everest

Let us consider
PEverest = 0.36atm, ρ(T = 100◦C) = 0.598kg/m3 , Lgl = 2.257 · 103 J/g
The density of the vapour (gas) is about 1000 less than water (liquid), it implies
2.5. Order parameter of a phase transition 27
that:
1
∆V = Vg − Vl ≈ Vg =
ρg
We have:
dP Lge Lge ρg 2.25 · 103 J/g · 0.593kg/m3 3.6 103 J kg Pa
= = = = 3
= 3.6·103
dT T ∆V T 373K K g m K
⇒ ∆T ≈ ∆P/(3.6103 P a/K) = 18◦C

⇒ T0 − TEverest = 18◦C ⇒ TEverest ≈ 80◦C
2.5 Order parameter of a phase transition

An order parameter is a measure of the degree of order across the boundaries in a
phase transition system. In particular, order parameters are macroscopic observable
that are equal to zero above the critical temperature, and different from zero below:
(
6= 0 T < Tc
Op = (2.18)
= 0 T → Tc−
When a phase transition implies a breaking of a phase symmetry, the order parameter
is related to this symmetry. Therefore, the order parameter reflects the symmetry of
the system. Recall that, at Tc the system has a symmetry broken.
For instance, consider the densities of liquid and gas and the related order param-
eter of the gas-liquid transition ∆ρ = ρl − ρg , that is 6= 0 for T 6= Tc but → 0 when
T → Tc (see Figure 2.15).
solid
liquid
ρl
ρg
gas
Tc T
Figure 2.15: (T, ρ) projection of the (P, V, T ) system, where ρ = N/V .
Remark. Note that ρ = N V = v hence either N or V varies.

1
In Figure 2.16 is shown the behaviour for a ferromagnetic system. We have

(
M 6= 0 T < Tc
H=0⇒
M → 0 T → Tc−
Clearly M 6= 0 if H 6= 0. Recall that M is the order parameter of the paramagnetic-
ferromagnetic phase transition.
H>0
Tc
H=0 T
H<0
Figure 2.16: Magnetization of a ferromagnet. In red: zero-field magnetization. Below the

critical temperature there is a spontaneous magnetization.
Variable conjugate to OP
~ →H
• Ferromagnetic system: M ~ (magnetic field).
~ →E
• Ferroelectric: P ~ (electric field).
~ H.
• Liquid crystals: Qαβ → E, ~
• Fluid : V → P (pressure), or ρ → µ.
2.6 Classification of the phase transitions

2.6.1 Thermodynamic classification
Thermodynamically, one can distinguish two kinds of phase transitions:
1. Ones who develop latent heat.
2. Ones who do not develop latent heat. The entropy changes continuously at the
transition.
2.6.2 Eherenfest classification

The Eherenfest classification is based on the behaviour of the derivatives of the
thermodynamic potentials.
A phase transition is of order n if all the (n − 1) derivatives are continuous and
the nth derivative displays a finite discontinuity.
Example 12
For instance, a first order transition in which S = −(∂G/∂T )P has finite dis-
continuity.
Remark. There are first order transitions where S is continuous (no latent heat), but
ρ is discontinuous (v = (∂G/∂P )T ).
2.7. Critical exponents 29
Example 13
Second order transition. The specific heat displays a finite jump, see Figure
2.17c in the conductor-superconductor transition.
Another example is a second order transition but with divergence. Consider the
fluid-superfluid transition (or λ transition) of the He4 (Figure 2.17d).
χm cp
(cH )
f lex point
diverges
Tc T Tc T
(a) (b) Liquid gas kT = − V1 ∂V
∂P
cp C
jump diverges
Ts T Tλ T
(c) 2° order phase transition (d) Superfluid in λ-transition
Figure 2.17: Plots of response functions.
Remark. λ transition: a second-order or higher-order transition, in which the heat

capacity shows either a discontinuity (second-order) or a vertex (higher-order) at the
transition temperature. It is so named because the shape of the specific heat versus
temperature curve resembles the Greek letter λ.
2.6.3 Modern classification

A phase transition is of the first order if exists a finite discontinuity in either
one or more partial derivatives of the thermodynamic potentials. Instead, if the first
derivatives are all continuous, but the second are either discontinuous, or infinite, one
talks of continuous transitions. A critical point is a continuous transition.
2.7 Critical exponents

At the critical point response functions may diverge. How are these divergence?
In general, when you are close to Tc , there are singolarities. Now, we can ask, how
the curve diverges? What is the behaviour close to the critical point? Power law, so
which are the values of these critical exponents?
2.7.1 Divergence of the response functions at the critical point

While at the critical point the order parameter goes to zero continuously as T →
Tc− , the response function may develop divergences.
Example 14
In a fluid system since at T = Tc the curve P = P (V ) develops an horizontal
flex (Figure 2.18), we have kT = − V1 ∂V ∞. Similarly, in a magnetic since

∂P T →
the curve is like Figure 2.16, we have χT = ∂H T → ∞.
∂M

T →Tc
f lex
Figure 2.18: (V, T ) projection.
2.7.2 Critical exponents definition

The notion of critical exponent describes the behaviour of the order parameter
and the response functions in proximity of the critical point. In order to answer to
these questions, let us define:
Definition 1: Critical Exponent, or Scale Exponent
Let us define the adimensional parameter measuring the distance from the crit-
ical point t ≡ T −T
Tc , the critical exponent λ associated to the function F (t) is
c
defined as:
ln |F (t)|
λ± = lim (2.19)
t→0± ln |t|
We note that it behaves like a power law. One can also write the power law :
t→0±
F (t) ∼ |t|λ± (2.20)
More generally, for t 1:
F (t) = A|t|λ± (1 + btλ1 + . . . ), λ1 > 0 (2.21)
where all other terms are less important.

Definition 2: Thermodynamic Critical Exponents
• Exponent β : tells how the order parameter goes to zero. Consider Figure
t→0−
2.19a, we have M ∼ (−t)β . No sense in going from above (t → 0+ )
where it stays 0.
• Exponent γ± (susceptibility): related to the response function. Consider

t→0±
Figure 2.19b, we have χT ∼ |t|−γ± . In principle, the value of γ can
depend on the sign of t i.e. γ + =
6 γ − , but they are the same in reality and
we have γ + = γ − = γ.
• Exponent α± : how specific heat diverges (second order derivative in

respect of T ). For instance see Figure 2.19c, we have cH ∼ |t|−α± .
• Exponent δ : in this case one consider the isotherm T = Tc and look for
the behaviour of M at the critical point at small H (or viceversa). The
result is M ∼ H 1/δ . In Figure 2.19d, H ∼ |M |δ sign(M ).
M χ
H=0
H=0
Tc T Tc T
(a) Exponent β. (b) Exponent γ± .
cH M
T = Tc
H=0
Tc T
(c) Exponent α± . (d) Exponent δ.
Figure 2.19
Zero-field specific heat CH ∼ |t|−α

Zero-field magnetization M ∼ (−t)β
Zero-field isothermal susceptibility χT ∼ |t|−γ
Critical isotherm (t = 0) H ∼ |M |δ sign(M )
Correlation length ξ ∼ |t|−ν
Pair correlation function at Tc 1
G(~r) ∼ rd−2+η
Table 2.1: Definitions of the most commonly used critical exponents for a magnetic system
[4].
Remark. In compiling Table 2.1 and 2.2 we have made the as yet totally unjustified
Specific heat at constant volume Vc CV ∼ |t|−α

Liquid-gas density difference (ρl − ρg ) ∼ (−t)β
Isothermal compressibility kT ∼ |t|−γ
Critical isotherm (t = 0) P − Pc ∼ |ρl − ρg |δ sign(ρl − ρg )
Correlation length ξ ∼ |t|−ν
Pair correlation function at Tc 1
G(~r) ∼ rd−2+η
Table 2.2: Definitions of the most commonly used critical exponents for a fluid system [4].
assumption that the critical exponent associated with a given thermodynamic variable
is the same as T → Tc from above or below.
2.7.3 Law of the corresponding states

The system displays correlation at very long distance, these goes to the size of the
system when T → Tc . We are talking about long range correlation. The correlation
function is ξ ∼ t−ν . For instance, consider a polymer as in Figure 2.20a.
Having defined the critical exponents, we need to justify why they are interesting
and why they are more interesting than the critical temperature Tc itself. It turns out
that, whereas Tc depends sensitively on the details of the interatomic interactions, the
critical exponents are to a large degree universal depending only on a few fundamental
parameters.
To summurize, the critical exponents are more interesting than Tc since their
values do not depend on microscopic details, but only on few parameters such as the
space dimension d and the symmetry of the system.
One of the first experimental evidence of this universality was given by the work
of Guggenheim on the coexistence curves of g different fluids: A, Kn, χe , Ne, N2 ,
CO2 and O2 . By plotting T /Tc versus ρ/ρc (Figure 2.20b) he found that all the data
collapse on the same curve, i.e. different sets of data fit the same function. Moreover
for t → 0:
(ρl − ρc ) ∼ (−t)β
and β ∼ 1/3 ≈ 0.335. Therefore, close to the critical point all the data lie on the
same curve and hence can be described by the same exponent β. A further test
of universality is to compare this value to that obtained for a phase transition in a
completely different system with a scalar order parameter. For instance, if we do the
same for a string ferromagnetic the result is β = 1/3 too.
Remark. The law of corresponding states gives a universal liquid-gas coexistence
curve.
2.7.4 Thermodynamic inequalities between critical exponents

It is possible to obtain several rigorous inequalities between the critical exponents.
The easiest to prove is due to Rushbrooke.
Rushbrocke inequality
It follows from the well known thermodynamic relation between the specific heats
at constant field and constant magnetization. Remember the relation between re-
sponse functions:
2
1 ∂v 2

2 1 ∂v
kT (cp − cv ) = T vα = T v 2 =T
v ∂T P v ∂T P
ρ
ρc
(a) N -Polymer.
1 = Tc /Tc T /Tc
(b) Coexistence curve of different fluids plotted in

reduced variables.
Figure 2.20
For magnetic systems one has

2
∂M
χT (cH − cM ) = T
∂T H
| {z }
≥0
From thermodynamic stability we have cM ≥ 0, cH ≥ 0, χT ≥ 0. Hence, from the

previous relation we have
T ∂M 2

cH = + cM
χT ∂T H |{z}
≥0
which implies
2
T ∂M
cH ≥ (2.22)
χT ∂T H
On the other hand, for T → Tc− (t → 0− ) and H = 0 (zero field) we have

−α
cH ∼ (−t)

χT ∼ (−t)−γ

M ∼ (−t)β

that implies
∂M
∼ (−t)β−1
∂T H=0
Since the inequality (2.22) is valid for all temperature T, it follows that can only be
obeyed if
[(Tc − T )β−1 ]2
B(Tc − T )−α ≥ B 0 T
(Tc − T )−γ
with B, B 0 > 0. Take the limit T → Tc− , we have:
B0T
lim (Tc − T )2−α−2β−γ ≥ >0
T →Tc− B
Since the left hand side must be strictly greater than zero, we have the RushBrook
inequality:
α + 2β + γ ≥ 2 (2.23)
Griffith inequality
The Griffith inequality is obtained from the convexity property (in T and V ) of
the Helmolds free energy and from A ∼ t2−α :
⇒ α + β(1 + δ) ≥ 2 (2.24)
We have introduced two very new ideas, universality and inequalities between the
critical exponents, which appear to hold as equalities (see Sec.12.3.5).
In the intervening chapters, we look at models of systems which undergo phase
transitions and how to calculate their critical exponents and other properties.
Chapter 3
Recall of statistical mechanics and

theory of ensembles
Lecture 4.
3.1 Statistical ensembles Friday 18th
October, 2019.
Statistical mechanics roughly speaking was born as a sort of theory from mi- Compiled:
croscopic and try to compute the macroscopic length using thermodynamics. The Wednesday 5th
problem is going from the countinuous problems to the macroscopic problems. In February, 2020.
origin was statistical mechanics of equilibrium system. Each microstate with a given
energy fixed, will have the same probability, this is the equal probability statement.
In general, if we consider a system with N,V (number of particles and volume)
fixed and also the total energy E fixed, we call Ω(E, V, N ) the number of microstate
with total energy E , volume V and number of particles N.
If the system is isolated and in equilibrium the rule of equal probability of the
microstates holds:
If the system is isolated and in equilibrium with energy E it visits each microstate
consistent with energy E with equal probability.
Another way to say is: the system spends the same amount of time in each of the
Ω(E, V, N ) microstates.
Therefore, we call a single configuration of a given microstate C. A configuration
is just when you have the spatial part, because momentum can be obtained by inte-
grating. Let us suppose to compute the probability of a given configuration C, PC ;
because of equal probability we have:
1
PC = (3.1)
Ω(E, V, N )
Now, let us now consider two subsystem 1 and 2 that can exchange energy, volume
and/or particles. The number of microstates, of the combined system, of total energy
ET = E1 + E2 , total volume VT = V1 + V2 and NT = N1 + N2 is given by:
X
Ω(ET , VT , NT ) = Ω1 (E1 , V1 , N1 )Ω2 (ET − E1 , VT − V1 , NT − N1 ) (3.2)
E1 ,V1 ,N1
One can show that, in the thermodynamic limit, Ω(ET , VT , NT ) is strongly peaked
around a given point (E1∗ , V1∗ , N1∗ ) and the fluctuations around this value are rare and
small. Writing Ω(ET , VT , NT ) as

S(ET ,VT ,NT ) X 1
Ω(ET , VT , NT ) ∝ e kB
= exp (S1 (E1 , V1 , N1 ) + S2 (E2 , V2 , N2 ))
kB
E1 ,V1 ,N1
(3.3)
35
36 Chapter 3. Recall of statistical mechanics and theory of ensembles
(the proportionality becomes from the Boltzmann definition of entropy).

The values (E1∗ , V1∗ , N1∗ ) are obtained by the max entropy condition that can be
written as
d ln Ω1 d ln Ω2
= ⇒ T1 = T2 (3.4a)
dE1 dE2
d ln Ω1 d ln Ω2
= ⇒ P1 = P2 (3.4b)
dV1 dV2
d ln Ω1 d ln Ω2
= ⇒ µ1 = µ2 (3.4c)
dN1 dN2
We next consider these properties to the case in which 1 is the system we want to
study and 2 is a much larger system than 1 (a bath). This setup will bring us to the
canonical ensemble.
3.2 The canonical ensemble
Figure 3.1: Isolated system. There are two subsystems, S constituted by red points and
B constituted by the black one.
Let us consider an isolated system made by two subsystems, one S and one much
larger, B, that we call thermal bath (Figure 3.1). The total number of particles is
given by NT = NB + NS with NB NS 1 (they are both large but B is much
larger than S ), where NB are the particles in the thermal bath and NS the particle
of the system.
Let ET be the energy of the composite system. The two subsystems can exchange
energy but the whole system has constant energy ET . Therefore, let the energy to be
free to fluctuate in time at fixed temperature TB (isothermal ensembles). Note that
VS , NS , VB , NB are fixed (no exchange of volume and particles).
For resuming, other quantities fixed are the temperature of the bath TB , the
number of the total particles of the system NT , and also the total volume VT . We
have also VT = VB + VS , with VB VS .
The key to the canonical formalism is the determination of the probability dis-
tribution of the system among its microstates. And this problem is solved by the
realization that the system plus the bath constitute a closed system, with fixed tem-
perature, to which the principle of equal probability of microstates applies.
If one assumes that the system and the bath are weakly coupled (neglet interaction
energy):
ET = ES + EB = const EB ES
Let C by the microstate of the system S, and G the microstate of the heat bath B.
A given microstate of the isolated composite system B-S is given from a pair (C, G)
3.2. The canonical ensemble 37
of microstate C ∈ S and G ∈ B. The number of microstates of the isolated system

with total energy ET and system energy ES is given by:
ΩT (ET , ES ) = Ω(ES )ΩB (ET − ES )
Remark. In this analysis V and N are fixed. Since ET is fixed

X
ΩT (ET ) = Ω(ES )ΩB (ET − ES ) (3.5)
ES
From the principle of equal probability for microstates at equilibrium, the proba-
bility of a composed microstate (C ◦ G) is given by:
(
1
EC + EG = ET
PC◦G = ΩT (ET )
(3.6)
0 otherwise
Since we are not interested to the microstates of the heat bath

X X 1 1 X
PC = PC◦G = = 1 (3.7)
ΩT (ET ) ΩT
all G all G G
such that such that
g(ET −EC −EG ) g(ET −EC −EG )
The number of microstates G with energy EG = ET − EC is given by:
ΩB (EG ) = ΩB (ET − EC )
This implies that the probability of a given configuration is related to the number of
microstate of the bath:
ΩB (ET − EC )
⇒ PC = ∝ ΩB (ET − EC ) (3.8)
ΩT (ET )
It is more convenient to deal with the logarithmic of PC that is smoother
⇒ kB ln ΩB (ET − EC ) = SB (3.9)
This is the entropy of B and is a function of NB . Since EC EB ' ET we can

expand SB (ET − EC ) around x0 = ET by the small amount
∆ ≡ x − x0 = (EB ) − (ET ) = −EC

df
f (EB ) = f (ET ) + (EB − ET ) + . . .
dEB EB =ET
Therefore:
EC2 ∂ 2 SB

∂SB
kB ln ΩB (EB ) = SB (EB ) = SB (ET ) − EC + 2 + ...
∂EB EB =ET 2 ∂EB EB =ET
(3.10)
To make explicit the NB dependence, let us consider the molar version
SB → NB sB EB → NB eB
E2 ∂ 2 sB

∂sB
SB = NB sB = NB sB (ET ) − EC + C
∂eB eB =eT 2NB ∂e2B
Let us consider the limit in which the system size is fixed, while the one of the heat
bath is going to ∞:
E
z }|B {
ET ES + NB eB
lim = → eB (3.11a)
NB →∞ NB NB
dsB
lim kB ln ΩB (ET − EC ) → NB sB − EC (3.11b)
NB →∞ deB
On the other hand,
dsB 1 1
≡ =
deB TB T
which implies

SB NB sB EC
PC ∝ ΩB (ET − EC ) = exp = exp −
kB kB kB T
Since the first therm does not depend on C, it can be absorbed in the constant and
what we get by expanding considering the huge number of particles
PC ∝ exp(−EC /kB T ) (3.12)
Remark. Since the energy of the system fluctuates, its microstates are not anywhere
equiprobable, but are visited with probability given by (3.12).
Remark. Since the bath is very large, T is the only property of the bath that affects
the system. The Boltzmann factor is defined as:
1
β≡ (3.13)
kB T
The normalization consists in dividing by the normalization factor, that is the
sum of all microstates
e−βEC
PC = P −βE (3.14)
Ce
C
Finally, the canconical partition function is defined as

X
Q(T, V, N ) ≡ exp(−βEC ) (3.15)
all C
with V,N
fixed
Given Q(T, V, N ), one gets the Helhmoltz free energy:
A(T, V, N ) = −kB T ln Q(T, V, N ) (3.16)
that is the free energy describing the isothermal (or canonical) ensemble at fixed T,
volume V and number of particles N.
Remark. X X
Q(T, V, N ) = e−βE(C) = e−βE Ω(E, V, N )
C E
V,N fixed
What we have done is a foliation in energy of the space, that is a sum over the energy
(keeping {V, N } fixed):
X X X
Q(T, V, N ) = e−βE Ω(E, V, N ) = e−βE eS/kB = e−β(E−T S)
E E E
Hence,
Q(T, V, N ) = e−βA ⇒ A = −kB T ln Q(T, V, N )
3.2. The canonical ensemble 39
We have formulated a complete algorithm for the calculation of a fundamental

relation in the canonical formalism. Given a list of states of the system, and their
energies EC , we calculate the partition function (3.15). The partition function is
thus obtained as a function of temperature and of the parameters that determine
the energy levels. The fundamental relation is (3.16) that determines the Helmholtz
potential.
The probability of a configuration can be written as (3.14), that is a very useful
form. Indeed, the average energy is expected to be
−βEC
P
C EC e
X
U= EC PC = P −βEC
(3.17)
C Ce
or
∂
U =− ln Q (3.18)
∂β
3.2.1 Energy fluctuations in the canonical ensemble

Despite energy in the canonical ensemble fluctuates, while in the microcanonical
one is constant, this does not contradict the equivalence principle of the ensemble
(in the thermodynamic limit). The reason is that the relative size of the energy
fluctuation decreases in the large system limit. To see it, let us compute the average
square fluctuations of E.
(δE)2 = (E − hEi)2 = E 2 − hEi2 (3.19)

Remark. Remember that thermodynamic assume that the number of number of free-
dom is related to the number of Avogadro.
On the other hand,
 
∂Q(T,V,N )
e−βEC

∂β  = − ∂ ln Q
X X
hEi = PC EC = EC P −βEC
= −
C C Ce Q ∂β N,V

∂2Q

2 X ∂β 2
E = PC EC2 =
Q
C
Therefore,
1 ∂2Q 1 ∂Q 2

(δE)2 = (E − hEi)2 =

−
Q ∂β 2 N,V Q2 ∂β N,V
2
∂ hEi

∂ ln Q
= =−
∂β 2 N,V ∂β N,V
Since
∂E
cV = (3.20)
∂T N,V
we have
(δE)2 = kB T 2 cV (3.21)

Both cV and hEi are extensive

p p
h(δE)2 i kB T 2 cV

1
= ∼O √ ⇒0
hEi hEi N
because N ∼ 1023 .
3.3 Isothermal and isobaric ensemble

Now, the system is coupled both to a thermal and a volumic bath at temperature
TB and pressure PB . The idea is: consider the same system with the bath; the
difference is that in this case the system can exchange energy but also volume (we
continue to keep the temperature of the bath fixed). At this point the ensemble is
isothermal and isobaric. All the assumptions done before are valid, in particular,
assuming as before weak coupling between the degrees of freedom of the bath and
those of the system
ET = E + EB
VT = V + VB
We look for the partition function that describes this isothermal and isobaric ensem-
ble. Similarly to the previous case, one can write
PC ∝ ΩB (EB , VB ) ∝ ΩB (ET −EC , VT −VC ) ∝ exp[SB (ET − EC , VT − VC )/kB ] (3.23)
Remark. Now, C is specified both by its volume V and energy E. As before, one
can expand log ΩB both in EB and in VB (around ET and VT ) and take the limit
NB → ∞.
" #
SB (ET , VT ) EC ∂SB VC ∂SB 1
PC ∝ exp − − + term '
kB kB ∂EB ET ,VT kB ∂VB VT ,ET NB
(3.24)
Recalling that (

dS P PB → P
= with (3.25)
dV E T TB → T

EC P VC
⇒ PC ∝ exp − − (3.26)
kB T kB T
If we normalize:
e−β(EC +P VC )
PC = (3.27)
∆(T, P, N )
where
X
∆(T, P, N ) = e−β(E(C)+P V (C)) (3.28)
C
is called the Gibbs partition function.

Remark. Note that
X X X
∆(T, P, N ) = e−βP V ( e−βEC ) = e−βP V Q(T, V, N )
V C V
V,N fixed
XX
= e−β(E+P V ) Ω(E, V, N )
V E
| {z }
fluctuating
variables
By summing over all the microstates compatible with E and V :
Ω(E, V, N ) −β(E+P V )
P (E, V ) = e (3.29)
∆(T, P, N )
3.3. Isothermal and isobaric ensemble 41
Remark.
XX X
∆(T, P, N ) = e−βE−βP V Ω(E, V, N ) = e−βE−βP V +S(E,V,N )/kB (3.30)
E V E,V
| {z }
Laplace transform
Classical systems (fluids)
∞
1
Z Z
−βP V −βH(pN ,rN )
∆(T, P, N ) = dV e d p~1 . . . d p~N e (3.31)
0 h3N N !
which implies
Z ∞
∆(T, P, N ) = dV e−βP V Q(T, V, N ) (3.32)
0
that is the Laplace transform of the canonical partition function Q.

We also define
P ∂S
βP ≡ = (3.33)
T ∂V
Remark. Let us remind that
Definition 3: Laplace transform
The Laplace transform of a function f (t), defined for all real numbers t ≥ 0, is
the function F (s), which is a unilateral transform defined by
Z ∞
F (s) = f (t)e−st dt
0
where s is a complex number frequency parameter with real numbers σ and ω:
s = σ + iω
Magnetic system
Ensemble in which both E and M can fluctuate. In particular, we have Ω(E, M )

(with TB and HB fixed).
SB (ET ,MT ) E dS M dSB

− k C dEB −kC
PC ∝ e kB B B B dMB (3.34)
dSB
Since dMB = −H
TB and
B dSB
dEB = TB :
1
⇒ PC ∝ exp[−β(EC − HMC )], TB → T, HB → H (3.35)
The normalization function is:

X X
∆(T, H, N ) = e−β(EC −HMC ) = e−βE+βM H Ω(E, M ) (3.36)
C E,M
that is the Gibbs partition function for magnetic systems.

3.3.1 Saddle point approximation

The sum (3.30) can be approximated by the maximum of the integrand (this is
fair for highly peaked functions):
∗ −βP V ∗ +S(E ∗ ,V ∗ ,N )/k

X
exp[−βE − βP V + S(E, V, N )/kB ] ≈ e−βE B
E,V
where
dS(E ∗ , V ∗ , N ) dS(E ∗ , V ∗ , N )

1 P
= , =
dE V,N T dV E,N T
this implies
−kB T ln ∆(T, P, N ) ' E ∗ + P V ∗ − T S
Hence, we define the Gibbs free energy as:
G(T, P, N ) = −kB T ln ∆(T, P, N ) (3.37)
3.4 Gran canonical ensemble

In this case N varies instead than V. Thus we have
SB (ET − EC , NT − NC )

PC = exp
kB

SB (ET , NT ) EC dSB NC dSB 1
∼ exp − − + terms of order ≤ (3.38)
kB kB dEB kB dNB VB
exp[−βEC + βµNC ]
=
Θ(T, V, µ)
where
X X
Θ(T, V, µ) = e−β(EC −µN ) (3.39)
N C
V,N fixed
is the grancanonical partition function.

Remark. Remember that
dS 1 dS µ
= , = (3.40)
dE T dN T
The fugacity is defined as:
z ≡ eβµ (3.41)
and we rewrite
∞
X X
Θ(T, V, µ) = zN ( e−βEC ) (3.42)
N =0 C
V,N fixed
In principle, if one is able to compute the partition function is able to compute

the thermodynamic quantitites.
Chapter 4
Statistical mechanics and phase

transitions
Lecture 5.
Wednesday 23rd
4.1 Statistical mechanics of phase transitions October, 2019.
Compiled:
From the microscopic degrees of freedom, one compute the partition function
Wednesday 5th
in the appropriate ensemble, then the corresponding thermodynamic potential and February, 2020.
from it all the thermodynamic properties of the system as equilibrium phases and, if
present, phase transitions. Actually, until the ’30 there were strong concerns about
the possibility that statistical mechanics could describe phase transitions.
∂Ω L
(a) Region Ω with boundary ∂Ω. (b) Magnetic system with
characteristic length L.
Figure 4.1
Let us consider a system withing a region Ω of volume V (Ω) and boundary ∂Ω of

area S(Ω) (Figure 4.1a). Denoting by L a characteristic lenght of the system
V (Ω) ∝ Ld , S(Ω) ∝ Ld−1

where d is the spatial dimension.
Remark. Space Ω can be either discrete or continuous.
Suppose that the system is finite. Formally, we can write
X
HΩ = − kn Θn (4.1)
n
where
• kn : are the coupling constants. In general, but not always, they are intensive
thermodynamic variables.
• Θn : is a linear, or higher order, combination of the dynamical microscopic

degrees of freedom (local operators in quantum statistical mechanics).
43
44 Chapter 4. Statistical mechanics and phase transitions
• kn Θn : must obey the symmetry of the system. It is important that in principle

the term satisfies the symmetry of the system. This is a master rule!
To fix the idea, let us consider two classical examples: the magnetic system and the
fluid system.
4.1.1 Magnetic system (canonical)

The degrees of freedom are the spins lying on a Bravais lattice S~i , with 1 ≤ i ≤
N (Ω), where the N (Ω) are the number of lattice sites (Figure 4.1b). A configuration
is the orientation of the spin in each site C = {S~1 , . . . , S~N }. We have:
S~i
X
Θ1 = (4.2a)
i
S~i · S~j
X
Θ2 = (4.2b)
ij
We consider the trace operation, that is the sum over all possible values that each
degree of freedom can assume:
X XX X
Tr ≡ ≡ ··· (4.3)
{C} S~1 S~2 S~N
where can also indicate an integration if values are continuous.

P
The canonic partition function is

QΩ (T, {kn }) = Tr e−βHΩ (4.4)
with β ≡ kB T .
1
4.1.2 Fluid system (gran canonical)

Consider N particles in a volume V, with number density ρ = N/V . The 2dN
degrees of freedom are
{C} = {(x~i , p~i )i=1,...,N }
and
" #
Xp~i 2
Θ1 = + U1 (x~i ) (4.5a)
2mi
i
1X
Θ2 = U (|x~i − x~j |) (4.5b)
2
i>j
The trace operation is

∞ N
1 (d p~i )(d x~i )
X X Z Y
Tr ≡ = (4.6)
N! hdN
{C} N =0 i=1
The gran canonical partition function is:

−β(HΩ −µN )
FΩ = Tr e (4.7)
4.1. Statistical mechanics of phase transitions 45
For a generic partition function QΩ (T, {kn }), we can define the finite system free
energy as
FΩ [T, {kn }] = −kB T ln QΩ (T, {kn }) (4.8)
The relation with thermodynamic is trough the theromdynamic limit.
Since the free energy is an extensive function,
FΩ ∝ V (Ω) ∼ Ld
In general, one can write
FΩ [T, {kn }] = V (Ω)fb [T, {kn }] + S(Ω)fs [T, {kn }] + O(Ld−2 ) (4.9)
where fb [T, {kn }] is the bulk free energy density.

Definition 4: Bulk free energy density
We define the bulk free energy density as
FΩ [T, {kn }]
fb [T, {kn }] ≡ lim (4.10)
V (Ω)→∞ V (Ω)
if the limit exists (to prove for each system) and does not depend on Ω.
For a system defined on a lattice we have
L(Ω) ∝ N (Ω)1/d , V (Ω) ∝ N (Ω)

1
fb [T, {kn }] = lim FN [T, {kn }]
N (Ω)→∞ N (Ω)
To get information on surface property of the system, let us calculate
FΩ [T, {kn }] − V (Ω)fb [T, {kn }]

fs [T, {kn }] ≡ lim (4.11)
S(Ω)→∞ S(Ω)
4.1.3 Thermodynamic limit with additional constraints

For a fluid we cannot simply take the limit V (Ω) → ∞ by keeping N fixed,
otherwise we will always get a infiinite system with zero density. One has to take
also the limit N (Ω) → ∞ such that:
N (Ω)
≡ ρ = const
V (Ω)
In general, is not so easy to prove the existence of the limit and it depends on the
range of the particle-particle interactions.
4.1.4 Statistical mechanics and phase transitions

Since all the thermodynamic information of a system can be obtained by the
partition function, in principle, also the ones concerning the existence and nature of
the phase transition must be contained in Z (or Q). On the other hand, we know
from thermodynamic that phase transitions are characterized by singularities in the
derivation of F. Also Z must display these singularities.
On the other hand, Z is a sum of exponentials

ZΩ = Tr e−βHΩ (4.12)
the exponentials are analytic functions everywhere (they converge), hence ZΩ is an-
alytic for Ω finite!
The question is: where do singularities come from? It is only in the thermodynamic
limit that singularities in F and hence points describing phase transitions can arise!
For summarizing, there is no way out of this for producing singularities. The
singularities will develop in the thermodynamic limits. For reaching singularities, we
have to reach so precision in thermodynamic that we are not able to go exactly into
the critical point. How can we relate singularities, geometrically, in the behaviour of
the system?
4.2 Critical point and correlations of fluctuations

From thermodynamics, we know that, at the critical point, some response func-
tions may diverge (see Section 2.7.1). Now, we show that this is a consequence of the
onset of microscopic fluctuactions that are spatially correlated over long distances.
To see this, let us compute the response of a ferromagnetic in presence of an external
magnetic field H. The Gibbs partition function of a generic magnetic system is as
equation (3.36):
X
ZGibbs [T, {kn }] = Tr e−β(H(C)−HM (C)) = e−βE+βHM Ω(E, M )
M,E
Remark. The term (−HM ) is the work done by the system against the external field
H to mantain a given magnetization M .

∂ ln ZG 1 h
−β(H(C)−HM (C))
i
hM i = = Tr M (C)e (4.13)
∂(βH) T ZG
∂ hM i
ii2
β h
2 −βH+βHM
i β h h −βH+βHM
χT = = Tr M (C)e − 2 Tr M (C)e
∂H ZG ZG
(4.14)
Hence,
1
2
χT = M − hM i2 (4.15)
kB T
The thermodynamic response function χT in statistical mechanics is related to the
variance of the magnetization.
We can relate the above expression with the correlation of the microscopic by
performing a coarse-graining of the system, where the magnetization M (C) can be
computed as an integral Z
M (C) = d3~r m(~r) (4.16)
Hence, Z hD E D Ei
kB T χT = d~r d r~0 m(~r)m(r~0 ) − hm(~r)i m(r~0 ) (4.17)
Let us assume the translational symmetry:
homogeneous
(
hm(~r)i = mE
D (4.18)
m(~r)m(r~0 ) ≡ G(~r − r~0 ) two-point correlation function
Instead, let us consider the connected correlation function, i.e. the correlation function
of the fluctuations δm = m − hmi:
D E D D EE
m(~r)m(r~0 ) ≡ (m(~r) − hm(~r)i) m(r~0 ) − m(r~0 ) = G(~r − r~0 ) − m2 (4.19)
c
4.2. Critical point and correlations of fluctuations 47
Given the translational invariance, one can centre the system such that its centre of
mass coincides with the origin
~rCM ⇒ ~r0 ≡ ~0
Z Z
⇒ d~r d r~0 [G(~r − ~r0 ) − m2 ]
The integration over r~0 gives the volume V (Ω) of the system:
Z
kB T χT = V (Ω) d~r hm(~r)m(~r0 )ic (4.20)
| {z } | {z }
response correlation function
function of the fluctuations
of the local magnetization
The equation (4.20) is called the fluctuation-dissipation relation.

How Gc (~r) behaves? In general, one has
Gc (~r) ∼ e−|~r|/ξ (4.21)
meaning that for |~r| > ξ the fluctuations are uncorrelated, where ξ is the correlation
length. The correlation length is related to the correlation function. In general, it is
finite but, if you approach Tc , it diverges. In fact, at the critical point this correlation
will expand in the whole space and reaches the size of all the system, in other words,
it goes to infinity (ξ → ∞). When ξ will diverge, there will not be anymore the
exponential and the integral cannot be keeped finite.
Let g be the value of Gc for |~r| < ξ:
kB T χT ≤ V gξ 3
where there is an inequality because we are understimating the integral (Figure 4.2).
ξ |~r|
Figure 4.2: Plot of the two-point correlation function, G.
Rearranging the terms, we obtain

kB T χT
≤ gξ 3 (4.22)
V
Hence, if χT diverges at the critical point it implies ξ → ∞.
In particular, one can see that for H = 0 and T → Tc± :
ξ± (T, H = 0) ∼ |t|−ν± (4.23)
where ν+ = ν− = ν is the correlation length critical exponent.

Remark. It does not derive from thermodynamic considerations.

Scaling (4.23) is often used as the most general definition of a critical point.
One can also show that at T = Tc (i.e. t = 0)
1
Gc (r) ∼ (4.24)
rd−2+η
where η is the correlation critical exponent.
Remark. The formula is a power law decay instead than exponential.
4.3 Finite size effects and phase transitions

Actually, the thermodynamic limit is a mathematical trick and in real systems it
is never reached. Is it then physically relevant?
If we had instruments with infinite precision each change of the physical properties
of a system would occur within a finite range, therefore we would observe a smooth
crossover instead than a singularity. In this respect, the notion of correlation length
ξ is extremely important.
To illustrate this point, let us consider the gas-liquid system in proximity of its
critical point (T ∼ Tc ). If we approach Tc from the gas phase, there will be fluctu-
ations of ρ with respect to ρG , ∆ρ = ρ − ρG , due to the presence of denser droplets
(liquid) in the continuum gas phase. These droplets will have different diameters, but
the average size would be ξ, where it is the typical size of the liquid droplets. Clearly
t→0
ξ = ξ[T ] and, in proximity of the critical point ξ ∼ |t|−ν .
On the other hand, in a finite system, ξ cannot diverge since is bounded above,
ξ ≤ L, where L is the linear system size.
As T → Tc , where ξ should be larger than the system size, the behaviour of the
system should deviate from the one expected by the theory that is obtained in the
limit L → ∞. How far the real system would be from the critical point t = 0 where
singularities develop? Let us try to give an estimate of this deviation.
Let us consider a system of size L = 1 cm and
t ≡ (T − Tc )/Tc , ξ ∼ ξ0 t−ν
Let us assume that the lattice distance is ξ0 = 10 Å. Hence,

−1/ν −1/ν
ξ L
t∼ ∼ ∼ (1010 )−1/ν (4.25)
ξ0 10 Å
In the next chapters, we will see that ν < 1 and close to 1/2, hence:
t ∼ (1010 )−2 = 10−20
Therefore we have t ≈ 10−20 as a distance from Tc .

This estimate suggests that the experimental instrument that measures tempera-
ture must have a precision of 10−20 to see deviations from the results obtained in the
thermodynamic limit.
4.4. Numerical simulations and phase transitions 49
4.4 Numerical simulations and phase transitions

In this case, the size L of the simulated system is few multiples of ξ0 and the finite-
size effects of the simulated data can strongly affect the location and the scaling laws
of the phase transition under numerical investigation. Finite size scaling analysis of
the numerical data is needed.
cV
N↑
Tc
Figure 4.3: (Tc , CV ) plot at different N .
We can find the critical point by doing Montecarlo simulation. Supposing a Mon-
tecarlo simulation of a Ising model, for which there is no an analytic solution and
compute the energy. Try to extrapolate for example the position of the peak as N
increases. If we start to see the behaviour as in Figure 4.3, something is happening.
There are two approaches we can use.
The first approach is studying the system by looking for all the details. An example
could be a protein, that interact with other proteins; in this case we can look at all
the electrons (or atoms). Nevertheless, even if we thought at the simple protein that
exists, there would be a lot of degrees of freedom.
For doing a simulation, if we are interested in long time behaviour and in large scale
behaviour, details are not important. What it is important are symmetries, ranges
of interaction. Therefore, we can forget about all the details. We can introduce
the effective potentials as Van der Waals or Lenard Jones potential and studying
collective effects. This is the second approach.
Chapter 5
Role of the models in statistical

mechanics
Lecture 6.
Friday 25th
5.1 Role of the models October, 2019.
Compiled:
Which is the role of models in statistical mechanics? There are two possible Wednesday 5th
approaches: February, 2020.
1. The model must describe the real system in a very detailed way. The maximum
number of details and parameters to be tuned are included. The pro is the closer
to the real specific system (faithfull description). The drawback is that the
model is so complicated that no analytical solution is possible. Moreover, even
numerically, these models can be studied for very short times and small sizes.
An example is the simulation of the folding dynamics that can be performed
for few nanoseconds. On the other hand, the introduction of many details are
often not crucial if one is interested in large scale properties.
2. Try to introduce (coarse-graining approach) the most simple model that satis-
fies few essential properties of the real system such as its symmetries, dimen-
sionality, range of interactions etc. Since most of the microscopic details are
integrated, these models cannot describe the full physics of a specific system but
they can reproduce its main features. Moreover, these models can be studied
numerically and, to some extent, also analytically (exact solution).
It is the latter approach that we shall take here. Let us start by introducing
what is, perhaps, the most paradigmatic model in the statistical mechanics of phase
transition, the Ising model.
5.2 The Ising model

Suggested by Lenz to Ising for his PHD thesis (1925), it is supposed to describe
a magnetic system that undergoes a transition between a paramagnetic and a ferro-
magnetic phase. In d = 1 the model was solved exactly by Ising. Unfortunately, he
found that for T > 0 the model does not display a phase transition.
The wrong conclusion was that this model was not able to describe a phase tran-
sition. In fact, it turns out that, for d > 1, the model does display a paramagnetic-
ferromagnetic phase transition.
Let us first discuss some general feature of the model for any dimension d.
51
52 Chapter 5. Role of the models in statistical mechanics
5.2.1 d -dimensional Ising model

For hypercubic lattice with given N (Ω) sites {i}i=1,...,N (Ω) and linear size L(Ω),
we have
N (Ω) = Ld
The microscopic degrees of freedom are the spins Si , defined at each i-esim lattice
site. Each spin can assume the values Si = ±1, that means that at each site the
possible values are the spin up or down. For a lattice with N (Ω) spins, there are
2N (Ω) possible configurations.
Remark. Since we do not consider the spin as a vector, this is a model for a strongly
anysotropic ferromagnet (along a given direction).
The minimal model that can try to capture the interaction between the spin is the
following. Suppose to have also an external magnetic field Hi (it values depends on
the site i). One can consider interactions between spins whose strength are described
by functions Jij , kijk , . . .. For instance, there is a coupling that derives from electrons
coupling
ri − r~j |)
Jij = f (|~
The physical origin is the overlap between the electronic orbitals
P of the neighbouring
atoms forming the Bravais lattice. Remember that a term as i Si is not correlated,
while we need an interaction for describing the model.
A general Hamiltonian of the model can be written as:
X X X
HΩ ({Si }) = Jij Si Sj − Hi Si − Si Sj Sk + . . .
ij i ijk
Standard Ising model one keeps only the two-body interactions:
N N
1X X
HΩ ({Si }) = − Jij Si Sj − Hi Si (5.1)
2
i,j i=1
where the first term represents a two body interaction that is a quadratic term, while
the second term is a one body interaction. We have put the minus because we want
to minimize the energy, but it depends on the sign of J.
For this model, the sum over all configurations on trace is given by
X X X X
Tr ≡ ··· ≡
S1 =±1 S2 =±1 SN =±1 {S}
Our problem is to find the partition function with N sites, which depends on T and
in principle depends on the configuration given (it is fixed both for H and J !). Hence,
the canonical partition function is given by
ZΩ (T, {Hi }, {Jij }) = Tr e−βHΩ ({S}) (5.2)
and the corresponding free-energy,
FΩ (T, {Hi }, {Jij }) = −kB T ln ZΩ (5.3)
The bulk limiting free energy is:
1
fb (T, {Hi }, {Jij }) = lim FΩ (5.4)
N →∞ N
5.2. The Ising model 53
How do we know that the above limit does exist? It must be proven. The surface
is not important in the bulk limit. Note that we are assuming that the interaction
between the spin is a short range force, it is not as the size of the system.
For this model it is possible to show that the limit exists if
X
|Jij | < ∞ (5.5)
j6=i
Remark. In general what determines the existence of the limit of these spin models
are the dimension d and the range of the spins interactions.
For example it is possible to show that, if
ri − r~j |−σ
Jij = A|~ (5.6)
so it is a long range interaction, the limit exists when
σ>d
Remark. If the interaction is dipolar, since it decades as 1/r3 , for the case d = 3 the
limit does not exists. However, it is still possible to prove the existence of the limit
for this case if one assumes that not all dipoles are fully aligned.
Assuming that the thermodynamic limit exists, we now look at some additional
rigorous results on the limiting free energy and its derivatives.
5.2.2 Mathematical properties of the Ising model with nearest neigh-

bours interactions
For simplicity, let us consider the case in which the external magnetic field is
homogeneous, i.e. Hi ≡ H, and the spin-spin interaction is only between spins that
are nearest-neighbours (n.n.) on the lattice:
(
J if i and j are n.n.
Jij = (5.7)
0 otherwise
Now, the model is very simple:
N (Ω) N (Ω)
X X
− HΩ ({S}) = J Si Sj + H Si (5.8)
hiji i
where the notation hiji means a double sum over i and j, with the constraint that i
and j are nearest-neighbours.
Since H is uniform, the average magnetization per spin is
N (Ω)
1 X
hmi = hSi i (5.9)
N (Ω)
i=1
where h. . .i means average over the chosen ensemble.

Remark. For J = 0, (5.8) is the Hamiltonian of a paramagnet. The only influence
ordering the spins is the field H. They do not interact, there are no cooperative
effects and hence no phase transition.
Since
  
N N (Ω) N (Ω)
" #
X 1 X 1 X X X
hSi i = Tr Si e−βHΩ ({Si }) = Tr  Si expβJ Si Sj + βH Si 
Z Z
i=1 i i hiji i
it is easy to show that:

1 ∂FΩ
hmi = − (5.10)
N ∂H
where
FΩ (T, J, H) = −kB T ln ZN (T, J, H) (5.11)
Now, let us consider the properties of the limiting free-energy
1 1
fb = lim FΩ = lim (−kB T ln ZN ) (5.12)
N →∞ N N →∞ N
It is possible to prove the following properties:

1. fb < 0.
2. fb (T, J, H) is a continuous function of T,J and H.
3. The right and left derivatives of fb (T, J, H) exist and are equal almost every-
where.
4. The molar entropy s = − ∂f

∂T ≥ 0 almost everywhere.
b
∂fb 2
5. ∂T is a monotonic non increasing function of T. That is ∂∂Tf2b ≤ 0. This implies
that: 2
∂S ∂ fb
cH = T = −T ≥0
∂T H ∂T 2 H
∂fb
6. ∂H is a monotonic non increasing function of H. That is
∂ 2 fb
≤0
∂H 2
This implies that 2
∂M ∂ fb
χT = =− ≥0
∂H T ∂H 2 T
Remark. The above properties have been postulated in thermodynamics, but here
they have been rigorously proved for the Ising model using statistical mechanics.
Proof of property (4). Almost everywhere, we have to prove that

∂fb
s≡− ≥0
∂T
Let us consider a finite system
1 Tr HΩ e−βHΩ

∂FΩ −βHΩ
− = kB ln (Tr e ) + kB T
∂T kB T 2 Tr(e−βHΩ )
" #
Tr βHΩ e−βHΩ
= kB ln ZΩ + = −kB T Tr(ρΩ ln ρΩ )
ZΩ to do
where
e−βHΩ
ρΩ =
ZΩ
5.2. The Ising model 55
is the probability distribution.

Since ρΩ ≤ 1 it implies ln ρΩ ≤ 0 and so − Tr(ρΩ ) ln ρΩ is positive. Then, let us
divide by N (Ω) and take the thermodynamic limit:
1 ∂FΩ 1
lim − = −kB T lim Tr(ρΩ ln ρΩ ) = T s ≥ 0 ⇒ s≥0
N →∞ N ∂T N →∞ N | {z }
SΩ

All the other properties listed before (except (1)) are consequences of the convexity
property of fb .
Theorem 1
fb (T, J, H) is an upper convex (i.e. concave) function of H.
Proof. The proof is based on the Hölder inequality for two sequences {gk }, {hk }:
Definition 5: Hölder inequality
Given {gk }, {hk } with gk , hk ≥ 0, ∀k and two non negative real numbers α1 , α2
such that α1 + α2 = 1, the following inequality holds
!α1 !α2
X X X
α1 α2
(gk ) (hk ) ≤ gk hk (5.13)
k k k
Now, consider the partition function:

 
 !   " ! #
 X X  X
ZΩ (H) = Trexp βH Si expβJ Si Sj  = Tr exp βH Si G(S)
 
 
 i hiji  i
| {z }
G(S)
It implies that
( ) !
X X
ZΩ (H1 α1 + H2 α2 ) = Tr exp βα1 H1 Si + βα2 H2 Si G(S)
i i
On the other hand, since α1 + α2 = 1:
G(S) = G(S)α1 G(S)α2
h P P i
ZΩ (H1 α1 + H2 α2 ) = Tr (eβH1 i Si G(S))α1 (eβH2 i Si G(S))α2
If we now apply the Hölder inequality we get
P α1 P α 2
ZΩ (H1 α1 + H2 α2 ) ≤ Tr eβH1 i Si G(S) Tr eβH2 i Si G(S)
= ZΩ (H1 )α1 ZΩ (H2 )α2
If we now take the logs and multiply by −kB T both sides, we have
1 α1 α2
lim − kB T ln ZΩ (H1 α1 + H2 α2 ) ≥ − lim kB T ln ZΩ (H1 )− lim kB T ln ZΩ (H2 )
N →∞ N N →∞ N N →∞ N
It implies
fb (H1 α1 + H2 α2 ) ≥ α1 fb (H1 ) + α2 fb (H2 )
That is a concave function of H 1 .
1
A real-valued function f on an interval is said to be concave if, for any x and y in the interval
and for any α ∈ [0, 1], f ((1 − α)x + αy) ≥ (1 − α)f (x) + αf (y).
5.2.3 Ising model and Z2 symmetry.

The symmetry of the system in sense of the Hamiltonian is: we can invert the value
of the S and the Hamiltonian does not change. It is valid when H = 0, otherwise it
is not true. Let us see the Z2 symmetry and the following interesting relation:
Lemma 2
∀ function Φ of the configuration {Si }, the following relation holds:
X X
Φ({Si }) = Φ({−Si }) (5.14)
{Si =±1} {Si =±1}
this is true for all function of the spin.
Now, we consider the Hamiltonian of the Ising model:

N (Ω) N (Ω)
X X
−HΩ = J Si Sj + H Si
hiji i
Clearly,
H(H, J, {Si }) = HΩ (−H, J, {−Si }) (5.15)
This is a spontaneous broken symmetry.
Hence,
X X
ZΩ (−H, J, T ) = exp[−βHΩ (−H, J, {Si })] = exp[−βHΩ (−H, J, {−Si })]
(5.14)
{Si =±1} {Si =±1}
X
= exp[−βHΩ (H, J, {Si })] = ZΩ (H, J, T )
(5.15)
{Si =±1}
(5.16)
Taking −kB T ln, we got:
FΩ (T, J, H) = FΩ (T, J, −H) (5.17)
If we take the thermodynamic limit limN →∞ N,
1
we have
⇒ fb (T, J, H) = fb (T, J, −H) (5.18)
and it means that the free energy density is an even function of H !
Remark. From the finite-size relation (5.17), one can show that a finite-size Ising
model does not display a transition to a ferromagnetic phase (for all dimension d ).
Indeed,
∂F (H) ∂F (−H) ∂F (−H)
N (Ω)M (H) = − = − = = −N (Ω)M (−H) (5.19)
∂H (5.17) ∂(H) ∂(−H)
Therefore:
M (H) = −M (−H), ∀H (5.20)
If H = 0, we have M (0) = −M (0), that is valid if and only if M (0) = 0!
The magnetization of a finite system is, at H = 0, always zero. This is simply
consequence of the symmetry argument shown above. We have not a phase transition.
Hence, it is only in the thermodynamic limit, where the symmetry is spontaneously
broken, that the model displays a transition.
For resuming, although the Hamiltonian is invariant with respect to the transfor-
mation H → −H, {Si } → {−Si }, the thermodynamic state is not. This situation is
called spontaneous symmetry breaking.
5.3. Lattice gas model 57
5.3 Lattice gas model

Even if we had not seen any transition, the Ising model is interesting because we
can use this model to solve other problems that seems different but are not. In fact,
the importance of the Ising model relies also on the fact that it can be mapped into
other discrete systems. Despite its simplicity, the Ising model is widely applicable
because it describes any interacting two-state system. One of these applications is
the lattice gas model, where a gas is put in a lattice.
What is a lattice gas model in more details? The archetypal lattice gas is a model
where each lattice site can either be occupied by an atom or vacant. Let us consider a
d -dimensional lattice with coordination number z and lattice spacing a, divided into
cells, as in Figure 5.1 . Let us suppose that each cell is either empty or occupied by
a single particle (this is more true if a ∼ Å).
Figure 5.1: d-dimensional lattice with lattice spacing a.
The ni is the occupation of the i-esim cell and it is:

(
0 if empty
ni =
1 if occupied
We have:
Nc
X
NΩ = ni (5.21)
i=1
where Nc is the number of the lattice cells. In particular, Nc > NΩ .
The Hamiltonian of the model is
Nc N
c
X 1X
HΩ = U1 (i)ni + U2 (i, j)ni nj + O(ni nj nk ) (5.22)
2
i=1 ij
where U1 is for instance an external field, while U2 is a many body interaction.

Since we want to work in the gran-canonical ensemble,
Nc
− µ)n + 1
X X
HΩ − µNΩ = (
U1
(i) i U2 (i, j)ni nj + . . .
2
i=1 ij
and we will put U1 = 0 for convenience.

A formal relation with the Ising model can be obtained by choosing
1
ni = (1 + Si ), with Si = ±1 (5.23)
2
The one body term becomes:

X 1 1X 1X
(U1 (i) − µ) (1 + Si ) = (U1 (i) − µ) + Si (U1 (i) − µ) (5.24)
2 2 2
i i i
while the two bodies term is equal to:

Nc Nc Nc
1X 1 1 X 1X 1X
U2 (i, j) (1 + Si )(1 + Sj ) = 2 U2 (i, j)Si + U2 (i, j)Si Sj + U2 (i, j)
2 4 8 8 8
ij ij ij ij
Let us consider only short-range interactions, i.e.

(
U2 i, j are n.n.
U2 (i, j) =
0 otherwise
It implies
Nc
1X 1 1 X U2 X 1
U2 (i, j) (1 + Si )(1 + Sj ) = zU2 Si + Si Sj + U2 zNc (5.25)
2 4 4 4 8
ij i hiji
Remark. Note that the becomes z i, where z is the coordination number of

P P
ij
neighbours.
Remember that, for simplicity, we put U1 = 0. We can rewrite:
Nc
X X
HΩ − µNΩ = E0 − H Si − J Si Sj (5.26)
i=1 hiji
where
1 z
E0 = − µNc + U2 Nc (5.27a)
2 8
1 z
−H = − µ + U2 (5.27b)
2 4
U2
−J = (5.27c)
4
and remember that z is the coordination number of neighbours. J is a nearest
neighbour interaction which favours neighbouring sites being occupied.
The last equation implies that, in the gran canonical ensemble, we have:
ZLG = Tr{n} (e−β(HΩ −µNΩ ) ) = e−βE0 ZIsing (H, J, Nc ) (5.28)
We have seen that the Ising model is something more general than the magnetization
transition. In the next section, we show how to pass from the partition Z of a fluid,
in the continuum, to the ZLG of the lattice gas model.
5.4 Fluid system in a region Ω

We can consider the system with periodic boundary condition, or within a box,
or confined by an external one-body potential.
The Hamiltonian for N particles in d -dimension is
N 2
X pi 1X 1 X
HΩ = + U1 (~
ri ) + ri , r~j ) +
U2 (~ ri , r~j , r~k )
U3 (~ (5.29)
2m 2 3!
i=1 i6=j i6=j6=k
5.4. Fluid system in a region Ω 59
In the gran-canonical ensemble, we have:

∞ N
1 dd p~i dd r~i −β(HΩ −µN )
X Z Y
−β(HΩ −µN )
ZΩ = Tr e = e (5.30)
N! hdN
N =0 i=1
and the gran-canonical potential is
ωΩ (T, µ, U1 , U2 , . . . ) = −kB T ln ZΩ (5.31)
Remark. Even if ωΩ (. . . ) contains an infinite sum, it is not singular if Ω is finite!

Indeed, if U2 is an hard-core repulsion, each particle has a finite volume and,
within a finite Ω, only Nmax particles can fit in
∞
X NX
max
⇒ ∼
N =0 N =0
In the thermodynamic limit, it corresponds to

ωΩ
ωb (T, µ, U1 , U2 , . . . ) = lim
V (Ω)→∞ V (Ω)
with the constraint

hN i
ρ= lim = const
V,N →∞ V (Ω)
Remember also that

dωb (T, µ) = −σ dT − ρ dµ = −P (5.32)
Now:
∞
" N Z #
+∞
X 1 Y 1 2
ZN = dd ~p dN e−β p~i /2m QN (T )eβµN (5.33)
N! −∞ h
N =0 i=1
where
N
Z Y
QN (T ) = d r~i e−β(U (~r)) (5.34)
i=1
2
On the other hand, since dx e−αx
R p
= 2π/α,
+∞
1 −β p~i 2 /2m 1
Z
dd ~p d
e =
−∞ h Λ(T )d
where
h
Λ(T ) = √ (5.35)
2 πmkB T
Hence,
∞ βµ N
X 1 e
ZΩ = QN (5.36)
N ! Λd (T )
N =0
5.4.1 From the continuous to the lattice gas model

Let us divide Ω in discrete cells of size a. If a is approximate a repulsive range
between particles, we have that the probability that there is more than one particle
sits in a cell is 1. The potentials of the continuous model depend on {~ ri }.
Consider the occupation numbers nα = nα (~ ri ). We have:
X Z N
X Z
nα = N = dd~r ri − ~r) =
δ(~ dd~r ρ(~r) (5.37)
α Ω i=1 Ω
where
N
X
ρ(~r) = ri − ~r)
δ(~ (5.38)
i=1
Moreover,
X XZ Z
d
U1 (~
ri ) = ri − ~r) =
d ~r U1 (~r)δ(~ dd~r U1 (~r)ρ(~r) (5.39)
i i Ω Ω
We have U ({~
ri }) → U ({nα }) :
N
Z Y X
QN ∝ dd r~i →
i=1 {nα }
Indeed, for each configuration specified by the set {nα } there are N ! possible configu-
rations of {~
ri }. This is because the particles can exchange position between occupied
cells. Hence, more precisely,
N
Z Y 0
X
d d N
QN ∝ d r~i ' N !(a ) ...
i=1 {nα =0,1}
P0
Remark. The symbol means that
P the sum has the constraint that the total number
of particles is fixed to N, that is α nα = N .
Therefore,
0
X
d N
QN ∝ N !(a ) e−βU ({nα }) (5.40)
{nα }
and we can rewrite the equation (5.36) as

∞ βµ N ∞
" d #N X
0
X 1 e X
βµ a
ZΩ = QN = e e−βU ({nα }) (5.41)
N ! Λd (T ) Λ(T )
N =0 N =0 {nα }
P0
where with the constraint = N.
P P
= {nα } α nα
Remark. In general it is difficult to perform sum with constraints. Fortunately, we
are considering the gran-canonical ensemble. Indeed, we can write
∞ X 0 0 0 0
X X X X X
f (nα ) = f (nα ) + f (nα ) · · · + f (nα ) = f (nα )
N =0 {nα } {nα } {nα } {nα } {nα }
P P P
αnα =0 αnα =1 α nα =∞
with no restriction.
Remark. In the final sum all the 2N possible microscopic states are inclued (consid-
ering U1 = 0)
Eventually, we have2
 
Nc
X
X d a X
ZΩGC ∝ exp−β −µ − log nα + βU2 nα nβ + . . .  (5.42)
β Λ
{nα } α=0 hαβi
⇒ ZΩ = Tr e−β(HΩ −eµN ) = ZLG (e

µ) (5.43)
where
a
µ
e = µLG = µphys + dkB T log (5.44)
Λ
2 a
b = ea ln b ⇒ eβµ ( Λ
a d
) = eβµ exp d ln a

Λ
Chapter 6
Some exactly solvable models of

phase transitions
6.1 1-dim Ising model

In this section, we arrive at the exact solution of the one dimensional Ising model.
There are two techniques for solving the model:
1. the recursive method ;
2. the transfer matrix method.
6.1.1 Recursive method

Case with H = 0 and free boundary conditions
1 2 i i+1 N
Figure 6.1: One dimensional Bravais Lattice.
Let us consider a Bravais lattice in the one dimensional case, that is just a one
dimensional lattice, as in Figure 6.1.
The canonical partition function of such a system is:
 K 
X X X z}|{ NX
−1
ZN (T ) = ··· exp βJ Si Si+1  (6.1)
S1 =±1 S2 =±1 SN =±1 i=1
The two body interaction is the sum in all the neighbours that in that case are (i − 1)
and (i + 1), but we have only to consider the one after, because the one behind is yet
taken by the behind site.
We want to solve this partition function. If we consider free boundary condition,
the N does not have a N+1, almost for the moment. Let us define
K ≡ βJ, h ≡ βH (6.2)
Making explicit the sum in the exponential:
X X X
ZN (K) = ··· eK(S1 S2 +S2 S3 +···+SN −1 SN )
S1 =±1 S2 =±1 SN =±1
61
62 Chapter 6. Some exactly solvable models of phase transitions
What does happen if we just add another spin at the end SN +1 ? Which is the
partition function with that new spin? We obtain:
X X X X
ZN +1 (K) = ··· eK(S1 S2 +S2 S3 +···+SN −1 SN ) eKSN SN +1
SN +1 =±1 S1 =±1 S2 =±1 SN =±1
On the other hand, this sum is just involving this term:

X
eKSN SN +1 = eKSN + e−KSN = 2 cosh(KSN ) = 2 cosh(K)
SN +1 =±1
where the last equivalence derive from the fact that cosh is an even function and it
does not depend on ±1. Therefore,
ZN +1 (K) = (2 cosh(K))ZN (K) and ZN (K) = (2 cosh(K))ZN −1 (K)
By performing a backward iteration,
ZN (K) = Z1 (2 cosh(K))N −1
Since Z1 = = 2, we have
P
S1 =±1 1
ZN (T ) = 2(2 cosh(K))N −1 (6.3)
The free energy is
FN (K) = −kB T ln ZN (K) = −kB T ln 2 − kB T (N − 1) ln (2 cosh(K)) (6.4)
and taking the thermodynamic limit it becomes

1 J
fb (T ) ≡ lim FN (K) = −kB T ln 2 cosh (6.5)
N →∞ N kB T
As one can see (Figure 6.2) fb (T ) is an analytic function of T, so we have no phase
transition at T 6= 0.
-2 × 1024
-4 × 1024
fb
-6 × 1024
-8 × 1024
-1 × 1025
0 20 40 60 80 100
T
Figure 6.2: Free energy function in thermodynamic limit for the one dimensional Ising
model, for kB = 1.38 × 1023 , J = 1.
Now, let us compute the magnetization (the average over the spin hSj i) for a
generic site j (assume again that Si = ±1). This can be done in many ways. Here,
we choose one that consider another way to compute Z for the 1 − dim Ising model.
This method can be useful for other calculations. It is based on the following identity:
exp[KSi Si+1 ] = cosh(K)+Si Si+1 sinh(K) = cosh(K)[1+Si Si+1 tanh(K)] (6.6)

(proof )
6.1. 1-dim Ising model 63
Proof of identity (6.6). Remind that
ex + e−x ex − e−x
cosh x = , sinh x =
2 2
Hence,
ex = cosh x + sinh x
In our case,
ekSi Si+1 = cosh(KSi Si+1 ) + sinh(KSi Si+1 ) = cosh K + Si Si+1 sinh K
where the last step was obtained considering that cosh is an even function, while sinh
is an odd one.
Using identity (6.6), we obtain
N −1 X NY
−1
" #
X X
ZN (K) = exp K Si Si+1 = [cosh(K)(1 + Si Si+1 tanh(K))]
{S} i=1 {S} i=1
by rearranging,
X NY
−1
N −1
ZN (K) = (cosh K) (1 + Si Si+1 tanh K) (6.7)
{S} i=1
If we now expand the products, we get terms of the following form:
X
(tanh K)M Si1 Si1+1 Si2 Si2+1 . . . SiM SiM +1 = 0 (6.8)
Sie =±1
e=1,...,M
where i1 . . . im is a set of M sites of the lattice.

Remark. The terms above, when summed over {S} are zero, except the term with
M = 0 that is equal to 1 and, when summed over {S}, gives 2N .
Therefore:
ZN (K) = 2N (cosh K)N −1
that coincides with the result obtained before.

If we now compute the average hSj i, the procedure is similar but now there will
be terms as (6.8) with the addiction of an Sj :
(tanh K)M Si1 Si1+1 Si2 Si2+1 . . . SiM SiM +1 Sj (6.9)
that, when one sums

P over {S} are all zero, included the term with M = 0 that now
is equal to Sj and Sj =±1 = 0. Hence, we have the result
hSj i = 0 ∀j (6.10)
The magnetization is always zero ∀j 6= ∞!

N 1
2
Figure 6.3: One dimensional lattice ring: Ising model with periodic boundary conditions.
Case with H 6= 0 and periodic boundary conditions

Consider the spins sitting on a 1D lattice ring as in Figure 6.3. The periodic
boundary conditions are:
SN +1 = S1
We have:
N
X N
X
− βHΩ ({S}) = K Si Si+1 + h Si (6.11)
i=1 i=1
where
K ≡ βJ, h ≡ βH
The 1 − dim Ising model with this setup can be solved in several ways. Here we will
use the method of the transfer matrix. This is a quite general technique that we will
Lecture 7. discuss within the Ising model.
Wednesday 30th
October, 2019. 6.1.2 Transfer Matrix method
Compiled:
Wednesday 5th Given the Hamiltonian (6.11)1 we can write the corresponding partition function
February, 2020. in the following symmetric form:
X X X h h
ih h
i h h
i
ZN (k, h) = ··· eKS1 S2 + 2 (S1 +S2 ) eKS2 S3 + 2 (S2 +S3 ) . . . eKSN S1 + 2 (SN +S1 )
S1 =±1 S2 =±1 SN =±1
We want to write the partition function in a form similarly to j Mij Pjk . Note that,
P
in the previous form ZN can be written as a product of matrices
N
X X Y h
ZN (h, k) = ··· exp KSi Si+1 + (Si + Si+1 )
2
S1 =±1 SN =±1 i=1
(6.12)
X X
= ··· hS1 | T |S2 i hS2 | T |S3 i . . . hSN | T |S1 i
S1 =±1 SN =±1
where T is a 2 × 2 matrix defined as

0 0 h 0
hS| T S = exp KSS + (S + S )
(6.13)
2
Remark. Note that the labels of the matrix corresponds to the values of Si . Hence,
its dimension depends on the number of possible values a spin Si can assume. It
can also depends on how many spins
P are involved in the interacting terms that are
present in the Hamiltonian (kLL Si Si+1 Si+2 Si+3 ).
1
The choice of boundary conditions becomes irrelevant in the thermodynamic limit, N → ∞.
6.1. 1-dim Ising model 65
For the Ising model, we have Si = ±1 and nearest neighbour interaction implies
that we have two values and that T is a 2 × 2 matrix whose components are
h+1| T |+1i = exp[K + h] (6.14a)
h+1| T |−1i = h−1| T |+1i = exp[−K] (6.14b)
h−1| T |−1i = exp[K − h] (6.14c)
The explicit representation is
eK+h e−K

T= (6.15)
e−K eK−h
Let us introduce some useful notations and relations using the bra-ket formalism:
E 1 E 0
(+) (−)
Si = Si = (6.16a)
0 i 1 i
D D
(+) (−)
Si = (1∗ , 0)i Si = (0, 1∗ )i (6.16b)
The identity relation is:

1 0
ED ED
X (+) (+) (−) (−)
|Si i hSi | = Si Si + Si Si = 1 = (6.17)
0 1
Si =±1
By using the identity property, we can rewrite the partition function as

X X
ZN (K, h) = ··· hS1 | T |S2 i hS2 | T |S3 i . . . |Si i hSi | T |Si+1 i . . .
S1 =±1 SN =±1
X (6.18)
N N

= hS1 | T |S1 i = Tr T
S1 =±1
this is exactly the trace of the matrix, which is most usefully expressed in terms of the
eigenvalues. Being T symmetric, we can diagonalize it by an unitary transformation
as
TD = P−1 TP (6.19)
with PP−1 = 1. Hence,
" #
N −1 −1 −1 −1

Tr T = Tr TTT| {z. . . T} = Tr PP TPP TP . . . P TPP
N
N −1
Tr TN −1

= Tr PTD P = DP P
ciclyc property
of the trace
= Tr TN

D
where
λN

λ+ 0 0
TD = ⇒ TN
D = +
(6.20)
0 λ− 0 λN
−
with λ± are the eigenvalues with λ+ > λ− .
Remark. P is the similitude matrix whose columns are given by the eigenvectors of
λ± .
We finally have:
N N
ZN (K, h) = Tr TN (6.21)

D = λ+ + λ−
Remark. As mentioned previously the dimension of the transfer matrix T and hence
the number of eigenvalues {λ} depend both on the possible values of Si and on the
number of sites involved in terms of the Hamiltonian (range of interaction).
Example 15
For example, consider the Ising (Si = ±1) with n. n. and next n. n. interactions.
The Hamiltonian is:
X X
H = k1 Si Si+1 + k2 Si Si+1 Si+2 Si+3
i i
Because of the second term, now there are 24 = 16 possible configurations that
can be described by using a 4 × 4 transfer matrix that we can write formally as
hSi Si+1 | T |Si+2 Si+3 i
Example 16
For example, suppose Si = +1, 0, −1, therefore the spin can assume three dif-
ferent values. This is a deluted Ising model.
Now, let us consider the transfer matrix formalism in a more general setting.
6.2 General transfer matrix method

The aim of this section is to describe how transfer matrices can be used to solve
classical spin models. The idea is to write down the partition function in terms of
a matrix, the transfer matrix. The thermodynamic properties of the model are then
wholly described by the eigenspectrum of the matrix. In particular, the free energy
per spin in the thermodynamic limit depends only on the largest eigenvalue and the
correlation length only on the two largest eigenvalues through simple formulae.
Let T be a square matrix (n + 2) × (n + 2) that, for example, it is built if the spin
variables may assume (n + 2) possible values. The k -esim value can be defined by the
bra-ket notation where the two vectors are given by a sequence of "0" and a single
"1" at the k -esim position.
Example 17
If k = 3 and there are (n + 2) possible values:
 
0
0
D E  
(3) (3)
Si = (0, 0, 1∗ , 0, . . . , 0) Si = 1
 
 .. 
.
0
these are the bra-ket at the k -esim position.
Similarly to the 2 × 2 Ising case, it is easy to show the identity property

X
|Si i hSi | = 1, 1 ∈ (n + 2) × (n + 2) (6.22)
Si
where now the sum is over (n + 2) values.

Let us consider the diagonal matrix Si , where the elements along the diagonal are
all the (n + 2) possible values of the i-esim spin (or of some of their combination if
longer interaction terms are considered)
X
Si ≡ |Si i Si hSi | (6.23)
Si
6.2. General transfer matrix method 67
Example 18
Ising model n + 2 = 2
(1) (1)
1 (1) ∗ 0 (2) ∗ S 0 0 0 S 0
S (1 , 0) + S (0, 1 ) = + =
0 1 0 0 0 S (2) 0 S (2)
Ising: S (1) = +1, S (2) = −1.

Remark. Note that in this case the matrix Si is equal to the Pauli matrix σz .
Remark. By construction hSi | and |Si i are the eigenvectors related to the eigenvalues
Si = S (1) , S (2) , . . . , S (n+2) .
Similarly. let hti | and |ti i be the eigenvectors related to the (n + 2) eigenvalues of
the transfer matrix T: {λ+ , λ− , λ1 , . . . , λn } , with λ+ > λ− ≥ λ1 ≥ · · · ≥ λn .
Clearly,
n+2
X
T = PTD P−1 = |ti i λi hti | (6.24)
i=1
Indeed
n+2
X n+2
X
T |tj i = |ti i λi hti |tj i = |ti i λi δij = λj |tj i (6.25)
i=1 i=1
Given the set of λ described above, the N particle partition function is given by
n
X
ZN = λN
+ + λN
− + λN
i (6.26)
i=1
6.2.1 The free energy

Now, let us consider the free energy
FN = −kB T log ZN
In particular, we are interested in the limit of the bulk free energy. Looking at the
thermodynamic limit N → ∞ we have
n
" #
1 1 X
fb = lim FN = lim (−kB T ) log λN N
+ + λ− + λN
i
N →∞ N N →∞ N
i=1
by factorizing λ+ , we obtain
n
" N !#
λ N
−kB T −
X λ i
fb = lim log λN
+ 1+ N +
N →∞ N λ+ λ+
i=1
Since λ+ > λ− > λ1 > . . . λn ,

N N
λ− N →∞ λi N →∞
−→ 0, −→ 0 ∀i
λ+ λ+
The result is
fb = −kB T log λ+ (6.27)
The limiting bulk free-energy depends only on the largest eigenvalue of the transfer
matrix T! This is important since sometimes it is much simpler to compute only the
largest eigenvalue than the whole spectrum of T. Also an important theorem about
λ+ exists.
Theorem 2: Perron-Frobenius
Let A be a n × n matrix. If A is finite (n < ∞) and Aij > 0, ∀i, j, (Aij = Aij (~x)),
therefore its largest eigenvalue λ+ has the following properties:
1. λ+ ∈ R+
2. λ+ 6= from {λi }i=1,...,n−1 . It means there is no degeneracy.
3. λ+ is a analytic function of the parameters of A.
Remark. Since in our case A ↔ T, λ+ is related to fb from the theorem. This means
that fb is an analytic function!
If the conditions of the Perron-Frobenius theorem are satisfied by T, the model
described by T cannot display a phase transition!
Remark. This is true for T > 0 since for T = 0 some Tij can be either 0 or ∞ violating
the hypothesis of the theorem.
Remark. If T has infinite dimension (see d > 1) the hypothesis of the theorem are
not valid anymore and fb can be non-analytic.
6.2.2 The correlation function

A second important quantity which is simply related to the eigenvalues of the
transfer matrix is the correlation length. To calculate this, we need the spin-spin
correlation function which serves as an example of how to obtain averages of products
of spins using transfer matrices.
Let us consider the two point correlation between two spins at distance R to
another. The fluctuation with respect to the average is:
ΓR ≡ hS1 SR i − hS1 i hSR i (6.28)
Since
ΓR ∼ exp[−R/ξ]
R→∞
we can define the correlation length ξ as

−1 1
ξ ≡ lim − log |hS1 SR i − hS1 i hSR i| (6.29)
R→∞ R
Now, let us compute the terms hS1 SR iN and hS1 iN hSR iN .
Term hS1 SR iN
From the definition of average we obtain
1 X
hS1 SR iN = S1 SR exp[−βHN ] (6.30)
ZN
{S}
Remark. The subscript N denotes that we are again considering a ring of N spins.
ZN is known from equation (6.26).
Writing this expression by using the transfer matrix formalism, one obtains
1 X
hS1 SR iN = S1 hS1 | T |S2 i . . . hSR−1 | T |SR i SR hSR | T |SR+1 i . . . hSN | T |S1 i
ZN
{S}
Summing over the free spins,

1 X
hS1 SR iN = S1 hS1 | TR−1 |SR i SR hSR | TN −R+1 |S1 i (6.31)
ZN
S1 ,SR
On the other hand, since

n+2
X
T= |ti i λi hti |
i=1
we have
n+2
X
T R−1
= |ti i λR−1
i hti | (6.32a)
i=1
n+2
X
TN −R+1 = |ti i λN
i
−R+1
hti | (6.32b)
i=1
Hence,
n+2
X
hS1 | TR−1 |SR i = hS1 |ti i λR−1
i hti |SR i (6.33a)
i=1
n+2
X
hSR | TN −R+1 |S1 i = hSR |tj i λN
j
−R+1
htj |S1 i (6.33b)
j=1
and plugging these expressions in (6.31) one gets
X X n+2
X n+2
X
S1 SR e−βHN = S1 hS1 |ti i λiR−1 hti |SR i SR hSR |tj i λN
j
−R+1
htj |S1 i
{S} S1 SR i=1 j=1
Since the term htj |S1 i is a scalar, it can be moved at the beginning of the product.
Remembering the notations
X
S1 = |S1 i S1 hS1 | (6.34a)
S1
X
SR = |SR i SR hSR | (6.34b)
SR
one gets
X X
S1 SR e−βHN = htj | S1 |ti i λR−1
i hti | SR |tj i λN
j
−R+1
(6.35)
{S} ij
Lecture 8.
Since k = ZN for k = +, −, 1, . . . , n, we have
λN
P
k Wednesday 6th
November, 2019.
htj | S1 |ti i λR−1 hti | SR |tj i λN −R+1
P
ij i j Compiled:
hS1 SR iN = Pn N Wednesday 5th
k=1 λk
February, 2020.
If we now multiply and divide by λN
+ , we get
htj | S1 |ti i (λi /λ+ )R−1 hti | SR |tj i (λj /λ+ )N −R+1
P
ij
hS1 SR iN = Pn N
k=1 (λk /λ+ )
Remark. In the thermodynamic limit N → ∞, only the terms with j = + and k = +

will survive in the sum. Remind that R is fixed.
R−1
X λi
hS1 SR i = lim hS1 SR iN = ht+ | S1 |ti i hti | SR |t+ i
N →∞ λ+
i=±,1...n
Rembember that λ+ > λ− ≥ λ1 ≥ · · · ≥ λn :
n
λi R−1
X
hS1 SR i = ht+ | S1 |t+ i ht+ | SR |t+ i + ht+ | S1 |ti i hti | SR |t+ i
λ+
i6=+
Since one can prove

lim hSR i1 = ht+ | S1 |t+ i , lim hSR iN = ht+ | SR |t+ i (6.36)
N →∞ N →∞
we obtain
X λi R−1
hS1 SR i = hS1 i hSR i + ht+ | S1 |ti i hti | SR |t+ i (6.37)
λ+
i6=+
Example 19: Show relation (6.36)

Let us prove (6.36) by a method analogous to that followed above.
1 X 1 X 1 X X
hS1 iN = S1 e−βHN = S1 hS1 | TN |S1 i = S1 hS1 |ti i λN
i hti |S1 i
Z Z Z
{S} S1 S1 i
P N
1 X N i (λi /λ+ ) hti | S1 |ti i
= λi hti | S1 |ti i = P n N
Z k=1 (λk /λ+ )
i
Taking the limit N → ∞:
hS1 i = lim hS1 iN = ht+ | S1 |t+ i

N →∞
The correlation function follows immediately from (6.37),

n
λi R−1
X
ΓR = hS1 SR i − hS1 i hSR i = ht+ | S1 |ti i hti | SR |t+ i (6.38)
λ+
i6=+
Remark. ΓR depends only on the eigenvalues and eigenvectors of the transfer matrix
T and by the values of the spins S1 and SR .
A much simpler formula is obtained for the correlation length (6.29). Taking the
limit R → ∞ the ratio (λ− /λ+ ) dominates the sum and hence

−1 1
ξ = lim − log |hS1 SR i − hS1 i hSR i|
R→∞ R−1
( " #)
1 λ− R−1
= lim − log ht+ | S1 |t− iht− | SR |t+ i
R→∞ R−1 λ+

λ− 1
= − log − lim log ht+ | S1 |t− iht− | SR |t+ i
λ+ R→∞ R − 1

λ−
= − log
λ+
The important result is

−1 λ−
ξ = − log (6.39)
λ+
It means that the correlation length does depend only on the ratio between the two
largest eigenvalues of the transfer matrix T.
6.2.3 Results for the 1-dim Ising model

Let us now return to the example of the nearest neighbour Ising model in a mag-
netic field, to obtain explicit results for the bulk free energy fb , the correlation function
Γ and the correlation length ξ.
Recall that the transfer matrix of such a system is given by

exp(K + h) exp(−K)
T=
exp(−K) exp(K − h)
Now, let us calculate the eigenvalues:
|T − λ1| = (eK+h − λ)(eK−h − λ) − e−2K = 0
The two solutions are

q
λ± = eK cosh(h) ± e2K sinh2 (h) + e−2K (6.40)
The free energy

The free energy is
−kB T
fb ≡ lim log ZN (K, h)
N →∞ N " N !#
1 λ−
= −kB T lim log λN
+ 1+
N →∞ N λ+
= −kB T log λ+
and inserting the explicit expression of λ+ for the Ising model, we get
q
K 2K 2 −2K
fb = −kB T log e cosh h + e sinh (h) + e
q (6.41)
= −KkB T − kB T log cosh(h) + sinh2 (h) + e−4K
Remark. Remember that K ≡ βJ, h ≡ βH.
Exercise 2
Check that if h = 0 we get back the expression found previously with the
iterative method. What is the importance of boundary conditions?
Solution. If h = 0, we obtain

1
fb = −KkB T − kB T log 1 + 2K = −kB T log eK + log 1 + e−2K

e
K
e + e−K

K −K

= −kB T log e + e = −kB T log 2
2

J
= −kB T log(2 cosh K) = −kB T log 2 cosh
kB T
The choice of boundary conditions becomes irrelevant in the thermodynamic

limit, N → ∞.
Let us now consider the limits T → 0 and T → ∞ by keeping H fixed and J fixed.
• Case: T → 0 ⇒ K → ∞, h → ∞.
K→∞
e−4K −→ 0
p
h→∞
sinh2 h ∼ sinh(h)
We have
2eh
cosh h + sinh h ∼ ' eh
2
and
h→∞
K→∞
f ∼ −KkB T − kB T log eh ∼ −J − H const (6.43)
Therefore, as T → 0+ , f goes to a constant that depends on J and H.
• Case: T → ∞ ⇒ K → 0, h → 0. In this case we suppose also that H and J

(fixed) are also finite.
e−4K ' 1
p √
sinh2 h + e−4K ∼ 1
h→0
Since cosh h ∼ 1:
fB ∼ −KkB T − kB T log (1 + 1) ∼ −J − kB T ln 2 (6.45)
Therefore, as T → ∞, the free energy goes linearly to zero, as in Figure 6.4.
fb
Figure 6.4: Plot of the free energy fb in function of the temperature T . For T → 0, the
free energy becomes constant, while for T → ∞ it goes linearly to zero.
The magnetization
This can be obtained by differentiating the negative of the free energy with respect
to the magnetic field H (or by using equation (6.36)):

∂fb 1 ∂fb ∂
q
2 −4K
m=− =− = log cosh(h) + sinh (h) + e
∂H kB T ∂h ∂h
The result is
sinh h + √sinh 2h cosh−4K

h
sinh h+e sinh h
m= p =p (6.46)
2
cosh h + sinh h + e−4K sinh2 h + e−4K
• Case: T > 0 fixed, H → 0 ⇒ h → 0.
sinh h ∼ h ∼ 0,
cosh h ∼ 1
In zero field h → 0, we have m → 0 for all T > 0. It means that there is no

spontaneous magnetization!
The magnetic susceptibility
∂m 1 ∂m
χT ≡ = (6.48)
∂H kB T ∂h
If we consider the case h 1, it is convenient first expand the (6.46) for h → 0 and
take the derivative to get χT .
Since sinh(h) ∼ h + h3 and cosh(h) ∼ 1 + h2 , we have
h1 h(1 + e2K )

m ∼
1 + e−2K
If we now derive with respect to h
1 ∂m h1 1 (1 + e2K )
χT = ≈
kB T ∂h kB T (1 + e−2K )
• Case: T → ∞ ⇒ K → 0.
e2K ' e−2K ' 1
The Curie’s Law for paramagnetic systems is:
1
χT ∼ (6.49)
kB T
• Case: T → 0 ⇒ K → ∞.
e−2K ' 0
The Curie’s Law for paramagnetic systems is:
1 2K 1 2J/kB T
χT ∼ e ∼ e (6.50)
kB T kB T
The correlation length
" p #
cosh h − sinh2 h + e−4K

λ−
ξ −1 = − log = − log p (6.51)
λ+ cosh h + sinh2 h + e−4K
For h = 0, we have cosh h → 1, sinh h → 0:
1 − e−2K

−1 1
ξ = − log = − log
1 + e−2K coth K
Therefore:
1
ξ= , for h = 0 (6.52)
log (coth K)
• Case: T → 0 ⇒ K → ∞.
eK + e−K K→∞ K→∞
coth K = ' 1 + 2e−2K + . . . −→ 1
eK − e−K
It implies
K1 1 e2K
ξ ∼ ∼
ln (1 + 2e−2K ) 2
Hence,
1 J/kB T
T →0
ξ ∼
e (6.53)
2
It diverges exponentially ξ → ∞ as T → 0.
• Case: T → ∞ ⇒ K → 0.
K2 K2 2
eK + e−K K→0 1+K + 2 +1−K + 2 2 + 2 K2 1 + K2
coth K = ' K2 K2
∼ ∼
eK − e−K 1+K + −1+K − 2K K
2 2
K→0 1
ξ −1 = log (coth K) ∼ ln + ln (1 + K 2 ) ∼ +∞
K
Therefore
K→0
ξ −→ 0
More precisely,
K→0 1 K→0 1
ξ ∼ 2
∼ − (6.54)
ln (1/K) + ln (1 + K ) ln K
6.3 Classical Heisenberg model for d=1

Now, let us suppose to study something different from the Ising model. Indeed,
from a physicist’s point of view the Ising model is highly simplified, the obvious
objection being that the magnetic moment of a molecule is a vector pointing in any
direction, not just up or down. One can build this property, obtaining the classical
Heisenber model. We do not anymore assume spin that can assume values as -1 or
+1, but spin that can assume a continuous value. Unfortunately, this model has not
been solved in even two dimensions [8].
Let us take a d = 1 dimensional lattice. In the classical Heisenberg model, the
2
spins are unit length vectors ~Si , i.e. ~Si ∈ R3 , ~Si = 1 (continuous values on the unit
sphere). We have
~Si = (S x , S y , S z )
i i i
with periodic boundary condition

~SN +1 = ~S1
Assuming H = 0, the model is defined through the following Hamiltonian:

N
− βH({~S}) = K ~Si · ~Si+1 ~h · ~Si )
X X
(+ (6.55)
i=1 i
This model satisfies O(3) symmetry. In the transfer matrix formalism:

X X PN ~
Si ·~
e−βH = eK Si+1
= Tr TN (6.56)

ZN (K) = i=1
{~
S} {~
S}
6.3. Classical Heisenberg model for d=1 75
where D E
~Si T ~Si+1 = eK ~Si ·~Si+1
Similarly to the Ising case,

X
T= |ti i λi hti |
i
and
TD = P−1 TP
The problem is computing the eigenvalues λi of T.
Formally, we should find
h i D E D ED E X
exp K ~S1 · ~S2 = ~S1 T ~S2 = λi ~S1 ti ti ~S2 = λi fi (~S1 )f ∗ (~S2 )
X
i∈eigenvalues i
~ ~
Remark. We start by noticing that the term eK S1 ·S2 is similar to the plane wave ei~q·~r ,
that in scattering problems is usually expanded in spherical coordinates. Plane wave
can be expanded as a sum of spherical harmonics as
∞ X
X l
ei~q·~r = 4π (i)l jl (qr)Ylm
∗
(q̂)Ylm (r̂)
l=0 m=−l
where
π
(i)l
Z
jl (qr) = − sin(θ)eiqr cos(θ) Pl (cos(θ)) dθ
2 0
are the spherical Bessel functions, while the Pl (cos(θ)) are the Legendre polynomial
of order l.
From a formal comparison we have
(
i~q · ~r = iqr
~S1 ↔ Ŝ1 , (6.57)
K ~S1 · ~S2 = K ~S1 ~S2 = K

multiplying by (−i) we can write

qr = −iK ~S1 ~S2 = −iK (6.58)

In our case, we have q̂ = ~S1 , r̂ = ~S2 . Hence,

∞ X
l
K~
S1 ·~ ∗ ~
(S1 )Ylm (~S2 ) = λi fi (~S1 )f ∗ (~S2 )
X X
e S2
= 4π (i)l jl (−iK)Ylm (6.59)
l=0 m=−l i
where
λi = λlm (K) = 4π(i)l jl (−iK) (6.60)
Remark. Note that λi does not depend on m!
If l = 0, the largest eigenvalue is:
sin K
λ+ = λ0 (K) = 4πj0 (−iK) = 4π
K
and
cosh K sinh K
λ− = λ1 (K) = 4πij1 (−iK) = 4π −
K K2
Exercise 3
Given the largest eigenvalue λ+ ,
sin K
λ+ = 4π
K
find the bulk free energy density of the model and discuss its behaviour in the
limits of low (T → 0) and high (T → ∞) temperatures.
Solution. The bulk free energy is

sin K
fb = −kB T log λ+ = −kB T log 4π
K
Remind that K ≡ βJ and consider the limits
• Case: T → 0 ⇒ K → ∞.

J sin K K→∞ J 1 J
fb = − log 4π ∼ − log = log (K)
K K K K K
Hence,
K→∞
fb ∼ 0
• Case: T → ∞ ⇒ K → 0.
K→0
sin K ∼ K

K

⇒ fb = −kB T log 4π = −kB T log(4π)

K

In this case the free energy fb goes linearly with respect to the temperature.
How can we violate the hypothesis of the Perron-Frobenius theorem hoping to

find a phase transition also in a d = 1 model? One of the hypothesis of the Perron-
Frobenius theorem is the one in which Aij > 0 for all i, j. Hence, one possibility is to
build a model in which its transfer matrix has same Aij that are equal to zero also
for T 6= 0.
6.4 Zipper model

Lecture 9.
Friday 8th The Zipper model is an unusually simple and interesting member of the class of
November, 2019. one dimensional systems which exhibit a phase transition. It is a model introduced by
Compiled: Kittel [9] to describe oligomers undergoing denaturation transition. Simplest model
Wednesday 5th of DNA thermal denaturation transition (no bubbles). Better model for the denatu-
February, 2020. ration of short oligomers.
The hypothesis are: the binding energy between two bases located at the end of
the molecule is smaller than the one for pairs away from the ends. The unbinding
starts and develops from the ends as a zipper.
In this denaturation transition we do not allow bubbles. Let us consider first the
single-ended zipper, i.e. a molecular zipper of N parallel links that can be opened only
from one end as in Figure 6.6. The single-ended zipper is simpler than any related
problem which has been treated, and it offers a good way to introduce a biophysics
example into a course of statistical mechanics.
If the first k bonds (or links) are open (unbounded pairs) the energy to open the
k+1 is ε0 . Note that if at least one of the previous k bond is closed the energy needed
to open the k +1 band is infinite! We specify further that the last link, k = N , cannot
be opened; this minor features serves only to distinguish one end from the other, and
6.4. Zipper model 77
Figure 6.5: Sequential unzipping from the ends.
1
2
k
Figure 6.6: Open and closed links in a single-ended zipper.
we shall say that the zipper is open when N − 1 links are open.
We suppose that there are G orientations which each open link can assume: that
is, the open state of a link is G-fold degenerate, corresponding to the rotational
freedom of a link. Hence, once a bond is open it can orient itself in G different ways.
In other words, there is an entropy
S0 = kB log G (6.61)
associated to each open band. In the problem of DNA the empirical value of G may
be of the order of 104 .
Partition function
Let us suppose that the energy required to open the first k links is ε0 . If k links
are open, the degeneracy is Gk , and the contribution of this configuration to the
partition function is
Gk e−kε0 /kB T
By summing over the possible values of k, the partition function is
N
X −1 N
X −1
k −kε0 /kB T
ZN (T, G, ε0 ) = G e = ek(S0 T −ε0 )/kB T (6.62)
k=0 k=0
Let us call
χ ≡ Ge−ε0 /kB T (6.63)
and simplify the previous expression
N −1
X 1 − χN
ZN = χk = (6.64)
1−χ
k=0
We see immediately there is a single pole singularity.

The free energy is
1 − χN

FN = −kB T ln ZN = −kB T ln (6.65)
1−χ
We can now compute some observables of interest. The correct procedure is to eval-
uate thermodynamic quantities for finite N and then to examine the limit N → ∞.
Calculate average number of open links

The thermodynamic average number of open links is
PN −1
kχk d N χN χ
hkiN ≡ Pk=0
N −1 k
= χ ln Z N = N
− (6.66)
k=0 χ
dχ χ −1 χ−1
The function is plotted in Figure 6.7. We examine the behaviour of hkiN in the vicinity
of the point χc = 1 for which the denominators are equal to zero (pole).
Remark. In this model, we consider the average number of open links instead of the
magnetization.
< k >N
0 1 χ
Figure 6.7: Thermodynamic average number of open links in a single-ended zipper of N

links.
In order to analyze what happens near 1, we expand χ ≡ 1 + ε:
1 − (1 + ε)N

log ZN (χ) = log
1 − (1 + ε)
N (N −1) 2 N (N −1)(N −2) 3
" #
1 − (1 + εN + 2! ε + 3! ε + O(ε4 ))
= log
ε
N (N − 1) N (N − 1)(N − 2) 2

= log N + ε+ ε + ...
2 6
N −1 (N − 1)(N − 2) 2

= log N + log 1 + ε+ ε (6.67)
2 6
N ε N 2 ε2

= log N + log 1 + + + ...
2 6
2
N ε N 2 ε2 1 N ε N 2 ε2

= log N + + + ... + + + ... + ...
2 6 2 2 6
N ε N 2 ε2
= log N + + + ...
2 24
6.4. Zipper model 79
N χN χ
By doing the same for hkiN = χN −1
− χ−1 , one gets
N ε N 3 ε3

N
hkiN = 1+ − + ... (6.68)
2 6 360
this is true for N 1, ε 1.
At the transition point χc = 1, where ε = 0:
N
hkiN '
2
We can define the variation (slope per site) as a response function (the derivative
with respect to the parameter):
1 d hki N N 3 ε3
' − + ... (6.69)
N dε 12 240
is a maximum at ε = 0, and the slope at the transition point becomes infinite as
N → ∞ (linearly). The response function diverges linearly to N, this is a good signal
that we have a transition.
Transition temperature
The temperature Tc corresponding to the pole χc = 1 is given by
Ge−ε0 /kB Tc = 1
Hence,
ε0
Tc = (6.70)
kB log G
Note that as G → 1, Tc → 0. For G = 1 there is no solution at a finite temperature
and hence the model does not display a phase transition for any finite T ! This is
telling you that if G = 1 what is important it is the energy, you have no entropy as
disorder. At that point everything can happen.
There is a finite transition temperature if G > 1. One might perhaps argue that
the model is now not strictly one-dimensional, for the degeneracy G arises from the
rotational freedom of an open link.
Remark. Despite the model is 1-dim, for G > 1 there is a phase transition. This is
due to two contributions:
1. Existence of forbidden configuration (infinite energy). It is a necessary con-
dition, but not sufficient, for a phase transition in d = 1 with finite range
interactions.
2. A further requirement may be that the degeneracy of the excited state (G) of
a structural unit must be higher than the degeneracy of the ground state2 .
Unwinding from both ends

When the zipper is allowed to unwind from both ends, there are k+1 ways in which
a total of k links may be opened, so that the partition function for a double-ended
zipper of N links is
N
X −1
ZN (T, G, ε0 ) = (k + 1)Gk e−kε0 /kB T (6.71)
k=0
and to this should be added a term for the state of N open links. This terminal term
for a simple zipper is GN exp(−N ε0 /kB T ).
2
In the mean-field approximation no transition can occur if the degeneracy of the ground state
is higher than that of the excited state.
6.4.1 Transfer matrix method for the Zipper model

The idea is: we want to map the Zipper model to an Ising model. The spin like
model consists on associating to each bond a spin such that Si = 0 if the i-esim bond
is closed , while Si = 1, . . . , G if the i-esim bond is open with G possible orientations.
Therefore,
• Case: Si 6= 0 open. We have two subcases:
– Si−1 open: Si−1 6= 0 ⇒ E(Si 6= 0|Si−1 6= 0) = ε0 .

– Si−1 closed: Si−1 = 0 ⇒ E(Si 6= 0|Si−1 = 0) = ε0 + V0
• Case: Si = 0 closed. We have E(Si = 0) = 0 irrespective of Si−1 .

Hence, considering all these cases, the energy results
E(Si , Si−1 ) = (ε0 + V0 δSi−1 ,0 )(1 − δSi ,0 ) (6.72)
The boundary condition is SN = 0 (always closed). The full Hamiltonian of the

model can be written as (it could be also a function of delta, but it is not a problem):
N
X −1
HN = ε0 (1 − δS1 ,0 ) + (ε0 + V0 δSi−1 ,0 )(1 − δSi ,0 ) (6.73)
i=2
The Kittel’s version is obtained by assuming V0 = ∞.

The partition function is
X
ZN = exp(−βHN )
{S}
In order to implement the transfer matrix formalism we rewrite ZN as follows

N −2 h i
e−βε0 (1−δSi+1 ,0 ) 1 + (e−βV0 − 1)δSi ,0 (1 − δSi+1 ,0 ) (6.74)
X Y
ZN = e−βε0 (1−δS1 ,0 )
{S} i=1
Let us consider the Kittel model, the condition V0 = ∞ implies exp(−βV0 ) = 0.

Hence, we can define the transfer matrix as
T = {hS| T S 0 ≡ tS,S 0 } (6.75)

where
tS,S 0 = e−βε0 (1−δS0 ,0 ) [1 − δS,0 (1 − δS 0 ,0 )] (6.76)
or in matrix form
 
1 0 ......0
1 a . . . . . . a
 .. .. .. 
 
T = . .
 .  , with a ≡ e−βε0
 .. .. .. 
. . . 
1 a ......a
The first think to notice is that the constraint that the bond Si+1 cannot be open if
bond Si is closed (Si = 0) yields the null entries in the first row of T. This violates
the hypothesis of the Perron-Frobenius theorem!
The matrix T has three different eigenvalues
λ1 = Ga, λ2 = 1, λ3 = 0 (6.77)
6.5. Transfer matrix for 2 − dim Ising 81
The partition function can be written as

 
1
1
ZN = (1, a, . . . , a)TN −2  .  (6.78)
 
 .. 
1
Moreover, we have
   
0 1 − Ga
1  1 
λ1 → ~v1 =  .  , λ2 → ~v2 =  . 
   
 ..   .. 
1 1
and we can then write

 
1
a a(1 − Ga) − 1 1
 ..  = ~v1 + ~v2
 
. 1 − Ga 1 − Ga
a
 
1
1 −Ga 1
 ..  = ~v1 + ~v2
 
 .  1 − Ga 1 − Ga
1
Therefore,
1 − (Ga)N 1 − (Ge−βε0 )N 1
ZN = = −βε
= (−λN N
1 + λ2 ) (6.80)
1 − Ga 1 − Ge 0 1 − Ge−βε0
Since in the thermodynamic limit only the contribution of the largest eigenvalue
matters for fb we have
fb = −kB T ln max(λ1 , λ2 )
Remark. Given that the λ1 and λ2 are positive, analytic function of T (λ1 = Ga, λ2 =
1). In order to have a phase transition (i.e. non analiticity of fb ) the two eigenvalues
must cross for a given value of T. It is true if and only if:
ε0
Gac = 1 ⇔ Ge−βc ε0 = 1 ⇔ Tc = (6.81)
kB ln G
that agree with previous calculation (see Eq.(6.70)).
6.5 Transfer matrix for 2 − dim Ising

The two-dimensional Ising model for a system of interacting spins on a square
lattice is one of the very few nontrivial many-body problems that is exactly soluble
and shows a phase transition [6]. The exact solution in the absence of an external
magnetic field (H = 0) was first given almost eighty years ago in a famous paper by
Onsager [5], using the theory of Lie algebras. In particular, from Onsager’s solution we
can see that already in two dimensions an Ising model can exhibit phase transitions,
showing a non null spontaneous magnetization for temperatures low enough.
Let us therefore consider a two-dimensional Ising model, defined on a lattice made
of N rows and M columns, as in Figure 6.8. We apply periodic boundary conditions
to the system in both directions (geometrically, this can be thought of as defining the
n=N
n+1
m+1
(n, m)
3
n=1
m=1 2 3 m=M
Figure 6.8: 2-dimensional Ising lattice of N rows and M columns.
model on a torus), and we consider only nearest neighbour interactions. The spin in
a site is identified by Ssite = Sm,n .
We consider a set of spin arranged on a square lattice, interacting only with nearest
neighbors and with a magnetic field H 6= 0. The reduced Hamiltonian of the system
will be:
X X
−βHΩ ({S}) = K Si Sj + h Si
hiji i
N X
X M N X
X M
=K (Sm,n Sm+1,n + Sm,n Sm,n+1 ) + h Sm,n
n=1 m=1 n=1 m=1
This can be rewritten as follows:

M
X
− βHΩ ({S}) = [E[µm , µm+1 ] + E[µm ]] (6.82)
m=1
where
N
X N
X
E[µm , h] = K Sm,n Sm,n+1 + h Sm,n (6.83a)
n=1 n=1
XN
E[µm , µm+1 , h] = K Sm,n Sm+1,n (6.83b)
n=1
the first equation is the one body interaction, while the second equation represents
the interaction between nearest neighbours columns (two body interaction).
Moreover, µ is a m dimensional vector; in particular, each µm represents the set
of N spins along column m:
µm = {Sm,1 , Sm,2 , . . . , Sm,N } (6.84)
We can write a transfer matrix between columns, permitting to transfer along the
m. To make it simpler, suppose h = 0 (so the energy does not depend on h):
hµm | T |µm+1 i = exp[k(E[µm , µm+1 ] + E[µm ])] (6.85)
Now, we have to diagonalize.

Remark. In the 2x2 transfer matrix in the 2-dim we have two possible values. Now,
we have to do the same in principle, but we have to do for all of the (6.84).
6.5. Transfer matrix for 2 − dim Ising 83
Remark. T is a matrix of dimension 2N × 2N , hence, in the thermodynamic limit is

an infinite matrix (violation of Perron-Frobenius).
According to the formalism
ZN (K, h) = Tr TN

To find the eigenvalues of T given by (6.85) is highly non trivial. The big problem it
is that in the thermodynamic limit is that the dimension of the transfer matrix goes
to infinity, then it is difficult to be diagonalized. This was first achieved by Onsanger
in 1944, as said, for the case H = 0 and in the N → ∞ limit. Onsanger has shown
that the free energy of the system is given by
kB T 2π

1
Z q
2 2
fb (T ) = −kB T log (2 cosh(2βJ)) − log 1 + 1 − g sin (Φ) dΦ
2π 0 2
(6.86)
where
2
g=
cosh(2βJ) coth(2βJ)
and also that the magnetization is:
( 1/8
1 − sinh−4 (2βJ) T < Tc
m= (6.87)
0 T > Tc
where Tc is the temperature given by the condition

2 2J
2 tanh =1
kB Tc
which yields the numeric result:
⇒ Tc ' 2, 264J/kB 6= 0
hence, we have a phase transition at a critical temperature Tc different from 0!

Onsager also showed that the critical exponents of this model are:
1 7
α = 0, β= , γ=
8 4
where α = 0 because the specific heat diverges logarithmically for T ∼ Tc :

T
c ∝ A − ln 1 − +B
Tc
It means that the specific heat displays at the transition a logarithmic divergence (no
power law!).
Chapter 7
The role of dimension, symmetry

and range of interactions in phase
transitions
Lecture 10.
Which is the role of the dimension in phase transition? Consider d, the dimension Wednesday 13th
of the system. For the Ising model, we have seen that in d = 1 there is no phase November, 2019.
transition, while the Onsanger solution tell us that for d = 2 there is a paramagnetic- Compiled:
Wednesday 5th
ferromagnetic transition for Tc > 0. Therefore, the dimension seems a crucial param-
February, 2020.
eter! Since in general analytic solutions are not available, is there a simple argument
to establish the existence of a phase transition? In the case of a para-ferro transition,
may we establish whether a phase with long range order exists and is stable within a
range of T > 0?
7.1 Energy-entropy argument

We need an argument that can tell us which kind of system has a phase transition.
The idea is to use the entropy energy argument. Indeed, our systems are ruled by a
free energy and the previous states are found by making derivative. We have energy
and entropy: low energy state can be stable with respect to thermal fluctuations, but
the fluctuations will destroy the long range order. This idea can be generalized.
Let us consider:
dU −T |{z}
dF = |{z} dS (7.1)
energy entropy
We expect that:
• T 1: entropy should dominates.
• T 1: energy should dominates.

Question: there is a temperature different to zero in which this is compatible?
7.1.1 1-dim Ising

Let us study the stability of the states with minimum energy to fluctuations for
T 6= 0, for a system of size N.
We already know that, for T = 0, two ground states exist, either all spins up or
all spins down. For instance, suppose that we have the ground state with all the spin
up; the energy of the state is
EG = −JN (7.2)
and it is the same for the other configuration.
85
Chapter 7. The role of dimension, symmetry and range of interactions in phase
86 transitions
Now, let us consider T 6= 0, there could be a given number of elementary ex-

citations of the kind spin up/down. What happens if we swap one or more spins?
These are defects with respect to the ground state and they are also called domain
walls. This is in the one dimensional case, but is valid also in many dimensional.
Therefore, which is the variation in energy ∆E with respect to the ground state? For
each excitation there is an energy penalty ∆E = 2J, indeed, if we suppose that we
have only one swap, we have
EG = −JN, E ∗ = −J(N − 1) + J ⇒ ∆E = 2J
For a finite concentration x of domain walls, we can write M = N x, giving
∆EM = 2M J (7.3)
Now, let us compute the change in entropy. The entropy of the ground state can
be computed immediately: this is zero because it is the logarithm of the number of
configurations, but in this case we have only one configuration, namely SG = ln 1 = 0.
Hence, the difference between the entropy of the ground state and the entropy of the
new state is just the entropy of the new state. Therefore, we want to estimate the
entropy of the states with M domain walls. The number of possible ways to insert
M domains in N positions, namely the number of configurations, is

N N
#= = (7.4)
M xN
We have:
N
SM = kB log (7.5)
M
the difference is
N
∆S = SM − SG = SM = kB ln
xN
Let us calulate
∆F = FM − FG = ∆E − T ∆S

N
= 2M J − kB T ln
M

N
= 2xN J − kB T ln
xN
= N {2xJ + kB T [x ln x + (1 − x) ln (1 − x)]}
where we have used the Stirling approximation
ln N ! = N ln N − N
Since the equilibrium states are obtained by the minimum of F, we can minimize
with respect to x. We are interested in the free energy in the bulk, hence, firstly we
normalize and then we derive for finding the minimum
∆FN ∂∆fb
∆fbN = , =0 (7.6)
N ∂x
this gives
∂
{2xJ + kB T [x ln x + (1 − x) ln (1 − x)]} = 2J + kB T [ln x + 1 − ln (1 − x) − 1]
∂x
= 2J + kB T [ln x − ln (1 − x)] = 0
7.1. Energy-entropy argument 87
hence,
x 2J x
ln =− ⇒ = e−2J/kB T
1−x kB T 1−x
and finally the results is
1
x= (7.7)
1+ e2J/kB T
It means that ∀T 6= 0 exist a finite concentration x of domain walls. The ground
state is unstable ∀T > 0. Indeed, if you have a finite density of x, no long range order
exist for T > 0. From (7.7), we can see that as T → 0, we have x → 0 as expected.
Now, let us try to do the same for d dimensions.
7.1.2 d-dim Ising

What is a domain wall in d dimensions? The domain walls is an hypersurface of
size Ld−1
∆E ∝ 2JLd−1 (7.8)
Computing the entropy it is a very difficult problem. Indeed, the entropy of a fluc-
tuating hypersurface is difficult to estimate. For a single domain wall, we can say
S ≥ kB ln L (7.9)
where L is the number of ways to place a straight wall within a system of linear size
L. The ∆S is just S because the entropy of the ground state is again zero.
Remark. If we underestimate S, we obtain
∆F = 2JLd−1 − kB T ln L (7.10)
it means that now energy can win if the temperature is different from zero. Therefore,
for d = 2, or greater (d > 1), that long range order can survive thermal fluctuations
and the system could present an ordered phase!
Peierls argument
The Peierls argument [10] is a mathematically rigorous and intuitive method to
show the presence of a non-vanishing spontaneous magnetization in some lattice mod-
els. This argument is typically explained for the d = 2 Ising model in a way which
cannot be easily generalized to higher dimension. The idea is trying to perturb the
system using an external magnetic field as perturbation (it is very small h). In that
way, we are breaking explicitly the symmetry, but then, taking the limit h → 0 and
switching off the magnetic field, we see the stability.
We know that for finite systems, from the Z2 symmetry, it follows
hmiN = 0
This is true for finite systems, however, in the thermodynamical limit N → ∞, if

d ≥ 2 the magnetization hmi∞ vanishes only in the high temperature paramagnetic
phase. In the low temperature ferromagnetic phase, the value of hmi∞ is not well
defined and depends on how the thermodynamical limit is performed. In this case
the Z2 symmetry is said to be spontaneously broken.
The breaking of a symmetry can be thought as a form of thermodynamical instabil-
ity: the particular value acquired by hmi∞ in the ferromagnetic phase is determined
by small perturbations.
88 transitions
A conventional way to uniquely define hmi∞ in the broken phase (where it is called
spontaneous magnetization) is to use an infinitesimal magnetic field:
(h)
hmi∞ = lim lim hmiN (7.11)
h→0+ N →∞
where it is crucial to perform the thermodynamical limit before switching off the
magnetic field h → 0+ . The instability manifests itself in that using h → 0− would
change the sign of hmi∞ .
A different approach to expose the instability is the use of appropriate boundary
conditions: we can for example, if we want hmi∞ > 0, impose in all the sites i on the
lattice boundary (i ∈ ∂Ω) the condition Si = +1, as in Figure 7.1.
+ + + + + + +
+ +
+ +
+ + + + + + +
Figure 7.1: System with boundary condition with all the spins in the surface up.
In the paramagnetic phase the effect of boundary conditions does not survive the
thermodynamical limit, while in the ferromagnetic phase their effect is analogous to
that of the infinitesimal magnetic field.
This is the boundary condition chosen by Pierls to establish the existence of a
Tc 6= 0 for the d = 2 Ising model. Let us gives just a qualitative presentation of the
(rigorous) result.
Let N+ , N− be the number of spin up and down respectively. Clearly,
N = N+ + N−
On a finite lattice the mean value of the magnetization can be written in the form
hN+ i − hN− i hN− i

hmiN = =1−2
N N
In order to show that hmi∞ > 0 (remember that we are considering boundary condi-
tions with spin up at ∂Ω), it is sufficient to show that for every N we have
hN− i 1
< −ε (7.12)
N 2
with ε > 0 and N -independent. Indeed, if (7.12) holds
hmiN ≥ 2ε ∀N (7.13)
The Peierls argument is a simple geometrical construction that can be used to prove
this bound. The outcome of the Peierls argument for the model in d dimensions is
an estimate of the form
hN− i
≤ fD (x) (7.14)
N
where x is defined by
x = 9e−4Jβ (7.15)
7.1. Energy-entropy argument 89
and fD is a continuous function of x (independent on N ) and such that
lim fD (x) = 0
x→0
In particular, for small enough T we have the bound
hN− i 1
< −ε
N 2
which ensures that hmi∞ ≥ 2ε and the Z2 symmetry is spontaneously broken. More
precisely, for d = 2, one has
hN− i x2 2 − x
≤ (7.16)
N 36 (1 − x)2
where x = 9e−4Jβ < 1.

Remark. Note that above bound gives also a lower bound on the critical temperature
hN− i x2 2 − x 1
≤ 2
< −ε
N 36 (1 − x) 2
As long as hNN− i < 12 − ε, the system is in the ferromagnetic phase. The critical value
xc ≡ x(βc ) must be outside the interval [0, x1/2 ] where x1/2 is the smallest positive
solution of the equation
x2 2 − x 1
=
36 (1 − x)2 2
From the solution x1/2 and the condition xc > x1/2 , one has
Jβc ≤ Jβ1/2
where Jβ1/2 = 1
4 log 9/x1/2 . Hence, Tc > T1/2 .
Exercise 4
The following equation gives x1/2 :
x3 + 16x2 − 36x + 18 = 0
Find T1/2 .
Solution. This equation has three real solutions:
x1 = −18.05, x2 = 0.79, x3 = 1.26
The smallest positive solutions is x1/2 ≡ x2 , hence
J 1 4J
= log 9/x1/2 ⇒ T1/2 =
kB T1/2 4 kB log 9/x1/2
90 transitions
7.2 Role of the symmetry

Interacting systems can be classified with respect to their global symmetry group.
Let us illustrate some examples.
Example 20: Ising model
X
HIsing = − Jij σi σj (7.17)
i<j
where σi ∈ {−1, 1}. The symmetry group of this Hamiltonian is Z2 , which has
two elements {1, η}. We have
1 : identity, ησi = −σi , η2 = 1
Example 21: Potts model

The Potts model, a generalization of the Ising model, is a model of interacting
spins on a crystalline lattice. The Hamiltonian is
X
Hq−Potts = − Jij δσi ,σj (7.18)
i<j
where σi ∈ [1, 2, 3, . . . , q]. Hq−Potts is invariant under the permutation group of

the sequence {1, 2, 3, . . . , q}. There are q! elements, for example {2, 1, 3, . . . , q}.
The symmetry group is denoted by Sq .
Remark. The difference between a Zq and Sq symmetry is that an Hamiltonian has

symmetry Zq if it is invariant with respect to cyclic permutations 1

1 2 ... q − 1 q
η= (7.19)
2 3 ... q 1
and its powers η l with l = 0, . . . , q−1. Both models satisfy a discrete global symmetry.
Now, we jump into the case in which we consider continuous symmetries.
Figure 7.2: Spin can assume all values around the circles.
Example 22: XY model

This is a spin model that is invariant with respect to the continuous global
symmetry θi → θi + α. Indeed, the Hamiltonian of this model is
Jij ~Si · ~Sj

X
HXY = − (7.20)
i<j
1
In mathematics, and in particular in group theory, a cyclic permutation (or cycle) is a permu-
tation of the elements of some set X which maps the elements of some subset S of X to each other
in a cyclic fashion, while fixing (that is, mapping to themselves) all other elements ofX. If S has k
elements, the cycle is called a k-cycle. Cycles are often denoted by the list of their elements enclosed
with parentheses, in the order to which they are permuted.
7.3. Continuous symmetries and phase transitions 91
where ~Si is a 2D spin vector

~Si = (Sx , Sy )
i i

that can assume values on the unit circle ( ~Si = 1). Suppose that spins are

sitting in hyper dimensional and can rotate along circles. They can assume all
the value as in Figure 7.2.
The simplest way to parametrize the Hamiltonian is by the angle. Denoting by
θi the direction angle of spins ~Si , the Hamiltonian can be rewritten as
X
HXY = − Jij cos(θi − θj ) (7.21)
i<j
with θi ∈ [0, 2π].

Remark. The interaction term cos(θi − θj ) can be written also as
1 ∗
Zi Zj + Zi Zj∗

2
where Zj = exp(iθj ).
The model is invariant under the global transformation
Zi → eiα Zi (7.22)
The phase exp(iα) form a group under multiplication known as U (1) that is
equivalent to O(2). Indeed, the interaction term can be written also as
Ω̂i · Ω̂j
where Ω̂i = (cos θi , sin θi ).

Remark. In n-dimensions Ω̂ has n components Ω̂ = {Ω1 , Ω2 , . . . , Ωn } and the
corresponding Hamiltonian is
X
H=− Jij Ω̂i · Ω̂j (7.23)
i>j
It is symmetric with respect to the global symmetry group O(n).
Which are the domain walls for continuous symmetries? Which are the implica-
tions for the stability of the ordered phase?
7.3 Continuous symmetries and phase transitions

When the symmetry is continuous the domain walls interpolate smoothly between
two ordered regions (see Figure 7.3). The energy term that in Ising is proportional
to 2JLd−1 , how does it change here?
Let us consider the XY model and suppose that the variation of the direction
between two nearest neighbours sites is very small, i.e. (θi − θj ) 1 for i, j nearest
neighbours. Now, we can dilute the energy, in other words we weak the energy term.
Let us do a Taylor expansion of the interaction term
1 X 1

2 2
cos(θi − θj ) ' 1 − (θi − θj ) ⇒ 1 − (θi − θj ) (7.24)
2 2
hiji
92 transitions
Figure 7.3: For continuous symmetry the domain walls interpolate smoothly between two
ordered regions.
The Hamiltonian can be written as

X 1

2
H ' −J 1 − (θi − θj ) (7.25)
2
hiji
The (7.24) corresponds to the discrete differential operator where θi −θj = ∂x θ, hence
J
Z
H = E0 + d~r (∇θ)2 (7.26)
2
| {z }
E≡Stifness energy
where E0 = 2JN is the energy corresponding to the case in which all the spins are
oriented along a given direction.
Definition 6: Stifness energy
The Stifness energy is defined as
J
Z
E= d~r (∇θ)2 (7.27)
2
where θ(~r) is the angle of a local rotation around an axis and J is the spin
rigidity. For an ordered phase θ(~r) = θ0 .
Let us now imagine a domain wall where θ(~r) rotates by 2π (or 2πm) by using
the entire length of the system (see again Figure 7.3):
2πnx
θ(~r) =
L
where n is the total number of 2π turn of θ in L.

Remark. Note that there is no variation along the other d − 1 dimensions, therefore
we just doing over one dimension.
We consider only the term E (Stifness energy) of the Hamiltonian
2
L
J d−1 L 2πn 2

J d 2πnx
Z Z
E = Ld−1 dx = L dx ≈ 2π 2 n2 JLd−2
2 0 dx L 2 0 L
(7.28)
Remark. Unlike the Ising model where E ∼ Ld−1 , here E ∼ Ld−2 ! Hence, if S ≥
kB ln L for a single domain wall, S should dominate if d ≤ 2, the ordered phase is
always unstable and no phase transition is expected for T 6= 0!
7.3. Continuous symmetries and phase transitions 93
Definition 7: Lower critical dimension

The Lower Critical dimension dc is the dimension at which (and below which)
the system does not display a ordered phase (there is no long range order). In
other words if d ≤ dc , we have Tc = 0.
From what we have found before we can say that
• For discrete global symmetries: dc = 1.
• For continuous global symmetries: dc = 2 (Merming-Wagner theorem)2 .
Example 23: XY model transition

The XY model in d = 2 is rather special. While the Mermin–Wagner theo-
rem prevents any spontaneous symmetry breaking on a global scale, ordering
transitions of Kosterlitz-Thouless-type may be allowed. This is the case for the
XY model where the continuous (internal) O(2) symmetry on a spatial lattice
of dimension d ≤ 2, remains zero for any finite temperature T 6= 0 (it do not
display an ordered phase).
Remark. This transition does not imply the spontaneous breaking of the O(2)
symmetry!
However, the theorem does not prevent the existence of a phase transition in
the sense of a diverging correlation length ξ. To this end, the model has two
phases:
• a conventional disordered phase at high temperature with dominating ex-

ponential decay of the correlation function G(r) ∼ exp(−r/ξ) for r/ξ 1;
• a low-temperature phase with quasi-long-range order where G(r) decays

according to some power law, which depends on the temperature, for "suffi-
ciently large", but finite distance r (a r ξ with a the lattice spacing).
The transition from the high-temperature disordered phase with the exponential
correlation to this low-temperature quasi-ordered phase is a Kosterlitz–Thouless
transition. It is a phase transition of infinite order.
In the d = 2 XY model, vortices are topologically stable configurations. It is
found that the high-temperature disordered phase with exponential correlation
decay is a result of the formation of vortices. Vortex generation becomes thermo-
dynamically favorable at the critical temperature TKT of the KT transition. At
temperatures below this, vortex generation has a power law correlation (hence,
there is no long range order for T < TKT ).
Many systems with KT transitions involve the dissociation of bound anti-parallel
vortex pairs, called vortex–antivortex pairs, into unbound vortices rather than
vortex generation. In these systems, thermal generation of vortices produces an
even number of vortices of opposite sign. Bound vortex–antivortex pairs have
lower energies than free vortices, but have lower entropy as well.
In order to minimize free energy, F = E − T S, the system undergoes a tran-
sition at a critical temperature, TKT . Below TKT there are only bound vor-
tex–antivortex pairs. Above TKT , there are free vortices.
2
In statistical mechanics, the Mermin–Wagner theorem states that continuous symmetries cannot
be spontaneously broken at finite temperature in systems with sufficiently short-range interactions
in dimensions d ≤ 2. Intuitively, this means that long-range fluctuations can be created with little
energy cost and since they increase the entropy they are favored.
94 transitions
7.4 Role of the interaction range

So far we have considered models where the interactions were short range. How
things change if long range are considered instead? How does the symmetry broken
depends on the range of interactions?
One can show, for example, that if
J
Jij = , 1≤α≤2 (7.29)
|~ri − ~rj |α
phase with long range order is stable for 0 < T < Tc also for d = 1!
Remark. If α > 2 + ε we get back the physics found for short range interactions. If
α < 1 the thermodynamic limit does not exist.
A limiting case of long range interaction is the infinite range case where all the
spins interact one to another with the same intensity independently on their distance.
No metric is involved (instead of previously where the definition of J of before is a
metric.). It can be solve exactly and later we will see why.
7.4.1 Ising model with infinite range

Let us consider the Hamiltonian
N
J0 X X
− HN ({S}) = Si Sj + H Si (7.30)
2
i,j i
with Si ∈ [−1, +1].

Remark. The sum over i, j is an unrestricted double sum.
The problem with the double sum is that
X
Si Sj ∝ O(N 2 )
i,j
and the thermodynamic limit is ill-defined. To circumvent this problem Mark Kac
suggested to consider a strength
J
J0 = (7.31)
N
this is called the kac approximation. Hence,
N
J X X
− HN ({S}) = Si Sj + H Si (7.32)
2N
i,j i
Lecture 11. with this choice we recover E ∼ O(N ).

Wednesday 20th The partition function is
November, 2019.  
Compiled:
Wednesday 5th
X βJ X X
ZN (T, J, H) = exp Si Sj + βH Si  (7.33)
February, 2020. 2N
{S} ij i
Since there are no restrictions on the double sum, we can write

!  !2
X X X X
Si Sj = Si  Sj  = Si
ij i j i
Rewriting the partition function, we have:

 !2 
X K X X
ZN (T, J, H) = exp Si +h Si  (7.34)
2N
{S} i i
7.4. Role of the interaction range 95
Remark. Recall that we have defined K = βJ and h = βH.

In order to transform the quadratic term into a linear one we make use of the
integral identity known as the Hubbard–Stratonovich transformation (we can do it in
any dimension). Let
X
x≡ Si
i
The key identity in the Hubbard-Stratonovich method is simply an observation of the
result of a Gaussian integral. In the present case it takes the form
r
+∞
NK
Z
Kx2 NK 2
e 2N = e− 2
y +Kxy
dy , Re K > 0 (7.35)
2π −∞
where y is a random field that follows a random distribution.
Proof of Hubbard-Stratonovich identity. To show the identity (7.35) it is suf-

ficient to complete the square
NK 2 NK x 2 Kx2
y + Kxy = −
− y− +
2 2 N 2N
and then shifting the integral to one over z ≡ y − Nx .

Hence,
Z +∞ r
Kx2
− NK
(y− x 2
) (a) Kx2 2π
e 2N e 2 N dy = e 2N
−∞ N K
where in (a) we have considered z ≡ y − Nx , dz = dy and the integral

Z +∞ r
−αz 2 π
e dz =
−∞ α
with α ≡ 2 .
NK
By using (7.35) in the partition function, we have

r  
Z +∞
NK NK 2 X P
ZN (K, h) = dy e− 2 y  e(h+Ky) i Si  (7.36)
2π −∞
{S}
| {z }
Qy
where
 
PN N
exp[(h + Ky)Si ] = (2 cosh(h + Ky))N
X Y X
Qy = e(h+Ky) i=1 Si
= 
{S} i=1 Si =±1
(7.37)
Remark. y is called auxiliary field and is a fluctuating external field with Gaussian
distribution.
The partition function becomes
r r
N K +∞ N K +∞
Z Z
NK 2
ZN (K, h) = dy e− 2 y N
(2 cosh(h + Ky)) = dy eN L(K,h,y)
2π −∞ 2π −∞
(7.38)
where
K
L(K, h, y) = ln [2 cosh(h + Ky)] − y 2 (7.39)
2
96 transitions
Remark. In the limit N → ∞ the integral can be computed exactly by the saddle
point method. We can replace the medium of the integral with the maximum of the
integrand, we say that all the information is coming only from a bit of information.
Replacing the all integral with the integrand computed where it is maximum is an
approximation and we are loosing information. It also depends on the form of the
function. For example, for a delta function it works better. In general:
Z +∞
f (x) dy → f (x̄)
−∞
where x̄ = maxx f (x).

Indeed as N → ∞, since the integrand is exp(N L(K, h, y)), the integral is domi-
nated by the global maximum in y of the function L(K, h, y):
r
N 1 NK h i
ZN (K, h) ≈ max eN L(K,h,y)
2π y
Let ys be the value of y at which
L(K, h, ys ) = max L(K, h, y)
y
hence, r
N K N L(K,h,ys )
N 1
ZN (K, h) ≈ e (7.40)
2π
When we are able to compute the ys we can do this approximation and we can
compute the bound free energy as
1
fb (K, h) = lim (−kB T log ZN ) = −kB T L(K, h, ys ) (7.41)
N →∞ N
Example 24: How to compute ys

Looking for ys , we consider the condition of maximum ∂L
∂y = 0:
∂L sinh(h + Ky)K
= − Ky = 0 ⇒ ys = tanh(h + Kys ) (7.42)
∂y cosh(h + Ky)
The last one is an implicit equation that can be solved graphically as a function
of K and h.
The magnetization in the N → ∞ limit is given by

∂f 1 ∂ ln ZN (K, h)
m=− = lim
∂H T N →∞ βN ∂H
∂L(K, h, ys ) O(log N ) 2 sinh(Kys + h)
= + =
∂h N 2 cosh(Kys + h)
= tanh(Kys + h)
Hence, showing that ys is determined by Eq.(7.42) plays the role of an effective field
acting on each spin. Comparing Eq.(7.42) with the last result, gives us the self
consistency condition for m
m ≡ ys ⇒ m = tanh(h + Km) (7.43)
Remark. We have solved analitically this problem. This is the usual “mean field”
result.
Remark. The a Hubbard-Stratonovich transformation is generally useful for trans-
forming an interacting problem to a sum or integration over non-interacting prob-
lems.
Chapter 8
Mean field theories of phase

transitions and variational mean
field
8.1 Mean field theories

Increasing the dimension of the systems, the effort to solve analitically the prob-
lems increase; indeed, we have seen that
• In d = 1: many (simple) models can be solved exactly using techniques such as
the transfer matrix method.
• In d = 2: few models can still be solved exactly (often with a lot of effort).
• In d = 3: almost no model can be exactly solved.
Hence, approximations are needed. The most important and most used one is the
mean field approximation. It has different names depending on the system considered:
• Magnetic systems: Weiss theory.
• Fluids systems: Van der Walls.
• Polymers: Flory’s theory.
The idea is trying to simplify the problem by neglecting the correlation between
the fluctuations of the order parameter. It is equivalent to a statistical independence
of the microscopic degrees of freedom.
8.1.1 Mean field for the Ising model (Weiss mean field)
Let us start from the generic Ising model
1X X
H[{S}] = − Jij Si Sj − H Si (8.1)
2
ij i
where the double sum over i and j have no restrictions, while H is homogeneous.
The partition function is
X
ZN (T, H, {Jij }) = e−βH[{S}] = exp(−βFN (T, H, {Jij })) (8.2)
{S}
Since H is uniform, the magnetization per spin is

hSi i = hSi ≡ m
97
98 Chapter 8. Mean field theories of phase transitions and variational mean field
Let us now consider the identity

Si Sj = (Si − m + m)(Sj − m + m)
= (Si − m)(Sj − m) + m2 + m(Sj − m) + m(Si − m)
Remark. The mean field approximation consists in neglecting the term
(Si − m)(Sj − m) = (Si − hSi i)(Sj − hSj i)
that measures correlation between fluctuations.
Hence, using the mean field approximation, the above identity becomes
Si Sj ≈ m2 + m(Si − m) + m(Sj − m)
and
1X MF 1 X
Jij −m2 + m(Si + Sj )

Jij Si Sj ≈
2 2
i,j i,j
Let us focus on the term
1X 1 X
Jij m(Si + Sj ) = 2 m Jij Si (8.3)
2 2
i,j i,j
If we do not make any assumption on Jij , the mean field Hamiltonian is

1 X X X
HM F [{S}] = m2 Jij − m Jij Si − H Si (8.4)
2
ij ij i
and by calling
X
J̄i ≡ Jij
j
we get
1 X m X X
HM F [{S}] = m2 J̄i − J̄i Si − H Si
2 2
i i i
Remark. Note the coefficient emphasized in green (1/2) is needed to avoid the double
counting of bonds.
Moreover, if we suppose that
J̄i → J¯
we have
1 m X
HM F [{S}] = m2 N J¯ − J¯ + H Si (8.5)
2 2
i
Remark. In the standard Ising model, where

1X X
Jij Si Sj → Jij Si Sj
2
ij hiji
the term 2m of Eq.(8.3) can be written as follows. Let

P
hiji Jij Si
Jij = z Jˆi
X
j∈n.n. of i
where z is the coordination number of the underlying lattice (for the hypercubic
lattice z = 2d). By assuming Jˆi = Jˆ and inserting the 1/2 to avoid double counting,
we have that equation (8.3) becomes
N
1 X
Jij Si = 2m z Jˆ
X
2m Si (8.6)
2
hiji i=1
8.1. Mean field theories 99
Hence, in this case the Hamiltonian is

N
1
HM F [{S}] = m2 N z Jˆ − (mz Jˆ + H)
X
Si (8.7)
2
i=1

ˆ = e−N β Jˆ z2 m2 ˆ
X PN
ZN (T, H, J) eβ(Jzm+H) i=1 Si
{S}
N
ˆz 2
XY
ˆ
= e−N β J 2 m exp β Jzm + H Si
{S} i=1 (8.8)
!N

ˆz 2
X
ˆ
= e−N β J 2 m exp β Jzm +H S
S=±1
h iN
−N β Jˆ z2 m2 ˆ
=e 2 cosh β Jzm +H
Remark. We are replacing the interaction of the J with a field close to the Si . We
ˆ
called Jzm = Hef f , the mean field!
The free energy per spin is
ˆ
FN (T, H, J) 1
= ˆ
−kB T ln ZN (T, H, J)
N N (8.9)
1ˆ 2 h
ˆ
i
= Jzm − kB T ln cosh β(Jzm + H) − kB T ln 2
2
Sometimes it is useful to use the dimensionless variables defined as
FN kB T H
f¯ ≡ , θ≡ , H̄ ≡ (8.10)
N z Jˆ z Jˆ z Jˆ
Hence,
1
f¯(m, H̄, θ) = m2 − θ ln 2 cosh θ−1 (m + H̄) (8.11)

2
In order to be a self-consistent, the last equation has to satisfy the thermodynamic
relation:
∂f
ˆ

m=− ⇒ m = tanh β(Jzm + H)
∂H T
Remark. The results of m is similar to the Ising with infinite range (Jz ˆ ↔ J).
Now, let us consider the H = 0 case, we have

ˆ
m = tanh β(Jzm) (8.12)
and the graphical solution is shown in Figure 8.1 (hyperbolic function). We can
distinguish three cases:
ˆ > 1: there are three solutions, one at m = 0 and two symmetric at
• Case β Jz
m = ±m0 . Magnetization is 6= 0 (= |m0 |) for H = 0 (ordered phase). The two
solution are symmetric because they are related by the Z2 symmetry.
ˆ < 1: single solution at m = 0 (disordered or paramagnetic phase).
• Case β Jz
• Case β Jzˆ = 1: the three solutions coincide at m = 0 (critical point). The
critical temperature Tc is given by
ˆ ˆ
ˆ = 1 ⇒ z J = 1 ⇒ Tc = z J 6= 0!
βc Jz
kB Tc kB
Remark. Tc depends on z and hence on d !
ˆ >1
Jβz
ˆ =1
Jβz
ˆ <1
Jβz
−m0
+m0

ˆ
Figure 8.1: Graphical solution of equation m = tanh β(Jzm) (case H = 0).
8.1.2 Free-energy expansion for m ' 0

The critical point is characterized by the order parameter that is zero. Now, we
want to expand the free energy around the critical point. Let us put H = 0:
ˆ = 1 Jzm
h i
f (m, 0, T, J) ˆ 2 − kB T ln cosh β Jzm
ˆ (8.13)
2
ˆ
Define x ≡ β Jzm ' 0 and by expanding in Taylor series
x2 x4
cosh(x) ' 1 + + +...
|2 {z 4!}
t'0
1
log (1 + t) ' t − t2
2
Hence,
x2 x4 1 x4 x2 x4
log (cosh x) ' + − + O(x6 ) = − + O(x6 )
2 4! 2 4 2 12
This gives the result
ˆ ' const + A m2 + B m4 + O(m6 )

f (m, 0, T, J) (8.14)
2 4
with

ˆ 1 − β Jz
A ≡ Jz ˆ (8.15a)
ˆ 4
(Jz)
B ≡ β2 >0 (8.15b)
3
We have three cases:
ˆ > 1 ⇒ A < 0: two stable symmetric minima at m = ±m0 (Figure

• Case β Jz
8.2). Coexistence between the two ordered phases.
ˆ < 1 ⇒ A > 0: one minimum at m = 0 (Figure 8.3).
• Case β Jz
ˆ = 1 ⇒ A = 0: 3 minima coincide at m = 0 (Figure 8.4).

• Case β Jz
Remark. Note that in the computations we have just made we have never imposed a
particular value for the dimensionality of the system. This means that the results of
this approximation should be valid also for d = 1, but we know that in one dimension
the Ising model does not exhibit a phase transition. This is an expression of the
fact that in the one-dimensional case mean field theory is not a good approximation
(again, the dimensionality of the system is still too low).
−m0 m0
ˆ > 1 ⇒ A < 0.
Figure 8.2: Plot of the free energy: case β Jz
ˆ < 1 ⇒ A > 0.
ˆ = 1 ⇒ A = 0.
8.1.3 Mean field critical exponents

Lecture 12.
Let us consider the equation Friday 22nd
A 2 B 4 November, 2019.
f (m, T, 0) ≈ const + m + m + O(m6 ) Compiled:
2 4 Wednesday 5th
February, 2020.
with B > 0, so we do not need more term to find the minima of the solution. This
ˆ
is called stabilization. What is most important is the coefficient A = Jz(1 ˆ
− β Jz),
that means that A can change sign.
β exponent
The β exponential observe the order parameter. Consider H = 0, t ≡ T −Tc

Tc and
t→0−
m ∼ −tβ . The condition of equilibrium is
∂f
=0
∂m
which implies

∂f 3
h
ˆ ˆ 2
i
= Am0 + Bm 0 = Jz(1 − β Jz) + Bm 0 m0 = 0
∂m m=m0
ˆ
Since at the critical point we have Tc = Jz
kB :
kB Tc
0= (T − Tc )m0 + Bm30
T
The solution are m0 = 0 and
m0 ' (Tc − T )1/2 (8.16)
Hence, the mean field value is β = 1/2.
δ exponent
Now, let us concentrate in the δ exponent. We are in the only case in which we
are in T = Tc and we want to see how the magnetization decrease: H ∼ mδ .
Starting from the self-consistent equation, we have

m = tanh β(Jzmˆ + H) (8.17)
Inverting it
ˆ
β(Jzm + H) = tanh−1 m
On the other hand, for m ∼ 0
m3 m5
tanh−1 m ' m + + + ...
3 5
Therefore, by substituting
m3 3

ˆ m + kB T m + . . .

H = kB T m + + ... ˆ
− Jzm = kB T − Jz
3 3
kB T 3
' kB (T − Tc )m + m
3
ˆ
At T = Tc = kB ,
Jz
we have
m3
H ∼ kB Tc (8.18)
3
The mean field value is δ = 3.
α exponent
Consider the α exponent, for H = 0, cH ∼ t−α and t = (T − Tc )/Tc . Compute
the specific heat at H = 0. Consider first T > Tc , where m0 = 0,
ˆ 2
Jzm 1
f (m, H) = ˆ
− ln 2 cosh β(Jzm + H) − kB T ln 2
2 β
If m = 0, cosh 0 = 1 and
f = −kB T ln 2
it is called paramagnetic phase. Indeed,
∂2f

cH = −T =0 (8.19)
∂T 2
The mean field value is α = 0.

Remark. For T < Tc , m = m0 6= 0. This implies that cH 6= 0, but still f = −kB T ln A
with A = const. We obtain α = 0 also in this case.
s
ˆ
Jz
m0 = ± − (T − Tc )
2Tc
γ exponent
Now we consider the γ exponent, for H = 0, χ ∼ t−γ . Starting again from
equation (8.17):
ˆ
m = tanh β(Jzm + H)
and developing it around m ' 0, as shown before we get

kB T 3
H = mkB (T − Tc ) + m
3
∂m 1
⇒ χT = = ∂H
∂H ∂m
Since ∂H
∂m ' kB (T − Tc ) + KB T m2 , as m → 0
χ ∼ (T − Tc )−1 (8.20)
The mean field value is γ = 1.
Summary
The mean field critical exponents are
1
β= , γ = 1, δ = 3, α=0 (8.21)
2
We can immediately note that these exponents are different from those found by
Onsager for the Ising model in two dimensions, so the mean field theory is giving us
wrong predictions. This is because mean field theories are good approximations only
if the system has a high enough dimensionality (and d = 2 is still too low for the
Ising model, see Coarse graining procedure for the Ising model).
Remark. In the mean field critical exponents the dimension d does not appear. Tc
instead depends on the number of z of neirest neighbours and hence on the embedding
lattice (on the dimension)!
Remark. (lesson) The ν exponent define the divergence of the correlation lengths. In
order to do that, in principle we should compute the correlation function, but which
are the correlation we are talking about? The correlation or the fluctuation with to
respect the average? In the ferromagnetic we have infinite correlation lengths, but it
is not true, because instead of that we consider the variation correlated! Which is the
problem here? In mean field we were neglecting correlation between fluctuation. We
thought: let us compute neglecting correlation. How we can compute the correlation
function within the mean field theory with thermal fluctuations? We look at the
response of the system. Experimentally what can we do? It is a magnetic field, but
we cannot use homogeneous magnetic field. Another way to compute the correlation
function without looking at thermal fluctuation it is by considering a non homoge-
neous magnetic field. If we make a variation in Hi in the system, what happened in
the Hj ? This is an important point.
8.2 Mean field variational method

The mean field variational method is a general approach to derive a mean field
theory. The method is valid for all T and is sufficiently flexible to deal with complex
systems. The method is similar to the one used in quantum mechanics, namely it is
based on the following inequality
Eα = hψα | Ĥ |ψα i ≥ E0 (8.22)

valid for all trial function ψα .
Remark. E0 is the ground state energy.
Example 25
In many body problem we have Hartree and Hartree-Fock variational methods.
The closest bound to E0 is the one that is obtained by minimizing Eα , i.e.

hψα | Ĥ |ψα i over |ψα i, where the |ψα i are functions to be parametrized in some con-
venient way.
The method is based on the following inequalities
1. Let Φ be a random variable (either discrete or continuous) and let f (Φ) be a
function of it.
For all function f of Φ, the mean value with respect to a distribution function
p(Φ) is given by
hf (Φ)ip ≡ Tr(p(Φ)f (Φ)) (8.23)
If we consider the function
f (Φ) = exp[−λΦ] (8.24)
it is possible to show the inequality
D E
e−λΦ ≥ e−λhΦip , ∀p (8.25)
p
Proof of inequality (8.25). ∀Φ ∈ R, eΦ ≥ 1 + Φ. Hence,

e−λΦ = e−λhΦi e−λ[Φ−hΦi] ≥ e−λhΦi (1 − λ(Φ − hΦi))
Taking the average of both sides, we get
D E D E
→ e−λΦ ≥ (1 − λ(Φ − hΦi))e−λhΦi = e−λhΦip
p p

8.2. Mean field variational method 105
2. The second inequality refers to the free energy. Let ρ(Φ) be a probability
distribution, i.e. such that
Tr(ρ(Φ)) = 1, ρ(Φ) ≥ 0 ∀Φ (8.26)
Hence,
D E
e−βFN = ZN = Tr{Φ} e−βH[{Φ}] = Tr{Φ} ρe−βH−ln ρ = e−βH−ln ρ
ρ
From the inequality (8.25),

D E
e−βFN = e−βH−ln ρ ≥ e−βhHiρ −hln ρiρ
ρ
Taking the logs one has
F ≤ hHiρ + kB T hln ρiρ = Tr(ρH) + kB T Tr(ρ ln ρ) ≡ Fρ (8.27)
Whenever we are able to write the last equation by using a ρ, then we will
minimize it. This is the variational approach of statistical mechanics. The
question is: which is the ρ that minimizes?
The functional Fρ will reach its minimum value with respect to the variation of
ρ with the constraint Tr(ρ) = 1, when
1 −βH
ρ̄ = ρeq = e (8.28)
Z
So far so good but not very useful, since we are back to the known result that the
distribution that best approximately the free energy of the canonical ensemble
is given by the Gibbs-Boltzmann distribution. To compute ρeq , we need some
approximation!
8.2.1 Mean field approximation for the variational approach

Let us now try to compute the Z by starting from the inequality (8.27). Up to now
everything is exact. The idea is to choose a functional form of ρ and then minimize
Fρ with respect to ρ. Note that ρ is the N −point probability density function (it is
a function of all the degrees of freedom):
ρ = ρ(Φ1 , . . . , ΦN )
it is a N −body problem, where Φα is the random variables associated to the α−esim

degree of freedom. This is in general a very difficult distribution to deal with. This
is equivalent exactly at
~ 1 , . . . ,~rN , P
ψα (~r1 , P ~ N)
The mean-field approximation consists in factorising ρ into a product of 1−point

distribution function:
N N
MF Y Y
ρ(Φ1 , . . . , ΦN ) ' ρ(1) (Φα ) ≡ ρα (8.29)
α=1 α=1
where we have used the short-hand notation ρ(1) (Φα ) → ρα .

Remark. Approximation (8.29) is equivalent to assume statistical independence be-
tween particles (or more generally between different degrees of freedom). The inde-
pendence of the degree of freedom is a very strong assumption!
Example 26
Let us consider the spin model on a lattice; what is the Φα ? We have:
Φα → Si
Hence, ρ = ρ(S1 , S2 , . . . , SN ) and (8.29) becomes

N N
MF Y Y
ρ ' ρ(1) (Si ) ≡ ρi
i=1 i=1
With Eq.(8.29) and the condition Tr(ρα ) = 1, we compute the two averages in the
Eq.(8.27) given the field. We have:
!!
to do
Y X X
Tr{Φ} (ρ ln ρ) = Tr ρα ln ρα = Tr(α) (ρα ln ρα ) (8.30)
α α α
where Tr(α) means sum over all possible values of the random variable Φα (with α
fixed and Tr(α) ρα = 1).
We end up that
X
FρM F = hHiρM F + kB T Tr(α) (ρα ln ρα ) (8.31)
α
Remark. FρM F = F ({ρα }) and we have to minimize it with respect to ρα .

How can we parametrize ρα ? There are two approaches that are mostly used:
1. Parametrize ρα ≡ ρ(1) (Φα ) by the average of Φα with respect to ρα , hΦα iρα (in
general is the local order parameter):
ρα = ρ(1) (Φα ) → hΦα iρα
This means that there are two constraints in the minimization procedure:
Tr(α) ρα = 1, Tr(α) (ρα Φα ) = hΦα i
where the second is the self-consistent equation.

Remark. In this case the variational parameter coincides with the order param-
eter.
2. In the second approach is ρα itself the variational parameter. FρM F is mini-
mized by varying ρα . It is a more general approach, that involves functional
minimization.
8.2.2 First approach: Bragg-Williams approximation

We apply this approach to the Ising model with non uniform magnetic field. The
Hamiltonian of such a system is
X X
H[{S}] = −J Si Sj − Hi Si (8.32)
hiji i
It means that
Φα → Si = ±1
and that the variational parameter becomes the order parameter
hΦα i → hSi i ≡ mi
Remark. Note that this time H → Hi (non-uniform), hence mi depends on the site
i.
We have to define a 1−particle probability density distribution ρi ≡ ρ(1) (Si ) such
that (
(1) Tr ρi = 1
ρi ≡ ρ (Si ) → (8.33)
Tr ρi Si = mi
Since we have to satisfy these two constraints, we need two free parameters. A linear
functional form is sufficient. Denoting by:
• a: statistical weight associated to the value Si = −1.
• b: statistical weight associated to all the remaining possible values of Si (for an

Ising only one value remains, i.e. Si = +1).
The simplest function form with two parameters is the linear function, namely
ρi ≡ ρ(1) (Si ) = a(1 − δSi ,1 ) + bδSi ,1 (8.34)
Using the constraints Lecture 13.

( Wednesday 27th
Tr(i) (ρi ) = 1 →a+b=1 November, 2019.
Compiled:
Tr(i) (ρi Si ) = mi → a − b = mi Wednesday 5th
February, 2020.
where a, b are the functions of the order parameter. In that case we have not to write
the functions for all the i. For Si = 1 we have one value, for all the other values
another one. The results of the previous equation are:
(
a = 1−m
2
i
b = 1+m
2
i
Hence,
1 − mi 1 + mi
ρi = (1 − δSi ,1 ) + δSi ,1 (8.35)
2 2
that in matrix form can be expressed as
!
(mi +1)
0
ρi = 2
(1−mi ) (8.36)
0 2
Mean field energy term

Let us consider the average of the Hamiltonian
* +
X X X X
hHiρM F = −J Si Sj − Hi Si = −J hSi Sj iρM F − Hi hSi iρM F
hiji i hiji i
ρM F
(8.37)
Since we have
N
Y
ρM F = ρi
i=1
the term hSi Sj iρM F will transform into
hSi Sj iρM F = hSi iρM F hSj iρM F

Moreover, for all function g of Si we can write

X
hg(Si )iρM F = Tr(i) (g(Si )ρi ) = g(Si )ρi
Si =±1
1 − mi

X 1 + mi
= g(Si ) δSi ,1 + (1 − δSi ,1 )
2 2
Si =±1
1 + mi 1 − mi
= g(1) + g(−1)
2 2
Note that, if g(Si ) = Si , we have g(1) = +1 and g(−1) = −1, hence
hSi iρM F = mi
as expected. Taken this into account, the Hamiltonian can be rewritten as

X X
hHiρM F = −J mi mj − Hi mi (8.38)
hiji i
Remark. This has the form of the original Hamiltonian where Si had been replaced
by their statistical averages.
The entropy term is:
MF
X
hln ρiρM F = Tr(ρ ln ρ) = Tr(i) (ρi ln ρi )
i
(8.39)
X 1 + mi 1 + mi 1 − mi 1 − mi

= ln + ln
2 2 2 2
i
The total free energy in Eq.(8.27) becomes:
FρM F = hHiρM F + kB T hln ρiρM F

X X X 1 + mi 1 + mi 1 − mi 1 − mi
= −J mi mj − Hi mi + kB T ln + ln
2 2 2 2
hiji i i
(8.40)
We now look for the values mi = m̄i , that minimizes FρM F (equilibrium phases):

∂FρM F
=0
∂mi mi =m̄i
This gives:
X kB T 1 + m̄i
0 = −J m̄j − Hi + ln
2 1 − m̄i
j∈ n.n. of i
To solve it, remember that

1 1+x
tanh−1 (x) = ln |x| < 1
2 1−x
Hence,
X
kB T tanh−1 (m̄i ) = J m̄j + Hi
j∈ n.n. of i
which implies   
X
m̄i = tanh (kB T )−1 J m̄j + Hi 
j∈ n.n. of i
We have again found the self-consistency equation for the magnetization that we have
already encountered in the Weiss mean field theory for the Ising model! This is again
a confirmation that all mean field theories are equivalent. Defining
X
z m̄i ≡ m̄j
j∈ n.n. of i
we get
m̄i = tanh [β(Jz m̄i + Hi )] (8.41)
this is the Bragg-William approximation.
A
A
B
B
(b) Triangular lattice is not bipartite.

(a) Square lattice is bipartite.
Figure 8.5: Ising anti-ferromagnet in an external field.
Example 27: Ising anti-ferromagnet in an external field

Let us consider the model
X X
H= +J Si Sj − H Si , (8.42)
hiji i
Note the + sign before J, this means that the interactions are anti-ferromagnetic.
Let us consider two cases:
• If H = 0 ferromagnetic and anti-ferromagnetic behave similarly when the

interactions are between nearest neighbours on a bipartite lattice, i.e. a
lattice that can be divided into two sublattices, say A and B, such that a
A site has only B neighbours and a B site only A ones.
Remark. FCC is not bipartite, while BCC it is. See Figure 8.5.
If the lattice is bipartite and Jij is non zero only when i and j belong to
different sublattices (they do not have to be only n.n.!), one can redefine
the spins such that (
+Sj j ∈ A
Sj0 =
−Sj j ∈ B
Clearly, Si0 Sj0 = −Si Sj . It is like if the Jij have changed sign and we are
formally back to ferromagnetic model for the two sublattices:
X
H∗ = − J Si0 Sj0 (8.43)
hiji
i.e. a ferromagnetic Ising.
• In presence of a magnetic field H, we need to reverse its sign when applied

to sites B.
The thermodynamic of a ferromagnetic Ising model on a bipartite lattice

in a uniform magnetic field H is identical to the one of the Ising antifer-
romagnetic model in presence of the so called staggered field, i.e. HA = H
and HB = −H. The Hamiltonian is
X X X
H∗ [S] = −J S(rA )S(rB )−H S(rA )+H S(rB ), J > 0, H > 0
hrA rB i rA rB
(8.44)
The average magnetization per spin is
1
m ≡ (mA + mB )
2
while
1
mS = (mA − mB )
2
is the staggered magnetization.
In order to use the variational density matrix method for this problem we
consider two independent variational parameters mA and mB for sublattice
A and B respectively. On each sublattice, the model is like the standard
Ising ( (1)
ρA (S) = 1+m2 δS,1 +
A 1−mA
2 δS,−1
(1) 1+mB 1−mB
ρB (S) = 2 δS,1 + 2 δS,−1
Remark. Note that, being H uniform, hSi i = m, i.e. does not depend on
(1) (1)
i. Same for the 1−particle distribution functions ρA (S) and ρB (S).
By performing the calculation for the terms

X X
hHiρM F = −J hSi Sj iρM F − H hSi iρM F
hiji i
X
hln ρiρM F = Tr(i) (ρi ln ρi )
i
as before, but remembering to partition the procedure into the two sub-
lattices A and B, one can show that the variational free energy is given
by
F (mA , mB ) z Jˆ 1 1 1
= mA mB − H(mA + mB ) − kB T s(mA ) − kB T s(mB )
N 2 2 2 2
(8.45)
where the entropy term is
1−m 1−m

1+m 1+m
s(m) = ln + ln
2 2 2 2
By differentiating with respect to mA and mB , one gets

F
N

∂(F/N ) H kB T 1 + mA
=0 ⇒ mB = − ln
∂mA z Jˆ z Jˆ 1 − mA

∂(F/N ) H kB T 1 + mB
=0 ⇒ mA = − ln
∂mB z Jˆ z Jˆ 1 − mB
As before, since
1 1+x
tanh−1 (x) = ln
2 1−x
these self-consistent equations can be written as

mA = tanh β H − z Jm ˆ B
(8.47)
mB = tanh β H − z Jm ˆ A
ˆ B from the B
The sites ∈ A experience an internal field HA,M F = −z Jm
neighbours and vice versa for the sites ∈ B.
8.2.3 Second approach: Blume-Emery-Griffith model

We apply this approach to the so called Blume-Emery-Griffith model. This is a
spin model with vacancies that describes the phase diagram and the critical properties
of an interacting system displaying a tricritical point. Perhaps the most famous of
these systems is the He3 − He4 mixture undergoing a fluid-superfluid transition.
Remark. He4 is a non radiative isotope with two protons and two neutrons. Roughly
1/4 of the universe matter is He4 ! From a quantum statistical point of view He4 is a
boson.
A gas of He4 undergoes a fluid-superfluid transition at Tλ = 2.17K and P = P0 .
It is known as λ−transition since at T ∼ Tλ the specific heat c(T ) behaves as in
Figure 8.6a: the plot of the specific heat as a function of the temperature has a shape
that resembles a λ. The λ−transition is a genuine critical point (second order). For
T < Tλ , He4 is in the superfluid phase and it can be described by a two-fluids model
in which one component has zero viscosity and zero entropy.
C P
λ − shape melting
curve
λ − line
He4I
He4II
1 2.17 T
T
(a) Plot of the specific heat c(T ). It has the
shape of a λ. (b) (P, T ) phase diagram.
Figure 8.6
The BEG model is used to describe what happens when we add some He3 to
the system constituted by He4 ; it does not consider quantum effects, but only the
"messing up" due to the He3 impurities.
Remark. He3 is a non-radioactive isotope with 2 protons and 1 neutron. From a
quantum statistical point of view is a fermion.
Experimentally when He3 is added to He4 the temperature of the fluid-superfluid
transition decreases. More specifically, if inserted in a system of He4 it will "dilute"
its bosonic property. Then, one expects that Tλ decreases, as observed. Denoting by
x the concentration of He3 , one observes
Tλ = Tλ (x)
with Tλ (x) that decreases as x increases.

For small concentration of He3 the mixture remains homogeneous, and the only
effect is the change of Tλ . However, when the concentration x of He3 reaches the
critical value xt
n3
x > xt = ∼ 0.67
n3 + n4
He3 and He4 separate into two phases (just like oil separates from water, the mixture
undergoes a separation between a phase rich and a phase poor of He3 ) and the λ
transition becomes first-order (namely, discontinuous). The transition point (xt , Tt )
where the system shifts from a continuous λ-transition to a discontinuous one is that
where the phase separation starts and is called tricritical point (i.e. it is a critical point
that separates a line of second order transition from a line of first order transition).
The BEG model was introduced to describe such a situation.
BEG Model
As we have anticipated, the BEG Model is a lattice gas model and so it is based
on an Ising-like Hamiltonian. In particular, it is the model of a diluted ferromagnetic
system. On the sites of this lattice we define a variable Si which can assume the
values −1,0 and +1: we decide that when an He4 atom is present in a lattice site
then Si = ±1, while when Si = 0 it means that the site is occupied by an He3 atom.
We then define our order parameter to be
hSi i = mi
In the Ising model Si can

only
2
be equal to 1, while in this case it can be either 0

or 1: we can thus interpret Si2 as the concentration of He4 atoms, and
x ≡ 1 − Si2

as the fraction of He3 . We also define
∆ ∝ µHe3 − µHe4
to be the difference of the chemical potentials of He3 and He4 ; since this parameter
is related to the number of He3 and He4 atoms, we expect that when
• x → 0 (namely, there is only He4 ), we have ∆ → −∞.
• x → 1 (namely, there is only He3 ), we have ∆ → +∞.
and the order parameter for the λ−transition becomes
(
0 T > Tλ
hSi i =
m T < Tλ
We consider the following Hamiltonian for the system:

N
X N
X
H = −J Si Sj + ∆ Si2 − ∆N (8.48)
hiji i=1
Remark. N is the total number of lattice sites The ∆N term is a typical term for a
gas in gran canonical ensemble.
Variational mean field approach to BEG

Since we want to apply the second variational method that we have seen, we write
the mean field probability density as:
Y Y
ρM F = ρi = ρ(Si )
i i
and the free energy:

X
G(T, J, ∆) = hHiρM F + kB T Tr(ρi ln ρi ) (8.49)
i
The mean value of the Hamiltonian is:

X X

hHiρM F = −J hSi Sj i + ∆ Si2 − N ∆
hiji i
and since hSi Sj i = hSi i hSj i (it’s the fundamental hypothesis of mean field theories)
we get
MF X X

hHiρM F ' −J hSi i hSj i + ∆ Si2 − N ∆
hiji i
We have also
hSi i = hSj i ≡ m
Therefore, the free energy of the system is:
1
G(T, J, ∆)M F = − N Jz(TrSi (ρi Si ))2 + N ∆ TrSi (ρi Si2 ) − N ∆ + N kB T TrSi (ρi ln ρi )
2
(8.50)
where z is the coordination number of the lattice.
We now must minimize this expression with respect to ρi , with the constraint
TrSi (ρi ) = 1:
dG
=0
dρi
Let us consider each term
d
(Tr(ρi Si ))2 = 2(Tr(ρi Si ))Si = 2 hSi i Si = 2mSi
dρi
d
Tr ρi Si2 = Si2

dρi
d
(Tr(ρi ln ρi )) = ln ρi + 1
dρi
then,
dG
= −JN zmSi + N ∆Si2 + N kB T ln ρi + N kB T = 0
dρi
Dividing by N kB T ,
ln ρi ≡ ln ρ(1) (Si ) = βJzmSi − β∆Si2 − 1
which leads to
1 β(zJmSi −∆S 2 )
ρ(1) (Si ) = e i (8.52)
A
where we have reabsorbed e−1 into the normalization constant A. The constant A
can be found by imposing the constraint TrSi ρ(1) (Si ) = 1, we find
A = 1 + 2e−β∆ cosh(βzJm) (8.53)

Example 28: How to compute A

By imposing the constraint TrSi ρ(1) (Si ) = 1 (recall that Si = ±1, 0), we get
1 β(zJm(+1)−∆(+1)2 ) 2 2

1= e + eβ(zJm(−1)−∆(−1) ) + eβ(zJm(0)−∆(0) )
A
Hence, by rearranging
1 −β∆
1= 2e cosh (βzJm) + 1 ⇒ A = 1 + 2e−β∆ cosh(βzJm)
A
Given ρ(1) (Si ) it is possible to show

2 1
Si = TrSi (ρi Si2 ) = 2e−β∆ cosh(βzJm)
A
and

A − 2e−β∆ cosh(βzJm) 1
x = 1 − Si2 = ⇒x=
A A
Hence, substituting this expression of ρi into G, after some mathematical rearrange-
ment we get:
G(T, ∆, m, J) z
= Jm2 − ∆ − kB T ln A (8.54)
N 2
In order to find the equilibrium state for any T and ∆, we must minimize this
expression of G(T, ∆, m, J) with respect to m. If we expand G for small values of m,
keeping in mind the Taylor expansions
t2 t4 t2
cosh(t) = 1 + + , ln (1 + t) = t −
2 24 2
we get
c(T, ∆) 6
G(T, ∆, J, m) = a0 (T, ∆) + a(T, ∆)m2 + b(T, ∆)m4 + m (8.55)
6
where 
−β∆ − ∆


a0 (T, ∆) = −k B T ln 1 + 2e

a(T, ∆) = zJ 1 − zJ

2 δkB T
(8.56)
b(T, ∆) = zJ2 (βzJ)3 1 − δ



 8δ 3
c(T, ∆) > 0

and the parameter δ is

eβ∆
δ ≡1+ = δ(T, ∆) (8.57)
2
Note that unlike the Ising model in the Weiss approximation in this case both the
quadratic and the quartic terms, a and b, can change sign when the parameters
assume particular values. Let us also note that the order parameter of the system,
namely the concentration of He3 , is:
1 1
x(T, ∆, J) = 1 − Si2 =

= −β∆
A 1 + 2e cosh(βzJm)
Therefore, in the disordered phase (both He3 and He4 are present) we have m = 0
and the concentration of He3 becomes:
1 1
x(T, ∆, J) = −β∆
=1− (8.58)
1 + 2e δ
This way we can determine how the temperature of the λ-transition depends on x;
in fact, the critical temperature will be the one that makes a change sign, so we can
determine it from the condition a = 0:

zJ zJ zJ
a(Tc (∆)) = 1− = 0 ⇒ Tc =
2 δkB Tc kB δ
Since as we have just seen 1/δ = 1 − x, we have
Tc (x) = Tc (0)(1 − x) (8.59)
where Tc (0) = zJ/kB . The other transition (from the continuous λ to the discontinu-
ous one) will occur when the quartic term b changes sign, and so we can determine the
critical value of xc at which it occurs from the condition b = 0. Hence, the tricritical
point is the one that satisfies the conditions
( (
a(Tt , ∆t ) = 0 δt = kBzJTt
⇒
b(Tt , ∆t ) = 0 δt = 3
and the value of the concentration of He3 results

1 2
x(Tt , ∆t ) = 1 − = (8.60)
δt 3
which is in astonishingly good agreement with the experimental result of xt ∼ 0.67.
Exercise 5: Expansion of G for small values of m
Expand the free-energy per site
G z
= Jm2 − ∆ − kB T ln A
N 2
where A = 1 + 2e−β∆ cosh(βzJm) for small values of m.
Solution. Let us define
x ≡ βzJm, B ≡ 2e−β∆
x2 x4
Since cosh x ' 1 + 2 + 24 , we can expans A as
x2 x4

A = 1 + B cosh x ' 1 + B 1 + +
2 24
Hence,
Bx2 Bx4

ln A = ln 1 + B + +
2 24

B 2 B 4
' ln (1 + B) 1 + x + x
2(1 + B) 24(1 + B)
= ln (1 + B) + ln (1 + t)
where
B B
t≡ x2 + x4
2(1 + B) 24(1 + B)
Let us first consider the term
B 2e−β∆ 2 1
= −β∆
= β∆
=
1+B 1 + 2e 2+e δ
t2
Since ln (1 + t) = t − 2, we have
x2

1 1 1 6
⇒ ln A = ln (1 + B) + + − 2 x4 − x
2δ 24δ 4δ 24δ 2
If we remember that x ≡ βzJm, we obtain
βz 2 J 2

ln A z 2 z
− + Jm − ∆ 'a0 (T, ∆) + J− m2
β 2 2 2δ

1 1 1 5 6 6 6
+ − β 3 z 4 J 4 m4 + β z J m
8δ 24δ 24δ 2
Hence, the free energy G for small values of m is
G(T, ∆, J, m) = a0 (T, ∆) + a(T, ∆)m2 + b(T, ∆)m4 + c(T, ∆)m6
where

zJ βzJ
a(T, ∆) = 1−
2 δ
3 4 4 β3z4J 4

β z J 1 1 δ
b(T, ∆) = − = 1−
8δ δ 3 8δ 2 3
5
β z J6 6
c(T, ∆) = >0
24δ 2
8.2.4 Mean field again

Another way to introduce the variational approach and the mean field approxima-
tion often discussed starts from the general expression of the variational free energy
Fvar = hHiρT R + kB T hln ρT R iρT R (8.62)
We have to choose a family of distribution. If one assumes that the family of trial
distribution is of the Gibbs-Boltzmann form
e−βHT R
ρT R = (8.63)
ZT R
with
X
ZT R = e−βFT R = e−βHT R ({Φi }) (8.64)
{Φi }
then, since
ln ρT R = −βHT R − ln ZT R
we have
−HT R

kB T hln ρT R iρT R = kB T + kB T h− ln ZT R i
kB T | {z }
βFT R
By rearranging,
kB T hln ρT R iρT R = h−HT R i + FT R
Hence, the variational free energy becomes
Fvar = hHiρT R − hHT R iρT R + FT R = hH − HT R iρT R + FT R (8.65)

Clearly, F ≤ Fvar and one has to look for the minima of Fvar by varying ρT R . Within
this approach, the mean field approximation is still given by
N
(1)
Y
ρM F
T R (Φ1 , . . . , ΦN ) = ρT R (Φi )
i=1
that in this case becomes

N
Y (1) 1 P
ρT R (Φi ) = e−β i bi Φi (8.66)
i=1
ZTMRF
and X P
ZT R = e−β i bi Φi (8.67)
{Φ}
where bi are the variational parameters. The Hamiltonian is

X
HT R = − bi Φi (8.68)
i
If we consider again the Ising model (remind that it means Φi → Si = ±1), the
Hamiltonian is X X
H = −J Si Sj − H Si
hiji i
Hence, Eq.(8.65) becomes
Fvar = hH − HT R iρT R + FT R
*  !+
X X X
= FT R + −J Si Sj − H Si  − − bi Si
hiji i i
ρT R
* +
X X
= FT R + −J Si Sj + (bi − H)Si
hiji i
ρT R
X X
= FT R − J hSi Sj iρT R + (bi − H) hSi iρT R
hiji i
QN
Since ρT R = i=1 ρi , we have
hSi Sj iρT R = hSi iρT R hSj iρT R
Therefore,
X X
Fvar = FT R − J hSi iρT R hSj iρT R + (bi − H) hSi iρT R
hiji i
Let us minimize the last equation, we consider the condition:

∂Fvar
= 0, ∀i
∂bi
which gives  
∂Fvar X ∂ hSi i
0= = −J hSi iρT R + bi − H 
∂bi ∂bi
j∈ n.n. i
The variational parameters are equal to

X
bi = J hSj iρT R + H
j∈ n.n. i
Let us calculate the average of the spin hSi iρT R :
Si eβSk bk
Q P
1 X β
P
k Sk bk
k S
hSi iρT R = Si e = Q Pk
βSk bk
ZT R k Sk e
{S}
βSi bi
P
S =±1 Si e sinh(βbi )
= Pi βS b
= = tanh(βbi )
Si =±1 e cosh(βbi )
i i
Finally, the variational parameters are

X
bi = J tanh(βbj ) + H (8.69)
j∈ n.n. i
Remark. The main step to understand is how to derive Fvar from a ρT R . This is nice
to see a variation with respect to the real hamiltonian. Consider a bunch of data,
for instance a million of configuration, which is the distribution of the configuration?
Usually, we build up a model with a distribution that depends on parameters and
what we want to do is statistical inference. Starting from the model and the data we
have to obtain the real distribution.
Exercise 6
Consider again the antiferromagnetic Ising model
X X X
H[{S}] = −J S(~rA )S(~rB ) − H S(~rA ) + H S(~rB )
h~rA~rB i ~rA ~rB
whith J > 0 and H > 0. Remember that

• ~rA denotes the site on the A sublattice.
• ~rB denotes the site on the B sublattice.

Let us find again the mean-field solution, but now using the variational ansatz
F ≤ Fvar = hHiρT R − hHT R iρT R + FT R = hH − HT R iρT R + FT R
Remark. Since the problem can be splitted in two sublattices, it is convenient

to use X X
HT R = −HA S(rA ) − HB S(rB )
rA rB
In particular:
• show that Fvar has the following expression:
Fvar =FT R (βHA , βHB ) − 4N J hSA iρT R hSB iρT R

1 1
− N H hSA iρT R − hSB iρT R + N HA hSA iρT R + HB hSB iρT R
2 2
where
hSA iρT R ≡ mA + n
hSB iρT R ≡ mB − n
with m = mA + mB , and
mA = tanh(βH − 4βJmB )
mB = tanh(βH − 4βJmA )
• Expand the free energy Fvar in powers of m of the form
Fvar = A + Bm2 + cm4 + O(m6 )
and find the explicit expression of A, B and C as a function of T, H and

n.
Chapter 9
Non ideal fluids: Mean field

theory, Van der Walls, Virial
expansion and Cluster expansion
Lecture 14.
9.1 Mean field theory for fluids Friday 29th
November, 2019.
Ideal gases are exceedingly idealised systems and are not suited to describe the Compiled:
behaviour of real systems: they always obey the same state equation and never Wednesday 5th
undergo phase transitions (for example they never condense). We must therefore February, 2020.
step a little further: using the "philosophy" of mean field theories we can make the
description of fluids a little bit more realistic. As we will see this will also lead to
the derivation of the Van der Waals equation, which better describes the behaviour
of real fluids (even if, as we will shortly see, it still has some problems).
In general, in a real gas all the atoms or molecules interact through a certain po-
tential Φ({~ri }) that will depend on the positions of all the particles. For a fluid system
of N particles with position vectors {~ri }i=1,...,N , the configurational contribution to
the (grancanonical) partition function will therefore be:
N
Z Y PN
QN (T ) = d~ri e−β (Φ({~ri })+ i=1 ψext (~ri ))
(9.1)
V i=1
where ψext is a one body external potential, but we do not consider it because is not
the aim of our problem. In general,
X X
Φ({~ri }) = U2 (~ri ,~rj ) + U3 (~ri ,~rj ,~rµ ) + . . .
i6=j i6=j6=µ
(where Un can be a generic n-body interaction potential). For simplicity, we do not

consider U3 , that is the three body interaction. Let us suppose
U2 (~ri ,~rj ) → U2 (|~ri − ~rj |)
Therefore,
N
Z Y P
QN (T ) = d~ri e−β i6=j U2 (|~ri −~rj |)
V i=1
Now, we replace all this story with just a field, it is a sort of average of the interactions.
Doing the mean field assumption for U2 , we obtain
X X
U2 (|~ri − ~rj |) → ΦM F (~ri )
i,j>1 i
121
Chapter 9. Non ideal fluids: Mean field theory, Van der Walls, Virial expansion and
122 Cluster expansion
Generally ψext does not pose great problems while it is Φ that makes QN impos-
sible to compute exactly, forcing us to resort to approximations. In the framework of
mean field theories, we substitute the interaction potential Φ with an effective single-
particle potential that acts on every particle in the same way. Hence, the mean field
approximation consists in substituting the multi-body interaction potential Φ({~ri })
with an effective one body potential Φ(~r) withing which all the particles move:
X
Φ({~ri }) = ΦM F (~ri ) (9.2)
i
As said, for simplicity consider ψext = 0, hence mean field theories allow us to compute
QN as
Z N
MF D −βΦM F (~r)
QN (T ) ' d ~r e (9.3)
V
Remark. The integral depends on the form of ΦM F (~r). Of course, every particular
mean field theory will provide a different form of ΦM F (~r) which will lead to different
results.
If one assumes spatial isotropy, what it is important is not anymore the vector but
only the distance; hence, it is important just the integral over the modulus:
ΦM F (~r) = ΦM F (|~r|) = ΦM F (r)
9.2 Van der Waals equation

The Van der Waals equation can be obtained considering the atoms of a gas as
hard spheres. In this case, in fact, the mean field has the form:
(
∞ r < r0 repulsion
ΦM F (r) = (9.4)
u < 0 r > r0 attraction
as plotted in Figure 9.1.
ΦM F
0 r0
r
u
Figure 9.1: Plot of the potential ΦM F (r).

h iN
QM
N
F
(T ) = Vex e−∞
+ (V − V ex )e−βu
9.2. Van der Waals equation 123
where Vex ' r03 is the volume not accessible by the particle. Finally, the result is
h iN
QM
N
F
(T ) = (V − V ex )e −βu
(9.5)
The free energy is FN = −kB T ln QN , hence
FNM F (T ) = −N kB T [ln (V − Vex ) − βu] (9.6)
Let us calculate the pressure
∂FNM F

N kB T ∂u
PNM F =− = −N (9.7)
∂V T V − Vex ∂V T
Remark. In general, the deep u can go up and down depending on the V : u = u(V ).
This is because u is the attractive well of the mean field potential and, for r ≥ r0
must be proportional to the fluid density
u ∼ −N/V
where the minus sign means attraction. On the other hand, also Vex , the volume not
accessible, must be proportional to N .
Hence, we have
N
u = −a , Vex = bN
V
where b is the volume of a single particle. Inserting the last term in (9.7), we obtain
the Van der Walls equation of state:
2
N kB T N
PNM F (V, T ) = −a (9.8)
V − bN V
9.2.1 Critical point of Van der Waals equation of state

Let us define the specific volume as
1 V
v≡ =
ρ N
Hence, the equation of state becomes
KB T a
P = − 2 (9.9)
v−b v
The behaviour of the Van der Waals isotherms is shown in Figure 9.2a. As we can
see this changes with the temperature and resembles that of real isotherms; however,
Van der Waals isotherms are always analytic and have a non physical behaviour in
certain regions of (v, P ) plane, called spinodal curves, if T < Tc : for some values of
v we have ∂P/∂v > 0 which is physically impossible. This is a consequence of the
roughness of the approximation we have made, since it can be shown that it doesn’t
ensure that the equilibrium state of the system globally minimizes the Gibbs free
energy. As we will shortly see, however, this problem can be solved "by hand" with
Maxwell’s equal area rule, or Maxwell’s Construction. Overall, we have this effect
because it is a mean field, so the curve in Figure 9.2a it is replaced by the curve in
Figure 9.2b. Moreover, for T < Tc the equation P (v) = const has 3 distinct solutions.
For T > Tc only one solution ∈ R.
Let us now see how to determine the critical point of a system obeying Van der
Waals equation.
P f lex P
isotherms
Pc T > Tc
PX
T = Tc
T < Tc
v
vL vG v
(a) Van der Waals isotherms are represented
in red in (v, P ) diagram for different values of (b) Real isotherm in (v, P ) diagram for T <
T. Tc .
Figure 9.2
• First of all, from the representation of the isotherms we can see that the critical
point is a flex for the critical isotherm (i.e. the one with T = Tc ); in other
words, we can determine the critical point from the equations:
∂P ∂2P
= 0, =0
∂v ∂v 2
The second in particular means that there is a flex point. Let us pay attention
to it, indeed it is a standard way to find critical points. We obtain
a 8a
vc = 3b, Pc = , kB Tc =
27b2 27b
• Another way to find the critical point consists in noticing that at T = Tc , the
3 solutions coincide. In fact, we can note that the equation P (v) = P = const
is cubic in v. Let us rewrite the Van der Waals equation
v 2 kB T − a(v − b)
P =
v 2 (v − b)
as
3 kB T a ab
v − b+ v2 + v − =0 (9.10)
P P P
For T > Tc this equation has one real solution and two imaginary ones, and
for T < Tc three distinct real solutions; when T = Tc the three solutions of
the equation coincide. This means that at the critical point T = Tc this last
equation Eq.(9.10) must be written in the form:
(v − vc )3 = 0 ⇒ v 3 − 3v 2 vc + 3vvc2 − vc3 = 0
Equating the coefficients with Eq.(9.10) we get:

ab a kB Tc
vc3 = , 3vc2 = , 3vc = b +
Pc Pc Pc
from which we have again:
a 8a
vc = 3b, Pc = , kB Tc = (9.11)
27b2 27b
We have found a very interesting result: in fact, if we can measure a and b
at high temperatures then we are able to determine the critical point of the
system.
This model has also an interesting property, since it predicts that:
Pc vc 3
= ≈ 0.375
kB Tc 8
which is a universal number, independent of a and b and so of the particular

fluid considered. Experimentally this ratio is approximately 0.29 for Argon,
0.23 for water and 0.31 for He4 . Therefore, even if it is very rough, this model
leads to reasonable conclusions.
9.2.2 Law of corresponding states

The universal value of the ratio kPBc vTcc suggests a deeper correspondence between
different fluid systems. We can also rewrite Van der Waals equation (9.8) in a dimen-
sionless form, rescaling the thermodynamic quantities of the system. In particular,
defining:
P 27b2 v v T 27b
π≡ =P , ν≡ = , τ≡ = kB T (9.12)
Pc a vc 3b Tc 8a
Van der Waals equation becomes:

3
π + 2 (3ν − 1) = 8τ (9.13)
ν
We have found another very interesting result: when rescaled by their critical ther-
modynamic properties (by Pc , vc and Tc ), all fluids obey the same state equation.
This is the law of corresponding states: this is a form of universality. The law of
corresponding states applies everywhere on the phase diagram. It can even be shown
that this law is a consequence of dimensional analysis, and is more general than what
might seem: experimentally the law of corresponding states is well satisfied also by
fluids which do not obey Van der Waals equation.
9.2.3 Region of coexistence and Maxwell’s equal area rule

In real fluids, for T < Tc (τ < 1), there is a first order liquid-gas transition with
coexistence between vapor and liquid phase and non analiticity of the thermodynamic
potential. In particular, a real isotherm for T < Tc is the one in Figure 9.2b. How
this is described by the mean-field (i.e. Van der Walls) theory? The Van der Walls
isotherm for T < Tc is given by the graphic in Figure 9.3. The liquid phase goes
into a phase region that is not thermodinamycally stable. How can we remove the
non physical regions of the Van der Walls equation of state and describe coexistence?
The solution is the Maxwell (or equal area) construction!
Equal area or Maxwell construction

As we have previously anticipated, Maxwell’s equal area rule is a method to "man-
ually" remove the unphysical regions of Van der Waals isotherms.
From phase coexistence and general properties of phase transitions we know that
at the coexistence of two phases the chemical potentials and the pressures of the two
phases must be equal; furthermore, from thermodynamic potentials we also know
that the chemical potential is the Gibbs free energy per particle, namely G = µN ,
and in general we have also:
dG = −S dT + V dP + µ dN
smooth curve → no singularity
∂π non physical
∂v >0
region
vL vG v
Figure 9.3: Van der Waals isotherm for T < Tc .
Now, differentiating G = µN and subtracting this last equation we get:

S V
dµ = − dT + dP
N N
Therefore, since along an isotherm dT = 0, we will have:
V
dµ = dP
N
At the coexistence we have also dP coex = 0, hence
dµ = 0
is the physical condition. Recall that for Van der Wall dP 6= 0! Hence, the physical
coexistence condition implies
2 PL
1
Z Z
V an der W alls
0= dµ = µ(2) − µ(1) = dP V
1 N PG
Looking also at the Figure 9.4, we see that this means that the horizontal segment of
the isotherm must be drawn so that the two regions have the same area (from which
the name of the method). The integral can be partitioned in two parts
Z PL Z Px Z PL
0= V dP ⇒ V dP = − dP V
PG PG Px
Hence, the equal area condition gives the value of Px of the coexistence line!
PG
Px
PL
vL vx vG v
Figure 9.4: Maxwell’s equal area rule: Van der Walls isotherm for T < Tc .
9.2.4 Critical exponents of Van der Walls equation

Let us now study the behaviour of systems obeying Van der Waals equations near
the critical point, computing one of the critical exponents.
β exponent
Let us recall that the equation of state is

3
π + 2 (3ν − 1) = 8τ
ν
where
P v T
π= , ν= , τ=
Pc vc Tc
Let us consider (
t ≡ τ − 1 = T −T c
Tc (9.14)
Φ = ν − 1 = v−vvc
c
indeed we want to analyze the deviation from the critical point. Close to the critical
point we have τ ∼ ν ∼ 1 and t ∼ Φ ∼ 0.
We now expand the equation of state with respect to t and Φ in the neighbourhood
of the critical point:

3
π+ (3(Φ + 1) − 1) = 8(t + 1)
(1 + Φ)2
By rearranging,
8(t + 1) 3
⇒π= −
2(Φ + 1) − 1 (1 + Φ)2
Expanding for Φ ∼ 0, since we have
α(α − 1) 2 α(α − 1)(α − 2) 3
(1 + Φ)α ' 1 + αΦ + Φ + Φ
2! 3!
we obtain

27 3
π ' (1 + t) 4 − 6Φ + 9Φ − Φ + · · · − (3 − 6Φ + 9Φ2 − 12Φ3 + · · · )
2
2
3 27
∼ 1 + 4t − 6Φt + 9Φ2 t − Φ3 − Φ3 t + · · ·
2 2
Finally, the result is
3
π ' 1 + 4t − 6tΦ − Φ3 + O(tΦ2 , Φ4 ) (9.15)
2
where the terms we have neglected are justified a posteriori (i.e. we will see that
Φ ∼ t1/2 ; we could have not neglected them, but the result of the computation
doesn’t change).
The strategy we want to apply is the following: since we want to determine how
Φ changes with t, we can determine the relation between the densities Φg and Φl
in the gaseous and liquid phase from Maxwell’s equal area rule. This way, from the
expression of π we can determine the pressures in the two phases and express them
in terms of Φg or Φl , and since πl = πg at the coexistence we can obtain from this
equation the behaviour of Φ in terms of t.
Hence, as said, in order to get the values of vG (P ) and vL (P ) at coexistence, we
use the Maxwell construction Z PL
v dP = 0
PG
and since v = (Φ + 1)vc and dP = Pc dπ we have:

Z gas
(Φ + 1)vc Pc dπ = 0 (9.16)
liq
Let us consider T < Tc fixed (it is true if and only if t < 0, but small), hence
π = π(v) = π(Φ)
From Eq.(9.15) we have

9
dπ ' −6t dΦ − Φ2 dΦ
2
Thus the result of the differential dP = Pc dπ is

9 2
dP = Pc −6t dΦ − Φ dΦ
2
Then, from equation (9.16) we have the integral

Φg
9 2
Z
Φ −6t − Φ dΦ = 0 (9.17)
Φl 2
hence, " #
Φ2g Φ2l

−3Φ2g t+ 2
+ 3Φl t + =0
g g
Since t is small we can neglect it, and so:
Φ2g = Φ2l ⇒ Φg = ±Φl
Remembering that:
vg − vc vl − vc
Φg = , Φl =
vc vc
we see that the only acceptable solution is
Φg = −Φl (9.18)
9.3. Theories of weakly interacting fluids 129
(since the volume of a gas is larger than that of a liquid). Therefore, substituting Φg
and ρl = −ρg into the expression of π (Eq. (9.15)), we get
3
Φg → πg = 1 + 4t − 6tΦg − Φ3g
2
3 3
Φl → πl = 1 + 4t + 6tΦg + Φg
2
The two expression of π must be equal πg = πl since we are at the coexistence.
Solving with respect to Φg we get
3ρg (4t + ρ2g ) = 0
and excluding of course the case ρg = 0, in the end:

1/2
√ Tc − T

Φg = 2 −t ∼ (9.20)
Tc
which implies
1
β= (9.21)
2
9.3 Theories of weakly interacting fluids

If the gas is not ideal but made by weakly interacting particles, it is possible to
follow a perturbative approach to compute the partition function of such systems. Let
us consider N particles in region Ω of volume V . Particles interact through a generic
two-body potential that depends only on the relative distance between the particles:
U2 (~ri ,~rj ) = Φ(|~ri − ~rj |)
Hence,
1X
⇒ U ({~r}) = Φ(|~ri − ~rj |) (9.22)
2
i,j
Its Hamiltonian will be:

N
X ~p2i X
HΩ ({~r}) = + Φ(|~ri − ~rj |) (9.23)
2m
i=1 i,j>i
and its partition function in the canonical ensamble:
1
ZΩ (N, V, T ) = QN (V, T ) (9.24)
N !Λ3N
where Z Z Z
QN (V, T ) = d~r1 d~r2 · · · d~rN exp[−βU ({~r})] (9.25)
V V V
Remark. Of course for ideal gases U = 0, and so
VN
QN (V, T ) = V N ideal
→ ZN =
N !Λ3N
and the dependence on T is exclusively due to Λ = Λ(T ) (i.e. kinetic energy).

Now, suppose U 6= 0, but small! If we consider also the interaction terms we must
insert a correction χ in the configurational contribution to the partition function.
We can say that our QN (V, T ) it would be the on of the ideal version times a new
function
QN (V, T ) ' V N χ(N, V, T ) (9.26)
which (depending on the possible presence of attractive terms in the interaction po-
tential Φ can in general be also a function of the temperature T ; furthermore the
correction depends strongly on the gas density: if it is low the particles will not "per-
ceive" the presence of the other ones and the ideal gas approximation is a good one,
while for high densities the particles will be closer to each other and corrections to
QN are necessary.
Remark. If Φ is only repulsive, χ does not depend on T .
Let us note that inserting the correction χ, the free energy of the system will be:
FN = FNideal − kB T ln χ (9.27)
As previously said, the correction χ due to particle-particle interaction depends on

the particle density ρ of the fluid:
(
ρsmall ⇒ U = 0
(9.28)
ρhigh ⇒ U 6= 0 and not negligible
This suggests that the equation of state of a weakly interacting gas can be expanded
formally in powers of ρ. This is known as virial expansion.
In particular, for the ideal gas:
P
=ρ
kB T
For a non ideal gas, let us add the other terms of the expansion
P
→ = ρ + B2 (T )ρ2 + B3 (T )ρ3 + · · · + O(ρn ) (9.29)
kB T
this is a virial expansion and it is one of the most used. The coefficient B are
called the virial coefficients. The Eq.(9.29) was first introduced as a formula to fit
experimental data. Indeed, making a fit, you will obtain the virial coefficients. This is
what physicist have done for years. Then, mapping the coefficient with the real world
experiments, we can find some macroscopical parameters. The formula (9.29) can be
also obtained rigorously from a perturbation approach to the partition function (as
we will see later). Now, the question is: which is the virial expansion of a Van der
Walls (i.e. mean field) gas?
9.3.1 Van der Walls and virial expansion

Let us see for example the virial expansion of the Van der Waals equation. From
Van der Waals equation we have:
P N aN 2
= −
kB T V − bN kB T V 2
Let us factorize the term (N/V ),
−1 2
P N N a N
= 1−b −
kB T V V kB T V
Then, by expanding in power of (N/V ) and defining ρ = N/V , we have
N 2

3 4
P N a N 2 N
⇒ = + b− + b + b3 + . . .
kB T V V kB T V V

a
=ρ+ b− ρ2 + b2 ρ3 + b3 ρ4 + . . .
kB T
We can thus immediately identify the first virial coefficient:

a
B2 (T )V dW = b − , B3V dW = b2
kB T
where in B2 (T )V dW the first term is repulsive on excluded volume and the second
one is the attraction term. We note also that B3V dW is always positive.
Boyle’s temperature TB
The Boyle’s temperature is the T at which the second coefficient is zero:
B2V dW (TB ) = 0
so we have removed the most important coefficient. The competiting effects of repul-
sion and attraction are cancelled out. In this case, the Van der Walls temperature
TBV dW is
a
TBV dW =
bkB
to be compared with the critical temperature TcV dW that is
8a
TcV dW =
27b3
We notice that TcV dW TBV dW . It is clear that the Boyle’s temperature must be
much greater than the critical one.
Remark. Consider a polymer, the transition point called the θ point is when the
second coefficient is zero, as the case described above, but it is interesting in polymer
kind of system (lesson).
9.3.2 Cluster expansion technique for weakly interacting gases

We now obtain the formal virial expansion by starting from the microscopic system
and performing a perturbation expansion of the Boltzmann weights for small values
of U . Let us start from the partition function
Z Z Z Z P
QN = d~r1 · · · d~rN e−βU ({~r}) = d~r1 · · · d~rN e−β i,j>i Φij (9.30)
V V V V
where we have used the short notation
Φij ≡ Φ(|~ri − ~rj |)
The idea is to find a "small quantity" in terms of which we can expand QN ; this
quantity is the so called Mayer function.
Definition 8: Mayer function
The Mayer f-function is an auxiliary function that often appears in the series ex-
pansion of thermodynamic quantities related to classical many-particle systems.
It is defined as
f (|~r|) ≡ e−βΦ(|~r|) − 1 (9.31)
Remark. Note: if βΦ(r) 1, we have f (r) 1.
In fact, when the gas is ideal f (~r) = 0, and if the particles interact weakly Φ is
small, and so is f (~r). In particular, this expansion will work well for low densities
(namely |~ri − ~rj | is large and so Φ(|~ri − ~rj |) → 0) or high temperatures (namely
β → 0): in both cases, in fact, e−βΦ(|~ri −~rj |) → 1 and f (|~ri −~rj |) → 0. Using the short
notations Φij ≡ Φ(|~ri − ~rj |) and fij ≡ f (|~ri − ~rj |) we have
 
P P
⇒ e−β i j>i Φij =
Y Y
 (1 + fij )
i j>i
= (1 + f12 )(1 + f13 ) . . . (1 + f1N ) . . . (1 + f23 )(1 + f24 ) . . . (1 + f2N ) . . .

| {z } | {z }
i=1 i=2
= (1 + f12 + f13 + f12 f13 )(1 + f14 ) . . . (1 + f23 )
XX N X X X X
X

fik fkl + O(f 3 )

=1+ fij +

i j>i i=1
l>k j>i (ij)6=(lk)
k≥i
where
fij ≡ e−βΦij − 1
Higher order terms contain products of 3, 4, . . . fij terms. For simplicity, let us con-
sider first only linear terms. Hence, the solution is given by considering only the
Lecture 15. linear term. This is the cluster expansion.
Wednesday 4th As said, this first approximation is reasonable if either
December, 2019.
Compiled: 1. ρ is small enough. It implies that |~ri − ~rj | 1 and hence Φij 1.
Wednesday 5th
February, 2020. 2. Sufficiently high T such that Φ(|~ri − ~rj |)/kB T 1. What is important it is
the ration between β and Φij .
In either cases we have exp(−βΦij ) → 1 and fij → 0. By keeping only linear terms,
the configurational contribution to the partition function will be
 
Z X XZ Z
N
QN (V, T ) = d~r1 . . . d~rN 1 + fij + . . .  = V + d~r1 · · · d~rN fij
V i,j>i i,j>i V V
XZ
= V N + V N −2 d~ri d~rj fij + . . .
i,j>i V
We are summing up over all configurations ij. Let us try to compute the double
integral, with the definition of a new variable ~r = ~ri − ~rj :
Z Z Z Z Z
d~ri d~rj fij (|~ri − ~rj |) = d~ri d~r f (~r) = V d~r f (|~r|) ≡ −2B2 V
V V translational V V V
symmetry
Hence,
1
Z
B2 ≡ − d~r f (|~r|) (9.32)
2 V
From this we see precisely how the virial coefficient, which as we have already stated
can be experimentally measured, is related to the microscopic properties of the in-
teraction between the particles, represented by the Mayer function f . It can also be
shown that all the virial coefficients can be expressed in terms of integrals of products
of Mayer functions: higher order coefficients involve the computation of increasingly
difficult integrals, which can however be visualized in terms of graphs.
What we have seen now is how the cluster expansion works in general. Let us
now apply it in order to find the virial expansion for real gases. From what we have
found, the configurational partition function of the system becomes:
X
QN (V, T ) = V N − V N −1 B2 (T ) 1 + ...
i,j>i
The remaining sum is equal toN (N − 1): in fact, for any of the N values that i can
assume, j can have N − 1 values. These are all the possible connections (bonds)
between pairs of particles (i, j) with j > i. Hence,
QN (V, T ) = V N − V N −1 B2 (T )N (N − 1) + . . . (9.33)
and, considering that N − 1 ≈ N for large N , the complete partition function of the
system will be:
VN N2

ZN (V, T ) = 1− B2 (T ) + . . . (9.34)
N !Λ3N V
We recognise in this expression that (1 − B2 N 2 /V + · · · ) is the correction χ to the

ideal gas partition function that we have mentioned earlier; therefore, the free energy
of the system will be:
N2

FN = FNideal − kB T ln 1 − B2 (T ) + . . . (9.35)
V
and its pressure:
N2
! !
N N
V B2 1− V B2 + V B2

∂FN N kB T N kB T
PN =− = 1+ 2 = 2
∂V T,N V 1 − NV B2 V 1 − NV B2
Expanding the denominator for N

V B2 1 ρ 1, one gets

N kB T N
PN ' 1 + B2 + . . . (9.36)
V V
here we see the correction to the ideal gas.

Remark. The equation (9.36) gives an important relation between experimentally
accessible observables as PN and microscopic quantities such as f (~r) (and hence
Φ(~r)) trough the estimate of B2 . Therefore, it is important computing B2 , because
one time we have this we have the expansion. Or if we wish, by doing the fit of data
at different temperature we obtain B2 from the experiment and we can see fij .
The expansion in Eq.(9.36) contains only low-order terms in the density N/V , so
strictly speaking it is valid only for low densities. To consider higher order terms in
the virial expansion we need to consider higher order products of the fij . However,
we can use a "trick" in order to extend its range; in fact, remembering that the
McLaurin expansion (1 − x)−1 = 1 + x + . . ., from the Eq.(9.36) we can write:
PV 1
≈ 1 + ρB2 + · · · '
N kB T 1 − B2 ρ
and now re-expand (1 − B2 ρ)−1 , so that we can express all the virial coefficients in
terms of the first one:
1
' 1 + B2 ρ + (B2 )2 ρ2 + (B2 )3 ρ3 + . . .
1 − B2 ρ
Hence,
P
= ρ + B2 ρ2 + (B2 )2 ρ3 + (B2 )3 ρ4 + . . .
kB T
Identifying the coefficients for each power we get, in the end:
B3 ≈ (B2 )2 , B4 ≈ (B2 )3 , ..., Bn ≈ (B2 )n−1
This is the approximation of higher order virial coefficients with powers of B2 .

Remark. One question at the exam can be: let us compute virial expansion of a gas
in a potential.
9.3.3 Computation of virial coefficients for some interaction poten-

tials Φ
Let us now see this method in action by explicitly computing some coefficients B2
for particular interaction potentials.
Hard sphere potential

The particles are interacting (it is not ideal!) and there is a size that is the range
of the potential. As a first trial, we use a hard sphere potential similar (see Figure
9.5) to the one we have seen for the derivation of the Van der Waals equation:
(
∞ r<σ
Φ(r) = (9.37)
0 r≥σ
(the difference with what we have seen in Van der Waals equation is that now the
potential is purely repulsive, and has no attractive component).
σ r
Figure 9.5: Plot of the hard sphere potential Φ(r).
In this case, (
−βΦ(r) −1 r < σ
f (~r) = e −1= (9.38)
0 r≥σ
Therefore, from the definition of B2 and shifting to spherical coordinates:

Z +∞ Z σ
1 1 2
Z h i
2 −βΦ(r)
B2 (T ) = − d~r f (|~r|) = − 4π dr r e − 1 = 2π dr r2 = πσ 3
2 V 2 0 0 3
Hence,
2
⇒ B2HS (T ) = πσ 3 (9.39)
3
this is the second virial coefficient for a hard sphere gas. As expected B2HS does not
depend on temperature (purely repulsive interaction). Finally, for hard spheres we
have:
2 N
P V = N kB T 1 + πσ 3 (9.40)
3 V
Note that the excluded volume interaction (hard sphere term) increases the product
P V with respect to the ideal gas.
Square wall potential

We now use a slight refinement of the previous potential:

+∞
 |~r| < r0
Φ(~r) = −ε r0 < |~r| < r0 + δ (9.41)

0 |~r| > r0 + δ

This can be seen as a hard sphere potential where the spheres have an attractive shell
of thickness δ. We thus have:

−1
 |~r| < r0
βε
f (~r) = e − 1 r0 < |~r| < r0 + δ (9.42)

0 |~r| > r0 + δ

so that:
1 1
Z Z
B2 = − f (|~r|)d~r = − 4πr2 f (r)dr =
2 2
Z r0 Z r0 +δ
2 βε 2
= −2π (−r )dr + e − 1 r dr =
0 r0
3
eβε − 1

r0 3 3 2
= B2h.s. − π eβε − 1 (r0 + δ)3 − r03

= −2π − + (r0 + δ) − r0
3 3 3
where B2HS is the first virial coefficient of the hard sphere potential we have previously
seen. Now, if the temperature is sufficiently high, namely βε 1, we can approximate
eβε − 1 ≈ βε, so that:
" #
δ 3

HS 2 3
B2 = B2 − πβεr0 1+ −1 (9.43)
3 r0
For the sake of simplicity, defining:

δ 3

λ≡ 1+ −1
r0
we will have, in the end:

PV HS 2 πε 3
= 1 + B2 ρ = 1 + B2 − r λ ρ (9.44)
N kB T 3 kB T 0
so in this case B2 actually depends on the temperature.
Lennard-Jones potential
This potential is a quite realistic representation of the interatomic interactions. It
is defined as:
σ 12 σ 6
Φ = 4ε − (9.45)
r r
which contains a long-range attractive term (the one proportional to 1/r6 , which
can be justified in terms of electric dipole fluctuations) and a short-range repulsive
one (proportional to 1/r12 , which comes from the overlap of the electron orbitals,
i.e.Pauli excluded principle). This potential is plotted in Figure 9.6. The minimum
is in rmin = 21/σ . We can play with the range of attraction by changing σ or by
changing the ε.
σ rmin
−ε
Figure 9.6: Plot of the Lennard-Jones potential Φ.
With this interaction potential, the first virial coefficient is:

Z ∞ h 12 6
i
2 − k 4εT ( σr ) −( σr )
B2 (T ) = −2π r e B − 1 dr
0
which is not analytically computable. However, it can be simplified defining the

variables
r kB T
x= , τ=
σ ε
so that, integrating by parts f g = f g − g f where f 0 = x2 g = exp[−()], we obtain
R 0 R 0
2 3 4 ∞ 2 12

6
Z
∗ − τ4 112 − 16
B2 (T ) = πσ x − e x x dx
3 τ 0 x12 x6
Z ∞
12 6

− τ4 112 − 16
=A − e x x dx
0 x16 x4
Now, we can expand the exponential and integrate term by term; this gives an ex-
pression of B2 as a power series of 1/τ :
∞ 2n+1
2n − 1

X 10 1 4
B2 (τ ) = −2A Γ (9.46)
4n! 4 τ
n=0
where Γ is the Euler function and A0 is a constant. Note that the attractive part of
the Lennard-Jones potential has introduced in B2 a dependence on the temperature.
9.3.4 Higher order terms in the cluster expansion

Let us consider again the formal expansion
 
Y Y X X
 (1 + fij ) = 1 + fij + fij fkl + . . .
i j>i i,j>i i
j>i
l>k
k≥i
(ij)6=(kl)
The problem with this expansion is that it groups terms quite different from one
another. Fro example the terms f12 f23 and f12 f34 . Indeed the first term correspond
to a diagram as in Figure 9.7a, while the second to two disconnected diagrams as in
Figure 9.7b.
2 2 4
1 3 1 3
(a) Diagram of f12 f23 . (b) Diagram of f12 f34 .
Figure 9.7
Another problem of the above expansion is that it does not recognize identical
clusters formed by different particles. For example the terms f12 f23 and f12 f14 con-
tribute in the same way to the partition function. It is then convenient to follow a
diagrammatic approach similar to the Feymann approach in the reciprocal space.
For the linear term fij the only diagram is given by Figure
2
9.8. As we have seen this has multeplicity
N (N − 1)
2
and the integral is of the form
Z Z 1
f12 d~r1 d~r2 = V f (~r) d~r = −2V B2
Figure 9.8
For the term fij fkl we can have the case as in Figure 9.9,
that has molteplicity
N (N − 1) (N − 1)(N − 3) 1
2 2 2 j l
and the integral is of the form
Z
f12 f34 d~r1 d~r2 d~r3 d~r4
i k
i.e. involving 4-particles
Figure 9.9
Z
f (|~r1 − ~r2 |)f (|~r3 − ~r4 |) d~r1 d~r2 d~r3 d~r4 =
Z 2
2
=V f (~r) d~r = 4V 2 B22
The next case if for instance as in Figure 9.10. This involves

3 particles. The multiplicity of this diagram is
N (N − 1)(N − 2)
×3 j=k
3!
The integral is of the form
Z Z 2
f12 f23 d~r1 d~r2 d~r3 ' V dr f (r) = i l
Z
= f (|~r1 − ~r2 |)f (|~r2 − ~r3 |) d~r1 d~r2 d~r3 = (9.47) Figure 9.10
Z 2
=V f (~r) d~r = 4V B22
Another interesting diagram is the one in Figure 9.11. Its

molteplicity is
N (N − 1)(N − 2)
3! j=k
The associated integral involves 3 particles and it is of the

form
Z
f12 f23 f31 d~r1 d~r2 d~r3 = i l
Z
= f (|~r1 − ~r2 |)f (|~r2 − ~r3 |)f (|~r3 − ~r1 |) d~r1 d~r2 d~r3 Figure 9.11
Z
= f (|~r1 − ~r2 |)f (|~r2 − ~r3 |)f (|~r3 − ~r1 |) d~r2 d~r21 d~r23
On the other hand ~r13 = ~r23 − ~r21 , which implies
f (|~r3 − ~r1 |) = f (|~r23 − ~r21 |)
Hence,
Z Z
f (|~r12 |)f (|~r23 |)f (|~r31 |) d~r21 d~r23 d~r2 = f (|~r12 |)f (|~r23 |)f (|~r23 − ~r21 |) d~r21 d~r23 d~r2
Let us call this integral

Z
f12 f23 f31 d~r1 d~r2 d~r3 ≡ 3!V B3 − 2B22 (9.48)

The configurational partition function with the terms in Eq.9.47 and Eq.9.48 becomes
N (N − 1) N (N − 1)(N − 2)(N − 3) N (N − 1)(N − 2) 2
QN (V, T ) =V N − V N B2 + V N 2
(4B22 ) + V N 4B2
V 8V 2V 2
N (N − 1) N (N − 1)(N − 2)(N − 3) 2 N (N − 1)(N − 3)

=VN 1+ B2 + B2 + B3
V 2V 2 V2
(9.49)
Let us now face the problem in a slightly different ways. Let us remind that
X Z Y
QN (V, T ) = fkl d3N r (9.50)
diagrams kl
where the sum is over all possible diagrams, i.e. all possible ways in which ones can
draw edges between pairs of points (k, l). For each such diagrams I have to product
between all edge and then integrate over the configurational space (N points).
1 2
3 or or ...
1 diagram 2 diagram ...
Figure 9.12: Example of connected diagrams for i = 4 sites.
Let us now consider only connected diagrams for i sites. In other words given i
points (i particles) from a system of N points and I consider all the possible ways I
can connect these i points (an example isQ shown in Figure 9.12).
For each diagram we take the product kl fkl and then integrate over the position
of the i points (i particles). For a fixed diagram:
Z Y
fkl d~r1 . . . d~ri
kl∈ diagram
Example 29: Diagram for i = 4 sites

For example the diagram 1 in Figure 9.12 gives the contribution
Z
f12 f13 f34 d~r1 d~r2 d~r3 d~r4
The diagram 2 gives

Z
f12 f13 f23 f34 d~r1 d~r2 d~r3 d~r4
and so on.
Finally, we sum over all these connected diagrams of i points:
X Z Y
fkl d~r1 . . . d~ri
connected lk∈ diagram
diagrams
the results is what we call (i!V Bi ) and defines Bi . Let us analyze what happens for
different values of i points:
• case i = 1: clearly B1 = 1;
• case i = 2: just one edge, hence we have just one connected diagram. The
integral becomes:
Z
f12 d~r1 d~r2 = −2V B2
• case i = 3: the connected diagrams are shown in Figure 9.13.

2 2 2 2
+ + +
1 3 1 3 1 3 1 3
Figure 9.13: Connected diagrams for i = 3 points.
X Z Y
fkl d~r1 d~r2 d~r3 =
connected diagrams kl∈ diagram
of i = 3 points
Z Z Z
= f12 f23 d~r1 d~r2 d~r3 + f12 f13 d~r1 d~r2 d~r3 + f13 f23 d~r1 d~r2 d~r3
| {z }
2
3V ( f (~r)d~r)
R
Z
+ f12 f23 f13 d~r1 d~r2 d~r3
| {z }
3!V (B3 −2B22 )
Hence,
X Z Y
fkl d~r1 d~r2 d~r3 = 3V (−2B2 )2 + 6V (B3 − 2B2 )
connected diagrams kl∈ diagram
of i = 3 points
= 6V B3 = 3!V B3
Eventually, for the partition function we have to sum over all possible clusters.
One possible procedure is:
1. given the N points we can partition them into connected clusters. For all i
points we can make mi clusters of that size i.
X
imi = N
i
For each cluster of size i we have a term (i!V Bi ). If there are mi of them we
have a weight (i!V Bi )mi .
2. Now, we have to count in how many ways we can make the partition of N in a
set of {mi } clusters. Clearly if we permute the label of the N vertices we have
possible different clusters. In principle, this degenerancy is proportional to N !
On the other hand, if one changes the order of the labels within a cluster (in i!
ways) this does not change the cluster and since there are mi clusters of size i
we have to divide by (i!)mi .
Moreover, since there are mi clusters one can swap them (in mi ! ways). The
degenerancy is mi !(i!)
N!
mi . Therefore,
N!
(i!V Bi )mi
XY
QN (V, T ) = (9.51)
mi !(i!)mi
{mi } i
1 1
9 9
(a) Description (b) Description
Figure 9.14
Exercise 7: N = 9 points
Consider the N = 9 points in Figure 9.14a.
1. Partition these points into clusters, as in Figure 9.14b.

For this partition {mi } we have m4 = 1, m2 = 2, m1 = 1. Now, the cluster
of size 4 can be connected in a given different ways (4!V B4 )1 .
2. Compute the degenerancy of this case (More on Huang chapter 10).

Chapter 10
Landau theory of phase transition

for homogeneous systems
10.1 Introduction to Landau theory

Landau theory is a phenomenological mean field theory of phase transitions that
aims at describing the occurence of phase transitions in a unitary framework (no spa-
tial variation of the order parameter). Landau theory is based on some assumptions,
which we now introduce:
1. Existence of an unfirom order parameter η. Remember the definition of the
order parameter:
(
0 T ≥ T̄ (disordered phase)
η=
6= 0 T < T̄ (ordered phase)
Well known examples are (

η→m
η → ρL − ρG
2. There exists a function L called Landau free energy1 , which is an analytic

function of the coupling constants {Ki } of the system and of the order parameter
η:
L = L(η)
3. The form of L must satisfy the underlying symmetry of the system.
4. The equilibrium states of the system are the global minima of L with respect
to η.
We also assume that the thermodynamic properties of the system can be obtained
by differentiating L, just like we can do with thermodynamic potentials2 .
Note also that the general formulation of the Landau theory does not depend on
the dimensionality of the system (although we will see that once a system has been
chosen some details can depend on it).
Remark. Since L is analytic it can be formally expanded in power of η, for η ∼ 0.
L(η) ≈ a0 + a1 η + a2 η 2 + a3 η 3 + . . . (10.1)
1
To be more precise, L is the Landau free energy density; the "real" Landau free energy should
be L = V L.
2
Strictly speaking, Landau free energy is not really a thermodynamic potential: the correct
interpretation of L is that it is a coarse grained free energy (not the exact one).
143
144 Chapter 10. Landau theory of phase transition for homogeneous systems
10.2 Landau theory for the Ising model

To make things more clear, let us now consider the Ising model without any exter-
nal field and see how we can determine the form of L from the general assumptions
we have introduced. In this case η is a scalar (magnetization).
10.2.1 Costruction of L
First of all, since the equilibrium configurations of the system must be minima of
L:
∂L
= a1 + 2a2 η + 3a3 η 2 + · · · = 0
∂η
where we have chosen to stop the expansion at the three order. Now, since this
equation must hold for all T and for T > T̄ 3 we have η = 0, we see that a1 = 0.
Considering now the constraint on the symmetries of the system, in absence of
phase transitions for finite systems we have seen that the Ising model is invariant
under parity (Z2 symmetry), i.e. its Hamiltonian is simultaneously even in H and
{Si }:
H(H, {Si }) = H(−H, {−Si })
Thus, in absence of external fields (H = 0) the Hamiltonian of the Ising model is
even; this means that also L must be invariant under parity, namely an even function
of η:
L(−η) = L(η)
Therefore all the odd terms of the expansion are null:
a2k+1 = 0 ∀k ∈ N
Finally, since we have assumed that L is an analytic function of η then its expan-
sion cannot contain terms proportional to |η|.
In conclusion, the minimal expression for L(η) that describes the equilibrium phase
diagram of an Ising-like system is:
L(η) ' a0 (J, T ) + a2 (J, T )η 2 + a4 (J, T )η 4 + O(η 6 ) (10.2)
where the coefficients of the expansion a0 , a2 , a4 , . . . are functions of the physical

parameters, J and T . However, L can be further simplified and we can also explicitly
show its dependence on the temperature. In fact, first of all we can note that a0 is
the value of L in the paramagnetic state (when T > T̄ , η = 0):
L(η = 0) = a0
and so for simplicity we can set a0 = 0 (it’s just a constant shift in the energy, what
matters is the free-energy difference).
Moreover, in order to have η = η̄ 6= 0 < ∞ for T < T̄ (thermodynamic stability)
we should impose that the coefficient of the highest power of η is always positive. In
this case:
a4 (J, T ) > 0
Indeed if this condition is violated L reaches it s absolute minimum for η → ±∞,
which makes no sense physically! The Landau free energy results
L(η) ' a2 η 2 + a4 η 4 , with a4 > 0 (10.3)

3
For T > T̄ (critical point) we expect a paramagnetic phase.
10.2. Landau theory for the Ising model 145
Finally, fixing J and expanding the coefficients a2 and a4 as a function of the

reduced temperature t ≡ T −
T̄
T̄
(in T near T̄ ), we obtain
T − T̄ a b
a2 ∼ a02 + + ..., a4 ∼ + ...
T̄ 2 4
in the expansion of a4 we have neglected any explicit dependence on T − T̄ because
as we will see it will not dominate the behaviour of the thermodynamics near T̄ .
Moreover, by choosing a02 = 0 the sign of a2 is determined by the one of t. In
particular, at T = T̄ , one has a2 = 0.
We finally have that the form of the Landau free energy for the Ising model is
given by:
a 2 b 4
L= tη + η + O(η 6 ) (10.4)
2 4
Remark. Does not matter the coefficient in green in front, so in the next part of the
course we will change it. If it is written in this way we have always a > 0. We have
also b > 0.
Note that, in presence of an external magnetic field h, one should consider the
Legendre transform of L obtaining its Gibbs version:
a 2 b 4
LG = tη + η − hη (10.5)
2 4
we have inserted a field coupled with the order parameter.
10.2.2 Equilibrium phases

Let us now see what does the Landau theory for the Ising model predict. First of
all, in the absence of external fields we have that the equilibrium states are determined
by:
∂L
= 0 ⇒ atη + bη 3 = η(at + bη 2 ) = 0 (10.6)
∂η
Hence, the minima are
(
0 t > 0 (i.e. T > T̄ )
η̄ = q (10.7)
± −at
b t < 0 (i.e. T < T̄ )
and at T = T̄ the 3 solutions coincide!

Let us consider the two different cases:
• Case t > 0 (T > T̄ ): the only global minimum of L is the solution η̄ = 0. The
second derivative of L with respect to η is
∂2L
= at + 3bη 2
∂η 2
which results ≥ 0 for η̄ = 0 and in the case t > 0. It implies that η = η̄ is a

global minima, as in Figure 10.1a.
q
• Case t < 0 (T < T̄ ): there are 3 solutions, η̄ = 0 and η̄ = ± − at
b . Let us see
wheter they are minima or local maxima.
∂ 2 L

= at < 0 ⇒ η̄ = 0 local maxima (no equilibrium)
∂η 2 η̄=0
∂ 2 L

at
= at + 3b − = −2at
∂η 2 η̄=±√− at b
b
q
since t < 0, we have −2at > 0 and hence η̄ = ± − at b are two minima!
r !
at a2 t2 a2 t2 a2 t2
L η̄ = ± − =− + =− <0
b 2b 4b 4b
Hence, the two minima have the same valueare related by the group symmetry
Z2 (η̄ → −η̄).
T >T
(a) Landau free energy L for t > 0 with (b) Landau free energy L for t < 0 with
h = 0. h = 0.
Figure 10.1
10.3 Critical exponents in Landau’s theory

Let us therefore see what critical exponents does the Landau theory for the Ising
model predict. Let us define t ≡ T −
T̄
T̄
.
Exponent β
This is immediately determined from what we have just seen: in fact, η ∼ tβ for
h = 0, t → 0− . Since t < 0, the minima of L are
r
at 1
η̄ = ± − ⇒β=
b 2
as expected.
Exponent α
∂ L 2
The specific heat at zero field of the system is CH = −T ∂T 2 . In particular, we
have CH ∼ t for h = 0, |t| → 0. As we have seen:
−α
• if t > 0: L(η̄ = 0) = 0.
q 2 2
• if t < 0: Lmin = L η̄ = ± − at
b = − a4bt .
Hence, (
0 t>0
Lmin = 2 2
− a4bt t<0
10.3. Critical exponents in Landau’s theory 147
Therefore:
∂2L ∂2 a2

2
cH = −T = −T − (T − T̄ )
∂T 2 ∂T 2 4bT̄ 2
We have
a2 a2

∂ 2
− (T − T̄ ) = − (T − T̄ )
∂T 4bT̄ 2 2bT̄ 2
∂2 a2 a2

∂
= − (T − T̄ ) = −
∂T 2 ∂T 2bT̄ 2 2bT̄ 2
Hence, the specific heat at zero field results
(
0 T > T̄
cH = a2
2bT̄ 2
T T < T̄
a2
We have t → 0− if and only if T → T̄ − , which implies cH → 2bT̄
that is constant.
Hence, in both cases:
α=0
Exponent δ
Let us remind that h ∼ η δ at T = T̄ . Considering now also an external field, the

state equation of the system will be given by the differentiation of L:
∂L
= atη + bη 3 − h = 0
∂η
Hence, the condition of equilibrium is
h = atη + bη 3 (10.8)
This tells us that, for fixed h, the extreme points of L are given by the values of η
that satisfies Eq.(10.8) (see Figure 10.2).
Figure 10.2: Plot of the Landau free energy for t < 0 with an external field h > 0.
At the critical point T = T̄ (t = 0) we have h ∼ η 3 . Therefore:
δ=3
Exponent γ
Let us remind that χT ∼ t−γ for h = 0, |t| → 0. If we now differentiate the state
equation (10.8) with respect to h we get:
∂η ∂η
at + 3bη 2 =1
∂h ∂h
∂η
Since χ = ∂h , we have
1
χ=
at + 3bη 2
If we now set h = 0, then for:
• t > 0: we will have η̄ = 0 and thus χT = at .

1
1/2
• t < 0: we will have η̄ = ± − at
b and thus χT = − 2at
1
.
In both cases χT ∼ 1/t and thus:
γ = γ0 = 1
Summary
In summary, the Landau theory for the Ising model gives the following (mean
field) values of the critical exponents
1
β= , α = 0, δ = 3, γ=1 (10.9)
2
which, as we expected, are identical to those we have found within Weiss mean field
theory. Moreover, Landau theory does not depend on the system dimension d (as
expected since is a mean field theory) but only on its symmetries.
Remark. For a O(n) (vector) model the order parameter η becomes a vector field ~η
with n compnents and
a b
LG (~η ) = t~η · ~η + (~η · ~η )2 − ~h · ~η + O (~η · ~η )3 (10.10)
2 4
10.4 First-order phase transitions in Landau theory

Lecture 16.
Friday 6th As we have seen, Landau theory is based on the assumption that the order pa-
December, 2019. rameter is small near the critical point, and we have seen in the example of the Ising
Compiled: model how it can describe a continuous phase transition (in fact, for t → 0 we have
Wednesday 5th η → 0). However, because of the symmetry properties of the Ising model we have
February, 2020. excluded any possible cubic term; what we now want to do is to consider a more
general form of L which includes also a cubic term in η (in the case in which the
symmetry is not violated), and see that this leads to the occurrence of a first-order
phase transition. In fact, we want to generalize to include multicritical points, or
phase transitions. Let us remember that in the Ising model we have phase transition
derived by symmetry breaking, while now we have another type of phase transitions.
We have seen that since the order parameter is null for T > T̄ the Landau free
energy cannot contain any linear term in η. Let us therefore consider the simplest
Landau free energy that depends on a particular field:
b
L(η, t, h) = atη 2 − wη 3 + η 4 − hη (10.11)
4
10.4. First-order phase transitions in Landau theory 149
∗
where t ≡ T −T
2 and w is an additional parameter that we fix to be positive, w > 0; as
in the previous case, we must have b > 0 so that η has finite values in the equilibrium
configurations. In addiction,
(
a > 0 if T > T ∗
at = (T − T ∗ )
2 < 0 if T < T ∗
Remark. For w < 0 the results are the same, but in the η < 0 diagram.
The temperature T ∗ is the one at which we have the continuous transition if
w = 0, but as we will see it doesn’t have great significance now. The equilibrium
configurations of the system, will be given by:
∂LG
=0 ⇒ h = 2atη − 3wη 2 + bη 3
∂η
In absence of external fields (h = 0), the equilibrium states becomes
h=0 ⇒ η(2at − 3wη + bη 2 ) = 0
The solutions of this equation are
disordered phase
(
η̄ = 0
√ (10.12)
η¯± = 1
2b 3w ± 9w2 − 8abt ordered phases
Let us rewrite the ordered solutions as

r
1 p 2at
η̄± = 3w ± 9w2 − 8abt = c ± c2 − (10.13)
2b b
with
3w
c=
2b
However, these two last solutions are possible only if:
2at T − T∗ c2 b T ∗∗ − T ∗
η̄± ∈ R ⇐⇒ c2 − > 0 ⇐⇒ t = < ≡ t∗∗ ≡
b 2 2a 2
Hence, we have
c2 b
T ∗∗ = T ∗ + = T ∗ + 2t∗∗
a
so, since t∗∗ is positive, this will occur at temperatures higher than T ∗ . Let us consider
different cases:
• If t > t∗∗ (T > T ∗∗ ), then the system will be in the disordered phase and we
have η̄± ∈/ R. The only real solution is η̄ = 0 that is also the absolute minimum
of L. The plot is shown in Figure 10.3.
q
• If t ≤ t∗∗ (T ≤ T ∗∗ ), we have η̄± = c± c2 − at b ∈ R are both possible solutions.
One will be a local maximum and the other a local minimum.
– At T = T ∗∗ , we have η̄− = η̄+ (flex point), as shown in Figure 10.4.

– For Tt < T < T ∗∗ , a new minimum appears at η = η̄+ , but we will have
L(η̄+ ) > 0, so this is only a local minimum (since L(0) = 0): in this range
of temperatures the ordered phase is metastable. The plot is shown in
Figure 10.5.
T > T ∗∗
Figure 10.3: Landau free energy for t > t∗∗ (T > T ∗∗ ). The point η̄ = 0 is the absolute
minimum.
T = T ∗∗
η− = η+ η
Figure 10.4: Landau free energy for t ≤ t∗∗ (T = T ∗∗ ). The point η̄− = η̄+ is a flex one.
Tt < T < T ∗∗
metastable
phase
η− η+ η
Figure 10.5: Landau free energy for t ≤ t∗∗ (Tt < T ≤ T ∗∗ ). The point η̄+ is a local
minimum.
– If we further decrease the temperature T , we will reach a temperature

T = Tt for which L(η̄+ ) = 0 = L(0): at this point the ordered and
disordered phase coexist, so this is the temperature of a new transition!
The plot is shown in Figure 10.6. Tt is given by the coexistence condition
L(η̄+ ) = L(0)
that is the coexistence between the disordered and ordered phases. In fact,
in the plot of Figure 10.6 we see that there are two minima in the same
line, this is a first order transition.
– Finally for T ∗ < T < Tt , η̄+ becomes negative and so now η = η̄+ is the
global minimum of L: the ordered phase becomes stable and the disordered
10.4. First-order phase transitions in Landau theory 151
T = Tt
0 η− η+ η
Figure 10.6: Landau free energy for t ≤ t∗∗ (T = Tt ). The point η̄+ is a minimum. The
ordered and disordered phase coexist.
phase metastable, indeed now η = 0 is only a local minimum (see Figure

10.7).
Figure 10.7: Landau free energy for t ≤ t∗∗ (T ∗ < T < Tt ). The point η̄+ is the global
minimum.
– If now T < T ∗ , L develops a new minimum for η < 0, but it is only a local
minimum (the asymmetry introduced by −wη 3 ensures that η̄+ is always
the global minimum). This means that also for T < T ∗ the disordered
phase with η̄+ continues to be the stable one, and so no phase transition
occurs at T ∗ any more; this is what we meant when we said that T ∗ is not
a relevant temperature any more.
Therefore, we have seen that lowering the temperature of the system, the value
of η for which L has a global minimum changes discontinuously from η = 0 to
η̄+ : this is a first-order transition. All the results obtained are shown in Figure
10.8.
As we said, at T = Tt the system undergoes a first order transition. It is defined

by two conditions: it must be a minimum of L and such that the value of L in that
minimum is zero. Thus we can determine Tt as follows:
(
∂L
= 0 = η 2at − 3wη + bη 2 extreme condition

∂η
L(0) = L(η+ ) = 0 = η 2 (at − wη + b 2
4η ) coexistence condition
(a) First-order transition. (b) Same transition for lower values of the
temperature.
Figure 10.8: The notation in this plot is different from the one used previously. Here
T̄ ≡ T ∗ , T ∗ ≡ T ∗∗ and T ∗∗ ≡ Tt .
Therefore, for η 6= 0: (
2at − 3wη + bη 2 = 0
⇒
at − wη + 4b η 2 = 0
Solving with respect to η and t, we get
(
η̄t = + 2w
b >0
2w2
tt = ab
Since by the definition t = (T − T ∗ )/2, we have:
4w2
Tt = T ∗ + (10.14)
ab
Remark. Let us note that Tt > T ∗ .
Since at T = Tt there is a first order transition does the system display latent
heat?
a 2w 2

∂L 1 2
s= − = − aη̄t = −
∂T ηt 2 2 b
Hence, there is an entropy jump. The latent heat absorbed to go from the ordered
to the disordered phase is
2
a 2w
q = −Tt s = Tt (10.15)
2 b
10.4.1 Phase stability and behaviour of χT

Finally, we can also determine the susceptibility of the system:
∂η
χT ≡
∂h
In the presence of an external field, let us derive the equation of state with respect
to h:
∂ ∂LG ∂
2atη − 3wη 2 + bη 3 = h

=0 =
∂h ∂η ∂h
∂η
Hence, since χT ≡ ∂h ,
χ 2at − 6wη + 3bη 2 = 1

The result is
1
χT = (10.16)
2at − 6wη + 3bη 2
We now make use of equation (10.16) to compute the limit of stability of the phases
we have found.
10.5. Multicritical points in Landau theory 153
10.4.2 Computation of T ∗∗
As said, T = T ∗∗ is the value below which the ordered phase becomes a metastable
state (local minima). In particular, since for T = T ∗∗ the point η̄− = η̄+ is a flex
point, we have the condition
∂2L
=0
∂η 2
thus:
∂ ∂h
2atη − 6wη 2 + bη 3 = h = 0 ⇒ 2at − 6wη + 3bη 2 = = χ−1 = 0

∂η ∂η
Remember that at T = T ∗∗ the two solutions η̄± coincide, hence from Eq.(10.13) we
have
2at 3w
c2 − = 0 ⇒ η̄± = η2 =
b 2b
Inserting in the expression with χ−1 = 0, we have
χ−1 = 0 = 2at∗∗ − 6wη̄2 + 3bη̄22
Hence,
9w2 1
⇐⇒ t∗∗ = = (T ∗∗ − T ∗ ) (10.17)
8ab 2
Remind that for Tt < T < T ∗∗ the ordered phase η̄+ is metastable.
10.4.3 Computation of T ∗
The instability of the disordered phase η = 0 is when L presents a flex point at
η = 0. Therefore, from the condition
∂ 2 L

=0
∂η 2 η̄=0
we have
∂h
χ−1 = 2at − 6wη + 3bη 2 = =0 ⇒ 2at = 0 ⇒ t = 0
∂η
Hence, we have
⇒ T = T∗ (10.18)
thus no phase transition occurs at T ∗ any more. The plot of the Landau free energy
in the case T = T ∗ is shown in Figure 10.9.
10.5 Multicritical points in Landau theory

It is possible for a system to have more "disarranging parameters" than the only
temperature T ; let us call one such field ∆. In this case the phase diagram of the
system becomes richer, with coexistence and critical lines that intersect in points
called multicritical points; one of the most common examples of a multicritical point
is the tricritical point, which divides a first-order transition line from a second-order
one. An example of a system of the type we are considering is the Blume-Emery-
Griffiths model, which we have studied in Mean field theory for the Blume-Emery-
Griffiths model. In that case the additional "disarranging field" was the concentration
x of He3 , and the tricritical point is the one we called (xt , Tt ).
Such a phenomenology can be obtained within Landau theory also with terms dif-
ferent from a simple cubic one; in particular, we can have first order phase transitions
T = T∗
Figure 10.9: Landau free energy for t ≤ t∗∗ (T = T ∗ ). The point η̄+ is the global minimum,
while the point η̄ = 0 is a flex point.
even when the system is invariant under parity, like in the case of the Ising model.
In fact in that situation we required the coefficient of η 4 to be always positive, but if
this is not true then L will be:
a(t, ∆) 2 b(t, ∆) 4 c 6
L(T, ∆, η) = η + η + η − hη (10.19)
2 4 6
where a, b and c are functions of two parameteres (T, ∆) and c > 0 positive for
the stability of the system (otherwise, like in the case previously considered, the
minimization of L leads η to infinity); ∆ is the disordered field (in the BEG model
∆ was the % He3 atoms).
Remark. To allow the coefficient of η 4 to change sign, we need the η 6 term.
Let us study the phenomenology of ∆ (∆c is a critical value):
• If ∆ < ∆c : as T decreases, a(T, ∆) decreases and, at T = Tc (∆), becomes zero.

In this region b(T, ∆) > 0 and the system displays the standard (η 4 ) critical
point. At T = Tc we have:
(
a(Tc , ∆) = 0
T = Tc (∆) ⇒
b(Tc , ∆) > 0
If a changes sign and b is kept positive (which can be done varying the values
of T and ∆ in a way such that a goes to zero faster than b, depending of course
on their explicit expressions) then a critical transition occurs since in this case
η = 0 becomes a local maximum for L, and it develops two new global minima.
Therefore, the solution of the equation a(T, ∆) = 0 will give a line of critical
points in (T, ∆) plane.
• If ∆ > ∆c : as T decreases, b(T, ∆) becomes zero before a(T, ∆). At T = Tc we

have (
a(Tc , ∆) > 0
T = Tc (∆) ⇒
b(Tc , ∆) = 0
Hence, if b becomes negative while a is still positive (which again can be done
varying T and ∆ so that b vanishes faster than a) then something rather different
happens: in this case, one can show that as b approaches zero L develops two
new symmetric local minima at η̄± (similarly to the case analysed before, with
10.5. Multicritical points in Landau theory 155
the difference that now the situation is perfectly symmetric since L is even) and
they will become the new global minima as L(η̄± ) = 0, which happens when
b changes sign: this way the equilibrium value of the order parameter change
discontinuously from zero to a non-zero quantity so a first-order transition has
indeed happened.
In this case, we have
L = aη 2 + cη 6 , a>0
The equilibrium states are
∂L
= 2aη + 6cη 5 = 0
∂η
so, the solutions are (

η=0
η1,2,3,4
coexistence between 3 phases
Figure 10.10: Landau free energy for ∆ > ∆c . There is coexistence between three phases,
in fact there are 3 global minima.
• Case ∆ = ∆t : the tricritical point is given by the values of ∆ = ∆t and T = Tc

such that
a(∆t , Tt ) = b(∆t , Tt ) = 0
This means that when both a and b are null the system goes from exhibit-
ing a continuous critical transition to a discontinuous first-order one; in other
words, the tricritical point (Tc , ∆c ) can be determined from the solution of the
equations a(T, ∆) = 0 and b(T, ∆) = 0.
At the tricritical point the system is described by the following Landau free-
energy:
Lt = cη 6 − hη
The equation of state is
∂Lt
=0 ⇒ h = 6cη 5
∂η
The first-order transition with an even L are shown in Figure 10.11.

To conclude let us consider again a system with an Ising-like Landau free energy,
where c > 0 and a, b are in general functions of the reduced temperature t (and also
Figure 10.11: First-order transition with an even L.
of the other "disarranging" parameter ∆, which we now neglect). The Landau free
energy is again
a b c
L = η 2 + η 4 + η 6 − hη
2 4 6
We now want to show that we can understand how the phase diagram of the
system is in (a, b) space, i.e. that we can draw where the phase transition lines are
and so we are able to visually represents where the various phases of the system are
in (a, b) plane.
First of all, we can note that when a, b > 0 the only minimum of L is η̄ = 0, so
the system is in the paramagnetic phase. Furthermore if a < 0 and b > 0 the system
is in the magnetic phase, and a second order transition has occurred; therefore we
can surely say that the half-line a = 0, b > 0 is a second order transition line.
We must thus determine where the first order transition line lies in (a, b) space.
In order to do so, we first note that the extrema of L are given by:
√
∂L 2 4 2 −b ± b2 − 4ac
0= = η(a + bη + cη ) ⇒ η̄± =
∂η 2c
(and of course they exist only when the temperature is such that b2 − 4ac > 0) and
since:
∂2L 2
p
= ±η ± · 2 b2 − 4ac
∂η 2 |η±
we have that ±η̄+ are maxima while ±η̄− are minima. The first order transition
happens when L(±η̄± ) = L(0) = 0, so:
a 2 b c a b 2 c
η̄ + η̄ 4 + η̄ 6 = 0 ⇒ + η̄ + η̄ 4 = 0
2 + 4 + 6 + 2 4 + 6 +
Now, from the condition ∂L/∂η = 0 we can express η̄+
4 as a function of η̄ 2 , and we
+
get η̄+
4 = −(a + bη̄ 2 )/c. Substituting we get:
+
2 a
η̄+ = −4
b
√
and substituting again in η̄+ = (−b + b − 4ac)/(2c) in the end we get:
2 2
r
ac
b = −4
3
so the first order transition line is a parabola in (a, b) plane (in particular it will lie
in the fourth quadrant). In the end the situation is as in Figure 10.12. As we can see
the tricritical point of the system, being the point that divides the first-order from
the second-order transition line, is the origin (0, 0) of the parameter space.
10.6. Liquid crystals 157
Figure 10.12: Phase diagram of the system in (a, b) space.
10.6 Liquid crystals

We now proceed to study a particular physical system, liquid crystals, to which
we will apply Landau theory of phase transitions. As we will see the symmetries of
the system will allow the Landau free energy to include a cubic term in the order
parameter (which we will properly define), and so we will be able to describe the
first-order transition from an isotropic to a nematic phase (which we are now going
to introduce).
10.6.1 What are liquid crystals?

Liquid crystals phase (LC) can be seen as an intermediate phase between a liquid
and a solid: they are liquid like any other conventional fluid, but also have internal
orientetional order like solid crystals. This orientational order provides them par-
ticular anisotropic properties from an optical, electric and magnetic point of view.
The most common structural characteristics of the molecules that constitute liquid
crystals are the following:
• They have an elongated, anisotrpic shape.
• Their axes can be considered rigid with good approximation.
• They have strong electric dipoles or easily polarizable groups.
Furthermore, it seems that the groups located at the extremities of a molecule are
not relevant for the formation of phases.
The vast majority of the interesting phenomenology of liquid crystals concerns
↔
the geometry and dynamics of the preferred axis of orientation n(~r), called director.
This is a ’two arrow vector’ that gives the local average alignment of the elementary
↔
constituents. In this description the amplitude of n(~r) is irrelevant and one takes
↔ ↔
n(~r) such that is unitary (i.e. n(~r) = 1). Since there is no head-tail symmetry
↔ ↔ ↔ ↔
(apolar order), n = − n (i.e. + n(~r) and − n(~r) are physically equivalent).
There is a plethora of possible liquid crystal phases; the most common are:
• Nematic: this phase is characterized by a very strong long-ranged orientational

order: the main axes of the molecules tend to orientate along a preferred di-
rection (see Figure 10.14), determined by the director. There is no long-ranged
translational order of the molecular centers of mass, even if a short-ranged one
can exist.
Figure 10.13: Graphical representation of the nematic, smectic and cholesteric phases of
a liquid crystal.
From optical point of view the Nematic phase is birifrangent, i.e. they exhibit
↔
two different refractive indexes: one parallel to the director n (called ordinary
refractive index) and one orthogonal to it (special refractive index). These
optical properties of the nematic phase are used to build devices like LCDs.
• Smectic: also in this phase the molecules are aligned along a preferred direc-
tion, but contrarily to the nematic one this phase has also a spatial periodic
order: the molecules are organised in layers. Furthermore, differently from ne-
matic phases, smectic liquid crystals have non-uniform density and are generally
more viscous.
• Cholesteric: It is similar to the nematic phase since it has a long-ranged

↔
orientational order, but the direction of n changes regularly in space; the typical
↔
configuration of a cholesteric liquid crystal has a director n(~r) that rotates
when ~r varies along a particular direction: for example, in a three-dimensional
reference frame the molecules are orientated along the y direction in xy plane,
but this direction roteates if z changes.
The structure of a cholesteric liquid crystal is characterised by the spatial dis-
tance along the torsion axis, called pitch, after which the director has rotated
by an angle of 2π. The pitch of the most common cholesteric liquid crystals is
of the order of several hundred nanometers, so comparable with the wavelength
of visible light; furthermore, it can also be very sensitive to changes in tempera-
ture, chemical composition, or external electromagnetic fields. Note also that a
nematic liquid crystal can be seen as a cholesteric one with infinite pitch; these
two phases in fact are not independent from each other, and there is no real
phase transition between them.
10.6.2 Definition of an order parameter for nematic liquid crystals

What we now want to do is to apply Landau theory to liquid crystals in order to
study the transition from an isotropic to a nematic phase (Figure 10.14); therefore,
we must define an order parameter for such a system. This is absolutely not trivial,
and there are two ways to do it a microscopic and a macroscopic one. We will use a
macroscopic approach.
↔
n
(b) Nematic phase.

(a) Isotropic phase.
Figure 10.14
Macroscopic approach
From a macroscopic point of view we have already stated that an important dif-
ference between the disordered and nematic phases consists in the response functions
when the liquid crystal is subjected to magnetic or electrical fields. Hence, a macro-
scopic definition of an order parameter for LC phase is based on the system response
when subject to fields. For instance, supposing that we have a liquid crystal sub-
~ the diamagnetic response of the system will be
ject to an external magnetic field H,
measurable in terms of its magnetization M,~ and in particular:
~ = χ̄
M ¯H ~ (10.20)
where χ̄
¯ is the response function matrix, namely the magnetic susceptibility of the
system. In components we have:
Mα = χαβ Hβ (10.21)
~ is static, then χ is symmetric, i.e.

where the inexes α, β stands for x, y, z. If H
χαβ = χβα
In the isotropic phase χ will also be diagonal, namely
χαβ = χδαβ
while in the nematic phase For a LC in the neumatic one has

 
χ⊥ 0 0
χαβ =  0 χ⊥ 0  (10.22)
0 0 χk
↔
where, as before, we have supposed that the director n is parallel to the z direction.
Therefore we could build an order parameter in terms of the susceptibility χ, and
this parameter will necessarily have a tensorial nature (since χ itself is in general a
tensor), so it will not be a simple scalar like in the previous case. Since we want our
order parameter to vanish in the disordered phase, we can define it "removing" from
χ its isotropic component. In other words, in components we can define:

1
Qαβ = A χαβ − δαβ Tr χ̄ ¯ (10.23)
3
where A is a constant. In this way Q is a good tensorial order parameter. In par-

ticular, the order parameter is a second rank traceless tensor. Let us note that its
definition is completely general, and in fact it is useful also to describe other kinds of
phases, not only the uniaxial nematic one.
It is possible to show that Qαβ can be written in terms of the local average
↔
orientiational order of the elementary constituents, n(~r) and the degree of local order
given by a scalar S(~r). Hence, we can define

1
Qαβ (~r) = S(~r) nα (~r)nβ (~r) − δαβ (10.24)
3
The advantage of this definition of the order parameter (which is the one we will use
in the following) is that it also takes into account the degree of orientation and the
mean direction. By definition Q is symmetric and traceless4 , so in general way we
can write it as:  
q1 q2 q3
Q =  q2 q4 q5  (10.25)
q3 q5 −q1 − q4
10.6.3 Landau-de Gennes theory for nematic liquid crystals

Since we now have a proper order parameter, we can formulate the Landau theory
for the phase transitions of nematic liquid crystals (also called Landau-de Gennes
theory). In particular we want to study the transition between the isotropic and
nematic phase, and we call Tn−i the temperature at which it occurs.
As we have already stated, the Landau free energy L must be consistent with the
symmetries of the system, so in this case it must be invariant under rotations. Now,
since Q transforms as a tensor under rotations and L must be a scalar, it will contain
¯ P ; to the fourth order we will have (the linear term is absent
terms of the form Tr Q̄
because by definition Q is traceless, i.e. Tr(Q) = 0):

1 ¯ 2 1 ¯ 3 1 ¯ 2
2
¯ 4
L = L0 + A(T ) Tr Q̄ + B(T ) Tr Q̄ + C(T ) Tr Q̄ + Tr Q̄
2 3 4
In reality this expression, and in particular the quartic term, can be simplified: in
fact it is a property (which we will not prove) of any n × n symmetric matrix that
¯ s with s > n can be expressed as a polynomial of Tr Q̄
Tr Q̄ ¯ p with p < n, so in our
case any Tr Q̄¯ s with s ≥ 4 can be expressed in terms of Tr Q̄ ¯ 2 and Tr Q̄
¯ 3 (we are
¯
automatically neglecting Tr Q̄ since in our case it vanishes, but in general it must be
considered). Therefore, we can write the Landau free energy as:
1 ¯ 2 + 1 B(T ) Tr Q̄
¯ 3 + 1 C(T ) Tr Q̄

¯2 2

L = L0 + A(T ) Tr Q̄ (10.26)
2 3 4
or, in components:
1 1 1
L = L0 + A(T )Qαβ Qβα + B(T )Qαβ Qβγ Qγα + C(T )(Qαβ Qβα )2 (10.27)
2 3 4
Remark. Since each 3 × 3 matrix satisfies the relation
¯ 4 = 1 Tr Q̄¯2 2

Tr Q̄
2
¯ 4.
the term proportional to C(T ) can be written as 21 C(T ) Tr Q̄
4
A matrix whose trace is zero is said to be traceless.
Let us note that since our order parameter is a tensor its invariance under rotations
does not exclude the possible existence of terms with odd powers of Q in L, in
particular the cubic one.
For the most general case of a biaxial nematic phase Q̄ ¯ can be diagonalized giving
2 
3S 0 0
Qαβ =  0 − 31 (S + η) 0  (10.28)
1
0 0 − 3 (S − η)
where we remind that S is the degree of local order. If η = 0 we have the standard
uniaxial nematic phase and the order parameter becomes
2 
3S 0 0
Qαβ =  0 − 13 S 0  (10.29)
0 0 − 13 S
Now, from the expression of Q in the case of a uniaxial nematic liquid crystal
(Eq.(10.29)) we have:
¯ 2 = 2 S2, ¯ 3 = 2 S3, 4
2
Tr Q̄ Tr Q̄ ¯2
Tr Q̄ = S4
3 9 9
Hence, for uniaxial nematic liquid crystal the Landau free energy becomes
A(T ) 2 2 C(T ) 4
L = L0 + S + B(T )S 3 + S (10.30)
3 27 9
so that, supposing that B and C do not depend on the temperature (i.e. B(T ) = B
and C(T ) = C), while A(T ) ' A(T − T ∗ ), we have:
A 2 C
L = L0 + (T − T ∗ )S 2 + BS 3 + S 4 (10.31)
3 27 9
This Landau free energy has exactly the same form of the one we studied in first-order
phase transitions in Landau theory (Eq.(10.11)), with the substitutions:
2 2 4
a = A, w=− B, b= C
3 27 9
Applying the results we have already found (Eq.(10.14)), we will have that the first-
order transitions between the isotropic and nematic phases occurs at the temperature:
4w2 2B 2
Tn−i = T ∗ + = T∗ + (10.32)
ba 27AC
Chapter 11
Role of fluctuations in critical

phenomena: Ginzburg criterium,
Coarse-graining and
Ginzburg-Landau theory of phase
transitions
11.1 Importance of fluctuations: the Ginzburg criterium

As we have seen, the main assumption (and the most important problem) of mean
field theories is that the fluctuations of the order parameter are completely neglected
in the computation of the partition function Z; this approximation breaks down in
the neighbourhoods of critical points, where as we have seen in long range correlations
the correlation length becomes comparable with the size of the system:
T →Tc
ξ ∼ |T − Tc |−ν
What we would now like to do is to include these fluctuations in a mean field theo-
retical framework; this will lead to the so called Ginzburg-Landau theory.
Overall, mean field is not a very good approximation in proximity of the critical
point, and the question is: how bad is the mean field approximation in proximity of
it? As a first approach we can try to estimate how big is the error we make in mean
field theories neglecting the fluctuations of the order parameter near a critical point,
so that we can understand under which conditions mean field theories are actually a
good approximations.
To make things explicit, let us use the Ising model as a base for our considerations.
We have seen in Weiss mean field theory for the Ising model that the Weiss mean
field theory for the Ising model is based on the assumption that
MF
hSi Sj i −→ hSi i hSj i
i.e. that the spins are statistically independent; therefore, a possible estimate of the
error dor each pair of spin (Si , Sj ) made with this assumption can be:
|hSi Sj i − hSi i hSj i|
Eij = (11.1)
hSi i hSj i
The numerator of Eij is, by definition, the two-point connected correlation function:
Gc (i, j) ≡ hSi Sj i − hSi i hSj i = h(Si − hSi i)(Sj − hSj i)i (11.2)
163
Chapter 11. Role of fluctuations in critical phenomena: Ginzburg criterium,
164 Coarse-graining and Ginzburg-Landau theory of phase transitions
Assuming translational invariance, we have

Gc (i, j) → Gc (|~ri − ~rj |) → Gc (r)
In order to compute Gc (r) we cannot assume omogeneity since hSi i = hSj i = m. It
implies that Gc = 0 identically, and if we want to compute the error in the mean
field, is always zero. Therefore, in order to have non-null correlation functions we need
that the system exhibits some kind of inhomogeneity, not necessarily due to thermal
fluctuations. In fact, the connected correlation function Gc describes not only the
spatial extension of the fluctuations of the order parameter, but also, through the
linear response theory, the way in which m varies in space in response to an external
inhomogeneous magnetic field. Within a mean field theory this is the only way to
compute Gc ! Let us see this explicitly. We know that from the partition function of
the Ising model in an inhomogeneous external field H ~ i , i.e.:
Z[Hi ] = Tr{S} e−β (−J hiji Si Sj − i Hi Si )

P P
(11.3)
we have the definition of the thermal average
Tr{S} Si e−β (−J hiji Si Sj − i Hi Si )
P P
1 ∂Z ∂ ln Z ∂F
hSi i = = = β −1 =−
Z[Hi ] βZ ∂Hi ∂Hi ∂Hi
(11.4)
Similarly, one can show that
β −2 ∂ 2 Z
hSi Sj i = (11.5)
Z ∂Hi ∂Hj
and thus:
β −2 ∂ 2 Z
−1 −1
β ∂Z β ∂Z
Gc (i, j) = −
Z ∂Hi ∂Hj Z ∂Hi Z ∂Hj
2 2
(11.6)
∂ ln Z ∂ F ({Hi })
= β −2 =−
∂Hi ∂Hj ∂Hi ∂Hj
Therefore, the response in i due to a variation of H in j is

∂ ∂ −1 ∂ ln Z ∂ ∂F
hSi i = β = − = βGc (i, j)
∂Hj ∂Hj ∂Hi ∂Hj ∂Hi
so Gc (i, j) can indeed be seen as a response function. The generating functions are:
• Z[Hi ]: generating function of G(i, j).
• ln Z[Hi ] = −βF : generating function of Gc (i, j).
If we now call: X
M= hSi i
i
we will have
∂M X ∂ hSi i X
= =β Gc (i, j)
∂Hj ∂Hj
i i
If our system is invariant under translations and subject to a uniform field, then:
∂M X ∂M ∂Hj X
= =β Gc (i, j)
∂H ∂Hj ∂H
j ij
and since χT = ∂M/∂H we get:

X X
χT = β Gc (i, j) = β (hSi Sj i − hSi i hSj i) (11.7)
i,j i,j
which is a version of the fluctuation-dissipation theorem.

11.1. Importance of fluctuations: the Ginzburg criterium 165
11.1.1 Fluctuation-dissipation relation

The fluctuation–dissipation theorem (FDT), or fluctuation–dissipation relation
(FDR), is a powerful tool in statistical physics for predicting the behavior of sys-
tems that obey detailed balance. It is a general result of statistical thermodynamics
that quantifies the relation between the fluctuations in a system that obeys detailed
balance and the response of the system to applied perturbations.
More specifically, the fluctuation–dissipation theorem says that when there is a
process that dissipates energy, turning it into heat (e.g., friction), there is a reverse
process related to thermal fluctuations.
For instance, let us consider the Brownian motion: if an object is moving through
a fluid, it experiences drag (air resistance or fluid resistance). Drag dissipates kinetic
energy, turning it into heat. The corresponding fluctuation is Brownian motion. An
object in a fluid does not sit still, but rather moves around with a small and rapidly-
changing velocity, as molecules in the fluid bump into it. Brownian motion converts
heat energy into kinetic energy—the reverse of drag.
Now, let us consider the partition function with an homogeneous magnetic field
Hi = H, ∀i: P P
ZN = Tr{S} eβJ hiji Si Sj +H i Si
The magnetization is thus:
1 P P 1 ∂ZN
Si eβJ hiji Si Sj +H i Si =
X X
M= hSi i = Tr{S}
Z βZN ∂H
i i
Similarly, we have
X 1 ∂ 2 ZN
hSi Sj i =
β 2 ZN ∂H 2
ij
Recall that
1 1
F = − kB T ln Z
N N
Hence, the magnetic supsceptibility is:

∂m ∂ 1 ∂F ∂ 1 ∂ ln Z
χT = = − = kB T
∂H ∂H N ∂H ∂H N ∂H
" #
1 ∂Z 2
2
1 ∂2Z

1 ∂ ln Z 1
= kB T = kB T −
N ∂H 2 N Z ∂H 2 Z 2 ∂H
 !2 
1  2 X X
= β hSi Sj i − β 2 hSi i 
Nβ
ij i
β X β X
= Gc (i, j) = Gc (~ri − ~rj )
N N
ij ij
X
=β Gc (x~i )
i
where we have defined ~xi ≡ ~ri − ~rj . Therefore:

Z
d −1
⇒ χT = (a kB T ) dd~r Gc (~r) (11.8)
Ω
this is the fluctuation dissipation relation.

11.1.2 Computation of ET OT
Let us now try to understand when the error Eij done in mean field theories is
negligible. Now, in general terms if we formulate a mean field theory for a system we
will make the error Eij in the region where correlations are relevant, namely if |~r| is
the distance between two points of the system the error is made for |~r| ≤ ξ, with ξ
the correlation length. The total relative error is the ER (r) integrated over the region
of radius |~r| ≤ ξ, i.e. where correlations are not negligible:
dr
R
|~r|≤ξ Gc (r) d ~
ET OT = R dr
(11.9)
|~r|≤ξ hSi i hSj i d ~
where we have called d the dimensionality of our system. Supposing T < Tc , so that
the order parameter η is non null, i.e. η(r) = η 6= 0, we have:
hSi i hSj i ≈ η 2
is uniform in the region |~r| < ξ. Hence, our mean field theory will be a good approx-
imation if ET OT 1:
r) dD~r
R
|~r|≤ξ Gc (~
ET OT ∼ R
2 Dr
1 (11.10)
|~r|≤ξ η d ~
known as Ginzburg criterion. If it is satisfied, then the mean field theory is a valid
approximation.
11.1.3 Estimation of ET OT as t → 0−
In order to express Eq.(11.10) in a useful fashion, let us write it in terms of critical
exponents; using also the version we have just found of the fluctuation-dissipation
theorem (Eq.(11.8)) we get (supposing our system is continuous) that the numerator
of Eq.(11.10) can be approximated as
Z fluctuation
d dissipation
Gc (r) d r ∼ kB Tc χT ∼ t−γ
|~r|≤ξ
On the other hand, the denominator can be approximated as

Z
η 2 dd r ∼ ξ d |t|2β ∼ t2β−νd
|~r|≤ξ
Therefore, the Ginzburg criterion can be reformulated as:

t→0− −γ+νd−2β
ET OT ∼ t 1
and in the limit t → 0− this is possible only if −γ + νd − 2β > 0, i.e.:
γ + 2β
d> ≡ dc (11.11)
ν
This means that Ginzburg criterion allows us to determine the upper critical dimen-
sion dc of a system, namely the dimension above which mean field theories are good
approximations. Let us consider three different cases:
• Case d < dc : fluctuations are relevant and mean field is no a good approxima-
tion.
• Case d > dc : fluctuations are less important and mean field describes properly
the critical point.
11.2. Functional partition function and coarse graining 167
• Case d = dc : mean field critical exponents ok but strong correction to the

scaling expected. For a Ising-like systems (in the mean field) we have
1 2
β= , γ=1 ⇒ dc =
2 ν
In order to compute dc we need to compute ν withing the mean field approxi-
mation. Let us note that since it depends on the critical exponents, the upper
critical dimension dc ultimately depends on the universality class of the system
considered; furthermore, in order to actually be able to compute dc we must
generalize Landau theory to systems with spatial inhomogeneities so that we
are able to compute the critical exponent ν.
Remark. Remind that within the mean field theory the ν exponent it is not
defined. In fact, the ν exponent define the divergence of the correlation length,
but in the mean field we neglet correlation between fluctuations.
We have νM F = 1/2, hence the upper critical dimension for the mean field is
d > 4.
11.2 Functional partition function and coarse graining

Lecture 17.
Since in proximity of the critical point the correlation length ξ diverges, there is Wednesday 11th
no point in which we can see small scales. It is convenient to rewrite the microscopic December, 2019.
partition function as an effective partition function obtained by integrating out the Compiled:
degrees of freedom over regions of linear size l a but still l ξ. Indeed, a possible Wednesday 5th
way to overcome the limitations of mean field theories can be the following: we February, 2020.
could regard the profile of the order parameter m(~r) to be the "degree of freedom"
of our system and compute the partition function as a functional integral; in other
words from the microscopic configuration of our system we can obtain m(~r) with a
coarse graining procedure (we will immediately see what we mean by this) and then
determine Z as a trace over all the possible configurations of our system, i.e. over all
the possible forms of m(~r):
 
Z  
 X 
Z = Tr{S} e−βH[{S}] = D[m(~r)] e −βH[{S}]] 
(11.12)


 
 {S} 
compatible with the
profile m(~r)
where we traced over all the possible microscopic configurations {S} compatible with
the order parameter profile m(~r). Let us define the effective Hamiltonian Hef f :
X
e−βH[{S}] = e−βHef f (m(~r)) (11.13)
{S}
compatible with the
profile m(~r)
How, in pratice, can we perform the coarse graining procedure and obtain Hef f ?
We therefore must understand how to determine m(~r); the idea of coarse graining
procedures is the following: for a given microscopic configuration {S} we average the
order parameter m(~r) over sufficiently wide "blocks", i.e. portions of the system with
linear dimension l much greater than its microscopic scale, which we call a (in the
case of the Ising model, for example, a can be taken as the lattice constant), but still
microscopic and in particular much smaller than the correlation length ξ, so that the
order parameter is uniform in every block. In other words, coarse graining a system
means dividing it into cells of linear dimension l, with l such that:
a l ξ(T ) < L (11.14)
(L being the linear dimension of our system) and averaging the order parameter m(~r).
This way we can obtain an expression for m(~r) (since l is anyway microscopic with
respect to the size of the system, so we can regard ~r as a continuous variable).
Remark. Hence, we partition the configurations according to the magnetization pro-
file. For example, if we have a configuration with half spin up and half down, we
obtain a profile with 1 and -1.
11.3 Coarse graining procedure for the Ising model

To make things more clear, let us see how the coarse graining procedure works for
the Ising model. For instance, see the two dimensional system represented in Figure
11.1, where we have many spins in each square in which the system is divided.
~r
Figure 11.1: Two dimensional system divided into cells with a huge number of spins.
If we call mi = hSi i the local magnetization at the i-th and d the dimensionality
of the system, once we have choosen the linear dimension l, every "block" will have
volume ld ; we replace what it is inside every block of the system centered in ~r, with
the coarse grained magnetization:
1 X
ml (~r) = Si (11.15)
Nl
i∈~r
where Nl = (l/a)d is the number of spins in each cell.

Remark. Since close to the critical point Tc , we have that the correlation length
diverges ξ a, we can always choose l ξ but still l a such that the number Nl
is large enough. In this way, m(~r) can be made to be a regular function of ~r.
Moreover, since it has been built as an average, ml (~r) does not fluctuate much
on microscopic scales but varies smoothly in space. Of course, in general we need to
specify l in order to determine ml , but the coarse graining procedure we are applying
will be useful only if the final results are independent of l (at least in the spatial scales
considered).
Remark. In the reciprocal space (Fourier transform), the bound in Eq.(11.14) implies
the following cut off on the wave vector ~q:
|~q| > Λ = l−1
Hence, this theory cannot develop ultraviolet divergences!
11.3. Coarse graining procedure for the Ising model 169
We now must express the partition function in terms of ml (~r):

 
 
X  X  X
−βH({S}) 
Z= e = e−βHef f [m(~r)]


 
ml (~r)  {S}  ml (~r)
compatible with the
profile m(~r)
If m(~r) is regular, the sum converges to a functional integral:

Z
ZGL = D[ml (~r)]e−βHef f [m(~r)] (11.16)
so we must compute Hef f [m(~r)]. First let us notice that Eq.(11.13):

X
e−βH[{S}] = e−βHef f (m(~r))
{S}
compatible with the
profile m(~r)
is proportional to the probability that the system displays a configuration with a

profile ml (~r).
11.3.1 Computation of Hef f [m(~r)]

Since we now have a system made up of "blocks" this effective Hamiltonian will
be composed of two parts: a bulk component relative to the single blocks and an
interface component relative to the interaction between the blocks; let us consider
them individually.
• Bulk component: suppose that every block of volume ld is separate from the
rest of the system; inside every one of them the magnetization is uniform (since
the linear dimension of the blocks is much smaller than the correlation length
l ξ), so we can use Landau theory for uniform systems. In the case of the
Ising model, it led to the free energy:
b̄
L = atm2 + m4
4
The total bulk energy is thus obtained summing over all the blocks:
bulk
X b̄
βHef f [m] = ātm2 (~r) + m4 (~r) (11.17)
2
~r
Hence, the probability that the sistem displays a configuration with a profile
ml (~r) is proportional to
bulk (m(~
ātm2 (~r)+ 2b̄ m4 (~r)
P cell (ml (~r)) ' e−βHef f r))
P
= e− ~
r (11.18)
• Interaction component: we now must take into account the fact that ad-
jacent blocks do interact. In particular, since as we have stated m does not
vary much on microscopic scales, the interaction between the blocks must be
such that strong variations of magnetization between neighbouring blocks is
energetically unfavourable. If we call µ
~ a vector of magnitude l (|~
µ| = l) that
points from one block to a neighbouring one (see Figure 11.2), the most simple
analytic expression that we can guess for such a term can be a harmonic one:
X X k̄ 2
int
− βHef f = ~ ) − m(~r)
m(~r + µ ~ ) − m(~r))4 (11.19)
+ O (m(~r + µ
2
~r µ
~
(the factor 1/2 multiplying k̄, just like the numeric factors multiplying ā and b̄,
have been inserted for future convenience). We can also think of this as a first
approximation of a general interaction between the blocks, namely as the first
terms of a Taylor expansion of the real interaction energy.
µ
~
Figure 11.2: Two dimensional system divided into block. The vector µ
~ points from one
block to a neighbouring one.
The total energy is thus obtained by summing the two terms. Now, since the
linear dimension of the blocks l is much smaller than the characteristic length L of
the system we can treat ~r as a continuous variable and thus substitute the sum over
~r with an integral:
X Ll 1 1 Z
−→ d dd~r
l
~r
(while the sum over µ

~ remains a sum, since for every ~r there is only a finite number
of nearest neighbours). Therefore:
• Bulk component:
Z
bulk 1 b̄ 4
βHef f [m] = d ātm (~r) + m (~r) dd~r
2
(11.20)
l 2
Thus, if we now define for the sake of simplicity:
ā b̄
a≡ , b≡
ld ld
we will have: Z
bulk b
βHef f [m] = atm2 (~r) + m4 (~r) dd~r
2
• Interaction component:
1 k̄
Z X
int
2
− βHef f = d ~ ) − m(~r) dd~r
m(~r + µ (11.21)
l 2
µ
~
11.3. Coarse graining procedure for the Ising model 171
Keeping in mind that |~ µ| = l, the interaction term can be rewritten in terms of

~
∇m:
~ ) − m(~r) 2 d

1 k̄ 1 k̄ X m(~r + µ
Z X Z
2 d
~ ) − m(~r) d ~r = d−2
m(~r + µ d ~r
lD 2 l 2 l
µ
~ µ
~
∂m 2 d
Z X
k̄
= d−2 d ~r
2l ∂χµ
µ
~
a l
k ~ 2 d
Z
L
L 1
−→ ∇m d ~r
2
where we have called χµ the components of µ ~ and we have rescaled the elastic
constant by ld−2 :
k̄
k ≡ d−2
l
In this way the result is indipendent on l.
• Total energy:
Z 2
2 b 4 k~
βHef f [m] = atm (~r) + m (~r) + ∇m(~r) dd~r (11.22)
2 2
Therefore, the (functional) partition function of the system will be as in Eq.(11.16):
R h 2
Z Z i
−βHef f [m(~r)] − ~
atm2 (~r)+ 2b m4 (~r)+ k2 (∇m(~
r)) dd~r
ZGL = D[m(~r)]e = D[m(~r)]e
(11.23)
Let us now make a couple of considerations:
• If m(~r) = m (uniform system) the energy of the system has the same structure
of the one used in Landau theory.
• The term proportional to (∇m(~~ r))2 is completely new but we could have in-
troduced it intuitively to a Landau-like mean field functional, since the intro-
duction of spatial variations in the order parameter has an energetic cost which
must depend on how it varies in space, i.e. it depends on the gradient of m.
This term can be also added directly to the Landau theory by simply assuming
that, whe m → m(~r) (one has to consider an additioned energy cost due to
small variation of m).
Why we take (∇m) ~ 2 and not something else? The choise is first of all a con-
sequence of the isotropy of the system (all directions are equivalent). Since
the system is isotropic and Z2 -invariant, we must use combinations of deriva-
tives that are invariant under rotations and parity, and, among all the possible
~
combinations, (∇m(~ r))2 is the simplest one.
Remark. Let us consider the cases in which m → m
~ (O(n) models); we have:
n X
X d
2
~ =
(∇m) ∂α mi ∂α mi
i=1 α=1
Higher order terms are:
n X
d X
d
2 X
∇2 m
~ = (∂α ∂α mi )(∂β ∂β mi )
i=1 α=1 β=1
and
n X
n X
d
2
X
2
~ (∇m)
m ~ = mi mi ∂α mj ∂α mj
i=1 j=1 α=1
In most cases it is sufficient to consider only the lowest order term.
11.3.2 Magnetic non-homogeneous field

If there is also an external magnetic field
~h(~r) = β H(~
~ r)
we must add to the Hamiltonian the term (Legendre transform):

Z
− ~h(~r) · m(~
~ r) dd~r
so that the partition function becomes:

R h 2
Z i
− ~ r)) −~
atm2 (~r)+ 2b m4 (~r)+ k2 (∇m(~ ~ r) dd~r
h(~r)·m(~
ZGL = D[m(~r)]e (11.24)
which is a functional of m(~r) and ~h(~r). As usual, all the thermodynamics of the
system can be obtained from ZGL , provided that now we take functional derivatives
instead of usual derivatives. Moreover, the free energy functional is defined as

b k ~
Z
F [m, h] = dd~r atm2 (~r) + m4 (~r) + (∇m(~ r))2 − h(~r)m(~r) (11.25)
2 2
11.3.3 Functional derivatives

In the calculus of variations, a field of mathematical analysis, the functional deriva-
tive relates a change in a functional to a change in a function on which the functional
depends. Functionals are usually expressed in terms of an integral of functions, their
arguments, and their derivatives.
Definition 9: Functional derivative
Given a manifold M representing (continuous/smooth) functions h (with certain
boundary conditions etc.), and a functional G defined as G : M → R. The
functional derivative of G[h], denoted δG/δh, is defined by
G(h + εΦ) − G(h)

δG d
Z
(x)Φ(x) dx = lim = G[h + εΦ]
δh ε→0 ε dε ε=0
where Φ is an arbitrary function. In physics, it is common to use the Dirac

delta function δ(x − y) in place of a generic test function Φ(x), for yielding the
functional derivative at the point y:
δG[h(x)] G[h(x) + εδ(x − y)] − G[h(x)]
= lim
δh(y) ε→0 ε
or, in many dimensions:
δG[h(~r)] G[h(~r) + εδ(~r − ~r0 )] − G[h(~r)]
= lim (11.26)
δh(r~0 ) ε→0 ε
Properties
Like the derivative of a function, the functional derivative satisfies the following
properties, where F [h] and G[h] are functionals:
• Linearity:
δ(λF + µG)[h] δF [h] δG[h]
=λ +µ
δh(x) δh(x) δh(x)
where λ, µ are constants.
11.4. Saddle point approximation: Landau theory for non-homogeneous systems173
• Product rule:
δ(F G)[h] δF [h] δG[h]
= G[ρ] + F [h]
δh(x) δh(x) δh(x)
• Chain rules: if F is a functional and G another functional, then

δF [G[h]] δF [G] δG[h](x)
Z
= dx ·
δh(y) δG(x) G=G[h]
δh(y)
If G is an ordinary differentiable function (local functional) g, then this

reduces to:
δF [g(h)] δF [g(h)] dg(h)
=
δh(y) δg[h(y)] dh(y)
Let us consider some examples.

Example 30: Functional derivative of a function
A function can be written in the form of an integral like a functional. For
example, Z
f (~r) ≡ F [f ] = f (~r0 )δ d (~r − ~r0 ) dd~r0
Since the integrand does not depend on derivatives of f , the functional derivative
of f (~r) is,
δf (~r) δF ∂ h 0 d 0
i
≡ = f (~
r )δ (~
r − ~
r ) = δ d (~r − ~r0 ) (11.27)
δf (~r0 ) δf (~r0 ) ∂f (~r0 )
Example 31: Functional derivative of interaction component of Hef f

The functional derivative of the interaction component of Hef f [m(~r)] is
Z
δ k ~ 0
2
d 0

~ 2m

∇m(~r ) d ~r = −k ∇ (11.28)
δm(~r) 2
Taking into account the result Eq.(11.28), we have:
δF δ ln Z[h]
hm(~r)i = − =− (11.29)
δh(~r) δh(~r)
and one can show that the magnetic suscpetibility is
δ2F 2
−1 δ ln Z[h]
χ(~r,~r0 ) = = β
δh(~r)δh(~r0 ) δh(~r)δh(~r0 ) (11.30)
= β −1 m(~r)m(~r0 ) − hm(~r)i m(~r0 ) = β −1 Gc (~r,~r0 )

The problem is again try to approximate this term as much as we can. Let us do it.
11.4 Saddle point approximation: Landau theory for non-

homogeneous systems
We can now compute Z, as a first approach, using the saddle point approximation;
as we will see this will reproduce a Landau-like mean field theory which will also take
into account the presence of inhomogeneities. In particular thanks to the new term
~
involving ∇m(~ r) we will be able to compute the fluctuation correlation function and
so also to determine the critical exponents η and ν. Let us recall the results previously
obtained: Z
~ ~ r)dd~r
R
ZGL = D[m(~r)]e−βHef f [m(~r)]+ h(~r)·m(~
where h(~r) = βH(~r) and

Z 2
2 b 4 k~
βHef f [m] = atm (~r) + m (~r) + ∇m(~r) dd~r
2 2
Therefore we approximate Z with the leading term of the integral, i.e. we must
determine the function m0 that maximizes the exponent, namely minimizes:
Z
b k 2
Z
~
L(m, ∇m, h) = βHef f − ~h · m d
~ d ~r = 2
atm + m + 4 ~
∇m − ~h · m
~ dd~r
2 2
(11.31)
Let m0 (~r) be the profile for which L(m0 (~r), h(~r)) is minimum, then compute ZGL as
saddle
0point
ZGL [h] ' ZGL [h] = e−L[m0 (~r)] (11.32)
In order to find the minimum m0 (~r), one has to impose the stationarity condition of
the functional L:
δL = 0 (11.33)
Now, let us define h the integrand of βHef f :
b k ~
h = atm2 + m4 + (∇m)2
2 2
By considering δL with respect to the variations δm and δ(∇m), one gets the equation
of state " ! #
~ ∂h ∂h
h(~r) = − ∇ − (11.34)
~
∂(∇m) ∂m
Hence, by using the definition of h, we obtain the state equation:

~ 2 m0 (~r) + 2atm0 (~r) + 2bm3 (~r)
h(~r) = −k ∇ (11.35)
0
this is the mean field solution of the Gibbs-Landau. It is more general than the one
~ Let us note that:
found before, indeed it has the additional term ∇.
• If h = 0: Eq.(11.34) reduces to the Euler-Lagrange equation

!
∂h ~ ∂h
=∇ (11.36)
∂m ~
∂(∇m)
• If h(~r) = h (homogeneous field) and m0 (~r) = m0 : Eq.(11.35) reduces to the

equation of state of the Landau theory of uniform systems
h = 2atm0 + 2bm30
Remark. Moreover, note that a mean field theory of systems with spatial disomogene-
ity can start directly by considering the free energy functional defined in Eq.(11.25):
Z
b 4 k
F [m, h] = atm (~r) + m (~r) + (∇m) − h(~r)m(~r) dd~r
2 2
2 2
11.5. Correlation function in the saddle point approximation for non-homogeneous
systems 175
Example 32: Show relation (11.36)

Let us consider the case h = 0, we want to obtain the Euler-Lagrange equation:
!
∂h ~ ∂h
=∇
∂m ~
∂(∇m)
In order to do that, we define

Z
~
L[m, ∇m, h] = ~
L(m, ∇m, h) dd~r
Supposing h = 0 and looking for the variation of L with respect to the variations
~
of m, δm, and the variation of ∇m, ~
δ(∇m), we have:
Z Z
~
δL[m, ∇m, 0] = ~ + δ(∇m))
L(m + δm, ∇m ~ dd~r − ~
L(m, ∇m) dd~r
Z h i
= ~ + δ(∇m))
L(m + δm, ∇m ~ ~
− L(m, ∇m) dd~r
 
 ∂L ∂L
Z
~
 d
=  d ~r
 ∂m δm + ∂(∇m) δ(∇m)

~ g0

f
where in the last step Rwe did a Taylor ~

R 0expansion around (m, ∇m). We now
integrate by parts (i.e. f g = f g − f g) the red term, obtaining
0
Z " # Z
∂L ~ ∂L d ∂L
δL = δm − ∇ δm d ~r + δm dd~r
∂m ~
∂(∇m) ~
∂ ∇m
Z " #
∂L ∂L ∂L
Z
= ~
δm − ∇ d
~
δm d r + ~
∇ δm dd~r
∂m ~
∂(∇m) V ~
∂ ∇m
where for the blue term we have used the divergence theorem ∂Ω F = V ∇F ~ .
R R
Moreover, the blue term vanishes at the boundary of the integration. Hence, we
have " #
∂L ∂L
Z
δL = δm −∇ ~ dd~r
∂m ~
∂(∇m)
The stationarity condition δL = 0 (Eq.(11.33)) is true ∀δm 6= 0 if and only if
the integrand is zero, thus we obtain the Euler-Lagrange equation:
!
∂L ~ ∂L
−∇ =0
∂m ~
∂(∇m)
Hence, when h = 0, we have L → h.
11.5 Correlation function in the saddle point approxima-

tion for non-homogeneous systems
We can now proceed to compute the correlation function within our approxima-
tions. In order to do that, we take the (functional!) derivative of the state equation
(11.35) with respect to h(~r0 ). Remembering that

δm(~r)
χT (~r,~r0 ) =
δh(~r0 )
and that from Eq.(11.27):
δh(~r)
= δ(~r − ~r0 )
δh(~r0 )
The functional derivative becomes
δh(~r) δm(~r) h ~ 2 i
δ(~r − ~r0 ) = = −k ∇ + 2at + 6bm 2
0 (~
r ) χT (~r − ~r0 ) (11.37)
δm(~r) δh(~r0 )
where we have assumed translational invariance (i.e. uniform systems). Now, from
fluctuation-dissipation theorem we know that:
Gc (~r − ~r0 ) = kB T χT (~r − ~r0 )
so that h i
~ 2 + 2at + 6bm2 Gc (~r − ~r0 ) = δ(~r − ~r0 )
β −k ∇ (11.38)
0
Note that this means that the correlation function Gc (~r − ~r0 ) can be interpreted as
the Green’s function of the operator D written between the square brackets:
~ 2 + 2at + 6bm2
D ≡ −k ∇
the equation (11.38) is also known as Fundamental Equation and Gc (~r − ~r0 ) as Fun-
damental Solution, or Green’s function of the differential operator D.
As said, in case of translationally invariant (i.e. uniform) systems, m is constant
and equal to the equilibrium values given by the Landau theory for the Ising model;
in particular, depending on the sign of t there are two possible situations:
• Case t > 0 (T > Tc ): in this case the mean field solution is m(~r) = m0 = 0, so
the last equation becomes:
−k∇2 + 2at Gc (~r − ~r0 ) = kB T δ(~r − ~r0 ) (11.39)

Defining:
1/2
k
ξ> (t) ≡
2at
this can be rewritten as:
kB T
−∇2 + ξ>
−2
(t) Gc (~r − ~r0 ) = δ(~r − ~r0 )

k
• Case t < 0 (T < Tc ): in this case the magnetization is:
at 1/2

m0 = ± −
b
so the differential equation for Gc becomes:
−k∇2 − 4at Gc (~r − ~r0 ) = kB T δ(~r − ~r0 ) (11.40)

Defining:
1/2
k
ξ< (t) ≡ −
4at
this can be written as
kB T
−∇2 + ξ<
−2
(t) Gc (~r − ~r0 ) = δ(~r − ~r0 )

k
11.5. Correlation function in the saddle point approximation for non-homogeneous
systems 177
We will shortly see that ξ> and ξ< are the expressions of the correlation length
for T > Tc and T < Tc , respectively. We can therefore see that in both cases we get:
1
ξ ∼ t−1/2 ⇒ν= (11.41)
2
that is the mean field value of ν!
Remark. Since ν = 1/2, the upper critical dimension of a critical point belonging to
the Ising universality class is
2
dc = = 4
ν
as previously anticipated.
We have seen that for both the cases, t > 0 and t < 0, the correlation function
can be obtained by solving the differential equation:
kB T
−∇2 + ξ −2 Gc (~r − ~r0 ) = δ(~r − ~r0 ) (11.42)

k
which can be can be solved with the Fourier transform and by using spherical coor-
dinates.
11.5.1 Solution of (11.42) by Fourier transform

Lecture 18.
Let us do the Fourier transform of Eq.(11.42): Friday 13th
December, 2019.
kB T
−∇2 + ξ −2 (t) Gc (~r − ~r0 ) = δ(~r − ~r0 ) Compiled:

k Wednesday 5th
February, 2020.
If we define ~x ≡ ~r −~r0 and we use the following convention for the Fourier transform
G(q)
e of G:
Z +∞
G(q) =
e Gc (|~x|)e−i~q·|~x| dd |~x|
−∞
then transforming both sides of the equation we get:
kB T kB T 1
q 2 + ξ −2 G(q) (11.43)

e = ⇒ G(q)
e =
k k q + ξ −2
2
where q = |~q|. From this last equation we can also see that when T = Tc , since
ξ → ∞ we have G(q)
e ' q12 and so performing the inverse Fourier transform one gets
Gc (|~x|) = |~x|2−d
from which we have that the critical exponent η is null (we will see that explicitly
once we have computed G). In fact, at T = Tc we have previously defined
G(r) ∼ |~x|2−d−η
hence, in this case we have η = 0. Therefore, reminding that ~x ≡ ~r − ~r0 we can now
determine G(~x) with the Fourier antitransform:
dd~q ei~q·~x
Z
G(~x) = (11.44)
(2π)d q 2 + ξ −2
This integral is a bit tedious to compute, and in general its result depends strongly
on the dimensionality d of the system; the general approach used to solve it is to
shift to spherical coordinates in Rd and then complex integration for the remaining
part, which involves |~q|. In order to do some explicit computations, let us consider
the case d = 3; we will then have:
spherical Z ∞ Z +1 Z 2π
d3~q ei~q·~x coordinates 1 q2
Z
iq|~
x| cos θ
G(~x) = = = dq e d(cos θ) dϕ
(2π)3 q 2 + ξ −2 (2π)3 0 q 2 + ξ −2 −1 0
Z ∞ " #1 Z ∞
z≡cos(θ) 2π q2 eiq|~x|z 1 q sin(q|~x|)
= dq = dq
(2π)3 0 q 2 + ξ −2 iq|~x| (2π)2 |~x| 0 q 2 + ξ −2
−1
This last integral can be computed, using the residue theorem, extending it to the
complex plane:
Z ∞
q sin(q|~x|) 1 +∞ q sin(q|~x|) 1 zeiz|~x|
Z I
I= dq = dq = Im dz
0 q 2 + ξ −2 2 −∞ q 2 + ξ −2 2 (z 2 + ξ −2 )
There are two poles at zP = ±iξ −1 ; we choose as the contour of integration γ which
contains only the pole at +iξ −1 (see Figure 11.3) and so using the residue theorem
we will have:
residue
1 zeiz|~x| theorem 1
I
Im 2πiRes(iξ −1 )

I = Im −1 −1
dz =
2 γ (z + iξ )(z − iξ ) 2
Since,
−1 |~
iξ −1 e−ξ x| e−|~x|/ξ
Res(iξ −1 ) = =
2iξ −1 2
we obtain
1 π
Im 2πiRes(iξ −1 ) = e−|~x|/ξ (11.45)

I=
2 2
Therefore, in the end we have:
1 e−|~x|/ξ
G(|~x|) = (11.46)
8π |~x|
We see now clearly that the correlation function has indeed an exponential behaviour
(as we have stated also in long range correlations) and that ξ is really the correlation
length; furthermore, G(~x) ∼ 1/|~x| and from the definition of the exponent η we have
G(~x) ∼ 1/|~x|d−2+η , so since d = 3 we indeed have η = 0.
One can also solve the equation for G(~r − ~r0 ) by using the spherical coordinates
and use the Bessel functions.
Im z
zP+
Re z
Figure 11.3: Positive integration contour γ in the complex plane for the integral I. It
contains only the pole at +iξ −1 .
Therefore, we have seen that for the Ising model ν = 1/2. If we also consider the
values of the other critical exponents we see that the upper critical dimension for this
model is d = 4. In other words, mean field theories are actually good approximations
for the Ising model if d ≥ 4. We will later see some other confirmations of this fact.
11.6. Including fluctuations at the Gaussian level (non interacting fields) 179
11.6 Including fluctuations at the Gaussian level (non in-

teracting fields)
Until now even if we have introduced Ginzburg-Landau theory we are still ne-
glecting the effects of the fluctuations since we are regarding the mean field theory
approximation for non-homogeneous systems as a saddle point approximation of a
more general theory; in other words, since we are approximating
saddle
point
0
ZGL [h] ' ZGL [h] = e−L[m0 (~r)]
we are still regarding the magnetization m as non fluctuating over the system. In
order to include the fluctuations we must do more and go further the simple saddle
point approximation. The simplest way we can include fluctuations in our description
is expanding Z expressed as a functional integral around the stationary solution and
keeping only quadratic terms; this means that we are considering fluctuations that
follow a normal distribution around the stationary value. The important thing to
note, however, is that in this approximation these fluctuations are independent, i.e.
they do not interact with each other. As we will see, with this assumption the values
of some critical exponents will differ from the "usual" ones predicted by mean field
theories.
Hence, let us introduce fluctuations at the Gaussian level. Consider consider h = 0
and m0 (~r) = m0 be the solution of the saddle point approximation. Let us expand
the general expression
Z
2 b 4 k ~ 2
βHef f [m(~r)] = atm + m + ∇m dd~r
2 2
by using
m(~r) = m0 + δm(~r)
If we assumed that the fluctuations δm(~r) are small, we would obtain:
(∇m)2 = (∇(m0 + δm))2 = (∇(δm))2
m2 = m20 + 2m0 δm + (δm)2
m4 = m40 + 4m30 δm + 6m20 (δm)2 + 4m0 δm3 + (δm)4
Hence, we have
Z
2 b 4 k ~ 2 b
∇(δm) + at + 3bm0 δm + 2bm0 δm + δm dd~r
2
2 3 4
βHef f = V atm0 + m0 +
2 2 2
| {z }
A0
(11.48)
where V is the volume of the system and the term proportional to δm, 2atm0 + 2bm30 ,
is zero since m0 is the solution of the extremal condition (m0 is the stationary solu-
tion)
δHef f
=0
δm m=m0
For simplicity let us first consider T > Tc ; in this case, we know that m0 = 0 and
hence,
m(~r) = m0 + δm(~r) = δm(~r)
We have also A0 = 0, 3bm20 δm2 = 0 and 2bm0 δm3 = 0. Taking all of this into
account, we obtain:

k ~ 2 b
Z
T >Tc d 2 4
βHef f (δm) = d ~r ∇δm + at(δm) + (δm)
2 2
The Gaussian approximation consists in neglecting the quartic term (δm)4 , hence we
finally obtain:

k ~ 2
Z
G,T >Tc d 2
βHef f (δm) ' d ~r ∇δm + at(δm) (11.49)
2
Remark. It is important to understand that these are fluctuations with respect to the
solution m0 .
In order to compute this integral it is more convenient to shift to Fourier space.
11.6.1 Gaussian approximation for the Ising model in Ginzburg-

Landau theory
For simplicity, consider the case T > Tc ; now, let us compute the partition function
Z R d k 2
ZG (δm) = D[δm]e− d r( 2 (∇δm) +at(δm) )
2
(11.50)
in the Fourier space. Let us make some remarks on what happens when we apply
Fourier transformations in this case. If our system is enclosed in a cubic box of volume
V = Ld (with periodic boundary conditions), we can define the Fourier components
of the magnetization as: Z
~
δm~k = δm(~r)e−ik·~r dd~r (11.51)
V
where ~k = k1 , . . . , kd = 2π~
L with kα = L nα and nα = 0, ±1, . . .. We can therefore
n 2π
expand the magnetization in a Fourier series:
1 X i~k·~r
δm(~r) = e (δm~k ) (11.52)
V
~
k
Substituting this expression of m in δm~k we obtain an integral representation for the

Kronecker delta; in fact:

1
Z
i(~
k−~
k0 )·~r
X
d
δm~k = δm~k0 e d ~r
V V
~
k0
and this is true only if:
1
Z Z
~ ~0 ~ ~0
ei(k−k )·~r dd~r = δ~k,~k0 ⇒ ei(k−k )·~r dd~r = V δ~k,~k0
V V V
Let us now make an observations; since δm(~r) ∈ R (is real) we have that
δm~∗k = δm−~k
Useful relations
• Sometimes it is useful to convert the sum over ~k by an integral by using the

density of states in the ~k space that is V /(2π)d , hence one useful relation is
V
Z
dd~k
X
→ (11.53)
(2π)d Rd
~
k
• From the relation Eq.(11.53), we have:
1 X i~k(~r−~r0 ) 1 V
Z
~ 0
e → d
dd~k eik(~r−~r ) = δ(~r − ~r0 )
V V (2π) RD
~
k
Hence, another useful relation is:
1 X i~k(~r−~r0 )
e → δ(~r − ~r0 ) (11.54)
V
~
k
• As previously shown, by inserting m(~r) into the expression for m~k
1 X i~k0~r
Z
~
m(~r) = e m~k , m~k = m(~r)e−ik·~r dd~r
V V
~
k
one gets
Z
~ ~0
ei(k−k )·~r dd~r = V δ~k~k0 (11.55)
V
• Finally, since
Z
~ ~0 V →∞
ei(k−k )·~r dd~r = V δ~k~k0 −→ (2π)d δ(~k − ~k0 )
V
We get the last useful relation:
V →∞
V δ~k~k0 −→ (2π)d δ(~k − ~k0 ) (11.56)
Remark. Our coarse graining procedure is based on the construction of blocks which
have a linear dimension that cannot be smaller than a, the characteristic microscopic
length of the system; this means that not all the ~k are allowed, and in particular we
must have π
~
k ≤ = Λ
a
It is the ultraviolet cut-off!
Gaussian Hamiltonian in Fourier space

We want to compute Eq.(11.49) in the Fourier space. For simplicity, let us change
notation as follows
δm(~r) ↔ ϕ(~r), k↔c
Hence, Eq.(11.49) becomes
c
Z h i
G,T >Tc
βHef f [ϕ] = (∇ϕ)2 + atϕ2 dd~r (11.57)
2
with
1 X i~k·~r
ϕ(~r) = e ϕ~k , ϕ~k ∈ C
V
~
k
Let us consider the terms of expression (11.57) separately:

• Term atϕ2 : the integral we are considering is

at X (11.55),(11.56) at X
Z Z
~ ~0
2 d
atϕ (~r) d ~r = 2 ei(k+k )·~r ϕ~k ϕ~k0 dd~r = ϕ~k ϕ~k0 (2π)d δ(~k+~k0 )
V Rd V2
~
k,~
k0 ~
k~k0
On the other hand,

V 1
(2π)d δ(~k + ~k0 ) −→ V δ~k,−~k0
Hence, the term becomes
V 1 1
Z X
atϕ2 (~r) dd~r −→ 2atϕ~k ϕ−~k0 (11.58)
2V
~
k
• Term 2c (∇ϕ)2 : consider the integral

  
c c 1
Z Z
~ ~0
(∇ϕ)2 dd~r =
X X
∇ eik·~r ϕ~k ∇ eik ·~r ϕ~k0  dd~r
2 2V2
~
k ~0
Z k
c X
~k · ~k0 ϕ~ ϕ~ 0 i(~
k+~k0 )·~r d c X ~ 2
= − e d ~
r = k ϕ~k ϕ−~k0
2V 2 k k 2V
~
k~k0 | {z } ~
k
(2π)d δ(~
k+~
k0 )→V δ~k,−~k0
Hence, the term becomes

c V 1 c
X 2
Z
(∇ϕ)2 dd~r −→ ~k ϕ~k ϕ−~k0 (11.59)
2 2V
~
k
In conclusion, the Gaussian Hamiltonian in Eq.(11.49) in the Fourier space is the

sum of the two terms in Eq.(11.58) and Eq.(11.59):
2
G,T >Tc V 1 1 X
βHef [ϕ] −→ 2at + c~k ϕ~k ϕ−~k0 (11.60)

f 2V
~
k
Now, thinking Rabout the functional integral form of the partition function, what
does the measure D[ϕ] become in Fourier space?
Since ϕ(~r) is expressed in terms of the Fourier modes ϕ~k , which are in general
complex,
1 X ~
ϕ(~r) = ϕ~k eik·~r , ϕ~k ∈ C
V
~
k
the measure of the integral becomes:
Z Z +∞ Y
D[ϕ(~r)] → (11.61)

d(Re ϕ~k ) d(Im ϕ~k )
−∞
|~k|<Λ
However, since ϕ(~r) is real (i.e. ϕ~∗ = ϕ−~k ) the real and imaginary parts of the Fourier
k
modes are not independent, because we have:
 n o
Re ϕ~ = Re ϕ ~

k n−k o
Im ϕ~ = − Im ϕ ~
k −k
This means that if we use the measure we have written above (Eq.(11.61)) we would
integrate twice on the complex plane; we must therefore change the measure so as
to avoid this double counting. We can for example simply divide everything by 2, or
restrict the integration on the region where for example the last coordinate of ~k, let
us call it kz , is positive. Therefore:
Z Z +∞ Y
Tr ≡ D[ϕ(~r)] =

d Re ϕ~k d Im ϕ~k
−∞
|~k|<Λ
kz >0 (11.62)
+∞ Y
1
Z

= d Re ϕ~k d Im ϕ~k
2 −∞
|~k|<Λ
For sake of brevity, we define:
Z +∞ Y 0 0
Y Y
(11.63)

Tr ≡ dϕ~k , dϕ~k ≡ d Re ϕ~k d Im ϕ~k
−∞ ~ ~
k k |~k|<Λ
kz >0
In the end, in the Fourier space we have:

 
Z Z +∞ 0
e T >Tc [ϕ~ ]
−β H e G,T >Tc [ϕ~ ]
 dϕ~ e−β H
Y
T >Tc
ZeG = D[ϕ(~r)]e ef f k = k
ef f k (11.64)
−∞ ~
k
where 2
G,T >Tc 1 X 2
− β Hef f [ϕ~k ] = − 2at + c~k ϕ~k (11.65)
e
2V
~
k
Free energy in Gaussian approximation

Let us consider again the case T > Tc (for which we have (m0 = 0)) and h = 0.
In this case, the partition function of the system in the Fourier space is the one in
Eq.(11.64):
Y Z +∞ − 1 P~ 2at+c|~k|2 |ϕ~ |2

T >Tc

ZG
e = d Re ϕ~k d Im ϕ~k e 2V k k
−∞
|~k|<Λ
kz >0
2
Since ϕ~k = Re2 ϕ~k + Im2 ϕ~k , changing variables to:
x ≡ Re ϕ~k , y ≡ Im ϕ~k
Thus, we have
2
2at + c~k

+∞
π
Z
2 +y 2 )
dx dy e−A(x = , A≡
−∞ A 2V
Hence,
  
T >Tc T >Tc Y 2πV 1 X 2πV
ZeG = e−β FG = 2 = exp log
e  
2 
~ 2 ~
|~k|<Λ 2at + ck |~k|<Λ 2at + ck
kz >0
We therefore have that the free energy of the system is:

 
kB T X 2πV
FeGT >Tc = − log  (11.66)
 
2 
2
2at + c~k

|~k|<Λ
Remark. For T < Tc we have m0 = ±(−at/b)1/2 6= 0. In addition, we have to redefine

the quadratic term (at + 3bm20 ) (in Eq.(11.48)), that for m20 = −at/b, becames −2at.
Moreover, we have also the term V A0 = V (atm20 + 2b m40 ). Therefore, in the case
T < Tc the free energy of the system is
 
kB T X 2πV
FeGT <Tc = V A0 − log  (11.67)
 
2 
2 ~
~
|k|<Λ 2at + ck
Specific heat in the Gaussian approximation

We can now compute the specific heat of the system, and so determine its critical
exponent α. We therefore want to compute:
∂ 2 FGL
cG
V = −T
∂T 2 V
The derivatives are straightforward, and in the end we get:
A X 1 B X 1
cG
V = 2 − 2
V V
2 ~
|k|<Λ 2at + c~k
~ ~
|k|<Λ 2at + ck
| {z }
| {z } 2st
1st
One can show that (

st ξ 4−d ∼ t−ν(4−d) d<4
1 ∝
<∞ d>4
and (
ξ 2−d ∼ t−ν(2−d) d<2
2nd ∝
<∞ d>2
Therefore for d < 2 the 2nd contribution to cGL
V diverges, but in the same range of
d the divergence of the first contribution is more relevant; on the other hand, for
2 ≤ d < 4 only the first contribution diverges. It is therefore the 1st term that
determines the divergence of the specific heat, and in particular for d < 4 we have
cG −ν(4−d) ; in summary:
V ∼t
(
G t−ν(4−d) d < 4
cV ∼ (11.68)
<∞ d>4
and so we see that in the Gaussian approximation the inclusion of the fluctuations
has changed the behaviour of cV at the transition point; in particular, has changed
the value of the critical exponent α (cV ∼ t−α ) to:
αG = ν(4 − d) for d < 4 (11.69)
In order to compute it, however, we still must determine ν so we now proceed to

compute the two-point correlation function in order to determine both η and ν.
Two-point correlation function in the Gaussian approximation

We have to compute the 2-point correlation function for Hef
G (ϕ). We know that
f
the (simple) correlation function is defined as:
G(~r,~r0 ) = ϕ(~r)ϕ(~r0 )

cV cV
d>4 d<4
Gaussian
f luctuations
mean f ield mean f ield
Gaussian
0 t 0 t
(a) Case d > 4. (b) Case d < 4.
Figure 11.4: Behaviour of the specific heat cV as a function of the rescaled temperature
t. In blue it is represented its behaviour in the mean field theory, while in red the one with
Gaussian approximations. We see that in the case d < 4 with Gaussian approximations the
specific heat diverges near t ∼ 0.
so we first have to determine:

1 X i(~k·~r+~k0 ·~r0 )
ϕ(~r)ϕ(~r0 ) = e ϕ~k ϕ~k0
V2
~
k,~
k0
Shifting to Fourier space, we have (the subscript G stands for Gaussian):

R +∞

−∞ dϕ~ k1
. . . dϕ~k dϕ~k0 ϕ~k ϕ~k0 e−βHef f
ϕ~k ϕ~k0 G = R +∞
−∞ dϕ~ k1
. . . dϕ~k dϕ~k0 e−βHef f
where, as we said,
2
1 X 2
G
βHef = V A0 + 2at + c~k ϕ~k

f
2V
~
k
It is clear that in ϕ~k ϕ~k0 G all the integrals factorize since the Fourier modes are all

independent (they are decoupled); therefore, all the integrals in the numerator that
don’t involve ~k or ~k0 simplify with the same integrals in the denominator. Taking
this into account, it is possible to show

V
ϕ~k ϕ~k0 G
= 2 δ~k,−~k0
2at + c~k

that, in the limit V → ∞ (Eq.(11.56)) becomes
V →∞ (2π)d
2 δ(~k + ~k0 ) (11.70)

ϕ~k ϕ~k0 G
−→
2at + c~k

Going back to real space, by antitransforming, we have:

1 X i(~k·~r+~k0 ·~r0 )
1 X i(~k·~r+~k0 ·~r0 ) V
ϕ(~r)ϕ(~r0 ) G = 2

e ϕ~k ϕ~k0 G = 2 e 2 δ~k,−~k0
V V
2at + c~k

~
k,~
k0 ~
k,~
k0
~ 0
1 X eik·(~r−~r )
= 2
V ~
~
k 2at + ck
We see that defining:

c 1/2
ξ(t) =
2at
we get
~ 0
1 X 1 eik·(~r−~r ) 1 X i~k·~x ~
ϕ(~r)ϕ(~r0 ) G = (11.71)

2 = e Ĝ(k)
V c ~ −2 V
~
k k + ξ ~
k
where we have defined the correlation function

1 1
Ĝ(~k) = 2 (11.72)
c ~
k + ξ −2
this correlation function acquires the same form of the one computed in mean field
theory. This means that the critical exponents ν and η now have the same values
predicted by mean field theory (see Sec.(11.5) and Sec.(11.5.1)), namely:
(
νG = 21
⇒ (11.73)
ηG = 0
hence, there are no changes with Gaussian fluctuations! Interactions between ϕ~k are
needed!
Chapter 12
Widom’s scaling theory.

Block-spin Kadanoff’s
transformation
Lecture 19.
Wednesday 18th
12.1 Introduction December, 2019.
Compiled:
We have seen that as a given system approaches a critical point T → Tc± , the Wednesday 5th
distance ξ over which the fluctuations of the order parameter are correlated becomes February, 2020.
comparable to the size of the whole system L and the microscopic aspects of the
system become irrelevant. This means that near a critical point the system has no
longer characteristic lengths (a, L), besides ξ of course that becomes the only relevant
length scale of the problem. We can therefore expect that if we "move" a little bit from
a critical point (t ∼ 0), for example changing the temperature by a small amount, the
free energy of the system as a function will not change its shape, hence it is invariant
in form by a change of scale.
This hypothesis is also suggested by experimental data such as the ones shown
by Guggenheim for the gas phase diagrams and the ones shown for ferromagnetic
materials at different temperatures. Let us consider the experiment in Figure 12.2
( [4] pag. 119); we can see that data from different temperatures, if scaled properly,
collapse into two (one for t < 0 and one for t > 0) unique curves. It is clearly
illustrated in Figure 12.1. At the origin, the Widom’s static scaling theory was
introduced also to explain this collapse.
m
|t| β
t < 0
t> 0
| h |
|t| ∆
Figure 12.1: Scaled magnetization m is plotted against scaled magnetic field h.
187
188 Chapter 12. Widom’s scaling theory. Block-spin Kadanoff’s transformation
Figure 12.2: Scaled magnetic field h is plotted against scaled magnetization m for the
insulating ferromagnet CrBr3 , using data from seven supercritical (T > Tc ) and from eleven
subcritical T < Tc isotherms. Here σ ≡ M/M0 . (1969) [4]
12.2 Widom’s static scaling theory

We have seen that when a phase transition occurs the free energy of the system is
such that the response functions exhibit singularities, often in the form of divergences.
To make a concrete example (but of course all our statements are completely general)
if we consider a magnetic system we can suppose to write its free energy density as:
f (T, H) = fr (T, H) + fs (T, H)
where t = (T − Tc )/Tc and h = (H − Hc )/kB T , fr is the "regular" part of the free
energy (which does not significantly change near a critical point, it is an analytic
function), while fs is the "singular" one, which contains the non-analytic behaviour
of the system near a critical point (i.e. t ∼ 0 and h ∼ 0).
Widom’s static scaling hypothesis consists in assuming that the singular part fs
of the free energy is a generalized homogeneous function, i.e.:
fs (λp1 t, λp2 h) = λfs (t, h), ∀λ ∈ R
Note that assuming that one thermodynamic potential is a generalized homogeneous
function implies that all the other thermodynamic potentials are so.
Therefore, in order to properly define the scaling hypothesis, we should rely on
the mathematical concept of homogeneous functions and now we discuss the main
properties of such functions.
12.2. Widom’s static scaling theory 189
12.2.1 Homogeneous functions of one or more variables

Single variable
Let us begin with the definition of homogeneous function for a single variable r.
Definition 10: Homogeneous function
A function f (r) is said to be homogeneous in r if
f (λr) = g(λ)f (r), ∀λ ∈ R (12.1)
where g is, for the moment, an unspecified function (we will shortly see that it
has a precise form).
Example 33: Parabola f (r) = Br2

An example of homogeneous function is
f (r) = Br2
in fact
f (λr) = B(λr)2 = λ2 f (r)
and so in this case g(λ) = λ2 .
A very interesting property of an homogeneous functions is that, once its value

in a point r0 (i.e. f (r0 )) and the function g(λ) are known, the entire f (r) can be
reconstructed for all r ∈ R; indeed, any r can be written in the form r = λr0 (of
course with λ = r/r0 ), so that
f (r) = f (λr0 ) = g(λ)f (r0 ) (12.2)
We now want to show that g(λ) has a precise form.

Theorem 3
The function g(λ) is not arbitrary, but it must be of the form
g(λ) = λp (12.3)
where p is the degree of the homogeneity of the function.
Proof. From the definition of homogeneous function, for λ, µ ∈ R we have on one

hand that:
f (λµr) = f (λ(µr)) = g(λ)f (µr) = g(λ)g(µ)f (r)
on the other hand,
f ((λµ)r) = g(λµ)f (r)
and so:
g(λµ) = g(λ)g(µ)
If we now suppose g to be differentiable1 , then differentiating with respect to µ this
last equation we get:
∂ ∂
[g(λµ)] = [g(λ)g(µ)] ⇒ λg 0 (λµ) = g(λ)g 0 (µ)
∂µ ∂µ
1
Actually g(λ) continuous is sufficient, but proof becomes more complicated.
Setting µ = 1 and defining p ≡ g 0 (µ = 1), we have:
g 0 (λ) p
λg 0 (λ) = g(λ)p ⇒ =
g(λ) λ
which yields:
d p
(ln g(λ)) = ⇒ ln g(λ) = p ln λ + c ⇒ g(λ) = ec λp
dλ λ
Now, g 0 (λ) = pec λp−1 , so since g 0 (1) = p by definition we have p = pec and thus
c = 0. Therefore:
g(λ) = λp
A homogeneous function such that g(λ) = λp is said to be homogeneous of degree
p.
Generalized homogeneous functions (more variables)

Let us now define homogeneous functions for more than only one variable:
f (λx, λy) = λp f (x, y), ∀λ ∈ R
The function f (x, y) is a generalized homogeneous function if has as more general

form
f (λa x, λb y) = λf (x, y), ∀λ ∈ R (12.4)
Remark. If we consider instead
f (λa x, λb y) = λp f (x, y)
we can always choose λp ≡ s such that
f (sa/p x, sb/p y) = sf (x, y)
and choosing a0 = a/p and b0 = b/p we are back to (12.4). Hence, it is the most
general form an homogeneous function can have.
Remark. Since λ is arbitrary, we can choose λ = y −1/b , thus we get

1/b x
f (x, y) = y f a/b , 1
y
in that way f depends on x and y only through the ratio x

y a/b
! Similarly, for x, one
can choose λ = x−1/a , obtaining
y
f (x, y) = x1/a f 1, b/a
x
Example 34
The function f (x, y) = x3 + y 7 is an homogeneous one. Indeed, we have:
f (λ1/3 x, λ1/7 y) = λx3 + λy 7 = λf (x, y)
Instead, examples of non-homogeneous functions are:
f (x) = e−x , f (x) = log x

12.3. Relations between critical exponents 191
12.2.2 Widom’s scaling hypothesis

As said, the Widom’s static scaling hypothesis consists in assuming that the sin-
gular part of the free energy, fs , is a generalized homogeneous function, i.e.:
fs (λp1 t, λp2 h) = λfs (t, h), ∀λ ∈ R (12.5)
where p1 and p2 are the degrees of the homogeneity.

The exponents p1 and p2 are not specified by the scaling hypothesis; however, we
are shortly going to show that all the critical exponents of a system can be expressed
in terms of p1 and p2 ; this also implies that if two critical exponents are known, we
can write p1 and p2 in terms of them (since in general we will have a set of two
independent equations in the two variables p1 and p2 and therefore determine all the
critical exponents of the system. In other words, we just need to know two critical
exponents to obtain all the others.
Remark. Since fs is a generalized homogeneous function, it is always possible to
choose λ to remove the dependence on one of their arguments; for example, one can
choose λ = h−1/p2 to obtain
fs (t, h) = h1/p2 fs (h−p1 /p2 t, 1)
where
p2
∆≡
p1
is called the gap exponent.
12.3 Relations between critical exponents

Let us now explore the consequences of Widom’s assumption on the critical expo-
nents of a system, again on a magnetic one for concreteness. Indeed, let us see how
this simple hypothesis allow us, by simple differential calculus, to obtain relations
between the thermodynamic critical exponents.
12.3.1 Exponent β (scaling of the magnetization)

Let us start from the scaling hypothesis
fs (λp1 t, λp2 h) = λfs (t, h)
Since
∂f
M=
∂H
deriving both sides of Widom’s assumption with respect to h2 we get:
∂fs p1 p2 ∂fs
λp2 (λ t, λ h) = λ
∂h ∂h
and thus:
λp2 Ms (λp1 t, λp2 h) = λMs (t, h)
On the other hand, we know that, for h = 0 and t → 0− , Ms (t) ∼ (−t)β . Hence, in
order to determine β, we set h = 0 so that this becomes
Ms (t, 0) = λp2 −1 Ms (λp1 t, 0)

2
We should in principle derive with respect to H, but since h ∝ βH, the β factors simplify on
both sides.
Since λ is arbitrary, using the properties of generalized homogeneous functions, we

set
λp1 t = −1 ⇒ λ = (−t)−1/p1
to eliminate the dependence on t. Hence, we get
Ms (t, 0) = (−t)(1−p2 )/p1 Ms (−1, 0)
By definition of the β critical exponent, we have:
1 − p2
β= (12.6)
p1
12.3.2 Exponent δ
Let us consider again the relation
λp2 Ms (λp1 t, λp2 h) = λMs (t, h)
We can determine the exponent δ by setting t = 0 (T = Tc ), obtaining:
M (0, h) = λp2 −1 M (0, λp2 h)
Now, using again the same property of generalized homogeneous functions we set
λp2 h = 1 ⇒ λ = h−1/p2
and we get:
Ms (0, h) = h(1−p2 )/p2 Ms (0, 1)
h→0+
Since Ms ∼ h1/δ , we have:
p2
δ= (12.7)
1 − p2
Now we can also express p1 and p2 in terms of β and δ from the two relations
Eq.(12.6) and Eq.(12.7). The result is:
1 δ
p1 = , p2 = (12.8)
β(δ + 1) δ+1
from which we see that the gap exponent is:

p2
∆≡ = βδ (12.9)
p1
12.3.3 Exponent γ
In order to obtain the magnetic susceptibility, we derive twice the expression of
Widom’s assumption with respect to h, to get:
λ2p2 χT (λp1 t, λp2 h) = λχT (t, h)
The exponent γ describes the behaviour of χT for t → 0 when no external field

is present (h = 0). What we can now see is that the scaling hypothesis leads to the
equality of the exponents for t → 0+ and t → 0− .
12.3. Relations between critical exponents 193
• Case t → 0− : setting h = 0 and λ = (−t)−1/p1 we get

2p2 −1
−
χT (t, 0) = (−t) p1 χT (−1, 0)
and if we call γ − the critical exponent for t → 0− , we see that, since
t→0−
χT (t, 0) ∼ (−t)−γ−
we get
2p2 − 1
γ− = = β(δ − 1)
p1
• Case t → 0+ : setting h = 0 and λ = (t)−1/p1 we get
2p2 −1
−
χT (t, 0) = t p1 χT (1, 0)
and if we call γ + the critical exponent for t → 0+ , we see that, since
t→0+
χT (t, 0) ∼ t−γ+
we get
2p2 − 1
γ+ = = β(δ − 1)
p1
We therefore see explicitly that:
2p2 − 1
γ− = γ+ ≡ γ = = β(δ − 1) (12.10)
p1
12.3.4 Exponent α (scaling of the specific heat)

In order to determine the behaviour of the specific heat (at constant external field)
near the critical point, we derive the expression of Widom’s assumption twice with
respect to the temperature t, so that:
λ2p1 cH (λp1 t, λp2 h) = λcH (t, h)
We want to see again that the scaling hypothesis leads to the equality of the exponents
for t → 0+ and t → 0− .
• Case t → 0− : setting h = 0 and λ = (−t)−1/p1 we get

− 2− p1
cH (t, 0) = (−t) 1 cH (−1, 0)
and if we call α− the critical exponent for t → 0− , we see that, since
t→0−
cH (t, 0) ∼ (−t)−α−
we get
1
α− = 2 −
p1
• Case t → 0+ : setting h = 0 and λ = (t)−1/p1 we get

− 2− p1
cH (t, 0) = t 1 cH (1, 0)
and if we call α+ the critical exponent for t → 0+ , we see that, since
t→0−
cH (t, 0) ∼ t−α+
we get
1
α+ = 2 −
p1
We have again:
1
α− = α+ ≡ α = 2 − (12.11)
p1
12.3.5 Griffiths and Rushbrooke’s equalities

If we now substitute p1 = 1
β(δ+1) into α = 2 − p1 ,
1
we get:
α + β(δ + 1) = 2 (12.12)
This is the Griffiths equality, which we have already encountered in inequalities be-
tween critical exponents as an inequality (see Sec.2.7.4).
On the other hand, Rushbrooke’s equality is obtained by combining Griffith equal-
ity with the relation γ = β(δ − 1):
α + 2β + γ = 2 (12.13)
We therefore see, as anticipated in Sec.2.7.4, that the static scaling hypothesis allows
to show that they are indeed exact equalities.
12.3.6 An alternative expression for the scaling hypothesis

We can re-express Widom’s assumption in another fashion often used in literature.
Let us consider the Widom’s assumption
fs (λp1 t, λp2 h) = λfs (t, h)
If we set λ = t−1/p1 , then:
fs (1, t−p2 /p1 h) = t−1/p1 fs (t, h)

p2
From ∆ = p1 and α = 2 − p1 ,
1
we can rewrite this as:

2−α h
fs (t, h) = t fs 1, ∆ (12.14)
t
which is the most used form of the scaling hypothesis in statistical mechanics.
As we can notice, we have not considered the critical exponents η and ν; this will
be done shortly in Kadanoff’s scaling and correlation lengths.
12.3.7 Scaling of the equation of state

Besides the relations between critical exponents, Widom’s static scaling theory
allows us to make predictions on the shape of the state equation of a given system.
By predicting the scaling form of the equation of state, we can explain the collapse of
the experimental data. Let us now see how, again for a magnetic system. We start
from the relation
Ms (t, h) = λp2 −1 Ms (λp1 t, λp2 h)
Using the property of generalized homogeneous functions we set λ = |t|−1/p1 . Hence,
1−p2
t h
Ms (t, h) = |t| p1 Ms ( , p /p )
|t| |t| 2 1
Since β = (1 − p2 )/p1 and ∆ = p2 /p1 , we have

Ms (t, h) t h
= Ms ( , ) (12.15)
|t|β |t| |t|∆
Hence, we can define the scaled magnetization and scaled magnetic field as
m̄ ≡ |t|−β M (t, h), h̄ ≡ |t|−∆ h(t, M ) (12.16)

12.4. Kadanoff’s block spin and scaling of the correlation function 195
and
F± (h̄) ≡ Ms (±1, h̄) (12.17)
where +1 corresponds to t > 0 (namely T > Tc ) and −1 to t < 0 (i.e. T < Tc ). Using
these definitions, Eq.(12.15) becomes
m̄ = F± (h̄) (12.18)
The meaning of this equation is that if we measure M and h and rescale them as we
have just seen, all the experimental data should fall on the same curve independently
of the temperature T ; there are of course two possible curves (not necessarily equal),
one for T > Tc and one for T < Tc (which correspond to M (1, h) and M (−1, h)).
These predictions are in perfect agreement with experimental results shown in Figure
12.2, and are one of the greatest successes of Widom’s static scaling theory.
12.4 Kadanoff’s block spin and scaling of the correlation

function
As we have seen, Widom’s static scaling theory allows us to determine exact
relations between critical exponents, and to interpret the scaling properties of systems
near a critical point. However, this theory is based upon the following equation:
f (T, H) = fr (T, H) + fs (T, H)
but gives no physical interpretation of it; in other words, it does not tell anything
about the physical origin of scaling laws. Furthermore, as we have noticed Widom’s
theory does not involve correlation lengths, so it tells nothing about the critical
exponents ν and η.
We know that one of the characteristic traits of critical phenomena is the diver-
gence of the correlation length ξ, which becomes the only physically relevant length
near a critical point. However, by now we are unable to tell if and how this is related
to Widom’s scaling hypothesis; everything will become more clear within the frame-
work of the Renormalization Group, in which we will see that Widom’s assumption
is a consequence of the divergence of correlation length.
Nonetheless, before the introduction of the Renormalization Group, Kadanoff
(1966) proposed a plausibility argument for his assumption applied to the Ising model,
which we are now going to analyse. We will see that Kadanoff’s argument, which
is based upon the intuition that the divergence of ξ implies a relation between the
coupling constants of an effective Hamiltonian Hef f and the length on which the or-
der parameter m is defined, is correct in principle but not in detail because these
relations are in reality more complex than what predicted by Kadanoff; furthermore,
Kadanoff’s argument does not allow an explicit computation of critical exponents.
We will have to wait for the Renormalization Group in order to solve these problems.
12.4.1 Kadanoff ’s argument for the Ising model

Let us consider a d-dimensional Ising model with hypercubic lattice with lattice
constant a; assuming nearest-neighbour interactions the Hamiltonian of the system
will be:
XN N
X
− βHΩ = K σi σj + h σi (12.19)
hiji i=1
where σi = ±1, K = βJ and h = βH, as usual.

The Kadanoff’s argument is based on a coarse-grained operation on the system
and on two basic assumptions.
Coarse graining operation

Since the values of the spin variables are correlated on lengths of order r < ξ(t),
we partition the system into blocks of size la (l is an adimensional scale) such that
a la ξ(t)
The spins contained in this regions of linear dimension la will behave, statistically, as
a single unit. We can therefore imagine to carry out, similarly to what we have seen
for the Ginzburg-Landau theory, a coarse graining procedure were we substitute the
spins σi inside a "block" of linear dimension la (which will therefore contain ld spins)
with a single block spin (or superspin) SI ; the total number of blocks will of course
be:
N
Nl = d
l
Considering the I-th block, we can define the block spin SI as:
1 1 X
SI ≡ σi (12.20)
|ml | ld
i∈I
where the mean magnetization of the I-th block ml is:

1 X
ml ≡ hσi i (12.21)
ld
i∈I
Remark. The division by |ml | in equation (12.20) is crucial because it rescales the
new variables SI to assume only the values ±1, just like the original ones (rescaling
of the fields).
In the end we are left with a system of block spins on a hypercubic lattice with
lattice constant la. We can therefore rescale the spatial distances between the degrees
of freedom of our system:
~r
~rl =
l
In other words, since la is now the characteristic length of the system we are measuring
the distances in units of la (just like in the original one we measured distances in units
of a). The coarse graining procedure we have just seen is described by Figure 12.3
for a two-dimensional Ising model.
(b) Block spins.

(a) Original system.
(c) Final rescaled system.
Figure 12.3: Coarse graining procedure for a two-dimensional Ising model.
Kadanoff’s argument now proceeds with two assumptions.
1st crucial assumption

The first assumption states that, in analogy to what happens in the original sys-
tem, we assume that the block spins interact with the nearest neighbours and an
external effective field (just like the original ones do). Hence, the Hamiltonian of the
new system Hl is equal in form to HΩ , the original one, of course provided that the
spins, coupling constants and external fields are redefined. If we call Kl and hl these
new constants, the new effective Hamiltonian is:
Nb
X Nb
X
− βHl = Kl SI SJ + hl SI (12.22)
hIJi I=1
Remark. This assumption is in general wrong!

Since in the new system the lengths have been rescaled by a factor l, this means
that in the new system all the lengths will be measured in units of la. Hence, also the
correlation length has to be measured in units of la, and in particular we will have:
ξ
ξl =
l
This means that the new system has a lower correlation length ξl < ξ, and so the
system described by Hl is more distant from the critical point than the original one
HΩ . Hence, we will have a new effective temperature:
tl > t
Similarly, in the coarse grained system the magnetic field hl will be rescaled to an
effective one:
X XX X X X
h σi = h σi = h |ml |ld SI = h|ml |ld SI = hl SI
| {z }
i I i∈I I hl I I
which implies that there is a relation between the new magnetic field and the mean
magnetization:
hl = h|ml |ld
Since the Hamiltonian of the block spin system Hl has the same form of the
original one HΩ , the same will be true also for the partition function Zl and the free
energy, provided that h, K and N are substituted with hl , Kl and N/ld ; in particular,
considering the singular part fs of the free energy density we will have:
N
Nl fs (tl , hl ) = fs (tl , hl ) = N fs (t, h)
ld
and so
⇒ fs (tl , hl ) = ld fs (t, h)
Remark. Note that the homogeneity condition is recovered with λ ≡ ld .
2st crucial assumption

In order to proceed, we should ask how t and h change under the block spin
transformation; hence, we now need the second assumption. We assume that:
tl = tlyt , hl = hlyh , yt , yh > 0 (12.23)
where the yt , yh are called scaling exponents and are for now unspecified, apart from
the fact that they must be positive (so that the coarse grained system is indeed farther
from the critical point with respect to the original one).
The justification of this assumption lies in the fact that we are trying to under-
stand the scaling properties of our system near a critical point, and these are the
simplest possible relations between (t, h) and (tl , hl ) that satisfy the following sym-
metry requirements:
• when h → −h, then hl → −hl ;
• when h → −h, then tl → tl ;
• when t = h = 0, then tl = hl = 0.
If we use this assumption in the free energy equation
fs (tl , hl ) = ld fs (t, h)
we get:
fs (t, h) = l−d fs (tlyt , hlyh ) (12.24)
This is very similar to Widom’s scaling hypothesis (Eq.(12.5)), but with the parameter
λ that is the inverse of the block volume ld . Since l has no specified value, we can
choose the one we want and again we use the properties of generalized homogeneous
functions to eliminate one of the arguments of fs . In particular, setting
l = |t|−1/yt
we get:
fs (t, h) = |t|d/yt fs (1, h|t|−yh /yt ) (12.25)
where the gap exponent is now
yh
∆= (12.26)
yt
comparing this equation with the alternative expression of the scaling hypothesis
(Eq.(12.14)) we have:
d
2−α= (12.27)
yt
12.4.2 Kadanoff ’s argument for two-point correlation functions

Let us now compute the two-point correlation function of the block spin system:
GIJ (~rl , tl ) ≡ hSI SJ i − hSI i hSJ i (12.28)
where ~rl is the vector of the relative distance between the centers of the I-th and
J-th block (measured in units of la, as stated before). We want now to see how this
correlation length is related to the one of the original system G(~r, t). Since from the
first assumption we have
hl l−d
hl = h|ml |ld ⇒ |ml | =
h
Using the second assumption, for which hl = hlyh , we obtain:
|ml | = lyh −d
Since
1 1 X
SI = σi
|ml | lD
i∈I
the two-point correlation function becomes
 
1 XX
GIJ (~rl , tl ) = hSI SJ i − hSI i hSJ i = hσi σj i − hσi i hσj i

l2(yh −d) l2d i∈I j∈J
| {z }
Gij
ld ld
= [hσi σj i − hσi i hσj i]
l2(yh −d) l2d
where in the last step we have made the assumption that, since la ξ, Gij inside
a block is fairly constant, we can bring it outside the sum. Hence, the sum over i, j
becomes l2d . Hence, we have:
GIJ (~rl , tl ) = l2(d−yh ) Gij (~r, t) (12.29)
Introducing also the dependence on h, we have:

~r yt yh
GIJ , tl , hl = l2(d−yh ) Gij (~r, t, h) (12.30)
l
Again, we can remove the dependence on t by setting l = t−1/yt so that:

2(d−yh )
G(~r, t, h) = t yt G(~rt1/yt , 1, ht−yh /yt ) (12.31)
Now, ~r scales with l as all the lengths of our system, and since we have set l = t−1/yt
we have
|~r|t1/yt = 1 ⇒ t = |~r|−yt
Therefore, inserting in (12.31):

G(~r, t, h) = |~r|−2(d−yh ) FG ht−yh /yt (12.32)
where we have defined

FG ht−yh /yt ≡ G(1, 1, ht−yh /yt ) (12.33)
Let us remember that the power law behaviour of G in proximity of the critical point
is
G ∼ |~r|2−d−η
we get
2(d − yh ) = d − 2 + η (12.34)
With the choice l = t−1/yt we further have that the correlation length scales as:
ξ = lξl = ξl t−1/yt
and remembering that the correlation length diverges as
ξ ∼ t−ν
we also have:
1
ν= (12.35)
yt
The Eq.(12.35) together with Eq.(12.27) leads to the hyperscaling relation:
⇒ 2 − α = νd (12.36)
Hyperscaling relations are known to be less robust than the normal scaling relations
between critical exponents (for example, for Hamiltonians with long-ranged power
law interactions hyperscaling relations don’t hold).
Chapter 13
Renormalization group theory.

Universality
Lecture 20.
13.1 Renormalization group theory (RG) Friday 20th
December, 2019.
Kadanoff’s argument for the Ising model allows us to explain the scaling form of the Compiled:
free energy density and of the correlation length near the critical point; in particular, Wednesday 5th
February, 2020.
the Kadanoff’s block spin transformation justifies the Widom scaling hypothesis and
identifies λ with l. We have obtained the results:

−d yt yh
fs (t, h) = l fs (tl , hl )


~r
G(~r, t, h) = l−2(d−yh ) G yt
l , tl , hl
yh

t = tlyt , h = hyh

l l
where we did two crucial assumptions:
1. 1st assumption:
Hl = HΩ
2. 2nd assumption:
(
tl = tlyt
hl = hlyh
However, as we have seen, the Kadanoff’s theory is unable to predict the values of
the scaling exponents yt and yh (and thus ultimately of the critical exponents), nor
can it explain why universality occurs.
Remark. Open problems are: how an iterative procedure of coarse-graining can pro-
duce the 2nd assumption? How this can gives rise to the singular behaviour of fs ?
How can we explain universality of the critical points?
We will see that these problems are solved with the introduction of the Renormal-
ization Group (done by K. G. Wilson at the beginning of the ’70s), which we will call
simply "RG" from now on for the sake of simplicity. The RG is based upon the cor-
rect "intuition" of Kadanoff’s argument that the coupling constants of a Hamiltonian
change if we coarse-grain the system (or in other words we "look" at it on different
spatial scales); however, this "intuition" strictly speaking is not correct since we have
seen that in Kadanoff’s procedure we assume that after the coarse-graining procedure
the Hamiltonian of the system has exactly the same form: as we will see, this is not
true in general because new terms can appear after we coarse-grain the system.
201
202 Chapter 13. Renormalization group theory. Universality
13.1.1 Main goals of RG

The main goals of the Renormalization Group theory are:
1. To fornish an algorithm way to perform systematically the coarse graining pro-

cedure.
More specifically, the realization of a coarse graining procedure, also called
decimation, is like the one introduced by Kadanoff for the Ising model; in
general, this procedure must integrate the degrees of freedom of the system on
scales of linear dimension la which must be much larger than the characteristic
microscopic scale a of the system, but also much smaller than the correlation
length ξ:
a la ξ
After the decimation, we are left with a new effective Hamiltonian that describes
the system at larger length scales. We will see that this is equivalent to find a
transformation between the coupling constants K → K 0 .
2. Identify the origin of the critical behaviour and explain universality.

The coarse graining procedure will give rise to a system with ξl = ξ/l, this
means that the new correlation length is smaller than the original one, so our
system is farther from criticality after the decimation.
To make an example, suppose we are given a generic Hamiltonian H = H([K])

which depends on an arbitrary number of coupling constants [K] = K ~ = (K1 , K2 , . . . Kn )
(in the case of an Ising model with nearest-neighbour interaction and an external field
there are only two coupling constants, K = K1 and h = K2 ).
Let us suppose we apply a coarse-graining procedure, in which we integrate the
degree of freedom within distance l with a ≤ la ≤ L. For what we have just stated,
the action of the RG can be expressed as a transformation of the coupling constants:
[K 0 ] = Rl [K], l>1 (13.1)
where Rl is called RG transformation, while this last equation is referred to as recur-

sive relation.
Properties of Rl
We suppose that the function Rl satisfy the following properties:
1. Rl is analytic (sum of a finite number of degrees of freedom, no matter how

complicated it may be).
2. The set of transformations Rl forms a semigroup, because if we subsequently

apply two transformations Rl1 and Rl2 on two different length scales l1 and l2
we have: (
[K 0 ] = Rl1 [K]
[K 00 ] = Rl2 [K 0 ] = Rl2 ◦ Rl1 [K]
Hence, we have the relation
Rl2 l1 [K] = Rl2 ◦ Rl1 [K] (13.2)
Remark. Note that it is a semigroup and not a group, since in general does not
exist the inverse transformation; in fact, we should have always l > 1.
13.1. Renormalization group theory (RG) 203
There is no general way to construct the function Rl : depending on the sys-

tem and on the case considered we can choose different ways to carry out the
decimation, and in general (as we will see) for a given system many different
RG transformations can be built. In general such procedures can be done ei-
ther in coordinate space (real space Renormalization Group) or in Fourier space
(momentum shell Renormalization Group).
In terms of the coupling constants [K] the partition function of the original
system is:
ZN [K] = Tr e−βH[K]
while the free energy density is
kB T
fN [K] = − log ZN [K]
N
Now, if the RG transformation integrates the degrees of freedom on the spatial
scale la then the number of degrees of freedom will decrease by a factor ld , if d
is the dimensionality of the system; in other words, after the RG transformation
Rl , the number N of degrees of freedom is reduced by
N
Nl =
ld
3. The new hamiltonian Hl [K 0 ] can be (and in general it is) different from the
previous one H[K] (this is the main difference with the Kadanoff’s theory), but
the effective Hamiltonian Hl [K 0 ] must have the same symmetry properties of
the original one!
This means (and this is the great improvement with respect to Kadanoff’s
argument) that the decimation can make some new terms appear in the coarse-
grained Hamiltonian, as long as they respect the same symmetries of the original
system. In other words, if Km = 0 in HN but its relative term is allowed by
the symmetry group of HN itself, then we can have Km 0 6= 0 in H0 .
N0
For example, if we start from
X
HN = N K0 + K2 Si Sj
ij
if the term K3 is not allowed by the symmetry group of HN , we cannot produce

X X X
0 0 0 0
HN 0 = N K0 + K1 SI + K20 SI SJ + K30 SI SJ SK
I IJ IJK
4. The invariance condition is not in H (as in Kadanoff’s theory) but in Z; hence,

the coarse-graining procedure (decimation) leaves invariant the partition func-
tion instead of the Hamiltonian:
ZN 0 [K 0 ] = ZN [K] (13.3)
this invariance condition has a consequence on the free-energy:

1 ld 1
fN [K] ' log ZN [K] ' d log ZN 0 [K 0 ] ' l−d 0 log ZN 0 [K 0 ]
N l N N
where N 0 = N/ld . Hence,
f [K] ∝ l−d f [K 0 ] (13.4)
which is the scaling form of the free energy density as obtained by Kadanoff.
13.1.2 Singular behaviour in RG

We have stated that the RG transformations Rl are analytic, hence a single Rl
involves the integration of a finite number of dregrees of freedom and the free energy
cannot develop the singularity behaviour we are looking for; so, we might ask: where
does the singular behaviour of a system near a critical point come from? This occurs
in the thermodynamic limit, which in this case is obtained when we apply the RG
transformation an infinite number of times. Indeed, in order to integrate a thermo-
dynamic number of degrees of freedoms, one has to apply an infinite number of RG
transformations.
In general, after the n-th iteration of the RG, the coarse-graining length of the
system will be l → ln and the coupling constants [K] → [K (n) ]. As n increases
the "vector" of coupling constants describes a "trajectory" in the space of all the
possible coupling constants [K], often called Hamiltonian space, or theory space;
by varying the initial conditions, namely different initial Hamiltonians, one obtains a
flux of trajectories (i.e. the set of all the trajectories that start from different initial
conditions). In general, these trajectories can form strange attractors or complex limit
cycles; however, it is almost always found that they are simply attracted towards or
ejected from fixed points (cases where this doesn’t occur are really exotic), so in
the following we will assume that the flux of trajectories only exhibits fixed points.
The study of the properties of the flux of trajectories near these fixed points is
crucial, since as we will see it is that that will allow us to actually explain universality
and predict the values of the critical exponents (the scaling behaviour introduced by
Widom is related to the behaviour of the trajectories close to some fixed points). We
therefore proceed to study such points.
13.1.3 Zoology of the fixed points

Suppose we know Rl . If [K ∗ ] is a fixed point of the flux of trajectories, by definition
we have:
Rl [K ∗ ] = [K ∗ ] (13.5)
Then, in general from the Hamiltonian of a system we can determine the correlation
length ξ, and if [K 0 ] = Rl [K] we know that:
ξ[K]
ξ[K 0 ] = ≡ ξl
l
Therefore for a fixed point we have:
ξ[K ∗ ]
ξ[K ∗ ] = (13.6)
l
which implies two cases: (
∗ 0 trivial
ξ[K ] = (13.7)
∞ critical
A fixed point with ξ = ∞ is called critical, while if ξ = 0 trivial. Clearly, every fixed
point [K ∗ ] can have its own basin of attraction, i.e. a set of points {[K]} that
under the action of the flux of trajectories tend to [K ∗ ]:
(n) n→∞
Rl [K] −→ [K ∗ ] (13.8)
An important result concerning the basin of attraction of critical fixed points is the
following:
13.1. Renormalization group theory (RG) 205
Theorem 4
All the points [K] belonging to a basin of attraction of a critical fixed point have
the correlation length ξ = ∞.
Proof. Call [K] the initial set of coupling constants, after n iterations of the RG the
correlation length of the system will be such that:
ξ[K] = lξ[K (1) ] = · · · = ln ξ[K (n) ] ⇒ ξ[K] = ln ξ[K (n) ]
If we now take the limit n → ∞ the right hand side diverges if K (n) → K ∗ , i.e. if
[K] belongs to the basin of attraction of [K ∗ ]. Hence, ξ[K ∗ ] = ∞, implies
⇒ ξ[K] = ln ξ[K ∗ ] = ∞
Therefore, we have ξ[K] = ∞.

The set of [K] that forms the basin of attraction of a critical fixed point is also
called critical manifold.
Universality
All the critical models that belong to the critical manifold, have the same critical
behaviour of the corresponding critical fixed point. Hence, we want to study the
behaviour of Rl close to the fixed points.
We can argue that the fact that all the points of a critical manifold flow towards
the same fixed point (i.e. the same Hamiltonian) is the basic mechanism on which
universality is based upon, but this is by no means a complete explanation, since
universality involves the behaviour of systems near a critical point and we still have
said nothing about that. We can however note the following fact: starting from any
point in Hamiltonian space, iterating the RG transformation and identifying the fixed
point towards which the system flows, the phase of the original point in Hamiltonian
space (i.e. in the phase diagram) will be described by this fixed point. Therefore,
every phase of the system is "represented" by a fixed point of the flux of trajectories.
As we will later see, critical fixed points describe the singular critical behaviour
while trivial fixed points are related to the bulk phases of the system: therefore, the
knowledge of the location and nature of the fixed points of the flux of trajectories can
give us hints on the structure of the phase diagram of the system, and the behaviour
of the flow near critical fixed points allows us to calculate the values of the critical
exponents.
13.1.4 Linearization of RG close to the fixed points and critical

exponents
In order to study the behaviour of the flux of trajectories near a fixed point [K ∗ ]
of Rl , let us take a slight perturbation from it, namely we set:
~ =K
K ~ ∗ + δK
~ (13.9)
~ is a small displacement. Applying the RG transformation, in components

where δ K
we will have:
X ∂Kj0
~ ∗ + δ K)
Kj0 = (Rl )j (K ~ j = K∗ + δKi + O δKi2

j
∂Ki K ~∗
i
Neglecting all the terms beyond the linear ones, we can write the action of the lin-
~ and δ K
earised RG transformation in terms of the displacements δ K ~ 0 as:
~ 0 = π̄δ K
δK ~ (13.10)
where
∂Kj0

(π̄)ij = (13.11)
∂Ki K ~∗
Of course π̄ is a square matrix, but:
1. π̄ is in general not symmetric, one has to distinguish between left and right
eigenvectors;
2. π̄ it is not always diagonalizable or sometimes its eigenvalues can be complex.
For most of the physical system, however, π̄ can be diagonalized and the eigen-
values are real.
However, we suppose π̄ to be symmetric (which, as before, is almost always the case)
(σ)
so that it can be diagonalized. If we call λl and ~e(σ) the σ-th eigenvalue and relative
eigenvector of π̄ (where we are explicitly writing the length scale of the decimation),
(l)
in components the action π̄ (l) will be:

(l) (σ) (σ) (σ)
⇒ π̄ij ~ej = λl ~ei (13.12)
this is the eigenvalue equation. From the semigroup property of the RG transforma-
tion Rl , we have:
0 0
π̄ (l) π̄ (l ) = π̄ (ll )
Hence, we have:
(σ) (σ) (σ)
λl λl0 = λll0 (13.13)
This is a functional equation which can be solved in the following way: if we write
(σ)
the eigenvalues explicitly as functions of l, namely λl = λ(σ) (l), then differentiating
with respect to l0 :
λ(σ) (l)λ0(σ) (l0 ) = `λ0(σ) (ll0 )
where with λ0 we mean that λ has been differentiated with respect to its argument.
Setting now l0 = 1 and defining λ0(σ) (1) = yσ we get:
λ0(σ) (l) yσ
=
λσ (l) l
which is easily solved to give:
(σ)
λl = lyσ (13.14)
where, as we have defined it, yσ is a number (to be determined) independent of l
(is the critical exponent). To see how δ K~ changes under the action of the linearized
transformation π̄ let us find out how its components along the directions determined
by the eigenvectors ~e(σ) change. In other words, let us expand δ K~ in terms of ~e(σ)
(that is a complete orthonormal base):
~ = ~
X
δK a(σ)~e(σ) , a(σ) = ~e(σ) · δ K (13.15)
σ
and applying π̄ (l) :

X
~ 0 = π̄δ K
~ = (σ)
X X
δK a(σ) π̄ ~e(σ) = a(σ) λl ~e(σ) ≡ a0(σ)~e(σ) (13.16)
σ σ σ
where in the last step we have defined the components a0(σ) which is the projection
~ 0 along ~e(σ) .
of δ K
13.2. The origins of scaling and critical behaviour 207
Remark. Ortonormality is not always true since in general π̄ is not symmetric!

~ along the eigenvectors ~e(σ) depends
We therefore see that the behaviour of δ K
(σ) ~ grow under the
on the magnitudes of the eigenvalues λl : some components of δ K
action of π̄ while some others shrink. In particular, if we order the eigenvalues in
descending order
~ ~ ~ ~
λ ≥ λ ≥
1 2 3 λ ≥ · · · ≥ λσ
we can distinguish three cases:

(σ)
1. case λl > 1 (i.e. y σ > 0): implies that a(σ) grows under π̄. These are called
relevant eigenvalues/eigenvectors;

(σ)
2. case λl < 1 (i.e. y σ < 0): implies that a(σ) decreases under π̄. These are
called irrelevant eigenvalues/eigenvectors;

(σ)
3. case λl = 1 (i.e. y σ = 0): implies that a(σ) remains constant under π̄
(its behaviour can depend on the higher orders in the expansion that we have
neglected). These are called marginal eigenvalues/eigenvectors.
The above analysis implies that: starting from a point close to a critical fixed
point [K ∗ ] (but not on the critical manifold), the trajectory will abandon [K ∗ ] along
the relevant directions whereas it will approach [K ∗ ] along the irrelevant directions.
Hence, we have that:
• The irrelevant eigenvectors form the local basis of the basin of attraction of
[K ∗ ]. Hence, the number of irrelevant directions of a fixed point is equal to the
dimension of its critical manifold.
• The relevant eigenvectors form a sub-space complementar to the basin of at-

traction of codimension C. Hence, the number of relevant directions is equal to
its codimension.
Remark. Let us note that the eigenvalues, and their possible relevance, depend on
the matrix π̄, which in turn depends on the fixed point considered [K ∗ ]: this means
that the terms "relevant", "irrelevant" or "marginal" must always be specified with
respect to the particular fixed point considered.
13.2 The origins of scaling and critical behaviour

Let us consider a fixed point of the flux of trajectories of a generic system, and
assume that it has two relevant directions corresponding to the coupling constants
that for simplicity we defined as
K1 → K1 (T ) = T, K2 → K2 (H) = H
where T is the temperature and H is the external field. We suppose that T and H
are transformed under the RG such that:
T 0 = RTl (T, H), H 0 = RH

l (T, H)
where RTl and RH l are analytic functions given by the coarse graining procedure. The
~
fixed points K = (T ∗ , H ∗ ) of the flux of trajectories will be given by the solutions
∗
of: (
T ∗ = RTl (T ∗ , H ∗ )
H ∗ = RH ∗
l (T , H )
∗
with the correlation length that diverges, i.e. ξ(T ∗ , H ∗ ) = ∞. Linearising the trans-
formation around the fixed point (T ∗ , H ∗ ), in terms of the reduced variables (for
standard magnetic systems H ∗ = 0 )
(T − T ∗ ) (H − H ∗ )
t= , h=
T∗ H∗
we have: 0
t t
0 = π̄ (13.17)
h h
where:
∂RTl /∂T ∂RTl /∂H

π̄ = (13.18)
∂RH
l /∂T ∂RH
l /∂H T ∗ ,H ∗
~ = (t, h).
Remark. In this case δ K
In general the eigenvectors ~e(σ) are linear combinations of t and h. When π̄ can be
diagonalized, t and h are not "mixed-up" by the transformation. Hence, as previously
stated, we suppose π̄ to be diagonalizable. We therefore write its eigenvalues as:
(t) (h)
λ l = l yt , λl = lyh (13.19)
This way we can write the linear transformation as:

!
0 (t) 0 y
t λl 0 t t l tt
0 = (h) ⇒ 0 = yh (13.20)
h 0 λl h h l h
After n iterations we will have:

(
t(n) = (lyt )n t
(13.21)
h(n) = (lyh )n h
On the other hand, since in general we know that
ξ(t, h)
ξ 0 ≡ ξ(t0 , h0 ) =
l
after n iterations we have
ξ(t, h) = ln ξ(lnyt t, lnyh h) (13.22)
This is the scaling law of the correlation length. From this we can determine the
critical exponent ν. In fact, setting h = 0 (H = H ∗ ) for simplicity, we have
ξ(t, 0) = ln ξ(lnyt t, 0)
Since l is arbitrary, choosing it so that lnyt t = b 1, with b a positive real number

(b ∈ R+ , l is not an integer any more), we have:
1/yt −1/yt
n b t
l = ⇒ ξ(t) = ξ(b, 0)
t b
Since for t → 0 in general ξ ∼ t−ν , we get:
1
ν= (13.23)
yt
13.3. Real space renormalization group (RSRG) 209
This is an extremely important result! In fact, we see that once the RG transformation
R is known, yt is straightforward to compute as:
(t)
log λl
yt =
log l
and so we are actually able to calculate ν and predict its value! We can do even
something more (including giving yh a meaning) from the scaling law of the free
energy density. After n iterations of the RG we have:
fs (t, h) = l−d fs (t0 , h0 ) = l−nd fs (t(n) , h(n) ) = l−nd fs (lnyt t, lnyh h) (13.24)
and choosing again l so that lnyt t = byt , then:
yh

d/yt −d yt b h
fs (t, h) = t b fs b , y /y
th t
Comparing this to what we have seen in Eq.(12.14) we get:
(
2 − α = dν = ydt
yh (13.25)
∆= yt
where yh and yt can be computed as

(t) (h)
log λl log λl
yt = , yh = (13.26)
log l log l
Hence, we have all we need for the computation!
13.3 Real space renormalization group (RSRG)

We now want to see how we can build the RG transformation Rl from the coarse
graining procedure. Let us start from the most general form of the Ising model where
σi = ±1 and the two body interaction is
ĥ
w(σi , σj ) = −ĝ − (σi + σj ) − K̂σi σj (13.27)
z
where z is the coordination number. The Hamiltonian of such a system is:
X P w(σ ,σ )
Z= e hiji i j (13.28)
{σ}
Let us consider the one-dimensional case d = 1 with nearest-neighbour interaction

and periodic boundary conditions without any external field (H = 0), as in Figure
13.1.
σi−1 σi σi+1
Figure 13.1: One-dimensional Ising model.
The idea of the renormalization group is to perform an integration over some

degrees of freedom and obtain a new partition function with N/ld spins but of the
same form of the one describing the original system with N spins.
There are many ways to perform this partial integration. Here we consider the
decimation procedure with l = 2 and with l = 3. We will see that in the case d = 1
the procedure is clearly exact.
13.3.1 Ising d = 11, RSRG with l = 22.

A coarse graining of l = 2 can be obtained for instance by summing over the spins
positioned at the even sites. Before doing this sum it is convenient to rename the
even spins σ2i as Si andP the odd spins (the ones that
Pwill survive) σ2i+1 as σi0 .
In this way the sum {σ} can be partitioned as {σ0 } {Si } where now we have
P
i
i = 1, . . . , N/2. Hence, the partition function becomes
X PN/2
e i=1 [w(σi ,Si )+w(Si ,σi+1 )] (13.29)
X P X 0 0
ZN (g, k, h) = e i w(σi ,σi+1 ) =
{σi }N
i {σi0 }1
N/2
{Si }1
N/2
Bringing the summation out of the exponential, we obtain:

 
X N/2
e[w(σi ,Si )+w(Si ,σi+1 )] 
Y X 0 0
ZN = 
N/2
{σi0 }1 i=1 Si =±1
Let us note that:

e[w(σi ,Si )+w(Si ,σi+1 )] = f (σi0 , σi+1
X 0 0 0
)
Si =±1
If we can write f (σi0 , σi+1

0 ) as an exponential exp w0 (σi0 , σi+1
0 ) , we obtain:

X PN/2 0 0 0
ZN (g, h, k) = e i=1 w (σi ,σi+1 ) (13.30)
N/2
{σi0 }1
It has a similar form as the original one (Eq.(13.29)) but with N/2 spins. Supposing
that also w0 (σi0 , σi+1
0 ) can be written as
h0 0
g0 + 0
(σ + σi+1 ) + K 0 σi0 σi+1
0
z i
we have
ZN (g, h, K) = ZN/2 (g 0 , h0 , K 0 ) (13.31)
as requested by the renormalization group. In order to find g 0 , h0 , K 0 as a function of
g, h, K, we should satisfy the relation
0
w(σi0 ,Si ) w(Si ,σi+1 )
z }|
}| { { z
h 0 h
0 h0 0 0 0 0 0 X g + (σi + Si ) + Kσi0 Si + g + (Si + σi+1
0 0
) + KSi σi+1
eg + 2 (σi +σi+1 )+K σi σi+1 = e 2 2
Si =±1
(13.32)
for all values of the pair (σi0 , σi+1
0 ).
Remark. Remind that the coordination number is z = 2d for an hypercubic lattice;

in this case we have d = 1, hence z = 2.
For simplicity let us consider
(
x ≡ eK , y ≡ eh , z ≡ eg
0 0 0 (13.33)
x0 ≡ eK , y 0 ≡ eh , z 0 ≡ eg
Hence, the previous condition becomes
σi0 +σi+1
0
σi0 +Si

S
i +σi+1

0 0
0 σi σi+1 σi0 Si
X
0 0 2
xSi σi+1
2 2
zy x = z y x y 2
Si =±1
Let us consider all the cases for different values of σi0 and σi+1
0 , we obtain three
independent equations and three variables:

• Case σi0 = +1 and σi+1

0 = +1:
z 0 y 0 x0 = z 2 y(x2 y + x−2 y −1 ) (13.34)
• Case σi0 = −1 and σi+1

0 = −1:
−1 0
z0y0 x = z 2 y −1 (x−2 y + x2 y −1 ) (13.35)
• Case σi0 = +1 and σi+1

0 = −1, or σi0 = −1 and σi+1
0 = +1:
−1
z 0 x0 = z 2 (y + y −1 ) (13.36)
in the two cases the relation obtained is the same.
Now, we rearrange these independent equations in the following way:
• By multuplying Eq.(13.34)×(13.35)×2·(13.36) we obtain:

4 2
z 0 = z 8 y + y −1 x2 y + x−2 y −1 x−2 y + x2 y −1

• By dividing Eq.(13.34)/(13.35) we obtain:
2 x2 y + x−2 y −1
y0 = y2
x−2 y + x2 y −1
• By multiplying and dividing the three equations as ((13.34)×(13.35))/(2·(13.36)),

we obtain
x2 y + x−2 y −1 x−2 y + x2 y −1

04
x =
(y + y −1 )2
Let us note that g and g 0 are constant factors of the partition function Z and can
be absorbed by a shift in the free energies. Moreover g (i.e. z ≡ eg ) does not enter in
the expression of h0 and K 0 ; the two relevant renormalization equations are the ones
for x0 and y 0 :
2 −2 −1

y 0 2 = y 2 x−2y+x 2 y−1
x y+x y
x0 4 = (x y+x y )(−1 x−2 y+x2 y −1 )
2 −2 −1
(y+y )2
Taking the logarithm of these equations we obtain:

(
2h0 = 2h + log e2K+h + e−2K−h − log e−2K+h e2K−h

4K 0 = log e2K+h + e−2K−h + log e−2K+h + e2K−h − 2 log eh + e−h

(13.37)
Hence, we have obtained the renormalization group transformation
(h0 , K 0 ) → Rl (h, K)
Now, let us look for fixed points (h∗ , K ∗ ); then will we linearized around (h∗ , K ∗ )
and we will find the relevant eigenvalues.
Let us consider for simplicity h = 0, so that y ≡ eh = 1. The equations (13.37)
simplify to
( 2 x2 +x−2
y 0 = x−2 +x2
(x2 +x−2 )(x−2 +x2 )
x0 4 = 4
Hence, we have: 
y 0 = 1
2 (13.38)
x0 4 = (x +x )
2 −2
Therefore, we have that K is the only variable left because we have obtained h0 = 0.
By substituting x0 ≡ eK , we have:
2 2
x2 + x−2
2K
04 4K 0 e + e−2K
x = ⇒e = = cosh2 (2K)
4 2
In conclusion we obtain the RG equation:
1
K0 = log (cosh(2K)) (13.39)
2
By rearranging we have:
0 0
e2K = cosh(2K) = 2 cosh2 (K) − 1 ⇒ e2K − 1 = 2 cosh2 (K) − 1 = 2 sinh2 (K)

Similarly,
0
e2K + 1 = 2 cosh2 (K)
Hence,
0
e2K − 1
= tanh K 0 = tanh2 (K)

2K 0
e +1
Thus we can rewrite the RG equation (13.39) as:
K 0 = tanh−1 (tanh(K))2 (13.40)

Rewritten in the form

y0 = y2 (13.41)
where y ≡ tanh K, its fixed points are given by
y∗ = y∗2
whose solutions are y ∗ = 1 and y ∗ = 0. Let us consider the two cases separately:
• Case y ∗ = 1− (K → ∞, T → 0+ ): since tanh K < 1 ∀K ∈ R, starting from

any initial point y0 < 1 the recursion relation y 0 = y 2 makes y smaller every
(n) n→∞
time, moving it towards the fixed point y ∗ = 0 (Rl (y0 ) → 0+ ). We can
thus conclude that y ∗ = 1 is an unstable fixed point.
(n) n→∞
• Case y ∗ = 0+ (K → 0+ , T → ∞): for all y0 < 1 we have Rl (y0 ) → 0+ .
Hence, y ∗ = 0 is a stable fixed point.
As expected for the one-dimensional Ising model, no critical fixed point for T 6= 0
is found (see the flux of trajectories in Figure 13.2). Moreover, note that the fact that
the flow converges towards T = ∞ means that on large spatial scales the system is
well described by a Hamiltonian with a high effective temperature, and so the system
will always be in the paramagnetic phase (a part when T = 0).
Remark. For a generic scaling parameter l one can see that
tanh K 0 = (tanh K)l
Hence, we have
y0 = yl l≥2
Therefore, the analysis for a scaling parameter l ≥ 2 is similar to the l = 2 case.
P aramagnetic point
T =0 T →∞
y=1 y=0
K=∞ K = 0+
Figure 13.2: One-dimensional flux of trajectories for Ising model for the recursion relation
tanh(K 0 ) = tanh2 (K).
13.3.2 Ising d = 11, RSRG with l = 3

Let us consider again a one-dimensional Ising model with nearest-neighbour in-
teraction and periodic boundary conditions, without any external field (H = 0). We
choose to apply the coarse-graining procedure to our system by grouping spins in
blocks of three (l = 3); this way the (i + 1)-th block (with i = 0, 1, 2 . . .) will be con-
stituted by the spins S1+3i , S2+3i and S3+3i (for example, the first block is [S1 , S2 , S3 ],
the second one [S4 , S5 , S6 ] and so on).
a=1
S1 S2 S3 S4 S5 S6 S7 S8 S9
Figure 13.3: One-dimensional lattice for the d = 1 Ising model. We group spins in block
of three.
In order to define the new block spin we could use the majority rule, but we further
simplify the problem requiring that the new block spin SI0 coincides with the central
spin S2+3i of the block. In other words, for every block we set:
P (SI0 ; S1+3i , S2+3i , S3+3i ) = δSI0 ,S2+3i
(for example for the first block we have P (S10 ; S1 , S2 , S3 ) = δS10 ,S2 ). We will see later
in Sec.13.4 the meaning of the projector P .
Therefore, the coarse-graining procedure consists in summing over the spins at the
boundaries of the blocks and leaving untouched the central ones . In Figure 13.3 we
represent the situation, where the spins over which we sum are indicated by a cross
empty circle and the ones leaved untouched by a full red circle. We obtain a new
lattice as represented in Figure 13.4.
l=3
Figure 13.4: The lattice after the sum of the spins represented by an empty circle.
Now, using the notation that will be introduced later in Sec.13.4 for the general
theory, we have:
0
Y P
e−βH = Tr{Si } P (SI0 , Si )e−βH = T r{Si } δSI0 ,S2+3i eK j Sj Sj+1
I
X
KS1 S2 KS2 S3 KS3 S4 KS4 S5
= δS10 ,S2 δS20 ,S5 · · · e e e e ··· (13.42)
{Si =±1}
0 0 0
X
= eKS1 S1 eKS1 S3 eKS3 S4 eKS4 S2 · · ·
{Si =±1}
Let us see how it works for the two blocks [S1 , S2 , S3 ] and [S4 , S5 , S6 ]. First of all, we
call
S2 ≡ S10 , S5 = S20 fixed
Hence, we have:
0 0
X X
eKS1 S3 +kS3 S4 +kS4 S2
S3 =±1 S4 =±1
From the definitions of cosh and sinh we can write:
eKS3 S4 = cosh K(1 + xS3 S4 ) (13.43)
where
x ≡ tanh K
so that the sum over S3 and S4 becomes:
(cosh K)3 1 + xS10 S3 (1 + xS3 S4 ) 1 + xS4 S20

X
S3 ,S4
Expanding the product and keeping in mind that Si2 = +1, we get:
(1 + xS10 S3 )(1 + xS3 S4 )(1 + xS4 S20 ) =

= 1 + xS10 S3 + xS3 S4 + tS4 S20 + x2 S10 S4 + x2 S10 S3 S4 S20 + x2 S3 S20 + x3 S10 S20
and clearly all the terms containing S3 or S4 (or both) vanish when we perform the
sum S3 ,S4 =±1 . Therefore, the result of the partial sum for the first two blocks is:
P
Y
22 (cosh K)3 (1 + x3 S10 S20 )
I
where the term 22 comes from the fact that the constant terms 1 and x3 S10 S20 must
be summed 22 times, two for the possible values of S3 and two for S4 . Therefore, the
partition function of the block spin system will be:
h i
0 0 2N 0 3N 0 3 0
ZN 0 (K ) = Tr 0
{SI } 2 (cosh K) 1 + x S S
I I+1 (13.44)
where N 0 = N/3 is the new number of spin variables. This must have the same form
of ZN (K). We know that in general
0
X
0 −βH
ZN 0 = Tr{S 0 } e
I
, −βH0 ({SI0 }) = N 0 g(K, K 0 ) + K 0 SI0 SI+1
0
so let us try to write ZN

0 [K 0 ] in this form. Let us note that
0
cosh K 0
22 (cosh K)3 (1 + x3 SI0 SI+1
0
) = 22 0
(cosh K)3 (1 + x3 SI0 SI+10
)
cosh K
3
(a) 2 (cosh K) 0
(1 + x0 SI0 SI+1
0

=2 cosh K )
cosh k 0
3
(b) 2 (cosh K) 0 0 0

=2 exp K S I S I+1
coshK 0
(cosh K)3
2 ln 2+ln cosh K 0
+K 0 SI0 SI+1
0
=e
where in (a) we have defined x3 ≡ x0 , so that (tanh K)3 = tanh K 0 , and where in (b)
we have used the relation Eq.(13.43). Hence, we have:
(cosh K)3

0
g(K, K ) = 2 ln 2 + ln (13.45)
cosh K 0
The new effective Hamiltonian has therefore the same form of the original one with the
redefined coupling constant K 0 , and exhibits also a new term (g(K, K 0 )) independent
of the block spins.
Let us note that (tanh K)3 = tanh K 0 is the recursion relation we are looking for:
K 0 = tanh−1 tanh3 K (13.46)

Rewritten in the form x0 = x3 , its fixed points are given by:
x∗ = x∗ 3
whose solutions are:

(
x∗ = 0 ⇐⇒ K → 0 ⇐⇒ T → ∞
(13.47)
x∗ = 1 ⇐⇒ K → ∞ ⇐⇒ T → 0
Since tanh K < 1∀K ∈ R, starting from any initial point x0 < 1 the recursion relation
x0 = x3 makes x smaller every time, moving it towards the fixed point x∗ = 0. We
can thus conclude that x∗ = 1 is an unstable fixed point while x∗ = 0 is stable, as
graphically represented in Figure 13.5.
x=1 x = 0+
k=∞ k = 0+
T =0 T =∞
Figure 13.5: RG flux of trajectories for the recursion relation (tanh K)3 = tanh K 0 .
Note that the fact that the flow converges towards T = ∞ means that on large
spatial scales the system is well described by a Hamiltonian with a high effective
temperature, and so the system will always be in the paramagnetic phase (a part
when T = 0).
Let us now see how the correlation length transforms. We know that in general,
if the decimation reduces the number of spins by a factor l (in the case we were
considering above, l = 3) we have to rescale distances accordingly, and in particular:
ξ(x)
ξ(x0 ) =
l
where in general x0 = xl . Since l is in general arbitrary, we can choose

l = c/ ln x
and thus:
ξ(x) c −1
ξ(x0 ) = ξ(xl ) = ξ el ln x = ξ(ec ) = = ξ(x)
l ln x
Therefore:
ln x cξ(ec ) const
ξ(ec ) = ξ(x) ⇒ ξ(x) = =
c ln x ln x
Finally, substituting x = tanh K, we obtain:
const
ξ(K) = (13.48)
ln (tanh K)
which is the exact result we have found at the end of the transfer matrix method for
d = 1. Let us note that for K → ∞ (i.e. x → 1), we have
ξ ∼ econst/T
that is finite ∀K 6= ∞.
13.3.3 Decimation procedure for d > 11: proliferation of the interac-

tions
As we have stated, in d = 1 the recursion relations can be determined without
great problems and they don’t introduce new interactions. However, this is not the
case if d > 1, and the value of the new coupling constants can’t be determined
exactly, forcing us to use approximations. Let us see with a generic example how the
RG transformation can introduce new interactions in a two-dimensional Ising model
with nearest-neighbour interactions.
Suppose we divide our system in blocks containing an odd number of spins and,
similarly to what we have seen for d = 1 in the previous sections, we sum over the
spins on the boundary of the block and leave unchanged the one at the center.
We have problems with the spins at the boundaries. One can see that, in order to
satisfy the condition
ZN = ZN/l
in addition to the parameters g 0 and K 0 at least one more interacting coupling must
be considered
Looking at the figure 13.6, we see that the spin on
the corder of the block 2 is coupled to one spin in
bock 1 and one in block 3. When we sum over 3
the spin in block 2 an effective coupling between
blocks 1 and 3 will be established: we therefore see
that the coarse-graining procedure introduces next-
nearest-neighbour interactions between the blocks,
so new terms are appearing in the Hamiltonian
(which of course, as already stated, respect the 1 2
symmetries of the original one):
H → H0 = H0 (K 0 , K20 )
Figure 13.6: Introduction of a
new effective interaction.
This unfortunately occurs at each iteration.
Therefore, we understand that the iteration of the RG will introduce increasingly com-
plicated couplings: this is the so called proliferation of interactions. In order to solve
it, approximations are necessary.
K0
L0
Q0
(a) Square lattice. We sum over the red (b) New interactions introduced.
cross spins.
Figure 13.7: Decimation for the two-dimensional Ising model with l = 2.
Example 35: Decimation of Ising on square lattice

Let us now see in detail how to face the problem of the proliferation for a two-
dimensional Ising model with nearest-neighbour interaction and H = 0. We
choose to coarse-grain the system summing over a "chessboard" of spins, as
shown in Figure 13.7. We call x the spin to sum over (red cross), while with o
the other one (black circle).
The partition function in this case is:
X X P w(S ,S ) X X K P S ,S
i j hiji i j = Z 0 0 0 0
ZN (K, g) = e hiji = e N/2 (K , L , Q , g )
{o} {x} {o} {x}
By performing the sum and imposing the constraint ZN (K, g) =

ZN/2 (K 0 , L0 , Q0 , g 0 ), it is possible to show:

0 1
k = 4 ln cosh 4K

L0 = 18 ln cosh 4K
 0 1
Q = 8 ln cosh 4K − 12 ln cosh 2K

This way, besides nearest-neighbour interactions (K) we are introducing also

next-nearest-neighbour ones (K 0 and L0 ) and also four-spin cluster interactions
√
(Q0 ). Note also that the final set of spins resides on a square whose
√ side is 2
times the original one, so we have that the scaling factor is l = 2.
If we now reiterate the procedure, more complicated interactions will appear
and the problem becomes rapidly intractable. We therefore must do some ap-
proximations:
1. We choose to neglect Q0 (also because it is the only coupling that can

become negative and thus prevent the spins from aligning).
2. Omit explicit dependence on L0 defining a new constant K 0 :
K 0 + L0 → K 0
This way the recursion relation involves only K.
We end up with only one recursion:

3
K0 = ln cosh 4K
8
Let us therefore see to which conclusions does this lead. The fixed points are
given by:
3
K ∗ = ln cosh 4K ∗
8
and the non-trivial (K 6= 0) numerical solution of this equation is Kc =
∗
0.50698 . . .; the exact value found by Onsager is Kexact = 0.44069 . . ., so our

approximation is good enough. If we now set the initial value K0 , the we have:
n→∞
(
K (n) −→ ∞ K0 > Kc
n→∞
K (n) −→ 0 K0 < Kc
Thus, the fixed points K ∗ = 0 and K ∗ = ∞ are stable, while K ∗ = Kc is

unstable; this can be visually represented as in Figure 13.8
Let us now linearise the recursion relation near Kc and compute a couple of
critical exponents. On the base of what we have previously seen, if we call
δK = K − Kc and δK 0 = K 0 − Kc , we have:
δK 0 = λt δK = lyt δK
where
dK 0

λt =
dK K=Kc
√
Therefore, since also l = 2, we get:
dK 0

ln λt 1
yt = = √ ln
ln l ln 2 dK K=Kc

1 d 3
= √ ln ln (cosh 4K) = 1.070 . . .
ln 2 dK 8
it is positive, hence λt is a relevant eigenvalue. We have also the critical expo-

nents:
1 d
ν= = 0.9345 . . . , α=2− = 0.1313 . . .
yt yt
Onsager’s exact result, as we know, gives α = 0 (since the specific heat di-
verges logarithmically, cV = ln t) and thus ν = 1. We therefore see that our
approximation is sufficiently accurate (even if still improvable).
K∗ = 0 K ∗ = Kc K∗ = ∞
3
Figure 13.8: RG flux of trajectories of the recursion relation K ∗ = 8 ln cosh 4K ∗ .
Migdal-Kadanoff bond moving approximation

The real-space renormalisation group can be performed exactly on the d = 1 Ising
model by decimating a regular sequence of spins. This yields the following recursion
relation for the coupling constant:
1
K0 = ln cosh(2K)
2
Unfortunately, this decimation can not be carried out exactly in higher dimensions
and some truncation approximations are necessary. One such scheme, which proved
Figure 13.9: Square lattice for the 2-dimensional Ising model with l = 2. The red cross
are the spins to sum.
to be quite versatile is the so called Migdal-Kadanoff renormalisation (it is also called

potential moving), which we illustrate here on the Ising model on the 2-dimensional
square lattice with l = 2 defined in Figure 13.9, where x are again the spin to sum
over.
The idea is to remove the same interactions, i.e. same bonds; however a simple
deletion of same bonds is a too strong approximation, therefore what we do is moving
some bonds to neighbouring ones. In particular, we decimate every other line or
column of spins along each lattice direction but removing the bonds not connected to
the retained spins. Obviously, this simple approximation weakens the whole system,
making it more “one-dimensional”. This is remedied by using the unwanted bonds to
reinforce those that are left behind (see Figure 13.10): the spins that are retained are
now connected by a pair of double bonds of strength K → 2K.
(b) The lattice after the bonds moving

(a) The original lattice. Some bonds are
step. The double bonds have coupling
moved to neighbouring ones.
constant K → 2K.
Figure 13.10: Migdal-Kadanoff for the Ising model on the square lattice with l = 2.
After we have done this, the sum over x spins is similar to the Ising d = 1 with
l = 2 and gives:
1
K 0 = ln cosh(2 · 2K)
2
where the crucial difference is the factor 2 in the cosh(2 · 2K). The fixed point are
given by:
1
K ∗ = ln cosh(4K ∗ )
2
Again we find K = 0 and K = ∞ fixed points:
• For K 1 (T → 0): unlike the d = 1 we have K 0 ≈ 1
2 ln e4K ≈ 2K and hence
the low temperature fixed point is also stable.
• For K 1 (T → ∞): we have K 0 ≈ 12 ln 1 + 16K 2 ≈ 8K 2 so the high

temperature fixed point is stable as it is in d = 1.

Hence, in between we must have another unstable fixed point, which is given by:
∗ ∗
∗ e4K + e−4K
e2K = ⇒ K ∗ = 0.305
2
Moreover, we have:
dK 0

yt
l = ⇒ yt = 0.75
dK K=K ∗
(remind that l = 2) while the exact solution is yt = 1.
Comparing to the previous approximation, the Migdal-Kadanoff approximation
gives a result further away from the true answer of Kc = 0.441. On the other hand, it
is in a sense a more straightforward as the procedure did not generate any additional
interactions.
This procedure is very easy to generalize in d-dimensions. For instance, if we
consider an Ising cubic lattice (d = 3), we have three bonds to move, hence the
coupling constant transforms as K → 4K. Generally, in d-dimensions, the coupling
constant becomes:
K → 2d−1 K
Hence,
1
K0 =
ln cosh 2d−1 · 2K (13.49)
2
In conclusion, for a generic scaling factor l, we have:
1
Kl0 = ln cosh ld−1 · 2K (13.50)
2
In higher dimensions the approximations worsens.
13.3.4 Decimation procedure and transfer matrix method

Let us consider the partition function
hSi |T|Si+1 i
N z }| { N/l
XY
w(Si ,Si+1 )
N/2
= Tr TN = Tr T2 = Tr Tl

ZN = e
{S} i=1
Hence, after the renormalization procedure the transfer matrix transforms as:
T0l = Tl ⇒ T0l {K 0 } = T({K})l (13.51)

In the case in which the Migdal-Kadanoff approximation is considered, Eq.(13.51)

becomes l
T0l {K 0 } = T {ld−1 K} (13.52)

13.3.5 Migdal-Kadanoff for anysotropic Ising model on a square

lattice
Now, let us consider an anysotropic Ising model on a square lattice, as in Figure
13.11. First of all, let us suppose that vertical bonds have Ky ≡ Jy /KB T , while
horizontal ones Kx ≡ Jx /kB T . Hence, we have:
Kx Jx
=
Ky Jy
Ky
Kx Kx
Ky
Figure 13.11: Anysotropic Ising model on a square lattice, with coupling constants Kx
and Ky .
We proceed in the following way:
1. Firstly, we move the vertical bonds and double the ones that will survive, as in
Figure 13.12a. The vertical bonds weight becomes:
Ky0 = 2Ky
Ky0 Ky0
Ky0 Ky0
Kx0 Kx0
Kx Kx
Ky0 Ky0
Ky0 Ky0 Kx0 Kx0
Kx Kx
Ky0 Ky0
Ky0 Ky0 Kx0
Kx0
Kx Kx
Ky0 Ky0
Ky0 Ky0 Kx0
Kx0
Kx Kx
(b) The lattice after the vertical bonds

(a) The lattice after the vertical bonds moving step and after the sum of the spin
moving step. along x.
Figure 13.12
We can now perform the sum of the spins, represented as a red cross in Fig.13.12a,
along the x direction as for the case d = 1. This gives
1
Kx0 = ln cosh(2Kx )
2
The lattice after this step is shown in Figure 13.12b.
2. We now perform the moving of the renormalized bonds Kx0 . The bonds along
x that survive will have weight
Kx00 = 2Kx0 = ln cosh(2Kx )
This step is illustrated in Figure 13.13
Kx00 Kx00
Ky0 Ky0 Ky0
Ky0 Ky0 Ky0

Kx00 Kx00
Ky0 Ky0 Ky0
Ky0 Ky0 Ky0
Kx00 Kx00
Figure 13.13: The lattice after the renormalized horizontal bonds Kx0 moving step.
3. We finally perform the sum of the spins along the y direction as for the case
d = 1. We obtain:
1 1
Ky00 = ln cosh 2Ky0 = cosh(4Ky )
2 2
The lattice after this step is shown in Figure 13.14.
Kx00 Kx00
Ky00 Ky00 Ky00
Kx00 Kx00
Ky00 Ky00 Ky00
Kx00 Kx00
Figure 13.14: The lattice after the sum of the spin along y.
The final recursion equations are:

(
Kx00 = 2Kx0 = ln cosh(2Kx )
(13.53)
Ky00 = 21 cosh(4Ky )
Hence, the fixed points are given by:

(
Kx∗ = ln cosh(2Kx∗ )
(13.54)
Ky∗ = 21 cosh 4Ky∗

13.4. RG transformation: general approach 223
Let us consider the case Kx = 2Ky . The couplings Kx and Ky at the fixed point are
related by
Kx∗ = 2Ky∗ (13.55)
Hence, the value of K ∗ is (0.62, 0.31). Moreover, we have:
∂Kx00

= 2 tanh 2Kx∗ = 2yt
∂Kx Kx =Kx∗
∂Ky00

= 2 tanh 4Ky∗ = 2 tanh(2Kx∗ ) = 2yt
∂Ky Ky =Ky∗
This gives yt = 0.75. Let us note that yt is the same as the one for the symmetric
case.
Ky
Kx = 2Ky
0.31
critical
surf ace
0.62 Kx
Figure 13.15: (Kx , Ky ) plane. The critical surface is represented in red, while the line
Kx = 2Ky in black. The intersection point is the critical point K ∗ = (0.62, 0.31).
Let us consider the plot in Figure 13.15. We note that all points on the critical
surface are critical with a given choice of Kx /Ky . All these Tc (Kx /Ky ) flow to the
Ising model fixed point (same universality class).
13.4 RG transformation: general approach

A RG transformation reduces the number of degrees of freedom by ld , i.e. going
from N to N 0 = N/ld . This is achieved by performing a "partial trace" over the
degrees of freedom, say, {Si }N 1 , and keeping the coarse grained degrees of freedom
0
{SI0 }N
1 . Formally, we can write (we call −βH → H[K]):
eHN 0 {[K ],SI } = Tr0{Si } eHN {[K],Si } = Tr{Si } P (Si , SI0 )eHN {[K],Si }
0 0 0
(13.57)
where Tr0 is the constrained trace, while P (Si , SI0 ) is the projection operator, which
"incorporates" the constraints and allows us to write an unconstrained trace. This
operator must satisfy the following properties:
1. It must be built such that SI0 have the same range of values of Si (i.e. if Si = ±1,
we should have SI0 = ±1).
2. P (Si , SI0 ) ≥ 0. This garantees that eHN {[K ],SI } ≥ 0.

0 0 0
3. P (Si , SI0 ) should preserve the symmetries of the original Hamiltonian.

4. Tr{SI0 } P (Si , SI0 ) = P (Si , SI0 ) = 1. This condition is necessary to satisfy

P
SI0
ZN 0 [K 0 ] = ZN [K]
Indeed,
ZN 0 [K 0 ] = Tr{SI0 } eHN 0 {[K ],SI } = Tr{SI0 } Tr{Si } P (Si , SI0 )eHN { [K],Si }
0 0 0
h i
= Tr{Si } Tr{SI0 } P (Si , SI0 ) eHN { [K],Si } = Tr{Si } eHN { [K],Si }
= ZN [K]
In general this operator must be built "by hand". Let us see some examples.
Example 36: Block-Kadanoff transformation for l = 2
For example, in the case of Kadanoff’s block transformation we can assign the
block spins SI0 their values with the "majority rule", i.e. we build (hyper)cubic
blocks of side (2l + 1)a (so that each one contains an odd number of spins) and
set: !
X
0
SI = sign Si = ±1
i∈I
and the projection operator can be written as:

!!
Y X
P (Si , SI0 ) = δ SI0 − sign Si
I i∈I
As we can see, doing an unconstrained trace with this operator is equivalent to

performing the constrained trace.
Example 37: Decimation in d = 1 with l = 2

Let us do decimation in d = 1 with l = 2. In this case, SI0 are the even sites
I = 2i. Hence, we have:
N/2
Y
P (Si , SI0 ) δ SI0 − S2i

=
I=1
The partition function is:

P
0
ZN/2 = Tr{Si } P (Si , SI0 )e i w(S2i ,S2i+1 )
 
N/2
Y P
δ SI0 − S2i e i w(S2i ,S2i+1 )

= Tr{Si } 
I=1
 
N/2
w(SI0 ,S2i+1 )+w(S2i+1 ,SI+1
0
Y X
)
= TrSI0  e 
I=1 S2i+1 =±1
13.4.1 Variational RG
In this scheme the projection operator P (Si , SI0 ) depends on a set of parameters
{λ} and the RG transformation is formally
eHN 0 {[K ],{λ},SI } = Tr{Si } P (Si , SI0 , {λ})eHN {[K],Si }

0 0 0
13.4. RG transformation: general approach 225
After choosing a certain form of P (Si , SI0 , {λ}), one can minimize, over the space of
{λ}, the expression:

log Tr{Si } eHN {[K],Si } − log Tr{SI0 } eHN 0 {[K ],{λ},SI }
0 0 0
| {z } | {z }
∝FN ∝FN 0
Notice that, when

Tr{SI0 } P (Si , SI0 , {λ}) = 1
the RG transformation is exact. In general, it is not possible to choose the parameters
{λ} to satisfy this condition and various schemes have been introduced to minimize
∆F = FN − FN 0
Chapter 14
Spontaneous symmetry breaking

Lecture n.
14.1 Spontaneous symmetry breaking Tuesday 0th
January, 2020.
When we talk about a broken symmetry, we often refer to a situation as Compiled:
Wednesday 5th
H = H0 + H1 February, 2020.
where H0 is invariant under the group G and H1 is invariant under a subgroup G0 ⊂ G.
Example 38: Ising with magnetic field
Let us consider the Hamiltonian for the Ising model with a magnetic field H 6= 0:
X X
H=J Si Sj + Hi Si
hiji i
The second term, i Hi Si , breaks the Z2 symmetry satisfied by the first alone.
P
Example 39: Hydrogen atom with an external field

An example, in quantum mechanics, is the hydrogen atom in presence of an
~ (Stark effect) or a magnetic one, B,
electric field E ~ (Zeeman effect). If H1 is
small, the original symmetry is weakly violated and perturbative approaches are
often used.
In all the above examples, one says that the symmetry is broken explicitly.
Definition 11: Spontaneous symmetry breaking
The Hamiltonian maintains the original symmetry but the variables used to
describe the system become asymmetric.
At this point it is convenient to distinguish between

• Discrete symmetries: for instance Z2 , Zq .
• Continuous symmetries: for instance XY , O(n).
Let us consider first the discrete ones by focusing on the Z2 symmetry (Ising). As
previously said, if H = 0, the Hamiltonian of the Ising model, HIsing , is invariant
with respect to the change Si → −Si , hence the discrete group is
G = Z2
A Ginzburg-Landau theory of the Ising model is given by

1 ~ 2 r0 2 u0 4
Z
d
βH(Φ) = d ~x ∇Φ + Φ + Φ − hΦ (14.1)
2 2 4
227
228 Chapter 14. Spontaneous symmetry breaking
Hence, the partition function can be computed as:

Z
Z(r0 , u0 , h) = D[Φ]e−βH(Φ) (14.2)
If we have h = 0, the symmetry is Φ → −Φ. The equation of state obtained with

saddle point approximation is
~ 2 Φ + r0 Φ + u0 Φ3
h = −∇
If h does not depend on ~x, i.e. h(~x) = h, the last equation reduces to the equation
of state of the Landau theory of uniform system:
h = r0 Φ + u0 Φ3
Let us remind that the saddle point appriximation constists in approximating the
functional integral of Eq.(14.2) with its dominant term, i.e. with the one for which the
exponent (Eq.(14.1)) is minimum. Therefore, in the uniform case (namely ∇Φ ~ = 0),
it is equivalent to find the uniform value Φ0 that is the extrema of the potential:
1 u0
V (Φ) = r0 Φ2 + Φ4 − hΦ
2 4
Hence, if h = 0, the extrema of the potential can be computed as
V 0 = (r0 + u0 Φ2 )Φ = 0
V V
Φ Φ
(a) Plot of the potential V (Φ) in the case (b) Plot of the potential V (Φ) in the case
r0 > 0, i.e. T > Tc . r0 < 0, i.e. T < Tc .
Figure 14.1
Let us remember that r0 ∝ (T − Tc ). In order to find the extrema of the potential

V (Φ), we should distinguish two cases:
1. Case r0 > 0 (T > Tc ): there is only one solution Φ0 = 0, as we can see in Figure
14.1a.
q
2. Case r0 < 0 (T < Tc ): there are two solutions Φ0 = ± − ur00 , as illustrated in
Figure 14.1b.
We note that the two solution ±Φ0 are related by the Z2 transformation, namely
Φ → −Φ. Moreover, in this case with T < Tc , the two states (phases) ±Φ0
have a lower symmetry than the state Φ0 = 0.
If the thermal fluctuations δΦ are sufficiently strong to allow passages between
the two states ±Φ0 at T < Tc , we have hΦi = 0 (preserves states).
However, for T < Tc and N → +∞, transition between the two states will be
less and less probable and the system will be trapped into one of the two states
(±Φ0 ). In other words, the system choose spontaneously one of the two less
symmetric state. Therefore, its physics is not any more described by Φ but by
the fluctuations δΦ around the chosen minimum Φ0 . There is a spontaneous
symmetry breaking. It means that the variable Φ is not any more symmetric
and one has to look at Φ → Φ0 + δΦ, where δΦ is a new variable!
14.2. Spontaneous breaking of continuous symmetries and the onset of Goldstone
particles 229
14.2 Spontaneous breaking of continuous symmetries and

the onset of Goldstone particles
Let us start with a simple model in which the order parameter is a scalar complex
variable
Φ1 + iΦ2
Φ= √
2
and with an Hamiltonian H that is invariant with respect to a global continuous
transformation. For instance, the simplest model in statistical mechanics that is in-
variant with respect to a continuous symmetry is the XY model with O(2) symmetry,
or a Ginzburg-Landau model for a superfluid or a superconductor (with no magnetic
field). Hence, we suppose that the Hamiltonian as the following form:

1~ r0 u0
Z
d ~ ∗ ∗ ∗ 2
βHef f = d ~x ∇Φ · ∇Φ + ΦΦ + (ΦΦ )
2 2 4
where
1
Φ(~x) = √ [Φ1 (~x) + iΦ2 (~x)], or Φ(~x) = ψ(~x)eiα(~x)
2
The physical meaning of Φ depends on the case considered. If we have:
• Superfluid: Φ is the
2macroscopic wave function of the Bose condensate (density
of superfluid n = Φ ).

• Superconductor: Φ is the single particle wave function describing the position

of the centre of mass of the Cooper pair.
14.2.1 Quantum relativistic case (field theory)

In quantum mechanics the analog of the Hamiltonian H is the action
Z
S(Φ) = d4~x L(Φ) (14.3)
where
1 r0 u0
L(Φ) = − ∂µ Φ∂ µ Φ∗ − ΦΦ∗ − (ΦΦ∗ )2 (14.4)
2 2 4
The Lagrangian L(Φ) describes a scalar complex (i.e. charged) muonic field with
√
mass m ≡ r0 ; we note that, if L(Φ) describes a muonic field, we should have r0 > 0
to have the mass m well defined. Moreover, the term (ΦΦ∗ )2 means self-interaction
with strenght λ ≡ u0 .
In all cases (r0 > 0 or r0 < 0), the original symmetry is U (1); it means that
both the Hamiltonian H and the Lagrangian L are invariant with respect to the
transformation
Φ → eiθ Φ, Φ∗ → e−iθ Φ∗ (14.5)
where the phase θ does not depend on ~x (global symmetry). In components the
transformation becomes
(
Φ1 → Φ1 cos θ − Φ2 sin θ

0 0 cos θ − sin θ Φ1
⇒ (Φ1 , Φ2 ) =
Φ2 → Φ2 cos θ + Φ1 sin θ sin θ cos θ Φ2
Now, let us focus first on the statistical mechanics model and to the most inter-
esting case of r0 < 0. In components, H can be expressed as
Z h i Z
βH = dd~x (∇Φ1 )2 + (∇Φ2 )2 + dd~x V (Φ1 , Φ2 ) (14.6)
where the potential is

r0 2 u0 2 2
V (Φ1 , Φ2 ) = Φ1 + Φ22 + Φ1 + Φ22 (14.7)
2 4
In the case r0 < 0, it is called mexican hat potential and it is shown in Figure 14.2.
V (Φ)
φ1
φ2
Figure 14.2: Case r0 < 0. The potential V (Φ) is a mexican hat potential.
In the uniform case we have (∇Φ1 = ∇Φ2 = 0). Let us define S = Φ21 + Φ22 ,
p
the potential in Eq.(14.7) can be rewritten as:

r0 2 u0 4
V (S) = S + S
2 4
In the uniform case, the solution is given by the minima of the potential V (S); hence,
in order to find the extrema points, we derive the potential with respect to S and we
impose the condition V 0 = 0:
dV (S)
= r0 S + u0 S 3 = 0
dS
We have a maximum at S = 0 and a minimum at S 2 ≡ v 2 = −r0 /u0 . Hence, for

r0 < 0, the Hamiltonian H displays a minimum when
r0
Φ21 + Φ22 ≡ v 2 = −
u0
It could be represented in the 2d plane (Φ1 , Φ2 ), where the minimum lies on a circle
of radius r
r0
v= −
u0
as show in Figure 14.3. The spontaneous symmetry breaking occurs when the system
"chooses" one of the infinite available minima. In our example, let us suppose that
the chosen minimum is r
r0
Φ1 = v = − , Φ2 = 0 (14.8)
u0
14.2. Spontaneous breaking of continuous symmetries and the onset of Goldstone
particles 231
Φ2
(v, 0)
Φ1
q
Figure 14.3: Plane (Φ1 , Φ2 ). The minimum lies on a circle of radius v = − ur00 .
Interpretation in relativistic quantum mechanics
Now, let us give a physical interpretation of the results previously obtained and
of the considerations we have done in order to obtain them. In particular:
1. Choosing r0 < 0 corresponds to an imaginary mass. This is because moving

away from Φ = 0, the system experiences a negative resistence in both direc-
tions, being Φ = 0 a relative local maximum.
2. The minimum has the lowest energy and therefore it must correspond to the
empty state. In this case, however, there is an infinite number of empty states!
In summary, the starting Hamiltonian H (or Lagrangian L) is invariant with

respect to U (1), but the one that describes the fluctuation dynamics around one of
the chosen minimum state is not invariant with respect to U (1). Let us see in more
details why the Hamiltonian, or Lagrangian, it is not invariant anymore in the case
r0 < 0.
First of all, let us write the Lagrangian with respect to the fluctuations of Φ1 and
Φ2 around the chosen state (v, 0) (Eq.(14.8)), we obtain:
(
Φ1 = v + δΦ1
⇒ Φ = Φ1 + iΦ2 = v + (δΦ1 + iδΦ2 )
Φ2 = 0 + δΦ2
√
where we omit the factor 1/ 2 for simplicity. Let us note that
(
δΦ1 = Φ1 − v
⇒ hδΦ1 iΦ0 = hδΦ2 iΦ0 = 0
δΦ2 = Φ2
indeed, as expected, the expectation value of the empty state is back to be zero.
Now, for the quantum relativistic Lagrangian L, let us define
m2
r0 → m2 , u0 → λ, v2 = −
λ
(recall that we are still in the case r0 < 0!). Hence, the Lagrangian Eq.(14.4) becomes
1
L = − ∂µ (v + δΦ1 + iδΦ2 )∂ µ (v + δΦ1 − iδΦ2 )
2
m2
− (v + δΦ1 + iδΦ2 )(v + δΦ1 − iδΦ2 )
2
λ
− [(v + δΦ1 + iδΦ2 )(v + δΦ1 − iδΦ2 )]2
4
1 1
= − (∂µ δΦ1 ∂ µ δΦ1 ) − (∂µ δΦ2 ∂ µ δΦ2 )
2 2
m2 2
v + 2vδΦ1 + δΦ21 + δΦ22

−
2
λ 2 2
− v + 2vδΦ1 + δΦ21 + δΦ22
4
Since we have defined m2 = −v 2 λ, we can rewrite it as
1 1 λv 2 2
L = − (∂µ δΦ1 ∂ µ δΦ1 ) − (∂µ δΦ2 ∂ µ δΦ2 ) + v +
2vδΦ
+X
1 δΦ X2X
1 + δΦ
X 2
2
2 2 2 X
X
λ 2 hh 2
v 4 + 4v 2 δΦ21 + δΦ1 + δΦ21 + δΦ22 + 2v 2 h
3 2 2 2

− 4v δΦh1 h
+hδΦ
h2h + 4vδΦ1 δΦ1 + δΦ2

4
Neglecting the constant terms in v (in green), finally we obtain
1 1
L(δΦ1 , δΦ2 ) = − (∂µ δΦ1 )2 − (∂µ δΦ2 )2
2 2
2
− λv 2 δΦ21 − vλδΦ1 (δΦ1 )2 + (δΦ2 )2 (14.9)
2
λ 2
− (δΦ1 )2 + (δΦ2 )2
4
Comparing it with Eq.(14.4), we note that the yellow term −λv 2 δΦ21 indicates that the
field δΦ1 (related to the transversal fluctuations) has a null empty state (hδΦ1 i = 0)
and a mass M such that (recall that m2 = −λv 2 = r0 ):
M 2 = 2λv 2 = −2r0 = −2m2
Therefore, it represents a real, massive, mesonic scalar field that is physically ac-
cettable (indeed we have r0 < 0!). However, L is not any more invariant under the
transformation δΦ1 → −δΦ1 , as we wanted to show! The symmetry is broken.
Remark. Note that the field δΦ2 has no mass (there is not a term ∝ δΦ22 )! Indeed,
it describes the fluctuations along the circle where the potential V is in its minimum
which imples no dynamical inertia, that implies no mass!
In summary, starting with one complex scalar field √Φ(~x) having mass m, when
m2 < 0 one gets a real scalar field δΦ1 with mass M = −2m2 and a second scalar
field δΦ2 that is massless. This is called the Goldstone boson. Hence, we have the
following theorem:
Theorem 5: Goldstone’s theorem
If a continuous symmetry is spontaneously broken and there are no long range
interactions, exists an elementary excitation with zero momentum, or particle
of zero mass, called Goldstone boson.
More generally, let P be a subgroup of G (P ⊂ G). If G has N independent
generators and P has M independent generators, hence, if P is the new (lower)
symmetry, therefore N − M Goldstone bosons exist.
14.3. Spontaneous symmetry breaking in gauge symmetries 233
In the previous case the symmetry group was G = U (1), hence G has N = 1
indipendent generators, whereas we have M = 0 (we have chosen a specific minimum).
Therefore, we have only one Goldstone boson.
Example 40: XY model
Let us consider the XY model in statistical mechanics. We have that
• δΦ1 represents the fluctuation of the modulus of m.
• δΦ2 represents fluctuations of the spin directions, or spin waves.
Remark. In particle physics the presence of Goldstone bosons brings a serious problem
in field theory since the corresponding particles are not observed! The resolution
of this problem is given by Higgs-Englert-Brout (1964). In particular, the Higgs
mechanism gives back the mass to the Goldstone particles, because the Goldstone
theorem, that works well for a continuous global symmetry, can fail for local gauge
theories!
14.3 Spontaneous symmetry breaking in gauge symme-

tries
14.3.1 Statistical mechanics
In statistical mechanics, let us consider a Ginzburg-Landau model for supercon-
ductors in presence of a magnetic field (Meissner effect, i.e. the magnetic induction
~ = 0 inside the superconductor). The Hamiltonian of such a system is:
B

1 2 ~ 2 r u0 ∗ 2 ~ ~
Z
d ~ 0 ∗
βH(Φ) = d ~x B + ∇ − 2iA Φ + Φ Φ + (Φ Φ) − B · H (14.10)

2 2 4
h i
B2 ~ and ∇
~ → ∇ ~ + iq A
~ is the minimal
where 2 is the energy of the magnetic field B
~ the induction field is:
coupling. If we have an external magnetic field H,
~ =H
B ~ +M
~
For normal conductors we have Φ0 = 0, which implies B~ = H,

~ while for supercon-
ductors Φ 6= 0 and we have a spontaneous symmetry breaking.
14.3.2 Field theory analog: the Higgs mechanism for an abelian

group
Let us consider the analog of the previous system in field theory: a scalar charged
mesonic fields selfinteracting and in presence of an electromagnetic field with potential
quadrivector Aµ (~x). We begin by applying the Higgs mechanism to an abelian1 , U (1)
gauge theory, to demonstrate how the mass of the corresponding gauge boson (the
photon) comes about.
The U (1) gauge invariant kinetic term of the photon is given by
1
Lphoton = − Fµν (~x)F µν (~x), Fµν = ∂µ Aν − ∂ν Aµ
4
1
In abstract algebra, an abelian group, also called a commutative group, is a group in which the
result of applying the group operation to two group elements does not depend on the order in which
they are written.
That is, Lphoton is invariant under the transformation: Aµ (~x) → Aµ (~x) − ∂ν α(~x) for
any α and x. If we naively add a mass term for the photon to the Lagrangian, we
have
1 1
Lphoton = − Fµν (~x)F µν (~x) + m2A Aµ Aµ
4 2
the mass term violates the local gauge symmetry; hence, the U (1) gauge symmetry
requires the photon to be massless.
Now extend the model by introducing a complex scalar field, with charge q, that
couples both to itself and to the photon. In this case, because of the presence of Aµ (~x),
we should consider a theory that satisfies symmetry U (1) locally! The Lagrangian of
this model is:
1
L = − Fµν (~x)F µν (~x) + (Dµ Φ(~x))∗ (Dµ Φ(~x)) − V (Φ, Φ∗ ) (14.11)
4
where
Dµ Φ = (∂µ + iqAµ )Φ gauge-covariant derivative
Fµν = ∂µ Aν − ∂ν Aµ Field strength tensor
m2 λ
V (Φ, Φ∗ ) = ΦΦ∗ + (ΦΦ∗ )2 Potential
2 4
The complex scalar field we are considering is defined again as:
1 1
Φ = √ (Φ1 + iΦ2 ), Φ∗ = √ (Φ1 − iΦ2 )
2 2
Hence, it is easily discerned that this Lagrangian is invariant under the gauge trans-
formations
Φ(~x) → eiα(~x) Φ(~x), Φ∗ (~x) → e−iα(~x) Φ(~x)
Aµ (~x) → Aµ (~x) − ∂ν α(~x)
Let us consider again two different cases:
• If m2 > 0: the state of minimum energy will be that with Φ = 0 and the
potential will preserve the symmetries of the Lagrangian. Then the theory is
simply QED with a massless photon and a charged scalar field Φ with mass m.
• If m2 < 0: the field Φ will acquire a vacuum
q expectation value. The state of
2
minimum energy will be that with Φ = − mλ ≡ v (circle of radius |Φ| = v)
and the global U (1) gauge symmetry will be spontaneously broken.
Now, let us focus on the case m2 < 0. In particular, it is convenient to parametrize
Φ by choosing
Φ̄1 = v, Φ¯2 = 0
Hence, we have:
Φ(x) = (v + δΦ1 ) + iδΦ2
where Φ1 is referred to as the Higgs boson and Φ2 as the Goldstone boson. As said,
they are real scalar fields which have no vacuum expecation value (hΦ1 iΦ0 = hΦ2 iΦ0 =
0). By inserting it in the Lagrangian and by keeping in mind that m2 = −v 2 λ, the
Lagrangian transforms as
1 1 1
L = − Fµν F µν + (∂µ δΦ1 )2 + (∂µ δΦ2 )2
4 2 2
(14.14)
2 2 2
− λv δΦ1 + q 2 v 2 Aµ Aµ − qvAµ ∂µ δΦ2 + higher order terms
2
Let us give a physical interpretation of the emphasized terms:
√
• λv 2 δΦ21 means that the field δΦ1 is massive with mass M = v 2λ (Higgs boson).
• q 2 v 2 Aµ Aµ means that the gauge boson Aµ , the photon, has got a mass
MA2 = 2q 2 v 2
Note that since now Aµ is massive, it has three indipendent polarization states.
• qvAµ ∂µ δΦ2 means that the field δΦ2 is not massive (indeed there is no term
∝ δΦ22 ) and that it is mixed with Aµ (Goldstone boson). Dynamically, this
means that a propagating photon can transform itself into a field δΦ2 (the
photon becomes a Goldstone boson).
Summarizing, √this Lagrangian now describes a√ theory with a Higgs boson δΦ1
with mass M = v 2λ, a photon of mass MA = qv 2 and a massless Goldstone δΦ2 .
Moreover, since δΦ2 does not seem to be a physical field it should be eliminated
by a gauge transformation. Hence, the strange δΦ2 Aµ mixing can be removed by
making the following gauge transformation:
1
Aµ (~x) → Aµ (~x) − ∂µ α(~x)
q
We can choose α(~x) = − v1 δΦ2 (~x)
1
Aµ (~x) → Aµ (~x) + ∂µ δΦ2 (~x)
qv
and the gauge choice with the transformation above is called the unitary gauge. By
inserting it in the Lagrangian Eq.(14.14), we eliminate the mixed term qvAµ ∂µ δΦ2 ,
in red, and the term 21 (∂µ δΦ2 )2 . Thus we obtain:
1 1
L = − Fµν F µν + (∂µ δΦ1 )2 − λv 2 δΦ21 + q 2 v 2 Aµ Aµ + higher order terms (14.15)
4 2
The Goldstone δΦ2 will then completely disappear from the theory and one says
that the Goldstone has been "eaten" to give the photon mass. Therefore, the new
Lagrangian in Eq.(14.15) contains two fields: one is a massive photon with spin 1 and
the second field δΦ1 is massive too, but has spin 0 (scalar). The mechanics trough
which the gauge boson becomes massive is the so called Higgs mechanism.
It is instructive to count the degrees of freedom (dof) before and after spontaneous
symmetry broken has occurred. In particular, we have:
• For a global U (1) symmetry, we have 2 massive scalar fields, hence there are
1 + 1 degrees of freedom.
After symmetry breaking, we have 1 scalar field massive and 1 scalar field not
massive. Hence, there are again 1 + 1 degrees of freedom.
• For a local gauge U (1) symmetry, we have 2 massive scalar fields and one
massless photon. Hence, there are 2+2 degrees of freedom (the massless photon
has 2 polarizations).
After symmetry breaking, we have 1 massive scalar field and 1 massive photon.
Hence, there are 1 + 3 degrees of freedom (the massive photon has 3 polariza-
tions).
Remark. Let us note that in Eq.(14.15) among the higher order terms we have ne-
glected, there are
Aµ Aµ
δΦ1
δΦ1
δΦ1
Aν Aν
(a) Feyman diagram of the term ∝ (b) Feyman diagram of the term ∝
δΦ1 Aµ Aµ . δΦ21 Aµ Aµ .
Figure 14.4
• A term proportional to ∝ δΦ1 Aµ Aµ , shown in Figure 14.4a.
• A term proportional to ∝ δΦ21 Aµ Aµ , shown in Figure 14.4b.
Remark. Note that the presence of the massive photon MA2 = 2q 2 v 2 , with q = 2l in
superconductivity, gives rise to the exponential drop
B(x) = B(0)e−x/l
inside the system.

Now, let us do some considerations:
• As said, we cannot introduce by hand a massive photon i.e. a term like

2 MA Aµ A in the Lagrangian because we would violate explicitely the gauge
1 2 µ
symmetry!
• The Lagrangian is gauge invariant.
• Symmetry breaking occurs at the level of the vacuum state.
• A gauge symmetry that is explicitely broken is not renormalizable.
14.3.3 Non-abelian gauge theories

Let us illustrate some examples of non abelian gauge theories.
Example 41
An example of non abelian gauge theory is the gauge theory that contains the
combined electromagnetic and weak interactions, which is generally referred
to as the electroweak unification, or electroweak interactions theory (Glashow-
Weinberg-Solam, or GWS). It is non abelian because the Lagrangian is invariant
under the group SU (2) × U (1) , that is not an abelian group because
| {z } | {z }
weak electromagnetism
interactions
of SU (2).
Example 42: Quantum chromodynamic (quarks+gluons)

In the case of quantum chromodynamic, one has a term that is SU (3) invariant
and the GWS Lagrangian that has symmetry SU (2) × U (1). It implies that the
Lagrangian of this theory is invariant under
SU (3) × SU (2) × U (1) (14.16)
Because of the groups SU (2) and SU (3) the symmetries above ar not abelian
(for example in SU (2) two matrices U (α) and U (β) do not commute in general).
14.3.4 Extension of Higgs mechanism to non-abelian theories

The abelian example of Higgs mechanism can now be generalized in a straight-
forward way to a non-abelian gauge theory. In particular, we discuss an electroweak
interactions gauge theory. To spontaneously break the symmetry, consider a complex
scalar field of SU (2):

1 Φ1 + iΦ2 Φa (~x)
Φ= √ = (14.17)
2 Φ3 + iΦ4 Φb (~x)
where Φa , Φb are complex fields. The Lagrangian of the system is invariant under the
gauge transformation SU (2) × U (1):

Φa (~x) i
α (~
x ) i
~
τ ·~
α (~
x ) Φ a (~x )
→ e2 0
e2
Φb (~x) Φb (~x)
where ~τ are Pauli matrices, α0 , α1 , α2 , α3 are four real functions (4 vectorial mesons).
In particular,
α0 (~x) → Bµ (~x)

~ (~x) → Wµa (~x) = Wµ(1) (~x), Wµ(2) (~x), Wµ(3) (~x)
α
(3)
where α0 is the scalar gauge field and Bµ is a linear combination of Aµ and Wµ .
The Lagrangian is given by
1 1
L = (Dµ Φ)† (Dµ Φ) − µ2 Φ∗ Φ − λ(Φ∗ Φ)2 − bµν bµν − faµν fµν
a
(14.19)
4 4
where
1 i
Dµ → ∂µ igτ a Wµa + g 0 Bµ
2 2
fµν = ∂µ Wν − ∂ν Wµ − gεabc Wµb Wνa
a a a
bµν = ∂µ Bν − ∂ν Bµ
1
Wµa → Wµa − εabc αb (~x)Wµc (~x) + ∂µ αa (~x)
g
1 ∂α0
Bµ → Bµ + 0
g ∂xµ
Just as in the abelian example, the scalar field develops a nonzero vacuum expecation
value for µ2 < 0, which spontaneously breaks the symmetry. There is an infinite
number of degenerate states with minimum energy satisfying ν ≡ Φ21 +Φ22 +Φ23 +Φ24 =
v 2 . After we have chosen the direction on the sphere in R4 , we note that 3 symmetries
are broken; hence, we have 3 Goldstone bosons.
Higgs mechanism
The W and Z gauge masses can now be generated in the same manner as the
Higgs mechanism generated the photon mass in the abelian example. Let us consider
a Higgs scalar field +
Φ
δΦ =
Φ0
such that
0
h0| Φ |0i =
v
Thus, the Lagrangian becomes:
1 µ 1 2
⇒ LHiggs = (gv)2 Wµ+ W − + v 2 gWµ(3) − g 0 Bµ (14.21)
2 2
where
1
Wµ(1) = √ Wµ+ + Wµ−

2
1
(2)
Wµ = √ Wµ+ − Wµ−

2
Hence, the mass of the W + particle and its antiparticle is:

1
2
MW = (gv)2
2
Moreover, let us note that the yellow term in Eq.(14.21) is a linear combination of
Wµ3 and Bµ which corresponds to Zµ0 , the field for a third weak gauge boson. To
make Zµ0 and Aµ orthogonal we should consider
Aµ = (cos θW )Bµ + (sin θW )Wµ3

Zµ0 = (− sin θW )Bµ + (cos θW )Wµ3
where θW is the Weinberg angle, defined as:
g0
tan θW =
g
Hence, the mass of the Z 0 particle is:

2
M w2

1 vg
MZ2 0 = =
2 cos θW cos2 θW
It is again instructive to count the degrees of freedom before and after the Higgs
mechanism. At the outset we had a complex doublet Φ with four degrees of freedom,
one massless B with two degrees of freedom and three massless W i gauge fields
with six, for a total number of 12 degrees of freedom. At the end of the day, after
spontaneous symmetry breaking, we have a real scalar Higgs field h with one degree
of freedom, three massive weak bosons, W ± and Z 0 , with nine, and one massless
photon with two degrees of freedom, yielding again a total of 12. One says that
the scalar degrees of freedom have been eaten to give the W ± and Z 0 bosons their
longitudinal components.
Conclusions
239
240
Bibliography
[1] Luca Peliti Statistical Mechanics in a Nutshell.
[2] Wikipedia https://en.wikipedia.org/wiki/Lever_rule.
[3] Herbert B. Callen Thermodynamics and an introduction to thermostatistics, sec-

ond edition.
[4] J.M.Yeomans Statistical Mechanics of Phase Transitions.
[5] L. Onsanger. Phys. Rev. 65 (1944) 117. Method: transfer matrix and operator
algebra.
[6] T.D. Schultz, D.C. Mattis, and E.H. Lieb Rev. Mod. Phys. 36 (1964) 856.
Method: transfer matrix expressed in fermions.
[7] R.J. Baxter and I.G. Enting 399th solution of the Ising model, J. Phys. A 11
(1978) 2463.Method: star-triangle transformation.
[8] R.J. Baxter Exactly Solved Models in Statistical Mechanics
[9] C. Kittel, American Journal of Physics 37, 917 (1969) Phase Transition of a
Molecular Zipper
[10] Claudio Bonati The Peierls argument for higher dimensional Ising models.
https://arxiv.org/abs/1401.7894v2
241

Statistical Mechanics: Alice Pagano

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Statistical Mechanics: Alice Pagano

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistical Mechanics: Alice Pagano

Uploaded by

Copyright:

Available Formats

LECTURE NOTES

Collection of the lectures notes of professor Orlandini.

Compiled: Wednesday 5th February, 2020

Padova, Wednesday 5th February, 2020 Alice Pagano

2 Equilibrium phases and thermodynamics of phase transitions 15

3 Recall of statistical mechanics and theory of ensembles 35

4 Statistical mechanics and phase transitions 43

4.1 Statistical mechanics of phase transitions . . . . . . . . . . . . . . . . . 43

5 Role of the models in statistical mechanics 51

6 Some exactly solvable models of phase transitions 61

7 The role of dimension, symmetry and range of interactions in phase

8 Mean field theories of phase transitions and variational mean field 97

10 Landau theory of phase transition for homogeneous systems 143

11 Role of fluctuations in critical phenomena: Ginzburg criterium,

12 Widom’s scaling theory. Block-spin Kadanoff ’s transformation 187

13 Renormalization group theory. Universality 201

14 Spontaneous symmetry breaking 227

The goal of statistical mechanics [1] is to predict the macroscopic properties of

2. Equilibrium phases and thermodynamics of the phase transitions.

3. Statistical mechanics and theory of ensambles.

4. Thermodinamic limit and phase transitions in statistical mechanics.

5. Order parameter and critical point.

6. The role of modelling in the physics of phase transitions.

7. The Ising model.

8. Exact solutions of the Ising model.

9. Transfer matrix method.

11. Approximations: Meanfield theory Weiss and variational mean field.

12. Landau theory of phase transitions: the role of symmetries.

14. The Ginzburg-Landau model.

15. Landau theory for non-homogeneous system. The ν exponent.

16. Gaussian fluctuations in the G-L theory.

17. Widom’s scaling theory.

18. Kadauoff’s theory of scaling.

20. Spontaneous symmetry breaking.

2. Made by a (large) number N of degrees of freedom. For instance, we remind

Thermodynamic is a macroscopic theory of matter at equilibrium. It starts either

• Adiabatic walls: no heat flux. If it is removed we obtain a diathermic walls.

• Rigid walls: no mechanical work. If it is removed we obtain a flexible or mobile

• Impermeable walls: no flux of particles (the number of particles remain con-

1.2 Equilibrium states

1. S is an additive function with respect to the subsystems in which the system is

2. S is differentiable and monotonically increasing with respect to the internal

3. For each subsystem (α) we have:

S (α) = S (α) (U (α) , V (α) , N (α) ) (1.4)

This fundamental relation holds for each subsystem.

4. S is an homogeneous function of 1st order with respect to the extensive param-

It means that S is an extensive quantity.

Remark. Since S is monotonically increasing in U, the following inequality holds:

1.3 Equations of states

It means that at equilibrium the temperature of a subsystem is equal to the one of

P (λS, λV, λN ) = P (S, V, N )

where X1 = V is the volume and P1 = −P is the pressure.

and dividing by the number of moles N

that is the Gibbs-Duhem relation in a molar form.

Remark. Note that µ = µ(T, P ) is a relation between intensive variables.

1.4 Legendre transform and thermodynamic potentials

Y = Y (X0 , X1 , . . . , Xk , . . . , Xr+1 ) (1.18)

Y = Y (X0 , X1 , . . . , Pk , . . . , Xr+1 ) (1.19)