chapter1
chapter1
Lecture Notes
USTH 2024
Lecturer:
Trinh Xuan Hoang
Institute of Physics, Vietnam Academy of Science and Technology
Email: txhoang@iop.vast.vn
Books:
The lecture notes are prepared based on the following books:
K. Huang, Statistical Mechanics, 2nd edition, John Wiley & Sons, New York, 1987.
1
Plan of the course
Chapter 1 Elements of probability theory (2 lectures)
2-3 homeworks
1 midterm test
1 final exam (oral)
2
A brief introduction
Statistical Mechanics (SM) is considered as the third pillar of modern physics, next to
quantum theory and relativity theory.
SM is a physical theory aimed to explain the macroscopic behavior of physical systems
from the equations of motion of the microscopic constituents of these systems and the
probabilistic assumptions made about them.
SM is based on the laws of mechanics (classical and quantum) and statistical methods.
SM allows the calculation of macroscopic (bulk) properties of substances from the micro-
scopic properties of the molecules and their interactions.
SM is applied in many areas of physics, chemistry and biology.
3
Chapter 1
4
1.2 Random variables
A random variable X is defined by
P (x) ≥ 0
Z
P (x)dx = 1
5
Exercise 1.3 An opinion poll is conducted in a country with many political parties. How
large a sample is needed to reasonably sure that a party of 5% will show up in it with a
percentage between 4.5 and 5.5?
Exercise 1.4 Two volumes V1 and V2 connected by a hole which allows a gas of N
molecules to move freely from one volume to another and vice versa vice. Prove that the
probability that the volume V1 has exactly n molecules is
−N n N
pn = (1 + γ) γ
n
1.3 Averages
Average or expectation value of any function f (X) is defined as
Z
⟨f (X)⟩ = f (x)P (x)dx (1.4)
⟨X m ⟩ = µm
Show that
µ2n+1 = 0
µ2n = (2n − 1)!! = (2n − 1)(2n − 3) . . . 1
6
1.4 Characteristic functions
The characteristic function of a random variable X whose range I is the set of real
numbers is defined by Z
G(k) = ⟨e ⟩ = eikx P (x)dx
ikX
(1.7)
I
One can always redefine P (x) to include the whole real axis:
Z +∞
ikX
G(k) = ⟨e ⟩ = eikx P (x)dx , (1.8)
−∞
thus G(k) is a Fourier transform of P (x). G(k) exists for all real k with
where µm are the m-moments of the distribution. This can be shown by making a Taylor
expansion for G(k):
∞
km ∂ mG
X
G(k) = . (1.10)
m=0
m! ∂k m k=0
where κm are called the cumulants. The cumulants are alternative to the moments,
they are both defined by the characteristic function. Knowing the cumulants one can
calculate the moments and vice versa.
κ1 = µ1 (1.12)
κ2 = µ2 − µ21 = σ 2 (1.13)
κ3 = µ3 − 3µ2 µ1 + 2µ31 (1.14)
κ4 = µ4 − 4µ3 µ1 − 3µ22 + 12µ2 µ21 − 6µ41 (1.15)
...
7
which is a periodic function with the period equal to 2π
a
. For lattice distributions, G(k) =
2πn
1 not only at k = 0 but at multiple values of k = a . For non-lattice distributions,
G(k) = 1 only for k = 0 and |G(k)| < 1 for k ̸= 0.
Exercise 1.7 Compute the characteristic function of the square distribution in Exercise
1.5 and find its moments in this way.
Exercise 1.8 Show that for the Gaussian distribution all cumulants beyond the second
one are zero.
Exercise 1.9 Given the Poisson distribution
an −a
pn = e (a > 0)
n!
with n = 0, 1, 2, . . . Find its cumulants.
P (A ∩ B)
P (A|B) = (1.18)
P (B)
Example: The probability that a particle is found in the volume d3 r about the position
r2 at time t2 , given that it was found in the volume d3 r about the position r1 at time t1 .
Independence: If two sets A and B represent two independent sets of events (events in
B have no influence on the probabilities of events in A and vice versa), then
P (A ∩ B) = P (A)P (B).
8
1.6 Joint and conditional probability distribution
The joint probability distribution of two random variables X and Y is denoted by P (x, y),
which is also a bivariate distribution.
From the joint probability distribution, one can obtain the marginal distributions on
the given set of variables: Z
P1 (x) = P (x, y)dy
Z
P2 (y) = P (x, y)dx .
Exercise 1.10: Compute the marginal and conditional probabilities for the following
ring-shaped bivariate distribution
Exercise 1.11: Two dice are thrown and the outcome is 9. What is the probability
distribution of the points on the first die conditional on this given total?
P (x1 , x2 , . . . , xr )
9
The Taylor expansion of G generates the moments
∞
X (ik1 )m1 (ik2 )m2 . . . (ikr )mr
G(k1 , k2 , . . . , kr ) = ⟨X1m1 X2m2 . . . Xrmr ⟩ (1.24)
0
m1 !m2 ! . . . mr !
The above summations exclude the case when all mi = 0. The cumulants are denoted by
the double bracket ⟨⟨⟩⟩.
The second cumulants create a r×r matrix, called the covariance matrix with the elements
given by
⟨⟨Xi Xj ⟩⟩ = ⟨(Xi − ⟨Xi ⟩)(Xj − ⟨Xj ⟩)⟩ = ⟨Xi Xj ⟩ − ⟨Xi ⟩⟨Xj ⟩ (1.26)
The off-diagonal elements of the covariance matrix are called covariances. After being
normalized, they are called the correlation coefficients:
⟨⟨Xi Xj ⟩⟩ ⟨Xi Xj ⟩ − ⟨Xi ⟩⟨Xj ⟩
ρij = p =p (1.27)
⟨⟨Xi ⟩⟩⟨⟨Xj ⟩⟩ (⟨Xi2 ⟩ − ⟨Xi ⟩2 )(⟨Xi2 ⟩ − ⟨Xj ⟩2 )
with −1 ≤ ρij ≤ 1.
For r = 2: if X1 and X2 are statistically independent then
(i) ⟨X1m1 X2m2 ⟩ = ⟨X1m1 ⟩⟨X2m2 ⟩, which means that all moments are factorized.
(ii) G(k1 , k2 ) = G1 (k1 )G2 (k2 ), which means that the characteristic function is factorized.
(iii) ⟨⟨X1m1 X2m2 ⟩⟩ = 0 for m1 ̸= 0 and m2 ̸= 0.
Uncorrelated vs. independent: X1 and X2 are called uncorrelated when the covari-
ance ⟨⟨X1 X2 ⟩⟩ = 0 which is weaker than statistical independence.
Exercise 1.12 Find moments and cumulants of the bivariate Gaussian distribution
1 2 +2bxy+cy 2 )
P (x, y) = const.e− 2 (ax
with ac − b2 > 0 and a > 0. Show that for this distribution “uncorrelated” and “inde-
pendent” are equivalent.
Hence,
ZZ Z
PY (y) = δ(x1 + x2 − y)PX (x1 , x2 )dx1 dx2 = PX (x1 , y − x1 )dx1 . (1.29)
10
If X1 and X2 are independent then
Z
PY (y) = PX1 (x1 )PX2 (y − x1 )dx1 . (1.30)
or
⟨⟨(X1 + X2 )2 ⟩⟩ = ⟨⟨X12 ⟩⟩ + ⟨⟨X22 ⟩⟩
.
The characteristic function
GY (k) = GX1 ,X2 (k, k)
If X1 and X2 are independent then GY (k) = GX1 (k)GX2 (k).
Exercise 1.13 X1 and X2 are independent variables with the Gaussian distributions
1 x21 1 x22
1 1
PX1 (x1 ) = √ exp − 2 PX2 (x2 ) = √ exp − 2
σ1 2π 2 σ1 σ2 2π 2 σ2
Find PY (y) for Y = X1 + X2 .
Discrete-time random walk A drunkard is walking to the left and to the right with
equal probabilities:
1 1
Xi = ±1 p(+1) = p(−1) = − .
2 2
Thus ⟨Xi ⟩ = 0 and ⟨Xi2 ⟩ = 1. Let Y be his walking distance from the origin after N
steps:
Y = X 1 + X2 + . . . + X N .
Find the probability distribution of Y .
We have:
⟨Y ⟩ = ⟨X1 ⟩ + ⟨X2 ⟩ + . . . + ⟨XN ⟩ = 0 (1.32)
⟨Y 2 ⟩ = N ⟨Xi2 ⟩ = N (1.33)
Thus * 2 +
Y 1
= −→ 0 as N → ∞ (1.34)
N N
or the variance of the mean goes to zero as N goes to infinity.
To find the probability distribution of Y we use the characteristic function
N
N 1 ik 1 −ik
GY (k, N ) = [GY (k)] = e + e (1.35)
2 2
11
The probability that Y has the value n is the coefficient of eink
1 N
⇒ pn (N ) = N 1 (1.36)
2 2
(N − n)
dx
PY (y) = PX (x) = PX (g(y))g ′ (y)
dy
More generally, X and Y can have different numbers of components. In that case, one
should use this equation
Z
PY (y) = δ(f (x) − y)PX (x)dx .
12
Z
1 m 3/2 2
P (E) = δ( mv 2 − E) e−mv /2kT dvx dvy dvz
2 2πkT
By writing dvx dvy dvz = dv 3 = v 2 sin θ dθ dϕ dv and doing integration one gets
Exercise 1.14 Let X have the range (0, 2π) and constant probability density in that
range. Find the distribution of Y = sin X. Also when P (x) = A + B sin(x) (|B| < A =
1
2π
).
Exercise 1.15 A point lies on a circle at a random position with uniform distribution
along the circle. What is the probability distribution of its azimuth?
Exercise 1.16 A scattering center is bombarded by a homogeneous beam of particles.
A particle that comes in with impact parameter b is deflected by an angle θ(b). Find the
differential cross-section dσ/dΩ.
The total cross-section
I Z 2π Z π
dσ dσ
σ= dΩ = sin θ dθ dϕ
4π dΩ 0 0 dΩ
Exercise 1.17 Show that a Lorentz distribution in the frequency scale is also a Lorentz
distribution in the wavelength.
Normalization gives
1/2
A 2 /2A
C= e−B (1.38)
2π
By putting µ1 = −B/A and σ 2 = 1/A one gets the Gaussian distribution of the form
(x − µ1 )2
2 −1/2
P (x) = (2πσ ) exp − , (1.39)
2σ 2
The cumulants:
κ1 = µ1 κ2 = σ 2 κ3 = κ4 = . . . = 0
13
1.11 The central limit theorem
Let X1 , X2 , . . . , XN be a set of N independent random variables, having the same prob-
ability density PX (x) with zero mean and finite variance σ 2 . Let
X 1 + X2 + . . . + X N
Y = √ . (1.41)
N
The central limit theorem states that even when PX (x) is not Gaussian, in the limit
N → ∞ the distribution of Y is Gaussian!!
y2
2 −1/2
PY (y) = (2πσ ) exp − 2 . (1.42)
2σ
This theorem explains the dominant role of the Gaussian distribution in all fields of
statistics.
Proof: Z
1
GX (k) =eikx PX (x)dx = 1 − σ 2 k 2 + O(N −3/2 ) (1.43)
2
N N
σ2k2
k 1 2 2
GY (k) = GX √ = 1− −→ e− 2 σ k as N → ∞ (1.44)
N 2N
Exercise 1.18 Verify by explicit calculation that for Lorentzian variables, the central
limit theorem is wrong (the proof breaks down).
In fact, if Xi are independent and Lorentzian, then Y is again Lorentzian (i.e. not
Gaussian)!
A minimal smoothness condition for G(k) is needed for the central limit theorem, namely
that its second derivative at the origin exists.
All cumulants of the Poisson distribution are equal to the mean value ⟨N ⟩.
Note: The Poisson distribution also applies for random dots in two and three dimensional
spaces.
Exercise 1.19 N1 and N2 are two statistically independent variables, each with a Poisson
distribution. Show that their sum N1 + N2 is again Poissonian.
14