Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
13 views

chapter1

Uploaded by

Hù ng
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

chapter1

Uploaded by

Hù ng
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Statistical Mechanics

Lecture Notes

USTH 2024
Lecturer:
Trinh Xuan Hoang
Institute of Physics, Vietnam Academy of Science and Technology
Email: txhoang@iop.vast.vn

Books:
The lecture notes are prepared based on the following books:

ˆ N. G. Van Kampen, Stochastic Processes in Physics and Chemistry, 3rd edition,


Elsevier, Amsterdam, 2007.

ˆ K. Huang, Statistical Mechanics, 2nd edition, John Wiley & Sons, New York, 1987.

ˆ K. Huang, Introduction to Statistical Mechanics, Taylor & Francis, London, 2001.

Students are advised to read the books for better details.

1
Plan of the course
Chapter 1 Elements of probability theory (2 lectures)

Chapter 2 The laws of thermodynamics (2 lectures)

Chapter 3 Some applications of thermodynamics (1 lecture)

Chapter 4 Kinetic theory of gases (2 lectures)

Chapter 5 Classical statistical mechanics (2 lectures)

Chapter 6 Quantum statistical mechanics (2 lectures)

Chapter 7 Some applications of quantum statistical mechanics (2 lectures)

2-3 homeworks
1 midterm test
1 final exam (oral)

2
A brief introduction
Statistical Mechanics (SM) is considered as the third pillar of modern physics, next to
quantum theory and relativity theory.
SM is a physical theory aimed to explain the macroscopic behavior of physical systems
from the equations of motion of the microscopic constituents of these systems and the
probabilistic assumptions made about them.
SM is based on the laws of mechanics (classical and quantum) and statistical methods.
SM allows the calculation of macroscopic (bulk) properties of substances from the micro-
scopic properties of the molecules and their interactions.
SM is applied in many areas of physics, chemistry and biology.

3
Chapter 1

Elements of probability theory

1.1 Probability concept


An event (w), a set of events (A), the set of all events (Ω).
w ∈ A means that the event w belongs to the set A. Of course, w ∈ Ω.
A is a subset of Ω, or A ⊂ Ω.
A null set ⊘ is the set of no events.
The intersection A ∩ B is the set of all the events that belong to both A and B.
If A ∩ B = ⊘ then A and B are said to be disjoint.
The union A ∪ B is the set of all the events that belong to A or B or both of them.
Examples: tossing a coin, casting a die etc.
Probability axioms: The probability P (A) that an arbitrary event w belongs to the
set A (w ∈ A) must satisfy the following three axioms
(i) P (A) ≥ 0 ∀A ∈ Ω,
(ii) P (Ω) = 1, P
(iii) If Ai ∩ Aj = ⊘ ∀i ̸= j then P (∪i Ai ) = i P (Ai ).
Consequences of the axioms:
If Ā = Ω − A (Ā is the complement of A) then P (Ā) = 1 − P (A).
P (⊘) = 0.
P (A ∪ B) = Prob.[(w ∈ A) or (w ∈ B)]. If A ∩ B = ⊘ then P (A ∪ B) = P (A) + P (B).
How to observe the probability?
- Toss a coin N times one after another independently, or
- Toss N coins at the same time independently.
As N → ∞, the fraction of times of getting heads is 12 .
The probabilities P (A) on the set Ω as defined by the axioms are called a priori proba-
bilities.
In a gas, the probability of finding a molecule at an exact position r is zero but the
probability of finding a molecule at a small volume ∆V is non-zero.

4
1.2 Random variables
A random variable X is defined by

a) a set of possible values

b) a probability distribution over this set.

Ad a. Let x a value of X. Example: tossing a coin gives x = 1 (head) and x = 0 (tail).


The set of possible values contains 0 and 1. Throwing a die gives a set of 6 possible
values.
Note: A set can be discrete, continuous, partially continuous or multidimensional.
Ad b. P (x) is the probability distribution of X, also called the probability density.

P (x) ≥ 0
Z
P (x)dx = 1

P (x)dx is the probability that X has a value between x and x + dx.


It is possible to include delta functions in P (x):
X
P (x) = pn δ(x − xn ) + P̃ (x) (1.1)
n

with pn > 0 and Z Z


X
P (x)dx = pn + P̃ (x)dx = 1
n

If P̃ (x) = 0 then P (x) is discrete.


Cumulative distribution function:
Z x+0
P(x) = P (x′ )dx′ (1.2)
−∞

P(x) is a monotonically non-decreasing function. P(−∞) = 0 and P(∞) = 1.


Exercise 1.1 Let X be the number of points obtained by casting a die. Give its range
and probability distribution. Same question for casting two dice.
Exercise 1.2 Flip a coin N times. Prove that the probability that heads turn up exactly
n times is  
−N N
pn = 2 n = 0, 1, 2, . . . N (1.3)
n
This is called binomial distribution.
 
N N!
=
n n!(N − n)!
N

n
reads as “N choose n” and is called the binomial coefficient.

5
Exercise 1.3 An opinion poll is conducted in a country with many political parties. How
large a sample is needed to reasonably sure that a party of 5% will show up in it with a
percentage between 4.5 and 5.5?
Exercise 1.4 Two volumes V1 and V2 connected by a hole which allows a gas of N
molecules to move freely from one volume to another and vice versa vice. Prove that the
probability that the volume V1 has exactly n molecules is
 
−N n N
pn = (1 + γ) γ
n

where γ = V1 /V2 . This is called “Bernoulli distribution”.

1.3 Averages
Average or expectation value of any function f (X) is defined as
Z
⟨f (X)⟩ = f (x)P (x)dx (1.4)

Note that ⟨f + g⟩ = ⟨f ⟩ + ⟨g⟩. The averages

⟨X m ⟩ = µm

are called m-th moments of X.

µ1 = ⟨X⟩ is called mean value

σ 2 = ⟨(X − ⟨X⟩)2 ⟩ = µ2 − µ21 is called the variance or dispersion


σ is called the standard deviation.
Note that not all distributions have a finite mean and finite variance. Example is the
Lorentz or Cauchy distribution:
1 γ
P (x) = (−∞ < x < ∞) (1.5)
π (x − a)2 + γ 2

which has an infinite mean and an infinite variance.


Exercise 1.5 Find the moments of a square distribution P (x) = 0 for |x| > a and
P (x) = (2a)−1 for |x| ≤ a.
Exercise 1.6 The Gaussian distribution is defined by
2 /2
P (x) = (2π)−1/2 e−x (1.6)

Show that
µ2n+1 = 0
µ2n = (2n − 1)!! = (2n − 1)(2n − 3) . . . 1

6
1.4 Characteristic functions
The characteristic function of a random variable X whose range I is the set of real
numbers is defined by Z
G(k) = ⟨e ⟩ = eikx P (x)dx
ikX
(1.7)
I
One can always redefine P (x) to include the whole real axis:
Z +∞
ikX
G(k) = ⟨e ⟩ = eikx P (x)dx , (1.8)
−∞

thus G(k) is a Fourier transform of P (x). G(k) exists for all real k with

G(0) = 1 and |G(k)| ≤ 1

G(k) is the moment generating function:



X (ik)m
G(k) = µm , (1.9)
m=0
m!

where µm are the m-moments of the distribution. This can be shown by making a Taylor
expansion for G(k):

km ∂ mG
X  
G(k) = . (1.10)
m=0
m! ∂k m k=0

Also using Taylor expansion one gets



X (ik)m
log G(k) = κm , (1.11)
m=0
m!

where κm are called the cumulants. The cumulants are alternative to the moments,
they are both defined by the characteristic function. Knowing the cumulants one can
calculate the moments and vice versa.

κ1 = µ1 (1.12)
κ2 = µ2 − µ21 = σ 2 (1.13)
κ3 = µ3 − 3µ2 µ1 + 2µ31 (1.14)
κ4 = µ4 − 4µ3 µ1 − 3µ22 + 12µ2 µ21 − 6µ41 (1.15)
...

Lattice distribution For the lattice distribution


X
P (x) = pn δ(x − na) n = . . . , −2, −1, 0, 1, 2, . . . (1.16)
n

one gets the characteristic function


X
G(k) = pn eikna , (1.17)
n

7
which is a periodic function with the period equal to 2π
a
. For lattice distributions, G(k) =
2πn
1 not only at k = 0 but at multiple values of k = a . For non-lattice distributions,
G(k) = 1 only for k = 0 and |G(k)| < 1 for k ̸= 0.

Exercise 1.7 Compute the characteristic function of the square distribution in Exercise
1.5 and find its moments in this way.
Exercise 1.8 Show that for the Gaussian distribution all cumulants beyond the second
one are zero.
Exercise 1.9 Given the Poisson distribution
an −a
pn = e (a > 0)
n!
with n = 0, 1, 2, . . . Find its cumulants.

1.5 Joint and conditional probabilities


Joint probability: Consider P (A ∩ B) where A ∩ B ̸= ⊘ (not empty).

P (A ∩ B) = Prob{(w ∈ A) and (w ∈ B)}

is called the joint probability.


Examples:
1) The probability of finding m rabbits and n tigers in a given area of the forest is the
joint probability of finding m rabbits in this area and n tigers in this area.
2) The probability that at time t1 there are n1 particles in the volume ∆V and and at
time t2 there are n2 particles in this volume.
Conditional probability: The probability that an arbitrary event w ∈ A given that
w ∈ B is called the conditional probability and is given by

P (A ∩ B)
P (A|B) = (1.18)
P (B)

From this definition we get

P (A ∩ B) = P (A|B)P (B) = P (B|A)P (A) (1.19)

Example: The probability that a particle is found in the volume d3 r about the position
r2 at time t2 , given that it was found in the volume d3 r about the position r1 at time t1 .
Independence: If two sets A and B represent two independent sets of events (events in
B have no influence on the probabilities of events in A and vice versa), then

P (A|B) = P (A) and P (B|A) = P (B).

It becomes that for independent sets of events A and B:

P (A ∩ B) = P (A)P (B).

8
1.6 Joint and conditional probability distribution
The joint probability distribution of two random variables X and Y is denoted by P (x, y),
which is also a bivariate distribution.
From the joint probability distribution, one can obtain the marginal distributions on
the given set of variables: Z
P1 (x) = P (x, y)dy
Z
P2 (y) = P (x, y)dx .

If we have a fixed value for Y then


P (x, y)
P (x|y) =
P2 (y)
is the conditional probability distribution. The normalization condition for P (x|y) can
be proved as:
Z Z Z
P (x, y) 1 P2 (y)
P (x|y)dx = dx = P (x, y)dx = =1.
P2 (y) P2 (y) P2 (y)

If X and Y are statistically independent then

P (x, y) = P1 (x)P2 (y) (1.20)


P (x|y) = P1 (x) (1.21)

Exercise 1.10: Compute the marginal and conditional probabilities for the following
ring-shaped bivariate distribution

P (x1 , x2 ) = π −1 δ(x21 + x22 − a2 )

Exercise 1.11: Two dice are thrown and the outcome is 9. What is the probability
distribution of the points on the first die conditional on this given total?

1.7 Multivariate distribution


The joint probability of r random variables X1 , X2 , . . . , Xr

P (x1 , x2 , . . . , xr )

is a multivariate distribution. The moments are defined by


Z
m1 m2
⟨X1 X2 . . . Xr ⟩ = xm
mr 1 m2 mr
1 x2 . . . xr P (x1 , x2 , . . . , xr )dx1 dx2 . . . dxr (1.22)

The characteristic function

G(k1 , k2 , . . . , kr ) = ⟨ei(k1 X1 +k2 X2 +...+kr Xr ) ⟩ . (1.23)

9
The Taylor expansion of G generates the moments

X (ik1 )m1 (ik2 )m2 . . . (ikr )mr
G(k1 , k2 , . . . , kr ) = ⟨X1m1 X2m2 . . . Xrmr ⟩ (1.24)
0
m1 !m2 ! . . . mr !

Generation of the cumulants:



X (ik1 )m1 (ik2 )m2 . . . (ikr )mr
log G(k1 , k2 , . . . , kr ) = ⟨⟨X1m1 X2m2 . . . Xrmr ⟩⟩ (1.25)
0
m1 !m2 ! . . . mr !

The above summations exclude the case when all mi = 0. The cumulants are denoted by
the double bracket ⟨⟨⟩⟩.
The second cumulants create a r×r matrix, called the covariance matrix with the elements
given by
⟨⟨Xi Xj ⟩⟩ = ⟨(Xi − ⟨Xi ⟩)(Xj − ⟨Xj ⟩)⟩ = ⟨Xi Xj ⟩ − ⟨Xi ⟩⟨Xj ⟩ (1.26)
The off-diagonal elements of the covariance matrix are called covariances. After being
normalized, they are called the correlation coefficients:
⟨⟨Xi Xj ⟩⟩ ⟨Xi Xj ⟩ − ⟨Xi ⟩⟨Xj ⟩
ρij = p =p (1.27)
⟨⟨Xi ⟩⟩⟨⟨Xj ⟩⟩ (⟨Xi2 ⟩ − ⟨Xi ⟩2 )(⟨Xi2 ⟩ − ⟨Xj ⟩2 )
with −1 ≤ ρij ≤ 1.
For r = 2: if X1 and X2 are statistically independent then
(i) ⟨X1m1 X2m2 ⟩ = ⟨X1m1 ⟩⟨X2m2 ⟩, which means that all moments are factorized.
(ii) G(k1 , k2 ) = G1 (k1 )G2 (k2 ), which means that the characteristic function is factorized.
(iii) ⟨⟨X1m1 X2m2 ⟩⟩ = 0 for m1 ̸= 0 and m2 ̸= 0.
Uncorrelated vs. independent: X1 and X2 are called uncorrelated when the covari-
ance ⟨⟨X1 X2 ⟩⟩ = 0 which is weaker than statistical independence.
Exercise 1.12 Find moments and cumulants of the bivariate Gaussian distribution
1 2 +2bxy+cy 2 )
P (x, y) = const.e− 2 (ax

with ac − b2 > 0 and a > 0. Show that for this distribution “uncorrelated” and “inde-
pendent” are equivalent.

1.8 Addition of random variables


Let X1 and X2 be random variables with joint probability density PX (x1 , x2 ). The prob-
ability that Y = X1 + X2 has a value between y and y + ∆y is
ZZ
PY (y)∆y = PX (x1 , x2 )dx1 dx2 . (1.28)
y<x1 +x2 <y+∆y

Hence,
ZZ Z
PY (y) = δ(x1 + x2 − y)PX (x1 , x2 )dx1 dx2 = PX (x1 , y − x1 )dx1 . (1.29)

10
If X1 and X2 are independent then
Z
PY (y) = PX1 (x1 )PX2 (y − x1 )dx1 . (1.30)

For the moments


⟨Y ⟩ = ⟨X1 ⟩ + ⟨X2 ⟩ (1.31)
regardless whether X1 and X2 are independent or not.
If X1 and X2 are uncorrelated (⟨⟨X1 X2 ⟩⟩ = 0) then
2
σY2 = σX 1
2
+ σX 2

or
⟨⟨(X1 + X2 )2 ⟩⟩ = ⟨⟨X12 ⟩⟩ + ⟨⟨X22 ⟩⟩
.
The characteristic function
GY (k) = GX1 ,X2 (k, k)
If X1 and X2 are independent then GY (k) = GX1 (k)GX2 (k).
Exercise 1.13 X1 and X2 are independent variables with the Gaussian distributions
1 x21 1 x22
   
1 1
PX1 (x1 ) = √ exp − 2 PX2 (x2 ) = √ exp − 2
σ1 2π 2 σ1 σ2 2π 2 σ2
Find PY (y) for Y = X1 + X2 .

Discrete-time random walk A drunkard is walking to the left and to the right with
equal probabilities:
1 1
Xi = ±1 p(+1) = p(−1) = − .
2 2
Thus ⟨Xi ⟩ = 0 and ⟨Xi2 ⟩ = 1. Let Y be his walking distance from the origin after N
steps:
Y = X 1 + X2 + . . . + X N .
Find the probability distribution of Y .
We have:
⟨Y ⟩ = ⟨X1 ⟩ + ⟨X2 ⟩ + . . . + ⟨XN ⟩ = 0 (1.32)
⟨Y 2 ⟩ = N ⟨Xi2 ⟩ = N (1.33)
Thus * 2 +
Y 1
= −→ 0 as N → ∞ (1.34)
N N
or the variance of the mean goes to zero as N goes to infinity.
To find the probability distribution of Y we use the characteristic function
 N
N 1 ik 1 −ik
GY (k, N ) = [GY (k)] = e + e (1.35)
2 2

11
The probability that Y has the value n is the coefficient of eink
 
1 N
⇒ pn (N ) = N 1 (1.36)
2 2
(N − n)

for 21 (N − n) integer between 0 and N . Note that pn is maximum when n = 0 with


 
1 N
p0 (N ) = N 1
2 2
N

1.9 Transformation of variables


Suppose that
Y = f (X)
X is continuous. Example: Y = log X, Y = 1/X (frequency to wavelength).
The probability that Y has a value between y and y + ∆y is
Z
PY (y)∆y = PX (x)dx
y<f (x)<y+∆y
Z
⇔ PY (y) = δ(f (x) − y)PX (x)dx

For simple case, X = f −1 (y) = g(Y ), one gets

dx
PY (y) = PX (x) = PX (g(y))g ′ (y)
dy

If X and Y are vectors of the same numbers of components:

PY (y) = PX (x)| det J| ,

where J is the Jacobian matrix


∂xi
Jij = .
∂yj

More generally, X and Y can have different numbers of components. In that case, one
should use this equation
Z
PY (y) = δ(f (x) − y)PX (x)dx .

Example: The Maxwell distribution


 m 3/2 2 2 2
f (vx , vy , vz ) = e−m(vx +vy +vz )/2kT
2πkT
is the joint distribution of three components of the velocity vector, v = (vx , vy , vz ). Find
the distribution for the kinetic energy E = 12 mv 2 .

12
Z
1  m 3/2 2
P (E) = δ( mv 2 − E) e−mv /2kT dvx dvy dvz
2 2πkT
By writing dvx dvy dvz = dv 3 = v 2 sin θ dθ dϕ dv and doing integration one gets

P (E) = 2π −1/2 (kT )−3/2 E 1/2 e−E/kT

Exercise 1.14 Let X have the range (0, 2π) and constant probability density in that
range. Find the distribution of Y = sin X. Also when P (x) = A + B sin(x) (|B| < A =
1

).
Exercise 1.15 A point lies on a circle at a random position with uniform distribution
along the circle. What is the probability distribution of its azimuth?
Exercise 1.16 A scattering center is bombarded by a homogeneous beam of particles.
A particle that comes in with impact parameter b is deflected by an angle θ(b). Find the
differential cross-section dσ/dΩ.
The total cross-section
I Z 2π Z π
dσ dσ
σ= dΩ = sin θ dθ dϕ
4π dΩ 0 0 dΩ

Exercise 1.17 Show that a Lorentz distribution in the frequency scale is also a Lorentz
distribution in the wavelength.

1.10 The Gaussian distribution


The general form of a Gaussian distribution is
1 2 −Bx
P (x) = Ce− 2 Ax (−∞ < x < ∞) (1.37)

Normalization gives
 1/2
A 2 /2A
C= e−B (1.38)

By putting µ1 = −B/A and σ 2 = 1/A one gets the Gaussian distribution of the form

(x − µ1 )2
 
2 −1/2
P (x) = (2πσ ) exp − , (1.39)
2σ 2

which is also called the normal distribution after Gauss.


The characteristic function
1 2 k2
G(k) = eiµ1 k− 2 σ (1.40)

The cumulants:
κ1 = µ1 κ2 = σ 2 κ3 = κ4 = . . . = 0

13
1.11 The central limit theorem
Let X1 , X2 , . . . , XN be a set of N independent random variables, having the same prob-
ability density PX (x) with zero mean and finite variance σ 2 . Let
X 1 + X2 + . . . + X N
Y = √ . (1.41)
N

The central limit theorem states that even when PX (x) is not Gaussian, in the limit
N → ∞ the distribution of Y is Gaussian!!
y2
 
2 −1/2
PY (y) = (2πσ ) exp − 2 . (1.42)

This theorem explains the dominant role of the Gaussian distribution in all fields of
statistics.
Proof: Z
1
GX (k) =eikx PX (x)dx = 1 − σ 2 k 2 + O(N −3/2 ) (1.43)
2
N  N
σ2k2
  
k 1 2 2
GY (k) = GX √ = 1− −→ e− 2 σ k as N → ∞ (1.44)
N 2N
Exercise 1.18 Verify by explicit calculation that for Lorentzian variables, the central
limit theorem is wrong (the proof breaks down).
In fact, if Xi are independent and Lorentzian, then Y is again Lorentzian (i.e. not
Gaussian)!
A minimal smoothness condition for G(k) is needed for the central limit theorem, namely
that its second derivative at the origin exists.

1.12 The Poisson distribution


Many processes in physics can be described as point process, or random dots or random
set of points.
The probability that exactly N independent random dots fall in a given interval is
⟨N ⟩N −⟨N ⟩
pN = e (1.45)
N!
This is the Poisson distribution.
The second moment
⟨N 2 ⟩ = ⟨N ⟩2 + ⟨N ⟩

All cumulants of the Poisson distribution are equal to the mean value ⟨N ⟩.
Note: The Poisson distribution also applies for random dots in two and three dimensional
spaces.
Exercise 1.19 N1 and N2 are two statistically independent variables, each with a Poisson
distribution. Show that their sum N1 + N2 is again Poissonian.

14

You might also like