Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
29 views

Notes

Uploaded by

Tatev Chalyan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Notes

Uploaded by

Tatev Chalyan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 171

Notes on

INTRODUCTION
TO QUANTUM FIELD THEORY

Roberto Bonciani1
Dipartimento di Fisica, Università di Roma “La Sapienza”
e INFN Sezione di Roma,
Piazzale Aldo Moro 2,
00185 Roma

Academic Years 2021-2022, 2022/2023

1
Email: roberto.bonciani@roma1.infn.it
Indice

1 Necessity of a Theory of Fields 4


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Summary of the quantization procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 One-dimensional chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Limit to the continuum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.2 Quantization of the vibrating string . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.3 Fock space and phonons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.4 Commutation relations in the continuum . . . . . . . . . . . . . . . . . . . . . . 17
1.3.5 Normal ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Special Relativity 20
2.1 Notes on Special Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.1 Simultaneous events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.2 Causal structure of the Space-Time . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.3 Lorentz transformations: Boosts . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.4 Boost in a general direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.5 Transformation of the three-velocity . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2 Kinematics of the classical particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.1 Four-velocity and four-acceleration . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.2 Four-momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Vectors and Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.1 Vectors and Contravariant Components . . . . . . . . . . . . . . . . . . . . . . 29
2.3.2 Dual vectors and covariant components . . . . . . . . . . . . . . . . . . . . . . 30
2.3.3 Vectors and Tensors in Differential Form . . . . . . . . . . . . . . . . . . . . . . 32
2.4 Minkowski Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.5 Lorentz group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.6 Poincaré group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.7 Infinitesimal Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.8 Some notes on Group Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.8.1 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.8.2 Lie groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.8.3 A simple example: the (abelian) group SO(2) and U (1) . . . . . . . . . . . . . 46
2.9 The generators of the Poincaré group and the algebra . . . . . . . . . . . . . . . . . . . 49
2.10 Rappresentazioni irriducibili finito-dimensionali del gruppo di Poincaré . . . . . . . . . 51
2.10.1 Campi tensoriali. Rappresentazioni a spin intero . . . . . . . . . . . . . . . . . 52
2.10.2 Campi spinoriali. Spinori di Dirac . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.11 Infinite dimensional representations of the Poincaré group: particle states . . . . . . . 57

1
3 Conservation Laws 58
3.1 Lagrangian formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.1.1 Relativistic free particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.1.2 Euler-Lagrange Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.1.3 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2 Lagrangian formalism for the vibrating string . . . . . . . . . . . . . . . . . . . . . . . 63
3.3 Lagrangian formalism: relativistic fields . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.4 Hamilton’s principle and the equations of motion . . . . . . . . . . . . . . . . . . . . . 66
3.5 Global symmetries and Nöether’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.5.1 Simmetrie geometriche. Trasformazioni di Lorentz . . . . . . . . . . . . . . . . 69
3.5.2 Campo scalare e conservazione del quadriimpulso e del momento angolare orbitale 71
3.5.3 Simmetrie interne globali . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4 Free Fields 74
4.1 The Klein-Gordon Field (classical field) . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.1.1 The Klein-Gordon equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.1.2 Plane wave solutions of the Klein-Gordon equation . . . . . . . . . . . . . . . . 75
4.1.3 Lagrangian density of the Klein-Gordon real field . . . . . . . . . . . . . . . . . 78
4.1.4 Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.1.5 Complex scalar field and the charge . . . . . . . . . . . . . . . . . . . . . . . . 80
4.1.6 Non relativistic limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.1.7 The two-component form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2 Quantization of the Klein-Gordon field . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2.1 Real field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2.2 Complex field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.2.3 Locality and causality in QFT . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.3 The Dirac Field (classical field) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.3.1 The Dirac equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.3.2 αi and β matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.3.3 Covariance of the Dirac equation . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.3.4 Unitarity and Dirac adjoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.3.5 Probability density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.3.6 Lagrangian and Hamiltonian densities . . . . . . . . . . . . . . . . . . . . . . . 102
4.3.7 Conserved quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.3.8 The matrix γ5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.3.9 Bilinear covariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.3.10 Algebra of the γ µ matrices and γ5 . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.3.11 Plane wave solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.3.12 Energy projectors and polarization sum . . . . . . . . . . . . . . . . . . . . . . 112
4.3.13 Spin projectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.3.14 Non relativistic limit of the Dirac’s equation . . . . . . . . . . . . . . . . . . . . 116
4.3.15 Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.3.16 Time Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.3.17 Charge Conjugation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.3.18 CP T transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.3.19 Massless fermionic field: the neutrino . . . . . . . . . . . . . . . . . . . . . . . . 126
4.4 Quantization of the Dirac Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.5 The Electromagnetic Field (classical field) . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.5.1 Covariant form of Maxwell’s equations . . . . . . . . . . . . . . . . . . . . . . . 134
4.5.2 Electromagnetic tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

2
4.5.3 Lagrangian density of the elettromagnetic field . . . . . . . . . . . . . . . . . . 135
4.5.4 Energy-Momentum tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.5.5 Number of degrees of freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.6 Quantization of the Electromagnetic Field . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.7 Propagator of the Klein-Gordon field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4.7.1 Closed paths and residues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
4.7.2 Open paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
4.8 Propagator of the Dirac field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
4.9 Propagator of the Electromagnetic field . . . . . . . . . . . . . . . . . . . . . . . . . . 159

5 Cross Section and Decay Rate 160


5.1 From transition amplitude to probability . . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.2 Cross Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.3 Decay Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
5.3.1 Two-body phase space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
5.4 The process e+ + e− → µ+ + µ− . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.4.1 Modulus Squared of the Transition Amplitude . . . . . . . . . . . . . . . . . . . 165
5.4.2 Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
5.4.3 Flux Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
5.4.4 Cross Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

3
Capitolo 1

Necessity of a Theory of Fields

1.1 Introduction
Non-relativistic Quantum Mechanics (NRQM), developed from the beginning of last century untill
∼ 1926 is a theory devoted to the study of a single particle. To the particle is associated a wave
function, ψ(x, t), whose time evolution is determined by the wave equation:


i~ ψ(x, t) = H ψ(x, t) , (1.1)
∂t
p2
that, in the case in which H = 2m + V , represents the well known non relativistic Schrödinger’s
equation for a particle moving in a potential V . The modulus squared of the wave function, |ψ(x, t)|2 ,
is interpreted as the probability density of finding the particle in x at the time t. Such a Theory leaves
the concept of classic determinism in favor of a treatment of the microscopic physical phenomena
that is intrinsically statistical. However, this theory is not yet completely satisfactory, since it is not
so general to include the possibility that the particle’s speed is close to the speed of light. In other
words, it does not include Special Relativity and it is therefore valid for velocities much smaller than c.
Another crucial point is that, in NRQM it is possible to study only transition amplitudes that do not
involve a different number of particles in the initial and final state (for instance a scattering process in
which two particles that collide have an energy which is sufficient to produce particles in the final state
that are different from the ones in the initial state). This is a characteristic that a relativistic theory
must have, because of the correspondence energy-mass. We have, therefore, to find a theory which is
much more flexible and general than NRQM, that include more known processes and that has, as a
non relativistic limit, Schrödinger’s theory.
The first attempt to include Special Relativity in quantum mechanics regarded the search of a
relativistically “correct” evolution equation and it brought to the so-called Klein-Gordon equation
(Schrödinger himself worked to such equation in the same years or even before to write his famous
article on wave mechanics). If we consider the fact that Eq. (1.1) can be derived from the non-relativistic
energy-momentum relation
p2
E= , (1.2)
2m
with the correspondences

E → i~ , p → −i~∇ , (1.3)
∂t
we can try to include Special Relativity in quantum mechanics using the correct relativistic energy-
momentum relation
E2
= p2 + m2 c2 , (1.4)
c2

4
finding the following differential equation:
 2 2 
~ ∂ 2 2 2 2
− ~ ∇ + m c φ(x, t) = 0 . (1.5)
c2 ∂t2

In the case in which one would like to interpret Eq. (1.5) as a wave equation, (“à la Schrödinger”),
he would face many problems. As we will se in the next chapters, first of all the probability density
connected to the field φ(x, t) is not positive definite. This put immediately in troubles the probabilistic
p
interpretation. Moreover, Eq. (1.5) admits p plane-wave solutions with positive energy E = p + m
2 2

but also with negative energy, E = − p2 + m2 . While classically this would not cause particular
issues1 , from a quantum mechanical point of view, this would mean that a particle can jump from
a positive-energy state to a negative-energy one emitting a photon (for instance) and then, since the
spectrum is unbounded from below, it would keep on emitting and jumping to bigger and bigger
negative energies.
These two issues made in such a way that the Klein-Gordon theory was temporarly abandoned.
A successful step forward was instead done by Dirac in 1927. Dirac realized that the non positivity
of the probability density in the KG equation was due to the fact that the derivative with respect to
the time is of the second order and postulated the following first order (in time and, as required by
special relativity, also in space) differential equation describing the wave equation of an electron2 :


i ψ(x, t) = (−iα · ∇ + βm) ψ(x, t) . (1.6)
∂t
Eq. (4.248) is such that the probability density, ρ = |ψ|2 , is actually positive definite. However, Dirac’s
equation still admits positive as well as negative-energy plane-wave solutions, but if we can for some
reason neglect the contribution of negative-energy solutions we can solve the equation for the Hydrogen
atom, finding a spectrum in very good agreement with the experimental measurements. Finally, Dirac’s
equation includes the description of the spin, that emerges in a natural way from the theory and does
not need an ad hoc contruction, and its non-relativistic limit is the Pauli equation, as we would expect.
In order to physically interpret negative-energy solutions, Dirac introduced the so-called “holes
theory”, that predict a particle with the same mass of the electron but with negative charge, the
electron “antiparticle” (or “positron”). This explaination was again a success, since the positron was
actually detected in cosmic rays by Anderson in 1932.
In this context is inserted also the problem of the quantization of the elecctromagnetic field, that
in non relativistic Schrödinger’s theory is still considered as a classical field. The quantization of this
field would consists in finding a method to describe the corpuscular nature of the field, i.e. the photon.
In this sense, there is an asymmetry between the treatment of the Dirac electron or the Klein-Gordon
scalar particle and the electromagnetic field.
While for the former we look for an equation that starts already from the corpuscular nature of the
field, which is evident from the classical limit, for the latter the problem is faced differently: we have
the field equations (Maxwell’s eqs) that involve a classical field and show the wave nature of the light
and we want to quantize them to describe the microscopic quantum nature of the field.
Quantum field theory makes the procedure uniform. We consider the relativistic equations of Dirac
and Klein-Gordon as “classic equations” for the relative fields. Then, we quantize them to describe
the particle nature of those fields. Since this is the philosophy, the name “second quantization” is not
appropriate anymore. The so-called “first quantization” is nothing else than the identification of the
correct relativistic equation that the field has to satisfy.
1
The two energy solutions are separated by a finite gap that classically will never be overcome. Moreover, classically
one can always discart negative energy solutions on the basis of the fact that they are non-physical.
2
We use natural units ~ = c = 1.

5
1.2 Summary of the quantization procedure
On the basis of what so far discussed, we can summarize point-by-point the quantization procedure
that we will follow in constructing the theory:

• Firstly we find the field equations. We will study the Klein-Gordon, Dirac and Maxwell’s equa-
tions. All the equations will be considered as classical equations that the different fields have to
satisfy.

• The field equations are the Euler-Lagrange equations derived from a Lagrangian density. We
introduce the lagrangian formalism. We look for conserved quantities, via the Nöther’s theorem,
that will play the role of observables in the quantum theory.

• In order to formulate the procedure of quantization of a field, i.e. of a system with infinite degrees
of freedom, we look at a peculiar system: the vibrating string. This system can be thought of as
the continuum limit of a one-dimensional distribution of harmonic oscillators, that we know how
to treat and quantize in the discrete.

• We find the momenta conjugated to the fields and then we move to the Hamiltonian description
of the system. The fields are promoted to time-dependent operators (in the Heisenberg picture)
that act on a Hilbert space3 . We then impose the commutation relations among fields and
conjugated momenta, performing the so-called canonical quantization.

• We apply canonical quantization first of all to the free (non interacting) fields. In order to
construct a coherent picture, we will be able to include spin-statistics in the quantization, treating
consistently particles that obey Bose-Einstein statistics with commutation relations, while the
particles that obey Fermi-Dirac statistics have to be quantized using anticommutation relations.

• Probably the most interesting part, since it is the one that regards directly our experiments, is
the treatment of interacting fields. To consider the interactions, we will have to find a suitable
Lagrangian (for example by minimal substitution, in the case of electromagnetic interactions),
and Hamiltonian and extend to this case the canonical quantization rules. the conserved current
does not contain time derivatives of the fields and, therefore, for

• The formalism so developed, will give the possibility to study transition processes from n-particle
initial states to m-particle final states ...

Before starting, in the next Chapter, with the introduction of the various fields treated in the
theory, we lay the foundations for their definition.
First of all, we study the vibrating string in order to understand what can be intended as a field,
just to have a concrete idea linked to a very well understood mechanical system). Then we will use
this system to explain quantization.
The theory we are trying to build has as its main feature the invariance under Poincaré transforma-
tions, in the sense that Physics studied in two different inertial frames must be the same. Consequently
the action will have to be invariant under Poincaré transformations and the various fields will have a
well defined behavior under these transformations.
We will, therefore, briefly recap the main features of the Lorentz and Poincaré groups.
3
Actually we will see that, since the numer of particles is not anymore a conserved quantity, we need a peculiar space,
tensor product of a variable numer of Hilbert spaces, called the Fock space.

6
1.3 One-dimensional chain
One of the key points of Quantum Field Theory is the fact that we have to construct a formalism with
infinite degrees of freedom, in order to be able to adjust the treatment of a system with a variable
number of particles. Since our intuition is connected to the one-particle case (analytical mechanics,
non relativistic quantum mechanics) we will start with a discontinuus system and we will define a sort
of limit to the continuum to move from the one-particle description to the “many particles” description
that will be connected to the field.
The field describes, in this treatment, the fluctuations with respect to a certain state (for instance
the equilibrium state or the vacuum state) and the particles will be connected to the quanta of the
modes (Fourier modes) of these fluctuations.
In order to understand the transition to the continuum and the quantization of a continuus system,
we study the case of the linear chain of harmonic oscillators. Let us consider, then, a system of (N + 1)
material points, all of them with mass m, interacting through an harmonic potential (springs with
same constant k) as in the figure:

m k m k m k m k m

a
At the equilibrium, the particles are separated by a distance a and therefore the lenght of the chain
is L = aN . Let us consider for simplicity N even (this is not a big constraint since then we want to
take the limit N → ∞).
Let us also consider m = 1.
In the excited situation the n-th particle oscillates around the equilibrium configuration of a
quantity that we will call qn (t).

q2 q3 qn

x
x1 x2 a x3 ··· xn
Let us make another assumption: the interaction of the n-th particle is limited to the nearest
particles, in such a way that the potential energy of the system can be written as follows:
N
1 X
V = ω2 (qn − qn+1 )2 (1.7)
2
n=1

and therefore the equations of motions are


∂V
q̈n = − = −ω 2 [(qn − qn+1 ) − (qn−1 − qn )] = ω 2 (qn+1 + qn−1 − 2qn ) . (1.8)
∂qn
This means that the n-th particle feels a force which is decomposed in two parts: the force of the
spring on the left and the force of the spring on the right.
Since we are dealing with a chain, we have to choose what happens at the end of the chain, i.e.
boundary conditions. We can impose two kinds of boundary conditions: i) chain with fixed end-points
q1 = qN +1 = 0, or ii) chain with periodic boundary conditions qn = qn+N . Since in the end we want
to take also the limit L → ∞, both cases bring to the same conclusions.

7
We will chose the case of periodic boundary conditions4

qn = qn+N , (1.9)

studying basically the case of a ring of particles connected by springs. In this situation we have N
particles and N springs, so the sums over n go from 1 to N .
The kinetic energy of the chain is
N
1X 2
T = q̇ (1.10)
2 n=1 n
and therefore we can write down the lagrangian and the hamiltonian of the system as
N N
1X 2 1 2X
L = T −V = q̇n − ω (qn − qn+1 )2 , (1.11)
2 n=1 2 n=1
N N
1X 2 1 2X
H = T +V = q̇n − ω (qn − qn+1 )2 . (1.12)
2 2
n=1 n=1

We notice that the equations of motion (1.8) and the hamiltonian (4.107) are not diagonalized.
We can diagonalize them moving to the normal modes, i.e. looking for a solution in Fourier series. In
order to do that, we look for the following solution

qn(j) (t) = cj (t)eikj xn = since xn = na = cj (t)eikj na , (1.13)

anticipating what will come from the imposition of the boundary conditions, that kj is indeed enume-
rable.
If we substitute (1.13) into the equations of motion (1.8) we find
h i
c̈j (t)eikj xn = ω 2 eikj (n+1)a + eikj (n−1)a − 2eikj na cj (t) , (1.14)
h i
= ω 2 eikj a + e−ikj a − 2 cj (t)eikj xn , (1.15)
= −ω 2 [2 − 2 cos (kj a)] cj (t)eikj xn , (1.16)
 
2 2 kj a
= −4ω sin cj (t)eikj xn . (1.17)
2
Defining  
kj a
ωj2 = 4ω sin 2 2
, (1.18)
2
we find that cj (t) has to be the solution of the equation of an harmonic oscillator of frequency ωj

c̈j (t) + ωj2 c(t) = 0 . (1.19)

Eq. (1.18) is the dispersion relation that link the frequency ωj to the wave number kj . The cj (t) are
the normal modes, that decouple the system. The relation (1.18) is periodic. If we consider kj and
kj + 2π
a m, with m ∈ Z, we get the same value for ωj . Therefore, we can restrict our analysis to the
so-called “first Brillouin zone”, i.e.
π
|kj | ≤ . (1.20)
a
Imposing the boundary conditions in Eq. (1.9), we find

eikj (n+N )a = eikj na , (1.21)


4
For fixed end-point conditions see for instance ....

8
or
2π 2π
j= j. kj = (1.22)
aN L
In the first Brillouin zone we have to impose Eq. (1.20), therefore
2π π N
j ≤ ⇒ |j| ≤ . (1.23)
aN a 2
We then find N + 1 modes. However, the solution j = 0 gives kj = 0, then ωj = 0 and finally qn linear
in time. This corresponds to a rigid translation of the chain, that we are not going to consider (we
want to study the vibrations only). Therefore
2π N
kj = j, j = ±1, ±2, ..., ± . (1.24)
L 2
The general solution can be cast in the following form
N/2 N/2 N/2
ikj an Qj (t) 2π Qj (t)
X X X
qn (t) = qn(j) = e √ = ei N nj √ , (1.25)
j=−N/2 j=−N/2
N j=−N/2
N

where we put √
Qj (t) = N cj (t) , (1.26)
that, again, are solutions of the following differential equation:

Q̈j + ωj2 Qj = 0 . (1.27)

Since qn (t) has to be a real quantity (it is the physical displacement of the n-th particle), we have
to impose qn∗ = qn and then
N/2 N/2
X
−i 2π nj
Q∗j (t) X 2π Qj (t)
e N √ = ei N nj √ . (1.28)
j=−N/2
N j=−N/2
N

Puting j → −j in the r.h.s. of Eq. (1.28), we find that the following relation must hold:

Q−j (t) = Q∗j (t) . (1.29)

Using the following representation of the Kronecker delta5


N
X 2π ′
ei N (j−j )n = N δjj ′ , (1.31)
n=1

we can write the lagrangian in terms of the normal modes (for the moment we leave unexpressed the
potential energy V ):
N N N/2
1X 2 1X X 2π ′ Q̇j Q̇j ′
L = q̇n − V = ei N (j+j )n −V ,
2 2 N
n=1 n=1 j,j ′ =−N/2

5
This formula can be justified easily as follows. If j = j ′ , this is trivially N . If, instead, j 6= j ′ we have
N N h 2π ′
X 2π ′ X 2π ′
in 1 − ei N (j−j )(N+1)
ei N (j−j )n
= ei N (j−j ) − 1 = 2π − 1,
1 − ei N (j−j )

n=1 n=0
2π ′ ′ 2π ′ ′
ei N (j−j ) − ei2π(j−j ) ei N (j−j )
2π ′ 1 − ei2π(j−j )
= = ei N (j−j )
= 0. (1.30)
i 2π (j−j ′ ) i 2π (j−j ′ )
1−e N 1−e N

9
N/2
1 X Q̇j Q̇−j ′
= with j ′ → −j ′ = N δjj ′ −V ,
2 N
j,j ′ =−N/2
N/2 N/2
1 X 1 X
= Q̇j Q̇−j −V = Q̇∗j Q̇j − V . (1.32)
2 2
j=−N/2 j=−N/2

Moreover, since Q̇∗j Q̇j = Q̇−j Q̇∗−j we have

N/2 N/2 N/2


1 X X X 2
Q̇∗j Q̇j = Q̇∗j Q̇j = Q̇j . (1.33)
2
j=−N/2 j=1 j=1

Therefore, we can define the momenta conjugated to Qj as

∂L
Pj = = Q̇∗j . (1.34)
∂ Q̇j

On the other hand, we have the momenta conjugated to qn defined as follows


N/2
∂L X 2π Q̇j (t)
pn (t) = = q̇n (t) = ei N nj √ = |j → −j| ,
∂ q̇n (t) N
j=−N/2
N/2 N/2 N/2
X 2π Q̇−j (t) X 2π Q̇∗j (t) X 2π Pj (t)
= e−i N nj √ = e−i N nj √ = e−i N nj √ , (1.35)
j=−N/2
N j=−N/2
N j=−N/2
N

where we used the fact that also pn (t) must be real, and then p∗n = pn , that implies

Q̇−j (t) = Q̇∗j (t) , (1.36)

or, in terms of Pj (t),


P−j (t) = Pj∗ (t) . (1.37)
For later use, we can write Qj and Pj in terms of qn and pn , using the representation of the delta
in Eq. (1.31). In fact

N N N/2 N/2
X
−i 2π jn
X X
i 2π (j ′ −j)n Qj ′ (t)
X Qj ′ (t) √
qn (t)e N = e N √ = N δjj ′ √ = N Qj (t) . (1.38)
n=1 n=1 j ′ =−N/2
N j ′ =−N/2
N

Therefore
N
X 2π qn (t)
Qj (t) = e−i N jn √ . (1.39)
n=1
N
In the same way, we find
N
X 2π pn (t)
Pj (t) = ei N jn √ . (1.40)
n=1
N
Now, let us write the hamiltonian of the system in terms of normal modes. We have
N N N N
1X 2 1 2X 1X 2 1 2X
H = q̇n + ω (qn − qn+1 )2 = pn + ω (qn − qn+1 )2 ,
2 n=1 2 n=1 2 n=1 2 n=1

10
  2 
N  N/2 N/2 N/2
1 X X 2π ′ Pj Pj ′ 2πX Qj (t) X 2π Qj (t) 
= e−i N (j+j )n + ω2  ei N jn √ − ei N j(n+1) √  ,
2  N N N 
n=1 j,j ′ =−N/2 j=−N/2 j=−N/2
  2 
N  N/2 N/2
1 X X 2π ′ P P
j j ′ X 2π 2π Qj (t) 
= e−i N (j+j )n + ω2  (1 − ei N j )ei N jn √  ,
2  ′ N N 
n=1 j,j =−N/2 j=−N/2
  
N N/2 N/2
1 X X 2π ′ P P
j j ′ X 2π 2π ′ 2π ′ Q Q ′
j j 

= e−i N (j+j )n + ω2  (1 − ei N j )(1 − ei N j )ei N (j+j )n ,
2 n=1  ′ N ′
N 
j,j =−N/2 j,j =−N/2

= we put j → −j in the kinetic energy and j ′ → −j ′ in the potential energy ,


  
N  N/2 N/2
1 X X 2π ′ P−j Pj ′ X 2π 2π ′ 2π ′ Qj Q−j ′ 
= ei N (j−j )n + ω2  (1 − ei N j )(1 − e−i N j )ei N (j−j )n ,
2  ′ N N 
n=1 j,j =−N/2 ′ j,j =−N/2

= using Eq. (1.31) ,


N/2
Pj∗ Pj ′ Qj Q∗j ′
  
1 X 2π 2π ′
= N δjj ′ + ω 2 (1 − ei N j )(1 − e−i N j )N δjj ′ ,
2 N N
j,j ′=−N/2
N/2 n
1  2π 2π
 o
|Pj |2 + ω 2 2 − ei N j − e−i N j ) |Qj |2 ,
X
=
2
j=−N/2
π 
= since ωj2 = 4ω 2 sin2 j ,
N
N/2 n
1 o
|Pj |2 + ωj2 |Qj |2 .
X
= (1.41)
2
j=−N/2

Since
|Pj |2 = Pj∗ Pj = P−j P−j

= |P−j |2 , (1.42)
2 2
|Qj | = Q∗j Qj = Q−j Q∗−j = |Q−j | , (1.43)
ωj2 = 2
ω−j , (1.44)
we can write the hamiltonian as follows:
N/2 n o
|Pj |2 + ωj2 |Qj |2 .
X
H= (1.45)
j=1

Looking at Eq. (1.45) is clear that the system is equivalent to a system composed by N decoupled
harmonic oscillators (Qj and Pj are complex quantities).

1.3.1 Limit to the continuum


In order to move to the continuum, let us write the displacement qn (t) as a funcion of xn as follows:
u(xn , t) = qn (t) . (1.46)
In this way, u(xn , t) represents the displacement from the equilibrium of the n-th massive point-like
particle (i.e. a function that for each xn gives the displacement from xn of that massive point). We
can then rewrite the equations of motion (1.8) in terms of u(xn , t) geting
ü(xn , t) = ω 2 [(u(xn+1 , t) − u(xn , t)) − (u(xn , t) − u(xn−1 , t))] . (1.47)

11
Let us consider the limit in which the massive points become denser and denser, N → ∞ and
a → 0, keeping the lenght of the string finite, N a = L = const. We have also to consider the fact that
puting more and more point in the chain we add masses. However, we want to keep the mass of the
chain finite, therefore we may also keep the ratio m/a = µ constant (constant mass density over the
string). Note that for simplicity we kept m = 1 from the beginning, so we will have to rescale the field

by a factor 1/ a to keep the energy of the string finite.
We can interpret this limit considering the propagation of acoustic waves with a wave lenght λ >> a.
In this limit we have
u(xn+1 , t) − u(xn , t)
lim = u′ (x, t) , (1.48)
a→0 a
where x ∈ [xn+1 , xn ] and where xn+1 → x, xn → x. In the same way we have

u′ (x1 , t) − u′ (x2 , t)
lim = u′′ (x, t) , (1.49)
a→0 a
where x ∈ [x1 , x2 ]. In the end, Eq. (1.47) becomes

ü(x, t) = ω 2 a2 u′′ (x, t) , (1.50)

which is a wave equation and ω 2 a2 = v 2 is the velocity of propagation of the waves throught the string.
It is clear that ω 2 a2 should be a constant since
   2
2 2 2 kj a 2 kj a
ωj = 4ω sin ∼ 4ω = ω 2 a2 kj2 (1.51)
2 2

and therefore in the limit a → 0, ω should go like 1/a to give a finite frequency and wave number of
the j-th mode. Then we find
ü(x, t) = v 2 u′′ (x, t) , (1.52)
with the usual D’Alambert solution u(x, t) = f (x + vt) + g(x − vt) ...
Let us see what happens to the hamiltonian. In terms of u(xn , t) we can write
N N
1X˙ 2 1 2X
H= (u) (xn , t) + ω [u(xn , t) − u(xn−1 , t)]2 . (1.53)
2 n=1 2 n=1

In the limit to the continuum, we have6


N L
1
X Z
∼ dx (1.55)
n=1
a 0

and therefore
L
1
Z
u̇2 (x, t) + v 2 u′2 (x, t) dx .

H= (1.56)
2a 0
The fact that H has a factor 1/a depends on having set m = 1 at the beginning. So, in order to have

a finite energy on the string we have to impose that the function u(x, t) goes to 0 as a. We then
6
We can understand recalling the way we define the integral
Z L N
X
f (x) dx = lim a fn , (1.54)
0 N→∞
a→0 n=1

where fn is the value of the function f (x) in x = xn = na.

12
redefine the field7
u(x, t)
φ(x, t) = √ , (1.61)
a
in terms of which we have
Lh
1
Z i
H= φ̇2 (x, t) + v 2 φ′2 (x, t) dx (1.62)
2 0
and the equations of motion
φ̈(x, t) = v 2 φ′′ (x, t) . (1.63)
In the limiting procedure, we do not change the boundary conditions. In fact the relation

u(xn+N , t) = u(xn , t) , (1.64)

corresponds to
u(x + L, t) = u(x, t) , (1.65)
and this in any case gives rise to a wave number which is enumerable:

eikj (x+L) = eikj x , (1.66)

tha implies

j, kj =
j ∈ Z. (1.67)
L
Now j can go from −∞ to +∞ (always except 0).
Since aN = L we can define the field φ in normal modes

X 2π Qj (t)
φ(x, t) = ei L jx √ (1.68)
j=−∞
L

and the dispersion relation becomes


 2
kj a
ωj2 → 4ω 2
= v 2 kj2 , (1.69)
2

typical of a wave equation.


The normal modes Qj still satisfy a single harmonic oscillator differential equation

Q̈j (t) + v 2 kj2 Qj (t) = 0 (1.70)


7
We can look at the same procedure keeping m and setting m/a = µ = const in the limit to the continuum. In these
terms, we have
Z L
µ L
Z
1 X 2 1
T = m q̇n → dx mu̇2 (x, t) → dx u̇2 (x, t) , (1.57)
2 n
2a 0 2 0
Z L
τ L
Z
1 X 1
V = k (qn+1 − qn )2 → dx ka2 u′2 (x, t) → dx u′2 (x, t) , (1.58)
2 n 2a 0 2 0
m
Na = L , = µ , ka = τ , (1.59)
a
where τ is the tension of the string and τ /µ = v 2 . Then

µ L 2  1 Lh 2
Z Z i
L= u̇ (x, t) − v 2 u′2 (x, t) = φ̇ (x, t) − v 2 φ′2 (x, t) , (1.60)
2 0 2 0

where we defined the field φ(x, t) = µ u(x, t).

13
and in terms of Qj (t) and Pj (t) we can express the hamiltomnian as follows:

1 Lh 2
Z i
H = φ̇ (x, t) + v 2 φ′2 (x, t) dx ,
2 0
 
Z L
1 X 2π ′ Q̇ Q̇
j j ′ X 2π ′ Q Q ′
j j 
= dx  ei L (j+j )x − v 2 kj kj ′ ei L (j+j )x ,
2 0 ′
L ′
L
jj jj

= j ′ → −j ′ ,
 
1 L Q̇ Q̇ Q Q
Z X 2π ′ j −j ′ X 2π ′ ′
j −j 
= dx  ei L (j−j )x − v 2 kj k−j ′ ei L (j−j )x ,
2 0 L L
jj ′ jj ′
Z L
2π ′
= since ei L (j−j )x dx = Lδjj ′ ,
" 0 #
1 X Q̇j Q̇−j ′ Q j Q −j ′
= Lδjj ′ − v 2 kj k−j ′ Lδjj ′ ,
2 ′ L L
jj

1 X h i
= Q̇j Q̇∗j + v 2 kj2 Qj Q∗j ,
2
j=−∞
∞  
2
v 2 kj2 |Q|2
X
= Q̇ + , (1.71)
j=1

since, as before, Q̇j Q̇∗j = Q̇∗−j Q̇−j , Qj Q∗j = Q∗−j Q−j and kj2 = k−j
2 . Again, we find that the system

is equivalent to an infinite sum of harmonic oscillators. The fact that the various frequencies are
enumerable is due to the finite lenght of the ring and the boundary conditions imposed. In the case in
which also the lenght L goes to infinity, we would have to deal with a Fourier transform instead of a
series.
The quantization of the system is done by the quantization of these harmonic oscillators.

1.3.2 Quantization of the vibrating string


We are in a situation in which our continuous system, the vibrating string, is solved in terms of normal
modes, diagonalizing the hamiltonian that can be written as an infinite sum of decoupled harmonic
oscillators. This pattern gives rise to a “simple” procedure for the quantization of the system. This
can be done through the canonical quantization of the harmonic oscillators.
In the discrete system we have
N/2 h i
|Pj |2 + ωj2 |Qj |2 ,
X
H = (1.72)
j=1
 
kj a
ωj2 = 4ω sin 2 2
, (1.73)
2
2π N
kj = j, |j| = 1, 2, ..., . (1.74)
L 2
In the continuum case
∞ h i
|Pj |2 + ωj2 |Qj |2 ,
X
H = (1.75)
j=1

ωj2 = v 2 kj2 , (1.76)

14

kj = j, j ∈ Z. (1.77)
L
In both cases, in order to quantize the single harmonic oscillators, we will have to promote Qj and
Pj to operators. Consequently, the relations (1.29,1.37) will become

Q̂†j = Q̂−j , P̂j† = P̂−j . (1.78)

Let us start with the discrete case. We can introduce annihilation and creation operators8 , âj and â†j ,
such that
r
ωj i
âj = Q̂j + p P̂j† , (1.81)
2 2ωj
r
† ωj † i
âj = Q̂j − p P̂j , (1.82)
2 2ωj

where ωj = 2ω| sin (kj a/2)|. Starting from the quantization relations9 on the hermitian operators q̂n
and the conjugated momenta p̂n

[q̂n , p̂m ] = iδnm , [q̂n , q̂m ] = [p̂n , p̂m ] = 0 , (1.83)

we find that also the operators Q̂j and P̂j obey similar commutation relations10 :

[Q̂j , P̂j ′ ] = iδjj ′ , [Q̂j , Q̂j ′ ] = [P̂j , P̂j ′ ] = 0 , (1.87)

Finally, we find (we will omit from now on the hat for simplicity of notation, but we are speaking
about operators)
1
[aj , a†k ] = p [ωj Qj + iPj† , ωk Q†k − iPk ] , (1.88)
4ωj ωk
1 n o
= p −iωj [Qj , Pk ] + iωk [Pj† , Q†k ] , (1.89)
4ωj ωk
1
= p {−iωj [Qj , Pk ] − iωk [Q−k , P−j ]} , (1.90)
4ωj ωk
8
We can write Q̂j and P̂j in terms of âj and â†j as follows:
r
1   ωj  
Q̂j = √ âj + â†−j , P̂j = −i â−j − â†j , (1.79)
2ωj 2
where r r
ωj † i ωj i
â−j = Q̂ + √ P̂j , â†−j = Q̂j − √ P̂ † . (1.80)
2 j 2ωj 2 2ωj j
We have, then, [a−j , a†−k ] = −δjk and [a−j , a−k ] = [a†−j , a†−k ] = 0.
9
Remember that we are using the natural units ~ = c = 1.
10
We have
N N
X q̂n X p̂n
Q̂j = e−ikj an √ , P̂j = eikj an √ (1.84)
n=1
N n=1
N
and therefore
X 2π 2π ′ m 1 X −i 2π jn i 2π j ′ m 1
[Q̂j , P̂j ′ ] = Q̂j P̂j ′ − P̂j ′ Q̂j = e−i N jn ei N j q̂n p̂m − e N e N p̂m q̂n , (1.85)
n,m
N n,m
N

= using (1.83) p̂m q̂n = q̂n p̂m + iδnm = iδjj ′ . (1.86)

15
= δjk , (1.91)
[aj , ak ] = [a†j , a†k ] = 0. (1.92)
In terms of annihilation and creation operators, the hamiltonian becomes:
N/2 h i
|Pj |2 + ωj2 |Qj |2 ,
X
H = (1.93)
j=1
N/2
ωj h i
a†j aj + aj a†j + a†−j a−j + a−j a†−j .
X
= (1.94)
2
j=1

If we use the commutation relations (1.91,1.92), we find


N/2 h i
ωj a†j aj + a†−j a−j + 1 ,
X
H =
j=1
N/2  
1
a†j aj
X
= ωj + , (1.95)
2
j=−N/2

where ωj = 2ω| sin (kj a/2)|, which has the known form of the sum of N independent harmonic oscil-
lators. Note that H is independent of time (energy conservation) although in the relations (1.81,1.82)
they do depend on time. The time dependence of the operators aj and a†j are given by the Hamilton’s
equations
ȧj (t) = i[H, aj ] = −iωj aj (t) , (1.96)
ȧ†j (t) = i[H, a†j ] = iωj a†j (t) , (1.97)

since, by derect inspection we have [H, a†j ] = ωj a†j and [H, aj ] = −ωj aj . Then11

aj (t) = e−iωj t aj (0) , (1.100)


a†j (t) = eiωj t a†j (0) . (1.101)
We can express the displacements qn (t) in terms of annihilation and creation operators:
N/2
X Qj (t)
qn (t) = eikj an √ ,
j=−N/2
N
N/2
1
eikj an (aj (t) + a†−j (t)) ,
X
= p
j=−N/2
N 2ωj
N/2 N/2
1 1
eikj an eiωj t a†−j (0) ,
X X
ikj an −iωj t
= p e e aj (0) + p
j=−N/2
N 2ωj j=−N/2
N 2ωj

= j → −j in the second piece ,


N/2
1  
e−iωj t+ikj an aj (0) + eiωj t−ikj an a†j (0) .
X
= p (1.102)
j=−N/2
N 2ωj
11
For a−j and a†−j we also have similar relations but with an opposite sign in the exponent
a−j (t) = eiωj t a−j (0) , (1.98)
a†−j (t) = e−iωj t a†−j (0) , (1.99)
due to their commutation relations with the hamiltonian.

16
1.3.3 Fock space and phonons
We can now study the spectrum of the hamiltonian and give an interpretation of what we find.
The state with lowest energy is determined by the condition
aj |0i = 0 , ∀j . (1.103)
The corresponding eigenvalue is
X1
E0 = ωj . (1.104)
2
j

For the moment we are considering the discrete case, therefore (1.104) constitutes a finite energy.
When we will move to the continuum, this term will become infinite and we will have to redefine the
energy of the vacuum state in order to “reabsorb” this infinity.
The creation operators act on the vacuum state as follows
a†j |0i = |ji , (1.105)
where |ji is an eigenstate of the hamiltonian with a definite energy ωj . This state is also an eigenstate
of the momentum (as we will see) and therefore corresponds to a state with a definite energy and
momentum. This quantum of excitation can be interpreted as a particle, that has a definite energy
and a brings a definite momentum. Note: it has nothing to do with the particles connected with
springs that we started with. Now we are speaking about the quantization of the vibrations of the
chain (a pure quantum description).
We can act again, n times, with a†j on the vacuum state, finding the state

(a†j )n
√ |0i = |nj i , (1.106)
n!
which is a state with energy given by the sum of the single energies (so, in this case n times ωj ) and
momentum given by the sum of the momenta. This state can be interpreted as a state in which we
have n particles with energy ωj and definite momentum. Since a†j commutes with the other a†k , the
general eigenstate of the hamiltonian che bi written as follows:
1
|n1 , n2 , ..., nN i = √ (a† )n1 (a†2 )n2 ...(a†N )nN |0i . (1.107)
n1 !n2 !...nN ! 1
This state represent a state in which we have n1 particles with energy ω1 , n2 particles with energy ω2 ,
... nN particles with energy ωN .
This interpretation is corroborated by the analogous case of the electromagnetic radiation and the
explaination of the Photoelectric effect, in which it was introduced the quantum of the electromagnetic
radiation (the photon) with a given discrete energy ~ω.
The space we have introduced with this contruction is a direct sum of a variable number of Hilbert
spaces and it is calle the Fock space.
Note the flexibility of this point of view! We can deal with a variable number of particles in our
state. We can excite a particle state with a†j from the vacuum in such a way to move, for instance from
a n-particle state to a (n + 1)-particle state. We can destroy a particle in our state, acting with aj ...
and so on.

1.3.4 Commutation relations in the continuum


Using the commutation relations for Qj and Pj , Eq. (1.87), we can find the commutation relations for
the fields, since
1 X i 2π jx
φ(x, t) = √ e L Qj (t) , (1.108)
L j

17
1 X −i 2π jx
π(x, t) = √ e L Pj (t) . (1.109)
L j

We have12
1 X i 2π (jx−ky) 1 X i 2π (jx−ky)
[φ(x, t), π(y, t)] = e L Qj (t)Pk (t) − e L Pk (t)Qj (t) , (1.111)
L L
j,k j,k

= since [Qj (t), Pk (t)] = iδjk


1 X i 2π (jx−ky)
= e L iδjk , (1.112)
L
j,k
i X i 2π j(x−y)
= e L , (1.113)
L
j
= i δ(x − y) . (1.114)

Equivalently, we find
[φ(x, t), φ(y, t)] = [π(x, t), π(y, t)] = 0 . (1.115)
It is important to note the relationship between quantization conditions (in this case given by the
commutation relations) and statistics obeyed by the particles. Since

[aj , aj ′ ] = [a†j , a†j ′ ] = 0 , (1.116)

the two-particle state is such that

|i, ji = a†i a†j |0i = a†j a†i |0i = |j, ii , (1.117)

therefore, totally symmetric under the exchange of the particles. This is the case also for multi-particle
states. This means that we are describing bosons.

1.3.5 Normal ordering


As we already noticed, in the continuum case the energy of the vacuum state becomes infinite:

X
E0 = ωj . (1.118)
j=1

However, we note that in general what matters is the energy of a state with respect to the energy of the
vacuum (i.e. a difference in energies). The “absolute” energy of the vacuum state is not an observable.
We can then “redefine” the energy of the vacuum in such a way that

H|0i = 0 , (1.119)
12
We use the following representation of the Dirac delta:
1 X i 2π
e L j(x−y) = δ(x − y) , (1.110)
L
j,k


that can be “proved” looking at its action on a generic function f (x) = k ck ei L kx :
P

Z L X Z L X Z L X
1 2π 1 2π 2π 1 2π 2π
ei L j(x−y) f (x)dx = ei L j(x−y) ck ei L kx = ei L x(j+k) e−i L yj ck = j → −j ,
0 L j 0 L j,k 0 L j,k
Z L X
1 L i 2π
Z
1 2π 2π X 2π
= ei L x(k−j) ei L yj ck = since e L x(k−j) δjk = ck ei L ky = f (y) .
0 L L 0
j,k k

18
removing (so to say) the infinite constant value and imposing that the energy of the vacuum is simply
zero (this is also needed for Lorentz invariance).
Formally, this operation is achieved defining the “normal ordering” of the operator H. This is
indicated with the singn : H : and defined as follows:

: H : = H − h0|H|0i , (1.120)

or, as a rule, puting all the creation operators in the expression on the left of the annihilation operators
(respecting the statistics).

19
Capitolo 2

Special Relativity

2.1 Notes on Special Relativity


The concept of finding a class of physical frames in which one can write physics laws in a unique formal
way goes back to Newtonian mechanics and it was introduced by Galileo with the Principle of Inertia.
It can be refrased as: In every inertial frame (IF), Physics is described by the same (in form) equation
F = ma.
This principle, with the additional constraint of the “universal time”, brings to a class of transfor-
mations, the Galilean transformations (GT)

x′ (t′ ) = x(t) − v0 t , (2.1)



t = t, (2.2)

that leave unchanged the equations of motion, F = ma.


NB Galilean Relativity Principle (GRP) is adapted to Newtonian mechaniccs. It does not take into
account Classical Electrodynamics. Maxwell’s equations

1 ∂2
 
2
− ∇ φ = ρ, (2.3)
c2 ∂t2
1 ∂2
 
2 1
2 2
−∇ A = j, (2.4)
c ∂t c
are not invariant under GT (but under Lorentz transformations). The point lies on the fact that we see
experimentally that the speed of light, c = √ǫ10 µ0 ≃ 3 · 108 ms−1 , is a universal constant, with the same
value in every inertial frame. In Maxwell’s equations c appears explicitely! Therefore, they cannot be
invariant under galilean transformations. The composition of velocities is totally different in the two
cases.
At the beginning of XXth century physicists have to understand which one, among the following
three options, is the correct one:
1. It exists a “Relativity Principle” for Mechanics, but not for Electrodynamics, and Electrodynamics
changes in every inertial frame (System of Eather ...).

2. It exists a unique “Relativity Principle” both for Mechanics and Electrodynamics, the GRP, and
therefore Maxwell’s equations are wrong.

3. It exists a unique “Relativity Principle” both for Mechanics and Electrodynamics, and GT are
only a low-speed limit of more complex invariance transformations, and F = ma is a low-speed
limit of a formulation of Mechanics which is covariant under a new class of transformations,
Lorentz transformations.

20
The third hypothesis revealed to be the correct one. Based on electromagnetism, mechanics was
reformulated throught a redefinition of the concept of time.

The theory of Special Relativity, formalized by Einstein in 1905, is based on the following two
postulates:

• Physical laws are the same in every inertial frame.

• The speed of light is the same in every inertial frame, and the relation c = √1 applies.
ǫ0 µ0

The second postulate brings to the criticism of the concept of simultaneous events, that now has
to depend on the reference system. The absolute time, à la Newton, loses meaning and it emerges the
necessity to consider time on the same ground as space coordinates.

2.1.1 Simultaneous events


In a given inertial frame, physical phenomena are analyzed in terms of events: the physical phenomenon
“happens” in a certain point x at a certain time t. The event is indicated with the following vector:
(ct, x) in Minkowski space (see later).

Definition 2.1.1 In a given IF we say that two events are simultaneous if they happen in two different
space points and two light rays moving from each point in the direction of the other meet at half of the
distance.

It is clear that, if the speed of light is the same in every IF, a pair of events that are simultaneous
in an IF cannot be the same in another IF.
Suppose that in a given IF, S, a light signal is emitted from P1 = (x1 , y1 , z1 ) at t1 and reaches
P2 = (x2 , y2 , z2 ) at t2 . Since the speed of light is c, we will have

(x1 − x2 )2 + (y1 − y2 )2 + (z1 + z2 )2 = c2 (t1 − t2 )2 . (2.5)

If S ′ is another IF in which at t1 P1 coincides with P1′ and P2 with P2′ , since c is the same in both IF
we will have
(x′1 − x′2 )2 + (y1′ − y2′ )2 + (z1′ + z2′ )2 = c2 (t′1 − t′2 )2 . (2.6)

Definition 2.1.2 The expression

∆s212 = c2 (t1 − t2 )2 − (x1 − x2 )2 − (y1 − y2 )2 − (z1 + z2 )2 , (2.7)

can be taken as the interval between the two events in S and it is a relativistic invariant.

If ∆s212 = 0 in S, we have ∆s′12 2 = 0 in S ′ .

Definition 2.1.3 The infinitesimal interval will be

ds2 = c2 dt2 − dx2 − dy 2 − dz 2 . (2.8)

Property 2.1.4 ds2 is a relativistic invariant.

Since ds2 = 0 implies ds′2 = 0, it means that they are infinitesimals of the same order. We can put

ds2 = a ds′2 . (2.9)

Since we want the space-time to be homogeneous and isotropic, a cannot depend on X µ , and neither
on the vector v, relative velocity of the frame S ′ with respect to S. It could depend on v = |v|,

21
modulus of the relative velocity. However, let us consider three reference systems, S1 , S2 and S3 . S2
moves with respect to S1 with velocity v2 and S3 moves with respect to S1 with velocity v3 . S3 will
move with respect to S2 with velocity v23 . Therefore, we have

ds21 = a(v2 ) ds22 , ds21 = a(v3 ) ds23 , ds22 = a(v23 ) ds23 . (2.10)

Taking the ratio of the first two equations we have that

a(v3 ) 2
ds22 = ds , (2.11)
a(v2 ) 3

but also (considering the third one)


ds22 = a(v23 ) ds23 . (2.12)
Therefore
a(v3 )
= a(v23 ) . (2.13)
a(v2 )
The r.h.s. depends on v2 , v3 but also on the directions of v2 and v3 (v23 is the modulus of the relative
velocity), while the l.h.s. doers not depend on the directions, it means that they should be constants
and that therefore a = 1. In the end
ds2 = ds′2 . (2.14)
NOTE: the interval ∆s2 is not positive definite (as it is instead in the Euclidean case) but it can be
> 0, < 0 or = 0.

1. If there exists an inertial frame in which the two events happen at the same spatial point, but
at subsequent times, in that frame we must have

∆s′2 = c2 ∆t′2 > 0 . (2.15)

Since ∆s2 is a relativistic invariant, in another frame we will have, in any case,

∆s2 = c2 ∆t2 − ∆l2 = ∆s′2 > 0 . (2.16)

We call this interval a time-like interval. In this case the two events can be connected by a
causal-effect relationship.

2. If there exists an inertial frame in which the two events happen at two different spatial points,
but at the same time, in that frame we must have

∆s′2 = −∆l′2 < 0 . (2.17)

Since ∆s2 is a relativistic invariant, in another frame we will have, in any case,

∆s2 = c2 ∆t2 − ∆l2 = ∆s′2 < 0 . (2.18)

We call this interval a space-like interval. In this case the two events cannot be causally
connected.

3. Finally, an interval for which ∆s2 = 0 is called light-like.

The property to be time-like, space-like or light-like is a characteristic of the vector and does not
depend on the inertial frame.

22
X 0 = ct

P1
X2 = 0
P2
X1

P3

Figura 2.1: Minkowski space and the light cone.

2.1.2 Causal structure of the Space-Time


Let us consider an event O in the space-time as the origin of our frame (for simplicity of representation
we consider a 1+1 dimensional Minkowski space) as in Fig. 2.1. The events for which c2 ∆t2 = ∆l2 ,
reported as X 2 = 0, are represented in the diagram as streight lines at 45◦ and they are characteristic
of the propagation of light. This means that an event on one of these streight lines is connected to
the origin O by a signal that travels at the speed of light. The region between the two lines at 45◦
is called the light cone (“cone” because in more dimensions is a cone). Events within the light cone,
like P1 or P3 , have time-like distance from O and, therefore, they can be in causal relationship with
O. P3 “happens” before O, while P1 after. We can make a Lorentz transformation to a frame in which
P3 and O happens in the same spatial place but at two subsequent instants. In the same way we can
find a frame in which O and P1 happens in the same spatial place but at two subsequent instants. The
region outside the light cone cannot be connected causally with events within the light cone. In fact, a
signal from P3 , for instance, in order to reach O would have to travel at a speed bigger than the speed
of light and this is not possible. We can find a frame in which O and P3 happen simultaneously in two
separate space points (Note: in the case of space-like separations, we can also find a frame in which
the temporal succession of the two events is inverted).

2.1.3 Lorentz transformations: Boosts


Let us consider two inertial frames, S and S ′ . S ′ , for instance, will move with respect to S with
constant velocity v. Knowing the coordinates of one event in S, say (ct, x, y, z), we would like to find
the transformation laws that allow us to represent the same event in S ′ , (ct′ , x′ , y ′ , z ′ ).
Let us suppose for simplicity that S ′ moves with respect to S with a translation in the x direction

y y′
S S′
v

O x O′ x′
z z′

23
and that the axis x and x′ coincide. If an event has coordinates X µ in S, its coordinates X ′µ in S ′
will be: 


 X ′0 = γX 0 − βγX 1
X ′1 = −βγX 0 + γX 1

(2.19)


 X ′2 = X2
X ′3 =

X3
v 1
where β = c and γ = √ . Eq. (2.19) can be written in the following way:
1−β 2

X ′µ = Λµν X ν , (2.20)

where we used the Lorentz transformation Λµν , that can be written in matrix form as follows:
 
γ −βγ 0 0
−βγ γ 0 0
Λ =  0
. (2.21)
0 1 0
0 0 0 1

Another way to parametrize Λ is through hyperbolic functions. In fact, we know that γ 2 (1 − β 2 ) = 1.


Therefore, we can define an imaginary angle φ, such that

γ = cosh φ , e γβ = sinh φ . (2.22)

Then, we can write:  


cosh φ − sinh φ 0 0
− sinh φ cosh φ 0 0
Λ = 
 0
. (2.23)
0 1 0
0 0 0 1
The inverse Lorentz transformation is the one that gives the coordinates X µ in S, knowing X ′µ in
S and can be found immediately inverting the velocity in (2.21)

X 0 = γX ′0 + βγX ′1


X 1 = βγX 0 + γX ′1

2 = ′2
(2.24)


 X X
X 3 =

X ′3

and then
X ν = Λµ.ν X ′µ , (2.25)
where now (Λ−1 )νµ = Λµ.ν is such that
 
γ βγ 0 0
βγ γ 0 0
Λ−1 = 
0
 (2.26)
0 1 0
0 0 0 1

and
Λρ.ν Λµν = ηρµ = δρµ , (2.27)

24
that can be checked multiplying Eq. (2.21) times Eq. (2.26) and using γ 2 − β 2 γ 2 = 1:
    
γ −βγ 0 0 γ βγ 0 0 1 0 0 0
−βγ γ 0 0 βγ γ 0 0 = 0 1 0 0 .
ΛΛ−1 = 
    
 0 (2.28)
0 1 0   0 0 1 0   0 0 1 0
0 0 0 1 0 0 0 1 0 0 0 1

Note that the boost defined in this way is particularly simple, in the sense that the translation
with velocity v is done in the x direction only, v = v î. In general we can have the velocity directed
in a general direction (we will write the general boost below). Moreover, it can happen that the frame
S ′ does not have the axis x′ , y ′ and z ′ parallel to x, y and z. In this case we can rotate S ′ to make
in such a way that the axis become parallel to the axis of S. This corresponds to an isometry in the
three-dimensional space. Since the time is not affected, only dx will change. However, since we are
speaking about an isometry, we will have |dx′ | = |dx| and therefore, in the end, ds′2 = c2 dt2 − dx′2 =
c2 dt2 − dx2 = ds2 . This means that the spatial rotations are part of the Lorents transformations (they
leave ds2 unchanged). A Lorentz transformation is a composition of a rigid rotation and a boost in
the v direction.

Non relativistic limit


If we consider the limit in which
v
≪ 1, (2.29)
c
we have
1 1
p ≃ 1 + β 2 + ... (2.30)
1− β2 2
and therefore at zeroth order in β we find



 t′ ≃ t
 x′

≃ x − vt
(2.31)


 y′ = y
z′

= z

i.e. the Galilean transformations.

2.1.4 Boost in a general direction


The boost in the x direction shows the general feature that only the components in the direction of
the velocity, and the time, are affected by the transformation. The components perpendicular to the
direction of the velocity are not. We can write a boost in a general direction v̂, decomposing the vector
X as the sum of two vectors: one parallel and the other perpendicular to v̂.

X = Xk + X⊥ , X′ = X′k + X′⊥ . (2.32)

Then, the boost can be written as follows:


X 0 −β Xk
 0
 X ′0 = √ = X√−β X·v̂
1−β 2 1−β 2



Xk −β X 0 v̂
X′k = √ (2.33)

 1−β 2
 ′

X⊥ = X⊥ .

25
2.1.5 Transformation of the three-velocity
It is interesting to look at the composition of velocities in special relativity. We will demonstrate that,
if u is the three-velocity of a material point that moves with respect to the observer in S and the
observer in S ′ moves with respect to S with a velocity v, the velocity u′ of the point seen by S ′ is such
that if |u′ | ≤ c and |v| ≤ c then |u ≤ c. For simplicity we consider a boost in the x direction. We have

′ = γ t− v x



 t c 2
x′ = γ (x − vt)

(2.34)


 y′ = y
z ′ =

z

and the inverse transformation given by



= γ t′ + cv2 x′

t


= γ (x′ + vt′ )

x
(2.35)


 y = y′
z′

z =

Let us consider the three-velocity u of the material point in components. We have


! !
dx dx dt′ d x′ + vt′ d t − cv2 x u′ + v 1 − cv2 ux
ux = = ′ = ′ p p = px p . (2.36)
dt dt dt dt 1 − β 2 dt 1 − β2 1 − β2 1 − β2

From Eq. (2.36) we find


v
ux (1 − β 2 ) = u′x − ux u′x + v − β 2 ux , (2.37)
c2
and therefore
u′x + v
ux = . (2.38)
1 + cv2 u′x
For the component in y we have
p
dy dt′ u′y 1 − β 2
uy = ′ = ... = (2.39)
dt dt 1 + cv2 u′x

and for uz p
dz dt′ u′z 1 − β 2
uz = ′ = ... = . (2.40)
dt dt 1 + cv2 u′x
In summary
u′x +v

 ux = 1+ v2 u′x ,
√c



 ′
uy 1−β 2
uy = 1+ v2 u′x , (2.41)
 √c
u′z 1−β 2


 uz =

1+ v
u′x .
c2

Note that if c → ∞ (or better if we consider the limit v/c ≪ 1) we find the “euclidean” composition of
the velocities 

ux ≃ ux + v ,

uy ≃ u′y , (2.42)


uz ≃ ′
uz .

26
Let us assume that the point moves in the x direction only (for simplicity) and that u′x ≤ c, v ≤ c.
Then we have
(c − u′x ) ≥ 0 , and (c − v) ≥ 0 . (2.43)
It follows that
(c − u′x )(c − v) = c2 − cv − u′x c + vu′x ≥ 0 (2.44)
and then (deviding by c2 which is 6= 0 and positive)

u′x v u′x + v
1+ ≥ . (2.45)
c2 c
Looking at the x component in Eq. (2.41), we get
 ′ 
 v ′ ux + v
u′x + v = ux 1 + 2 ux ≥ ux (2.46)
c c
and therefore
ux ≤ c . (2.47)
If u′x = c, we find immediately that

u′x + v c+v
ux = v ′ = = c. (2.48)
1 + c2 ux 1 + vc

2.2 Kinematics of the classical particle


We want now to describe the kinematics and the dynamics of a point-like massive particle in a covariant
way. The goal is to be able to re-write the second principle of dynamics in a manifetly covariant way,
using tensor relations, in such a way that for v ≪ c we can recover Newtonian mechanics.

2.2.1 Four-velocity and four-acceleration


In newtonian mechanics we introduce the velocity of the particle as v = dx dt . An obvious relativistic
µ
generalization of the dx is dX µ . However, dt is not a relativistic invariant, and therefore dX
dt does not
transform as a four-vector. We should find an invariant that can replace dt.
We know that
ds2 = c2 dt2 − dx2 − dy 2 − dz 2 (2.49)
is an invariant. Let us then perform a LT to an inertial frame in which dx′ = dy ′ = dz ′ = 0. If we
rename τ the time in that frame, i.e. the time in the frame in which the particle is at rest, we have
p
dτ = dt 1 − β 2 (2.50)

and
ds2 = c2 dτ 2 (2.51)
then also dτ is an invariant. τ is called the proper time of the particle.
Let us consider now the following vector (with the dimensions of a velocity)
dX µ dX µ
U µ = (U 0 , U ) = =γ . (2.52)
dτ dt
U µ is indeed a four-vector, since dX µ is a four-vector and dτ is an invariant. We have

1 dX 0 c
U0 = p =p , (2.53)
1 − β 2 dt 1 − β2

27
1 dX v
U = p =p . (2.54)
1− β2 dt 1 − β2
U µ is a time-like vector, since
!2 !2
µ c v
U Uµ = p − p = c2 > 0 . (2.55)
1− β2 1− β2

Following on the same line, we can define the four-acceleration


dU µ dU µ
Aµ = =γ . (2.56)
dτ dt
The components of Aµ are
dU 0 v·a
A0 = γ = ... = , (2.57)
dt c(1 − β 2 )
dU a v·a
A = γ = ... = + 2 v, (2.58)
dt (1 − β ) c (1 − β 2 )2
2

where v = dx dv µ
dt and a = dt . Note that for c → ∞ we have that the temporal component of A goes to
zero, while the spatial component becomes a, the usual non relativistic acceleration.

2.2.2 Four-momentum
In newtonian mechanics an important quantity is the momentum of the particle, which ia defined as
p = mv. A covariant form of p can be constructed in the following way

P µ = mU µ , (2.59)

where m coincides with the inertial mass of the particle when v ≪ c. We have
!
mc mv
Pµ = p ,p , (2.60)
1 − β2 1 − β2

which is called the energy-momentum four-vector. P µ is such that


m2 c2 m2 v 2
P 2 = P µ Pµ = − = m2 c2 > 0 . (2.61)
1 − β2 1 − β2
It is a time-like vector. The relation P µ Pµ = m2 c2 is called the mass-shell relation. Since the lagrangian
of the free particle is p
L = −mc2 1 − β 2 , (2.62)
such that the three-momentum p is actually
∂L mv
p= =p , (2.63)
∂v 1 − β2
we can look at the energy, performing a Legendre transformation to
mc2
E = p·v−L = p . (2.64)
1 − β2
Note that even if v = 0 the energy of the free particle is not zero, but
v→0
E → mc2 . (2.65)

28
Using Eq. (2.64), we have !  
µ mc mv E
P = p ,p = ,p (2.66)
1 − β2 1 − β2 c
and from the mass-shell relation we have
E2
= |p|2 + m2 c2 . (2.67)
c2

2.3 Vectors and Tensors


After the introduction of Lorentz transformations, we now want to study how mathematical objects,
that will be used to describe our Physics, transform under Lorentz transformations (LT). This is the
subject of Tensor Analysis.
Let us start introducing a more general definition of vectors in a Euclidean space.

2.3.1 Vectors and Contravariant Components


In Special Relativity (SR) we have to deal with different kind of vectors. The fact that in Newtonian
mechanics, for instance, we do need just the usual Euclidean definition is simply due to the fact that
usually we use an ortonormal system of basis vectors for the vectorial space. In this situation the metric
tensor reduces to a Kronecker delta function and it becomes impossible to appreciate the difference
between different definitions of vectors.

Let us consider a vector space V on R. Let {ei } is a set of independent vectors which constitutes
a basis for V.
If v ∈ V, it can be expressed as a linear combination of the basis vectors

v = v i ei , with i = 1, · · · , dim(V) . (2.68)

The real numbers v i are called the contravariant components of v. The place of the index i , as
superscript is relevant. As we will see in a moment, components with an index as subscript describe a
different kind of vector.
Let us consider now a different basis of V, {e′ i } and let Λ be the transformation from the old to
the new basis. We have
e′ i = Λji ej (2.69)
Note that the index j of ej is contracted with the upper index of Λ. Under basis transformation, the
components of v tranform accordingly. The transformation law is the following. Remember that the
vector v is an absolute quantity, that can be represented using different basis. But v is always the
same vector. Therefore, in the new basis we will have

v = v ′i e′ i = v ′i Λji ej , (2.70)

but we can also write


v = v j ej , (2.71)
and matching Eq. (2.70) and Eq. (2.71) we find

v j = Λji v ′i . (2.72)

Note that the index of v ′i is contracted with the lower index of Λ (it goes with the transposed).
l
Multiplying Eq. (2.72) by Λ−1 j on the l.h and r.h.s, we have
l l l
Λ−1 j
v j = Λ−1 j
Λji v ′i = Λ−1 Λ i v ′i = δil v ′i = v ′l . (2.73)

29
Finally
l
v ′l = Λ−1 j
vj (2.74)

Therefore, if the basis transforms with Λ, the contravariant components of v transform with the inverse
−1 T
transposed of Λ, ΛT = Λ−1 .
In matrix notation

v = ΛT v′ , (2.75)
−1 T
v′ = ΛT v = Λ−1 v , (2.76)

2.3.2 Dual vectors and covariant components


Once the vectorial space V is defined, it is automatically defined also the “dual” space, V ∗ , which is
the vectorial space of linear funtionals on V:

σ : V → R, (2.77)
v → σ(v) . (2.78)

Since V ∗ is a vectorial space, we can find a basis {ki } in which the functional σ can be represented
in a unique way as
σ = σi ki . (2.79)
the set σi are real numbers that represent the components of σ in this basis.
Although V and V ∗ are different spaces, they are connected. They have the same dimensionality
and they are isomorfic, but they are different! If the basis changes in V, this will imply a change of
basis of V ∗ . Therefore, we can ask how the components of σ behave under the basis transformation in
Eq. (2.69). We labeled the components of σ in the {ki } basis with a lower index because the properties
of these components under a basis tranformation in V are different from those of the contravariant
components of a vector in V.
Using (2.79), we can write

σ(v) = σi ki (v) = σi ki (v j ej ) = σi v j ki (ej ) . (2.80)

The number ki (ej ) tells how the components of the basis in the functional space V ∗ act on the
components of the base in V. We say that the two chosen basis are “dual” when we have

ki (ej ) = δji , (2.81)

with δ the Kronecker delta δii = 1, δji = 0 if i 6= j. In this case the situation is much simpler and we
have
σ(v) = σi v i . (2.82)
Note that (2.82) is not a scalar product! It is the sum of the product of the corresponding components
of σ and v, vectors that belong to two different vector spaces.
Let us consider dual bases. If we apply the basis functionals {ki } to the vector v ∈ V we have

ki (v) = ki (v j ej ) = v j ki (ej ) = v i , (2.83)

because of (2.81). Therefore, the action of ki on v is to extract its contravariant component. On the
other hand, we have
σ(ej ) = σi ki (ej ) = σj , (2.84)
because of (2.79).

30
If we consider the change of basis (2.69), it will imply a change of basis in V ∗ , say from ki to k′ i .
In k′ i the expression of σ will be given by
i
σ = σi′ k′ . (2.85)

We have, because of (2.84)

σi′ = σ(e′ i ) = σj kj (e′ i ) = σj kj (Λli el ) = σj Λli kj (el ) = σj Λli δlj = σj Λji . (2.86)

In summary
σi′ = Λji σj (2.87)
and the components σi transform according to the transformation of basis (as in (2.69)). That is why
they are called “covariant” components.

Scalar Product and Metric Tensor


Just to have in mind a practical example, let us introduce the scalar product and refrase what we just
said in this case.
The scalar product between two vectors of V is an application of V × V → R, which is bilinear,
symmetric and not degenerate
v, w ∈ V → (v, w) ∈ R . (2.88)
The scalar product induces a norm on V, that in turn induces a metric. Therefore, with a scalar
product our vector space becomes a metric space.
Let us fix the first vector v and consider the scalar product with every other vector w ∈ V. In this
case we defined a functional fv = (v, .) such that

w ∈ V → fv (w) = (v, w) ∈ R . (2.89)

fv is formally a vector of the dual space of V, V ∗ . We can choose a basis in V ∗ . Let us call it {ki },
with the index i as superscript. Therefore, fv will be expressed in a unique way in this basis:

f = fi ki . (2.90)

Because of the definition (2.89), in this case we have

fi = f (ei ) = (v, ei ) = vi . (2.91)

We call them covariant components of v.


We have, in particular

fv (w) = (v, w) = v i (ei , w) = v i wj (ei , ej ) = v i wj gij , (2.92)

where we introduced the “metric tensor”

gij = (ei , ej ) . (2.93)

The metric tensor is symmetric (by construction). If it is also positive definite, there is a theorem that
proves that with a change of basis we can find

gij = (ei , ej ) = δij . (2.94)

In this case we see that covariant and contravariant components are exactly the same. If the metric
tensor is not positive definite, then they are diferent and related by the metric tensor

vi = gij v j . (2.95)

31
This is the case of Special Relativity.
Since the covariant components of v are defined as

vi = (v, ei ) = (v j ej , ei ) = v j (ej , ei ) = gij v j , (2.96)

under basis transformation, as in Eq. (2.69), they change according to the following relation

vi′ = (v, e′ i ) = (v j ej , Λρi eρ ) = v j Λρi (ej , eρ ) = v j Λρi gjρ = Λρi vρ . (2.97)

In summary
vi′ = Λji vj (2.98)

Now we can ask how the metric tensor transforms under (2.69)? We have

gij = (e′ i , e′ j ) = (Λρi eρ , Λσj eσ ) = Λρi Λσj (eρ , eσ ) = Λρi Λσj gρσ . (2.99)

Therefore

gij = Λρi Λσj gρ,σ . (2.100)

A two-indices object, gij , that transforms like in Eq. (2.100) is called a covariant tensor of rank 2.
Using Eq. (2.100) we can show that the scalar product is an absolute quantity, that does not depend
on the chosen basis. In fact
µ ν
(u′ , v′ ) = gµν′
u′µ v ′ν = Λρµ Λσν gρσ u′µ v ′ν = Λρµ Λσν gρσ Λ−1 γ Λ−1 δ uγ v δ (2.101)
ρ σ
= ΛΛ−1 γ ΛΛ−1 δ uγ v δ = δγρ δδσ uγ v δ = gγδ uγ v δ (2.102)
 

= (u, v) . (2.103)

We can also define the inverse of the matric tensor (the contravariant version of the metric tensor)
gµν such that
gµν gνρ = δρµ = δµρ = gµν gνρ (2.104)
and
(u, v) = gµν uµ v ν = uν v ν = uν vµ gµν . (2.105)
Under basis transformation, gµν behaves as follows:

(u, v) = gγδ uγ vδ = (u′ , v′ ) = g ′µν u′µ vν′ (2.106)


′µν
= g Λγµ Λδν uγ vδ (2.107)

and therefore
gγδ = g′µν Λγµ Λδν . (2.108)
l m
Multiplying on both sides by Λ−1 γ
Λ−1 δ , we have
l m l m l m
Λ−1 γ
Λ−1 δ
gγδ = Λ−1 γ
Λ−1 δ
Λγµ Λδν g′µν = Λ−1 Λ µ Λ−1 Λ ν g′µν (2.109)

and finally
l m
g′lm = Λ−1 γ
Λ−1 δ
gγδ . (2.110)

2.3.3 Vectors and Tensors in Differential Form


A more convenient (and general) way to define vectors and tensors is to use the apparatus of differential
geometry. In this way, we use local definitions that are valid also for non linear spaces, like in General
Relativity.

32
Contravariant Vectors
Let us suppose to work in a Euclidean space and let (x1 , · · · , xn ) be a system of euclidean coordinates.
A curve in this space is given in parametric form as


 x1 = x1 (t)

.




. (2.111)

.





 n
x = xn (t)

with t ∈ [a, b] ⊂ R. The velocity vector of the curve in the point x0 = x(t0 ) is
 1
dxn dxi

dx
vx = ,··· , , vxi = . (2.112)
dt dt t=t0 dt

Let us suppose that in a neighbourhood of x0 the new coordinates (z 1 , · · · , z n ) are introduced, in such
a way that
xi = xi (z 1 , · · · , z n ) , i = 1, · · · , n (2.113)
and such that in this neighbourhood we have

∂xi
 
detJ = det 6= 0 . (2.114)
∂z i
In the new coordinates, the parametric equations of the curve are


 z 1 = z 1 (t)

.




. (2.115)

.





 n
z = z n (t)

and we can write


xi (t) = x(z(t)) . (2.116)
The velocity vector in the new coordinates is
 1
dz n dz i

dz
vz = ,··· , , vzi = , (2.117)
dt dt t=t0 dt

In the transformation from x to z coordinates, the velvcity vector transforms as follows

dxi ∂xi dz j ∂xi


vxi = = j = j vzj , (2.118)
dt ∂z dt ∂z
Therefore
∂xi j
vxi = v (2.119)
∂z j z
or, in matrix form
vx = J vz . (2.120)
A vector whose components transform as in Eq. (2.119) is called a contravariant vector (or contra-
variant tensor of rank 1).

33
Covariant Vectors
Let us consider the gradient of a scalar function f (x1 , · · · , xn ):
 
∂f ∂f ∂f
ξ = ∇f = 1
, · · · , n , ξi = . (2.121)
∂x ∂x ∂xi

If we
n introduce
o a new system of coordinates (z 1 , · · · , z n ) such that xi = xi (z 1 , · · · , z n ) and detJ =
i
det ∂x∂z i
6= 0, we define
 
∂f ∂f ∂f
η= , · · · , , ηi = i . (2.122)
∂z 1 ∂z n ∂z
Changing system of coordinates, the gradient transforms in the following way
∂f ∂xj ∂f ∂xj
ηi = = = ξj . (2.123)
∂z i ∂z i ∂xj ∂z i
Therefore
∂xj
ηi = ξj (2.124)
∂z i
A vector whose components transform as in Eq. (2.124) is called a covariant vector (or covariant
tensor of rank 1).
Summarizing, if the jacobian of the transformation is
 ∂x1 ∂x1

∂z 1
. . ∂z n
 . . 
J =  .
, (2.125)
. 
∂xn n
∂z 1
. . ∂x∂z n

we have in matrix form:

contravariant ξ = J η, (2.126)
t −1
covariant η = Jt ξ ,

=⇒ ξ= J η. (2.127)

Note: The transformations of the contravariant and covariant vector coincide in the case in which
−1
J = Jt , =⇒ JJ t = 1 , (2.128)

therefore, if in every point the transformation is linear (J = const) and ortogonal.

Metric Tensor
Let us now introduce the scalar product of two vectors.
Let us suppose that the coordinate system (x1 , · · · , xn ) is euclidean, that ξ1 and ξ2 are two vectors
with origin in P0 = (x10 , · · · , xn0 ) and let us introduce in a neighbourhood of (x10 ,n· · · ,oxn0 ) another
∂xi
system of coordinates (z 1 , · · · , z n ) such that xi = xi (z 1 , · · · , z n ) and detJ = det ∂z i
6= 0, with
xi0 = xi (z01 , · · · , z0n ).
Knowing that
∂xi ∂xi
ξ1i = η1j , ξ2i = η2j , (2.129)
∂z j P0 ∂z j P0
we define the scalar product as
∂xi ∂xi
(ξ1 , ξ2 ) = ξ1i ξ2i = η1j η2k = gjk η1j η2k , (2.130)
∂z j ∂z k P0

34
where we introduced the metric tensor
∂xi ∂xi
gjk = = Jji Jki = δrs Jjr Jks . (2.131)
∂z j ∂z k P0

Let us see how the metric tensor transfoms under change of coordinates. If we introduce in a neigh-
bourhood of P0 a new system of coordinates (y 1 , · · · , y n ) such that z i = z i (y 1 , · · · , y n ) and detJ 6= 0,
we will have
∂z i j ∂z i
η1i = j
ζ 1 , η i
2 = j
ζ2j , (2.132)
∂y P0 ∂y P0
Therefore
∂z i ∂z j
(ξ1 , ξ2 ) = gij η1i η2j = gij ′ k l
ζ1k ζ1l = gkl ζ1 ζ1 . (2.133)
∂y k ∂y l P0
The metric tensor transforms according to the following rule:

∂z i ∂z j

gkl = gij = Jki Jlj gij . (2.134)
∂y k ∂y l P0

A tensor that transforms like in Eq. (2.134) is called a covariant tensor of rank 2.
Note that the metric tensor is a symmetric tensor

gij = gji , (2.135)

because of the fact that the scalar product is symmetric. Moreover, in general

gij = gij (P0 ) = gij (z01 , · · · , z0n ) , (2.136)

then it is a function of the point in which ξ1 and ξ2 are defined.


Definition The metric gij (z) is called euclidean if it exists a system of coordinates (x1 , · · · , xn ), with
k ∂xk
xi = xi (z 1 , · · · , z n ) and det(J) 6= 0, gij = ∂x
∂z i ∂z j
, such that in these coordinates we have
(
′ 1 i=j
gij = δij = (2.137)
0 i 6= j

Definition The metric gij (z) is called pseudo-euclidean if it exists a system of coordinates (x1 , · · · , xn ),
k ∂xk
with xi = xi (z 1 , · · · , z n ) and det(J) 6= 0, gij = ∂x
∂z i ∂z j
, such that in these coordinates we have

1
 for i ≤ p (i = j)

gij = δij = −1 for p + 1 ≤ i ≤ p + q = n (i = j) (2.138)

0 i 6= j

The space where such a metric is defined is called Pseudo Euclidean and it is labeled with Rnp,q . We
will call Minkowski space, a pseudo euclidean space R41,3 , with metric
 
1 0 0 0
0 −1 0 0
gij = 
0 0 −1 0  .
 (2.139)
0 0 0 −1

In pseudo-euclidean coordinates we have

|ξ| = gij ξ i ξ j = (ξ 1 )2 + · · · + (ξ p )2 − (ξ p+1 )2 − · · · − (ξ n )2 . (2.140)

35
We can extend the notion of metric tensor also to covariant vectors. We have
−1
ξ1 = J t η1 , (2.141)
−1
ξ2 = J t (2.142)

η2 ,

or, in components
∂z j
ξ1,i = η1,j , (2.143)
∂xi
∂z j
ξ2,i = η2,j , (2.144)
∂xi
where ξ1 and ξ2 are two vectors in the euclidean coordinate system (x1 , · · · , xn ), while η1 and η2 the
same vectors in the system (z 1 , · · · , z n ), such that xi = xi (z 1 , · · · , z n ) and det(J) 6= 0. Therefore,

∂z j ∂z k
(ξ1 , ξ2 ) = ξ1,i ξ1,i = η1,j η1,k = gjk η1,j η1,k . (2.145)
∂xi ∂xi
In order to understand how gij transform under change of coordinate system, let us introduce
another coordinate system (y 1 , · · · , y n ) such that z i = z i (y 1 , · · · , y n ) and det(J) 6= 0. We have
∂y j
η1,i = ζ1,j , (2.146)
∂z i
∂y j
η2,i = ζ2,j , (2.147)
∂z i
and therefore
∂y l jk ∂y r
gjk η1,j η2,k = g ζ1,l ζ2,r , (2.148)
∂z j ∂z k
and, finally
∂y l jk ∂y r
g g′lr = . (2.149)
∂z j ∂z k
A quantity that transforms as in Eq. (2.149) is called contravariant tensor of rank 2.
Theorem We have gij = {gij }−1 .


In fact, let us look at the transformation rules in a matrix form. If we define the covariant metric
tensor as gc and the contravariant metric tensor as g c , we have

gc′ = J t gc J , (2.150)
h  it
−1 −1
g′c = Jt gc J t . (2.151)
h −1 it
However Jt = J −1 and therefore
−1
g′c = J −1 gc J t . (2.152)

From Eq. (2.150) we have


−1 −1
gc′ = J −1 (gc )−1 J t (2.153)
and therefore we find
gc = (gc )−1 . (2.154)
In components we have
gij gjk = gik = δik = g ij gjk = gki , (2.155)
where δik is the Kronecker delta (and therefore we do not have to distinguish between upper or lower
indices).

36
Mixed Tensors
Let us suppose now that in every point of our space, with coordinates (x1 , · · · , xn ), is defined a linear
operator A(x). If ξ is a vector in x, we have

η i = aij (x) ξ j (2.156)

and for the covariant vectors


ηj = aij (x) ξi . (2.157)
If we introduce now, in the neighbourhood of x a new system of coordinates (z 1 , · · · , zn ) such that
xi = xi (z), we will have
∂xi ∂xi
η i = j η ′j , ξ i = j ξ ′j (2.158)
∂z ∂z
and
∂z i ′ j ∂z i ′
ηj = η i , ξ = ξ . (2.159)
∂xj ∂xj i
Because of Eq. (2.156) we have
∂xi ′k i ∂x ′l
j
η = a j ξ , (2.160)
∂z k ∂z l
from which
∂z k ∂xj i ′l
η ′k = a ξ = a′k ′l
l ξ . (2.161)
∂xi ∂z l j
Therefore
∂z k ∂xj i
a′k
l = a . (2.162)
∂xi ∂z l j
A quantity that transforms according to Eq. (2.162) is called mixed tensor of rank 2.

General definition
In general, we define a tensor of (p, q) type, of rank p + q, on a n-dimensional vector space, a collection
of np+q numbers, in a certain system of coordinates (x1 , · · · , xn ), whose numerical expression depends
on the system of coordinate as follows: if (z 1 , · · · , z n ) is another system of coordinates and xi =
xi (z 1 , · · · , z n ) we have

i ,··· ,i ∂xi1 ∂xip ∂z l1 ∂z lq ′k1 ,··· ,kp


Tj11,··· ,jqp = · · · , · · · T . (2.163)
∂z k1 ∂z kp ∂xj1 ∂xjq l1 ,··· ,lq
Since det(J) 6= 0, the relation (2.163) can be inverted:

′k ,··· ,kp ∂z k1 ∂z kp ∂xj1 ∂xjq i1 ,··· ,ip


1
Tl1 ,··· = · · · , · · · T . (2.164)
,lq ∂xi1 ∂xip ∂z l1 ∂z lq j1 ,··· ,jq
In every point of the space, (p, q) tensors form a linear space.

2.4 Minkowski Space


The ideal space for the study of Special Relativity is a 4-dimensional vector space (X 0 = ct, X 1 = x,
X 2 = y, X 3 = z), called Minkowski space, M4 , with pseudo-euclidean metric η µν , that in matrix form
is given by the following expression:
 
1 0 0 0
0 −1 0 0
η = 0 0 −1 0  .
 (2.165)
0 0 0 −1

37
We have
η µν = ηµν , (2.166)
such that
η µν ηνρ = ηρµ = δρµ . (2.167)
The scalar product in this space is defined as follows:

X · X = Xµ X µ = ηνρ X µ X ν = η νρ Xµ Xν = (X 0 )2 − (X 1 )2 − (X 2 )2 − (X 3 )2 (2.168)

and it is not positive definite.


The vectors in M4 are called four-vectors. We have contravariant vectors (with contravariant
indices)
V µ = (V 0 , V) . (2.169)
The covariant vector Vµ can be recovered by V µ using the metric

Vµ = ηµν V ν = (V 0 , −V) , (2.170)

2.5 Lorentz group


So far we have considered boosts. However, the Lorentz tranformations do not include only boosts.
The specific requirement is that a Lorents transformation leave unchanged the quadratic form

(X 0 )2 − (X 1 )2 − (X 2 )2 − (X 2 )2 . (2.171)

A boost does exactly this. However, there is another transformation that can leave (2.171) unchanged.
In fact, if we consider a rigid rotation in the euclidean 3-dim space, this will leave unchanged the
quadratic form (X 1 )2 + (X 2 )2 + (X 2 )2 , and therefore also (2.171), since the time is not included in the
transformation. Finally, also transformations of the following matrix form
 
1 0
Λ= , (2.172)
0 R

where R is ortogonal (RRt = 1), are Lorentz tranformations.


Let us consider a generic Lorentz transformation, Λµν , that can be a composition of a boost and a
rigid rotation. In order Λµν to be a Lorentz transformation, it must fulfill the following relation:

ηµν Λµσ Λνρ = ησρ . (2.173)

The relation in Eq. (2.173) comes from the fact that Lorentz transformations preserve the metric and
leave unchanged the lenght of the four-vector X µ :

X ′2 = ηµν X ′µ X ′ν = ηµν Λµσ X σ Λνρ X ρ = ηµν Λµσ Λνρ X σ X ρ , (2.174)


2 σ ρ
X = ησρ X X (2.175)

and from X ′2 = X 2 we find (2.173). Using

X ′2 = η µν Xµ′ Xν′ = η µν Λµ.σ Xσ Λν.ρ Xρ = η µν Λµ.σ Λν.ρ Xσ Xρ , (2.176)


X 2 = η σρ Xσ Xρ , (2.177)

we find also the following form


η µν Λµ.σ Λν.ρ = η σρ . (2.178)
Eq. (2.173) can be written in matrix form as Λt ηΛ = η.

38
Property 2.5.1 The Lorentz transformations, Λµν , form a group, the Lorents Group (LG).

In order to see that, we have to prove that: i) if Λ1 ∈ LG and Λ2 ∈ LG, then Λ1 Λ2 ∈ LG; ii) the
identity is a Lorentz transformation; iii) ∃! Λ−1 ∈ LG such that ΛΛ−1 = Λ−1 Λ = 1.

• Let us consider
Λ Λ
X µ →1 X ′µ →2 X ′′µ , (2.179)
then we have
X ′′µ = (Λ2 )µν X ′ν = (Λ2 )µν (Λ1 )νρ X ρ . (2.180)
We have to prove that
Λµρ = (Λ2 )µν (Λ1 )νρ (2.181)
is indeed a Lorentz transformation (it satisfies Eq. (2.173)). In fact, we have
h i h i
ηµν (Λ2 )µγ (Λ1 )γσ (Λ2 )νδ (Λ1 )δρ = ηµν (Λ2 )µγ (Λ2 )νδ (Λ1 )γσ (Λ1 )δρ = ηγδ (Λ1 )γσ (Λ1 )δρ = ησρ , (2.182)


where we used the fact that Λ2 and Λ1 are indeed Lorentz transformations.

• The identity transformation


Λµν = δνµ (2.183)
trivially satisfies relation (2.173):
ηµν δσµ δρν = ησρ . (2.184)

• The inverse exists. In fact, since Λt ηΛ = η, (det(Λ))2 = 1 or det(Λ) = ±1 (and in particular


det(Λ) 6= 0). Let us see how the inverse can be defined. Multiplying both sides of Eq. (2.173) by

η ρσ we have
′ ′ ′
η ρσ ηµν Λµσ Λνρ = η ρσ ησρ = ησσ (2.185)
This means that
′ ′ ′
η ρσ ηµν Λνρ = Λµ.σ = (Λ−1 )σµ . (2.186)

Let us check that, indeed, (Λ−1 )σµ is a Lorentz transformation, i.e. that

ηµν (Λ−1 )µσ (Λ−1 )νρ = ηρσ . (2.187)

Using relation (2.186) we have to prove that


′ ′ ′ ′ ′ ′
ηµν (η µξ ησω Λωξ )(η νξ ηρω′ Λωξ ′ ) = ηµν η µξ η νξ ησω ηρω′ Λωξ Λωξ ′ = η ξξ (ησω Λωξ )(ηρω′ Λωξ ′ ) . (2.188)

If we multiply the r.h.s. and l.h.s. of Eq. (2.173) by Λν.ρ′ we find

ηµν Λµσ Λνρ Λν.ρ′ = ηµν ′ Λµσ = ησρ Λν.ρ′ . (2.189)

Using Eq. (2.189) in Eq. (2.188), we find


′ ′ ′ ′ ′ ′
ηµν (η µξ ησω Λωξ )(η νξ ηρω′ Λωξ ′ ) = (η ξξ ηξδ )Λσ.δ (ηρω′ Λωξ ′ ) = δδξ Λσ.δ ηρω′ Λωξ ′ , (2.190)
.ξ ′ ω′
= Λσ ηρω′ Λ ξ′ = ηρω′ δωσ′ = ηρσ . (2.191)

39
In summary, Lorentz transformations form a group that is called Lorentz Group. Since, as we
noticed, Λ ∈ LG is such that (det(Λ))2 = 1, we have elements of the group with det(Λ) = 1 and
elements with det(Λ) = −1. The identity has det(Λ) = 1. This means that only the subset with
det(Λ) = 1 can form a subgroup of the LG.
Moreover, from (2.173) we have

1 = η00 = ηµν Λµ0 Λν0 = (Λ00 )2 −


X
(Λi0 )2 , (2.192)
i

from which we obtain


(Λ00 )2 ≥ 1 =⇒ Λ00 ≥ 1 or Λ00 ≤ −1 . (2.193)
In total the LG has four different subsets, listed in the following table:
Symbol Λ00 detΛ Name
L↑+ ≥1 +1 Proper hortocronous
L↓+ ≤ −1 +1 Proper anticronous
L↑− ≥1 −1 Improper hortocronous
L↓− ≤ −1 −1 Improper anticronous

Only L↑+ is a soubgroup of the LG (the identity is such that δ00 = 1) and its elements can be
obtained from the identity with a continuous change of the parameters of the group (for instance the
velocity of the boosts and the angles of the rigit rotation). A group that depends in a continuous and
differentiable way on a set of parameters is called a Lie group.
The four subsets L↑+ , L↓+ , L↑− and L↓− can be connected only via the discontinuous transformations
called Parity and Time Reversal.
• A Parity transformation acts only on the spatial part of the four-vector, inverting it:
Λ
(X 0 , X) →
P
(X 0 , −X) . (2.194)
In matrix form we have  
1 0 0 0
0 −1 0 0
ΛP = 
0 0 −1 0 
 (= η) . (2.195)
0 0 0 −1
Parity belongs to the set L↑− .
• A Time Reversal transformation acts only on the temporal part of the four-vector, inverting it:
Λ
(X 0 , X) →
T
(−X 0 , X) . (2.196)
In matrix form we have  
−1 0 0 0
0 1 0 0
ΛT = 
0
 (= −η) . (2.197)
0 1 0
0 0 0 1
Time Reversal belongs to the set L↓− .
We can connect the four subsets using Parity and Time Reversal, as in figure 2.2. For instance,
take an element of L↑+ , say Λ↑+ . The element Λ1 = Λ↑+ ΛT is such that det(Λ1 ) = det(Λ↑+ ΛT ) =
det(Λ↑+ )det(ΛT ) = −1 and (Λ1 )00 = −(Λ↑+ )00 ≤ −1. Therefore, the action of ΛT was such that Λ1 ∈ L↓− .
If we consider, instead, Λ2 = Λ↑+ ΛP , we find again det(Λ2 ) = −1 but now the sign of (Λ2 )00 is the same
as the one of (Λ↑+ )00 . Therefore Λ2 ∈ L↑− .

40
PT
L+ L+

T T
P P

L- L-
PT

Figura 2.2: Connection of the four subsets of the LG.

2.6 Poincaré group


We consider so far only homogeneous transformations. However, in addition to boosts and rigid
rotations, we have the freedom to redefine the origin of our inertial frame, adding a constant vector
(rigid translation) and Physics must not to be affected by this operation. Such a transformation can
be expressed as
X µ → x′µ = Λµν X ν + aµ . (2.198)
Boosts with rotations and translations of the space axis are called Poincaré transformations (or
inhomogeneous Lorents transformations). We indicate such a transformation as T (Λ, a).
Poincaré transformations form a group, the Poincaré group (PG). In fact:

• the identity T (1, 0) ∈ P G;

• the composition of two transformation is a Poincaré transformation, since

X ′′µ = (Λ1 )µν X ′ν + a′µ = (Λ1 )µν (Λ2 )νρ X ρ + aν + a′µ = (Λ1 )µν (Λ2 )νρ X ρ + (Λ1 )µν aν + a′µ , (2.199)
 

and (Λ1 )µν (Λ2 )νρ is a Lorentz transformation, while (Λ1 )µν aν + a′µ is a constant vector. Therefore

T (Λ′ , a′ )T (Λ, a) = T (Λ′ Λ, Λ′ a + a′ ) . (2.200)

• The inverse exists, T −1 (Λ, a) = T (Λ−1 , −Λ−1 a) such that

T (Λ, a)T −1 (Λ, a) = T (Λ, a)T (Λ−1 , −Λ−1 a) = T (ΛΛ−1 , −ΛΛ−1 a + a) = T (1, 0) . (2.201)

2.7 Infinitesimal Transformations


Since L↑+ is constituted by elements that can be connected smoothly to the identity, we can study the
local properties of LG and PG using infinitesimal transformations. An infinitesimal transformation
is a Lorentz (Poincaré) transformation in which the parameters go smoothly to zero. Therefore, for
instance
Λµν ≃ δνµ + ǫµν , (2.202)
at first order. Including the translations we will have

T (Λµν , aµ ) ≃ T (δνµ + ǫµν , δaµ ) , (2.203)

41
where we considered
X ′µ ≃ (δνµ + ǫµν ) X ν + δaµ = X µ + ǫµν X ν + δaµ . (2.204)
The infinitesimal Lorentz transformation (δνµ + ǫµν ) has to satisfy the usual relation (2.173). Therefore:

ησρ = ηµν (δσµ + ǫµσ )(δρµ + ǫµρ ) , (2.205)


= ηµν δσµ δρµ + ηµν δσµ ǫµρ + ηµν ǫµσ δρµ + ... , (2.206)
= ησρ + ǫσρ + ǫρσ . (2.207)

This means that the ǫ tensor must be antisymmetric

ǫσρ = −ǫρσ (2.208)

and therefore it has 6 independent elements. The LG then depends upon 6 independent parameters
(three for the boosts and three for the rotations). Including the 4 parameters of the rigid translation,
we have that the Poincaré group depends on 10 parameters.

2.8 Some notes on Group Theory


In this section we recall some concepts of group theory. This is important in order to fully characterize
the Poincaré group and to understand the transformation properties of the quantities we will study in
the rest of the course.

Definition 2.8.1 A Group (G) is a collection of elements that are combined through a closed operation
(product) such that
if a, b ∈ G =⇒ a · b ∈ G . (2.209)

The product must obey the following properties:

1. a · (b · c) = (a · b) · c (associative)

2. ∃e ∈ G such that a · e = e · a = a, ∀a ∈ G (identity: null element of the product)

3. ∀a ∈ G, ∃a−1 ∈ G, such that a · a−1 = a−1 · a = e (inverse)

It can be demonstrated that the identity element e and the inverse a−1 are unique and that (a−1 )−1 = a.

Definition 2.8.2 If a · b = b · a, ∀ a, b ∈ G, the group G is called Abelian.

Definition 2.8.3 We call subgroup of G, a subset H which is closed under the operation defined on
G.

Definition 2.8.4 We call homomorphism between two groups, an application

φ : G1 → G2 , (2.210)

such that ∀g1 , g2 ∈ G1 we have


φ(g1 · g2 ) = φ(g1 ) ◦ φ(g2 ) , (2.211)
where “◦” is the product defined in G2 . If φ is invertible is calle isomorphism.

42
2.8.1 Representations
The set of linear invertible transformations on a vector space V is a group which is called GL(V ).

Definition 2.8.5 A Representation of a group G on a vector space V is an homomorphism

DR : G → GL(V ) . (2.212)

Therefore, if g ∈ G it follows that DR (g) ∈ GL(V ) and

DR (g1 · g2 ) = DR (g1 )DR (g2 ) , (2.213)


DS (e) = 1 . (2.214)

So to say, the group is the abstract entity, while the representation is the realization of the group
structure via operators on a vector space. dim(V ) is the dimension of the representation. If dim(V ) = n
finite, we can immediately figure out a representation as a space of matrices acting on some finite
dimensional vector space. In this case the product can be the usual product rows by columns.

Definition 2.8.6 Two representations D1 and D2 of the same group G on two vector spaces V and
W are called “equivalent” if there exists an invertible application between the two vector spaces

T : V →W (2.215)

such that
T D1 (g) T −1 = D2 (g) , ∀g ∈ G . (2.216)

Irriducible representations
Let us now introduce the concept of reducible and irreducible representations.

Definition 2.8.7 A subspace S of V is called “invariant” with respect to the representation DS (g) if
∀v ∈ S and ∀g ∈ G, we have DR (g)v ∈ S.

Therefore

Definition 2.8.8 A representation DR on a vector space V is “irreducible” if V does not contain


subspaces invariant under DR . On the contrary, it is “reducible” if it contains invariant subspaces.

In this case the representation DR can be expressed as direct sum of irriducible representations
X (m)
DR (G) = DR (G) (2.217)
L
m

and the operators that act on V (finite-dimensional, for instance) will appear, in a suitable basis, as
block diagonal matrices.

2.8.2 Lie groups


A special role in Physics is played by the connected Lie groups

Definition 2.8.9 A “Lie group” is a group whose elements g depend in a continuous and differentiable
way on a set of real parameters θ a , a = 1, 2, ..., n, θ a ∈ R

g = g(θ 1 , ..., θ n ) . (2.218)

Without loss of generality, we can chose θ a such that for θ a = 0 we have g(0) = e. In this way, every
element of the group is connected to the identity by a continuous path in Rn

43
Lie Algebra
There is an important structure that is connected to the Lie group (and that, as the group itself, does
not depend on the representation of the group) and is called the Lie Algebra. Although not necessary,
in order to find out the algebra connected to the Lie group we consider a particular representation,
DR (g(θ)). For θ → 0 we have to recover the identity (acting on the vector space) and since DR (g(θ))
is continuous and differentiable in θ, we can define the infinitesimal trasformation (at first order in θ)
as
DR (g(θ)) ≃ 1 + iθa TRa , (2.219)
where we considered the fact the our Lie group can depend upon a set of parameters, θa , a = 1, ..., n,
and where
∂DR
TRa = −i . (2.220)
∂θa θ=0
The operators TRa are called the generators of the Lie group, in the representation R.
In terms of the generators, we can write any transformation DR (g(θ)) in exponential form. As a
simple example, let us consider the case of a Lie group depending on a single parameter θ and let us
indicate with DR (θ) the operator corresponding to g(θ) in a certain representation. Using the group
properties and (2.219), we will have

DR (θ + dθ) = DR (θ) DR (dθ) ≃ DR (θ) (1 + idθTR ) = DR (θ) + idθTR DR (θ) . (2.221)

Since
dDR
DR (θ + dθ) − DR (θ) ≃ dθ = iTR DR (θ) dθ , (2.222)

we then have
DR (θ) = eiTR θ . (2.223)
This formula can be proven to hold in general. If the Lie group depends upon a certain number
of parameters, we can indeed write the operator DR in exponential form (it is called the exponential
map)
a
DR (g) = eiTR θa . (2.224)
The generators TRa obey an algebra, that can be found as follows. Since D(g) is a representation of
our Lie group, it has tu fullfil the following relation

DR (g1 ) DR (g2 ) = DS (g1 g2 ) = DR (g3 ) , (2.225)

where g3 = g1 g2 . Using the exponential map, this means


a a a
eiαa TR eiβa TR = eiδa TR , (2.226)

since DS (g1 g2 ) = DR (g3 ) and therefore it must be of the same form of DR (g1 ) and DR (g2 ), and

δa = δa (αa , βa ) . (2.227)

However, in general we have


eA eB 6= eA+B , (2.228)
and therefore (in general) δa 6= αa + βa .
Let us consider infinitesimal transformations and let us take the logarithm of both sides of Eq. (2.226).
We have
  
1 1
log 1 + iαa TRa + (iαa TRa )2 + .. 1 + iβa TRa + (iβa TRa )2 + ... = iδa TRa (2.229)
2 2

44
or  
1 1
log 1 + iαa TRa + iβa TRa − (αa TR ) − (βa TR ) − αa βb TR TR + ... = iδa TRa .
a 2 a 2 a b
(2.230)
2 2
Expanding the log up to second order (log (1 + x) ≃ x − x2 /2...) we have
1 1 1 1
iδa TRa ≃ iαa TRa + iβa TRa − (αa TRa )2 − (βa TRa )2 − αa βb TRa TRb + αa βb TRa TRb + αa βb TRb TRa
2 2 2 2
1 a 2 1 a 2
+ (αa TR ) + (βa TR ) , (2.231)
2 2
1
= i(αa + βa )TRa − αa βb (TRa TRb − TRb TRa ) , (2.232)
2
or
αa βb [T a , T b ] = 2i(αc + βc − δc )T c = γc T c . (2.233)
Since this relation must hold for every αc and βc , γ must be proportional to αa βb :

γc = αa βb fca b . (2.234)

The constants fca b are called structure constants. Finally we find

[T a , T b ] = ifca b T c , (2.235)

which is the Lie algebra that the generators have to fulfill.


The explicit form of the generators T a depends on the specific representation. However, the algebra
(2.235) is completely general and valid for every representation. We can prove that the structure
constants are independent on the representation as well. They remain the same in every representation.
Finally, we found the algebra imposing the group structure at second order in α and β. However, it
can be proven that aty higher orders no further requirements occur. Knowing the structure constants
and the generators is sufficient to know everything about the local structure of the group.
If fca b = 0, we have [T a , T b ] = 0 and the group is Abelian.

The idea behind the study of the algebra connected to the Lie group lies in the fact that

1. The representations of the algebra induce a corresponding representation on the Lie group;

2. It is easier to study the algebra than the group, since the generators form a vector space (and it
is easier to deal with sums that with products).

Casimir operators
In the study of the representations an important role is played by the Casimir operators.

Definition 2.8.10 A Casimir operator is an operator that commutes with all the generators T a of the
group.

For instance we can think about the angular momentum, remembering that J 2 actually commutes
with the three components of the angular momentum J i (that are the generators for the rotations),
[J 2 , J i ] = 0. Casimir operators help in the study of the irreducible representations. They are linked to
the first Schur’s lemma

Lemma 2.8.11 (Schur’s lemma). If U (G) is an irriducible representation of the group G on a


vector space V and J 2 is a Casimir operator for that representation ([J 2 , U (g)] = 0 for ∀g ∈ G), then
J 2 is proportional to the identity.

45
As an example we can consider again the angular momentum for which we have J 2 = j(j + 1)1.
If we consider abelian groups, since fca b = 0, we have [T a , T b ] = 0 and therefore every generator is
a Casimir. It follows that every irreducible representation of an abelian group will be constituted by
operators that, since they all commute with the generators, are proportional to the identity. Therefore
the irreducible representations will have dimension one.

Unitary representations
Of particular importance in Physics are the unitary representations of the Lie groups. To have a
unitary representation means that the generators are hermitian and therefore they can be identified
with observables. In order to have finite-dimensional unitary representations of a Lie group we need
the group to be compact. This means that the parameters, which the group depends on, should range
in a closed interval of the reals. This is the case, for instance, of the two- and three-dimensional
rotations (rotations in the Euclidean space), for which the angles are defined in closed intervals. This
is, instead, not the case for the Lorentz group. Although the part that regards rotations is compact,
the boosts are not. The parameter that defines the boosts are the components of the velocity v of the
inertial frame S ′ with respect to the inertial frame S. The modulus of v is such that 0 ≤ vc < 1. So, v
can never reach the speed of light c (v = c is a singular point for Lorentz transformations). This fact
makes in such a way that the Lorents group is not compact.
There is a theorem that states: “Non compact groups have no finite-dimensional unitary represen-
tations”.
Therefore, finite-dimensional representations of the Lorentz group cannot be unitary. However,
we can find unitary infinite-dimensional representations and this is what matters for our physical
descriptions, since quantum states live in infinite-dimensional spaces (Hilbert space ...).

2.8.3 A simple example: the (abelian) group SO(2) and U(1)


As a first simple example let us consider the group of rotations in two dimensions. This is a Lie group
depending on a single real parameter, the angle of rotation φ.
R(φ) is a rotation of angle φ acting on a certain vector space, v ∈ V . Under R(φ) the modulous of
v has to remain unchanged. If we have

v → v′ = R(φ)v , (2.236)

then

|v′ |2 = vt R(φ)t R(φ)v ≡ vt v = |v|2 =⇒ R(φ)t R(φ) = 1 = R(φ)R(φ)t . (2.237)

This defines the orthogonal transofrmations. Moreover, from R(φ)t R(φ) = 1 it follows that detR(φ) =
±1. The subset with detR(φ) = 1 forms a subgroup (it is the one containing the identity) which is
known as SO(2), special ortogonal group.
SO(2) is an Abelian group, since

R(φ1 )R(φ2 ) = R(φ1 + φ2 ) = R(φ2 )R(φ1 ) . (2.238)

It has the global property


R(φ) = R(φ ± 2π) (2.239)
and this property is not related to the infinitesimal transformations (local structure) but to the global
structure of the group.
Let us find the generators of SO(2). For the infinitesimal transformations we have

R(dφ) ≃ 1 + K dφ . (2.240)

46
Since R(φ)t R(φ) = 1 we have
(1 + K dφ)t (1 + K dφ) = 1 , (2.241)
therefore at first order in dφ we have

1 + (K + K t ) dφ = 1 , (2.242)

or
K = −K t . (2.243)
We can put K = −iJ with J hermitian (J † = J) and then

R(dφ) ≃ 1 − iJ dφ . (2.244)

Exponentiating we find
R(φ) = e−iφJ . (2.245)
J is the generator of the group.

The representation on the Euclidean 2-dim vector space


We can consider a matrix representation of SO(2) as rotations on a 2-dim Euclidean vector space.

v = (v1 , v2 )

0 ≤ φ ≤ 2π
ê2
ê′1
ê′2 φ
ê1
We can write the transformation (that rotates the basis) as

ê′i = D(φ)ji êj , (2.246)

where the matrix D(φ) is  


cos φ sin φ
D(φ) = . (2.247)
− sin φ cos φ
If v ∈ V it can be written in components as

v = v i êi = v ′i ê′i . (2.248)

The components transform with D −1 = D t and in this case

v ′i = Dji v j , (2.249)

that in matrix form means


v ′1
     1
cos φ − sin φ v
= . (2.250)
v ′2 sin φ cos φ v2
In order to find the expression of the generator in this representation, let us find the infinitesimal
transformation expanding for a small parameter the expression in Eq. (2.247)
 
1 dφ
D(dφ) = = 1 − idφJ . (2.251)
−dφ 1

47
Therefore  
0 i
J= . (2.252)
−i 0
In fact
1
e−iφJ = 1 − iφJ + (−iφJ)2 + .... , (2.253)
2
= | since J 2 = 1 , J 3 = J , J 4 = 1... |
φ2
 3
φ
= 1 − iφJ − 1 − iJ − + ... , (2.254)
2 3!
= 1 cos φ − iJ sin φ , (2.255)
 
cos φ sin φ
= = D(φ) . (2.256)
− sin φ cos φ

Knowing J and φ we can have all the elements of the group.


The study of the algebra (then the generators) gives many pieces of information about the group.
In particular, it gives the “local” properties. However, from the exponential form we cannot extract
the global relation D(φ) = D(φ + 2π), which we have to impose separately. This global relation is
important in the study of the irreducible representations of the group.

Irreducible representations
Let us consider a representation (labeled by R) of the group on a finite-dimensional vector space. We
have
DR (φ) = e−iφJ . (2.257)
Moreover, we have

DR (φ1 )DS (φ2 ) = DS (φ1 + φ2 ) , (2.258)


DR (φ) = DR (φ + 2π) , (2.259)

and these relations have to be satisfied by every possible representation.


J is the generator ‘in the representation R” and it is an hermitian operator on V . Because of that,
it is diagonalizable, it has real einvalues and the eigenvectors form an ortonormal basis for V .
Since DR (φ) = f (J) (it is a function of J), we have

[DR (φ), J] = 0 . (2.260)

Therefore, they have a common basis of eigenvectors, say |αi, on which

J|αi = α|αi , (2.261)


−iφα
DR (φ)|αi = e |αi . (2.262)

From these relations α can be whatever. However, we still have to impose the global relation

DR (φ) = DR (φ + 2π) . (2.263)

If we do that, we find that α ≡ m ∈ Z. Therefore

J|mi = m|mi , (2.264)


−iφm
DR (φ)|mi = e |mi . (2.265)

These are all one-dimensional representations, as it was expected from the fact that the group is
abelian.

48
The representation is diagonal in the basis of eigenvectors |mi and has e−iφm as eigenvalues.
Therefore, the representation is completely reducible in irreducible one-dimensional representations.
Every |mi is invariant under DR (φ) and therefore we can express DR (φ) as a direct sum of irriducible
representations Dm (φ) = e−iφm X
DR (φ) = Dm (φ) . (2.266)
L

Diagonalizing J, the generator of the group in the representation R, we found the irreducible repre-
sentations.
Let us have a closer look to these irreducible representations.

1. If m = 0 we have D0 (φ) = 1, therefore the trivial representation (the identity)

2. If m = 1 we have D1 (φ) = e−iφ . This is an isomorphism between elements of SO(2) and numbers
on the unit circle in the complex plane. When φ ranges in the closed interval [0, 2π], D1 (φ) = e−iφ
covers the unit circle clockwise.

3. If m = −1 we have the same as above, but anti-clockwise.

4. m = ±2 covers the unit circle twice.

5. ... etc ...


Among these representations, only m = ±1 are faithful (one-to-one).
If we now go back to the representation on the Euclidean 2-dim vector space (2-dim representation)
we understand that it has to be reducible to two 1-dim irreducible representations. It is indeed
equivalent to the direct sum of m = ±1 representations. In fact
 
0 i
J= (2.267)
−i 0

has two eigenvalues, ±1. The corresponding eigenvectors are


ê1 ± iê2
ê± = √ . (2.268)
2
With respect to the new basis we have

J ê± = ±ê± , (2.269)


±iφ
DR (φ)ê± = e ê± . (2.270)

2.9 The generators of the Poincaré group and the algebra


Siccome una qualsiasi trasformazione di Lorentz può essere trovata come prodotto di una propria
ortocrona con parità e time reversal, lo studio dell’intero gruppo di Lorentz, omogeneo o inomogeneo,
si riduce allo studio di L↑+ e separatamente di P e T .
Cominciamo con L↑+ ; le trasformazioni di parità e time reversal saranno considerate a parte.
Per lo studio dei generatori del gruppo di Poincaré considereremo le trasformazioni infinitesime.
Dovremo considerare sia trasformazioni proprie di Lorentz, sia traslazioni dello spazio-tempo. In totale
avremo:
T (Λ, a) ≃ T (δµν + δω µν , δaµ ) , (2.271)
dove δω µν è un tensore antisimmetrico di rango due

δω µν = −δω νµ , (2.272)

49
costituito dai parametri infinitesimi della trasformazione propria di Lorentz, e δaµ è un quadrivettore
infinitesimo.
La trasformazione infinitesima (2.271) sarà allora composta da una traslazione:

T (1, δaµ ) = 1 − iδaµ P µ , (2.273)

e da una trasformazione propria di Lorentz:


i
T (δµν + δω µν , 0) = 1 − δωµν J µν . (2.274)
2
Il quadrivettore P µ è il generatore delle traslazioni, mentre il tensore antisimmetrico di rango due
J µν = −J νµ è il generatore delle trasformazioni di Lorentz proprie, che come abbiamo detto sono
costituite dai boosts e dalle rotazioni tridimensionali1 . In totale, J µν e P µ costituiscono un insieme
di 10 generatori: le 4 componenti di P µ , insieme alle 6 componenti di J µν , dovute al fatto che è un
tensore antisimmetrico e quindi ha n(n−1)
2 componenti indipendenti.
L’esponenziazione delle (2.273) e (2.274) porta alle seguenti relazioni per le trasformazioni finite:
µa i µν ω
T (1, a) = e−i P µ
, T (Λ, 0) = e− 2 J µν
. (2.276)

Troviamo il comportamento dei generatori sotto trasformazioni di Poincaré.


Indichiamo con T (Λ, b) la trasformazione generica e con T (1+δω, δa) la trasformazione infinitesima
su cui poniamo la nostra attenzione. La trasformazione dell’operatore T (1 + δω, δa) sotto la T (Λ, b) è
data da:

T (Λ, b)T (1 + δω, δa)T −1 (Λ, b) = T (Λ[1 + δω], Λδa + b) T −1 (Λ, b) =


= T (Λ[1 + δω], Λδa + b) T (Λ−1 , −Λ−1 b) =
= T (Λ[1 + δω]Λ−1 , −[Λ(1 − δω)]Λ−1 b + Λδa + b) =
= T (Λ[1 + δω]Λ−1 , Λδa − ΛδωΛ−1 b) . (2.277)

Sviluppando al primo ordine le trasformazioni infinitesime nella (2.277), si ottiene, infine:


 
i µν µ
T (Λ, b) − δωµν J − iδaµ P T −1 (Λ, b) =
2
i
= − ΛδωΛ−1 µν J µν − i Λδa − ΛδωΛ−1 b µ P µ ,
 
(2.278)
2
dove si ha

ΛδωΛ−1 = Λµσ δω σρ (Λ−1 )ρν = δω σρ Λµ.α ηασ ηρβ (Λ−1 )βν = δω σρ ηασ ηρβ Λµ.α Λν.β , (2.279)

µν
= δωαβ Λµ.α Λν.β (2.280)

ΛδωΛ−1 b P µ = δωρσ Λα.ρ Λµ.σ bα P µ ,



µ
(2.281)
1
δωρσ Λα.ρ Λµ.σ bα P µ − Λα.σ Λµ.ρ bα P µ ,

= (2.282)
2
In meccanica quantistica sia P µ che J µν sono entrambi operatori. Se cerchiamo rappresentazioni unitarie del gruppo,
1

T (δ + ω µν , ǫµ ) dovrà essere unitario e quindi i generatori dovranno essere autoaggiunti:


µν

J µν† = J µν , P µ† = P µ . (2.275)

50
1
= δωρσ Λα.ρ Λµ.σ (bα P µ − bµ P α ) . (2.283)
2
L’Eq. (2.278) porta quindi alle seguenti leggi di trasformazioni:

T (Λ, b)J µν T −1 (Λ, b) = Λ.µ .ν


ρ Λσ (J
ρσ
− (bρ P σ − bσ P ρ )) , (2.284)
µ −1
T (Λ, b)P T (Λ, b) = Λ.µ
ν P
ν
, (2.285)

cioè P µ e J µν si trasformano rispettivamente come un quadrivettore ed un tensore di rango due.


Se consideriamo anche T (Λ, b) come trasformazione infinitesima, la (2.278) porta alle seguenti regole
di commutazione fra i generatori del gruppo di Poincaré

[P µ , P ν ] = 0 , (2.286)
 
[P µ , J λσ ] = i P λ η µσ − P σ η µλ , (2.287)
[J µν , J ρσ ] = i (J νσ η µρ + J ρν η σµ − J µσ η νρ − J ρµ η σν ) . (2.288)

Se ci restringiamo ad alcune trasformazioni particolari, le regole di commutazione coincidono con


quelle degli operatori che si possono definire in meccanica quantistica dalla conservazione di certe
quantità. Infatti, l’operatore quadriimpulso coincide col generatore P µ delle traslazioni: la componente
spaziale P 0 non è altro che l’energia del sistema, mentre il vettore P = (P 1 , P 2 , P 3 ) è l’impulso dello
stesso. Per quanto riguarda il tensore J µν , esso è legato al momento angolare: in particolare, il
momento angolare è dato dalle tre componenti

J = (J 23 , J 31 , J 12 ) , (2.289)

mentre le altre componenti sono i generatori dei boosts di Lorentz:

K = (J 10 , J 20 , J 30 ) . (2.290)

Per queste quantità si possono ritrovare dalle (2.288) le seguenti regole di commutazione:

[Ji , Jj ] = iǫijk Jk , (2.291)


0 0
[Ji , P ] = [Pi , P ] = 0 , (2.292)
[Ji , Pj ] = iǫijk Pk , (2.293)
[Ki , Kj ] = −iǫijk Jk , (2.294)
[Ji , Kj ] = iǫijk Kk . (2.295)

È da notare che mentre le tre componenti del momento angolare costituiscono un’algebra chiusa
(vedi (2.291)), le tre componenti dei generatori dei boosts no (2.294).

2.10 Rappresentazioni irriducibili finito-dimensionali del gruppo di


Poincaré
Consideriamo, adesso, la parte più interessante della trattazione del gruppo di Poincaré: il fatto che
sia possibile classificare i campi corrispondenti alle particelle elementari conosciute in base al loro
comportamento sotto le trasformazioni del gruppo.
Diremo campo una funzione dallo spazio degli eventi M (spazio di Minkowski) in C n :

φ(X) : M −→ C n (2.296)

φ(X) = φ1 (X), φ2 (X), ...., φn (X) , (2.297)

51
cioè, in generale una funzione a più componenti complesse di X µ . In realtà, poi, utilizzeremo funzioni ad
un solo valore reale (campo scalare) o complesso (campo scalare carico) e funzioni a quattro componenti
complesse (campo di Dirac).
Se operiamo una trasformazione di Lorentz X µ → X ′µ = Λµν X ν , ovvero cambiamo sistema di
riferimento, il campo φ(X) subirà una trasformazione lineare omogenea

φ(X) → φ′ (X ′ ) = S(Λ) φ(X) , (2.298)

dove l’operatore S(Λ) è interamente determinato dalla trasformazione Λ. S(Λ) agisce sulle componenti
di φ(X) ed è una rappresentazione del gruppo di Lorentz sullo spazio lineare individuato dal vettore
φ(X).
Le rappresentazioni che ci interessano sono le rappresentazioni di ordine finito del Gruppo di Lo-
rentz, cioè quelle su spazi lineari individuati da vettori φ(X) con un numero finito di componenti.
Avremo a che fare con rappresentazioni a spin intero (spin 0, 1) e semiintero (spin 12 ).

2.10.1 Campi tensoriali. Rappresentazioni a spin intero


Si dice campo tensoriale un oggetto ad m indici controvarianti ed n covarianti, Tαµ11....α
....µm
n (X), tale che
µ ′µ µ ν
per X → X = Λν X si abbia la seguente trasformazione:
X→X ′ .βn ν1 ....νm
Tαµ11....α
....µm
n
(X) −→ Tα′µ11....α
....µm
n
(X ′ ) = Λµν11 ..Λµνm
m
Λ.β
α1 ..Λαn Tβ1 ....βn (X) ,
1
(2.299)

dove Λ.β −1 α
α = (Λ )β .
Indicheremo il generico tensore Tαµ11....α
....µm
n di rango m, n col simbolo (m, n). Lo spazio dei tensori
(m, n) è uno spazio lineare, ovvero la combinazione lineare di tensori (m, n) dà ancora un tensore
(m, n):
(m, n) ⊕ (m, n) −→ (m, n) . (2.300)
Inoltre si possono definire una serie di operazioni, quali:
• il prodotto tensoriale, ⊗, tale che:

(m, n) ⊗ (m′ , n′ ) −→ (m + m′ , n + n′ ) , (2.301)

cioè tale che il prodotto di due tensori di rango (m, n) e (m′ , n′ ) dia un tensore con rango
(m + m′ , n + n′ );

• la contrazione di due indici. Se poniamo uguali un indice controvariante ed uno covariante, il


rango del tensore si abbassa di 1 in m e di 1 in n. Per es.

Tδαβγ −→ Tβαβγ = K αγ (2.302)


(3, 1) −→ (2, 0)

• l’innalzamento o l’abbassamento di indici. Attraverso il tensore metrico ηµν possiamo far diventare
un indice controvariante covariante e viceversa:

Tδαβγ = ηδρ T αβγρ . (2.303)

Campo scalare. Rappresentazione banale del Gruppo di Lorentz


Il primo caso particolare di campo tensoriale è il campo scalare. Questo è un tensore di rango (0, 0)
e pertanto si trasforma, quando X → X ′ , secondo la rappresentazione banale del Gruppo di Lorentz
S(Λ) = 1:
φ(X) −→ φ′ (X ′ ) = φ(X) . (2.304)

52
Campo vettoriale. Rappresentazione quadridimensionale del Gruppo di Lorentz
Consideriamo, adesso, un campo tensoriale di rango (1, 0):

V µ (X) = V 0 (X), V(X) , (2.305)




che chiameremo quadrivettore controvariante. Sotto la trasformazione di coordinate X → X ′ , il campo


vettoriale V µ si trasforma esattamente come X µ (o per esempio come il quadriimpulso P µ = (E, p)),
ovvero:
V µ (X) −→ V ′µ (X ′ ) = Λµν V ν (X) . (2.306)
Un esempio di campo quadrivettoriale controvariante è il potenziale elettromagnetico Aµ (X).
Se, invece, consideriamo un campo tensoriale di rango (0, 1), Uµ (X), si avrà la seguente legge di
trasformazione, quando X → X ′ :

Uµ (X) −→ Uµ′ (X ′ ) = Λ.ν


µ Uν (X) , (2.307)

cioè Uµ (X) si trasforma con l’inversa di Λ. Diremo che Uµ è un quadrivettore covariante. Un esempio
di tale tipo di vettore è il gradiente di un campo scalare, ∂µ φ(X) = ∂φ(X)
∂X µ .

Il prodotto tensoriale fra un vettore controvariante ed uno covariante, con indici saturati, V µ Uµ , è
uno scalare di Lorentz. Infatti:

V µ (X) Uµ (X) −→ V ′µ (X) Uµ′ (X) = Λµν Λ.ρ ν


µ V (X) Uρ (X) = (2.308)
= δνρ V ν (X) Uρ (X) = (2.309)
= V ν (X) Uν (X) (2.310)

In particolare la norma di un quadrivettore V µ Vµ = kV k2 è un invariante di Lorentz.


Allora, nel caso di campo vettoriale si ha S(Λ) = Λµν che agisce sulle quattro componenti del
quadrivettore V µ . Abbiamo trovato una rappresentazione quaridimensionale del Gruppo di Lorentz.
Siccome questo è costituito da boosts e dalle rotazioni in tre dimensioni, potremo considerare 6 matrici:
(Λ̃µν )x , (Λ̃µν )y , (Λ̃µν )z , per i boosts lungo gli assi cartesiani e Rx , Ry e Rz per le rotazioni intorno agli
stessi. Per esempio si ha:
   
cosh φx sinh φx 0 0 1 0 0 0
 sinh φx cosh φx 0 0
(Λ̃µν )x =   e Rz = 0 cos θz sin θz 0 ,
 
 0 (2.311)
0 1 0 0 − sin θz cos θz 0
0 0 0 1 0 0 0 1

dove φx coinvolge la velocità di traslazione lungo l’asse delle x e θx è l’angolo di rotazione attorno allo
stesso asse.
Per trovare nella stessa rappresentazione matriciale i generatori del Gruppo, basta ricordare che
per il mapping esponenziale si ha:
 
0 1 0 0
d 1 0 0 0
Kx = −i (Λ̃µν )x = −i 
0 0 0 0
 (2.312)
dφx
φx =0
0 0 0 0
 
0 0 1 0
d 0 0 0 0
Ky = −i (Λ̃µν )y = −i 
1 0 0 0
 (2.313)
dφy
φy =0
0 0 0 0

53
 
0 0 0 1
d 0 0 0 0
Kz = −i (Λ̃µ )z = −i  (2.314)
dφz ν

0 0 0 0
φz =0
1 0 0 0

e i generatori delle rotazioni sono semplicemente dati da quelli trovati in tre dimensioni euclidee con
l’aggiunta di una riga ed una colonna di zeri per la parte temporale:
 
0 0 0 0
0 0 0 0
Jx = −i  0 0 0 1
 (2.315)
0 0 −1 0
 
0 0 0 0
0 0 0 −1
Jy = −i  0 0 0 0 
 (2.316)
0 1 0 0
 
0 0 0 0
0 0 1 0
Jz = −i  0 −1 0 0 .
 (2.317)
0 0 0 0

È da notare che, mentre le matrici Ji sono hermitiane e quindi la loro esponenziazione porta ad una
matrice unitaria, lo stesso non si verifica per le Ki , che non sono hermitiane. Questo è dovuto al fatto
che il gruppo SO(3) è un gruppo compatto e quindi ammette rappresentazioni unitarie, mentre l’intero
Gruppo di Lorentz (con i boosts) non è compatto e quindi non ammette rappresentazioni unitarie.
I generatori del gruppo formano l’algebra di Lie associata ed infatti si può verificare che, nella
rappresentazione matriciale quadridimensionale appena data, valgono le (2.291), (2.294) e (2.295).
Mentre le espressioni matriciali sono peculiari della rappresentazione cercata, le regole che definiscono
l’algebra associata hanno carattere universale. Per ogni rappresentazione del Gruppo di Lorentz, i
generatori del gruppo devono soddisfare le (2.291), (2.294) e (2.295).
La rappresentazione (2.314, 2.317) del Gruppo di Lorentz viene denotata con SO(3, 1).

N.B. L’importanza dello studio delle proprietà di trasformazione delle quantità con cui abbiamo
a che fare nella costruzione della teoria e lo sviluppo delle notazioni tensoriali dipendono da quanto
segue. Supponiamo che nel sistema di riferimento S una legge fisica sia espressa da un’uguaglianza
tensoriale:
Tβα = Uβα . (2.318)
Dei due tensori che esprimono la legge (2.318) sappiamo esattamente le proprietà di trasformazione
sotto il Gruppo di Lorentz. Allora in un’altro sistema di riferimento S ′ , si avrà:
γ
Tβ′α = Λαγ Λ.δ
β Tδ = (2.319)
= Λαγ Λ.δ
β Uδγ = (2.320)
= Uβ′α , (2.321)

dove per passare dalla (2.319) alla (2.320) abbiamo sfruttato la (2.318). La (2.321) ci dice che una
relazione fra tensori rimane invariata in forma sotto trasformazioni di Lorentz: in S ′ vale la stessa
relazione per i tensori trasformati.
Si dice che la legge (2.318) è covariante a vista.

54
2.10.2 Campi spinoriali. Spinori di Dirac
Cerchiamo, adesso, una rappresentazione bidimensionale del Gruppo di Lorentz.
Per far questo, prendiamo spunto dall’omomorfismo fra il gruppo speciale delle rotazioni in tre
dimensioni, SO(3), ed il suo ricoprimento universale, SU (2), gruppo delle trasformazioni unitarie
speciali su uno spazio bidimensionale.
SO(3) dipende da tre parametri (per esempio i tre angoli di Eulero) e ne possiamo dare una rap-
presentazione in termini di matrici 3 × 3 reali relative alle tre rappresentazioni lungo gli assi cartesiani:
Rx (θx ), Ry (θy ) e Rz (θz ). Più in generale, se n individua una direzione nello spazio euclideo, una rota-
zione di un angolo θ eseguita intorno all’asse n secondo la regola della mano destra può essere scritta
come segue:
Rn (θ) = eiJ·θ , (2.322)
dove abbiamo posto θ = θn e dove le matrici Ji sono i generatori del gruppo SO(3) e soddisfano le
regole dell’algebra di Lie associata:
[Ji , Jj ] = iǫijk Jk . (2.323)
Prima di tutto mostriamo come anche SU (2) dipenda da tre parametri reali. Possiamo dare una
rappresentazione del gruppo in termini di matrici 2 × 2 complesse:
 
a b
U = , (2.324)
c d

con a, b, c, d ∈ C. Siccome la U deve essere unitaria (U † U = 1) e a determinante detU = +1, gli 8


parametri indipendenti che sembrano definire la U nella (2.324) si riducono a tre. Infatti, la prima
richiesta porta a 4 relazioni reali e la seconda ad un’altra. Sotto queste imposizioni la U diventa:
 
a b
U = . (2.325)
−b∗ a∗

Una matrice siffatta agisce su uno spazio vettoriale complesso a due dimensioni, detto spazio degli
spinori a due componenti:  
ξ1
ξ = ξ −→ ξ ′ = U ξ (2.326)
ξ2
I generatori del gruppo SU (2) sono le matrici di Pauli σi :
     
0 1 0 −i 1 0
σ1 = ; σ2 = ; σ3 = , (2.327)
1 0 i 0 0 −1

che obbediscono alle seguenti regole di commutazione (dell’algebra associata):


1 1 1
[ σi , σj ] = iǫijk σk . (2.328)
2 2 2
Una trasformazione di SU (2) preserva la forma quadratica (x2 + y 2 + z 2 ). Infatti, consideriamo il
raggio vettore r = (x, y, z) e la matrice
 
z x − iy
h = σ·r = , (2.329)
x + iy −z

hermitiana a traccia nulla.


Trasformiamo la h tramite la trasformazione di similitudine:

h −→ h′ = U h U † . (2.330)

55
La (2.330) conserva l’hermiticità e la traccia nulla. Infatti, si ha:

h′ h′† = U hU † (U hU † )† = U hU † U h† U † = U hh† U † = 1 , (2.331)

dove abbiamo utilizzato U † U = 1 e hh† = 1; e:

tr h′ = tr (U hU † ) = tr (U † U h) = tr h , (2.332)

per la proprietà ciclica della traccia.


Inoltre, la (2.330) conserva anche il determinante, che non è altro che la forma quadratica krk2 con
un segno cambiato:
x′2 + y ′2 + z ′2 = −det h′ = −det h = x2 + y 2 + z 2 . (2.333)
Per questo è logico che SU (2) sia legato in qualche modo al gruppo delle rotazioni SO(3). In
particolare si ha una corrispondenza 2 a 1 fra gli elementi di SU (2) e gli elementi di SO(3) e le rappre-
sentazioni irriducibili di SO(3) sono contenute in quelle di SU (2) (si dice che SU (2) è il ricoprimento
universale di SO(3)).
La corrispondenza fra i due gruppi si può riassumere in:
i
U = e 2 σ·θ ⇐⇒ R = eiJ·θ , (2.334)
1
σi ⇐⇒ Ji . (2.335)
2
Cerchiamo, allora, una rappresentazione bidimensionale del Gruppo di Lorentz.
Per le rotazioni potremo utilizzare la (2.335).
Per trovare i generatori bidimensionali dei boosts, K(2) , sfruttiamo le regole di commutazione
dell’algebra, che sono verificate per qualunque rappresentazione:

(2) (2) 1
[Ki , Kj ] = −iǫijk σk (2.336)
2
1 (2) (2)
[ σi , Kj ] = iǫijk Kk (2.337)
2
1 1 1
[ σi , σj ] = iǫijk σk . (2.338)
2 2 2
Si trova:
(2) i
Ki = ± σi , (2.339)
2
(2) (2)
cioè abbiamo due possibili rappresentazioni, una con Ki = 2i σi e l’altra con Ki = − 2i σi . Corri-
spondentemente si avrammo due tipi di spinori, che sotto boost si trasformano in maniera diversa:
ni o
φR → φ′R = exp σ · (θ − iφ) φR , (2.340)
2
ni o
φL → φ′L = exp σ · (θ + iφ) φL , (2.341)
2
dove θ sono i parametri della rotazione e φ quelli dei boosts.
È da notare che φR e φL (spinori destri e sinistri) si trasformano allo stesso modo sotto rotazioni.
Infatti, se consideriamo soltanto rappresentazioni irriducibili del gruppo delle rotazioni in teoria non
relativistica abbiamo un solo tipo di spinori: quelli di Pauli. L’introduzione delle trasformazioni di
Lorentz, invece, distingue fra due tipi di componenti.
Questa rappresentazione bidimensionale del Gruppo di Lorentz si indica con SL(2, C).

56
Si può vedere la corrispondenza fra le due rappresentazioni SO(3, 1) e SL(2, C) esattamente co-
me abbiamo stabilito la corrispondenza fra le trasformazioni di SO(3) e di SU (2). Consideriamo il
quadrivettore σ µ = (σ 0 = 1, σ) e saturiamolo con l’evento X µ :
 0
X − X3 −X 1 + iX 2

µ
X = σµ X = , (2.342)
−X 1 − iX 2 X0 + X3

matrice hermitiana.
Se A ∈ SL(2, C), facciamo la trasformazione su X:

X −→ X ′ = AXA† , (2.343)

mediante la quale X ′ è ancora una matrice hermitiana. La (2.343) preserva il determinante di X, che
non è altro che la forma quadratica

X µ Xµ = (X 0 )2 − (X 1 )2 − (X 2 )2 − (X 3 )2 , (2.344)

cioè la norma del quadrivettore X µ . In altre parole è una trasformazione di Lorentz.

Siccome per parità i due spinori, destro e sinistro, si trasformano l’uno nell’altro, se vogliamo
considerare il Gruppo di Lorentz completo non ha più senso distinguere fra φR e φL .
Introdurremo, allora, lo spinore di Dirac, costituito come segue:
 
φR
ψ= (2.345)
φL

tale che:  n o 
exp 2i σ · (θ − iφ) 0
ψ −→ ψ ′ =  n o ψ . (2.346)
i
0 exp 2σ · (θ + iφ)

Gli spinori di Dirac costituiscono lo spazio vettoriale per la rappresentazione irriducibile bidimen-
sionale del Gruppo di Lorentz.

2.11 Infinite dimensional representations of the Poincaré group: par-


ticle states

57
Capitolo 3

Conservation Laws

3.1 Lagrangian formalism


Following the formalization of analytical mechanics we will study now the system in the Lagrangian
formalism. This formalism is optimal for the study of the symmetries of the theory. We will use the
related Hamiltonian formalism when we will quantize the system, making a correspondence between
the Poisson brackets and the commutators of the operators that describe the physical observables.
This will be the subject of the co-called “canonical quantization”.

3.1.1 Relativistic free particle


In order to recall the basic principles of Lagrangian mechanics, let us concentrate on a simple example:
the relativistic free particle. In classical Physics, we find the equations of motion from Least Action
Principle, or Hamilton’s Principle.
A physical system can be described by a function called the “Lagrangian” L (given by the difference
between the kinetic and the potential energy), which depends on the coordinates (collectively labeled
with q), the velocities (q̇ and (at most) the time1 :

L = L(q, q̇, t) . (3.1)

The motion of the classical system is the function of time q(t) that minimizes the following functional
(the Action)
Z t2
S= L(q, q̇, t) dt , (3.2)
t1
with respect to path variation with fixed end points

δS = 0 . (3.3)

Eq. (3.3) gives a system of second order differential equations, called the Euler-Lagrange equations,
that for Newtonian mechanics are the generalization of the second principle F = ma.
Let us consider now a relativistic system. Our study should be independent on the inertial frame
where the observer lives. In other words, our description of the system should be invariant under
Lorentz transformations and therefore the Action should be invariant, in such a way that the Euler-
Lagrange differential equations are unchanged in form in every inertial frame. Considering Eq. (3.2),
this means that L dt must be a Lorentz scalar. We already pointed out that dt is not a Lorentz
scalar and, moreover, time derivation as dX µ /dt does not transform as a four-vector under Lorentz
transformations. Let us consider, then, instead of the time, the “proper time” of the particle, τ for
which we know that dτ is actually invariant.
1
If the system is closed the explicit dependence on time in the Lagrangian is not present

58
We can write the product L dt in a manifestly invariant way as follows:
Z τ2
S= L(X µ , Ẋ µ , τ ) dτ , (3.4)
τ1

where Ẋ µ = dX µ /dτ . Since dτ is invariant, so should be the function L(X µ , Ẋ µ , τ )2 .


Let us now consider a generic variation of the “path” X µ (τ ). Let us consider a reparametrization
τ → τ ′ and a variation of X µ (τ ) and Ẋ µ (τ ) as follows:

τ → τ ′ = τ + ∆τ (τ ) , (3.6)
µ ′µ ′ µ µ
X (τ ) → X (τ ) = X (τ ) + ∆X (τ ) , (3.7)
µ ′µ ′ µ µ
Ẋ (τ ) → Ẋ (τ ) = Ẋ (τ ) + ∆Ẋ (τ ) . (3.8)

Note that ∆τ is a function of τ . Moreover, the variation of the path we are considering is a global
variation, such that
∆X µ (τ ) = X ′µ (τ ′ ) − X µ (τ ) , (3.9)
that include a variation for the change in the parametrization and a variation in form of the function
X µ (τ ) given a certain τ . Finally, since we are speaking about global variations, in the last equation
we have to consider that ∆Ẋ µ (τ ) 6= d/dτ ∆X µ (τ ).
Let us, moreover, consider a lagrangian that does not change in form under this reparametrization.
In principle, we could have

L′ (X ′µ , Ẋ ′µ , τ ′ ) − L(X ′µ , Ẋ ′µ , τ ′ ) = δL . (3.10)

In order not to affect the equations of motion, δL should be the total derivative of a function that
vanishes at the end points. However, let us consider δL = 0.
To the first order in the variation we have
∂L ∂L ∂L
L(X ′µ , Ẋ ′µ , τ ′ ) ≃ L(X µ , Ẋ µ , τ ) + ∆τ + µ
∆X µ + ∆Ẋ µ . (3.11)
∂τ ∂X ∂ Ẋ µ
Since the total derivative of L with respect of τ is

dL ∂L ∂L ∂X µ ∂L ∂ Ẋ µ
= + ∆τ + ∆τ , (3.12)
dτ ∂τ ∂X µ ∂τ ∂ Ẋ µ ∂τ
∂L
we can extract ∂τ from the previous equation and substitute it in Eq. (3.11) getting
" #
dL ∂L ∂L
L(X ′µ , Ẋ ′µ , τ ′ ) ≃ L(X µ , Ẋ µ , τ ) + ∆τ + (∆X µ − Ẋ µ ∆τ ) + (∆Ẋ µ − Ẍ µ ∆τ ) , (3.13)
dτ ∂X µ ∂ Ẋ µ

which is written in terms of a total proper time derivative (we will use this in a while, in order to do
the integration).
Consider also that  
d∆τ d∆τ
dτ ′ = dτ + dτ = 1 + dτ . (3.14)
dτ dτ
2
In particular, if under a Lorentz transformation the lagrangian becomes L′ (X ′µ , Ẋ ′µ , τ ) (τ does not change under
LT) we must have
L′ (X ′µ , Ẋ ′µ , τ ) = L(X ′µ , Ẋ ′µ , τ ) . (3.5)
This means that the form of the function should be the same (when X µ becomes X ′µ ... etc.). Moreover, in practice we
will never consider lagrangians that depend explicitely on time.

59
At first order we can therefore write:
Z τ′ Z τ2
2
′µ ′µ ′ ′
δS = L(X , Ẋ , τ ) dτ − L(X ′µ , Ẋ µ , τ ) dτ , (3.15)
τ1′ τ1
τ2  ( " #)
d∆τ dL ∂L ∂L
Z
≃ dτ 1+ L+ ∆τ + (∆X µ − Ẋ µ ∆τ ) + (∆Ẋ µ − Ẍ µ ∆τ )
τ1 dτ dτ ∂X µ ∂ Ẋ µ
Z τ2
− dτ L , (3.16)
τ1
Z τ2 " #
d ∂L µ µ ∂L µ µ
≃ dτ (L∆τ ) + (∆X − Ẋ ∆τ ) + (∆Ẋ − Ẍ ∆τ ) , (3.17)
τ1 dτ ∂X µ ∂ Ẋ µ

where we discarted terms of second and higher order in the variations.


Let us now specify a bit better the variation ∆X µ etc.
We can write

∆X µ = X ′µ (τ ′ ) − X µ (τ ) , (3.18)
′µ ′ ′µ ′µ µ
= X (τ ) − X (τ ) + X (τ ) − X (τ ) , (3.19)
′µ ′µ µ
≃ Ẋ (τ ) ∆τ + X (τ ) − X (τ ) , (3.20)
µ µ
≃ Ẋ (τ ) ∆τ + δX , (3.21)

where we neglected terms of higher order in the variations and we introduced a variation in form of the
function X µ (τ ) which consists on a variation at a fixed parameter τ . Therefore the total variation of
X / mu(τ ) is represented as the sum of a variation that depends on the fact that τ varies (and therefore
the derivative is involved) and a variation in form of the function, at fixed parametert τ . Note the fact
that, then
δX µ = (∆X µ − Ẋ µ ∆τ ) . (3.22)
Similarly, we have

∆Ẋ µ = Ẋ ′µ (τ ′ ) − Ẋ µ (τ ) , (3.23)
′µ ′ ′µ ′µ µ
= Ẋ (τ ) − Ẋ (τ ) + Ẋ (τ ) − Ẋ (τ ) , (3.24)
′µ ′µ µ
≃ Ẍ (τ ) ∆τ + Ẋ (τ ) − Ẋ (τ ) , (3.25)
µ µ
≃ Ẍ (τ ) ∆τ + δẊ , (3.26)
d
= Ẍ µ (τ ) ∆τ + δX µ , (3.27)

where we used the fact that
d
δẊ µ = δX µ , (3.28)

since the variation δ is taken at equal τ .
Substituting in Eq. (3.17) and integrating by parts we find
Z τ2 hd ∂L µ ∂L d µ
i
δS ≃ dτ (L∆τ ) + δX + δX , (3.29)
τ1 dτ ∂X µ ∂ Ẋ µ dτ
Z τ2 hd    
d ∂L µ ∂L d ∂L µ
i
= dτ (L∆τ ) + δX + − δX , (3.30)
τ1 dτ dτ ∂ Ẋ µ ∂X µ dτ ∂ Ẋ µ
Z τ2 hd    
∂L µ ∂L d ∂L µ
i
= dτ L∆τ + δX + − δX , (3.31)
τ1 dτ ∂ Ẋ µ ∂X µ dτ ∂ Ẋ µ
= using again δX µ = ∆X µ − Ẋ µ ∆τ

60
τ2   τ2 τ2
∂L d ∂L ∂L µ  ∂L
Z 
= dτ µ
− δX µ + L − Ẋ ∆τ + ∆X µ . (3.32)
τ1 ∂X dτ ∂ Ẋ µ ∂ Ẋ µ τ1 ∂X µ τ1

Let us impose now


δS = 0 . (3.33)
If we consider just functional variations (i.e. ∆τ = 0 and ∆X µ = δX µ with δX µ (τ1 ) = δX µ (τ2 ) = 0)
that vanish in τ1 and τ2 , we find the Hamilton’s principle and therefore the Euler-Lagrange equations
of motion:
∂L d ∂L
µ
− = 0. (3.34)
∂X dτ ∂ Ẋ µ
Let us suppose now that the path is such that the Euler-Lagrange equations are satisfied. If we require
that the theroy is invariant under a slightly more general variation, with a reparametrization of the
curve
∆X µ (τ ) = 0 , (3.35)
the third term in Eq. (3.32) vanishes and the second gives rise to the following equation
∂L µ
L= Ẋ , (3.36)
∂ Ẋ µ
which means3 that the lagrangian must be an homogeneous functions of degree one in Ẋ µ .
In the case of the free particle we can chose4
q
L = α Ẋ µ Ẋµ , (3.39)

where α is a constant that can be fixed imposing the correct behaviour for velocities small with respect
to the speed of light.
Z τ2 q Z τ2 Z τ2 Z τ2 Z t2 p
p
S=α µ
Ẋ Ẋµ dτ = α µ
dX dXµ = α ds = αc dτ = αc 1 − β 2 dt , (3.40)
τ1 τ1 τ1 τ1 t1

where we moved back to a non-manifestly-invariant form. In the case β << 1 we have


v2
 
L ≃ αc 1 − 2 + ... (3.41)
2c
and we must impose α = −mc, in such a way that
mv 2
L ≃ −mc2 + + ... (3.42)
2
that reproduces the correct Newtonian kinetic energy (up to a constant).
Finally r
v2
L = −mc 1 − 2 . (3.43)
c
3
Eq. (3.36) has also another meaning: if we write the four-momentum of the particle as
∂L
Pµ = , (3.37)
∂ Ẋ µ
and we suppose that Pµ is the canonical conjugated momentum to X µ , the canonical Hamiltonian is given by
∂L µ
H= Ẋ − L = 0 . (3.38)
∂ Ẋ µ
Therefore we find that the Hamiltonian is identically zero. This is due to the fact that Pµ actually does not have all the
components independent, but they are constrained by the mass-shell relation, P µ Pµ = m2 . This relation makes in such
a way that the trasformation from (X µ , Ẋ µ ) to (X µ , P µ ) is indeed non canonical.
4
This ammounts to take as action the integral of the ds.

61
3.1.2 Euler-Lagrange Equations
Since q
L = −mc Ẋ µ Ẋµ , (3.44)
the variable X µ is cyclic, ∂L/∂xµ = 0, the equations of motion are

d ∂L
= 0, (3.45)
dτ ∂ Ẋ µ
or
mẊµ = Pµ = const . (3.46)
Remenbering the components of the four-momentum, we have
!  
mc mv E
Pµ = p , −p = , −p = const . (3.47)
1 − β2 1 − β2 c

Finally: (
E
c = const
(3.48)
p = const

3.1.3 Conservation Laws


Let us consider now the invariance of the Lagrangian under Poincaré transformations

X µ → X ′µ = Λµν X ν + aµ . (3.49)

Note that the transformation leaves unchanged the proper time, τ . We will find that, as a consequence
of this invariance, we get some conservation laws.
If we consider an infinitesimal transformation, we have

X µ → X ′µ = X µ + δX µ (3.50)

and since
Λµν ≃ δνµ + ǫµν , (3.51)
we will have
δX µ = X ′µ − X µ ≃ ǫµν X ν + δaµ . (3.52)
The Lagrangian will change accordingly:
∂L ∂L
L(X ′µ , Ẋ ′µ , τ ) ≃ L(X µ , Ẋ µ , τ ) + µ
δX µ + δẊ µ , (3.53)
∂X ∂ Ẋ µ
∂L ∂L d
= L(X µ , Ẋ µ , τ ) + µ
δX µ + δX µ , (3.54)
∂X ∂ Ẋµ dτ
  
µ µ ∂L µ d ∂L µ µ d ∂L
= L(X , Ẋ , τ ) + δX + δX − δX , (3.55)
∂X µ dτ ∂ Ẋ µ dτ ∂ Ẋ µ
   
∂L d ∂L d ∂L
= L(X µ , Ẋ µ , τ ) + − δX µ
+ δX µ
, (3.56)
∂X µ dτ ∂ Ẋ µ dτ ∂ Ẋ µ

where we used the fact that the variation δX µ is indeed a local variation (δτ = 0) and therefore

d
δẊ µ = δX µ . (3.57)

62
If now we impose L(X ′µ , Ẋ ′µ , τ ) = L(X µ , Ẋ µ , τ ), we have
   
∂L d ∂L µ d ∂L µ
− δX + δX =0 (3.58)
∂X µ dτ ∂ Ẋ µ dτ ∂ Ẋ µ
and on the solution of the equations of motion finally
 
d ∂L µ
δX = 0, (3.59)
dτ ∂ Ẋ µ
or
∂L
δX µ = const . (3.60)
∂ Ẋ µ

Substituting Eq. (3.52) and remembering that


∂L
= Pµ , (3.61)
∂ Ẋ µ
we get the following conservation laws

Pµ ǫµν X ν = ǫµν Pµ Xν = const , (3.62)


µ
Pµ δa = const . (3.63)

Since ǫµν is antisymmetric in the exchange of the two indices, the part of the tensor Pµ Xν that survives
is only the antisymmetric part
 
µν µν 1 1 1
ǫ Pµ Xν = ǫ (Pµ Xν + Pν Xµ ) + (Pµ Xν − Pν Xµ ) = ǫµν (Pµ Xν − Pν Xµ ) . (3.64)
2 2 2
Finally, since ǫµν and δaµ are constants, we have

Mµν = Pµ Xν − Pν Xµ = const , (3.65)


Pµ = const , (3.66)

conservation of the generalized angular momentum and of the mopmentum. Note that Mµν id an anti-
symmetric tensor and therefore it has 6 independent quantities, while Pµ has 4. In total, the invariance
under Poincaré transformations (that depend on 10 parameters) gives 10 coonserved quantites.

3.2 Lagrangian formalism for the vibrating string


Let us now consider the case of the vibrating string and study the lagrangian approach in the continuum
case.
We have
L=T −V , (3.67)
where
N L
1X 2 1
Z
T = pn → φ̇2 (x, t) dx , (3.68)
2 2 0
n=1
N
1 2X v 2 L ′2
Z
V = ω (qn − qn+1 )2 → φ (x, t) dx . (3.69)
2 n=1 2 0

In total:
L  h i
1 2
Z
2 ′2
L= dx φ̇ (x, t) − v φ (x, t) . (3.70)
0 2

63
We see that we can write the lagrangian as a space integral of a “lagrangian density”
1h 2 i
L = φ̇ (x, t) − v 2 φ′2 (x, t) , (3.71)
2
Z L
L = dxL . (3.72)
0

The same can be found for the hamiltonian:


1h 2 i
H = φ̇ (x, t) + v 2 φ′2 (x, t) , (3.73)
2
Z L
H = dxH . (3.74)
0

We want not derive the vibrating string equations of motion from the Hamilton’s Principle, as in
the case of the point like particle. Then we will consider the following case:

1. Our system is described by a Lagrangian density, L, local function of the fields;

2. L depends upon the fields, their first derivative (and at most on space point and time)
 
L = L φ, φ̇, φ′ , x, t . (3.75)

We can define the action as


Z t2 Z t2 Z L
S= L dt = L dx dt (3.76)
t1 t1 0

and we require that the equations of motion derive from the imposition of

δS = 0 . (3.77)

The variation of the fields has to be imposed to vanish on the boundary of integration:

δφ(x, t1 ) = δφ(x, t2 ) = 0 , ∀x ∈ [0, L] , (3.78)


δφ(0, t) = δφ(L, t) = 0 , ∀t ∈ [t1 , t2 ] . (3.79)

Moreover, note that δφ is a variation in form of the field, at a given point x and t. This means that
∂ ∂
δφ̇ = δφ , and δφ′ = δφ . (3.80)
∂t ∂x
We have
 
∂L ∂L ∂L ′
Z Z
0 = δS = dt dx δL = dt dx δφ + δφ̇ + δφ , (3.81)
∂φ ∂ φ̇ ∂φ′
         
∂L ∂ ∂L ∂ ∂L ∂ ∂L ∂ ∂L
Z
= dt dx δφ + δφ − δφ + δφ − δφ , (3.82)
∂φ ∂t ∂ φ̇ ∂t ∂ φ̇ ∂x ∂φ′ ∂x ∂φ′
  t2 Z  0 Z  
∂L ∂L ∂L ∂ ∂L ∂ ∂L
Z
= dx δφ + dt δφ + dt dx − − δφ . (3.83)
∂ φ̇ t1
∂φ′ L ∂φ ∂t ∂ φ̇ ∂x ∂φ′

Using (3.78,3.79), for the arbitraryness of δφ, we have the Euler-Lagrange equation of motion:

∂L ∂ ∂L ∂ ∂L
− − = 0. (3.84)
∂φ ∂t ∂ φ̇ ∂x ∂φ′

64
If we consider the lagrangian density of the vibrating string, Eq. (3.71), we find the wave equation

1 ∂2φ ∂2φ
− = 0. (3.85)
v 2 ∂t2 ∂x2
Knowing the lagrangian density, we can perform a Legendre transformation to get the hamiltonian
density. We define the momentum conjugate to the field φ
∂L
π(x, t) = (3.86)
∂ φ̇
and then
1 2
π + v 2 π ′2 . (3.87)

H = π φ̇ − L =
2

3.3 Lagrangian formalism: relativistic fields


We will consider now the field as a function of the space-time point X µ , φ(X), and we will let our
system be described by a lagrangian, local funcion of the field (or fields, if we have more than one),
of its derivatives and, at most, of the space-time point (as we will see, we cannot have an explicit
dependence on xµ for theories that have to be Poincaré invariant). The request of locality for the
lagrangian is connected to necessity that physical quantities are observable (causality principle).
Our goal, is to include in our quantum description of microscopic phenomena special relativity.
Therefore, we will require that the action, S = dt L, is invariant under Poincaré transformations (i.e.
R

the action is a scalar). In fact, Physics must be independent on the inertial frame in which we describe
it.
(n)
If the lagrangian is L = L(φi (X), ∂µ φi (X), ..., ∂µ φi (X), Xµ ), we define the lagrangian density L,
such that: Z
L= L(φi (X), ∂µ φi (X), ..., ∂µ(n) φi (X)Xµ ) d3 X . (3.88)
V
The action, S(V ), will be given by the following expression:
Z t2 Z t2 Z
S(V ) = Ldt = L(φi (X), ∂µ φi (X), ..., ∂µ(n) φi (X)Xµ ) d4 X . (3.89)
t1 t1 V

We impose that S is invariant under proper Poincaré transformations (disconituous transformations,


as the parity for instance, have to be studied apart). Since the volume element d4 X is actually invariant

d4 X ′ = |detΛ| d4 X = d4 X , (3.90)

that follows from the fact that the proper Lorentz transformation has determinant +1, we have to
impose that the lagrangian density is invariant under proper Poincaré transformations. This means,
for instance, that L cannot depend explicitely on the space-time point.
Apart from locality and Poincaré invariance, we can constrain the lagrangian density with additional
requirements: i) The action (and then the lagrangian) should be a real functional to avoid problems
in the probabilistic interpretation of the theory; ii) In order to have equations of motion that are
at most second order differential equations, the lagrangian density can depend upon up to first order
derivative of the fields; iii) We can require that the lagrangian is invariant under other transformations;
for instance including internal symmetries, gauge transformations and so on ...
In general, therefore, we will have to deal with lagrangian densities of the following kind:

L = L(φi (X), ∂µ φi (X)) , (3.91)

65
where the label i of the fields φi can be a collective index or a Lorentz index, depending on what we
are considering.
Once the lagrangian density is defined, we can define the conjugated momenta to the fields
∂L
πi (X) = (3.92)
∂ φ̇i
and then the hamiltonian density, via a Legendre transformation
X
H = πi φ̇i − L , (3.93)
i

that coincides with the energy density of the system.

3.4 Hamilton’s principle and the equations of motion


Once the lagrangian density (and the action) is specified, we can find the equations of motion for the
fields φi from the Hamilton’s principle. We ask then that

δS = 0 , (3.94)

i.e. that the action is stationary on the variations δφi (X), that will have to be the analogous of the
fixed-endpoints variations of the analytical mechanics of the massive point particle.
In our case we have to deal with an integration over the space volume V and one over the time,
between t1 andt2 (that can also be ±∞). Then, if Σ is the surface that delimits the integration volume,
δφi (X) should be such that:

δφ(x, t) = 0 if x ∈ Σ (3.95)
δφ(x, t1 ) = δφ(x, t2 ) = 0 ∀ x ∈ V . (3.96)

We will have, then


 
∂L ∂L
Z
4
0 = δS = d X δφi + δφi,µ = (3.97)
∂φi ∂φi,µ
  ∂L   
∂L ∂L
Z Z
= d4 X − ∂µ 4
δφi + d X ∂µ δφi = (3.98)
∂φi ∂φi,µ ∂φi
 
∂L  ∂L 
Z
= d4 X − ∂µ δφi , (3.99)
∂φi ∂φi,µ

where, in order to move from (3.97) to (3.98) we integrated by parts and from (3.98) to (3.99) we used
the vanishing of the field variations on the boundary of the integration domain. Since δφi is arbitrary,
Eq. (3.99) gives rise to the Euler-Lagrange equations for the fields:
 
∂L ∂L
− ∂µ = 0. (3.100)
∂φi ∂φi,µ

We must notice that L is determined up to a total derivative. In fact, if L gives rise to the equations
of motion (3.100), also L′ = L + ∂µ Λµ (X) gives the same equations, provided that Λµ (X) vanishes on
the boundary of the integration domain.

66
3.5 Global symmetries and Nöether’s theorem
Abbiamo visto come, nel formalismo lagrangiano, si facciano derivare le equazioni del moto dal Principio
variazionale di Hamilton. Supporremo quindi che il nostro sistema fisico sia descritto da una densità di
lagrangiana, funzione locale dei campi e al massimo delle loro derivate prime. Aggiungeremo l’ipotesi
che L dipenda anche esplicitamente dal punto dello spazio-tempo X µ , anche se in realtà poi avremo a
che fare con lagrangiane indipendenti da X µ . Questo per necessità di formulare in maniera generale il
teorema di Nöether.
Supponiamo di operare sul sistema una generica trasformazione. A livello matematico ciò si
tradurrà in una trasformazione sull’azione, S(V ), che coinvolga X µ , φi (X) e L.
Hanno particolare interesse le trasformazioni che lasciano invariata la “fisica” del problema, cioè che
permettano di avere le stesse ampiezze di transizione e quindi, in ultima analisi, le stesse equazioni del
moto. Trasformazioni di questo genere vengono dette simmetrie del sistema e generalmente hanno
struttura di gruppo.
Se scriviamo una trasformazione generica come segue:

µ −→ X ′µ = X µ + δX µ
X


φi (X) −→ φ̃i (X ′ ) = φi (X) + ∆φi(X)
−→ L̃ φ̃i (X ′ ), φ̃i,µ (X ′ ), X ′ = L (φi (X), φi,µ (X), X) + ∆L (φi (X), φi,µ (X), X)

L

(3.101)
si avrà corrispondentemente:
Z  
′ ′
S(V) −→ S (V ) = d4 X ′ L̃ φ̃i (X ′ ), φ̃i,µ (X ′ ), X ′ , (3.102)
V′

dove V è il volume quadridimensionale d’integrazione.


Le Eqs. (3.101) costituiscono una simmetria del sistema se si ha:

S ′ (V ′ ) = S(V) . (3.103)

L’importanza del teorema di Nöether sta nel fatto che questo asserisce che ad ogni simmetria
continua del sistema viene associata una legge di conservazione locale, ovvero una quantità conservata,
che possiamo identificare quantisticamente come un’osservabile. Il numero delle quantità conservate è
pari al numero di parametri indipendenti da cui dipende la trasformazione (3.101). Quindi lo studio
delle simmetrie del sistema ci permette di fare un salto nella trattazione del problema e di individuare
subito un certo numero di osservabili.
È da notare che la richiesta (3.103) rappresenta la simmetria più generale possibile: non è detto che
non esistano delle simmetrie più limitate. Per esempio un certa trasformazione può lasciare invariata
la lagrangiana o la densità di lagrangiana e queste implicano a loro volta la (3.103). Consideriamo
quindi il caso generale e poi ci limiteremo ad alcuni casi più restrittivi.
Cominciamo col puntualizzare alcune cose a proposito delle (3.101).
Le trasformazioni che considereremo in questo paragrafo sono tutte trasformazioni infinitesime,
alle quali ci limitiamo perché stiamo considerando trasformazioni continue, il cui comportamento è
deducibile da quello nell’intorno dell’identità.
Queste trasformazioni possono agire sullo spazio-tempo, X µ → X ′µ , ed indurre quindi una cor-
rispondente variazione sul campo φi , φi (X) → φ̃i (X ′ ) (simmetrie geometriche), ma possono anche
agire soltanto sulla forma funzionale del campo φi , indipendentemente dal punto in cui essa è valutata
(simmetrie interne). Quindi, la variazione del campo φi (X) comprende genericamente le due possibi-
lità. Per esempio, una trasformazione di Lorentz sullo spazio-tempo, cioè il passaggio da un sistema
di riferimento inerziale ad un altro nello studio della fisica di un problema, indurrà una conseguen-
te trasformazione sui campi dovuta alla diversa natura di questi: se si ha un campo scalare si avrà

67
φ̃(X ′ ) = φ(X), mentre per un campo tensoriale o spinoriale la trasformazione X ′µ = Λµν X ν determine-
rà la trasformazione φ′ (X ′ ) = S(Λ)φ(X) nelle rispettive rappresentazioni del gruppo. Oppure, senza
trasformazioni dello spazio-tempo, potremo pensare ad una simmetria sotto la ridefinizione dei campi
φi .
Definiamo genericamente la variazione totale di φi (X) e L come segue:

∆φi (X) = φ̃i (X ′ ) − φi (X) = φ̃i (X ′ ) − φ̃i (X) + φ̃i (X) − φi (X) , (3.104)
µ
≃ ∂µ φ̃i (X) δX + φ̃i (X) − φi (X) , (3.105)
µ
≃ ∂µ φi (X) δX + δφi (X) , (3.106)

dove abbiamo posto δφi (X) = φ̃i (X) − φi (X), variazione in forma di φi e dove abbiamo sostituito
φ̃i con φi all’interno della derivazione fra (3.105) e (3.106), a meno di termini di ordine superiore al
primo. Inoltre:

∆L = L̃(φ̃i (X ′ )...) − L(φi (X), ...) , (3.107)


∂L ∂L
= δL(φi (X)...) + δφi + δφi,µ + ∂µ LδX µ , (3.108)
∂φi ∂φi,µ
dove δL è la variazione in forma della densità di lagrangiana ed il resto deriva dall’aver considerato
φi (X), φi,µ (X) e X µ come variabili indipendenti in L e ∂µ L è la derivata totale5 di L rispetto ad X µ
∂L ∂φi
(∂µ L = ∂φ i ∂X
µ + ...) .

Considerando le trasformazioni infinitesime, imporre la (3.103) equivale ad imporre:


Z Z
4 ′
0 = δS = d X L̃ − d4 X L . (3.112)
V′ V

Quindi, per poter procedere nel calcolo, dovremo riportare i due integrali allo stesso dominio d’inte-
grazione. Trasformando V ′ d4 X ′ in V d4 X dovremo tener conto dello jacobiano della trasformazione
R R

X µ → X ′µ = X µ + δX µ , (3.113)

ovvero di:
∂X ′ν
= det δµν + ∂µ δX ν ≃ 1 + ∂µ δX µ , (3.114)

det(J) = µ
∂X
dove abbiamo usato la relazione det(1 + ǫ) ≃ 1 + tr(ǫ).
Sostituendo nell’Eq. (3.112) e sviluppando al primo ordine, si ottiene:
Z n o
0 = d4 X (1 + ∂µ δX µ ) L̃ − L , (3.115)
ZV  
∂L ∂L
≃ d4 X δL + δφi + δφi,µ + ∂µ LδX µ + ∂µ δX µ L̃ , (3.116)
V ∂φ i ∂φ i,µ
( )
∂L h ∂L ∂L i
Z i h
4 µ µ
≃ d X δL + δφi + ∂µ δφi − ∂µ δφi + ∂µ LδX + ∂µ δX L , (3.117)
V ∂φi ∂φi,µ ∂φi,µ
5
We can rewrite the total difference as follows:

∆L = L̃(φ̃i (X ′ )...) − L(φi (X), ...) , (3.109)


= L̃(φ̃i (X ′ )...) − L(φ̃i (X ′ )...) + L(φ̃i (X ′ )...) − L(φi (X ′ )...) + L(φi (X ′ )...) − L(φi (X), ...) , (3.110)
∂L ∂L
≃ δL + δφi + δφi,µ + ∂µ LδX µ , (3.111)
∂φi ∂φi,µ

where we identified the functional variation of the lagrangian density, δL ≃ L̃(φ̃i (X ′ )...)−L(φ̃i (X ′ )...), its derivative with
respect to the variation in form of the fields, L(φ̃i (X ′ )...) − L(φi (X ′ )...) ≃ ∂φ
∂L
i
δφi + ∂φ∂L
i,µ
δφi,µ and the total derivative
µ ′ µ
with respect to X , L(φi (X )...) − L(φi (X), ...) ≃ ∂µ LδX .

68
Z  h ∂L i
4 µ
= d X δL + ∂µ δφi + L δX , (3.118)
V ∂φi,µ

dove per passare da (3.116) a (3.117) abbiamo integrato per parti e sostituito, a meno di infinitesimi
superiori al primo, L̃ con L, e per passare da (3.117) a (3.118) abbiamo sfruttato le equazioni del moto.
Per l’arbitrarietà del d4 X, la (3.118) dà la seguente equazione:
h ∂L i
∂µ δφi + L δX µ = −δL . (3.119)
∂φi,µ

Consideriamo il termine δL.


Se la trasformazione è una simmetria, come abbiamo imposto, la variazione in forma della densità
di lagrangiana non può essere qualunque. Infatti, dovendo rimanere invariate le equazioni di moto, δL
potrà al massimo essere la quadridivergenza di una certa funzione δΩµ :

δL = ∂µ δΩµ , (3.120)

con δΩµ che si annulla sulla frontiera del dominio d’integrazione.


L’Eq. (3.119) diventa, allora, semplicemente un’equazione di continuità:

∂µ J µ = 0 , (3.121)

dove abbiamo definito la seguente quadricorrente:


 
µ ∂L µ µ
J = δφi + L δX + δΩ . (3.122)
∂φi,µ

Se i campi φi e la funzione arbitraria δΩµ si annullano all’infinito, la conservazione della corrente


J µ , espressa dall’Eq. (3.121), porta alla conservazione della carica:
Z
Q = d3 X J 0 . (3.123)
V

Infatti, si ha:
dQ
Z Z
3 0
= ∂0 d XJ = dΣ J · n = 0 , (3.124)
dt V ∂V
che implica:
Q = cost . (3.125)
È chiaro che, a seconda della trasformazione (o meglio a seconda di quanti parametri indipendenti
contiene la trasformazione) (3.101), avremo più correnti conservate e quindi più cariche conservate. Il
numero di queste dipende proprio dal numero di parametri indipendenti della trasformazione.
È da notare, inoltre, che se le simmetrie “di Nöether” formano un gruppo, l’algebra di questo gruppo
induce sulle cariche conservate la stessa algebra. In altre parole le cariche sono i generatori del gruppo
di trasformazioni considerato.
Andiamo, adesso, a vedere alcuni esempi.

3.5.1 Simmetrie geometriche. Trasformazioni di Lorentz


Consideriamo il caso in cui δΩµ = 0, cioè in cui la densità di lagrangiana viene lasciata invariata dalla
trasformazione, e operiamo una trasformazione di Lorentz infinitesima:

X ′µ = X µ + ǫµν Xν , (3.126)

69
dove il tensore del secondo ordine ǫµν è antisimmetrico. Infatti, siccome X 2 è un’invariante di Lorentz,
si ha:
X 2 = X ′2 (3.127)
e siccome per la trasformazione infinitesima X ′ = X + δX, elevando al quadrato si trova

X ′2 = (X + δX)2 ≃ X 2 + X · δX (3.128)

che, per la (3.127), dà:


X · δX = 0 . (3.129)
Ma siccome, ancora, δX µ = ǫµν Xν , si ha infine:

Xµ Xν ǫµν = 0 , (3.130)

che è vera solo se ǫµν è antisimmetrico, essendo Xµ Xν simmetrico.


Questo vuol dire che ǫµν ha 6 = n(n−1)
2 parametri indipendenti: 3 per le rotazioni e 3 per i boosts,
lungo i tre assi coordinati.
Consideriamo l’indice “i” del campo φi come un indice di Lorentz, ovvero consideriamo il caso di
un unico campo che si trasformi sotto la (3.126) secondo una certa rappresentazione del Gruppo di
Lorentz. Allora si avrà:
h 1 ii
φi (X) → S(Λ)ij φj (X) ≃ 1 − Σνρ ǫνρ φj (X) =
2 j
1
= φi (X) − (Σνρ ǫνρ )ij φj (X) . (3.131)
2
Le Σνρ sono i generatori del Gruppo di Lorentz, o meglio una loro rappresentazione nella base dei
campi (rappresentazione tensoriale o spinoriale), mentre ǫµν rappresenta gli “angoli” di rotazione.
In totale, quindi: (
δX µ = ǫµν Xν
. (3.132)
∆φi (X) = − 21 (Σνρ ǫνρ )ij φj (X)
Siccome
∂L i ∂L  i
Jµ = µ i µ
+ LδX µ ,

δφ + LδX = ∆φ − ∂µ φ δX (3.133)
∂φi,µ ∂φi,µ
la quadricorrente conservata è data dalla seguente relazione:
∂L h 1  i i
Jµ = − Σ νρ ǫ νρ
φj
− ∂ ρ φi
δX ρ
+ LδX µ = (3.134)
∂φi,µ 2 j
1 ∂L ∂L
= − (Σνρ ǫνρ )ij φj − ∂ν φi ǫνρ Xρ + gνµ ǫνρ Xρ L = (3.135)
2 ∂φi,µ ∂φi,µ
 
1 ∂L νρ i j νρ ∂L i µ
= − (Σ νρ ǫ ) φ − ǫ X ρ φ − g L = (3.136)
2 ∂φi,µ j ∂φi,µ ,ν ν

1 ∂L
= − (Σνρ ǫνρ )ij φj − ǫνρ Xρ Tνµ , (3.137)
2 ∂φi,µ
dove abbiamo posto:
∂L i
Tνµ = φ − gνµ L . (3.138)
∂φi,µ ,ν
Siccome, inoltre, ǫµν è antisimmetrico nello scambio dei due indici, l’unico contributo non nullo di
ǫνρ Xρ Tνµ deriva dalla parte antisimmetrica di Xρ Tνµ (in ν e ρ):
1
Xρ Tνµ − Xν Tρµ .

(3.139)
2

70
Per cui, infine, si ha:
 
µ 1 νρ ∂L  i
j

µ µ
J = ǫ − i Σνρ φ − Xρ Tν − Xν Tρ = (3.140)
2 ∂φ,µ j
1 νρ µ
= ǫ Mνρ , (3.141)
2
dove abbiamo definito il tensore:
∂L
Mµνρ = Xν Tρµ − Xρ Tνµ − (Σνρ )ij φj , (3.142)

∂φi,µ

che ha 24 componenti indipendenti ( 4 in µ e 6 = n(n−1)


2 in ρν).
µ
Il tensore Mρν è una generalizzazione del momento angolare.
È formato da un momento “orbitale” Xρ Tνµ − Xν Tρµ , che infatti ha la struttura di un prodotto
∂L i j
vettoriale e da un momento “intrinseco” (momento di spin) − ∂φ i (Σρν )j φ . Il primo momento angolare

deriva dall’azione del Gruppo di Lorentz sulle coordinate spazio-temporali; lo spin dall’azione dello
stesso sulle coordinate spinoriali del campo.
La conservazione della quadricorrente, ∂µ J µ = 0, essendo ǫρν una costante (sono gli angoli di
rotazione e non dipendono da X), porta alla seguente equazione per il tensore M:

∂µ Mµρν = 0 . (3.143)

La (3.143) costituisce in realtà 6 correnti conservate, che sono le 6 componenti indipendenti in ρ e


ν di Mµρν .
Posto Z
Mρν = d3 X Moρν , (3.144)

se i campi vanno a zero all’infinito, si ha la conservazione delle 6 cariche:

Ṁρν = 0 . (3.145)

3.5.2 Campo scalare e conservazione del quadriimpulso e del momento angolare


orbitale
Se ci riduciamo al caso particolare di un campo scalare, avremo

∆φ(X) = φ′ (X ′ ) − φ(X) = 0 . (3.146)

Consideriamo prima di tutto una traslazione spazio-temporale di un quadrivettore aµ costante:


(
δX µ = aµ
(3.147)
∆φ = 0

cosicché si abbia:
δφ(X) = −∂µ φ(X) δX µ = −∂µ φ(X) aµ . (3.148)
Allora, si può ricavare facilmente la conservazione del quadriimpulso. Infatti, si ha:
 
µ µ ∂L
J = Lgν − ∂ν aν = (3.149)
∂φ,µ
= −Tνµ aν , (3.150)

dove Tνµ è il tensore energia-impulso del sistema.

71
Siccome la traslazione aµ è costante, la legge di conservazione della corrente J µ implica:
∂µ Tνµ = 0 , (3.151)
che sono quattro leggi di conservazione locale.
Definiamo il quadriimpulso del sistema come segue:
Z
Pν = d3 X Tν0 . (3.152)

Allora la (3.151) porta alla


Ṗν = 0 . (3.153)
Infatti, le 
µ
∂µ T0 = 0


∂ T µ

= 0
µ 1
µ (3.154)


 ∂µ T2 = 0
∂ T µ

= 0
µ 3
implicano 
0 ∂i T0i
∂0 T0 =


∂ T 0

= ∂i T1i
0 1
(3.155)


∂0 T20 = ∂i T2i
∂ T 0

= ∂i T3i
0 3

e integrando in d3 X, supposto che i campi vadano a zero all’infinito, si ottiene la (3.153) componente
per componente: 
= ∂i d3 X T0i → 0
R


 ∂0 P0

 . = . .
(3.156)


 . = . .

∂ P R 3 i
0 3 = ∂i d X T3 → 0
Se invece delle traslazioni consideriamo le trasformazioni proprie di Lorentz, avremo:
1 1
J µ = ǫρν Xρ Tνµ − Xν Tρµ = ǫρν Mµρν .
 
(3.157)
2 2
La conservazione della corrente J µ implica:
∂µ Mµρν = 0 , (3.158)
ovvero:
∂0 M0ρν = ∂i Miρν . (3.159)
Consideriamo le componenti M0ij . Si ha:
M0ij = Xi Tj0 − Xj Ti0 = [Xi Pj − Xj Pi ] ,
 
(3.160)
dove abbiamo introdotto Pi densità spaziale d’impulso. Allora:
 
0 L3 −L2
M0ij = ǫijk Lk = −L3 0 L1  (3.161)
L2 −L1 0
dove L = r ∧ P è la densità spaziale di momento angolare. Integrando la (3.159) in d3 X si ottiene la
conservazione del momento angolare orbitale:
L̇ = 0 , (3.162)
dove Z
Li = d3 X Li . (3.163)

72
3.5.3 Simmetrie interne globali
Come abbiamo già accennato, l’altro esempio di trasformazione (3.101) da considerare è quello di una
variazione che coinvolga soltanto una ridefinizione in forma dei campi, ma non un cambiamento di
sistema di riferimento.
Genericamente avremo: (
δX µ = 0
(3.164)
∆φi = δφi 6= 0
da cui scaturisce la legge di conservazione locale ∂µ J µ = 0 con:
∂L
Jµ = δφi . (3.165)
∂φi,µ
Se i campi vanno a zero all’infinito, si conserva la carica:
∂L i
Z Z
3 0
Q = d XJ = d3 X δφ . (3.166)
∂ φ̇i

Campo scalare carico


Il tipico esempio di simmetria interna è l’invarianza della lagrangiana del campo scalare carico,
L = ∂µ φ† ∂ µ φ − m2 φ† φ , (3.167)
sotto trasformazioni di fase globali:
(
φ → φ′ = eiα φ
. (3.168)
φ† → φ′† = φ† e−iα
Quest’invarianza determina la conservazione della corrente:
Jµ h  i
jµ = = i ∂µ φ† φ − ∂µ φ φ†

(3.169)
α
e della carica: Z  
Q = i d3 X φ̇† φ − φ̇φ† , (3.170)

che può essere vista nel modello interagente come carica elettrica delle particelle e antiparticelle scalari
φ.

Campo di Dirac
Le trasformazioni di fase globali lasciano invariata anche un’altra lagrangiana: quella del campo di
Dirac libero:
L = ψ (i ∂6 − m) ψ . (3.171)
Riscriviamo le (3.168) per il campo ψ:
(
ψ → ψ ′ = e−iα ψ
′ . (3.172)
ψ → ψ = ψ eiα
Allora, avremo una corrente conservata:

jµ = = ψ γµ ψ (3.173)
α
ed una carica conservata: Z Z
3 0
d3 X ψ † ψ . (3.174)

Q = d X ψγ ψ =

73
Capitolo 4

Free Fields

In this chapter we will study the non-interacting fields, from the classical viewpoint to their canonical
quantization.

4.1 The Klein-Gordon Field (classical field)


We introduced different finite-dimensional representation of the Lorentz group. According to them, we
can classify our fields. We start with the simplest representation, the trivial one1 , and we consider then
a scalar field (real or complex) φ(X) which under a Poincaré transformation X µ → X ′µ = Λµν X ν + aµ
transforms as
φ(X) → φ′ (X ′ ) = φ(X) . (4.1)

4.1.1 The Klein-Gordon equation


The Klein-Gordon field will satisfy a differential equation that can be found using the relativistic
dispertion relation (energy-momentum relation or mass-shell condition)

E 2 = p 2 + m2 , (4.2)

replacing the energy and the momentum with the correspondence principle
(

E → i ∂t ,
(4.3)
p → −i∇ .

We find
∂2
− φ(X) = (−∇2 + m2 )φ(X) , (4.4)
∂t2
that, remembering the covariant form ∂µ ∂ µ = ∂02 − ∂i2 , can be written in manifestly covariant way as
follows
(∂µ ∂ µ + m2 )φ(X) = 0 . (4.5)
Eq. (4.5) is invariant under Poincaré transformations. In fact we can check easily that

(∂µ′ ∂ ′µ +m2 )φ′ (X ′ ) = (Λµ.ν Λµρ ∂ν ∂ ρ +m2 )φ′ (X ′ ) = (δρν ∂ν ∂ ρ +m2 )φ′ (X ′ ) = (∂ν ∂ ν +m2 )φ(X) = 0 . (4.6)

Eq. (4.5) has to be considered as a classical equation for the classical field φ(X). Then we will
quantize our system. In this sense there is no “second quantization”, but only the quantization of the
classical field (we will quantize once!).
1
In this representation the generators of the group are zero, J µν = 0.

74
If, as was the case when the equation was proposed around 1926, we would like to interpret Eq. (4.5)
as a wave equation (so to say “à la Schrödinger”), we would face many issues. The main ones can be
summarized as follows:

• First of all, the fact that we have a differential equation which is second order in time seams to
be in constrast with the basic laws of quantum mechanics according to which we can determine
the time evolution of the wave function knowing just the function at a certain time t0 . The
requirement that in order to solve the differential equation we have to provide the initial values
of the field and its time derivative violates the Heisenberg principle.

• The fact that we have a differential equation which is second order in time makes in such a way
that the probabilistic interpretation of the theory is at risk. What we would like to interpret as
“probability density” is in fact non positive definite. We can see that considering the differential
equation for the complex-conjugated field (which is the same as Eq. (4.5) since the differential
operator is real):
(∂µ ∂ µ + m2 )φ∗ (X) = 0 . (4.7)
If we multiply Eq. (4.5) by φ∗ and we subtract Eq. (4.7) multiplied by φ, we have

0 = φ∗ (∂µ ∂ µ + m2 )φ − φ(∂µ ∂ µ + m2 )φ∗ =


∂2 ∂2
= φ∗ 2 φ − φ 2 φ∗ + φ∇2 φ∗ − φ∗ ∇2 φ =
∂t ∂t
∂ ∗∂ ∂
= (φ φ − φ φ∗ ) + ∇ · (φ∇φ∗ − φ∗ ∇φ) , (4.8)
∂t ∂t ∂t
which is a continuity equation in which the probability density should be given by the following
expression2 :  
∗ ∂ ∂ ∗ ←→
ρ=i φ φ − φ φ = iφ∗ ∂0 φ , (4.9)
∂t ∂t
such that
d ←→
Z
d3 X iφ∗ ∂0 φ = 0 . (4.10)
dt
Eq. (4.10) defines the correct scalar product (in the Hilbert space of φ(X)), which is conserved
(time independent):
←→
Z
(φ1 , φ2 ) = d3 X iφ∗1 ∂0 φ2 . (4.11)

However, ρ it is not a positive definite expression and, therefore, the connection with the
probabilistic interpretation of the theory fails.

• Finally, an even more serious problem arises from the plane wave solutions of the Klein-Gordon
equation. As we will see in the next section.

4.1.2 Plane wave solutions of the Klein-Gordon equation


We look for a solution of Eq. (4.5) as a plane wave solution
µ
φ(X) = A e−iPµ X . (4.12)

Substituting Eq. (4.12) in Eq. (4.5) we find that the plane wave is a solution prodived that
µ µ
(∂µ ∂ µ + m2 )Ae−iPµ X = (−Pµ P µ + m2 )Ae−iPµ X = 0 , (4.13)
2
We put an “i” in order to have a real ρ

75
i.e.
Pµ P µ = E 2 − p2 = m2 . (4.14)
Eq. (4.14) gives two possible solutions for the energy E:
p
E+ = p 2 + m2 = ω p , (4.15)
p
E− = − p2 + m2 = −ωp . (4.16)

We therefore have two different solutions:

fp+ (X) = A e−iE+ t+ip·x = A e−iωp t+ip·x , (4.17)


fp− (X) = Ae −iE− t+ip·x
= Ae iωp t+ip·x
. (4.18)

Solution (4.18) is of difficult interpretation within a theory such as wave mechanics, “à la Schrödin-
ger”. The issue can be solved moving to a field theory. Eq. (4.5) should be interpreted not as a wave
equation, but as the differential equation that the classical field φ has to fulfill.
The general solution will be a superposition of f + and f − :
Z
φ(X) = d3 p α(p) A e−iωp t+ip·x + β(p) A eiωp t+ip·x . (4.19)


Let us normalize our functions with respect to the scalar product (4.11). We have
Z h i
+ + 2 µ ′ µ ′ µ µ
(fp (X), fp′ (X)) = i|A| d3 X eiPµ X ∂0 e−iPµ X − e−iPµ X ∂0 eiPµ X , (4.20)
Z h i
2 ′ µ ′ µ
= i|A| d3 X −iωp′ ei(P −P )µ X − iωp e−i(P −P )µ X , (4.21)
Z
2 ′ µ
= |A| d3 X(ωp′ + ωp )ei(P −P )µ X , (4.22)
Z
2 i(ωp −ωp′ )t ′
= |A| (ωp′ + ωp )e d3 Xe−i(p−p )·x , (4.23)

= (2π)3 |A|2 (ωp′ + ωp )ei(ωp −ωp′ )t δ(p − p′ ) , (4.24)


3 2 ′
= (2π) |A| 2ωp δ(p − p ) . (4.25)

Imposing
(fp+ (X), fp+′ (X)) = δ(p − p′ ) , (4.26)
we find3
1
A= 3p . (4.27)
(2π) 2 2ωp
Finally
µ
e−iPµ X
fp+ (X) = 3p . (4.28)
(2π) 2 2ωp
In the same way we can normalize fp− (X), that has negative norm (a remark of the fact that
negative energy solutions cannot be linked to usual wave mechanics solutions). We have
Z h i
′ ′
(fp− (X), fp−′ (X)) = i|A|2 d3 X e−iωp t−ip·x ∂0 eiωp′ t−ip ·x − eiωp′ t−ip ·x ∂0 e−iωp t−ip·x , (4.29)
Z h i
′ ′
= i|A|2 d3 X iωp′ e−i(ωp −ωp′ )t−i(p−p )·x + iωp ei(ωp −ωp′ )t−i(p−p )·x , (4.30)

3
We choose A real.

76
Z
2 −i(ωp −ωp′ )t ′
= −|A| (ωp′ + ωp )e d3 Xe−i(p−p )·x , (4.31)

= −(2π)3 |A|2 2ωp δ(p − p′ ) . (4.32)

Imposing
(fp− (X), fp−′ (X)) = −δ(p − p′ ) , (4.33)
we find the same expression for the normalization factor:
1
A= 3p . (4.34)
(2π) 2 2ωp

Therefore
eiωp t+ip·x
fp− (X) = 3p . (4.35)
(2π) 2 2ωp
We can prove that f + and f − are ortogonal, as follows
Z h i
+ ′ ′

(fp (X), fp′ (X)) = i|A| 2
d3 X e−iωp t−ip·x ∂0 e−iωp′ t+ip ·x − e−iωp′ t+ip ·x ∂0 e−iωp t−ip·x , (4.36)
Z h i
′ ′
= i|A| 2
d3 X −iωp′ e−i(ωp +ωp′ )t−i(p−p )·x + iωp e−i(ωp +ωp′ )t−i(p−p )·x ,(4.37)
Z
2 −i(ωp +ωp′ )t ′
= |A| (ωp′ − ωp )e d3 Xe−i(p−p )·x , (4.38)
= 0. (4.39)

We express the classical solution of the KG equation in terms of plane waves as the following
combination:
Z
d3 p α(p) fp+ (X) + β(p) fp− (X) ,

φ(X) = (4.40)

d3 p
Z
α(p) e−iωp t+ip·x + β(p) eiωp t+ip·x .

= 3p (4.41)
(2π) 2 2ωp

Since we are integrating in the whole domain of p, we can change p with −p in the second integral
finding

d3 p
Z  
−iPµ X µ iPµ X µ
φ(X) = 3p α(p) e + β̃(p) e , (4.42)
(2π) 2 2ωp
Z  
= d3 p α(p) fp+ (X) + β̃(p) fp− (X) , (4.43)

where now µ
eiPµ X
fp− (X) = (fp+ (X))∗ = 3p . (4.44)
(2π) 2 2ωp
If we consider a real field, then we have to impose φ∗ (X) = φ(X):
Z  
φ∗ (X) = d3 p α∗ (p) (fp+ (X))∗ + β̃ ∗ (p) fp+ (X) = φ(X) , (4.45)

that means β̃(p) = α∗ (p).

77
The final expression for the classical real Klein-Gordon field (in terms of normal modes) is the
following
d3 p
Z
µ µ
φ(X) = 3p α(p) e−iPµ X + α∗ (p) eiPµ X . (4.46)
(2π) 2 2ωp
NOTE: The expression (4.44) for f − has all the characteristics of the previous one, i.e. negative norm
and ortogonality with f + . The fact that negative-energy solutions are related to the positive-energy
ones by the transformation (E, p) → (−E, −p) has a nice meaning in terms of the Feynman-Stuckelberg
interpretation.
We can use the scalar product to extract the coefficients α(p) and α∗ (p):

←→
Z
α(p) = (fp , φ) = i d3 X (f + )∗ ∂0 φ ,
+
(4.47)


Z
α∗ (p) = −(fp− , φ) = −i d3 X (f − )∗ ∂0 φ . (4.48)

4.1.3 Lagrangian density of the Klein-Gordon real field


We now want to find the Lagrangian density of the Klein-Gordon real field, i.e. the functional L such
that through the Euler-Lagrange equations we can obtain Eq. (4.5). In order to do that, we use the
Hamilton’s principle following the reverse procedure. If δφ is the variation of the field φ(X), that
vanishes on the boundary of the integration domain, we multiply Eq.(4.5) by that variation and we
integrate by parts. We have
Z
d4 X ∂µ ∂ µ φ + m2 φ δφ = (4.49)

0 =

m2 2
Z  
d4 X ∂µ ∂ µ φ δφ − ∂ µ φ ∂µ δφ +

= δφ = (4.50)
2
 m2 2
Z  
4 µ
= − d X ∂ φ δ ∂µ φ − δφ = (4.51)
2
"Z #
4 1 µ 2 2
(4.52)

= −δ d X ∂µ φ∂ φ − m φ ,
2

where moving from (4.49) to (4.50) and from (4.51) to (4.52) we integrated by parts and where the
first term of Eq. (4.50) gives a vanishing integral, since it is the total derivative and the variation of
the field annihilates on the boundary of integration.
The lagrangian density is:
1
∂µ φ∂ µ φ − m2 φ2 .

L = (4.53)
2
We note that an overall sugn in the definition of L does not affect the equations of motion. However,
on the sign of L depends the sign of the hamiltonian density H. We chose the sign of (4.53) in such a
way to have an hamiltonian density definite positive.

Conserved quantities
From the invariance under Poincaré transformations of the lagrangian (4.53), we have a relation for
the Nöther’s charges. Translation invariance gives the conservation of the energy-momentum tensor
∂L ,ν
T µν = φ − η µν L = ∂ µ φ∂ ν φ − η µν L , (4.54)
∂φ,µ

78
such that ∂µ T µν = 0. The four conserved charges are the energy density

1h 2 i
H = T 00 = (∂0 φ)2 − L = φ̇ + (∇φ)2 + m2 φ2 (4.55)
2
and the momentum density
Pi = Ti0 = φ̇∂i φ . (4.56)
From Lorentz invariance, instead, we find the conservation of the 6 charges
Z
0
= d3 X Xµ Tν0 − Xν Tµ0 ,

Mµν (4.57)

among which for instance the orbital angular momentum


Z Z
0 3 0 0
d X Xi Tj − Xj Ti = d3 X (Xi ∂0 φ∂j φ − Xj ∂0 φ∂i φ) ,

Mij = (4.58)
Z Z
= d X ∂0 φ (Xi ∂j − Xj ∂i ) φ = −i d3 X ∂0 φLij φ ,
3
(4.59)

where
Lij = i (Xi ∂j − Xj ∂i ) (4.60)
is the angular momentum operator

4.1.4 Hamiltonian
The hamiltonian density is in Eq. (4.55)4 . It can be obtained also with a Legendre transformation of
the lagrangian density. We define the conjugated momentum to the field
∂L
π(X) = = φ̇ . (4.61)
∂ φ̇
Then we find
1 2
π + (∇φ)2 + m2 φ2 .

H= (4.62)
2
Recalling that

d3 p
Z
µ µ
φ(X) = 3p α(p) e−iPµ X + α∗ (p) eiPµ X , (4.63)
(2π) 2 2ωp
d3 p
Z
µ µ
π(X) = −i 3p ωp α(p) e−iPµ X − α∗ (p) eiPµ X , (4.64)
(2π) 2 2ωp
d3 p
Z
µ µ
∇φ(X) = i 3p p α(p) e−iPµ X − α∗ (p) eiPµ X , (4.65)
(2π) 2 2ωp

we can find the hamiltonian as follows


1 2
Z Z
d X H = d3 X
3
π + (∇φ)2 + m2 φ2 ,

H = (4.66)
2
(
1
Z 3 3
d pd p ′ Z
= d3 X
(2π)3 4ωp ωp′
p
2
4
Actuaslly, since in Eq. (4.55) there is the energy density, it is written in terms of φ and φ̇. If we want the hamiltonian
we have to intend Eq. (4.61) as an equation from which to extract φ̇ in terms of π.

79
µ µ
h ′ µ ′ µ
i
−ωp ωp′ α(p) e−iPµ X − α∗ (p) eiPµ X α(p′ ) e−iPµ X − α∗ (p′ ) eiPµ X


µ µ
h ′ µ ′ µ
i
−p · p′ α(p) e−iPµ X − α∗ (p) eiPµ X α(p′ ) e−iPµ X − α∗ (p′ ) eiPµ X

)
2
 −iPµ X µ ∗ iPµ X µ
h ′ −iPµ′ X µ ∗ ′ iPµ′ X µ
i
+m α(p) e + α (p) e α(p ) e + α (p ) e , (4.67)
(
1 d3 p d3 p′
Z Z
= d3 X
(2π)3 4ωp ωp′
p
2
′) X µ
α(p)α(p′ )e−i(P +P µ
(−ωp ωp′ − p · p′ + m2 )
i(P +P ′ )µ X µ
+α∗ (p)α∗ (p′ )e (−ωp ωp′ − p · p′ + m2 )
′) Xµ
+α(p)α∗ (p′ )e−i(P −P µ
(ωp ωp′ + p · p′ + m2 )
)
′) Xµ
+α∗ (p)α(p′ )ei(P −P µ
(ωp ωp′ + p · p′ + m2 ) , (4.68)

= | integrating in d3 X |
(
1 d3 p d3 p′
Z
= p
2 4ωp ωp′
h i
α(p)α(p′ )e−i(ωp +ωp′ )t + α∗ (p)α∗ (p′ )ei(ωp +ωp′ )t (−ωp ωp′ − p · p′ + m2 ) δ(p + p′ )
)
h i
−i(ω −ω )t i(ω −ω )t
+ α(p)α∗ (p′ )e p p′ + α∗ (p)α(p′ )e p p′ (ωp ωp′ + p · p′ + m2 ) δ(p − p′ ) (4.69)
,

= | integrating in d3 p′ , since −ωp2 + p2 + m2 = 0 |


1
Z
= d3 p ωp [α(p)α∗ (p) + α∗ (p)α(p)] , (4.70)
2

where we considered α∗ (p)α(p) 6= α(p)α∗ (p), although classically this does not have any meaning.
As in the case of the vibrating string, we find that the hamiltonian is the “sum” of an infinite
number of hamiltonians of harmonic oscillator of frequency ωp .

4.1.5 Complex scalar field and the charge


So far we treated the real field case. Let us consider now the possibility to have a complex scalar
field. We can treat the case, starting with two real fields, φ1 and φ2 degenerate in mass, for which the
lagrangian density is
1
∂µ φ1 ∂ µ φ1 + ∂µ φ2 ∂ µ φ2 − m2 (φ21 + φ22 ) ,

L= (4.71)
2
rotating in the complex plain to φ and φ∗ defined such that
φ1 + iφ2
φ = √ , (4.72)
2
φ1 − iφ2
φ∗ = √ , (4.73)
2
or
φ + φ∗
φ1 = √ , (4.74)
2
φ − φ∗
φ2 = √ , (4.75)
i 2

80
Substituting (4.74,4.75) in Eq. (4.71) we find

L = ∂µ φ∗ ∂ µ φ − m2 φ∗ φ . (4.76)

The lagrangian (4.76) has a global internal symmetry, under U (1) phase transformations:

φ(X) → φ′ (X) = e−iθ φ(X) , (4.77)


∗ ′∗ iθ ∗
φ (X) → φ (X) = e φ (X) , (4.78)

with θ ∈ R. It is easy to check that indeed

L′ = ∂µ φ′∗ ∂ µ φ′ − m2 φ′∗ φ′ = ∂µ φ∗ ∂ µ φ − m2 φ∗ φ = L . (4.79)

The Nöther’s current associated to this symmetry is


∂L ∂L ∂L ∗
Jµ = δφi = δφ + δφ . (4.80)
∂φi,µ ∂φ,µ ∂φ∗,µ

Note that the piece proportional to the lagrangian density is not present, since for this internal
symmetry we do not have any change in the space-time point, δX µ = 0.
The infinitesimal transformation can be found expanding (4.77,4.78):

δφ = −iθφ , (4.81)
∗ ∗
δφ = iθφ , (4.82)

from which
h ←
∗→
 
∂L ∂L ∗ µ ∗ ∗ µ µ
i
∂µ δφ + δφ = ∂µ [−iθφ∂ φ + iθφ ∂ φ] = θ ∂µ iφ ∂ φ = 0. (4.83)
∂φ,µ ∂φ∗,µ

Since relation (4.83) holds for any θ, we define the current as


←→
J µ = iφ∗ ∂ µ φ (4.84)

and the conserved current is Z ←→


Q= d3 Xiφ∗ ∂ 0 φ . (4.85)

4.1.6 Non relativistic limit


In the β ≪ 1 limit, the energy can be expanded as well finding

p2
E ∼m+ + ... (4.86)
2m
We have a big constant, the mass m, which comes from relativity and is not present in non relativistic
newtonian mechanics, and a small term which is indeed the non relativistic kinetic energy. Therefore,
we consider the limit in which the momenta and energies are small with respect to the big term m.
We define
E′ = E − m , (4.87)
and therefore we have E ′ ≪ m.
The positive-energy solutions will oscillate with a term that is as big as m, ∼ e−imt , and a slightly

varying term ∼ e−iE t . In order to study the latter, we have to factorize the former. We put

φ = ϕ e−imt , (4.88)

81
in such a way that
∂ϕ
i ∼ E ′ ϕ ≪ mϕ . (4.89)
∂t
We have, at first order in E ′ :
 
∂φ ∂ϕ
= − imϕ e−imt , (4.90)
∂t ∂t
2
 2   
∂ φ ∂ ϕ ∂ϕ −imt ∂ϕ
= − im e − im − imϕ e−imt , (4.91)
∂t2 ∂t2 ∂t ∂t
 
∂ϕ
≃ −i 2m − m ϕ e−imt .
2
(4.92)
∂t

Substituting in the Klein-Gordon equation we have

∂2φ
= ∇ 2 φ − m2 φ (4.93)
∂t2
and, therefore  
∂ϕ
−i 2m − m ϕ e−imt = ∇2 ϕe−imt − m2 ϕe−imt .
2
(4.94)
∂t
Finally
∂ϕ 1 2
i =− ∇ ϕ, (4.95)
∂t 2m
which is the Schródinger equation for a free spinless particle.

4.1.7 The two-component form


The Klein-Gordon equation is second order in time. Every second-order differential equation is equi-
valent to a system of first-order differential equations. We can then cast the Klein-Gordon equation
into a form which is “similar” to the Schrödinger equation, loosing the manifest covariance and getting
an equation that involves a two-component field and a matrix 2x2 that plays the role of the hamil-
tonian in the Schródinger equation (however, in this two-component description the Schrödinger-like
hamiltonian is not even hermitian).
We consider then φ and ∂0 φ as independent fields, defining two fields, φ+ and φ− , as follows
1
φ+ = √ (i∂0 φ + mφ) , (4.96)
2m
1
φ− = √ (−i∂0 φ + mφ) , (4.97)
2m
such that
1
φ = √ (φ+ + φ− ) , (4.98)
2m
r
m
i∂0 φ = (φ+ − φ− ) . (4.99)
2

(For the moment we do not include interactions).


Let us take the derivatives with respect to time of φ± :

∂2
 
1
i∂0 φ± = √ ∓ 2 φ + mi∂0 φ , (4.100)
2m ∂t

82
= | using the Klein-Gordon equation |
1 
∓(−p2 − m2 )φ + mi∂0 φ , (4.101)

= √
2m
 r 
1 2 2 1 m
= √ ∓(−p − m ) √ (φ+ + φ− ) + m (φ+ − φ− ) , (4.102)
2m 2m 2
 2 
p m m
= ± + (φ+ + φ− ) + (φ+ − φ− ) . (4.103)
2m 2 2

Finally   
i∂0 φ+ = p2 p2
2m + m φ+ + 2m φ− ,
p2
 2
p
 (4.104)
i∂0 φ− = − 2m φ+ − 2m + m φ− .

If we arrange φ± in a two-component vector


 
φ+
Ψ= (4.105)
φ−

we find the following form for the KG equation

(i∂0 − H)Ψ = 0 . (4.106)

In Eq. (4.106), we defined the following matrix


!
p2 p2
p2 p2 2
 
+m
H= 2m
p2
2m
p2 = +m τ3 + iτ , (4.107)
− 2m − 2m − m 2m 2m

where      
1 0 1 2 0 −i 3 1 0
τ = , τ = , τ = , (4.108)
1 0 i 0 0 −1
are the Pauli matrices. Note that although we found a natural representation in terms of the Pauli
matrices, we are not speaking about spin. The particle we are describing with the KG equation are
spinless particles. Here we are considering an SU (2) rotation, but in another space (the one identified
by the two-component vectors Ψ).
Since
∂2
(i∂0 + H)(i∂0 − H) = − 2 − H 2 (4.109)
∂t
and
 2 2
p4 p2
 2 
p p
H2 = + m (τ 3 )2 − (τ 2 2
) + i + m [τ 3 , τ 2 ]+ = (p2 + m2 )1 , (4.110)
2m 4m2 2m 2m

we find that since Eq. (4.106) holds, we have

(i∂0 + H)(i∂0 − H)Ψ = 0 , (4.111)

and therefore, because of Eq. (4.109):

∂2
 
− ∇ 2 + m2 Ψ = 0. (4.112)
∂t2

Both φ+ and φ− satisfy the KG equation.

83
The operator H is not hermitian, H † 6= H. However, it is hermitian “in τ 3 metric”, i.e.

τ 3H †τ 3 = H , (4.113)

as can be checked by direct inspection.


We can also plug in the new form the conserved charge
Z
3

∗ 0
→ i
Z
←

Q = i d Xφ ∂ φ= d3 X φ∗+ + φ∗− ∂ 0 (φ+ + φ− ) , (4.114)
2m
= ... ,
Z
d3 X |φ+ |2 − |φ− |2 , (4.115)

=
Z
= d3 X Ψ† τ 3 Ψ . (4.116)

We can introduce the “Klein-Gordon” adjoint

Ψ = Ψ† τ 3 (4.117)

and write then Z


Q= d3 X ΨΨ . (4.118)

Plane wave solutions


We can study the solutions of the KG equation in the two-component form. We look for a solution of
the following kind  
φ
Ψ ∼ A + e−iEt+ip·x . (4.119)
φ−
Substituting in Eq. (4.106) we find the system
!
p2 p 2 
E −m− − 2m φ+
p2
2m
p2 = 0, (4.120)
E+m+ φ−
2m 2m

that has a non-trivial solution if


!
p2 p 2
E−m− − 2m
det p2
2m
p2 = E 2 − p 2 − m2 = 0 . (4.121)
2m E+m+ 2m

Eq. (4.121) require positive and negative energy solutions


p
E = ± p2 + m2 = ±ωp (4.122)

and the system becomes


 
 ±ωp − m − p2 p2
2m φ+ = 2m φ−
p2

p2
 (4.123)
2m φ+ = − ±ωp + m + φ−

2m

84
Positive-energy solutions
The system is Eq. (4.123) in which we consider +ωp :
 
p2 (+) p2 (+)
ωp − m − 2m φ+ = 2m φ−
p2 (+)

p2

(+)
(4.124)
2m φ+ = − ωp + m + 2m φ−

that gives
(+) p2
φ+ 2m
(+)
= p2
(4.125)
φ− ωp − m − 2m
and since
p2 (ωp − m)2
ωp − m − = , (4.126)
2m 2m
p2 (ωp + m)2
ωp + m + = , (4.127)
2m 2m
p2 = (ωp + m)(ωp − m) , (4.128)

we find
(+) p2
φ+ 2m ωp + m
(+)
= p2
=− . (4.129)
φ− ωp − m − ωp − m
2m
We choose the positive-energy solution as
 
(+) (+) ωp + m −iωp t+ip·x
Ψ =A e . (4.130)
ωp − m

The normalization factor A(+) will be discussed below.

Negative-energy solutions
The system is Eq. (4.123) in which we consider −ωp :
 
p2 (−) p2 (−)
−ωp − m − 2m φ+ = 2m φ−
p2 (−)

p2

(−)
(4.131)
2m φ+ = − −ωp + m + 2m φ−

that gives
(−) p2
φ+ 2m ωp − m
(−)
=− p2
=− . (4.132)
φ− ωp + m + ωp + m
2m
Then  
(−) (−) ωp − m iωp t+ip·x
Ψ =A e . (4.133)
ωp + m
The normalization factor A(−) will be discussed below.

Normalization
For the normalization of the two solutions we impose
Z
(+)
d3 X Ψ̄(+)
p Ψp ′ = δ(p − p′ ) , (4.134)

85
Z
(−)
d3 X Ψ̄(−)
p Ψp ′ = −δ(p − p′ ) . (4.135)

We have
Z Z
3 (+) (+)
d X Ψ̄(+)
p Ψp ′ = d3 X Ψ(+)†
p τ 3 Ψp ′ , (4.136)
Z
2 ′
A(+) e−i(ωp′ −ωp )t (m + ωp )(m + ωp′ ) − (ωp − m)(ωp′ − m) d3 X ei(p −p)·x ,
 
=
2
= A(+) (2π)3 4mωp δ(p − p′ ) (4.137)

and therefore5
1
A(+) = p . (4.138)
(2π)3 4mωp
For A(−) we find the same expression
1
A(−) = p . (4.139)
(2π)3 4mωp
Finally:
!
1 m + ωp 1
Ψ(+) = p √ ωp −m e−iωp t+ip·x , (4.140)
3
(2π) 2ωp 2m ωp +m
!
1 m + ωp 1
= p √ p2 e−iωp t+ip·x , (4.141)
3
(2π) 2ωp 2m (ωp +m)2
!
ωp −m
1 m + ωp
Ψ(−) = p √ ωp +m eiωp t+ip·x , (4.142)
3
(2π) 2ωp 2m 1
!
p2
1 m + ωp
= p √ (ωp +m)2 eiωp t+ip·x . (4.143)
(2π)3 2ωp 2m 1

Charge conjugation
The Klein-Gordon equation, as the other covariant equations, has a symmetry related to the existence
of both positive and negative energy solutions. These solutions can be transformed into each other by
charge conjugation:
φ → φC = τ1 φ∗ , (4.144)
C ∗
such that φC = τ1 τ 1 φ∗ = φ.
 

φC satisfies the same free6 equation as φ. In fact, taking the complex conjugate of Eq. (4.106),
with Eq. (4.107), we find
 2
p2
 
∂ ∗ p 3
−i φ = +m τ + (−i)(−τ ) φ∗ .
2
(4.145)
∂t 2m 2m

If now we multiply on the l.h.s. by τ 1 , recalling the fact that τ 1 anti-commutes with τ 2 and τ 3 , we
have  2
p2
 
∂ p
i φC = + m τ3 + i)τ 2 φC . (4.146)
∂t 2m 2m
We choose A(+) real.
5

When we will introduce the electromagnetic interaction, we will see that φC is a solution of the equation in which
6

the electric charge changes sign.

86
The charge conjugated field φC has a charge which is opposite to the charge of φ. In fact
†
φC = τ1 φ∗ , φC = (φ∗ )† (τ 1 )† = φt τ 1 . (4.147)

Therefore
Z Z
′ 3 C †
τ φ = d3 X φt τ 1 τ 3 τ 1 φ∗ ,
3 C

Q = d X φ (4.148)
Z  
= − d3 X |φ+ |2 − |φ− |2 = −Q . (4.149)

Concerning the plane-wave solutions, we find that charge conjugation connects the negative-energy
solution to the positive-energy one in the following way:
!
p2
(−) C 1 m + ω p
Ψ−p = τ1 p √ (ωp +m)2 e−iωp t+ip·x , (4.150)
(2π)3 2ωp 2m 1
!
1 m + ωp 1
= p √ p2 eiωp t+ip·x = Ψ(+)
p . (4.151)
(2π)3 2ωp 2m (ωp +m) 2

Therefore, charge conjugation turns a negative-energy state of momentum −p into a positive-energy


one of opposite charge and momentum +p. If we “connect” φ with a “particle” state, φC will be
connected with an anti-particle state.

4.2 Quantization of the Klein-Gordon field


We worked out the theory of the classical φ(X) field and we analysed the system using the lagrangian
and the hamiltonian formalism. The first gives us the possibility to find the conserved quantities of
the physical system, as the charges of the symmetries of the corresponding action. The second is the
correct framework for the canonical quantization.

4.2.1 Real field


Just to recap, the real field, φ(X), has to satisfy the Klein-Gordon equation

(∂ 2 + m2 )φ(X) = 0 , (4.152)

which is the Euler-Lagrange equation of the following lagrangian density


1
∂µ φ∂ µ φ − m2 φ2 .

L= (4.153)
2
The lagrangian density is invariant under Poincaré transformations and, therefore, it follows that the
four-momentum and the generalized angular momentum (in particular the orbital angular momentum,
since the real scalar field does not have spin) are conserved.
The energy density coming from Nöther’s theorem is
1 2  1
H = T00 = φ̇ + (∇φ)2 + m2 φ2 = π 2 + (∇φ)2 + m2 φ2 , (4.154)

2 2
since
∂L
π(X) = = φ̇ (4.155)
∂ φ̇
and therefore it coincides with the hamiltonian density that can be found via a Legendre transformation

H = π φ̇ − L . (4.156)

87
Note that H is positive definite. The hamiltonian can be diagonalized in terms of normal modes if we
find a plane-wave solution of Eq. (4.152)

d3 p
Z
µ µ
a(p)e−ipµ X + a∗ (p)eipµ X .

φ(X) = p (4.157)
3
(2π) 2ωp

We have
ωp
Z
H= d3 X [a(p)a∗ (p) + a∗ (p)a(p)] , (4.158)
2
that has the form of an infinite sum of harmonic oscillators. This suggests the right way for the
quantization of this system. We will have to promote the field (4.157) from a classical function to an
operator. In order to do that, we can only interpret the coefficients a(p) and a∗ (p) as operators

a(p) → â(p) , a∗ (p) → ↠(p) (4.159)

and we will have to check that these operators are indeed creation-annihilation operators, as we can
understand from the form of the hamiltonian.
Considering the correspondence with the non-relativistic point-like particle, the degree of freedom
that was describing the position at a certain time in that framework, corresponds now to the field in
a certain point at the time t:
q(t) → φ(x, t) . (4.160)
The conjugated momentum, p(t), corresponds to the conjugated momentum π(x, t).
Note that the description in terms of the fields is given naturally in a time-dependent framework.
When we promote the field to an operator, this operator will have to be treated in Heisenberg picture.
To the canonical quantization relations

[qi (t), pj (t)] = i δij , [qi (t), qj (t)] = [pi (t), pj (t)] = 0 , (4.161)

will have to involve the fields. We will have to impose the following equal-time quantization relations

[φ(x, t), π(y, t)] = i δ(x − y) , (4.162)


[φ(x, t), φ(y, t)] = [π(x, t), π(y, t)] = 0 , (4.163)

where now the field is


d3 p
Z h i
µ µ
p â(p)e−iPµ X + ↠(p)eiPµ X (4.164)
(2π)3 2ωp
and π(X) = φ̇(X).
We have to check that the quantization relations (4.162,4.163) imply that ↠and â are indeed
creation-annihilation operators, and the hamiltonian can be written in terms of the number operator.
In order to to that, we remember that7
∗ ←

Z
â(p) = fp , φ = i d3 X fp+ ∂0 φ ,
+
(4.168)


7
This can be checked by direct inspection, using the form of f + and the field in normal modes. In fact
∗ ←

Z
fp+ , φ = i d3 X fp+ ∂0 φ ,

(4.165)

d3 p ′ iPµ X µ ←
→ h ′ −iPµ′ X µ
Z Z i


= i d3 X p e ∂ 0 â(p )e + â † ′ iPµ
(p )e , (4.166)
(2π)3 4ωp ωp′
= .. = â(p) . (4.167)

88
∗ ←

Z

fp− , φ d3 X fp−

â (p) = − = −i ∂0 φ . (4.169)

Therefore
µ
eiPµ X ←

Z
â(p) = i d3 X p ∂0 φ , (4.170)
3
(2π) 2ωp
Z 3
d X  
µ µ
= i p eiPµ X φ̇ − iωp eiPµ X φ , (4.171)
(2π)3 2ωp
d3 X
Z   µ
= p ωp φ + iφ̇ eiPµ X , (4.172)
3
(2π) 2ωp
d3 X
Z   µ
↠(p) = p ωp φ − iφ̇ e−iPµ X . (4.173)
3
(2π) 2ωp

With these expressions we can construct the commutator [a(p), a† (p′ )] and, using the quantization
relations for the fields, prove that [a(p), a† (p′ )] = δ(p − p′ ). We have

d3 X d3 Y
Z h   
† ′ iPµ X µ ′ µ
[â(p), â (p )] = 3
p ω p φ(X) + iφ̇(X) e ω p ′ φ(Y ) − iφ̇(Y ) e−iPµ Y ,
(2π) 4ωp ωp′
  ′ µ
  µ
i
− ωp′ φ(Y ) − iφ̇(Y ) e−iPµ Y ωp φ(X) + iφ̇(X) eiPµ X , (4.174)
= | where we have to remember that X 0 = Y 0 = t |
d3 X d3 Y
Z h  
−iPµ′ Y µ +iPµ X µ
= e ω p φ(X) + i φ̇(X) ω p ′ φ(Y ) − iφ̇(Y ) ,
(2π)3 4ωp ωp′
p
  i
− ωp′ φ(Y ) − iφ̇(Y ) ωp φ(X) + iφ̇(X) , (4.175)
d3 X d3 Y
Z 
−i(ωp′ −ωp )t ip′ ·y−ip·x
= e e ωp ωp′ [φ(x, t), φ(y, t)] − iωp [φ(x, t), φ̇(y, t)] ,
(2π)3 4ωp ωp′
p

+iωp′ [φ̇(x, t), φ(y, t)] + [φ̇(x, t), φ̇(y, t)] , (4.176)
d3 X d3 Y
Z

= e−i(ωp′ −ωp )t eip ·y−ip·x (ωp + ωp′ )δ(x − y) , (4.177)
(2π)3 4ωp ωp′
p

d3 X
Z

= 3
p e−i(ωp′ −ωp )t e−i(p−p )·x (ωp + ωp′ ) , (4.178)
(2π) 4ωp ωp′
1
= p e−i(ωp′ −ωp )t (ωp + ωp′ ) δ(p − p′ ) , (4.179)
4ωp ωp′
= δ(p − p′ ) . (4.180)

In the same way we find


[â(p), â(p′ )] = [↠(p), ↠(p′ )] = 0 . (4.181)
Moreover, using the expression of the fields we find
1 2
Z 
H = d3 X φ̇ + (∇φ)2 + m2 φ2 , (4.182)
2
ωp 
Z 
= d3 p â(p)↠(p) + ↠(p)â(p) , (4.183)
2
that in normal ordering gives Z
: H := d3 pωp ↠(p)â(p) . (4.184)

89
The operators â(p) and ↠(p) are therefore, indeed, annihilation and creation operators. They act on
the Fock space, in such a way that
â(p)|0i = 0 . (4.185)
In this way, the energy of the vacuum is 0:
Z
: H : |0i = d3 pωp ↠(p)â(p) |0i = 0 . (4.186)

If we act once with ↠(p) on the vacuum we find a one-particle state with definite energy (and
momentum):
↠(p)|0i = |pi , (4.187)
such that
Z

: H : â (p)|0i = d3 p′ ωp′ ↠(p′ )â(p′ ) ↠(p)|0i , (4.188)
Z Z
= d3 p′ ωp′ ↠(p′ )δ(p − p′ )|0i + d3 p′ ωp′ ↠(p′ )↠(p)â(p′ )|0i , (4.189)

= ωp ↠(p)|0i . (4.190)

Therefore, ↠(p)|0i is an eigenstate of the hamiltonian with energy ωp . If we consider, for example, the
state
|p1 , p2 i = ↠(p1 )↠(p2 )|0i , (4.191)
we find:
Z
† †
: H : â (p1 )â (p2 )|0i = d3 pωp ↠(p)â(p) ↠(p1 )↠(p2 )|0i , (4.192)
Z  
= d3 pωp ↠(p) ↠(p1 )â(p) + δ(p − p1 ) ↠(p2 )|0i , (4.193)
Z
= d3 pωp ↠(p)↠(p1 )â(p)↠(p2 )|0i + ωp1 ↠(p1 )↠(p2 )|0i , (4.194)
Z  
= d3 pωp ↠(p)↠(p1 ) ↠(p2 )â(p) + δ(p − p2 ) |0i

+ωp1 ↠(p1 )↠(p2 )|0i , (4.195)


† †
= (ωp1 + ωp2 ) â (p1 )â (p2 )|0i . (4.196)

Therefore, ↠(p1 )↠(p2 )|0i is again an eigenstate of : H : with energy (ωp1 + ωp2 ). Etc ...
Let us look at the momentum of these states. From Nöther’s theorem we have8
Z Z
Pi = d3 X T 0i = d3 X φ̇ ∂ i φ , (4.197)

d3 p d3 p′
Z Z   
3 −iPµ X µ † iPµ X µ ′i ′ µ
= d X 3
p (−iω p ) a(p)e − a (p)e (−ip ) a(p′ )e−iPµ X
(2π) 4ωp ωp′
′ µ

−a† (p′ )eiPµ X , (4.198)
d3 p d3 p′
Z Z
′ µ ′ µ
ωp p′i d3 X (a(p)a(p′ )e−i(Pµ +Pµ )X + a† (p)a† (p′ )ei(Pµ +Pµ )X )

= − 3
p
(2π) 4ωp ωp′
′ µ ′ µ 
−(a(p)a† (p′ )e−i(Pµ −Pµ )X + a† (p)a(p′ )ei(Pµ −Pµ )X ) , (4.199)
= integrating in d3 X
8
from now on we will omit the “hat” on the creation/annihilation operators.

90
d3 p d3 p′
Z
= p ωp p′i (a(p)a† (p′ ) + a† (p)a(p′ ))δ(p − p′ ) , (4.200)
4ωp ωp′
= integrating in d3 p′
1
Z
= d3 p pi (a(p)a† (p) + a† (p)a(p)) , (4.201)
2
where, from (4.199) to (4.200), we considered that

d3 p d3 p′ 1
Z Z
′i ′ −i(ωp +ωp′ )t
3
p ωp p a(p)a(p )e δ(p + p ) = − d3 p pi a(p)a(−p) e−2iωp ,(4.202)

(2π) 4ωp ωp′ 2

but
1 1
Z Z
3
d p pi a(p)a(−p) e−2iωp = d3 p pi a(−p)a(p) e−2iωp , (4.203)
2 2
= | since [a(p), a(−p)] = 0 |
1
Z
= − d3 p pi a(p)a(−p) e−2iωp , (4.204)
2
= | where we changed p → −p | . (4.205)

Therefore
1
Z
d3 p pi a(p)a(−p) e−2iωp = 0 (4.206)
2
and, finally
d3 p d3 p′
Z
ωp p′i a(p)a(p′ )e−i(ωp +ωp′ )t δ(p + p′ ) = 0 . (4.207)
(2π)3 4ωp ωp′
p

The same is true for


d3 p d3 p′
Z
ωp p′i a† (p)a† (p′ )ei(ωp +ωp′ )t δ(p + p′ ) = 0 . (4.208)
(2π)3 4ωp ωp′
p

In normal ordering, then, we have


Z
i
: P := d3 p pi a† (p)a(p) , (4.209)

that commutes with the hamiltonian (as it should) and such that

: P i : a† (p)|0i = pi a† (p)|0i , (4.210)


i † †
: P : a (p1 )a (p2 )|0i = (pi1 + pi2 )a† (p1 )a† (p2 )|0i , (4.211)

etc ...
These results corroborate the interpretation of the state |p1 , p2 i = a† (p1 )a† (p2 )|0i as a two-particle
state, with definite energy, which is the sum of the energies of the one-particle state |p1 i and the
one-particle state |p2 i, and with definite momentum, which is the vectorial sum of the momenta p1
and p2 .

One- and two-particle states. Bosons


The state |pi = a† (p)|0i is a plane wave, a state with definite energy and momentum and therefore
totally delocalized. A one-particle state will be described as a superposition of plane waves
Z
|ψi = d3 p ψ(p)a† (p)|0i . (4.212)

91
The function ψ(p) is the actual wave function in p-representation. In fact
Z
hp |ψi = h0|a(p ) d3 p ψ(p)a† (p)|0i ,
′ ′
(4.213)
Z
= d3 p ψ(p)h0|a(p′ )a† (p)|0i , (4.214)
Z
= d3 p ψ(p)δ(p − p′ ) = ψ(p′ ) (4.215)

gives the probability amplitude to have a particle with momentum p′ . We have


Z Z
hψ|ψi = d p ψ h0|a(p) d3 qψ(q)a† (q)|0i ,
3 ∗
(4.216)
Z
= d3 pd3 q ψ ∗ ψ(q)δ(p − q) , (4.217)
Z
= d3 p |ψ(p)|2 . (4.218)

Therefore, ψ(p) should be normalizable and we can choose hψ|ψi = 1.


Let us consider now a two-particle state. As in the previous case, we can write
Z
|ψi = d3 p1 d3 p2 ψ(p1 , p2 )a† (p1 )a† (p2 )|0i . (4.219)

Due to the commutation relations we have

|p1 , p2 i = a† (p1 )a† (p2 )|0i = a† (p2 )a† (p1 )|0i = |p2 , p1 i . (4.220)

This means that in the integral (4.219) the function ψ(p1 , p2 ) should be symmetric in the exchange
1 ↔ 2 (or, in other words, only the symmetric part of ψ(p1 , p2 ) gives an ingtegral different from zero).
It follows that the commutation relations that we used for the quantization of the KG field give rise
to bosonic particles.

4.2.2 Complex field


Let us now discuss the quantization of the complex field φ(X) with lagrangian density

L = ∂µ φ∗ ∂ µ φ − m2 φ∗ φ , (4.221)

that can be found from the lagrangian of two real fields, φ1 and φ2 , degenerate in mass, rotated in the
complex plain as
φ1 + iφ2
φ = √ , (4.222)
2
φ1 − iφ2
φ∗ = √ . (4.223)
2

We promote the fields to operators, and then we will have φ and φ† . We can find the expression of the
fields in terms of creation-annihilation operators using Eqs. (4.222,4.223) in which we substitute the
expressions for the two real fields φ1 and φ2 . We find
!
† †
+ a1 (p) + ia2 (p) − a1 (p) + ia2 (p)
Z
3
φ(X) = d p fp √ + fp √ ,
2 2

92
Z  
= d3 p fp+ a(p) + fp− b† (p) , (4.224)
!
a1 (p) − ia2 (p) a† (p) − ia† (p)
Z

φ (X) = d3 p fp+ √ + fp− 1 √ 2 ,
2 2
Z  
= d3 p fp+ b(p) + fp− a† (p) , (4.225)

where we defined
a1 (p) + ia2 (p)
a(p) = √ , (4.226)
2
a1 (p) − ia2 (p)
b(p) = √ . (4.227)
2
Note that, trivially, b† (p) 6= a† (p), as it should since the field is not anymore hermitian.
Interpreting a(p), a† (p) and b(p), b† (p) as creation-annihilation operators, we find a spectrum
constituted by two kind of particles: “type a” and “type b” particles. Let us study their quantum
numbers.
For the quantization of the system we have to find the conjugated momenta to φ and φ† :
∂L
πφ = = φ̇† , (4.228)
∂ φ̇
∂L
π φ† = = φ̇ , (4.229)
∂ φ̇†
and impose the commutation relations at equal time:

[φ(x, t), φ̇† (y, t)] = [φ† (x, t), φ̇(y, t)] = iδ(x − y) , (4.230)
† † †
[φ(x, t), φ(y, t)] = [φ (x, t), φ (y, t)] = [φ̇(x, t), φ̇ (y, t)] = .... = 0 , (4.231)

where the dots mean “all other combinations”. These quantization rules induce analogous commutation
relations among the operators a(p), a† (p) and b(p), b† (p). In fact we find:

[a(p), a† (p′ )] = [b(p), b† (p′ )] = δ(p − p′ ) , (4.232)

and all the other combinations give zero commutator. It turnes out that a(p), a† (p) and b(p), b† (p)
are indeed creation-annihilation operators and the conserved quantities such that the hamiltonian and
the momentum can be written in terms of them.
We have9 Z  
H = d3 X φ̇† φ̇ + ∇φ† · ∇φ + m2 φ† φ . (4.233)

Now, using Eqs. (4.224,4.225), we find


1 †
Z 
H = d3 p ωp a (p)a(p) + a(p)a† (p) + b† (p)b(p) + b(p)b† (p) , (4.234)
2
or Z  
: H := d3 p ωp a† (p)a(p) + b† (p)b(p) . (4.235)

For the momentum we find an analogous expression:


Z  
: P : = d3 p pi a† (p)a(p) + b† (p)b(p) .
i
(4.236)
9
We can find this relation taking the hamiltonian as a sum of the two hamiltonians of the real fields φ1 and φ2 and
then rotate with (4.222,4.223).

93
Fock space
Since we have two kinds of creation-annihilation operators we have
a(p)|0i = b(p)|0i = 0 , (4.237)
and then we have, for instance, one-particle states of kind a, a† (p)|0i, and b, b† (p)|0i, with definite
energy and momentum. Let us see:
Z  
: H : a (p)|0i = d3 p′ ωp′ a† (p′ )a(p′ ) + b† (p′ )b(p′ ) a† (p)|0i = ωp a† (p)|0i ,

(4.238)

since a and b operators commute. Then, we conclude that a† (p)|0i is an eigenstate of the hamiltonian
with energy ωp . However, we also have
Z  
: H : b (p)|0i = d3 p′ ωp′ a† (p′ )a(p′ ) + b† (p′ )b(p′ ) b† (p)|0i = ωp b† (p)|0i .

(4.239)

Therefore, also b† (p)|0i is an eigenstate of the hamiltonian with the same energy ωp .
The same is true for the momentum. We have:
: P i : a† (p)|0i = pi a† (p)|0i , (4.240)
i † i †
: P : b (p)|0i = p b (p)|0i . (4.241)
Then the states a† (p)|0i and b† (p)|0i have the same energy and momentum. They are degenerate with
respect to these quantum numbers.
However, note that in the complex-field case there is another conserved quantity, which is the
charge (that will be interpreted as the actual electric charge once we will introduce electromagnetic
interactions).
From the Nöther’s theorem we have


Z
Q = i d3 X φ† ∂0 φ , (4.242)

and substituting the fields in terms of creation-annihilation operators and performing the integrations,
we find Z  
: Q : = d3 p a† (p)a(p) − b† (p)b(p) . (4.243)

Therefore, this operator (that commutes with the hamiltonian) resolves the degeneracy, distinguishing
between particles of type a and particles of type b.
In fact, now, we have
: Q : a† (p)|0i = a† (p)|0i , (4.244)
† †
: Q : b (p)|0i = −b (p)|0i . (4.245)
States of type a are eigenstates of the charge with eigenvalue +1, while states of type b have opposite
charge (-1).
Then, the spectrum is constructed using the following operators: a† (p) creates a particle state of
type a with energy ωp , momentum p and charge +1, while a(p) annihilates such state; b† (p) creates a
particle state of type b with energy ωp , momentum p and charge −1, while b(p) annihilates such state.
We say that particles of type b are the “anti-particcles” of the particles of type a. In the real case, we
have a(p) = b(p) and therefore the particle is its own anti-particle.
States a and b appear in the theory in a totally symmetric way. Therefore, the names "particle"
and "anti-particle" are totally interchangable.
Note that the field operator φ(X) is a linear combination of annihilation operators a(p) and creation
operators b† (p) (and viceversa for φ† (X)). This suggests a sort of “equivalence” between the creation
of a charge +1 and the annihilation of a charge -1.

94
4.2.3 Locality and causality in QFT

4.3 The Dirac Field (classical field)


Let us proceed with the study of the finite-dimensional representations of the Lorentz group. We
consider now a field that transforms, under Poincaré transformations, as a spinor in the ( 12 , 0) ⊕ (0, 21 )
representation:
ψ(X) → ψ ′ (X ′ ) = S(Λ) ψ(X) , (4.246)
where, as we will se below
1 µν
S(Λ) = e 8 [γµ ,γν ]ǫ . (4.247)

4.3.1 The Dirac equation


Historically, we can look at the Dirac equation as an attempt to overcome the difficulties emerged by
the Klein-Gordon equation. In particular, we refer to the failure of the probabilistic interpretation of
the theory, due to the fact that what should be interpreted as a probability density is not positive
definite. This is connected to the fact that the time derivative in the KG equation is of second order.
Therefore, we look for a covariant equation of the kind

i ψ(X) = Hψ(X) , (4.248)
∂t
in which, for the covariance, in the hamiltonian the space derivatives have to be as well of the first
order:
H = α · p + βm . (4.249)
α and β are four matrices that we will define according to some obvious constraints. Firstly, since the
hamiltonian has to the hermitean, we have to have

(αi )† = αi , β† = β . (4.250)

We require that:

1. If ψ(X) is a solution of (4.248), it has to be also a solution of the KG equation, since it has to
fulfil the correct relativistic energy-momentum relation (E 2 = p2 + m2 ).

2. The equation should be relativistically covariant.

3. The equation has to give rise to a conserved current, j µ , that has to transform as a four-vector
and such that j 0 is positive definite.

Using (4.249) and the correspondence principle we find the following form for the equation:

i ψ(X) = (−iα · ∇ + β m)ψ(X) , (4.251)
∂t

4.3.2 αi and β matrices



Applying twice the operator i ∂t we should recover the KG equation. We have

∂2
− ψ(X) = (−iα · ∇ + β m)2 ψ(X) , (4.252)
∂t2
where, in components

(−iα · ∇ + β m)2 = −αi αj ∂i ∂j − im(αi β + βαi )∂i + β 2 m2 , (4.253)

95
1
= − (αi αj + αj αi )∂i ∂j − im(αi β + βαi )∂i + β 2 m2 , (4.254)
2
since ∂i ∂j is totally symmetric in the exchange i ↔ j and therefore only the symmetric part of αi αj
survives in the sum.
The Klein-Gordon equation is given by
∂2
ψ = (∇2 − m2 )ψ , (4.255)
∂t2
and therefore we should have
β2 = 1 , (4.256)
i i
α β + βα = 0 , (4.257)
1 i j
(α α + αj αi ) = δij . (4.258)
2
Relations (4.256), (4.257), (4.258) can be written in a more compact way as follows:
[αi , αj ]+ = 2δij , (4.259)
i
[α , β]+ = 0 , (4.260)
2
β = 1. (4.261)
These relations imply the following properties for αi and β:
First of all, from Eq. (4.259) it follows that, also for αi we have
(αi )2 = 1 . (4.262)
Therefore, αi and β have real eigenvalues (because they are hermitean) and they have to be ±1.
Another property is that trαi = trβ = 0. In fact, using Eqs. (4.261,4.260) we have:
αi = β 2 αi = −βαi β (4.263)
and, for the cyclicity of the trace
tr αi = tr (β 2 αi ) = −tr (βαi β) = −tr (β 2 αi ) = −tr αi , (4.264)
from which
trαi = 0 . (4.265)
The same happens for β.
Since the trace is zero and the eigenvalues are ±1, αi and β should have even dimensionality. They
cannot be matrices 2 × 2, since we cannot accomodate, in that space, four anticommuting matrices.
This can be done, instead, using 4 × 4 matrices.
A possible representation for αi and β is the so-called Dirac representation:
0 σi
   
1 0 i
β= , α = , (4.266)
0 −1 σi 0
where σ i are the Pauli matrices (2 × 2), generators of the SU (2) group:
     
1 0 1 2 0 −i 3 1 0
σ = , σ = , σ = , (4.267)
1 0 i 0 0 −1
such that
[σ i , σ j ] = 2iǫijk σ k , (4.268)
i j ij
[σ , σ ]+ = 2δ . (4.269)
It is simple to check, by direct inspection, that the matrices in Eq. (4.266) satisfy Eqs. (4.259,4.260,4.2614.262).

96
4.3.3 Covariance of the Dirac equation
The equation in (4.251) is not written in a manifestly covariant form. We introduce the following
matrices (the Dirac matrices)
γ 0 = β , γ i = βαi , (4.270)
such that we can define γ µ = (γ 0 , γ 1 , γ 2 , γ 3 ) and write Eq. (4.251) in the following form10

(iγ µ ∂µ − m)ψ(X) = 0 . (4.271)

Using the “slash” notation for a four-vector, 6 a = γµ aµ , we can also write

(i 6 ∂ − m)ψ(X) = 0 . (4.272)

Consistently with the representation (4.266), we have

σi
   
0 1 0 i 0
γ = , γ = . (4.273)
0 −1 −σ i 0

Moreover, relations (4.259,4.260,4.261,4.262) can be summarized by the (Clifford) algebra

[γµ , γν ]+ = 2ηµν , (4.274)

with
(γ 0 )2 = 1 , (γ i )2 = −1 , (γ 0 )† = γ 0 , (γ i )† = −γ i . (4.275)
We now want to find the representation S(Λ) of the Lorentz group such that, if in the inertial frame
S our system is described by Eq. (4.272), in the inertial frame S ′ it will be described by

(i 6 ∂ ′ − m)ψ ′ (X ′ ) = 0 , (4.276)

where
6 ∂ ′ = γ µ ∂µ′ (4.277)
and
ψ ′ (X ′ ) = S(Λ)ψ(X) . (4.278)
We note that in Eq. (4.277) the γ µ is the same as in Eq. (4.272). In fact, the representation of the
gamma matrices can change by a unitary transformation, that, however, does not affect the physical
description of our system. We can then decide to use the same representation in the two inertial frames
and use the same gammas.
S(Λ) has to be a representation of the Lorentz group and therefore it must fulfil the following
relations:
S(Λ1 ) S(Λ2 ) = S(Λ1 Λ2 ) , (4.279)
for any Λ1 and Λ2 Lorents transformations, and

S −1 (Λ) = S(Λ−1 ) , (4.280)

since S(Λ) S(Λ−1 ) = S(ΛΛ−1 ) = 1 = S(Λ) S −1 (Λ).


We have
0 = (iγ µ ∂µ − m)ψ(X) = (iγ µ ∂µ − m)S −1 (Λ)ψ ′ (X ′ ) . (4.281)
Moreover
∂ ∂ ∂X ′ν ∂
∂µ = = = Λν . (4.282)
∂X µ ′ν
∂X ∂X µ ∂X ′ν µ
10
We multiply Eq. (4.251) on the l.h.s. by β and we use the definition of the gamma’s.

97
Multiplying Eq. (4.281) by S(Λ) on the left, and using (4.282), we find
0 = S(Λ)(iγ µ ∂µ − m)S −1 (Λ)ψ ′ (X ′ ) , (4.283)
= S(Λ)(iγ µ ∂ν′ Λνµ − m)S −1 (Λ)ψ ′ (X ′ ) , (4.284)
= (iS(Λ)Λνµ γ µ S −1 (Λ)∂ν′ − m)ψ ′ (X ′ ) . (4.285)
In order to reproduce Eq. (4.277) we have to impose
S(Λ)Λνµ γ µ S −1 (Λ) = γ ν , (4.286)
and therefore, multiplying on the l.h.s. by S −1 (Λ) and on the r.h.s. by S(Λ)
S −1 (Λ)γ ν S(Λ) = Λνµ γ µ . (4.287)
Let us find an explicit form for S(Λ). If we consider an infinitesimal transformation
Λµν ≃ δνµ + ǫµν , (4.288)
where ǫµν = −ǫνµ , we have
i
S(Λ) ≃ 1 − σµν ǫµν , (4.289)
4
i
S −1 (Λ) ≃ 1 + σµν ǫµν . (4.290)
4
Substituting in Eq. (4.287), we find (at first order)
   
i i
1 + σµν ǫµν γρ 1 − σαβ ǫαβ = γρ + ǫρα γ α (4.291)
4 4
and neglecting higher order terms, we find
i 1
[σµν , γρ ]ǫµν = γν ηρµ ǫµν = (γν ηρµ − γµ ηρν ) ǫµν , (4.292)
4 2
where, due to the fact that ǫµν is anti-symmetric, we anti-symmetrized the tensor γν ηρµ . In the end,
we have
[σµν , γρ ] = −2i(γν ηρµ − γµ ηρν ) . (4.293)
This relation is satisfied by
i
σµν = [γµ , γν ] , (4.294)
2
as can be checked by direct inspection. Therefore
1
S(Λ) ≃ 1 + [γµ , γν ]ǫµν . (4.295)
8
Exponentiating we get
1 µν
S(Λ) = e 8 [γµ ,γν ]ǫ . (4.296)
σµν are the generators of the Lorentz group in this representation. Using the Dirac form of the gamma
matrices, we find explicitely
σ00 = σii = 0 , (4.297)
0 σi
 
i
σ0i = −σi0 = [γ0 , γi ] = −i , (4.298)
2 σi 0
 k 
i σ 0
σij = [γi , γj ] = ǫijk (4.299)
2 0 σk
and we see, once more, that the σij are the generators of the rotations and are hermitian, while σ0i
are the generators of the boosts and are anti-hermitian.

98
4.3.4 Unitarity and Dirac adjoint
The operator S(Λ) is not unitary and this is due to the fact that the Lorentz group is not compact
and therefore we cannot find finite-dimensional unitary representations. In fact we have
† i i i
σµν = − (γµ γν − γν γµ )† = ... = (㵆 γν† − γν† γµ† ) = [㵆 , γν† ] 6= σµν , (4.300)
2 2 2
since γ0† = γ0 but γi† = −γi . This implies that
i † µν i µν
S † (Λ) = e 4 σµν ǫ 6= e 4 σµν ǫ = S −1 (Λ) . (4.301)
However, we can prove (by direct inspection) that

γ0 σµν γ0 = σµν (4.302)
and therefore
γ0 S † (Λ)γ0 = S −1 (Λ) . (4.303)
S(Λ) is not unitary but is “unitary with respect to the metric γ0 ”. A consequence of this behaviour is
that a bilinear in the fields like ψ † ψ is not a scalar under Lorentz transformations, but it transforms
as the temporal component of a four-vector. If we want to construct a scalar (and this is important
because then we would like to find the lagrangian density for the Dirac field and it must be a scalar)
we have to consider the so-called Dirac adjoint:
ψ = ψ † γ0 . (4.304)
With the Dirac adjoint of ψ, we can construct a scalar: ψψ. In fact, under a Lorentz transformation
we have:

ψ ψ ′ = ψ ′† γ0 ψ ′ = (S(Λ)ψ)† γ0 S(Λ)ψ = ψ † S † (Λ)γ0 S(Λ)ψ = ψS −1 (Λ)S(Λ)ψ = ψψ . (4.305)
We have still to verify that S(Λ) satisfy Eq. (4.279), and Eq. (4.280) follows directly. Let us consider
the first Lorentz transformation Λ1 . We have
S −1 (Λ1 )γ µ S(Λ1 ) = (Λ1 )µν γ ν . (4.306)
ρ
Multiplying on the l.h.s. by (Λ−1
1 )µ , we find

(Λ−1 ρ −1 σ −1 ρ µ ν ρ ν ρ
1 )µ S (Λ1 )γ S(Λ1 ) = (Λ1 )µ (Λ1 )ν γ = δν γ = γ . (4.307)
Let us consider now the second transformation, Λ2 . We have
S −1 (Λ2 )γ ρ S(Λ2 ) = (Λ2 )ρα γ α . (4.308)
We now substitute in the expression above the γ ρ with the analogous expression in (4.307):
(Λ2 )ρα γ α = S −1 (Λ2 )(Λ−1 ρ −1 µ
1 )µ S (Λ1 )γ S(Λ1 )S(Λ2 ) , (4.309)
= (Λ−1 ρ −1
1 )µ S (Λ2 )S
−1
(Λ1 )γ µ S(Λ1 )S(Λ2 ) . (4.310)
Multiplying on the left by (Λ1 )σρ we find

S −1 (Λ2 )S −1 (Λ1 )γ µ S(Λ1 )S(Λ2 ) = (Λ1 )σρ (Λ2 )ρα γ α = (Λ1 Λ2 )σα γ α . (4.311)
Since (Λ1 Λ2 ) is indeed a Lorentz transformation, we can also write
S −1 (Λ1 Λ2 )γ σ S(Λ1 Λ2 ) = (Λ1 Λ2 )σα γ α . (4.312)
Therefore, it follows the statement:
S(Λ1 )S(Λ2 ) = S(Λ1 Λ2 ) . (4.313)
S(Λ) is indeed a representation of the Lorentz group.

99
Example
As an example, we write the explicit form of S(Λ), acting on the spinorial field ψ(X), when Λ is a
boost. Let us choose for simplicity a boost in the x direction. We will have

X µ → X ′µ = Λµν X ν , (4.314)

where, in matrix form  


γ −βγ 0 0
−βγ γ 0 0
Λµν = 
 0
 (4.315)
0 1 0
0 0 0 1
or, using the hyperbolic parametrization
v 1
β= = tanh(θ) , γ=p = cosh(θ) , (4.316)
c 1 − β2
 
cosh(θ) − sinh(θ) 0 0
− sinh(θ) cosh(θ) 0 0
Λµν = 

 (4.317)
 0 0 1 0
0 0 0 1
For the infinitesimal transformation (θ ≪ 1)
     
1 −θ 0 0 1 0 0 0 0 −θ 0 0
−θ 1 0 0 0 1 0 0
 + −θ 0 0 0 .
Λµν ≃ δνµ + ǫµν = 
    
0 = (4.318)
0 1 0 0 0 1 0  0 0 0 0
0 0 0 1 0 0 0 1 0 0 0 0

We have
ǫµν = ηµρ ǫρν (4.319)
and therefore we find
ǫ10 = −ǫ01 = θ (4.320)
and the other components are zero. Then
i i i i
S(Λ) ≃ 1 − ǫµν σ µν = 1 − ǫ10 σ 10 + ǫ01 σ 01 = 1 − 2ǫ10 σ 10 = 1 − θσ 10 , (4.321)
 
4 4 4 2
where
 
0 0 0 1
1
 
i i 1 0 0 σ 0 0 1 0
σ 10 = [γ 1 , γ 0 ] = γ γ − γ 0 γ 1 = −iγ 0 γ 1 = −iα1 = −i
 
= −i   . (4.322)
2 2 σ1 0 0 1 0 0
1 0 0 0

Exponentiating, we find
 2  3
θ 1 θ 1 θ 1 θ
S(Λ) = e− 2 α = 1 − α1 + − α1 + − α1 + ... , (4.323)
2 2 2 6 2
= recalling that (α1 )2 = 1
∞  2k ∞  2k+1
X 1 θ 1
X 1 θ
= − +α − , (4.324)
(2k)! 2 (2k + 1)! 2
k=0 k=0

100
   
θ 1 θ
= cosh − α sinh , (4.325)
2 2
cosh 2θ  −σ 1 sinh θ2
  
= . (4.326)
−σ 1 sinh 2θ cosh θ2
Since    
θ 2 cosh(θ) + 1 2 θ cosh(θ) − 1
cosh = , sinh = , (4.327)
2 2 2 2
recalling the hyperbolic parametrization in terms of β and γ we find
!
βγ 1
r
γ+1 1 − γ+1 σ
S(Λ) = βγ 1 . (4.328)
2 − γ+1 σ 1
This form can be generalized to a boost in the n direction as
!
βγ
r
γ+1 1 − γ+1 σ·n
S(Λ) = βγ . (4.329)
2 − γ+1 σ·n 1

4.3.5 Probability density


One of the main problems in the “first-quantization” interpretation of the Klein-Gordon equation was
the failure of the probabilistic interpretation due to the non-positivity of the probability density. This
can be linked to the fact that the KG equation is second order in the time derivative. Let us see what
happens in the case of the Dirac equation.
Firstly, let us write the equation for the daggered field, ψ † . We have

−i ψ † = i(αψ † ) · ∇ + βmψ † . (4.330)
∂t
Multiplying the Direc equation for ψ by ψ † on the left and subtracting Eq. (4.330) multiplied by ψ on
the right, we find
   
† ∂ ∂ †
iψ ψ +i ψ ψ = ψ † (iα · ∇ψ + βmψ) − (i∇ψ † · α + βmψ † )ψ , (4.331)
∂t ∂t
or

i (ψ † ψ) = −iψ † α · ∇ψ − i(∇ψ † ) · α ψ = −i∇ · (ψ † αψ) . (4.332)
∂t
If we define the vector
j µ = (ψ † ψ, ψ † αψ) = ψγ µ ψ , (4.333)
Eq. (4.355) becomes
∂µ j µ = 0 . (4.334)
Eq. (4.334) implies the conservation of the “charge”
Z
Q = d3 X ψ † ψ , (4.335)

and since ψ † ψ is a positive definite quantity, it can be interpreted as a probability density (and then
Q is the total probability to find the particle in all the space, therefore Q = 1).
The vector j µ defined in Eq. (4.333) transforms indeed as a four-vector under Lorentz transforma-
tions. In fact
j ′µ (X ′ ) = ψ ′ (X ′ )γ µ ψ ′ (X ′ ) = (ψ ′† (X ′ )γ 0 )γ µ ψ ′ (X ′ ) , (4.336)
† 0 µ † † 0 µ
= (S(Λ)ψ(X)) γ γ S(Λ)ψ(X) = ψ S (Λ)γ γ S(Λ)ψ(X) , (4.337)
0 † 0 µ −1 µ
= ψ(X)γ S (Λ)γ γ S(Λ)ψ(X) = ψ(X)S (Λ)γ S(Λ)ψ(X) , (4.338)
= Λµν ψ(X)γ ν ψ(X) , (4.339)
= Λµν j ν (X) . (4.340)

101
4.3.6 Lagrangian and Hamiltonian densities
The Dirac field is a spinorial complex field. Then, we will consider ψ and ψ as independent fields.
While ψ obeys the equation
(i 6 ∂ − m)ψ(X) = 0 , (4.341)
the equation for the adjoint field can be found taking the dagger of (4.341)

−i∂µ ψ † (X) (γ µ )† − mψ † (X) = 0 (4.342)

and multiplying by γ 0 on the r.h.s.

−i∂µ ψ † (X) γ 0 γ 0 (γ µ )† γ 0 − mψ(X) = −i∂µ ψ(X)γ µ − mψ(X) = 0 (4.343)

or, better
← −
ψ(X) (i 6 ∂ + m) = 0 . (4.344)
Using the same approach as in the Klein-Gordon case, we can recover the lagrangian density starting
from the Euler-Lagrange equations (4.341,4.344) mulptiplied by the variation of the fields and making
in such a way to find the Hamilton Principle
← −
Z n o
0 = δS = δψ(i 6 ∂ − m)ψ + ψ (i 6 ∂ + m)δψ , (4.345)
Z
= δ d3 X ψ(i 6 ∂ − m)ψ . (4.346)

The lagrangian density is therefore


L = ψ(i 6 ∂ − m)ψ . (4.347)
It is easy to check that the Euler-Lagrange equations of the lagrangian density (4.347) are indeed
Eqs. (4.341,4.344). In fact, in order to get the equations for ψ we have
∂L ∂L
0= − ∂µ = (i 6 ∂ − m)ψ(X) , (4.348)
∂ψ ∂ψ ,µ

since L does not involve derivatives of the field ψ and therefore


∂L
= 0. (4.349)
∂ψ ,µ

For ψ wi have
∂L ∂L
0= − ∂µ = −mψ − (iγµ ∂µ ψ) (4.350)
∂ψ ∂ψ,µ
and therefore Eq. (4.344).
The lagrangian density (4.347) has a problem. It is a singular lagrangian, in the sense that the
momentum conjugate to ψ is zero:
∂L
πψ = = iψ † , (4.351)
∂ ψ̇
∂L
πψ † = = 0. (4.352)
∂ ψ˙†

This is due to the fact that L does not involve derivatives of ψ † or, which is the same, the canonical
momenta do not depend on velocities. The canonical formalism rely on momenta that are the time
derivative of the conjugated degree of freedom. In this case, then, in principle we cannot proceed with

102
a Legendre transformation getting the hamiltonian (the energy) of the system. The problem was solved
by Dirac himself, that proposed a procedure to arrive to the hamiltonian. This procedure coincides,
in this case, with the naive formula
H = πψ ψ̇ − L (4.353)
and, considering the configurations of the field that satisfy Dirac’s equation, we get
H = iψ † ∂0 ψ (= ψ † (−iα · ∇ + βm)ψ) . (4.354)
Contrarly to the KG field, this expression is not positive definite. However, we will see that when we
will move to the quantized version of H, as an operator acting on the Fock space, it will be positive
definite.
The expression (4.354) can be recovered also using Nöther’s theorem.

4.3.7 Conserved quantities


The lagrangian density (4.347) is Poincaré invariant. This imply that, according to Nöther’s theorem,
there are some quantities that are conserved.
If we consider the non homogeneous part of the Poincaré group (rigid translations), we get the
relation
∂µ Tνµ = 0 , (4.355)
where the tensor Tνµ is the so-called “energy-momentum” tensor
∂L ∂L †
Tνµ = ψ,ν + †
ψ,ν − ηνµ L (4.356)
∂ψ,µ ∂ψ,µ
and , considering the configurations of the field that satisfy Dirac’s equation, we get
Tνµ = iψγ µ ψ,ν . (4.357)
It is easy to check that the form given in Eq. (4.357) satisfies Eq. (4.355).
The conserved four-vector is
Z Z
Pν = d X Tν = d3 X iψ † ∂ν ψ .
3 0
(4.358)

Therefore
Z Z
3
H = d X T00 = d3 X iψ † ∂0 ψ , (4.359)
Z Z
P = d3 X T 0i = − d3 X iψ † ∇ψ . (4.360)

If we consider instead the Lorentz group, we get


∂µ Mµrhoσ = 0 , (4.361)
where  
1
Mµρσ µ
= iψγ Xρ ∂σ − Xσ ∂ρ + [γρ , γσ ] ψ . (4.362)
4
The conserved charges are the following 6 charges:
Z
Mρσ = d3 X M0ρσ . (4.363)

The angular momentum is


 
1
Z
23 31 12 3 † σ 0
J = (M , M , M ) = d X ψ (−ix ∧ ∇ + )ψ, (4.364)
2 0 σ
where we can recognize an orbital part and a spin part.

103
Global phase invariance
The lagrangian density (4.347) is invariant under the following transformation

ψ(X) → ψ ′ (X) = e−iθ ψ(X) , (4.365)



ψ(X) → ψ ′ (X) = e ψ(X) , (4.366)

where θ ∈ R. This is a continuous transformation. The infinitesimal tranformation is

δψ = −iθψ , (4.367)
δψ = iθψ . (4.368)

This symmetry gives rise to a four-vector


∂L
Jµ = δψ = θ ψγ µ (4.369)
∂ψ,µ
such that
∂µ J µ = 0 . (4.370)
Then, we can define the current as in Eq. (4.333) such that the conserved charge is the one in Eq. (4.335).
Once we introduce the interaction of the Dirac field with the electromagnetic field the charge will be
correctly interpreted in QFT as the electric charge (and not connected with the probability density of
a tentative “first quantization” interpretation of the theory).

4.3.8 The matrix γ5


The matrix
γ5 = iγ 0 γ 1 γ 2 γ 3 (4.371)
plays un important role in the Clifford algebra of the γ matrices. It has the following properties:
1. γ5 is hermitian:

(γ 5 )† = (iγ 0 γ 1 γ 2 γ 3 )† = −i(γ 3 )† (γ 2 )† (γ 1 )† (γ 0 )† , (4.372)


3 2 1 0 0 1 2 3
= iγ γ γ γ = iγ γ γ γ , (4.373)
= γ5 . (4.374)

2. γ5 anticommutes with all the γ µ :

[γ5 , γ 0 ]+ = iγ 0 γ 1 γ 2 γ 3 γ 0 + iγ 0 γ 0 γ 1 γ 2 γ 3 = −iγ 1 γ 2 γ 3 + iγ 1 γ 2 γ 3 = 0 , (4.375)


i 0 1 2 3 i i 0 1 2 3
[γ5 , γ ]+ = iγ γ γ γ γ + iγ γ γ γ γ = ... = 0 . (4.376)

The representation for γ5 follows the representation of the γ µ . In the Pauli representation we have
 
0 1
γ5 = . (4.377)
1 0
We can find a more “covariant” form of γ5 noting that the expression

iγ µ γ ν γ ρ γ σ with µ 6= ν 6= ρ 6= σ (4.378)

gives ±γ5 . Actually, if µνρσ is an even permutation of 0123, we have +γ5 ; if µνρσ is an odd permutation
of 0123, we have −γ5 . Using the totally antisymmetric tensor ǫµνρσ we have
X X
ǫµνρσ (iγ µ γ ν γ ρ γ σ ) = (+1)(+γ5 ) + (−1)(−γ5 ) = 24 γ5 . (4.379)
evenP(0123) oddP(0123)

104
Therefore
i
γ5 = iγ 0 γ 1 γ 2 γ 3 =
ǫµνρσ γ µ γ ν γ ρ γ σ . (4.380)
24
In order to find how γ5 transforms under Lorentz transformations S(Λ), consider that

det Λ = ǫµνρσ Λµ0 Λν0 Λρ0 Λσ0 , (4.381)

from which we can write


ǫµνρσ Λµα Λνβ Λρδ Λσγ = detΛ ǫαβδγ . (4.382)
Therefore
i
S −1 (Λ)γ5 S(Λ) = ǫµνρσ S −1 (Λ)γ µ γ ν γ ρ γ σ S(Λ) , (4.383)
24
i
= ǫµνρσ S −1 γ µ SS −1 γ ν SS −1 γ ρ SS −1 γ σ S , (4.384)
24
i
= ǫµνρσ Λµα γ α Λνβ γ β Λρδ γ δ Λσγ γ γ , (4.385)
24
i
= ǫµνρσ Λµα Λνβ Λρδ Λσγ γ α γ β γ δ γ γ , (4.386)
24
i
= detΛ ǫαβδγ γ α γ β γ δ γ γ , (4.387)
24
= detΛ γ5 . (4.388)

4.3.9 Bilinear covariants


The space of Dirac matrices is a 16-dim space. We can prove that a basis for such space is constituted
by the following 16 matrices:
Γ = {1, γ µ , γ5 , σ µν , γ µ γ5 } . (4.389)
With this choice, it is very easy to understand the transformation behaviour of bilinears in the fields,
like ψΓψ, under Lorentz transformations.
In fact, we already proved that
ψ 1 ψ = ψψ (4.390)
transforms as a scalar under Lorentz transformations. Moreover,

ψγ µ ψ (4.391)

transforms as a four-vector.
For the other possible bilinears we have:

ψ ′ (X ′ )γ5 ψ ′ (X ′ ) = ψ † (X)S † (Λ)γ 0 γ5 S(Λ)ψ(X) , (4.392)


−1
= ψ(X) S (Λ)γ5 S(Λ)ψ(X) , (4.393)
= detΛ ψ(X)γ5 ψ(X) . (4.394)

We say that ψ(X)γ5 ψ(X) is a pseudo-scalar.

ψ ′ (X ′ )γ µ γ5 ψ ′ (X ′ ) = ψ † (X)S † (Λ)γ 0 γ µ γ5 S(Λ)ψ(X) , (4.395)


−1 µ −1
= ψ(X) S (Λ)γ S(Λ)S (Λ)γ5 S(Λ)ψ(X) , (4.396)
= detΛ Λµν ψ(X)γ ν γ5 ψ(X) . (4.397)

We say that ψ(X)γ µ γ5 ψ(X) is a pseudo-vector.

105
Since σ µν = 2i (γ µ γ ν − γ ν γ ν ), we concentrate on

ψ ′ (X ′ )γ µ γ ν ψ ′ (X ′ ) = ψ † (X)S † (Λ)γ 0 γ µ γ ν S(Λ)ψ(X) , (4.398)


−1 µ −1 ν
= ψ(X) S (Λ)γ S(Λ)S (Λ)γ S(Λ)ψ(X) , (4.399)
= Λµα Λνβ α β
ψ(X)γ γ ψ(X) . (4.400)

Therefore, ψ(X)γ µ γ ν ψ(X) transforms as a rank-2 tensor.

4.3.10 Algebra of the γ µ matrices and γ5


It is important, for future applications, to introduce some rules for the calculation of traces with γ
matrices. We consider the Minkowski space with 4 = 3 + 1 dimensions. We recall the algebra of the
γ’s
[γµ , γν ]+ = 2ηµν . (4.401)
We have
• γµ γ µ = 4 1 In fact
γµ γ µ = (γ 0 )2 − (γ 1 )2 − (γ 2 )2 − (γ 2 )2 = 4 1 . (4.402)

• γµ γ ν γ µ = −2γ ν In fact

γµ γ ν γ µ = γµ (−γ µ γ ν + 2η µν ) = −γµ γ µ γ ν + 2γ ν = −4γ ν + 2γ ν = −2γ ν . (4.403)

• γµ γ λ γ ν γ µ = 4η λν In fact

γµ γ λ γ ν γ µ = γµ γ λ (−γ µ γ ν + 2η µν ) = −γµ γ λ γ µ γ ν + 2γ ν γ λ = 2γ λ γ ν + 2γ ν γ λ ,
= 2[γ λ , γ ν ]+ = 4η λν . (4.404)

And, saturating with vectors, recalling the “slash” notation 6 a = aµ γ µ , we have

• 6 a 6 a = a2 In fact

6 a 6 a = aµ aν γµ γν = aµ aν (−γµ γν + 2ηµν ) = − 6 a 6 a + 2a2 , (4.405)

and therefore 6 a 6 a = a2 .
• 6 a 6 b+ 6 b 6 a = 2a · b

• γµ 6 aγ µ = −2 6 a

• γµ 6 a 6 bγ µ = 4a · b

Concerning the traces of the γ’s, we have:


• trγ µ = 0

• tr(6 a 6 b) = 4 a · b In fact

1 1
tr(6 a 6 b) = aµ bν tr(γ µ γ ν ) = aµ bν tr(γ µ γ ν + γ ν γ µ ) = aµ bν tr(2η µν 1) = 4 a · b , (4.406)
2 2
where we used the cyclicity of the trace tr(γ µ γ ν ) = tr(γ ν γ µ ) and therefore
1
tr(γ µ γ ν ) = (tr(γ µ γ ν ) + tr(γ ν γ µ )) . (4.407)
2

106
• tr(6 a 6 b 6 c) = 0 In fact

tr(6 a 6 b 6 c) = tr(6 a 6 b 6 cγ5 γ5 ) = | cyclicity | = tr(γ5 6 a 6 b 6 cγ5 ) = | anti-commuting the γ5 | ,


= −tr(γ5 γ5 6 a 6 b 6 c) = −tr(6 a 6 b 6 c) . (4.408)

• tr(6 a 6 b 6 c 6 d) = 4(a · b)(c · d) + 4(a · d)(b · c) − 4(a · c)(b · d) In fact

tr(6 a 6 b 6 c 6 d) = tr[(−2 6 b 6 a + 2(a · b)) 6 c 6 d] = 2(a · b)tr(6 c 6 d) − tr(6 b 6 a 6 c 6 d) ,


= 8(a · b)(c · d) − tr[6 b(− 6 c 6 a + 2a · c) 6 d] = 8(a · b)(c · d) − 8(a · c)(b · d)
+tr(6 b 6 c 6 a 6 d) ,
= 8(a · b)(c · d) − 8(a · c)(b · d) + tr[6 b 6 c(− 6 d 6 a + 2(a · d)))] , (4.409)
= 8(a · b)(c · d) − 8(a · c)(b · d) + 8(a · d)(b · c) − tr(6 b 6 c 6 d 6 a) ,
= 8(a · b)(c · d) − 8(a · c)(b · d) + tr[6 b 6 c(− 6 d 6 a + 2(a · d)))] , (4.410)
= 8(a · b)(c · d) − 8(a · c)(b · d) + 8(a · d)(b · c) − tr(6 a 6 b 6 c 6 d) , (4.411)

from which tr(6 a 6 b 6 c 6 d) = 4(a · b)(c · d) + 4(a · d)(b · c) − 4(a · c)(b · d).

• In general

tr(6 a1 6 a2 .... 6 an ) = 0 , if n is odd , (4.412)


tr(6 a1 6 a2 .... 6 an ) = (a1 · a2 )tr(6 a3 6 a4 .... 6 an ) − (a1 · a3 )tr(6 a2 6 a4 .... 6 an ) + ...
+(a1 · an )tr(6 a2 6 a3 .... 6 an−1 ) , if n is even . (4.413)

And, including γ5 :
• trγ5 = 0

• tr(γ5 6 a) = 0

• tr(γ5 6 a 6 b) = 0

• tr(γ5 6 a 6 b 6 c) = 0

• tr(γ5 6 a 6 b 6 c 6 d) = 4iǫµνρσ aµ bν cρ dσ

• tr(γ5 6 a1 6 a2 .... 6 an ) = 0 if n is odd

• tr(γ5 6 a1 6 a2 .... 6 an ) 6= 0 if n is even, n > 4.

4.3.11 Plane wave solutions


In this section we consider the plane wave solutions of the Dirac equation. We assume
µ
ψ(X) = u(P )e−iPµ X , (4.414)

where u(p) is a spinor. Substituting into the Dirac equation, we get


µ µ µ
(iγ µ ∂µ − m)u(P )e−iPµ X = (iγ µ (−iPµ ) − m)u(P )e−iPµ X = (6 P − m)u(P )e−iPµ X = 0 . (4.415)

This leads to the following equation for the spinor u(P ):

(6 P − m)u(P ) = 0 , (4.416)

107
or, in matrix form, using a two-component spinor
 0    
P −m −σ · p φ 0
= . (4.417)
σ·p −P 0 − m χ 0
The system has non-trivial solutions only if
 0 
P −m −σ · p
det = m2 − (P 0 )2 + (σ · p)2 = m2 − (P 0 )2 + p2 = 0 , (4.418)
σ·p −P 0 − m
where we used the fact that
1
(σ · p)2 = σi σj pi pj = ([σi , σj ] + [σi , σj ]+ )pi pj = (δij + ǫijk σk )pi pj = p2 . (4.419)
2
Therefore, as in the Klein-Gordon case, we find again two kind of solutions
p
P 0 = ± p2 + m2 = ±ωp . (4.420)
We have two different plane waves, with positive and with negative frequency, that we wiill name
µ
ψ (+) = u(P )e−iPµ X , (4.421)
(−) iPµ X µ
ψ = v(P )e . (4.422)
Substituting into the Dirac equation, therefore, we find that the spinors u(p) and v(p) are solutions of
the following equations
(6 P − m)u(P ) = 0 , (4.423)
(6 P + m)v(P ) = 0 . (4.424)
In order to solve the system (4.423,4.424) it is convenient to move in the frame in which the particle
is at rest, i.e. in the frame in which P µ = (m, 0). In this frame, 6 P = γ 0 m and therefore we get
(γ 0 m − m)u(m, 0) = 0 , (4.425)
0
(γ m + m)v(m, 0) = 0 , (4.426)
or, since m 6= 0
(γ 0 − 1)u(m, 0) = 0 , (4.427)
0
(γ + 1)v(m, 0) = 0 , (4.428)
If we define the general spinors

  
u1 v1
u2   v2 
u(m, 0) = 
u3  ,
 v(m, 0) = 
 v3  ,
 (4.429)
u4 v4
and we consider for instance the Pauli representation for the gamma matrices, Eqs. (4.427,4.428) have
the following solution
u3 = u4 = v1 = v2 = 0 . (4.430)
We do not find any constraint on the other components, and therefore the general solution is the
following linear combination
   
1 0
0 1 (1) (2)
0 + β 0 = αu (m, 0) + βu (m, 0) ,
u(m, 0) = α     (4.431)
0 0

108
   
0 0
0 ′ 0
v(m, 0) = α′  ′ (1) ′ (2)
   
1 + β 0 = α v (m, 0) + β v (m, 0) , (4.432)
0 1

where we defined
       
1 0 0 0
0 1 0 0
u(1) (m, 0) =  u(2) (m, 0) =  v (1) (m, 0) =  v (2) (m, 0) = 
   
0 ,

0 ,

1 ,

0 ,
 (4.433)
0 0 0 1

and where the degeneracy is due to the spin. In the rest frame we have: positive-energy solutions, with
spin up or spin down, and negative-energy solutions with spin up or spin down.
The spinors u(1) (m, 0), u(2) (m, 0), v (1) (m, 0), v (2) (m, 0), are eigenvectors of the third component
of the spin, with eigenvalues ±1/2.
In order to find the general solution, u(P ), v(P ) in a frame in which P µ = (E, p) we can boost our
solutions, found in the rest frame, using Eq. (4.329).
If
E p
γ= , βγ = , p = pn̂ , (4.434)
m m
we have
! r
βγ σ·p 
r
1 − γ+1 σ·n

γ+1 E+m 1 − E+m
S(Λ) = βγ = σ·p . (4.435)
2 − γ+1 σ·n 1 2m − E+m 1

Therefore, using the generic two-component spinor in the rest frame


 (α)   
(α) φ (α) 0
u (m, 0) = , v (m, 0) = , (4.436)
0 χ(α)

where α = 1, 2, φ1 ∝ u(1) (m, 0), φ2 ∝ u(2) (m, 0), ... etc, we have
 q 
σ·p   (α)  E+m (α)
r
2m φ

E+m 1 φ
u(α) (P ) = S −1 (Λ)u(m, 0) = σ·p
E+m =  σ·p
 . (4.437)
2m E+m 1 0 √ φ(α)
2m(E+m)

The same for the spinor v(P ):

√ σ·p
 
r
E+m

1 σ·p    χ(α)
0 2m(E+m)
v (α) (P ) = S −1 (Λ)v(m, 0) = σ·p
E+m = q . (4.438)
2m E+m 1 χ(α) E+m (α)
χ
2m

The same result can be found noting that

(6 P − m)(6 P + m) = P 2 − m2 = 0 . (4.439)

Therefore, if we define

u(α) (P ) = Cα (6 P + m)u(α) (m, 0) , (4.440)


(α) (α)
v (P ) = Dα (− 6 P + m)v (m, 0) , (4.441)

where Cα and Dα are normalization factors, we find immediately

(6 P − m)u(α) (P ) = 0 , (4.442)

109
(6 P + m)v (α) (P ) = 0 . (4.443)

In order to find the normalization factors Cα and Dα , we note that (by direct inspection) the
following relations hold in the rest frame:

u(α) (m, 0)u(β) (m, 0) = δαβ , (4.444)


(α) (β) αβ
v (m, 0)v (m, 0) = −δ , (4.445)
(α) (β)
u (m, 0)v (m, 0) = 0 . (4.446)
(4.447)

These relations are already cast in scalar form, in the sense that in a generic frame it must hold

u(α) (P )u(β) (P ) = δαβ , (4.448)


(α) (β) αβ
v (P )v (P ) = −δ , (4.449)
(α) (β)
u (P )v (P ) = 0 , (4.450)
(4.451)

that can be used to impose the normalization of the spinors:

δαβ = u(α) (P )u(β) (P ) = Cα∗ (u(α) (m, 0))† (6 P + m)† γ 0 Cβ (6 P + m)u(β) (m, 0) , (4.452)
= Cα∗ Cβ (α)
u (m, 0)(6 P + m) u 2 (β)
(m, 0) , (4.453)
= Cα∗ Cβ (α)
u (m, 0)(2m 6 P + 2m )u 2 (β)
(m, 0) , (4.454)
(α) (β) (α) 0 (β) (α) (β)
= | since u (m, 0) 6 P u (m, 0) = P0 u (m, 0)γ u (m, 0) = Eu (m, 0)u (m, 0) |
= ∗
Cα Cβ 2m(E + m)u(α) (m, 0)u(β) (m, 0) , (4.455)
= |Cα |2 2m(E + m) δαβ . (4.456)

This gives (apart from a phase that we choose to be equal to zero)


1
Cα = p . (4.457)
2m(E + m)
The same expression we find for Dα , using Eq. (4.449). In the end
6P +m
u(α) (P ) = p u(α) (m, 0) , (4.458)
2m(E + m)
−6P +m
v (α) (P ) = p v (α) (m, 0) . (4.459)
2m(E + m)
It is easy to check that these spinors are indeed ortogonal (they satisfy Eq. (4.450)):

u(α) (m, 0)(6 P + m)(− 6 P + m)v (β) (m, 0)


u(α) (P ) v (β) (P ) = = 0. (4.460)
2m(E + m)
Using the two-component expression for u(P ) and v(P )
 (α)   
(α) φ (α) 0
u (P ) = , v (P ) = (4.461)
0 χ(α)
and explicitely expressing (6 P + m) and (− 6 P + m) in matrix notation, we get the expressions of
Eqs. (4.437,refvsp):
 q 
E+m (α)
2m φ
u(α) (P ) =  , (4.462)
√ σ·p φ(α)
2m(E+m)

110
√ σ·p
 
χ(α)
2m(E+m)
v (α) (P ) =  q . (4.463)
E+m (α)
2m χ

In this way, we found the spinors normalized in the sense of Eqs. (4.448,4.449). However, the
scalar product for our fields is defined in terms of ψ † and not ψ. What we would like to impose is the
normalization of the charge, which is given by
Z
Q = d3 X ψ † ψ . (4.464)

Let us note that


 q 
E+m (α)
2m φ
q 
E+m σ·p
(u(α) (P ))† u(α) (P ) = (φ(α) )† √ (φ(α) )† , (4.465)
√ σ·p
 
2m 2m(E+m) φ(α)
2m(E+m)

E + m (α) † (α) (σ · p)2


= (φ ) φ + (φ(α) )† φ(α) , (4.466)
2m 2m(E + m)
= | since (φ(α) )† φ(α) = 1 and (σ · p)2 = p2 = E 2 − m2 |
E
= , (4.467)
m
while (u(α) (P ))† u(β) (P ) = 0 when α 6= β. For the spinor v(P ) we get

√ σ·p
 
 q  χ(α)
σ·p E+m 2m(E+m)
(v (α) (P ))† v (α) (P ) = √ (χ(α) )† 2m (χ(α) )†  q , (4.468)
2m(E+m) E+m (α)
2m χ
p2 E + m (α) † (α)
= (χ(α) )† χ(α) + (χ ) χ , (4.469)
2m(E + m) 2m
E
= . (4.470)
m
In order to normalize, using the correct scalar product, the positive and negative energy solutions,
then, we have to consider
r
(+) (α) m −iPµ X µ
ψ(α) (X) = N u (P ) e , (4.471)
E
r
(−) m iPµ X µ
ψ(α) (X) = N v (α) (P ) e , (4.472)
E
such that
(+) (+) (−) (−) (+) (−)
(ψ(α) (X))† ψ(β) (X) = δαβ , (ψ(α) (X))† ψ(β) (X) = δαβ (ψ(α) (X))† ψ(β) (X) = 0 . (4.473)

In Eqs. (4.471,4.472), N is a normalization factor.


Recalling the scalar product Z
(ψ1 , ψ2 ) = d3 X ψ1† ψ2 , (4.474)

we now want to normalize the fields to the delta:


2 m
Z
(+) (+) µ
(ψ(α) (X), ψ(β) (X)) = |N | d3 X (u(α) (P ))† u(β) (Q)ei(P −Q)µ X ,
E
m E
Z
2 µ
= |N | δαβ d3 X ei(P −Q)µ X = |N |2 δαβ (2π)3 δ(p − q) . (4.475)
E m

111
This implies (we choose N real):
1
N=p . (4.476)
(2π)3
Finally, the full expression of the Dirac field in normal modes is
2 Z
d3 p
r
X mh µ µ
i
ψ(X) = 3 b(α) (p) u(α) (P )e−iPµ X + d∗(α) (p) v (α) (P )eiPµ X , (4.477)
(2π) 2 E
α=1
p
with E = p2 + m2 > 0 and where, for the classic field, b(α) (P ) and d∗(α) (P ) are the coefficients of the
linear combination.

4.3.12 Energy projectors and polarization sum


It is convenient to introduce the projectors for positive and negative-energy, spin up and spin down
solutions, in such a way that from a generic solution we could project out four independent solutions
(positive-energy spin-up, positive-energy spin-down, negative-energy spin-up, negative-energy spin-
down solutions).
Considering that

(6 P + m)(6 P + m) = 2m(6 P + m) , (6 P + m)(− 6 P + m) = 0 , (4.478)

let us write the following operators


± 6P +m
Λ± = . (4.479)
2m
These are indeed the projectors we were looking for. In fact, if

ψ(X) ∼ α u(P ) + β v(P ) , (4.480)

we have
Λ+ ψ(X) = α u(P ) , and Λ− ψ(X) = β v(P ) . (4.481)
The operators Λ± are projectors. In fact
1 ±6P +m
Λ2± = 2
(± 6 P + m)(± 6 P + m) = = Λ± , (4.482)
4m 2m
1
Λ+ Λ− = (6 P + m)(− 6 P + m) = 0 , (4.483)
4m2
1
Λ+ + Λ− = [6 P + m + (− 6 P + m)] = 1 . (4.484)
2m
The projectors Λ± can be written in terms of the polarization sum of the spinors as follows. We
have
2
X 2
X
u(α) (P )u(α) (P ) = u(α) (P )(u(α) (P ))† γ 0 , (4.485)
α=1 α=1
2
1 X
= (6 P + m)u(α) (m, 0)(u(α) (m, 0))† (6 P + m)† γ 0 , (4.486)
2m(E + m) α=1
2
1 X
= (6 P + m) u(α) (m, 0)u(α) (m, 0)(6 P + m) , (4.487)
2m(E + m)
α=1
2
X 1 + γ0
= | since u(α) (m, 0)u(α) (m, 0) = |
2
α=1

112
1 1 + γ0
= (6 P + m) (6 P + m) , (4.488)
2m(E + m) 2
1
2m(6 P + m) + (6 P + m)(γ 0 γ ν Pν + mγ 0 ) ,
 
= (4.489)
4m(E + m)
= | since γ 0 γ ν = −γ ν γ 0 + 2η 0ν |
1
2m(6 P + m) + (6 P + m)(2E + (− 6 P + m)γ 0 ) , (4.490)
 
=
4m(E + m)
2(E + m)
= (6 P + m) , (4.491)
4m(E + m)
(6 P + m)
= = Λ+ . (4.492)
2m
Analogously we find
2
X (6 P − m)
v(α) (P )v (α) (P ) = = −Λ− . (4.493)
2m
α=1
Therefore
2
X 2
X
u(α) (P )u(α) (P ) − v(α) (P )v (α) (P ) = 1 . (4.494)
α=1 α=1

4.3.13 Spin projectors


The positive and negative energy solutions are still doubly degenerate. It is possible to remove such
deceneracy selecting a spin state through spin projectors.
Let us consider the solution of the Dirac equation in the rest frame. The spinors u(1) (m, 0) and
u(2) (m, 0) are eigenstates of  3 
i σ 0
σ12 = [γ1 , γ2 ] = (4.495)
2 0 σ3
with eigenvalues +1 and −1, respectively. The same is true for v (1) (m, 0) and v (2) (m, 0). Therefore, a
projector for eigenstates of spin up (in the ẑ direction) can be looked for in the following expression
1 + σ12
Σ̃(ẑ) = , (4.496)
2
such that
 
1
1 + σ12 0
Σ̃(ẑ)u(1) (m, 0) =   = u(1) (m, 0) ,
0 (4.497)
2
0
 
0
1 + σ12 1
Σ̃(ẑ)u(2) (m, 0) =   = 0,
0 (4.498)
2
0
 
0
1 + σ12 0
Σ̃(ẑ)v (1) (m, 0) =   = v (1) (m, 0) ,
1 (4.499)
2
0
 
0
1 + σ12 0
Σ̃(ẑ)v (2) (m, 0) =   = 0.
0 (4.500)
2
1

113
We also notice that
i
σ12 = [γ1 , γ2 ] = iγ 1 γ 2 = −γ 0 γ5 γ 3 = γ5 γ3 γ 0 , (4.501)
2
and that
1 + σ12 1 + σ12 n̂3R
Σ̃(ẑ) = = , (4.502)
2 2
where n̂3R is the third spatial component of the space-like vector
 
0
0
n̂µR =  n̂2R = −1 , n̂µR P µ = 0 ,

(4.503)


0
1

in the rest frame. We can therefore write


1 + γ5 γ3 γ 0 1 + γ5 γ3 n̂3 γ 0 1 + γ5 6 n̂R γ 0
Σ̃(ẑ) = = = = Σ̃(n̂R ) , (4.504)
2 2 2
where, in the rest frame
6 n̂R = γ0 n̂0R + γi n̂iR = γ3 n̂3R . (4.505)
The expression of Σ(n̂R ) in Eq. (4.504) is “almost” generalizable to a generic inertial frame. The
problem is the presence of γ 0 , that does not allow to use the same expression in another frame. If we
could drop the γ 0 from Eq. (4.504), we would have reached our goal.
0
Projectors Σ̃(±n̂R ) = 1±γ526n̂R γ behave as follows
     
α α α
β  1 1 + σ3
 
0 β   0 
Σ̃(+n̂R ) 
γ  = 2
  = , (4.506)
0 1 + σ3 γ  γ 
δ δ 0
     
α  α 0
β  1 1 − σ 3

0  β  β 
Σ̃(−n̂R ) 
γ  = 2
  = . (4.507)
0 1 − σ3 γ   0 
δ δ δ

Let us see to which projectors correspond instead the


1 ± γ5 6 n̂R
Σ(±n̂R ) = , (4.508)
2
without the γ 0 in their expression. We have
     
α α α
β  1 1 + σ3
 
0 β   0 
Σ(+n̂R ) 
γ  = 2
  = , (4.509)
0 1 − σ3 γ   0 
δ δ δ
     
α  α 0
β  1 1 − σ 3

0  β  = β  .
   
Σ(−n̂R )  =
γ  2 3 (4.510)
0 1 + σ  γ  γ 
δ δ 0

Therefore, in the rest frame Σ(+n̂R ) projects positive-energy spin-up and negative-energy spin-down
solutions, while Σ(−n̂R ) projects positive-energy spin-down and negative-energy spin-up solutions.

114
The expression of the spin projectors in a general frame is, then
1 ± γ5 6 n̂
Σ(±n̂) = , (4.511)
2
in which now n̂µ is the boosted unit space-like vector. In fact, if Λ is the boost, according to which

P µ = Λµν PRν , (4.512)

where PRµ = (m, 0), we have

1 ± S −1 (Λ)γ5 S(Λ)S −1 (Λ)γµ S(Λ)n̂µR


Σ(±n̂) = S −1 (Λ)Σ(±n̂R )S(Λ) = , (4.513)
2
1 ± γ5 γ ν Λµν n̂µR 1 ± γ5 6 n̂
= = . (4.514)
2 2
The vector n̂µ is still space-like, n̂2 = −1, and since in the rest frame we have n̂µR PR µ = 0, in the
boosted frame we still have n̂µ Pµ = 0. The operator 12 γ5 6 n̂ is called the Pauli-Lubanski operator and
it is the relativistic generalization of what in the rest frame is the projection of the spin σ/2 in the
direction of n̂R . If in the rest frame we have

Σ(±n̂R )u(m, 0) = ±u(m, 0) , (4.515)


Σ(±n̂R )v(m, 0) = ∓v(m, 0) , (4.516)

in the boosted frame we still have

Σ(±n̂)u(P ) = Σ(±n̂)S −1 (Λ)u(m, 0) = S −1 (Λ)Σ(±n̂R )u(m, 0) = ±S −1 (Λ)u(m, 0) ,


= ±u(P ) , (4.517)
−1 −1 −1
Σ(±n̂)v(P ) = Σ(±n̂)S (Λ)v(m, 0) = S (Λ)Σ(±n̂R )v(m, 0) = ∓S (Λ)v(m, 0) ,
= ∓v(P ) . (4.518)

Σ(±n̂) project out positive energy solutions with spin projection in the n̂ direction of ± 21 and
negative energy solutions with spin projection ∓ 21 .
Σ(±n̂) are actually projectors, then they satisfy the following properties:
1 1 1
Σ2 (±n̂) = (1 ± γ5 6 n̂)2 = (1 ± 2γ5 6 n̂ + γ5 6 n̂γ5 6 n̂) = (2 ± 2γ5 6 n̂) ,
4 4 4
= Σ(±n̂) , (4.519)
1
Σ(+n̂) + Σ(−n̂) = (1 + γ5 6 n̂ + 1 − γ5 6 n̂) = 1 , (4.520)
2
  
1 + γ5 6 n̂ 1 − γ5 6 n̂
Σ(+n̂)Σ(−n̂) = = 0. (4.521)
2 2

We have
[Λ± , Σ(±n̂)] = 0 , for every n̂ such that n̂µ Pµ = 0 . (4.522)
In fact

6 P γ5 6 n̂ = P µ n̂ν γµ γ5 γn u = −P µ n̂ν γ5 γµ γn u = −P µ n̂ν γ5 (−γν γµ + 2ηµν ) , (4.523)


ν
= γ5 6 n̂ 6 P + 2γ5 Pν n̂ , (4.524)
µ
= | since n̂ Pµ = 0 |
= γ5 6 n̂ 6 P . (4.525)

115
and therefore, if Pν n̂ν = 0, we have
     
±6P +m 1 ± γ5 6 n̂ 1 ± γ5 6 n̂ ±6P +m
= . (4.526)
2m 2 2 2m

Using Λ± and Σ(±n̂) we can compose projectors for definite energy and spin

P1 = Λ+ Σ(+n̂) , (4.527)
P2 = Λ+ Σ(−n̂) , (4.528)
P3 = Λ− Σ(+n̂) , (4.529)
P4 = Λ− Σ(−n̂) , (4.530)

such that
4
X
Pi = 1 , Pi Pj = δij , trPi = 1 . (4.531)
i=1

4.3.14 Non relativistic limit of the Dirac’s equation


We consider in this section the case in which the particle that we would like to describe using Dirac’s
equation moves with a speed much smaller than the speed of light, v ≪ c, and it is in interaction with
an electromagnetic field.
In order to describe the interaction, we perform the so-called “minimal substitution” in the Dirac’s
eqution. This ammounts to
∂ µ → ∂ µ + ieAµ , (4.532)
where e is the electric charge of the electron (negative, so e = −|e|) and Aµ is the electromagnetic
four-potential Aµ = (φ, A). Under the substitution (4.532) the free Dirac’s equation becomes

(i 6 ∂ − e A
6 − m)ψ(X) = 0 . (4.533)

In components we have

(iγ0 ∂ 0 − eγ0 A0 − m)ψ(X) + γi (i∂ i − eAi )ψ(X) = 0 . (4.534)

We would like that Eq. (4.534) would provide an accurate description of the behaviour of an electron
(positve-energy state) in an electromagnetic field for small velocities. Its energy will be
p p2
E= p 2 + m2 ∼ m + + ... (4.535)
2m
p2 µ
where 2m ≪ m. In this situation the term e−iPµ X is dominated by e−imt that oscillates much
faster than any other term. It is then convenient to isolate such fast varying term redefining our
positive-energy solution as
ψ(X) = ψ̃(X)e−imt , (4.536)

where now ψ̃(X) oscillates much slower, ∼ e−iE t where E ′ = E − m ≪ m. Substituting in Eq. (4.534),
we find an equation for ψ̃(X):

γ0 (i∂ 0 − eA0 + m)ψ̃(X) − mψ̃(X) + γi (i∂ i − eAi )ψ̃(X) = 0 . (4.537)

If we express ψ̃(X) with two two-component spinors


 
φ̃
ψ̃(X) = , (4.538)
χ̃

116
we find

0 −σ i
         
1 0 0 0 φ̃ φ̃ i i φ̃
(i∂ − eA + m) −m + i (i∂ − eA ) = 0, (4.539)
0 −1 χ̃ χ̃ σ 0 χ̃

or, the following system: (


(i∂ 0 − eA0 )φ̃ = σ · (p − eA)χ̃
(4.540)
(i∂ 0 − eA0 + 2m)χ̃ = σ · (p − eA)φ̃
In the second equation we can neglect the terms i∂ 0 χ̃ and −eA0 χ̃ with respect to 2mχ̃ and we can
therefore solve for χ̃ as follows:
σ · (p − eA)
χ̃ = φ̃ . (4.541)
2m
Eq. (4.541) tells us that χ̃ is “small” with respect to φ̃ (of order of p/m). So, the spinor is basically
described, in this limit, by the two-component spinor φ̃. Substituting (4.541) in the first equation of
(4.540), we find an equation for φ̃:

[σ · (p − eA)]2
(i∂ 0 − eA0 )φ̃ = φ̃ . (4.542)
2m
We have

[σ · (p − eA)]2 = σ i σ j (pi − eAi )(pj − eAj ) , (4.543)


1 1
= | since σ i σ j = [σ i , σ j ] + [σ i , σ j ]+ |
2 2
= (δij + iǫijk σ k )(pi − eAi )(pj − eAj ) , (4.544)
2 ijk k i j i j i j 2 i j
= (p − eA) + iǫ σ (p p − ep A − eA p + e A A ) . (4.545)

The two terms pi pj and Ai Aj are totally symmetric in ij, therefore, when we saturate with the epsilon-
tensor they vanish. Then we have to remember that pj and Aj do not commute, since pi = i∂ i and
Aj = Aj (X). Therefore we have
pi Aj = i∂ i Aj + Aj pi (4.546)
and

ǫijk (−epi Aj − eAi pj ) = ǫijk [−ei∂ i Aj − e(Ai pj + Aj pi )] = −eσ k (∇ ∧ A)k = −eσ · B , (4.547)

since ∇ ∧ A = B is the magnetic field.


Finally
[σ · (p − eA)]2 = (p − eA)2 − eσ · B (4.548)
and therefore
(p − eA)2
 
∂ 0 e σ
i φ̃ = eA + − · B φ̃ = H φ̃ . (4.549)
∂t 2m m2
Eq. (4.549) is the Schrödinger equation of a spin-1/2 particle in an electromagnetic field. In particular,
Dirac’s equation describes the correct magnetic dipole moment of the electron
e e
µ=− s = −g s, (4.550)
m 2m
where the factor g = 2 was introduced phenomenologically ad hoc to describe the anomalous Zeeman
effect. Now, this is a prediction of the Dirac equation.
If we consider the system in a weak static magnetic field, B = B k̂, in the z direction.

117
We have  
−y
1 B
A = B∧r =  x  (4.551)
2 2
0
Since B is a weak field, we neglect the term A2 in Eq. (4.549) and find finally
1  2 e σ
H = eA0 + (4.552)

p − e(p · A + A · p) − · B.
2m m2
In the case at hand we have that [pi , Ai ] = 0 and therefore

(p · A + A · p) = 2A · p = B(xpy − ypx ) = L · B , (4.553)

where L is the orbital angular momentum. Finally

p2 e
H= + eA0 − [L + 2s] · B , (4.554)
2m 2m
that gives a good description of the Zeeman effect.

The fine structure of the hydrogen atom


Let us now consider the case of a central potential (Hydrogen atom) such that
α
A = 0 , e A0 = V (r) = − , (4.555)
r
1
where α ∼ 137 is the fine structure constant. Eqs. (4.540) become
(
(E − V (r))φ̃ = σ · pχ̃
(4.556)
(E − V (r) + 2m)χ̃ = σ · pφ̃

Moreover, let us expand in the non relativistic regime keeping consistently terms of the order p2 /m2
correcting the energy p2 /(2m) and V . We then keep up to terms in p4 /m3 and p2 V /m2 . This will
give rise to the “relativistic corrections” to the non relativistic treatment of the hydrogen atom. The
equation for χ̃ now becomes
 
σ·p 1 E − V (r)
χ̃ = φ̃ ≃ 1− σ · p φ̃ . (4.557)
(E − V (r) + 2m) 2m 2m

There is another correction to take into account (see Maggiore) according to which the wave function
is corrected by a factor
p2
 
φ̃ = 1 − ψ. (4.558)
8m2
Finally we have

p2
   
σ·p 1 E − V (r)
χ̃ = φ̃ ≃ 1− σ·p 1− ψ, (4.559)
(E − V (r) + 2m) 2m 2m 8m2
p2
   
1 E − V (r)
≃ σ·p 1− + σ·p ψ. (4.560)
2m 8m2 2m

Substituting Eq. (4.558) and Eq. (4.560) in the first equation of (4.556), we find

p2 p2
     
1 E − V (r)
(E − V (r)) 1 − ψ = σ·p σ·p 1− + σ·p ψ, (4.561)
8m2 2m 8m2 2m

118
p2 p2 Ep2
   
σ · p V (r) σ · p
= 1− + − ψ. (4.562)
2m 8m2 4m2 4m2

On the left-hand side we have


p2 Ep2 V p2
   
(E − V (r)) 1 − ψ = (E − V (r)) − + ψ (4.563)
8m2 8m2 8m2

and, neglecting terms of order Ep4 /m4 , we can write

Ep2 p2
 2 
p
ψ≃ + V (r) ψ . (4.564)
8m2 8m2 2m

Finally, we get

i ψ = Hψ , (4.565)
∂t
where
p2 p4
   
1 1 2 2

H= + V (r) − + σ · p V (r) σ · p − p V (r) + V (r)p . (4.566)
2m 8m3 4m2 2

Let us analyse the two terms in the square brackets. We have

σ · p V (r) σ · p = σ i σ j pi V (r)pj = σ i σ j i∂ i V (r)pj + V (r)pi pj ,



(4.567)
= σ i σ j ieE i pj + V (r)pi pj ,

(4.568)
i j ij ijk k
= | since σ σ = δ + iǫ σ |
2
= ieE · p + V (r)p − e σ · (E ∧ p) , (4.569)

where we introduced the electric field ∂ i V (r) = eE i and we used the fact that iǫijk σ k V (r)pi pj = 0 for
the antisymmetry of the epsilon tensor. Moreover, we have

p2 V (r) + V (r)p2 = pi pi V (r) + V (r)p2 = pi ieE i + V pi + V (r)p2 ,



(4.570)
= iep · E + ieE · p + 2V (r)p2 , (4.571)
2
= e∇ · E + 2ieE · p + 2V (r)p . (4.572)

Finally
 
1 1 2 2 1 h e i
(4.573)

σ · p V (r) σ · p − p V (r) + V (r)p = − ∇ · E − e σ · (E ∧ p) .
4m2 2 4m2 2

Since  
1 dV (r)
eE = −∇V (r) = −r , (4.574)
r dr
therefore
   
e 1 1 dV (r) σ 1 1 dV (r)
− 2 σ · (E ∧ p) = − 2 · (−r ∧ p) = s · L, (4.575)
4m 2m r dr 2 2m2 r dr

where L = r ∧ p is the orbital angular momentum.


The resulting hamiltonian is
 2
p4
  
p 1 1 dV (r) e
H = + V (r) − + s·L− (∇ · E) , (4.576)
2m 8m3 2m2 r dr 8m2
= H0 + Hpert , (4.577)

119
where
p2
 
H0 = + V (r) (4.578)
2m
is the central-potential hamiltonian of a spinless particle, with energy levels

mα2
En = − , (4.579)
2n2
and eigenfunctions
ψnlm = Rnl (r) Ylm (θ, φ) . (4.580)
The term
p4
 
1 1 dV (r) e
Hpert =− 3 + s·L− (∇ · E) (4.581)
8m 2m2 r dr 8m2
can be treated in perturbation theory and it is constituted by the so-called “relativistic correction”

p4
Hr = − , (4.582)
8m3
the spin-orbit interaction  
1 1 dV (r)
HSO = s · L, (4.583)
2m2 r dr
and the Darwin term
e
(∇ · E) .
HD = − (4.584)
8m2
The hamiltonian (4.581) does not resolve completely the degeneracy of the energy levels of the hydrogen
atom11 . In particular the two levels 2S 1 and 2P 1 are still degenerate, while in Nature we register a
2 2
small difference, of about 1000 MHz (Lamb shift). This difference can be accounted for treating
correctly the system in quantum field theory, calculating higher-order QED quantum corrections.

4.3.15 Parity
So far we considered proper Lorentz transformations. In this section we will see how discontinuous
transformations, as Parity or Time Reversal, act on the field.
Parity is a Lorentz tranformation. Moreover, it can be represented via a unitary operator. On the
space-time point, Parity acts as follows:
(
x → −x
. (4.585)
t → t

Therefore, in matrix notation we have


 
1 0 0 0
0 −1 0 0
ΛP = 
0 0 −1 0  ,
 (4.586)
0 0 0 −1

which is basically the metric.


If we want that Dirac’s equation is invariant under Parity transformations, we have to require that
Eq. (4.287) holds for S(ΛP ) as well:

S −1 (ΛP )γν S(ΛP ) = ΛµP ν γ ν , (4.587)


11
This is the case also for the complete Dirac’s equation. It is not a problem of the fact that we afforded the calculation
perturbatively.

120
or, in components, multiplying on the left by S(ΛP ) and bringing both terms of the equation on the
l.h.s.
[γ 0 , S(ΛP )] = 0 e [γ i , S(ΛP )]+ = 0 . (4.588)
Eqs. (4.588) are satisfied by the following choice:

S(ΛP ) = ηP γ 0 , (4.589)

where ηP is a constant to be determined. Note that we have to have

[S(ΛP )]2 = 1 , (4.590)

since applying twice Parity we would like to find the identity operator. Therefore Eq. (4.590) implies

kηP k = 1 , (4.591)

i.e. ηP is a phase, that for the moment we put = 1:

S(ΛP ) = γ 0 . (4.592)

We note that such choice respects the fact that S(ΛP ) must be unitary. In fact

[S(ΛP )]† = γ 0 † = γ 0 = S(ΛP ) . (4.593)

The interacting Dirac’s equation is indeed covariant under the Parity transformation. In fact we
have
ψ ′ (X ′ ) = S(ΛP )ψ(X) , (4.594)
where X ′µ = (t, −x), and

0 = (i 6 ∂ − e 6 A − m)ψ(X) = (i 6 ∂ − e 6 A − m)S −1 (ΛP )ψ ′ (X ′ ) . (4.595)

Multiplying by S(ΛP ) on the left we have

S(ΛP )i 6 ∂S −1 (ΛP ) = i 6 ∂ ′ , (4.596)


−1 ′
S(ΛP ) 6 AS (ΛP ) = 6 A , (4.597)

since A0 does not change under parity but A changes sign and ∂ ′0 = ∂ 0 , ∂ ′i = −∂ i .
Finally we have
(i 6 ∂ ′ − e 6 A′ − m)ψ ′ (X ′ ) = 0 . (4.598)

4.3.16 Time Reversal


Time Reversal invariance means that if we have a sequence of observations made on a state described
by a certain wave function and we invert the temporal order of the sequence, we still find a physically
realizable sequence of observations.
The action of Time Reversal on the space-time point is
(
x → −x
, (4.599)
t → t

such that in matrix notation we have


 
−1 0 0 0
0 1 0 0
ΛT = 
0
, (4.600)
0 1 0
0 0 0 1

121
which is −η µ u .
We know that Time Reversal has to be represented by an anti-unitary anti-linear operator, therefore
via an operator Ũ such that
(
Ũ † Ũ = 1
. (4.601)
Ũ (α|φi + β|ψi) = α∗ Ũ |φi + β ∗ Ũ |ψi

Therefore Ũ is defined through

hφ|Ũ † ψi = hU φ|ψi∗ = hψ|U φi . (4.602)

Such operator can be constructed as a product of a unitary operator U times the operation of complex
conjugation K:
Ũ = U K . (4.603)
In fact, we have

Ũ (α|φi+β|ψi) = U K(α|φi+β|ψi) = U (α∗ K|φi+β ∗ K|ψi) = α∗ U K|φi+β ∗ U K|ψi = α∗ Ũ |φi+β ∗ Ũ |ψi .


(4.604)
If
|φ′ i = Ũ |φi , and |ψ ′ i = Ũ |ψi , (4.605)
then we have
X
hφ′ |ψ ′ i = hφ′ | U K

|βihβ|ψi , (4.606)
β
X
= hφ′ | hψ|βi U K|βi , (4.607)
β
X
= hβ ′ |φihβ ′ |U † U |βihψ|βi , (4.608)
ββ ′
X
= hψ|βi hβ ′ |βi hβ ′ |φi , (4.609)
ββ ′
= hψ|φi = hφ|ψi∗ , (4.610)

as it should be.
The representation of Time Reversal on the Dirac field can be found imposing the invariance of
Dirac’s equation. Let us consider Dirac’s equation in the original form (replacing nevertheless αi and
β matrices with the gamma’s)

i ψ = Hψ, (4.611)
∂t
where, including electromagnetic interactions, we have

H = eA0 + γ 0 γ i (−i∂i − eAi ) + γ 0 m . (4.612)

We define the Time Reversal operator K = T K, where T † T = 1, such that

ψ ′ (X ′ ) = K ψ(X) , ψ(X) = K −1 ψ ′ (X ′ ) (4.613)

where X ′ = (−t, x). Then

∂ −1 ′ ′ ∂
i K ψ (X ) = −i ′ K −1 ψ ′ (X ′ ) = H K −1 ψ ′ (X ′ ) . (4.614)
∂t ∂t

122
Multiplying on the left by K and remembering the anti-linear nature of K we get
∂ ∂

K(−i)K −1 ψ ′ (X ′ ) = i ′ ψ ′ (X ′ ) = K H K −1 ψ ′ (X ′ ) . (4.615)
∂t ∂t
We now should have
K H K −1 = T H ∗ (t) T −1 = H(t′ ) . (4.616)
We have
H ∗ (t) = eA0 (t) + (γ 0 γ i )∗ (i∂i − eAi (t)) + γ 0 m . (4.617)
Therefore, since
A0 (t) = A0 (t′ ) , A(t) = −A(t′ ) , (4.618)
because A0 is generated by a static charge distribution, while A is generated by a current (and therefore
when we invert the sign of time the current flows in the opposite direction and changes sign to the
vector potential), we must have

T H ∗ (t) T −1 = eA0 (t′ ) + T (γ 0 γ i ∗ )T −1 (i∂i + eAi (t′ )) + T γ 0 T −1 m . (4.619)

This gives the two conditions

T γ 0 T −1 = γ 0 , (4.620)
0 i∗ −1 0 i
T (γ γ )T = −γ γ . (4.621)

Using the first equation into the second

T (γ 0 γ i ∗ )T −1 = T γ 0 T −1 (T γ i ∗ T −1 ) = γ 0 (T γ i ∗ T −1 ) = −γ 0 γ i , (4.622)

we find that we have to impose


T γ i ∗ T −1 = −γ i . (4.623)
Eq. (4.623) is satisfied by
T = iγ 1 γ 3 , (4.624)
which is an operator such that
T †T = 1 , T2 = 1, (4.625)
as it should.
We could have used directly the relation (4.287) in order to find K = S(ΛT ). However, we have to
remember that in order to find relation (4.287) we already assumed the operator S(Λ) to be unitary
and linear. In fact, we commuted without a sign the “i′′ that multiplies the gamma’s. For Time
Reversal, we should use the proper relation

S −1 (ΛT )iγ ν S(ΛT ) = i(ΛT )νµ γ µ . (4.626)

This means
−iT −1 γ ν ∗ T = i(ΛT )νµ γ µ (4.627)
and therefore

T −1 γ 0 T = γ0 , (4.628)
−1 i ∗ i
T γ T = −γ . (4.629)

The solution of Eqs. (4.628,4.629) is again Eq. (4.624), since γ 0 ∗ = γ 0 , γ 1 ∗ = γ 1 , γ 2 ∗ = −γ 2 , γ 3 ∗ = γ 3


and since T 2 = 1.

123
4.3.17 Charge Conjugation
Per introdurre la coniugazione di carica conviene prima di tutto passare dall’equazione di Dirac per
il campo libero, a quella per il campo in interazione con un campo elettromagnetico Aµ . Infatti,
la coniugazione di carica agisce sul numero quantico che accoppia questi due campi e la distinzione
fra fermioni carichi negativamente o positivamente (che poi saranno identificati con le particelle e le
antiparticelle) ha senso soltanto se si parla di interazione e quindi se si può rivelare la differenza fra i
due.
Il passaggio all’equazione in interazione si fa banalmente attraverso la sostituzione minimale:

∂ µ −→ ∂ µ + ie Aµ . (4.630)

Tramite l’Eq. (4.630) possiamo riscrivere la (??) nel seguente modo:

(i 6 ∂ − e A
6 − m) ψ(X) = 0 . (4.631)

Cerchiamo, allora, la trasformazione discreta C che porti la funzione d’onda ψ(X) nella sua tra-
sformata ψC (X), che dovrà rappresentare un fermione con la stessa massa di ψ(X), ma con carica
elettrica di segno opposto:
ψ(X) −→ ψC (X) = C ψ(X) . (4.632)
La funzione d’onda ψC (X) dovrà soddisfare un’equazione di Dirac in cui −e è diventato +e:

(i 6 ∂ + e A
6 − m) ψC (X) = 0 . (4.633)

Ciò che richiediamo da C è che sia locale e che, a meno di una fase, la sua applicazione alla
trasformata di ψC (X) riporti allo stato iniziale.
Per trasformare la (4.631) nella (4.633), come prima cosa, è necessario cambiare il segno relativo
fra i 6 ∂ e −e A
6 e questo si può fare prendendo l’aggiunto della (4.631):
 
ψ† i 6 ∂ † + e A
6 † + m = 0. (4.634)

Moltiplicando a destra la (4.634) per γ0 e ricordando che γ µ † γ0 = γ0 γ µ , si ottiene:

ψ̄ (i 6 ∂ + e A
6 + m) = 0 . (4.635)

Trasponendo la (4.635) si arriva ad un’equazione che raggiunge quasi il nostro scopo:

i 6 ∂t + e A
6 t + m ψ̄ t = 0 . (4.636)


Se adesso trovassimo una trasformazione C, tale che:

C γµt C −1 = − γµ , (4.637)

moltiplicando a sinistra la (4.636) per questa C e cambiando di segno, otterremmo:

6 − m) C ψ̄ t = (i 6 ∂ + e A
(i 6 ∂ + e A 6 − m) ψC (X) = 0 , (4.638)

dove:
ψC (X) = ηC C ψ̄ t , (4.639)
con ηC fattore di fase.
Cerchiamo C nella nostra rappresentazione delle γ µ , in cui γ 0 è diagonale. Si ha:

γ0t = γ0 ; γ 1 t = −γ 1 ; γ2 t = γ2 ; γ 3 t = −γ 3 . (4.640)

124
Siccome deve valere la (4.637), C deve anticommutare con γ0 e γ 2 e commutare con γ 1 e γ 3 . Ciò
si verifica se:
−iσ 2
 
0
C = i γ2γ0 = . (4.641)
−iσ 2 0
Infatti, tramite la (4.641) si ha:
(
−γ µ C se µ = 0, 2 ;
C γµ = (4.642)
γ µ C se µ = 1, 3 .

La C ha le seguenti proprietà di facile verifica:

C † = C t = C −1 = −C . (4.643)

Abbiamo, allora, trovato un operatore C tale che, se ψ(X) è soluzione dell’equazione di Dirac
(4.631), il suo trasformato ψC (X) sia soluzione dell’equazione “del positrone” (4.633) e viceversa.
Vediamo come agisce effettivamente C sulla funzione d’onda. Se abbiamo una soluzione generica
dell’equazione di Dirac, in un generico sistema di riferimento, ad energia e spin definiti:
  
′ ±P6 +m 1 + γ5 6 n
ψ (X) = ψ(X) , (4.644)
2m 2
si ha:
6 ∗+m 1 + γ5 6 n ∗
  
′ t ±P
ψC (X) = C ψ̄ ′ (X) = ηC C γ0 ψ ∗ (X) . (4.645)
2m 2
Siccome, inoltre, valgono le:

γ0 γ µ ∗ = γ µ t γ0 ; γ0 γ5 = −γ5 γ0 ; [C, γ5 ] = 0 , (4.646)

si ottiene:
6 t+m 1 − γ5 6 nt
  
′ ±P
ψC (X) = ηC C γ0 ψ ∗ (X) = (4.647)
2m 2
  
∓P 6 +m 1 + γ5 6 n
= ηC C ψ̄ t (X) = (4.648)
2m 2
  
∓P6 +m 1 + γ5 6 n
= ηC ψC (X) . (4.649)
2m 2

Si vede, quindi, che ψC ′ (X) è descritto dagli stessi P µ e nµ di ψ ′ (X), ma ha segno dell’energia

opposto. Per come abbiamo costruito gli operatori di proiezione per lo spin, ciò vuol dire che è
invertito anche lo spin della particella.
In particolare, prendiamo una soluzione ad onda piana nel sistema di riferimento di riposo, ad
energia negativa con spin down:
 
0
0
ψ(X) = u(4) (m, 0) eimt = eimt 

0 .
 (4.650)
1

Il suo coniugato di carica sarà:


     
0 0 0 1 0 1
0 0 −1 0 0 0
ψC (X) = iηC γ 2 ψ ∗ = ηC e−imt 
0 −1 0 0
   = ηC e−imt
0
 ,
0 (4.651)
1 0 0 0 1 0

125
cioè una soluzione ad energia positiva e spin up.

La matrice C, costruita come nell’Eq. (4.641), mette in luce una simmetria dell’equazione di Dirac.
Infatti, combinando le due operazioni:
(
C
ψ(X) −→ ψC (X)
(4.652)
Aµ (X) −→ −Aµ (X)

si ottiene un’invarianza formale della (4.631).


Questa simmetria ci dice che ad ogni stato fisicamente realizzabile per un elettrone in un campo
elettromagnetico Aµ , corrisponde uno stato realizzabile per un positrone in un campo −Aµ .
La coniugazione di carica consiste nell’insieme delle trasformazioni (4.652).

4.3.18 CP T transformation
4.3.19 Massless fermionic field: the neutrino
Oltre ai fermioni di massa m 6= 0, nel Modello Standard delle interazioni fondamentali sono previsti
anche fermioni che sperimentalmente sembrano avere massa nulla: i neutrini. Per essi, l’equazione di
Dirac si riduce alla:
i 6 ∂ ψν (X) = 0 , (4.653)
dove il pedice ν sta per neutrino.
Il fatto che il termine di massa non sia presente nella (??) permette di disaccoppiare i due spinori
a due componenti φR (X) e φL (X), mediante i quali avevamo costruito lo spinore di Dirac12 ψν (X).
Inoltre, come abbiamo già accennato nel primo capitolo, la massa nulla del campo fermionico in
considerazione fa sì che le sue polarizzazioni possibili siano date dagli autovalori dell’elicità: ± 12 lungo
la direzione del moto. È conveniente, allora, non utilizzare per le γ µ la rappresentazione (??), ma
introdurre la rappresentazione di Weyl o rappresentazione chirale, nella quale è diagonale la γ5 (legata,
come vedremo, all’elicità):

σi
     
0 1 0 −1 0
γ0 = , γi = , γ5 = iγ 0 1 2 3
γ γ γ = . (4.654)
1 0 −σ i 0 0 1

Se poniamo, allora:  
φL (X)
ψν (X) = , (4.655)
φR (X)
la (4.653) si divide in due equazioni differenziali disaccoppiate per la componente destra, φR (X), e
sinistra, φL (X), di ψν (X):

i∂ 0 φL (X) − i σ · ∇ φL (X) = 0 , (4.656)


0
i∂ φR (X) + i σ · ∇ φR (X) = 0 , (4.657)

ovvero, ponendo p = −i∇:

i∂ 0 φL (X) + σ · p φL (X) = 0 , (4.658)


0
i∂ φR (X) − σ · p φR (X) = 0 . (4.659)
σ·p
L’operatore ĥ = 2kpk è detto elicità del neutrino e, come si vede, rappresenta in pratica la
p
componente dello spin lungo la direzione del moto ( kpk ).
12
Cfr. Capitolo 1.

126
Siccome il neutrino ha massa nulla, avrà un quadrivettore energia-impulso di tipo luce, P 2 = 0,
da cui risulta che:
E = ± kpk . (4.660)
Consideriamo l’Eq. (4.658) in cui φL (X) sia un’onda piana ad energia positiva E = kpk (negativa
E = −kpk):
µ
φL (X) = φ0L e∓iPµ X , (4.661)
cioè quello che identificheremo con una “particella” (“antiparticella”). Sostituendo (4.661) in (4.658) si
ottiene:
1
ĥ φL (X) = ∓ φL (X) . (4.662)
2
Questo vuol dire che l’Eq. (4.658) descrive neutrini ad elicità − 12 (neutrini left-handed) e neutrini
ad energia negativa ed elicità 21 . Consistentemente con la quantizzazione del campo di Dirac, che
affronteremo nel prossimo capitolo, la seconda ipotesi è analoga ad asserire che l’Eq. (4.658) descrive
anche antineutrini (cioè antiparticelle ad energia positiva) con elicità 21 (antineutrini right-handed).
Se facciamo lo stesso ragionamento per l’Eq. (4.659), troviamo che questa descrive antineutrini ad
elicità − 12 (antineutrini left-handed) e neutrini ad elicità 12 (neutrini right-handed).
A questo punto abbiamo a che fare con due funzioni d’onda perfettamente analoghe da un punto di
vista teorico. Sperimentalmente, però, si può vedere che in natura sono presenti soltanto neutrini left-
handed ed antineutrini right-handed. Inoltre, siccome il neutrino interviene soltanto nelle interazioni
deboli e i due stati adesso menzionati non si possono connettere attraverso una trasformazione di
parità, le interazioni deboli violano la parità.
Introduciamo i due seguenti prioettori:

(1 − γ5 ) (1 + γ5 )
PL = , PR = . (4.663)
2 2
Come i può verificare facilmente, PL e PR godono di tutte le proprietà peculiari di un proiettore:
2
PL,R = PL,R , PL + PR = 1 , PL PR = 0 , (4.664)

dove abbiamo usato la proprietà γ52 = 1.


PL e PR proiettano rispettivamente su φL (X) e φR (X):
    
(1 − γ5 ) 1 2 0 φL φL
PL ψ(X) = ψ(X) = = , (4.665)
2 2 0 0 φR 0
    
(1 + γ5 ) 1 0 0 φL 0
PR ψ(X) = ψ(X) = = , (4.666)
2 2 0 2 φR φR

per cui, nelle interazioni deboli compariranno soltanto le espressioni:

(1 − γ5 ) (1 + γ5 )
ψν (X) , e ψ̄ν (X) . (4.667)
2 2
La prima distrugge neutrini left-handed e crea antineutrini right-handed, mentre la seconda crea
neutrini left-handed e distrugge antineutrini right-handed.
La teoria così viluppata si chiama teoria del neutrino a due componenti e fu proposta da Weyl
nel 1929 e ripresa solo nel 1957, quando evidenze sperimentali confermarono che le interazioni deboli
violano la parità.
A questo punto bisogna puntualizzare alcune cose.

127
• Da un punto di vista di teoria dei gruppi, il fatto che sia possibile descrivere il neutrino con
uno spinore a due componenti deriva dal fatto che la rappresentazione spinoriale del Gruppo di
Poincaré per massa nulla è riducibile nelle due rappresentazioni irriducibili φR e φL del Gruppo
di Lorentz che avevamo incontrato nel primo capitolo. Se nella lagrangiana L è presente, invece,
un termine di massa mψ̄ψ, questo mescola le due componenti φR e φL in un termine misto e L
non è più invariante separatamente sotto i due tipi di trasformazioni di Lorentz.

• L’operatore γ5 è detto chiralità. Per campi a massa nulla, la chiralità e l’elicità coincidono, ma
questo non è vero per campi massivi. Si riottiene l’uguaglianza nel caso di alte energie, cioè
quando la massa della particella è trascurabile in confronto alla sua energia. Questa osservazione
fa comodo poiché in questo caso si può far ricorso alla simmetria chirale approssimata anche se
stiamo trattando particelle massive, come l’elettrone, o i quarks e ricavare importanti relazioni
fra gli elementi della matrice S (“regole di somma” in QCD).

• La teoria del neutrino a due componenti è invariante sotto trasformazioni chirali:

ψν (X) → eiγ5 Λ ψν (X) , (4.668)


iγ5 Λ
ψ̄ν (X) → ψ̄ν (X) e . (4.669)

La simmetria è violata da termini di massa.

4.4 Quantization of the Dirac Field


The quantization of the Dirac field should follow some basic principles, as in the case of the Klein-
Gordon field. Firstly, we have to find a procedure that can accomodate the description of the particle
and anti-particle states, both with positive energy. Moreover, we are dealing with fermionic states.
Therefore, we would like to have a theory that incorporates directly the Pauli exclusion principle, or
better the fact that fermions should obey Fermi-Dirac statistics.
The expression of the Dirac field in normal modes is the following:
X Z d3 p
r
m µ µ
ψ(X) = 3 b(p, n)u(P, n)e−iPµ X + d∗ (p, n)v(P, n)eiPµ X . (4.670)
±n (2π) 2 E

If we want to quantize the field, we should promote to operators the coefficients b(p, n), d(p, n), b∗ (p, n),
d∗ (p, n) (in particular b∗ (p, n), d∗ (p, n) will become b† (p, n), d† (p, n)). However, we have to understand
how they can act on a possible Fock space. In order to do that, let us see which is the expression of
the hamiltonian

Z Z
H = d3 X H = i d3 X ψ † ψ , (4.671)
∂t
in terms of b(p, n), d(p, n) and b∗ (p, n) d∗ (p, n).
We need to remember the ortogonality and completeness relations for the spinors. We have

u(P, n)u(P, n′ ) = −v(P, n)v(P, n) = δnn′ , (4.672)


E
u† (P, n)u(P, n′ ) = −v † (P, n)v(P, n) = δnn′ , (4.673)
m
v(P, n)u(P, n ) = v (P, n)u(P̃ , n ) = u (P, n)v(P̃ , n′ ) = 0 ,
′ † ′ †
(4.674)

where P µ = (E, p) and P̃ µ = (E, −p). We already demonstrated the first equations, we should
demonstrate the last two. We have

u†(α) (m, 0)(6 P + m)† (− 6 P̃ + m)v(β) (m, 0)


† ′
u (P, n)v(P̃ , n ) = , (4.675)
2m(E + m)

128
u(α) (m, 0)γ 0 (6 P + m)† γ 0 γ 0 (− 6 P̃ + m)v(β) (m, 0)
= , (4.676)
2m(E + m)
u(α) (m, 0)(6 P + m)(− 6 P + m)γ 0 v(β) (m, 0)
= , (4.677)
2m(E + m)
= 0. (4.678)

Moreover

v(α) (m, 0)(− 6 P + m)† (6 P̃ + m)u(β) (m, 0)
† ′
v (P, n)u(P̃ , n ) = , (4.679)
2m(E + m)
v (α) (m, 0)γ 0 (− 6 P + m)† γ 0 γ 0 (6 P̃ + m)u(β) (m, 0)
= , (4.680)
2m(E + m)
v (α) (m, 0)(− 6 P + m)(6 P + m)γ 0 u(β) (m, 0)
= , (4.681)
2m(E + m)
= 0. (4.682)

Now let us substitute Eq. (4.670) in Eq. (4.671). We have


X Z d3 p m d3 p′
r r h
m
Z
3
H = i d X 3 3 ′
±n,±n′ (2π) 2 E (2π) 2 E
 µ µ

d(p, n)v † (P, n)e−iPµ X + b† (p, n)u† (P, n)eiPµ X ×
 ′ µ ′ µ
i
×(−iE ′ ) b(p′ , n′ )u(P ′ , n′ )e−iPµ X − d† (p′ , n′ )v(P ′ , n′ )eiPµ X (4.683)
,
d3 p d3 p′ mE ′
Z Z h ′ µ
X
3
= 3
√ d X d(p, n)v † (P, n)b(p′ , n′ )u(P ′ , n′ )e−i(Pµ +Pµ )X
(2π) EE ′
±n,±n′
′ µ
−d(p, n)v † (P, n)d† (p′ , n′ )v(P ′ , n′ )e−i(Pµ −Pµ )X
′ µ
+b† (p, n)u† (P, n)b(p′ , n′ )u(P ′ , n′ )ei(Pµ −Pµ )X
′ µ
i
−b† (p, n)u† (P, n)d† (p′ , n′ )v(P ′ , n′ )ei(Pµ +Pµ )X , (4.684)
= the integral in d3 X gives delta functions
X Z mE ′ h † ′
= d3 p d3 p′ √ v (P, n)u(P̃ ′ , n′ ) d(p, n)b(p′ , n′ )e−i(E+E )t δ(p + p′ )
EE ′
±n,±n′

−v † (P, n)v(P ′ , n′ ) d(p, n)d† (p′ , n′ )e−i(E−E )t δ(p − p′ )

+u† (P, n)u(P ′ , n′ ) b† (p, n)b(p′ , n′ )ei(E−E )t δ(p − p′ )

i
−u† (P, n)v(P̃ ′ , n′ ) b† (p, n)d† (p′ , n′ )ei(E+E )t δ(p + p′ ) , (4.685)
= using the ortogonality relations and the delta’s for the integration in d3 p′
XZ h i
= d3 pE b† (p, n)b(p, n) − d(p, n)d† (p, n) . (4.686)
±n

Then as in the case of the charged KG field, we have to kinds of particle states. The peculiarity of the
Dirac case, however, lies in the fact that there is the minus sign between the term with “particles” of kind
“b” and those of kind “d”. If we would impose commutation relations among the creation-annihilation
operators, we would produce a state with negative energy. Moreover, commutation relations give rise,
as we already noticed in the KG case, to symmetric wave functions and instead we would like to have
anti-symmetric wave fuinctions, to satisfy Fermi-Dirac statistics.

129
Therefore, we will impose the following quantization rules:

[b(p, n), b† (p′ , n′ )]+ = [d(p, n), d† (p′ , n′ )]+ = δnn′ δ(p − p′ ) , (4.687)
′ ′ ′ ′
[b(p, n), b(p , n )]+ = [d(p, n), d(p , n )]+ = .... = 0 . (4.688)

Using anticommutators (instead of commutators) we can write the energy in normal ordering as
follows
XZ h i
: H := d3 pE b† (p, n)b(p, n) + d† (p, n)d(p, n) . (4.689)
±n

The momentum operator has the same structure as the hamiltonian and then we find
XZ h i
: P i := d3 ppi b† (p, n)b(p, n) + d† (p, n)d(p, n) . (4.690)
±n

The spectrum is recovered defining the action of b(p, n) and d(p, n) on the vacuum

b(p, n)|0i = 0 , d(p, n)|0i = 0 , (4.691)

while the creation operators b† (p, n) and d† (p, n) create one-particle stated with definite energy and
momentum
b† (p, n)|0i = |pi , d† (p, n)|0i = |pi , (4.692)
If we refer only to H and P, states of kind “b” and states of kind “d” are degenerate

: H : b† (p, n)|0i = Eb† (p, n)|0i , (4.693)


† †
: H : d (p, n)|0i = Ed (p, n)|0i , (4.694)
† †
: P : b (p, n)|0i = pb (p, n)|0i , (4.695)
† †
: P : d (p, n)|0i = pd (p, n)|0i . (4.696)

However, the Dirac lagrangian is invariant under global phase transformations and the conserved
quantity, for the Nöther’s theorem, is the “charge”
Z
Q = d3 X ψ † ψ . (4.697)

If we substitute the expression of the field in normal modes in Eq. (4.697) and we integrate as in the
case of the hamiltonian, we find
Z XZ h i
d3 X ψ † ψ = d3 p b† (P, n)b(P, n) + d(P, n)d† (P, n) (4.698)
±n

and therefore, in normal ordering


XZ h i
: Q := d3 p b† (p, n)b(p, n) − d† (p, n)d(p, n) . (4.699)
±n

The charge operator is able to distinguish between states of kind “b” and states of kind “d”.
Let’s remember that the current that gives rise to the conserved charge of the Nöther’s theorem is

j µ = ψγ µ ψ (4.700)

and that the interacting term of the Dirac’s field with the electromagnetic field in the lagrangian
density is
Lint = −Hint = −eψγ µ ψAµ . (4.701)

130
Therefore
J µ = eψγ µ ψ (4.702)
can be interpreted as the electric current, while
XZ h i
: Q := d3 pe b† (p, n)b(p, n) − d† (p, n)d(p, n) (4.703)
±n

will be interpreted as the electric charge of the Dirac state.


This means that b† (p, n) will create a particle with energy E, momentum p and charge e = −|e|
(the electron), while d† (p, n) will create the anti-particle, with energy E, momentum p and charge −e
(the positron).

Two-particle states. Fermions


If we now consider a two-particle state, for instance a state with two electrons, since we have

b† (p1 , n)b† (p2 , n′ )|0i = −b† (p2 , n′ )b† (p1 , n)|0i (4.704)

(and the same happens for antiparticle states) we will have totally antisymmetric states in the exchange
of the two particles.

Anti-commutation rules for the fields


We quantized the Dirac’s field imposing anti-commutation rules on the creation-annihilation operators
in order to have a physical insight of what we were doing. These anti-commutations rules induce on
the fields analogous anti-commutations rules. We have
X Z d3 p d3 p′ m nh i
† −iPµ X µ † iPµ X µ
[ψα (x, t), ψβ (y, t)]+ = √ b(p, n)u α (P, n)e + d (p, n)v α (P, n)e ×
(2π)3 EE ′
±n,±n′
h ′ µ ′ µ
i
× p′ , n′ )vβ† (P ′ , n′ )e−iPµ Y + b† (p′ , n′ )u†β (P ′ , n′ )eiPµ Y
h ′ µ ′ µ
i
+ d(p′ , n′ )vβ† (P ′ , n′ )e−iPµ Y + b† (p′ , n′ )u†β (P ′ , n′ )eiPµ Y ×
h µ µ
io
× b(p, n)uα (P, n)e−iPµ X + d† (p, n)vα (P, n)eiPµ X , (4.705)
(X 0 =Y 0 )
d3 p d3 p′ m n
X Z
= √

(2π)3 EE ′
±n,±n
µ ′ µ
[b(p, n), d(p′ , n′ )]+ uα (P, n)vβ† (P ′ , n′ )e−iPµ X e−iPµ Y
µ ′ µ
+[b(p, n), b† (p′ , n′ )]+ uα (P, n)u†β (P ′ , n′ )e−iPµ X eiPµ Y
µ ′ µ
+[d† (p, n), d(p′ , n′ )]+ vα (P, n)vβ† (P ′ , n′ )eiPµ X e−iPµ Y
µ ′ µ
o
+[d† (p, n), b† (p′ , n′ )]+ vα (P, n)u†β (P ′ , n′ )eiPµ X eiPµ Y , (4.706)
(X 0 =Y 0 )
= | using anti-commutation relations |
X Z d3 p d3 p′ m n
= √

(2π)3 EE ′
±n,±n

+uα (P, n)u†β (P ′ , n′ )eip·(x−y) δnn′ δ(p − p′ )


o
+vα (P, n)vβ† (P ′ , n′ )e−ip·(x−y) δnn′ δ(p − p′ ) , (4.707)
(X 0 =Y 0 )

131
XZ d3 p m n † †
o
= uα (P, n)u β (P, n) + v α (P̃ , n)v β (P̃ , n) e−ip·(x−y) . (4.708)
±n
(2π)3 E

Remembering the expression of the sum over polarizations, we have


   ( ! )
6P +m 6 P̃ − m
uα (P, n)u†β (P, n) vα (P̃ , n)vβ† (P̃ , n)
X X
= γ0 , = γ0 . (4.709)
±n
2m αβ ±n
2m
αβ

Therefore
   ( ! ) ( ! )
6P +m 0 6 P̃ − m 6 P + 6 P̃ E
γ + γ0 = γ 0
= δαβ . (4.710)
2m αβ 2m 2m m
αβ αβ

Substituting in the previous expression we have

d3 p
Z

[ψα (x, t), ψβ (y, t)]+ = δαβ e−ip·(x−y) = δαβ δ(x − y) . (4.711)
(2π)3

Analogously we find

[ψα (x, t), ψβ (y, t)]+ = [ψα† (x, t), ψβ† (y, t)]+ = 0 . (4.712)

4.5 The Electromagnetic Field (classical field)


In this section we will consider the case of a vector field, the electromagnetic field.
Maxwell’s equations, in the Heaviside-Lorentz system, have the following form:

∇ · E = ρ, (4.713)
∇ · H = 0, (4.714)
1 ∂H
∇∧E+ = 0, (4.715)
c ∂t
1 ∂E
∇∧H− = j. (4.716)
c ∂t
Taking the divergence of Eq. (4.716), we find the continuity equation

1 ∂ρ
+ ∇ · j = 0, (4.717)
c ∂t
conservation of the electric charge.
Since the divergence of the magnetic field H is identically zero, we can introduce a vectorial function
A(x, t) such that:
H = ∇ ∧ A. (4.718)
A is called vector potential.
Substituting Eq. (4.718) into Eq. (4.715) we obtain:
 
1 ∂ 1 ∂A
∇∧E+ ∇∧A = ∇∧ E+ = 0. (4.719)
c ∂t c ∂t

Eq. (4.719) inplies the existence of a scalar function, φ(x, t), such that:

1 ∂A
E+ = −∇φ . (4.720)
c ∂t

132
The electric field E can then be expressed as follows:
1 ∂A
E = −∇φ − . (4.721)
c ∂t
The scalar function φ is called scalar potential.
The Maxwell’s equations can be written in terms of the potentials. In this way we find that only
two equations survive and the other two are identically satisfied.
In fact, Eq. (4.714) is identically satisfied. Eq. (4.713), with the electric field defined in (4.721),
becomes
1 ∂
∇2 φ + ∇ · A = −ρ . (4.722)
c ∂t
Eq. (4.715) is identically satisfied. Finally, Eq. (4.718) becomes

1 ∂2
 
2 1 ∂
∇ A − 2 2 A = −j + ∇ ∇ · A + φ . (4.723)
c ∂t c ∂t

In total, therefore, the four Eqs.(4.713, 4.714, 4.715, 4.716) are reduced to the following two:

1 ∂
∇2 φ + ∇ · A = −ρ , (4.724)
c ∂t
1 ∂2
 
2 1 ∂
∇ A − 2 2 A = −j + ∇ ∇ · A + φ . (4.725)
c ∂t c ∂t

Eqs. (4.724 , 4.725) exhibit an important invariance under the following redefinition of the poten-
tials: (
A → A′ = A + ∇ψ
(4.726)
φ → φ′ = φ − 1c ∂ψ ∂t

where ψ(x, t) is a generic function C 2 of its arguments.


This invariance is called gauge invariance. We find that the fields E and H are gauge-invariant
quantities.
We can use gauge invariance in order to simplify Eqs. (4.724, 4.725). In fact, if we perform a
transformation (4.726) of the potentials (φ, A) to the potentials (φ′ , A′ ), with ψ such that

1 ∂2 1 ∂φ
∇2 ψ − 2 2
ψ = −∇ · A − , (4.727)
c ∂t c ∂t
in the new gauge we will have
1 ∂ ′
∇ · A′ + φ = 0 (4.728)
c ∂t
and Eqs. (4.724,4.725) will be simplified as follows:

1 ∂2 ′
∇2 φ′ − φ = −ρ , (4.729)
c2 ∂t2
1 ∂2
∇2 A′ − 2 2 A′ = −j , (4.730)
c ∂t
(where we should remember that Eq. (4.728) holds). This choice of the gauge is called Lorentz gauge.
We have to notice that this choice of the function ψ does not determine in a univoque way the
potentials φ′ and A′ . It is possible to make another gauge transformation, staying in the Lorentz
gauge. In fact, if
1 ∂2
∇2 χ − 2 2 χ = 0 , (4.731)
c ∂t

133
the transformation (4.726) with χ at the place of ψ gives two new potentials (φ′′ , A′′ ) for which a
relation like the one in Eq. (4.728) holds:

1 ∂ ′′ 1 ∂ ′ 1 ∂2
∇ · A′′ + φ = ∇ · A′ + φ + ∇2 χ − 2 2 χ =
c ∂t c ∂t c ∂t
≡ 0, (4.732)

where we used Eq. (4.728) and Eq. (4.731).


Gauge invariance tells us that not all the four components of the potentials are independent. In
fact, Eq. (4.728) and Eq. (4.732) constitute two constraints for the four components of (φ, A). In total,
therefore, only two components are independent, as we will see explicitely below.

4.5.1 Covariant form of Maxwell’s equations


The charge density ρ and the current j transform, under Lorentz transformations, as the temporal and
spatial parts of a four-vector
J µ = (ρ, j) . (4.733)
The continuity equation, then, becomes simply forma:

∂µ J µ = 0 . (4.734)

The differential D’Alambert operator

1 ∂2
− ∇2 , (4.735)
c2 ∂t2
can be expressed in covariant form as follows

1 ∂2
− ∇2 = ∂µ ∂ µ = ∂ 2 , (4.736)
c2 ∂t2
that manifestly shows the fact that it is a Lorentz scalar. Finally, the scalar, φ, and vector, A,
potentials transform, again, as temporal and spatial parts of a four-vector

Aµ = (φ, A) . (4.737)

We can then write Eqs. (4.724, 4.725) in a manifestly covarian form as:

∂ 2 Aµ − ∂ µ (∂ν Aν ) = J µ , (4.738)

which are invariant also under a gauge transformation

Aµ −→ A′µ = Aµ + ∂ µ ψ . (4.739)

In the Lorentz gauge, we will have (


∂ 2 Aµ = J µ
(4.740)
∂µ Aµ = 0
and in the free-field case (
∂ 2 Aµ = 0
(4.741)
∂µ Aµ = 0

134
4.5.2 Electromagnetic tensor
We can write a manifestly covariant form of Maxwell’s equations, introducing the electromagnetic
tensor, that has, as components, the components of the electric and magnetic fields, E and H, that
are dirtectly gauge invariant.
Let us define the following anti-symmetric rank-2 tensor:

F µν = ∂ µ Aν − ∂ ν Aµ = −F νµ . (4.742)

Since we have
( (
E = −∇φ − 1c ∂A
∂t E i = ∂ i A0 − ∂ 0 Ai
=⇒ (4.743)
H = ∇∧A Hi = ǫijk ∂ j Ak

we get immediately that F µν can be represented in form of a matrix 4 × 4 as follows:


0 −E 1 −E 2 −E 3
 
E 1 0 −H 3 H 2 
F µν = E 2 H 3
. (4.744)
0 −H 1 
E 3 −H 2 H 1 0
Using F µν we can express the two Maxwell’s equations with sources as

∂µ F µν = J ν . (4.745)

The second pair of equations can be obtained by the Bianchi identities for F µν :

∂ µ F νσ + ∂ σ F µν + ∂ ν F σµ = 0 , (4.746)

or we can introduce the dual of F µν , F µν , via the following definition:


1 αβµν
F αβ = ǫ Fµν , (4.747)
2
where ǫαβµν is the Levi-Civita tensor. The tensor F µν has the following matrix representation:
0 −H 1 −H 2 −H 3
 
H 1 0 E 3 −E 2 
F µν = H 2 −E 3
, (4.748)
0 E1 
H 3 E 2 −E 1 0
and therefore it gives the opportunity to write the second pair of Maxwell’s equation in the following
form
∂µ F µν = 0 . (4.749)

4.5.3 Lagrangian density of the elettromagnetic field


Let us look for the Lagrangian density for the field Aµ in the vacuum, i.e. with J µ = 0. L should
be a Lorentz scalar (invariant under proper Lorentz transformations), gauge invariant and, since the
equations of motion are linear in the fields, L should contain quadratic terms. We have, at our disposal,
the four-vector Aµ and the tensor F µν with which we can construct scalars like

Fµν F µν ; Fµν F µν ; Fµν F µν . (4.750)

Other terms like Fµν Aµ Aν , Fµν dX µ dX ν , etc... are identically zero for the anti-symmetry of Fµν .
Moreover, we have to discart also terms like Aµ Aµ . In fact, although it is a Lorentz scalar, it is not
gauge invariant.

135
Of the three terms in Eq. (4.750) only one survives. In fact,

Fµν F µν = 2 (H 2 − E 2 ) , (4.751)

while the second term gives

Fµν F µν = 2 (E 2 − H 2 ) = −Fµν F µν , (4.752)

i.e. analogous to the first. The third is a pseudo-scalar and therefore it has to be discarted.
In total we have
L = a Fµν F µν , (4.753)
where a is a proportionality constant that has to be found.
We find the correct equations of motion imposing a = − 14 . Finally

1
L = − Fµν F µν . (4.754)
4
Note: this “constructive” way to the Lagrangian density would have allowed also the presence of
other terms. One could add for instance a term which is a Lorentz scalar and also gauge invariant,
like Fµλ Fσλ F σµ . This term, however, is an operator of dimension 6 and it is not renormalizable. This
criterion will be clear when we will introduce radiative corrections.

We can find the Lagrangian density for the electromagnetic field in a more standard way, using
Hamilton’s principle. Considering a variation of the field, δAν , that vanishes on the boundary of the
integration volume, we have
Z Z
0 = δS = d X ∂ Aν − ∂ν (∂ Aµ ) δA = d4 X [∂µ ∂ µ Aν δAν − ∂ν (∂ µ Aµ ) δAν ] , (4.755)
4
 2 µ
 ν

= | integrating by parts |
Z
= d4 X [∂ µ (∂µ Aν δAν ) − ∂µ Aν δ(∂ µ Aν ) − ∂ µ (∂ν Aµ δAν ) + ∂ν Aµ δ(∂ µ Aν )] , (4.756)
= | the surface terms integrate to zero |
Z
= d4 X [−∂µ Aν δ(∂ µ Aν ) + ∂ν Aµ δ(∂ µ Aν )] , (4.757)
 
1 1 1 1
Z
= d4 X − ∂µ Aν δ(∂ µ Aν ) − ∂ν Aµ δ(∂ ν Aµ ) + ∂ν Aµ δ(∂ µ Aν ) + ∂µ Aν δ(∂ ν Aµ ) (4.758)
,
2 2 2 2
= | since F µν = ∂ µ Aν − ∂ ν Aµ |
 
1
Z
4 µν
= d X − Fµν δF , (4.759)
2
 
1
Z
= δ d4 X − Fµν F µν . (4.760)
4

Therefore, we have
1 1 1
L = − Fµν F µν = − ∂µ Aν ∂ µ Aν + ∂µ Aν ∂ ν Aµ , (4.761)
4 2 2
which is Lorentz invariant, gauge invariant and local. The overall sign is important in order to have
an energy density which is positive definite.
We can check that with this Lagrangian density we get the correct Maxwell’s equation as Euler-
Lagrange equations
∂L ∂L
− ∂ν = 0. (4.762)
∂Aµ ∂Aµ,ν

136
In fact, we have (
∂L
∂Aµ = 0 ,
∂L
(4.763)
∂Aµ,ν = ∂ µ Aν − ∂ ν Aµ ,
and therefore
∂ 2 Aµ − ∂ µ (∂ν Aν ) = 0 . (4.764)
We can express the Lagrangian density in terms of the electric and magnetic fields. We have

F 00 = F ii = 0 , (4.765)
0i 0 i i 0 0 i 0 i i0
F = ∂ A − ∂ A = ∂ A + ∂i A = −E = −F = −F0i , (4.766)
ij i j j i j i ijk k ijk k
F = ∂ A − ∂ A = −∂i A + ∂j A = −ǫ (∇ ∧ A) = −ǫ H , (4.767)

and therefore
Fµν F µν = F0i F 0i + Fi0 F i0 + Fij F ij + Fji F ji = −2|E|2 + 2|H|2 . (4.768)
In total
|E|2 − |H|2
L= . (4.769)
2

4.5.4 Energy-Momentum tensor


From Nóther’s theorem we have
∂L
T µν = ∂ ν Aα − η µν L , (4.770)
∂(∂µ Aα )
1
= (−∂ µ Aα + ∂ α Aµ ) ∂ ν Aα + η µν Fαβ F αβ , (4.771)
4
µα ν 1 µν αβ
= −F ∂ Aα + η Fαβ F , (4.772)
4
such that
∂µ T µν = 0 . (4.773)
This form, (4.772), is not symmetric in the exchange µ ↔ ν and not even gauge invariant. In fact,
under a gauge transformation
Aµ → Aµ + ∂ µ χ , (4.774)
we have
1
T µν → T ′µν = −F µα ∂ ν (Aα + ∂α χ) + η µν Fαβ F αβ = T µν − F µα ∂ ν ∂α χ . (4.775)
4
However, the conserved charges are gauge invariant (this is the important thing!). In fact, we have

F µα ∂ ν ∂α χ = ∂α (F µα ∂ ν χ) − (∂α F µα ) ∂ ν χ = ∂α (F µα ∂ ν χ) , (4.776)

since for the equations of motion, ∂α F µα = 0. Therefore, under the gauge transformation (4.774) we
have
Z
ν
P = d3 X T 0ν →
Z Z Z
→ P = d X T = d X T − d3 X F 0α ∂ ν ∂α χ ,
′ν 3 ′0ν 3 0ν
(4.777)
Z
= P ν − d3 X ∂α (F 0α ∂ ν χ) , (4.778)
Z
= P ν − d3 X ∂i (F 0i ∂ ν χ) = P ν , (4.779)

137
since F 00 = 0 and since the last integral gives a surface term that is zero in the limit of infinite volume
(we understand always the fact that the fields go to zero sufficiently rapidly at infinity).
We can define a gauge invariant energy-momentum tensor adding to the form in Eq. (4.772) the
following term
C µν = ∂α (F µα Aν ) , (4.780)
which satisfies
∂µ C µν = 0 , (4.781)
and it is such that Z Z
3 0ν
d XC = d3 X ∂i F 0i Aν = 0 . (4.782)

The new energy-momentum tensor (symmetric in µ ↔ ν and gauge-invariant) is


1
T̃ µν = T µν + C µν = F µα Fαν + η µν Fαβ F αβ . (4.783)
4
Using (4.783) we get the usual expressions for the energy density and the momentum:

|E|2 + |H|2
H = T̃ 00 = , (4.784)
2
P i = T̃ 0i = (E ∧ H)i , (4.785)

which is the Poynting vector.


The two expressions T µν and T̃ µν are physically equivalent. The additional term is a total derivative
in the Lagrangian density and, therefore, it does not affect the equations of motion. It is interesting
to notice that such a piece changes the currents, while the charges are always the same.

4.5.5 Number of degrees of freedom


We describe the electromagnetic field with the four-vector Aµ . However, due to gauge invariance, the
physical degrees of fredom are not 4 (as the fact that we use an object with four components would
suggest) but 2. We can perform the calculation of the actual degrees of freedom in a covariant gauge
or in a physical gauge, like the Coulomb gauge.

Covariant gauge
We can show that the field Aµ has only two degrees of freedom using the equations of motion and
gauge invariance.
Consider the Fourier tranform of the field
Z
ν
Aµ (X) = d4 KeiKν X Aµ (K) . (4.786)

Substituting into the equations of motion we get

−K 2 Aµ (K) + Kµ (K ν Aν (K)) = 0 . (4.787)

Now, let us write the field Aν (K) as a combination of 4 vectors of a basis for the Minkowski space.
We can choose the following vectors:

K µ = (E, k) , K̃ µ = (E, −k) , ǫ(λ) µ (K) , λ = 1, 2 , (4.788)

with
K µ ǫµ(λ) (K) = 0 . (4.789)

138
We can write
Aµ (K) = a(λ) (K)ǫµ(λ) (K) + b(K)Kµ + c(K)K̃µ . (4.790)
Substituting in Eq. (4.787) we get
h i n h io
0 = −K 2 a(λ) (K)ǫµ(λ) (K) + b(K)Kµ + c(K)K̃µ + Kµ K ν a(λ) (K)ǫν(λ) (K) + b(K)Kν + c(K)K̃ν ,
= −K 2 a(λ) (K)ǫµ(λ) (K) − K 2 c(K)K̃µ + (K ν K̃ν )c(K)Kµ . (4.791)

Since the 4 vectors form a basis, we have to have

K 2 a(λ) (K) = K 2 c(K) = (K ν K̃ν )c(K) = 0 (4.792)

and therefore, since (K ν K̃ν ) 6= 0 we have c(K) = 0 and since we want a(λ) (K) 6= 0 we have to have
K 0 = 0. The coefficient b(K) is indeterminate and we can choose it in such a way to be 0. This fact
is connected to gauge invariance. In fact, if

Aµ (X) → Aµ (X) + ∂µ χ(X) , (4.793)

the Fourier transform is such that

Aµ (K) → Aµ (K) + iKµ χ(K) , (4.794)

where Z
ν
χ(X) = d4 KeiKν X χ(K) . (4.795)

Under (4.794) we have

Aµ (K) = a(λ) (K)ǫµ(λ) (K) + b(K)Kµ + c(K)K̃µ →


→ A′µ (K) = a′(λ) (K)ǫµ(λ) (K) + b(K)Kµ + iχ(K)Kµ + c′ (K)K̃µ , (4.796)
= a′(λ) (K)ǫµ(λ) (K) + c′ (K)K̃µ , (4.797)

where
a′(λ) (K) = a(λ) (K) , b′ (K) = b(K) + iχ(K) = 0 , c′ (K) = c(K) . (4.798)
Therefore, choosing a gauge transformation such that χ(K) = ib(K) we can always remove the term
proportional to Kµ . We remain, then with two degrees of freedom (since b(K) = c(K) = 0):

Aµ (K) = a(λ) (K)ǫµ(λ) (K) , (4.799)

with
K µ Aµ (K) = a(λ) (K)K µ ǫµ(λ) (K) = 0 , (4.800)
which in coordinate space can be written as

∂ µ Aµ (K) = 0 , (4.801)

i.e. the Lorentz gauge.

139
Coulomb gauge
In Coulomb gauge the number of degrees of freedom is even more evident. In fact, in the vacuum we
can always choose a gauge such that
A0 = ∇ · A = 0 , (4.802)
that are two constraints on the four components of Aµ (therefore two degrees of freedom left).
Let us show that this is possible. Let us make a gauge transformation as in Eq. (4.793) with
Z t
χ(X) = − A0 (x, t′ ) dt′ , (4.803)
0

in such a way that Z t


Aµ → A′µ = Aµ − ∂µ A0 (x, t′ ) dt′ . (4.804)
0
Clearly we have Z t
A′0 = A0 − ∂0 A0 (x, t′ ) dt′ = 0 . (4.805)
0
Let us perform now an additional gauge transformation

A′µ → A′′µ = A′µ + ∂µ χ̃(X) , (4.806)

such that ∇ · A′′ = 0. To this end we choose χ̃(X) such that

∇ · A′′ = ∇ · A′ − ∇2 χ̃(X) = 0 , (4.807)

or
∇2 χ̃(X) = ∇ · A′ . (4.808)
A solution for this equation is13

1 ∇′ · A′ (x′ , t)
Z
χ̃(X) = − d3 X ′ . (4.811)
4π |x − x′ |

for which we have


1 ∇′ · Ȧ′ (x′ , t)
Z
∂0 χ̃(X) = − d3 X ′ = 0, (4.812)
4π |x − x′ |
since for the Gauss equation
 
0 = ∇ · E = ∇ · ∇Ȧ′0 − ∂ 0 A′ = −∇ · A′ (X) . (4.813)

In the end, in the gauge in which we defined A′′µ we have

A′′0 = A′0 + ∂0 χ̃ = 0 + 0 = 0 , and ∇ · A′′ = 0 , (4.814)

as we wanted to show.
13
In fact we have
1
∇2 = −4πδ 3 (x) , (4.809)
|x|
and therefore
Z Z
1 1 1
∇2 χ(X) = − d3 X ′ ∇′ · A′ (x′ , t) ∇2 = d3 X ′ ∇′ · A′ (x′ , t) 4πδ 3 (x − x′ ) = ∇ · A′ (X) . (4.810)
4π |x − x′ | 4π

140
4.6 Quantization of the Electromagnetic Field
We are now ready to consider the quantization of the electromagnetic field. We would like to mantain
the general covariance of the theory and therefore we require to find non-trivial commutation relations
among all the components of Aµ and the conjugated momentum
∂L
Πµ = . (4.815)
∂ Ȧµ

We would impose the following equal time commutation relations

[Aµ (x, t), Πν (y, t)] = iηµν δ3 (x − y) ,


[Aµ (x, t), Aν (y, t)] = [Πµ (x, t), Πν (y, t)] = 0 . (4.816)

In order to evaluate the conjugated momentum, we refer to Eq. (4.761). We find

∂L
Πµ = = −∂ 0 Aµ + ∂ µ A0 = F µ0 . (4.817)
∂ Ȧµ

Since F µν is antisymmetric, we have Π0 = 0, at the operator level, and we are not able to impose the
commutation relation
[A0 (x, t), Π0 (y, t)] = iδ3 (x − y) . (4.818)
This is a problem that emerges from our requirement to mantain a manifestly covariant form of
the quantization, while we know already that the time-degree of freedom is not physical. A possible
solution is to get rid of the general covariance and to quantize only the two transverse degrees of
freedom. This could be done, for instance, using a physical gauge, like the Coulomb gauge, in which
we reduce from the beginning only to the two transverse degrees of freedome. However, in such
approach we loose covariance, that is quite important in computations. We therefore choose to quantize
the electromagnetic field preserving general covariance and renouncing to explicit gauge invariance
(although we will recover gauge invariance checking that two computations in two different gauges give
rise to the same result). This approach was introduced by Gupta and Bleuler.
The idea is to renounce to gauge invariance in order to cure the relation Π0 = 0 in such a way that
this does not hold at the operator level, but only when we evaluate the operator on a physical state.
Let us choose a lagrangian density that gives the correct equations of motion (Maxwell’s equations)
but only in Lorentz gauge:
∂ 2 Aµ = 0 . (4.819)
These equations come from the lagrangian density
1
L = − ∂µ Aν ∂ µ Aν , (4.820)
2
as can be easyly checked. The difference between the lagrangian density given in Eq. (4.761) and the
one in Eq. (4.820) is
1 1
LGF = − Fµν F µν + ∂µ Aν ∂ m uAν , (4.821)
4 2
1 1 1
= − ∂µ Aν ∂ µ Aν − ∂µ Aν ∂ ν Aµ + ∂µ Aν ∂ µ Aν , (4.822)
2 2 2
1 ν µ
= − ∂µ Aν ∂ A , (4.823)
2 
ν 1 1
= ∂ (∂µ Aν )A − (∂ ν ∂µ Aν )Aµ ,
µ
(4.824)
2 2

141
   
ν 1 µ 1 ν 1
= ∂ (∂µ Aν )A − ∂µ (∂ Aν )A + (∂ ν Aν )2 ,
µ
(4.825)
2 2 2
= | up to total derivatives that do not affect the eqs of motion |
1 ν
= (∂ Aν )2 . (4.826)
2
Therefore, we quantive the lagrangian density
1 1
L = LEM + LGF = − Fµν F µν − (∂ ν Aν )2 . (4.827)
4 2
The lagrangian LGF is called “gauge fixing” lagrangian.
If we look for the Euler-Lagrange equations for the lagrangian density in Eq. (4.827) we find
∂L ∂L
= 0, = −Aµ,ν + Aν,µ − η µν (∂ λ Aλ ) (4.828)
∂Aµ ∂Aµ,ν

and therefore
∂L
0 = −∂ν = ∂ 2 Aµ − ∂ µ (∂ ν Aν ) + ∂ µ (∂ λ Aλ ) = ∂ 2 Aµ , (4.829)
∂Aµ,ν
that are the Maxwell’s equations in the Lorentz gauge14 .
Using the lagrangian density (4.827) we can recompute the momentum conjugated to Aµ finding

Πµ = F µ0 − η µ0 (∂ ν Aν ) . (4.832)

Now the temporal component is not anymore identically equal to zero. We have

Π0 = −(∂ ν Aν ) . (4.833)

It is clear thet the Lorentz gauge gives ∂ ν Aν = 0; however, we are now speaking about operators. We
can require that in general
∂ ν Aν 6= 0 , (4.834)
but it is zero only when evaluated between two physical states

hphys|∂ ν Aν |physi = 0 . (4.835)

The condition (4.835) defines the physical states. Imposing (4.834) at the operator level, with (4.835)
on the physical states, means that we enlarged the Fock space. We have states that are physical and
non-physical states on which, in general, hφ|∂ ν Aν |φi 6= 0. The enlargement of the Fock space is the
price to pay for the covariant quantization. The states corresponding to temporal and longitudinal
photons will be non physical, while the transverse polarization states will be the physical ones.
We will comment more closely on Eq. (4.835) in a while.
Since now at the operator level we have Π0 6= 0, we can proceed imposing the quantization relations
(4.816) that can be simplyfied as follows. We have

Π0 = −∂ 0 A0 − ∂i Ai , (4.836)
14
We can in general use the lagrangian
1 λ
L = − Fµν F µν − (∂ ν Aν )2 , (4.830)
4 2
with λ a constant (actually a Lagrange multiplier). The equations of motion would then be

∂ 2 Aµ − (1 − λ)∂ µ (∂ ν Aν ) = 0 , (4.831)

that give ∂ 2 Aµ = 0 when λ = 1. The case λ = 1 is called “Lorentz-Feynman gauge”.

142
Πi = −∂0 Ai + ∂i A0 . (4.837)

Therefore, for the temporal component

iδ3 (x − y) = [A0 (x, t), Π0 (y, t)] = [A0 (x, t), −∂ 0 A0 (y, t) − ∂i Ai (y, t)] = [A0 (x, t), −Ȧ0 (y, t)] , (4.838)

since
   
i ∂ i ∂ i
[A0 (x, t), −∂i A (y, t)] = −A0 (x, t) A (y, t) + A (y, t) A0 (x, t) , (4.839)
∂yi ∂yi

= − [A0 (x, t), Ai (y, t)] = 0 . (4.840)
∂yi
For the spatial part we have

−iδ3 (x − y) = [Ai (x, t), Πi (y, t)] = [A0 (x, t), −∂0 Ai (y, t) + ∂i A0 (y, t)] = [A0 (x, t), −Ȧi (y, t)] , (4.841)

since, again, we have


[A0 (x, t), ∂i A0 (y, t)] = 0 . (4.842)
Finally we have

[Aµ (x, t), Ȧν (y, t)] = −iηµν δ3 (x − y) ,


[Aµ (x, t), Aν (y, t)] = [Ȧµ (x, t), Ȧν (y, t)] = 0 . (4.843)

Plane wave solutions


In order to get the quanta (photons) we need to express the field in normal modes (plane wave solu-
tions). We have to express Aµ in a basis of the Minkowski space. We do not have the opportunity to
move to the rest frame, since the photons travel at the speed of light. However, in the frame in which
the momentum is P µ = (p, 0, 0, p), we choose the following 4 vectors:

1. The unit time-like vector (that defines the time axis)

nµ (p) = (1, 0, 0, 0) = ǫ(0) µ (p) (4.844)

such that
ǫ(0) µ (p)ǫ(0)
µ (p) = 1 . (4.845)

2. The two transverse space-like vectors

ǫ(λ) µ (p) , λ = 1, 2 , (4.846)

such that

ǫµ(λ) (p)ǫ(0) µ (p) = ǫµ(λ) (p)P µ = 0 , (4.847)


′ λλ′
ǫµ(λ) (p)ǫ(λ ) µ (p) = −δ . (4.848)

We have
ǫ(1) µ (p) = (0, 1, 0, 0) , ǫ(2) µ (p) = (0, 0, 1, 0) . (4.849)

143
3. A fourth space-like vector
ǫ(3) µ (p) , (4.850)
such that
ǫ(3)
µ (p)ǫ
(0) µ
(p) = ǫ(3)
µ (p)ǫ
(λ) µ
(p) = 0 , (4.851)
ǫ(3)
µ p)ǫ
(3) µ
(p) = −1 . (4.852)
For instance we can choose
(0)
P µ − (ǫν (p)P ν )ǫ(0) µ (p)
ǫ(3) µ (p) = (0)
= (0, 0, 0, 1) . (4.853)
(ǫν (p)P ν )

These 4 vectors are orthonormal in the Minkowski space and satisfy completeness relations:
′ ′
ǫµ(λ) (p)ǫ(λ ) µ (p) = η λλ , λ = 0, 1, 2, 3 , (4.854)
(λ′ )
ǫµ(λ) (p)ǫν (p)ηλλ′ = ηµν . (4.855)
(λ)
We can prove the relation (4.855) in the frame in which P µ = (p, 0, 0, p) noting that ǫµ = δµλ and
since it is a covariant equation it holds unchanged in form in any other frame.
The expansion of Aµ in plane waves is therefore15
3
d3 p
Z h i
µ µ
ǫµ(λ) (p) a(λ) (p)e−iPµ X + a†(λ) (p)eiPµ X ,
X
Aµ (X) = p (4.857)
(2π)3 2E λ=0

where we considered the fact thet the field Aµ is a real field and where we normalized already the
expression, because for every µ = 0, 1, 2, 3 we find a Klein-Gordon field
d3 p
Z h i
µ µ
A0 (X) = p a0 (p)e−iPµ X + a†0 (p)eiPµ X , (4.858)
(2π)3 2E
d3 p
Z h i
µ µ
A1 (X) = p a1 (p)e−iPµ X + a†1 (p)eiPµ X , (4.859)
(2π)3 2E
. .
. .
(+) (+)∗
Remembering the form of fp (X) and fp (X) we can write
Z 3 h i
ǫµ(λ) (p) a(λ) (p)fp(+) (X) + a†(λ) (p)fp(+)∗ (X) ,
X
Aµ (X) = d3 p (4.860)
λ=0

We would like to check that, imposing the quantization relations on the fields, the opertators
a(λ) (P ) and a†(λ) (P ) are actually annihilation/creation operators (they obay the correct commutation
relations). We can project out the operators a(λ) (P ) and a†(λ) (P ) in terms of the fields and then use
the quantization relations for the fields and check that these relations induce the correct commutation
relations of the annihilation/creation operators. We have
3
←→
X Z
ǫµ(λ) (p)a(λ) (p) = i d3 Xfp(+)∗ (X) ∂0 Aµ (X) . (4.861)
λ=0
15
If we consider circular polarization we have to introduce two complex vectors for the transverse states. Therefore,
we have
3 h
d3 p
Z i
−iPµ X µ µ
X
Aµ (X) = p ǫ(λ)
µ (p)a(λ) (p)e + ǫ(λ)
µ

(p)a†(λ) (p)eiPµ X . (4.856)
3
(2π) 2E λ=0

144

If now we multiply on the left by ǫ(λ ) µ (p) we find
3
←→
Z
(λ′ ) µ λ′ λ ′
X
ǫ (p) ǫµ(λ) (p)a(λ) (p) =η a(λ) (p) = i d3 Xǫ(λ ) µ fp(+)∗ (X) ∂0 Aµ (X) . (4.862)
λ=0

Then, multiplying on the left by ησλ′ we find


←→
Z

a(σ) (p) = iησλ′ d3 Xǫ(λ ) µ fp(+)∗ (X) ∂0 Aµ (X) , (4.863)
Z h i

= iησλ′ d3 Xǫ(λ ) µ fp(+)∗ (X)Ȧµ (X) − (∂0 fp(+)∗ (X)) Aµ (X) . (4.864)

Analogously we find
←→ ←→
Z Z
′ ′
a†(σ) (p) = −iησλ′ d 3
Xǫ(λ ) µ fp(+) (X) ∂0 Aµ (X)
= iησλ′ d3 Xǫ(λ ) µ Aµ (X) ∂0 fp(+) (X) ,(4.865)
Z h i

= iησλ′ d3 Xǫ(λ ) µ (∂0 fp(+) (X)) Aµ (X) − fp(+) (X)Ȧµ (X) . (4.866)

With these expressions we find, for instance

[a(λ) (p), a†(λ′ ) (p′ )] = a(λ) (p)a†(λ′ ) (p′ ) − a†(λ′ ) (p′ )a(λ) (p) , (4.867)
Z n

= −ηλδ ηλ′ δ′ d3 X d3 Y ǫ(δ) µ (p)ǫ(δ ) ν (p′ )
h i
fp(+)∗ (X)Ȧµ (X) − (∂0 fp(+)∗ (X)) Aµ (X) ×
h i
(+) (+)
× (∂0 fp′ (Y )) Aν (Y ) − fp′ (Y )Ȧν (Y )
h i
(+) (+)
− (∂0 fp′ (Y )) Aν (Y ) − fp′ (Y )Ȧν (Y ) ×
h io
× fp(+)∗ (X)Ȧµ (X) − (∂0 fp(+)∗ (X)) Aµ (X) ,
Z X 0 =Y 0

n
= −ηλδ ηλ′ δ′ d3 X d3 Y ǫ(δ) µ (p)ǫ(δ ) ν (p′ )
(+)
fp(+)∗ (X)(∂0 fp′ (Y )) Ȧµ (X)Aν (Y )
(+)
−fp(+)∗ (X)fp′ (Y ) Ȧµ (X)Ȧν (Y )
(+)
−(∂0 fp(+)∗ (X))(∂0 fp′ (Y )) Aµ (X)Aν (Y )
(+)
+(∂0 fp(+)∗ (X))fp′ (Y ) Aµ (X)Ȧν (Y )
(+)
−(∂0 fp′ (Y ))fp(+)∗ (X) Aν (Y )Ȧµ (X)
(+)
+(∂0 fp′ (Y ))(∂0 fp(+)∗ (X)) Aν (Y )Aµ (X)
(+)
+fp′ (Y )fp(+)∗ (X) Ȧν (Y )Ȧµ (X)
o
(+)
−fp′ (Y )(∂0 fp(+)∗ (X)) Ȧν (Y )Aµ (X) ,
Z X 0 =Y 0

n
= −ηλδ ηλ′ δ′ d3 X d3 Y ǫ(δ) µ (p)ǫ(δ ) ν (p′ )
(+)
fp(+)∗ (X)(∂0 fp′ (Y )) [Ȧµ (X), Aν (Y )]X 0 =Y 0
(+)
−fp(+)∗ (X)fp′ (Y ) [Ȧµ (X), Ȧν (Y )]X 0 =Y 0
(+)
−(∂0 fp(+)∗ (X))(∂0 fp′ (Y )) [Aµ (X), Aν (Y )]X 0 =Y 0

145
o
(+)
+(∂0 fp(+)∗ (X))fp′ (Y ) [Aµ (X), Ȧν (Y )]X 0 =Y 0 ,
Z n

= −ηλδ ηλ δ
′ ′ d3 X d3 Y ǫ(δ) µ (p)ǫ(δ ) ν (p′ )
(+)
fp(+)∗ (X)(∂0 fp′ (Y )) iηµν δ3 (x − y)
o
(+)
+(∂0 fp(+)∗ (X))fp′ (Y ) (−iηµν )δ3 (x − y) ,

→ (+)
Z
(δ) µ (δ′ ) ′
= −ηλδ ηλ′ δ′ ǫ (p)ǫµ (p ) i d3 X fp(+)∗ (X) ∂0 fp′ (X) ,

= −ηλδ ηλ′ δ′ η δδ δ3 (p − p′ )
= −ηλλ′ δ3 (p − p′ ) (4.868)

and, in the same way we find

[a(λ) (p), a(λ′ ) (p′ )] = [a†(λ) (p), a†(λ′ ) (p′ )] = 0 . (4.869)

Finally, in summary:

[a(λ) (p), a†(λ′ ) (p′ )] = −ηλλ′ δ3 (p − p′ ) , (4.870)


[a(λ) (p), a(λ′ ) (p′ )] = [a†(λ) (p), a†(λ′ ) (p′ )] = 0 . (4.871)

Note the “wrong” sign in Eq. (4.870) for the component 00! This has an important consequence. In
fact, if we define a one-particle state as
Z
|1, λi = d3 p f (p) a†λ (p)|0i , (4.872)

its norm comes out to be negative in the case λ = 0. In fact


Z

h1, λ|1, λ i = d3 p d3 p′ f ∗ (p)f (p′ )h0|aλ (p)a†λ′ (p′ )|0i , (4.873)
Z
= d3 p d3 p′ f ∗ (p)f (p′ )h0|[aλ (p), a†λ′ (p′ )]|0i , (4.874)
Z
= −ηλλ′ d3 p|f (p)|2 . (4.875)

For λ = λ′ = 0 we find a state with negative norm and therefore it is not physical.

Physical states
Let us consider again the condition (4.835). We want to impose a linear condition on the operators
acting on a physical state, such that (4.835) is fulfilled. The field Aµ (X) has two components
3
d3 p
Z
µ
X
Aµ (X) = A(+) (−)
µ (X) + Aµ (X) = p ǫµ(λ) (p)a(λ) (p)e−iPµ X ,
(2π)3 2E λ=0
3
d3 p
Z
µ
ǫµ(λ) (p)a†(λ) (p)eiPµ X .
X
+ p (4.876)
(2π)3 2E λ=0

If we impose that a physical state satisfies the following condition

∂ µ A(+)
µ |physi = 0 , (4.877)

146
(−) (+)
since Aµ = (Aµ )† we have that (4.835) is automatically satisfied. In fact

0 = hphys|(∂ µ A(+) † µ (−)


µ ) = hphys|∂ Aµ (4.878)

and therefore
0 = hphys|∂ µ Aµ(−) + ∂ µ A(+) µ
µ |physi = hphys|(∂ Aµ )|physi . (4.879)
(+)
Eq. (4.877) is the Gupta-Bleuler condition. Let us see what is ∂ µ Aµ in terms of creation/annihilation
operators. We have
3
d3 p
Z
µ
X
µ
∂ A(+)
µ = −i p e−iPµ X P µ ǫµ(λ) (p)a(λ) (p) , (4.880)
(2π)3 2E λ=0
= | since P µ ǫ(1) µ (2) µ (3)
µ (p) = P ǫµ (p) = 0 and P ǫµ (p) = −P ǫµ (p) |
µ (0)

d3 p
Z
µ
e−iPµ X (P µ ǫ(0)
 
= −i p µ (p)) a(0) (p) − a(3) (p) . (4.881)
(2π)3 2E
(+)
Therefore, ∂ µ Aµ |physi = 0 implies the condition
 
a(0) (p) − a(3) (p) |physi = 0 . (4.882)

This means that |physi are constructed as follows:


h in
a†(0) (p) − a†(3) (p)
|physi = |n(0) , n(3) i = |0i . (4.883)
n!
In fact we have

[a(0) (p′ ) − a(3) (p′ ), a†(0) (p) − a†(3) (p)] = −δ3 (p − p′ ) + δ3 (p − p′ ) = 0 (4.884)

and then
h in
† †
   a
 (0) (p) − a (3) (p)
a(0) (p) − a(3) (p) |n(0) , n(3) i = a(0) (p) − a(3) (p) |0i , (4.885)
n!
h i n
a†(0) (p) − a†(3) (p)  
= a(0) (p) − a(3) (p) |0i = 0 . (4.886)
n!
|n(0) , n(3) i is the state with n temporal photons and n longitudinal photons. Note that a state
|n(0) , n(3) i is the vacuum state for transverse photons

a(1) (p)|n(0) , n(3) i = a(2) (p)|n(0) , n(3) i = 0 . (4.887)

The requirement for a physical state is that it contains the same number of temporal and longi-
tudinal photons but there is no constraint on the number of transverse photons. We then have the
following combination
|physi = |ψT i + δ|φi , (4.888)
where
|ψT i = α a†(1) (p1 )|0i + β a†(2) (p2 )|0i , |φi = |n(0) , n(3) i . (4.889)

Other states with a†(0) (p)|0i and a†(3) (p)|0i not in the combination |n(0) , n(3) i are not physical.
The state vector |φi is quite peculiar. It has zero norm. In fact

hφ|φi = h0|(a(0) (p) − a(3) (p))(a†(0) (p) − a†(3) (p))|0i , (4.890)

147
= h0|(a(0) (p)a†(0) (p) + a(3) (p)a†(3) (p))|0i , (4.891)
= | since [a(0) (p), a†(3) (p)] = [a(3) (p), a†(0) (p)] = 0 |
= h0|([a(0) (p), a†(0) (p)] + [a(3) (p), a†(3) (p)])|0i , (4.892)
= | since [a(0) (p), a†(0) (p)] = −[a(3) (p), a†(3) (p)] |
= 0. (4.893)

Moreover, |φi is ortogonal to |ψT i = α a†(1) (p1 )|0i + β a†(2) (p2 )|0i:
h i
hψT |φi = h0| α∗ a(1) (p1 ) + β ∗ a(2) (p2 ) a†(0) (p) − a†(3) (p) |0i = 0 .

(4.894)

This means that any scalar product between physical states are only given by scalar products between
the transverse states.

Energy and momentum


Let us look for the expression of the energy and the momentum operators in terms of creation-
annihilation operators. Let us note that we are considering the following lagrangian density
1 1 1
L = − ∂µ Aν ∂ µ Aν = − ∂µ A0 ∂ µ A0 + ∂µ Ai ∂ µ Ai , (4.895)
2 2 2
which is the sum of three Lagrangian densities of the real fields Ai minus the lagrangian density of the
real field A0 . We can therefore immediately understand that we have
3
Z Z " #
h i Z
† †
X
3 3 µ 3
: H := d X H = d X Π Ȧµ − L = d p E −a(0) (p)a(0) (p) + a(λ) (p)a(λ) (p) . (4.896)
λ=1

The same expression holds for the momentum


3
Z " #
−a†(0) (p)a(0) (p) + a†(λ) (p)a(λ) (p)
X
i 3 i
: P := d pp . (4.897)
λ=1

If we now evaluate the energy or the momentum of a physical state, we see that they get contribu-
tions only from the transverse states. In fact, we have

(4.898)
 
a(0) (p) − a(3) (p) |physi = 0

and
h i h i
hphys| −a†(0) (p)a(0) (p) + a†(3) (p)a(3) (p) |physi = hphys| −a†(0) (p) + a†(3) (p) a(0) (p)|physi = 0 .
(4.899)
Therefore
Z 2
a†(λ) (p)a(λ) (p)|physi ,
X
3
hphys| : H : |physi = d p E hphys| (4.900)
λ=1
and
Z 2
a†(λ) (p)a(λ) (p)|physi ,
X
i
hphys| : P : |physi = d3 p pi hphys| (4.901)
λ=1
We can conclude that the physical state is determined only by the transverse modes. |ψT i and
|ψT i + c|φi are physically equivalent. They have the same energy, momentum, angular momentum ...
they are physically indistinguishable. They represent the photon.

148
4.7 Propagator of the Klein-Gordon field
We studied so far the equations of motion of the scalar field without sources. Let us now consider the
case in which we are in presence of a source, j(X), which can be for instance a known function of the
space-time point. The differential equation fullfiled by the field is
(∂ 2 + m2 )φ(X) = j(X) , (4.902)
to be intended as a classical equation. The solution of Eq. (4.902) can be obtained calculating the
Green function, which is the solution of Eq. (4.902) in presence of a point-like source16
(∂ 2 + m2 )G(X − X ′ ) = δ4 (X − X ′ ) , (4.903)
such that Z
φ(X) = φ0 (X) + d4 X ′ G(X − X ′ )j(X ′ ) , (4.904)

where φ0 (X) is a solution of the homogeneous equation, respecting the given boundary conditions. It
is easy to verify that (4.904) satisfies (4.902):
 Z 
2 2 0 4 ′ ′ ′
(∂ + m ) φ (X) + d X G(X − X )j(X ) =
Z
= (∂ + m )φ (X) + d4 X ′ (∂ 2 + m2 )G(X − X ′ )j(X ′ ) ,
2 2 0
(4.905)
Z
= d4 X ′ δ4 (X − X ′ )j(X ′ ) = j(X) . (4.906)

The problem now is to calculate the Green function. In order to do that, we Fourier transform:
d4 p −iPµ (X−X ′ )µ
Z

G(X − X ) = e G̃(P ) , (4.907)
(2π)4
d4 p −iPµ (X−X ′ )µ
Z
δ4 (X − X ′ ) = e . (4.908)
(2π)4
Substituting in Eq. (4.903) we find
(−P 2 + m2 ) G̃(P ) = 1 , (4.909)
and therefore
1
G̃(P ) = − . (4.910)
P2 − m2
Finally
d4 P −iPµ (X−X ′ )µ 1
Z

G(X − X ) = − 4
e . (4.911)
(2π) P − m2
2

The calculation of Eq. (4.911) has to be done in the complex plain, puting attention to the fact that
the integrand has poles in the domain of integration. We can for instance integrate in dP 0 . In this
case we have two single poles on the real axis at
p
P 2 − m2 = 0 =⇒ P 0 = ± p2 + m2 = ±ω . (4.912)
Therefore, in order to perform the integration we have different choices, according to how we avoid the
singularity in integrating in dP 0 . The integration will be done in principal value and the infinitesimal
arc with which to circumvent the poles can be chosen in the upper or in the lower complex semi-plain.
The difference will be a residue, i.e. the solution of the homogeneous equation. The coice on the path
depends on the boundary conditions.
16
For translationally invariant systems the Green function is a function of (X − X ′ )

149
4.7.1 Closed paths and residues
Let us consider the integration in P 0 on a closed path
p around one of the poles.
If C + is a closed positive path around P 0 = p2 + m2 = ω, we can apply the residue theorem
finding
µ
d3 P dP 0 e−iPµ X
Z Z
∆+ = −i (4.913)
(2π)3 C + (2π) (P 0 − ω)(P 0 + ω)
d3 P
Z
= e−i(ωt−p·x) . (4.914)
(2π)3 2ω
p
If C − is a closed positive path around P 0 = − p2 + m2 = −ω, instead, we get
µ
d3 P dP 0 e−iPµ X
Z Z

∆ = −i , (4.915)
(2π)3 C − (2π) (P 0 − ω)(P 0 + ω)
d3 P
Z
= − ei(ωt+p·x) , (4.916)
(2π)3 2ω
= | transforming p → −p |
d3 P
Z
= − ei(ωt−p·x) . (4.917)
(2π)3 2ω

Both ∆± are solution of the homogeneous equation. In fact

d3 P dP 0 2 −iPµ X µ
2 e
Z Z
2 2 ±
(∂ + m )∆ = −i (∂ + m ) 2 , (4.918)
(2π)3 C ± (2π) (P − m2 )
d3 P dP 0 2 −iPµ X µ
2 e
Z Z
= i (P − m ) , (4.919)
(2π)3 C ± (2π) (P 2 − m2 )
d3 P dP 0 −iPµ X µ
Z Z
= i e = 0, (4.920)
(2π)3 C ± (2π)

for Cauchy’s theorem.

4.7.2 Open paths


The integration on an open path gives the solution for the non-homogeneous differential equation.
Of particular interest are the so-called “retarded” and “advanced” Green functions. These provide the
correct solutions for a classical field that preserves causality. They depend
p on how we regularize the two
singularities on the real P 0 axis, that occur at P 0 = ±ω, where ω = p2 + m2 . In principle, we have
four possibilities to perform the integral: we can get around both singularities with a vanishing circle
on the upper complex half-plane, or both with a vanishing circle on the lower complex half-plane, or
we can use a circle on the upper half-plane and one in the lower one (with two evident configurations).

ℑ(P 0 )

ω
−ω 0 ℜ(P 0 )
150
Using a different language (but same result), instead of considering a vanishing circle around the
pole, we can displace the pole (using a vanishing imaginary part) keeping the integration on the real
axis, as in the figure

ǫ limη→0
limǫ→0
ω
ω − iη

such that
dP 0 f (P 0 ) dP 0 f (P 0 )
Z Z
lim = lim (4.921)
ǫ→0 γǫ (2π) P 0 − ω η→0 (2π) P 0 − ω + iη
and usually the “limit” procedure is understood. Therefore, the situation becomes as follows:

ℑ(P 0 )

retarded

advanced

time-ordered

−ω 0 ω ℜ(P 0 )

Retarded Green functions


The first case to be considered is the one in which the two poles are both displaced below the real axis.
In this way, the Green function vanishes for t < t′ . In fact, we define
1 1
G̃ret (P ) = − =− 0 . (4.922)
(P 0 + iη)2 − p2 − m2 (P − ω + iη)(P 0 + ω + iη)

If we close the integration contour in the upper half-plane, for the case t < t′ ,

ℑ(z)

Γ+
γR

γ1

−R 0 R ℜ(z)
−ω − iη ω − iη

and we let R → ∞. For Cauchy’s theorem we have


′ µ
e−iPµ (X−X )
Z Z
0 = − lim d3 P dP 0 , (4.923)
R→∞ Γ+ (P − ω + iη)(P 0 + ω + iη)
0

151
R ′ µ
e−iPµ (X−X )
Z Z
= − lim d3 P dP 0
R→∞ −R (P 0 − ω + iη)(P 0 + ω + iη)
′ µ
e−iPµ (X−X )
Z Z
− lim d3 P dP 0 , (4.924)
R→∞ γR (P − ω + iη)(P 0 + ω + iη)
0

and for Jordan’s lemma, we have that (for t − t′ < 0)


′ µ
e−iPµ (X−X )
Z Z
lim d3 P dP 0 = 0. (4.925)
R→∞ γR (P − ω + iη)(P 0 + ω + iη)
0

Therefore:
Gret (X − X ′ ) = 0 , for t − t′ < 0 . (4.926)
If t − t′ > 0, instead, we have to close the integration contour in the lower half P 0 plane, in order to
use Jordan’s lemma. This means that we are including in the contour the two poles.

ℑ(z)

−R 0 R ℜ(z)

−ω − iη ω − iη

γ1 γR
Γ−

Now the residues theorem gives us (remember we are closing the contour clock-wise):
′ µ
e−iPµ (X−X )
X Z
−2πi Res(f, ±ω) = − lim dP 0 , (4.927)
R→∞ Γ− (P 0 − ω + iη)(P 0 + ω + iη)
Z R ′ µ
e−iPµ (X−X )
= − lim dP 0 . (4.928)
R→∞ −R (P 0 − ω + iη)(P 0 + ω + iη)

Finally
′ µ
θ(X 0 − X ′0 ) e−iPµ (X−X )
Z
′ 4
Gret (X − X ) = − d P 0 , (4.929)
(2π)4 (P + iη)2 − p2 − m2
θ(X 0 − X ′0 )
Z 4  
d P −iPµ (X−X ′ )µ 1 1
= − 4
e 0
− 0
,(4.930)
(2π) 2ω (P − ω + iη) (P + ω + iη)
θ(X 0 − X ′0 )
Z 3
d P ip·(x−x′ ) 1
Z h
0 −iP0 (X 0 −X ′0 )
= − 4
e dP e 0
(2π) 2ω (P − ω + iη)
1 i
− 0 , (4.931)
(P + ω + iη)
0 ′0
!
e−iP0 (X −X ) 0 ′0
= for the residues theorem Res , ∓ω = −2πie±iω(X −X )
(P 0 ± ω + iη)
θ(X 0 − X ′0 )
Z 3 h
d P −iω(X 0 −X ′0 )+ip·(x−x′ ) iω(X 0 −X ′0 )+ip·(x−x′ )
i
= i e − e (4.932)
(2π)3 2ω

152
= | p → −p in the second integral |
θ(X 0 − X ′0 )
Z 3 h
d P −iω(X 0 −X ′0 )+ip·(x−x′ ) iω(X 0 −X ′0 )−ip·(x−x′ )
i
= i e −e (4.933)
(2π)3 2ω
= θ(X 0 − X ′0 ) i∆+ + i∆− . (4.934)


The Green function Gret (X − X ′ ) is real and transports in the future both solutions, with positive
or negative frequency. It has to be used in problems in which we have the boundary at a certain t′
and we ask what happens in consequence of that, for t > t′ . It is a causal Green function in the sense
that it is different from zero in the future light-cone of X ′ . For space-like separations, (X − X ′ )2 < 0,
since it is invariant under proper Lorentz transformations, it vanishes. In fact, if (X − X ′ )2 < 0, we
can find a frame in which t < t′ , for which then Gret (X − X ′ ) = 0 and it remains zero in every frame.

Advanced Green functions


The second case is constituted by the advanced Green function, which is defined to vanish for t−t′ > 0.
We define
1 1
G̃adv (P ) = − 0 =− 0 . (4.935)
(P − iη)2 − p2 − m2 (P − ω − iη)(P 0 + ω − iη)
If we close the integration contour in the lower half-plane, for the case t > t′ ,
ℑ(z)
−ω + iη ω + iη
−R 0 R ℜ(z)

γ1 γR
Γ−

and we find
Gadv (X − X ′ ) = 0 , for t − t′ > 0 . (4.936)
t′
If t − < 0, instead, we have to close the integration contour in the upper half P0
plane, in order to
use Jordan’s lemma. This means that we are including in the contour the two poles.
ℑ(z)

Γ+
γR

−ω + iη ω + iη
γ1

−R 0 R ℜ(z)

For the residues theorem we have


′ µ
θ(X ′0 − X 0 ) e−iPµ (X−X )
Z
′ 4
Gadv (X − X ) = − d P 0 , (4.937)
(2π)4 (P − iη)2 − p2 − m2

153
θ(X ′0 − X 0 ) d4 P −iPµ (X−X ′ )µ
 
1 1
Z
= − e − (4.938)
,
(2π)4 2ω (P 0 − ω − iη) (P 0 + ω − iη)
0 ′0
!
e−iP0 (X −X ) 0 ′0
= for the residues theorem Res , ∓ω = 2πie±iω(X −X )
(P 0 ± ω − iη)
θ(X ′0 − X 0 )
Z 3 h
d P −iω(X 0 −X ′0 )+ip·(x−x′ ) iω(X 0 −X ′0 )+ip·(x−x′ )
i
= −i e − e (4.939)
(2π)3 2ω
= | p → −p in the second integral |
θ(X 0 − X ′0 )
Z 3 h
d P −iω(X 0 −X ′0 )+ip·(x−x′ ) iω(X 0 −X ′0 )−ip·(x−x′ )
i
= −i e − e (4.940)
(2π)3 2ω
−θ(X ′0 − X 0 ) i∆+ + i∆− . (4.941)

=

Also Gadv (X − X ′ ) is real and it transports in the past both solutions, with positive or negative
frequency. It has to be used in problems in which we have the boundary at a certain t′ in the future
and we ask what happens in the present in order to cause this boundary in the future. It is not obvious,
but it is possible. It is a causal Green function in the sense that it is different from zero in the past
light-cone of X ′ and for space-like separations, (X −X ′ )2 < 0, since it is invariant under proper Lorentz
transformations, it vanishes (like Gret (X − X ′ )).

Feynman propagator
Quantum-mechanically the correct propagator, that propagates “particle” and “anti-particle” states in
the future, is the Feynman propagator. It is defined giving a vanishing positive immaginary part to
the pole in −ω and a vanishing negative immaginary part to the pole in ω:

ℑ(z)

Γ+

−ω + iη

−R 0 R ℜ(z)
ω − iη

Γ−

This time, for t − t′ > 0 we have to close the countour in the lower half P 0 plane (Γ− ), while for
t − t′ < 0 in the upper (Γ+ ). We have

θ(X 0 − X ′0 )
Z 3
d P ip·(x−x′ ) 1
Z h
′ 0 −iP0 (X 0 −X ′0 )
DF (X − X ) = − 4
e dP e 0
(2π) 2ω Γ− (P − ω + iη)
1 i
− 0 , (4.942)
(P + ω − iη)
θ(X ′0 − X 0 )
Z 3
d P ip·(x−x′ ) 1
Z h
0 −iP0 (X 0 −X ′0 )
− 4
e dP e 0
(2π) 2ω Γ+ (P − ω + iη)

154
1 i
− , (4.943)
(P 0 + ω − iη)
d3 P
Z
′ ′
= iθ(X 0 − X ′0 ) 3
e−iω(t−t )+ip·(x−x )
(2π) 2ω
d3 P
Z
′ ′
+iθ(X ′0 − X 0 ) 3
eiω(t−t )+ip·(x−x ) , (4.944)
(2π) 2ω
d3 P
Z
′ ′
= iθ(X 0 − X ′0 ) e−iω(t−t )+ip·(x−x )
(2π)3 2ω
d3 P
Z
′ ′
+iθ(X ′0 − X 0 ) 3
eiω(t−t )−ip·(x−x ) , (4.945)
(2π) 2ω
= θ(X − X ) i∆ (X − X ′ ) − θ(X ′0 − X 0 )∆− (X − X ′ ) .
0 ′0 +
(4.946)

Now, the Green function is complex and it propagates in the future the positive frequency solutions
and in the past the negative frequency ones. This is consistent with Dirac’s “hole theory” interpretation
of particle and anti-particle states propagating both in the future.
We can have a physical interpretation of the meaning of the propagator considering the following
simple “quantum process” associated to the charged KG field φ(X). Let us consider the creation of a
particle (one-particle state) at the time t in y. We will have
Z h i
|ψ(y, t)i = φ (Y )|0i = d3 P fp(+) b(p)|0i + fp(+)∗ a† (p)|0i ,

(4.947)

d3 P
Z
µ
= p eiPµ Y a† (p)|0i . (4.948)
3
(2π) 2ω

The probability amplitude of finding the particle in x at t′ > t is given by

θ(t′ − t)hψ(x, t′ )|ψ(y, t)i = θ(t′ − t)h0|φ(X)φ† (Y )|0i , (4.949)

that can be interpreted as the creation of a particle of charge q = +1 in (y, t) by φ† (Y ), its propagation
up to (x, t′ ) and its annihilation in this point by φ(X).

(x, t′ )

(y, t)

Such a relation enters a scattering process, where two nucleons (a proton and a neutron) exchange
a charged pion (see Fig. 4.1 (a)). The same “effect” can be recovered creating a negative charge in
(x, t′ ), that then propagates up to (y, t) and is annihilated in this point, with t > t′ . Therefore we
have to consider also the amplitude

θ(t − t′ ) h0|φ† (Y )φ(X)|0i , (4.950)

that enters for instance the diagram in Fig. 4.1 (b). The complete amplitude will be the sum of the
two amplitudes:

h0|T (φ(X)φ† (Y ))|0i = h0|T (φ† (Y )φ(X))|0i (4.951)

155
n p p n

(x, t′ ) (y, t)

Π+ t Π−
(y, t) (x, t′ )

p n n p
(a) (b)

Figura 4.1: Scattering of a proton and a neutron

= θ(X 0 − Y 0 )h0|φ(X)φ† (Y )|0i + θ(Y 0 − X 0 )h0|φ† (Y )φ(X)|0i (4.952)


d3 P d3 P ′ h
Z
µ ′ µ
= √ θ(X 0 − Y 0 )ei(Pµ Y −Pµ X ) h0|a(p′ )a† (p)|0i
3
(2π) 4ωω ′
µ ′ µ
i
+θ(Y 0 − X 0 )e−i(Pµ Y −Pµ X ) h0|b(p)b† (p′ )|0i , (4.953)
= | since h0|a(p′ )a† (p)|0i = δ(p − p′ )... |
d3 P h
Z i
0 0 −iPµ (X−Y )µ 0 0 iPµ (X−Y )µ
= θ(X − Y )e + θ(Y − X )e (, 4.954)
(2π)3 2ω
= −iDF (X − Y ) . (4.955)

Where we defined the time-ordered product of the two bosonic fields φ(X) and φ† (Y )) as follows:

T (φ(X)φ† (Y )) = T (φ† (Y )φ(X)) = θ(X 0 − Y 0 ) φ(X)φ† (Y ) + θ(Y 0 − X 0 ) φ† (Y )φ(X) . (4.956)

A more convenient way to write the Feynman propagator is using the following integral represen-
tation for the step function: Z ∞
i e−iωt
θ(t) = lim dω . (4.957)
η→0+ 2π −∞ ω + iη
In fact,
ℑ(z)

Γ+

0
−R −iη R ω

Γ−

156
for t < 0 the integral contour has to be chosen to be Γ+ in such a way that we can apply Cauchy’s
theorem and Jordan’s lemma, getting

i e−iωt
Z
0 = θ(t) + lim dω = θ(t) . (4.958)
R→∞ γ + 2π ω + iη
R

For t > 0, instead, we close the integral contour in the lower complex plane (Γ− ) getting

i e−iωt
Z
−2πi Res(θ, −iη) = θ(t) + lim dω = θ(t) , (4.959)
R→∞ γ − 2π ω + iη
R

where
i −ηt
−2πi Res(θ, −iη) = −2πi lim e = 1. (4.960)
η→0+ 2π
Including the integral representation of the Heaviside θ in Eq. (4.955) we find

d3 P h
Z i
0 0 −iPµ (X−Y )µ 0 0 iPµ (X−Y )µ
−iDF (X − Y ) = θ(X − Y )e + θ(Y − X )e , (4.961)
(2π)3 2ωp
0 0 0 0
d3 P dω h e−iω(X −Y ) −iPµ (X−Y )µ eiω(X −Y ) iPµ (X−Y )µ i
Z Z
= i e + e (4.962)
,
(2π)4 2ωp ω + iη ω + iη
= | we substitute P 0 = ω + ωp , such that ω = P 0 − ωp and dω = dP 0 |
0 0 0 0 0 0 0
d4 P 1 h e−iP (X −Y )+iωp (X −Y ) e−iωp (X −Y )+ip·(x−y)
Z
= i
(2π)4 2ωp (P 0 − ωp + iη)
0 0 0 0 0 0 0
eiP (X −Y )−iωp (X −Y ) eiωp (X −Y )−ip·(x−y) i
+ , (4.963)
(P 0 − ωp + iη)
µ µ
d4 P 1 h e−iPµ (X−Y ) eiPµ (X−Y ) i
Z
= i + , (4.964)
(2π)4 2ωp (P 0 − ωp + iη) (P 0 − ωp + iη)
= | substituting in the second integral P µ → −P µ |
d4 P 1 −iPµ (X−Y )µ h 1 1
Z i
= i e − , (4.965)
(2π)4 2ωp (P 0 − ωp + iη) (P 0 + ωp − iη)
µ
d4 P e−iPµ (X−Y )
Z
= i . (4.966)
(2π)4 P 2 − m2 + iη
Finally
µ
d4 P e−iPµ (X−Y )
Z
DF (X − Y ) = − . (4.967)
(2π)4 P 2 − m2 + iη
We can check again that DF is a Green function for the KG operator:
µ
d4 P e−iPµ (X−Y )
Z
2 2 2 2
(∂ + m )X DF (X − y) = − lim (∂ + m )X 2 , (4.968)
η→0 (2π)4 P − m2 + iη
d4 P −iPµ (X−Y )µ
2 e
Z
2
= − lim (−P + m ) , (4.969)
η→0 (2π)4 P 2 − m2 + iη
d4 P −iPµ (X−Y )µ
Z
= e = δ4 (X − Y ) . (4.970)
(2π)4

The same result can be found, acting with (∂ 2 + m2 )X (derivatives with respect to X) directly on
ih0|T (φ(X)φ† (Y ))|0i. In fact:

(∂ 2 + m2 )X ih0|T (φ(X)φ† (Y ))|0i = ∂02 ih0|T (φ(X)φ† (Y ))|0i

157
+ih0|T ((−∇ + m2 )X φ(X)φ† (Y ))|0i , (4.971)
0 0 † †
= ∂0 ih0|δ(X − Y )[φ(X), φ (Y )]|0i + ∂0 ih0|T (φ̇(X)φ (Y ))|0i
+ih0|T ((−∇ + m2 )X φ(X)φ† (Y ))|0i , (4.972)

= ∂0 ih0|T (φ̇(X)φ (Y ))|0i
+ih0|T ((−∇ + m2 )X φ(X)φ† (Y ))|0i , (4.973)
0 0 † †
= ih0|δ(X − Y )[φ̇(X), φ (Y )]|0i + ih0|T (φ̈(X)φ (Y ))|0i
+ih0|T ((−∇ + m2 )X φ(X)φ† (Y ))|0i , (4.974)
4
= δ (X − Y ) , (4.975)

where we used the fact that17

∂0 θ(X 0 − Y 0 ) = δ(X 0 − Y 0 ) , and ∂0 θ(Y 0 − X 0 ) = −δ(X 0 − Y 0 ) (4.978)

and we used the commutation relations of the fields.


The propagator for the real KG field is

ih0|T (φ(X)φ(Y ))|0i = DF (X − Y ) . (4.979)

Propagators and commutators

4.8 Propagator of the Dirac field


In the case of the Dirac field we define the propagator as in the case of KG field

SF (X − Y )αβ = −ih0|T (ψα (X)ψ β (Y ))|0i , (4.980)

but now, since we are dealing with fermions, the T -ordered product is defined as follows:

T (ψα (X)ψ β (Y )) = −T (ψ β (Y )ψα (X)) = θ(X 0 −Y 0 )ψα (X)ψ β (Y )−θ(Y 0 −X 0 )ψ β (Y )ψα (X) . (4.981)

SF (X − Y ) is indeed a Green’s function for the Dirac equation. In fact


0
(i 6 ∂X − m)αβ h0|T (ψβ (X)ψ γ (Y ))|0i = h0|iγαβ δ(X 0 − Y 0 )[ψβ (X), ψδ† (Y )]+ γδγ
0
|0i ,
0
+h0|T (iγαβ ∂0 ψβ (X)ψ γ (Y ))|0i ,
i
+h0|T ([iγαβ ∂i − m]ψβ (X)ψ γ (Y ))|0i , (4.982)
= | since [ψβ (x, t), ψδ† (y, t)]+ = δβδ δ3 (x − y) |
0
= h0|iγαβ δ(X 0 − Y 0 )δβδ δ3 (x − y)γδγ 0
|0i ,
µ
+h0|T ([iγαβ ∂µ − m]ψβ (X)ψ γ (Y ))|0i , (4.983)
= iδαγ δ4 (X − Y ) . (4.984)

Therefore
(i 6 ∂X − m)αβ SF (X − Y )βγ = δαγ δ4 (X − Y ) . (4.985)
17
These relations can be demonstrated using the integral representation for the Heaviside function. We have
Z ∞ Z ∞ Z ∞
∂ ∂ i e−iωt dω ωe−iωt dω −iωt
θ(t) = lim dω = lim = e = δ(t) , (4.976)
∂t η→0+ ∂t 2π −∞ ω + iη η→0+ −∞ 2π ω + iη −∞ 2π
Z ∞ ∞
eiωt dω ωeiωt
Z
∂ ∂ i
θ(−t) = lim dω = − lim
∂t η→0 ∂t 2π −∞
+ ω + iη η→0 +
−∞ 2π ω + iη
Z ∞
dω −iωt
= | ω → −ω | = − e = −δ(t) (4.977)
−∞ 2π

158
The Green’s function SF (X − Y )βγ can be expressed in terms of DF (X − Y ) as follows

SF (X − Y )βγ = −(i 6 ∂X + m)βγ DF (X − Y ) . (4.986)

In fact, we have

(i 6 ∂X − m)αβ SF (X − Y )βγ = −(i 6 ∂X − m)αβ (i 6 ∂X + m)βγ DF (X − Y ) ,


= δαγ (∂ 2 + m2 )DF (X − Y ) = δαγ δ4 (X − Y ) . (4.987)

Therefore
" µ
#
d4 P e−iPµ (X−Y )
Z
SF (X − Y )βγ = −(i 6 ∂X + m)βγ − 2 , (4.988)
(2π)4 P − m2 + iη
d4 P −iPµ (X−Y )µ (6 P + m)βγ
Z
= e . (4.989)
(2π)4 P 2 − m2 + iη

4.9 Propagator of the Electromagnetic field


Finally, the propagator for the electromagnetic field will be given by the following expression:
d4 P −iPµ X µ −iηµν
Z
h0|T (Aµ (X)Aν (Y )) |0i = iηµν D(X − Y ) = e . (4.990)
(2π)4 P 2 + iη
The field Aµ (X) is a bosonic field and the T -ordered product has to be defined as in the case of the
KG field:

T (Aµ (X)Aν (Y )) = T (Aν (Y )Aµ (X)) = θ(X 0 − Y 0 )Aµ (X)Aν (Y ) + θ(Y 0 − X 0 )Aν (Y )Aµ (X) . (4.991)

We can check that the expression in Eq. (4.990) is indeed a Green function for the equation of
motion18
∂ 2 Aµ (X) = J µ (X) , (4.995)
i.e. a function Gµν (X − Y ) such that
2
∂X Gµν (X − Y ) = ηµν δ4 (X − Y ) . (4.996)

In fact we have
2
∂X (−ih0|T (Aµ (X)Aν (Y )) |0i) = −i ∂0 h0|δ(X 0 − Y 0 )[Aµ (X), Aν (Y )]|0i
 
−i∂0 h0|T Ȧµ (X)Aν (Y ) |0i
+ih0|T ∇2X Aµ (X)Aν (Y ) |0i ,

(4.997)
= −i h0|δ(X 0 − Y 0 )[Ȧµ (X), Aν (Y )]|0i , (4.998)
4
= δ (X − Y ) . (4.999)

18
We can construct the propagator in the general case in which the lagrangian density is
1 λ
L = − Fµν F µν − (∂α Aα )2 , (4.992)
4 2
i.e. the equations of motion are
∂ 2 Aµ (X) − (1 − λ)∂ µ (∂α Aα ) = J µ (X) . (4.993)
We find
d4 P −iPµ X µ −iηµν
Z  
1 − λ Pµ Pν
h0|T (Aµ (X)Aν (Y )) |0i = e −i , (4.994)
(2π)4 P 2 + iη λ (P 2 + iη)2
that for λ = 1 gives back the propagator in the so-called Feynman gauge. The physical quantities in the end should be
independent of λ.

159
Capitolo 5

Cross Section and Decay Rate

5.1 From transition amplitude to probability


The transition amplitude has the following form
r r
X X Y m Y 1
Sf i = δf i + (2π)4 δ4 ( Pi − Pf ) M, (5.1)
VE V 2E
i f f erm bos

where δf i represents the absence of scattering (since we want i 6= f this term is zero), the δ4 represents
the conservation of the total four-momentum, then we have normalization factors for the fermions and
for the bosons involved in the scattering and, finally, the matrix element M that contains the external
fields, the interaction vertices and the propagators.
The transition amplitude is not an observable. In order to define a measurable quantity we have,
firstly, to move to a probability, taking the modulus squared of Sf i :
2 Y m Y 1
|M|2 .
X X
|Sf i |2 = (2π)4 δ4 ( Pi − Pf ) (5.2)
VE V 2E
i f f erm bos

Let us analyse first the modulus squared of the delta function. In order to do that, it is more
convenient to write the delta using its Fourier transform:
Z Z T
2 µ
4 4 3
(2π) δ (Pf − Pi ) = lim d X dt ei(Pf −Pi )µ X . (5.3)
T →∞,V →∞ V − T2

We focus on δ(Ef − Ei ) (for the spatial part we obtain the same result). We have
Z T
2 sin ∆E2 T

2
i∆E t
(2π)δ(Ef − Ei ) = lim dt e = lim . (5.4)
T →∞ − T T →∞ ∆E
2

Therefore
4 sin2 ∆E2 T

(2π)δ(Ef − Ei ) = lim = lim 2π T δ(Ef − Ei ) . (5.5)
T →∞ ∆E 2 T →∞
The same happens for the spatial part and in the end we obtain
X X 2 X X
(2π)4 δ4 ( Pi − Pf ) lim V T (2π)4 δ4 ( Pi − Pf ) . (5.6)
T →∞,V →∞
i f i f

We define the probability density per unit time, or probability density rate, as
|Sf i |2 Y m Y 1
|M|2 ,
X X
wf i = = V (2π)4 δ4 ( Pi − Pf ) (5.7)
T VE V 2E
i f f erm bos

160
1 Y
(2m) |M|2 .
X X Y
= V (2π)4 δ4 ( Pi − Pf ) (5.8)
ext
V 2E
i f f erm

wf i is the probability density per unit time to have the final state “f ”, with momenta pf , starting
with the initial state “i”. However, from an experimental point of view, it is not possible to measure
an exact pf and one would like to have an interval of momentum, between pf and pf + dpf . In this
interval, we have a certain number of states, equally probable, on which we have to sum our probability
density. If we quantize in the box, the momentum is discrete

p= n, (5.9)
L
where n is an integer vector. Therefore, the number of states with momentum between pf and pf +dpf
is
L3 3 V
d3 n = d pf = d3 pf . (5.10)
(2π)3 (2π)3
We then have
1 Y Y V
(2m) |M|2
X X Y
dwf i = V (2π)4 δ4 ( Pi − Pf ) d3 pf . (5.11)
ext
V 2E (2π)3
i f f erm f

5.2 Cross Section


We define now the observable for scattering processes, which is called Cross Section. We have in mind
a process in which a monochromatic beam of particles prepared at t = −∞ collides on a target in
which we have a certain density of scatter centers (if we perform a boost in the incoming momentum
direction, we can move to the center of mass frame, in which we see the two particles that take part to
the scattering that move one against the other). Let us suppose that the beam has a certain section
S. If n is the number of incoming particles per unit time and unit surface (Ni the total number of
incoming particles in a time interval ∆t, Ni = nS∆t), and N the number of scattered particles per
unit time and diffusing particle (Nd the total number of particles scattered per unit diffusing center in
∆t, Nd = N ∆t) we define the cross section as

N
σ= . (5.12)
n
The cross section has the dimensions of a surface. In fact
N Nd S∆t Nd
σ= = = S (5.13)
n ∆t Ni Ni

and therefore [σ] = l2 .


n is the incoming flux and can be expressed as the product of the density of incoming particles
times the relative velocity of those particles with respect to the diffusing center. In fact we have
Ni L Ni L
n= = = ρ |vrel | , (5.14)
S∆t L V ∆t
where L is the linear dimension traveled in ∆t by the incoming particles (they all have the same
velocity).
If we consider the rate of scattered particles in a certain small region of the phase space, we can
define the differential cross section as
dN
dσ = . (5.15)
n

161
The dN is exactly the dwf i . Therefore

dN dwf i 1 Y 1 Y Y L3
(2m) |M|2
X X
dσ = = = V (2π)4 δ4 ( Pi − Pf ) 3
d3 pf (5.16)
n n ρ |vrel | ext
V 2E (2π)
i f f erm f

We consider the situation in which one particle at a time scatters on a diffusing center. In the
volume V we will have one incoming particle and therefore
1
ρ= . (5.17)
V
Moreover, we have a 2 → n scattering and therefore

V2 Y 1 Y Y V
(2m) |M|2
X X
dσ = (2π)4 δ4 ( Pi − Pf ) 3
d3 pf ,
|vrel | ext
V 2E (2π)
i f f erm f

1 d3 pf
(2m) |M|2
X X Y Y
= (2π)4 δ4 ( Pi − Pf ) , (5.18)
4E1 E2 |vrel | (2π)3 2Ef
i f f erm f

where all the volumes cancel.


The cross section is a Lorentz scalar. In Eq. (5.18) everything is manifestly Lorentz invariant except
the flux term, that we have to specify. In fact

p1 p2
E1 E2 |vrel | = E1 E2 |v1 − v2 | = E1 E2 − . (5.19)
E1 E2

In the frame in which particle 2 is at rest we have p2 = 0, E2 = m2 and therefore

|p1 |
q q q
E1 E2 |vrel | = E1 m2 = m2 |p1 | = m2 E12 − m21 = m22 E12 − m21 m22 = (P1 µ P2µ )2 − m21 m22 ,
E1
(5.20)
which is now written in a manifestly covariant way.
Finally
X X 1 Y 2
Y d3 pf
dσ = (2π)4 δ4 ( Pi − Pf ) p (2m) |M| . (5.21)
i f
4 (P1 µ P2µ )2 − m21 m22 f erm f
(2π)3 2Ef

5.3 Decay Rate


The interaction can cause the decay of a particle, that in the free theory would be stable. This can
happen if the kinematic constraints are fulfilled. The process to consider is now a process 1 → n and
the initial state is constituted by one particle.
We define the decay rate as the probability density per unit time to have a certain final state starting
with the initial state constituted by the particle that decays (the decaying particle has momentum
P µ = (E, p)):

1 Y Y d3 pf
(2m) |M|2
X
dΓ = dwf i = (2π)4 δ4 (P − Pf ) . (5.22)
2E (2π)3 2Ef
f f erm f

Some comments are in order:

• The decay rate is, again, independent on the volume.

162
• The decay rate in Eq. (5.22) is the “partial decay rate” of the decaying particle in a certain final
state. It is governed by the matrix element |M|2 . To understand better, consider the decay of a
Z boson in the Standard Model. The Z can decay in different final states. We can for instance
compute the decay rate of Z → e+ e− . This would be

1 Y d3 pe− d3 pe+
dΓZ→e+ e− = (2π)4 δ4 (P − Pe− − Pe+ ) (2m) |MZ→e+ e− |2 . (5.23)
2E (2π)3 2Ee− (2π)3 2Ee+
f erm

To have the decay rate in this channel, we have to integrate all over the phase space
Z
ΓZ→e+e− = dΓZ→e+e− , (5.24)

which is a “partial decay rate” because it involves a single channel. The Z boson can decay also
into other leptonic pairs or quark pairs. Therefore, if we sum over all the possibilities that the
interaction that we are considering allows, we have the total decay rate
X
Γ= Γf . (5.25)
f

The ratio
Γf
Bf = (5.26)
Γ
is called the “branching ratio” and it gives the probability of finding the state f among the
possible decay products.
1
• All the pieces of formula (5.22) are Lorentz invariant except the term 2E . In fact Γ is not a
Lorentz scalar, but it transforms as the inverse of the temporal component of a four-vector. In
1
the frame in which the decaying particle is at rest, this factor becomes 2M , where M is the mass
of the particle. In a generic frame in which the decaying particle has velocity β we have
M
E = γM = p (5.27)
1 − β2
and therefore the rate in that frame is smaller than the one in the rest frame of the decaying
particle by a factor 1/γ
1
ΓE = ΓM . (5.28)
γ
1
The lifetime of the particle, which is the inverse of the total rate τ = Γ, therefore is bigger in
the frame in which the particle has velocity β (dilatation of time).

• In the rest frame we have


X X X X
δ4 (P − Pf ) → δ4 (M − Pf ) = δ(M − Ef )δ3 ( pf ) . (5.29)
f f f f

Therefore X Xq
M= Ef = p2f + m2f . (5.30)
f f

This means that, in the case in which the decay products are massive we have to have M ≥
f mf . The energy at disposal for the decay products is at most M . In the
Plimiting case in
P
which also the decay products are produced at rest, we have pf = 0 and M = f mf , otherwise
the energy M has to go partly into the masses of the particles produced and partly into their
momenta.

163
5.3.1 Two-body phase space
The part of both the cross section and the decay width formulas that has to do with the differentials
in the final state momenta is called n-body phase space:
X X Y d3 pf
dΦ(n) = (2π)4 δ4 ( Pi − Pf ) . (5.31)
(2π)3 2Ef
i f f

Of particular importance is the two-body phase space. If we consider f = 3, 4 and Pi = P , then


P
i

d3 p3 d3 p4
dΦ(2) = (2π)4 δ4 (P − P3 − P4 ) . (5.32)
(2π)3 2E3 (2π)3 2E4

Let us consider, for instance, the case in which a particle of mass M decays into two particles of masses
m3 and m4 . We can calculate dΦ(2) in the rest frame of the decaying particle. We have

d3 p3 d3 p4
dΦ(2) = (2π)4 δ4 (M − P3 − P4 ) , (5.33)
(2π)3 2E3 (2π)3 2E4
d3 p3 d3 p4
= δ(M − E3 − E4 )δ3 (p3 + p4 ) . (5.34)
(2π)2 4E3 E4

Now suppose we have to integrate all over the phase space. We can integrate first of all in p4 using
the delta (p4 = −p3
1 d3 p3
dΦ(2) = δ(M − E 3 − E4 ) , (5.35)
(2π)2 4E3 E4
where now q q
E3 = p23 + m23 , and E4 = p23 + m24 , (5.36)
since we have to replace everywhere p4 with −p3 .
Now we have to integrate in d3 p3 . We can write

d3 p3 = p23 dp3 dΩ = p23 dp3 dφd cos θ , (5.37)

where dΩ = dφd cos θ is the solid angle and where p3 > 0 is the modulus of p3 . We can integrate in
dp3 as follows
Z ∞
p23
 
dΩ
q q
(2) 2 2 2 2
dΦ = dp3 p 2 δ M − p 3 + m3 − p 3 + m4 . (5.38)
(2π)2 0
p
4 p3 + m23 p23 + m24

For the properties of the delta function


1
δ(f (x)) = δ(x − x0 ) , (5.39)
|f ′ (x 0 )|

where x0 is a zero of f (x). In the phase space p3 ≥ 0 we have a single zero which is
1 1
q q
p3 = 4 4 4 2 2 2 2
M + m3 + m4 − 2M m3 − 2M m4 − 2m3 m4 = 2 2 λ(M 2 , m23 , m24 ) . (5.40)
2M 2M
Therefore, in the end we have (substituting the root in Eq. (5.40) into the square roots and
simplifying)
dΩ
q
dΦ(2) = λ(M 2 , m23 , m24 ) . (5.41)
32π 2 M 2

164
In the case in which m3 = m4 = m the formula simplifies considerably:
r
dΩ 4m2
dΦ(2) = 1 − . (5.42)
32π 2 M2
The same formula holds, mutatis mutandis, for the scattering 2 → 2, in which we calculate the
cross section in the c.m. frame. If P1 + P2 → P3 + P4 in the c.m. frame we will have

P1µ = (E1 , p) , P2µ = (E2 , −p) , (5.43)


p p
where E1 = p2 + m21 and E2 = p2 + m22 . If we define

S = (P1 + P2 )2 = (E1 + E2 )2 , (5.44)



the energy at disposal for the reaction is S (which corresponds to M in the case of decay of one
particle. For the four-momenta 3 and 4 we will have

P3µ = (E3 , p3 ) , P4µ = (E4 , −p3 ) , (5.45)


p p
where E3 = p23 + m23 and√E4 = p23 + m24 . Therefore, with respect to the case of a decay we just
have to substitute M with S:
dΩ
q
(2)
dΦ = λ(S, m23 , m24 ) , (5.46)
32π 2 S
in which we have to remember that p4 = −p3 and
1
q
p3 = √ S 2 + m43 + m44 − 2Sm23 − 2Sm24 − 2m23 m24 . (5.47)
2 S

5.4 The process e+ + e− → µ+ + µ−


In this section we consider the process e+ + e− → µ+ + µ− . The cross section is given by

(2me )2 (2mµ )2 d3 p3 d3 p4
dσe+ e− →µ+ µ− = (2π)4 δ4 (p1 + p2 − p3 − p4 ) p |M|2 , (5.48)
4 (p1 · p2 )2 − m21 m22 (2π)3 2E3 (2π)3 2E4

where me and mµ are the masses of the electron and of the muon respectively. Let us now concentrate
on the different pieces of the calculation starting from the modulus squared of the transition amplitude.

5.4.1 Modulus Squared of the Transition Amplitude


In the Standard Model there are three families of leptons; in this section we consider only the first two:
electron and muons. They differ by the mass: me ∼ 0.5 MeV, mµ ∼ 105 MeV, but they have the same
electric charge.
The interaction Lagrangian density is

Lint = −e : ψ̄e 6 Aψe + ψ̄µ 6 Aψµ : (5.49)

We consider the second order expansion of the S matrix. The T -product inside the integral reads:

(−i e)2    
T : ψ̄e 6 Aψe + ψ̄µ 6 Aψµ X1 :: ψ̄e 6 Aψe + ψ̄µ 6 Aψµ X2 : =
2
(−i e)2 n        
T : ψ̄e 6 Aψe X1 :: ψ̄e 6 Aψe X2 : + T : ψ̄µ 6 Aψµ X1 :: ψ̄µ 6 Aψµ X2 :
2

165
       o
+T : ψ̄e 6 Aψe X1 :: ψ̄µ 6 Aψµ X2 : + T : ψ̄µ 6 Aψµ X1 :: ψ̄e 6 Aψe X2 : . (5.50)

In order to evaluate the corresponding matrix elements, we apply the Wick’s theorem. Operators
belonging to different fields cannot be contracted. Moreover, we do not have to consider contractions
between two operators evaluated in the same point. Therefore, the only possibility consists in con-
tracting the photon field. The four terms above have matrix elements different from zero for different
initial and final states. The first and the second terms in Eq. (5.50) represent electron-positron to
electron-positron and muon-anti muon to muon-anti muon scattering processes, respectively. We are
interested, instead, in electron-positron to muon-anti muon scattering, that is represented by the third
and fourth terms of Eq. (5.50). Considering as initial state |e+ e− i and final state |µ+ µ− i, these two
terms give the following contributions:

(−i e)2 n        o
T : ψ̄e 6 Aψe X1 :: ψ̄µ 6 Aψµ X2 : + T : ψ̄µ 6 Aψµ X1 :: ψ̄e 6 Aψe X2 : =
2
(−i e)2
 
: ψ̄e 6 Aψe X1 : : ψ̄µ 6 Aψµ X2 : + : ψ̄µ 6 Aψµ X1 : : ψ̄e 6 Aψe X2 : . (5.51)
   
2

We have to select, in the first contribution, the annihilation of an electron and a positron in X1
and the creation of a muon and an anti-muon in X2 , while, in the second contribution, the annihilation
of an electron and a positron in X2 and the creation of a muon and an anti-muon in X1 . These two
contributions can be represented by the following Feynman diagrams (in X space):

e− µ− e− µ−

X1 X2
X1 X2
e+ µ+ e+ µ+

When we integrate in X1 and X2 , if we exchange X1 with X2 in the second term, we find the same
contribution coming from the first term, that therefore has to be considered twice:
Z  
(2) 2 4 4
 
S = (−ie) d X1 d X2 : ψ̄e 6 Aψe X1 : : ψ̄µ 6 Aψµ X2 : . (5.52)

Moving to momentum space we have then to consider the following Feynman diagram:

e− p1 p3 µ−
p1 + p2

e+ p2 p4 µ+

The matrix element M and its complex conjugated are given by


−iη µν
(5.53)
 
M = ū(p3 , n3 )i −ieγν ij
v(p4 , n4 )j 2
v̄(p2 , n2 )k −ieγµ kl u(p1 , n1 )l ,
(p1 + p2 )
1
= ie2 ū3i (γν )ij v4j v̄2 (γ ν )kl u1l , (5.54)
(p1 + p2 )2 k

166
1
M∗ = −ie2 v̄4j′ (γρ )j ′ i′ u3i′ ū1 (γ ρ )l′ k′ v2k′ , (5.55)
(p1 + p2 )2 l′
where we wrote the products like ū3 γν v4 making the components explicit. Finally,

e4
|M|2 = u3i′ ū3i (γν )ij v4j v̄4j′ (γρ )j ′ i′ v2k′ v̄2k (γ ν )kl u1l ū1l′ (γ ρ )l′ k′ ,
 
4
(5.56)
(p1 + p2 )
where we grouped together the spinors that refer to the same external momentum.
The expression (5.56) has to be evaluated according to what we intend to measure experimentally.
Very often we are interested to unpolarized cross sections. Since we admit an undefined spin state of
the final state, quantum mechanically we have to sum over the final state spins. We can reach the
same final state both with a certain spin configuration of particle 1 and of particle 2. Therefore, we
can sum over the initial state provided that we devide by the different spin states available. In the
case of two fermions in the initial state, we have to consider 2 states for each particle and therefore a
1/4 factor overall:
1X
|M|2 =⇒ |M|2 ; (5.57)
4 ′
n,n

this means: “sum over


P the final state spins” and “average over the initial state spins”.
Since we have a n,n′ , in Eq. (5.56) we can recognize the polarization sums:
X 6p+m X 6p−m
u(p, n)ū(p, n) = , v(p, n)v̄(p, n) = . (5.58)
n
2m n
2m

We have
X
v2k′ v̄2k γ νkl u1l ū1l′ γ ρl′ k′ =
 
u3i′ ū3i γνij v4j v̄4j′ γρj′ i′
n,n′
       
6 p 3 + mµ 6 p 4 − mµ 6 p 2 − me ν 6 p 1 + me
(γν )ij (γρ )j ′ i′ (γ )kl (γ ρ )l′ k′ , (5.59)
2mµ ′
ii 2m µ jj ′ 2m e k ′ k 2m e ll ′
6 p + m 6 p 4 − mµ   6 p 2 − me ν 6 p 1 + me ρ 
3 µ
= tr γν γρ tr γ γ , (5.60)
2mµ 2mµ 2me 2me
which is the product of two traces on the Dirac indices. Therefore, we have:
1X
|M|2 = |M|2 , (5.61)
4 ′
n,n
e4 6 p + m
3 µ 6 p 4 − mµ   6 p 2 − me ν 6 p 1 + me ρ 
= tr γ ν γρ tr γ γ , (5.62)
4(p1 + p2 )4 2mµ 2mµ 2me 2me
e4   
ρ ν
. (5.63)

= tr (6 p 3 +m µ )γ ν (6 p 4 −m µ )γρ tr (6 p 1 +m e )γ (6 p 2 −m e )γ
64m2e m2µ (p1 + p2 )4

Let us evaluate the two traces, remembering that the trace of an odd number of γ matrices is zero.
   
2
T r1 = tr (6 p3 + mµ )γν (6 p4 − mµ )γρ = tr 6 p3 γν 6 p4 γρ − mµ γν γρ , (5.64)
= 4p3ν p4ρ + 4p3ρ p4ν − 4(p3 · p4 )gνρ − 4m2µ gνρ , (5.65)

= tr (6 p1 +me )γ ρ (6 p2 −me )γ ν = ... = 4p1ρ p2ν + 4p1ν p2ρ − 4(p1 · p2 )gνρ − 4m2e gνρ . (5.66)

T r2

Therefore, the product of the two traces is

T r1 T r2 = 16 (p1 · p4 )(p2 · p3 ) + (p1 · p3 )(p2 · p4 ) − (p1 · p2 )(p3 · p4 ) − 16m2e (p3 · p4 )


 

167
+16 (p1 · p3 )(p2 · p4 ) + (p1 · p4 )(p2 · p3 ) − (p1 · p2 )(p3 · p4 ) − 16m2e (p3 · p4 )
 

−16 (p1 · p2 )(p3 · p4 ) + (p1 · p2 )(p3 · p4 ) − 4(p1 · p2 )(p3 · p4 ) + 64m2e (p3 · p4 )


 

−16m2µ (p1 · p2 ) − 16m2µ (p1 · p2 ) + 64m2µ (p1 · p2 ) + 64m2e m2µ , (5.67)


= 32(p1 · p3 )(p2 · p4 ) + 32(p1 · p4 )(p2 · p3 ) + 32m2e (p3 · p4 ) + 32m2µ (p1 · p2 )
+64m2e m2µ . (5.68)

In total, we have:

e4
|M|2 = 2 2 2 2
 
(p 1 ·p 3 )(p 2 ·p 4 )+(p 1 ·p 4 )(p 2 ·p 3 )+m e (p 3 ·p 4 )+m µ (p 1 ·p 2 )+2m e m µ . (5.69)
2m2e m2µ (p1 + p2 )4

5.4.2 Kinematics
In order to express the scalar products, we choose a reference system. Since |M|2 is invariant it is
convenient to calculate it in the center of mass (c.m.) frame. In this frame we have the following
situation:
p3

p1 θ p2

p4

where θ is the so-called scattering angle. Therefore, we have:

pν1 = (E1 , p) , pν2 = (E2 , −p) . (5.70)

Since p21 = m2e = p22 , it follows that E1 = E2 = E, then

pν1 = (E, p) , pν2 = (E, −p) (5.71)

and
(p1 + p2 )2 = 4E 2 . (5.72)
Also p3 and p4 are back-to-back and therefore if we call

pν3 = (E3 , p′ ) , pν4 = (E4 , −p′ ) , (5.73)

with p23 = m2µ = p24 , we have to have E4 = E3 . Moreover, since p1 + p2 = p3 + p4 , we also have

p1 + p2 = 2E = p3 + p4 = 2E3 =⇒ E3 = E . (5.74)

Finally
pν3 = (E, p′ ) , pν4 = (E, −p′ ) . (5.75)
The various scalar products can be expressed in terms of E, p, p′ and the scattering angle θ:

p1 · p2 = E 2 + p2 , (5.76)
2 ′2
p3 · p4 = E + p , (5.77)
2 ′ 2 ′
p1 · p3 = E − p · p = E − pp cos θ = p2 · p4 , (5.78)
2 ′ 2 ′
p1 · p4 = E + p · p = E + pp cos θ = p2 · p3 , (5.79)

168
(5.80)

therefore we find:
e4
|M|2 =
 2
2 2 4
(E − pp′ cos θ)2 + (E 2 + pp′ cos θ)2 + m2e (E 2 + p′2 )
2me mµ 16E
+m2µ (E 2 + p2 ) + 2m2e m2µ ,

(5.81)
e4  4 2 ′2 2 2 2 2 2 ′2 2 2 2 2

= 2E + 2p p cos θ + (m e + m µ )E + m e p + m µ p + 2m e m µ . (5.82)
2m2e m2µ 16E 4

In the cross section, the term |M|2 is multiplied by a factor (2me )2 (2mµ )2 and then we have

e4  4
(2me )2 (2mµ )2 |M|2 = 2E + 2p2 p′2 cos2 θ + (m2e + m2µ )E 2 + m2e p′2 + m2µ p2 + 2m2e m2µ . (5.83)

2E 4

This means that in the cross section we do not have mass terms in the denominator. Since me ≪ mµ
(we also have m2e ≪ E 2 and m2µ ≪ E 2 ), in Eq. (5.83) we can neglect terms proportional to m2e , finding
a simpler formula:

e4  4
(2me )2 (2mµ )2 |M|2 ≈ 2 ′2 2 2 2 2 2

2E + 2p p cos θ + m µ E + m µ p
2E 4
e4  4 2 ′2 2 2 2

= 2(E + E p cos θ + m µ E ) ,
2E 4
e4  2 ′2 2 2

= E + p cos θ + m µ , (5.84)
E2
since for me ∼ 0 we have m2e = 0 = E 2 − p2 and therefore p2 = E 2 .

5.4.3 Flux Factor


The calculation of the flux factor in our case gives the following result:
q
4 (p1 · p2 )2 − m2e m2µ ≈ 4 (p1 · p2 )2 = 8E 2 .
p
(5.85)

5.4.4 Cross Section


Finally, merging the various pieces together we find:

1 e4  2  d3 p3 d3 p4
dσ = (2π)4 δ4 (p1 + p2 − p3 − p4 ) E + p ′2
cos 2
θ + m 2
µ , (5.86)
8E 2 E 2 (2π)3 2E3 (2π)3 2E4
e4
= δ4 (p1 + p2 − p3 − p4 )
 2
E + p′2 cos2 θ + m2µ d3 p3 d3 p4 , (5.87)

2
128π E 6

α2 
= δ4 (p1 + p2 − p3 − p4 ) 6 E 2 + p′2 cos2 θ + m2µ d3 p3 d3 p4 , (5.88)

8E
where we introduced the fine structure constant α = e2 /(4π).
If we use the δ4 (p1 + p2 − p3 − p4 ) in the d3 p4 integration, we find

α2  2 ′2 2 2
 3 ′
dσ = δ(E1 + E2 − E3 − E4 ) E + p cos θ + m µ d p , (5.89)
8E 6
α2
= δ4 (2(E − E ′ ))
 2
E + p′2 cos2 θ + m2µ d3 p′ ,

(5.90)
8E 6

169
1 α2  2
δ(E − E ′ ) E + p′2 cos2 θ + m2µ d3 p′ . (5.91)

= 6
2 8E
We can express d3 p′ in terms of the solid angle

d3 p′ = p′2 dp′ dΩ(= p′2 dp′ d cos θdφ) (5.92)

and calculate the differential cross section, with respect to the solid angle dΩ

dσ α2
Z
δ(E − E ′ ) E 2 + p′2 cos2 θ + m2µ p′2 dp′ .
 
= 6
(5.93)
dΩ 16E

Since p′2 = E ′2 − m2µ , we have


dp′ E′ E′
= = . (5.94)
dE ′ p′
q
E ′2 − m2µ

Finally,

dσ α2
Z
δ(E − E ′ ) E 2 + p′2 cos2 θ + m2µ p′ E ′ dE ′ ,
 
= (5.95)
dΩ 16E 6
α2  2
E + p′2 cos2 θ + m2µ p′ E , (5.96)

=
16E 6
p′  2
α2
E + p′2 cos2 θ + m2µ , (5.97)

=
16E 4
E
q
where we used the fact that now p′ = E 2 − m2µ .
To find the total cross section we must integrate in dΩ. For simplicity let us consider the ultra-
relativistic limit, E 2 ≫ m2µ . Therefore, in Eq. (5.97) we can neglect the term with m2mu (m2mu ∼ 0 =⇒
p′ = E) getting
dσ α2 
1 + cos2 θ .

= 2
(5.98)
dΩ 16E
Then, we have

α2
Z
2
 
σ = 1 + cos θ dΩ , (5.99)
16E 2
1
α2
Z
1 + cos2 θ d cos θ ,

= 2
2π (5.100)
16E −1
α2 π 1
= 2
= 5.6 · 10−5 2 . (5.101)
3E E
The cross section is now (in natural units) in Energy−2 . If E ∼ 100 GeV, we would obtain

σ = 5.6 · 10−9 GeV−2 . (5.102)

If we want to express the cross section in barn, we have to remember that

1 GeV−2 = 0.389 mbarn . (5.103)

Therefore:
σ = 5.6 · 10−9 GeV−2 = 2.18 · 10−9 mbarn = 2.18 pbarn . (5.104)

170

You might also like