The CPT theorem states that under mild technical assumptions any unitary, local, Lorentz-invariant point-particle quantum field theory in flat Minkowski space is CPT invariant. This theorem was first established in the early 1950s in the context of Lagrangian field theory. A few years later, a general version of the theorem was proved within the more abstract and mathematically rigorous framework of axiomatic field theory. In the following two subsections, the line of reasoning behind each of the two versions of the proof will be reviewed with particular focus on the key ingredients on which the proofs rely.
Adopting the usual mathematical practice, we initially set up Θ to transform a physical state
to its CPT-conjugate state
in the same physical Hilbert space:
Note that we consider a single theory that contains both states
and
. Since Θ involves charge conjugation, which connects particles and antiparticles, this means we take the theory to describe both particle and antiparticle states at the same time. In particular, we do not consider situations such as a mapping of the Pauli equation for an electron to the Pauli equation for a positron: These are two distinct models, each with their own parameters that can in principle be specified separately. The property
seems reasonable on physical grounds: two charge conjugations, two parity reflections, and two time reversals should leave the physics unaffected. In the last step, we have used the fact that quantum symmetry operations are known to be (anti)unitary. We finally remark that for the general form of the transformation law for quantum operators
, we may take
The second equality holds on account of Equation (3).
2.1. Proof Based on Lagrangian Field Theory
The first approaches to the CPT theorem by Bell [
3], Lüders [
4,
5], and Pauli [
6], and implicitly already by Schwinger [
7], are to a large extent based on the Lagrangian formalism in quantum field theory. These approaches proceed essentially by construction: all physically acceptable terms that can enter a Lagrangian density are formed. Alternatively, some of these authors work at the level of the interaction Hamiltonian or the actual field equations. In any case, it is then shown that each of these terms must necessarily be CPT symmetric. Often, this procedure is illustrated explicitly by focusing on the physically most relevant cases of a scalar particle, a spin-
fermion, and a spin-1 boson: All known elementary particles fall into one of these categories.
Proceeding in this manner, we need the properties of the field operators under the reflection Θ, i.e., under the CPT transformation:
where we have adopted the gamma-matrix conventions of Ref. [
8]. Here, the transpose operation denoted by the superscript
T is not to be confused with the time-reversal transformation; it applies only to the spinor indices, and does not refer to the entire quantum-field Hilbert space. We mention that by virtue of Equation (3), the CPT operator Θ and its Hermitian conjugate may be interchanged in Equations (5)–(7) in much the same way as in Equation (4).
The demonstration that Θ can indeed be interpreted as the CPT-transformation operator is most convincingly achieved in quantum field theory. Here, we merely motivate with a few heuristic remarks that Equations (5)–(7) indeed implement CPT conjugation. The inversion of the spacetime point
is intuitively reasonable. It arises from a PT reflection, and C would be expected to leave
x unchanged. The Hermitian or complex conjugation of the field operators arises from C. Consider, for example, an electrically charged scalar field
ϕ with charge
q. It must couple to the photon field
via a gauge-covariant derivative
if the usual gauge invariance and the resultant charge conservation is to be maintained. The field equation for
then contains
because it can be obtained via Hermitian conjugation from the field equation for
ϕ [
9]. Inspection of
and
reveals that
ϕ and
couple with opposite charge
q to the electromagnetic field compatible with being charge conjugates of one another. The minus sign in the transformation of
is also intuitively reasonable by looking at the example of classical electromagnetism. Changing the sign of source charges
, parity inversion
, and time reversal
changes the overall sign of the source current
, so the sign of
must reverse as well [
10].
Another ingredient in the construction of Lagrangians is the 4-gradient
. This operator remains unchanged under Θ:
This behavior stems from the question of whether equations of motion of the
same form, and in particular with the
same derivative structure, are satisfied by the CPT conjugate fields. For example, consider
for real-valued
and
. CPT symmetry holds if
also satisfies this equation, i.e.,
. The equation
holds trivially because it follows from the original equation by renaming
. This is consistent with the aforementioned fact that pure changes of coordinates should not have physical effects.
For general, dynamical rank-
n tensor densities
, the following relation holds:
Roughly speaking, tensor densities with an even (odd) number of indices are CPT even (odd). Note that this property is compatible with the above definitions (5), (7), and (8). For example, the transformation of the electromagnetic field-strength tensor
can either be obtained with Equation (9) for
or with Equations (7) and (8) for
, and the results agree:
Using the spin–statistics theorem, we will argue below that spinors are also compatible with the transformation law (9). Nondynamical objects with Lorentz indices do not follow this rule. For example, a complex-valued four-vector
transforms as four complex numbers
and involves complex conjugation according to Equation (11) to be discussed next.
Antilinearity of T. The time-reversal transformation is an antilinear operation [
11]; it complex conjugates complex numbers
z:
. The other two transformations, charge conjugation C and parity inversion P, are both linear. This means that Θ, just like
T, is also antilinear:
This implies in particular that Lagrangian parameters, such as couplings, masses, mixing angles, entries of Dirac matrices, etc. are complex conjugated under the CPT transformation. We remark that
, so that
. Antilinear operators that obey this additional condition are called antiunitary.
The action of Θ and other antiunitary operators on matrix elements needs to be defined carefully, so for the moment we switch from the bra–ket notation to one that better clarifies how operators act on states
. Consider, for example, an ordinary unitary transformation
U on both the states
ψ and
χ as well as on the operator
:
, which leaves the matrix element trivially unchanged. For our present purpose of searching for physical invariances, such trivial types of transformation are uninteresting. Instead, a nontrivial unitary or antiunitary transformation of a matrix element involves the states only
consistent with the mathematical definition of a transformation on a Hilbert space. For an antiunitary transformation, this yields
Here,
is the CPT-conjugate operator determined by the general rules above. The complex conjugation of the matrix element in the second step differs from the usual unitary-operator case; it follows from the mathematical theory of antilinear operators. The last step uses the usual properties of Hilbert-space scalar products. Physical observables are usually Hermitian. Moreover, they may describe a process in time, such as scattering
, with an initial and final state. The last step in Equation (12) then reads
, i.e., CPT affects not only the scattering operator
, but also interchanges initial and final states, as expected from the time-reversal factor in CPT.
Connection between Spin and Statistics. The proof of the CPT theorem proceeds by using the connection between spin and statistics in the following way [
12]. Since the Lagrangian density is a Lorentz scalar, spinor indices must be contracted, so that spinors always come in pairs. These spinor bilinears are formed such that they transform like Lorentz tensors, i.e., that they possess zero, one, or two Lorentz indices:
,
,
, etc. This allows a straightforward Lorentz-invariant coupling to other tensors, like
,
ϕ, and
. A CPT transformation yields
. Here, we used that
and
anticommute, and that
. The crucial step is the last one, in which we simplified the spinor-space transpose; it requires reversing the order of
χ and
ψ. It is this step that uses the connection between spin and statistics: fermion fields anticommute. The term in the curly brackets is a spinor-space scalar, so the spinor transposition may be left out and we have
. Extension of this reasoning including fermion anticommutation to the other Dirac bilinears shows that they also follow the general rule (9) established above for dynamical tensor densities.
Lorentz invariance. We have already used Lorentz symmetry implicitly in the above individual ingredients for the construction of field-theory Lagrangian densities: the Minkowski position
x, the scalar, spinor, and vector fields in Equations (5)–(8), and the general tensor densities in Equation (9) are all realizations of the Lorentz group, i.e., rotations and boosts of these objects are implemented by Lorentz transformations. In Lagrangian field theory in Minkowski space, Lorentz invariance is guaranteed if the action, and thus the Lagrangian density, are both Lorentz scalars. This implies that all fields and derivatives in a Lagrangian density must be combined such that not only the spinor indices, but also all Lorentz indices are properly contracted. Through this pairwise contraction, the total number of Lorentz indices in each Lagrangian term of field products must be even
,
. According to Equation (9), this yields for the Lagrangian density
under CPT. Here, the assumption has been made that all fields in
are combined at the same spacetime point
x. We will have to say more about such pointlike interactions below [
13].
Unitarity. The next ingredient for the CPT theorem is a Hermitian Lagrangian density
, so that with Equation (13), we have
A Hermitian
will lead to a Hermitian Hamiltonian
H. The time evolution
is then unitary, so that probability is conserved in time:
. Without this property, a conventional interpretation of the quantum mechanics of closed systems would appear to be difficult.
Point interactions. To conclude the Lagrangian version of the proof of the CPT theorem, we show that the action
remains invariant under Θ. With the above result (14), we write
where we have changed the dummy integration variable
in the second step. As mentioned earlier, the validity of this step hinges crucially upon the absence of non-pointlike interactions. Consider, for example, Lagrangian contributions of the form
for two real scalar fields
ϕ and
φ;
is a constant nonzero four-vector. The equations of motion for
ϕ read
. This shows that the behavior of
ϕ at
x is determined by the value of
φ at the point
indicating an interaction at a distance. According to Equation (5), real scalars reverse the sign of their entire spacetime argument under CPT, i.e.,
and
. The corresponding piece of the action therefore obeys
. It thus becomes apparent that CPT symmetry is not automatically satisfied in theories with interactions at a distance. We remark that such interactions would typically be associated with violations of causality. The axiomatic proof the CPT theorem to be discussed next further illuminates the role of microscopic causality.
2.2. Proof Based on Axiomatic Field Theory
In 1950s, efforts to place quantum field theory on a more rigorous mathematical footing intensified. Within this mathematical-physics context, the Lagrangian formalism turned out to be too narrow, and sets of axioms, such as the Wightman axioms, were adopted [
14]. A next natural step was then to ask what properties of quantum field theory follow rigorously from these axioms. For the case of CPT symmetry, this question was answered by Jost [
15] in 1957. Although more technical, his proof of the CPT theorem thoroughly illuminates the close connection between Lorentz and CPT symmetry.
The core of the argument involves the complexified version of Lorentz transformations, which are essentially boosts and rotations by complex-valued velocities and angles acting on complex-valued Minkowski vectors, tensors, etc. When in the conventional real Lorentz-transformation laws complex velocities and angles are entered, physical interpretation is usually lost for most input values, but the equation may still be mathematically correct. However, it turns out that for judiciously chosen imaginary boosts and real rotation angles, a complete spacetime inversion, which goes a long way to a full CPT transformation, can be achieved. Now, this is clearly not a proper orthochronous Lorentz transformation. However, this transformation does have a physical interpretation and is usually already contained in the mathematical structure of the transformation law. It is this complexification feature of the Lorentz transformations in quantum physics that is used to prove the CPT theorem and that exposes the intimate connection between CPT and Lorentz symmetry.
Let us illustrate this idea with an example. It has already been argued above that the Minkowski position
would change sign under CPT:
, i.e., CPT can be implemented by multiplying
with
, where
is the
identity matrix. Consider a Lorentz transformation
consisting of a boost along the
x-axis with rapidity
and a rotation about the
x-axis by an angle
α:
It is apparent that for any proper Lorentz transformation, which has velocities
and thus rapidities
, and which has angles
, we have
. However, if we chose a purely imaginary
and let
, we do obtain
This illustrates that if the equations of physics remain invariant not only for real-valued boosts and rotations, but also for complex-valued Lorentz transformations, we can expect them to be left unchanged also under spacetime inversion. In particular, spacetime inversion amounts to a CPT transformation for, e.g., spacetime points
.
The actual proof of the CPT theorem includes further physical input, such as energy positivity and microscopic causality. It also includes some technical mathematical aspects including an analysis of the circumstances under which the above analytic continuation into the complex plane is valid. The goal of the following paragraphs is to shed some more light on this proof.
Jost’s proof proceeds in the context of Wightman’s approach to rigorous quantum field theory in a flat-spacetime background [
16]. Weak gravitational fields can be expanded about flat backgrounds and are in this sense included in the framework. Strong gravitational fields, however, seem to lie outside Wightman’s rigorous quantum field theory, so that Jost’s proof would need to be generalized.
Wightman’s approach is based on a set of axioms that define what is meant by a sensible quantum field theory. To appreciate the generality of the CPT theorem, it is useful to spell out these axioms and comment on their physical significance. In the literature, a few variations of Wightman’s definition of a quantum field theory can be found. For example, certain axioms may be combined into a single one or, vice versa, axioms may be separated into subaxioms. Nevertheless, in one form or another, the following physical assumptions are made:
- (1)
Lorentz- and translation-covariant Hilbert space . This assumption essentially states that we consider a relativistic version of quantum theory in which the usual rules of quantum mechanics apply. In particular, there are unitary operators that implement Lorentz transformations Λ and spacetime translations a. The unitarity of these transformations ensures that under Λ and a states in transform to other states in such that all transition amplitudes remain unchanged.
- (2)
Vacuum state. The Hilbert space contains a unique state, called the vacuum , that remains invariant under both the Lorentz transformations and the translations up to a phase . In particular, the vacuum can neither have a nonzero four-momentum nor a nonzero angular momentum, as these quantities would change under . Together with axiom (4) below, the vacuum needs to be the state with lowest energy. These requirements are intuitively reasonable: the flat-spacetime vacuum looks the same to all inertial observers. An additional, more technical assumption is that be cyclic. This essentially means that all other physical states in can be constructed by acting with the field operators of axiom (3) on the vacuum. This property is akin to that of the usual quantum harmonic oscillator, where excited states can be reached from the ground state with the creation operator.
- (3)
Field operators. Physical quantities are represented by polynomials of field operators acting on this Hilbert space. These field operators transform under the Lorentz transformations as scalars, vectors, tensors, spinors, etc. Moreover, these fields are set up such that each corresponds to a definite finite spin and mass allowing the usual particle interpretation. It turns out that field operators are mathematically not well-defined at a spacetime point, so there is the technical assumption of them being tempered distributions “smeared out” with test functions. This assumption can, for example, be used to establish continuity properties as the field operators vary with spacetime, but otherwise this level of rigor will be unnecessary for our present purposes.
- (4)
Energy positivity. Translation invariance leads to a conserved four-momentum operator . Its zeroth component , the energy operator or Hamiltonian, is postulated to have non-negative eigenvalues . Together with the condition of Lorentz symmetry, this implies that the four-momentum eigenvalues are lightlike or timelike four-vectors . In other words, must lie in the forward momentum-space lightcone. This property is closely tied to the requirement of stability: if there were no lowest-energy state, it would seem difficult to prevent the system from filling an infinite number of pairs of positive- and negative-energy states.
- (5)
Microscopic causality. Many textbooks seems to suggest that the property of causality is automatically contained in a Lorentz-symmetric theory. However, consider a model with spacelike particle four-velocities
. Being a four-vector,
transforms covariantly under the Lorentz transformations compatible with Lorentz symmetry. However, a spacelike four-velocity is associated with superluminal particle speeds and thus acausalities. For this reason, causality is imposed separately as follows. Field operators
ϕ commute or anticommute if they cannot be connected by light signals:
for
. In the mathematical-physics literature, this requirement is sometimes also called locality. The microcauslity condition may be understood intuitively by recalling the usual quantum-mechanical uncertainty relation
for two Hermitian operators
A and
B, with the usual definition of the uncertainty
for any Hermitian operator
with respect to the state
. Here, one measurement generally affects the other measurement because their uncertainties are not independent unless the commutator
vanishes. A careful reasoning in the present context shows that with the above microcausality condition, the physics at
x cannot affect the physics at
y and vice versa if the separation between
x and
y is spacelike [
17]. We remark in passing that in this axiomatic framework the spin–statistics theorem follows rigorously, so that the above choice between commutators and anticommutators is actually fixed: commutators for integer-spin fields and anticommutators for half-integer spin fields.
Note the generality of the above axioms. In particular, no Lagrangian is required, and the details of the field equations are not specified. This set of axioms is sufficient to prove CPT symmetry.
The basic field-theory objects in Wightman’s approach are Wightman functions
defined simply as vacuum expectation values of field operators:
Here, the spacetime points
are physical, i.e., each point is described by a set of four real numbers. The fact that the Wightman functions depend only on spacetime differences follows from translation invariance postulated in Axiom (1). The field operators can be of any type, i.e., scalar, vectors, tensor, spinor, etc. Now, the Wightman reconstruction theorem [
18] roughly states that a quantum field theory is uniquely determined by its Wightman functions. The theorem follows from the axioms above. The significance of the reconstruction theorem in the present context is that the CPT theorem needs to be proved only for all the
: if they satisfy CPT symmetry, so will the corresponding quantum field theory. This simplifies matters because instead of working with abstract field operator and Hilbert-space states, one needs to consider ordinary functions only.
In what follows, our sole focus will be on scalar fields, which is sufficient to gain a flavor of the proof and to appreciate the significance of the physical ingredients needed. More general types of fields are considered in Ref. [
16]. We first state the CPT theorem, and then proceed to prove it. Translated into the present context, the theorem asserts that the CPT-transformed Wightman functions should be identical to the original ones. As the Wightman functions are matrix elements, we employ Equation (12) to state the CPT theorem as
where we have used Equation (5). The usual Hilbert-space properties give
. In other words, one must show that
is satisfied by all Wightman functions in order to prove CPT symmetry.
Physical Lorentz symmetry. Axioms (1) and (3) imply invariance of the Wightman function under the usual physical Lorentz transformation Λ, so we can write:
We cannot immediately extend this equation to complex Lorentz transformations, such as the inversion in Equation (17) because
may have singularities, branch points, discontinuities, etc.
Energy positivity. The next idea is to resolve
into its Fourier components. These will contain plane-wave 4-momenta, for which we can use Axiom (4). As an example, let us sketch this idea for the particular Wightman function
involving two field operators. Inserting a complete set of momentum eigenstates
and employing the translation operator on both fields
and
yields
This is the desired Fourier decomposition of
, where we have used that
is translation invariant and that the
are momentum eigenstates
.
Since the integral remains well behaved for decaying exponentials, we may insert certain complex-valued spacetime differences
. Complexifying the exponent gives
The only real contribution to the exponent is the square bracket term, so it must give an overall negative contribution for exponential suppression. Axiom (4) guarantees that
p is in the forward momentum-space lightcone, i.e.,
and
. Then, the exponential decays for
in the forward position-space lightcone, i.e.,
and
is timelike
. This reasoning can be made rigorous for all Wightman functions, so that energy positivity implies that Equation (21) remains valid for certain complex position differences:
Here, the Λ are still the usual real Lorentz transformations.
Complex Lorentz transformation. A theorem by Bargmann, Hall, and Wightman [
19] now states that Equation (24) remains valid for an even larger set of complex
and also for complex Lorentz transformations. This larger set consists of all the original
that have their imaginary part in the forward cone and also all
that can be generated from
with complex Lorentz transformations
:
. This set is sometimes called the “extended tube.” Equation (24) therefore takes the form
A special case of this relation is
where we plugged in the specially chosen complex Lorentz transformation (17), i.e.,
.
Another important difference between Equations (24) and (25), which follows from the Bargmann–Hall–Wightman theorem, is that there are no real-valued, physical
in Equation (24) because the imaginary parts of these
are all timelike vectors and therefore strictly nonzero. However, Equation (25) holds for some real-valued spacetime points: the extended tube contains not only the above complex
, but also transformed points
, and a complex Lorentz transformation acting on a complex
can give a real four-vector. Consider the example of
. Clearly,
is in the forward cone, so it is valid input for Equation (24). For Equation (25), we may transform
to
by any complex
, so, in particular, we may select an imaginary boost
These special, physical points
in the extended tube are called Jost points. A special case of Equation (26) is therefore
By using matrix manipulations, Jost showed that all
are not only real but also spacelike. The significance of this result is that it permits the application of Axiom (5).
Microscopic causality and spin–statistics. Translating Equation (28) back to vacuum expectation values gives
where
. The CPT theorem (19) contains the Hermitian-conjugate fields, so we can use the general Hilbert-space scalar-product property
on the right-hand side of Equation (29) to write
However, this leaves the fields on the right-hand side in the reverse order relative to Equation (19) that we want to prove. However, since the Jost points are real and mutually spacelike separated, we can use the commutativity property of Axiom (5) to change the order of the fields freely:
As before, we can express this in terms of Wightman functions using
:
We remark that this reordering implicitly uses the spin–statistics theorem, which can be proved independently. If we had also considered half-integer spin fields, anticommutators would generate a factor of
for each interchange of two fermionic fields.
Each of the two Wightman functions in Equation (32) have their respective analytic continuation into the extended tube by the above line of reasoning. However, these analytic continuations need not be the same a priori because the two functions only agree at Jost points, and not for all real
. However, it can be shown that the Jost points form an open set, and there is a mathematical theorem that guarantees the equality of analytic functions if they agree in a real-valued open set. Thus, the two analytic continuations are, in fact, identical
One can now take the limit of physical spacetime positions
:
This is the result we needed to prove. Although more technical than the Lagrangian proof, the above line of reasoning in axiomatic field theory reduces the requirements for CPT symmetry down to the bare essentials, namely quantum physics, Lorentz symmetry, energy positivity, and microcausality.