Lecture II: Coordinate Bases, Tensor Algebra in Flat Spacetime, and Special Relativity
Lecture II: Coordinate Bases, Tensor Algebra in Flat Spacetime, and Special Relativity
Lecture II: Coordinate Bases, Tensor Algebra in Flat Spacetime, and Special Relativity
Christopher M. Hirata
Caltech M/C 350-17, Pasadena CA 91125, USA∗
(Dated: September 30, 2011)
I. OVERVIEW
In this lecture we will continue developing the tools of tensor algebra and calculus in flat spacetime. Some of
the discussion will be longwinded, but we are working through calculus in a way that will be applicable to curved
spacetime. The laws of special relativity will be used to illustrate their application.
The recommended reading for this lecture is:
• MTW §2.6–3.2. (Actually, 3.2 on tensors is covered in this lecture and 3.1 on electrodynamics is next lecture.
I think the subject flows a bit more coherently this way.)
A. Gradient operator
Suppose that we have a scalar field f , i.e. a function that associates to each point P a scalar f (P). Then you know
from multivariable calculus that it was possible to define a vector field called “grad f ” or “∇f .” We will define the
gradient here as a 1-form rather than a vector. However in the event that we have defined the metric tensor (or dot
product), this 1-form can be identified with a vector by raising an index.
Let us suppose that we take a trajectory P(λ) through spacetime. Then the scalar field f defines a function of λ,
f [P(λ)]. This function can be differentiated and evaluated at λ = 0,
d
f [P(λ)] . (1)
dλ λ=0
This derivative should depend only on the particular choice of P(0) [or the coordinates xα (λ = 0)], and their first
derivatives with respect to λ. It should depend linearly on the latter. Think about the chain rule:
df ∂f dxα
= . (2)
dλ ∂xα dλ
Therefore we may write
d dP
f [P(λ)]
= df [P(0)], (λ = 0) . (3)
dλ λ=0 dλ
Here dP/dλ is a vector and df is a linear operation on this vector, i.e. df is a 1-form. We call it the gradient of f .
If we set v = dP/dλ, then we may define the directional derivative
d
f [P(λ)].
∂v f = hdf, vi = (4)
dλ
This is the unifying relation between directional derivatives, gradients, and ordinary derivatives. Note that the dot
product or metric plays no role here.
B. Coordinate basis
In flat spacetime, we may choose a coordinate system in which the coordinates xα are the components of a position
vector x = xα eα . In this case, examining Eq. (2), we see that if v = dP/dλ, then
dxα
hd(xα ), vi = = vα . (5)
dλ
Since we recall that hk̃, vi = k̃β v β , it follows that the components of d(xα ) are
d(xα ) = ω α . (7)
Note what is in this equation: xα is a coordinate, i.e. it’s a scalar function of position; d(xα ) is its gradient, i.e. a
1-form; and ω α is a basis 1-form (α denotes which 1-form, not which component).
So far this is all notation – nothing nontrivial has been said. But what is important is that when we go to curved
spacetime (or even if we used a curved coordinate system such as spherical coordinates), the coordinates will no longer
be components of a vector, although we will continue to denote them by xα . However, the xα are still scalar functions
and their gradients d(xα ) still exist. Moreover, none of the discussion here has used the metric. So when we go to
curved spacetime, even though there will no longer be such a thing as a position vector x, it will still be permissible
to define a set of basis vectors by
dP
eα ≡ (λ = 0), xβ [P(λ)] = xβ [P(0)] + λδαβ (8)
dλ
(“the eα vector is the velocity of a particle whose α coordinate is increasing at unit rate, and whose other coordinates
remain fixed”) and the basis 1-forms by
ω α ≡ d(xα ). (9)
This basis will be called the coordinate basis. For a general coordinate system it will not be orthonormal, and we
may introduce other bases that are orthonormal for the purpose of displaying results; but for most calculations,
computation in the coordinate basis is the most straightforward (though not necessarily efficient) approach.
[Example of a more general coordinate system in class: Polar coordinates in 2 dimensions. The basis {er , eθ } is not
orthonormal, nor is the basis {ω r , ω θ }.]
In the coordinate basis, it is easily seen that the directional derivative is given by
∂
∂v = v α . (10)
∂xα
Before we do anything complicated, we’ll highlight a few simple applications of the theory. The following notions
are probably familiar from special relativity, but we can give their description using the powerful new machinery we’ve
learned. We will highlight for each concept what modifications will be required when we go to curved spacetime.
A. Proper time
Define the world line of a particle to be the trajectory it takes through spacetime. In general one might describe
this trajectory in parametric form: P(λ), or xα (λ) if you like coordinates. But there are an infinity of ways to choose
the parameter λ. One special and often convenient choice, though, is the proper time τ . This is defined by the
requirement
dP dP
· = −1. (11)
dτ dτ
3
One may convert any parameterization to proper time by using the conversion
r
dτ dP dP
= − · , (12)
dλ dλ dλ
so long as the curve is timelike (dP/dλ has negative square norm). So proper time is an appropriate parameters for
massive particles, which as we will see follow along timelike trajectories. It is not useful for massless particles such as
photons, which follow trajectories where dP/dλ has zero square norm.
One notes that in special relativity, where gµν = ηµν , we could have used the coordinate time t = x0 as the initial
parameter λ. Then the proper time satisfies
s
r 1 2 2 2 3 2
dτ dxµ dxν dx dx dx
= −ηµν = 1− − − ≡ γ −1 . (13)
dt dt dt dt dt dt
Note that (i) proper time passes at a slower rate than coordinate time; and (ii) the timelike requirement forces the
3-dimensional spatial vector
dxi
(3)
v=
ei (14)
dt
to have norm less than unity (< c). Equation (13) also defines the Lorentz factor γ,
1
γ=p . (15)
1 − |(3) v|2
B. 4-velocity
In ordinary mechanics, the 3-velocity of an object is its spatial displacement per unit time – this is (3) v. This is
not a Lorentz-invariant object. A Lorentz-invariant velocity v can be constructed – this is the 4-velocity u:
dP
.
u≡ (16)
dτ
By construction, u · u = −1. The components of u can be found from
dxα dxα dt dxα
uα = = = γ. (17)
dτ dt dτ dt
Therefore we have
u0 = γ, and ui = (3) v i γ. (18)
C. 4-momentum
In either Newtonian mechanics or special relativity, a particle need not have constant velocity: forces may act on
it. So we will take the derivative of the velocity of a massive particle to obtain the 4-acceleration:
dv dv µ
a= , or (in components) aµ = . (21)
dτ dτ
One can then define the 4-force on the particle to be dp/dτ . This equals ma if the mass is constant.
It is always the case that 4-acceleration and 4-velocity are orthogonal:
dv µ ν 1 d 1 d
a · v = ηµν v = (ηµν v µ v µ ) = (−1) = 0. (22)
dτ 2 dτ 2 dτ
So in the instantaneous rest frame of a particle, the 4-acceleration is purely spatial.
The operation of defining a – seemingly trivial and innocuous – will not make sense in curved spacetime without
lots of extra work. To see why, let’s write the derivative according to freshman calculus:
v(τ + ǫ) − v(τ )
a = lim . (23)
ǫ→0 ǫ
The two vectors on the right hand side, v(τ + ǫ) and v(τ ), are located at different points in spacetime. In special
relativity, this is no issue; we simply move the vectors to the same point and subtract. But in curved spacetime, a
vector (e.g. a velocity) is defined at a point P. The job of moving it to another point Q is nontrivial: imagine taking a
2-dimensional velocity vector tangent to the Earth’s surface defined in Pasadena, and somehow transporting it along
the globe to New York.
Now that we have built the framework of special relativity, it is time to return to our vectors and 1-forms. We have
found that many useful physical concepts are vectors (velocity, momentum, acceleration, force, electric current), some
are 1-forms (wave vector, electromagnetic vector potential), and there is a natural correspondence between the two.
Our next task is to construct general tensors – linear functions that accept an ordered list of vectors and 1-forms and
return a number. Examples of tensors will include the metric tensor, the electromagnetic field, the energy-momentum
density-flux, the polarization density matrix of light, and (later) the curvature of spacetime. So here we will define
tensors and then the basic operations that act on them. The algebraic operations will carry over trivially to curved
spacetime, whereas the calculus operations (e.g. derivatives) will take more work.
A. Linear machines
M
So now we are ready to define a general tensor. Let us define a tensor S of rank N as a linear operation on M
1-forms and N vectors that returns a scalar. That is, it is a function
that distributes over addition and scalar multiplication. We may define the components of a tensor by acting on the
basis vectors,
S(k̃, ..., l̃, u, ..., v) = S α1 ...αM β1 ...βN k̃α1 ...l̃αM uβ1 ...v βN (26)
B. Examples
• A vector v is actually a tensor of rank 10 : it takes in any 1-form k̃ and returns the scalar hk̃, vi. The components
of v are hω α , vi which are simply v α .
• A 1-form k̃ is actually a tensor of rank 01 : it takes in any vector v and returns the scalar hk̃, vi. Its components
are k̃α .
A tensor of rank M
N can exist without reference to the metric or dot product. But if we endow the Universe with
a metric, we know that there is a correspondence of 1-forms to vectors. So it follows that a tensor S should be able
to accept either vectors or 1-forms as arguments in each slot. So really if there is a metric we need only think about
the total rank M + N of a tensor. There are then 2M+N different forms of the tensor depending on which slots take
1-forms and which take vectors. These slots are related
to each
other by the operation of raising or lowering an index.
Consider for definiteness transforming from the 03 to 12 form of S, i.e. given Sδǫζ we wish to find S α βγ . This is:
S α βγ = S(ω α , eβ , eγ ). (28)
Now remember that the vector v associated with ω α can be obtained by using the inverse-metric: its components are
v δ = g δα , and the vector itself is v = g δα eδ . [Recall that the definition of the vector associated with a 1-form was
that v · w = hω α , wi for any vector w ∈ V.] So we will write:
So we have seen that just as there is a natural correspondence of vectors and 1-forms
if we are given a metric,
so there is also a correspondence of second rank tensors of all types, 02 , 11 , and 20 . Expressions involving the
components with up and down indices allow us to indicate which version of the tensor we are talking about, but if
there is a metric all these forms are equivalent. The computational rules associated with them are simple: they are
generalizations of matrix multiplication by the metric tensor g in either direct (gµν ) or inverse (g µν ) form.