Lecture 6
Lecture 6
COMMENTS ON HOMEWORK 2
• Never ever have more than 2 repeated indices. If you find that you have used more than two, go back and
make sure you use different dummy indices.
• The primes on the indices fundamentally mean primes on the underlying object too. For instance, gµ0 ν 0 means
the components of the metric in the primed coordinate system. So gµ0 ν is meaningless.
1 2 1
f= x = r2 cos2 φ.
2 2
We then have
∂ x2
∂ ∂f ∂ 2 x
= (r cos φ) = = (1 + y 2 /r2 ),
∂x ∂r ∂x ∂x r r
∂ ∂f ∂ ∂ x
= (x) = (r cos φ) = cos φ = .
∂r ∂x ∂r ∂r r
The reason the two do not commute is that the coordinates that are kept constant are different.
• Instantaneous rest frame: the frame in which, at a given instant, a particle is at rest, hence in this frame, its
4-velocity has components (1, 0, 0, 0).
CHANGE OF BASES
Last lecture we defined the tangent space Vp , and showed that it is a n-dimensional vector space, by proving that
the partial derivative operators {∂(µ) } form a basis. This is called a coordinate basis of Vp .
We will mostly use coordinate bases of the tangent space, but we don’t have to do so. In general, we will denote
bases of Vp with subscripted and parenthesized indices, e.g. {e(µ) } = {e(1) , ..., e(n) }, and the components of
a vector V on a basis by V µ , i.e. V = V µ e(µ) . The parentheses are here to remind you that e(µ) are not some
components, but rather vectors. We place them down to make sure that pairs of indices which are summed over
always come with one up and one down.
Now suppose that we have another basis {e(µ0 ) } (as before, putting the primes on the indices), related to the basis
{e(µ) } by
0
e(µ) = M µ µ e(µ0 ) .
2
0
In other words, M µ µ is the µ0 −th component of the vector e(µ) on the basis {e(µ0 ) }. Then, given a vector V ,
we can write
0 0 0 0
V = V µ e(µ) = V µ M µ µ e(µ0 ) = V µ e(µ0 ) , V µ = M µ µV µ .
We see that to get the primed coordinates of V from its unprimed coordinates involes the primed-to-unprimed change-
of-basis matrix. This is the origin of the word “contravariant” when talking about vectors.
Now let us apply this to change of coordinate bases. We first need to find the components of the vectors ∂(µ)
on the coordinate basis {∂(µ0 ) }. Last lecture, we saw that the components of a vector V on a coordinate basis are
obtained by applying V to the coordinate functions xµ , i.e. V µ = V (xµ ). Thus, we have
0
0 0 ∂xµ
M µ µ = ∂(µ) (xµ ) = .
∂xµ
Thus we arrive at the following natural-looking relations:
0 0
∂xµ 0 ∂xµ µ
∂(µ) = ∂(µ0 ) , Vµ = V , V = V µ ∂(µ) .
∂xµ ∂xµ
Consider a 1-dimensional smooth curve on M (the smoothness can be defined through charts) parametrized by
some parameter τ ∈ R, i.e. the curve is a set of points {p(τ )}. Given a function f ∈ F, define the function
(
˜ R →R
f:
τ 7→ f (p(τ )).
Then let us define the tangent vector V ≡ d/dτ along the curve as follows: given a smooth function f ∈ F, we define
d/dτ (f ) ≡ df˜/dτ , where the right-hand-side is the usual derivative. It should be clear that d/dτ is a tangent vector.
It should also be clear that its components on a coordinate basis {∂(µ) } are just dxµ /dτ . We thus have
d dxµ
= ∂(µ) .
dτ dτ
The name τ was chosen on purpose to remind you of the proper time (which we will come back to once we formally
introduce tensors). And indeed, the formal definition of the 4-velocity u is the tangent vector d/dτ along a curve
parametrized by τ . It is a vector whose components on a coordinate basis are dxµ /dτ .
DUAL VECTORS
A dual vector W at p ∈ M is defined as a linear map W : Vp → R. Note that this is a very general definition,
that can be made for any vector space (i.e. it is not restricted to tangent spaces of manifolds). In words, a dual vector
V acting on a vector V gives a real number, W (V ) ∈ R. Instead of parentheses, we will denote the action of dual
vectors on vectors by a dot: W (V ) ≡ W · V . We define by Vp∗ the set of dual vectors at some point p ∈ M. I let you
show for yourselves that this is a vector space.
Given a basis {e(µ) } of Vp , we can define the dual basis {e∗(µ) } such that e∗(µ) · e(ν) = δνµ . I leave it as an
homework exercise for you to show that this is indeed a basis of Vp∗ , which is thus also a vector space of dimension
n. Given a vector V = V µ e(µ) , we can easily show that its components are given by
V µ = e∗(µ) · V .
0
Now consider again two bases of Vp , related by e(µ) = M µ µ e(µ0 ) . For any vector V , we have
0 0 0 0
e∗(µ ) · V = V µ = M µ µ V µ = M µ µ e∗(µ) · V ,
3
where we used the results from above. For this to hold for any vector V , we must have
0 0
e∗(µ ) = M µ µ e∗(µ) .
Now, since {e∗(µ) } form a basis of Vp∗ , we can write any dual vector W as
W = Wµ e∗(µ) .
The components are obtained by dotting the dual vector in the basis vectors:
Wµ = W · e(µ) .
As you can suspect, and are asked to show explicitly, the components of a dual vector transform as
0
Wµ = M µ µ Wµ0 ,
that is, in the same way as the basis vectors. This is why dual vectors are also called covariant vectors.
Now, given a coordinate basis {∂(µ) }, we can define its dual basis (just like for any basis!). We use the notation
From the transformation rule of the dual basis, we see that, under change of coordinates,
0
0 ∂xµ
dx(µ ) = dx(µ) ,
∂xµ
∇f = ∇µ f dx(µ) .
Now let us figure out the components on the coordinate basis, by computing ∇f · ∂(ν) :
∂f
∇µ f = ∇f · ∂(ν) = ∂(ν) (f ) = .
∂xν
We have thus found
∂f
df = ∇f = dx(µ) .
∂xµ
The dual vector df = ∇f is called the gradient of f . Now consider a vector V = V µ ∂(µ) :
∂f
∇f · V = V µ ∇µ f = V µ .
∂xµ
In other words, this gives the directional derivative of f along V .
Note that the expression above only holds in a coordinate (dual) basis. The gradient ∇f is still well defined
regardless of basis, but its components in a general basis are no longer ∂f /∂xµ .
4
Since Vp∗ is a vector space, we can also define its dual Vp∗∗ , which is also a vector space of dimension n. Conveniently,
there exists a basis-independent bijective mapping between Vp∗∗ and Vp : given a vector V , define the following
dual-dual vector (which by definition is a linear map from Vp∗ to R)
(
Vp∗ →R
Ṽ : .
W 7→ Ṽ · W ≡ W · V .
I let you reflect on why the mapping V 7→ Ṽ is bijective. The consequence is that we can see vectors as dual-dual
vectors, i.e. we can identify the dual-dual space with the original vector space, Vp∗∗ ↔ Vp . Note that there
are many ways to build bijective mappings between Vp and Vp∗ (an, in general, between any two vectors spaces of the
same dimension), but there is no “geometric”, basis-independent mapping as there is between Vp∗∗ and Vp .
We can thus define V · W ≡ Ṽ · W = W · V . This is called the contraction of the vector V and of the dual vector
W.
Now suppose that we have a basis {e(µ) } of Vp and associated dual basis {e∗(µ) } of Vp∗ . Given a dual vector
W = Wµ e∗(µ) and a vector V = V µ e(µ) , we have
W · V = V · W = Wµ V µ .
TENSORS
T : (Vp∗ )k × (Vp )l → R,
i.e. a linear operator that takes in k dual vectors W (1) , ...W (k) , and l vectors V (1) , ..., V (l) , and returns a real number
Note that the parenthesized indices are labels, and are not components. Just like we did for basis vectors
and basis dual vectors, we use down labels for vectors and up labels for dual vectors, which works well with the
repeated-up-and-down-index summation convention.
Given two dual vectors X, Y , we can construct the following tensor of rank (0, 2):
(
Vp × Vp → R
X ⊗Y :
(U , V ) 7→ (X · U )(Y · V ).
This is called the tensor product of the dual vectors X, Y . Similarly, we can construct tensor products of arbitray
numbers of vectors and dual vectors. For instance, given k vectors (X (1) , ..., X (k) ) and l dual vectors (Y (1) , ..., Y (l) ),
we define
(
(1) (l) (Vp∗ )k × (Vp )l →R
X (1) ⊗...⊗X (k) ⊗Y ⊗...⊗Y :
(W (1) , ...W (k) , V (1) , ..., V (l) ) 7→ (X (1) · W (1) )...(X (k) · W (k) )(Y (1) · V (1) )... × (Y (l) · V (l) ).
In particular, given a basis {e(µ) } of Vp and a dual basis {e∗(ν) } of Vp∗ , we can construct a basis of the space of
rank-(k, l) tensors with tensor products:
where each index µ1 , ..., µk , ν1 , ..., νl runs from 1 to n (or 0 to n − 1). I let you think about why this is indeed a
basis. This means that the space of tensors of rank (k, l) is of dimension nk+l . We can then write a tensor as a linear
combination of the basis vectors:
where each repeated index is summed over (so this expression contains k + l nested sums). The nk+l real numbers
T µ1 ...µk ν1 ...νl are the components of the tensor.
Note that we have already encountered tensors: vectors are tensors of rank (1, 0) and dual vectors are
tensors of rank (0, 1).
So far we have been denoted vectors with a bar on top and dual vectors with a bar on the bottom. We could extend
this notation to tensors but it would quickly become cumbersome. We could also adopt a slot notation, for instance,
write T ••••• to mean a rank (3, 2) tensor, but again, this is not super convenient.
So instead, we adopt the abstract index notation. We will say things like “consider the tensor T αβγ δλ ”, meaning
the geometric object T which is a rank (3, 2) tensor, not its components in a specific basis. In this context, the
greek letters play the role of placeholders. I will try and keep greek letters close to the beginning of the alphabet to
mean the geometric object, and further letters µν, ... to mean the components, but the context should make meaning
clear. All this will make more sense as we go and encounter specific examples.