Overview
Overview
Overview
outline notes
John Magorrian
john.magorrian@physics.ox.ac.uk
4 October 2021
1
Week 1
Introduction
1.1 Vectors
When you think of a “vector” you might think of one or both of the following ideas:
a physical quantity that has both length and direction
– e.g., displacement, velocity, acceleration, force, electric field,
– angular momentum, torque, ...
a column of numbers (coordinate vector)
The mathematical definition of a vector generalises these familiar notions by redefining vectors
in terms of their relationship to other vectors: vectors are objects that live (with other like-
minded vectors) in a so-called vector space; anything that inhabits a vector space is then a
vector. A vector space is a set of vectors, equipped with operations of addition of two vectors and
multiplication of vectors by scalars (i.e., scaling) that must obey certain rules. These distil the
fundamental essence of “vector” into a small set of rules that are easily generalisable to, say, spaces
of functions. From this small set of rules we can introduce concepts like basis vectors, coordinates
and the dimension of the vector space. We’ll meet vector spaces in week 3, after limbering up by
recalling some of the more familiar geometric properties of 2d and 3d vectors next week (week 2).
In these notes I use a calligraphic font to denote vector spaces (e.g., V, W), and a bold font for
vectors (e.g., a, b).
1.2 Matrices
You have probably already used matrices to represent
geometrical transformations, such as rotations or reflections, or
system of simultaneous linear equations.
But, just as we’ll generalise the notion of vector, we’ll also generalise that of matrix: a matrix
represents a linear map from one vector space to another. A map f : V → W from one vector
space V to another space W is linear if it satisfies the pair of conditions
f (a + b) = f (a) + f (b),
(1.1)
f (αa) = αf (a),
for any vectors a, b ∈ V and for any scalar value α. An alternative way of writing these linearity
conditions is that
f (αa + βb) = αf (a) + βf (b), (1.2)
for any pair of vectors a, b ∈ V and any pair of scalars α, β. Examples of linear maps include
rotations, reflections, shears and projections. Linear maps are fundamental to undergraduate
physics: the first step in understanding a complex system is nearly always to “linearize” it: that is,
to approximate its equations of motion by some form of linear map, ẋ = f (x). We’ll spend most of
this course studying the basic properties of these maps and learning different ways of characterising
them.
2
Even though our intuition about vectors and linear maps comes directly from 2d and 3d real
geometry, in this course we’ll mostly focus on their algebraic properties: that is why this subject
is usually called “linear algebra”.
3
Week 2
r = xi + yj + zk, (2.1)
or
r = r1 e1 + r2 e2 + r3 e3 , (2.2)
and then use these coordinates (x, y, z) or (r1 , r2 , r3 ) for doing calculations. The numerical values
of these coordinates depend on our choice of basis vectors.
Now let us identify the basis vectors with 3d column vectors, like this:
1 0 0
! ! !
e1 ↔ 0 , e2 ↔ 1 , e3 ↔ 0 . (2.3)
0 0 1
r1 1 0 0
! ! ! !
r ↔ r2 = r1 0 + r2 1 + r3 0 (2.4)
r3 0 0 1
This (r1 , r2 , r3 )T is known as the coordinate vector for r. Once we’ve chosen a set of basis
vectors, every physical vector r, v, E has a corresponding coordinate vector and vice versa, and
this correspondence respects the rules of vector arithmetic: if, say,
r(r) = r0 + vt, (2.5)
the the corresponding coordinate vectors satisfy the same relation. For that reason we often don’t
distinguish between “physical” vectors and their corresponding coordinate vectors, but in this
course we’ll be careful to make that distinction.
Throughout the following I’ll use the convention that scalar indices a1 , a2 , a3 etc refer to coor-
dinates of the vector a.
4
2.2 Products
Apart from adding vectors and scaling them, there are a number of other ways of operating on
pairs or triplets of vectors, which are naturally defined using by appealing to familiar geometrical
ideas:1
The scalar product is AL p.21
a · b ≡ |a| |b| cos θ, (2.6)
where |a| and |b| are the lengths or magnitudes of a and b and θ is the angle between them.
The result is a scalar, hence the name. Geometrically, it measures the projection of the
vector b along the direction of a.
The vector product of two vectors is another vector, AL p.24
a · (b × c). (2.8)
This is the (oriented) volume of the parallelepiped spanned by a, b and c.
The scalar product is clearly commutative: b · a = a · b.2 Notice that the (square of the) length
of a vector is given by the scalar product of the vector with itself:
|a|2 = a · a, (2.9)
because cos θ = 1 in this case. Two non-zero vectors a and b are perpendicular (θ = ±π/2) if and
only if a · b = 0. The vector product is anti -commutative: b × a = −a × b.
The scalar and vector products are both linear:3
c · (αa + βb) = αa · b + βa · b,
(2.10)
c × (αa + βb) = αa × b + βa × b.
Because the triple product is made up of a scalar and a vector product, each of which is linear in
each of its arguments, it follows that the triple product itself is also linear in each of a, b, c.
Later we’ll see that the scalar product can be defined for vectors of any dimension. In constrast,
the vector and triple products apply only to three-dimensional vectors.4
1 The corresponding algebraic definitions of these will follow later in the course.
2 They commute if a and b are real vectors; for complex vectors the relation is slightly more complex (week NN).
3 Again, true for real vectors; only half true for complex vectors.
4 After your first pass through the course have a think about how they might be generalised to spaces of arbitrary
dimension
5
2.3 Turning geometry into algebra
Lines Referred to some origin O, the displacement vector of points on a line passing through the AL p.33
point with displacement vector p and running parallel to q satisfies joining them is
r(λ) = p + λq, (2.11)
the position along the line being controlled by the parameter λ ∈ R. Expressing this in coordinates
and rearranging, we have
r1 − p1 r2 − p2 r3 − p3
λ= = = . (2.12)
q1 q2 q3
Planes Similarly, AL p.36
r(λ1 , λ2 ) = p + λ1 q1 + λ2 q2 (2.13)
is the parametric equation of the plane spanned by q1 and q2 that passes through p. A neater
alternative way of representing a plane is
r · n̂ = d. (2.14)
where n̂ is the unit vector normal to the plane and d the shortest distance between the plane and
the origin r = 0.
Spheres Points on the sphere of radius R centred on the point displaced a from O satisfy AL p.40
|r − a|2 = R2 . (2.15)
Exercises
1. Show that the shortest distance dmin between the point p0 and the line r(λ) = p + λq is given
by
|(p − p0 ) × q|
dmin = . (2.16)
|q|
What is the minimum distance of the point (1, 1, 1) from the line AL p.36
2. Show that the line r(λ) = P + ΛQ intersects the plane r(λ1 , λ2 ) = p + λ1 q1 + λ2 q2 at the
point r = P + Λ0 Q, where the parameter
(p − P) · (q1 × q2 )
Λ0 = . (2.18)
Q · (q1 × q2 )
This is well defined unless the denominator Q · (q1 × q2 ) = 0. Explain geometrically what
happens in this case.
3. Show that the shortest distance between the pair of lines r1 (λ1 ) = p1 + λ1 q1 and r2 (λ2 ) =
p2 + λ2 q2 is given by
dmin = |(p2 − p1 ) · n̂|, (2.19)
where n̂ = (q1 × q2 )/|q1 × q2 |.
4. What is the minimum distance of the point (1, 1, 1) from the plane
2 3 0
! ! !
r = −1 + λ1 −5 + λ2 1 ? (2.20)
4 2 1
6
2.4 Orthonormal bases
Any (e1 , e2 , e3 ) with e1 · (e2 × e3 ) 6= 0 is a basis for 3d space, but it nearly always makes sense to
choose an orthonormal basis, which is one that satisfies
ei · ej = δij (2.21)
for any choice of i, j, where AL p.25
1, i = j,
δij ≡ (2.22)
0, i 6= j,
is known as the Kronecker delta. Then, using the linearity of the scalar product, we have that
a · b = (a1 e1 + a2 e2 + a3 e3 ) · (b1 e1 + b2 e2 + b3 e3 ) = · · ·
(2.23)
= a1 b1 + a2 b2 + a3 b3 .
Now we can go algebraic on the vector product. Let e1 , e2 , e3 be a right-handed, orthonormal AL p.25
basis. Then we have that
X3
ei × ej = ijk ek , (2.28)
k=1
That is, 123 = 231 = 312 = 1, 213 = 132 = 321 = −1 and ijk = 0 for all other 21 possible
choices of (i, j, k) .
7
Now we can easily express vector equations in terms of their coordinates:
r = p + λq → ri = pi + λqi ,
3
X
d=a·b → d= ai bi ,
i=1
3 X
X 3
c=a×b → ci = ijk aj bk ,
j=1 k=1
3 X
X 3 X
3
a · (b × c) → ijk ai bj ck .
i=1 j=1 k=1
Exercises
1. Show that a · (a × c) = 0.
2. Show that the triple product a × (b × c) is unchanged under cyclic interchange of a, b, c.
3. Use the (important) identity AL p.25
3
X
ijk ilm = δjl δkm − δjm δkl , (2.30)
i=1
to show that
3 X
X 3
ijk ijm = 2δkm ,
i=1 j=1
(2.31)
3 X
X 3 X
3
ijk ijk = 6,
i=1 j=1 k=1
and to prove the following relations involving the vector product: AL p.26
a × (b × c) = (a · c)b − (a · b)c,
(a × b) · (c × d) = (a · c)(b · d) − (a · d)(b · c), (2.32)
a · (a × b) = 0.
8
2.6 Matrices
If you’re not familiar with the rules of matrix arithmetic see §3.2.1 of Lukas’ notes or §8.4 of Riley,
Hobson & Bence. Here is a recap: AL p.51
The important message of this section is that every n×m matrix is a linear map from n-dimensional
column vectors to m-dimensional ones. Conversely, any linear map from n-dimensional column
vectors to m-dimensional ones can be represented by some n × m matrix, as we’ll now show.
n
X
B(em
j )= Bij eni , (2.36)
i=1
2.6.2 Terminology
Square matrices have as many rows as columns.
The (main) diagonal of a matrix A consists of the elements A11 , A22 , ... (i.e., top left to
bottom right). A diagonal matrix is one for which Aij = 0 when i 6= j; all elements off the
main diagonal are zero.
The n-dimensional identity matrix is the n × n diagonal matrix with elements Iij = δij .
9
Given an n × m matrix A, its transpose AT is the m × n matrix with elements [AT ]ij = Aji .
Its Hermitian conjugate A† has elements [A† ]ij = A?ji .
Exercises
1. A 2 × 2 matrix is a linear map from the plane (2d column vectors) onto itself. It is completely
defined by 4 numbers. Identify the different types of transformation that can be constructed
from these 4 numbers.
2. A 3 × 3 matrix maps 3d space onto itself. It has 9 free parameters. Identify the geometrical
transformations that it can effect.
3. Given a n × n matrix with complex coefficients, how to characterise the map it represents?
4. Show that (AB)T = B T AT and (AB)† = B † A† . Hence show that AAT = I implies AT A = I.
5. Show that the map B(x) = n × x is linear. Find the find the matrix that represents it. AL p.57
10
Week 3
4. Every vector has an additive inverse: for all a ∈ V there is some a0 ∈ V for which
a + a0 = 0; (3.4)
6. The multiplication-by-scalar operation must be distributive with respect to vector and scalar
addition, consistent with the operation of multiplying two scalars and must satisfy the mul-
tiplicative identity:
α(a + b) = αa + αb;
(α + β)a = αa + βa;
(3.6)
α(βa) = (αβ)a;
1a = a.
For our purposes the scalars F will usually be either the set R of all real numbers (in which case
we have a real vector space) or the set C of all complex numbers (giving a complex vector
space). Note that the “type” of vector space refers to the type of scalars!
11
Examples abound: see lectures.
A subset W ⊆ V is a subspace of V if it satisfies the first 4 conditions above. That is: it must be AL p.12
closed under addition of vectors and multiplication by scalars; it must contain the zero vector; the
additive inverse of each element must be included. The other conditions are automatically satisfied
because they depend only on the definition of the addition and multiplication operations.
Notice that the conditions for defining a vector space involve only linear combinations of vectors: AL p.14
new vectors constructed simply by scaling and adding other vectors. There is not (yet) any notion of
length or angle: these require the introduction of a scalar product (§5.1). The following important
ideas are associated with linear combinations of vectors:
The span of a list of vectors (v1 , ..., vm ) is the set of all possible linear combinations of them:
(m )
X
span(v1 , ..., vm ) ≡ αi vi | αi ∈ F . (3.7)
i=1
12
3.2 Linear maps
Recall that a function is a mapping f : X → Y from a set X (the domain of f ) to another set Y AL p.42
(the codomain of f ). The image of f is defined as
Im f ≡ {f (x)|x ∈ X} ⊆ Y. (3.9)
The map f is
one-to-one (injective) if each y ∈ Y is mapped to by at most one element of X;
onto (surjective) if each y ∈ Y is mapped to by at least one element of X;
bijective if each y ∈ Y is mapped to by precisely one element of X (i.e., f is both injective
and surjective).
For the rest of the course we’ll focus on maps f : V → W, whose domain V and codomain W are
vector spaces, possibly of different dimensions, but over the same field F. Such an f : V → W is AL p.45
linear if for all vectors v1 , v2 ∈ V and all scalars α ∈ F,
f (v1 + v2 ) = f (v1 ) + f (v2 ),
(3.10)
f (αv) = αf (v).
It’s easy to see from that that if f : V → W and g : W → U are both linear maps then their
composition g ◦ f : V → U is also linear. And if a linear f : V → V is bijective then its inverse f −1
is also a linear map.
Im f is a vector subspace of W.
13
f injective ⇔ Ker f = {0} ⇔ dim Ker f = 0.
But the main result in this section is that any linear map f : V → W satisfies the dimension
theorem AL p.47
dim Ker f + dim Im f = dim V. (3.12)
We’ll be invoking this theorem over the next few lectures. Here are some immediate consequences
of it:
if f has an inverse (bijective) then dim V = dim W; AL p.48
if dim V = dim W then the following statements are equivalent (they are all either true or
false): f is bijective ⇔ dim Ker f = 0 ⇔ rank f = dim W;
Matrices Any n × m matrix is a linear map from F m to F n : for any u, v ∈ F m we have AL p.49
that A(u + v) = A(u) + A(v) and A(αv) = α(Av). Composition is multiplication.
Coordinate maps Given a vector space V over a field F with basis e1 , ..., en , we have that
any vector in V can be expressed as
n
X
v= vi e i ,
i=1
where the vi are the coordinates of the vector wrt the ei basis. Introduce a mapping ϕ :
F n → V defined by
v1
n
ϕ .
.
. =
X
vi ei . (3.13)
vn i=1
This is called the coordinate map (for the basis e1 , ..., en ). It is clearly linear. Because
dim F n = dim V and Im ϕ = V it follows as a consequence of the dimension theorem that ϕ
is bijective and therefore has an inverse mapping ϕ−1 : V → F n . The coordinates (v1 , ...., vn )
of the vector v are vi = (ϕ−1 (e)i .
Linear differential operators Consider the vector space C ∞ (R) of smooth functions
f : R → C. Then the differential operator
L : C ∞ (R) → C ∞ (R)
n
X di (3.14)
L• = pi (x) i •
i=0
dx
is a linear map.
14
3.3 Change of basis of the matrix representing a linear map
Let f : V → W be a linear map. Suppose that v1 , ..., vm is a basis for V with coordinate map AL p.71
φ : F m → V, and w1 , ..., wn is a basis for W with coordinate map ψ : F n → W. The matrix A
that represents f : V → W in this bases is simply
A = ψ −1 ◦ f ◦ φ. (3.15)
Now consider another pair of bases for V and W with coordinate maps
ϕ0 : F m → V, basis v10 , ..., vm
0
∈ V;
0 n
ψ : F → W, basis w10 , ..., wn0 ∈ W.
A0 = (ψ 0 )−1 ◦ f ◦ ϕ0
= (ψ 0 )−1 ◦ ψ ◦ ψ −1 ◦ f ◦ ϕ ◦ ϕ−1 ◦ ϕ0 . (3.16)
| {z } | {z } | {z }
Q A P −1
Here P −1 is an m × m matrix that transforms coordinate vectors wrt the vi0 basis to the vi one.
Inverting, P transforms unprimed coordinates to primed ones. Similarly, Q transforms from the
wi basis to the wi0 one.
The most important case is when V = W, vi = wi , vi0 = wi0 . Then A and A0 are both square
and are related through
A0 = P AP −1 . (3.17)
Here’s how this works: P −1 transforms coordinate vectors from the primed to unprimed basis.
Then A acts on that before we transform the resulting coordinates back to the primed basis.
Think P does the priming of the coordinates!
Exercises
1. We have just seen that coordinate vectors α = (α1 , ..., αm )T and α0 = (αP 0 0 T
1 , ..., αm ) with
0 0
respect to these two bases are related via α = P α, or, in index form, αj = k Pjk αk . Show
that the basis vectors transform in the opposite way: vj = k (P −1 )jk vk0 .
P
1 0
2. Consider the reflection matrix A = . What is the corresponding matrix in a new
0 −1
0 cos α 0 − sin α
basis rotated counterclockwise by α: v1 = , v2 = ?
sin α cos α
15
Week 4
Exercises
1. Show that the full space of solutions to Ax = b is the set {x0 + x1 | x0 ∈ ker A}, where x1 is
any vector for which Ax1 = b.
16
1. The solution space is simply ker A, i.e., everything that doesn’t map to Im A\{0}. This
unmapped space is Im A = span(A1 , ..., Am ), the span of the columns of A. By the (original)
dimension theorem (3.12) we have
column
| {z rank} + |dim space of solns = m. (4.4)
{z }
dim Im A dim Ker A
2. Consider instead the n rows of A. The vector equation Ax = 0 is equivalent to the n scalar
equations Ai · x = 0, where · is matrix multiplication as used in the definition (4.2) of W ⊥
above. That is, solutions live in the space “orthogonal” to that spanned by the rows. Let W
be the vector subspace of F m spanned by the row vectors A1 , ..., An . Its dimension is the
row rank of A. W ⊥ is then the space of solutions to Ai · v = 0 (i = 1, ..., n). From the
orthogonal complement dimension theorem (4.3),
row
| {zrank} + |dim space of solns = m. (4.5)
{z }
dim W dim W ⊥
Comparing (4.4) and (4.5) shows that the row rank is equal to the (column) rank. [Lukas p.55]
then it’s clear that the (row) rank of A is 4. This procedure is called row reduction: the idea is
to reduce the matrix to so-called row echelon form in which the first nonzero element of row i
is rightwards of the first nonzero element of the preceding row i − 1. 2
to use rows.
17
To solve Ax = b we can premultiply both sides by a sequence of elementary matrices E1 , E2 , ...,
Ek , reducing it to
(Ek · · · E2 E1 A)x = (Ek · · · E2 E1 )b, (4.7)
in which the matrix (Ek · · · E2 E1 A) on the LHS is reduced to row echelon form. That is,
1. Use row reduction to bring A into echelon form; AL p.77
2. apply the same sequence of row operations to b;
3. solve for x by backsubstitution.
The advantage of this procedure over other methods (e.g., using the inverse) is that we can construct
ker A and deal with the case when the solution is not unique.
Exercises
1. Use row reduction operations to obtain the rank and kernel of the matrix
0 1 −1
!
2 3 −2 . (4.10)
2 1 0
Another, less efficient, way of calculating A−1 is by using the Laplace expansion of the determinant,
equation (4.2.3) below.
18
4.2 Determinant
Suppose V1 , ..., Vk are vector spaces over a common field of scalars F. A map f : V1 × · · · × Vk → F
is multilinear, specifically k-linear, if it is linear in each argument separately: AL p.82
f (v1 , ..., αvi + α0 vi0 , ..., vk ) = αf (v1 , ..., vi , ..., vk ) + α0 f (v1 , ..., vi0 , ..., vk ). (4.14)
For the special case k = 2 the map is called bilinear. A multilinear map is alternating if it returns
zero whenever two of its arguments are equal:
f (v1 , ..., vi , ..., vi , ..., vk ) = 0. (4.15)
This means that swapping any pair of arguments to a alternating multilinear map flips the sign of
its output.
showing that the map is completely determined by the nk possible results of applying it to the
basis vectors. Imposing the condition that δ be alternating means that that δ(ei1 , ..., ein ) vanishes
if two or more of the ik are equal. Therefore we need consider only those (i1 , ..., in ) that are
permutations P of the list (1, ..., n). The change of sign under pairwise exchanges implied by the
alternating condition means that
δ(eP (1) , ..., eP (n) ) = sgn(P )δ(e1 , ..., en ),
where sgn(P ) = ±1 is the sign of the permutation P (+1 for even, −1 for odd). Finally the
condition that det I = 1 sets δ(e1 , ..., en ) = 1, completely determining δ. The result is that AL p.84
X
det A = sgn(P )AP (1),1 AP (2),2 · · · AP (n),n
P
n
X n
X (4.16)
= ··· i1 ,...,in Ai1 ,1 · · · Ain ,n ,
i1 =1 in =1
where i1 ,...,in is the multidimensional generalisation of the Levi-Civita or alternating symbol (4.15).
There are two important results that follow hot on the heels of the Leibniz expansion: AL p.86
det(AT ) = det A
– This implies that the determinant is also multilinear and alternating in the rows of A.
19
det(AB) = (det A)(det B)
– From this it follows that A is invertible iff det A 6= 0. Moreover, det(A−1 ) = 1/ det A.
– The determinant of the matrix that represents a linear map is independent of basis used:
in another basis the matrix representing the map becomes A0 = P AP −1 (equ. 3.17) and
so det A0 = det A. As every linear map is represented by some matrix, this means
that we can sensibly extend our definition of determinant to include any linear map
f : V → V.
1
A−1 = adjA.
det A
obtained by replacing column i in A with b. Note that this b = j xj Aj . Using the multilinearity
P
property of the determinant,
20
the last line following because the determinants vanish unless i = j. This shows that we may solve
Ax = b in a cute but inefficient matter by calculating
xi = det(B(i) )/ det A (4.22)
for each i = 1, ..., n.
21
4.3 Trace
The trace of an n × n matrix A is defined to be the sum of its diagonal elements: AL p.110
n
X
tr A ≡ Aii . (4.23)
i=1
Exercises
1. Show that
tr(AB) = tr(BA),
(4.24)
tr(ABC) = tr(CAB).
2. Let A and A0 be matrices that represent the same linear map in two different bases. Show that
tr A0 = tr A.
3. Show that the trace of any matrix that represents a rotation by an angle θ in 3d space is equal
to 1 + 2 cos θ.
22
Week 5
For Cn , the space of n-dimensional vectors with complex coefficients the natural scalar prod-
uct is
Xn
ha, bi = a† · b = a?i bi . (5.8)
i=1
23
For the space L2 (a, b) of square-integrable functions f : [a, b] → C the natural choice is
Z b
hf, gi = f ? (x)g(x) dx. (5.9)
a
Exercises
1. Consider a generalisation of (5.8) to
n
X
ha, bi = a† M b = a?i Mij bj . (5.10)
i,j=1
where M is an n × n matrix. This gives us linearity in b (5.1) for free. What constraints do
the other two conditions, (5.2) and (5.3), impose on M ?
2. [For your second pass through the notes:] In the definition (5.5) of the angle θ between two
vectors how do we know that | cos θ| ≤ 1?
3. If the nonzero v1 , ..., vn are pairwise orthogonal (that is hvi , vj i = 0 for i 6= j) show that they AL p.95
must be LI.
4. Show that if hv, wi = 0 for all v ∈ V then w = 0. AL p.99
5. Suppose that the maps f : V → V and g : V → V satisfy
hv, f (w)i = hv, g(w)i (5.11)
for all v, w ∈ V. Show that f = g.
24
identifying e1 , e2 , ... with the column vectors (1, 0, 0, ..)T , (0, 1, 0, 0, ...)T etc, the matrix M
that represents a linear map f : V → V has matrix elements
Mij = hei , f (ej )i. (5.19)
Given a list v1 , ..., vm of vectors we can construct an orthonormal basis for their span using the
following, known as the Gram–Schmidt algorithm: AL p.97
1. Choose one of the vectors v1 . Let e01 = v1 . Our first basis vector is e1 = e01 /|e01 |.
2. Take the second vector v2 and let e02 = v2 −he1 , e2 ie1 : that is, we subtract off any component
that is parallel to e1 . If the result is nonzero then our second normalized basis vector is
e2 = e02 /|e02 |.
3. ...
k. By the k th step we’ll have constructed e1 , ...ej for some j ≤ k. Subtract off any component
Pj−1
of vk that is parallel to any of these: e0k = vk − i=1 hei , vk iei . If e0k 6= 0 then our next
0 0
basis vector is ej+1 = ek /|vek |.
Exercises
1 2
1. Construct an orthnormal basis for the space spanned by the vectors v1 = 1 , v2 = 0 ,
1 0 1
v3 = −2 .
−2
25
5.2 Adjoint map
Let f : V → V be a linear map. Its adjoint f † : V → V is defined as the map that satisfies AL p.100
(f † )† = f (5.24)
† † †
(f + g) = f + g (5.25)
† ?
(αf ) = α f (5.26)
† † †
(f ◦ g) = g ◦ f (5.27)
−1 † † −1
(f ) = (f ) (assuming f invertible) (5.28)
Suppose that A is the matrix that represents f in some orthonormal basis e1 , ..., en . Then by
(5.19), (5.20) and (5.2) the matrix A† that represents f † in the same basis has elements
26
5.3 Hermitian, unitary and normal maps
A Hermitian operator f : V → V is one that is self-adjoint: AL p.101
f † = f. (5.30)
| det f |2 = 1. (5.34)
If f is an orthogonal map then this becomes det f = ±1. If det f = +1 then the map f is a pure
rotation. If det f = −1 then f is a rotation plus a reflection.
Let U be the matrix that represents a unitary map in some orthonormal basis. Then (5.33)
becomes
Xn Xn
? ?
Uij Ukj = Uji Ujk = δik . (5.35)
j=1 j−1
The first sum shows that the rows of U are orthonormal, the second its columns.
Hermitian and unitary maps are examples of so-called normal maps, which are those for which AL p.112
[f, f † ] ≡ f ◦ f † − f † ◦ f = 0. (5.36)
Exercises
1. Let R be an orthogonal 3 × 3 matrix with det R = +1. How is tr R related to the rotation
angle?
2. Let U1 and U2 be unitary. Show that U2 ◦ U1 is also unitary. Does the commutator [U2 , U1 ] ≡
U2 ◦ U1 − U1 ◦ U2 vanish?
3. Suppose that H1 and H2 are Hermitian. Is H2 ◦ H1 also Hermitian? If not, are they any
conditions under which it becomes Hermitian?
27
5.4 Dual vector space
Let V be a vector space over the scalars F. The dual vector space V ? is the set of all linear
maps ϕ : V → F: AL p.108
ϕ(α1 v1 + α2 v2 ) = α1 ϕ(v1 ) + α2 ϕ(v2 ). (5.37)
It is easy to confirm that the set of all such maps satisfies the vector space axioms (§3.1).
Some examples of dual vectors spaces:
If V = Rn , the space of real, n-dimensional column vectors, then elements of V ? are real,
n-dimensional row vectors.
Take any vector space V armed with a scalar product h•, •i. Then for any v ∈ V the mapping
ϕv (•) = hv, •i is a linear map from vectors • ∈ v to scalars. The dual space V ? consists of all
such maps (i.e., for all possible choices of v).
In these examples dim V ? = dim V. This is true more generally: just as we used the magic of
coordinate maps to show that any n-dimensional vector space V over the scalars F is isomorphic1
to n-dimensional column vectors F n , we can show that the dual space V ? is isomorphic to the
space of n-dimensional row vectors. AL p.109
When playing with dual vectors it is conventional to write elements of V and V ? like so,
n
X
v= v i ei ∈ V,
i=1
n
(5.38)
X
ϕ= ϕi ei? ?
∈V ,
i=1
with upstars indices for the coordinates v i of vectors in V, balanced by downstars indices for the
corresponding basis vectors, and the other way round for V ? .
1I don’t define what an isomorphism is in these notes, but do in the lectures. So look it up if you zoned out.
28
Week 6
Eigenthings
29
6.1 How to find eigenvalues and eigenvectors
AL p.114
Choose a basis for V and let A be the n × n matrix that represents f in this basis. Then the
characteristic equation (6.4) becomes
X
0 = det(A − λI) = sgn(Q)(A − λI)Q(1),1) · · · (A − λI)Q(n),n , (6.6)
Q
Exercises
1. Find the eigenvalues and eigenvectors of the following matrices:
cos θ − sin θ 0
!
1 1 1 0 −i cos θ − sin θ 1 1
√ , , , sin θ cos θ 0 , . (6.7)
2 1 1 i 0 sin θ cos θ 0 1
0 0 1
Qn Pn
2. Show that det A = i=1 λi and tr A = i=1 λi .
3. Our procedure for finding the eigenvalues of f : V → V relies on choosing a basis for V and
then representing f as a matrix A. Show that the eigenvalues this returns are independent of
basis used to construct the matrix.
30
6.2 Eigenproperties of Hermitian and unitary maps
The following statements are easy to prove. If f is a Hermitian map (f = f † ) then AL p.119
So, if the eigenvalues λ1 , ..., λn of an n × n unitary or Hermitian matrix are distinct, then its n
(normalized) eigenvectors are automatically an orthonormal basis.
6.3 Diagonalisation
AL p.116
A map f : V → V is diagonalisable if there is a basis in which the matrix that represents f is
diagonal. An important lemma is that
f is diagonalisable ⇔ V has a basis of eigenvectors of f . (6.8)
Exercises
1. Now suppose that f is diagonalisable and let A be the n × n matrix that represents f in some
basis. According to the preceding lemma, this A has eigenvectors v1 , ..vn with Avi = λi vi that
form an eigenbasis for Cn . Let P = (v1 , ..., vn ) whose columns are these eigenvectors. Show
that
P −1 AP = diag(λ1 , ...λn ). (6.9)
Diagonal matrices have such appealing properties (e.g., they commute, involve only n numbers
instead of n2 ) that we’d like to understand under what conditions a matrix representing a map
can be diagonalised. The results of §6.2 imply that both Hermitian and unitary matrices are
diagonalisable provided the eigenvalues are distinct. But we can do better than this. Next we’ll
show that:
1. a map f is diagonalisable iff it is normal (5.36);
2. if we have two normal maps f1 and f2 then there is a basis in which the corresponding
matrices are simultaneously diagonal iff [f1 , f2 ] = 0.
31
6.4 Eigenproperties of normal maps
Let f : V → V be a normal map (f ◦ f † = f † ◦ f ). Then
f (v) = λv ⇒ f † (v) = λ? v; AL p.122
Now suppose that A is the matrix that represents f in ourP“standard” e1 = (1, 0, 0..)T , e2 =
(0, 1, 0, ...)T etc basis. We can expand the eigenvectors vi = j Qji ej : that is,
is a matrix whose ith column is vi expressed in the standard e1 , ..., en basis. Then
Q−1 AQ = diag(λ1 , ..., λn ). (6.12)
Exercises
1. Diagonalise the matrix
1 3 −1
A= . (6.13)
2 −1 3
2. Diagonalize √ √ AL p.121
2 3 2 3 2
1 √
3 √2 −1 3 (6.14)
4
3 2 3 −1
3. Show that if f has an orthonormal basis of eigenvectors then it is normal. [Hint: choose a basis
and show that (A† )ij = A?ji .]
32
6.5 Simultaneous diagonalisation
Let A and B be two diagonalisable matrices. Then there is a basis in which both A and B are
diagonalisable iff [A, B] = 0. AL p.125
33
6.6 Applications
Exponentiating a matrix: AL p.129
m ∞
1 1 X 1
exp[αG] = lim I + αG = In + αG + α2 G2 + · · · = (αG)j . (6.15)
m→∞ m 2 j=0
j!
q(x) = q 0 (x0 ) = x0T diag(λ1 , ..., λn )x0 .λ1 (x01 )2 + λ2 (x02 )2 + · · · + λn (x0n )2 , (6.19)
so that the level surfaces of q(x) (i.e., the solutions to q(x) = c for some constant level c)
become easy to interpret geometrically.
Exercises
0 −1
1. Calculate exp(αG) for G = 1 0 .
2. Show that
det exp A = exp(tr A). (6.20)
34