Vectors Notes I
Vectors Notes I
VECTORS PART I
DR B MENGE
1. Vectors
We might have first learned vectors as arrays of numbers, and then defined addition and
multiplication in terms of the individual numbers in the vector. This however, is not what we
are going to do here. The array of numbers is just a representation of the vector, instead of the
vector itself.
Here, we will define vectors in terms of what they are, and then the various operations are
defined axiomatically according to their properties.
Often, vectors have a length and direction. The length is denoted by |v|. In this case, we
can think of a vector as an “arrow” in space. Note that λa is either parallel (λ ≥ 0) to or
anti-parallel (λ ≤ 0) to a.
Definition 1.2 (Unit vector). A unit vector is a vector with length 1. We write a unit vector
as v̂.
Example 1.3. Rn is a vector space with component-wise addition and scalar multiplication.
Note that the vector space R is a line, but not all lines are vector spaces. For example, x + y = 1
is not a vector space since it does not contain 0.
1.2. Scalar product. In a vector space, we can define the scalar product of two vectors, which
returns a scalar (i.e. a real or complex number). We will first look at the usual scalar product
defined for Rn , and then define the scalar product axiomatically.
1Department of Mathematics and Physics, Technical University of Mombasa, P.O. Box 90420, 80100 Mombasa,
Kenya.
1
2 DR B MENGE
|a|
b
|a| cos θ
Using the dot product, we can write the projection of b onto a as (|b| cos θ)â = (â · b)â.
The cosine rule can be derived as follows:
−−→ −→ −−→
|BC|2 = |AC − AB|2
−→ −−→ −→ −−→
= (AC − AB) · (AC − AB)
−−→ −→ −−→ −→
= |AB|2 + |AC|2 − 2|AB||AC| cos θ
We will later come up with a convenient algebraic way to evaluate this scalar product.
Viewing this as a quadratic in λ, we see that the quadratic is non-negative and thus cannot have
2 real roots. Thus the discriminant ∆ ≤ 0. So
4(x · y)2 ≤ 4|y|2 |x|2
(x · y)2 ≤ |x|2 |y|2
|x · y| ≤ |x||y|.
Note that we proved this using the axioms of the scalar product. So this result holds for all
possible scalar products on any (real) vector space.
Example 1.9. Let x = (α, β, γ) and y = (1, 1, 1). Then by the Cauchy-Schwarz inequality, we
have
√ p
α + β + γ ≤ 3 α2 + β 2 + γ 2
α2 + β 2 + γ 2 ≥ αβ + βγ + γα,
with equality if α = β = γ.
Corollary 1.10 (Triangle inequality).
|x + y| ≤ |x| + |y|.
Proof.
|x + y|2 = (x + y) · (x + y)
= |x|2 + 2x · y + |y|2
≤ |x|2 + 2|x||y| + |y|2
= (|x| + |y|)2 .
So
|x + y| ≤ |x| + |y|.
1.4. Vector product. Apart from the scalar product, we can also define the vector product.
However, this is defined only for R3 space, but not spaces in general.
Definition 1.11 (Vector/cross product). Consider a, b ∈ R3 . Define the vector product
a × b = |a||b| sin θn̂,
where n̂ is a unit vector perpendicular to both a and b. Since there are two (opposite) unit
vectors that are perpendicular to both of them, we pick n̂ to be the one that is perpendicular
to a, b in a right-handed sense.
a×b
a
4 DR B MENGE
Proof. The area of the base of the parallelepiped is given by |b||c| sin θ = |b × c|. Thus the
volume= |b × c||a| cos φ = |a · (b × c)|, where φ is the angle between a and the normal to b and
c. However, since a, b, c form a right-handed system, we have a · (b × c) ≥ 0. Therefore the
volume is a · (b × c).
Since the order of a, b, c doesn’t affect the volume, we know that
[a, b, c] = [b, c, a] = [c, a, b] = −[b, a, c] = −[a, c, b] = −[c, b, a].
Theorem 1.15. a × (b + c) = a × b + a × c.
Proof. Let d = a × (b + c) − a × b − a × c. We have
d · d = d · [a × (b + c)] − d · (a × b) − d · (a × c)
= (b + c) · (d × a) − b · (d × a) − c · (d × a)
=0
Thus d = 0.
1.6. Spanning sets and bases.
VECTORS PART I 5
1.6.1. 2D space.
Definition 1.16 (Spanning set). A set of vectors {a, b} spans R2 if for all vectors r ∈ R2 ,
there exist some λ, µ ∈ R such that r = λa + µb.
In R2 , two vectors span the space if a × b 6= 0.
Theorem 1.17. The coefficients λ, µ are unique.
Proof. Suppose that r = λa + µb = λ0 a + µ0 b. Take the vector product with a on both sides to
get (µ − µ0 )a × b = 0. Since a × b 6= 0, then µ = µ0 . Similarly, λ = λ0 .
Definition 1.18 (Linearly independent vectors in R2 ). Two vectors a and b are linearly inde-
pendent if for α, β ∈ R, αa + βb = 0 iff α = β = 0. In R2 , a and b are linearly independent if
a × b 6= 0.
Definition 1.19 (Basis of R2 ). A set of vectors is a basis of R2 if it spans R2 and are linearly
independent.
Example 1.20. {î, ĵ} = {(1, 0), (0, 1)} is a basis of R2 . They are the standard basis of R2 .
1.6.2. 3D space. We can extend the above definitions of spanning set and linear independent
set to R3 . Here we have
Theorem 1.21. If a, b, c ∈ R3 are non-coplanar, i.e. a · (b × c) 6= 0, then they form a basis of
R3 .
Proof. For any r, write r = λa + µb + νc. Performing the scalar product with b × c on both
sides, one obtains r · (b × c) = λa · (b × c) + µb · (b × c) + νc · (b × c) = λ[a, b, c]. Thus
λ = [r, b, c]/[a, b, c]. The values of µ and ν can be found similarly. Thus each r can be written
as a linear combination of a, b and c.
By the formula derived above, it follows that if αa + βb + γc = 0, then α = β = γ = 0. Thus
they are linearly independent.
Note that while we came up with formulas for λ, µ and ν, we did not actually prove that these
coefficients indeed work. This is rather unsatisfactory. We could, of course, expand everything
out and show that this indeed works, but in IB Linear Algebra, we will prove a much more
general result, saying that if we have an n-dimensional space and a set of n linear independent
vectors, then they form a basis.
In R3 , the standard basis is î, ĵ, k̂, or (1, 0, 0), (0, 1, 0) and (0, 0, 1).
The reader should check that this definition coincides with the |x||y| cos θ definition in the
case of R2 and R3 .
1.6.4. Cn Cn space. Cn is very similar to Rn , except that we have complex numbers.PAs a
result, we need a different definition of the scalar product. If we still defined u · v = ui vi ,
then if we let u = (0, i), then u · u = −1 < 0. This would be bad if we want to use the scalar
product to define a norm.
Definition 1.28 (Cn ). Cn = {(z1 , z2 , · · · , zn ) : zi ∈ C}. It has the P
same standard basis as Rn
but the scalar product is defined differently. For u, v ∈ Cn , u · v = u∗i vi . The scalar product
has the following properties:
(1) u · v = (v · u)∗
(2) u · (λv + µw) = λ(u · v) + µ(u · w)
(3) u · u ≥ 0 and u · u = 0 iff u = 0
Instead of linearity in the first argument, here we have (λu + µv) · w = λ∗ u · w + µ∗ v · w.
Example 1.29.
4
X
(−i)k |x + ik y|2
k=1
X
= (−i)k x + ik y | x + ik y
X
= (−i)k (x + ik y | x + ik x + ik y | y)
X
= (−i)k (x | x + (−i)k y | x + ik x | y + ik (−i)k y | y)
X
= (−i)k [(|x|2 + |y|2 ) + (−1)k y | x + x | y]
X X X
= (|x|2 + |y|2 ) (−i)k + y | x (−1)k + x | y 1
= 4x | y.
We can prove the Cauchy-Schwarz inequality for complex vector spaces using the same proof
as the real case, except that this time we have to first multiply y by some eiθ so that x · (eiθ y)
is a real number. The factor of eiθ will drop off at the end when we take the modulus signs.
1.7. Vector subspaces.
Definition 1.30 (Vector subspace). A vector subspace of a vector space V is a subset of V that
is also a vector space under the same operations. Both V and {0} are subspaces of V . All others
are proper subspaces.
A useful criterion is that a subset U ⊆ V is a subspace iff
(1) x, y ∈ U ⇒ (x + y) ∈ U .
(2) x ∈ U ⇒ λx ∈ U for all scalars λ.
VECTORS PART I 7
(3) 0 ∈ U .
This can be more concisely written as “U is non-empty and for all x, y ∈ U , (λx + µy) ∈ U ”.
Example 1.31.
(1) If {a, b, c} is a basis of R3 , then {a + c, b + c} is a basis of a 2D subspace.
Suppose x, y ∈ {a + c, b + c}. Let
x = α1 (a + c) + β1 (b + c);
y = α2 (a + c) + β2 (b + c).
Then
λx + µy = (λα1 + µα2 )(a + c) + (λβ1 + µβ2 )(b + c) ∈ {a + c, b + c}.
Thus this is a subspace of R3 .
Now check that a + c, b + c is a basis. We only need to check linear independence.
If α(a + c) + β(b + c) = 0, then αa + βb + (α + β)c = 0. Since {a, b, c} is a basis of
R3 , therefore a, b, c are linearly independent and α = β = 0. Therefore a + c, b + c is
a basis and the subspace has dimension 2.
(2) Given a set of numbers αi , let U = {x ∈ Rn : ni=1 αi xi = 0}. We show
P
n P that this is a
vector
P subspace
P of R : Take x, y ∈ U , then consider λx+µy. We have αi (λxi +µyi ) =
λ αi xi + µ αi yi = 0. Thus λx + µy ∈ U .
The dimension of the subspace is n − 1 as we can freely choose xi for i = 1, · · · , n − 1
and then xn is uniquelyP determined by the P previous xi ’s.
(3) Let W = {x ∈ Rn : αi xi = 1}. Then αi (λxi + µyi ) = λ + µ 6= 1. Therefore W is
not a vector subspace.
1.8. Suffix notation. Here we are going to introduce a powerful notation that can help us
simplify a lot of things.
First of all, let v ∈ R3 . We can write v = v1 e1 + v2 e2 + v3 e3 = (v1 , v2 , v3 ). So in general,
the ith component of v is written as vi . We can thus write vector equations in component form.
For example, a = b → ai = bi or c = αa + βb → ci = αai + βbi . A vector has one free suffix, i,
while a scalar has none. P
[Einstein’s summation convention]
P Consider a sum x·y = xi yi . The summation convention
says that we can drop the symbol and simply write x · y = xi yi . If suffixes are repeated once,
summation is understood.
Note that i is a dummy suffix and doesn’t matter what it’s called, i.e. xi yi = xj yj = xk yk etc.
The rules of this convention are:
(1) Suffix appears once in a term: free suffix
(2) Suffix appears twice in a term: dummy suffix and is summed over
(3) Suffix appears three times or more: WRONG!
Example 1.32. [(a · b)c − (a · c)b]i = aj bj ci − aj cj bi summing over j understood.
It is possible for an item to have more than one index. These objects are known as tensors,
which will be studied in depth in the IA Vector Calculus course.
Here we will define two important tensors:
Definition 1.33 (Kronecker delta).
(
1 i=j
δij = .
0 i 6= j
8 DR B MENGE
We have
δ11 δ12 δ13 1 0 0
δ21 δ22 δ23 = 0 1 0 = I.
δ31 δ32 δ33 0 0 1
So the Kronecker delta represents an identity matrix.
Example 1.34.
(1) ai δi1 = a1 . In general, ai δij = aj (i is dummy, j is free).
(2) δij δjk = δik
(3) δii = n if we are in Rn .
(4) ap δpq bq = ap bp with p, q both dummy suffices and summed over.
Definition 1.35 (Alternating symbol εijk ). Consider rearrangements of 1, 2, 3. We can divide
them into even and odd permutations. Even permutations include (1, 2, 3), (2, 3, 1) and (3, 1, 2).
These are permutations obtained by performing two (or no) swaps of the elements of (1, 2, 3).
(Alternatively, it is any “rotation” of (1, 2, 3))
The odd permutations are (2, 1, 3), (1, 3, 2) and (3, 2, 1). They are the permutations obtained
by one swap only.
Define
+1 ijk is even permutation
εijk = −1 ijk is odd permutation
0 otherwise (i.e. repeated suffices)
εijk has 3 free suffices.
We have ε123 = ε231 = ε312 = +1 and ε213 = ε132 = ε321 = −1. ε112 = ε111 = · · · = 0.
We have
(1) εijk δjk = εijj = 0
(2) If ajk = akj (i.e. aij is symmetric), then εijk ajk = εijk akj = −εikj akj . Since εijk ajk =
εikj akj (we simply renamed dummy suffices), we have εijk ajk = 0.
Proposition 1.36. (a × b)i = εijk aj bk
Proof. By expansion of formula
Theorem 1.37. εijk εipq = δjp δkq − δjq δkp
Proof. Proof by exhaustion:
+1
if j = p and k = q
RHS = −1 if j = q and k = p
0 otherwise
LHS: Summing over i, the only non-zero terms are when j, k 6= i and p, q 6= i. If j = p and
k = q, LHS is (−1)2 or (+1)2 = 1. If j = q and k = p, LHS is (+1)(−1) or (−1)(+1) = −1. All
other possibilities result in 0.
Equally, we have εijk εpqk = δip δjq − δjp δiq and εijk εpjq = δip δkq − δiq δkp .
Proposition 1.38.
a · (b × c) = b · (c × a)
Proof. In suffix notation, we have
a · (b × c) = ai (b × c)i = εijk bj ck ai = εjki bj ck ai = b · (c × a).
Theorem 1.39 (Vector triple product).
a × (b × c) = (a · c)b − (a · b)c.
VECTORS PART I 9
Proof.
[a × (b × c)]i = εijk aj (b × c)k
= εijk εkpq aj bp cq
= εijk εpqk aj bp cq
= (δip δjq − δiq δjp )aj bp cq
= aj bi cj − aj ci bj
= (a · c)bi − (a · b)ci
Spherical trigonometry.
Proposition 1.40. (a × b) · (a × c) = (a · a)(b · c) − (a · b)(a · c).
Proof.
LHS = (a × b)i (a × c)i
= εijk aj bk εipq ap cq
= (δjp δkq − δjq δkp )aj bk ap cq
= aj bk aj ck − aj bk ak cj
= (a · a)(b · c) − (a · b)(a · c)
Consider the unit sphere, center O, with a, b, c on the surface.
A
δ(A, B) α
B C
Suppose we are living on the surface of the sphere. So the distance from A to B is the arc
length on the sphere. We can imagine this to be along the circumference of the circle through
A and B with center O. So the distance is ∠AOB, which we shall denote by δ(A, B). So
a·b = cos ∠AOB = cos δ(A, B). We obtain similar expressions for other dot products. Similarly,
we get |a × b| = sin δ(A, B).
(a × b) · (a × c)
cos α =
|a × b||a × c|
b · c − (a · b)(a · c)
=
|a × b||a × c|
Putting in our expressions for the dot and cross products, we obtain
cos α sin δ(A, B) sin δ(A, C) = cos δ(B, C) − cos δ(A, B) cos δ(A, C).
This is the spherical cosine rule that applies when we live on the surface of a sphere. What does
this spherical geometry look like?
10 DR B MENGE
1.9. Geometry.
2. Linear maps
A linear map is a special type of function between vector spaces. In fact, most of the time,
these are the only functions we actually care about. They are maps that satisfy the property
f (λa + µb) = λf (a) + µf (b).
We will first look at two important examples of linear maps — rotations and reflections, and
then study their properties formally.
2.1. Examples.
2.1.1. Rotation in R3 R3. In R3 , first consider the simple cases where we rotate about the z axis
by θ. We call this rotation R and write x0 = R(x).
Suppose that initially, x = (x, y, z) = (r cos φ, r sin φ, z). Then after a rotation by θ, we get
x0 = (r cos(φ + θ), r sin(φ + θ), z)
= (r cos φ cos θ − r sin φ sin θ, r sin φ cos θ + r cos φ sin θ, z)
= (x cos θ − y sin θ, x sin θ + y cos θ, z).
We can represent this by a matrix R such that x0i = Rij xj . Using our formula above, we obtain
cos θ − sin θ 0
R = sin θ cos θ 0
0 0 1
Now consider the general case where we rotate by θ about n̂.
n̂
A0
A0
θ
B A B A
x0 C C
O
12 DR B MENGE
2.1.2. Reflection in R3 R3. Suppose we want to reflect through a plane through O with normal
n̂. First of all the projection of x onto n̂ is given by (x · n̂)n̂. So we get x0 = x − 2(x · n̂)n̂. In
suffix notation, we have x0i = xi − 2xj nj ni . So our reflection matrix is Rij = δij − 2ni nj .
n̂ x
x0
(1) Consider S : R3 → R2 with S(x, y, z) = (x+y, 2x−z). Simple yet tedious algebra shows
that this is linear. Now consider the effect of S on the standard basis. S(1, 0, 0) = (1, 2),
S(0, 1, 0) = (1, 0) and S(0, 0, 1) = (0, −1). Clearly these are linearly dependent, but they
do span the whole of R2 . We can say S(R3 ) = R2 . So the image is R2 .
Now solve S(x, y, z) = 0. We need x + y = 0 and 2x − z = 0. Thus x = (x, −x, 2x),
i.e. it is parallel to (1, −1, 2). So the set {λ(1, −1, 2) : λ ∈ R} is the kernel of S.
(2) Consider a rotation in R3 . The kernel is the zero vector and the image is R3 .
(3) Consider a projection of x onto a plane with normal n̂. The image is the plane itself,
and the kernel is any vector parallel to n̂
Theorem 2.6. Consider a linear map f : U → V , where U, V are vector spaces. Then (f ) is a
subspace of V , and ker(f ) is a subspace of U .
Proof. Both are non-empty since f (0) = 0.
If x, y ∈ (f ), then ∃a, b ∈ U such that x = f (a), y = f (b). Then λx + µy = λf (a) + µf (b) =
f (λa + µb). Now λa + µb ∈ U since U is a vector space, so there is an element in U that maps
to λx + µy. So λx + µy ∈ (f ) and (f ) is a subspace of V .
Suppose x, y ∈ ker(f ), i.e. f (x) = f (y) = 0. Then f (λx+µy) = λf (x)+µf (y) = λ0+µ0 = 0.
Therefore λx + µy ∈ ker(f ).
Thus αm+1 em+1 + · · · + αn en ∈ ker(f ). Since {e1 , · · · , em } span ker(f ), there exist some
α1 , α2 , · · · αm such that
αm+1 em+1 + · · · + αn en = α1 e1 + · · · + αm em .
But e1 · · · en is a basis of U and are linearly independent. So αi = 0 for all i. Then the
only solution to the equation αm+1 f (em+1 ) + · · · + αn f (en ) = 0 is αi = 0, and they are
linearly independent by definition.
Example 2.11. Calculate the kernel and image of f : R3 → R3 , defined by f (x, y, z) =
(x + y + z, 2x − y + 5z, x + 2z).
First find the kernel: we’ve got the system of equations:
x+y+z =0
2x − y + 5z = 0
x + 2z = 0
Note that the first and second equation add to give 3x + 6z = 0, which is identical to the third.
Then using the first and third equation, we have y = −x − z = z. So the kernel is any vector in
the form (−2z, z, z) and is the span of (−2, 1, 1).
To find the image, extend the basis of ker(f ) to a basis of the whole of R3 : {(−2, 1, 1), (0, 1, 0), (0, 0, 1)}.
Apply f to this basis to obtain (0, 0, 0), (1, −1, 0) and (1, 5, 2). From the proof of the rank-nullity
theorem, we know that f (0, 1, 0) and f (0, 0, 1) is a basis of the image.
To get the standard form of the image, we know that the normal to the plane is parallel to
(1, −1, 0) × (1, 5, 2) k (1, 1, −3). Since 0 ∈ (f ), the equation of the plane is x + y − 3z = 0.