Bili Near
Bili Near
Bili Near
This handout discusses some important constructions from linear algebra that we will use
throughout the rest of the course. Most of the material discussed here is described in more detail
in Introduction to Linear Algebra by Johnson, Riess and Arnold.
Let us begin by recalling a few basic ideas from linear algebra. Throughout this handout, let V
be an n-dimensional vector space over the real numbers. (In all of our applications, V will be a
linear subspace of some Euclidean space Rk , that is, a subset that is closed under vector addition
and scalar multiplication.)
A linear map from V to itself is called an endomorphism of V . Given an endomorphism
A : V → V and a basis {x1 , . . . , xn } for V , we can express the image under A of each basis vector
as a linear combination of basis vectors:
n
Axj = Aij xi .
i=1
This determines an n × n matrix Ax = (Aij ), called the matrix of A with respect to the given
Eli Fender for suggesting this notation.) By linearity, the action of A on any
basis. (Thanks to
other vector v = j vj xj is then determined by
n n
A vj xj = Aij vj xi .
j=1 i,j=1
If we associate with each vector v = j vj xj its n-tuple of coefficients (v1 , . . . , vn ) arranged as a
column matrix, then the n-tuple associated with w = Av is determined by matrix multiplication:
⎛ ⎞ ⎛ ⎞⎛ ⎞
w1 A11 . . . A1n v1
⎜ .. ⎟ ⎜ .. . . ⎟ ⎜ .. ⎟
⎝ . ⎠=⎝ . . . . ⎠⎝ . ⎠.
.
wn An1 . . . Ann vn
Just as in the case of linear maps on Rn , the jth column of this matrix is the n-tuple associated
with the image of the jth basis vector xj .
If we change to a different basis, the matrix of A will change. To see how, suppose {x1 , . . . , xn }
and {x1 , . . . , xn } are bases for V , and let C = (Cij ) be the matrix of coefficients of xj expressed
with respect to the basis {x1 , . . . , xn }: thus for each j = 1, . . . , n,
n
(3.1) xj = Cij xi .
i=1
The matrix C is called the transition matrix from {xi } to {xj }. Its columns are the n-tuples
representing x1 , . . . , xn in terms of the basis {xi }, which are linearly independent, so C is invertible.
2
Proof. Let (Aij ) denote the matrix entries of Ax and (Aij ) those of Ax . We prove the proposition
by calculating the vector Axj in two ways. First, we substitute (3.1) and then expand Axi in terms
of (Aij ):
n
n
Axj = Cij Axi = Cij Aki xk .
i=1 i,k=1
Because the vectors {x1 , . . . , xn } are independent, the fact that these two expressions are equal
implies that the respective coefficients of xk are equal:
n
n
Aki Cij = Cki Aij .
i=1 i=1
This is equivalent to the matrix equation Ax C = CAx , which in turn is equivalent to (3.2).
Here is the most important application of the change of basis formula. You have already seen
the determinant of an n × n matrix (see Handout 1). The trace of an n × n matrix M is the
number tr M = i Mii (the sum of the entries on the main diagonal). The next theorem describes
some of the most important properties of the determinant and trace functions.
Theorem 3.2. For any n × n matrices M and N ,
(3.3) det(M N ) = (det M )(det N ) = det(N M );
(3.4) tr(M N ) = tr(N M ).
Proof. For a proof of (3.3), see any good linear algebra book. For (3.4), we compute as follows:
n n n
tr(M N ) = Mij Nji = Mij Nji ;
i=1 j=1 i,j=1
and tr(N M ) yields the same expression with the roles of i and j reversed.
Corollary 3.3. Suppose V is a finite-dimensional vector space and A : V → V is an endomorphism.
If {xi } and {xj } are any two bases for V and Ax and Ax are the matrix representations of A with
respect to the two bases, then det Ax = det Ax and tr Ax = tr Ax .
3
Proof. Let C be the transition matrix from {xi } to {xj }. Using the results of Proposition 3.1 and
Theorem 3.2, we compute
det Ax = det C(Ax C −1 )
= det (Ax C −1 )C
= det Ax (C −1 C)
= det Ax .
The computation for the trace is identical.
Because of this corollary, we can make the following definition: if A : V → V is any endomor-
phism, we define the determinant of A to be the determinant of any matrix representation of A,
and the trace of A to be the trace of any matrix representation. The corollary shows that these
numbers are well defined, independently of the choice of basis.
Bilinear Forms
This can be summarized as the value obtained by multiplying the matrix Bx on the right by w and
on the left by the transpose of v:
⎛ ⎞⎛ ⎞
B11 . . . B1n w1
⎜ . . . ⎟ ⎜ .. ⎟
(3.6) B(v, w) = v1 . . . vn ⎝ .. . . . ⎠⎝ . ⎠.
.
Bn1 . . . Bnn wn
In matrix notation, we can write this as B(v, w) = vxT Bx wx , where vx and wx are the column
matrices representing v and w in this basis, and the superscript T designates the transpose of a
matrix: if M = (Mij ) is any k × l matrix, its transpose M T is the l × k matrix whose (i, j)-entry
is (M T )ij = Mji . In particular, (3.5) implies that if two bilinear forms agree on all pairs of vectors
in some basis, then they are identical.
4
A matrix is said to be symmetric if it is equal to its transpose. Note that if a bilinear form B
is symmetric, then its matrix with respect to any basis is a symmetric matrix, because
Bij = B(xi , xj ) = B(xj , xi ) = Bji .
Conversely, if B is represented by a symmetric matrix with respect to some basis, then it follows
easily from (3.5) that it is a symmetric bilinear form.
The most important type of bilinear form is an inner product, which is a bilinear form that
is symmetric and positive definite. Given an inner product on V , we usually denote the value of
the inner product at a pair of vectors v and w by the notation v, w. A (finite-dimensional )
inner product space is a finite-dimensional vector space endowed with a specific choice of inner
product. The most familiar and important example is Rn with its Euclidean dot product,
v, w = v · w = v1 w1 + · · · + vn wn .
The following exercise shows a common way to construct other examples.
Henceforth, we assume that V is an n-dimensional inner product space, endowed with a specific
inner product · , · . (In our applications, V will be a tangent plane to a surface, and · , · will
be the restriction of the Euclidean dot product.)
In an inner product space, we can define many geometric quantities analogous to ones that we
are familiar with in Rn . For example, the norm of a vector v ∈ V is the nonnegative real number
v = v, v1/2 , and the angle between two nonzero vectors v, w is θ = arccos v, w/(v w) .
A unit vector is a vector v with v = 1, and two vectors v, w are orthogonal if v, w = 0. A set
of vectors {ε1 , . . . , εk } is said to be orthonormal if each εi is a unit vector and distinct vectors
are orthogonal; or, more succinctly, if
1, i = j,
εi , εj = δij =
0, i = j.
(The symbol δij is called the Kronecker delta.) An orthonormal basis is a basis consisting of
orthonormal vectors.
Lemma 3.5. If {ε1 , . . . , εk } is a set of orthonormal vectors, then it is a linearly independent set.
The next proposition shows that every inner product space admits many orthonormal bases.
Proposition 3.6 (Gram–Schmidt Algorithm). Let V be an inner product space and let
{x1 , . . . , xn } be any basis for V . Then there exists an orthonormal basis {ε1 , . . . , εn } such that
span(ε1 , . . . , εk ) = span(x1 , . . . , xk ) for each k = 1, . . . , n.
Proof. We will prove by induction on k that for each k = 1, . . . , n there exist orthonormal vectors
{ε1 , . . . , εk } whose span is the same as that of {x1 , . . . , xk }. When k = n, this proves the proposi-
tion, because orthonormal vectors are independent, and a linearly independent set of n vectors in
an n-dimensional vector space is automatically a basis.
5
Begin by setting ε1 = x1 /x1 , which is a unit vector whose span is the same as that of x1 . Now
let k ≥ 1 and assume by induction that we have produced orthonormal vectors ε1 , . . . , εk satisfying
the span condition. Define
k
yk+1 = xk+1 − xk+1 , εi εi ,
i=1
yk+1
εk+1 = .
yk+1
Because xk+1 ∈ / span(x1 , . . . , xk ), it follows that yk+1 = 0, and thus εk+1 is well defined. Clearly
εk+1 is a unit vector. A straightforward computation shows that yk+1 is orthogonal to each
of the vectors ε1 , . . . , εk , and therefore so is εk+1 . Since εk+1 is a linear combination of the
vectors {ε1 , . . . , εk , xk+1 }, it lies in their span, which by the induction hypothesis is equal to
span{x1 , . . . , xk , xk+1 }. Since the vectors {ε1 , . . . , εk+1 } are orthonormal, they are independent,
and thus their span is a (k + 1)-dimensional subspace contained in the span of {x1 , . . . , xk+1 }.
These two subspaces have the same dimension, so they are equal.
Like any symmetric bilinear form, an inner product can be represented in terms of a basis by a
symmetric matrix. It is traditional to write the matrix of an inner product as gx = (gij ), where
gij = xi , xj . In an orthonormal basis, the inner product is represented by the identity matrix,
but in terms of a non-orthonormal base it will be represented by a different matrix.
In fact, this example is not special, because, as the following theorem shows, every bilinear form
can be constructed in this way.
Theorem 3.8. Let V be a finite-dimensional inner product space, and let B be a bilinear form
on V . Then there exists a unique endomorphism A : V → V such that B = BA . In terms of any
orthonormal basis for V , A and B are represented by the same matrix.
Proof. Let {ε1 , . . . , εn } be any orthonormal basis for V , and write Bij = B(εi , εj ). Let A : V → V
be the endomorphism determined by the same matrix with respect to this basis, so that
n
Aεj = Bkj εk .
k=1
For each i, j, we compute
n
BA (εi , εj ) = εi , Aεj = Bkj εi , εk = Bij ,
k=1
6
where the last equation follows because the only term in the sum for which εi , εk = 0 is the one
with k = i. Thus BA and B give the same results when applied to pairs of basis vectors, so they
are equal.
To prove uniqueness, suppose A1 and A2 are endomorphisms such that v, A1 w = B(v, w) =
v, A2 w for all v, w ∈ V . Define D : V → V by Dw = A1 w − A2 w. The hypothesis implies that
v, Dw = 0 for all v, w ∈ V . In particular, taking v = Dw, this implies 0 = Dw, Dw = Dw2
for all w ∈ V . Thus D is the zero endomorphism, which implies A1 = A2 .
Given a bilinear form B, the unique endomorphism A such that B = BA is called the endo-
morphism associated with B. Similarly, given an endomorphism A, we say that the bilinear
form BA defined by (3.7) is associated with A. Note that the endomorphism associated with a
bilinear form is canonically determined, independent of any choice of basis, even though we used a
basis to prove its existence and uniqueness.
It is important to be aware that a bilinear form and its associated endomorphism are represented
by the same matrix only when working with an orthonormal basis. The next proposition shows
how the matrices are related in an arbitrary basis.
Proposition 3.9. Suppose B is a bilinear form on V and A is its associated endomorphism. In
terms of any basis {x1 , . . . , xn }, the matrices of B and A are related by Bx = gx Ax , where gx is
the matrix representing the inner product.
Proof. We will only need the theorem when V is 2-dimensional, so we give the proof for that case
only. Choose any orthonormal basis {x1 , x2 } for V . First we dispose of a simple special case: if
A is equal to a scalar multiple of the identity map, meaning that there is a real number λ such
that Ax = λx for all x ∈ V , then the chosen basis {x1 , x2 } is already an orthonormal basis of
eigenvectors and we are done. So assume henceforth that A is not a scalar multiple of the identity.
With respect to the chosen orthonormal basis, A is represented by a symmetric matrix:
a b
Ax = .
b c
A real number λ is eigenvalue of A if and only if there is a nonzero vector v such that Av − λv = 0.
This is the case if and only if the matrix Ax − λI is singular (where I is the 2 × 2 identity matrix),
which is equivalent to det(Ax − λI) = 0. Thus the eigenvalues of A are the solutions (if any) to
the following quadratic equation:
(a − λ)(c − λ) − b2 = 0,
or equivalently
λ2 − (a + c)λ + (ac − b2 ) = 0.
This has solutions
a+c± (a + c)2 − 4(ac − b2 ) a + c ± (a − c)2 + 4b2
λ= = .
2 2
Since we are assuming that A is not a multiple of the identity, it must be the case that either
a = c or b = 0; in either case the expression under the square root sign is strictly positive, so the
quadratic equation has two distinct real roots. Call them λ1 and λ2 .
For each j = 1, 2, the fact that A − λj id is singular means it has nontrivial kernel, so there exist
nonzero vectors ε1 and ε2 such that
Aε1 − λ1 ε1 = 0,
Aε2 − λ2 ε2 = 0.
After dividing each vector εj by its norm (which does not affect the two equations above), we may
assume that ε1 and ε2 are unit vectors.
Finally, we will show that ε1 and ε2 are orthogonal. Using the fact that A is symmetric, we
compute
λ1 ε1 , ε2 = λ1 ε1 , ε2 = Aε1 , ε2 = ε1 , Aε2 = ε1 , λ2 ε2 = λ2 ε1 , ε2 .
Thus (λ1 − λ2 )ε1 , ε2 = 0, and since λ1 = λ2 , this implies ε1 , ε2 = 0.
Quadratic Forms
In many applications, the most important uses of bilinear forms involve their values when both
arguments are the same. For that reason, we make the following definition. If V is a finite-
dimensional vector space, a function Q : V → R is called a quadratic form on V if there is some
bilinear form B on V such that Q(v) = B(v, v) for all v ∈ V . Any such bilinear form is said to
8
Proof. Given a quadratic form Q on V , by definition there is some bilinear form B0 such that
Q(v) = B0 (v, v) for all v. Define B : V × V → R by
B(v, w) = 12 B0 (v, w) + B0 (w, v) .
Then an easy verification shows that B is bilinear, and it is obviously symmetric. For any v ∈ V ,
we have
B(v, v) = 12 B0 (v, v) + B0 (v, v) = B0 (v, v) = Q(v),
so this proves the existence of such a B.
To prove uniqueness, we will derive a formula (called a polarization identity) which shows
that B is completely determined by Q. If B is any symmetric bilinear form associated with Q, we
have
1
1
4 Q(v + w) − Q(v − w) = 4 B(v + w, v + w) − B(v − w, v − w)
= 14 B(v, v) + B(w, v) + B(v, w) + B(w, w)
(3.8)
− 14 B(v, v) − B(w, v) − B(v, w) + B(w, w)
= B(v, w).
Any other symmetric bilinear form associated with Q would have to satisfy an analogous equation,
and thus would be equal to B.