Geometric Linear Algebra, Volume I - Hsiung Lin, Yixiong Lin
Geometric Linear Algebra, Volume I - Hsiung Lin, Yixiong Lin
Geometric Linear Algebra, Volume I - Hsiung Lin, Yixiong Lin
LINEAR ALGEBRA
Volume 1
This page intentionally left blank
GEOMETRIC
LINEAR ALGEBRA
Volume 1
I-Hsiung Lin
National Taiwan Normal University, China
We World Scientific
NEW JERSEY · LONDON · SINGAPORE · BEIJING · SHANGHAI · HONG KONG · TAIPEI · CHENNAI
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
QA184.2.L49 2005
512'.5--dc22
2005041722
For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.
Printed in Singapore.
To the memory of my
grandparents, parents
and to my family
This page intentionally left blank
PREFACE
vii
viii Preface
Algebra has operational priority over geometry, while the latter provides
intuitively geometric motivation or interpretations to results of the former.
Both play a role of head and tail of a coin in many situations.
Linear algebra is going to transform the afore-mentioned geometric ideas
into two algebraic operations so that solving linear equations can be handled
linearly and systematically. Its implication is far-reaching and its applica-
tion is widely-open and touches almost every field in modern science. More
precisely,
−
a directed line segment AB → a vector x;
ratio of signed lengths of (directed) line segments along the same line
−
PQ
− = α → y = α x , scalar multiplication of x by α.
AB
See Fig. P.1.
P
Q
B y
A
x
Fig. P.1
Hence, the whole line can be described algebraically as αx while α runs
through the real numbers. While, the parallelogram in Fig. P.2 indicates
− − −
that directed segments OA and BC represent the same vector x , OB and
−
AC represent the same vector y so that
−
the diagonal OC → x +y , the addition of vectors
x and
y.
B C
y
y x+y
A
O x
Fig. P.2
α
x, α ∈ R (real field) and
x+y
linear combination,
linear dependence, and
linear independence
of vectors, plus deductive and inductive methods, one can develop and
establish the whole theory of Linear Algebra, even formally and in a very
abstract manner.
The main theme of the theory is about linear transformation which
can be characterized as the mapping that preserves the ratio of the signed
lengths of directed line segments along the same or parallel lines. Linear
transformations between finite-dimensional vector spaces can be expressed
as matrix equations xA = y , after choosing suitable coordinate systems as
bases.
The matrix equation xA = y has two main features. The static struc-
ture of it, when consider y as a constant vector b , results from solving
algebraically the system x A = b of linear equations by the powerful and
useful Gaussian elimination method. Rank of a matrix and its factorization
as a product of simpler ones are the most important results among all.
Rank provides insights into the geometric character of subspaces based on
the concepts of linear combination, dependence and independence. While
factorization makes the introduction of determinant easier and provides
preparatory tools to understand another feature of matrices. The dynamic
structure, when consider y as a varying vector, results from treating A as
a linear transformation defined by x → xA = y . The kernel (for homoge-
neous linear equations) and range (for non-homogeneous linear equations)
of a linear transformation, dimension theorem, invariant subspaces, diago-
nalizability, various decompositions of spaces or linear transformations and
their canonical forms are the main topics among others.
When Euclidean concepts such as lengths and angles come into play, it
is the inner product that combines both and the Pythagorean Theorem or
orthogonality dominates everywhere. Therefore, linear operators y =xA
x Preface
are much more specified and results concerned more fruitful, and provide
wide and concrete applications in many fields.
Roughly speaking, using algebraic methods, linear algebra investigates
the possibility and how of solving system of linear equations, or geomet-
rically equivalent, studies the inner structures of spaces such as lines or
planes and possible interactions between them. Nowadays, linear algebra
turns out to be an indispensable shortcut from the global view to the local
view of objects or phenomena in our universe.
published in Chinese from 1982 to 1984, with the last two still unable to
be published until now). I try to write the book in the following manner:
4. Usually, each set of Exercises contains two parts: <A> and <B>. The
former is designed to familiarize the readers with or to practice the
established results in that section, while the latter contains challenging
ones whose solutions, in many cases, need some knowledge to be exposed
formally in sections that follow. In addition to these, some set of Exer-
cises also contain parts <C> and <D>. <C> asks the readers to try
to model after the content and to extend the process and results to vec-
tor spaces over arbitrary fields. <D> presents problems concerned with
linear algebra, such as in real or complex calculus, differential equations
and differential geometry, etc. Let such connections and applications of
linear algebra say how important and useful it is.
The readers are asked to do all problems in <A> and are encouraged to
try part in <B>, while <C> and <D> are optional and are left to more
mature and serious students.
No applications outside pure mathematics are touched and the needed
readers should consult books such as Gilbert Strang’s Linear Algebra and
its Application.
Finally, three points deviated from most existed conventional books on
linear algebra should be cautioned. One is that chapters are divided accord-
ing to affine, linear, and Euclidian structures of R1 , R2 and R3 , but not
according to topics such as vectors spaces, determinants, etc. The other is
that few definitions are formal and most of them are allowed to come to
the surface in the middle of discussions, while main results obtained after a
discussion are summarized and are numbered along with important formu-
las. The third one is that a point x = (x1 , x2 ) is also treated as a position
xii Preface
vector from the origin 0 = (0, 0) to that point, when R2 is considered as a
two-dimensional vector space, rather than the common used notation
x1 x1
or .
x2 x2
As a consequence of this convention, when a given 2 × 2 matrix A is con-
sidered to represent a linear transformation on R2 to act on the vector
x,
we adopt x A and treat x as a 1 × 2 matrix but not A x2 to denote the
x1
1. How does one know beforehand that the sum of the left side is equal to
1
6 n(n + 1)(2n + 1)?
2. To pursue this answer, try trivial yet simpler cases when n = 1, 2, 3 and
even n = 4, and then try to find out possible common rules owned by
all of them.
3. Conjecture that the common rules found are still valid for general n.
4. Try to prove this conjecture formally by mathematical induction or some
other methods.
Now, for n = 1, take a “shadow” unit square and a “white” unit square
and put them side by side as Fig. P.3:
For n = 2, use the same process and see Fig. P.4; for n = 3, see Fig. P.5.
12
22 area of shadow region 12 + 22 + 32 14
= =
32 area of the rectangle 4·9 36
7 2·3+1
Fig. P.5 = = .
18 6·3
This suggests the conjecture
12 + 22 + 32 + · · · + n2 2n + 1
2
=
(n + 1)n 6n
1
⇒ 12 + 22 + 32 + · · · + n2 = n(n + 1)(2n + 1).
6
It is approximately in this manner that I wrote the contents of the book, in
particular, in Chaps. 1, 2 and 4. Of course, this procedure is roundabout,
overlapped badly in some cases and even makes one feel impatiently and
sick. So I tried to summarize key points and main results on time. But I do
strongly believe that it is a worthy way of educating beginners in a course
of linear algebra.
Well, I am not able to realize physically the existence of four or higher
dimensional spaces. Could you? How? It is algebraic method that convinces
us properly the existence of higher dimensional spaces. Let me end up this
puzzle with my own experience in the following story.
Some day in 1986, in a Taoism Temple in eastern Taiwan, I had a face-
to-face dialogue with a person epiphanized (namely, making its presence
or power felt) by the God Nuo Zha (also esteemed as the Third Prince in
Chinese communities):
I asked: Does God exist?
Nuo Zha answered: Gods do exist and they live in spaces, from dimension
seven to dimension thirteen. You common human being
lives in dimension three, while dimensions four, five and
six are buffer zones between human being and Gods.
Also, there are “human being” in underearth, which
are two-dimensional.
I asked: Does UFO (unfamiliar objects) really exist?
Nuo Zha answered: Yes. They steer improperly and fall into the three-
dimensional space so that you human being can see
them physically.
Believe it or not!
Preface xv
the points x0 and x in the plane. This is Part 2, the Euclidean structures
of R2 and R3 , which contains Chaps. 4 and 5.
In our vivid physical world, it is difficult to realize that the parallel
planes a1 x1 + a2 x2 + a3 x3 = b (b = 0) and a1 x1 + a2 x2 + a3 x3 = 0 will
intersect along a “line” within our sights. By central projection, it would be
reasonable to imagine that they do intersect along an infinite or imaginary
line l∞ . The adjoint of l∞ to the plane a1 x1 + a2 x2 + a3 x3 = b constitutes
a projective plane. This is briefly touched in Exs. <B> of Sec. 2.6 and
Sec. 3.6, Ex. <B> of Sec. 2.8.5 and Sec. 3.8.4.
Changes of coordinates from x = (x1 , x2 ) to y = (y1 , y2 ) in R2 :
y1 = a1 + a11 x1 + a21 x1
y2 = a2 + a12 x1 + a22 x2 or y =
x0 +
x A,
xvi Preface
a11 a12
where A = a21 a22
with a11 a22 − a12 a21 = 0 is called an affine
transformation and, in particular, an invertible linear transformation if
x0 = (a1 , a2 ) = 0 . This can be characterized as a one-to-one mapping from
R2 onto R2 which preserves ratios of line segments along parallel lines
(Secs. 1.3, 2.7, 2.8 and 3.8). If it preserves distances between any two
points, then called a rigid or Euclidean motion (Secs. 4.8 and 5.8). While
y = σ( x A) for any scalar σ = 0 maps lines onto lines on the projective
plane and is called a projective transformation (Sec. 3.8.4). The invariants
under the group (Sec. A.4) of respective transformations constitute what
F. Klein called affine, Euclidean and projective geometries (Secs. 2.8.4, 3.8.4,
4.9 and 5.9).
As important applications of exterior products (Sec. 5.1) in R3 , elliptic
geometry (Sec. 5.11) and hyperbolic geometry (Sec. 5.12) are introduced in
the same manner as above. These two are independent of the others in
the book.
Almost every text about linear algebra treats R1 trivially and obviously.
Yes, really it is and hence some pieces of implicit information about R1
are usually ignored. Chapter 1 indicates that only scalar multiplication of a
vector is just enough to describe a straight line and how the concept of linear
dependence comes out of geometric intuition. Also, through vectorization
and coordinatization of a straight line, one can realize why the abstract
set R1 can be considered as standard representation of all straight lines.
Changes of coordinates enable us to interpret the linear equation y = ax+b,
a = 0, geometrically as an affine transformation preserving ratios of segment
lengths. Above all, this chapter lays the foundation of inductive approach
to the later chapters.
Ways of thinking and the methods adopted to realize them in Chap. 2
constitute a cornerstone for the development of the theory and a model to
go after in Chap. 3 and even farther more. The fact that a point outside a
given line is needed to construct a plane is algebraically equivalent to say
that, in addition to scalar multiplication, the addition of vectors is needed in
order, via concept of linear independence and method of linear combination,
to go from a lower dimensional space (like straight line) to a higher one
(like plane). Sections 2.2 up to 2.4 are counterparts of Secs. 1.1 up to 1.3
and they set up the abstract set R2 as the standard two-dimensional real
vector space and changes of coordinates in R2 . The existence of straight
lines (Sec. 2.5) on R2 implicitly suggests that it is possible to discuss vector
and affine subspaces in it. Section 2.6 formalizes affine coordinates and
Preface xvii
Notations
Sections denoted by an asterisk (∗ ) are optional and may be omitted.
[1] means the first book listed in the Reference, etc.
A.1 means the first section in Appendix A, etc.
xviii Preface
Section 1.1 means the first section in Chap. 1. So Sec. 4.3 means the
third section in Chap. 4, while Sec. 5.9.1 means the first subsection of
Sec. 5.9, etc.
Exercise <A> 1 of Sec. 1.1 means the first problem in Exercises <A>
of Sec. 1.1, etc.
(1.1.1) means the first numbered important or remarkable facts or sum-
marized theorem in Sec. 1.1, etc.
Figure 3.6 means that the sixth figure in Chap. 3, etc. Fig. II.1 means
the first figure in Part 2, etc. Figure A.1 means the first figure in Appendix
A; similarly for Fig. B.1, etc.
The end of a proof or an Example is sometimes but not always marked
by for attention.
For details, refer to Index of Notations.
(1) For honored high school students: Chapters 1, 2 and 4 plus Exer-
cises <A>.
(2) For freshman students: Chapters 1, 2 (up to Sec. 2.7), 3 (up to Sec. 3.7),
4 (up to Sec. 4.7 and Sec. 4.10) and/or 5 (up to Sec. 5.7 and Sec. 5.10)
plus, at least, Ex. <A>, in a one-academic-year three-hour-per-week
course. As far as teaching order, one can adopt this original arrange-
ment in this book, or after finishing Chap. 1, try to combine Chaps. 2
and 3, 4 and 5 together according to the same titles of sections in each
chapter.
(3) For sophomore students: Just like (2) but contains some selected prob-
lems from Ex. <B>.
(4) For a geometric course via linear algebra: Chapters 1, 2 (Sec. 2.8),
3 (Sec. 3.8), 4 (Sec. 4.8) and 5 (Secs. 5.8–5.12) in a one-academic-year
three-hour-per-week course.
Preface xix
(5) For junior and senior students who have had some prior exposure to
linear algebra: selective topics from the contents with emphasis on
problem-solving from Exercises <C>, <D> and Appendix B.
Acknowledgements
I had the honor of receiving so much help as I prepared the manuscripts of
this book.
Students listed below from my classes on linear algebra, advanced cal-
culus and differential geometry typed my manuscripts:
while
edited the initial typescript. They did these painstaking works voluntar-
ily, patiently, dedicatedly, efficiently and unselfishly without any payment.
Without their kind help, it is impossible to have this book coming into
existence so soon. I’m especially grateful, with my best regards and wishes,
to all of them.
And above all, special thanks should be given to Ms Shu-li Hsieh and
Mr Chih-chiang Huang for their enthusiasm, carefulness, patience and con-
stant assistance with trifles unexpected.
Teaching assistant Ching-yu Yang in the Mathematics Department,
provided technical assistance with computer works occasionally.
Prof. Shao-shiung Lin of National Taiwan University, Taipei reviewed the
inital typescript and offered many valuable comments and suggestions for
improving the text. Thank you both so much.
Also, thanks to Dr. K. K. Phua, Chairman and Editor-in-Chief, World
Scientific, for his kind invitation to join this book in their publication, and
to Ms Zhang Ji for her patience and carefulness in editing the book, and
to these who might help correcting the English.
Of course, it is me who should take the responsibility of possible errata
that remain. The author welcomes any positive and constructive comments
and suggestions.
I-hsiung Lin
NTNU, Taipei, Taiwan
June 21, 2004
CONTENTS
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Volume One
Part 1: The Affine and Linear Structures of R1 , R2 and R3
xxi
xxii Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 819
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839
This page intentionally left blank
PART 1
Introduction
Starting from intuitively geometric objects, we treat
1
2 The Affine and Linear Structures of R1 , R2 and R3
A suitable choice of basis for the kernel or/and the image of a linear trans-
formation will play a central role in handling these problems.
Based on results about linear transformations, affine transformations
are composed of linear transformations followed by translations. We discuss
topics such as:
Introduction
Our theory starts from the following simple geometric
Postulate A single point determines a unique zero-dimensional (vector)
space.
Usually, a little black point or spot is used as an intuitively geometric
model of zero-dimensional space. Notice that “point” is an undefined term
without length, width and height.
In the physical world, it is reasonable to imagine that there exits two
different points. Hence, one has the
Postulate Any two different points determine one and only one straight
line.
A straightened loop, extended beyond any finite limit in both directions,
is a lively geometric model of a straight line. Mathematically, pick up two
different points O and A on a flat paper, imagining extended beyond any
limit in any direction, and then, connect O and A by a ruler. Now, we have
a geometric model of an unlimited straight line L (see Fig. 1.1).
L
A
Fig. 1.1
As far as the basic concepts of straight lines are concerned, one should
know the following facts (1)–(6).
5
6 The One-Dimensional Real Vector Space R (or R1 )
L
Q
P
A
O
Fig. 1.2
P Q R P R Q P=R Q
Fig. 1.3
L
P Q P' Q'
Fig. 1.4
8 The One-Dimensional Real Vector Space R (or R1 )
− −
P P = QQ = 0 . (1.1.2)
Hence, zero vector is uniquely defined.
Now, fix any two different points O and X on L. For simplicity, denote
−−
the vector OX by x , i.e.
−−
x = OX.
Note that x = 0 .
−
For any fixed point P on L, the ratio of the signed length OP with
−−
respect to OX is
−
OP
−− = α,
OX
where the real number α has the following properties:
α = 0⇔P = O;
0 < α < 1⇔P lies on the segment OX (P = O, X);
α > 1⇔P and X lie on the same side of O and OP > OX;
α = 1⇔P = X; and
α < 0⇔P and X lie on the different sides of O.
In all cases, designate the vector
− −−
OP = αOX = α
x.
On the other hand, for any given α ∈ R, there corresponds one and only
−
one point P on line L such that OP = α x holds (see Fig. 1.5).
P
X
O αx
x
Fig. 1.5
Summarize as
The Vectorization of a straight line
Fix any two different points O and X on a straight line L and denote the
−−
vector OX by x . Then the scalar product α
x of an arbitrary real number
1.1 Vectorization of a Straight Line: Affine Structure 9
x | α ∈ R}
L(O; X) = {α
−
the vectorized space of the line L with O as the origin, OO = 0 as zero
vector and x as the base vector. Elements in L(O; X) are called (line) vectors
which have the following algebraic operation properties: α, β ∈ R,
1. (α + β)
x = αx + β x = β x + α
x;
2. (αβ) x = α(β x ) = β(α x );
3. 1
x = x;
−
4. Let 0 = OO, then α x + 0 = 0 + α x = α
x;
5. (−α) x = −α x ; x + (− x ) = 0 = x −
x;
6. 0
x = α0 = 0.
−
In short, via the position vector OP = α x for any α, points P on L have
the above algebraic operation properties. (1.1.3)
Using the concept of (1.1.3), one can establish the algebraic characteri-
zation for three points lying on the same line.
−−
Suppose that points O, X and Y are collinear. Let x = OX and
−
y = OY .
In case O = X = Y : then x = y = 0 , and hence x = α y or y = αx
holds for any α ∈ R.
In case two of O, X and Y coincide, say X = O = Y : then y = 0 and
y = 0 x holds.
If O, X and Y are different from each other, owing to the fact that Y
lies on the line determined by O and X, y belongs to L(O; X). Hence, there
exists α ∈ R such that y = αx.
We summarize these results as
Linear dependence of line vectors
−− −
Let
x = OX and
y = OY . Then
(1) (geometric) Points O, X and Y are collinear.
⇔ (2) (algebraic) There exists α ∈ R such that y = αx or
x = α y.
⇔ (3) (algebraic) There exist scalars α, β ∈ R, not all equal to zero,
such that αx + βy = 0.
In any one of these three cases, vectors
x and
y are said to be linear
dependent (on each other). (1.1.4)
10 The One-Dimensional Real Vector Space R (or R1 )
Therefore, we have
Linear independence of a nonzero vector
Let OX = x . Then
Exercises
<A>
x}
B = {
[O]B = 0, [X]B = 1.
1.2 Coordinatization of a Straight Line: R1 (or R) 11
For example:
−
[P ]B = −2 ⇔ OP = −2
x;
3 − 3
[Q]B = ⇔ OQ = x .
2 2
See Fig. 1.6.
X Q
L(O; X)
O 3x
x 2
P
–2x
Fig. 1.6
Now we summarize as
The coordinatization of a straight line
Let L(O; X) be an arbitrary vectorized space of line L, with B = {
x },
−−
x = OX, as a basis. The set
RL(O;X) = {[P ]B | P ∈ L}
is called the coordinatized space of L with respect to B. Explain further as
follows.
(1) There is a one-to-one correspondence from any point P on line L onto
corresponding number [P ]B of the real number system R.
(2) Define a mapping Φ: L(O; X) → R by
−
Φ(αx ) = α (or Φ(OP ) = [P ]B , P ∈ L).
Then Φ is one-to-one, onto and preserves algebraic operations, i.e. for
any α, β ∈ R,
1. Φ(β(α
x )) = βα,
2. Φ(α
x + β
x ) = α + β. (1.2.1)
[P] R L(O; X)
P Φ
α
X [X ]
O αx
x [O] 1
R
L(O; X) 0 0 1 α
Fig. 1.7
(2) Only one nonzero vector is enough to generate the whole space L(O; X)
(refer to (1.1.4) and (1.1.5)).
Hence, we say that L(O; X) is a one-dimensional vector space with zero
vector 0 . Accurately speaking, L(O; X) is a one-dimensional affine space
(see Sec. 2.8 or Fig. B.2) of the line L.
Owing to arbitrariness of O and X, the line L can be endowed with
infinitely many vectorized spaces L(O; X). But according to (1.2.1), no
matter how O and X are chosen, we always have that
L(O; X)
RL(O; X) = R. (1.2.2)
Φ
So, we assign R another role, representing the standard one-dimensional
real vector space and denoted by
R1 . (1.2.3)
A number α in R is identical with the position vector in R1 , starting from 0
and pointing toward α, and is still denoted by α (see Fig. 1.8).
For the sake of reference and comparison, we summarize as
The real number system R and the standard one-dimensional
vector space R1
(1) R (simply called the real field, refer to Sec. A.3)
(a) Addition For any x, y ∈ R,
x+y ∈R
satisfies the following properties.
1. (commutative) x + y = y + x.
2. (associative) (x + y) + z = x + (y + z).
3. (zero element) 0: 0 + x = x.
4. (inverse element) For each x ∈ R, there exists a unique element
in R, denoted as −x, such that
x + (−x) = x − x = 0.
1.2 Coordinatization of a Straight Line: R1 (or R) 13
R
x+y 0 x αx
R1
Fig. 1.8
L
y
Y
x O'
P
O
X
Fig. 1.9
Our purpose here is to find out the relation between [P ]B and [P ]B .
1.3 Changes of Coordinates: Affine and Linear Transformations (or Mappings) 15
−
[P ]B = µ (⇔ OP = µ x ), [O ]B = α0 ;
−−
[P ]B = ν (⇔ O P = ν
y ), [O]B = β0 .
Since x and y are collinear, there exist constants α and β such that
y = α x and x = β y . Hence
x = αβ x implies (αβ − 1)
x = 0 . The
x shows that αβ − 1 = 0 should hold, i.e.
linear independence of
αβ = 1.
− −− −−
Now, owing to the fact that OP = OO + O P ,
µ x + ν
x = α0 x + να
y = α0 x = (α0 + να)
x
⇒ µ = α0 + να, or
[P ]B = [O ]B + α[P ]B . (1.3.1)
−− −− −
Similarly, by using O P = O O + OP , one has
ν = β0 + βµ, or
[P ]B = [O]B + β[P ]B . (1.3.2)
α0 1
ν=− + µ.
α α
−− −− α0
But OO = −O O ⇒ α0 x = −β0
y = −β0 α
x ⇒ α0 = −β0 α or β0 = − α .
Since αβ = 1, therefore
ν = β0 + βµ.
This is (1.3.2).
Similarly, (1.3.1) is deduced from (1.3.2) with the same process.
16 The One-Dimensional Real Vector Space R (or R1 )
In specific terminology, Eqs. such as (1.3.1) and (1.3.2) are called affine
transformations or mappings between affine spaces L(O; X) and L(O ; Y );
in case O = O , called (invertible) linear transformation (see Sec. B.7).
Finally, here is an example.
O
Y
O'
Fig. 1.10
Exercises
<A> (Adopt the notations in (1.3.3).)
1. Suppose that L(O; X) and L(O ; Y ) are two vectorized spaces on the
same line L. Let
1
[O]B = −5 and
y = x.
3
(a) Locate the points O, X, O and Y on the line L.
(b) If a point P ∈ L and [P ]B = 0.2, find [P ]B ? If [P ]B = 15, what
is [P ]B ?
2. Construct two vectorized spaces L(O; X) and L(O ; Y ) on the same
line L, and explain graphically the following equations as changes of
coordinates with
[P ]B = x and [P ]B = y, P ∈ L.
(a) y = −2x.
√
(b) y = 3x − 53 .
(c) x = 6y.
(d) x = −15y + 32.
let
[P ]B = x for P ∈ L, and
[P ]B = y for P ∈ L .
A mapping or transformation T from L onto L (see Sec. A.2) is called
an affine mapping or transformation if there exist constants a and b = 0
such that
T (x) = y = a + bx (1.4.1)
holds for all P ∈ L and the corresponding P ∈ L . Note that y = T (x)
is one-to-one. In case a = 0, y = T (x) = bx is called a linear transforma-
tion (isomorphism) from the vector space L(O; X) onto the vector space
L (O ; X ). In this sense, change of coordinates on the same line as intro-
duced in (1.3.3) is a special kind of affine mapping.
For any two fixed different points P1 and P2 with [P1 ]B = x1 and [P2 ]B =
x2 , the whole line L has coordinate representation
x = (1 − t)x1 + tx2 ,t∈R (1.4.2)
−−
with respect to the basis B. The directed segment P1 P2 or −x−
1 x2 with P1 as
initial point and P2 as terminal point is the set of points
x = (1 − t)x1 + tx2 , 0 ≤ t ≤ 1. (1.4.3)
In case 0 < t < 1, the corresponding point x is called an interior point of
−
x−
1 x2 , otherwise (i.e. t < 0 or t > 1) an exterior point. See Fig. 1.11.
P1 x (exterior point)
P2 x x1
L(O: X) x2 (interior point)
X
O
Fig. 1.11
−−
Orient the line L by the basis vector OX in L(O; X) and let x2 − x1 be
−−
the signed length of the segment x1 x2 as we did in Sec. 1.1. For convenience,
we also use −x−
1 x2 to denote its signed length. Then, by (1.4.2), we see that
(1 − t)(x − x1 ) = t(x2 − x)
−
x1 x t
⇒ − = , t = 0, 1 (1.4.5)
xx2 1−t
−
which is equal to y−1y
, by using (1.4.4). This means that an affine mapping
yy 2
preserves the ratio of two line segments along the line (see also (1.4.6) and
Ex. <A> 2).
Finally, y2 − y1 = a + bx2 − (a + bx1 ) = b(x2 − x1 ) means that
−
y−
y = b− x−x
. (1.4.6)
1 2 1 2
Then, an affine mapping does not preserve the signed length except b = 1,
and does not preserve the orientation of directed segment except b > 0.
We summarize as
Affine invariants
An affine transformation between straight lines preserves
1. (directed) line segment along with end points, interior points and exterior
points, and
2. ratio of (signed) lengths of two line segments (along the same line) which
are called affine invariants. It does not necessarily preserve
3. (signed) length, and
4. orientation. (1.4.7)
(−, +)
(+, +) P2
P1
(+, −)
x2
x1
L(O; X) (0, 1)
(1, 0)
Fig. 1.12
Exercises
<A>
1. For each pair P1 , P2 of different points on L(O; X) and each pair P1 , P2 ,
of different points on L (O ; X ), show that there exists a unique affine
mapping T from L(O; X) onto L (O ; X ) such that
T (P1 ) = P1 and T (P2 ) = P2 .
2. A one-to-one and onto mapping T : L(O; X) → L (O ; X ) is affine if and
only if T preserves ratio of signed lengths of any two segments.
CHAPTER 2
Introduction
In our physical world, one can realize that there does exist a point, lying
outside of a given straight line.
In Fig. 2.1, the point Q does not lie on the line L. This point Q and the
moving point X on the line L generate infinitely many straight lines QX.
We image that the collection of all such lines QX constitute a plane.
Q L
P
X
O
Fig. 2.1
Fig. 2.2
21
22 The Two-Dimensional Real Vector Space R2
Q C
O O
A
P O B
Fig. 2.3
R2 = {(x1 , x2 ) | x1 , x2 ∈ R}
with scalar multiplication and addition of vectors on it, which can be consid-
ered as a typical model of planes and is called the standard two-dimensional
real vector space.
Changes of coordinates on the same plane (Sec. 2.4) result in the alge-
braic equations
with a11 a22 − a21 a12 = 0. This is the affine transformation or linear trans-
formation in case a1 = a2 = 0 to be formally discussed in Secs. 2.7 and 2.8.
Section 2.5 introduces straight lines in the affine plane R2 and its various
equation representations in different coordinate systems.
Section 2.6 formally introduces the terminology, affine basis B =
{a0 , a2 }, to replace the coordinate system Σ(
a1 , a0 ;
a1 ,
a2 ), and this will
be used thereafter constantly. Hence, we also discuss barycentric or affine
coordinates of a point with respect to an affine basis. Ex. <B> of Sec. 2.6
introduces primary ideas about projective plane.
The contents in Secs. 2.5 and 2.6 provide us the necessary backgrounds
in geometric interpretation, which will be discussed in Secs. 2.7 and 2.8.
Just like Sec. 1.4, a one-to-one mapping T from R2 onto R2 that pre-
serves signed ratios of line segments along the same line is characterized
algebraically as
T (x) =
x0 +
xA, x ∈ R2 .
where A = [aij ]2×2 is invertible,
In the above diagram, one may go directly from Sec. 2.4 to Sec. 2.7 without
seriously handicapping the understanding of the content in Sec. 2.7.
The manners presented, the techniques used and the results obtained in
this chapter will be repeated and reconstructed for R3 in Chap. 3.
1. length P Q, and
(2.1.1)
2. direction.
−
In case Q = P , call P Q a zero vector.
Q Σ
(fixed
P
direction)
Fig. 2.4
2.1 (Plane) Vector 25
−
In physics, a vector P Q can be interpreted as a force acting on the point
P and moving in the fixed direction to the point Q (see Fig. 2.5). Under
−
this circumstance, the vector P Q possesses both
(acting force)
P
Fig. 2.5
Fig. 2.6
26 The Two-Dimensional Real Vector Space R2
x and −
Therefore, x represent, respectively, two vectors equal in length
but opposite in direction.
Addition
− −
Suppose x = P Q and y = QS. Starting from P , along
x , to Q and then
along y , to S is the same as starting from P directly to S. Then, it is
reasonable to define
−
x +
y = PS (2.1.6)
y
x+y
Q
x
P
Fig. 2.7
2.1 (Plane) Vector 27
Scalar multiplication
−
Suppose α ∈ R and x = P Q. Keep P fixed. Stretch the segment P Q,
α times, to a collinear segment P R. As for α > 0 or α < 0, the point R
lies on the same side of P as Q or on the opposite of P , respectively (see
Fig. 2.8). In all cases, we define the scalar product of the scalar α and the
vector x by
−
αx = P R, (2.1.7)
R
Q x(> 0)
P
R x
x( < 0)
Fig. 2.8
x+y
y
x–y
x
–y
Fig. 2.9
28 The Two-Dimensional Real Vector Space R2
When comparing them with (1.2.4)(2), we can easily find that line vectors
x and plane vectors x enjoy exactly the same operations and properties.
The only difference between them is that, one is one-dimensional, by nature,
while the other is two-dimensional.
Remark 2.1
Operation properties as shown in (2.1.11), not only hold for line vectors
and plane vectors, but also for (three-dimensional) space vectors. Now, we
will roughly explain as follows.
Line vectors
Suppose x = 0 . Then for any collinear vector y , there exists a unique
scalar α ∈ R such that
y = αx . Therefore, in essence, the addition of x
2.1 (Plane) Vector 29
R
x+y
y
x
P
x Q
Fig. 2.10
Remark 2.2
It is appropriate, now, to say what is a vector space over a field.
For general definition of a field F, please refer to Sec. A.3; and definition
of a vector space V over a field F, refer to Sec. B.1.
According to (1) in (1.2.4), R is a field and is called the real field.
Therefore, (1.1.3) or (2) in (1.2.4) says that R1 or R is a vector space over
the field R, simply called a real vector space.
Similarly, (2.1.11) indicates that R2 is also a real vector space, but of
two-dimensional (see Secs. 2.2 and 2.3).
The three-dimensional real vector spaces R3 will be defined in Secs. 3.1
and 3.2.
A vector space over the complex field C is specifically called a complex
vector space.
30 The Two-Dimensional Real Vector Space R2
The elements of the field F are called scalars and the elements of the
vector space V are called vectors. The word “vector”, without any practical
meanings such as displacement or acting force, is now being used to describe
any element of a vector space.
Exercises
<A>
1. Explain graphically the properties listed in (2.1.11).
2. Explain physically the properties listed in (2.1.11).
L(O; A2)
P
P2
A2 L(O; A1)
P1
A1
O
Fig. 2.11
2.2 Vectorization of a Plane: Affine Structure 31
x1a1 + x2 a2
A2
a2
A1 P1
L(O; A1)
O a1 x1a1
Fig. 2.12
O = B1 = B2 = B3 O = B1 = B2 B3 O = B1 B2 = B3 O B1 B2 B3
Fig. 2.13
Case 2 Suppose that three out of the four points are collinear (see
Fig. 2.14). For example, in case O = B1 = B2 but they are collinear,
then b2 = α b1 + 0 b3 for a suitable scalar α.
B3 B3
O B1 = B2 O B1 B2
Fig. 2.14
Case 3 Suppose that any three of the four points are not collinear. As we
already knew from Fig. 2.12, b3 = α b1 + β b2 is true for some scalars α
and β.
Conclusively, we have the
Linear dependence of plane vectors
The following statements hold and are equivalent.
(1) (geometric) Four points O, B1 , B2 and B3 are coplanar.
⇔ (2) (algebraic) Fix one of the points O, B1 , B2 and B3 as origin and
−
hence produce three vectors, say O as origin and bi = OB i ,
i = 1, 2, 3. Then, at least one of the vectors b1 , b2 and b3 is a
linear combination of the other two vectors.
⇔ (3) (algebraic) There exist real numbers y1 , y2 and y3 , not all equal to
zero, such that y1 b1 + y2 b2 + y3 b3 = 0 .
2.2 Vectorization of a Plane: Affine Structure 33
Under these circumstances, the four points are called affinely dependent
and the resulting plane vectors linearly dependent. (2.2.3)
Remark
The geometric fact that three points O, A1 and A2 are not collinear, is alge-
−
braically equivalent to the linear independence of plane vectors a1 = OA1
−
and a2 = OA2 . Any common vector in Σ(O; A1 , A2 ) is produced, via linear
combination x1 a1 + x2
a2 , of a unique vector x1
a1 in L(O; A1 ) and a unique
vector x2 a2 in L(O; A2 ). We combine these two facts together and write as
Σ(O; A1 , A2 ) = L(O; A1 ) ⊕ L(O; A2 ), (2.2.5)
indicating that two intersecting (but not coincident) straight lines deter-
mine a unique plane.
Exercises
<A>
1. Prove (2.2.3) in detail.
2. Use notation in (2.2.2). For any three vectors b1 , b2 , b3 ∈ Σ(O; A1 , A2 ),
prove that they are linearly dependent.
3. Prove (2.2.4) in detail.
34 The Two-Dimensional Real Vector Space R2
P(x1, x2)
A2 (0,1)
A1
O (1,0)
(0,0)
Fig. 2.15
−
[P ]B = (x1 , x2 ) ⇔ OP = x1
a1 + x2
a2 . The set of all these coordinate
vectors
R2Σ(O; A1 , A2 ) = {[P ]B | P ∈ Σ}
a2 Φ
OP (0, 1) (x1, x2)
a1 (1, 0)
O (0, 0)
Σ(O; A1, A2) R2Σ(O; A1, A2)
Fig. 2.16
36 The Two-Dimensional Real Vector Space R2
Φ(y1 b1 + y2 b2 ) = Φ((y1 α11 + y2 α21 )
a1 + (y1 α12 + y2 α22 )
a2 )
= (y1 α11 + y2 α21 , y1 α12 + y2 α22 )
= y1 (α11 , α12 ) + y2 (α21 , α22 )
= y1 Φ( b1 ) + y2 Φ( b2 )
Ψ−1 ° Φ
Σ Σ
Φ Ψ
RΣ = RΣ
(2.3.3)
where Ψ−1 is the inverse mapping of Ψ and Ψ−1 ◦ Φ means the composite
mapping of Φ followed by Ψ−1 (see Sec. A.2). Ψ−1 ◦ Φ is also a linear
isomorphism. Observing this fact, we have the
2.3 Coordinatization of a Plane: R2 37
x in R2 is
2. (vector point of view) When considered as a vector space,
called a vector, pointing from zero vector 0 toward the point x (for
exact reason, see Definition 2.8.1 in Sec. 2.8).
If conveniently, we will feel free to use both of these two concepts. Refer
to Fig. 2.17. Both traditionally and commonly used in existed textbooks
of linear algebra, when the point (x1 , x2 ) in R2 is considered as a position
vector (as in 2), it is usually denoted by a column vector
x1 x1
or
x2 x2
and is treated as a 2 × 1 matrix. We would rather accept x = (x1 , x2 ) as a
vector too than this traditional convention, for simplicity and for later usage
in connection with matrix computational works (see Sec. 2.4).
38 The Two-Dimensional Real Vector Space R2
Remark 2.3
In the Euclidean sense, a segment has its length, and two lines from the
angle between them.
are imposed, then e1 and e2 are said to form a rectangular or Cartesian
coordinate system (see Fig. 2.17(b)).
A2 vector x
e2 = (0, 1) vector x
e2 = (0, 1)
A1
O
e1 = (1, 0) O
0
0 e1 = (1, 0) A1
Fig. 2.17
From now on, if not specified, a plane endowed with a rectangular coordi-
nate system will always be considered as a concrete geometric model of R2 .
Therefore, we have
It is easily seen that vectors in RL(O; A1 ) are closed under the addition
and scalar multiplication which are inherited from those in R2Σ(O; A1 , A2 ) .
This simply means that RL(O; A1 ) exists by itself as a vector space, and
is called a one-dimensional vector (or linear) subspace of R2Σ(O; A1 , A2 ) . For
the same reason, RL(O; A2 ) is also a one-dimensional vector subspace of
R2Σ(O; A1 , A2 ) .
Observe the following two facts.
1. The vector subspaces RL(O; A1 ) and RL(O; A2 ) have only zero vector
0 = (0, 0) in common, i.e.
RL(O; A1 ) ∩ RL(O; A2 ) = { 0 }.
RL(O; A2)
Φ 1(x) = (x, 0)
Φ1 A2
R e2
x
O
0 1 0
e1 A1
Φ2
Φ 2(x) = (0, x)
RL(O; A1)
Fig. 2.18
R2 = R ⊕ R,
Exercises
<A>
(1) y ∈S⇒
x, x +y ∈ S,
(2) α ∈ R and x ∈ S ⇒ α
x ∈ S (in particular, 0 ∈ S).
2.3 Coordinatization of a Plane: R2 41
{ 0 } and R2 itself are trivial subspaces of R2 . A subspace which is not
identical with R2 is called a proper subspace. Prove that the following
are equivalent.
(a) S is a proper subspace which is not { 0 }.
x0 ∈ S such that
(b) There exists a vector
x0 | α ∈ R},
S = {α
S1 + S2 = { x2 |
x1 + x1 ∈ S1 and
x2 ∈ S2 }
is still a subspace of R2 . In case S1 ∩ S2 = { 0 }, write S1 + S2 as
S1 ⊕ S2
R2 = U ⊕ V.
Is V unique?
6. Let xn be position vectors in R2 such that the terminal point of
x1 , . . . ,
xj−1 is the initial point of xj for 2 ≤ j ≤ n and the terminal point of
42 The Two-Dimensional Real Vector Space R2
xn is the initial point of
x1 . Then
n
xj = x2 + · · · +
x1 + xn = 0.
j=1
<B>
Almost all the concepts we have introduced for R and R2 so far, can be
generalized verbatim to abstract vector spaces over an arbitrary field.
Such as:
Vector (or linear) space V over a field F: Fn .
Real or complex vector space: Rn , Cn .
2.3 Coordinatization of a Plane: R2 43
6 . Let { xn } and {
x1 , . . . , yn } be two bases for an n-dimensional
y1 , . . . ,
vector space V . Then there exists a permutation j1 , j2 , . . . , jn of
1, 2, . . . , n so that all
{
x1 , . . . ,
xi−1 ,
yji , xn },
xi+1 , . . . , 1≤i≤n
(1) Does the intuition experienced in the construction of the abstract spaces
R and R2 help in solving the problems? In what way?
(2) Is more intuitive or algebraic experience, such as in R3 (see Chap. 3),
needed?
(3) Does one have other sources of intuition concerning geometric concepts
than those from R, R1 and R2 ?
44 The Two-Dimensional Real Vector Space R2
(4) Are the algebraic methods developed and used to solve problems in R
and R2 still good? To what extend should it be generalized? Does the
nature of a scalar field play an essential role in some problems?
(5) Need the algebraic methods be more unified and simplified? Need new
methods such as matrix operations be introduced as early as possible
and widely used?
(6) Are more mathematical backgrounds, sophistication or maturity
needed?
1. Model after (2) in (1.2.4) to explain that the set C of complex numbers
is a one-dimensional vector space over the complex field C.
(a) Show that the complex number 1 itself constitutes a basis for C. Try
to find all other bases for C.
(b) Is there any intuitive interpretation for C as we did for R in Secs. 1.1
and 1.2?
(c) Consider the set C of complex numbers as a vector space over the
√
real field. Show that {1, i}, where i = −1, is a basis for C and
hence C is a two-dimensional real vector space. Find all other bases
for this C. Is there any possible relation between R2 and C?
2. Consider the set R of real numbers as a vector space over the rational
field Q.
(a) Make sure what scalar multiplication means in this case!
(b) Two real numbers 1 and α are linearly independent if and only if α
is an irrational number.
(c) Is it possible for this R to be finite-dimensional?
3. Let the set
R+ = {x ∈ R | x > 0}
be endowed with two operations:
(1) Addition ⊕: x ⊕ y = xy (multiplication of x and y), and
(2) Scalar multiplication of x ∈ R+ by α ∈ R: α x = xa .
(a) Show that R+ is a real vector space with 1 as the zero vector
and x−1 the inverse of x.
(b) Show that each vector in R+ is linearly independent by itself
and every two different vectors in R+ are linearly dependent.
(c) Show that R+ is linear isomorphic to the real vector space R.
2.4 Changes of Coordinates: Affine and Linear Transformations (or Mappings) 45
4. Show that B = {
x1 , xn }, where
x2 , . . . ,
xj = (1, 1, . . . , 1, 0, . . . , 0), 1≤j≤n
j
B = { a2 },
a1 ,
B = { b1 , b2 }.
Our main purpose here is to establish the relationship between the coordi-
nate vectors [P ]B and [P ]B of the same point P ∈ Σ with respect to the
bases B and B , respectively (see Fig. 2.19).
A2
P
B1
a2
A1 B2 b1
a1 b2
O
O'
Fig. 2.19
In view of parallel invariance of vectors, one may consider b1 and b2 as
vectors in Σ(O; A1 , A2 ), and hence
[ bi ]B = (αi1 , αi2 ), i.e. bi = αi1
a1 + αi2
a2 , i = 1, 2.
Similarly, when considered a2 ∈ Σ(O ; B1 , B2 ).
a1 ,
[
ai ]B = (βi1 , βi2 ), i.e.
ai = βi1 b1 + βi2 b2 , i = 1, 2.
As indicated in Fig. 2.19,
− −− −−
OP = OO + O P
⇒ x1
a1 + x2
a2 = (α1
a1 + α2
a2 ) + (y1 b1 + y2 b2 )
= α1
a1 + α2
a2 + y1 (α11
a1 + α12
a2 ) + y2 (α21
a1 + α22
a2 )
⇒ (since
a1 and
a2 are linearly independent)
x1 = α1 + α11 y1 + α21 y2 ,
x2 = α2 + α12 y1 + α22 y2 . (2.4.1)
Suppose the reader is familiar with basic facts about matrix (if not, please
refer to Sec. B.4). The above mentioned equations can be written into a
single one like
(x1 , x2 ) = (α1 , α2 ) + (α11 y1 + α21 y2 , α12 y1 + α22 y2 )
α11 α12
= (α1 α2 ) + (y1 y2 ) . (2.4.1 )
α21 α22
respectively. Then, the coordinate vectors [P ]B and [P ]B , of the same point
P ∈ R2 , satisfy the following two formulas of coordinate changes:
[P ]B = [O ]B + [P ]B AB
B ,
[P ]B = [O]B + [P ]B AB
B ,
y1 (α11
a1 + α12
a2 ) + y2 (α21
a1 + α22
a2 )
= (α11 y1 + α21 y2 )
a1 + (α12 y1 + α22 y2 )
a2 = 0
α11 y1 + α21 y2 = 0
α12 y1 + α22 y2 = 0
⇒ (owing to linear independence of b1 and b2 )
Similarly, by b2 = α21
a1 + α22
a2 ,
Put the above two sets of equations in the form of matrix product, and
they can be simplified, in notation, as
α11 α12 β11 β12 β11 β12 α11 α12 1 0
= =
α21 α22 β21 β22 β21 β22 α21 α22 0 1
⇒ AB B B B
B AB = AB AB = I2.
This means that AB B
B and AB are invertible to each other. Actual compu-
tation in solving βij in terms of αij shows that
B −1 1 α22 −α12
AB = AB
B = .
α11 α22 − α12 α21 −α21 α11
This is 2.
−1
About 3 Multiply both sides of the first formula from the right by AB
B ,
one has
−1 −1
[P ]B ABB = [O ]B AB B + [P ]B
−1
⇒ [P ]B = −[O ]B ABB + [P ]B AB
B
−1
All we need to prove in the remaining is that [O]B = −[O ]B AB
B . For
−− −−
this purpose, note first that OO = −O O. Therefore, remembering the
notations we adopted at the beginning of this subsection,
−−
OO = α1
a1 + α2
a2
= α1 (β11 b1 + β12 b2 ) + α2 (β21 b1 + β22 b2 )
= (α1 β11 + α2 β21 ) b1 + (α1 β12 + α2 β22 ) b2
−−
= −O O
= −(β1 b1 + β2 b2 ).
⇒ β1 = −(α1 β11 + α2 β21 )
β2 = −(α1 β12 + α2 β22 ).
⇒ (β1 β2 ) = −(α1 β11 + α2 β21 α1 β12 + α2 β22 )
β11 β12
= −(α1 α2 ) .
β21 β22
This finishes 3.
50 The Two-Dimensional Real Vector Space R2
Solution Suppose
−
a1 = OA1 = (1, 2) − (1, 0) = (0, 2),
−
a2 = OA2 = (0, 1) − (1, 0) = (−1, 1);
−−
b1 = O B 1 = (0, 0) − (−1, −1) = (1, 1),
−−
b2 = O B 2 = (2, 3) − (−1, −1) = (3, 4),
and let
B = { a2 },
a1 , B = { b1 , b2 }
B2
b2
A1
A2
a1
a2
B1 O
b1
O'
Fig. 2.20
2.4 Changes of Coordinates: Affine and Linear Transformations (or Mappings) 51
To compute AB
B :
b1 = α11
a1 + α12
a2
Similarly,
b2 = α21
a1 + α22
a2
⇒ (3, 4) = b2 = α21 (0, 2) + α22 (−1, 1) = (−α22 , 2α21 + α22 )
α22 = −3
⇒
2α21 + α22 = 4
7
α21 =
⇒ 2
α22 = −3
7
⇒ [ b2 ]B = , −3 .
2
Putting together, then
[b ] 1 −1
AB
B = 1 B = 7 .
[ b2 ]B 2 −3
To compute AB
B :
a1 = β11 b1 + β12 b2
Also,
a2 = β21 b1 + β22 b2
⇒ [
a2 ]B = (−7, 2).
Hence,
−6 2
AB
B = .
−7 2
which show that AB B
B and AB are, in fact, invertible to each other. By
the way,
1 −1 7 1
= 7
det AB
B = −3 + = ,
2 −3 2 2
−6 2 1
det AB = = −12 + 14 = 2 =
B
−7 2 det AB
B
and
−1 1 −3 1 −3 1 −6 2
AB
B = =2 = = AB
B
det AB
B − 72 1 − 72 1 −7 2
Finally,
−−
OO = (−1, −1) − (1, 0) = (−2, −1)
= α1
a1 + α2
a2 = α1 (0, 2) + α2 (−1, 1) = (−α1 , 2α1 + α2 )
α2 = 2
⇒
2α1 + α2 = −1
3
α1 = −
⇒ 2
α2 = 2
3
⇒ [O ]B = − , 2 .
2
While,
−− −−
O O = −OO = (2, 1)
= β1 b1 + β2 b2 = β1 (1, 1) + β2 (3, 4) = (β1 + 3β2 , β1 + 4β2 )
β1 + 3β2 = 2
⇒
β1 + 4β2 = 1
β =5
⇒ 1
β2 = −1
⇒ [O]B = (5, −1).
Now, by actual computation, we do have
B 3 −6 2
−[O ]B AB = − − 2 = −(9 − 14, −3 + 4)
2 −7 2
= −(−5, 1) = (5, −1) = [O]B .
The wanted formulas of changes of coordinates are
3 1 −1
[P ]B = − 2 + [P ]B 7 and
2 2 −3
−6 2
[P ]B = (5 −1) + [P ]B
.
−7 2
−−
For example, if [P ]B = (5, 2), i.e. O P = 5 b1 + 2 b2 , then
3 1 −1
[P ]B = − 2 + (5 2) 7
2 2 −3
3 21
= − , 2 + (12, −11) = , −9
2 2
−
2 a1 − 9 a2 . 2
which means OP = 21
54 The Two-Dimensional Real Vector Space R2
Remark The computations of AB B
B and AB .
Here, we suppose that the readers are familiar with basic matrix computa-
tion techniques, which may be obtained from Sec. B.4.
Suppose
ai = (ai1 , ai2 ), i = 1, 2, and bi = (bi1 , bi2 ), i = 1, 2
−1
B −1 b11 b12 a11 a12
⇒ AB = BA = . (2.4.3)
b21 b22 a21 a22
Similarly,
−1
a11 a12 b11 b12
AB
B = AB
−1
= . (2.4.3 )
a21 a22 b21 b22
Exercises
<A>
<B>
x1 = α1 + α11 y1 + α21 y2
x2 = α2 + α12 y1 + α22 y2 ,
x = (x1 , x2 ) ∈ R2 , define
For vector
a
x A = (x1 x2 ) 1 = a1 x1 + a2 x2
a2
and consider it as a vector in R (see the Note right after (2.4.1 )).
y ∈ R2 and α ∈ R,
x,
(a) Show that, for
(α
x +
y )A = α(
x A) +
y A.
x ∈ R2 |
Ker(A) = { x A = 0}, also denoted as N(A);
xA |
Im(A) = { x ∈ R2 }, also denoted as R(A),
are respectively called the kernel and the range of A. Show that
Ker(A) is a subspace of R2 while Im(A) a subspace of R.
(b) Show that A, as a linear transformation, is onto R, i.e. Im(A) = R,
if and only if Ker(A) = R2 .
(c) Show that A is one-to-one if and only if Ker(A) = { 0 }. Is it possible
that A is one-to-one?
2.4 Changes of Coordinates: Affine and Linear Transformations (or Mappings) 57
x → xA = (a1 x, a2 x).
(a) Show that A is a linear transformation. Its kernel Ker(A) and range
Im(A) are subspaces of R2 .
(b) Show that dim Ker(A) + dim Im(A) = dim R2 = 2 (see Sec. B.3).
(c) Show that the following are equivalent.
(1) A is one-to-one, i.e. Ker(A) = { 0 }.
(2) A is onto, i.e. Im(A) = R2 .
(3) A maps every basis B = { x2 } for R2 onto the basis
x1 ,
{ x1 A, x2 A} for R .
2
58 The Two-Dimensional Real Vector Space R2
ping f : R2 → R2 defined by
f (α1
x1 + α2
x2 ) = α1
y1 + α2
y2
is the unique linear transformation from R2 into R2 satisfying
f (
x1 ) =
y1 , f (
x2 ) =
y2 .
Suppose [f (x1 )]B = (a11 , a12 ) and [f (
x2 )]B = (a21 , a22 ). Then
[f ( x1 )]B a11 a12
[f (
x )]B = [
x ]B [f ]B , where [f ]B = = .
[f (x2 )]B a21 a22
This [f ]B is called the matrix representation of f related to the
basis B.
(e) Let f : R2 → R2 be any linear transformation. Then (show that)
f (
x) =
x A,
where A = [f ]N and N = { e2 } is the natural basis for R2 .
e1 ,
(f) Let S: a1 x1 + a2 x2 = 0 be a subspace of R2 . Show that there are
infinitely many linear transformations f : R2 → R2 such that
S = Ker(f ), and
S = Im(f )
hold respectively.
(g) Let S1 : a11 x1 + a12 x2 = 0 and S2 : a21 x1 + a22 x2 = 0 be two sub-
spaces of R2 . Construct a linear transformation f : R2 → R2 such
that
f (S1 ) = S2 .
How many such f are possible?
5. Let B = { x2 } and B = {
x1 , y2 } be two bases for R2 and f : R2 → R2
y1 ,
be a linear transformation.
(a) Show that
[f ]B = AB B
B [f ]B AB .
2.5 Straight Lines in a Plane 59
where [f (
xi )]B = (ai1 , ai2 ), i.e. f (
xi ) = ai1
y1 + ai2
y2 for i = 1, 2.
Show that
[f ( x ]B [f ]B
x )]B = [ B ;
in particular, [f ]B B
B = [f ]B and [ x ]B = [ x ]B AB . Show also that
[f ]B B
B = [f ]B AB and [f ]B B
B = [f ]B AB .
f (
x ) = [
x ]B
L2
L1
P (0, x2 )
P
A2
( x1 , 0)
A1
O
Fig. 2.21
60 The Two-Dimensional Real Vector Space R2
(+, −) A1 (+, 0)
(0, −) a1
(−, −) O (+, +)
(−, 0) a2
(−, +) (0, +)
A2
Fig. 2.22
Note that the two coordinate axes separate each other at O into four half-
lines, symbolically denoted by (+, 0), (0, +), (−, 0) and (0, −).
2.5 Straight Lines in a Plane 61
L
X
B
b
A
x
a1
a
a2
O
Fig. 2.23
Suppose
[A]B = [
a ]B = (a1 , a2 ),
[ b ]B = (b1 , b2 ),
[X]B = [
x ]B = (x1 , x2 ).
In terms with these coordinates, (2.5.4) can be rewritten as
[X]B = [
a ]B + t[ b ]B (2.5.5)
Example (continued from the Example in Sec. 2.4) Find the equation
of the line determined by the points A = (−3, 0) and B = (3, 3) in R2 ,
respectively, in Σ(O; A1 , A2 ) and Σ(O ; B1 , B2 ).
Solution In the coordinate system Σ(O; A1 , A2 ), let
−
a = OA = (−3, 0) − (1, 0) = (−4, 0) ⇒ [ a ]B = (−2, 4),
− 9
b = AB = (3, 3) − (−3, 0) = (6, 3) ⇒ [ b ]B = , −6 .
2
For arbitrary point X ∈ L, let [X]B = (x1 , x2 ) and (2.5.5) shows that
9
(x1 , x2 ) = (−2, 4) + t , −6 , t ∈ R
2
9
x1 = −2 + t
⇒ 2 , or 4x1 + 3x2 − 4 = 0
x = 4 − 6t
2
Adopt the notations and results in (2.4.2). From (2.5.5), the equation
of the line L in Σ(O; A1 , A2 ) is
[X]B = [
a ]B + t[ b ]B , X ∈ L.
Via the change of coordinate formula [X]B = [O]B + [X]B AB
B , one obtains
a ]B + t[ b ]B }AB
[X]B = [O]B + {[ B , (2.5.8)
which is the equation of the same line L in another coordinate system
Σ(O ; B1 , B2 ).
For example, we use the above example and compute as follows. As we
already knew that
−6 2
[O]B = (5, −1) and ABB = ,
−7 2
using [X]B = (x1 , x2 ) and [X]B = (y1 , y2 ), we have
9 −6 2
(y1 , y2 ) = (5, −1) + (−2, 4) + t −6
2 −7 2
= (5, −1) + (−16 + 15t, 4 − 3t)
= (−11 + 15t, 3 − 3t).
Hence y1 = −11 + 15t, y2 = 3 − 3t as shown in the example.
Finally, we discuss
The relative positions of two lines in a plane
Let
L1 :
x =a1 + t b1 (passing the point
a1 with direction b1 ),
L2 : x = a2 + t b2 (passing the point
a2 with direction b2 )
be two given lines in R2 . Then L1 and L2 may have the following three
relative positions. They are,
1. coincident (L1 = L2 ) ⇔ the vectors
a1 − a2 , b1 and b2 are linearly
dependent,
2. parallel (L1 // L2 ) ⇔ b1 and b2 are linearly dependent, but linearly
independent of a2 −
a1 , and
3. intersecting (in a unique single point) ⇔ b1 and b2 are linearly
independent. (2.5.9)
a2 a1 b2
b2
b1 a2
a2 − a1 a1
b2 b1
b1
a1 a2
L1 = L 2
L1 // L 2
Fig. 2.24
Proof Since b1 and b2 are nonzero vectors, only these three cases on the
right sides of 1, 2 and 3 need to be considered. Only sufficiences are proved
and the necessities are left to the readers.
a2 −
Case 1 Let b2 = α b1 and x ∈ L1 , then
a1 = β b1 . For point
t−β
a2 − β b1 + t b1 =
x = a1 + t b1 = a2 + b2 ,
α
x ∈ L2 . Conversely, if
which means x ∈ L2 , then
x = a2 + t b2 =
a1 + β b1 + tα b1 =
a1 + (β + tα) b1
x ∈ L1 . Therefore, L1 = L2 holds.
indicates that
Case 2 Suppose b2 = α b1 , but a2 −
a1 = β b1 for any scalar β ∈ R. If
there exists a point x common to both L1 and L2 , then two scalars t1 and
t2 can be found so that
a1 + t1 b1 = a2 + t2 b2 =
a2 + t2 α b1
⇒
a2 −
a1 = (t1 − t2 α) b1
x = a2 −
a1 + t( a1 ) = (1 − t)
a1 + t
a2 , t∈R (2.5.10)
Just like (1.4.2) and (1.4.3), the directed segment
a1
a2 with the initial point
a1 and terminal point a2 is the set of points
x = (1 − t)
a1 + t
a2 , 0 ≤ t ≤ 1. (2.5.11)
If 0 < t < 1, the point x is called an interior point of with end points a1 a2
a1 and a2 ; if t < 0 or t > 1,
x is called an exterior point. See Fig. 2.25
(compare with Fig. 1.11). 12 (a1 +
a2 ) is called the middle point of
a1
a2 .
(t < 0)
(t = 0)
(0 < t < 1)
a1 (t = 1)
1 (t > 1)
(a1 + a2) a
2 2
Fig. 2.25
By a triangle
∆
a1
a2
a3 (2.5.12)
with three noncollinear points and a1 , a2 a3
as vertices, we mean the plane
figure formed by three consecutive segments a2 ,
a1 a3 and
a2 a3
a1 , which
are called sides. See Fig. 2.26.
a3
a1 a2
Fig. 2.26
Exercises
<A>
1. Use the notations and results from Ex. <A> 4 in Sec. 2.4. Denote by L
the line determined by the points A = (1, 3) and B = (−6, 1) in R2 .
(a) Find the parametric and coordinate equations of L in Σ(O; A1 , A2 )
and Σ(O ; B1 , B2 ), respectively.
(b) Check the answers in (a), by using (2.5.8).
2. It is known that the equation of the line L in the rectangular coordinate
system Σ( o;e1 ,
e2 ) is
−3x1 + x2 + 4 = 0.
<B>
of ∆ b1 b2 b3 . Compare these quantities with (a). What can you
find?
(c) Find the fourth point a4 in R2 so that a1 ,
a2 ,
a3 ,
a4 (in this order)
constitute the vertices of a parallelogram, denoted by a1 a2 a3 a4 .
Compute the area of a1 a2 a3 a4 . Do the same questions as in (b).
In particular, do the image points b1 , b2 , b3 , b4 form a parallelo-
gram? What is its area?
8. In R2 (of course, with a rectangular coordinate system Σ(
o;
e1 ,
e2 )),
the length of a vector x = (x1 , x2 ) is denoted by
|
x | = (x21 + x22 )1/2 .
|
x |2 = 1 or x21 + x22 = 1.
x2
x x
x1
O (1, 0)
Fig. 2.27
(1, 0)}, {(0, 0),(0, 1)} and {(0, 0),(1, 1)}. Moreover,
I22 = {(0, 0)} ∪ {(0, 0), (1, 0)} ∪ {(0, 0), (0, 1)} ∪ {(0, 0), (1, 1)}.
{(0, 1), (1, 1)} is an affine straight line not passing through (0, 0). The other
one is {(1, 0), (1, 1)}.
(0, 1) (1, 1)
(0,0) (1, 0)
Fig. 2.28
1. Let I3 = {0, 1, 2} and construct the vector space I33 over I3 . See Fig. 2.29
and try to spot the vectors (2, 1, 1), (1, 2, 1), (1, 1, 2).
(0, 0, 2) (0,1, 2)
(0, 2, 2)
(1, 0, 2) (1, 2, 2)
(2, 1, 2) (2, 2, 2)
(2, 0, 2)
(0,0,1)
(0,1,1) (0, 2,1)
(0, 2, 0)
(1, 0, 0) (0, 0, 0) (0,1, 0)
(1, 2, 0)
(1,1, 0)
(2, 0, 0) (2, 2, 0)
(2,1, 0)
(c) How many affine straight lines not passing through (0, 0, 0) are
there?
(d) How many different ordered bases for I33 are there?
(e) Is
1 0 0 1 2 1 1 1 1
0 1 0 = 2 0 1 1 0 2
0 0 1 1 1 1 1 2 1
true? Anything to do with changes of coordinates (or bases)?
(f) Let Si , 1 ≤ i ≤ 13, denote the 13 one-dimensional subspaces of I33 .
Then
13
I33 = Si .
i=1
a2
a2 – a0 x
a0 a1 – a0
a1
v2
v1
O
Fig. 2.30
⇒
x = a1 −
a0 + x1 ( a2 −
a0 ) + x2 ( a0 )
= (1 − x1 − x2 )
a0 + x1
a1 + x2
a2
= λ0
a0 + λ1
a1 + λ2
a2 , λ0 + λ1 + λ2 = 1, (2.6.1)
where λ0 = 1 − x1 − x, λ1 = x1 , λ2 = x2 . The ordered triple
(
x )B = (λ0 , λ1 , λ2 ), λ0 + λ1 + λ2 = 1 (2.6.2)
72 The Two-Dimensional Real Vector Space R2
1 = 0
(−, −, +)
(−, +, +)
a2
(0, 0, 1)
(+, −, +) (0, 1, 0)
(+, +, +)
a1 (−, +, −)
a0
2 = 0
(1, 0, 0) (+, +, −) 0 = 0
(+, −, −)
Fig. 2.31
1 1 1
Note that 3 , 3 , 3 is the barycenter of the base triangle ∆ a0 a1 a2 .
Let Σ( b0 ; b1 , b2 ) be another vectorized space of R2 with
B = { b0 , b1 , b2 } the corresponding affine basis. For any x ∈ R2 , denote
(
x )B = (µ0 , µ1 , µ2 ), µ0 + µ1 + µ2 = 1 or
x−
[ b0 ]B = (y1 , y2 ) = (µ1 , µ2 ).
Then, by (2.4.2), the change of coordinates from Σ(
a0 ;
a1 ,
a2 ) to
Σ( b0 ; b1 , b2 ) is
x − b0 ]B = [
[ a0 − b0 ]B + [ a0 ]B AB
x − B , (2.6.4)
2.6 Affine and Barycentric Coordinates 73
where
[a − a ] β11 β12
AB
B = 1 0 B =
[ a2 − a0 ]B β21 β22
is the transition matrix, which is invertible. Suppose
a0 − b0 ]B = (p1 , p2 ),
[
a0 − b0 = p1 ( b1 − b0 ) + p2 ( b2 − b0 ). Then, (2.6.4) can be rewritten as
i.e.
β11 β12
(y1 y2 ) = (p1 p2 ) + (x1 x2 ) (2.6.5)
β21 β22
or
β11 β12 0
(y1 y2 1) = (x1 x2 1) β21 β22 0 . (2.6.5 )
p1 p2 1
In particular, if
a0 = b0 holds, then (p1 , p2 ) = (0, 0) and (2.6.5)
reduces to
β11 β12
(y1 y2 ) = (x1 x2 ) (2.6.7)
β21 β22
x −
or (2.6.4) reduces to [ x −
a0 ]B = [ a0 ]B AB
B which is a linear transfor-
mation. In case bi − b0 = ai − a0 , i = 1, 2, AB
B = I2 is the 2 × 2 identity
matrix and (2.6.5) reduces to
(y1 , y2 ) = (p1 , p2 ) + (x1 , x2 ), (2.6.8)
x − b0 ]B = [
or (2.6.4) reduces to [ a0 − b0 ]B + [
x −
a0 ]B , which represents
a translation. Therefore, a change of coordinates is a composite mapping of
a linear transformation followed by a translation.
Exercises
<A>
ai = (ai1 , ai2 ) ∈ R2 , i = 1, 2, 3.
1. Suppose
(a) Prove that {
a1 , a3 } is an affine basis for R2 if and only if the
a2 ,
matrix
a11 a12 1
a21 a22 1
a31 a32 1
is invertible.
74 The Two-Dimensional Real Vector Space R2
with
a01 a02 1
1
λ1 = x1 x2 1 ,
a1 − a0
det
a2 −a0 a21 a22 1
a01 a02 1
1
λ2 = a11 a12 1 ,
a1 − a0
det
a2 −a0 x1 x2 1
and, conversely,
a1 −
a0
x−
a0 = (λ1 λ2 )
a2 −
a0
x1 − b11 x2 − b12
= , or
b21 − b11 b22 − b12
x1 x2 1
b11 b12 1 = 0.
b b22 1
21
2.6 Affine and Barycentric Coordinates 75
a2
e2 L
b2 x
b1
O e1
a0
a1
Fig. 2.32
(c) Let ( x )B = (λ0 , λ1 , λ2 ), ( b1 )B = (α10 , α11 , α12 ) and ( b2 )B =
(α20 , α21 , α22 ) be as in (2.6.2). Show that L has equation
λ0 − α10 λ1 − α11 λ2 − α12
= = , or
α20 − α10 α21 − α11 α22 − α12
λ0 λ1 λ2
α10 α11 α12 = 0.
α α21 α22
20
<B>
1. Suppose ∆ a0
a1 a2 is a base triangle. Let p and q be points inside and
outside of ∆ a0
a1
a2 , respectively. Prove that the line segment connecting
p and q will intersect ∆ a0
a1
a2 at one point.
2. Oriented triangle and signed areas as coordinates
Suppose a1 , a2 and a3 are non-collinear points in R2 . The ordered
triples { a1 , a2 , a3 } is said to determine an oriented triangle denoted
by ∆¯a1
a2
a3 which is also used to represent its signed area, considered
positive if the ordered triple is in anticlockwise sense and negative
76 The Two-Dimensional Real Vector Space R2
a3
a2
a2
a1 x
a1 a3
Fig. 2.33
Let S = ∆ ¯a1
a2
a3 , S1 = ∆¯
xa2 ¯
a3 , S2 = ∆ xa3
a1 and S3 = ∆ ¯xa1
a2 .
Then S = S1 + S2 + S3 and the triple (S1 , S2 , S3 ) is called the area
coordinate of the point x with respect to coordinate or base triangle
∆a1 a3 with
a2 a1 ,
a2 and
a3 as base points and S1 , S2 , S3 as its coordi-
nate components.
(a) Let S1 : S2 : S3 = λ1 : λ2 : λ3 and call (λ1 : λ2 : λ3 ) a homogenous
area or barycentric coordinate. In case λ1 + λ2 + λ3 = 1, such λ1 ,
λ2 and λ3 are uniquely determined and denote (λ1 : λ2 : λ3 ) simply
by (λ1 , λ2 , λ3 ) as usual and call the coordinate normalized. Given
area coordinate (S1 , S2 , S3 ) with S = S1 + S2 + S3 , then (λS1 :
λS2 : λ(S − S1 − S2 )) is a barycentric coordinate for any scalar
λ = 0. Conversely, given barycentric coordinate (λ1 : λ2 : λ3 ) with
λ1 + λ2 + λ3 = 0,
λ1 S λ2 S λ3 S
, ,
λ 1 + λ2 + λ3 λ1 + λ2 + λ3 λ 1 + λ 2 + λ3
is the corresponding area coordinate. In case λ1 + λ2 + λ3 = 0,
(λ1 : λ2 : λ3 ) is said to represent an ideal or infinite point in the realm
of projective geometry (see Ex. <B> of Sec. 2.8.5 and Sec. 3.8.4).
(b) Owing to the fact that S = S1 + S2 + S3 , the third quantity is
uniquely determined once two of S1, S2 and S3 have been decided.
S S
Therefore, one may use (S1 , S2 ) or S1 , S2 to represent the point
x and is called the affine coordinate of x with respect to affine basis
{a3 , a2 } or {
a1 , a1 − a2 −
a3 , a3 } with
a3 as a base point or vertex (see
Fig. 2.34(a)). In case the line segments a1 and
a3 a3
a2 have equal
◦
length, say one unit and the angle ∠ a1 a3 a2 = 90 , then {
a3 , a2 }
a1 ,
2.6 Affine and Barycentric Coordinates 77
a2
a2
a1 a1
a3 a3
(a) (b)
Fig. 2.34
λ1 x1 + λ2 x2 + λ3 x3 λ1 y1 + λ 2 y 2 + λ 3 y 3
x= , y= .
λ1 + λ2 + λ 3 λ1 + λ 2 + λ3
¯ 1 ¯ k ¯
∆ x
ab = ∆ x1 a b + ∆ x2 a b .
1+k 1+k
See Fig. 2.35.
78 The Two-Dimensional Real Vector Space R2
x1 x x2
b x
b x2 a
x1
a
Fig. 2.35
a3
h3
a
a2
h1
h2
c
a1
b
Fig. 2.36
2.6 Affine and Barycentric Coordinates 79
(b) Treat the line as the y-axis in a rectangular coordinate system (see
Fig. 2.37). Then, a point
x on the line has its abscissa (via Ex. 2(c))
λ1 x1 + λ2 x2 + λ3 x3
x= =0
λ1 + λ 2 + λ3
with h1 = x1 , h2 = x2 and h3 = x3 .
y-axis
h3 a3
h2
a2
a1
h1
x-axis
Fig. 2.37
Furthermore, deduce the fact that the equation of the line in an affine
coordinate system is
x1 x2 1
a1 a2 1 = 0,
b b2 1
1
(b) Show that the line determined by the points (h11 , h12 , h13 ) and
(h21 , h22 , h23 ) has the equation
λ1 λ2 λ3
h11 h12 h13 = 0, or
h h22 h23
21
h12 h13
λ1 + h13 h11 λ2 + h11 h12 λ3 = 0.
h22 h23 h23 h21 h21 h22
(c) Three points (hi1 , hi2 , hi3 ), i = 1, 2, 3 are collinear if and only if
h11 h12 h13
h21 h22 h23 = 0.
h h32 h33
31
(d) Two lines are parallel if and only if their equations in barycentric
coordinates are
h1 λ1 + h2 λ2 + h3 λ3 = 0,
(h0 + h1 )λ1 + (h0 + h2 )λ2 + (h0 + h3 )λ3 = 0,
where h0 is a constant.
6. Give a coordinate triangle ∆ a1 a3 and use |
a2 a2 | = |
a1 a1 −
a2 | to denote
the length of the side a1 a2 , etc. See Fig. 2.38.
a2
a1 b1
a3
Fig. 2.38
2.7 Linear Transformations (Operators) 81
(a) Show that the three sides have the respective equations:
a1 a2 : λ3 = 0,
a2 a3 : λ1 = 0,
a3 a1 : λ2 = 0.
median on a2 : λ1 − λ2 = 0,
a1
median on a3 : λ2 − λ3 = 0,
a2
median on a1 : λ3 − λ1 = 0.
a3
altitude on a2 : |
a1 a1 | cos ∠
a3 a1 · λ1 − | a3 | cos ∠
a2 a2 · λ2 = 0,
altitude on a3 : |
a2 a2 | cos ∠
a1 a2 · λ2 − | a1 | cos ∠
a3 a3 · λ3 = 0,
altitude on a1 : |
a3 a3 | cos ∠
a2 a3 · λ3 − | a2 | cos ∠
a1 a1 · λ1 = 0.
(d) From the given point a with (α1 : α2 : α3 ) draw three perpendicular
lines
a b1 ,
a b2 and
a b3 to the three sides a3 ,
a2 a1 and
a3 a1
a2
respectively. Show that a b1 has equation
| a2 | cos ∠
a1 a2 · λ2 − | a1 | cos ∠
a3 a3 · λ3
| a1 a2 | cos ∠ a2 · λ2 − | a3 a1 | cos ∠
a3 · λ3
− (λ1 + λ2 + λ3 ) = 0.
α1 + α2 + α3
What are equations for
a b2 and
a b3 ?
and scalar α ∈ R,
x3 − x2 −
x1 = α( x1 )
x3 ) − T (
⇒ T ( x2 ) − T (
x1 ) = α[T ( x1 )] (2.7.1)
x3 T ( x3 )
x2 T ( x2 )
x1 T ( x1 )
T T ( x2 ) – T ( x1 )
x2 – x1
T ( x3 ) – T ( x1 )
x3 – x1
Fig. 2.39
x −
f ( x ) − T (
x 0 ) = T ( x 0 ), x ∈ R2 .
This f is one-to-one, onto and f ( 0 ) = 0 holds. Also, for any scalar α ∈ R,
we have by definition (2.7.1)
x −
f (α( x ) − T (
x 0 )) = α(T ( x −
x 0 )) = αf ( x 0 ), x ∈ R2 .
(∗1 )
x2 −
x1 = 2(
x3 −
x1 ) ⇒ T (
x2 ) − T ( x3 ) − T (
x1 ) = 2(T ( x1 ));
( x1 + x2 − x 0 )
− x3 −
x 0 = 2( x 0 ) ⇒ T ( x2 −
x1 + x 0 ) − T (
x 0)
x3 ) − T (
= 2(T ( x 0 )).
2.7 Linear Transformations (Operators) 83
x1 + x2 − x0
( x1 − x0 ) + ( x2 − x0 )
x2
x3
x2 − x0
x1
x1 − x0
x0
Fig. 2.40
f ( x1 − x0 ) + f ( x2 − x0 )
f ( x2 − x0 )
f ( x1 − x0 )
O
Fig. 2.41
Then
x1 −
f (( x2 −
x 0 ) + ( x 0 )) = f (( x2 −
x1 + x 0) −
x 0)
= T ( x2 −
x1 + x 0 ) − T (
x 0 ) = 2(T (x3 ) − T (
x 0 ))
x3 ) − T (
= 2[(T ( x1 ) − T (
x1 )) + (T ( x 0 ))]
x2 ) − T (
= T ( x1 ) − T (
x1 ) + T ( x1 ) − T (
x 0 ) + (T ( x 0 ))
x2 ) − T (
= (T ( x1 ) − T (
x 0 )) + (T ( x 0 ))
x1 −
= f ( x2 −
x 0 ) + f ( x 0 ). (∗2 )
T (
x ) = T ( x −
x0 ) + f ( x0 ), x ∈ R2 ,
(2.7.2)
We call the readers attention, once and for all, to the following concepts
and notations concerned:
A vector (or linear) subspace (see Sec. B.1), simply called a subspace.
Hom(V, W ) or L(V, W ): the vector space of linear transformations from
V to W (see Sec. B.7).
Hom(V, V ) or L(V, V ): the vector space of linear operators on V .
x ∈ V | f (
Ker(f ) or N(f ): the kernel{ x ) = 0 } of an f ∈ H(V, W ) (see
Sec. B.7).
Im(f ) or R(f ): the range{f ( x) |
x ∈ V } of an f ∈ H(V, W ) (see
Sec. B.7).
Note that Ker(f ) is a subspace of V and Im(f ) a subspace of W . If f is
a linear operator on V , a subspace U of V is called invariant under f or
simply f-invariant if
f (U ) ⊆ U
x ∈ U, f (
i.e. for any x ) ∈ U always holds. For examples,
{ 0 }, V, Ker(f ) and Im(f )
are trivial invariant subspaces of V for any linear operator f .
This section is divided into eight subsections.
Section 2.7.1 formulates what a linear operator looks like in the
Cartesian coordinate system N = { e2 }, and then Sec. 2.7.2 presents
e1 ,
some basic but important elementary operators with their eigenvalues and
eigenvectors.
Section 2.7.3 discusses various matrix representations of a linear oper-
ator related to different bases for R2 and the relations among them. The
rank of a linear operator or a matrix is an important topic here.
Some theoretical treatment, independent of particular choice of a basis
for R2 , about linear operators will be given in Sec. 2.7.4.
From Sec. 2.7.5 to Sec. 2.7.8, we will investigate various decompositions
of a linear operator or matrix.
Geometric mapping properties of elementary operators or matrices are
discussed in Sec. 2.7.5. Therefore, algebraically, a square matrix can be
expressed as a product of elementary matrices. And hence, geometrically,
its mapping behaviors can be tracked.
Section 2.7.6 deepens the important concepts of eigenvalues and eigen-
vectors introduced in Sec. 2.7.2. If a linear operator or matrix has two
distinct eigenvalues, then it is diagonalizable as a diagonal matrix as its
canonical form.
86 The Two-Dimensional Real Vector Space R2
x = x1
e1 + x2
e2 and
f(x) = x1 f (
e1 ) + x2 f (
e2 ),
e1 ) and f (
f is completely determined by the vectors f ( e2 ). Suppose
f (
ei ) = (ai1 , ai2 ) = ai1
e1 + ai2
e2 , i = 1, 2.
Then
f (
x ) = x1 (a11
e1 + a12
e2 ) + x2 (a21
e1 + a22
e2 )
f (
x) =
x A, x ∈ R2 . (2.7.6)
Convention
For a given real 2×2 matrix A = [aij ], when considered as a linear operator
A: R2 → R2 , it is defined as
x →
x A in N = { e2 }.
e1 , (2.7.7)
⇔xA = 0
⇔ a11 x1 + a21 x2 = 0
a12 x1 + a22 x2 = 0
⇔ (see Sec. 2.5) x lies on the lines x = t(−a21 , a11 ) and
x = t(−a22 , a12 ).
In case (a11 , a21 ) = (0, 0) = 0 , a11 x1 + a22 x2 = 0 is satisfied by
x ∈ R2 and should be considered as the equation for the
all vectors
whole space R2 . Suppose (−a21 , a11 ) = 0 and (−a22 , a12 ) = 0 . According
to (2.5.9),
The lines
x = t(−a21 , a11 ) and x = t(−a22 , a12 ) coincide.
⇔ The direction vectors (−a21 , a11 ) and (−a22 , a12 ) are linearly
dependent.
−a21 a11 a11 a12
⇔ = or = or
−a22 a12 a21 a22
a11 a12 a11 a12
det A = det = = a11 a22 − a12 a21 = 0.
a21 a22 a21 a22
88 The Two-Dimensional Real Vector Space R2
In this case, the kernel space Ker(A) is the one-dimensional subspace, say
a11 x1 + a21 x2 = 0. On the other hand,
The lines x = t(−a21 , a11 ) and x = t(−a22 , a12 ) intersect at 0 .
−a21 a11 a11 a12
⇔ = or = or
−a22 a12 a21 a22
a a12
det A = 11 = a11 a22 − a12 a21 = 0.
a21 a22
In this case, Ker(A) = { 0 } holds.
What are the corresponding range spaces of A?
Suppose Ker(A) is a11 x1 + a21 x2 = 0. Then
y = (y1 , y2 ) ∈ Im(A), the range space of A
⇔ y1 = a11 x1 + a21 x2
y2 = a12 x1 + a22 x2 for some x = (x1 , x2 ) ∈ R2 .
a12 a22
⇔ remember that = =λ
a11 a21
y = (a11 x1 + a21 x2 )(1, λ) x = (x1 , x2 ) ∈ R2 .
for
This means that the range space Im(A) is a straight line passing 0 .
If Ker(A) = { 0 }, then
y = (y1 , y2 ) ∈ Im(A)
⇔ y1 = a11 x1 + a21 x2
y2 = a12 x1 + a22 x2 for some x = (x1 , x2 ) ∈ R2 .
⇔ (solve simultaneous equations with x1 and x2 as unknowns)
1 1 y1 y2
x1 = (a22 y1 − a21 y2 ) = ,
det A det A a21 a22
1 1 a11 a12
x2 = (−a12 y1 + a11 y2 ) = .
det A det A y1 y2
This is the Cramer rule (formulas).
Therefore, the range space Im(A) = R2 .
We summarize as
The kernel and the range of a linear operator
Let A = [aij ]2×2 be a real matrix, considered as a linear operator on R2
(see (2.7.7)).
(1) Ker(A) and Im(A) are subspaces of R2 and the dimension theorem is
dim Ker(A) + dim Im(A) = dim R2 = 2,
2.7 Linear Transformations (Operators) 89
where dim Ker(A) is called the nullity of A and dim Im(A) the rank
of A (see (2.7.44)), particularly denoted as r(A).
(2) If Ker(A) = R2 , then A = O2×2 is the zero matrix or zero linear
operator and Im(A) = { 0 }. Note the rank r(O) = 0.
(3) The following are equivalent.
Im(A) =
x2 A.
x2 : x2 → x2 A is one-to-one
Hence, the restriction operator A|
and onto (see Sec. A.2).
(4) The following are equivalent (see Exs. <A> 2 and <B> 4 of Sec. 2.4).
1. Ker(A) = { 0 }, i.e.
x A = 0 if and only if x = 0.
2. A is one-to-one.
3. A is onto, i.e. Im(A) = R2 .
4. The rank r(A) = 2.
5. Two row vectors of A are linearly independent. So are the column
vectors.
6. det A = 0.
7. A is invertible and hence is a linear isomorphism.
8. A maps every or a basis { x2 } for R2 onto a basis {
x1 , x1 A,
x2 A}
for R .
2
y ∈ R2 , the equation
For a given xA =
y has the unique solution
−1
x = y A , where
−1 1 a22 −a12
A = . (2.7.8)
det A −a21 a11
(a) length,
(b) angle,
(c) area, and
(d) directions(anticlockwise or clockwise) (see Ex. <A> 4 of Sec. 2.8.1
for formal definition) or orientations.
Readers can easily prove these results. So the proofs are left to the readers.
If any difficulty arises, you just accept these facts or turn to Sec. 2.8.3 for
detailed proofs. They are also true for affine transformations (see (2.7.1)
and (2.7.2)) and are called affine invariants.
2.7 Linear Transformations (Operators) 91
x0 + Ker ( A)
Im( A)
Ker ( A)
x0
e2 x0 A
e2
x1
x2 A
e1
0 e1
0
v
〈〈 x2 〉〉
v + 〈〈 x2 〉〉
Fig. 2.42
Exercises
<A>
Read Secs. B.4, B.6 and B.7 if necessary, and try to do the following
problems.
2.7.2 Examples
This subsection will be concentrated on constructing some basic linear
operators, and we will emphasize their geometric mapping properties.
92 The Two-Dimensional Real Vector Space R2
See Fig. 2.43(a) and (b). In Fig. 2.43(b), we put the two plane coincide
and the arrow signs indicate how A preserves ratios of signed lengths of
segments.
Ker( A)
x0 + 〈〈 x 〉〉
x0 + 〈〈 x 〉〉
e2 e2
(a) (b)
Fig. 2.43
2.7 Linear Transformations (Operators) 93
Operators
0 λ 0 0 0 0
, and , where λ = 0
0 0 λ 0 0 λ
are all of this type.
1. Let
v = (a, b). Then
vA = a
v and
e2 A = 0 =0·
e2 .
Ker(A)
〈〈 v 〉〉
e2 〈〈 v 〉〉 e2
v
A av
0 v 0
e1 e1
0
x0 + 〈〈 x 〉〉 x0 + 〈〈 x 〉〉
(a) (b)
Fig. 2.44
Operators
a 0 0 b 0 0
, and , where ab = 0
b 0 0 a a b
are all of this type.
94 The Two-Dimensional Real Vector Space R2
x0 + 〈〈 x 〉〉
x0 + 〈〈 x 〉〉
Ker( A) v1
e2 〈〈 v2 〉〉
A v1
e1 (a + b) v2 〈〈 v2 〉〉
v2
0 v2
0
0
(a) (b)
Fig. 2.45
or
xA = (x1 , x2 )A = (λ1 x1 , λ2 x2 ) has the properties:
1.
e1 A = λ1 e1 and e2 A = λ2 e2 . λ1 and λ2 are eigenvalues of A with
corresponding eigenvectors e1 and e2 .
2. In case λ1 = λ2 = λ, then A = λI2 and is called an enlargement with
scale λ. Consequently, A keeps every line passing 0 invariant and has
only 0 as an invariant point, if λ = 1.
3. In case λ1 = λ2 , A has two invariant lines (subspaces)
e1 and
e2 ,
and is called a stretching.
4. Stretching keeps the following invariants:
a. Length if λ1 = λ2 = 1.
b. Angle if λ1 = λ2.
c. Area if λ1 λ2 = 1.
d. Sense (direction) if λ1 λ2 > 0.
0 e1
Fig. 2.46
This means that the point (x1 , x2 ) reflects along the line x1 = x2 into the
point (x2 , x1 ), and then reflects along x2 -axis into the point (−x2 , x1 ), the
image point of (x1 , x2 ) under f . See Fig. 2.47.
(x1, x2)
x1 = x2
e2
(−x2, x1)
(x2, x1)
0 e1
Fig. 2.47
2.7 Linear Transformations (Operators) 97
x
〈〈 v2 〉〉
2v2 e2 v1 ab1v1
v2 1v1
e1
0
xA
〈〈v1 〉〉
−2v2
− ab2v2
Fig. 2.48
u2 0 a
xA x
e2
u1 −b 0
0 e1
1 0
x
0 −1
Fig. 2.49
a
a 0 0
=b b c ,
b c 1 b
2.7 Linear Transformations (Operators) 99
e1 A = λ
e1
100 The Two-Dimensional Real Vector Space R2
xA
v2
e2
0 e1 = v1
Fig. 2.50
and
e1 is the only invariant subspace of A. Also
λ 0 0 0 1 0
A= = λI2 + =λ 1
1 λ 1 0 λ 1
shows that A has the mapping behavior as
x = (x1 , x2 ) −−−−−−−−−→ (λx1 , λx2 )
enlargement λI2
@
A
translation along (x2 ,0)
R@ ?
xA = (λx1 + x2 , λx2 )
See Fig. 2.51.
x
xA
x
e2
0 e1
Fig. 2.51
1. S keeps every point on the x1 -axis fixed which is the only invariant
subspace of it.
2. S moves every point (x1 , x2 ) with x2 = 0 along the line parallel to the
x1 -axis, through a distance with a constant proportion a to its distance
to the x1 -axis, to the point (x1 + ax2 , x2 ). Thus, each line parallel to
x1 -axis is an invariant line.
e2
e1 e1
0 0
Fig. 2.52
f (
x ) = f (x1 , x2 ) = (bx2 , x1 + ax2 ).
x = (x1 , x2 ) = 0 . Then,
Let
x
v1
〈〈 v2 〉〉
11v1
2v2 e2 1v1
xA
0 e1
22v2
〈〈 v1 〉〉 v2
12 < 0
Fig. 2.53
v (1, 2)
e2
1v
x
0 e1 2e1
〈〈 v 〉〉
Fig. 2.54
xA = (bx2 , x1 + ax2 )
(0, ax2 )
( x1 , x2 )
e2
( x2 , x1 )
(bx2 , x1 )
0
e1
x1 = x2
Fig. 2.55
√ √
a+ a2 +4b a− a2 +4b
(1) a2 + 4b > 0. Let λ1 = 2 and λ2 = 2 and
vi = (b, λi )
for i = 1, 2.
1.
v i A = λi vi , i = 1, 2. Thus, λ1 and λ2 are eigenvalues of A with
corresponding eigenvectors v1 and v2 .
2. v1 and v2 are invariant lines (subspaces) of R2 under A.
(see Exs. <B> 4, 5 of Sec. 2.4 and Sec. 2.7.3.) See Fig. 2.53.
(2) a2 + 4b = 0. Let λ = a2 , a2 and
v = (−a, 2).
1.
vA = λ v . λ is an eigenvalue of multiplicity 2 of A with correspond-
ing eigenvector v.
v is an invariant line (subspace) of R2 under A.
2.
104 The Two-Dimensional Real Vector Space R2
Is there any more basic operator than these mentioned from Example 1
to Example 7? No more! It will turn eventually out that these operators are
sufficiently enough to describe any operators on R2 in various ways, both
algebraically and geometrically. Please refer to Secs. 2.7.5 to 2.7.8.
Before we are able to do so, we have to realize, as we learned from these
examples, that the natural Cartesian coordinate system N = { e2 } is
e1 ,
not always the best way to describe all possible linear operators whose def-
initions are independent of any particular choice of basis for R2 . A suitable
choice of a basis B = { x2 } for R2 , according to the features of a given
x1 ,
2.7 Linear Transformations (Operators) 105
xA = λx,
then λ is called an eigenvalue (or characteristic root) of A and
x an asso-
ciated or corresponding eigenvector (or characteristic vector). (2.7.11)
In case λ = 0 and x = 0 is an associated eigenvector, then this geo-
metrically means that the line x is an invariant line (subspace) of R2
under A, and if λ = 1, A keeps each vector along x (or point on
x )
fixed, which is called a line of invariant points. If λ = −1, reverse it.
How to determine if A has eigenvalues and how to find them if they do
exist? Suppose there does exist a nonzero vector x such that
x for some scalar λ ∈ R
xA = λ
x (A − λI2 ) = 0
⇔
⇔ (by (3) in (2.7.8))
a11 − λ a12 a − λ a12
det(A − λI2 ) = det = 11
a21 a22 − λ a21 a22 − λ
= λ2 − (a11 + a22 )λ + (a11 a22 − a12 a21 ) = 0.
106 The Two-Dimensional Real Vector Space R2
The most simple and idealistic one is that A has two distinct real eigen-
values λ1 and λ2 with respective eigenvectors x1 and
x2 . In this case,
x1
and x2 are linearly independent. To see this, suppose there exist constants
α1 and α2 such that
α1
x1 + α2
x2 = 0
⇒ (perform A to both sides) α1 λ1
x1 + α2 λ2
x2 = 0
⇒ (by eliminating
x2 from the above two equations)
α1 (λ1 − λ2 )
x1 = 0
⇒ (since λ1 = λ2 and
x1 = 0 ) α1 = 0 and hence α2 = 0. (2.7.14)
This matrix identity holds for any 2 × 2 real matrix A even if A has
coincident eigenvalues or A does not have real eigenvalues. Since a direct
computation shows that
2
2 a11 + a12 a21 a12 (a11 + a22 )
A =
a21 (a11 + a22 ) a12 a21 + a222
a a12 1 0
= (a11 + a22 ) 11 − (a11 a22 − a12 a21 )
a21 a22 0 1
= (tr A)A − (det A)I2 , (2.7.18)
the result (2.7.17) follows. For other proofs, see Exs.<A> 7 and <B> 5.
xA2 ∈
and hence x, x ∈ R2 .
xA for any
(2)
shows that
A is invertible ⇔ det A = 0,
1
A−1 = ((tr A)I2 − A). (2.7.19)
det A
Notice that (2.7.19) still holds for any n × n matrix A over a field F
(see (2.7.13)). It is the geometric equivalence of the Cayley–Hamilton for-
mula that enables us to choose suitable basis B for R2 so that, in the eyes
of B, A becomes [A]B as in (2.7.10). For details, see Secs. 2.7.6–2.7.8.
2.7 Linear Transformations (Operators) 109
Exercises
<A>
How?
3. A matrix A = [aij ]2×2 is called involutory if
A2 = I2
(refer to Ex. 9 of Sec. B.4 and Ex. 8 of Sec. B.7).
(a) By purely algebraic method, show that
(1) A2 = I2
⇔ (2) A = ± I2 or a11 + a22 = 0 and a211 = 1 − a12 a21 .
Give some numerical examples for such A.
(b) By geometric consideration, show that
(1) A2 = I2
⇔ (2) (A − I2 )(A + I2 ) = O
⇔ (3) Im(A − I2 ) ⊆ Ker(A + I2 ) and Im(A + I2 ) ⊆ Ker(A − I2 ).
Then,
(1) If r(A − I2 ) = 0, then A = I2.
(2) If r(A − I2 ) = 1, then there exists a basis { x2 } for R2 so that
x1 ,
−1
x1 1 0 x1
A= .
x2 0 −1 x2
(3) If r(A − I2 ) = 2, then A = −I2 .
(c) Show that
R2 = Im(A − I2 ) ⊕ Im(A + I2 ),
Im(A + I2 ) = Ker(A − I2 ), Im(A − I2 ) = Ker(A + I2 ).
Then, try to find all such A.
(d) Try to use A2 − (tr A)A + (det A)I2 = O to determine A as far as
possible.
4. Let A = [aij ]2×2 be a real matrix such that
A2 = −I2
(refer to Ex. 9 of Sec. B.7).
(a) Show that there does not exist a nonzero vector x0 such that x0 A =
λ x0 for any λ = 0. That is, A does not have any invariant line.
Show that
yA = λ̄
y where
y =x1 −ix2 is the conjugate vector of
x.
(b) Show that
x1 − λ2
x1 A = λ1 x2 ,
x2 A = λ2
x1 + λ1
x2 .
−1 λ1 −λ2 x
[A]B = P AP = , where P = 1 .
λ2 λ1 x2
all such linear transformations form a vector space over F (see Sec. B.7). As
a consequence, the set Hom(V, V ) of operators on V forms an associative
algebra with identity and has the set GL(V, V ) of invertible linear opera-
tors become a group under the composite operation. If dim V = m < ∞
and dim W = n < ∞, Hom(V, W ) can be realized (under isomorphism) as
M(m, n; F), the set of all m×n matrices over F and GL(V, V ) as GL(n; F), the
set of all invertible matrices in M(n; F) = M(n, n; F). See Secs. B.4 and B.7.
2
f (
ai ) = aij bj , i = 1, 2
j=1
2
x= xi
ai
i=1
% 2 &
2
2
2
2
⇒ f(x) =
xi f (
ai ) = xi aij bj = xi aij bj
i=1 i=1 j=1 j=1 i=1
where
[f (
a1 )]C a a12
[f ]B
C = = 11 (2.7.22)
[f (
a2 )]C a21 a22
[f ]B−1
C = [f −1 ]CB .
Conversely, given a real matrix A = [aij ]2×2 , there exists a unique linear
transformation f such that [f ]B
C = A. In fact, define f : R → R by
2 2
2
f (
ai ) = aij bj , i = 1, 2
j=1
2
and, then, for
x= i=1 xi
ai , linearly by
2
f(x) = xi f (
ai )
i=1
(a) [f ( x ]B [f ]B
x )]C = [ C, x ∈ R .
2
2.7 Linear Transformations (Operators) 117
(b) Linearity:
[f + g]B B B
C = [f ]C + [g]C ,
[αf ]B B
C = α[f ]C , α ∈ R2 .
(c) Composite transformation:
[g ◦ f ]B B C
D = [f ]C [g]D .
(e) Change of coordinates: Suppose B and C are bases for R2 . Then (refer
to (2.4.2))
B
[f ]B B C
C = AB [f ]C AC ,
where ABB is the transition
matrix from the basis B to the basis B
and similarly for ACC . So the following diagram is commutative.
C [f ]B
R2 −−−→ R2
(B) (C)
B
AB ↑ ↓ ACC =ACC −1
(2.7.23)
(B ) (C)
R 2
−−→ R 2
[f ]B
C
In case C = B, we write
[f ]B = [f ]BB (2.7.24)
in short. f is called diagonalizable if [f ]B is a diagonal matrix for some basis
B for R2 . Then, it follows from (e) that, if C = B and C = B ,
B −1
[f ]B = AB
B [f ]B AB (2.7.25)
holds. Then, [f ]B is said to be similar to [f ]B . Similarity among matrices
is an important concept in the theory of linear algebra. It enables us to
investigate the geometric behaviors of linear or affine transformations from
different choices of bases. For concrete examples, see Sec. 2.7.2.
Actually, (b) and (d) in (2.7.23) tell us implicitly more information, but
in an abstract setting.
Let the set of linear operators on R2 be denoted by
Hom(R2 , R2 ) or L(R2 , R2 ) = {f : R2 → R2 is a linear operator}, (2.7.26)
which is a vector space over R. As usual, let N = { e2 } be the natural
e1 ,
basis for R .
2
118 The Two-Dimensional Real Vector Space R2
2
2
2
2
= akj xk fkj (
ek ) = akj fkj (xk
ek )
k=1 j=1 k=1 j=1
2
2
= akj fkj (
x)
k=1 j=1
2
2
⇒f = akj fkj . (2.7.28)
k=1 j=1
2
2
akj fkj = 0 (zero operator) ⇔ akj = 0, 1 ≤ k, j ≤ 2.
k=1 j=1
Thus, {f11 , f12 , f21 , f22 } forms a basis for Hom(R2 , R2 ) and therefore,
Hom(R2 , R2 ) is a 4-dimensional real vector space (refer to Sec. B.3).
Corresponding to (2.7.26), let the set of real 2×2 matrices be denoted by
which is also a real vector space with matrix addition and scalar multipli-
cation: B = [bij ]2×2 and α ∈ R,
A + B = [aij + bij ];
αA = [αaij ].
Let
1 0 0 1 0 0 0 0
E11 = , E12 = , E21 = , E22 = . (2.7.30)
0 0 0 0 1 0 0 1
2.7 Linear Transformations (Operators) 119
[fij ]N = Eij , 1 ≤ i, j ≤ 2.
2
2
A= aij Eij (2.7.31)
i=1 j=1
Φ(f ) = [f ]N .
GL(R2 , R2 ) (2.7.33)
1. h ◦ (g ◦ f ) = (h ◦ g) ◦ f .
2. 1R2 ◦ f = f ◦ 1R2 , where 1R2 is the identity operator on R2 .
120 The Two-Dimensional Real Vector Space R2
Hence, both are called the real general linear group on R2 . (2.7.35)
det f = det[f ]B ,
where B is any fixed basis for R2 (for geometric meaning, see (2.8.44)).
(2.7.38)
Owing to
it comes immediately
and hence PAP −1 and A have the same set of eigenvalues. Hence, the
characteristic polynomial of a linear operator f on R2 is
Notice that
x A = λ x ∈ R2
x for
x P −1 )(PAP −1 ) = λ(
⇔ ( x P −1 ). (2.7.40)
This means that if x is an eigenvector of A (or an operator) associated with
the eigenvalue λ in the natural coordinate system N , then x P −1 is the
−1
corresponding eigenvector of PAP associated with the same eigenvalue λ
in any fixed basis B for R2 , where P is the transition matrix from B to N
and x P −1 = [
x ]B . See the diagram.
A
R2 −−−→ R2
(N ) (N )
P ↑ ↓ P −1
(B) (B)
R2 −−−→ R2
PAP −1
For more properties concerned with trace, refer to Exs. 25–30 of Sec. B.4.
The invariance of trace can be proved indirectly from the invariance of
characteristic polynomial as (2.7.12) shows.
Finally, how about the rank of an operator or square matrix? The rank
we defined for a linear operator or matrix in Sec. 2.7.1 is, strictly speaking,
not well-defined, since we still do not know if the nonnegative integer is
unchangeable subject to changes of bases.
2.7 Linear Transformations (Operators) 123
nonzero matrices.
In case r(A) = 1, i.e. dim Im(A) = 1: Two separate cases are considered
as follows.
Case 1 Let r(B) = 1, then Im(AB) = { 0 } or Im(AB) = v where
v = 0 according to the image Im(A) = Ker(B) or not. Hence,
0 ⇔ Im(A) = Ker(B)
r(AB) =
r(A) = r(B) = 1 ⇔ Im(A) ∩ Ker(B) = { 0 }.
Case 2 Let r(B) = 2, then
r(AB) = r(A) = 1 < 2 = r(B).
In case r(A) = 2, still two separate cases are considered as follows.
Case 1 If r(B) = 1, then r(AB) = 1 holds.
Case 2 If r(B) = 2, then r(AB) = 2 holds.
Summarize the above as
The lower and upper bounds for the rank of the product matrix
of two matrices
Let A2×2 and B2×2 be two real matrices.
(1) If either A or B is zero, the
r(AB) = 0 ≤ min{r(A), r(B)}.
(2) If both A and B are nonzero,
r(A) + r(B) − 2 ≤ r(AB) ≤ min{r(A), r(B)},
where
r(AB) = r(A) + r(B) − 2 ⇔ Ker(B) ⊆ Im(A),
r(AB) = r(A) ⇔ Im(A) ∩ Ker(B) = { 0 },
r(AB) = r(B) ⇔ R2 = Im(A) + Ker(B). (2.7.43)
These results are still suitable for matrices Ak×m and Bm×n over fields,
with m replacing 2 (see Ex. 3 of Sec. B.7). Try to reprove (2.7.43) by using
the dimension theorem stated in (2.7.8) and then try to prove these results
for general matrices (see Ex. <C>).
As an easy consequence of (2.7.43), we have
124 The Two-Dimensional Real Vector Space R2
r(PAP −1 ) = r(A).
r[f ] = r([f ]B ),
These results are still usable for arbitrary matrices over fields. Readers are
urged to prove (2.7.44) directly without recourse to (2.7.43).
For the sake of reference and generalization, we introduce four closely
related subspaces associated with a given matrix.
Let A = [aij ]2×2 be a real matrix. By interchange of rows and columns
of A, the resulted matrix, denoted as
A∗ , (2.7.45)
Im(A) or R(A) = {
xA |
x ∈ R2 }
= the subspace of R2 generated by the row vectors
A1∗ = (a11 , a12 ) and A2∗ = (a21 , a22 ) of A,
Ker(A) or N(A) = {
x ∈ R2 |
xA = 0 } (2.7.46)
are called respectively the row space and the left kernel space of A, while
Im(A∗ ) or R(A∗ ) = {
x A∗ |
x ∈ R2 }
= the subspace of R2 generated by the column
a11 a12
vectors A∗1 = and A∗2 = of A,
a21 a22
∗ ∗ ∗
Ker(A ) or N(A ) = { x ∈ R | x A = 0 }
2
(2.7.47)
are called respectively the column space and the right kernel space of A.
2.7 Linear Transformations (Operators) 125
What we have said here from (2.7.46) to (2.7.49) are still true for arbitrary
m × n matrices A over a field F (see Ex. 15 of Sec. B.4, Secs. B.5 and B.6,
Ex. 2 of Sec. B.7 and Sec. B.8): 0 ≤ r(A) ≤ min(m, n).
Exercises
<A>
1. Let the linear operator f on R2 be defined as
−6 5
f (
x) =
x
1 3
in N = {e1 , e2 }. Let B = {(−1, 1), (1, 1)} and C = {(2, −3), (4, −2)}.
126 The Two-Dimensional Real Vector Space R2
(ACB )−1 = AB
C
2. Let
2 3
f(x) = x
−4 −6
and B = {(1, −1), (−2, 1)} and C = {(−1, −2), (3, −4)}. Do the same
questions as in Ex. 1.
3. Find a linear operator f on R2 and a basis B for R2 such that
−1 1
[f (
x )]B = [
x ]N
1 1
Does there exist a linear operator f on R2 and two bases B and C for
R2 such that [f ]B = I2 and [f ]C = A? Note that both I2 and A has
the same characteristic polynomial (t − 1)2 , and hence the same set of
eigenvalues.
6. Let
1 1 1 2
A= and B = .
−1 1 −1 0
2.7 Linear Transformations (Operators) 127
(a) Give as many different reasons as possible to justify that there does
not exist a linear operator f on R2 and two bases B and C for R2
such that [f ]B = A and [f ]C = B.
(b) Find some invertible matrices P2×2 and Q2×2 such that P AQ = B.
7. For any linear operator f on R2 , there exist a basis B = { x2 } and
x1 ,
another basis C = { y1 , y2 } for R such that
2
−1
x1 y1 B 0 0 1 0 1 0
[f ] N = [f ] C = or or .
x2 y2 0 0 0 0 0 1
8. Find nonzero matrices A2×2 and B2×2 such that AB has each possible
rank.
9. Suppose A2×2 and B2×2 are matrices (or n × n matrices). Then
(AB)−1 = B −1 A−1 .
S ⊥ = {
x ∈ R2 |
x ⊥ y ∈ S}.
y for each
<B>
2. Let
1 0 0 1 1 1
I2 = , J= , E= ,
0 1 1 0 0 0
0 0 1 0 0 1
F = , G= , H= .
1 1 1 0 0 1
Adopt the Cartesian coordinate system N = { e2 } in R2 .
e1 ,
(a) Explain geometrically and algebraically that it is impossible for I2
and J to be similar. Instead of this, is it possible that
I2 = PJP ∗
for invertible P ? How about I2 = PJQ for invertible P and Q?
(b) Show that E, F, G and H are never similar to I2 .
(c) Figure 2.56(a) and (b) explain graphically the mapping properties
of E and F , respectively. Both are projections (see Sec. 2.7.4).
invariant line x1 = x2
( x1 , x1 )
( x1 , x2 ) e2 ( x1 , x2 ) ( x2 , x2 )
e2
0 Ker ( F )
e1 ( x1 , x2 ) 0 e1
( x1 , x1 )
( x2 , x2 ) ( x1 , x2 )
Ker ( E )
(a) (b)
Fig. 2.56
Do you have a strong feeling that they should be similar? Yes, they
are! Could you see directly from Fig. 2.56 that
JFJ = JFJ −1 = E?
Try to find invertible matrices P and Q such that
−1 −1 1 0
PEP = QFQ = .
0 0
Is it necessary that J = P −1 Q?
2.7 Linear Transformations (Operators) 131
(d) Look at Fig. 2.57(a) and (b). Explain geometrically and algebraically
that G and H are similar. Also,
JHJ = JHJ −1 = G.
invariant line
(0, x1 +x2)
x 1 + x2 = 0 x1 + x2 = 0
e2 ( x1 , x2 )
e2
( x1 , x2 )
(x1 + x2, 0)
0 e1
( x1 , x2 ) (x1 + x2,0) 0 e1
invariant line
( x1 , x2 )
(0, x1 + x2) Ker ( H )
Ker (G)
(a) (b)
Fig. 2.57
(b) Let
0 0 0 −1
A= , B= .
1 1 0 1
Show that AB and BA are not similar.
(c) If either A or B is invertible, the AB and BA are similar.
Then, watch the steps (and better think geometrically at the same time):
Can you follow it and give precise reasons for each step? Or, do you get lost?
Where? How? If the answer is positive, try to do the following problems
and you will have a good chance to be successful.
134 The Two-Dimensional Real Vector Space R2
Then Im(B) = Ker(A). Show that there exists Dq×p such that C = DB ,
and such a D is unique if and only if r(B) = p.
14. Try to do Ex. 10 of Sec. B.7.
15. Try to do Ex. 11 of Sec. B.7.
16. Try to do Ex. 12 of Sec. B.7.
of R2 onto V1 along V2 if
f (
x) =
x1 (2.7.50)
Of course, the zero linear operator is the projection of R2 onto { 0 }
along the whole space R2 . The identity operator 1R2 is the projection of R2
onto itself along { 0 }. Now, we have
V2
x V1
e2
f (x )
0 e1
Fig. 2.58
g(
yi ) =
xi , i = 1, 2.
Therefore,
(g ◦ f )(
x1 ) = g(f (
x1 )) = g(
y1 ) =
x1
⇒ (g ◦ f )2 (
x1 ) = (g ◦ f )(
x1 ) = g(
y1 ) =
x1 ;
and
(g ◦ f )(
x2 ) = g(f (
x2 )) = g( 0 ) = 0
x2 ) = (g ◦ f )(
⇒ (g ◦ f )2 ( x2 ) = 0 .
x ) = (g ◦ f )(
Hence (g ◦ f )2 ( x ∈ R2 .
x ) for all
We summarize as
The projectionalization of an operator
Let f be a linear operator on R2 . Then there exists an invertible linear
operator g on R2 such that
(g ◦ f )2 = g ◦ f,
The above result still holds for linear operators on finite-dimensional vector
space or any n × n matrix over a field.
138 The Two-Dimensional Real Vector Space R2
x2
y2
x x1
e2 f e2
g ° f ( x)
g
0
e1 0 e1
y1 = f ( x1 )
Fig. 2.59
[ f ] = PAQ
R2 f R2
( ) ( ) 0 0
y1
g [1R2 ] = P [1R2 ] = Q
h
e2 e2
( ) ( )
R2 R2
h° f °g [f] = A
0 e1 0 e1
Fig. 2.60
[f ]N = [1R2 ]N B C
B [f ]C [1R2 ]N
⇒ [f ]B B N
C = [1R2 ]N [f ]N [1R2 ]C
−1
x y 1 0
= 1 [f ]N 1 = . (2.7.54)
x2 y2 0 0
See Fig. 2.60.
140 The Two-Dimensional Real Vector Space R2
We summarize as
The rank theorem of a linear operator on R2
Let f be a linear operator on R2 .
(1) There exist linear isomorphisms g and h on R2 such that, for any
x = ( x2 ) ∈ R2 ,
x1 ,
(0, 0), if r(f ) = 0,
h ◦ f ◦ g(x1 , x2 ) = (x1 , 0), if r(f ) = 1,
(x1 , x2 ), if r(f ) = 2.
See the diagram beside Fig. 2.60.
(2) Let A2×2 be a real matrix. Then there exist invertible matrices P2×2
and Q2×2 such that
O2×2 , if r(A) = 0,
1 0
, if r(A) = 1,
P AQ = 0 0
1 0
0 1 = I2 , if r(A) = 2.
Matrices on the right side are called the normal form of A according
to its respective rank (see Fig. 2.60). (2.7.55)
turn, is equal to f n ( x1 ) for any integer n ≥ 3. This means r(f n ) = 1
for n ≥ 1 in this case.
2.7 Linear Transformations (Operators) 141
As a conclusion, we summarize as
The ranks of iterative linear operators
Let f be a linear operator on R2 .
(1) Then
r(f 2 ) = r(f 3 ) = r(f 4 ) = · · · .
In fact,
1. If r(f ) = 0, then r(f n ) = 0 for n ≥ 1.
2. If r(f ) = 1 and Ker(f ) = Im(f ), then r(f n ) = 0 for n ≥ 2, while if
r(f ) = 1 and Ker(f ) ∩ Im(f ) = { 0 }, then r(f n ) = 1 for n ≥ 1.
3. If r(f ) = 2, then r(f n ) = 2 for n ≥ 1,
(2) For any real 2 × 2 matrix A,
r(A2 ) = r(A3 ) = r(A4 ) = · · · . (2.7.57)
The aforementioned method can be modified a little bit to prove the general
result: for any n × n matrix A over a field,
r(An ) = r(An+1 ) = · · · . (2.7.58)
always holds.
Exercises
<A>
1. Notice that R2 = e1 + e2 ⊕
e1 = e2 ⊕
e1 + e2 . Suppose
f and g are respectively, the projections of R onto e1 +
2
e2 along
e2 and along
e1 . Show that f ◦ g = g and g ◦ f = f .
2. Suppose f and g are idempotent linear operators, i.e. f 2 = f and g 2 = g
hold.
(a) Show that Im(f ) = Im(g) ⇔ g ◦ f = f and f ◦ g = g.
(b) Show that Ker(f ) = Ker(g) ⇔ g ◦ f = g and f ◦ g = f .
(Note These results still hold for idempotent linear operators or pro-
jections on a finite-dimensional vector space.)
3. Let A2×2 be
−3 4 −3 4
or .
1 −2 −9 12
Try to find respective invertible matrix P2×2 such that (AP )2 = AP .
How many such P are possible?
142 The Two-Dimensional Real Vector Space R2
g = h ◦ f ◦ h−1
<B>
f (
x ) = λ
x and g(
x ) = µ
x
fi ◦ fj = δij fi , 1 ≤ i, j ≤ 2.
(a) Show that either each r(fi ) = 0 for 1 ≤ i ≤ 2 or each r(fi ) = 1 for
1 ≤ i ≤ 2.
(b) Suppose gi ∈ Hom(R2 , R2 ) for 1 ≤ i ≤ 2 satisfy gi ◦ gj = δij gi ,
1 ≤ i, j ≤ 2.
Show that, there exists a linear isomorphism ϕ on R2 such that
gi = ϕ ◦ fi ◦ ϕ−1 , 1 ≤ i ≤ 2.
[f, g] = f ◦ g − g ◦ f.
(Note (a) and (c) are still true for any vector space V with dim V = 2,
while (b) is true only for real vector space.)
<C>
4. Do (2.7.58).
5. Prove the statements inside the so many Notes in Exs. <A>
and <B>.
6. Try your best to do as many problems as possible from Ex. 13 through
Ex. 24 in Sec. B.7.
7. For any A, B ∈ M(n; C) such that B is invertible, show that there
exists a scalar λ ∈ C such that A + λB is not invertible. How many
such different λ could be chosen?
8. Show that T : M(n; F) → M(n; F) is a linear operator if and only if,
there exists A ∈ M(n; F) such that
k
T (X) = Qj XRj , X ∈ M(n; F).
j=1
T (f ) = tr (f ◦ f0 ), f ∈ Hom(V, V ).
T (f ) = λ tr (f ), f ∈ Hom(V, V ).
tr T (f ) = tr f, f ∈ Hom(V, V ).
148 The Two-Dimensional Real Vector Space R2
10. For P ∈ M(m; R) and Q ∈ M(n; R), define σ(P, Q): M(m, n; R) →
M(m, n; R) as
P ⊗Q
<D> Applications
(2.7.55) or (2.7.56) can be localized to obtain the rank theorem for contin-
uously differentiable mapping from open set in Rm to Rn . We will mention
this along with other local versions of results from linear algebra in Chaps. 4
and 5.
5 −3 x1 6
=
1 4 x2 2
side by side for comparison when solving the equations as follows. (1) ↔ (2)
means type 1, (2) + (−5) × (1) means type 3, and − 23
1
(2) means type 2, etc.
5x1 − 3x2 = 6 (1) 5 −3 x1 6
=
x1 + 4x2 = 2 (2) 1 4 x2 2
↓(1)↔(2) ↓E(1)(2)
x1 + 4x2 = 2 (1) 1 4 x1 2
=
5x1 − 3x2 = 6 (2) 5 −3 x2 6
↓(2)+(−5)×(1) ↓E(2)−5(1)
x1 + 4x2 = 2 (1) 1 4 x1 2
=
−23x2 = −4 (2) 0 −23 x2 −4
↓− 23
1
(2) ↓E− 1 (2)
23
x1 + 4x2 = 2
(1) 1 4 x1 2
4 = 4
x2 = (2) 0 1 x2 23
23
↓(1)+(−4)×(2) ↓E(1)−4(2)
x1 =
30
(1) 30
23 1 0 x1 23
= .
4 (2) 0 1 x2 4
x2 = 23
23
Also, refer to Sec. 2.7.2 for geometric mapping properties of these ele-
mentary matrices. For more theoretical information about or obtained by
elementary matrices, please refer to Sec. B.5.
To see the advantages of the introduction of elementary matrices, let us
start from concrete examples.
Example 1 Let
1 2
A= .
4 10
(1) Solve the equation x = (x1 , x2 ) ∈ R2 and b = (b1 , b2 )
x A = b where
is a constant vector.
(2) Investigate the geometric mapping properties of A.
Solution Where written out, x A = b is equivalent to
1 2 x1 + 4x2 = b1
(x1 , x2 ) = (b1 , b2 ) or .
4 10 2x1 + 10x2 = b2
1 0
1 0
(4, 2) (8, 2)
4 1 (4,1) (8,1) 0 2
e2
Shearing One-way
0 e1 4e1 0 4e1 stretch
0 4e1
(8,18)
1 2
1 2
0 1
(4,10) Shearing
4 10
(4, 8)
0
( ) 1
2
scale
Fig. 2.61
and
1 0
AF(2)−2(1) F 12 (2) =
4 1
1 0 −1 −1 1 0 1 2
⇒A= F1 F = .
4 1 2 (2) (2)−2(1) 4 1 0 2
Does factorization like (2.7.62) help in solving the equations
xA = b ?
Yes, it does. Take the first factorization for example,
1 2
x = b
4 10
1 0 1 2
⇔x
= y and y = b where
y = (y1 , y2 ).
4 2 0 1
154 The Two-Dimensional Real Vector Space R2
Solve firstly
−1
1 2 1 2 1 −2
y = b ⇒ y = b
= b = (b1 , −2b1 + b2 ),
0 1 0 1 0 1
and secondly
1 0
x = (b1 , −2b1 + b2 )
42
1 0 1
⇒
x = (b1 , −2b1 + b2 ) = 5b1 − 2b 1 , −b1 + b 2 .
−2 12 2
Example 2 Let
0 2
A= .
−1 1
Do the same problems as in Example 1.
Solution For
x = (x1 , x2 ) and b = (b1 , b2 ),
x A = b is equivalent to
0 · x1 − x2 = b1
2x1 + x2 = b2
which can be easily solved as x1 = 12 (b1 + b2 ) and x2 = −b1 .
The shortcoming of A, for the purpose of a general theory to be estab-
lished later in this subsection, is that its leading diagonal entry is zero. To
avoid this happening, we exchange the first row and the second row of A
and the resulted matrix amounts to
−1 1 0 1 0 2
B= = = E(1)(2) A.
0 2 1 0 −1 1
Then, perform column operations to
x2 x1
−1 1
E(1)(2) A B
- - - - - - - - - = - - = 0 2
b b - - - - - -
b1 b2
1 1
0 B
−−−−−→
2 = F
-
- −(1)
F−(1) ------- b
−b1 b2
2.7 Linear Transformations (Operators) 155
1 0
0 B
−−−−−→ 2 = F
-- −(1) F(2)−(1)
F(2)−(1) - - - - - - - - - - -
b
−b1 b1 + b2
1 0
0 B
−−−−−→ ----------
1 = F
-- −(1) F(2)−(1) F 12 (2) .
F 1 (2)
2 b
−b1 b1 +b 2
2
Note that x A = ( x E(1)(2) )(E(1)(2) A) = ( x E(1)(2) )B = (x2 x1 )B = b
so that the first column corresponds to x2 while the second one to x1 .
Equivalently, we can perform row operations to
..
−1 0 . b1 x2
[(E(1)(2) A)∗ | b∗ ] = [A∗ F(1)(2) | b∗ ] = [B ∗ | b∗ ] = .
1 2 .. b2 x1
.. ..
1 0 . −b 1 0 . −b
−−−−→ ..
1 −−−−−→
..
1
E−(1) E(2)−(1)
1 2 . b2 0 2 . b1 + b2
..
1 0 . −b1
−−−−→ . .
0 1 .. b1 +b
E 1 (2) 2
2
2
In this case, the first row corresponds to x2 while the second one to x1 .
(a) The solution of xA = b
x = b1 + b2
1
2 , or
x2 = −b1
1
−1
x = (x1 , x2 ) = (b1 b2 ) 12
= b A−1 .
2 0
(b) The invertibility of A and its inverse A−1
1 0
I2 = = BF−(1) F(2)−(1) F 12 (2)
0 1
−1
⇒ B −1 = (E(1)(2) A)−1 = A−1 E(1)(2) = F−(1) F(2)−(1) F 12 (2)
⇒ A−1 = F−(1) F(2)−(1) F 12 (2) F(1)(2)
1
−1 0 1 −1 1 0 0 1 −1
= = 21 .
0 1 0 1 0 12 1 0 2 0
156 The Two-Dimensional Real Vector Space R2
−1
A = F(1)(2) F 1−1 F −1 F −1
(2) (2)−(1) −(1)
2
See Fig. 2.62. Note that A does not have real eigenvalues.
Example 3 Let
1 2
A= .
2 −7
x ∗ = b∗ where
(1) Solve the equation A x = (x1 , x2 ) and b = (b1 , b2 ).
(2) Investigate the geometric mapping properties of A.
Solution As against
x A = b in Examples 1 and
2, here we use column
∗ x ∗ b
vector x = x as unknown vector and b = b1 as constant vector in
1
2 2
x ∗ = b∗ with A as coefficient matrix and [A | b∗ ]2×3 as augmented matrix.
A
2.7 Linear Transformations (Operators) 157
(1, 5)
4e2 4e2
2e2
0 1 1 0 1 1
e2
1 0 0 2 0 1 (1,1)
2e1 e1 Shearing
0 0 0 e1 0 −1 0
(−1, 4) 4e2 (−1, 5) 0 1
−1 0
1 −1
0 2 4e2
0 1
Shearing
( −1,1)
− e1 0
0
Fig. 2.62
(−3,11) ( −1,11)
e2 (1,1) 1 0 (1,1) (3,1)
( −1,1) 1 0
2 1 0 −11
e1 ( −1,9)
0 0
0 1
1 2
2 −7
0 ( )
1
2
scale (1, −11) (3, −11)
(3, − 5)
(1, − 9)
Fig. 2.63
√
Note that A has two distinct real eigenvalues −3 ± 20.
2.7 Linear Transformations (Operators) 159
Example 4 Let
2 3
A= .
−4 −6
(1) Solve the equation A x ∗ = b∗ .
(2) Try to investigate the geometric mapping properties of A.
x ∗ = b∗ out as
Solution Write A
2x1 + 3x2 = b1
.
−4x1 − 6x2 = b2
The equations have solutions if and only if 2b1 + b2 = 0. In this case, the
solutions are x1 = 12 (b1 − 3x2 ) with x2 arbitrary scalars.
Apply row operations to
..
2 3 . b1
[A | b∗ ] = .
−4 −6 .. b2
3
.. b1
1 .
−−−−−−→ 2
..
2 = E1
2 (1)
[A | b∗ ]
E 1 (1)
2 −4 −6 . b2
3
.. b1
1 2 . = E(2)+4(1) E 1 (1) [A |
−−−−−−→ .
2
2
b∗ ].
E(2)+4(1) .
0 0 . 2b + b
1 2
2 0
e2
1 0
(−4,1) (−2,1)
1
3
2
( )
1
2
scale (2,3)
e2 0 1 −4 1 0 0 0
(− 4, −6)
Fig. 2.64
EA or AF,
P −1 = P ∗ .
Ek Ek−1 · · · E2 E1 A = In .
A−1 = Ek Ek−1 · · · E2 E1 .
Ek Ek−1 · · · E2 E1 A = U .
Then,
among them are called the pivots of A). Therefore, the following hold:
1. A is invertible if and only if d1 , . . . , dn are nonzero. In this case,
A can be factored as
1 0 d1 0 1 ··· ···
..
A = ... ..
. ..
. ..
. .
... ... 1 0 dn 0 1
= LDU,
A∗ = U ∗ DL∗ ,
A = LDL∗ .
PA = LU
as in (1).
2. Hold all row exchanges until all nonzero entries below the diagonal
are eliminated as possible. Then, try to permutate the order of rows
of such a resulting matrix in order to produce an upper triangular
form. Thus,
A = LP U,
For PA = LU :
−1 0 5 −1 0 5
A −−−−−−→ PA =
2 4 0 −−−−−−→ 0 4 10
P =E(1)(3) E(2)+2(1)
0 1 3 0 1 3
−1 0 5 −1 0 5
5
−−−−−−→ 0 1 52 −−−−−→ 0 1 2
E 1 (2) E(3)−(2)
4 0 1 3 0 0 12
−1 0 5
−1 0 1 5
⇒ PA = E(2)+2(1) E −1
1
−1
E(3)−(2) 2
4 (2)
1
0 0 2
1 0 0 −1 0 5
5
= −2 4 0 0 1 2
0 1 1 1
0 0 2
−1 0 0 −1 0 0 1 0 −5
= 2 1 0 0 4 0 0 1 5
.
2
1
0 1 1 0 0 2 0 0 1
For A = LP U :
0 1
3 0 1 3 2 4 0
A −−−−−−→ 2 0 −−−−−−→ 2 4
4 0 −−−−−−→ 0 1 3
E(3)+ 1 (2) E(3)−2(1) E(1)(2) =P
2 0 2
5 0 0 −1 0 0 −1
2 4 0
−1
⇒ A = E(3)+ 1 E −1
E −1 0 1 3
2 (2)
(3)−2(1) (1)(2)
0 0 −1
1 0 0 0 1 0 2 4 0
= 0 1 0 1 0 0 0 1 3 .
−1
1 2 1 0 0 1 0 0 −1
A = LUm×n ,
PA
For more details and various applications of this kind of factorizations, see
Sec. B.5 for a general reference.
In fact, this process of elimination by using elementary row and column
operations provides us a far insight into the symmetric matrices.
Let us start from simple examples.
Let
1 −3
A= .
−3 9
Then
1 −3 1 0
A −−−−−−→ −−−−−−→
E(2)+3(1) 0 0 F (2)+3(1) 0 0
−1 1 0 −1 1 0 1 0 1 −3
⇒ A = E(2)+3(1) F = = LDL∗ .
0 0 (2)+3(1) −3 1 0 0 0 1
Let
0 4
B= .
4 5
2.7 Linear Transformations (Operators) 165
Then,
∗ 5 4
B −−−−→ E(1)(2) BF(1)(2) = E(1)(2) BE(1)(2) =
E(1)(2) 4 0
F(1)(2)
∗ ∗ 5 0
−−−−−−−−−−−−−→ E(2)− 45 (1) E(1)(2) BE(1)(2) E(2)− 4 =
E(2)− 4 (1) ,F(2)− 4 (1) 5 (1) 0 − 16
5
5 5
−1 −1 5 0 * −1 −1
+∗
⇒ B = E(1)(2) E(2)− 4
(1) −16 E(1)(2) E(2)− 4
(1)
5 0 5
5
4 4
5 1 5 0 5 1
=
1 0 0 −16
5 1 0
= PDP ∗ ,
4
1 5 0
where P = 15 0 is not necessarily lower triangular and D = 0 − 16 is
5
diagonal. To simplify D one step further, let
√1 0
Q= 5 √ .
5
0 4
Then,
1 0
∗
QDQ = ,
0 −1
Or, equivalently,
1 0
∗
RBR = ,
0 −1
where
0 √1
R = (P Q−1 )−1 = QP −1 = √ √ .
5
4
5
− 5
These two examples tell us explicitly and inductively the following impor-
tant results in the study of quadratic forms.
166 The Two-Dimensional Real Vector Space R2
where
the index k of A = the number of positive 1 in the diagonal,
the signature s = k − l of A = the number k of positive 1 in the
diagonal minus the number l of negative −1 in the diagonal,
the rank r of A = k + l.
(3) Sylvester’s law of inertia
The index, signature and rank of a real symmetric matrix are invariants
under congruence.
(4) Therefore, two real symmetric matrices of the same order are congruent
if and only if they have the same invariants. (2.7.71)
The result (1) still holds for symmetric matrices over a field of characteristic
other than two (see Sec. A.3).
Exercises
<A>
1. Let
4 3
A= .
5 2
2.7 Linear Transformations (Operators) 167
5. Let
1 4
A= .
−3 7
Do the same problems as in Ex. 1. What happens to (a)(1)?
6. Let
2 −4
A= .
−6 12
(a) Find the following factorizations of A:
λ 0
(1) A = P −1 01 λ P .
2
1 −2
(2) A = E1 E2 0 0 , where E1 and E2 are elementary matrices.
(3) A = LU .
(4) A = LS, where L is lower triangular and S is symmetric.
(b) Determine when the equation A x ∗ = b∗ , where
x = (x1 , x2 ) and
b = (b1 , b2 ), has a solution and then, solve the equation by using
the following methods.
(1) Apply elementary row operations to [A | b∗ ].
(2) Apply A = LU in (a).
(c) Find the image of the triangle ∆ a1 a3 , where
a2 a1 = (3, 1),
a2 = (−2, 3) and a3 = (2, −4), under the mapping
x →
x A by
direct computation and by the former three factorizations in (a).
Illustrate graphically at each step.
7. Suppose A = LDU , where L, D, U are invertible.
(a) In case A has another factorization as A = L1 D1 U1 , show that
L1 , D1 , U1 are invertible and
L = L1 , D = D1 and U = U1
by considering L−1
1 LD = D1 U1 U
−1
.
∗ −1 ∗
(b) Rewrite A as L(U ) (U DU ). Show that there exists a lower tri-
angular matrix L and a symmetric matrix S such that
A = LS.
8. Let
6 2
A= .
2 9
(Note This is the Sylvester’s law in (2.7.71) for 2×2 symmetric matrix.
Ponder if your proof is still good for 3 × 3 or n × n symmetric matrix.
If yes, try it; if no, is there any other way to attack this problem? See
Ex. <B> 3 of Sec. 3.7.5.)
12. Factorize the nonzero matrix
a11 a12
A=
a21 a22
<B>
Caution the following problems are combined and more challenging and
basic knowledge about inner products (see Chap. 4) are needed.
1. Let
0 1 2
A= .
−1 0 1
Also,
1 0 −1 0 −1
PA = , where P =
0 1 2 1 0
e2 e3 ′
PAQ − e1 ′
P −1 Q−1
e2
− e1 0 0 e2 ′
(0,1, 2)
e1 ′
[e2 ] ( −1,0,1)
e1 v3 e3 ′
0 PAQ = A v2
−v1
P=P
u2 = [e1 ] Q=Q '
0 e2 ′
0
u1 e1 ′
v1
Fig. 2.65
e2 e2 e3 ′
e3 ′
−e1 ′ −1
−1 PAQ Q
e1 P − e1 e2 ′
0 0 0 e2 ′ 0
e1 ′ e1 ′
Fig. 2.66
x R−1 = [
x ]B = (α1 , α2 ).
Meanwhile,
1 2 3 1 0
PA = , where P =
0 0 0 2 1
y = (y1 , y2 , y3 ) ∈ R3 and
for x = (x1 , x2 ) = x ∗ = A
y A∗ or y ∗,
Ker(A∗ ) = {
y ∈ R3 | y1 + 2y2 + 3y3 = 0}
= (−2, 1, 0), (−3, 0, 1),
∗
Im(A ) = {
x ∈ R2 | 2x1 + x2 = 0} = (1, −2).
Justify that
Im(A)⊥ = Ker(A∗ ), Ker(A)⊥ = Im(A∗ ), and
∗ ⊥ ∗ ⊥
Im(A ) = Ker(A), Ker(A ) = Im(A),
also R3 = Ker(A∗ ) ⊕ Im(A) and R2 = Ker(A) ⊕ Im(A∗ ). See
Fig. 2.67.
⊥
A Im(A) = Ker( A* )
*
Im(A )
(1, 2, 3)
* ⊥
Ker( A) = Im ( A )
Ker( A* )
0 0
(1, −2)
A*
Fig. 2.67
(c) From our experience in (2.7.8) and Fig. 2.67, it is easily known that
the line (1, −2) = Im(A∗ ) is an invariant line of AA∗ , i.e. there
exists a scalar λ such that
(1, −2)AA∗ = λ(1, −2).
Actual computation shows that λ = 70. Hence, even though A, as
a whole, does not have an inverse, when restricted to the subspace
Im(A∗ ), it does satisfy
1 ∗
A A = 1Im(A∗ ) .
70
Therefore, specifically denoted,
1 ∗
AA+ =
70
serves as a right inverse of the restriction A|Im(A∗ ) .
176 The Two-Dimensional Real Vector Space R2
+ x
AA ⊥
Ker( A) = Im(A*)
(2,1)
(1, −2)
Im(A*)
Fig. 2.68
2.7 Linear Transformations (Operators) 177
Im(A)
A+ A
x
⊥
Ker(A* ) = Im(A)
Fig. 2.69
1 2 3 1
A= = [1 2 3]
−2 −4 −6 −2
√1
√ 1 2 3
= 70 −2
5
√ √ √
√ 14 14 14
5
1
√
14
1 1 −2 1 ∗
+
⇒ A = √ 14 √
√ 2
√ = A .
70 5 5 70
√3
14
√ √
Here 70 is called the singular value of A (see (e)) and ( 70)2 = 70
is the nonzero eigenvalue of AA∗ as shown in (c).
(e) Suppose x1 = √15 (1, −2) which is intentionally normalized from
(1, −2) as a unit vector. Also, let
x = √15 (2, 1). Then
5 √ 1 2 3
x1 A = √ (1, 2, 3) = 70 √ , √ , √ ,
5 14 14 14
x2 A = 0 .
* +
Let
y1 = √1 , √2 , √3 . Take any two orthogonal unit vectors y2
14 14 14 * + * +
∗ −2 √1 −3 √ −6 √5
and
y3 in Ker(A ), say
y2 = √ 5
, 5
, 0 and
y 3 = √
70
, 70
, 70
.
178 The Two-Dimensional Real Vector Space R2
Then
√
y1 + 0 ·
x1 A = 70 y2 + 0 ·
y3 ,
= 0 =0·
x2 A y1 + 0 ·
y2 + 0 ·
y3 ,
√ y1
x1 70 0 0
⇒ A= y2
x2 0 0 0
y3
−1 √ y1 √
x1 70 0 0 70 0 0
⇒A= y2 = R S, (∗)
x2 0 0 0 0 0 0
y3
where
−1
√1 −2
√ √1 √2
R= 5 5
= 5 5
and
√2 √1 −2
√ √1
5 5 5 5
√1 √2 √3
14 14 14
−2
S= √ √1 0
5 5
√−3 −6
√ √5
70 70 70
√ + √1 0
70
70 0 0
= 0 0
0 0 0
0 0
70 y1
x2
A y1
0
0 y2
x1
y3
R S
e3 ′
e1 70 0 0
0 0 0
e2 ′
0 e2 0
e1 ′
70 e1 ′
Fig. 2.70
solution and b A+ is the shortest solution (in distance from 0 or its
length) among all, i.e.
| b A+ | = min |
x |.
x A= b
Equivalently, it is the vector b A+ that minimizes | b −
x A| for all
x ∈ R2 , i.e.
| b − ( b A+ )A| =
min | b −
x A|.
x ∈R2
1. Prove (2.7.66).
2. Prove (2.7.67).
3. Prove (2.7.68).
4. Prove (2.7.69).
5. Prove (2.7.70).
6. Prove (2.7.71).
p
q
mi = m, nj = n.
i=1 j=1
(1) Addition If A = [Aij ] and B = [Bij ] are of the same type, then
A + B = [Aij + Bij ].
λA = [λAij ].
(3) Product Suppose Am×n = [Aij ] and Bn×l = [Bjk ] where Aij is
mi ×nj submatrix of A and Bjk is nj ×lk submatrix of B for 1 ≤ i ≤ p,
1 ≤ j ≤ q, 1 ≤ k ≤ t, i.e. the column numbers of each Aij is equal to
the row numbers of the corresponding Bjk , then
q
AB = [Cik ], Cik = Aij Bjk for 1 ≤ i ≤ p, 1 ≤ k ≤ t.
j=1
A∗ = [A∗ij ]∗ .
Ā∗ = [Ā∗ij ]∗ .
Also, prove that if A and A11 are invertible, then A22 − A21 A−1
11 A12 is
also invertible and
−1
−1 Ir −A−1
11 A12 A11 O Ir O
A = .
O In−r O (A22 − A21 A−1
11 A12 )
−1
−A21 A−1
11 In−r
In particular,
A O O −B
det = (det A)2 ; det = (det B)2 .
O A B O
r(A) = r ≥ 1
⇔ There exist a matrix Bm×r of rank r and a matrix Cr×n of rank r
such that A = BC.
2.7 Linear Transformations (Operators) 185
* +
A O
(b) r O B
= r(A) + r(B).
* + * +
A C A O
(c) r O B
≥r O B .
(d) Use (c) to prove that r(A) + r(B) − n ≤ r(AB), where Am×n and
Bn×m .
22. Suppose Am×n , Bn×p and Cp×q .
(a) Show that
Im A ABC O Iq O O −Iq AB O
= .
O In O B −C Ip Ip O B BC
(b) Show that
AB O AB O ABC O
r ≤r =r
O BC B BC O B
and hence, deduce the Frobenius inequality (see Ex. <C> 11 of
Sec. 2.7.3)
r(AB) + r(BC) ≤ r(B) + r(ABC).
(c) Taking B = In and using Bn×p to replace C, show the Sylvester
inequality (see (2.7.43) and Ex. <C> of Sec. 2.7.3)
r(A) + r(B) − n ≤ r(AB).
23. (a) Let Am×n and Bm×n be real matrices. Show that
(det AB ∗ )2 ≤ (det AA∗ )2 (det BB ∗ )2 .
(b) Let Am×n be a complex matrix. Show that any principal subdeter-
minant of order r of AĀ∗
∗ i1 . . . ir
AĀ ≥ 0, 1 ≤ i1 < · · · < ir ≤ m
i1 . . . ir
(see Sec. B.6).
From Secs. 2.7.2 to 2.7.5, we repeat again and again in examples and
exercises to familiarize readers with these two concepts and methods to
compute them, and to make readers realize implicitly and purposely how
many advantages they might have in the investigation of geometric mapping
properties of a linear operator.
Here in this subsection, we give two examples to end the study of the
diagonalizability of a linear operator (see (2.7.24)) and to summarize for-
mally their results in (2.7.72) and (2.7.73). Examples presented here might
make you feel cumbersome, boring, and sick. If so, please just skip this
content and go directly to Exercises or Sec. 2.7.7.
The resulting vector x = 45 y1 + 15 y2 , 35 y1 + 25 y2 is then the (unique)
solution to f (
x) = y . Thus f is onto (see (2.7.8)). It is worth to notice
that the above algebraic computation can be simplified as the computation
of the inverse A−1 of the matrix A: y = xA ⇔ x = y A−1 .
188 The Two-Dimensional Real Vector Space R2
f maps straight lines into straight lines and preserves ratios of lengths
of line segments along the same line. The equation
a1 x1 + a2 x2 + b = 0 (∗1 )
(1) One is
This means that the line x1 −x2 = 0 and 3x1 +x2 = 0 are kept invariant
under the mapping f .
(2) The other is
a1 a
A−1 =µ 1 .
a2 a2
−1 a 0
⇔ (A − µI2 ) 1 = . (∗3 )
a2 0
190 The Two-Dimensional Real Vector Space R2
1 4 3 1 0
⇔ det(A−1 − µI2 ) = det −µ
5 1 2 0 1
1 4 − 5µ 3 1
= = (5µ2 − 6µ + 1) = 0.
25 1 2 − 5µ 5
1
⇒ µ = 1 and µ = .
5
If µ = 1, by (∗3 ), we have a1 − 3a2 = 0, and if µ = 15 , then a1 − a2 = 0.
f ( x) = xA
0 e1
(1, −3)
Fig. 2.71
〈〈 v2 〉〉 〈〈 v1 〉〉
(−1,1) (1,1)
(1,−1)
(−1,−1)
(3,−7)
Fig. 2.72
1
− 34
−1 0 0 4
A2 = P P = ,
0 1 − 14 3
4
the claimed advantage will come to surface once we can handle the
geometric mapping properties of both A1 and A2 . Note that
x = (x1 , x2 ) ∈ R2 using N = {
e2 }
e1 ,
↓
−1
x P = [ x ]B = (α1 , α2 ) ∈ R2 using B = { v2 }
v1 ,
↓
−1 1 0
xP = (α1 , 0), the projection of (α1 , α2 ) onto (α1 , 0) in B
0 0
↓
−1 1 0 v1
xP P = (α1 0) = α1 v1 =
x A1 .
0 0 v2
2.7 Linear Transformations (Operators) 193
x
A1 〈〈 v1 〉〉
A2
2xA2 e2
xA1
xA2 v1
v2 e1
0
xA
〈〈 v2 〉〉
1xA1
Fig. 2.73
2 −3
f (1, 1) = (1 1) = (1, 1),
−1 4
2 −3
f (−1, 1) = (−1 1) = (−3, 7),
−1 4
2 −3
f (−1, −1) = (−1 − 1) = (−1, −1) = −f (1, 1),
−1 4
2 −3
f (1, −1) = (1 − 1) = (3, −7) = −f (−1, 1).
−1 4
Connect the image points (1, 1), (−3, 7), (−1, −1) and (3, −7) by consecu-
tive line segments and the resulting parallelogram is the required one. See
Fig. 2.72. Try to determine this parallelogram by using the diagonal canon-
ical decomposition of A. By the way, the original square has area equal to
4 units. Do you know what is the area of the resulting parallelogram? It is
20 units. Why? 2
How to find the images of lines parallel to the kernel v2 ? Let
x = x0 + t v2 be such a line and suppose it intersects the line v1 at the
point t0
v1 . Then
f (
x ) = f (
x0 + t
v 2 ) = f (
x0 ) + tf (
v 2 ) = f (
x0 )
v 1 ) = t 0 f (
= f (t0 v1 ) = 8t0
v1 .
This means that f maps the whole line x = x0 + tv2 into a single point,
7 units away from the point of intersection of it with the line v1 , along
v1 and in the direction of the point of intersection. See Fig. 2.74.
the line
8t0 v1 = f ( x) x = x 0 + tv2
〈〈 v2 〉〉
x0
e2
t0 v1 v2
e1
0
v1
〈〈 v1 〉〉
Fig. 2.74
What is the image of a line not parallel to the kernel v2 ? Let
x = x0 + t
u be such a line where u and v2 are linearly independent.
Suppose
u =u1 + u1 ∈ Im(f ) and
u2 with u2 ∈ Ker(f ). Then
f (
x ) = f (
x0 ) + tf (
u)
= f (
x0 ) + tf (
u1 +
u2 ) = f (
x0 ) + t(f (
u1 ) + f (
u2 ))
= f (
x0 ) + tf (
u1 ) = f (
x0 ) + 8t
u1 ,
where f (
x0 ) = 8t0 v1 for some t0 . Since u1 = 0, the image is a line and
coincides with the range line v1 . Also, f maps the line x = x0 + tu
one-to-one and onto the line v1 and preserves ratios of signed lengths
〈〈 v2 〉〉
x0 v2
e2 u2
u = u1
u
e1
0
v1 u1
x0
〈〈 v1 〉〉
Fig. 2.75
f ( x)
8P( x)
x
〈〈 v2 〉〉
P
e2
v2
P( x)
e1
0
v1
〈〈 v1 〉〉
Fig. 2.76
8 0 v1 1 −2
⇒ [f ]B = PAP −1 = , where P = = .
0 0 v2 3 2
Let
1 0 1 2 2 1 0 1 −2
A1 = P −1 P =
0 0 8 −3 1 0 0 3 2
1 − 12
1 2 −4 4 1
= = = A
8 −3 6 − 38 3 8
4
(1) Diagonalizability
1. f is diagonalizable, i.e. there exists a basis B for V such that [f ]B
is a diagonal matrix.
⇔ 2. There exists a basis B = { x n } for V and some scalars
x1 , . . . ,
λ1 , . . . , λn such that f ( x i ) = λi x i , for 1 ≤ i ≤ n. Under these
circumstances,
λ1 0
..
[f ]B = . .
0 λn
⇔ 3. In case the characteristic polynomial of f is
det(f − t1V ) = (−1)n (t − λ1 )r1 · · · (t − λk )rk ,
2.7 Linear Transformations (Operators) 199
Eλi = {
x ∈ V | f ( x}
x ) = λi
ϕ(A) = (A − λ1 In ) · · · (A − λk In ) = On×n
(see Ex. <C> 9(e)). This ϕ(t) is called the minimal polynomial
of A.
⇔ 5. V = Eλ1 ⊕ · · · ⊕ Eλk , where Eλ1 , . . . , Eλk are as in 3.
⇐ 6. f has n distinct eigenvalues λ1 , . . . , λn , where n = dim V .
See Ex. <C> 4(e) for another diagonalizability criterion.
(2) Diagonal canonical form or decomposition
For simplicity, via a linear isomorphism, we may suppose that V = Fn
and f (
x) = x A where A is an n × n matrix. Adopt notations in (1)3.
Let B = { x 1 , . . . ,
x n } be a basis for Fn consisting of eigenvectors of A
such that
Then,
1. Fn = Eλ1 ⊕ · · · ⊕ Eλk .
2. Each Ai : Fn → Fn is a projection of Fn onto Eλi along Eλ1 ⊕ · · · ⊕
Eλi−1 ⊕ Eλi+1 ⊕ · · · ⊕ Eλk , i.e.
A2i = Ai , 1 ≤ i ≤ k.
3. Ai Aj = On×n if i = j, 1 ≤ i, j ≤ k.
4. In = A1 + · · · + Ak .
5. A = λ1 A1 + · · · + λk Ak . (2.7.73)
For more details, refer to Secs. B.11 and B.12. Also, note that, if
dim(Ker(f )) ≥ 1, then each nonzero vector in Ker(f ) is an eigenvector
corresponding to the eigenvalue 0.
Exercises
<A>
1. In Example 1, does linear isomorphism f have other invariant lines
besides x1 − x2 = 0 and 3x1 + x2 = 0? Find invariant lines of the affine
transformation
f (
x) =
x0 +
x A,
where x0 = 0 , if any.
2. In Example 2, find all possible one-dimensional subspaces S of R2
such that
R2 = S ⊕ Ker(f ).
x ∈ R2 , let
For each
x + Ker(f ) = {
v |
x + v ∈ Ker(f )}
be the image of Ker(f ) under the translation v → x +
v which is the
x +
line v2 parallel to the line
v2 . Show that
x1 x2 + Ker(f ) ⇔
+ Ker(f ) = x1 −
x2 ∈ Ker(f ).
Denote the quotient set
R2 /Ker(f ) = {
x + Ker(f ) |
x ∈ R2 }
and introduce two operations on it as follows:
(1) α(x + Ker(f )) = α x + Ker(f ), α ∈ R,
(2) ( x1 + Ker(f )) + ( x2 + Ker(f )) = (
x1 +
x2 ) + Ker(f ),
2.7 Linear Transformations (Operators) 201
which are well-defined (refer to Sec. A.1). Then, show that R2 /Ker(f )
is a vector space isomorphic to S mentioned above (see Sec. B.1).
R2 /Ker(f ) is called the quotient space of R2 modulus Ker(f ). What
is R2 /S?
3. In N = { e2 }, define f : R2 → R2 by
e1 ,
5 1
f ( x ) = x A, where A = and
x = (x1 , x2 ).
−7 −3
(a) Model after the purely algebraic method in Example 1 to answer
the following questions about f :
(1) f is one-to-one and onto.
(2) f maps straight lines into straight lines and preserves their
relative positions.
(3) f preserves ratios of signed lengths of line segments along the
same or parallel lines.
(4) f maps triangles into triangles with vertices, sides and interior
to vertices, sides and interior, respectively.
(5) f maps parallelograms into parallelograms.
(6) Does f preserve orientations (counterclockwise or clockwise)?
Why?
(7) Does f preserve areas of triangles? Why?
(8) How many invariant lines do f have? If any, find out.
x 0 ∈ R2 and
Now, for any fixed x 0 = 0 , answer the same questions
for the affine transformation
T (
x) =
x 0 + f (
x)
as (1)–(8).
(b) Model after the linearly algebraic method in Example 1 to answer
the same questions (1)–(8) as in (a). Notice the following process:
(1) Compute the characteristic polynomial det(A − tI2 ).
(2) Solve det(A − tI2 ) = 0 to determine the eigenvalues λ1 and λ2 .
(3) Solve the equation x or
x A = λi x (A−λi I2 ) = 0 to determine
the corresponding eigenvectors xi = 0, i = 1, 2.
(4) Make sure if x1 and x2 are linearly independent.
(5) Let B = { x2 }, a basis for R2 . Then set up
x1 ,
λ 0 x
[f ]B = PAP −1 = 1 , where P = 1 .
0 λ2 x2
(6) Justify (2) in (2.7.72) for this f or A.
Then, try to use these information to do the problem.
202 The Two-Dimensional Real Vector Space R2
4. In N = { e2 }, let f : R2 → R2 be defined by
e1 ,
6 2
f ( x ) = x A, where A = .
2 9
6. In N = { e2 }, let f : R2 → R2 be defined by
e1 ,
3 −2
f (
x) =
x A, where
x = (x1 , x2 ) and A = .
−6 4
v2
e2 e1 + e2
( )
1 1
− ,
2 4
( )
3 1
− ,−
2 4 0 e1
( ) −1,, −
1
2 v1
Fig. 2.77
Conversely, give a parallelogram with vertices (0, 0), (1, 2), (−2, 1)
and (−3, −1). Under what coordinate system B = { v2 } does the
v1 ,
square look like the given parallelogram? This means that we have to
find out v1 and v2 so that
[ e2 ]B = (−3, −1) and
e1 ]B = (1, 2), [ [
e1 +
e2 ]B = (−2, 1).
Now
e1 =1·
v1 + 2 ·
v2
= (−3) ·
e2 v1 + (−1) ·
v2
e1 1 2 v1
⇒ =
e2 −3 −1 v2
−1
v 1 2 1 0 1 −1 −2
⇒ 1 = = .
v2 −3 −1 0 1 5 3 1
1 2 3 1
Thus v1 = − 5 , − 5 , v2 = 5 , 5 . See Fig. 2.78.
204 The Two-Dimensional Real Vector Space R2
(1, 2)
(−2, 1) e2 e1 + e2
v2
0 e1
v1
(−3, −1)
Fig. 2.78
e2
a1
e1
0
a2
b2
Fig. 2.79
2.7 Linear Transformations (Operators) 205
onto ∆ 0 a2 b2 . In case
a1 and b1 are corresponding to
a2 and b2 respec-
tively, let linear operator f be defined as
f ( a2 = −2
a1 ) = a1 ,
1
f ( b1 ) = b2 = − b1
2
a1 −2 0 a1
⇒ [f ]N =
b1 0 −2 1
b1
−1
2 1 −2 0 2 1 − 11 − 98
⇒ f (x ) = [
x ]N [f ]N = x
=x 4
.
4 6 0 − 12 4 6 3
2
1
4
This f is the required one. In case
a1 and b1 are corresponding to b2
and
a2 respectively, define linear operator g as
g(
a2 ) = b2 ,
g( b2 ) =
a2
a2 0 1 a2
⇒ [g]N =
b2 1 0 b2
−1
−4 −2 0 1 −4 −2 − 14 5
⇒ g(
x ) = [
x ]N [g]N =
x
=x 8
.
−2 −3 1 0 −2 −3 3
2
1
4
−e1 + e2 e2 e1 + e2
0 e1
Fig. 2.80
points of intersection with the square into its points of intersection with
the line x1 + x2 = 0. See Fig. 2.80. In terms of linear algebra, this is the
projection of R2 along the x1 -axis, and in turn, eventually, becomes an
easy problem concerning eigenvectors. Define f : R2 → R2 by
f (
e1 ) = 0 ,
e2 ) = −
f ( e1 +
e2 .
Then
0 0
f (
x) =
x
−1 1
is a required one. See Fig. 2.80. Note that f (− e2 ) = −
e1 + e1 + e2 and
the image segment is covered twice by the square under f . Actually,
take any vector v which is linearly independent from − e1 + e2 . Then
a linear operator of R onto the line x1 + x2 = 0, defined as
2
f (
v) = 0,
f (−
e1 +
e2 ) = λ(−
e1 +
e2 ) for scalar λ = 0
e2
−e1 + e2
e1
0
v
Fig. 2.81
208 The Two-Dimensional Real Vector Space R2
e2 A
C O e1
Fig. 2.82
<B>
1. Prove (2.7.72).
2. Let
1 2 3 −8
A= and B = .
0 2 0 −1
(d) Use (c) to show that there exists a basis B = { v2 } for R2 such
v1 ,
that both [A]B and [B]B are diagonal matrices. In this case, A and
B are called simultaneously diagonalizable.
3. There exist matrices A2×2 and B2×2 such that A is diagonalizable and
AB = BA holds, but B is not diagonalizable. For example,
1 0
A = I2 and B = .
1 1
Try to find some other examples, if any.
4. Suppose A and B are similar 2 × 2 real matrices. Show that there exist
bases B and C for R2 and a linear operator f on R2 such that
[f ]B = A and [f ]C = B
(try to refer to (2.7.25)).
(Note This result still holds for n × n matrices over a field.)
5. Let f be a nonzero linear operator on R2 (or any two-dimensional vector
space). For any nonzero vector x in R2 , the subspace
x , f ( x ), . . . , ,
x ), f 2 ( denoted as Cf (
x)
is called the f -cycle subspace of R2 generated by
x.
(a) Show that Cf ( y ) ∈ Cf (
x ) is f -invariant, i.e. f ( x ) for each
y ∈ Cf ( x ).
A11 O
[f ]B = ,
A21 A22 n×n
where the restriction f |S has matrix representation A11 with
respect to B1 .
(2) In particular, V = S1 ⊕ S2 where S1 and S2 are
f -invariant subspaces if and only if, there exists a basis B =
{x1 , . . . ,
xk , xn } for V such that
xk+1 , . . . ,
A11 O
[f ]B = ,
O A22 n×n
where { xk } = B1 is a basis for S1 with [f |S1 ]B1 = A11
x1 , . . . ,
and { xk+1 , . . . ,
xn } = B2 is a basis for S2 with [f |S2 ]B2 = A22 .
In Case (2), f = f |S1 ⊕ f |S2 is called the direct sum of f |S1 and f |S2 .
Another interpretation of A22 is as follows. Suppose S is a f -invariant
subspace of V . Consider the quotient space of V modulus S (see
Sec. B.1)
V /S = {
x + S |
x ∈V}
and the induced quotient operator f˜: V /S → V /S defined by
f˜(
x + S) = f (
x ) + S.
This f˜ is well-defined and is linear. Let π: V → V /S denote the natural
projection defined by π( x) = x + S. Then the following diagram is
commutative: π ◦ f = f˜ ◦ π.
f
V −→ V
π ↓ ↓π
f˜
V /S −→ V /S
(3) Suppose dim V = n. Adopt notations as in (1). Then B2 =
{
xk+1 + S, . . . ,
xn + S} is a basis for V /S and
A11 0
[f ]B = ,
A21 A22
where A11 = [f |S ]B1 = A11 and A22 = [f˜]B2 .
212 The Two-Dimensional Real Vector Space R2
Both (2) and (3) provide other ways to compute the characteristic
polynomial of f .
(e) Diagonalizability of f, f |S , and f˜
Suppose dim V = n and S is a f -invariant subspace of V .
(1) If f is diagonalizable, then so is f |S .
(2) If f is diagonalizable, then so is f˜.
(3) If both f |S and f˜ are diagonalizable, then so is f .
5. Suppose f is a linear operator on a finite-dimensional vector space V .
(a) If Cf ( x = 0 and
x ) is the f -cycle subspace generated by
dim Cf ( x ) = k, then
(1) B = { x , f (
x ), f 2 (
x ), . . . , f (k−1) (
x )} is a basis for Cf (
x ).
Hence there exist unique scalars a0 , a1 , . . . , ak−1 such that
x ) = −a0
f (k) ( x − a1 f (
x ) − · · · − ak−1 f (k−1) (
x ).
(2) Therefore,
0 1
0 1 0
0
f |Cf ( =
x) B ..
0 1 .
0 1
−a0 −a1 −a2 · · · −ak−2 −ak−1 k×k
2.7 Linear Transformations (Operators) 213
ϕ(t) = (t − λ1 ) (t − λ2 ) · · · (t − λk ),
0
1 0
0
1 0
.
..
.
0 1 0
1 0 n×n
12. Let A ∈ M(m, n; C). Show that there exist unitary matrices Pm×m and
Qn×n so that
λ1
..
. 0
λr
PAQ = , λ1 ≥ λ2 ≥ · · · ≥ λr > 0,
0
..
0 .
0
@
0 @
0
@
PAP −1
= @ .@ .
@ ..
@
@ 0
n×n
2.7 Linear Transformations (Operators) 217
0 1 0 ··· 0
b21 b22 b23 · · · b2n
−1 b31 b32 b33 · · · b3n O A12
RAR = =
. .. .. .. .. A21 A22
.. . . . .
bn1 bn2 bn3 · · · bnn
and tr A = tr A22 = 0. What is Rn×n ?
218 The Two-Dimensional Real Vector Space R2
aij
and XY − Y X = A holds if and only if bij = λj −λi for i = j.
<D> Applications
For possible applications of diagonalizable linear operators to differential
equations, we postpone the discussion to Sec. 3.7.6 in Chap. 3.
which is not the original one. This indicates that A is not diagonalizable.
It is reasonable to expect that the mapping properties of f would be more
complicated.
Since det A = 1, f is one-to-one and hence onto.
Just like Example 1 in Sec. 2.7.6. This f maps lines into lines and pre-
serves their relative positions and ratios of signed lengths of line segments
along the same line or parallel lines.
220 The Two-Dimensional Real Vector Space R2
Let
y = (y1 , y2 ) = f (
x) =
x A. Then
y1 = x1 ,
y2 = −2x1 + x2 .
y2 − x2 = −2x1
shows that the point x is moved parallel to the line x1 = 0 and downward
the distance 2x1 if x1 > 0 and upward the distance −2x1 if x1 < 0. This is
equivalent to say that
y2 − x2
= −2,
x1
which shows that
x is moved along a line parallel to the line x1 = 0 and the
distance moved is proportional to its distance from x1 = 0 by a constant
scalar −2. Therefore, the line
x1 = c (constant)
is mapped onto itself with its point (c, x2 ) into (c, −2c + x2 ). Such a line is
also called an invariant line of f but is not an invariant subspace except
c = 0. See Fig. 2.83. In Fig. 2.84, does the quadrilateral with vertices
(3, 1), (1, 2), (−2, 1) and (−4, −2) have the same area and the same orien-
tation as its image quadrilateral? Why?
( x1 , x2 )
e2
0 e1
x1 = 0
(x1, − 2x1 + x2)
x1 = c
Fig. 2.83
2.7 Linear Transformations (Operators) 221
(−4, −6)
(−2, 5)
(1, 2)
(−2, 1) e2 (3, 1)
e1
(−4, −2)
(3, −5)
Fig. 2.84
(A − I2 )2 = (A − I2 )(A − I2 ) = A2 − 2A + I22
1 −4 1 −2 1 0 0 0
= −2 + = = O2×2 ,
0 1 0 1 0 1 0 0
x ∈ R2 such that
holds. There does exist vector
x (A − I2 ) = 0
but x (A − I2 ))(A − I2 ) =
( x (A − I2 )2 = 0 .
Fig. 2.85
3 3
v1 A = v1 = v1 + 0 · v2 ,
2 2
3 1 3
v2 A = e1 − ( e1 + e2 ) = v1 + v2
2 2 2
3
−1
0 3 0 0
⇒ [f ]B = P AP = 2
= I2 + ,
1 32 2 1 0
where
1
v −2 − 12
P = 1 = .
v2 1 0
x ∈ R2 ,
For any
x = (x1 , x2 ) in N = { e2 }
e1 ,
↓
[ x ]B = (α1 , α2 ) in B = { v2 }
v1 ,
↓
3
[f (
x )]B = [
x ]B [f ]B = (α1 , α2 ) + (α2 , 0)
2
3 3
= α1 + α2 , α2 in B
2 2
↓
−1 1 1
f ( x ) = xA = x P [f ]B P = x1 + x2 , − x1 + 2x2 ∈ R2 in N .
2 2
224 The Two-Dimensional Real Vector Space R2
This means that v1 , i.e. x1 −x2 = 0, is the only invariant line (subspace)
of f , on which each point x is moved to f ( x ) = 32
x . See Fig. 2.86.
e2
invariant line
0 v2 = e1
x(1, 2)
v1
3
v1
( 3 3
)
1, 2
2 2
2
f ( x) (
3
2
3
1 + 2, 2
2 )
Fig. 2.86
where
v
P = 1 .
v2
2.7 Linear Transformations (Operators) 225
by scalar λ→
enlargement
[
x ]B = (α1 , α2 ) (λα1 , λα2 )
1 1translation along (α2 , 0)
[f ( x )]B = [ x ]B [f ]B = λ(α1 , α2 ) + (α2 , 0) = (λα1 + α2 , λα2 )
This means that the one-dimensional subspace v1 , i.e. α2 = 0, is
the only invariant subspace of f except λ = 1. In case λ = 1, each line
v1 is invariant under f .
parallel to (2.7.74)
For general results, please refer to Sec. B.11, in particular, Ex. 2 and
Sec. B.12. Remind readers to review Example 6 and the explanations after
it in Sec. 2.7.2, including Figs. 2.51 and 2.52.
Exercises
<A>
1. Model after Example 1 to investigate the geometric mapping proper-
ties of
−2 0
f (
x) =
xA, where
x = (x1 , x2 ) and A = .
3 −2
2. Do the same problem as in Ex. 1 and Ex. <A> 3(a) of Sec. 2.7.6 if
2 1
(a) f (
x) =
xA, where
x = (x1 , x2 ) and A = −1 4 ,
λ 0
(b) f (
x) = x = (x1 , x2 ) and A = b λ , bλ = 0.
xA, where
3. Constructive linear algebra
Fix the Cartesian coordinate system N = { e2 } on R2 . Let
e1 ,
v1 = (3, 1) and v2 = (1, −2). Try to map the triangle ∆ 0
v1v2 respec-
tively onto ∆ 0 v1 (− v2 ), ∆ 0 (− v2 )(− v1 ) and ∆ 0 (− v1 ) v2 . See Fig. 2.87.
For example, map ∆ 0 v2 onto ∆ 0
v1 v1 (− v2 ) but keep 0 fixed. One way
to do this is to find a linear isomorphism f : R2 → R2 such that
f ( v1 = 1 ·
v1 ) = v1 + 0 ·
v2
v 2 ) = −
f ( v2 = 0 ·
v1 + (−1) ·
v2
1 0
⇒ [f ]B = , where B = { v2 }.
v1 ,
0 −1
226 The Two-Dimensional Real Vector Space R2
− v2
e2
v1
e1
0
−v1
v2
Fig. 2.87
v 1 ) = −
g( v2 = 0 ·
v1 + (−1)
v2
g( v1 = 1 ·
v2 ) = v1 + 0 ·
v2
0 −1 0 1 1 0
⇒ [g]B = =
1 0 1 0 0 −1
1 5
−1 0 −1 7 7
⇒ [g]N = P P =
1 0 − 10 − 1
7 7
1
⇒ g( x [g]N = (x1 − 10x2 , 5x1 − x2 ).
x) =
7
0 1
Note the role of the matrix 1 0 in the derivation of g.
(a) Model after f and g respectively and try to map ∆ 0
v1
v2 onto
∆ 0 (− v2 )(− v1 ).
(b) Same as (a) but map ∆ 0
v1v2 onto ∆ 0 (−
v1 )
v2 .
2.7 Linear Transformations (Operators) 227
The next problem is to map ∆ 0 v1v2 onto ∆ 0
v2 (
v1 +
v2 ), half of
the parallelogram with vertices 0 , v2 , v1 + v2 and v1 . The mapping
h: R2 → R2 defined linearly by
h(
v1 ) =
v1 +
v2 ,
h(
v2 ) =
v2
is probably the simplest choice among others. See Fig. 2.88.
Geometrically, h represents the shearing with v2 as its invari-
ant line (for details, refer to Example 6 in Sec. 2.7.2) and it moves
a point [
x ]B = (α1 , α2 ) to the point [h(
x )]B = (α1 , α1 + α2 ) along
the line parallel to
v2 . Now
1 1
[h]B =
0 1
9
− 47 def
−1 1 1 7
⇒ [h]N = P P = =A
0 1 1 5
7 7
1
⇒ h( x ) = x [h]N
= (9x1 + x2 , −4x1 + 5x2 ).
7
v1 x
v1 + v2
v2
h( x)
〈〈v2 〉〉
Fig. 2.88
h(
y1 ) =
y1
h(y2 ) =
y1 + y2
1 0
⇒ [h]γ =
1 1
−1 1 0
y1 1 −2
⇒ [h]N = Q Q, where Q = = 7
1 1 y2 2 0
9
1 0 2 1 0 1 −2 7 − 47
= 7 = ,
7 − 72 1 1 1 2 0 1 5
7 7
which coincides with the original one mentioned above. The other
choice is to define k: R2 → R2 by
k(
v1 ) =
v2 ,
k(v2 ) = v1 + v2
0 1
⇒ [k]B =
1 1
6
−1 0 1 7 − 57 def
⇒ [k]N = P P = = B
1 1 − 11 1
7 7
1
⇒ k(x) = x [k]N = (6x1 − 11x2 , −5x1 + x2 ).
7
Note that
0 1 0 1 0 0
= +
1 1 1 0 0 1
0 1 −1 0 0 0
= + .
−1 0 0 1 0 1
√ √
1+ 5 1− 5
so that k has eigenvalues λ1 = 2 and λ2 = 2 . Solve
k( x ) = λi x , i.e.
6
− λi − 57
(x1 x2 ) 7 11 = 0, i = 1, 2
−7 1
7 − λi
√
and we obtain the corresponding
√ eigenvectors x1 = (22, 5 + 7 5)
x2 = (22, 5 − 7 5). Then C = {
and x1 , x2 } is a basis for R2 and
√
1+ 5
0
R[k]N R−1 = √ = [k]C ,
2
1− 5
0 2
where
√
x1 22 5 + 7 5
R= = √ .
x2 22 5 − 7 5
Therefore, [k]N = R−1 [k]C R, i.e.
k(
x) = x R−1 [k]C R = [
x [k]N = x ]C [k]C R.
x ∈ R2 , we can follow the following
This means that, for a given
steps to pinpoint k( x ):
x → [
x ]C → [
x ]C [k]C → [
x ]C [k]C R = k(
x ).
Equivalently, by using (2.7.72), compute
1 0
B1 = R−1 R
0 0
√ √ √
−1 5 − 7 5 −5 − 7 5 1 0 22 5 + 7 5
= √
308 −22 22 0 0 22 5 − 7 5
√
−1 5 − 7 5 −10
= √ √ ,
14 5 −22 −5 − 7 5
√
0 0 −1 −5 − 7 5 10
B2 = R−1 R= √ √
0 1 14 5 22 5−7 5
and we get the canonical decomposition of [k]N = B as
I2 = B1 + B2
√ √
1+ 5 1− 5
[k]N = B1 + B2 .
2 2
Refer to Fig. 2.73 and try to use the above decomposition to explain
geometrically how k maps ∆ 0 v1v2 onto ∆ 0 v2 (
v1 +
v2 ).
230 The Two-Dimensional Real Vector Space R2
(c) Model after h and k to map ∆ 0 v2 onto ∆ 0 (−
v1 v1 −
v1 )(− v2 ).
(d) In B = { v1 , v2 }, a linear transformation p: R → R2 has the
2
representation
−3 0
[p]B = .
1 −3
<B>
1. Prove (2.7.74) and interpret its geometric mapping properties graphically.
Read Secs. 3.7.7 and B.12 and try your best to do the exercises there.
e2
−e1 0 e1
J J2 J3 J4
−e2
Fig. 2.89
x1 = 0
x1 = x2
( x1 , x2 )
e2
( x2 , x1 )
(− x2 , x1 )
e1
Fig. 2.90
This basis B and the resulting rational canonical form P AP −1 are exactly
what we want (refer to Sec. B.12). Figure 2.91 illustrates mapping proper-
ties of f or A in the language of the basis B. Note that
v
vA
A A2 A3 A4 = I 2
Fig. 2.91
det(A − tI2 ) = t2 + a1 t + a0 ,
where a21 −4a0 < 0, so that A does not have real eigenvalues (see Remark on
the next page). Then A2 + a1 A + a0 I2 = O. Let the generalized eigenspace
G = {
x ∈ R2 |
x (A2 + a1 A + a0 I2 ) = 0 } = R2 .
v ∈ R2 .
Take any nonzero vector
where
v
P = .
vA
234 The Two-Dimensional Real Vector Space R2
x ]B = (α1 , α2 ) −
[ −−−→ (−α2 , α1 ) −
−−−→
(−a0 α2 , α1 )
0 1 a0 0
−1 0 0 1
translation
1 1along (0, −a α
1 2)
(2.7.75)
See Fig. 2.92 and refer to Secs. 3.7.8 and B.12 for generalized results. Read-
ers should review Example 1 and the explanations associated, including
Fig. 2.91.
(0, −a12) x
(1, 2)
(−a02 , 1 − a12 ) f ( x)
vA (2 , 1)
v
(−2 ,1)
0
(−a02 , 1)
Fig. 2.92
Remark
Even if a21 − 4a0 ≥ 0 so that A has real eigenvalues, (2.7.75) is still valid but
not for every nonzero vector v in R2 . All one needs to do is to choose a vec-
tor v in R2 , which is not an eigenvector of A. Then, {
v,
v A} is linear inde-
pendent and hence forms a basis B for R . In this case, (1) and (2) hold too.
2
Exercises
<A>
1. In N = { e2 }, let f : R2 → R2 be defined by
e1 ,
−2 5
f (
x) =
xA where
x = (x1 , x2 ) and A = .
4 −3
(a) Model after Example 1 to justify (2.7.75).
(b) Do the same problem as in Ex. <A> 3(a) of Sec. 2.7.6.
2.8 Affine Transformations 235
<B>
Then, call the set X an n-dimensional affine space with the vector space
V as its difference space. (2.8.1)
−
Since P P = 0 , hence any single point in X can be considered as the zero
−
vector or base point. Also, call P Q a position vector with initial point P
and terminal point Q or a free vector since law of parallel invariance holds
236 The Two-Dimensional Real Vector Space R2
T (
x ) = T ( x −
x0 ) + f ( x0 ), x ∈ R2 .
T (
y 0 ) = T ( y0 −
x0 ) + f ( x0 ) and
T(x) = T (x0 ) + f(x −
y0 + y0 − x0 )
=
T ( x0 ) + f ( y0 − x0 ) + f ( x −
y0 )
= T (y0 ) + f ( x − y0 ), x ∈ R .
2
In particular,
T (
x ) = T ( 0 ) + f (
x ), x ∈ R2 .
(2.8.5)
In case T ( 0 ) = 0 , the affine transformation reduces to a linear isomor-
phism. In order to emphasize the “one-to-one and onto” properties, an
affine transformation is usually called an affine motion in geometry.
The composite function of two affine transformations is again affine. To
see this, let T1 (
x ) = T1 ( 0 ) + f1 (
x ) and T2 (
x ) = T2 ( 0 ) + f2 (
x ) be two
affine transformations. Thus
(T2 ◦ T1 )(
x ) = T2 (T1 (
x ))
= T2 ( 0 ) + f2 (T1 ( 0 ) + f1 (
x ))
= T2 ( 0 ) + f2 (T1 ( 0 )) + f2 (f1 (
x ))
= (T2 ◦ T1 )( 0 ) + (f2 ◦ f1 )(
x ), x ∈ R2
(2.8.6)
with the prerequisite that the composite function of two linear isomorphisms
is isomorphic, which can be easily verified.
By the very definitions, the inverse transformation f −1 : R2 → R2 of
f is isomorphic and the inverse transformation T −1 : R2 → R2 is affine too.
238 The Two-Dimensional Real Vector Space R2
In fact, we have
T −1 (
x ) = −f −1 (T ( 0 )) + f −1 (
x ), x ∈ R2
(2.8.7)
−1 −1
and T ◦ T(x) = T ◦ T
( x ) = x for any x ∈ R .
2
Summarize as the
Affine group on the plane
The set of all affine transformations on the plane
Ga (2; R) = { x) |
x0 + f ( x0 ∈ R2 and f : R2 → R2 is a linear isomorphism}
forms a group under the composite operation (see Sec. A.2) with
I: R2 → R2 defined by I(
x) =
x
as identity element and
−f −1 (
x0 ) + f −1 (
x ), x ∈ R2
[f ]B = AB B −1
N [f ]N (AN ) = A0 [f ]N A−1 −1
0 = A0 AA0
and
−1
[T (
a0 )]B = [ a0 )]N AN
x0 + f (
B = ( x0 + a0 A)A0
= ( a0 A)A−1
x0 + −1
0 − a0 A0 ,
where a0 A−1 N
0 = [ a0 ]N AB = [ a0 ]B = 0 .
Notice that change of coordinates stated in (2.4.2) is a special kind of
affine motions.
Using the natural affine basis N = { 0 , e2 }, an affine transformation
e1 ,
T ( x ) = x0 + xA (see (2.8.5)) is decomposed as
x →
y →
xA (keeping 0 fixed) = x0 +
y, (2.8.12)
2.8 Affine Transformations 241
in R2 .
x0 be any fixed point in R2 and T : R2 → R2 be an affine
To see this, let
transformation. Then
T (
x ) = T ( x −
x0 ) + f ( x0 )
x0 ) −
= T ( x0 + ( x −
x0 + f ( x0 ))
x0 ) −
= T ( x0 + T
x0 ( x ), (2.8.14)
T = f2 ◦ f1 ,
where f2 ( x0 ) −
x ) = [T ( x0 ] + x0 ) −
x is the translation along T ( x0 , while
f1 (
x) = x ) − T (
x0 + (T ( x0 )): R2 → R2 is an affine transformation keeping
x0 fixed. (2.8.15)
Finally, we state
Example In R2 , let
a0 = (1, 2),
a1 = (1, −1),
a2 = (0, 1) and
b0 = (−2, −3), b1 = (3, −4), b2 = (−5, 1).
(a) Construct affine mappings T1 , T2 and T such that
T1 ( 0 ) =
a0 , T1 (
e1 ) =
a1 , T1 (
e2 ) =
a2 ;
T2 ( 0 ) = b0 , T2 (
e1 ) = b1 , T2 (
e2 ) = b2 , and
T (
ai ) = bi , i = 0, 1, 2.
(b) Show that T = T2 ◦ T1−1 .
x0 = (−2, −2). Express T as f2 ◦ f1 where f1 is a linear isomor-
(c) Let
phism keeping
x0 fixed while f2 is a suitable translation.
a0
b2
a2 = e2
e1
0
a1
x0
b0
b1
Fig. 2.93
(a) By computation,
a1 −
a0 = (1, −1) − (1, 2) = (0, −3) = 0
e1 − 3
e2 ,
a2 −
a0 = (0, 1) − (1, 2) = (−1, −1) = −
e1 −
e2 , and
a0 = T1 ( 0 ) = (1, 2) = e1 + 2
e2 .
2.8 Affine Transformations 243
or in coordinate form, if
x = (x1 , x2 ) and
y = T1 (
x ) = (y1 , y2 ),
y1 = 1 − x2 ,
y2 = 2 − 3x1 − x2 .
Similarly,
b1 − b0 = (3, −4) − (−2, −3) = (5, −1),
b2 − b0 = (−5, 1) − (−2, −3) = (−3, 4) and
b0 = T2 ( 0 ) = (−2, −3).
a1 −
f ( a0 ) = f (0, −3) = f (0
e1 − 3 e1 ) − 3f (
e2 ) = 0f ( e2 )
= −3f (
e2 ) = b1 − b0 = (5, −1)
5 1
⇒ f (
e2 ) = − , ;
3 3
f (a2 − a0 ) = f (−1, −1) = −f ( e1 ) − f (
e2 ) = b2 − b0 = (−3, 4)
5 1 14 13
⇒ f ( e1 ) = −f ( e2 ) − (−3, 4) =
,− − (−3, 4) = ,− .
3 3 3 3
Therefore,
14
− 13
T ( x ) = (−2 −3) + [ x − (1 2)]
3 3
, or
− 53 1
3
14 5 10 14 5
y1 = −2 + (x1 − 1) − (x2 − 2) = − + x1 − x2 ,
3 3 3 3 3
13 1 2 13 1
y2 = −3 − (x1 − 1) + (x2 − 2) = − x1 + x2 .
3 3 3 3 3
244 The Two-Dimensional Real Vector Space R2
Exercises
<A>
1. In R2 , let
a0 = (−1, 2), a1 = (5, −3),
a2 = (2, 1);
b0 = (−3, −2), b1 = (4, −1), b2 = (−2, 6).
2.8 Affine Transformations 245
(a) Show that B = { a0 , a2 } and C = { b0 , b1 , b2 } are affine bases
a1 ,
for R .
2
(b) Find the affine transformation T mapping ai onto bi , i = 0, 1, 2.
Express T in matrix forms with respect to B and C, and with respect
to the natural affine basis N , respectively.
(c) Suppose T is the affine transformation with matrix representation
1 5 2 −1
T (x1 , x2 ) = − + (x1 x2 )
2 4 5 6
S = {A ∈ GL(2; R) | |det A| = 1}
S1 = {A ∈ S | det A = 1}
S(B) = S(C).
246 The Two-Dimensional Real Vector Space R2
4. Two bases B = { a2 } and C = { b1 , b2 } for R2 are said to belong to
a1 ,
the same class if there exists an A ∈ S+ (defined in Ex. 2(a)) such that
bi = ai = bi A, i = 1, 2. Thus, all the bases for R2 are divided,
ai A or
with respect to the subgroup S+ , into two classes. Two bases are said
to have same orientation if they belong to the same class, opposite
orientations otherwise. The bases of one of the two classes are said to
be positively orientated or right-handed ; then, the bases of the other
class are said to be negatively oriented or left-handed. R2 together with
a definite class of bases is said to be oriented.
(Note This acts as formal definitions for the so-called anticlockwise
direction and clockwise direction mentioned occasionally in the text,
say in Fig. 2.3 and (2.7.9).)
5. Subgroup of affine transformations keeping a point fixed
Let x0 ∈ R2 be a point. An affine transformation of the form
T (
x) = x −
x0 + ( x0 )A,
T (
x0 ) =
x0 .
See Fig. 2.94. All such transformations form a subgroup of Ga (2; R),
and is group isomorphic to
A 0
A ∈ GL(2; R)
0 1
x0 + ( x − x0 ) A
x
x − x0
x0
Fig. 2.94
(Note This is the reason why the real general linear group GL(2; R)
can be treated as a subgroup of the affine group Ga (2; R).)
2.8 Affine Transformations 247
x x0 + (x − x0)
x0 x − x0
Fig. 2.95
7. Subgroup of translations
The set of translations
T (
x) =
x0 +
x,
x0 ∈ R2
forms a subgroup of Ga (2; R). See Fig. 2.96. This subgroup is group
isomorphic to the group
I2 0
x ∈ R 2
.
x0 1
0
x0 + x
x0
x
0
Fig. 2.96
x0 + ( x − x 0 )
y0 − x0 y0
x0
Fig. 2.97
x1 b2 − a1
y1 b2
a2 b2 0
b1
x → x − a1 b1 − a1 y1 0
b1 0
0 a1 y = f ( x − a1 ) w = y + b1
a3 x2 b3 − a1 y2 b3
b3
y2
Fig. 2.98
2.8.2 Examples
This subsection will concentrate on how to construct some elementary affine
transformations and their matrix representations in a suitable basis. It will
be beneficial to compare the content with those in Sec. 2.7.2.
R2 is a vector space, and is also an affine plane as well. Occasionally, we
need planar Euclidean concepts such as lengths, angles and areas as learned
in high school courses. Please refer to the Introduction and Natural Inner
Product in Part Two, if needed.
An affine transformation T that maps a point (x1 , x2 ) onto a point
(y1 , y2 ) is of the form,
y1 = a11 x1 + a21 x2 + b1
(2.8.18)
y2 = a12 x1 + a22 x2 + b2
with the coefficient determinant
a11 a12
∆= = a11 a22 − a12 a21 = 0;
a21 a22
while, in the present matrix notation,
a11 a12
y =
x0 +
x A,
x0 = (b1 , b2 ) and A = (2.8.19)
a21 a22
with det A = 0, where x = (x1 , x2 ) and
y = (y1 , y2 ) and
y = T (
x ). The
latter is the one we obtained in (2.8.9) or (2.8.10), where B = C could be
any affine basis for R2 , via the geometric method stated in (2.8.4).
2.8 Affine Transformations 251
Remark In case
∆ = det A = 0, (2.8.20)
the associated transformation is called a singular affine transformation,
otherwise nonsingular. The affine transformations throughout the text will
always mean nonsingular ones unless specified otherwise.
1 0
[T (x )]B = [
x ]B [T ]B , [T ]B = x ∈ R2 .
, (2.8.22)
0 −1
Notice that [T ( a0 )]B = [
a0 ]B = 0 . What is the equation of T in
N = { 0 , e1 , e2 }? There are two ways to obtain this equation.
252 The Two-Dimensional Real Vector Space R2
x
a2
X
A2 a0
A1
P a0
x − a0
O T(x)
X′ a 2 − a0 e1
a1 − a0
(a) e2
0
(b)
Fig. 2.99
= −[ 0 ]B AB B N B
N + [ 0 ]B [T ]B AN + [ x ]N AB [T ]B AN .
But [ a0 ]N AN
a0 ]B = [ 0 ]B + [ B = 0 implies that
−[ 0 ]B AB N B
N = [ a0 ]N AB AN = [ a0 ]N = a0 .
Therefore,
T ( a0 ]N AN
a0 − [
x) = B N B
B [T ]B AN + [ x ]N AB [T ]B AN
=
a0 + ( a0 )AN
x − B
B [T ]B AN (2.8.23)
is the required equation of T in terms of N .
a1 −
The other way is to displace the vectors a2 −
a0 , x −
a0 and a0 to
the origin 0 . See Fig. 2.99(b). Then, observe each of the following steps:
x −
a0 (a vector in N = { e2 })
e1 ,
x −
⇒ ( a0 )AN
B = [ x ]B x −
(the coordinate of a0 in
a1 −
B = { a2 −
a0 , a0 })
⇒ ( a0 )AN
x − B [T ]B (the reflection in B = {
a1 − a2 −
a0 , a0 })
a0 )AN
x −
⇒ ( B
B [T ]B AN (the coordinate in N = { e2 })
e1 ,
⇒ a0 )AN
x −
a0 + ( B
B [T ]B AN (the reflection in affine basis
B = {
a0 , a2 })
a1 , (2.8.24)
and this is the required T (
x ).
2.8 Affine Transformations 253
Example 1 Let a0 = (2, 2), a1 = (4, 1) and a2 = (1, 3). Determine the
reflection along the direction a2 −
a0 = (−1, 1) with a0 + a1 −
a0 =
a0 + (2, −1) as the line of invariant points.
X ′(k < 0)
A1
P
O
X ′(k > 0)
X
A2
Fig. 2.100
a2 −
In case a0 is perpendicular to a1 − a0 , T is called an orthogonal
one-way stretch with axis a0 + a1 − a0 .
(1) A one-way stretch preserves all the properties listed in (1) of (2.7.9),
(2) but it enlarges the area by the scalar factor
|k|
and preserves the orientation if k < 0 and reverses the orientation if
k < 0. (2.8.28)
(2.8.26) and (2.8.27) are still good for one-way stretch, of course, subject
to some minor charges.
To test the affine transformation (2.8.26) to see if it is a one-way stretch,
follow the following steps:
1. Compute the eigenvalues of A. If A has eigenvalues 1 and k = 1, then T
x (I2 − A) =
represents a one-way stretch if x0 has a solution.
2. Compute an eigenvector v1 corresponding to 1, then
x (I2 − A) =
x0 or
1
x0 +
v1
1−k
is the line of invariant points (i.e. the axis).
3. Compute an eigenvector v2 corresponding to k, and then
v2 is the
direction of the stretch. In particular,
1
x0 or x0
1−k
x0 = 0 .
is a direction (up to a nonzero scalar) if (2.8.29)
2.8 Affine Transformations 257
2. Solve
5 1
6 6
x (A − I2 ) = (x1
x2 ) = (0 0),
5 1
6 6
1
− (5, 1) + (1, −1) = (−2, 1) + (3, −3)
6
is the axis as expected.
3. Solve
− 16 1
6
x (A − 2I2 ) = (x1 x2 ) = (0 0)
5
6 − 56
Q1. What is the image of the line connecting (0, 3) and (1, 0) under T ?
Where do these two lines intersect?
Q2. What is the image of the triangle ∆ b1 b2 b3 , where b1 = (0, 3),
b2 = (−4, 0) and b3 = (1, −1), under T ? What are the areas of these
two triangles?
Thus, the image line of the original line 3x1 + x2 = 3 has the equation,
in N ,
5x1 − x2 − 13 = 0.
These two lines intersect at the point (2, −3) which lies on the axis
(−2, 1) + (3, −3). Is this fact accidental or universally true for any line
and its image line under T ?
2.8 Affine Transformations 259
b1′
b1
axis
a2
a0
0 (1,0)
b2
b2′
b3 b3′
a1
(2, −3)
Fig. 2.101
and get the eigenvectors v = t(1, −1) for t ∈ R and t = 0. The axis of
stretching is
1 2 1
· − (5, 1) + (1, −1) = − (5, 1) + (1, −1)
1 − (−3) 3 6
= (−2, 1) + (3, −3).
3. Solve
2
3 − 23
x (B + 3I2 ) = (x1 x2 ) = (0 0)
− 10
3
10
3
Q1. Compute
2 1 −7 −2 1
T (0, 3) = − (5, 1) + (0, 3) · = (−40, 1),
3 3 −10 1 3
2 1 −7 −2 1
T (0, 1) = − (5, 1) + (1, 0) · = (−17, −4).
3 3 −10 1 3
Q2. It is known that T ( b1 ) = b1 = 13 (−40, 1). Compute
2 1 −7 −2
T ( b2 ) = − (5, 1) + (−4, 0) · = (6, 2) = b2 ,
3 3 −10 1
2 1 −7 −2 1
T ( b3 ) = − (5, 1) + (1, −1) · = (−7, −5) = b3 .
3 3 −10 1 3
Then ∆ b1 b2 b3 has the signed area
1 b2 − b1 1 1 58 5 57
det = · =−
−6
2 b 3 − b1 2 9 33 2
the signed area of ∆ b1 b2 b3
⇒ = −3 = det B.
the area of ∆ b1 b2 b3
Notice that T reverses the orientation of ∆ b1 b2 b3 . Hope that the readers
will be able to give a graphical illustration just like Fig. 2.101.
Case 4 Two-way stretch
This is a combination of two one-way stretches whose lines of invariant
points intersect at one point, the only invariant point. There are no invariant
lines at all if the scale factors are distinct.
As an easy consequence of (2.8.28), we have
The two-way stretch
Let a0 , a2 be three distinct non-collinear points in R2 . The two-way
a1 and
stretch T , with a0 the only invariant point, which has scale factor k1 along
a1 −
a0 and scale k2 along a2 − a0 , has the following representations:
1. In the affine basis B = {
a0 , a2 },
a1 ,
k1 0 k 0 1 0
[T (
x )]B = [
x ]B [T ]B , where [T ]B = = 1 ,
0 k2 0 1 0 k2
where k1 = k2 and k1 k2 = 0.
2. In N = { 0 , e2 },
e1 ,
a1 −
a0
T(x) =
a0 + (x −
a0 )A−1 [T ]B A,
where A = .
a2 −
a0 2×2
a1 −
In case a2 −
a0 is perpendicular to a0 , T is called an orthogonal
two-way stretch.
(1) A two-way stretch preserves all the properties listed in (1) of (2.7.9),
262 The Two-Dimensional Real Vector Space R2
a2
T(x), k1> 0, k2 > 0
x
a1
Fig. 2.102
(2.8.29) has a counterpart for a two-way stretch. The details are left to
the readers. We go directly to an example.
and get the eigenvectors v2 = t(2, −1) for t ∈ R and t = 0. Then, any
v2 is another direction of stretch with scale factor −2.
such vector
4. To find the only invariant point, solve
x = (−2, −2) +
xA
x (A − I2 ) = (2, 2)
⇒
−1 1
1
− 53
⇒ x = (2 2)(A − I2 )
= (2 2) · − 3
= (1 1)
6 − 16
3 − 43
The square with vertices a0 = (1, 1), (−1, 1), (−1, −1), and (1, −1), in
counterclockwise ordering, is mapped onto the parallelogram with vertices
a0 = (1, 1), T (−1, 1) = 13 (5, −7), T (−1, −1) = (−5, −5) and T (1, −1) =
a0 a1
(−1, 1)
a2
0
T (1, −1) ( −1, −1)
(1, −1)
T (−1, 1)
T ( −1, −1)
Fig. 2.103
R′
X X′
R
Y Y′
S′ A
S
Q
O A
T
P
O (a) (b)
Fig. 2.104
2.8 Affine Transformations 265
∆T XX is similar to ∆T Y Y .
XX YY
⇒ =
TX TY
XX YY
⇒ = .
the distance of X to OA the distance of Y to OA
This means that any point Y in the rectangle PQRS moves, parallel to
the line OA, a distance with a fixed direct proportion to its distance to the
line OA.
Suppose every point X in the plane moves in this same manner, i.e.
T (X) = X
a0 a1 a0 a1
x
x
T(x) = (1 + k2)a1 + 2a2 T(x) = (1 + k2)a1 + 2a2
Fig. 2.105
x ∈ R2 |
E = { x}
xA =
has dimension one (see (1)3. in (2.7.73)). Take any eigenvector v of unit
length and take any a0 as a solution of x (I2 − A) = x0 . Then a0 +
v
x (I2 − A) =
or x0 itself is the axis of the shearing.
3. Take any u , of unit length and perpendicular to uA −
v , and then u=
k v holds and the scalar k is the coefficient. (2.8.33)
Example 4 Let a0 = (−2, −2) and a1 = (2, 1). Construct the shearings
with axis a0 + a1 −
a0 and coefficients k = 2 and k = −1, respectively.
268 The Two-Dimensional Real Vector Space R2
a1 −
Solution a0 = (4, 3) and its length |
a1 −
a0 | = 5. Let
1
v1 (4, 3),
=
5
1
v2 = (−3, 4).
5
Then B = { a0 ,
a0 + v1 ,
a0 + v2 } is an orthonormal affine basis for R2 .
In B, the required affine transformation is
1 0
[T (x )]B = [ x ]B [T ]B , where [T ]B = .
k 1
While, in N = { 0 , e2 },
e1 ,
−1
v 1 0 v1
T ( x ) = (−2, −2) + [ x − (−2, −2)] 1
v2 k 1 v2
=
x0 +
x A,
where
1 25 − 12k −9k
A=
25 16k 25 + 12k
1 1 −18
25 32 if k = 2;
49
=
1 37 9
if k = −1,
25 −16 13
x0 = (−2, −2) − (−2, −2)A = (−2, −2)(I2 − A)
4
(4, 3) if k = 2;
25
=
− 2 (4, 3) if k = −1.
25
Now, we treat the converse problems.
Suppose T (
x) = x0 +
x A if k = 2. Notice the following steps:
1. The characteristic polynomial is
1−25t
5 − 18
25
det[A − tI2 ] = det = t2 − 2t + 1 = (t − 1)2 .
32 49−25t
5 25
which reduces to 3x1 − 4x2 = 2. Then any point, say (−2, −2), on the
line 3x1 − 4x2 = 2, is an invariant point. Therefore, (−2, −2) + v1 ,
which is 3x1 − 4x2 = 2, is the axis.
3. Take a vector v2 such that |v2 | = 1 and
v2 ⊥ v1 , say
v2 = 15 (−3, 4).
Then
1 1 1 −18
v2 A (−3, 4) ·
= = (1, 2)
5 25 32 49
1 1
v2 A −
⇒ v2 = (1, 2) − (−3, 4) = (8, 6) = 2
v1 .
5 5
These two lines intersect at (2, 1) which lies on axis 3x1 − 4x2 = 2. This is
absolutely not accidental, but why?
270 The Two-Dimensional Real Vector Space R2
T ( −1,1)
T (1,1) axis
( −1,1) (1,1)
T ( −1, −1) a1
0
(− 1, −1) (1, −1)
v2
v
a0 1 T (1, −1)
Fig. 2.106
axis
( −1,1) (1,1) a1
0 T (1, −1)
T ( −1,1)
(1, −1)
v2 v1
a0
Fig. 2.107
For any point X in the plane, rotate the line segment OX, with center
at O, through an angle θ to reach the line segment OX . Define a mapping
T : R2 → R2 by
T (X) = X .
This T is an affine transformation and is called the rotation with center
at O of the plane through angle θ. It is counterclockwise if θ > 0, clockwise
if θ < 0. See Fig. 2.108.
X′
X X
B B A X′
A
O O
Fig. 2.108
To linearize T , let O =
a0 , A = a2 in N = { 0 ,
a1 and B = e2 } and
e1 ,
v1 = a1 − a0 = (α11 , α12 ) and v2 = a2 − a0 = (α21 , α22 ). Then
|
v1 | = |
v2 | = 1 and
v1 ⊥
v2 .
Now, B = {
a0 , a2 } is an affine orthonormal basis for R2 . For
a1 , x ∈ R2 , let
x −
a0 = α1
v1 + α2
v2 or [
x ]B = (α1 , α2 ),
x) −
T ( a0 = β1
v1 + β2
v2 or [T (
x )]B = (β1 , β2 ).
272 The Two-Dimensional Real Vector Space R2
β1 = α1 cos θ − α2 sin θ
β2 = α1 sin θ + α2 cos θ
cos θ sin θ
⇒ (β1 β2 ) = (α1 α2 ) .
− sin θ cos θ
We summarize as
The rotation
Let
a0 ,
a1 and
a2 be three distinct non-collinear points so that
|
a1 −
a0 | = |
a2 −
a0 | = 1 (in length), and
(
a1 −
a0 ) ⊥ (
a2 −
a0 ) (perpendicularity).
In N = { 0 , e2 }, to test if a given affine transformation
e1 ,
T (
x) =
x0 +
xA
Notice that A does not have real eigenvalues if θ = 0, π. The readers should
consult Ex. <B> 5 of Sec. 2.7.2 for a more general setting.
Do you know how to find the equation of the image of the unit circle
x21 + x22 = 1 under T ?
Case 7 Reflection again (Euclidean notations are needed)
√
For the given
x0 = (−2, 2 3), solve
√
1 3
2 − 2 √
x (I2 − A) = x √
= (−2, 2 3), where
x = (x1 , x2 )
− 23 3
2
√
and the axis is x1 − 3x2 = −4. Of course, any point on it, say (−4, 0),
can be used as a0 +
a0 and the axis is the same as v1 . See Fig. 2.109.
x
a0 + 〈〈v1〉〉, axis for T
a0 v2
v1
(− 4,0) x0
0 e1
xA
Fig. 2.109
x → T (
We can reinterpret the transformation x ) as follows.
orthogonal reflection 1 0
x = (x1 , x2 ) −−−−−−−−−− −−−→ x
with axis e 1 0 −1
√
1 3
rotation with 1 0 2 2
= (x1 , −x2 ) −−−−−−− −−−−−−−−−−−→ x √
center at 0 and through 60◦ 0 −1 − 3 1
2 2
√
1 3
x √
2 2
=
2
3
− 12
translation
x A −−−−−−−−−−−−−
= √−→
x0 +
x A = T (
x ).
along
x 0 = (−2, 2 3)
T ( x) a0 + 〈〈v1〉〉
〈〈v1〉〉
x0
x
a0 v2
v1 xA
60o
( x1 , − x2 )
Fig. 2.110
|
x −
x0 | = 1. Formally, we prove this as follows.
x =
x, xx ∗ = x21 + x22 = 1
⇒ (T ( x0 )A−1 [(T (
x) − x0 )A−1 ]∗ = 1
x) −
x0 )A−1 (A−1 )∗ (T (
x) −
⇒ (T ( x0 )∗ = 1
x) −
⇒ (Since A−1 (A−1 )∗ = A∗ A = A−1 A = I2 )(T (
x) − x0 )∗ = 1
x) −
x0 )(T (
x) −
⇒ |T ( x0 | = 1 if |
x | = 1.
x0 a0 −
= a0 (I2 − B).
a0 B =
This implies explicitly why 3 in (2.8.27) and (2.8.29) should hold. But this
is not the case in Example 6 where T ( x) =
x0 + xA is given, independent
of the form a0 + ( x − a0 )B. That is why we have to recapture
a0 from
x0
in (2.8.36).
Therefore, we summarize the following steps to test if a given transfor-
mation, in N = { 0 , e2 },
e1 ,
T (
x) =
x0 +
xA
θ
Then the eigenvectors corresponding to 1 are v = tei 2 = t cos θ2 , sin θ2
θ
u = tiei 2 = t −sin θ2 , cos θ2 for
for t ∈ R and t = 0, these to −1 are
t ∈ R and t = 0.
3. To find the axis, suppose the axis passes the point
a0 . Then, solve
a0 =
x0 +
a0 A, or
x0 =
a0 (I2 − A)
to get a0 +
a0 and v , the axis of the reflection. In fact, this axis has
x (I2 −A) in N . Note that
the equation x0 = x0 should be an eigenvector
corresponding to −1. (2.8.37)
Steps similar to these should replace those stated in (2.8.27) and (2.8.29)
once T (x) = x0 + x A is given beforehand, independent of being derived
x −
from T ( x ) = a0 + ( a0 )B.
Let A = [aij ]2×2 be an invertible matrix. We have the following
elementary matrix factorization for A (see (2.7.68)).
0, and
Case 2 a11 = 0, then a12 a21 =
a22
0 1 1 a12 a21 0
A= .
1 0 0 1 0 a12
Case 1 shows that an affine transformation T (
x) =
x0 +
x A in
N = {0, e2 } is the composite of
e1 ,
a21
e1 and coefficient
(a) a shearing: axis a11
,
a11 a12
(b) a shearing: axis
e2 and coefficient det A ,
(c) a two-way stretch: along e1 with scale factor a11 and along
e2
det A
with scale factor a11 , and
(d) a translation: along x0 ; (2.8.38)
while Case 2 shows that it is the composite of
(a) a reflection: axis e1 + e2 ,
a22
(b) a shearing: axis e2 and coefficient a ,
12
(c) a two-way stretch: along e1 with scale factor a21 and along
e2
with scale factor a12 , and
(d) a translation: along x0 . (2.8.39)
278 The Two-Dimensional Real Vector Space R2
Exercises
<A>
1. Prove (2.8.27).
2. Let
1 13 −8
T (
x) =
x0 +
x A, where A =
11 6 −13
be an affine transformation in N = { 0 , e2 }.
e1 ,
(a) Determine which x0 will guarantee that T is a reflection in a certain
affine basis.
(b) Find an affine basis in which T is a reflection and determine the
direction of the reflection.
(c) Find the image of the line 4x1 + x2 = c, where c is a constant.
(d) Find the image of the line x1 + x2 = 1. Where do they intersect?
(e) Find the image of the unit circle x21 + x22 = 1 and illustrate it
graphically.
3. Let
4 3
5 5
T(x) = x0 + x A, where A = .
3
5 − 45
Do the same problems as in Ex. 2.
4. Prove (2.8.29).
5. Let k be a nonzero constant and k = 1. Let
−3k + 4 6k − 6
T (
x) =
x0 +
x A, where A =
−2k + 2 4k − 3
be an affine transformation in N = { 0 , e2 }.
e1 ,
(a) Determine all these
x0 so that T is a one-way stretch in a certain
affine basis.
2.8 Affine Transformations 279
(b) For these x0 in (a), determine the axis and the direction of the
associated one-way stretch.
(c) Show that a line, not parallel to the direction, and its image under
T will always intersect at a point lying on the axis.
(d) Find the image of the ellipse x21 + 2x22 = 1 under T if k = 3.
(e) Find the image of the hyperbola x21 − 2x22 = 1 under T if k = −2.
(f) Find the image of the parabola x21 = x2 under T if k = 12 .
6. In N = { 0 , e2 }, let
e1 ,
T (
x) =
x0 +
x A,
T (
x ) = (−2, 3) +
xA.
(a) Find the image of the straight line x1 + x2 = 1 under T via the
following methods and graph it and its image.
(1) By direct computation.
(2) By steps indicated in (2.8.38).
(3) By diagonalization of A and try to use (2) of (2.7.72).
(b) Do the same problems as in (a) for the square |x1 | + |x2 | = 2.
(c) Do the same problems as in (a) for the circle x21 + x22 = 1.
(d) Justify (1) of (2.7.9).
2.8 Affine Transformations 281
14. Let
0 5
A= .
−3 4
Decompose A as A = A1 A2 A3 , where
0 1 1 45 −3 0
A1 = , A2 = , A3 =
1 0 0 1 0 5
and consider the affine transformation, in N = { 0 , e2 },
e1 ,
x ) = (3, −2) +
T ( x A.
e2
a2
a1
e1
0 0
(a) (b)
Fig. 2.111
T =g◦f =f ◦g
is called a hyperbolic rotation, See Fig. 2.112.
x
a2
T ( x)
a0
a1
Fig. 2.112
(c) Suppose | a1 − a0 | = | a2 −
a0 | = 1 and ( a1 − a0 ) ⊥ (a2 − a0 ),
the resulting T is called a hyperbolic (orthogonal ) rotation. Let
[
x ]B = (α1 , α2 ). Then a hyperbola with the lines a0 +a1 −
a0 and
a0 + a2 − a0 as asymptotes has equation α1 α2 = c, where c is
a constant. Let [T ( x )]B = (α1 , α2 ). Then α1 = k1 α1 and α2 = kα2
and hence
α1 α2 = α1 α2 = c,
which indicates that the image point T ( x ) still lies on the same
hyperbola α1 α2 = c. Therefore, such hyperbolas are invariant
under the group mentioned in (b). In particular, the asymptotes
a0 +
a1 −
a0 and
a0 +
a2 −
a0 are invariant lines with a0 as
the only point of invariant.
2. Elliptic rotation
Take two different points a0 and a1 in R2 . Let k be a positive constant.
Denote by f the orthogonal one-way stretch with axis a0 + a1 −
a0
and scale factor k, and by g the rotation with center at a0 and angle θ.
The composite
T = f −1 ◦ g ◦ f
is called the elliptic (orthogonal ) rotation with a0 + a1 − a0 as axis
and k and θ as parameters. See Fig. 2.113. Suppose | a1 −
a0 | = 1. Take
another point a2 so that |a2 −
a0 | = 1 and (a2 − a0 ) ⊥ (a1 − a0 ).
g ( f ( x))
f ( x)
T ( x) x
a0 a1
Fig. 2.113
Then B = {
a0 , a2 } is an affine orthonormal basis for R2 .
a1 ,
(a) Show that
[T (
x )]B = [
x ]B [T ]B ,
2.8 Affine Transformations 285
where
1 0 cos θ sin θ 1 0
[T ]B = [f ]B [g]B [f ]−1
B=
0 k − sin θ cos θ 0 1
k
1
cos θ k sin θ .
=
−k sin θ cos θ
Affine invariants
Affine transformation T (x) = x0 +
x A, where A2×2 is an invertible matrix,
preserves 1–7 but not necessarily preserves a–d in (1) of (2.7.9). (2.8.40)
xi = a2 −
a1 + ti ( a1 ), i = 1, 2, 3, 4.
The directed segments
x1 x2 x2 −
= x1 = (t2 − t1 )(
a2 −
a1 ), and
x3 x4 =
x4 −
x3 = (t4 − t3 )(
a2 −
a1 )
have their respective image under T the directed segment
T (
x1 )T ( x2 ) − T (
x2 ) = T ( x1 ) = (t2 − t1 )(
a2 −
a1 )A
T (
x3 )T ( x4 ) − T (
x4 ) = T ( x3 ) = (t4 − t3 )(
a2 −
a1 )A.
x3 =
Therefore (refer to (1.4.5)), in case x4 , it follows that
T (x1 )T (x2 ) t2 − t1
x1 x2
= = (ratios of signed lengths),
T ( x3 )T ( x4 ) t4 − t 3 x3 x4
which proves the claim. If x2 and
x1 x3
x4 lie on different but parallel lines,
the same proof will work by using 2 in (2.5.9).
For 4, refer to (2.5.12) and Fig. 2.26. By 2, it is easy to see that the
image of ∆ a1 a3 under T is the triangle ∆ b1 b2 b3 where bi = T (
a2 ai ) for
i = 1, 2, 3. Let a be any fixed interior point of ∆ a1 a2 a3 (one might refer
to Fig. 2.31 for a precise definition of interior points of a triangle). Extend
the line segment a to meet the opposite side
a3 a2 at the point
a1 a4 .
Then, a4 is an interior point of a1 a2 . By 2 or 3, T ( a4 ) is an interior point
of the side b1 b2 . Since the line segment a1
a4 is mapped onto the line
segment b3 T ( a4 ), the interior of a implies that T (
a ) is an interior point
of b3 T (
a4 ). See Fig. 2.114. Hence T (a ) is an interior point of the triangle
∆ b1 b2 b3 .
a3
a1
a4 a2
Fig. 2.114
a11 a12
A=
a21 a22
2.8 Affine Transformations 289
a1
e2
a2
f ( x) = xA or
a1
0 e1 0 a2
0 (det A> 0) (det A<0)
Fig. 2.115
Exercises
<A>
1. Prove (2.8.40) by using (2.8.38) and (2.8.39).
2. Prove (2.8.44) by using (2.8.38) and (2.8.39).
290 The Two-Dimensional Real Vector Space R2
3. On the sides a2 ,
a1 a3 , and
a2 a1 of a triangle ∆
a3 a1
a2
a3 , pick three
points b3 , b1 and b2 respectively, so that the directed segments a1 b3 =
3 b3 a2 , a2 b1 = b1 a3 and a3 b2 = 2 b2
a1 .
a1
b2
a1 '
b3
a3 '
a2 '
a2 a3
b1
Fig. 2.116
a1
b3
b2
b1
a3
a2
Fig. 2.117
(a) Compute
the area of ∆ b1 b2 b3
.
the area of ∆a1
a2
a3
(b) (Menelaus Theorem) The three points b1 , b2 and b3 are collinear
if and only if
αβγ
= −1.
(1 − α)(1 − β)(1 − γ)
(c) What happens to (b) in case b2 b3 is parallel to
a2
a3 ?
292 The Two-Dimensional Real Vector Space R2
point
y3 on the line x1 +
x1 − x2 so that
y1 ,
y2 and y3 do not coincide
with the vertices of ∆ x1 x2 x3 . Then
y1 , y2 and y3 are collinear.
⇔ (
x2 y1 : x3 )(
y1 y2 :
x3 x1 )(
y2 y3 :
x1 x2 ) = −1,
y3
2.8 Affine Transformations 293
and
y1 , y2 y3 are collinear.
⇔ (1 − α)−1 α(1 − β)−1 β(1 − γ)−1 γ = −1. (2.8.45)
See Fig. 2.118.
x1
y3
γ
y2
y3 x1
y2
1− γ
y1
x3 x2 y1
x3
x2
Fig. 2.118
Proof (For another proof, see Ex. <B> 3 of Sec. 2.8.3.) According to the
assumptions, we have
y1 = (1 − α)
x2 + α
x3 , α = 0, 1
y2 = (1 − β)x3 + βx1 , β = 0, 1
y3 = (1 −
γ) x1
+ γ x2 , γ = 0, 1. (∗1 )
x1 +
means geometrically. To see this, notice that the lines x2 −
x1 and
y1 + y2 − y1 are parallel to each other and they never meet at a point
in R2 within our sight. Suppose there were a point where these two lines
meet. By comparing the third equation in (∗1 ) to (∗2 ), there exists some
scalar t ∈ R such that
1 − γ = −tαβ
γ = t(1 − α)(1 − β)
x1
y3 = ∞
x2
x3
y1
y3 = ∞ y2
Fig. 2.119
Remark (∗4 ) and (∗5 ) are related to the well-known theorem: in a triangle
∆
x1
x2
x3 , the line segments (see Fig. 2.120)
y1 y2
x1
x2
x3 y2 x3 y1
⇔ = (ratio of signed lengths),
y2 x1 y1 x2
which is a special case of Menelaus Theorem.
By the way, in case the points y1 ,
y2 and
y3 all do not lie between the
vertices x1 , x2 and x3 or just two of them lie between the vertices, the
Menelaus Theorem is known also as Pasch axiom.
296 The Two-Dimensional Real Vector Space R2
x3
y2 y1
x1 x2
Fig. 2.120
y3 y2 y3
z x3 y1
y1 x3 x2 y2 z
x2
Fig. 2.121
Proof (For other proofs, see Ex. <B> 2 of Sec. 2.8.3 and Ex. <A> 1.)
Use Menelaus Theorem twice as follows. In ∆ x1
x2
y2 ,
x3 , z , y3 are collinear.
⇔ (
y2 x3 : x1 )(
x3 y3 :
x1 x2 )(
y3 x2
z :
zy2 ) = −1;
while in ∆
x2
y2
x3 ,
x1 , z , y1 are collinear.
⇔ (
x2 y1 : x3 )(
y1 x1 :
x3 y2 )(
x1 y2
z :
zx2 ) = −1.
By noting that z lies on the line segment
x2
y2 , combine together these
two results and finish the proof.
segments y3 and
y2 z3 meet at
z2 x1 , y1 and
y3 z1 meet at
z3 x2 ,
y1
y2 and
z1 z2 meet at x3 . Then
x1 , x2 and
x3 are collinear. (2.8.47)
x0
y1
y3
x1 y2
x3
z2
x2
z1 z3
Fig. 2.122
Similarly,
y2 − (1 − γ)
(1 − β) y3 = −β z3 = (γ − β)
z2 + γ x1 , γ − β = 0 (∗3 )
y3 − (1 − α)
(1 − γ) y1 = −γ z1 = (α − γ)
z 3 + α x2 , α − γ = 0. (∗4 )
(γ − β) + (α − γ) + (β − α) = 0.
Thus,
x1 ,
x2 and
x3 are collinear.
Exercises
<A>
(1 − t1 ) x1 = (1 − t1 )(1 − α)
y 1 + t1 x2 + (1 − t1 )α
x3 + t1
x1 .
y2 +
Find similar expressions for points on the lines x2 −
y2 and
+ x3 − y3 .
y3
(b) The three lines in (a) meet at a point z if and only if the three
expressions in (a) for z will be coincident.
2. (One of the main features of affine geometry) In Fig. 2.123, show that
XY BC ⇔ V is the midpoint of BC, by the following two methods.
(a) Use Ceva Theorem.
(b) Take {A, B, C} as an affine basis with A as base point (see Sec. 2.6).
Let A = (0, 0), B = (1, 0), C = (0, 1) and X = (α, 0), 0 < α < 1
and Y = (0, β), 0 < β < 1. Try to find out the affine coordinates of
− − −
U and V and show that α = β ⇔ 2AV = AB + AC.
2.8 Affine Transformations 299
Y
V
A U
X B
Fig. 2.123
x1 x1
x4
x2 x4
x2
y1
x3 x3
y1
z1
y2
y2 z2
Fig. 2.124
x0
y1 y2
y3
y4
z4
z3
z2
z1
Fig. 2.125
In case (y3 ,
y4 ; y2 ) = −1, these four points form a set of harmonic
y1 ,
points as in Ex. 3.
(Note In Fig. 2.125, if the line passing yi ’s is parallel to the line passing
zi ’s, to what extent can this problem be simplified both in statement
and in proof?)
<B>
Try to review problem sets in Ex. <B> of Sec. 2.6 and use ideas, such as
coordinate triangle and homogeneous area coordinate, introduced there to
reprove Menelaus, Ceva and Desargues Theorems and Exs. <A> 1–4.
<C> Abstraction and generalization
Try to extend all the results, including Ex. <A>, to n-dimensional affine
space over a field F.
x21 x22
1. Ellipse + = 1.
a21 a22
x2 x2
2. Imaginary ellipse 21 + 22 = −1.
a1 a2
x21 x2
3. Two intersecting imaginary lines or point ellipse 2 + 22 = 0.
a1 a2
x21 x22
4. Hyperbola − = 1.
a21 a22
x2 x2
5. Two intersecting lines 21 − 22 = 0.
a1 a2
6. Parabola x22 = 2ax1 .
7. Two parallel lines x21 = a2 .
8. Two imaginary parallel lines x21 = −a2 .
9. Two coincident lines x21 = 0. (2.8.49)
0 0 (a1, 0)
(a1, 0) 0
point ellipse hyperbola
ellipse
(a1, a2)
(a, 0) (−a, 0) (a, 0)
0 0 0 0
two intersecting lines parabola two parallel lines two coincident lines
Fig. 2.126
304 The Two-Dimensional Real Vector Space R2
and
2
b, x∗ =
x = b bi xi .
i=1
Also, let
∗ b11 b12 b1
B b
∆= with det ∆ = b12 b22 b2 ;
b b b b2 b
1
b b12
det B = 11 = b11 b22 − b212 ; and
b12 b22
tr B = b11 + b22 .
is, respectively,
8. Two imaginary parallel lines ⇔ In case b11 = 0 (or b22 = 0), det B = 0,
det ∆ = 0 and
b11 b1 b22 b2
= b11 b − b21 > 0 or > 0 .
b1 b b2 b
The quantity det B = b11 b22 − b212 is called the discriminant for quadratic
curves. Sometimes, ellipse, hyperbola and parabola (note that in these cases,
the rank of ∆ is equal to 3) are called non-degenerated conics or irreducible
quadratic curves, while the others are called degenerated or reducible (note
that, the rank r(∆) = 2 for two intersecting lines and r(∆) = 1 for type 7
and type 9). Therefore, in short, for irreducible curves, it is
1
c = ( x + y ). (2.8.53)
2
According to this definition, among the standard forms in (2.8.49), ellipse,
hyperbola and two intersecting lines all have center at 0 , two parallel lines
have every point on the line equidistant from both lines as its centers, two
coincident lines have every point on it as its centers, while parabola does
not have center. A non-degenerated conic with a (unique) center is called
a central-conic. There are only two such curves: ellipse and hyperbola and
the criterion for this is det B = 0.
Remark
Instead of the conventional methods shown from (∗1 ) to (∗11 ), one can
employ effectively the techniques developed in linear algebra (even up to
this point in this text) to give (2.8.49) and (2.8.52) a more concise, system-
atic proof that can be generalized easily to three-dimensional quadrics or
even higher-dimensional ones. This linearly algebraic method mainly con-
tains the following essentials:
Note This definition is not good for types 2 and 8. We have compensated
this deficiency by introducing the algebraic criteria in (2.8.52) instead of
the geometric (and intuitive) criteria in (2.8.50).
As a consequence of (2.8.40), from (2.8.50) it follows easily that
quadratic curves of different types in (2.8.49), except types 2 and 8, cannot
be affinely equivalent.
However, quadratic curves of the same type in (2.8.49) indeed are
affinely equivalent in the sense of (2.8.57).
For example, let γ1 and γ2 be two arbitrary ellipses on the plane R2 .
After suitable translation and rotation (see Sec. 2.8.2), one can transform
γ2 into a new location so that its center coincides with that of γ1 and its
major axis lies on that of γ1 . Use γ2∗ to denote this new-located ellipse.
Choose the center as the origin of a Cartesian coordinate system and the
common major axis as x1 -axis, then γ1 and γ2∗ can be expressed as
x21 x2
γ1 : 2 + 22 = 1,
a1 b1
2
x x2
γ2∗ : 21 + 22 = 1.
a2 b2
Then, the affine transformation
a2 b2
y1 = x1 , y2 = x2 (∗12 )
a1 b1
will transform γ1 onto γ2∗ . This means that γ1 and γ2∗ , and hence γ1 and γ2
are affinely equivalent.
For two imaginary ellipses
x21 x22 x21 x22
+ = −1, + = −1
a21 b21 a22 b22
or point ellipses or hyperbolas, (∗12 ) still works.
For two parabolas
x22 = 2a1 x1 , a1 = 0 and x22 = 2a2 x2 , a2 = 0,
the affine transformation
a1
y1 = x1 , y2 = x2 (∗13 )
a2
will do.
Readers definitely can handle the remaining types in (2.8.49), except
types 2 and 8.
2.8 Affine Transformations 309
We summarize as
We arrange Ex. <A> 3 for readers to prove this result by using (2.8.52)
and by observing (2.8.59) below.
Let us come back to (2.8.56) and compute the following quantities (refer
to (2.8.52)):
Since A is invertible and thus det A = 0, we note that det ABA∗ and
det B, det ∆ and det ∆ all have the same signs.
The implication of tr B upon tr (ABA∗ ) is less obvious. A partial result
is derived as follows. Suppose det B > 0 and tr B > 0 hold. Then
Summarize as
The affine invariants of quadratic curves
For a quadratic curve
x, x + b = 0,
x B + 2 b ,
are affine invariants. In case det B and tr B are positive, then the positive-
ness of
tr B
Later in Sec. 4.10, we are going to prove that these three quantities are
Euclidean invariants.
Exercises
<A>
<B>
(y1 , y2 ) →
v0 + (y1 , y2 )A, v 0 ∈ R2
where
Note that this will reduce to the one stated in (2.8.10) or (2.8.11)
by putting x3 = y3 = 1.
(c) Then x,
x ∆ = x ∗ = 0 under the affine transformation
x ∆
becomes
∗
∗ ∗
A 0 A 0 ∗
y ∆ y = 0.
v0 1 v0 1
312 The Two-Dimensional Real Vector Space R2
or, by putting y3 = 1,
(d) Let
cos θ sin θ
A=P = , and
− sin θ cos θ
∗ ∗ ∗ b11 b12 b1
A 0 A 0
∆ = b12 b22 b2 .
v0 1 v0 1
b1 b2 b
x21 + x22 = 1
y0 = 0 and A∗ = A−1 . Hence, A is
i.e. an orthogonal matrix where
cos θ sin θ cos θ sin θ
A= or
− sin θ cos θ sin θ − cos θ
x1 x2 = 1
x2 = ay2 + b, or
2
a 0
x = (b2 , b) +
y , a = 0, b ∈ R.
2ab a
All such transformations form a subgroup of two parameters a and
b of Ga (2; R).
(b) Take a = 1. The set
1 0
(b2 , b) +
x b ∈ R and
x ∈ R 2
2b 1
forms a transitive one-parameter subgroup of Ga (2; R), whose mem-
ber is called a parabolic translation.
(c) The linear part of a parabolic translation
1 0
2b 1
is a shearing (refer to Ex. 6 in Sec. 2.7.2 and (2.8.32)) and the x1 -axis
e1 is its line of invariant points.
2.8 Affine Transformations 315
is classified to be
1. an elliptic type ⇔ (algebraic) det B > 0.
⇔ (geometric) containing no infinite point.
These contain types 1, 2, 3.
2. a hyperbolic type ⇔ det B < 0.
⇔ containing two infinite points.
These contain types 4, 5.
3. a parabolic type ⇔ det B = 0.
⇔ containing one infinite point.
These contain types 6, 7, 8, 9.
Introduction
In our real world, there does exist a point lying outside a fixed given plane.
For example, a lamp (considered as a point) hanging over a desk (considered
as a plane) is such a case.
Figure 3.1 shows that one point R is not on the plane Σ. The family
of the straight lines connecting R to all arbitrary points in Σ are consid-
ered, in imagination, to form a so-called three-dimensional space, physically
inhabited by the human being.
Σ
Q
O
P
Fig. 3.1
Here and only in this chapter, the term “space” will always mean three-
dimensional as postulated above. Usually, a parallelepiped including its
interior is considered as a symbolic graph of a space Γ (see Fig. 3.2).
One should be familiar with the following basic facts about space Γ.
(1) Γ contains uncountably many points.
(2) Γ contains the line generated by any two different points in it, the
plane generated by any three non-collinear points in it, and coincides with
the space determined by any four different non-coplanar points in it.
319
320 The Three-Dimensional Real Vector Space R3
Fig. 3.2
Fig. 3.3
−
A directed segment P Q in space Γ is called a (space) vector, considered
identical when both have the same length and the same direction just as
stated in Sec. 2.1. And the most important of all, space vectors satisfy all the
properties as listed in (2.1.11) (refer to Remark 2.1 in Sec. 2.1). Hereafter, in
this section, we will feel free to use these operational properties, if necessary.
What we need is to note the following facts:
1. α ∈ R and x is a space vector ⇒ α
x and
x are collinear vectors in the
space Γ.
2.
x and y are space vectors ⇒ x +
y and
x,
y are coplanar vectors in
the space Γ.
Sketch of the Content 321
3.
x,
y and z are space vectors ⇒
x +
y +
z and
x,
y,
z may be either
coplanar or non-coplanar.
See Fig. 3.4.
x+y+z z
x
x+y
z
x y x+y+z y y
x x x
collinear coplanar coplanar non-coplanar
Fig. 3.4
Section 3.7.6: the limit process and the matrices; matrix exponentials;
Markov processes; homogeneous linear system of differential
equations; the nth order linear ordinary differential equation
with constant coefficients.
Section 3.7.7: differential equations.
x3 a3 P
A3
x2 a2
A2
O Q
x1 a1 A1
Σ(O; A1, A2)
L (O; A3 )
Fig. 3.5
x3
a3 + Σ(O; A1 , A2 ) (3.1.1)
(see Fig. 3.6). Then, let x3 run through all the reals, the family of parallel
planes (3.1.1) will fill the whole space Γ.
We summarize as (corresponding to (2.2.2))
be space vectors.
324 The Three-Dimensional Real Vector Space R3
a3
a2
Σ(O; A1 , A2 ) 0
a1
Fig. 3.6
⇔ (2) (algebraic) Take any one of the five points as a base point and con-
−
struct four vectors, for example, bi = OB i , 1 ≤ i ≤ 4. Then, at least
one of the four vectors b1 , b2 , b3 , b4 can be expressed as a linear
combination of the other three, i.e. b4 = y1 b1 + y2 b2 + y3 b3 , etc.
⇔ (3) (algebraic) There exist scalars y1 , y2 , y3 , y4 , not all zero, such that
y1 b1 + y2 b2 + y3 b3 + y4 b4 = 0 .
In any of these cases, b1 , b2 , b3 and b4 are said to be linearly dependent.
(3.1.3)
Also, corresponding to (2.2.4), we have
Linear independence of nonzero vectors in space
The following are equivalent.
It is observed that any one or any two vectors out of three linearly inde-
pendent vectors must be linearly independent, too.
Exercises
<A>
(c) If b1 , b2 and b3 are linearly independent, is it true that b1 and b2
are linearly independent? Why?
(d) Is it possible that any three vectors out of b1 , b2 , b3 and b4 are
linearly independent? Why?
The only main difference between Γ(O; A1 , A2 , A3 ) and R3Γ(O;A1 ,A2 ,A3 ) is in
notations used to represent them, even though, the former might be more
concrete than the latter (see Fig. 3.7).
P ( x1 , x2 , x3 )
A3 (0, 0,1)
OP
A2
a3 (0,1, 0)
Φ
a2
O (0, 0, 0)
a1
A1 (1, 0, 0)
Γ(O; A1, A2, A3) R Γ(O;
3
A1, A2, A3)
Fig. 3.7
A3
A3
e3
e3 A2
O A2 e2
0
0 e2
O
A1 A1
e1 e1
Fig. 3.8
Unless otherwise specified, from now on, R3 will always be endowed with
rectangular coordinate system with N = { e1 , e3 } as natural basis.
e2 ,
Elements in R are usually denoted by x = (x1 , x2 , x3 ),
3
y = (y1 , y2 , y3 ),
etc. They represent two kinds of concept as follows.
Case 1 (affine point of view) When R3 is considered as a space, element
x −
x is called a point and two points decide a unique vector y −
y or x
with x − x = 0.
Case 2 (vector point of view) When considered as a vector space, element
x in R3 is called a vector, pointing from the zero vector 0 toward the
point
x.
(3.2.4)
See Fig. 3.9. In short, in Case 1, an arbitrary fixed point can be used as a
base point in order to study the position vectors of the other points relative
3.2 Coordinatization of a Space: R3 329
to the base point. If the base point is considered as zero vector 0 , then
Case 1 turns into Case 2.
x x
y −x
0 y 0
Case 1 Case 2
Fig. 3.9
Exercises
<A>
1. Explain (3.2.1) graphically and prove 1, 2 in (3).
2. Any two vectorized spaces Γ(O; A1 , A2 , A3 ) and Γ(O ; B1 , B2 , B3 ) of a
space Γ is isomorphic to each other. Prove this and try to explain it
graphically. Indeed, there are infinitely many isomorphisms between
them, why?
3. Explain and prove the equivalence of linear dependence as stated in
(3.1.3) and linear independence as stated in (3.1.4), respectively, for R3 .
4. Explain (3.2.5) just like (2.3.5).
5. Explain (3.2.6) just like (2.3.6).
330 The Three-Dimensional Real Vector Space R3
where x i and
x j are the remaining two vectors of x1 ,
x2 ,
x3 and
x4 ? In how many ways?
(Note To do (b), you need algebraic computation to justify the claim
there. Once this procedure has been finished, can you figure out any
geometric intuition on which formal but algebraic proof for (c) will rely?
For generalization, see Steinitz’s Replacement Theorem in Sec. B.3.)
16. Find scalar k so that the vectors (k, 1, 0), (1, k, 1) and (0, 1, k) are
linearly dependent.
17. Find
the necessary
and sufficient conditions so that the vectors
2 2 2
1, a1 , a1 , 1, a2 , a2 and 1, a3 , a3 are linearly dependent.
<B>
Review the comments in Ex. <C> of Sec. 2.3 and then try to extend prob-
lems in Ex. <B> there to counterparts in R3 (or Fn or more abstract vector
space, if possible) and prove them true or false.
1. Suppose
x1 , xk for k ≥ 2 are linearly dependent vectors in R3 .
x2 , . . . ,
Let x be an arbitrary vector in R3 .
(a) If a1 x2 + · · · + ak
x1 + a2 xk = 0 for some scalars a1 , a2 , . . . , ak , show
that either a1 a2 · · · ak = 0 or a1 = a2 = · · · = ak = 0.
(b) In case a1 a2 · · · ak = 0 and b1 x2 + · · · + bk
x1 + b2 xk = 0 also holds,
then a1 : b1 = a2 : b2 = · · · = ak : bk .
5. Do Ex. <C>6 of Sec. 2.3 in case n = 3 and V = R3 .
M(n; C)
(2n2 )
@
@
SL(n; R) E11 , iE11 SL(n; iR)
(n2 − 1) (2) (n2 − 1)
@@ @
@
SL(n; R) ⊕ E11 SL(n; iR) ⊕ iE11
M(n; R) M(n; iR)
(n2 ) (n2 )
@
@
S(n; R) T (n; R)
n(n + 1) n(n − 1)
2 2
6. Find dimensions and bases for real vector spaces:
SU(n; C) = {A ∈ M(n; C) | Ā∗ = −A (Skew-Hermitian) and tr A = 0};
SH(n; C) = {A ∈ M(n; C) | Ā∗ = A (Hermitian) and tr A = 0}.
For details about matrix and determinant, please refer to Secs. B.4–B.6.
Perhaps, it might be easy for the readers to prove statements in Ex. <A>
2 of Sec. 2.4. But by exactly the same method you had experienced there,
is it easy for you to prove the extended results in (3.3.2)? Where are the
3.3 Changes of Coordinates: Affine Transformation (or Mapping) 337
difficulties one might encounter? Can one find any easier way to prove
them? Refer to Ex. <B> 3.
At least, for this moment, (3.3.2) is helpful in the computation of the
Coordinate changes of two coordinate systems in R3
Let
−
Γ(O; A1 , A2 , A3 ) with basis B = { a1 , a3 },
a2 , ai = OAi , 1 i 3, and
−−
Γ(O ; B1 , B2 , B3 ) with basis B = { b1 , b2 , b3 }, bi = O B i , 1 i 3
be two coordinate systems of R3 . Then the coordinates [P ]B and [P ]B of
the point P in R3 have the following formulas of changes of coordinates
(called affine transformations or mappings)
3
xi = αi + yj αji , i = 1, 2, 3, and
j=1
3
yj = βj + xi βij , j = 1, 2, 3,
i=1
simply denoted by
[P ]B = [O ]B + [P ]B AB
B , and
[P ]B = [O]B + [P ]B AB
B ,
where [O ]B = (α1 , α2 , α3 ), [O]B = (β1 , β2 , β3 ), [P ]B = (x1 , x2 , x3 ), [P ]B =
(y1 , y2 , y3 ) and
b1 B
α11 α12 α13
AB
=
b = α α22 α23 is called the transition matrix of
B 2 B 21
α α32 α33
b3 B 31
B with respect to B
satisfying:
1. The determinants
α11 α12 α13 β11 β12 β13
det AB = α21 α22 α23 = 0; = β21 β23 = 0.
B
det AB
B β22
α α32 α33 β β32 β33
31 31
338 The Three-Dimensional Real Vector Space R3
2. The matrices AB B
B and AB are invertible to each other, i.e.
1 0 0
B
AB B B
B AB = AB AB = I3 = 0 1 0
0 0 1
and therefore (see (3) in (3.3.2))
* +−1 B −1
AB B = AB B ; AB = AB
B .
B3
O′
O
A2
A1 B1
Fig. 3.10
Remark The computations of AB B
B and AB (extending (2.4.3)).
Adopt the notations in (3.3.3) and use the results in (3.3.2) to help
computation.
Let
ai = (ai1 , ai2 , ai3 ) and bi = (bi1 , bi2 , bi3 ), i = 1, 2, 3, also
a1 a11 a12 a13
A = a2 = a21 a22 a23 and
a3 a31 a32 a33
b1 b11 b12 b13
B = b2 = b21 b22 b23 .
b3 b31 b32 b33
Then A and B are invertible (see (3) in (3.3.2)).
3.3 Changes of Coordinates: Affine Transformation (or Mapping) 339
By assumption, [ bi ]B = (αi1 , αi2 , αi3 ), i = 1, 2, 3, and then
bi = αi1
a1 + αi2
a2 + αi3
a3
a1
= (αi1 αi2 αi3 ) a2
a3
= [ bi ]B A (remember that [ bi ]B is viewed as a 1 × 3 matrix)
⇒ [ bi ]B = b i A−1 , i = 1, 2, 3.
Similarly,
[ a i B −1 ,
ai ]B = i = 1, 2, 3.
Therefore,
b1 A−1 b1
−1 −1
AB
B = b2 A = b2 A = BA−1 , and
−1
b3 A b3
−1
a1 B a1
AB
B = a2 B −1 =
a2 B −1 = AB −1 (3.3.4)
−1
a3 B a3
are the required formulas.
−
a2 = OA2 = (0, 1, 1) − (1, 0, 0) = (−1, 1, 1),
−
a3 = OA3 = (1, 0, 1) − (1, 0, 0) = (0, 0, 1),
−−
b1 = O B 1 = (1, −1, 1) − (−1, −1, 1) = (2, 0, 2),
−−
b2 = O B 2 = (−1, 1, 1) − (−1, −1, −1) = (0, 2, 2),
−−
b3 = O B 3 = (1, 1, 1) − (−1, −1, −1) = (2, 2, 2), and
−− −−
OO = (−1, −1, −1) − (1, 0, 0) = (−2, −1, −1) = −O O.
340 The Three-Dimensional Real Vector Space R3
Then, let
a1 0 1 0 b1 2 0 2
A =
a2 = −1 1 1 and B = b2 = 0 2 2
a3 0 0 1 b3 2 2 2
1 −1 1
A−1 = 1 0 0,
0 0 1
0 −1 1
0 4 −4 2 2
1
B −1 =− 4 0 −4 = − 12 0 1
2 .
8
−4 −4 4 1 1
− 12
2 2
Then,
2 0 2 1 −1 1 2 −2 4
AB
B = BA
−1
= 0 2 21 0 0 = 2 0 2,
2 2 2 0 0 1 4 −2 4
0 −21 1 1
−2 0 1
0 1 0 2 2
1
AB
B = AB
−1
= −1 1 1 − 12 0 2
= 0 1 − 12 .
0 0 1 1 1
− 12 1 1
− 12
2 2 2 2
2 −2 4
[P ]B = (−3 2 −3) + [P ]B 2 0 2 and
4 −2 4
1 1
−2 0 2
1
[P ]B = 0 − 1 + [P ]B 0 1 − 12 .
2
1
2
1
2 − 12
3.3 Changes of Coordinates: Affine Transformation (or Mapping) 341
Exercises
<A>
to B , satisfying
x1 = −6 + y1 − y2 + 2y3 ,
x = 5 + y2 − y 3 ,
2
x3 = 2 + y1 + y2 + y3 .
4 8 1
(c) y1 = −15 + x1 + 2x2 + x3 , (d) y1 = 4 + x1 − x2 − x3 ,
3 3 3
1 1 1
y2 = −7 + x1 − x2 + 2x3 , y2 = −3 + x1 + x2 − x3 ,
3 3 3
1 2 1
y3 = 30 − x1 ; y3 = −6 − x1 + x2 + x3 .
3 3 3
<B>
(a) Show that the kernel Ker(A) and the range Im(A) are subspaces
of R3 .
(b) Show that dim Ker(A) + dim Im(A) = dim R3 = 3.
(c) Show that the following are equivalent.
(1) A is one-to-one, i.e. Ker(A) = { 0 }.
(2) A is onto, i.e. Im(A) = R3 .
(3) A maps every or a basis B = { x1 , x3 } for R3 onto a basis
x2 ,
{ x1 A, x2 A, x3 A} for R .
3
(4) A is invertible.
(d) Let B = { x1 , x3 } be a basis for R3 and
x2 , y1 ,
y2 ,
y3 be any vec-
tors in R . Show that there exists a unique linear transformation
3
f : R3 → R3 so that
f (
xi ) =
yi for 1 ≤ i ≤ 3.
(e) Let S be the subspace x1 + 2x2 + 3x3 = 0 in R3 . Show that there are
infinitely many linear transformations f : R3 → R3 with the following
respective property:
(1) Ker(f ) = S.
(2) Im(f ) = S.
Is there any linear transformation f : R3 → R3 such that Ker(f ) =
Im(f ) = S? Try to explain your claim geometrically and analytically.
(f) Let S be the subspace in R3 defined by
−x1 + 2x2 + 3x3 = 0, 5x1 + 2x2 − x3 = 0.
Do the same problem as in (e).
(g) Let S1 be the subspace as in (e) and S2 be the subspace as in
(f). Show that there are infinitely many linear transformations
f : R3 → R3 with the following respective property:
(1) Ker(f ) = S1 and Im(f ) = S2 .
(2) Ker(f ) = S2 and Im(f ) = S1 .
5. Let S1 ⊆ R2 be a subspace and S2 ⊆ R3 be a subspace.
(a) Find all possible linear transformations f : R2 → R3 with the follow-
ing property, respectively:
(1) f is one-to-one.
(2) Ker(f ) = S1 .
(3) Im(f ) = S2 . Be careful about the case that dim S2 = 3.
(4) Ker(f ) = S1 and Im(f ) = S2 .
3.4 Lines in Space 345
Therefore,
x2 = 0, x3 = 0 (3.4.1)
is called the equation of the first coordinate axis L1 with respect to B
(see Fig. 3.11). Similarly,
x1 = 0, x3 = 0,
(3.4.2)
x1 = 0, x2 = 0,
are, respectively, called the equation of the second and the third coordinate
axes L2 and L3 (see Fig. 3.11).
L3
(0, 0, x3 )
A3 L2
A2
(0, x2 , 0)
O
( x1 , 0, 0)
A1
P L
1
Fig. 3.11
These three coordinate axes, all together, separate the whole space R3
into 23 = 8 quadrants, according to the positive and negative signs of
components x1 , x2 , x3 of the coordinate [P ]B , P ∈ R3 (see Sec. 2.5).
By exactly the same way as we did in Sec. 2.5, one has
Equations of a line with respect to a fixed coordinate system in R3
The straight line L determined by two different points A and B in R3 has
the following ways of representation in Γ(O; A1 , A2 , A3 ) with basis B.
(1) Parametric equation in vector form
− −
L passes the point
a = OA with the direction b = AB, and hence has
the equation
x = t ∈ R,
a +tb,
−
−
where
x = OX is the position vector of a point X on L with respect
to O and is viewed as a point in R3 .
(2) Parametric equation with respect to basis B
[
x ]B = [
a ]B + t[ b ]B , t∈R
or, let [
a ]B = (a1 , a2 , a3 ), [ b ]B = (b1 , b2 , b3 ) and [
x ]B = (x1 , x2 , x3 ),
x1 = a1 + tb1 ,
x2 = a2 + tb2 ,
x3 = a3 + tb3 .
3.4 Lines in Space 347
a3 A
b
a B
x X
0
O L
a2
a1
Fig. 3.12
Example (continued from the example in Sec. 3.3) Find the equa-
tions of the straight line determined by the points A = (1, 2, 3)
and B = (−2, 1, −1) in the coordinate systems Γ(O; A1 , A2 , A3 ) and
Γ(O ; B1 , B2 , B3 ), respectively.
−
a = OA = (1, 2, 3) − (1, 0, 0) = (0, 2, 3), [
a ]B = (2, 0, 3), and
−
b = AB = (−2, 1, −1) − (1, 2, 3) = (−3, −1, −4), [ b ]B = (−4, 3, −7).
348 The Three-Dimensional Real Vector Space R3
Let X be a moving point on the line and [X]B = (x1 , x2 , x3 ). Then, the
equation of the line is
By using (3.4.4) and the results obtained in the example of Sec. 3.3,
we are able to deduce the equation of the line in the coordinate system
Γ(O ; B1 , B2 , B3 ) from that in Γ(O; A1 , A2 , A3 ) as follows.
1
−2 0 1
2
1
(y1 , y2 , y3 ) = 0 − 1 + {(2 0 3) + t(−4 3 − 7)} 0 1 − 12
2 1
2
1
2 − 12
1 1 3 3 1 1
= 0 − 1 + − t − t −
2 2 2 2 2 2
1 3 1 1
= − t 1− t ,
2 2 2 2
which is identical with the above result.
3.4 Lines in Space 349
L1 L1 L2 L1
L2
L2
L1 = L2
Fig. 3.13
Exercises
<A>
1. Prove (3.4.3) in detail.
2. Prove (3.4.5) in detail.
3.3) Let L√be the line in R deter-
3
3. (continued from Ex. <A> 3 of Sec.
mined by the points A = − 2 , 0, 3 and B = ( 2, 1, 6).
1
Σ ( A; B, C )
X x − aA
b2
C
b1
B a3
a
x a2
0
O
a1
Fig. 3.14
−
a = OA (viewed as a point in R3 ),
−
b1 = AB,
−
b2 = AC (both viewed as direction vectors in R3 ),
−−
x = OX (X, a moving point in R3 and
x viewed as a point).
Notice that b1 and b2 are linearly independent.
352 The Three-Dimensional Real Vector Space R3
− −
Σ passes through the point a = OA with directions b1 = AB and
− −−
x = OX is viewed as a point in Σ for X ∈ Σ.
b2 = AC, while
(2) Parametric equation in coordinates
[
x ]B = [a ]B + t1 [ b1 ]B + t2 [ b2 ]B
[ b1 ]B
= [ a ]B + (t1 t2 ) , t1 , t2 ∈ R.
[ b2 ]B
Or, let [ a ]B = (a1 , a2 , a3 ), [ bi ]B = (bi1 , bi2 , bi3 ), i = 1, 2, and [
x ]B =
(x1 , x2 , x3 ), then
2
xj = aj + bij ti , t1 , t2 ∈ R, j = 1, 2, 3.
i=1
or simplified as
αx1 + βx2 + γx3 + δ = 0. (3.5.3)
Example (continued from the example in Sec. 3.3) Find the equations
of the plane determined by the points A = (1, 2, 3), B = (−2, 1, −1) and
C = (0, −1, 4), respectively, in Γ(O; A1 , A2 , A3 ) and Γ(O ; B1 , B2 , B3 ).
Solution One might refer to the example in Sec. 3.4 and results there.
In the coordinate system Γ(O; A1 , A2 , A3 ),
−
a = OA = (0, 2, 3), [ a ]B = (2, 0, 3);
−
b1 = AB = (−3, −1, −4), [ b1 ]B = (−4, 3, −7);
−
b2 = AC = (0, −1, 4) − (1, 2, 3) = (−1, −3, 1), [ b2 ]B = (−4, 1, 0).
−−
For point X in the plane, let [ x ]B = [OX]B = (x1 , x2 , x3 ) and then, the
equation is
(x1 , x2 , x3 ) = (2, 0, 3) + t1 (−4, 3, −7) + t2 (−4, 1, 0), t1 , t2 ∈ R
x1 = 2 − 4t1 − 4t2 ,
⇒ x2 = 3t1 + t2 , t1 , t2 ∈ R
x3 = 3 − 7t1 ,
⇒ 7x1 + 28x2 − 8x3 + 10 = 0.
In the coordinate system Γ(O ; B1 , B2 , B3 ),
−− 1 1
a = O A = (2, 3, 4), [ a ]B = , 1, ;
2 2
− 3 1
b1 = AB = (−3, −1, −4), [ b1 ]B = − , − , 0 ;
2 2
− 5
b2 = AC = (−1, −3, 1), [ b2 ]B = 2, 1, − .
2
354 The Three-Dimensional Real Vector Space R3
L
Σ Σ Σ
Fig. 3.15
Also is the
(3.5.6)
See Fig. 3.16.
Σ1
Σ1
Σ2
Σ2
Σ1 = Σ2
Fig. 3.16
x = a2 −
a1 + t( a1 ) = (1 − t)
a1 + t
a2 , t ∈ R. (3.5.7)
356 The Three-Dimensional Real Vector Space R3
a1
a4
a2
a3
Fig. 3.17
Exercises
<A>
1. Prove (3.5.3) in detail.
2. Prove (3.5.5) in detail.
3. Prove (3.5.6) in detail.
3.5 Planes in Space 357
4. (continued
√ from Ex. <A> 3 of Sec. 3.3) Let A = (− 12 , 0, 3),
B = ( 2, 1, 6) and C = (0, 1, 9) be three given points in R3 .
− − −
(a) Show that OA, OB and OC are linearly independent, where
O = (0, 0, 0). Hence A, B and C determine a unique plane Σ.
(b) Find equations of Σ in Γ(O; A1 , A2 , A3 ) and Γ(O ; B1 , B2 , B3 )
respectively.
(c) Use (3.5.4) to check results obtained in (b).
5. In the natural rectangular coordinate system Γ( 0 ;
e1 ,
e2 ,
e3 ), the
following equations
x1 − x2 + 2x3 − 3 = 0,
−2x1 + 3x2 + 4x3 + 12 = 0,
6x1 − 8x2 + 5x3 − 7 = 0
represent, respectively, three different planes in R3 .
(a) Show that the planes intersect at a point. Find out the coordinate
of the point.
(b) Try to choose a new coordinate system Γ(O; A1 , A2 , A3 ), in which
the above three planes are the coordinate planes.
6. According to natural coordinate system Γ( 0 ; e1 ,
e2 ,
e3 ), a line and a
plane are given, respectively, as
x1 − 1 x2 + 1 x3
L: = = , and
2 −3 5
Σ: 2x1 + x2 + x3 − 4 = 0.
(a) Show that the line L and the plane Σ intersect at a point. Find out
the coordinate of this point.
(b) Construct another coordinate system Γ(O; A1 , A2 , A3 ) in R3 so
that, according to this new system, the line L is a coordinate axis
and the plane is a coordinate plane. How many choices of such a
new system are possible?
7. Still in Γ( 0 ;
e1 ,
e2 ,
e3 ), a line
x1 − 0 x2 − 2 x3 − 3
L: = =
1 1 2
and two planes
Σ1 : x1 − x2 + x3 = 1,
Σ2 : x1 + 2x2 + x3 = 7
are given.
358 The Three-Dimensional Real Vector Space R3
(a) Show that Σ1 and Σ2 intersect along a line L . Find the equation
of L .
(b) Then, show that L and L intersect at a point. Find the coordinate
of this point.
(c) Construct a new coordinate system Γ(O; A1 , A2 , A3 ) in R3 such
that Σ1 and Σ2 become coordinate planes of the new system which
contains the given line L in its third coordinate plane.
8. Let
x1 − 18x2 + 6x3 − 5 = 0
be the equation of a plane in Γ( 0 ; e1 ,
e2 ,
e3 ). Find its equation in
Γ(O; A1 , A2 , A3 ) as shown in Ex. <A> 4 of Sec. 3.4.
9. (continued from 8) Suppose
<B>
1. Suppose
are scalars such that the vectors (a12 , a22 , a32 ) and (a13 , a23 , a33 ),
(b12 , b22 , b32 ) and (b13 , b23 , b33 ) are linearly independent, respectively.
3.5 Planes in Space 359
Let
x1 = a11 + a12 t1 + a13 t2 , y1 = b11 + b12 t1 + b13 t2 ,
x2 = a21 + a22 t1 + a23 t2 , y2 = b21 + b22 t1 + b23 t2 , t1 , t2 ∈ R.
x3 = a31 + a32 t1 + a33 t2 , y3 = b31 + b32 t1 + b33 t2 ,
Σi : αi x1 + βi x2 + γi x3 + δi = 0, i = 1, 2, 3.
x0 x0 + V
V
0
Fig. 3.18
9. In R3 with natural coordinate system Γ( 0 ; e1 ,
e2 ,
e3 ), one uses
1/2
x | = x21 + x22 + x23
|
to denote the length of a vector
x = (x1 , x2 , x3 ). Then
x |2 = x21 + x22 + x23 = 1
|
is the equation of unit sphere in R3 (see Fig. 3.19).
e3
x
0 e2
e1
Fig. 3.19
(a) Try to find out all the coordinate systems Γ( 0 ; a1 , a3 ) in R3 ,
a2 ,
with basis B = { a1 , a2 , a3 }, so that the unit sphere still has the
equation
y12 + y22 + y32 = 1,
where [ x ]B = (y1 , y2 , y3 ).
(b) Suppose the coordinate system Γ(O; A1 , A2 , A3 ) is the same as in
Ex. <A> 3 of Sec. 3.3. Determine the equation of the unit sphere in
Γ(O; A1 , A2 , A3 ). Then, try to construct a coordinate system
Γ(O; B1 , B2 , B3 ) in which the equation is unchanged. How many
such systems are there?
(c) One views the equation obtained in (b) in the eyes of
Γ( 0 ;
e1 ,
e2 ,
e3 ). Then, what kind of surface does it represent? Try
3.6 Affine and Barycentric Coordinates 361
a3 X
a2
0
a1
Fig. 3.20
In this case,
a0 ,
a1 ,
a2 and
a3 are said to be affinely independent and the
ordered set
B = {
a0 ,
a1 , a3 } or
a2 , {
a1 − a2 −
a0 , a3 −
a0 , a0 } (3.6.1)
a3 x
a2
a0
a3 − a0
a1
a2 − a0
0
a1 − a0
Fig. 3.21
Γ(a0 ;
a1 , a3 ), the set B = {
a2 , a0 ,
a1 , a3 } is an affine basis for R3 .
a2 ,
Note that the rectangular affine basis
N = {0;
e1 , e3 }
e2 , (3.6.2)
= (1 − x1 − x2 − x3 )
a0 + x1
a1 + x2
a2 + x3
a3
= λ0
a0 + λ1
a1 + λ2
a2 + λ3
a3 , (3.6.3)
(
x )B = (λ0 , λ1 , λ2 , λ3 ) where λ0 + λ1 + λ2 + λ3 = 1 (3.6.4)
3.6 Affine and Barycentric Coordinates 363
2 = 3 = 0
a0
a3
a1
(+ , + , + ,+) a2
Fig. 3.22
Notice that ∆ a0
a1
a2
a3 can be easily described as the set of the points
(λ0 , λ1 , λ2 , λ3 ) where λ0 ≥ 0, λ1 ≥ 0, λ2 ≥ 0, λ3 ≥ 0 and λ0 + λ1 +
λ2 + λ3 = 1. In this expression, what does λ3 = 0 mean? How about
λ1 = λ3 = 0?
Let Γ( b0 ; b1 , b2 , b3 ) be another vectorized space in R3 with affine basis
B = { b0 , b1 , b2 , b3 }.
Then, the change of coordinates from Γ( a0 ;
a1 ,
a2 ,
a3 ) to
Γ( b0 ; b1 , b2 , b3 ) is
x − b0 ]B = [
[ a0 − b0 ]B + [ a0 ]B AB
x − B , x ∈ R3
364 The Three-Dimensional Real Vector Space R3
or
or
0
AB 0
(y1 y2 y3 1) = (x1 x2 x3 1)
B , (3.6.6)
0
p1 p2 p3 1 4×4
where
x −
[ a0 ]B = (x1 , x2 , x3 ), y − b0 ]B = (y1 , y2 , y3 ),
[
a0 − b0 ]B = (p1 , p2 , p3 ) and
[
[ a1 − a0 ]B β11 β12 β13
AB B = [
a2 −
a ]
0 B = β 21 β22 β23 is the transition matrix.
[ a3 − a0 ]B β31 β32 β33
Exercises
<A>
<B>
Try to extend problems in Ex. <B> of Sec. 2.6 to R3 . Be careful that, in
some cases, lines in R2 should be replaced by planes in R3 here.
<C> Abstraction and generalization
Try to extend the contents of Ex. <B> to abstract finite-dimensional affine
space over a field F. For partial extension to Rn , see Sec. 5.9.5.
3.7 Linear Transformations (Operators) 365
where, for 1 ≤ i ≤ 3, f (
ei ) = (ai1 , ai2 , ai3 ) is the ith row vector
of A, and
f ( e1 ) a11 a12 a13
A = f ( e2 ) = a21 a22 a23 , which is [f ]N . (3.7.1)
f (e3 ) 3×3 a31 a32 a33
Fig. 3.23
and hence det A = 0 (refer to Ex. <B> 4 of Sec. 3.5). For example,
the point (a21 , −a11 , 0) lying on that plane will result in
a11 a12
a12 a21 − a22 a11 = − = 0, etc.
a21 a22
⇔ (4) (linearly algebraic) A has at least one nonzero row vector, say
a1 = (a11 , a12 , a13 ) and there are scalars α and β such that the
other two row vectors a i = (ai1 , ai2 , ai3 ), i = 2, 3, satisfy
a2 = α
a1 and
a3 = β
a1 .
a1j
⇔ (6) (geometric) In case A∗2 = αA∗1 and A∗3 = βA∗1 where A∗j = a2j
a3j
for j = 1, 2, 3, are column vectors, then the image Im (A) is the
straight line
or x2 = αx1 , x3 = βx1 .
⇔ (7) The rank r(A) = 1. (3.7.5)
For a general setting, let the kernel
Ker(A) = x2
x1 ,
Im(A) =
x3 A
v ,
is a fixed line v = 0, through 0 . It is easy to see that (refer to (3.5.6))
x3 v
A x0 A
x2
Ker(A)
0 x1 0
x3 A
u + 〈〈u1, u2〉〉
Fig. 3.24
3.7 Linear Transformations (Operators) 369
x1 A = 0 = 0
v 1 + 0
v 2 + 0
x3 A,
x2 A = 0 = 0
v 1 + 0
v 2 + 0
x3 A,
x3 A= 0v 1 + 0
v 2 + 1x3 A,
x1 0 0 0 v1
⇒ x2 A = 0 0 0 v2
x3 0 0 1 x3 A
0 0 0
⇒ [A]B B N
B = PN AQB = 0 0 0 ,
where
0 0 1
x1
B
PN = x2 is the transition matrix from B to N , and
x3
−1
v1
QN
B =
v2 is the transition matrix from N to B . (3.7.7)
x3 A
This [A]B
B is the matrix representation of A with respect to bases B and B .
There are infinitely many such B and B that put A into this canonical form.
The quotient space of all affine subspaces (refer to Ex. <B> 8 of Sec. 3.5)
modulus Ker(A)
R3 /Ker(A) = {
x + Ker(A) |
x ∈ R3 } (3.7.8)
is a one-dimensional real vector space which is linearly isomorphic to Im(A).
See Fig. 3.25. Therefore,
dim Ker(A) + dim R3 /Ker(A) = dim R3 = 3.
x1 x1 + Ker(A) Im(A)
x2 (x1 + x2)A
A x2 A
x2 + Ker(A) x1 A
0 Ker(A)
0
( x1 A )
Fig. 3.25
0
0
(a) (b)
Fig. 3.26
3
⇔ (3) (algebraic) In Fig. 3.26(a), suppose the planes i=1 aij xi = 0 for
3
j = 1, 2 coincide while the third plane i=1 ai3 xi = 0 intersects the
former two along a line. Then,
ai1 ai2
det = 0 for 1 ≤ i < j ≤ 3 and at least one of
aj1 aj2
ai1 ai3
det , 1 ≤ i < j ≤ 3, is not equal to zero,
aj1 aj3
a ai1 j2
det i1 j1 , 1 ≤ i1 < i2 ≤ 3, 1 ≤ j1 < j2 ≤ 3
ai2 j1 ai2 j2
3.7 Linear Transformations (Operators) 371
are not equal to zero. Since the intersecting line of the former two
planes lies on the third one, then
a11 a12 a13
det A = a21 a22 a23
a a32 a33
31
a21 a22 a11 a12 a11 a12
= a13
− a23
+ a33 = 0.
a31 a32 a31 a32 a21 a22
In conclusion, det A = 0 and at least one of the subdeterminents of
order 2 is not equal to zero.
⇔ (4) (linearly algebraic) Two row (or column) vectors of A are linearly
independent and the three row (or column) vectors are linearly
dependent, say
α1 A∗1 + α2 A∗2 + α3 A∗3 = 0∗
for scalars α1 , α2 , α3 , not all zero, where A∗j for j = 1, 2, 3 are
column vectors.
⇔ (5) (linearly algebraic)
The row rank of A = the column rank of A = 2.
⇔ (6) (geometric) Adopt notations in (4). The image Im(A) is the plane
α1 x1 + α2 x2 + α3 x3 = 0.
⇔ (7) The rank r(A) = 2. (3.7.9)
Let
Ker(A) =
x1
and take two arbitrarily linearly independent vectors x3 in R3 so
x 2 and
that B = { x1 , x2 , x3 } forms a basis for R . Then, the range
3
Im(A) =
x2 A,
x3 A
is a fixed plane through 0 . It is easy to see that (refer to (3.5.5))
1. A maps each line x0 + Ker(A), parallel to Ker(A), into a single point
x0 A on Im(A).
2. A maps each line v0 + u , where
u is linearly independent of
x1 , not
parallel to Ker(A), onto a line on Im (A).
3. A maps each plane, parallel to Ker(A), into a line on Im(A). When will
the image line pass 0 , i.e. a subspace?
372 The Three-Dimensional Real Vector Space R3
v0
Im(A)
x0 Ker(A) v0 A
x1 A
x2 A 0
0 x0 A
x3 x3 A
x2 (a)
Σ Σ2
l2 Ker(A) Im(A) = Σ′
Σ1
A
l1 0 0 l2′
Σ′2
Σ1′ l1′
(b)
Fig. 3.27
0 0 0
[A]B B N
B = PN AQB = 0 1 0 ,
0 0 1
−1
x1 v
B
where PN x2 and QN
= B =
x2A . (3.7.11)
x3 x3 A
The quotient space of R3 modulus Ker(A),
R3 /Ker(A) = {
x + Ker(A) |
x ∈ R3 } (3.7.12)
is a two-dimensional vector space which is isomorphic to R2 . Also
dim Ker(A) + dim R3 /Ker(A) = dim R3 = 3.
3.7 Linear Transformations (Operators) 373
x1 + x2 ( x1 + x2 ) + Ker(A)
x1 x1 + Ker(A) Im(A)
x2
A x2 A ( x1 + x2 ) A
Ker(A)
x2 + Ker(A)
( x1 A) x1 A
0 A 0
x1 + Ker(A)
x1
Fig. 3.28
Fig. 3.29
of the intersection line of the first two planes does not lie on the
third plane. This is equivalent to
det A = 0.
⇔ (5) (linearly algebraic) Three row (or column) vectors of A are linearly
independent.
⇔ (6) (linearly algebraic)
The row rank of A = the column rank of A = 3.
⇔ (7) The rank r(A) = 3.
⇔ (8) (linearly algebraic) A is invertible and hence A: R3 → R3 , as a linear
operator, is a linear isomorphism.
⇔ (9) (linearly algebraic) A maps every or a basis { x1 , x3 } for R3 onto
x2 ,
a basis { x1 A, x2 A, x3 A} for R .
3
(3.7.13)
In (9), let B = { x2 ,
x1 , x3 } and B = {x1 A,
x2 A,
x3 A} which are bases
for R . Then the matrix representation of A with respect to B and B is
3
1 0 0
[A]B B N
B = PN AQB = 0 1 0 = I3 ,
0 0 1
−1
x1 x1 A
B
where PN x2 and QN
= B =
x2 A . (3.7.14)
x3 x3 A
As a counterpart of (2.7.9), we have
The general geometric mapping properties of an
invertible linear operator
Let A = [aij ]3×3 be an invertible real matrix. Then, the linear isomorphism
A: R3 → R3 defined by x → x A and the affine transformation T : R3 → R3
defined by T ( x ) = x0 + x A (see Sec. 3.8.3) both preserve:
1. Line segment (interior points to interior points) and line.
2. Triangle and parallelogram (interior points to interior points) and plane.
3. Tetrahedron and parallelepiped (interior points to interior points).
4. The relative positions (see (3.4.5), (3.5.5) and (3.5.6)) of straight lines
and planes.
5. Bounded set.
6. The ratio of signed lengths of line segments along the same or parallel
lines.
3.7 Linear Transformations (Operators) 375
i=1
Exercises
<A>
1. Prove (3.7.5) and (3.7.6) in detail.
2. Prove (3.7.8) and (3.7.12).
3. Prove (3.7.9) and (3.7.10) in detail.
4. Prove (3.7.13) in detail.
5. Prove (3.7.15) in detail.
6. Let A = [aij ] ∈ M(3; R) be a nonzero matrix, considered as the
linear operator x → x A. Denote by A1∗ , A2∗ , A3∗ the three row
vectors of A and A∗1 , A∗2 , A∗3 the three column vectors of A. Let
x = (x1 , x2 , x3 ),
y = (y1 , y2 , y3 ). Notice that
x A = x1 A1∗ + x2 A2∗ + x3 A3∗
= (
x A∗1 ,
x A∗2 ,
x A∗3 ) =
y.
The following provides a method of how to determine the kernel Ker(A)
and the range Im(A) by intuition or inspection.
376 The Three-Dimensional Real Vector Space R3
(a) Suppose r(A) = 1. We may suppose that A1∗ = 0 and A2∗ = α2 A1∗
and A3∗ = α3 A1∗ for some scalars α2 and α3 . Then
x1 A1∗ + x2 A2∗ + x3 A3∗ = (x1 + α2 x2 + α3 x3 )A1∗ = 0
⇔ x1 + α2 x2 + α3 x3 = 0,
which is the kernel Ker(A). We may suppose that A∗1 = 0∗ and
A∗2 = β2 A∗1 and A∗3 = β3 A∗1 for some scalars β2 and β3 . Then
which is Ker(A). May suppose that A∗1 and A∗2 are linearly inde-
pendent and A∗3 = β1 A∗1 + β2 A∗2 . Then,
which is Im(A).
Try to determine Ker(A) and Im(A) for each of the following matrices:
6 −10 2 12 0 −2
A = −3 5 −1 ; A = 0 4 6 .
−9 15 −3 3 −1 −2
7. Formulate the features for Ker(A) and Im(A) and the geometric mapping
properties for each of the following linear transformations, in the natural
bases for spaces concerned.
(a) f: R → R3 defined by f ( x A, A ∈ M(1; 3; R).
x) =
(b) f: R → R defined by f (
2 3
x A, A ∈ M(2; 3; R).
x) =
(c) f: R3 → R defined by f ( x A, A ∈ M(3; 1; R).
x) =
(d) f: R3 → R2 defined by f ( x A, A ∈ M(3; 2; R).
x) =
3.7 Linear Transformations (Operators) 377
<B>
3
aij xi = bj , j = 1, 2, 3
i=1
has a solution if and only if the constant vector b = (b1 , b2 , b3 ) lies on the
range Im(A) of the linear operator
f (
x) =
x A, A = [aij ]3×3 .
This is equivalent to saying that the coefficient matrix A and its augmented
∗
matrix [A | b ] have the same rank, i.e.
* ∗
+
r(A) = r [A | b ] .
1. Try to use these concepts and methods to redo Ex. <B> 4 of Sec. 3.5
but emphasize the following aspects.
(a) Find all relative positions of the three planes and characterize each
case by using ranks and determinants.
(b) What kinds of relative positions will guarantee the existence of a
solution? And how many solutions are there?
2. Try to use Ex. 1 to solve the following sets of equations and graph these
planes concerned, if possible.
0, 1 ≤ i ≤ m − 1
f (
x i) =
1, i = m
Ker(f ) = S;
Im(f ) = F.
Then
m
x = (x1 , . . . , xm ) = xi
ei
i=1
m
m
m
m
⇒ f (
x) = xi f (
ei ) = xi x j ) =
aij f ( aim xi .
i=1 i=1 j=1 i=1
3.7 Linear Transformations (Operators) 379
Note that at least one of a1m , a2m , . . . , amm is not equal to zero. As a
consequence,
m
the hypersubspace S: aim xi = 0, and
i=1
m
the hyperplane
x0 + S: aim xi = b, (3.7.16)
i=1
m
where b = i=1 aim xi0 if
x0 = (x10 , x20 , . . . , xm0 ).
m
where fj (
x ) = i=1 aij xi are the linear functionals determined by the jth
column vector of A, for each j, 1 ≤ j ≤ n.
380 The Three-Dimensional Real Vector Space R3
il th coordinate
ai1 1 ai1 2 ··· ai1 ,l−1 ai1 ,l+1 ··· ai1 ,k+1
. .. .. .. ..
..
. . . .
l+1 ail−1 ,1 ail−1 ,2 · · · ail−1 ,l−1 ail−1 ,l+1 · · · ail−1 ,k+1
= (−1) det
ail+1 ,1 ail+1 ,2 · · · ail+1 ,l−1 ail+1 ,l+1 · · · ail+1 ,k+1
.. .. .. .. ..
. . . . .
aik+1 ,1 aik+1 ,2 · · · aik+1 ,l−1 aik+1 ,l+1 · · · aik+1 ,k+1
for 1 ≤ l ≤ k + 1, and
pth coordinate = 0 for p = il , 1 ≤ p ≤ m and 1 ≤ l ≤ k + 1.
3.7 Linear Transformations (Operators) 381
Try to Figure out when v i1 i2 ···ik ik+1 = 0 for some 1 ≤ i1 <
i2 < · · · < ik < ik+1 ≤ m.
(Note The inductive processes in (a), (b) and (c) all together sug-
gest implicitly an inductive definition for determinant of order m over a
field F.)
5n
2. What are the possible dimensions for j=1 Ker(fj ) and how to charac-
terize them in each case? Here we suppose that, for each 1 ≤ j ≤ n,
dim Ker(fj ) = m − 1 holds.
(a) In case n = 2, the following are equivalent:
(1) The hypersubspaces Ker(f1 ) and Ker(f2 ) coincide, i.e.
Ker(f1 ) = Ker(f2 ) holds.
(2) There exists a nonzero scalar α so that f2 = αf1 .
(3) The ratios of the corresponding coefficients in
m
m
Ker(f1 ): i=1 ai1 xi = 0 and Ker(f2 ): i=1 ai2 xi = 0 are equal,
namely,
a11 a21 am1
= = ··· = .
a12 a22 am2
(5) The two column vectors of A are linearly dependent, and the
maximal number of linearly independent row vectors of A is 1.
Then, what happens if Ker(f1 ) = Ker(f2 ), i.e. dim (Ker(f1 ) ∩
Ker(f2 )) = m − 2?
53
(b) In case n = 3, the dimensions for j=1 Ker(fj ) could be n − 1, n − 2
or n − 3 which is, respectively, the case (3.7.5), (3.7.9) or (3.7.13)
if n = 3. Try to figure out possible characteristic properties, both
algebraic and geometric.
382 The Three-Dimensional Real Vector Space R3
(c) Suppose the matrix A = [aij ]m×n has a submatrix of order k whose
determinant is not equal to zero. Say, for simplicity,
a11 a12 · · · a1k
a21 a22 . . . a2k
det . .. = 0.
.. .
ak1 ak1 . . . akk
Then it follows easily that the first k column vectors A∗1 , . . . , A∗k
of A should be linearly independent. So are the first k row vectors
A1∗ , . . . , Ak∗ . For
∗
α1 A∗1 + · · · + αk A∗k = 0 , where 0 ∈ Fm .
a11 . . . ak1
.. =
⇒ (α1 · · · αk ) ...
. 0 , where 0 ∈ F .
k
a1k . . . akk
⇒ (α1 , . . . , αk ) = 0 in Fk .
Furthermore, in this case,
4
k
dim Ker(fj ) = m − k.
j=1
⇔ (x1 · · · xk )
−1
ak+1,1 . . . ak+1,k a11 ... a1k
.. .. .. .. ,
= −(xk+1 · · · xm ) . . . .
am1 ... amk ak1 . . . akk
5k
which indicates immediately that dim j=1 Ker(fj ) = m − k. Also,
Ex. 1 might be helpful in handling this problem.
3.7 Linear Transformations (Operators) 383
5k
(d) (continuation of (c)) Conversely, suppose dim j=1 Ker(fj ) = m−k,
then there exists at least one 1 ≤ i1 < i2 < · · · < ik ≤ m so that
ai1 1 . . . ai1 k
.. = 0.
det ... .
aik 1 . . . aik k
the former is called the row rank of A and the latter the column
rank of A.
⇔ 5. (algebraic) row rank of A = column rank of A = r.
⇔ 6. (linearly algebraic) the rank of A = r = dim Im(A), denoted as r(A);
in case r(A) = m ≤ n,
384 The Three-Dimensional Real Vector Space R3
⇔ 7. A is one-to-one, i.e. Ker(A) = { 0 }; in case r(A) = n ≤ m,
⇔ 8. A is onto, i.e. Im(A) = Fn ; in case m = n and r(A) = n,
⇔ 9. A is a linear isomorphism on Fn .
⇔ 10. A maps every or a basis for Fn onto a basis for Fn .
(3.7.19)
For other approaches to these results, refer to Exs. 2 and 24 of Sec. B.7,
see also Secs. B.8 and 5.9.3.
Let V and W be finite-dimensional vector spaces over the same field.
Suppose dim V = m, and B is a basis for V , while dim W = n and C is
a basis for W . The mapping Φ: V → Fm defined by Φ( x ) = [x ]B , the
coordinate vector of x ∈ V with respect to B, is a linear isomorphism
f
V → W
Φ↓ ↓Ψ.
Fm → Fn
g
Since [Φ]B N B C
N [g]N = [f ]C [Ψ]N (see Secs. 2.7.3 and 3.7.3), we can recapture
various properties in (3.7.19) for f via [g]N B
N , and hence, via [f ]C .
3.7.2 Examples
It is supposed or suggested that readers are familiar with those basic exam-
ples presented in Sec. 2.7.2 and general terminologies, such as
For operator A with rank r(A) = 1 or r(A) = 2, its algebraic and geomet-
ric mapping properties are essentially the same as operators on R2 , after
choosing a fixed basis B for R3 where at least one of the basis vectors is
from its kernel. Therefore, we will focus on examples here on operators of
rank 3.
has, in N = {
e1 , e3 }, the properties:
e2 ,
1. ei for i = 1, 2, 3. So
ei A = λi ei is an eigenvector of A corresponding to
the eigenvalue λi and ei is an invariant line (subspace) of A.
2. In case λ1 = λ2 = λ3 = λ, then
A = λI3
This is the simplest linear operator among all. It just maps a vector x =
(x1 , x2 , x3 ) into the vector
x A = (λ1 x1 , λ2 x2 , λ3 x3 ). See Fig. 3.30.
386 The Three-Dimensional Real Vector Space R3
xA
3 x3 e3
x3 e3
1 x1 e1
e3
x
0 e2 x2 e2 2 x2 e2
e1
x1 e1
Fig. 3.30
33x3
3x3
x3 [ x]B A [ x]B
22x2 2x2
0 x2
1x1
11x1
Fig. 3.31
eigenvalues eigenvectors
√
λ1 = ab
v1 = ( |b|, |a|, 0)
√
λ2 = − ab
v2 = (− |b|, |a|, 0)
λ3 = c v3 = e3 = (0, 0, 1)
B = { v1 , v3 } is a basis for R3 and the axes
v2 , v1 ,
v2 ,
v3 are
invariant lines.
1. Since
0 1 0 b 0 0
A = 1 0 0 0 a 0
0 0 1 0 0 c
in N = {
e1 , e3 }, A can be decomposed as
e2 ,
0 1 0
x = (x1 , x2 , x3 ) →
x 1 0 0 = (x2 , x1 , x3 )
0 0 1
b 0 0
→ (x2 , x1 , x3 ) 0 a 0 = (bx2 , ax1 , cx3 ) =
xA
0 0 c
Refer
√ to Fig.√3.31 (and compare with Fig. 2.48). What happens if
c = ab or − ab.
where the first mapping is of type as in (1) while the second one is a
reflection with respect to the coordinate plane e3 . See Fig. 3.33
e2 ,
(and Fig. 2.49).
e2
0
e1
(x1, x2, 0) x1 = x2
Fig. 3.32
Note A satisfies its characteristic polynomial, namely
(A − cI3 )(A2 − abI3 ) = A3 − cA2 − abA + abcI3 = O3×3 .
e2 0 e3
〈〈e2 , e3〉〉
xA
Fig. 3.33
x (A2 + abI3 ) = 0 .
Such a x is not an eigenvector of A corresponding to c. Hence,
x ∈
e2 should hold. Actual computation shows that any vector
e1 , x in
e1 , e2 satisfies x (A + abI3 ) = 0 and conversely. Choose an arbitrarily
2
This is the rational canonical form of A (for details, see Sec. 3.7.8). The
action of [A]B on [ x ]B is illustrated similarly as in Fig. 3.32 if a and −b are
replaced by 1 and −ab respectively, and e1 and
e2 are replaced by v and
v A which are not necessarily perpendicular to each other.
For another matrix representation, we adopt the method indicated in
Ex. <B> 5 of Sec. 2.7.2. In what fallows, the readers are supposed to
have
√ basic knowledge about complex numbers. A has complex eigenvalues
± ab i. Let x = (x1 , x2 , x3 ) ∈ C3 be a corresponding eigenvector. Then
√
x A = ab i x
√
⇔ (−bx2 , ax1 , cx3 ) = ab i (x1 , x2 , x3 )
√ √
⇔ bx2 = − ab i x1 , ax1 = ab i x2 ,
√
cx3 = ab i x3 (recall that c2 + ab > 0)
√ √
⇔ x = t( bi, a, 0) for t ∈ C.
√ √ √ √
Note that ( bi, a, 0) = (0, a, 0) + i( b, 0, 0). By equating the real and
the imaginary parts of both sides of
√ √ √ √ √
(0, a, 0) + i( b, 0, 0) A = ab i (0, a, 0) + i( b, 0, 0)
√ √ √
⇒ 0, a, 0 A = − ab b, 0, 0 ,
√ √ √
b, 0, 0 A = ab 0, a, 0 .
Combined with √ e3 A = c e3 , we obtain the√matrix representation of A in
√ √
the basis C = {( b, 0, 0), (0, a, 0), e3 } = { b
e1 , a e3 }
e2 ,
√
0 a 0 √0 ab 0
[A]C = Q −b 0 0 Q−1 = − ab 0 0
0 0 c 0 0 c
√
√ 0 1 0 b 0 0
√
= ab −1 0 0
, where Q = 0 a 0 . (3.7.22)
c
0 0 √ab 0 0 1
[A]C can be decomposed as, in C,
0 1 0 0 1 0 1 0 0
0
x ]C → [
[ x ]C −1 0 0 → [ x ]C −1 0 0 0 1
0 0 1 0 0 1 0 0 √c
ab
√ 0 1 0
→ [ x ]C ab
−1 0 0
= [ x ]C [A]C = [ x A]C ,
√c
0 0 ab
3.7 Linear Transformations (Operators) 391
where
√ the first mapping is a rotation through 90◦ in the coordinate plane
√
b e1 , a e2 , the second one is a one-way
√ stretch along
e3 and the
third one is an enlargement with scale ab. The readers are urged to illus-
trate these mappings graphically.
Case 1 a = c = 1
eigenvalues eigenvectors
λ1 = a v1 =e1
λ2 = c
v2 = (b, c − a, 0)
λ3 = 1 v3 =e3
In the basis B = {
v1 , v3 }, A has the representation
v2 ,
a 0 0 v1
[A]B = P AP −1 = 0 c 0 , v2 .
where P =
0 0 1 v3
Case 2 a = c = 1
eigenvalues eigenvectors
λ1 = a v1 =e1
λ2 = 1, 1
v2 = (b, 1 − a, 0) and
v3 =
e3
In the basis B = {
v1 , v3 },
v2 ,
a 0 0 a 0 0 v1
[A]B = P b 1 0 P −1 = 0 1 0 , v2 .
where P =
0 0 1 0 0 1 v3
Case 3 a = c = 1
1 is the only eigenvalue of A with associated eigenvectors
e1 and
e3 .
1. Since b = 0,
1 0 0
A = b 1 0
0 0 1
is not diagonalizable.
2. The coordinate plane e3 is the plane (subspace) of invariant
e1 ,
points of A.
3. Since x = (x1 , x2 , x3 ) →
x A = (x1 + bx2 , x2 , x3 ), A moves a point x
along a line parallel to the axis e1 through a distance with a constant
proposition b to its distance x2 from the e1 axis, i.e.
(x1 + bx2 ) − x1
=b
x2
to the point x A. This A is called a shearing along the plane e3
e1 ,
with scale factor b. See Fig. 3.34.
4. Hence, every plane v + e3 parallel to
e1 , e3 is an invariant
e1 ,
plane.
e2
e2
0 e1 0 e1
x e3
xA
e3
Fig. 3.34
Case 4 a = c = 1
a and 1 are eigenvalues of A with respective eigenvectors
e1 and
e3 .
1. A satisfies its characteristic polynomial −(t − a)2 (t − 1), i.e.
(A − aI3 )2 (A − I3 ) = O3×3 .
2. Since (A − aI3 )(A − I3 ) = O3×3 , A is not the diagonalizable (refer to
Ex. <C> 9 of Sec. 2.7.6 or Secs. 3.7.6 and 3.7.7).
3.7 Linear Transformations (Operators) 393
3. Since
a 0 0 a 0 0 1 0 0 a 0 0 0 0 0
A = b a 0 = 0 a 0 ab 1 0 = 0 a 0 + b 0 0 ,
0 0 1 0 0 1 0 0 1 0 0 1 0 0 0
0 0 0 λ1 − λ 2 0 0
(B − λ1 I3 )(B − λ2 I3 ) = b 0 0 b λ 1 − λ2 0
a c λ2 − λ 1 a c 0
0 0 0
= b(λ1 − λ2 ) 0 0 = O3×3 ,
bc 0 0
0 0 0
(B − λ1 I3 )2 = 0 0 0
bc + a(λ2 − λ1 ) c(λ2 − λ1 ) (λ2 − λ1 )2
and take a vector v2 (B − λ1 I3 )2 = 0 but
v2 satisfying v2 (B −λ1 I3 ) =
v1 =
0 . Say v2 = e2 , then
v1 v2 (B − λ1 I3 ) = (b, 0, 0) = b
= e1 .
394 The Three-Dimensional Real Vector Space R3
v1 B = b
e1 B = bλ1
e1 = λ1 v1 + 0 ·
v1 = λ1 v2 + 0 ·
v3 ,
v2 B =
e2 B = (b, λ1 , 0) = b v 1 + λ1 ·
e2 =
e1 + λ1 v2 + 0 ·
v3 ,
v3 = 0 ·
= λ2
v3 B v1 + 0 ·
v 2 + λ2 ·
v3
λ1 0 0 v1
−1
⇒[B]C = QBQ = 1 λ1 0 , where Q = v2 .
(3.7.23)
0 0 λ2 v3
This is called the Jordan canonical form of B (for details, see Sec. 3.7.7).
[B]C belongs to Case 4 in Example 3 by noticing that
λ1
λ2 0 0
[B]C = λ2 λ12 λλ12 0 . (3.7.24)
0 0 1
(0,1, 2)
e3 (0,1,1) e3 (1, 3, 2)
(2,1, 2)
(1, 0,1) e2
(1,1,1) A (3, 3, 2)
e2 0
0
e1 (1, 2, 0)
(1,1, 0) (2, 0, 0)
e1
invariant line
(3, 2, 0)
( = 2, b = c= 1)
Fig. 3.35
v1 A = bc
e1 A = bc(λ, 0, 0) = λ v1 + 0 ·
v1 = λ v2 + 0 ·
v3 ,
v2 A = c
e2 A = c(b, λ, 0) = bc e2 = 1 ·
e1 + λc v1 + λ ·
v2 + 0 ·
v3 ,
v3 A =
e3 A = (0, c, λ) = c e3 = 0 ·
e2 + λ v1 + 1 ·
v2 + λ ·
v3
λ 0 0 bce1
⇒ [A]B = P AP −1 = 1 λ 0 , e2 .
where P = c (3.7.25)
0 1 λ e3
[A]B is called the Jordan canonical form of A (for details, see Sec. 3.7.7).
See Fig. 3.35 for λ = 2.
Case 1 a2 + 4b > 0
eigenvalues eigenvectors
√
a + a2 + 4b
λ1 = v1 = (b, λ1 , 0)
√2
a − a2 + 4b
λ2 = v2 = (b, λ2 , 0)
2
λ3 = c v3 =
e3
In the basis B = {
v1 , v3 }, A is diagonalizable as
v2 ,
λ1 0 0 v1
[A]B = P AP −1 =0 λ2 0 , where P =
v2 .
0 0 λ3 v3
Case 2 a2 + 4b = 0 and a = 2c
eigenvalues eigenvectors
a a
λ= , v = (−a, 2, 0)
2 2
λ3 = c v3 =
e3
1. Since
a
−2 1 0
a
A − I3 = b a
2 0 ,
2
0 0 c − a2
* + 0 0 0
a 2
A − I3 = 0 0 0 ,
2 2
0 0 c − a2
3.7 Linear Transformations (Operators) 397
−c 1 0
A − cI3 = b a − c 0 , and
0 0 0
ac
2 −c
a
* + 2 +b 0
a
A − I3 (A − cI3 ) = b( a2 − c) −( ac
2 + b) 0
= O3×3 ,
2
0 0 0
thus A is not diagonalizable.
2. Now, B = { v, e3 } is a basis for R3 . Since
e1 ,
v A = λ
v,
1
e1 A = (0, 1, 0) = v + λ
e1 ,
2
e3 A = c
e3
0 1 0 λ 0 0
⇒ [A]B = P b a 0 P −1 = 12 λ 0
0 0 c 0 0 c
λ
c 0 0 v
1 λ
= c 2c c 0 , where P = e1 .
0 0 1 e3
( 1
1 + 2 , 2 , x 3
2 ) ( 1
2 )
x A = 1 + 2 , 2 ,cx 3
( 1, 2 , x3 )
v 1 v
x
e2
(0, 0, x3 ) 2 e1
e1
0 (0,0, x3)+ 〈〈v,e1〉〉
e3
Fig. 3.36
398 The Three-Dimensional Real Vector Space R3
Case 3 a2 + 4b < 0
A has only one real eigenvalue c with associated eigenvector
e3 . Hence
e3 is an invariant line (subspace) of A. Since
0 1 0 b 0 0 0 0 0
A = 1 0 0 0 1 0 + 0 a 0 ,
0 0 1 0 0 1 0 0 c−1
then, A is the sum of the following two linear operators.
1. The first one is the composite of the reflection with respect to the plane
x1 = x2 (see Example 2) followed by a one-way stretch along e1 with
scale b (see Example 1).
2. The second one is a mapping of R3 onto the plane e3 if a = 0, c =
e2 ,
1; onto the axis e2 if a = 0 and c = 1; and onto the axis
e3 if
a = 0 and c = 1 (see (3.7.34)).
See Fig. 3.37 (compare with Fig. 2.55).
x1 = x2 ( x2 , x1 , x3 )
e1 (bx2 , x1 , x3 )
( x1 , x2 , 0)
x
xA
e2 0 e3
(0, ax2 , (c − 1) x3 )
Fig. 3.37
Exercises
<A>
(a) A is one of
0 0 0 0 0 a 0 0 0
0 a 0 , 0 0 0 or 0 0 0 , where a = 0.
0 0 0 0 0 0 0 a 0
(b) A is one of
0 a 0 0 0 a 0 0 a
0 0 0 , 0 0 0 or 0 b 0 , where ab = 0.
b 0 0 0 0 b 0 0 0
(c) A is one of
0 a 0 0 0 0 a 0 0 0 0 0
b 0 c , a 0 0 , 0 0 0 or 0 a b ,
0 0 0 b c 0 0 b c c 0 0
where abc = 0.
(d) A is one of
0 0 a 0 0 a 0 a 0
b 0 0 , 0 b 0 or b 0 0 , where abc = 0.
0 c 0 c 0 0 0 0 c
Note that one can assign a, b and c numerical numbers so that
the characteristic polynomial can be handled easier.
(e) A is one of
0 −1 −1 −3 3 −2 0 1 −1
−3 −1 −2 , −7 6 −3 or −4 4 −2 .
7 5 6 1 −1 2 −2 1 1
0 1 0
(f) A = 0 0 1 into Jordan canonical form.
8 −12 6
−2 1 0
(g) A = 0 −2 1 into Jordan and rational canonical forms.
0 0 −2
2. Recall that e1 = (1, 0, 0),
e2 = (0, 1, 0), and
e3 = (0, 0, 1). Let
x1 =
(0, 1, 1), x2 = (1, 0, 1), x3 = (1, 1, 0), and x4 = (1, 1, 1). These seven
points form vertices of a unit cube. See Fig. 3.38. This cube can be
divided into six tetrahedra of equal volumes. Three of them are shown
in the figure. Any such a tetrahedron is called part of the cube. Similar
situation happens to a parallelepiped.
402 The Three-Dimensional Real Vector Space R3
e3 x1
x2
x4
e2
0
e1 x3
Fig. 3.38
(c) Try to simplify for A3×3 what Exs. <B> 5 and <C> 5 of
Sec. 2.7.6 say.
(d) Try to model after (2.7.18) for A3×3 . Is this a beneficial way to prove
this theorem?
(e) Let ϕ(λ) = det(A − λI3 ), the characteristic polynomial of A. Try the
following steps.
(1) Take λ so that ϕ(λ) = 0. Then (see (3.3.2) or Sec. B.6)
(A − λI3 )−1 = ϕ(λ)
1
adj(A − λI3 ).
(2) Each entry of the adjoint matrix adj(A − λI3 ) is a polynomial
in λ of degree less than 3. Hence, there exist constant matrices
B2 , B1 , B0 ∈ M(3; R) such that
adj(A − λI3 ) = B2 λ2 + B1 λ + B0 .
(3) Multiply both sides of adj(A − λI3 ) · (A − λI3 ) = ϕ(λ)I3 out and
equate the corresponding coefficient matrices of λ3 , λ2 , λ and λ0
to obtain
−B2 = −I3
B2 A − B1 = −b2 I3
B1 A − B0 = −b1 I3
B0 A = −b0 I3 .
Then try to eliminate B2 , B1 , B0 from the right sides.
(Note Methods in (c) and (e) are still valid for n × n matrices over a
field F.)
4. Let A = [aij ]3×3 .
(a) Show that the characteristic polynomial
det(A − tI3 ) = −t3 + tr(A)t2 + b1 t + det A.
Try to use aij , 1 ≤ i, j ≤ 3, to express the coefficient b1 .
(b) Prove (3.7.28).
(c) Let
−9 4 4
A = −8 3 4 .
−16 8 7
(1) Try to use (b) to compute A−1 .
(2) Show that A is diagonalizable and find an invertible matrix P
such that P AP −1 is a diagonal matrix.
404 The Three-Dimensional Real Vector Space R3
x ∈ R3 so that
Hence, show that there exists a vector x A2 = 0
x A = 0 ,
and x A = 0 , and B = { x , x A, x A } forms a basis for R . What is
3 2 3
[A]B ? See Ex. <C> 3 of Sec. 3.7.7 and Ex. 5 of Sec. B.12 for general
setting.
4. Does there exist a matrix A ∈ M(3; R) such that Ak = O for k = 1, 2, 3
but A4 = O? If yes, give explicitly such an A; if not, give a precise reason
(or proof).
5. (Refer to Ex. <B> 2 of Sec. 2.7.2, Ex. 6 of Sec. B.4 and Ex. 7 of Sec. B.7.)
Let A ∈ M(3; R) be idempotent, i.e. A2 = A.
(a) Show that each eigenvalue of A is either 0 or 1 .
(b) Guess what are possible characteristic polynomials and canonical
forms for such A. Could you justify true or false of your statements?
(c) Show that
A2 = A.
⇔ Each nonzero vector in Im (A) is an eigenvector of A associated
to the eigenvalue 1.
Try to use this to determine all such A up to similarity.
6. (Refer to Ex. <B> 3 of Sec. 2.7.2, Ex. 9 of Sec. B.4 and Ex. 8 of Sec.
B.7.) Let A ∈ M(3; R) be involutory, i.e. A2 = I3 .
(a) Show that each eigenvalue of A is either 1 or −1.
(b) Consider the following cases:
(1) A − I3 = O3×3 .
(2) A + I3 = O3×3 .
(3) A − I3 = O3×3 , A + I3 = O3×3 but (A − I3 )(A + I3 ) = O.
Write out the canonical form of A for each case. Prove them!
(c) Show that
Ker(A − I3 ) = Im(A + I3 ),
Ker(A + I3 ) = Im(A − I3 ), and
R = Ker(A − I3 ) ⊕ Ker(A + I3 ).
3
B = {
x1 , x3 } such that
x2 ,
λ 0 0 x1
[A]B = P AP −1 = 0 λ1 λ2 , where P =
x2 and λ2 = 0.
0 −λ2 λ1
x3
3
A= aij Eij A= aij Eij
i=1 j=1 i=1 j=1
We will feel no hesitation to use these converted results for R3 . Recall that
f ∈ Hom(R3 , R3 ) is called diagonalizable if there exists a basis B for R3
so that [f ]B is a diagonal matrix. In this case, [f ]C is similar to a diagonal
matrix for any basis C for R3 , and vice versa.
We list following results for reference.
1. det f = det[f ]B .
2. det(f − t1R3 ) = det([f ]B − tI3 ).
3. tr f = tr[f ]B .
4. r(f ) = r([f ]B ). (3.7.30)
is needed for A3×3 and B3×3 . For general case, see Ex. <C> in Sec. 2.7.3.
For A ∈ M(m, n; R) or A ∈ M(m, n; F) where m, n = 1, 2, 3, . . . , as
before in (2.7.46) and (2.7.47), let the
Then, we have (refer to Sec. 3.7.1 and, in particular, to Ex. <C> there)
408 The Three-Dimensional Real Vector Space R3
Solution Denote by Ai∗ the ith row vector of A for i = 1, 2, 3 and A∗j the
jth column vector for j = 1, 2, 3.
Since A3∗ = 2A1∗ + A2∗ or A∗3 = −A∗1 − 23 A∗2 , then the rank r(A) = 2.
By direct computation,
a1 1 1 0 b1 −1 −1 1
a2 = 1 0 1 = −2; det b2 = −1
det 1 −1 = 4
a3 0 1 1
b3 1 −1 −1
therefore B and C are bases for R3 .
(a) By using (3.3.5),
−1
b1
a1 −1 −1 1 −1 −1 1
1
ACB = b2
a2 = −1 1 −1 · − −1 1 −1
2
b3
a3 1 −1 −1 1 −1 −1
3.7 Linear Transformations (Operators) 409
3 −1 −1
1
=− −1 3 −1 ,
2
−1 −1 3
−1
a1 b1 1 1 0 1 1 0 2 1 1
1 1
a2 b2 = 1 0 1 · − 1 0 1 .
AB
C = 1 =− 1 2
2 2
a3 b3 0 1 1 0 1 1 1 1 2
It happens to be that
−1 −1
a1 b1 b1 a1
1 1
a2 = − b2 and hence, b2 = − a2 .
2 2
a3 b3 b3 a3
By above matrix relations or actual computation, we have
ACB AB
C = I3 .
Similarly,
−1
b1
e1 b1 −1 −1 1
ACN = b2 e2 = b2 I3 = −1 1 −1 ,
b3
e3 b3 1 −1 −1
−1 −1
e1 a1 a1 −1 −1 1
1
AN
B = e2
a2 = I3 a2 = − −1 1 −1
2
e3
a3
a3 1 −1 −1
and hence, ACB = ACN AN B holds.
(b) Note that [f ]N = A as given above. Now, by (3.3.4) and (3.3.5),
−1
f (1 B
a ) a A a1 A a1
1 B
[f ]B = [f ]B = f ( a ) = a A =
a A a2
B 2 B 2 B
2
f ( a3 ) B a3 A B a3 A a3
−1
a1 a1
= a2 A a2
= AB N N
N [f ]N AB
a3 a3
1 1 0 −1 0 1 −1 −1 1
1
= 1 0 1 0 3 −2 · − −1 1 −1
2
0 1 1 −2 3 0 1 −1 −1
−3 5 −3
1
= − 1 5 −7 .
2
−6 10 −6
410 The Three-Dimensional Real Vector Space R3
Similarly,
[f ]C = [f ]CC = ACN [f ]N
N AC
N
−1 −1 1 −1 0 1 1 1 0
1
= −1 1 −1 0 3 −2 · − 1 0 1
2
1 −1 −1 −2 3 0 0 1 1
−1 0 1
1
= − 3 0 −3 .
2
−5 4 −3
[f ]N = [f ]N N B N C
N = AB [f ]B AN = AC [f ]C AN
⇒ [f ]C = ACN AN B N
B [f ]B AN AC = P [f ]B P
−1
,
where
P = ACN AN C
B = AB
as shown above.
(c) By definition, (3.3.4) and (3.3.5),
−1 −1
f (1 B
e ) A∗ A1∗ a1 a1
1 B
[f ]N = f ( e2 ) B = A2∗ B = A2∗ a = A
a2
B
2
f ( e3 ) B A3∗ B A3 ∗ a3 a3
−1 0 1 −1 −1 1 2 0 −2
1 1
= 0 3 −2 · − −1 1 −1 = − −5 5 −1 ;
2 2
−2 3 0 1 −1 −1 −1 5 −5
−1
b1 −1 0 1 1 1 0
1
[f ]N
C = A b2 = 0 3 −2 · − 1 0 1
2
b3 −2 3 0 0 1 1
−1 0 1
1
= − 3 −2 1 .
2
1 −2 3
[f ]N N C
B = [f ]C AB .
3.7 Linear Transformations (Operators) 411
[f ]N N B C N C B
N = AB [f ]C AN = AC [f ]B AN
⇒ [f ]CB = ACN AN B C N C B C
B [f ]C AN AB = AB [f ]C AB .
Exercises
<A>
1. In N = {
e1 , e3 }, let
e2 ,
1 1 −1
f (
x) =
x A, where A = 1 −1 0 .
2 0 1
412 The Three-Dimensional Real Vector Space R3
Let B = { a1 , a3 } where
a2 , a1 = (−1, 2, 1), a3 = (1, 2, −3)
a2 = (2, 4, 5),
and C = { b1 , b2 , b3 } where b1 = (1, 0, 1), b2 = (1, 2, 4), b3 = (2, 2, −1).
Model after the example and do the same questions as in Ex. <A> 1 of
Sec. 2.7.3.
2. Find a linear operator f on R3 and a basis B for R3 so that
−1 1 1
[f ( x ]N 0 2
x )]B = [ 5 , x ∈ R3 .
−2 4 −3
Describe all possible such f and B.
3. Let
1 0 0 1 0 0
A = 1 1 0 and B = 1 1 0 .
0 0 1 0 1 1
(a) Do there exist a linear operator f on R3 and two bases B and C for
R3 so that [f ]B = A and [f ]C = B? Give precise reason.
(b) Show that there exist invertible matrices P and Q so that
PAQ = B.
4. Find nonzero matrices A3×3 and B3×3 so that AB has each possible
rank. Show that
r(AB) = 3 ⇔ r(A) = r(B) = 3.
5. Generalize Exs. <A> 10 through 21 of Sec. 2.7.3 to R3 or real 3×3 matri-
ces and prove them. For example, in Ex. 20, the orthogonal complement
of a subspace S in R3 is now defined as
S ⊥ = {
x ∈ R3 |
xy ∗ = 0 for each
y ∈ S}.
For a nonzero 3 × 3 matrix A, then
(1) Im(A)⊥ = Ker(A∗ ).
(2) Ker(A∗ )⊥ = Im(A), and R3 = Ker(A∗ ) ⊕ Im(A).
(3) Im(A∗ )⊥ = Ker(A).
(4) Ker(A)⊥ = Im(A∗ ), and R3 = Ker(A) ⊕ Im(A∗ ).
For each of the following matrices A:
2 5 3 −1 −2 3 1 0 −2
−6 1 −2 , 3 6 −9 and 4 6 −3 ,
2 21 10 2 4 −6 −5 −1 2
find a basis for each of the subspaces Im(A), Ker(A), Im(A∗ ) and
Ker(A∗ ) and justify the above relations (1)–(4).
3.7 Linear Transformations (Operators) 413
<B>
Read back Sec. 2.7.3 carefully and extend all possible definitions and results
there to linear transformations from Rm to Rn where m, n = 1, 2, 3; in par-
ticular, the cases m = n. Please refer to Secs. B.4, B.6 and B.7 if necessary.
We will be free to use them in what follows if needed.
f (x) = x
a for x ∈ R,
linearly independent of
a ? Notice that
[f ]B B N
N = [1R ]N [f ]N ,
i.e.
f( e )
[f ( x)]N = [ x]N A,
where A = [f ]N
N = 1 N .
f ( e2 ) N
(b) Give a fixed real matrix B2×3 . Do there exist a basis B for R2 and
a basis B for R3 so that [f ]B B
B = B holds? Notice that [f ]B =
B N N
[1R2 ]N [f ]N [1R3 ]B .
(c) Show that f can be one-to-one but never onto R3 .
(d) Show that there are infinitely many affine transformations
T (
x) =
y0 +
x A,
y0 ∈ R3 fixed and
x ∈ R2 ,
where A2×3 is of rank 2, mapping R2 one-to-one and onto any
preassigned two-dimensional plane in R3 .
(e) Show that any affine transformation from R2 into R3 preserves
relative positions of two straight lines in R2 (see (2.5.9)).
(f) Let T (x) = xA be an affine transformation from R2 into R3
y0 +
(see (d)).
(1) Show that the image of the unit square with vertices 0 , e1 ,
e1 +
e2 and e2 under T is a parallelogram (see Fig. 3.39).
Compute the planar area of this parallelogram.
(2) The image of a triangle in R2 under T is a triangle
∆T (
a1 )T (
a2 )T (
a3 ) (see Fig. 3.39). Compute
the area of ∆T (
a1 )T (
a2 )T (
a3 )
.
the area of ∆ a1 a2 a3
T ( a1 )
a1
T ( a2 )
e2 e1 + e2 e3 ′
T T ( a3 )
a3 T (0)
a2
0 e1 0 e2 ′
e1 ′
Fig. 3.39
3.7 Linear Transformations (Operators) 415
(3) What is the image of the unit circle x21 + x22 = 1 in R2 under
T ? How about its area? Refer to Fig. 2.66.
3. The system of three non-homogeneous linear equations in two
unknowns x1 and x2
2
aij xi = bj , j = 1, 2, 3
i=1
can be rewritten as
a a12 a13
xA = b , where A = 11 ,
a21 a22 a23
x = (x1 , x2 ) ∈ R2 .
b = (b1 , b2 , b3 ) and
l1 : a11 x1 + a21 x2 = b1 ,
l2 : a12 x1 + a22 x2 = b2 ,
l3 : a13 x1 + a23 x2 = b3
a11 a21 b1
Fig. 3.39(a) ⇔ = = for i = 2, 3
a1i a2i bi
(infinitely many solutions).
a11 a21 b1 a11 a21
Fig. 3.39(b) ⇔ = = but =
a12 a22 b2 a13 a23
(a unique solution).
416 The Three-Dimensional Real Vector Space R3
l2
l3 l1 l1
l1 = l2 = l3 l1 = l2 l2
l3 e2
e2 e2 e2
l3
0 e1 0 e1 0 e1 0 e1
(a) (b) (c) (d)
l1 l1
l2 e2 l3 e2 l2
l3
0 e1 0 e1
(e) (f)
Fig. 3.40
What happens if l1 = l2 l3 ?
A
(5) (linearly algebraic) The rank of the augmented matrix
b 3×3
is equal to that of the coefficient matrix A, i.e.
A
r = r(A).
b
3.7 Linear Transformations (Operators) 417
x A = b has a solution where A = O2×3 . Then
(b) In case
(1)
x A = b has a unique solution.
⇔ (2) x A = 0 has only one solution
x = 0 , i.e. Ker(A) = { 0 }.
⇔ (3) The linear transformation A: R2 → R3 is one-to-one, i.e.
r(A) = 2.
⇔ (4) AA∗ , as a square matrix of order 2, is invertible. Thus, the
unique solution (refer to Ex. <A> 5(1)) is
b A∗ (AA∗ )−1 .
|
v | = min |
x |.
x A= b
The remaining question is that how to find v , more explicitly, how
to determine v via A. If r(A) = 2, then by the former part of (b),
v = b A∗ (AA∗ )−1 .
it is known that
x0 + Ker(A)
e2 x0
Ker(A)
+
bA
e1
0
Fig. 3.41
418 The Three-Dimensional Real Vector Space R3
(c) Suppose r(A) = 1. Then r(AA∗ ) = 1 and AA∗ is then not invertible.
We may suppose a11 = 0. Rewrite A as
a11 A12
A= = BC,
a21 A22
where
a11
B= , C = 1 a−1
11 A12 and
a21
A12 = a12 a13 , A22 = a22 a23 .
1 1 1
(d) Let A = . Show that r(A) = 1 and
−1 −1 −1
1 −1
1
A+ = C ∗ (CC ∗ )−1 (B ∗ B)−1 B ∗ = 1 −1 .
6
1 −1
4. (continued from Ex. 3) Given A = [aij ]2×3 and b = (b1 , b2 , b3 ) ∈ R3 ,
suppose x A = b doesn’t have any solution x ∈ R2 . The problem is to
find x0 ∈ R2 so that | b −
x0 A| is minimal, i.e.
|b − min | b −
x0 A| = x A|.
x ∈R2
Geometrically, this means that x0 A is the orthogonal projection of b
onto the range space Im(A) and | b − x0 A| is the distance from b to
Im(A). For details, see Chaps. 4 and 5. See also Fig. 3.42.
3.7 Linear Transformations (Operators) 419
b Im(A)
e2 e3 ′ b e3 ′
Im(A)
A
x0 A x0 A
0 e1 e2 ′
0 0 e2 ′
e1 ′ r(A) = 1 e1 ′
r(A) = 2
Fig. 3.42
Now,
(b −
x0 A) ⊥ x ∈ R2
x A for all
⇔(b − xA)∗ = ( b −
x0 A)( x∗ = 0
x0 A)A∗ x ∈ R2
for all
x0 A)A∗ = b A∗ −
⇔(b − x0 AA∗ = 0 , i.e.
x0 AA∗
= b A∗ .
(a) In case r(A) = 2, then r(AA∗ ) = 2 and AA∗ is invertible. Therefore,
x0 = b A∗ (AA∗ )−1
⇒ The projection vector of b on Im(A) is b A∗ (AA∗ )−1 A.
(b) In case r(A) = 1, then r(AA∗ ) = 1 and AA∗ is not invertible.
Decompose A as in Ex. 3(c) and show that
x0 = b C ∗ (CC ∗ )−1 (B ∗ B)−1 B ∗
⇒ The projection vector of b on Im(A) is
b C ∗ (CC ∗ )−1 (B ∗ B)−1 B ∗ A = b C ∗ (CC ∗ )−1 C.
(Note Combining Ex. 3 and Ex. 4 together, for A2×3 = O2×3 the 3 × 2
matrix
∗ ∗ −1
A (AA ) , if r(A) = 2
+ ∗ ∗ −1 ∗ −1 ∗
A = C (CC ) (B B) B , if r(A) = 1 and
A = BC as in Ex. 3(c)
is called the generalized or pseudoinverse of A. This A+ has the follow-
ing properties:
(1) | b A+ | = min | x | if
x A = b has a solution.
x A= b
420 The Three-Dimensional Real Vector Space R3
(2) b A+ A is the orthogonal projection of b ∈ R3 on Im(A) and
| b − b A+ A| =
min | b − x A|.
x ∈R2
These results are also still valid for real or complex m × n matrix
A with m ≤ n. For general setting in this direction, see Sec. B.8 and
Fig. B.9, also refer to Ex. <B> of Sec. 2.7.5 and Example 4 in Sec. 3.7.5,
Secs. 4.5, 5.5.)
5. Let N = { e1 , e3 } be the natural basis for R3 and N = {1} the one
e2 ,
for R.
(a) A mapping f : R3 → R is a linear transformation, specifically called
a linear functional, if and only if there exist scalars a1 , a2 , a3 so that
a1
f ( x a2 ,
x ) = a1 x1 + a2 x2 + a3 x3 = x ∈ R3 .
a3
i.e.
a1
[f ( x ]N [f ]N
x )]N = [ N, where [f ]N
N = a2 .
a3
(b) Give a matrix B = [bi ]3×1 . Find conditions so that there exist a
basis B for R3 and a basis B for R so that [f ]B
B = B.
(c) Show that f can be onto but never one-to-one.
(d) Suppose f : R3 → R is a linear functional such that Im(f ) = R
holds. Try to define the quotient space
R3 /Ker(f ).
Show that it is linear isomorphic to R (refer to Fig. 3.25).
(e) Let fj : R3 → R be the linear functional satisfying
fj (
ei ) = δij , 1 ≤ i, j ≤ 3.
Then any linear functional f : R → R can be uniquely expressed as
3
f = f (
e1 )f1 + f (
e2 )f2 + f (
e3 )f3 .
(f) The set of all linear functionals from R3 to R, namely
(R3 )∗ = Hom(R3 , R)
(see Ex. 19 of Sec. B.7) is a three-dimensional real vector space,
called the (first) dual space of R3 , with {f1 , f2 , f3 } as a basis which
is called the dual basis of N in (R3 )∗ . How to define the dual basis
B∗ for (R3 )∗ of a basis B for R3 ?
3.7 Linear Transformations (Operators) 421
ϕ∗ (f )(
x ) = f (ϕ(
x )), f ∈ (R3 )∗ and
x ∈ R3 ; or
∗
x , ϕ (f ) = ϕ( x ), f .
f (
x) =
x A, x ∈ R3 ,
namely,
[f ( x ]N [f ]N
x )]N = [ N, where [f ]N
N = A3×2 .
(b) For a matrix B = [bij ]3×2 , find conditions so that there exist a
basis B for R3 and a basis B for R2 so that [f ]B
B = B.
(c) Show that f can be onto but never one-to-one.
(d) Show that R3 /Ker(f ) is isomorphic to Im(f ).
(e) Let f : R3 → R2 be a linear transformation. Then
(1) f is onto.
⇔ (2) There exists a linear transformation g: R2 → R3 so that
f ◦ g = 1R2 , the identity operator on R2 .
⇔ (3) There exist a basis B for R3 and a basis B for R2 so that
[f ]B
B is left invertible, i.e. there is a matrix B2×3 so that
B[f ]BB = I2 .
3.7 Linear Transformations (Operators) 423
Find the image of the unit sphere x21 + x22 + x23 = 1 under f . How
about the image of the unit closed ball x21 + x22 + x23 ≤ 1? Watch
the following facts: Let
y = (y1 , y2 ) = f (
x ).
(1) Ker(f ) = (1, 1, 1). Hence, f is one-to-one on the plane
x1 + x2 + x3 = 0.
(2) y1 = x1 −x3 , y2 = −x2 +x3 ⇒ x1 = y1 +x3 and x2 = −y2 +x3 .
Substitute these two equations into x1 +x2 +x3 = 0 and obtain
x3 = 13 (y2 −y1 ) and hence x1 = 13 (2y1 +y2 ), x2 = − 13 (y1 +2y2 ).
(3) Now, consider the image of the circle x21 + x22 + x23 = 1,
x1 + x2 + x3 = 0 under f .
424 The Three-Dimensional Real Vector Space R3
e2 f
0 0 e1 ′
e1
Fig. 3.43
(4) (algebraic)
a11 a21 a31 b1
Coincident Σ1 = Σ2 ⇔ = = = ;
a12 a22 a32 b2
a11 a21
intersecting along a line ⇔ at least two of the ratios ,
a12 a22
a31
and are not equal.
a32
(5) (linearly algebraic)
The coefficient matrix A and the augmented
A
matrix have the same rank, i.e.
b
A 1, if coincidence;
r = r(A) =
b 2, if intersection.
Therefore, it is worth mentioned that
x A = b has no solution.
⇔ The planes Σ1 and Σ2 are parallel.
a11 a21 a31 b1
⇔ = = = .
a12 a22 a b
32 2
A
⇔ r(A) = 1 < r = 2.
b
(b) In case
x A = b has a solution. Then
(1) The linear transformation A: R3 → R2 is onto, i.e. r(A) = 2.
⇔ (2) The solution space Ker(A) of x A = 0 is a one-dimensional
subspace of R3 .
⇔ (3) For each point b ∈ R2 , the solution set of
x A = b is a one-
dimensional affine subspace of R3 .
⇔ (4) A∗ A is an invertible 2 × 2 matrix, i.e. r(A∗ A) = 2. Thus, a
particular solution of
x A = b is
b (A∗ A)−1 A∗ .
In this case, the solution affine subspace of
x A = b is
b (A∗ A)−1 A∗ + Ker(A),
which is perpendicular to the vector b (A∗ A)−1 A∗ , namely, for any
x ∈ Ker(A),
∗
x ( b (A∗ A)−1 A∗ )∗ =
x A(A∗ A)−1 b
∗
= 0 (A∗ A)−1 b = 0 .
Hence, the point b (A∗ A)−1 A∗ has the shortest distance, among
all the points lying on the solution affine subspace, to the origin 0
426 The Three-Dimensional Real Vector Space R3
* –1 *
b(A A) A + Ker(A)
e3
Ker(A)
* –1 *
b(A A) A
0 e2
e1
Fig. 3.44
(c) In case
x A = b has a solution. Then
(1) The linear transformation A: R3 → R2 is not onto, i.e.
r(A) = 1.
⇔ (2) The solution space Ker(A) of x A = 0 is a two-dimensional
subspace of R .3
⇔ (3) For the point b for which x A = b has a solution, the
solution set is a two-dimensional affine subspace of R3 .
⇔ (4) For the point b for which x A = b has a solution, A∗ A is
not invertible and
r(A∗ A) = r(A∗ A + b∗ b ) = 1.
If
x0 is a particular solution of
x A = b , then the affine plane
x0 + Ker(A)
is the solution set of
x A = b (see Fig. 3.45). There exists a unique
point v on the plane whose distance to 0 is the smallest one, i.e.
v | = min |
| x |.
x A= b
This
v is going to be perpendicular to both Ker(A) and
x0 +
Ker(A).
3.7 Linear Transformations (Operators) 427
+
bA + Ker(A)
+
bA
e3
Ker(A)
0 e2
e1
Fig. 3.45
To find such a
v , we model after Ex. 3(c) and proceed as follows.
May suppose a11 = 0 and rewrite A as
a11
A = BC, where B = and C = 1 a−1
11 A12 and
A21 3×1 1×2
a
A21 = 21 , A12 = [a12 ]1×1
a31
x ( b A+ )∗ =
x B(B ∗ B)−1 (CC ∗ )−1 C = 0.
This means that the vector b A+ is indeed perpendicular to Ker(A)
and it is a point lying on the affine plane
x0 + Ker(A), since
( b A+ )A = b C ∗ (CC ∗ )−1 (B ∗ B)−1 B ∗ BC
= b C ∗ (CC ∗ )−1 C =
x BC =
xA = b .
428 The Three-Dimensional Real Vector Space R3
(d) Let
1 −1
A = 1 −1 .
1 −1
Show that A+ = 16 A∗ .
8. (continued from Ex. 7) Given A = [aij ]3×2 and b = (b1 , b2 ) ∈ R2 ,
suppose x ∈ R3 . As in Ex. 4, the
x A = b does not have any solution
problem is to find x0 ∈ R so that
3
|b − min | b −
x0 A| = xA|.
x ∈R3
This means that x0 A is the orthogonal projection of b onto the range
space Im(A) and | b −
x0 A| is the distance from b to Im(A). For details,
see Chaps. 4 and 5. See also Fig. 3.46.
e3 e2 ′ b
0 e2 e1 ′
0
Im(A)
e1
Fig. 3.46
According to Ex. 7, r(A) = 1 should hold in this case. Just like Ex. 4,
we have
(b −
x0 A) ⊥ x ∈ R3
x A for all
x0 AA∗ = b A∗ .
⇔
have the same rank 2, these lines are coincident along a line if and
only if r(A) = r([A | b∗ ]) = 1.
11. In R3 .
(a) Find necessary and sufficient conditions for n points (xi , yi , zi ),
1 ≤ i ≤ n, to be coplanar or collinear.
430 The Three-Dimensional Real Vector Space R3
(b) Find necessary and sufficient conditions for n planes ai1 x1 +ai2 x2 +
ai3 x3 + bi = 0, 1 ≤ i ≤ n, to be concurrent at a point, intersecting
along a line or coincident along a plane.
<C>
Read Ex. <C> of Sec. 2.7.3 and do all the problems there if you missed
them at that time.
Also, do the following problems. Refer to Exs. 19–24 of Sec. B.7, if
necessary.
1. A mapping f : C3 → C3 is defined by
f (x1 , x2 , x3 ) = (3ix1 − 2x3 − ix3 , ix2 + 2x3 , x1 + 4x3 ).
(a) In the natural basis N = { e1 , e3 }, f can be expressed as
e2 ,
3i 0 1
f ( x ) = [ x ]N [f ]N , where [f ]N = [f ]N
N = −2 i 0 .
−i 2 4
(b) Let x1 = (1, 0, i), x2 = (−1, 2i, 1), x3 = (2, 1, i). Show that both
B = { x1 , x2 , x3 } and f (B) = B = {f (
x1 ), f (
x2 ), f (
x3 )} are bases
for C3 . What is [f ]B B ?
(c) By direct computation, show that
23−3i
2
1−5i
2 −5 + i
−15+9i 5+5i
[f ]B
B = 2 2 5 − 5i .
19 − 5i −5i −10 + 3i
B
Compute the transition matrix and justify that [f ]B
PN B =
B N N
PN [f ]N PB .
B B B
(d) Compute [f ]B B B
B and PB and justify that [f ]B = PB [f ]B PB .
(e) Let g: C → C be defined by
3 3
where N = {
e1 , . . . ,
en+1 } is the natural basis for Rn+1 . What is
[Φ ◦ D ◦ Φ−1 ]N
N
where D is as in (b)?
3. In P2 (R), let N = {1, x, x2 } be the natural basis and let
B = {x2 − x + 1, x + 1, x2 + 1},
B = {x2 + x + 4, 4x2 − 3x + 2, 2x2 + 3},
C = {x2 − x, x2 + 1, x − 1},
C = {2x2 − x + 1, x2 + 3x − 2, −x2 + 2x + 1}.
(a) Use Ex. 2(g) to show that B, B , C and C are bases for P2 (R).
(b) Show that {5x2 − 2x − 3, −2x2 + 5x + 5} is linear independent in
P2 (R) and extend it to form a basis for P2 (R).
(c) Find a subset of {2x2 − x, x2 + 21x − 2, 3x2 + 5x + 2, 9x − 9} that is
a basis for P2 (R).
432 The Three-Dimensional Real Vector Space R3
(d) Compute the transition matrices PCB and PCB . Let Φ be as in Ex. 2(g)
and notice that
(1) Φ transforms a basis B for P2 (R) onto a basis Φ(B) for R3 .
(2) What is [Φ]B
Φ(B) ?
(3) PCB = [Φ]B
Φ(B) Φ(C)
Φ(B) PΦ(C) [Φ]C .
(e) Let T : P2 (R) → P2 (R) be defined by
T (p)(x) = p (x) · (3 + x) + 2p(x).
Show that T is a linear operator. Compute [T ]N and [T ]B and justify
that [T (p)]B = [p]B [T ]B by supposing p(x) = 3 − 2x + x2 .
B B B
(f) Compute [T ]B B C
C and [T ]C and justify that [T ]C = PB [T ]C PC .
(g) Let U : P2 (R) → R be defined by
3
U (a + bx + cx2 ) = (a + b, c, a − b).
Show that U is a linear isomorphism. Use N = { e1 , e3 } to
e2 ,
denote the natural basis for R . Compute [U ]N and [U ]N , [U −1 ]N
3 N B
B
and justify that
[U ]B
N [U
−1 N
]B = [U −1 ]N B
B [U ]N = I3 ,
i.e. ([U ]B
N)
−1
= [U −1 ]N
B .
(h) Compute [U ]N and justify that [U ◦ T ]N
N N
N = [T ]N [U ]N .
(i) Define V : P2 (R) → M(2: R) by
p (0) 0
V (p) = .
2p(1) p (3)
Show that V is a linear transformation and compute [V ]N N where
N = {E11 , E12 , E21 , E22 } is the natural basis for M(2: R) (see
Sec. B.4). Verify that [V (p)]N = [p]N [V ]N
N if p(x) = 4 − 6x + 3x .
2
Show that {g1 , g2 , g3 } is a basis for P2 (R)∗ . Try to find a basis for
P2 (R) so that its dual basis in P2 (R)∗ is {g1 , g2 , g3 }.
∗
(m) Let T be as in (e). Compute [T ∗ ]CB∗ and justify that it is equal to
([T ]B ∗ ∗ 2
C ) . Try to find T (f ) if f (ax + bx + c) = a + b + c by the
following methods:
∗
(1) [T ∗ (f )]B∗ = [f ]C ∗ [T ∗ ]CB∗ .
(2) By definition of T ∗ , for p ∈ P2 (R), then
T ∗ (f )(p) = f (T (p)).
[Φ]N
Φ(N ) = I4 .
B C
(e) Compute the transition matrices PN , PN and PCB by using the
following methods:
(1) Direct computation via definitions.
(2) PCB = [Φ]B
Φ(B) Φ(C)
Φ(B) PΦ(C) [Φ]C , etc.
Verify that
PCB = PN
B N
PC .
that
∗
[T ∗ ]CB∗ = ([T ]B
C)
∗
Projection on R3
Let f : R3 → R3 (or Rn → Rn for n ≥ 2) be a nonzero linear operator with
rank equal to r, 1 ≤ r ≤ 2.
f (
x) =
x1 .
x x
V1 V2
V2 V1
x2 e3 e3
x2
x1 = f ( x ) x1 = f ( x )
0 e2 0 e2
e1 e1
r( f ) = 2 r( f ) = 1
Fig. 3.47
g(
ei ) =
xi , 1 ≤ i ≤ m,
h(f (
xi )) = ei for 1 ≤ i ≤ r and h( ej for r + 1 ≤ j ≤ n.
yj ) =
Then h ◦ f ◦ g has the required property. See the following diagram and
Fig. 3.48 for m = 3, n = 2 and r(f ) = 1.
f
Rm −→ Rn
(B) (C)
g↑ ↓h
(N ) (N )
h◦f ◦g
Rm −−−−→ Rn
e3
Ker( f ) Im( f )
e2 ′
x1
x3 f ( x1 )
y2
f
0 e2 e1 ′
0
x2
e1
g h
e3
〈〈 e2, e3 〉〉
e2 ′
( x1 , x2 , x3 )
h° f °g
0 e2 ( x1 , 0)
e1
( x1 , 0, 0) 0 e1 ′
Fig. 3.48
and Rn . Suppose
f (
x) =
x A,
f (
x1 )
x1 .
.. ..
.
f ( xr )
P = [1Rm ]B
N =
xr and Q−1 = [1Rn ]CN = .
. yr+1
..
..
.
xm m×m
yn n×n
Then
I 0 N
P AQ = r = [f ]B B N
C = [g]N [f ]N [h]C . (3.7.37)
0 0
P1* Qn−1
*
[ f ] = PAQ
P2* 0 0
Pm* Q−1
2*
Q1*−1
em en '
[f] = A
0 e2 0 e2 '
e1 e1 '
Fig. 3.49
(1) Then
dim Ker(f 3 ) = dim Ker(f 4 ) = · · · = dim Ker(f n ) = · · · , for n ≥ 3;
r(f ) = r(f ) = · · · = r(f ) = · · · ,
3 4 n
for n ≥ 3.
(2) For any real 3 × 3 matrix A,
r(A3 ) = r(A4 ) = · · · = r(An ) = · · · , for n ≥ 3.
In general, for a linear operator f : Rn → Rn , there exists a least positive
integer k so that
1. Ker(f k ) = Ker(f k+1 ) = · · · and Im(f k ) = Im(f k+1 ) = · · · , and
2. Rn = Ker(f k ) ⊕ Im(f k ).
(3.7.38)
Exercises
<A>
1. Prove (3.7.34), (3.7.35), (3.7.36) and (3.7.38).
2. For each of the following matrices A, do the following problems.
(1) Find invertible matrix P such that AP is a projection on R3 .
(2) Find invertible matrices P and Q so that PAQ is in its normal form.
440 The Three-Dimensional Real Vector Space R3
(3) Use A to justify (3.7.38) and determine the smallest positive integer
k so that r(Ak ) = r(An ) for n ≥ k.
6 1 −5 −3 −6 15
(a) A = 2 −3 4 . (b) A = −1 −2 5 .
3 7 −1 2 4 −10
−2 6 3
(c) A = 0 12 10 .
4 0 4
3. Do Exs. <A> 2 and 5–15 of Sec. 2.7.4 in case Rn or n × n matrix A for
n ≥ 3.
<B>
Do Exs. <B> 1–3 of Sec. 2.7.4 in case Rn or n × n matrix A for n ≥ 3.
<C> Abstraction and generalization
Read Ex. <C> of Sec. 2.7.4 and do all the problems there.
<D> Applications
Do the following problems
1. Remind that the (n + 1)-dimensional vector space Pn (R) has the natural
basis N = {1, x, . . . , xn−1 , xn }. Let D: Pn (R) → Pn−1 (R) ⊆ Pn (R) be
the differential operator
D(p(x)) = p (x)
and I: Pn−1 (R) → Pn (R) be the integral operator
6 x
I(q(x)) = q(t) dt.
0
(a) Show that D ◦ I = 1Pn−1 (R) , the identity operator on Pn−1 (R) and
N N
[D ◦ I]N = IN DN = In ,
where N = {1, x, . . . , xn−1 } is the natural basis for Pn−1 (R). Is this
anything to do with the Newton–Leibniz theorem?
(b) Is I ◦ D = 1Pn (R) true? Why? Any readjustment needed?
(c) Show that, for 1 ≤ k < n,
Pn (R) = Pk (R) ⊕ xk+1 , . . . , xn ,
where Pk (R) is an invariant subspace of D, and xk+1 , . . . , xn is
not. Therefore
Pn (R)/Pk (R) is isomorphic to xk+1 , . . . , xn .
3.7 Linear Transformations (Operators) 441
i(b−a)
Now, divide [a, b] into n equal parts with ai = a + n for
0 ≤ i ≤ n.
(1) Take n = 1. Show that
6 b
(b − a)
p(t) dt = [p(a) + p(b)].
a 2
This is the trapezoidal rule for polynomials.
(2) Take n = 2. Calculate
6 b 2 6 b
p(t) dt = p(ai ) pi (t) dt
a i=0 a
x1 x2 x3 |
1 1 0 |
b1 x1 + x2 = b1
0 −1 −5 |
|
b2 − 4b1 ⇒ −x2 − 5x3 = b2 − 4b1
|
0 0 5 | b 3 − b1 5x3 = b3 − b1
x1 = −4b1 + b2 + b3
⇒ x2 = 5b1 − b2 − b3
x = 1 (b − b ).
3 3 1
5
∗
This is the solution of the equations A x ∗ = b . On the other hand,
1 1 0
E(3)−(1) E(2)−4(1) A = 0 −1 −5
0 0 5
444 The Three-Dimensional Real Vector Space R3
1 1 0 1 1 0
−1
⇒ A = E(2)−4(1) −1
E(3)−(1) 0 −1 −5 = E(2)+4(1) E(3)+(1) 0 −1 −5
0 0 5 0 0 5
1 0 0 1 0 0 1 1 0
= 4 1 0 0 1 0 0 −1 −5
0 0 1 1 0 1 0 0 5
1 0 0 1 1 0
= 4 1 0 0 −1 −5 (LU-decomposition)
1 0 1 0 0 5
1 0 0 1 0 0 1 1 0
= 4 1 0 0 −1 0 0 1 5 (LDU-decomposition).
1 0 1 0 0 5 0 0 1
From here, it is easily seen that
which can also be obtained from (*2) (see Application 8 in Sec. B.5).
Stop at (*2):
A is invertible, since
Therefore,
−4 1 1
A−1 = 5 −1 −1
− 15 0 1
5
= E(1)+5(3) E(2)−5(3) E(1)−(2) E 15 (3) E−(2) E(3)−(1) E(2)−4(1)
1 1
⇒ det A−1 = · (−1) = − ;
5 5
3.7 Linear Transformations (Operators) 445
and
−1 −1 −1
A = E(2)−4(1) E(3)−(1) E−(2) E −1
1 E −1 E −1 E −1
(3) (1)−(2) (2)−5(3) (1)+5(3)
5
Readers are urged to carry out actual computations to solve out the
solution.
The elementary matrices, LU and LDU decompositions can be used to
help investigating geometric mapping properties of A, better using GSP. For
example, the image of the unit cube under A is the parallelepiped as shown
in Fig. 3.50. This parallelepiped can be obtained by performing successive
mappings E(2)+4(1) to the cube followed by E(3)+(1) · · · then by E(1)−5(3) .
Also (see Sec. 5.3),
(2, 2,5)
(1,1,5)
e3
(6,5, 0)
(5, 4, 0)
A e3
e2 (1,1, 0)
e1
e2 0
0
e1
( )
1
3
scale (4, 3, − 5)
(5, 4, − 5)
Fig. 3.50
1 1 0
the signed volume of the parallelepiped = det 4 3 −5 = −5
1 1 5
the signed volume of the parallelepiped
⇒ = det A.
the volume of the unit cube
Since det A = −5 < 0, so A reverses the orientations in R3 . 2
Example 2 Let
0 1 −1
A = 2 4 6 .
2 6 4
Do problems similar to Example 1.
Solution Perform elementary row operations to
0 1 −1 |
|
b1 |
|
1 0 0 (x1 )
∗
[A| b |I3 ] = 2 4 6 |
| b2 |
| 0 1 0 (x2 )
| |
2 6 4 | b3 | 0 0 1 (x3 )
3.7 Linear Transformations (Operators) 447
| |
2 4 6 |
b2 |
0 1 0 (x1 )
−−−−−→ 0 1 −1 |
| b1 |
| 1 0 0 (x2 )
E(1)(2) | |
2 6 4 | b3 | 0 0 1 (x3 )
| b2 | 1
1 2 3 | 2 |
0 2 0
−−−−−→ 0 1 −1 |
| b1 |
| 1 0 0
E(3)−(1)
| |
E1
(1)
0 2 −2 | b3 − b 2 | 0 −1 1
2
| b2 | 1
1 2 3 | 2 |
0 2 0
−−−−−−→ 0 1 −1 |
| b1 |
| 1 0 0 (*3)
E(3)−2(2) | |
0 0 0 | b3 − b2 − 2b1 | −2 −1 1
1 0 5 |
|
− 2b1
b2
2
|
|
−2 1
2 0
−−−−−−→ 0 1 −1 |
| b1 |
| 1 0 0 . (*4)
E(1)−2(2) | |
0 0 0 | b3 − b2 − 2b1 | −2 −1 1
Notice that, since the leading entry of the first row of A is zero, exchange
of row 1 and row 2 is needed as the first row operation.
From (*3),
∗
x ∗ = b has a solution
A x.
x2 1 2 3 x1
⇔ E(3)−2(2) E 12 (1) E(3)−(1) E(1)(2) A x1 = 0 1 −1 x2
x3 0 0 0 x3
b2
2
= b1
b3 − b2 − 2b1
has a solution
x.
⇔ b3 − b2 − 2b1 = 0.
In this case,
b2
x1 + 2x2 + 3x3 =
2
x − x = b
2 3 1
x2 = b1 + x3
⇒ x1 = −2b1 + b2 − 5x3
2
x3 ∈ R is arbitrary.
448 The Three-Dimensional Real Vector Space R3
b2
⇒
x= −2b1 + − 5x3 , b1 + x3 , x3
2
b2
= −2b1 + , b1 , 0 + x3 (−5, 1, 1), x3 ∈ R.
2
Hence, the
solution set is the affine line −2b1 + b22 , b1 , 0 + (−5, 1, 1) in
R3 with −2b1 + b22 , b1 , 0 as a particular solution. It is worth mentioning
∗
x ∗ = b is the system of equations
that, A
x2 − x3 = b1
2x + 4x2 + 6x3 = b2
1
2x1 + 6x2 + 4x3 = b3 .
For this system of equations to have a solution, it is necessary and sufficient
that, after eliminating x1 , x2 , x3 from the equations,
b2 − 4b1 − 10x3 + 6b1 + 6x3 + 4x3 = b3
which is b3 − b2 − 2b1 = 0, as claimed above.
(*3) tells us that A is not invertible.
But (*3) does indicate that
1 2 3
E(3)−2(2) E 12 (1) E(3)−(1) (E(1)(2) A) = 0 1 −1 (upper triangle)
0 0 0
2 4 6 1 2 3
⇒ E(1)(2) A = 0 1 −1 = E −1 −1 −1
(3)−(1) E 12 (1) E(3)−2(2) 0 1 −1
2 6 4 0 0 0
1 2 3
= E(3)+(1) E2(1) E(3)+2(2) 0 1 −1
0 0 0
2 0 0 1 2 3
= 0 1 0 0 1 −1
2 2 1 0 0 0
1 0 0 2 4 6
= 0 1 0 0 1 −1 (LU-decomposition)
2 2 1 0 0 0
1 0 0 2 0 0 1 2 3
= 0 1 0 0 1 0 0 1 −1 (LDU-decomposition).
2 2 1 0 0 0 0 0 0
3.7 Linear Transformations (Operators) 449
Refer to (2) in (2.7.69) and the example after this. A can be decomposed
as follows too.
0 1 −1 0 1 −1 2 4 6
A −−−−−→ 2 4 6 −−−−−−→ 2 4 6 −−−−→ 0 1 −1
E(3)−(2) E(3)−2(1) E(1)(2)
0 2 −2 0 0 0 0 0 0
2 4 6
−1
⇒ A = E(3)−(2) −1
E(3)−2(1) −1
E(1)(2) 0 1 −1
0 0 0
1 0 0 0 1 0 2 4 6
= 0 1 0 1 0 0 0 1 −1 (LPU-decomposition).
2 1 1 0 0 1 0 0 0
From (*4), firstly, (*4) says that A is not invertible. Secondly, (*4) also says
∗
that Ax ∗ = b has a solution if and only if b3 − b2 − 2b1 = 0, and the solu-
tions are x1 = b22 −2b1 −5x3 , x2 = b1 +x3 , x3 ∈ R. Third, (*4) indicates that
1 0 5
E(1)−2(2) E(3)−2(2) E 1 (1) E(3)−(1) E(1)(2) A = 0 1 −1
2
0 0 0
1 0 5
⇒ P A = 0 1 −1 ,
0 0 0
−2 1
2 0
where P = 1 0 0 = E(1)−2(2) E(3)−2(2) E 12 (1) E(3)−(1) E(1)(2) .
−2 −1 1 3×3
(row-reduced echelon matrix of A)
Now perform elementary column operations
to
1 0 5 1 0 0
0 1 −1 0 1 0
I 0
2
PA 0 0 0 0 0 0
---- = --------- −−−−−−→ --------- = 0 0
F(3)−5(1) -----
I3 1 0 0 F(3)+(2) 1 0 −5
Q3×3
0 1 0 0 1 1
0 0 1 0 0 1
1 0 −5
I2 0
⇒ P AQ = , where Q = 0 1 1 = F(3)−5(1) F(3)+(2) .
0 0 3×3 0 0 1
(the normal form of A)
450 The Three-Dimensional Real Vector Space R3
Im(A) = (1, 0, 5), (0, 1, −1), where the basis vectors are the first and
the second row vectors of P A;
Ker(A) = (−2, −1, 1), where (−2, −1, 1) is the third row vector
of P ;
Im(A∗ ) = (0, 2, 2), (1, 4, 6), generated by the first and the second
column vectors of A;
Ker(A∗ ) = (−5, 1, 1), generated by the fundamental
solution
x∗ = 0 .
(−5, 1, 1) of A
Therefore, the parallelepiped with side vectors e 2 and (−2, −1, 1) has
e 1,
the image under A the parallelogram with side vector e 1 A and
e 2 A. See
Fig. 3.51.
e2 A = (2,4,6)
(−2, −1, 1)
(−1, −1, 1) e3
0
e2
A
( )
1
2
scale
(1,1,0) e3
e1 e2
e1
0
e1A = (0, 1, −1)
Fig. 3.51
3.7 Linear Transformations (Operators) 451
Example 3 Let
1 −3 0
A = −3 2 −1 .
0 −1 4
Do problems similar to Example 1.
1 −3 0 |
|
b1 |
|
1 0 0
−−−−−−→ 0 1 17 |
| − b2 +3b
7
1 |
| − 37 − 17 0
E− 1 (2) | |
7 0 −1 4 | b3 | 0 0 1
1 −3 0 |
|
b1 |
|
1 0 0
| |
−−−−−−→ 0 1 17 |
|
− b2 +3b
7
1 |
|
− 37 − 17 0 (*5)
E(3)+(2)
| |
0 0 29
7 | − b2 +3b
7
1
+ b3 | − 37 − 17 1
1 0 3
7
|
|
− 3b2 +2b
7
1 |
|
− 27 − 37 0
| |
−−−−−−→ 0 1 1
7
|
|
− b2 +3b
7
1 |
|
− 37 − 17 0
E 7
(3)
− b2 +3b291 −7b3
29 | |
E(1)+3(2) 0 0 1 | | − 29
3
− 29
1 7
29
1 0 0 |
|
− 7b1 +12b
29
2 +3b3 |
|
− 29
7
− 12
29 − 29
3
| |
1
−−−−−−→ 0 1 0 |
|
− 12b1 +4b
29
2 +b3 |
|
− 12
29 − 29
4
− 29 . (*6)
E
(1)− 3 (3)
− 3b1 +b292 −7b3
7 | |
E
(2)− 1 (3)
0 0 1 | | − 29
3
− 29
1 7
29
7
From (*5):
∗
x∗
A = b has a solution
x.
∗
x ∗ = P b has a solution
⇔ P A x, where
1 0 0
3
P = E(3)+(2) E− 17 (2) E(2)+3(1) =
− 7 − 7 0 .
1
− 37 − 17 1
In this case,
− 1
x 1 3x2 = b1
x1 = − (7b1 + 12b2 + 3b3 )
29
1 1
x2 + x3 = − (b2 + 3b1 ) ⇒ x2 = − 1 (12b1 + 4b2 + b3 ) .
7 7
29
29 x3 = − 1 (b2 + 3b1 − 7b3 )
x3 = − 1 (3b1 + b2 − 7b3 )
7 7 29
On the other hand,
1 −3 0
PA =
0 1 1
7
29
0 0 7
3.7 Linear Transformations (Operators) 453
Since
1 0 0
P −1 = E(2)−3(1) E−7(2) E(3)−(2) = −3 −7 0
0 −1 1
1 −3 0
1 0 0
⇒ A = −3 −7 0 0 1 17
0 −1 1 29
0 0 7
1 0 0 1 −3 0
= −3 1 0 0 −7 −1 (LU-decomposition)
1
0 7 1 0 0 29
7
1 0 0 1 0 0 1 −3 0
= −3 1 0 0 −7 0 0 1 17 (LDL∗ -decomposition).
1 29
0 7 1 0 0 7 0 0 1
∗
x∗ = b .
These decompositions can help in solving the equation A
Moreover,
1 −3 0 1 − 37 − 37
1 0 0
PAP ∗ =
0 1 1
7
0 − 1
7 − 17
= 0 − 71 0 . (*7)
29
29 0 0 7
0 0 7 0 0 1
Notice that
P ∗ = E(2)+3(1)
∗ ∗
E− 1
∗
(2) E(3)+(2) = F(2)+3(1) F− 7 (2) F(3)+(2) .
1
7
row operations and column operations of the same types, we will get a
diagonal matrix. In fact
1 0 0
E(2)+3(1)
A −−−−−−→ E(2)+3(1) AF(2)+3(1) = 0 −7 −1
F(2)+3(1)
0 −1 4
E− 1 (2) 1 0 0 1 0 0
E
−→ 0 − 17 71 −−−−−→ 0 − 17 0
7 (3)+(2)
−−−− (*8)
F− 1 (2) 1 F(3)+(2) 29
7 0 7 4 0 0 7
1 0 0 1 20 0
⇒ QP1 AP1∗ Q∗ = 0 1 0 , where Q = 0 7
0 .
29 √
0 0 −1 0 0 7
Now, let
1 0 0
3√7 √ √
S = QP1 = − √
7 √29 − √7
7 √
29
√7
29
−377 − 77 0
1 0 0
⇒ SAS ∗ = 0 1 0 (A is congruent to a diagonal matrix). (*9)
0 0 −1
Hence, the index of A is equal to 2, the signature is equal to 2 − 1 = 1, and
the rank of A is 2 + 1 = 3 (see (2.7.71)).
The invertible matrix S in (*9) is not necessarily unique. A is diag-
onalizable (see Secs. 3.7.6 and 5.7 ). A has characteristic polynomial
det(A − tI3 ) = −t3 + 7t2 − 4t − 29 and has two positive eigenvalues λ1 , λ2
3.7 Linear Transformations (Operators) 455
√1 0 0
1 0 0 λ1 v1
0
⇒ S1 AS1∗ = 0 1 0 , where S1 = 0 √1
λ2 v2.
0 0 −1 0 0 √1
v3
−λ3
(*10)
in R3 . Let
x S −1
y = (y1 , y2 , y3 ) =
x in the coordinate system B formed by
= the coordinate vector of
three row vectors of S.
Then
x,
x A = x ∗ = (
x A x S −1 )(SAS ∗ )(
x S −1 )∗
y∗
y (SAS ∗ )
=
= y12 + y22 − y32 = 1.
This means that, in B, the quadric looks like a hyperboloid of one sheet
(refer to Fig. 3.90) and can be used as a model for hyperbolic geometry (see
Sec. 5.12).
The following examples are concerned with matrices of order m × n
where m = n.
Example 4 Let
2 1 0
A= .
−1 0 1
Caution that, for the understanding of (3), one needs basic knowledge about
Euclidean structures of R2 and R3 , and one can refer to Part 2 if necessary.
Solution Perform elementary column operations to
2 1 0 1 1 0 1 0 0
0 0 1 0 1 0
−1 0 1
A
--------- -------- ----------
-- = −−−−−−→ 1 −−−−−→ 1 .
I3 1 0 0 F 12 (1) 2 0 0 F(2)−(1) 2 0 − 12
F F(2)(3)
0 1 0 (1)+ 21 (3) 0 1 0 0 0 1
0 0 1 1
2 0 1 1
2 1 − 12
3.7 Linear Transformations (Operators) 457
Hence, A is right invertible and B is one of its right inverses. In general, the
right inverses B can be found in the following way. Let B = [ v ∗1
v ∗2 ]3×2 .
Then
∗ ∗
AB = A
v1 v2 v ∗1
= A v ∗2 = I2
A
v ∗1 =
⇔ A e ∗1
v ∗2 =
A e ∗2 .
Suppose
v1 = (x1 , x2 , x3 ), then
v ∗1 =
A e ∗1 ⇔ 2x1 + x2 = 1
− x1 + x3 = 0
v1 = (0, 1, 0) + t1 (1, −2, 1)
⇔ for t1 ∈ R.
Similarly, if
v2 = (x1 , x2 , x3 ), then
v ∗2 =
A e ∗2 ⇔ 2x1 + x2 = 0
− x1 + x3 = 1
v 2 = (0, 0, 1) + t2 (1, −2, 1)
⇔ for t2 ∈ R.
1 0 −1 |
−b2 |
0 −1
−−−−−−−−−−−→ |
|
|
| . (*13)
E2(2) , E(1)− 1 (2) 0 1 2 | 2b2 + b1 | 1 2
2
From (*12),
1
1 2 0
E(2)+(1) E 12 (2) A = 1
(an echelon matrix)
0 2 1
1
1 2 0
⇒ A = E2(1) E(2)−(1)
0 12 1
1
2 0 1 2 0
= (LU-decomposition).
−1 1 0 12 1
Refer to (2.7.70).
From (*13),
∗
Ax ∗ = b has a solution
x = (x1 , x2 , x3 ).
x
1 0 −1 1 −b2
⇔ x2 = has a solution
x.
0 1 2 2b2 + b1
x3
⇒ x1 = −b2 + x3
x2 = b1 + 2b2 − 2x3 , x3 ∈ R
⇒ x = (−b2 , b1 + 2b2 , 0) + t(1, −2, 1),
t ∈ R.
Thus the solution set is a one-dimensional affine subspace in R3 .
Also, (*13) indicates that
E(1)− 12 (2) E2(2) E(2)+(1) E 12 (1) A
1 0 −1
= (row-reduced echelon matrix)
0 1 2
1 0 −1 0 −1
⇒PA = , where P = E (1)− 12 (2) E2(2) E(2)+(1) E 1
2 (1)
= .
0 1 2 1 2
3.7 Linear Transformations (Operators) 459
and
5 −2
AA∗ = ,
−2 2
1 2 2
(AA∗ )−1 = .
6 2 5
For any fixed b ∈ R3 , x A = b may have a solution or not if and only if
the distance from b to the range space Im(A) is zero or not (see Ex. <B> 4
x0 ∈ R2 so that
of Sec. 3.7.3 and Fig. 3.42). Suppose
|b − min | b −
x0 A| = x A|
x ∈R2
⇔(b −
x0 A) ⊥ x ∈ R2
x A for all
⇔(b − x A)∗ = 0 for all
x0 A)( x ∈ R2
x0 AA∗ = b A∗
⇔
x0 = b A∗ (AA∗ )−1 .
⇔ (*14)
460 The Three-Dimensional Real Vector Space R3
x0 A = b A∗ (AA∗ )−1 A is the orthogonal projection of b on Im(A).The
and
operator
2 −1
1 2 2 2 1 0
A∗ (AA∗ )−1 A = 1 0 · ·
6 2 5 −1 1 0
0 1
5 2 −1
1
= 2 2 2 (*15)
6
−1 2 5
Im(A)
b (0,1,2)
* * –1
bA (AA ) A
e3
b(2A (AA ) A−I3)
* * –1
e2
0
e1 Ker(A )
*
(2,1,0)
Fig. 3.52
3.7 Linear Transformations (Operators) 461
A subsequent problem is to find the reflection or symmetric point of b
with respect to the plane Im(A) (see Fig. 3.52). It is the point
b A∗ (AA∗ )−1 A + ( b A∗ (AA∗ )−1 A − b ) = b (2A∗ (AA∗ )−1 A − I3 ). (*16)
Hence, denote the operator
PA = 2A∗ (AA∗ )−1 A − I3 = 2A+ A − I3
2
5 2 −1 3
2
3 − 13
1 2
=2· 2 2 2 − I3 = 32 − 13 3
. (*17)
6
−1 2 5 − 13 2 2
3 3
and (5, 2, −1). Readers are urged to model after Fig. 3.31 to explain graph-
ically the mapping properties of A∗ A and A+ A.
What we obtained for this particular A is universally true for any real
2 × 3 matrix of rank 2. We summarize them in
A real matrix A2×3 of rank 2 and its transpose A∗ and its
generalized inverse A+
(1) 1. AA∗ : R2 → R2 is an invertible liner operator.
2. A∗ A: R3 → Im(A) ⊆ R3 is an onto linear transformation with Im(A)
as an invariant subspace.
(2) AA∗ = 1R2 : R2 → R2 is the identity operator on R2 , considered as the
orthogonal projection of R2 onto itself along { 0 }.
⇔ A∗ A: R3 → R3 is the orthogonal projection of R3 onto Im(A) along
Ker(A∗ ), i.e. A∗ A is symmetric and (A∗ A)2 = A∗ A.
In general, A∗ does not have these properties except A∗ is a right inverse
of A.
(3) The generalized inverse
A+ = A∗ (AA∗ )−1
of A orthogonalizes A both on the right and on the left in the following
sense:
AA+ = 1R2 is the orthogonal projection of R2 onto itself along
Ker(A) = { 0 }.
⇔ A A is the orthogonal projection of R3 onto Im(A) along Ker(A∗ ).
+
Example 5 Let
1 0
A = 1 1 .
1 −1
Do the same problems as Example 4.
3.7 Linear Transformations (Operators) 463
1 0 0
BA = I2 , where B =
−1 1 0
v1
i.e. B is a left inverse of A. In general, let B =
v
. Then
2 2×3
v1 v A
BA = A = 1 = I2
v2 v 2A
⇔
v 1A =
e1
v 2A =
e 2.
Suppose
v 1 = (x1 , x2 , x3 ). Then
v 1A =
e1
x1 + x2 + x3 = 1
⇔
x2 − x3 = 0
⇔
v 1 = (1, 0, 0) + t1 (−2, 1, 1), t1 ∈ R.
464 The Three-Dimensional Real Vector Space R3
Similarly, let
v 2 = (x1 , x2 , x3 ). Then
v 2A =
e2
x1 + x2 + x3 = 0
⇔
x2 − x3 = 1
⇔
v 2 = (−1, 1, 0) + t2 (−2, 1, 1), t2 ∈ R.
1 0 0
P = E(3)+(2) E(3)−(1) E(2)−(1) = −1 1 0
−2 1 1
1 0 1 0
−1
⇒ A = E(2)−(1) −1
E(3)−(1) −1
E(3)+(2) 0 1 = P −1 0 1
0 0 0 0
1 0 0 1 0
= 1 1 0 0 1 (LU-decomposition).
1 −1 1 0 0
and
3 0
A∗ A = ,
0 2
∗ −1 1 2 0
(A A) = .
6 0 3
For any fixed b ∈ R2 ,
x A = b always has a particular solution
∗ −1 ∗
b (A A) A and the solution set is
b (A∗ A)−1 A∗ + Ker(A), (*20)
A+ = (A∗ A)−1 A∗
1 2 0 1 1 1 1 2 2 2
= = (*21)
6 0 3 0 −1 −1 6 0 3 −3 2×3
A+ A = I2 ,
A+ is a left inverse of A.
466 The Three-Dimensional Real Vector Space R3
B in (∗ 19) is A+ .
⇔ The range space of B = Im(A∗ )
⇔ (1 − 2t1 , t1 , t1 ) and (−1 − 2t2 , 1 + t2 , t2 ) are in Im(A∗ ).
2(1 − 2t1 ) − t1 − t1 = 0
⇔
2(−1 − 2t2 ) − (1 + t2 ) − t2 = 0
1 1
⇒ t1 = and t2 = − .
3 2
In this case, B in (*19) is indeed equal to A+ . This A+ is called the gener-
alized inverse of A.
How about AA∗ ?
1 0 1 1 1
1 1 1
AA∗ = 1 1 = 1 2 0 (*22)
0 1 −1
1 −1 1 0 2
I2 0
⇒ R(AA∗ )R∗ = ,
0 0 3×3
where
√1 0 0 0 √1
2
− √12 0 1
2 − 12
2
√1
√1 √1 √1
1 1 1
R= 0 3
0 3 3 3 = 3 3 3 .
0 0 1 − √2 √1 √1 − √26 √1 √1
6 6 6 6 6
Thus, the index of AA∗ is 2 and the signature is equal to 2. By the way,
we pose the question: What is the preimage of the unit circle (or disk)
y12 + y22 = 1 (or ≤1) under A? Let
y = (y1 , y2 ) =
x A. Then
y12 + y22 =
yy∗ = 1
⇔ ( x A)∗ =
x A)( x∗
x AA∗
= x21 + 2x22 + 2x23 + 2x1 x2 + 2x1 x3 , in the natural basis
for R3
= 2x2 2
1 + 3x2 , in the basis { v 1 , v 2 , v 3 } = B
= x2 2
1 + x2 , in the basis {R1∗ , R2∗ , R3∗ } = C
= 1, (*23)
R3 Ker(A)
R2
e3 v3
v2 A
e2
e1 0
A*
v1
Im( A* )
Fig. 3.53
468 The Three-Dimensional Real Vector Space R3
x in R3 with respect
What is the reflection or symmetric point of a point
∗
to the plane Im(A )? Notice that (see (*16))
x ∈ R3
→ x on Im(A∗ )
x AA+ , the orthogonal projection of
→ x AA+ −
x AA+ + ( x (2AA+ − I3 ), the reflection point.
x) = (*24)
PA is symmetric and is orthogonal, i.e. PA∗ = PA−1 and is called the reflection
of R3 with respect to Im(A∗ ).
A simple calculation shows that
D = {
u 1, u 3 } is an orthonormal basis for R3 . In D,
u 2,
1 0 0 u1
+ + −1
[AA ]D = S(AA )S = 0 1 0 , where S = u 2 is orthogonal;
0 0 0 u3
1 0 0
[PA ]D = SPA S −1 = 0 1 0 .
0 0 −1
Try to explain [AA+ ]D and [PA ]D graphically.
As a counterpart of (3.7.40), we summarize in
A real matrix A3×2 of rank 2 and its transpose A∗ and its
generalized inverse A+
(1) 1. AA∗ : R3 → Im(A∗ ) ⊆ R3 is an onto linear transformation with
Im(A∗ ) as an invariant subspace.
2. A∗ A: R2 → R2 is an invertible linear operator.
(2) AA∗ : R3 → R3 is the orthogonal projection of R3 onto Im(A∗ ) along
Ker(A), i.e. AA∗ is symmetric and (AA∗ )2 = AA∗ .
⇔ A∗ A = 1R2 : R2 → R2 is the identity operator on R2 , considered as
the orthogonal projection of R2 onto itself along { 0 }.
In general, A∗ does not have these properties except A∗ is a left inverse
of A.
(3) The generalized inverse
A+ = (A∗ A)−1 A∗
of A orthogonalizes A both on the right and on the left in the following
sense.
AA+ : R3 → R3 is the orthogonal projection of R3 onto Im(A∗ )
along Ker(A).
⇔ A+ A = 1R2 is the orthogonal projection of R2 onto itself along
Ker(A∗ ) = { 0 }.
Therefore, A+ can be defined as the inverse of the linear isomorphism
A|Im(A∗ ) : Im(A∗ ) ⊆ R3 → R2 , i.e.
A+ = (A|Im(A∗ ) )−1 : Im(A) = R2 → Im(A∗ ). (3.7.41)
These results are still valid for real Am×n with rank equal to n. (see Ex. 12
of Sec. B.8 and Sec. 5.5.)
470 The Three-Dimensional Real Vector Space R3
Exercises
<A>
1. Prove (2.7.68) of Sec. 2.7.5 for A3×3 .
2. Prove (2.7.69) of Sec. 2.7.5 for A3×3 .
3. Prove (2.7.70) for A2×3 and A3×2 .
4. Prove (2.7.71) for real symmetric matrix A3×3 . For the invariance of
the index and the signature of A, try the following methods.
(1) A case-by-case examination. For example, try to prove that it is
impossible for any invertible real matrix P3×3 so that
1 0 0 −1 0 0
P 0 0 0 P ∗ = 0 0 0 .
0 0 0 0 0 0
(2) See Ex. <B> 3.
5. Prove Ex. <A> 7 of Sec. 2.7.5 for A3×3 .
6. For each of the following matrices A:
(1) Do problems as in Example 1.
(2) Find the generalized inverse A+ of A and explain it both alge-
braically and geometrically (one may refer to Exs. <B> 4 and 8 of
Sec. 3.7.3, (3.7.40), (3.7.41) and Sec. B.8 if necessary).
3 1 1 −1 0 −3 1 2 −1
(a) 2 4 2 . (b) 0 1 2 . (c) 2 4 −2 .
−1 −1 1 −1 −1 −5 3 6 −3
7. For each of the following matrices, refer to Example 2 and do the same
problems as in Ex. 6.
0 −2 3 0 3 5 0 −1 −4
(a) 0 1 −4 . (b) −1 2 4 . (c) 0 2 8 .
2 0 5 −1 −1 −1 0 1 4
8. For each of the following matrices, refer to Example 3 and do the same
problems as there.
0 1 1 2 1 1 2 3 0
(a) 1 0 1 . (b) 1 2 1 . (c) 3 5 −1 .
1 1 0 1 1 2 0 −1 2
−2 −2 0
(d) −2 −1 1 .
0 1 1
3.7 Linear Transformations (Operators) 471
9. Use (2.7.71) and Ex. 4 to determine the congruence among the following
matrices:
1 0 1 0 1 2 1 2 3
0 1 2 , 1 −1 3 and 2 4 5 .
1 2 1 2 3 4 3 5 6
If A and B are congruent, find an invertible matrix P3×3 so that
B = PAP ∗ .
10. Let
1 1 1
A= .
2 1 0
Do the same problems as in Example 4. One may also refer to Ex. <B>
of Sec. 2.7.5.
11. Model after (*19) and (*21) and try to derive A+ from (*11) in
Example 4.
12. Let
2 1
A = 1 0 .
1 1
Do the same problems as in Example 5.
<B>
1. Prove (3.7.40).
2. Prove (3.7.41).
3. Let A be a nonzero real symmetric matrix of order n. Suppose there
exist invertible matrices Pn×n and Qn×n so that
PAP ∗ and QAQ∗
are diagonal matrices. Let p and q be the number of positive diagonal
entries of PAP ∗ and QAQ∗ , respectively. Suppose that p < q. For
x i = Pi∗ , the ith row vector of P for 1 ≤ i ≤ n and
simplicity, let
y j = Qj∗ for 1 ≤ j ≤ n. Let the rank r(A) = r. Note that r ≥ q > p.
a contradiction.
Hence, p = q should hold. In fact, the above process can be further
simplified as follows. Rewrite x = (x1 , . . . , xn ) =
yP =
z Q where
y = (y1 , . . . , yn ) and z = (z1 , . . . , zn ) so that
x∗ =
x A y ∗ = y12 + · · · + yp2 − yp+1
y (PAP ∗ ) 2
− · · · − yr2 , and
∗ ∗ ∗
x A x = z (QAQ ) z = z12 + ··· + zq2 − 2
zq+1 − ··· − zr2 .
Then
yP =
zQ
⇔y = z QP −1
n
⇔ yj = bij zi for 1 ≤ j ≤ n.
i=1
zq+1 = 0
..
.
zn = 0
which has a nonzero solution z1∗ , . . . , zq∗ , zq+1
∗
= 0, . . . , zn∗ = 0 since
q > p. Corresponding to this set of solutions, the resulted y will induce
that
x Ax ∗ ≤ 0 while the resulted z will induce that x Ax ∗ > 0, a
contradiction. Hence, p = q should hold.
3.7 Linear Transformations (Operators) 473
4. Let
0 0 0 0 1 1 1
0 2 6 2 0 0 4
A=
0
.
1 3 1 1 0 1
0 1 3 1 2 1 2 4×7
I 0
P AQ = 3 .
0 0 4×7
0 c −b a
−c 0 a b
A=
b
−a 0 c
−a −b −c 0
7. Let
1 0 −1 2 1
−1 1 3 −1 0
A=
−2
.
1 4 −1 3
3 −1 −5 1 −6
(a) Prove that the rank r(A) = 3 by showing that A∗1 , A∗2 and A∗4 are
linearly independent and A∗3 = −A∗1 + 2A∗2 , A∗5 = −3A∗1 − A∗2 +
2A∗4 . Find α1 , α2 , α3 so that A4∗ = α1 A1∗ + α2 A2∗ + α3 A3∗ .
(b) Show that
(c) A matrix B4×4 has the property that BA = O4×5 if and only if
Im(B) ⊆ Ker(A). Find all such matrices B. Is it possible to have a
matrix B4×4 of rank greater than one so that BA = O? Why?
3.7 Linear Transformations (Operators) 475
5 4 −1
and find an invertible matrix P4×4 such that P A = R.
(Note Refer to Sec. B.5.)
9. Let A be any one of the following matrices
1 −1 0 0 1 0 0 0
1 −1 1 1
, 0 0 1 −1 and −1 1 0 0 .
1 0 2 1
1 −1 1 −1 0 2 1 1
Do problems as in Example 4.
10. Let A be any one of the following matrices
1 0 1 1 0 1 0 1
−1 1 1 1 0 0 −1 0
.
0 2 , 0 1 −1 and 0 0 −2
3 0 1 0 1 1 1 0
Do problems as in Example 5.
1. Let
S = {(x1 , x2 , x3 , x4 , x5 ) ∈ R5 | x1 − x2 + x3 − x4 + x5 = 0}
be a subspace of R5 .
(a) Determine dim(S).
(b) Show that (1, 1, 1, 1, 0) ∈ S and extend it to form a basis for S.
3. Let S be the set of solutions of the system of linear equations
3x1 − x2 + x3 − x4 + 2x5 = 0
x1 − x2 − x3 − 2x4 − x5 = 0.
⇔
xi A = λi
xi for 1 ≤ i ≤ 3
⇔
xi (A − λ i I3 ) = 0 for 1 ≤ i ≤ 3. (3.7.42)
Remind that λ1 , λ2 and λ3 are eigenvalues of A and x1 ,
x2 and
x3 are
associated eigenvectors of A, respectively. Note that
B = {
x1 , x3 }
x2 ,
is a basis for R3 , consisting entirely of eigenvectors.
For any vector x ∈ R3 ,
x = α1 x1 + α2
x2 + α3
x3 for some unique scalars
α1 , α2 and α3 . Hence
x (A − λ1 I3 )(A − λ2 I3 )(A − λ3 I3 )
x1 (A − λ1 I3 )](A − λ2 I3 )(A − λ3 I3 )
= α1 [
x2 (A − λ2 I3 )](A − λ1 I3 )(A − λ3 I3 )
+ α2 [
x3 (A − λ3 I3 )](A − λ1 I3 )(A − λ2 I3 )
+ α3 [
= α1 0 + α2 0 + α3 0 = 0 x ∈ R3
for all
⇒ (A − λ1 I3 )(A − λ2 I3 )(A − λ3 I3 ) = O3×3 . (3.7.43)
A direct matrix computation as
(A − λ1 I3 )(A − λ2 I3 )(A − λ3 I3 )
0 0
= P −1 λ2 − λ1
0 λ 3 − λ1
λ1 − λ2 0 λ1 − λ 3 0
P · P −1 0 P · P −1 λ2 − λ3 P
0 λ3 − λ2 0 0
−1
=P OP = O
will work too. This result is a special case of Cayley–Hamilton theorem
which states that A satisfies its characteristic polynomial
det(A − tI3 ) = −(t − λ1 )(t − λ2 )(t − λ3 ).
Eigenvalues λ1 , λ2 and λ3 may be not distinct.
478 The Three-Dimensional Real Vector Space R3
Case 1 λ1 = λ2 = λ3
If A3×3 has three distinct eigenvalues λ1 , λ2 and λ3 , then their respective
eigenvectors x1 ,
x2 and
x3 should be linearly independent. For
α1
x1 + α2
x2 + α3
x3 = 0
⇒ (apply both sides byA) α1 λ1
x1 + α2 λ2
x2 + α3 λ3
x3 = 0
⇒ (eliminating, say
x1 , from the above two relations)
α2 (λ1 − λ2 )
x2 + α3 (λ1 − λ3 )
x3 = 0
⇒ (by inductive assumption) α2 (λ1 − λ2 ) = α3 (λ1 − λ3 ) = 0
⇒ (because λ1 = λ2 = λ3 ) α2 = α3 = 0 and hence α1 = 0.
Eλi = {
x ∈ R3 | x } = Ker(A − λi I3 )
x A = λi
is of dimension one.
Case 2 λ1 = λ2 = λ3
(3.7.43) can be simplified as
(A − λ1 I3 )(A − λ3 I3 )
0 0 λ1 − λ 3 0
= P −1 0 P · P −1 λ1 − λ 3 P
0 λ3 − λ1 0 0
= P −1 OP = O. (3.7.44)
In this case,
x1 and x2 are eigenvectors associated to λ1 = λ2 and hence
dim(Eλ1 ) = 2 while dim(Eλ3 ) = 1.
λ3
Conversely, if A has three eigenvalues λ1 , λ2 and λ3 with λ1 = λ2 =
such that (A − λ1 I3 )(A − λ3 I3 ) = O holds. We claim that
Im(A − λ1 I3 ) ⊆ Ker(A − λ3 I3 )
⇒ (since A − λ1 I3 = O3×3 and dim(Eλ3 ) = 1)
dim Im(A − λ1 I3 ) = r(A − λ1 I3 ) = 1
⇒ dim Ker(A − λ1 I3 ) = dim Eλ1 = 3 − 1 = 2.
Hence dim(Eλ1 ) = 2.
Since R3 = Eλ1 ⊕ Eλ3 , A is definitely diagonalizable.
Case 3 λ1 = λ2 = λ3 , say λ
Then (3.7.43) is simplified as
0 0
(A − λI3 ) = P −1 0 P = P −1 OP = O (3.7.45)
0 0
⇒ A = λI3
Eλi = Ker(A − λi I3 ) = {
x ∈ R3 | x },
x A = λi i = 1, 2, 3
(1) λ1 = λ2 = λ3 .
a. The minimal polynomial is (t − λ1 )(t − λ2 )(t − λ3 ).
480 The Three-Dimensional Real Vector Space R3
A2i = Ai .
3. Ai Aj = O3×3 if i = j, 1 ≤ i, j ≤ 3.
4. I3 = A1 + A2 + A3 .
5. A = λ1 A1 + λ2 A2 + λ3 A3 .
See Fig. 3.31.
(2) λ1 = λ2 = λ3 and their algebraic multiplicities are equal to their respec-
tive geometric dimensions, i.e. dim(Eλ1 ) = 2 and dim(Eλ3 ) = 1.
a. The minimal polynomial is (t − λ1 )(t − λ3 ).
b. Let Eλ1 = x2 and Eλ3 =
x1 , x3 . Then B = {
x1 , x3 } is a
x2 ,
basis for R and
3
λ1 0 x1
[A]B = PAP = −1 λ1
, where P = x2 .
0 λ3 x3
Define
10 0 0
A1 = P −1 1 P and A3 = P −1 0 P.
0 0 0 1
3.7 Linear Transformations (Operators) 481
Then,
1. R3 = Eλ1 ⊕ Eλ3 .
2. Each Ai : R3 → R3 is a projection of R3 onto Eλi along Eλj for j = i,
i.e. A2i = Ai for i = 1, 3.
3. A1 A3 = A3 A1 = O3×3 .
4. I3 = A1 + A3 .
5. A = λ1 A1 + λ3 A3 .
See Fig. 3.31 for λ1 = λ2 .
(3) λ1 = λ2 = λ3 , say equal to λ and dim(Eλ ) = 3.
a. The minimal polynomial is t − λ.
b. For any basis B = {
x1 , x3 } for R3 ,
x2 ,
x1
A = [A]B = PAP −1 = λI3 , x2 .
where P =
x3
both have the same characteristic polynomial (t−1)4 and the same minimal
polynomial (t − 1)2 , but they are not similar to each other (Why? One may
prove this by contradiction or refer to Sec. 3.7.7).
482 The Three-Dimensional Real Vector Space R3
Example 1 Test if
0 −1 0 −1 4 2
A = 3 3 1 and B = −1 3 1
1 1 1 −1 2 2
are similar.
(2) Try to find a linear operator mapping the tetrahedron ∆ 0
a1
a2
a3 onto
the parallelogram a1 a2 . See Fig. 3.54(b).
Solution (1) There are six such possible linear operators. The simplest
one, say f1 , among them is the one that satisfies
a i ) = −
f1 ( ai for 1 ≤ i ≤ 3.
− a3 2e3
a1 a1
e3 e3
a2 e2 a2
e1 0 − a2 0 e2
e1
− a1 a3
a3
(a) (b)
Fig. 3.54
0 −1 0 a1 −1 1 1
⇒ [f2 ]N = P −1 −1 0 0 P, where P = a2 = 1 −1 1
0 0 −1
a3 1 1 −1
0 1 1 0 −1 0 −1 1 1 0 −1 0
1
= 1 0 1−1 0 0 1 −1 1 = −1 0 0
2
1 1 0 0 0 −1 1 1 −1 0 0 −1
0 −1 0 0 1 0
⇒ f2 ( x ) = x [f2 ]N = x −1
0 0 = − x 1 0 0 .
(*2)
0 0 −1 0 0 1
− f2 (
e 1 ) + f2 ( e 3 ) = −
e 2 ) + f2 ( e 1 ) − f2 (
a2 , f2 ( e 3 ) = −
e 2 ) + f2 ( a1 ,
f2 ( e 2 ) − f2 (
e 1 ) + f2 ( e 3 ) = −
a3
⇒ f2 (
e 1 ) + f2 ( e 3 ) = −(
e 2 ) + f2 ( a1 + a3 ) = −(1, 1, 1)
a2 +
e 1 ) = −
⇒ f2 ( e 2, e 2 ) = −
f2 ( e 1, e 3 ) = −
f2 ( e 3. (*3)
g(
a1 ) =
a1 ,
g(
a2 ) =
a2 ,
g(
a3 ) =
a1 +
a2 = 2
e 3.
3.7 Linear Transformations (Operators) 485
An
⇒ 1. det(A) = λ1 λ2 λ3 .
2. A is invertible ⇔ λ1 λ2 λ3 = 0. In this case,
−1
λ1 0
A−1 = P −1 λ−1
2
P.
0 λ−1
3
3. Hence
λn1 0
An = P −1 λn2 P.
0 λn3
4. tr(A) = λ1 + λ2 + λ3 .
486 The Three-Dimensional Real Vector Space R3
Example 3 Use
1 −6 4
A = −2 −4 5
−2 −6 7
to justify (3.7.49).
In Markov process (see Applications <D2 >), one of the main themes is
to compute
lim
x0 An ,
n→∞
where A is a regular stochastic matrix and
x0 is any initial probability
vector. We give such an example.
Example 4 Let
2 1 1
5 10 2
1 7 1
A= 5 10 10
.
1 1 3
5 5 5
By computation,
−3 36 −15
1
Q−1 =− −3 −44 5 .
60
−3 −16 5
Then
1 0 1 0 0
lim An = lim Q−1 ( 12 )n Q = Q−1 0 0 0 Q
n→∞ n→∞
0 ( 15 )n 0 0 0
0.25 0.35 0.40
= 0.25 0.35 0.40 .
0.25 0.35 0.40
1 1
p = (0.25, 0.35, 0.40) = v1 = (5, 7, 8),
5+7+8 20
which is the unique probability vector as an eigenvector associated to the
eigenvalue 1 of A.
For any probability vector
x0 = (α1 , α2 , α3 ),
p
lim x0 An =
x0 lim An = x0 p
n→∞ n→∞
p
= (0.25(α1 + α2 + α3 ), 0.35(α1 + α2 + α3 ),
0.40 · (α1 + α2 + α3 ))
= (0.25, 0.35, 0.40) =
p.
Exercises
<A>
<B>
and hence
% &
n
|λxk − akk xk | = |xk ||λ − akk | ≤ |xk | |aik | − |akk | .
i=1
(a) Suppose A is a positive matrix, i.e. each entry aij > 0 for
1 ≤ i, j ≤ n and λ is an eigenvalue of A such that |λ| = Ar
or Ac , then
λ = Ar or Ac
respectively. Also, show that Eλ = { x ∈ Cn | x A = λ x} =
(1, 1, . . . , 1).
(b) Furthermore, suppose A is a positive stochastic matrix (for defi-
nition, see Ex. <D3 >). Then any eigenvalue λ of A other than 1
satisfies |λ| < 1 and dim E1 = 1 where E1 = {
x ∈ Cn |
xA = x }.
4. Let A = [aij ] ∈ M(n; F).
(a) Show that the characteristic polynomial of A
det(A − tIn )
= (−1)n tn + an−1 tn−1 + · · · + ak tk + · · · + a1 t + a0
ai1 i1 · · · ai1 ik
n .. tn−k .
= (−1)tn + (−1)n−k ..
. .
k=1 1≤i1 <···<ik ≤n ai i · · · ai i
k 1 k k
(∆1 , ∆2 , . . . , ∆n )
* +
2 ··· n
if ∆1 = (A − λ0 In ) 2 · · · n = 0 and ∆i , 2 ≤ i ≤ n,
is obtained from ∆1 by replacing the (i − 1)st column by
−a12 , −a13 , . . . , −a1n .
x∗0
det( x0 − λIn ) = (−1)n [λn − ( x∗0 )λn−1 ],
x0
where x0 =
x0 , x∗0 . Try to find an invertible matrix Pn×n so that
x0
∗ −1
P ( x0 x0 − λIn )P = diag[ x0 , 0, . . . , 0].
x0 ,
7. Let
1 1 1 1
1 i −1 −i
A=
1 −1
.
1 −1
1 −i −1 i
8. Let
a0 a1 a2 a3
−a1 a0 −a3 a2
A=
−a2
.
a3 a0 −a1
−a3 −a2 a1 a0
and if k > n,
<D1 > Application (I): The limit processes and the matrices
Suppose aij (t) is a real or complex valued function defined on a set S in
the plane R2 or C for each 1 ≤ i ≤ m, 1 ≤ j ≤ n. Then
If t0 is a limit point of S and for each i, j, limt→t0 aij (t) = aij exists, then
6 6
b b
A(t) dt = aij (t) dt .
a a
m×n
(k)
Let A(k) = [aij ]m×n ∈ M(m, n; C) be a sequence of matrices of the
(k)
same order. Suppose for each i, j, limk→∞ aij = aij exists, then
det A
1 0
⇒ eA = P −1 e P.
2
0 e
Do the following problems.
3.7 Linear Transformations (Operators) 497
a b
1. (a) Let A = −b a , a, b ∈ R. Show that
at
e cos bt eat sin bt
etA = , t ∈ R.
−eat sin bt eat cos bt
(b) Let
0 a −b
A = −a 0 c , a, b, c ∈ R.
b −c 0
Show that there exists an invertible matrix P so that
ui
e 0 0
eA = P −1 0 e−ui 0 P,
0 0 1
√
where u = a2 + b2 + c2 .
2. Suppose An×n is diagonalizable and
λ1 0
PAP −1 = ..
. .
0 λn
Show that
eλ1 0
..
eA = P −1 . P.
0 eλn
For eA where A has Jordan canonical form, see Ex. <C> 13(a), (d) of
Sec. 3.7.7.
3. Give examples to show that, in general,
(1) eA · eB = eB · eA .
(2) eA+B = eA · eB .
Suppose A, B ∈ M(n; C). Show that
et(A+B) = etA etB , t∈C
if and only if AB = BA. Hence, in this case,
eA+B = eA eB = eB eA .
4. Prove the following.
(1) det eA = etr(A) .
498 The Three-Dimensional Real Vector Space R3
5. Let A ∈ M(n; C) and its spectral radius ρ(A) < 1 (see Ex. <D1 > 6).
Show that
(a) In − A is invertible, and
∞
(b) (In − A)−1 = m=0 Am (Note that A0 = In ).
In case A is a nilpotent matrix of index k, then (In − A)−1 = In +
A + · · · + Ak−1 .
n
6. (a) Let A1 = i,j=1 |aij | where A = [aij ] ∈ M(n; C). Show that (see
also Ex. <C> 14 of Sec. 3.7.7)
(1) A1 ≥ 0, and = 0 ⇔ A = O.
(2) αA1 = |α|A1 , α ∈ C.
(3) A + B1 ≤ A1 + B1 .
(4) AB1 ≤ A1 B1 .
(5) x A| ≤ |
| x |1 A1 , where
x = (x1 , . . . , xn ) ∈ Cn and |
x |1 =
n 1
k=1 |xk |.
∞
7. (Weyr, 1887) Let A ∈ M(n; C). Suppose the power series m=0 am z m
−1
has positive radius r of convergence, where r = limm→∞ m |am | .
Show that
∞
(1) if the spectral radius ρ(A) < r,then 0 am Am converges absolutely
∞
namely, 0 |am |Am 1 < ∞ ; and
∞
(2) if ρ(A) > r, 0 am Am diverges (i.e. does not converge).
In particular, in case r = +∞, then for any A ∈ M(n; C), the power
series
∞
am Am
m=0
∞
always converges absolutely. Suppose ϕ(A) = m=0 am Am , ρ(A) < r.
(a) Let λ1 , . . . , λn be eigenvalues of A. Then ϕ(λ1 ), . . . , ϕ(λn ) are eigen-
values of ϕ(A).
(b) Suppose A is diagonalizable. Then A and ϕ(A) are simultaneously
diagonalizable.
(c) Let
1
1
A= 2 1 .
0 2
Show that
−k 2k k · 2k+1
(I2 − A) = , k ≥ 1.
0 2k
(d) Let
0 a −b
A = −a 0 c , a, b, c ∈ R.
b −c 0
Show that
cosh u − 1 2 sinh u
cos A = I3 − A , sin A = A
u2 u
where u2 = a2 + b2 + c2 .
8. Let A ∈ M(n; C) be invertible. If there exists a matrix X ∈ M(n; C) so
that
eX = A,
500 The Three-Dimensional Real Vector Space R3
X = log A.
then
log λ1 0
..
log A = P −1 . P.
0 log λn
(b) Suppose
λ 0
λ
A= @@ .. , λ = 0,
@ .
@ λ
(c) Suppose
A1 0
A2
A = P −1 .. P,
.
0 Ak
3.7 Linear Transformations (Operators) 501
(f) Let
λ 0
A = 1 λ , λ = 0.
0 1 λ
Show that
1 0 log λ 0
A λ 1
e =e 1 1 and log A = λ log λ .
1
2 1 1 − 2λ1 2 1
log λ
λ
n
aij = 1 for 1 ≤ i ≤ n, i.e. each row is a probability vector,
j=1
x = (x1 , . . . , xn ) ∈ Cn , let
For
n
x |1 =
| |xi |,
i=1
Then, just like Ex. <D2 > 6(a), M(n; C) is a Banach algebra with the
norm 1 . A vector x = (x1 , . . . , xn ) in Cn is called positive (or non-
negative) if xi > 0 (or xi ≥ 0) for 1 ≤ i ≤ n and is denoted as
x> 0 x ≥ 0 ).
(or
Hence, a probability vector is a nonzero non-negative vector. Define
x > x ≥
y (or y) ⇔
x −
y > 0 (or ≥ 0 ).
504 The Three-Dimensional Real Vector Space R3
limit vector
k
xA
lim = p ∗0
x D−1 ( x D−1
e0 )D = ( p ∗0 )
x0
k→∞ (λ(A))k
(Note For more information, refer to Chung [38, 39] and Doob [40] or
simpler Kemeny and Snell [41].)
dx
= ax
dt
dx1
= λ1 x1 (t),
dt
dx2
= λ2 x2 (t),
dt
with initial conditions x1 (0) = α10 , x2 (0) = α20
x (t) =
x0 etA .
506 The Three-Dimensional Real Vector Space R3
where {
v1 , v n}
v2 , . . . ,
is the fundamental system of the solution space.
In case A is not diagonalizable, the Jordan canonical form of A is needed
(see Ex. <D> of Sec. 3.7.7).
2. For each of the following systems of differential equations, do the fol-
lowing problems.
(1) Find the general solutions, the fundamental system of solutions and
the dimension of the solution space.
3.7 Linear Transformations (Operators) 507
dn x dn−1 x dx
n
+ an−1 n−1 + · · · + a1 + a0 x = 0
dt dt dt
is equivalent to the homogeneous linear differential system
0 ··· ··· 0 −a0
1 0 ··· 0 −a1
d
x .. . . .. .. ..
=
x A, where A =
. . . . .
dt
0 . ..
.. . 0 −an−2
0 0 ··· 1 −an−1
dn−1 x
and
x = (x1 , . . . , xn ) with x1 = x, x2 = dx
dt , . . . , xn = dtn−1 .
x (t) =
αetA
x (t) =
x0 etA .
In fact, let v i (t) = (bi1 (t), bi2 (t), . . . , bin (t)) be the ith row vec-
tor of etA , then { v1 (t), . . . ,
v n (t)} forms a fundamental system of
solutions and
n
x (t) = αi
v i (t), where
α = (α1 , . . . , αn ).
i=1
510 The Three-Dimensional Real Vector Space R3
d2
show that the 2nd-order system of differential equations dt2
x
=
x A in
Ex. 8 can be expressed as a 1st-order system
0 0 1 5
dy 0 0 2 4
= y (t)
1 0 0 0 .
dt
0 1 0 0
Use method in Ex. 1 to the above equation to solve the original equa-
tion. Then, try to use this method to solve equations in Ex. 9.
(Note For further readings, see Boyce and Diprema [33] and Farlow [34]
concerning dynamical systems.)
Eλi = {
x ∈ R3 | x } = Ker(A − λi I3 ),
x A = λi i = 1, 2, 3
(A − λ1 I3 )2 (A − λ3 I3 ) = O. (*1 )
Then Eλ1 = Eλ2 and dim Eλ3 = 1 (Why? One might refer to Case 2 in
Sec. 3.7.6).
Just like the proof shown in Case 1 of Sec. 3.7.6, it follows easily that
Eλ1 ∩ Eλ3 = { 0 }.
In case dim Eλ1 = 2, then dim(Eλ1 + Eλ3 ) = 3 shows that R3 = Eλ1 ⊕
Eλ3 . Via Case 2 in Sec 3.7.6, it follows that A is diagonalizable and thus,
(A − λ1 I3 )(A − λ3 I3 ) = O, contradicting to our assumption.
Hence dim Eλ1 = 1. As a byproduct,
R3 = Eλ1 ⊕ Eλ3
so A is not diagonalizable.
Hence, we introduce the generalized eigenspace
Gλ1 = {
x ∈ R3 |
x (A − λ1 I3 )2 = 0 } = Ker((A − λ1 I3 )2 )
We claim that dim Gλ1 = 2. To see this, notice that (*1) holds if and
only if
Im(A − λ1 I3 )2 ⊆ Ker(A − λ3 I3 )
⇒ (since (A − λ1 I3 )2 = O3×3 and dim Eλ3 = 1) r((A − λ1 I3 )2 ) = 1
⇒ dim Eλ1 = 3 − r((A − λ1 I3 )2 ) = 3 − 1 = 2.
α1
x1 + α3
x3 = 0
R3 = Gλ1 ⊕ Eλ3 .
Take a vector v2 ∈ Gλ1 so that
v2 (A − λ1 I3 ) = 0 . Then v2 (A − λ1 I3 )
v1 =
is an eigenvector of A associated to λ1 , i.e. v2 ∈ Eλ1 . Then {
v2 } is
v1 ,
linearly independent and hence forms a basis for Gλ1 . Take any basis { v3 }
514 The Three-Dimensional Real Vector Space R3
v1 A = λ1
v1 ,
v2 A v2 (A − λ1 I3 ) + λ1
= v2 =
v 1 + λ1
v2 ,
v3 A = λ3
v3
v1 λ1 0 0 v1
⇒ v 2 A = 1 λ1 0
v2
v3 0 0 λ3 v3
λ1 0 0 v1
⇒ PAP −1 = 1 λ1 0 , where P =
v2 .
0 0 λ3 v3
v1 λ 0 0 v1
v2 A = 1 λ 0
⇒ v2
v3 0 0 λ v3
λ 0 0 v1
v2 .
⇒ PAP −1 = 1 λ 0 , where P =
0 0 λ v3
v1 λ 0 0 v1
⇒ v2 A = 1 λ 0
v2
v3 0 1 λ v3
λ 0 0 v1
⇒ A = P −1 1 λ 0 P, where P =
v2 .
0 1 λ v3
We summarize as (refer to (3.7.47))
The Jordan canonical form of a nonzero real matrix A3×3
Suppose the characteristic polynomial of A is
det(A − tI3 ) = −(t − λ1 )(t − λ2 )(t − λ3 )
where λ1 , λ2 and λ3 are real numbers and at least two of them are coinci-
dent. Denote by Eλi = Ker(A − λi I3 ) the eigenspace for i = 1, 2, 3.
(1) λ1 = λ2 = λ3 . Let
Gλ1 = Ker(A − λ1 I3 )2
be the generalized eigenspace associated to λ1 . Then
1. (A − λ1 I3 )(A − λ3 I3 ) = O but (A − λ1 I3 )2 (A − λ3 I3 ) = O.
⇔ 2. dim Gλ1 = 2 = the algebraic multiplicity of λ1 ,
dim Eλ1 = dim Eλ3 = 1 and Gλ1 ∩ Eλ3 = { 0 }.
⇔ 3. R3 = Gλ1 ⊕ Eλ3
v2 ∈ Gλ1 be such that
In this case, let v1 = v2 (A − λ1 I3 ) = 0 and
v3 ∈ Eλ3 a nonzero vector. Then B = { v1 , v2 , v3 } is a basis for R3 and
λ1 0 0 v1
[A]B = PAP −1 = 1 λ1 0 , where P = v2 .
0 0 λ3 v3
(2) λ1 = λ2 = λ3 = λ. Then
1. A − λI3 = O but (A − λI3 )2 = O.
⇔ 2. dim Gλ = 3, where Gλ = Ker(A − λI3 )2 , and dim Eλ = 2.
In this case, take v2 ∈ Gλ so that v2 (A − λI3 ) ∈ Eλ is a
v1 =
nonzero vector and v3 ∈ Eλ which is linearly independent of
v1 . Then
B = {v1 , v3 } is a basis for R3 and
v2 ,
λ 0 0 v1
[A]B = PAP −1 = 1 λ 0 , where P = v2 .
0 0 λ v3
3.7 Linear Transformations (Operators) 517
(3) λ1 = λ2 = λ3 = λ. Then
v3 ∈ Gλ so that
In this case, take v3 (A − λI3 ) =
v2 = 0 and v1 =
v3 (A − λI) = 0 which is in Eλ . Then B = { v1 , v2 , v3 } is a basis for R3
2
and
λ 0 0 v1
[A]B = PAP −1 = 1 λ 0 , v2 .
where P = (3.7.50)
0 1 λ v3
(3.7.48) indicates that (2) ⇒ (1) is no more true for square matrices of
order n ≥ 4, but (1) ⇔ (3) ⇒ (2) still holds.
As an example, we continue Example 1 in Sec. 3.7.6 as
0 −1 0 −1 4 2 2 0 0
A = 3 3 1 , B = −1 3 1 and C = 0 1 1
1 1 1 −1 2 2 0 0 1
To find an invertible P3×3 such that PAP −1 = C, let us find R3×3 and
Q3×3 so that RAR −1 = JA and QCQ−1 = JC and hence
RAR −1 = QCQ−1
⇒ PAP −1 = C, where P = Q−1 R.
Now,
−1 −1 0
A − I3 = 3 2 1
1 1 0
−2 −1 −1
⇒ (A − I3 )2 = 4 2 2 with rank r((A − I3 )2 ) = 1.
2 1 1
x (A−I3 )2 = 0 and we get G1 = Ker((A−I3 )2 ) = (1, 0, 1), (2, 1, 0).
Solve
Take v2 = (2, 1, 0), then
−1 −1 0
v2 (A − I3 ) = (2 1 0) 3
v1 = 2 1 = (1 0 1) ∈ E1 .
1 1 0
v3 = (2, 1, 1) ∈ E2 = Ker(A − 2I3 ). Then
Take
1 0 1
RAR −1 = JA , where R = 2 1 0 .
2 1 1
3.7 Linear Transformations (Operators) 519
Notice that the above way to compute A−1 is not necessarily the best one.
As a summary, one can try the following methods:
1. Direct computation by using the adjoint matrix adjA (see (3.3.2) and
Sec. B.6);
2. Elementary row operations (see Secs. 3.7.5 and B.5);
3. Caylay–Hamilton theorem (see, for example, (3.7.28)),
to compute A−1 . (3.7.52)
520 The Three-Dimensional Real Vector Space R3
Hence,
1 0 0
n
JA = n 1 0 for n = ±1, ±2, . . ..
0 0 2n
⇒ An = (R−1 JA R)n = R−1 JA
n
R
1 1 −1 1 0 0 1 0 1
= −2 −1 2 n 1 0 2 1 0
0 −1 1 0 0 2n 2 1 1
3+n−2 n+1
1−2 n
1 + n − 2n
= −4 − n + 2n+2 −1 + 2n+1 −2 − n + 2n+1 for n = ±1, ±2, . . ..
−2 − n + 2n+1 −1 + 2n −n + 2n
3.7 Linear Transformations (Operators) 521
d
x
=xA
dt
d
x −1
⇔ R = x AR−1 = ( x R−1 )(RAR −1 )
dt
d
y
⇔ =y JA , where
y =x R−1 .
dt
Hence, according to Ex. <D4 > 1 of Sec. 3.7.6, the solution is
y (t) =
αetJA , or
tA tJA
x (t) = c e = αe R, where c =αR.
t
e 0 0 1 0 1
= (α1 α2 α3 ) tet et 0 2 1 0
0 0 e2t 2 1 1
t
e 0 et
= (α1 α2 α3 ) (2 + t)et et tet
2e2t e2t e2t
⇒
x1 (t) = [α1 + (2 + t)α2 ]et + 2α3 e2t ,
x2 (t) = α2 et + α3 e2t ,
x3 (t) = (α1 + α2 t)et + α3 e2t ,
1
I2 + N
2
which does satisfy
2
1 1
I2 + N = I2 + N + N 2 = I2 + N = D.
2 4
Therefore, we define the matrix
1 0 0
I + 1N 0
√ = 1
B̃ = 2 2 2 1 √0 .
0 2
0 0 2
Then B̃ 2 = JA .
3.7 Linear Transformations (Operators) 523
Consequently, define
1 1 −1 1 0 0 1 0 1
B = R−1 B̃R = −2 −1 2 12 1 √0 2 1 0
0 −1 1 0 0 2 2 1 1
7 √ √ √
2 −2 2 1− 2 2 −
3
2
9 √ √ √
= − 2 + 4 2 −1 + 2 2 − 2 + 2 2 .
5
√ √ √
− 52 + 2 2 −1 + 2 − 12 + 2
Then B 2 = A holds.
It is obvious that N12 = N2 .
Suppose N1 has a real square root S, i.e. S 2 = N1 . Then S, as a complex
matrix, has three complex eigenvalues which are equal to zero. Therefore S,
as a real matrix, is similar to
0 0 0 0 0 0
N3 = 1 0 0 or N1 = 1 0 0 .
0 0 0 0 1 0
Let P be an invertible matrix so that
P SP −1 = N3
⇒ P S 2 P −1 = P N1 P −1 = N32 = O.
which leads to N1 = O, a contradiction. Similarly,
P SP −1 = N1
⇒ P S 2 P −1 = P N1 P −1 = N12 = N2
⇒ P N12 P −1 = P N2 P −1 = N22 = O
which leads to N2 = O, a contradiction. Hence, N1 does not have any real
square root.
Now
1 1 0
∗
JA = 0 1 0
0 0 2
0 1 0
∗ −1
⇒ SJA S = JA , where S = 1 0 0 = E(1)(2) = S −1 .
0 0 1
Therefore,
A∗ = R∗ S −1 (SJA
∗ −1
S )S(R∗ )−1
= R∗ S −1 JA S(R∗ )−1
= R∗ SRAR −1 S(R∗ )−1 = PAP −1 ,
where
1 2 2 0 1 0 1 0 1 8 3 4
∗
P = R SR = 0 1 1 1 0 0
2 1 0 = 3 1 2
1 0 1 0 0 1 2 1 1 4 2 1
is an invertible symmetric matrix. 2
For J1 and J2 :
e1 A e1 →
= A1∗ = 2 e1 (A − 2I8 ) =
e3 (A − 2I8 )3 = 0 ,
e2 A = A2∗ = e2 →
e1 + 2 e2 (A − 2I8 ) = e3 (A − 2I8 )2 ,
e1 =
e3 A = A3∗ = e3 →
e2 + 2 e3 (A − 2I8 ) =
e2 ,
e4 A e4 →
= A4∗ = 2 e4 (A − 2I8 ) = 0 .
{
v1 , v3 }
v2 ,
5. Combing 3 and 4,
B2 = {
v1 ,
v2 , v4 }
v3 ,
Since
v1 A v1 (A − 2I8 ) + 2
= v1 = 2
v1 ,
v2 A v2 (A − 2I8 ) + 2
= v2 =
v1 + 2
v2 ,
v3 A v3 (A − 2I8 ) + 2
= v3 =
v2 + 2
v3 ,
v4 A = 2
v4
..
J1 .
..
⇒ [A | G2 ]B2 = · · · . · · · .
..
. J2
For J3 :
e5 A = A5∗ = −
e5 →
e5 (A + I8 ) =
e6 (A + I8 )2 = 0 ,
e6 A e5 −
= A6∗ = e6 →
e6 (A + I8 ) =
e5
and the ranks
r(A + I8 ) = 7,
r((A + I8 )k ) = 6, for k ≥ 2.
These imply the following:
1. G−1 = Ker((A + I8 )2 ) = e6 is an invariant subspace of dimen-
e5 ,
sion 2, the algebraic multiplicity of −1 as an eigenvalue of A.
2. E−1 = Ker(A + I8 ) has the dimension
dim E−1 = dim R8 − r(A + I8 ) = 8 − 7 = 1.
Thus, since E−1 ⊆ G−1 , there exists a basis for G−1 containing exactly
one eigenvector associated to −1.
3. Select a vector v6 satisfying
v6 (A + I8 )2 = 0 but v6 (A + I8 ) = 0 .
v5 =
Then v5 is an eigenvector in E−1 .
528 The Three-Dimensional Real Vector Space R3
4. Now,
B−1 = { v6 }
v5 ,
Since
v5 A v5 (A + I8 ) −
= v5 = −
v6 ,
v6 A v6 (A + I8 ) −
= v5 −
v6 = v6
⇒ [A | G−1 ]B−1 = [J3 ]
For J4 : similarly,
[A | G0 ]B0 = [J4 ]
Putting together,
B = B2 ∪ B−1 ∪ B0 = {
v1 ,
v2 ,
v3 ,
v4 ,
v5 ,
v6 , v8 }
v7 ,
where
v1
..
P = . .
v8 8×8
2
3.7 Linear Transformations (Operators) 529
Example 7 Find a Jordan canonical basis and the Jordan canonical form
for the matrix
7 1 2 2
1 4 −1 −1
A= −2 1
.
5 −1
1 1 2 8
0 3 3 3
r((A − 6I4 )k ) = 0 for k ≥ 3 (refer to (3.7.38)).
Therefore, there exists a basis B = {
v1 ,
v2 , v4 } for R4 so that
v3 ,
v1 ; v4 ← · · · total number = dim E6 = 4 − 2 = 2,
v2 ← · · · total number = r(A − 6I4 ) − r((A − 6I4 )2 ) = 1,
v3 ← · · · total number = r((A − 6I4 )2 ) − r((A − 6I4 )3 ) = 1.
Thus,
v1 ,
v2 , v4 = Ker((A − 6I4 )3 ) and
v3 , v4 = Ker(A − 6I4 ).
v1 ,
We want to choose such a basis B as a Jordan canonical basis of A.
To find a basis for E6 = Ker(A − 6I8 ):
x (A − 6I4 ) = 0
⇒ x1 + x2 − 2x3 + x4 = 0
x1 − 2x2 + 2x3 + x4 = 0
⇒ x2 = x3 , x1 = x3 − x4
⇒ x = (x3 − x4 , x3 , x3 , x4 ) = x3 (1, 1, 1, 0) + x4 (−1, 0, 0, 1).
Let
v3 =
e1 . Define
v2 v3 (A − 6I4 ) =
= e1 (A − 6I4 ) = (1, 1, 2, 2),
v1 v2 (A − 6I4 ) = (0, 3, 3, 3).
=
..
6 0 0 .
..
1 6 0 . 0 3 3 3
1
1 2 2
[A]B = PAP −1 = 0 ..
, where P =
1
.
1 6 . 0 0 0
..
· · · ··· ··· . · · · 1 1 1 0
..
. 6
x (A − 6I4 ) =
v1
⇒ x1 + x2 − 2x3 + x4 = 0
x1 − 2x2 + x3 + x4 = 1
1
⇒
v2 = (1, 1, 2, 2).
3
Again, solve
x (A − 6I4 ) =
v2
1
⇒
v3 = (1, 0, 0, 0).
3
Then {v1 ,
v2 , v4 }, where
v3 , v4 = (1, 1, 1, 0) or (−1, 0, 0, 1), is also a Jordan
canonical basis of A. 2
3.7 Linear Transformations (Operators) 531
Example 8 Find a Jordan canonical basis and the Jordan canonical form
for the matrix
2 0 0 0 0 0
1 0
2 0 0 0
−1 1 2 0 0 0
A= .
0 0 0 2 0 0
0 0 0 1 2 0
0 0 0 0 1 4 6×6
d
x
=
x A.
dt
These provide information about the number and the sizes of Jordan blocks
in the Jordan canonical form:
v1 ; v4 ← · · · 2 = dim R6 − r(A − 2I6 ),
v2 ; v5 ← · · · 2 = r(A − 4I6 ) − r((A − 2I6 )2 ),
v3 ← · · · 1 = r((A − 4I6 )2 ) − r((A − 2I6 )3 ).
⇒ x2 − x3 = 0, x3 = 0, x5 = x6 = 0
⇒ x = (x1 , 0, 0, x4 , 0, 0) = x1
e1 + x4
e4 .
Therefore, E2 = e4 . Next, solve
e1 ,
x (A − 2I6 )3 = 0
but x (A − 2I4 )2 = 0
Therefore, {
v1 , v3 } and {
v2 , v5 } are the required bases so that
v4 ,
Finally, solve
x (A − 4I6 ) = 0
⇒
v6 = (0, 0, 0, 1, 2, 4).
Combing together,
B = { v2 ,
v1 , v3 ,
v4 , v6 }
v5 ,
A = P −1 JA P
⇒ (see Ex.<D2 > 4 of Sec. 3.7.6)
etA = P −1 etJA P.
1 1
2! 1! 1
Similarly,
tJ2 2t 1 0 2 0
e =e 1 , where J2 = , and
1! 1 1 2
et[4] = e4t [1]1×1 .
Therefore,
2t
..
e 0 0 .
..
e2t e2t
0 .
..
1 e2t 2t
e2t
e .
2
e tJ1
..
= ··· ··· ··· . ··· ··· ···
etJA
= etJ2 .
.. .
et[4] . e2t 0 ..
.. .
. e2t e2t ..
.
··· ··· · · · .. · · ·
.. 4t
. e 6×6
3.7 Linear Transformations (Operators) 535
d
The solution of x
dt =
x A is
x (t) =
αetA , α ∈ R6 is any constanst vector.
where
Remark
The method in Example 6 to determine the Jordan canonical form of a real
matrix An×n for n ≥ 2, once the characteristic polynomial splitting into
linear factors as
Exercises
<A>
0 4 2 2 0 0 −1 1 0
(d) −3 8 3 . (e) 2 2 0 . (f) 0 −1 1 .
4 −8 −2 −2 1 2 0 0 −1
−3 3 −2 1 0 1 2 −1 −1
(g) −7 6 −3 . (h) 1 0 2 . (i) 2 −1 −2 .
1 −1 2 −1 0 3 −1 1 2
5 −3 2 1 −3 4 4 6 −15
(j) 6 −4 4 . (k) 4 −7 8 . (l) 3 4 −12 .
4 −4 5 6 −7 7 2 3 −8
<B>
3 0 0 0 1 0 0 0
1 3 0 0 1 1 0 0
(a)
0 1 3 0 . (b)
0 1 2 0 .
−1 1 0 3 −1 0 1 2
0 −2 −2 −2 7 1 −2 1
−3 1 1 −3 1 4 1 1
(c)
1 −1 −1
. (d) .
1 2 −1 5 2
2 2 2 4 2 −1 −1 8
2 −2 −2 −2 2
2 0 0 0 −4
−1 0 −2 −6 1
3 1 −1
(e)
0 −1 1
.
(f) 2 1 3 3 0
.
0 2 3 3 7 2
1 0 0 3
0 0 0 0 5 5×5
1 0 0 0 0 0 5 1 1 0 0 0
0 0 0 5 1 0 0 0
1 0 0 0
0 1 1 0 1 0 0 0 5 0 0 0
(g) . (h) .
0 1 0 1 0 0 0 0 0 5 1 −1
1 0 0 0 1 0 0 0 0 0 5 1
0 −1 0 0 −1 1 6×6 0 0 0 0 0 5 6×6
3.7 Linear Transformations (Operators) 537
−1 0 2 −2 0 0 −1
0 0 0 0 −1
1 1
−1 2 −1 0 0 0
0
(i) 1 0 −1 2 0 0 1 .
1 0 −1 1 1 0 2
3 0 −6 3 0 1 4
0 0 0 0 0 0 1 7×7
1. Find the Jordan canonical form for each of the following matrices.
1 −1 O 0 1 O
1 −1 0 1
.. .. .. ..
(a) . . . (b) . . .
1 −1 0 1
O 1 n×n
O 0 n×n
0 1 0 ··· 0 0
0 0 1 · · · 0 0 0 0 ··· 0 a1
. . . ··· 0
. . . .. .. 0 0 a2
. . . . . ..
(e) . . . .. . (f) ... ..
.
..
. . .
. . . ..
. . . . . 0 ··· 0
an−1 0
0 0 0 · · · 0 1 an 0 ··· 0 0 n×n
1 0 0 ··· 0 0 n×n
α0 α1 α2 ··· αn−2 αn−1
αn−1 α α1 ··· αn−3 αn−2
0
αn−2 αn−1 α0 ··· αn−4 αn−3
(g) , α0 , α1 , . . . , αn−1 ∈R (or C).
. .. .. .. ..
.. . . . .
α1 α2 α3 ··· αn−1 α0
a0 a1 a2 ··· an−1
µan−1 a0 a1 ··· an−2
µan−2 µan−1 a0 ··· an−3
(h) , a0 , a1 , . . . , an−1 , µ ∈ R (or C).
. .. .. ..
.. . . .
µa1 µa2 µa3 ··· a0
c0 c1 c2 c3 c4
c1 c2 + c1 a c3 + c2 a c4 + c3 a c0 + c4 a
(i)
c2 c3 + c2 a c4 + c3 a c0 + c4 a c1 ,
c3 c4 + c3 a c0 + c4 a c1 c2
c4 c0 + c4 a c1 c2 c3
where c0 , c1 , c2 , c3 , c4 , a ∈ R (or C). Try to extend to a matrix
of order n.
(j) Pn×n is a permutation matrix of order n (see (2.7.67)), namely,
for a permutation σ: {1, 2, . . . , n} → {1, 2, . . . , n},
eσ(1)
..
P = . .
eσ(n)
2. For each of the following linear operators f , do the following problems.
(1) Compute the matrix representation [f ]B , where B is the given
basis.
(2) Find a Jordan canonical basis and the Jordan canonical form
of [f ]B .
3.7 Linear Transformations (Operators) 539
(3) Find the corresponding Jordan canonical basis for the original
operator f .
1 2
(d) f : M(2; R) → M(2; R) defined by f (A) = A, where B =
2 1
{E11 , E12 , E21 , E22 } is the basis for M(2; R). What happens
natural
1 2 1 2 1 2
if 2 1 is replaced by 0 1 or −2 1 ?
(e) Let V = ex , xex , x2 ex , e−x , e−2x be the vector subspace of
C[a, b], generated by B = {x, xex , x2 ex , e−x , e−2x }. f : V → V
defined by f (p) = p .
(f) Let V be the subspace, generated by B = {1, x, y, x2 , y 2 , xy}, of
the vector space P (x, y) = {polynomail functions over R in two
variables x and y}. f : V → V defined by f (p) = ∂y
∂p
.
tr(f
) = 0.
satisfies tr[f
] = 0 for ≥ 1 but it is not nilpotent. See also
Ex. <C> 11 of Sec. 2.7.6.)
Ker(Ak−1 ) =
x1 A, . . . ,
xrk A, xrk−1 ⊕ Ker(Ak−2 ).
xrk +1 , . . . ,
(3) Denote
(i)
Bj = {
xj Ai−1 ,
xj Ai−2 , . . . , xj },
xj A,
1 ≤ i ≤ k, ri+1 + 1 ≤ j ≤ ri with rk+1 = 0;
(i) (i)
Wj = the invariant subspace generated by Bj .
(k) (k) (k−1) (1) (1)
Then B = B1 ∪ · · · ∪ Brk ∪ Brk +1 ∪ · · · ∪ Br2 +1 ∪ · · · ∪ Br1
and
[A]B = PAP −1
(k)
A1
..
. 0
(k)
Ark
(k−1)
Ark +1
..
= .
(2)
Ar 2
(1)
Ar2 +1
..
0 .
(1)
Ar 1 n×n
where
0 0
1 0
1 0
= ,
(i) (i)
Aj = A | Wj (i) ..
Bj .
0
0 1 0 i×i
2 ≤ i ≤ k, ri+1 + 1 ≤ j ≤ ri ;
(1)
Aj = [0]1×1 , r2 + 1 ≤ j ≤ r1 .
Call [A]B a (Jordan) canonical form of the nilpotent operator
of f or A.
542 The Three-Dimensional Real Vector Space R3
g2 = f
ri = t2i−1 + t2i , 1 ≤ i ≤ k;
t1 ≥ t2 ≥ · · · ≥ t2k ≥ 0 and t1 + t2 + · · · + t2k = n.
ri − 1 ri+1 + 1
≥ t2i ≥ t2i+1 ≥ ⇒ ri ≥ ri+1 + 2.
2 2
Conversely, suppose these inequalities hold. If ri is even, choose
t2i = t2i+1 = r2i ; if ri is odd, choose t2i = ri2+1 and t2i+1 =
ri −1
2 .)
(e) Let A ∈ M(n; C) be nilpotent of index n. Show that A does not
have square root (see Example 3). What happens to a nilpotent
matrix An×n of index n − 1?
(f) Suppose An×n and Bn×n are nilpotent of the same index k, where
k = n or n − 1 and n ≥ 3. Show that A and B are similar. What
happens if 1 ≤ k ≤ n − 2?
(g) Show that a nonzero nilpotent matrix is not diagonalizable.
(h) Find all canonical forms of a nilpotent matrix An×n , where
n = 3, 4, 5 and 6.
4. Let A = [aij ] ∈ M(n; C) be an invertible complex matrix.
(a) Show that A has a square root, i.e. there exists a matrix B ∈
M(n; C) so that B 2 = A (see Example 3).
(b) Let p ≥ 2 be a positive integer, then there exists an invertible
matrix B ∈ M(n; C) so that B p = A.
3.7 Linear Transformations (Operators) 543
for 1 ≤ i ≤ k.
2 ri
(2) For any ε > 0, let Ei = diag[ε, ε , . . . , ε ] be a diagonal matrix of
order ri . Then
λi 0
ε λi
Ei Ji Ei−1 = .. for 1 ≤ i ≤ k.
.
0 ε λi r ×r
i i
Then
Ii O
0 1
P Ri P −1 = Q−1 Qi for 1 ≤ i ≤ r.
i
1 0
O In−i−2
10. A matrix A6×6 has the characteristic and the minimal polynomial
p(t) = (t − 1)4 (t + 2)2 ,
m(t) = (t − 1)2 (t + 2)
respectively. Find the Jordan canonical form of A.
11. Suppose a matrix A8×8 has the characteristic polynomial
(t + 1)2 (t − 1)4 (t − 2)2 .
Find all possible Jordan canonical forms for such A and compute the
minimal polynomial for each case.
12. Suppose a complex matrix An×n has all its eigenvalues equal to 1, then
any power matrix Ak (k ≥ 1) is similar to A itself. What happens if
k ≤ −1.
13. Let
λ
1 λ
1 λ
J =
. .
.. ..
1 λ n×n
and p(x) be any polynomial in P (R). Show that
p(λ) 0 ··· ··· 0 0
p (λ)
p(λ) ··· ··· 0 0
1!
p (λ)
p (λ)
··· ··· 0 0
2! 1!
p(J) = .
.. .. .. ..
. . . .
p(n−2) (λ) p(n−3) (λ)
· · · · · · p(λ) 0
(n−2)! (n−3)!
p(n−1) (λ) p(n−2) (λ)
(λ)
(n−1)! (n−2)! · · · · · · p 1! p(λ)
n×n
(a)
1 0 ··· ··· 0 0
1 1 ··· ··· 0 0
1!
1 1
··· ··· 0 0
2! 1!
J λ
e =e . .. . What is etJ ?
.. .
1 1
··· ··· 1 0
(n−2)! (n−3)!
1
(n−1)!
1
(n−2)! ··· ··· 1
1! 1
(b) limk→∞ J k exists (see Ex. <D1 > of Sec. 3.7.6) if and only if one of
the following holds:
(1) |λ| < 1.
(2) λ = 1 and n = 1.
(c) limk→∞ J k = On×n if |λ| < 1 holds and limk→∞ J k = [1]1×1 if λ = 1
and n = 1.
(d) Let
J1 0
J2
PAP −1 = .
..
0 Jk
be the Jordan canonical form of a matrix An×n with k Jordan blocks
Ji , 1 ≤ i ≤ k. Show that
tJ1
e 0
etJ2
etA = P −1 .. P.
.
0 etJk
(e) Prove Ex. <D1 > 5 of Sec. 3.7.6.
k
1
eA = lim A
k→∞ !
=0
exists for any A ∈ M(n; C). See Ex. <D2 > of Sec. 3.7.6.
15. Let M = [aij ] ∈ M(n; R) be a stochastic matrix (see Ex. <D3 > of
Sec. 3.7.6) and P M P −1 = J be the Jordan canonical form of M . Let
∞ be as in Ex. 14. Show that:
Use these results to prove Ex. <D3 > 3(d) of Sec. 3.7.6.
16. Let A = [aij ] ∈ M(n; F). Show that the following are equivalent.
For (b) ⇒ (a), see Ex. <C> 10(a) of Sec. 2.7.6. As a consequence, any
complex square matrix or real square matrix considered as a complex
one is always triangularizable. Yet, we still can give a more detailed
account for (b) ⇒ (a) as follows (refer to Ex. 3, and Exs. 2 and 3 of
548 The Three-Dimensional Real Vector Space R3
ψIA (t) =k
fi (t) = = (t − λ
)d , 1 ≤ i ≤ k.
(t − λi )di =1
=i
Let, for 1 ≤ i ≤ k,
= {
x ∈ Fn | there exists some positive integer
x (A − λi In )
= 0 }.
so that
Then
(1) Each Wi , 1 ≤ i ≤ k, is an invariant subspace of A so that
F n = W1 ⊕ · · · ⊕ W k .
Ei2 = Ei ,
Ei Ej = O, i = j, 1 ≤ i, j ≤ n.
Also, Ei has
2. A = D + N,
DN = N D.
3. Both D and N can be expressed as polynomials in A. A and D have
the same characteristic polynomial and hence, the set of eigenvalues.
det(A + B) = det A.
m m
i=0 Cm−i D
m−i i
N , m<k
m m
A = (D + N ) =
k−1 Cm Dm−i N i , m ≥ k
i=0 m−i
k−1
f (i) (D) i
= N , if f (z) = z m and m ≥ k.
i=0
i!
k−1
p(i) (D) i
p(A) = p(D) + N ,
i=1
i!
k−1
f (i) (D) i
f (A) = f (D) + N ,
i=1
i!
k−1
1 i
= eD + eD N .
i=1
i!
Then
(log λ1 )Ir1 0
..
D1 = P −1 . P.
0 (log λk )Irk
1 (−1)k−2
N1 = log(ND −1 + In ) = ND −1 − (ND −1 )2 + · · · + (N D−1 )k−1 .
2 k
When D1 and N1 are given as above, the matrix logarithm of A is
log A = D1 + N1 ,
1
In particular, if a = m where m ≥ 1 is a positive integer, then the mth
1
root A m is defined as
1 1 1 1 1
A m = e m log A = e m (D1 +N1 ) = e m D1 e m N1
1
λ1m Ir1 0
.. 1
= P −1 . P (In + N D−1 ) m .
1
0 λkm Irk
21. Try to use Exs. 16, 17 and 3 to prove that every triangularizable matrix
A ∈ M(n; F) has a Jordan canonical basis B so that [A]B is the Jordan
canonical form of A (refer to Sec. B.12).
dx1 (t)
= 2x1 (t) − x2 (t) − x3 (t),
dt
dx2 (t)
= 2x1 (t) − x2 (t) − 2x3 (t),
dt
dx3 (t)
= −x1 (t) + x2 (t) + 2x3 (t).
dt
Written in matrix form, this system is equivalent to
2 2 −1
d
x
x (t)A, where A = −1 −1
= 1 and
dt
−1 −2 2
x (t) = (x1 (t), x2 (t), x3 (t)). (*4)
Therefore,
1 0 0 1 2 −1
A = P −1 JP, where J = 1 1 0 and P = 1 0 0 .
0 0 1 1 1 0
1 −2 −1
d
x
(b) x A + f (t), where A = −3 −6 −4 ,
=
dt
3 13 8
x (0) = (1, 0, 0), f (t) = (0, −1, 1).
3 4 0 0
d
x −4 −5 0 0
x A + f (t), where A = ,
(c) =
dt 0 −2 3 2
2 4 −2 −1
x (0) = (0, 0, 1, 1), f (t) = (1, 1, 1, 1).
2. Consider the equation
d2 x dx
−3 + 2x = e−3t , x(1) = 1, x (1) = 0.
dt2 dt
Rewrite it as
dx 0 −2
= x A + f (t), A = , f (t) = (0, e−3t ),
dt 1 3
dx
x (t) = x(t), and x (1) = (1, 0).
dt
Then (see Exs. <D4 > 5, 6 of Sec. 3.7.6 and (*6)),
2t
tA −e + 2et −2e2t + 2et
e =
e2t − et 2e2t − et
⇒ f (t)e−tA = e−5t − e−4t , 2e−5t − e−4t
6 t 6 t 6 t
−tA −5t −4t −5t −4t
⇒ f (t)e dt = (e − e ) dt, (2e − e ) dt
1 1 1
= ···
6 t
−tA
⇒ f (t)e dt etA
1
1 −3t 1 (2t−5) 1 t−4 2 −3t 2 (2t−5) 1 t−4
=e + e − e ,− e + e − e
20 5 4 5 5 4
1 1 1
x (t) = −e2(t−1) + 2et−1 + e−3t + e(2t−5) − et−4 , . . .
⇒
20 5 4
1 −3t 1 (2t−5) 1 t−4
⇒ x(t) = −e2(t−1) + 2et−1 + e + e − e .
20 5 4
Try to work out the details.
556 The Three-Dimensional Real Vector Space R3
d
x
=
x A(t),
a ∈ Rn (or Cn )
x (0) =
dt
x =
a X(t), t ≥ 0,
dX
= XA(t), X(0) = In .
dt
(see Ex.<D1 > of Sec. 3.7.6). This suggests the following proof, called
the method of successive approximation. Define
X0 = In
6 t
Xm+1 = In + Xm A(s) ds, m ≥ 1.
0
3.7 Linear Transformations (Operators) 557
Step 1 Fix t1 > 0. Let α = max0≤t≤t1 A(t)1 (for 1 , see Ex. 14).
Then
6 t
Xm+1 − Xm 1 ≤ · · · ≤ Xm − Xm−1 1 A(s)1 ds
0
6 t
≤α Xm − Xm−1 1 ds
0
⇒ X1 − X0 1 ≤ αt,
α2 2
X2 − X1 1 ≤ t ,
2!
..
.
(αt)m+1
Xm+1 − Xm 1 ≤ , m≥1
(m + 1)!
∞ ∞
(αt)m+1
⇒ Xm+1 − Xm 1 ≤ = eαt − 1 < ∞, 0 ≤ t ≤ t1
m=0 m=0
(m + 1)!
∞
⇒ (Xm+1 − Xm ) = lim Xm = X exists on [0, t1 ] and hence,
m→∞
m=0 on t ≥ 0.
Step 2 Suppose Yn×n is another solution. Then
6 t
X −Y = (X − Y )A(s) ds.
0
Since both X and Y are differentiable, they are continuous on [0, t1 ].
Therefore α1 = max0≤t≤t1 X(t) − Y (t)1 < ∞. Now, for 0 ≤ t ≤ t1 ,
6 t
X(t) − Y (t)1 ≤ X(t) − Y (t)1 A(s)1 ds
0
6 t
≤ α1 A(s)1 ds ≤ α1 αt.
0
6 t 6 t
α2 α1 2
⇒ X(t) − Y (t)1 ≤ αα1 sA(s)1 ds ≤ α2 α1 s ds = t
0 0 2!
..
.
(αt)m
⇒ X(t) − Y (t)1 ≤ α1 , 0 ≤ t ≤ t1 , m ≥ 1
m!
⇒ X(t) − Y (t)1 = 0, 0 ≤ t ≤ t1
⇒ X(t) = Y (t) on 0 ≤ t ≤ t1 and hence, on t ≥ 0.
Thus, such a solution X is unique.
558 The Three-Dimensional Real Vector Space R3
Step 3 Let
x =
a X(t). Then
dx d d
= ( a X(t)) = a (X(t)) =
a X(t)A(t) =
x A(t), t ≥ 0 and
dt dt dt
x (0) =
a X(0) = a In = a.
Hence, this x is a solution. Just like Step 2,
x can be shown to be
unique.
5. (continued from Ex. 4) In case A(t) = A is a constant matrix, then
x0 = In ,
6 t
x0 = In + A ds = In + tA,
0
..
.
1 2 2 1 m m
xm = In + tA + t A + ··· + t A , t ≥ 0.
2! m!
These leads to the following important fact:
dX
= XA, X(0) = In
dt
has the unique solution
∞
1 m m
t A = etA .
m=0
m! (def.)
This definition coincides with that defined in Ex. <D2 > of Sec. 3.7.6.
Hence,
d
x
=
x A,
x (0) =
a
dt
has the unique solution
x =
a etA .
These results are useful in linear functional equation, Lie group and Lie
algebra, quantum mechanics and probability, etc.
A definitely has at least one real eigenvalue λ (refer to Sec. A.5). Hence,
its characteristic polynomial is of the form
where a1 and a0 are real constants. What we really care is the case that the
quadratic polynomial t2 +a1 t+a0 does not have real zeros, i.e. a21 −4a0 < 0.
But for completeness, we also discuss the cases a21 − 4a0 ≥ 0.
As we have learned in Secs. 2.7.6–2.7.8, 3.7.6 and 3.7.7, a canonical form
is a description of a matrix representation of a linear operator or square
matrix, obtained by describing a certain kind of ordered basis for the space
according to the features (e.g. eigenvalues, eigenvectors, etc.) of the given
operator or matrix. In this section, we will establish the canonical forms for
real 3×3 matrices based on the irreducible monic factors of its characteristic
polynomial instead of eigenvalues (see Sec. A.5).
Four cases are considered as follows.
Gλ1 = {
x ∈ R3 |
x (A − λ1 I3 )2 = 0 } (the generalized eigenspace),
Eλ2 = {
x ∈ R3 |
x (A − λ2 I3 ) = 0 } (the eigenspace).
and a particular basis had been chosen for Gλ1 and hence a basis B for R3
so that [A]B is the Jordan canonical form for A.
Here, we try to choose another basis for Gλ1 so that the induced basis
B for R3 represents A in a canonical form. The central ideas behind this
method are universally true for Cases 3 and 4 in the following.
560 The Three-Dimensional Real Vector Space R3
Notice that
(A − λ1 I3 )2 = A2 − 2λ1 A + λ21 I3
⇒
x (A2 − 2λ1 A + λ21 I3 ) = 0 x ∈ Gλ1
for all
⇒
x A2 = −λ21 x ∈ Gλ1 .
x A for all
x + 2λ1
Gλ1 =
v1 ,
v1 A, and
Eλ2 =
v2
⇒ B = {
v1 , v2 } is a basis for R3 .
v1 A,
Since
v1 A =0·
v1 +
v1 A,
( v1 A2 = −λ21
v1 A)A = v1 + 2λ1
v1 A,
v2 A= λ2
v2
v1 0 1 0 v1
2
⇒ v1 A A = −λ1 2λ1 0 v 1 A
v2 0 0 λ2 v2
0 1 0 v1
⇒ [A]B = PAP −1 = −λ21 2λ1 0 , where P = v 1 A .
0 0 λ2 v2
v ∈ Gλ = R3 so that
v (A − λI3 )2 = 0 v (A − λI3 ) = 0 )
(which implicitly implies that
v A2 = −λ2
⇒ v + 2λ
v A and
v (A − λI3 )2 is an eigenvector of A associated to λ.
It follows that
B = {
v, v A2 }
v A,
is a basis for R3 (see Ex. <A> 1). In B,
0 1 0 v
[A]B = PAP −1 = 0 0 1 , vA .
where P =
λ3 −3λ2 3λ 2
vA
Case 4 det(A − tI3 ) = −(t − λ)(t2 + a1 t + a0 ) where a21 − 4a0 < 0. Still
562 The Three-Dimensional Real Vector Space R3
Since a21 − 4a0 < 0, so Eλ ∩ Kλ = { 0 }.
We claim that dim Kλ = 2. Take any nonzero vector v ∈ Kλ , then
{ v , v A} is linearly independent and hence, is a basis for Kλ . To see this,
(1) In case a21 − 4a0 > 0, then det(A − tI3 ) = −(t − λ1 )(t − λ2 )(t − λ3 )
where λ = λ1 = λ2 = λ3 are real numbers. A is diagonalizable. See (1)
in (3.7.46).
(2) In case a21 − 4a0 = 0, then det(A − tI3 ) = −(t − λ1 )2 (t − λ2 ) where λ1
is a real number and λ = λ2 .
3.7 Linear Transformations (Operators) 563
Then
1. dim Gλ1 = 2, dim Eλ2 = 1,
2. Gλ1 ∩ Eλ2 = { 0 },
3. R3 = Gλ1 ⊕ Eλ2 .
Take any vector v1 ∈ Gλ1 \Eλ1 where Eλ1 = Ker(A − λ1 I3 ) and
v2 ∈ Eλ2 , then B = {
any nonzero vector v1 , v2 } is a basis for
v1 A,
R3 and
0 1 0 v1
[A]B = PAP −1 = −λ21 2λ1 0 , where P = v 1 A .
0 0 λ2 v2
0 1 0 v
[A]B = PAP −1 = 0 0 1 , where P = vA .
λ3 −3λ2 3λ 2
vA
564 The Three-Dimensional Real Vector Space R3
Thus
1. dim Kλ = 2, dim Eλ = 1,
2. Kλ ∩ Eλ = { 0 },
3. R3 = Kλ ⊕ Eλ .
Take any nonzero vector v1 ∈ Kλ and any nonzero vector
v2 ∈ Eλ . In
the basis B = { v1 , v1 A, v2 },
0 1 0 v1
[A]B = PAP −1 = −a0 −a1 0 , where P = v1 A . (3.7.53)
0 0 λ v2
Example 1 Let
0 0 1
A = 1 0 −1 .
0 1 1
Find the rational canonical form of A. If A is considered as a complex
matrix, what happens?
then solve
x (A2 + I3 ) = 0
⇒ x1 + x3 = 0
x ∈ R3 | x1 + x3 = 0} = (0, 1, 0), (1, 0, −1).
⇒ K = {
Choose
v1 = (0, 1, 0). Then v1 A = (1, 0, −1). Also, take
v2 = (1, 1, 1).
Then B = { v1 , v1 A, v2 } is a basis for R . Therefore,
3
v1 A = 0 ·
v1 + 1 ·
v1 A + 0 ·
v2 ,
( v1 A)A = v1 A = (0, −1, 0) = −
2
v1 + 0·
v1 A + 0 ·
v2 ,
v2 A = v2 = 0 · v1 + 0 · v1 A + 1 ·
v2
0 1 0 0 1 0
⇒ [A]B = PAP −1
= −1 0 0 ,
where P = 1 0 −1 .
0 0 1 1 1 1
This is the required canonical form.
On the other hand, if A is considered as a complex matrix,
det(A − tI3 ) = −(t − 1)(t − i)(t + i)
and then A has three distinct complex eigenvalues i, −i and 1.
For λ1 = i: Solve
x (A − iI3 ) = 0
x = (x1 , x2 , x3 ) ∈ C3 .
where
⇒ −ix1 + x2 = −ix2 + x3 = 0
x = (−x3 , −ix3 , x3 ) = x3 (−1, −i, 1)
⇒ for x3 ∈ C
⇒ Ei = Ker(A − iI3 ) = (−1, −i, 1).
Ei is a one-dimensional subspace of C3 . For λ1 = −i: Solve
x (A + iI3 ) = 0
⇒ ix1 + x2 = ix2 + x3 = 0
x = (−x3 , ix3 , x3 ) = x3 (−1, i, 1) for x3 ∈ C
⇒
⇒ E−i = Ker(A + iI3 ) = (−1, i, 1)
and dim E−i = 1. As before, we knew already that E1 = Ker(A − I3 ) =
(1, 1, 1). In the basis C = {(−1, −i, 1), (−1, i, 1), (1, 1, 1)} for C3 ,
i 0 0 −1 −i 1
[A]C = QAQ−1 = 0 −i 0 , where Q = −1 i 1 .
0 0 1 1 1 1
Thus, as a complex matrix, A is diagonalizable (refer to (3.7.46)). 2
566 The Three-Dimensional Real Vector Space R3
Also
3 2 3
A − I3 = 2 0 2 ⇒ r(A − I3 ) = 2
−1 −2 −1
⇒ dim Ker(A − I3 ) = 1 < 2,
10 0 10
(A − I3 )2 = 4 0 4 ⇒ r(A − I3 )2 = 1
−6 0 −6
⇒ dim Ker(A − I3 )2 = 2.
For A:
0 0 0
A − 2I3 = 0 0 1 ⇒ r(A − 2I3 ) = 1 ⇒ dim Ker(A − 2I3 ) = 2,
0 0 0
(A − 2I3 )2 = O ⇒ dim Ker(A − 2I3 )2 = 3.
Solve
x (A − 2I3 ) = 0
For B:
0 1 0
B − 2I3 = 0 0 1 ⇒ r(B − 2I3 ) = 2 ⇒ dim Ker(B − 2I3 ) = 1,
0 0 0
0 0 1
(B − 2I3 )2 = 0 0 0 ⇒ r(B − 2I3 )2 = 1,
0 0 0
(B − 2I3 )3 = O3×3 ⇒ dim Ker(B − 2I3 )3 = 3.
568 The Three-Dimensional Real Vector Space R3
Take
v = v (B − 2I3 )2 = (0, 0, 1) = 0 and so consider
e1 = (1, 0, 0), then
v B = (2, 1, 0) and v B = (4, 4, 1). In the basis C = {
2
v, v B 2 } for R3 ,
v B,
v B 3 = (8, 12, 6) = 6 · (4, 4, 1) − 12 · (2, 1, 0) + 8 · (1, 0, 0)
v − 12
= 8 v B + 6
v B2
0 1 0 1 0 0
⇒ [B]C = QBQ−1 = 0 0 1 , where Q = 2 1 0 .
8 −12 6 4 4 1
2
To the end, we use two concrete examples to illustrate the central ideas
behind the determination of the rational canonical forms of matrices of
order higher than 3. For details, read Sec. B.12.
e1 A =
e2 ,
e2 A =
e3 =
e1 A2 ,
e3 A =
e4 =
e1 A3 ,
e4 A = −
e1 − 2
e2 − 3
e3 − 2
e4
= −
e1 − 2
e1 A − 3
e1 A2 − 2
e1 A3 =
e1 A4
⇒ ei (A2 + A + I8 )2 = 0 for 1 ≤ i ≤ 4;
ei (A4 + 2A3 + 3A2 + 2A2 + I8 ) =
e5 A =
e6 ,
e6 A = −
e5 A −
e6 = −
e5 − e5 (−I8 − A) =
e5 A = e5 A2
⇒
ei (A2 + A + I8 ) = 0 for i = 5, 6.
R12 0
A2 = R22
0 R32
..
0 0 1 0 .
.
0 0 0 1 ..
.
−1 −2 −3 −2 ..
.
2 1 ..
3 4
.
· · · ··· ··· · · · .. · · · ······
=
.. .
. −1 −1 ..
.. .
. 1 0 ..
.. .
. ··· · · · .. ··· · · ·
..
. −1 1
..
. −1 0
570 The Three-Dimensional Real Vector Space R3
..
1 1 1 0 .
.
0 1 1 1 ..
.
−1 −2 −2 −1 ..
.
1 0 ..
1 1
.
· · · ··· ··· · · · .. · · · ······
⇒ A2 + A + I8 =
.. .
. 0 0 ..
.. .
. 0 0 ..
.. .
. ··· · · · .. ··· · · ·
..
. 0 2
..
. −2 2
⇒ r(A + A + I8 ) = 4;
2
O4×4
O2×2
..
. ··· · · ·
⇒ (A + A + I8 ) =
2 2
..
. −4 4
..
. −4 0 8×8
⇒ r(A + A + I8 ) = 2 for k ≥ 2.
2 k
v A2 = (1, 1, 1, −1, 0, . . . , 0),
v A3 = (1, 3, 4, 3, 0, . . . , 0),
v A4 = (−3, −5, −6, −2, 0, . . . , 0) = −
v − 2
v A − 3
v A2 − 2
v A3 .
Also, B
v = { v , v A, v A , v A } is a basis for Ker p1 (A) . In B
2 3 2
v,
[A|Ker p1 (A)2 ]B = R1 .
v
The matrix R1 is called the companion matrix of p1 (t)2 = t4 + 2t3 +
u ∈ Ker p1 (A), say
3t2 + 2t + 1. Take any vector u =
v1 . Then
u A = (0, 1, 1, 1, 0, 0, 0, 0),
u A2 = (−1, −2, −2, −1, 0, 0, 0, 0) = −
u −
uA
572 The Three-Dimensional Real Vector Space R3
H cycles
HH C(
v) C(
u)
dots HH
• • ← total number 2
1
= [dim R8 − r(A2 + A + 1)]
2
1
= ·4=2
2
• ← total number 1
1
= [r(A2 + A + 1)
2
−r(A2 + A + 1)2 ] = 1
⇓ ⇓
p1 (t)2 p1 (t)
produces the produces the
annihilator p1 (A)2 ; annihilator p1 (A).
1
Where the 2 in 2 is the degree of p1 (t). Also,
For R3 :
e7 A =
e8 ,
e8 A = (0, . . . , 0, −1, 1) = − e8 = −
e7 + e7 +
e7 A =
e7 A2
⇒
ei (A2 − A + I8 ) = 0 for i = 7, 8.
3.7 Linear Transformations (Operators) 573
Also,
..
0 −1 1 0 .
.
0 0 −1 0 ..
−1 −2 −3 −3 ...
3 5 7 3 ...
.
· · · · · · · · · · · · .. · · · · · · ···
A − A + I8 =
2
.. ..
. −1 2 .
.. ..
. 2 1 .
..
··· ··· ··· .· · · · · ·
..
. 0 0
..
. 0 0
⇒ r(A − A + I8 ) = 6
2
⇒ r(A2 − A + I8 )k = 2 for k ≥ 2.
Ker p2 (A) = {
x ∈ R8 |
x p2 (A) = 0 }
p2 (A)|C(
e7 ) = O2×2 .
3. Solve
x p2 (A) = 0 x ∈ R8
for
⇒ Ker p2 (A) = e8
e7 , as it should be by 2.
574 The Three-Dimensional Real Vector Space R3
v ∪ B
B = B u ∪ B
w = { v , v A, v A , v A , u , u A, w, wA}.
2 3
Example 5 Find a rational canonical basis and the rational canonical form
for the matrix
0 2 0 −6 2
1 −2 0 0 2
A = 1 0 1 −3 2 .
1 −2 1 −1 2
1 −4 3 −3 4
What happens if A is considered as a complex matrix?
3.7 Linear Transformations (Operators) 575
0 0 6 −12 6
0 0 12 −24 12
⇒ r(A2 + 2I5 )2 = 1.
Since 12 (dim R5 − r(A2 + 2I5 )) = 12 (5 − 1) = 2, so there exists a rational
canonical basis B = {v1 , v1 A, v2 ,
v2 A, v 3 } of A so that
..
0 1 .
.
−2 0 .. v1
.
· · · · · · .. · · · · · · · · · v 1 A
.. ..
−1
[A]B = QAQ =
. 0 1 . , where Q = v2 . (*3)
.. . v2 A
. −2 0 ..
.. .. v3
. ··· ··· . ···
..
. 2
To find such a rational canonical basis B: Solve
x (A2 + 2I5 ) = 0
⇒ x2 + x3 + x4 + 2x5 = 0
x = (x1 , −x3 − x4 − 2x5 , x3 , x4 , x5 )
⇒
= x1 (1, 0, 0, 0, 0) + x3 (0, −1, 1, 0, 0) + x4 (0, −1, 0, 1, 0)
+ x5 (0, −2, 0, 0, 1) where x1 , x3 , x4 , x5 ∈ R.
576 The Three-Dimensional Real Vector Space R3
Take
v1 =
e1 , then
v1 A = (0, 2, 0, −6, 2)
⇒ v1 A = (−2, 0, 0, 0, 0)
2
= −2
e1 = −2
v1 .
v2 = (0, −1, 1, 0, 0) which is linearly independent of
Take v1 and
v1 A. Then
v2 A = (0, 2, 1, −3, 0)
⇒ v2 A = (0, 2, −2, 0, 0)
2
= −2
v2 .
Solve
x (A − 2I5 ) = 0
Readers are urged to find an invertible complex matrix S5×5 so that SAS −1
is the above-mentioned rational canonical form.
Exercises
<A>
<B>
a11 0 7a11 + 3a21 + 3a22 0
f = .
a21 a22 a21 −3a11 − 3a21 + a22
Show that f is linear and find its rational canonical form and a rational
canonical basis.
Ga (3: R) = { x) |
x 0 + f ( x 0 ∈ R3 and
f : R3 → R3 is an invertible linear operator} (3.8.1)
I(
x ) = 1R3 (
x) = x ∈ R3 ,
x for all
3.8 Affine Transformations 579
For examples,
R2 R3
affine basis affine basis B = { a0 ,
a1 , a3 } with base point
a2 ,
B = {a0 , a2 } with
a1 ,
a0 (see (3.6.1)).
base point a0 .
T : R2 → R2 is an T (
x) = x ) : R3 → R3 is an affine
x0 + f (
affine transformation. transformation.
(2.8.9) [T (
x )]C = [T ( x ]B [f ]B
a0 )]C + [ C is the matrix
representation of T with aspect to B and C. See
Explanation below. (3.8.2)
B
f 0
(2.8.10) ([T (
x )]C 1) = ([
x ]B 1) C . (3.8.3)
T ( a0 ) C
1
7
A 0
Ga (2; R) in (2.8.11) Ga (3; R) = x0 1 A ∈ GL(3; R) and
8
x0 ∈ R3 with
I 0
identity: 03 1 , and
−1
A 0 A 0
inverse: x0 1 = − x A−1 1
. (3.8.4)
0
T (
x) =
x0 + f (
x)
an affine transformation on R3 .
Rewrite T as
T (
x ) = T ( x −
a0 ) + f ( a0 ), where T (
a0 ) =
x0 + f (
a0 ).
⇒ T (
x) − b0 a0 ) −
= T ( b0 x −
+ f ( a0 ).
3.8 Affine Transformations 581
ai −
where each f ( a0 ) is a vector in R3 and
ai −
[f ( a0 )]C = (αi1 , αi2 , αi3 )
3
⇔ f (
ai −
a0 ) = αij ( bj − b0 ) for 1 ≤ i ≤ 3.
j=1
so {
a0 ,
a1 , a3 } is affinely independent
a2 , (see Sec. 3.6) and hence B is an
affine basis for R3 . Similarly,
b1 − b0 2 0 0
det b2 − b0 = 1 2 1 = 6
b3 − b0 0 1 2
[T (
x )]C = [T ( x ] B AB
a0 )]C + [ C
a0 )]C and AB
we need to compute [T ( C . Now
T (
a0 ) =
x0 +
a0 A
= (1, −1, 0) + (1, 0, 1)A = (1, −1, 0) + (6, −5, 4) = (7, −6, 4)
⇒ T (
a0 ) − b0 = (7, −6, 4) − (0, −1, −1) = (7, −5, 5)
⇒ [T ( a0 ) − b0 )[1R3 ]N
a0 )]C = (T ( C (refer to (2.7.23) and (3.3.3)),
where
b1 − b0 2 0 0
= b2 − b0 = 1 2 1 and
[1R3 ]CN
b3 − b0 0 1 2
3 0 0 −1
1
[1R3 ]N
C = −2 4 −2 = [1R3 ]CN .
6
1 −2 4
3.8 Affine Transformations 583
Hence
3 0 0
1 1
a0 )]C = (7 −5
[T ( 5)· −2 4 −2 = (36, −30, 30) = (6, −5, 5).
6 6
1 −2 4
(
a1 −
a0 )A C
a0 )A[1R3 ]N
( a1 − C a1 −
a0
AC = ( a2 − a0 )A C = ( a2 − a0 )A[1R3 ]N
B = a2 − a0 A[1R3 ]N
C C
( a3 −
a0 )A C ( a0 )A[1R3 ]N
a3 − C a3 −
a0
1 2 1 4 −3 3 3 0 0
1
= 0 1 0 0 1 4 · −2 4 −2
6
1 3 2 2 −2 1 1 −2 4
36 −36 54
1
= 2 −4 14 .
6
49 −50 76
Let v1 = (4, −3, −6), v2 = (1, −1, −1) and v3 = (−2, 2, 1) be eigen-
vectors associated to 1, 2 and 3, respectively. Then D = {
x0 ,
x0 +
v1 ,
x0 +
v2 ,
x0 + v3 } is an affine basis for R . With x0 as a base point,
3
T (
x) =
x0 +
xA = T ( x −
x0 ) + ( x0 )A, where T (
x0 ) =
x0 +
x0 A
x) −
⇒ T ( x0 ) −
x0 = T ( x −
x0 + ( x0 )A
⇒ [T (
x )]D = [T ( x −
x0 )]D + [( x0 )A]D , i.e.
(T ( x ) −
x0 )P −1
= x0 A)P −1
( + (x −
x0 )P −1 (PAP −1 ).
Let
[
x ]D = ( x0 )P −1 = (α1 , α2 , α3 ),
x − and
[T ( x )]D = (T ( x ) −
x0 )P −1
= (β1 , β2 , β3 ).
584 The Three-Dimensional Real Vector Space R3
By computing
1 −9 −3
P −1 = 1 −8 −2
0 −2 −1
4 −3 3 1 −9 −3
⇒ [T (
x0 )]D = x0 AP −1
= (1 −1 0) 0 1 4 1 −8 −2
2 −2 1 0 −2 −1
= (0 5 −3).
Then, in terms of D, T has the simplest form as
1 0 0
x ]D 0 2
x )]D = (0, 5, −3) + [
[T ( 0 or
0 0 3
β1 = α1 ,
β2 = 5 + 2α2 ,
β3 = −3 + 3α3 .
See Fig. 3.55 and try to explain geometric mapping behavior of T in D.
For example, what is the image of the parallelepiped (
x0 +
v1 )(
x0 +
v2 )
( x0 + v3 ) with vertex at x0 under T ? 2
e3
x0 + v3
x0 e2
0
e1
x0 + v2
x0+ v1
Fig. 3.55
find out some of them and rewrite them in the simplest forms, if possible.
See Fig. 3.56.
(−1,−1,1)
(−1,1,1)
e3
(1,−1,1) (1,1,1)
e2
0
e1
(−1,−1,−1) (−1,1,−1)
(1,−1,−1) (1,1,−1)
Fig. 3.56
The simplest one among all is, even by geometric intuition, the one T1
that satisfies
ai ) = bi = −
T1 ( ai for i = 0, 1, 2, 3.
⇔ The unique invertible linear operator f1 : R3 → R3 , where T1 (
x) =
x −
b0 + f1 ( x0 ), satisfies
aj −
f1 ( a0 ) = bj − b0 = −( aj − a0 ) for j = 1, 2, 3.
−1 0 0
⇒ T1 ( x ) = b0 + ( x − a0 )A1 , where A1 = 0 −1
0.
0 0 −1
⇒ (since a0 A1 = b0 )
−1 0 0
T1 (x ) = f1 (
x) = xA1 = x 0 −1 0 or
0 0 −1
y1 = −x1
y = −x2 , where x = (x1 , x2 , x3 ) and y = T (
x ) = (y1 , y2 , y3 ).
2
y3 = −x3
586 The Three-Dimensional Real Vector Space R3
a0 ) = −
T2 ( a0 , a1 ) = −
T2 ( a1 a2 ) = −
and T2 ( a3 , a3 ) = −
T2 ( a2 .
a1 −
f2 ( a0 ) = −(
a1 −
a0 ), and
f2 (
a2 −
a0 ) = −(
a3 −
a0 ), a3 −
f2 ( a0 ) = −(
a2 −
a0 ).
⇒ (since
a0 A2 = b0 )
−1 0 0
T2 (
x ) = f2 (
x) = x 0
x A2 = 0 −1 or
0 −1 0
y1 = −x1 , y2 = −x3 , y3 = −x2 .
If the vertex a0 is preassigned to b0 = −
a0 , then the total number of such
mappings is 3! = 6. Each one of them is represented by one of the following
matrices A:
−1 0 0 −1 0 0 0 0 −1
0 −1 0 , 0 0 −1 , 0 −1 0 ,
0 0 −1 0 −1 0 −1 0 0
0 −1 0 0 0 −1 0 −1 0
−1 0 0 , −1 0 0 , 0 0 −1 .
0 0 −1 0 −1 0 −1 0 0
a0 ) = −
T3 ( a1 , a1 ) = −
T3 ( a0 , a2 ) = −
T3 ( a2 a3 ) = −
and T3 ( a3 .
⇒ (since
a0 A3 = (3, −1, −1) and b1 = −
a1 = (1, −1, −1))
1 0 0
T3 ( x 1 −1
x ) = (−2, 0, 0) + 0 .
1 0 −1
a0 ) = −
T4 ( a1 , a1 ) = −
T4 ( a2 , a2 ) = −
T4 ( a3 a3 ) = −
and T4 ( a0 .
a1 −
f4 ( a0 ) = − a1 = −(
a2 + a2 −
a1 ) = −(
a2 − a1 −
a0 ) + ( a0 ),
a2 −
f4 ( a0 ) = − a1 = −(
a3 + a3 − a1 −
a0 ) + ( a0 ),
a3 −
f4 ( a0 ) = −
a0 + a1 −
a1 = a0
1 −1 0
⇒ T4 (
x ) = − x −
a1 + ( a0 )A4 , where A4 = 1 0 −1.
1 0 0
⇒ (since −a1 = (1, −1, −1), a0 A4 = (3, −1, −1))
1 −1 0
T4 ( x 1
x ) = (−2, 0, 0) + 0 −1 .
1 0 0
There are four others of this type whose respective linear parts are
1 0 −1 1 −1 0 1 0 0 1 0 −1
1 −1 0 , 1 0 0 , 1 0 −1 , 1 0 0 .
1 0 0 1 0 −1 1 −1 0 1 −1 0
Yet, there are another 12 such affine transformations which are left as
Ex. <A> 2. 2
588 The Three-Dimensional Real Vector Space R3
Exercises
<A>
1. Prove (3.8.4), (3.8.5) and (3.8.6) in details.
2. Find another 12 affine transformations as mentioned in Example 2.
3. Let B = { a0 ,
a1 , a3 }, where
a2 , a0 = (2, 1, −1), a1 = (3, 1, −2),
a2 = (2, 3, 2), a3 = (3, 2, 0) and C = { b0 , b1 , b2 , b3 }, where b0 =
(1, 3, 1), b1 = (0, 3, 2), b2 = (2, 4, 2), b3 = (2, 1, 2). Show that B and
C are affine bases for R3 . For each of the following affine transforma-
tions T in the natural affine basis N = { 0 , e1 , e3 }, do the following
e2 ,
problems:
(1) The matrix representation of T with respect to B and B, denoted
by [T ]B , i.e.
[T ]B (
x ) = [T (
x )]B = [T ( x ] B AB
a0 )]B + [ B.
4 2 3
(f) T (
x) =
x0 + x0 = (4, −1, −1) and A = 2
xA, where 1 2.
−1 −2 0
2 1 0
(g) T (
x) =
x0 + x0 = (2, 1, −1) and A = 0 2 1.
xA, where
1 0 2
4. Let T and B and D be as in Ex. 3(a).
(a) Compute T −1 and [T −1 ]B . Is [T −1 ]B = [T ]−1
B (refer to (3.3.3))?
(b) Compute T and [T ]B , where T = T · T .
2 2 2
6. In Fig. 3.56, find all possible affine transformations mapping the tetra-
hedron ∆(− a1 ) a2 (−
a3 a0 ) onto the tetrahedron ∆ a0
a1
a2
a3 , where
a1 , a2 , . . ., etc. are as in Example 2.
7. In Fig. 3.56, find all possible affine transformations mapping the tetrahe-
dron ∆ a0 a1
a2
a3 onto itself. Do they constitute a group under composite
operation? Give precise reasons.
8. Try to extend Exs. <A> 2 through 8 of Sec. 2.8.1 to R3 and prove your
statements.
<B>
1. We define the affine group Ga (3; R) with respect to the natural affine
basis N (see (3.8.4)). Show that the affine group with respect to an affine
basis B = {a0 ,
a1 , a3 } is the conjugate group
a2 ,
−1
A0 0 A0 0
G a (3; R) ,
a0 1 4×4 a0 1 4×4
2. Let B = { a0 ,
a1 ,
a2 , a4 }, where
a3 , a0 = (−1, −1, 1, −2), a1 =
(0, 0, 2, −1), a2 = (−1, 0, 2, −1), a3 = (−1, −1, 2, −1), a4 =
(−1, −1, 1, −1) and C = { b0 , b1 , b2 , b3 }, where b0 = (2, −3, 4, −5),
b1 = (3, −2, 5, −5), b2 = (3, −3, 5, −4), b3 = (3, −2, 4, −4), b4 =
590 The Three-Dimensional Real Vector Space R3
(2, −2, 5, −4). Show that both B and C are affine bases for R4 . For each
of the following affine transformations T (x) =
x0 +
x A, do the same
problems as in Ex. <A> 3.
1 0 0 0
1 1 0 0
(a) x0 = (2, 3, 5, 7), A =
2 0
.
2 1
−1 1 −1 1
2 −1 0 1
0 3 −1 0
x0 = (0, −2, −2, −2), A =
(b) 0
.
1 1 0
0 −1 0 3
2 −4 2 2
−2 0 1 3
x0 = (−3, 1, 1, −3), A =
(c) −2 −2 3 3 .
−2 −6 3 7
1 −2 0 0
2 1 0 0
x0 = (2, 0, −2, −2), A =
(d) 1
.
0 1 −2
0 1 2 1
3. Try to describe all possible affine transformations on R4 mapping the
four-dimensional tetrahedron (or simplex).
4 /
4
∆ a0 a1 a2 a3 a4 = λi ai | λi ≥ 0 for 0 ≤ i ≤ 4 and
λi = 1
i=0 i=0
onto another tetrahedron ∆ b0 b1 b2 b3 b4 . Try to use matrices to write
some of them explicitly.
<C> Abstraction and generalization
Read Ex. <C> of Sec. 2.8.1.
3.8.2 Examples
We are going to extend the contents of Sec. 2.8.2 to the affine space R3 .
The readers should review Sec. 2.8.2 for detailed introductions.
Remember that R3 also plays as a vector space.
We also need spatial Euclidean concepts such as angles, lengths, areas
and volumes in some cases. One can refer to Introduction and Natural Inner
Product in Part 2 and the beginning of Chap. 5 for preliminary knowledge.
3.8 Affine Transformations 591
Case 1 Translation
x0 be a fixed vector in R3 . The mapping
Let
T (
x) =
x0 +
x, x ∈ R3
(3.8.10)
is called a translation of R3 along
x0 . Refer to Fig. 2.96. The set of all such
translations form a subgroup of Ga (3; R).
Translations preserve all geometric mapping properties listed in (3.7.15).
A translation does not have fixed point unless x0 = 0 , which in this case
every point is a fixed point. Any line or plane parallel to x0 is invariant
under translation.
Case 2 Reflection
Suppose a line OA and a plane Σ in space R3 intersect at the point O. For
any point X in R3 , draw a line XP , parallel to OA, intersecting Σ at the
point P and extend it to a point X so that XP = PX . See Fig. 3.57. The
mapping T : R3 → R3 defined by
T (X) = X
−
is called the (skew) reflection of space R3 along the direction OA with
respect to the plane Σ. In case the line OA is perpendicular to the plane
592 The Three-Dimensional Real Vector Space R3
A X
Σ
O P
X′
Fig. 3.57
The reflection
Let
a0 ,
a1 ,
a2 and a3 be non-coplanar points in the space R3 . Denote by T
the reflection of R along the direction
3
a3 − a0 with respect to the plane
Σ = a0 + a1 − a0 , a2 − a0 . Then,
Also,
T ( a0 (I3 − P −1 [T ]B P ) +
x) = x P −1 [T ]B P
=
x0 +
x A, where a0 (I3 − A) and A = P −1 [T ]B P. (3.8.12)
x0 =
Notice that det A = det[T ]B = −1. (3.8.12) suggests how to test whether
a given affine transformation T (x) = x0 +
x A is a reflection. Watch the
following steps (compare with (2.8.27)):
1 1
( 0 + T ( 0 )) + v2 =
v1 , x0 + v2
v1 ,
2 2
is the plane of invariant points of T .
4. Compute an eigenvector v3 corresponding to −1. Then
v3 or −
v3 is a
direction of reflection T . In fact,
x0 (A x (I3 − A)(I3 + A) = 0
+ I3 ) = (see (2)in (3.7.46)),
so x0 = 0 . Note that
x0 is a direction if v3 and
x0 are linearly depen-
dent.
(3.8.13)
The details are left to the readers.
Example 1
(a) Find the reflection of R3 along the direction v3 = (−1, 1, −1) with
respect to the plane (2, −2, 3) + (0, 1, 0), (0, −1, 1).
(b) Show that
1 0 0
5 4
T (
x) =
x0 +
x A, x0 = (0, −2, −4) and A = 0
where 3 3
0 − 43 − 53
Solution (a) In the affine basis B = {(2, −2, 3), (2, −1, 3), (2, −3, 4),
(1, −1, 2)}, the required T has the representation
1 0 0
[T ( x ]B [T ]B , where [T ]B = 0 1
x )]B = [ 0 .
0 0 −1
While, in the natural affine basis N ,
x − (2, −2, 3))P −1 [T ]B P
x ) = (2, −2, 3) + (
T (
where
0 1 0 0 −1 −1
B
P = PN = 0 −1 1 ⇒ P −1 = 1 0 0 .
−1 1 −1 1 1 0
Therefore,
0 −1 −1 1 0 0 0 1 0
P −1 [T ]B P = 1 0 0 0 1 0 0 −1 1
1 1 0 0 0 −1 −1 1 −1
−1 2 −2
= 0 1 0 , and
0 0 1
(2, −2, 3) − (2, −2, 3)P −1 [T ]B P = (2, −2, 3) − (−2, 2, −1) = (4, −4, 4)
−1 2 −2
x 0 1
x ) = (4, −4, 4) +
⇒ T ( 0 for x ∈ R3 , or
0 0 1
y1 = 4 − x1 , y2 = −4 + 2x1 + x2 , y3 = 4 − 2x1 + x3 .
(b) Since det A = −1, it is possible that T is a reflection. To make cer-
tainty, compute the characteristic polynomial det(A−tI3 ) = −(t−1)2 (t+1).
So A has eigenvalues 1, 1 and −1. Moreover,
0 0 0 2 0 0
2 4 8 4
(A − I3 )(A + I3 ) = 0 3 3 0 3 3 = O3×3
0 − 43 − 83 0 − 43 − 23
indicates that A is diagonalizable and thus, the corresponding T is a reflec-
x (I3 − A) =
tion if x0 has a solution. Now
x (I3 − A) =
x0
⇒ x2 − 2x3 − 3 = 0 (the plane of invariant points).
So T is really a reflection.
3.8 Affine Transformations 595
1
x0 + v2 = (0, −1, −2) + {(2α1 + α2 , 2α2 , α2 )|α1 , α2 ∈ R}
v1 ,
2
x1 = 2α1 + α2
⇔ x2 = −1 + 2α2
x3 = −2 + α2 , for α1 , α2 ∈ R
⇔ x2 − 2x3 − 3 = 0
2v3
x v3 + v2
v3
v3 + 〈〈 v1, v2 〉〉 e3
v2
v3 + v2
e2
0
e1
T ( x)
v1
Fig. 3.58
is a standard orthogonal reflection of R3 in the direction
v3 = √1 , − √1 , 0
2 2
with respect to the plane
v2 = {(x1 , x2 , x3 ) ∈ R3 | x1 = x2 }, where
v1 ,
1 1
v1 = √ , √ , 0 and v2 = e3 = (0, 0, 1).
2 2
See Fig. 3.59 and compare with Fig. 3.32 with a = b = c = 1. Notice that
v2 = e3
T ( x)
0 e2 x
v1
v3 e1
x1 = x2
Fig. 3.59
√1 √1 0
1 0 0 v1 2 2
A = P −1 0 1 0 P, where P =
v2 = 0 0 1 .
0 0 −1
v3 √1 − √12 0
2
X
A
X′(k > 0)
Σ
P
O
X′(k < 0)
Fig. 3.60
3.8 Affine Transformations 597
T (X) = X
Σ′ L
T (Σ′)
T (L)
Fig. 3.61
Also,
(1) A one-way stretch preserves all the properties 1–7 listed in (3.7.15),
(2) but enlarges the volume by the scale factor |k| and preserves the orien-
tation if k > 0 while reverses the orientation if k < 0.
(3.8.15)
Example 2
(a) (See Example 1(a).) Find the one-way stretch with scale factor k = 1
of R3 along the direction v3 = (−1, 1, −1) with respect to the plane
(−1, 1, 1) + v2 , where
v1 , v2 = (0, −1, 1).
v1 = (0, 1, 0) and
(b) Show that
−1 4 2
T (
x) = x0 + x A, where x0 = (−2, 4, 2) and A = −1 3 1
−1 2 2
Solution (a) In the affine basis B = {(−1, 1, 1), (−1, 2, 1), (−1, 0, 2),
(−2, 2, 0)}, the one-way stretch T is
1 0 0
[T ( x ]B [T ]B , where[T ]B = 0 1 0 .
x )]B = [
0 0 k
T ( x − (−1, 1, 1))P −1 [T ]B P,
x ) = (−1, 1, 1) + (
where
0 1 0 0 −1 −1
B
P = PN = 0 −1 1 ⇒ P −1 = 1 0 0 .
−1 1 −1 1 1 0
Hence,
0 −1 −1 1 0 0 0 1 0
P −1 [T ]B P = 1 0 0 0 1 0 0 −1 1
1 1 0 0 0 k −1 1 −1
k 1 − k −1 + k
= 0 1 0 ,
0 0 1
⇒ T (
x ) = (−1 + k, 1 − k, −1 + k)
k 1 − k −1 + k
+x 0 1 0 x ∈ R3 ,
for or
0 0 1
y1 = −1 + k + kx1 ,
y = 1 − k + (1 − k)x1 + x2 ,
2
y3 = −1 + k + (−1 + k)x1 + x3 ,
x (I3 − A) =
x0 , i.e.
2 −4 −2
(x1 , x2 , x3 ) 1 −2 −1 = (−2, 4, 2)
1 −2 −1
⇒ 2x1 + x2 + x3 = −2 does have a (in fact, infinite) solution.
x0 − T ( 0 ) = α
α x0 − x0 − 0 ) with
x0 = k(α k=2
⇒ (since = 0 )α − 1 = kα with k = 2
x0
1
⇒α= = −1
1−k
⇒ An invariant point αx0 = −
x0 = 2
v3
⇒ The plane of invariant point is 2
v3 + v2 .
v1 ,
3.8 Affine Transformations 601
Notice that
x = (x1 , x2 , x3 ) ∈ 2
v3 + v2
v1 ,
⇒ x1 = 2 + α1 + α2 , x2 = −4 − 2α1 , x3 = −2 − 2α2 for α1 , α2 ∈ R
⇒ (by eliminating the parameters α1 and α2 ) 2x1 + x2 + x3 = −2
x (I3 − A) =
⇒ x0
as we claimed before. In the affine basis C = {2v3 , 2
v3 + v1 , 2
v3 +
v2 ,
2 v3 + v3 },
1 0 0 1 −2 0
[T ( x ]C 0 1 0 ,
x ]C P AP −1 = [
x )]C = [ where P = 1 0 −2 .
0 0 2 1 −2 −1
See Fig. 3.62.
e3
( −1, 0, 0)
0 (2,1,1)
(0, − 2, 0)
e2
v1
e1
2x1 + x2 + x3 = 0
v3
v2
2v3 + v1 2v3
Fig. 3.62
For Q1 A plane, parallel to 2x1 +x2 +x3= −2,is of the form 2x1 +x2 +x3 =
c which intercepts x1 -axis at the point 2c , 0, 0 . Since 2x1 + x2 + x3 = −2
is the plane of invariant points of T , with scale factor 2, so the image plane
(refer to Fig. 3.62) is
as claimed above. Therefore, such a plane has its image a plane parallel to
itself and, of course, to 2x1 + x2 + x3 = −2.
(2, 1, 1) holds. Since (A∗ − I3 )(A∗ − 2I3 ) = O3×3 , so
y 0 (A∗ − 2I3 ) = 0
where y 0 = (2, 1, 1). Therefore
y 0 A∗ = 2
y0
∗ ∗
⇒Ay0 = 2y0
1 ∗
y ∗0 =
⇒ A−1 y
2
which can also be easily shown by direct computation.
For Q2 Take, for simplicity, the plane x1 − x2 − x3 = 5. To find the image
plane of this plane under T , observe that
1
x1 − x2 − x3 = (x1 , x2 , x3 ) −1 = 5
−1
1
⇒ (let y = T (x ) temporarily) [(y1 , y2 , y3 ) − (−2, 4, 2)]A−1 −1 = 5
−1
5
⇒ (y1 + 2, y2 − 4, y3 − 2) 1 = 5(y1 + 2) + y2 − 4 + y3 − 2 = 5
1
⇒ (replace y1 , y2 , y3 by x1 , x2 , x) 5x1 + x2 + x3 = 1
which is the equation of the image plane.
The two planes x1 − x2 − x3 = 5 and 5x1 + x2 + x3 = 1 do intersect
along the line x1 = 1, x2 + x3 = −4 which obviously lies on the plane
2x1 + x2 + x3 = −2. Refer to Fig. 3.61.
For Q3 By computation,
T (
a0 ) = T ( 0 ) =
x0 = (−2, 4, 2),
a1 ) = (−2, 4, 2) + (1, 0, −1)A = (−2, 4, 2) + (0, 2, 0) = (−2, 6, 2),
T (
a2 ) = (−2, 4, 2) + (2, −4, −2)A = (−2, 4, 2) + (4, −8, −4) = (2, −4, −2),
T (
a3 ) = (−2, 4, 2) + (−2, 0, 0)A = (−2, 4, 2) + (2, −8, −4) = (0, −4, −2).
T (
These four points form a tetrahedron ∆T ( a0 )T (
a1 )T (
a2 )T (
a3 ) whose vol-
ume is equal to
T ( a1 ) − T (a0 ) 0 2 0
det T (a2 ) − T (a0 ) = 4 −8 −4 = 16.
T ( a3 ) − T ( a0 )
2 −8 −4
604 The Three-Dimensional Real Vector Space R3
While
a0
a1
a2
a3 has volume equal to
a1 1 0 −1
a2 = 2 −4 −2 = 8.
det
a3 −2 0 0
Therefore,
Volume of T (
a0 ) · · · T (
a3 ) 16
= = 2 = det A.
Volume of a0 · · · a3 8
See Fig. 3.63.
T ( a0 )
T ( a1 )
a3
a0
T ( a3 )
a1
T ( a2 ) = a2
Fig. 3.63
points. Then
[T (
x )]B = [
x ]B [T ]B ,
k1 0 0 k1 0 0 1 0 0
where [T ]B = 0 k2 0 = 0 1 0 0 k2 0 .
0 0 1 0 0 1 0 0 1
3.8 Affine Transformations 605
2. In the natural affine basis N = { 0 ,
e1 , e3 },
e2 ,
a1 −
a0
T (
x) =
a0 + ( a0 )P −1 [T ]B P,
x − where P = a0 .
a2 −
a3 − a0
Also,
(1) A two-way stretch preserves all the properties 1–7 listed in ( 3.7.15),
(2) but enlarge the volume by the scale factor |k1 k2 | and preserves the
orientation if k1 k2 > 0 while reverses the orientation if k1 k2 < 0.
(3.8.17)
See Fig. 3.64 (refer to Fig. 2.102).
e2
T(x), k1 < 0, k2 < 0
0 e1
k1 0 0
(a) T = 0 k2 0 in
0 0 1
T ( x)
x
a3 a2
a0 a1
(b) in
Fig. 3.64
x1 +
v3 = {
x ∈ R3 |
x (I3 − A) =
x0 }
is the line of invariant points.
4. Compute an eigenvector v1 for k1 and an eigenvector
v2 for k2 so that
v1 and v2 are linearly independent in case k1 = k2 . Then
x1 + v2
v1 ,
is an invariant plane.
(3.8.18)
The details are left to the readers.
Solving x (A − I3 ) = 0 , get the corresponding eigenvector v3 =
(1, 4, −3). Solving x (A + 2I3 ) = 0 , get v1 = (0, 1, −1) and solving
v2 = (1, 0, −1).
x (A + 3I3 ) = 0 , get
For any x1 such that
x1 (I3 − A) = x0 holds, T is a two-way stretch with
x1 +
the line of invariant points: v3 , and
the invariant plane: x1 + v1 , v2 .
x1 + v3
x1 + 〈〈v3 〉〉 v3
e3
x1 e2
x1 + v1
(−1, 0, 0) e1
0 v1
x1 + v2 (0, −1, 0)
(0, 0, −1)
v2
x1 + x2 + 3 = −1 x1 + x2 + 3 = 0
Fig. 3.65
x1 +
Q3 Where is the intersecting line of a plane, nonparallel to v2 ,
v1 ,
with its image?
Q4 Let
a0 = x1 = (0, −3, 2),
a1 = (−1, −1, −1),
a2 = (1, −2, 1) and a3 =
x1 + v3 = (1, 1, −1). Find the image of the tetrahedron
a0
a1
a2
a3
and compute its volume.
For these, note that the inverse transformation of T is
2 18 −14
1
y −(2, −3, 1))A−1 ,
x = ( where x ) and A−1
y = T ( = 4 15 −13 .
6
4 18 −16
For Q1 Geometric intuition (see Fig. 3.65) suggests that the image of any
such plane would be parallel to the plane. This can be proved analytically
as follows.
A plane Σ parallel to the line L of invariant points has equation like
(−4a2 + 3a3 )x1 + a2 x2 + a3 x3 = b, where b = −3a2 + 2a3 , a2 , a3 ∈ R.
The condition b = −3a2 + 2a3 means that L is not coincident on Σ. Then
ΣL
−4a2 + 3a3
x
⇔ a2 =b
a3
−4a2 + 3a3 10a2 − 8a3
1
y − (2, −3, 1)]A−1
⇔ [ a2 = [
y − (2, −3, 1)] −a2 − a3 = b
6
a3 2a2 − 4a3
3.8 Affine Transformations 609
⇔ (10a2 − 8a3 )y1 − (a2 + a3 )y2 + (2a2 − 4a3 )y3 = 6b + 25a2 − 17a3
⇒ (let α2 = −a2 − a3 , α3 = 2a2 − 4a3 )
(−4α2 + 3α3 )x1 + α2 x2 + α3 x3 = 6b + 25a2 − 17a3 ,
where 6b + 25a2 − 17a3 = −18a2 + 12a3 + 25a2 − 17a3 = −3α2 + 2α3 . Hence,
the image plane T (Σ)ΣL. See Note below.
Σ v1 ⇔ a2 = a3 .
v3 ,
v3 ∈ Ker(A − I3 )
= Im(A∗ − I3 )⊥ and
∗ ⊥
v1 ∈ Ker(A + 2I3 )
= Im(A + 2I3 )
Σ v2 ⇔ 2a2 − a3 = 0.
v3 ,
⇒ Σ has equation 2x1 + x2 + 2x3 = b, b = 1.
1 ∗
⇒ A−1u∗ = − u , where u = (2, 1, 2), the normal vector.
2
For Q2 The answer is negative in general except the planes x1 +
v1 and
v3 , x1 + v2 which are invariant planes. But each such
v3 ,
plane and its image plane intersect along the line L of invariant points. Q1
answers all these claims.
under T ,
0
x 0 = 0
x3 =
1
0 −14
1
y − (2, −3, 1)]A−1 0 = [
⇔ [ y − (2, −3, 1)] −13 = 0
6
1 −16
⇒ (replace
y by
x) 14x1 + 13x2 + 16x3 = 5.
This image plane does intercept the line L at the invariant point 23 , − 13 , 0
and intersects the original plane x3 = 0 along the line x3 = 0, 14x1 + 13x2 +
16x3 = 5.
For Q4 By computation,
Then,
a0 ) · · · T (
The signed volume of T ( a3 ) −36
⇒ = = 6 = detA. 2
The signed volume of a0 · · · a3
−6
The stretch
Let a0 ,
a1 , a3 be non-coplanar points in R3 . Denote by T the stretch
a2 and
with scale factor ki along the line a0 +
ai − a0 for i = 1, 2, 3, where
k1 , k2 and k3 are all nonzero and different from 1. Then
[T (
x )]B = [
x ]B [T ]B , where
k1 0 0 k1 0 0 1 0 0 1 0 0
[T ]B = 0 k2 0 = 0 1 0 0 k2 0 0 1 0 .
0 0 k3 0 0 1 0 0 1 0 0 k3
2. In the natural affine basis N = { 0 ,
e1 , e3 },
e2 ,
a1 −
a0
T (
x) =
a0 + ( a0 )P −1 [T ]B P,
x −
where P = a2 −
a0 .
a3 −
a0
T ( x)
x
a3
a2
a1
a0
Fig. 3.66
X′
Σ
X
Fig. 3.67
a. In B,
1 0 0
[T (
x )]B = [
x ]B [T ]B , where [T ]B = k 1 0 .
0 0 1
b. In N = { 0 ,
e1 , e3 },
e2 ,
a1 −
a0
T (
x) =
a0 + ( a0 )P −1 [T ]B P,
x − where P = a0 .
a2 −
a3 − a0
For skew shearing, refer to Ex. <A> 17 of Sec. 2.8.2 and Ex. <B> 1.
To see if an affine transformation T ( x) = x0 + x A is a shearing, try
the following steps (refer to (2.8.33)):
1. Compute the eigenvalues of A. In case A has only one eigenvalue 1 with
multiplicity 3 and (A − I3 )2 = O, A is not diagonalizable (see (2) in
x (I3 − A) =
(3.7.50)) and then, the associated T is a shearing if x0 has
a solution.
2. The eigenspace
E = Ker(A − I3 )
has dimension equal to 2. Determine eigenvectors v1 and v3 so that
| v1 | = | v3 | = 1 and v1 ⊥ v3 in order to form an orthonormal basis for E.
614 The Three-Dimensional Real Vector Space R3
(3.8.22)
The details are left as Ex. <A> 12.
√
Solution Since a1 −
a0 = (1, −1, −1) has length 3, take the unit vector
v1 = √13 (1, −1, −1). Since
a2 −
a0 = (−1, −2, 1) happens to be perpendic-
ular to a1 − a0 , i.e.
1
(a2 − a1 −
a0 )( a0 )∗ = (−1 − 2 1) −1 = −1 + 2 − 1 = 0,
−1
√
so we can choose v2 to be equal to a2 −
a0 dividing by its length 6, i.e.
v2 = √16 (−1, −2, 1). Then, choose a vector
v3 = (α1 , α2 , α3 ) of unit length
so that
v3 ⊥
v1 and v3 ⊥
v2
⇒ α1 − α2 − α3 = 0 and − α1 − 2α2 + α3 = 0
⇒ α1 = α3 and α2 = 0
1
⇒v3 = √ (1, 0, 1).
2
In the orthonormal affine basis B = { a0 ,
a0 +
v1 ,
a0 +
v2 , v3 }, the
a0 +
required shearing T has the representation
1 0 0
[T ( x ]B [T ]B , where [T ]B = k 1 0 , simply denoted as A.
x )]B = [
0 0 1
While in N = { 0 ,
e1 , e3 },
e2 ,
T (
x) =
a0 + ( a0 )P −1 AP,
x −
3.8 Affine Transformations 615
where
1
√ − √13 − √13
v1
3
P = v2 = − 6
√1
− √26 √1 with P −1 = P ∗ .
6
v3 √1 0 √1
2 2
By computation,
1 − 3√
k
2
k
√
3 2
√k
3 2
P −1 AP = −2k
√
3 2 1 + 32k
√
2
2k
√
3 2
, and
−k
k
√
3 2
√
3 2
1 − 3√k
2
k
x0 = a0 P −1 AP =
a0 − a0 P −1 (I3 − A)P = √ (1, −1, −1)
2
−1
⇒ T ( x ) = x0 + x P AP.
e3
a0 + v2
e2
0
x
e1
T ( x) a0 + v3 a0
plane of
a0 + v1
invariant points
Fig. 3.68
√
For simplicity, take k = 2 and consider the converse problem. Let
T (
x) =
x0 +
x B, x0 = (1, −1, −1) and
where
2 1 1
3 3 3
2 2
B = P −1 AP = − 3 5
3 3
.
1
3 − 13 2
3
⇒ x1 + 2x2 − x3 = 0.
Take v2 = √16 (1, 2, −1). Notice that this v2 is not an eigenvector asso-
ciated to 1 since dim Ker(B − I3 ) = 2. By computation,
2 1 1
3 3 3
1 2 1
v2 B = √ (1, 2, −1) − 23 5
3 3
= √ (−1, 4, 1)
6 6
1
3 − 13 2
3
2 √
⇒
v2 B −
v2 = √ (−1, 1, 1) = − 2
v1 .
6
√
Therefore, − 2 is the coefficient.
3.8 Affine Transformations 617
Hence, in C = {
a0 ,
a0 +
v1 ,
a0 +
v2 , v3 },
a0 +
1
√ 0 0
[T (
x )]C = [
x ]C [T ]C , where [T ]C = − 2 1 0 .
0 0 1
x = ( x0 )B −1 ,
y − x0 = (1, −1, −1) and
where y = T (
x ) and
4 −1 −1
1
B −1 = P −1 A−1 P = 2 1 −2 .
3
−1 1 4
For Q1
1
x1 + x2 = x 1 = 0 if x = (x1 , x2 , x3 )
0
4 −1 −1 1
1
⇒ [(y1 , y2 , y3 ) − (1, −1, −1)] · 2 1 −2 1 = 0
3
−1 1 4 0
⇒ (replace y1 , y2 , y3 by x1 , x2 , x3 ) x1 + x2 = 0.
Hence, x1 + x2 = 0 and its image plane are coincident. This is within our
expectation because the normal vector (1, 1, 0) of x1 + x2 = 0 is perpen-
v1 = √13 (1, −1, −1), the direction of the shearing T (see 1 in
dicular to
(3.8.20)).
618 The Three-Dimensional Real Vector Space R3
1
x1 + 2x2 − x3 = x 2 = c
−1
4 −1 −1 1
1
⇒ [(y1 , y2 , y3 ) − (1, −1, −1)] · 2 1 −2 2 = c
3
−1 1 4 −1
⇒ y1 + 2y2 − y3 = c,
For Q2 By computation,
1 5 5 4 2 2
T ( b0 ) = (1, −1, −1) + (1, 1, 1)B = (1, −1, −1) + , , = , , ,
3 3 3 3 3 3
T (b1 ) = (1, −1, −1) + (−1, 1, 1)B = (1, −1, −1) + (−1, 1, 1) = (0, 0, 0),
T (b2 ) = (1, −1, −1) + (1, −1, 1)B
5 5 1 8 8 2
= (1, −1, −1) + ,− , = ,− ,− ,
3 3 3 3 3 3
T (b3 ) = (1, −1, −1) + (1, 1, −1)B
1 7 1 2 4 2
= (1, −1, −1) + − , , = , ,− .
3 3 3 3 3 3
These four image points form a tetrahedron T ( b0 ) · · · T ( b3 ). Thus
b − b0
1 1
the signed volume of b0 b1 b2 b3 = b2 − b0
6
b3 − b0
−2 0 0
1 4
= 0 −2 0 = − ,
6 3
0 0 −2
3.8 Affine Transformations 619
T ( b ) − T ( b0 )
1 1
the signed volume of T ( b0 ) · · · T ( b3 ) = T ( b2 ) − T ( b0 )
6
T ( b3 ) − T ( b0 )
4
− − 23 − 23
1 3
= 34 − 10 − 43
6 3
− 2 2
− 43
3 3
1 8 4
= · − · 27 = − .
6 27 3
So both have the same volumes. 2
X′
A
Σ
O
X
axis
Fig. 3.69
The rotation
Let
a0 ,
a1 , a3 be non-coplanar points in R3 so that
a2 and
a1 −
1. | a0 | = |
a2 −
a0 | = |
a3 −
a0 | = 1, and
2. ( a1 − a0 )⊥( a2 − a0 )⊥( a3 −
a0 ).
a0 +
Let T denote the rotation of the space with a1 −
a0 as axis and
through the angle θ. Then
Also,
(a) a rotation not only preserves all the properties 1–7 listed in (3.7.15),
(b) but also preserves length, angle, area, volume and orientation.
(3.8.24)
v1 to an orthonormal basis B = {
3. Extend v1 , v3 } for R3 . Then
v2 ,
1 0
[A]B = ,
0 B
Solution The three row (or column) vectors of A are of unit length and
are perpendicular to each other. So A∗ = A−1 and A is orthogonal. Direct
computation shows that
1 1
det A = (4 + 4 + 4 + 8 + 8 − 1) = · 27 = 1.
27 27
By computation,
√
1 2 5
v2 A = √ (4, −2, 5) = v2 +
v3 ,
3 5 3 3
√
1 5 2
v3 A = (2, −1, −2) = −
v2 + v3
3 3 3
1 0 0 √1 √2 0
√
5 5 5
⇒ [A]B = P AP −1 = − √2 0
2
0 3 , where P = √1
√
3 5 5 .
0 − 35 2
3 0 0 1
e1
0
Fig. 3.70
Q1 What is the image of the unit sphere x21 + x22 + x23 = 1 under
y =
x A?
2 2
Q2 What is the image of the cylinder x1 + x2 = 1 under y = x A?
Q3 What is the equation of the image cylinder (*) in the orthonormal affine
basis C = { 0 , A1∗ , A2∗ , A3∗ } where Ai∗ denotes the ith row vector of A for
i = 1, 2, 3?
624 The Three-Dimensional Real Vector Space R3
x →
x −
a0 + ( a0 )Aproj = Pproj (
x ).
where
∗ ∗ −1 a1 −
a0
Aproj = A (AA ) A and A =
a2 −
a0 2×3
a2
e3 PProj ( x ) a0 + 〈〈 a1 − a0,a2 − a0 〉〉
a0 a1
T ( x)
e1
0
e2
Fig. 3.71
(3) Let B = { a0 ,
a1 , a3 } be an orthonormal affine basis for R3 so that
a2 ,
S is a0 + a1 − a0 , a2 −
a0 . Let T be the orthogonal reflection defined
as in (2). Then,
1. In B,
1 0 0
[T (
x )]B = [
x ]B [T ]B , where [T ]B = 0 1 0 .
0 0 −1
2. In N = { 0 ,
e1 , e3 },
e2 ,
a1 −
a0
T (
x) =
a0 + ( a0 )P −1 [T ]B P,
x − where P = a0 .
a2 −
a3 − a0
1 0 0
[A]B = 0 cos θ sin θ for some θ ∈ R.
0 sin θ − cos θ
(b) Choose u 2 ∈ Ker(A − I3 ) which has dimension equal to 2 and
u 1,
u 3 ∈ Ker(A + I3 ) so that C = {
u 1, u 3 } is an orthonormal basis
u 2,
for R . Then
3
1 0 0
[A]C = 0 1 0 .
0 0 −1
3. Take any solution x (I3 − A) =
a0 of the equation x0 . Then,
(a) The direction of orthogonal reflection: a0 + v3 .
(b) The plane of invariant points: a0 + v2 , v3 =
a0 + Im(A − I3 ).
(3.8.27)
The details are left as Ex. <A> 21.
Determine these x0 so that each such T is an orthogonal reflection, and the
direction and the plane of invariant points.
3.8 Affine Transformations 627
1
√ √1 0
1 0 0 1
2 2
[A]C = QAQ−1 = 0 1 0 , where Q = 6 − 6
√ √1 √2 .
6
0 0 −1 √1 − √13 − √13
3
v1 A =
v1 ,
1 1 1 4 1 4
v2 A =√ − , , =− v2 + √ v3 ,
2 3 3 3 3 3 2
√
2 2 1 2 2 1
v3 A = ,− , = v2 + v3
3 3 3 3 3
628 The Three-Dimensional Real Vector Space R3
0 0 √1 √1 0
√ 2 2
− 13 2 2 1
⇒ [A]B = P AP −1 = 0 3 , where P = √2 − √12 0 .
√
2 2 1 0 0 1
0 3 3
x P −1 (P AP −1 )P
y =
⇒ [ x ]B (P AP −1 ),
y ]B = [ i.e. letting [
x ]B = (α1 , α2 , α3 ) and
[
y ]B = (β1 , β2 , β3 ),
β1 = α1 ,
−1 0 cos θ sin θ
(β2 , β3 ) = (α2 , α3 ) .
0 1 − sin θ cos θ
This means that, when the height α1 of the point x to the plane v3
v2 ,
being fixed, the orthogonal projection (α2 , α3 ) of x on v2 , v3 is subject
to a reflection with respect to the axis v3 to the point (−α2 , α3 ) and
then is rotated through the angle θ in the plane v3 with 0 as center
v2 ,
to the point (β2 , β3 ). Therefore, the point (β1 , β2 , β3 ), where β1 = α1 , is
the coordinate vector of x A = T (x ) in B. See Fig. 3.72(b). Compare with
Figs. 2.109 and 2.110. 2
x x
T ( x)
v1
e2 u1 e1
T ( x) e2
e1
(0, 2, 3) (0, 2, 3)
e3 v3 = e3
u3 0 v2
0
〈〈v2, v3 〉〉⊥ v1
u2
〈〈 u1 ,u2 〉〉⊥ u3
(b)
(a)
Fig. 3.72
3.8 Affine Transformations 629
Exercises
<A>
13. Let
a0 = (4, 2, 1)and the vectors
x1 = 0, 35 , 45 , 25 , − 25 and
x2 = 35 , 16 12
x3 = 45 , − 12
9
25 , 25 .
for i = 1, 2, 3.
(c) Do problems (b), (c) and (d) as in Ex. 5.
14. Let
a0 = (2, 2, 1) and the vectors
y1 = (2, 1, 0),
y2 = (0, 1, 1) and
y3 = (2, 0, 2).
15. For each of the following, do the same problems as in Exs. 13 and 14.
(a)
a0 = (2, 5, 5) and the vectors
y1 = (1, 1, 0), y2 = (0, 1, 1),
y3 = (1, 0, 1).
(b)
a0 = (−1, −2, 1) and the vectors y1 = (1, 2, 1), y2 = (1, 0, 1),
y3 = (1, 0, 2).
(c)
a0 = (2, 5, 5) and the vectors
y1 = (1, 1, 0), y2 = (2, 0, 1),
y3 = (2, 2, 1).
16. Use
0 1 1
T (
x) =
x0 +
xA, where A = −2 3 2
1 −1 0
(1) Find the image plane of any plane parallel to the direction of the
shearing.
(2) Find the image plane of any plane parallel to the plane of invariant
points.
(3) Find the image of the tetrahedron ∆a0
a1 a2
a3 under the shearing
and compute its volume.
18. Use
√1 √1 0
2 2
− √1 √1 √2
T (
x) =
x0 +
x A, where A = 6 6 6
√1
3
− √13 √1
3
(1) Find the image of a line, parallel or skew to the direction of the
orthogonal reflection, under T .
(2) Find the image of a plane, containing or parallel to the direction
of reflection.
(3) Find the image of a plane, perpendicular to the direction of
reflection.
(4) Find the image of a plane, intersecting with the plane of invariant
points along a line.
(5) Find the image of a tetrahedron ∆ a0
a1
a2
a3 and compute its
volume.
(6) Find the images of |x1 | + |x2 | + |x3 | = 1 and x21 + x22 + x23 = 1.
23. Let
2 −2 −1
1
T (
x) =
x0 +
xA, where A = −2 −1 −2 .
3
−1 −2 2
Do the same problems as Ex. 22.
24. Let
1
0 3 0
1
A = 0 0 3
.
9 −9 3
(a) Express A as a product of a finite number of elementary matri-
ces, each representing either a reflection, a stretch or a shearing in
N = {0, e1 , e3 }.
e2 ,
(b) Show that A has only one eigenvalue 1 of multiplicity 3 and
(A − I3 )2 = O but (A − I3 )3 = O.
(1) Find a basis B = {
v1 , v3 } so that
v2 ,
1 0 0
[A]B = PAP −1 = 1 1 0
0 1 1
is the Jordan canonical form (see (3) in (3.7.50)).
(2) In B, A can be expressed as a product of shearing matrices, i.e.
1 0 0 1 0 0
[A]B = 1 1 0 0 1 0 .
0 0 1 0 1 1
Note that shearing here means skew shearing (refer to
Ex. <A> 17 of Sec. 2.8.2 and Ex. <B> 1).
3.8 Affine Transformations 635
(c) Try to find the image of the cube with vertices at (±1, ±1, ±1)
x →
under xA by the following methods.
(1) Direct computation.
(2) Factorizations in (b) and (a), respectively.
25. Let
−1 1 0
A = −4 3 0 .
1 0 2
(a) Do as (a) of Ex. 24.
(b) Show that A has eigenvalues 2 and 1, 1 but A is not diagonalizable.
(1) Find a basis B = {
v1 , v3 } so that
v2 ,
2 0 0
[A]B = P −1 AP = 0 1 0
0 1 1
is the Jordan canonical form.
(2) In B, show that A can be expressed as a product of a one-way
stretch and a shearing, i.e.
2 0 0 1 0 0
[A]B = 0 1 0 0 1 0 .
0 0 1 0 1 1
(c) Do as (c) of Ex. 24.
<B>
b
=
x0 + a A + (t1 , t2 ) 1 A
b2
=
y 0 + t1
c 1 + t2
c2 , (*)
where c1 =b1 A,
c2 =
b2 A and y0 = T (
a) =
x0 +
a A and
c1 and
c2 are
linearly independent vectors in R . The latter is easily seen by showing
3
c1 b
(t1 , t2 ) = (t1 , t2 ) 1 A = 0 for same scalar t1 , t2
c2 b2
b
⇔ (since A is invartible) (t1 , t2 ) 1 = 0
b2
⇔ (since b1 and b2 are linearly independent) t1 = t2 = 0.
4. T preserves relative positions of lines and planes.
Give two lines
L1 :
x =
a1 + t b1 ,
L2 :
x =
a2 + t b2 , t ∈ R.
They might be coincident, parallel, intersecting or skew to each other (see
(3.4.5)). We only prove the skew case as follows.
L1 and L2 are skew to each other, i.e. b1 , b2 and a2 −
a1 are linearly
independent.
⇔ (since A is invertible) b1 A, b2 A and (a2 − a1 )A = ( x0 + a2 A) −
( x0 + a1 A) are linear independent.
⇔ (see (∗ )) The image lines
T (L1 ):
x = (
x0 +
a1 A) + t b1 A,
T (L2 ):
x = (
x0 +
a2 A) + t b2 A, t∈R
are skew to each other.
This finishes the proof.
Give a line L and a plane Σ. They might be coincident, parallel or
intersecting at a point (see (3.5.5)). Obviously T will preserve these relative
positions. Details are left to the readers. The same is true for the relative
positions of two planes (see (3.5.6)).
3. T preserves tetrahedron and parallelepiped.
This is an easy consequence of 4 and 2.
7. T preserves the ratio of solid volumes.
638 The Three-Dimensional Real Vector Space R3
Let
a0 ,
a1 ,
a2 and a3 be four different points in space R3 . The set
3 /
a0 a1 a2 a3 = λi ai | 0 ≤ λi ≤ 1 for i = 0, 1, 2, 3
(3.8.28)
i=0
dependent. See Fig. 3.73 (see also Fig. 3.2). While the tetrahedron (see
Secs. 3.5 and 3.6)
3
∆a0
a1
a2
a3 = ai | 0 ≤ λi ≤ 1 for 0 ≤ i ≤ 3 and
λi
i=0
/
λ0 + λ1 + λ2 + λ3 = 1 (3.8.29)
is contained in the parallelepiped a0 a1 a2 a3 and has its volume one-sixth
of the latter. See Fig. 3.73 (also, see Fig. 3.17).
T ( a3 )
T ( a3 )
a3
T
or
T (a1)
T ( a2 )
a2 T ( a0 )
T ( a0 ) T ( a2 )
a0 a1 T ( a1 )
det A > 0 det A < 0
Fig. 3.73
= (the signed volume of a0 a1 a2 a3 )detA. (3.8.30)
From this it follows immediately (refer to (2.8.44))
The geometric interpretation of the determinant
of a linear operator or a square matrix
3.8 Affine Transformations 639
Let f (
x) = xA: R3 → R3 be a linear operator and T ( x) = x0 +
xA the
associated affine transformation (A is allowed to be non-invertible here, see
(2.8.20)) in N = { e1 , e3 }, where A = [aij ]3×3 is a real matrix. Then
e2 ,
the determinant
a11 a12 a13
det f = det A = a21 a22 a23
a a32 a33
31
= the signed volume of the parallelepiped a0 a1 a2 a3 where
ai
= f (
ei )
= (ai1 , ai2 , ai3 ), the ith row of A, i = 1, 2, 3.
the signed volume of a0 a1 a2 a3
= .
the volume of the unit cube o e1 e2 e3
Therefore, for any such affine transformation T (
x) =
x0 +
x A,
the signed volume of the image domain T (Ω)
1. the signed volume of measurable space domain Ω = det A, in particular, if Ω is
a tetrahedron or a parallelepiped.
2. T preserves orientation ⇔ det A > 0; and reverses the orientation ⇔
det A < 0.
3. The image tetrahedron or parallelepiped is degenerated ⇔ det A = 0.
(3.8.31)
Exercises
<A>
1. Prove (3.7.15) in detail by using the following two methods.
(1) Complete the proof shown in the content.
(2) Use matrix factorizations of A, see Secs. 3.7.5 and 3.8.2.
Then, give concrete numerical examples to show that a–d are not nec-
essarily true in general.
<B>
Try to formulate a result for R4 similar to (3.7.15) and prove it.
<C> Abstraction and generalization
Read Ex. <C> of Sec. 2.8.3.
640 The Three-Dimensional Real Vector Space R3
But it does not deal with such objects as lengths, angles, areas and volumes
which are in the realm of spatial Euclidean geometry (refer to Part 2).
In what follows we will introduce a sketch of affine geometry for R3 to
the content and by the method that are universally true for more general
affine space over a field or ordered field.
are linearly independent in the vector space R3 (or V ) and affinely depen-
dent otherwise. See Ex. <A> 1.
For R3 , an affinely independent set B = { a0 ,
a1 , a3 } is called an
a2 ,
affine basis or affine frame with the point a0 as the base point or origin and
ai −
a0 the ith unit vector or coordinate vector.
x0 + S = {
Sk = v |
x0 + v ∈ S} (3.8.33)
a point: S 0 = {
x0 },
a line: S 1 ,
a plane or hyperplane: S 2 ,
the space: S 3 = R3 .
For any basis { vk } for S, the set {
v1 , . . . , x0 ,
x0 +
v1 , . . . , vk } is
x0 +
an affine basis for S k and can be extended to an affine basis for R3 .
p1 − p0 B
1 ..
V (
p0 ,
p1 , . . . ,
pk ) = det .
k!
pk − p0 B k×k
α01 α02 · · · α0k 1
1 α11 α12 · · · α1k 1
= . .. .. .. (3.8.35)
k! .. . . .
α α ··· α 1
k1 k2 kk
p0 in R3 , the vector
⇔ for any fixed point x −
p0
k
x −
can be expressed as p0 = ai −
λi ( p0 ). (3.8.36)
i=0
k
Then the ordered scalars λ0 , λ1 , . . . , λk with i=1 λi = 1 is called a barycen-
tric coordinate of
x , in the affine subspace a0 + a1 − ak −
a0 , . . . , a0 ,
with respect to { a0 , a1 , . . . , ak }. In particular, if B = { a0 , a1 , . . . , ak } is
a0
a3
a1
a2
Fig. 3.74
Let B = { a0 , ak } be affinely independent in
a1 , . . . , R3 and x, b0 ,
b1 , . . . , br be points in the subspace a0 +
a1 −
a0 , . . . ,
ak −a0 so that
C = { b0 , b1 , . . . , br } is affinely independent. Denote
[
x ]B = (x1 , . . . , xk ), and (
x )C = (λ0 , λ1 , . . . , λr );
[ bi ]B = (αi1 , . . . , αik ) for 0 ≤ i ≤ r.
Then,
x ∈ b0 + b1 − b0 , . . . , br − b0
α01 α02 ··· α0k
α11 α12 ··· α1k
⇔ (x1 · · · xk ) = (λ0 λ1 · · · λr ) . .. .. or, in short,
.. . .
αr1 αr2 ··· αrk (r+1)×k
b0
B
b1 B
[ x )C
x ]B = ( ..
. (3.8.39)
.
bk B (r+1)×k
Sr =
y 0 + S1 and S k =
x0 + S2
Sr Sk. (3.8.46)
S1p S2p .
In case S1 ∩ S2 = { 0 }: then the condition y0 ∈ S1 ⊕ S2 implies
x0 −
that S1 intersects S2 at a point, while x0 − y0 ∈
1 1
/ S1 ⊕ S2 implies that S11 is
skew to S21 . 2
Solution
(a) The answer is like Example 1. Why? Prove it.
3.8 Affine Transformations 647
Since x0 − y0 ∈
/ R4 does not happen in this case, so S12 and S22 are
never skew to each other.
(c) Let S 2 = x 0 + S1 and S 3 = y 0 + S2 where dim S1 = 2, dim S2 = 3.
Since
dim S1 ∩ S2 = dim S1 + dim S2 − dim(S1 + S2 )
= 2 + 3 − dim(S1 + S2 ) = 5 − dim(S1 + S2 )
and 3 ≤ dim(S1 + S2 ) ≤ 4, therefore dim S1 ∩ S2 could be 1 or 2 only.
In case dim S1 ∩ S2 = 1: then dim(S1 + S2 ) = 4. For any two points
x 0 , y 0 in R4 ,
x0 − y 0 ∈ S1 + S2 = R4 always hold. Then S 2 and S 3
will intersect along the line x 0 + S1 ∩ S2 if
y0 =
x 0.
In case dim S1 ∩ S2 = 2: then S1 ⊆ S2 and S1 + S2 = S2 holds. For
any two points y 0 ∈ R4 , either
x 0,
x0 −
y 0 ∈ S1 + S2 ⇒ S 2 and S 3 are coincident because S 2 ⊆ S 3 ,
or
x0 −
y0 ∈
/ S1 + S2 = S2 ⇒ S 2 || S 3 , parallel to each other.
Remark In R4 .
S11 and S22 can be only coincident, parallel or intersecting at a point.
648 The Three-Dimensional Real Vector Space R3
How about S 1 =
x 0 + S1 and S 3 =
y 0 + S2 ? Since
or
x0 −
y0 ∈
/ S1 + S2 ⇒ S 1 || S 3 .
Then,
b 0, b 1, b 2, . . . , b k are affinely dependent.
⇔ α0 α1 α2 · · · αk = (−1)k+1 . (3.8.47)
a0 a0 a0
b0
b0 b3 b0
a3
a3 a3
a1 a1 a1
b3
b1 b2 b1 b2 b1 b2
a2 a2 a2
(a) (b) (c) b3 = ∞
Fig. 3.75
In case Fig. 3.75(c) happens, then there exist scalars t0 , t1 , t2 , not all
zeros, so that
a3 =
a 0 + t0 b 0 + t1 b 1 + t 2 b 2 , where t0 + t1 + t2 = 0
i.e. the vector a 3 −
a0 is parallel to the subspace b 0 + b 1 − b 0 , b 2 − b 0 .
Substitute (∗ 1) into the right of the above relation, and get
α0 t0 t0 t1 α1
a3 = 1 + a0 + + a1
1 + α0 1 + α0 1 + α1
t1 t2 α2 t2
+ + a2 + a3
1 + α1 1 + α2 1 + α2
α 0 t0 t0 t1 α1
⇒ 1+ = 0, + = 0,
1 + α0 1 + α0 1 + α1
t1 t2 α2 t2
+ = 0 and =1
1 + α1 1 + α2 1 + α2
⇒ (since t0 + t1 + t2 = 0) α0 α1 α2 = −1.
By imagination, the extended line a 0a 3 will intersect the subspace
b0 + b 1 − b 0 , b 2 − b 0 at a point b 3 = ∞ at infinity (see Sec. 3.8.4.10
below) and in this situation, α3 should be considered as −1. Therefore,
α0 α1 α2 α3 = 1
still holds.
The sufficiency We are going to prove that
b 0, b 1, b 2 and b 3 are affinely dependent.
⇔ There exist scalars t0 , t1 , t2 , t3 , not all zeros, so that
t 0 b 0 + t1 b 1 + t 2 b 2 + t3 b 3 = 0 , where t0 + t1 + t2 + t3 = 0. (∗ 2)
Now, (∗ 1) and (∗ 2) imply that
α0 t0 t3 t0 t1 α1
+ a0 + + a1
1 + α0 1 + α3 1 + α0 1 + α1
t1 t2 α2 t2 t3 α3
+ + a2 + + a3 = 0.
1 + α1 1 + α2 1 + α2 1 + α3
Since α0 α1 α2 α3 = 1, so b 0 , b 1 , b 2 and b3 are all distinct and
t0 , t1 , t2 and t3 can be so chosen, all not equal to zeros, so that
t3 (1 + α3 )α0 t0 (1 + α0 )α1
=− , =− ,
t0 1 + α0 t1 1 + α1
t1 (1 + α1 )α2 t2 (1 + α2 )α3
=− and =− .
t2 1 + α2 t3 1 + α3
3.8 Affine Transformations 651
In this case,
t 0 b 0 + t1 b 1 + t2 b 2 + t 3 b 3 = 0
where
a −1 =
a k,
a k+1 =
a 1 and
a k+2 =
a 2 . Then
Note that the case k = 2 is the Ceva theorem in (2.8.46). See Fig. 3.76.
a0 a0
b0
b3
b0
a3
a3
b2
a1 b2 a1
b1 b1
a2 a2 b3 = ⬁
Fig. 3.76
652 The Three-Dimensional Real Vector Space R3
(1 − t0 )α0 t1 1 − t0 (1 − t1 )α1
⇔ = , = ,
1 + α0 1 + α3 1 + α0 1 + α1
t0 α2 1 − t1 t0 t1 α3
= and = .
1 + α2 1 + α1 1 + α2 1 + α3
1 − t1 α2 α3 (1 + α1 ) t1 α0 α1 (1 + α3 )
⇔ = and = .
t1 1 + α3 1 − t1 1 + α1
⇔ α0 α1 α2 α3 = 1. 2
x ∈ S k−1
and are called the open half-spaces of S k divided by S k−1 . Both S+ k−1
∪S k−1
and S− ∪ S
k−1 k−1
are called the closed half-spaces. For a point x of S k that
does not lie on S k−1 , the half-space containing x is called the side of x
k−1 1
with respect to S . Let p and q be two points on a line S . The closed
side of
q with respect to p is called the (closed) half line or ray from p to
−
q and is denoted by p q . The set
− −
pq = pq ∩ qp = {x = (1 − t) p + tq | 0 ≤ t ≤ 1} (3.8.50)
is called a segment joining
p and
q. Note that
pq =
qp. See Fig. 3.77.
S2
2
S
S−3
S+2 x2
S1 S1 x2
p q x1 2
x0 S
− x1
x0
S+3 x3
pq
Fig. 3.77
In particular, in case D = {
a 0, a k } is a finite subset of R3 (or V ),
a 1, . . . ,
then
Con(
a 0,
a 1, . . . ,
a k) (3.8.52)
is called convex cell with its dimension the dimension of the affine subspace
a0 + a1 − ak −
a 0, . . . , a 0 .
654 The Three-Dimensional Real Vector Space R3
In case {
a 0, a k } is affinely independent,
a 1, . . . ,
Con(
a 0,
a 1, . . . ,
a k ) = ∆ a1 ···
a 0 ak (3.8.53)
a i +
be the open half-spaces of S k divided by the face Sik−1 : a1 −
a i−1 −
a i, . . . , a i+1 −
a i, ak −
a i, . . . , a i .
Then the corresponding closed
half-spaces are
k = S k ∪ S k−1
Si+ k = S k ∪ S k−1 .
and Si−
i+ i i− i
Hence
k k
∆ a1 ···
a 0 a k = ∩ S i+ , and
i=0
k
Int ∆ a1 ···
a 0 a k = ∩ Si+
k
(3.8.54)
i=0
x2
2x1 + x2 = 50
(0, 35)
x1 + 2x2 = 70
(25, 0)
x1
0
Fig. 3.78
in R2 ,
2x1 + x2 ≤ 50,
x1 + 2x2 ≤ 70,
x1 ≥ 0,
x2 ≥ 0
all together represent the shaded point set (a polyhedron) shown in
Fig. 3.78. A general form of linear programming problems is to find val-
ues of x0 , x1 , . . . , xk that will maximize or minimize the function
k
f (x0 , x1 , . . . , xk ) = αi xi
i=0
T (
x) =
x 0 + f (
x ) (regular), x ∈ Rn
⇔ (in B)[T (
x )]B = [
x 0 ]B + [
x ]B [f ]B i.e. let
n
[ x −
x ]B = (x1 , . . . , xn ) means a0 = ai −
xi ( a 0 ),
i=1
[
x 0 ]B= (b1 , . . . , bn ),
[f ( a 1 − a 0 )]B
..
[f ]B = [aij ]n×n = . , and
[f (
an −
a 0 )]B
[T ( x )]B = (y1 , . . . , yn ),
then
n
yj = bj + aij xi for 1 ≤ j ≤ n. (3.8.56)
i=1
1. Ta (n; R), as a vector group (i.e. an additive group of vector space), acts
transitively on Rn .
2. Ta (n; R), as an additive group, is isomorphic to Rn .
3.8 Affine Transformations 657
S⬁0
S1
Fig. 3.79
S⬁1
Fig. 3.80
occasions. Define
an ideal point or point at infinity: (x1 , x2 , x3 , 0),
an ordinary or affine point: (x1 , x2 , x3 , x4 ) with x4 = 0, and
the affine or inhomogeneous coordinate of an affine point:
x1 x2 x3 x1 x2 x3
, , , 1 or , , .
x4 x4 x4 x4 x4 x4
Therefore, in (3.8.59),
the affine subspace R3 = {(x1 , x2 , x3 , x4 ) | x4 = 0},
2
the hyperplane at infinity S∞ or π∞ : x4 = 0, and
3.8 Affine Transformations 659
4
[
x ]Ñ = [
x ]N = (x1 , x2 , x3 , x4 ) if
x= xi
ei (3.8.61)
i=1
4
yj = σ aij xi , σ = 0, 1 ≤ j ≤ 4,
i=1
x4 = 0 if and only if y4 = 0
⇔ a14 x1 + a24 x2 + a34 x3 = 0 for all x1 , x2 , x3 ∈ R
⇔ a14 = a24 = a34 = 0.
Taking σ = 1
a44 (a44 = 0, why?), the transformation reduces to
1
4
yj = aij xi for 1 ≤ j ≤ 3,
a44 i=1
y4 = x4
660 The Three-Dimensional Real Vector Space R3
xi
which, in terms of inhomogeneous coordinates i.e. replacing xi by x4 , etc. ,
can be rewritten as
a4j aij
3
yj = + xi for 1 ≤ j ≤ 3.
a44 i=1 a44
(1) The set of all projective transformations on P 3 (R) that leave π∞ invari-
ant forms a subgroup of Gp (3; R) and is isomorphic to Ga (3; R), the
group of affine transformations on R3 .
(2) The set of all projective transformations on P 3 (R) that leave each point
at infinity invariant forms a subgroup of Gp (3; R) and is isomorphic to
Ta (3; R), the group of translations on R3 (see(3.8.57)). (3.8.63)
For detailed account about projective line, plane and space, refer to [6,
pp. 1–218].
Exercises
<A>
5. Prove (3.8.44).
6. Prove (3.8.45).
7. In R3 , determine the relative positions of
(1) three lines S11 , S21 and S31 ,
(2) three planes S12 , S22 and S23 ,
(3) two lines S11 , S21 and one plane S 2 , and
(4) one line S 1 and two planes S12 , S22 .
8. In R5 , determine the relative positions of
(1) two lines S11 , S21 ,
(2) two planes S12 , S22 ,
(3) one line S 1 and one plane S 2 ,
(4) S 1 and S 3 , S 1 and S 4 ,
(5) S 2 and S 3 , S 2 and S 4 ,
(6) S13 and S23 , S 3 and S 4 , and
(7) S14 and S24 .
9. Prove (3.8.47) in R5 for k = 4.
10. Prove (3.8.48) in R5 for k = 4.
11. Fix two scalars ai , bi ∈ R (or an ordered field) with ai < bi for 1 ≤ i ≤ 3.
The set
{(x1 , x2 , x3 ) ∈ R3 | ai ≤ xi ≤ bi for 1 ≤ i ≤ 3}
P 1 (R) or P 2 (R)). Note that they are collinear. The cross ratio of
A1 , A2 , A3 and A4 , in this ordering, is defined and denoted by
α1 α2 ε2 α1
(A1 , A2 ; A3 , A4 ) = : = .
ε1 ε2 ε1 α2
662 The Three-Dimensional Real Vector Space R3
and
k
exterior Ext ∆ = a j | at least one of λ0 , λ1 , . . . , λk is less than
λj
j=0
/
k
zero and λj = 1 .
j=0
(a) In case k = n, show that ∂∆, Int ∆ and Ext ∆ are pairwise disjoint
and
Rn = Int ∆ ∪ ∂∆ ∪ Ext ∆.
(b) In case k = n, show that
(1) Int ∆ is a convex set,
(2) any two points in Ext ∆ can be connected by a polygonal
curve, composed of line segments, which is contained entirely
in Ext ∆, and
3.8 Affine Transformations 663
(R 2 )
a0 y0
a1 x0
a2
Fig. 3.81
(R 3 )
a1
a0
x0
y0 a2
Fig. 3.82
x0 ; K) = {(1 − t)
Con( x | 0 ≤ t ≤ 1 and
x 0 + t x ∈ K}
is called a cone with K as base and x0 as the vertex. See Fig. 3.83.
Show that Con( x0 ; K) is a convex set.
x0
x0
K K
x0 ∉S
k
x0 ∈S k
Fig. 3.83
A = Con(
a0 ,
a1 , . . . ,
a n−1 ,
an ),
B = Con(
a0 ,
a1 , . . . ,
a n−1 , bn ).
(a) Suppose
a n and b n lie on the same side of S n−1 (see (3.8.49)).
Show that
Int A ∩ Int B = φ,
(b) Suppose
a n and b n lie on the opposite sides of S n−1 . Show that
A ∩ B = Con(
a0 ,
a1 , . . . ,
a n−1 ),
and A and B have the opposite orientations, i.e.
a1 − a0 a1 − a0
.. ..
. .
det · det < 0.
an−1 − a0 an−1 − a0
an −
a0 bn −
a0
See Fig. 3.84(b).
a3
a3
b3
s2 s2
a2
a0 a0
a2
a1 a1
(a) (b)
b3
Fig. 3.84
4. Let ∆ a0 a1 · · ·
a k be a k-simplex in Rn where k ≥ 2 and x
be its barycenter. Suppose x i is the barycenter of its ith face
∆ a1 · · ·
a0 ai−1 ai+1 · · ·
ak for 0 ≤ i ≤ k.
(a) Show that the line segments x i , 0 ≤ i ≤ k, meet at
a i x.
1
(b) Show that x i x = k+1 x a i (in signed length).
See Fig. 3.85.
5. Let
x be the barycenter of the k-simplex ∆ a1 · · ·
a0 a k in Rn where
k ≥ 2. Construct k-simplexes as follows:
A0 = ∆
xa1 · · ·
a k,
Ai = ∆ a1 · · ·
a0 x
a i−1 a i+1 · · ·
ak for 1 ≤ i ≤ k.
(a) If i = j, show that Ai and Aj have at most one face in common
but do not have common interior points.
666 The Three-Dimensional Real Vector Space R3
a0
a0
x2 x2
x3 x1
x x1
a1 x
a1
x0 a3
x0 a2
a2
(a) k = 2 (b) k = 3
Fig. 3.85
a1 · · ·
a0 ak = A0 ∪ A1 ∪ · · · ∪ Ak .
a0
a0
x a3
x
a1 a1
a2
(a) k = 2 (b) k = 3 a2
Fig. 3.86
T (
x) =
a0 ,
T (
x i1 ···ik ) =
ak for 1 ≤ k ≤ n.
3.8 Affine Transformations 667
Let det f denote the determinant of the linear part f of T−1 . Show that
1
det f = (−1)σ(i0 ,i1 ,...,in ) ,
(n + 1)!
where (i0 , i1 , . . . , in ) is a permutation of 0, 1, 2, . . . , n, defined as
k → ik for 1 ≤ k ≤ n and 0 → i0 (i0 is different from
i1 , . . . , in ); (−1)σ(i0 ,i1 ,...,in ) = 1 or −1 according (i0 , i1 , . . . , in ) is an
even permutation or odd.
7. Let ak be linearly independent in Rn . Fix a point
a1 , . . . , a0 in Rn .
Define, for given positive scalars c1 , . . . , ck ,
/
k
P = a0 + λi a i | |λi | ≤ ci , for 1 ≤ i ≤ k .
i=1
There are 2k vertices and 2k faces. The set of all its faces constitutes
its boundary ∂P. Furthermore, the set
x ∈ P | |λj | < cj for 1 ≤ j ≤ k}
Int P = {
is called the interior of P and the set Ext P = Rn − P is called the
exterior of P. See Fig. 3.87.
a3
a0 a1 a2
a0 a1 a0 a2
a1
Fig. 3.87
Use P to replace ∆ a1 · · ·
a0 a n in Ex. 1 to prove (a)–(c) there.
8. Generalized Euler formula: V − F + E = 2.
(a) Let a1 · · ·
a0 a k be a k-simplex in Rn . Any s + 1 distinct points
out of a0 , a1 , . . . ,
a k can be used to construct a s-simplex. The
total of them is αs = Cs+1 k+1
for 0 ≤ s ≤ k. Show that
α0 − α1 + α2 − · · · + (−1)k−1 αk−1 + (−1)k αk = 1.
668 The Three-Dimensional Real Vector Space R3
<D> Application
1. Let a1 · · ·
a0 a n be an n-simplex in Rn and α0 , α1 , . . . , αn be fixed real
numbers. Define a function f : a1 · · ·
a0 a n → R by
% n &
n
f λi
ai = λi αi ,
i=0 i=0
3.8.5 Quadrics
Here in this subsection, R3 is considered as an affine space in general, and
as a vector space in particular. In some cases, R3 as an Euclidean space
(see Part 2 and Chap. 5) is implicitly understood.
N = {0, e1 , e3 } always represents the natural affine basis for R3 ,
e2 ,
and x = [ x ]N = (x1 , x2 , x3 ) is used.
3.8 Affine Transformations 669
x21 x2
8. Hyperbolic paraboloid (Fig. 3.93) 2 − 22 + 2ax3 = 0.
a1 a2
x21 x2
9. Elliptic cylinder (Fig. 3.94) 2 + 22 − 1 = 0.
a1 a2
x21 x22
10. Imaginary elliptic cylinder + + 1 = 0.
a21 a22
x21 x22
11. Imaginary intersecting planes + = 0.
a21 a22
x21 x2
12. Hyperbolic cylinder (Fig. 3.95) 2 − 22 = 1.
a1 a2
x21 x2
13. Intersection planes (Fig. 3.96) 2 − 22 = 0.
a1 a2
14. Parabolic cylinder (Fig. 3.97) x21 + 2ax2 = 0.
15. Parallel planes (Fig. 3.98) x21 − a2 = 0.
16. Imaginary parallel planes x21 + a2 = 0.
17. Coincident planes (Fig. 3.99) x21 = 0.
x3
e3
e2
e1
x1 x2
Fig. 3.88
x3
e3
e2
e1
x1 x2
Fig. 3.89
x3
e3
e2
e1
x1 x2
Fig. 3.90
x3
e3
e2
e1
x1 x2
Fig. 3.91
x3
e3
e2
e1
x1 x2
Fig. 3.92
x3
e3
e1
e2
x1 x2
Fig. 3.93
x3
e3
e2
e1
x1 x2
Fig. 3.94
x3
e3
e1 e2
x1 x2
Fig. 3.95
x3
e3
e1
e2
x1 x2
Fig. 3.96
x3
e3
e2
e1
x1 x2
Fig. 3.97
a translation will reduce it to the form
λ1 z12 + c2 z2 = 0
⇒ (replace z1 and z2 by x1 and x2 , respectively)
c2
x21 + 2ax2 = 0, where 2a = .
λ1
3.8 Affine Transformations 675
x3
e3
e2
e1
x1 x2
Fig. 3.98
e3
e2
0
e1
Fig. 3.99
This is the standard form 14. The readers should practice all other cases.
Instead of affine transformations, in Euclidean space R3 we use the
rigid motions to reduce the quadrics (3.8.64) to their standard forms listed
in (3.8.66). For details, refer to Sec. 5.10.
k
r
Type(II) (k, r − k): x2i − x2i + 1 = 0, (3.8.67)
i=1 i=k+1
k
r
Type(III) (k, r − k): x2i − x2i + 2xr+1 = 0,
i=1 i=k+1
O *γ
(u , v )
*v 0 0
e2
γ x0
γ (O) *γ
(u0 , v0 ) (u , v )
*u 0 0
0 e1
Fig. 3.100
or (3.8.65) as
x,
x B̃ = 0, with x = (x1 , x2 , x3 , x4 ) and
b11 b12 b13 b1 ∗
b21 b22 b23 b2 B b
B̃ = = . (3.8.70)
b31 b32 b33 b3 b b
4×4
b1 b2 b3 b
Note that, both (3.8.64) and (3.8.65) may be obtained by setting
x4 = 1 in (3.8.69) and (3.8.70) respectively. In this case, the affine coordi-
nate (x1 , x2 , x3 , 1) is treated as (x1 , x2 , x3 ) and is considered as a point in R3
and is still denoted by x . Meanwhile, it is convenient to write (3.8.64) as
x1
x2
(x1 , x2 , x3 , 1)B̃
x3 = 0. (3.8.71)
1
According to (1) in (3.8.63), an affine transformation T (
y) =
y0 +
yA
on R can be written, in homogeneous coordinates, as
3
x =
y Ã, with
y = (y1 , y2 , y3 , y4 ),
x = T (
y ) = (x1 , x2 , x3 , x4 ),
a11 a12 a13 0
A 0 a21 a22 a23 0
à = = a31 a32
, (3.8.72)
y0 1 a33 0
α1 α2 α3 1 4×4
where
y 0 = (α1 , α2 , α3 ) and A = [aij ]3×3 is invertible. Or, in the corre-
sponding affine coordinates,
(x1 , x2 , x3 , 1) = (y1 , y2 , y3 , 1)Ã,
where
y = (y1 , y2 , y3 ) and x = T (
y ) = (x1 , x2 , x3 ). (3.8.73)
Note that à is invertible.
3.8 Affine Transformations 679
By the same process leading to (*14) and beyond in Sec. 2.8.5, we have
the following counterpart of (2.8.59).
1. det B, and
∗
B b
2. det
b b
are affine invariants. In case these two quantities are positive, the positive-
ness of tr B is also an affine invariant.
(3.8.74)
These three quantities are Euclidean invariants under the rigid motions on
R3 (see Sec. 5.7).
We postpone the characterizations of quadrics by means of Euclidean
concepts to Sec. 5.10 (refer to (2.8.52) for quadratic curves), including there
many computational examples.
k
r
x2i − x2i = 0, (3.8.75)
i=1 i=k+1
Exercises
<A>
1. Prove (3.8.66) in detail.
2. Prove (3.8.67) in detail.
3. Prove (3.8.74) in detail.
4. Prove (3.8.75) and the statement in the last paragraph in detail.
APPENDIX A
Some Prerequisites
A.1 Sets
A set is a collection of objects, called members or elements of the set. When
a set is to be referred to more than once, it is convenient to label it, usually,
by a capital letter such as A, B, . . ..
There are two distinct ways to describe a set:
For example, the set of all even positive integers less than 10 is written as
{2, 4, 6, 8} = {6, 2, 8, 4} or
Note that each element of a set is not repeated within the set itself and the
order in which the elements of a set are listed is immaterial.
Some definitions and notations are listed as follows:
681
682 Some Prerequisites
A rule for determining if, for each ordered pair (x, y) of elements of a
set A, x stands in a given relationship to y, is said to define a relation R
on A. A relation R on a set A is called an equivalent relation if the following
three conditions hold.
1. (Reflexivity) xRx (x is in relation R to itself).
2. (Symmetry) xRy ⇒ yRx.
3. (Transitivity) xRy and yRz ⇒ xRz.
Let R be an equivalent relation, x∼y is usually written in place of xRy.
That x − y is divisible by a fixed integer is an equivalent relation on the set
of integers.
A.2 Functions
A function f from a set A into a set B, denoted by
f : A → B,
is a rule that associates each element x ∈ A to a unique element, denoted
by f (x), in B. Equivalently, a function is a set of ordered pairs (as a subset
of A × B) with the property that no two ordered pairs have the same first
element.
Some terminologies are at hand.
1. f (x): the image of x under f .
2. x: a preimage of f (x) under f .
3. A: the domain of f .
4. f (A) = {f (x) | x ∈ A}: the range of f , a subset of B.
5. f −1 (S) = {x ∈ A | f (x) ∈ S}: the preimage of S ⊆ B.
6. f = g (f is equal to g): f (x) = g(x) for all x ∈ A if f : A → B and
g: A → B.
7. f is one-to-one: If f (x) = g(y) implies x = y, or, equivalently, if x = y
implies f (x) = f (y).
8. f is onto: f (A) = B if f : A → B, and said that f is onto B.
A.2 Functions 683
g( f (A)) = g f (A)
˚
g f
˚
Fig. A.1
Usually, g◦f = f ◦g, even if both are defined. But associative law h◦(g◦f ) =
(h ◦ g) ◦ f is true.
The identity function 1A : A → A is the function
1A (x) = x, x ∈ A.
It keeps every element of A fixed.
Suppose that f : A → B is a function such that there exists another
function g: B → A satisfying
g ◦ f = 1A , i.e. g(f (x)) = x, x ∈ A.
Then f is one-to-one, and g is onto and is called a left inverse function of f .
In case
f ◦ g = 1B , i.e. f (g(y)) = y, y∈B
then f is onto, and g is one-to-one and is called a right inverse function of f .
The following are equivalent:
1. f : A → B has a function g: B → A as both left and right inverse
functions, i.e.
g(f (x)) = x, x ∈ A and f (g(y)) = y, y ∈ B.
2. f : A → B is one-to-one and onto.
684 Some Prerequisites
A.3 Fields
Model after the set of real numbers in the following definition and the
properties henceforth derived.
Definition A field F is a set together with two operations “+” (called
addition) and “·” (called multiplication) defined on it so that, for each pair
of elements a, b ∈ F, there are unique elements a + b and a · b in F for which
the following conditions hold for all a, b, c ∈ F:
(1) Addition
(a) (commutative) a + b = b + a.
(b) (associative) (a + b) + c = a + (b + c).
(c) (identity element) There exists an element 0 ∈ F, called zero,
such that
a + 0 = a.
(d) (inverse element) For each a ∈ F, there exists an element, denoted
by −a, in F such that
a + (−a) = a − a = 0.
(def.)
(2) Multiplication
(a) (commutative) a · b = b · a.
(b) (associative) (a · b) · c = a · (b · c).
(c) (identity element) There exists an element 1 ∈ F, called unity, such
that 1 · a = a.
(d) (inverse element) For each nonzero element a ∈ F, there exists an
element, denoted by a−1 , in F such that
a−1 · a = 1.
(3) Addition and multiplication
(distributive) a · (b + c) = a · b + a · c.
The elements a + b and a · b (also denoted as ab) are called the sum and
product, respectively, of a and b. If b = 0, then a · b−1 also denote as a/b or
a
b and is called division by b.
A.3 Fields 685
Then,
(a) |z| ≥ 0, and = 0 ⇔ z = 0.
z1 |z1 |
(b) |z| = |z̄|; |z1 z2 | = |z2 z1 | = |z1 ||z2 | and z2 = |z2 | .
(c) |z1 + z2 | ≤ |z1 | + |z2 | with equality if and only if there exists α ≥ 0
such that z1 = αz2 or z2 = αz1 .
A.4 Groups
Model partially after the addition properties of the set of integers or the
multiplication properties of the set of nonzero real numbers.
Definition A group G is a set on which an operator ◦ is defined so that
associates to each pair of elements a, b in G, a unique element a ◦ b in G for
which the following properties hold for all a, b, c in G:
1. (associative) (a ◦ b) ◦ c = a ◦ (b ◦ c).
2. (identity element) There exists an element e in G so that
a ◦ e = e ◦ a = a.
3. (inverse element) For each a in G, there exists an element a−1 in G
so that
a ◦ a−1 = a−1 ◦ a = e.
A.5 Polynomials 687
A.5 Polynomials
A polynomial in indeterminate t with coefficients from a field F is an expres-
sion of the form
p(t) = g(t)
if both have the same degree and the coefficients of like powers of t are
equal.
4. If F is an infinite field (i.e. a field containing infinite elements), a polyno-
mial p(t) with coefficients in F are often regarded as a function p: F → F
and p or p(t), t ∈ F is called a polynomial function.
as the addition
(p + q)(t) = p(t) + q(t)
= an tn + · · · + am+1 tm+1 + (am + bm )tm
+ · · · + (a1 + b1 )t + a0 + b0 .
The scalar product or multiplication αp of a scalar α ∈ F and p is defined as
(αp)(t) = αp(t)
= αan tn + · · · + αa1 t + αa0 .
Therefore, the set of all polynomials with coefficients from a field F
P(F)
is a vector space over F (see Sec. B.1), while the set of such polynomials of
degrees no more than n, a nonnegative integer,
Pn (F)
forms a vector subspace of dimension n + 1.
Let f (t) be a polynomial and g(t) a polynomial of non-negative degree.
Then, there exist unique polynomials q(t) and r(t) such that
1. the degree of r(t) is less than that of f (t), and
2. f (t) = q(t)g(t) + r(t).
This is the so-called Division Algorithm for Polynomials. It follows that,
t − a divides f (t) if and only if f (a) = 0, and such a is called a zero of f (t).
Any polynomial of degree n ≥ 1 has at most n distinct zeros.
A polynomial p(t) of positive degree is called irreducible if it cannot be
factored as a product of polynomials with coefficients from the same field F,
each having positive degree. If f (t) is irreducible and f (x) does not divide
another polynomial g(t) with coefficients from the same field, then f (t) and
g(t) are relatively prime. This means that no polynomial of positive degree
can divide both of them. In this case, there exist polynomials f1 (t) and
g1 (t) such that
f (t)f1 (t) + g(t)g1 (t) = 1 (constant polynomial 1).
If an irreducible polynomial p(t) divides the product g(t)h(t) of two poly-
nomials g(t) and h(t) over the same field F, in symbol
f (t) | g(t)h(t),
then f (t) | g(t) or f (t) | h(t).
A.5 Polynomials 689
This appendix is divided into twelve sections. Among them, Secs. B.1–B.6
are devoted to static structures of vector spaces themselves, while Secs. B.7–
B.12 are mainly concerned with dynamic relations between vector spaces,
namely the study of linear transformations. Most topics are stated in the
realm of finite dimensional vector spaces. From Sec. B.4 on, some exercise
problems are attached as parts of the contents. Few geometric interpreta-
tions of abstract results are touched and the methods adopted are almost
purely algebraic. Essentially no proofs are given. In short, the manner pre-
sented is mostly adopted in the nowadays linear algebra books.
A vector space V over a field F is defined axiomatically in Sec. B.1, with
Fn , Inp and M(m, n; F) as concrete examples along with subspace operations.
Via the techniques of linear combination, dependence and independence
introduced in Sec. B.2, Sec. B.3 introduces the basis and dimension for
a vector space. The concept of matrices over a field and their algebraic
operations are in Sec. B.4, and the elementary row or column operations
on a matrix are in Sec. B.5. The determinant function on square matrices
is sketched in Sec. B.6.
Section B.7 is devoted to linear transformation (functional, opera-
tor, or isomorphism) and its matrix representation with respect to bases.
Section B.8 investigates a matrix and its transpose, mostly from the view-
point of linear transformations. Inner product spaces with specified linear
operators on them such as orthogonal, normal, etc. are in Sec. B.9. Eigen-
values and eigenvectors are in Sec. B.10, while Sec. B.11 investigates the
diagonalizability of a matrix. For nondiagonalizable matrices, their Jordan
and rational canonical forms are sketched in Sec. B.12. That is all!
691
692 Fundamentals of Algebraic Linear Algebra
scalar multiplication, respectively) are defined so that for each pair of ele-
ments x,
y in V , there is a unique element
x +
y (called the sum of
x and
y)
α
x (called the scalar product of
x by α)
(1) Addition
(a) (commutative) x + y = y +
x.
(b) (associative) ( x + y ) + z =
x + (
y +z ).
(c) (zero vector) There is an element, denoted by 0 , in V such that
x+ 0 =
x.
x ∈ V , there exists
(d) (negative or inverse vector of a vector) For each
another element, denoted by − x , in V such that
x + (−
x) = 0 .
(2) Scalar multiplication
(a) 1
x = x.
(b) α(β
x ) = (αβ)
x.
(3) The addition and scalar multiplication satisfy the distributive laws:
(α + β)
x = α
x + β
x,
α(
x +
y ) = α
x + α
y.
The elements of the field F are called scalars and the elements of the
vector space V are called vectors. The word “vector”, without any practical
meaning such as displacement or acting force, is now being used to describe
any element of a vector space.
If the underlying field F is the real field R or the complex field C, then
the corresponding vector space is called specifically a real or a complex
vector space, respectively.
A vector space will frequently be discussed without explicitly mentioning
its field of scalars.
B.1 Vector (or Linear) Spaces 693
1. (cancellation law) If
x,
y,z ∈ V and x +
z =
y +
z , then
x =
y.
2. Then, zero vector 0 is unique.
3. Negative − x of a vector x is unique.
4. α x = 0 ⇔ α = 0 or x = 0.
Examples
For a given field F and a positive integer n, define the set of all n-tuples
with entries from F as the set
The following specified vectors and notations for them will be used through-
out the whole book:
ei = (0, . . . , 0, 1, 0, . . . , 0), 1 ≤ i ≤ n.
↑ ith component
By Sec. A.5, the set P (F) of all polynomials with coefficients from a
field F is a vector space over F.
Let X be a nonempty set and F a field. The set of all functions from X
into F
F(X, F) = {function f : X → F}
694 Fundamentals of Algebraic Linear Algebra
is always a subspace of V .
2. Sum subspace If S1 and S2 are subspaces of V , then
S1 + S2 = { x2 |
x1 + x1 ∈ S1 and
x2 ∈ S2 }
is a subspace which has S1 and S2 as its subspaces. In case
S1 ∩ S2 = { 0 }, denote S1 + S2 by
S1 ⊕ S 2
and is called the direct sum of S1 and S2 .
x ∈ V , the coset of
3. Quotient space of V modulus a subspace S For any
S containing x is the set
x + S = {
v |
x + v ∈ S}.
B.2 Main Techniques: Linear Combination, Dependence and Independence 695
Then
x +S = x −
y + S if and only if y ∈ S. The quotient set
V /S = {
x + S |x ∈ V }
forms a vector space over F under the well-defined operations:
(
x1 + S) + (
x2 + S) = (
x1 +
x2 ) + S, and
α( x + S) = α x + S,
with 0 + S = S acts as zero vector in it.
x1 ⊆ V but x1 = V ), then there exists a vector x2 ∈ V − x1 and,
x1 and x2 are linearly independent. Construct subspace x1 , x2 . Then,
this procedure as it will stop after a finite number of steps, say n. Therefore,
V = xn ,
x1 , . . . ,
dim V = n.
dim W ≤ dim V
and equality holds if and only if W = V . Furthermore, any basis for W can
be extended to a basis B for V . In other words, there exists another vector
subspace U of V such that
V = W ⊕ U.
holds.
Example
Let Pn (F), n ≥ 0, be as in Sec. A.4 where F is an infinite field.
The set {1, t, . . . , tn } forms a basis for Pn (F).
B.4 Matrices 699
p(ai ) = αi , 0 ≤ i ≤ n.
B.4 Matrices
Let m and n be positive integers.
An m × n matrix with entries from a field F is an ordered rectangular
array of the form
a11 a12 · · · a1n a11 a12 · · · a1n
a21 a22 · · · a2n a21 a22 · · · a2n
. . . or . .. .. ,
.. .. .. .. . .
am1 am2 · · · amn am1 am2 · · · amn
denoted by
Ai∗ = (ai1 ai2 · · · ain ), 1≤i≤m
and is called a row matrix. Similarly, the entries a1j , a2j , . . . , amj of A com-
pose the jth column of A and is denoted by
a1j a1j
a a
2j 2j
A∗j = . or
.. , 1 ≤ j ≤ n
.. .
amj amj
and is called a column matrix. The entry aij which lies in the ith row and
jth column is called the (i, j) entry of A. Then, the matrix A is often written
in shorthand as
Am×n or A = [aij ]m×n or (aij )m×n or A = [aij ].
Matrices are used to describe route maps in topology and networks,
and to store a large quantity of numerical data on many occasions. They
appear seemingly naturally in the treatment of some geometrical problems.
Actually, if we endow this notation with suitable operations, the static and
dynamic properties of matrices will play the core of study about finite-
dimensional vector spaces in linear algebra.
Matrices with entries in R or C are called real or complex matrices,
respectively.
Two m × n matrices A = [aij ] and B = [bij ] are defined to be equal if
and only if aij = bij for 1 ≤ i ≤ m, 1 ≤ j ≤ n and is denoted as
A = B.
The m × n matrix having each entry aij equal to zero is called the zero
matrix and is denoted by
O.
The n × m matrix obtained by interchange m rows into m columns and
n columns into n rows of a m × n matrix A = [aij ] is called the transpose
of A and is denoted by
A∗ = [bji ] where bji = aij , 1 ≤ i ≤ m, 1 ≤ j ≤ n.
An n × n matrix A = [aij ]n×n is called a square matrix of order n
with aii , 1 ≤ i ≤ n, as its (main) diagonal entries.
B.4 Matrices 701
a11 0
a22
..
.
0 ann
1 0
1
In = .. .
.
0 1
α 0
α
αIn = .. .
.
0 α
Let A = [aij ]m×n be a complex matrix. Then Ā = [āij ]m×n is called the
conjugate matrix of A and the n × m matrix
Ā∗ = A
is called Hermitian.
9. Skew-Hermitian matrix This means a complex square matrix A having
the property
Ā∗ = −A.
In this case, the main diagonal entries of A are all pure imaginaries.
Hermitian or skew-Hermitian matrices with real entries are just real sym-
metric or real skew-symmetric matrices, respectively.
Let
1. Addition For each pair of matrices A = [aij ], B = [bij ] ∈ M(m, n; F), the
sum A + B of A and B is the m × n matrix
αA = [αaij ]m×n .
B.4 Matrices 703
A + B = B + A;
(A + B) + C = A + (B + C);
A+O =A (O is zero m × n matrix);
A + (−A) = O (−A means (−1)A);
1A = A;
α(βA) = (αβ)A;
α(A + B) = αA + αB;
(α + β)A = αA + βA.
0
..
0 . 0
0
Eij = 0 ··· 0 1 0 · · · 0 ← ith row, 1 ≤ i ≤ m, 1 ≤ j ≤ n
0
..
0 . 0
0
↑
jth column
as basis vectors.
Owing to the same operational properties, a 1 × n row matrix is also
regarded as a row vector in Fn and vice versa, while a m × 1 column matrix
as a column vector in Fm and vice versa.
In order to define matrix multiplication in a reasonable way, we start
from a simple example.
Suppose a resort island has two towns A and B. The island bus company
runs a single bus which operates on two routes:
See the network on the Fig. B.1. By a single stage journey we mean either
A → B or B → A, or B → B and can be expressed in matrix form
(to)
AB
A 0 1
(from) .
B 1 1
B A
Fig. B.1
How many routes can a tourist choose for a second-stage journey? Suppose
the bus starts at A, then the possible routes are
A
1
-B
A PP
1st stage PP
2nd stage
q
P
B
If the bus starts at B, then the possible routes are
A
1 -
B
B
PP
*
PP
q B
P
1st stage PP
PP
2nd stage qA
P
The situation can be described simply by using matrix notation such as
AB AB AB
A 0 1 A 0 1 A 1 1
=
B 1 1 B 1 1 B 1 2
or just
0 1 0 1 1 1
= .
1 1 1 1 1 2
Do you see how entries 1, 1, 1, 2 of the right matrix came out? Readers are
urged to try more complicated examples from your daily livings.
B.4 Matrices 705
Exercises
1. Unusual properties of matrix multiplication In M(2; R) or even
M(m, n; F), m, n ≥ 2, the followings happen (refer to (1) in (1.2.4) and
Sec. A.3).
(a) AB = BA. For example,
1 1 1 0
A= and B = , or
0 0 1 0
1 1
1 0 0
A = 0 0 and B = .
1 0 1
1 0
(b) There exist A = O and B = O for which AB = O. For example,
1 0 0 0
A= and B = .
0 0 0 1
In this case, BA = O holds too.
706 Fundamentals of Algebraic Linear Algebra
is a subspace.
1 0
(a) Let A = −1 0 . Show that
a22 − a21 0
V = a , a ∈ F .
a22
21 22
a21
Find a basis for V . What is dim V ?
(b) Let
0 1 0
A = 0 0 1 .
0 0 0
Determine the corresponding V and find a basis for V .
5. Let A ∈ M(n; F) and p ≥ 1 be an integer. The pth power of A is
Ap = AA · · · A (p times)
0
and A is defined to be the identity matrix In . Obviously,
Ap Aq = Aq Ap = Ap+q , and
p q pq
(A ) = A
hold for any nonnegative integers p, q. In the following, A = [aij ] ∈
M(2; F).
(a) Find all such A so that A2 = O.
(b) All A so that A3 = O.
(c) All A so that Ap = O where p ≥ 3 is an integer.
6. A matrix A ∈ M(n; F) is said to be idempotent if
A2 = A.
(a) Show that the matrices
1 2 2
1 0 0 0 −1
and
0 0
0 0 1
are idempotent.
(b) Show that
2 −2 4 −1 2 4
A = −1 3 4 , B= 1 −2 −4
1 −2 −3 1 2 −4
are idempotent and AB = O. What is BA?
708 Fundamentals of Algebraic Linear Algebra
(A∗ )∗ = A,
(AB)∗ = B ∗ A∗ .
1 1
A= (A + A∗ ) + (A − A∗ ),
2 2
V1 = {A ∈ M(n; R) | A = A∗ }, and
∗
V2 = {A ∈ M(n; R) | A = −A }.
M(n; R) = V1 ⊕ V2 .
B.4 Matrices 711
(g) Let
V3 = {A = [aij ] ∈ M(n; R) | aij = 0 for 1 ≤ i ≤ j ≤ n}
and V1 be as in (f). Show that V3 is a 12 n(n − 1)-dimensional sub-
space of M(n; R) and
M(n; R) = V1 ⊕ V3 .
15. Rank of a matrix (see Sec. B.5 and Ex. 2 of Sec. B.7)
Let A = [aij ] ∈ M(m, n; F) and A = O. The maximal number of linear
independent row (or column) vectors of A is defined as the row (or
column) rank of A. It can be show that, for any matrix A = O,
row rank of A = column rank of A = r(A).
This common number r(A) is called the rank of A. Define the rank of
O to be zero. Therefore 0 ≤ r(A) ≤ m, n. Try to prove this result in
case A is a 2 × 3 or 3 × 3 real matrix.
16. Invertible matrix and its inverse matrix
A matrix A = [aij ] ∈ M(n; F) is said to be invertible (refer to (2.4.2))
if there exists another n × n matrix B such that
AB = BA = In .
In this case, B is called the inverse matrix of A and is denoted by
A−1 .
(a) A−1 is unique if A is invertible, and (A−1 )−1 = A.
(b) A is invertible if and only if A∗ is invertible, and (A∗ )−1 = (A−1 )∗ .
(c) If A and B are invertible, then AB is invertible and
(AB)−1 = B −1 A−1 .
(d) The following are equivalent: A = [aij ]n×n
(1) A is invertible.
(2) The row rank of A is n.
(3) The column rank of A is n.
(4) The linear transformation f : Fn → Fn defined by
f (
x) =
xA
is a linear isomorphism (see Sec. B.7).
(5) The linear transformation g: Fn → Fn defined by
g(
y ) = A
y (
y is considered as a column vector)
is a linear isomorphism (see Sec. B.7).
712 Fundamentals of Algebraic Linear Algebra
(6) The homogenous equation x ∈ Fn ) has zero solution
x A = 0 (
0 only. So is A y = 0 .
(7) There exists an n × n matrix B such that AB = In .
(8) There exists an n × n matrix B such that BA = In .
(9) The matrix equation XA = O has zero solution X = O only.
So is AY = O.
(10) The matrix equation XA = B always has a unique solution.
So is AY = B.
(11) The determinant of A satisfies
det A = 0
and thus det A−1 = (det A)−1 .
These results extend those stated in Exs. <A> 2 and <B> 4 of
Sec. 2.4. Try to prove these in case A is a 2 × 2 or 3 × 3 real matrix.
(e) State equivalent conditions for an n × n matrix A to be not invert-
ible which is called singular.
(f) Suppose A is invertible. For positive integer p, Ap is invertible and
(Ap )−1 = (A−1 )p .
Therefore, extend the power of an invertible matrix A to negative
exponent:
A−p = (A−1 )p , p > 0.
(g) Suppose A, B ∈ M(n; F) such that AB is invertible. Then A and B
are invertible.
17. Suppose A and B are invertible. Find such A and B so that A + B is
invertible. If, in addition, A+B is invertible, then show that A−1 +B −1
is invertible and
(A−1 + B −1 )−1 = A(A + B)−1 B = B(A + B)−1 A.
18. Let A ∈ M(n; R) be skew-symmetric.
(a) Then In − A is invertible. Try to prove this directly if n = 2. Is
aIn + A invertible for all a = 1 in R?
(b) The matrix
B = (In + A)(In − A)−1
satisfies BB ∗ = B ∗ B = In , i.e. B ∗ = B −1 . A real matrix such as
B is called an orthogonal matrix.
B.4 Matrices 713
A−1 = diag[a−1 −1 −1
1 , a2 , . . . , an ].
23. An upper (or lower) triangular matrix is invertible if and only if its
main diagonal entries are nonzero.
24. Let A = [aij ], B = [bij ] ∈ M(2; R) or M(2; C).
(a) Compute AB and BA.
a b
(b) Show that AB − BA = c −a for some a, b, c ∈ R.
(c) Prove that there does not exist any α = 0 in R such that
AB − BA = αI2 .
i.e. the sum of entries on A’s main diagonal, is called the trace of A.
714 Fundamentals of Algebraic Linear Algebra
V = {A ∈ M(n; F) | tr A = 0}.
C = AB − BA.
716 Fundamentals of Algebraic Linear Algebra
Note For any field F and n ≥ 2,
{A ∈ M(n; F) | tr A = 0}
= {AB − BA | A, B ∈ M(n; F)}
still holds as subspaces of M(n; F). In case F is a field of characteristic 0
(i.e. 1 + 1 + · · · + 1 = 0 for any finite number of 1) such as R and C, it
is not possible to find matrices A, B ∈ M(n; F) such that
AB − BA = In ,
(refer to Ex. 25). For field F of characteristic p (i.e. 1 + 1 + · · · + 1 = 0
for p’s 1) such as Ip = {1, 2, . . . , p − 1} where p is a prime, this does
happen. For example, in I3 = {0, 1, 2}, let
0 1 0 0 0 0
A = 0 0 1 and B = 1 0 0 .
0 0 0 0 2 0
Then
1 0 0 0 0 0 1 0 0
AB − BA = 0 2 0 − 0 1 0 = 0 1 0 = I3 .
0 0 0 0 0 2 0 0 −2
31. Similarity
Two square matrices A, B ∈ M(n; F) are said to be similar if there
exists an invertible matrix P ∈ M(n; F) such that
B = P AP −1 (or A = P −1 BP )
(refer to (2.7.25)). Use A ∼ B to denote that A is similar to B.
(a) Similarity is an equivalent relation (see Sec. A.1) among matrices
of the same order. That is,
(1) A ∼ A.
(2) A ∼ B ⇒ B ∼ A.
(3) A ∼ B and B ∼ C ⇒ A ∼ C.
(b) Similarity provides a useful tool to study the geometric behavior
of linear or affine transformations by suitable choices of bases (see
Secs. 2.7.6, 3.7.6, etc.). Algebraically, similarity has many advan-
tages in computation. For example,
B.4 Matrices 717
(d) Not every square matrix can be similar to a diagonal matrix. For
example, there does not exist an invertible matrix P such that
1 0 −1
P P
1 1
is a diagonal matrix. Why? Is
n
1 0 1 0
= , n≥1
1 1 n 1
correct?
32. Necessary condition for a matrix to be similar to a diagonal matrix
Let A = [aij ] ∈ M(n; F) and P be an invertible matrix so that
λ1 0
..
P AP −1 = . .
0 λn
Fix i, 1 ≤ i ≤ n, then
P AP −1 − λi In = P AP −1 − P (λi In )P −1 = P (A − λi In )P −1
λ1 − λi 0
..
.
λi−1 − λi
= 0
λi+1 − λi
..
.
0 λn − λi
⇒ det(A − λi In ) = 0, 1 ≤ i ≤ n.
x iA = λi
x i, 1 ≤ i ≤ n.
(a) Justify the above by the example
−1
3 2 5 12 3 2 13 0
= .
2 −3 12 −5 2 −3 0 −13
(b) Suppose that there exists an invertible matrix P = [pij ] such that
√
5+ 33
1 2 −1 2 0√
P P = 5− 33
.
3 4 0 2
respectively.
B.5 Elementary Matrix Operations and Row-Reduced Echelon Matrices 721
2 0 1 5 0
0
−1 3 1 0 −1 3 4 0
0 0 2 1 ,
0 0 0 −2 1.
0 0 0 1 0 0 0 0 4
0 0 0 0 0
For examples,
1
0 1 −3 0 2 0 0 1 −11
0 0 0 1 2 , 1 0 0 17
0 0 0 0 0 0 1 0 −5
k1 k2 k3 kr
1 ∗ ··· ∗ 0 ∗ ··· ∗ 0 ∗ ······ ∗ 0 ∗ ······ ∗
0 · · · · · · 0 1 ∗ · · · ∗ 0 ∗ · · · · · · ∗ 0 ∗ · · · · · · ∗
.
.
. 0 · · · · · · 0 1 ∗ · · · · · · ∗ 0 ∗ · · · · · · ∗
. .. .
.. 0 ∗ · · · · · · ∗ ..
.
. . .
.
R = 0 . . . 0 ∗ · · · · · · ∗ ,
. .
.. .. ..
. . 0 · · · · · · 0 1 ∗ · · · · · · ∗
. ← rth row
.. .. .. ..
. . . . 0 0 · · · · · · 0
. ..
. .. .. .. .. ..
. . . . . . .
0 · · · · · · · · · 0 · · · · · · · · · 0 0 · · · · · · · · · 0 0 · · · · · · 0 m×n
(∗)
where ∗ could be any scalars and zeros elsewhere. Such a R is unique once
A is given, and is called the row-reduced echelon matrix of A.
The following are some basic results about row-reduced echelon
matrices:
PA = R
r(A) = r(R) = r.
affine subspace
v0
v0 + V
0 vector space V
Fig. B.2
B.6 Determinants
Let A = [aij ] be an n×n matrix with entries from a field F. The determinant
of A is an element of F, denote by
det A,
which can be defined inductively on n as follows:
and is called the expansion of the determinant det A along the ith row.
det B = − det A.
det In = 1.
From the very definition for determinants, the following basic properties
can be deduced.
det A = 0
n
det A = (−1)i+j aij det Aij , 1 ≤ i ≤ n.
j=1
4. The interchange of rows and columns does not change the value of
det A, i.e.
det A∗ = det A.
det(αA) = αn det A;
(1) (2)
det(A + B) = det aij + aij
a(k1 ) a(k2 ) (k )
· · · a1nn
11
2
. 12
= . .. .. .
. . .
k1 ,...,kn =1 (k1 ) (k )
· · · annn
(k2 )
an1 an2
(a sum of 2n terms)
If m < n:
det AB
a1j1 ··· a1jm bj1 1 ··· bj1 m
.. .. .. .. .
= . . . .
1≤j1 <j2 <···<jm ≤n
amj1 · · · amjm bjm 1 · · · bjm m
(a sum of Cnm terms)
hold. To put these identities in a compact form, we define the adjoint matrix
of A as
adj A = [bij ]n×n ,
bij = (−1)j+i det Aji , 1 ≤ i, j ≤ n.
Thus, (∗) can be written as a single identity
A · adj A = adj A · A = (det A)In .
We conclude that a square matrix A is invertible if and only if det A = 0.
In this case, the inverse matrix is
1
A−1 = adj A.
det A
Let A = [aij ]n×n . The system of linear equations
A
x= b
x1
..
x is the n × 1 column vector
in n unknowns x1 , . . . , xn , where . ∈ Fn
xn
b1
..
and b = . ∈ Fn , has a unique solution if and only if det A = 0. Let Xk
bn
732 Fundamentals of Algebraic Linear Algebra
Definition Suppose V and W are vector spaces over the same filed F. A
function (see Sec. A.2)
f: V → W
is called a linear transformation or mapping from V into W if it preserves
linear structures of vector spaces, i.e. for any y ∈ V and α ∈ F, the
x,
following properties hold:
1. f (αx ) = αf (x ).
2. f ( x +
y ) = f (
x ) + f (
y ).
If, in addition, f is both one-to-one and onto, then f is called a linear
isomorphism from V onto W , and V and W are called isomorphic. In case
W = V , a linear transformation f : V → V is specially called a linear
operator ; while if W = F, f : V → F is called a linear functional.
Remark
In general situations, both conditions 1 and 2 in the definition are indepen-
dent of each other and hence are needed simultaneously in the definition of
a linear transformation.
For example, define a mapping f : R2 (vector space) → R2 by
x , if x1 x2 ≥ 0
f(x) =
− x , if x1 x2 < 0,
B.7 Linear Transformations and Their Matrix Representations 733
such that
f (
x i) =
y, 1 ≤ i ≤ m.
All we need to do is to define a function f : V → W by assigning f (
x i) =
y i,
m
1 ≤ i ≤ m and then extending it linearly to all vectors i=1 αi x i in V by
%m &
m
f αi x i = αi
y i.
i=1 i=1
holds with dim Ker(f ) called the nullity of f and dim Im(f ) the rank of f .
As a consequence, f is one-to-one if and only if Ker(f ) = { 0 }.
Suppose dim V = m < ∞ and dim W = n < ∞. Then,
m
x= xi
a i.
i=1
x ]B = (x1 , . . . , xm ) ∈ Fm
[
y ]C = (y1 , . . . , yn ) ∈ Fn
[
n
if and only if
y = j=1 yj b j .
B.7 Linear Transformations and Their Matrix Representations 735
⇒ [f (
a i )]C = (ai1 , ai2 , . . . , ain ), 1 ≤ i ≤ m.
m
For any
x= i=1 xi
a i,
%m &
m
m
n
n
f(x) = xi f (
a i) = xi aij b j = xi aij bj
i=1 i=1 j=1 j=1 i=1
%m &
m
m
⇒ [f (
x )]C = xi ai1 , xi ai2 , . . . , xi aim
i=1 i=1 i=1
a11 a12 ··· a1n
a21 a22 ··· a2n
= (x1 · · · xm ) . .. ..
.. . .
am1 am2 · · · amn
= [ x ]B [f ]B
C,
where
[
a 1 ]B
[1V ]B
B = ...
[
a m ]B
is the matrix representation of 1V relative to B and B and is called the
change of coordinate matrix or transition matrix changing B into B . Simi-
larly, for another basis C for W and
y ∈ W , we have
[ y ]C [1W ]CC .
y ]C = [
Both [1V ]B C
B and [1W ]C are invertible.
736 Fundamentals of Algebraic Linear Algebra
What are the possible relations among [f ]B B B C
C , [f ]C , [1V ]B and [1W ]C ?
B
Since [f ( x )]C = [ x ]B [f ]C , therefore
C [f ]B
V −−−− → W
(B) f (C)
@
1V [1V ]B
B C
[1W ]C 11W
(B ) [f ]B (C )
C
V −−−− → W
f
is a linear isomorphism.
2. For each basis B for V and basis C for W , each f ∈ L(V, W ) has a unique
matrix representation [f ]B
C relative to B and C:
[f ( x ]B [f ]B
x )]C = [ C.
3. For another basis B for V and C for W, [f ]B B
C and [f ]C are related to each
C
other, subject to changes of coordinate matrices [1V ]B B and [1W ]C , as
[f ]B B B C
C = [1V ]B [f ]C [1W ]C .
Exercises
Let V and W be vector spaces over the same field F throughout the following
problems.
transformation.
(a) Then (refer to Ex. 15 of Sec. B.4)
with
(1) r(PA) = r(A) ⇔ Fm = Im(P ) + Ker(A),
(2) r(PA) = r(P ) ⇔ Im(P ) ∩ Ker(A) = { 0 },
(3) r(PA) = r(P ) + r(A) − m ⇔ Ker(A) ⊆ Im(P ).
(b) Suppose P ∈ M(m; F) and Q ∈ M(n; F) are invertible, and
A ∈ M(m, n; F). Then
fij (
xk ) = δki
yj , 1 ≤ i, k ≤ m, 1 ≤ j ≤ n.
Then N = {fij | 1 ≤ i ≤ m, 1 ≤ j ≤ n} forms a basis for L(V, W ). In
m,n
particular, if f ∈ L(V, W ) and f = i,j=1 aij fij , then
[f ]B
C = [aij ]m×n .
for V . then
Ir 0
[f ]B = ,
0 −In−r n×m
where r is the rank of f + 1V which is equal to dim Ker(f − 1V ).
742 Fundamentals of Algebraic Linear Algebra
from a basis for the range space R(f − i1V ). At the same time, for
x ) ∈ Ker(f + i1V ) which has dimension
r + 1 ≤ j ≤ n, (f − i1V )(
n − r. Therefore,
R(f − i1V ) = Ker(f + i1V ).
Similarly,
R(f + i1V ) = Ker(f − i1V ).
xj ), r + 1 ≤ j ≤ n, are linearly independent
The vectors (f + i1V )(
in R(f + i1V ). Then
n−r ≤r
holds. Similarly, r ≤ n − r. Hence
n = 2r.
n
In particular, n is even and r = 2.
B.7 Linear Transformations and Their Matrix Representations 743
and
r([h]N ) = n r(A),
det[h]N = (det A)n .
[g]B = [h]∗N .
T (In ) = In .
T (X −1 ) = T (X)−1 if X is invertible,
T (QXQ ) = T (Q)T (X)T (Q)−1 .
−1
n
(b) Now In = i=1 T (Eii ). So at least one of T (Eii ), 1 ≤ i ≤ n, is
not a zero matrix, say T (E11 ) = O. For some suitable elementary
matrices Q1 , Q2 of type 1, Eij = Q1 E11 Q2 holds for 1 ≤ i, j ≤ n.
Hence T (Eij ) = O, 1 ≤ i, j ≤ n.
(c) The rank r(T (Eii )) = 1, 1 ≤ i ≤ n. Let x1 ∈ Fn be such that
x1 = 0 and e1 T (E11 ) = x1 . Let xi = x1 T (E1i ), 2 ≤ i ≤ n.
B.7 Linear Transformations and Their Matrix Representations 745
(2) p maps each vector in V2 into zero vector 0 , i.e. p(x) = 0 for
x ∈ V2 .
V2
p
x
x − p ( x ) = x2
V1
0 p ( x) = x1
Fig. B.3
14. Prove that the following are equivalent. In cases (d), (e) and (f), V is
supposed to be finite-dimensional. Let V1 , . . . , Vk be subspaces of V .
(a) V = V1 ⊕ · · · ⊕ Vk and is called the direct sum of V1 , . . . , Vk if
k
(1) V = i=1 Vi ,
(2) Vi ∩ j=i Vj = { 0 } for each i, 1 ≤ i ≤ k.
k
(b) V = i=1 Vi , and, for any vector xi ∈ Vi , 1 ≤ i ≤ k such that, if
x1 + · · · +
xk = 0 , then x1 = · · · =
xk = 0 .
(c) Each vector x ∈ V can be uniquely expressed as x = x1 + · · · +
xk ,
where xi ∈ Vi , 1 ≤ i ≤ k.
k
(f) V = i=1 Vi and dim V = dim V1 + · · · + dim Vk .
(g) There exist linear operators p1 , . . . , pk ∈ L(V, V ) with R(pi ) = Vi
for 1 ≤ i ≤ k and satisfy:
(1) pi is a projection, i.e. p2i = pi , 1 ≤ i ≤ k.
(2) pi ◦ pj = pj ◦ pi = 0 for i = j, 1 ≤ i, j ≤ k, i.e.
A
Ker(pi ) = Vj .
j=i
(3) 1V = p1 + · · · + pk , i.e.
x ) + · · · + pk (
x = p1 ( x ), x ∈ V.
π(
x) =
x +S x ∈ V.
for
where
π: V → V /Ker(f ) is the natural projection;
f˜: V /Ker(f ) → R(f ) is the linear isomorphism as in Ex. 17(c);
f : V /Ker(f ) → U is the linear transformation such that g = f ◦ π.
This is possible because Ker(f ) ⊆ Ker(g) and f is defined by
f (x + Ker(f )) = g( x ∈V;
x ) for
˜−1
h = f ◦ f : R(f ) → U is linear; and
h: W → U is a linear extension of h from R(f ) to the whole space W.
B.7 Linear Transformations and Their Matrix Representations 749
g = h ◦ f.
4
n
co- dim Ker(fi ) ≤ n < +∞.
i=1
4
n
Ker(fi ) ⊆ Ker(f )
i=1
f = a1 f1 + · · · + an fn .
(Note This result lays the linearly algebraic foundation for the
Lagrange multiplier method in solving constrained extremum prob-
lems (refer to Ex. <D> 8 of Sec. 5.6).)
19. The vector space L(V, F) is usually denoted by
V∗
x ∈ V and f ∈ V ∗ , the
and is called the (first) dual space of V . For
scalar f ( x ) is also denoted as x , f .
(a) For any x ∈ V and x = 0 , there exists an f ∈ V ∗ such that
x , f = 0 but y , f = 0, for any other
y ∈V.
(b) Suppose dim V = n and B = { xn } is a basis for V . Then,
x1 , . . . ,
there exists a unique basis B∗ = {f1 , . . . , fn } for V ∗ such that
xi , fj = δij ,
1 ≤ i, j ≤ n.
750 Fundamentals of Algebraic Linear Algebra
j=1
n
f=
xi , f fi , f ∈ V ∗.
i=1
x0
V = Ker(f ) ⊕
S = Ker(f ).
g = αf.
(V ∗ )∗ = V ∗∗
x ∗∗ : V ∗ → F by
x ∈ V , define
(a) For each vector
∗∗
x (f ) = f (x) = f,
x .
x ∗∗ ∈ V ∗∗ .
Then,
(b) Define Φ: V → V ∗∗ by
Φ( x ∗∗ .
x) =
Then, Φ is a one-to-one linear transformation from V into V ∗∗ .
(c) Suppose dim V = n < ∞. Then Φ: V → V ∗∗ is a natural isomor-
phism. Therefore, every basis for V ∗ is a dual basis of some basis
for V in the sense that
x , f = f,
x , x ∈ V and f ∈ V ∗ .
S ⊆ (S 0 )0 = S 00 .
Ker(f )0 = f
V ∗ ←−−∗−− W ∗
ϕ
Ker() R()
(V ) 0
0 (W )
(V * ) 0 * 0 (W * )
Fig. B.4
Find S 0 .
n
x , f = f (
x) = ai bj
x i, fj ,
i,j=1
756 Fundamentals of Algebraic Linear Algebra
Just a stone’s throw from here is the natural inner product on pairs of
vectors in the real vector space Rn (see Sec. B.9). Identify (Rn )∗ with Rn
in the sense that the dual basis N ∗ = {f1 , . . . , fn } of the natural basis N =
n
{ ei for 1 ≤ i ≤ n and f = i=1 yi fi ∈ (Rn )∗ is
e } is so that fi =
e1 , . . . ,
nn
equal to i=1 yi y ∈ Rn . Then,
ei =
n
y =
x, xi yi
i=1
y A∗ +
x −
x = y A∗ , y A∗ ∈ N(A) and
x −
y A∗ ∈ R(A∗ )
is unique. Note that there are many such
y . Therefore
y A∗ )A =
x A = ( y A∗ A
holds. This implies that
A|R(A∗ ) : R(A∗ ) → R(A)
758 Fundamentals of Algebraic Linear Algebra
x − yA*
(F m )
x
R( A* ) 0 A
yA*
N(A)
y − xA y
(F n )
*
A | R( A )
xA = ( yA* ) A
A*
0 xA
x − yA*
R( A)
N( A* )
x
A* | R((A)
yA* = ( xA) A* 0 (F ) m
N(A) yA*
R( A* )
Fig. B.5
B.8 A Matrix and its Transpose 759
The materials here and the problems in the forthcoming exercises are
briefly taken from [4]. And readers are highly recommended to catch a
glimpse of Gilbert Strang’s article [17].
Exercises
A = [aij ] ∈ M(m, n; F) throughout the following problems unless specified
noted .
1. Let
1 −1 2 3 0
−1 1 2 5 2
A=
.
0 0 −1 −2 3
2 −2 3 4 −1
N ( A* ) x0 A + N ( A* )
y
R( A)
R( A* )
x0 A
A* 0
0
b = yA* = x0 AA*
(F )
n
(F m )
Fig. B.6
y∗ =
For each i, 1 ≤ i ≤ m, the solution set of A ei∗ ∈ Fm is an
(n − m)-dimensional affine subspace of F . Therefore, A has
n
A∗ (AA∗ )−1 .
B.8 A Matrix and its Transpose 761
(b) Suppose r(A) = n < m. Then the set of all left invertible matrices
of A is
x1
x2
= . ∈ M(n, m; F)| each
L xi ∈ Fm is a solution of
..
xn
xA = ei ∈ Fn for 1 ≤ i ≤ n .
(A∗ A)−1 A∗ .
∗
y ∗ = b where
4. Solution set of A y ∈ Fn and b ∈ Fm , b = 0 Suppose
r(A) = r and
(A11 A12 ), r(A) = r(A11 ) = m,
A = A11 A12
, r(A11 ) = r < m,
A21 A 22
∗
∗
b , r(A) = m,
∗
b = b1
∗ , r(A) = r < m, b1 ∈ Fr , b2 ∈ Fm−r .
b2
∗
y ∗ = b has a solution if and only if
(a) A
(1) in case r(A) = m, or
∗
(2) in case r(A) = r < m, then b∗2 = A21 A−1
11 b1 holds.
= ( b1 (A∗11 )−1 , 0 )1×n
y1 A∗12 (A∗11 )−1 ,
+ {(− y1 )1×n ∈ Fn |
y1 ∈ Fn−r }
1
+ z1 − 43 z2 − 13 + z3 − 43 z4
B = 3 , z1 , z2 , z3 , z4 ∈ F.
z1 z3
z2 z4 4×2
∗
For b = (b1 , b2 ) ∈ F2 , A y ∗ = b has solutions
2 1 1 1
y = b1 + b2 , b1 − b2 , 0, 0
3 3 3 3
5 4
+ α(1, 1, 1, 0) + β − , − , 0, 1 ,
3 3
where α, β ∈ F.
(b) Let
0 1 1
1 0 1
A=
1
.
1 0
4 1 −1 4×3
Then r(A) = 3 < 4. A has left invertible matrices of the form
1
− 2 + 2z1 2 − z1
1
2 − 3z1
1
z1
B = 21 + 2z2 − 12 − z2 2 − 3z2
1
z2 , z1 , z2 , z3 ∈ F.
1
2 + 2z3 1
2 − z3 − 12 − 3z3 z3
B.8 A Matrix and its Transpose 763
∗
For b = (b1 , b2 , b3 , b4 ) ∈ F4 , the possible solution of A y ∗ = b is
1
− 2 b1 + 12 b2 + 12 b3 + z1 (2b1 − b2 − 3b3 + b4 )
∗ ∗
y = B b = 12 b1 − 12 b2 + 12 b3 + z2 (2b1 − b2 − 3b3 + b4 ) .
1
2 b1 + 12 b2 − 12 b3 + z3 (2b1 − b2 − 3b3 + b4 )
∗ ∗
y ∗ = b has a solution if and only if r(A) = r[A | b ]
Therefore, A
which induces that 2b1 − b2 − 3b3 + b4 = 0. In this case, the unique
solution is
y = b A(A∗ A)−1
0 1 1
1 9 −15 9
0 1 1
= (b1 b2 b3 b4 )
1 · −15 45 −15
1 0 60
9 −15 29
4 1 −1
1 1 1 1 1 1 1 1 1
= − b 1 + b 2 + b 3 , b 1 − b 2 + b 3 , b1 + b2 − b 3 ,
2 2 2 2 2 2 2 2 2
∗
y ∗ = B b or
which coincides with y = b B ∗ above.
Exercises 7–14 are concerned with real or complex m × n matrix A =
[aij ]m×n . By the way, the readers are required to possess basic knowledge
about inner product , in Rn or Cn (see Sec. B.9, if necessary).
∗
y∗ = b
7. The geometric interpretation of the solution b (AA∗ )−1 A of A
with b = 0 Suppose r(A) = m < n. Thus, the solution b (AA∗ )−1 A
is the one that makes | y | minimum among so many solutions of
∗ ∗
A y = b (see Exs. 4 and 5). That is, the distance from 0 to the
∗
y∗ = b
solution set of A
min∗
|
y|
y ∗= b
A
is obtained at b (AA∗ )−1 A with
perpendicular vector : b (AA∗ )−1 A,
distance: | b (AA∗ )−1 A|.
See Fig. B.7.
8. The geometric interpretation of b A(A∗ A)−1 in case r(A) = n < m
and b = 0 For a given b ∈ Rm (or Cm ), as an approximate solution
∗ ∗
y ∈ Rn to A
y ∗ = b , the error function | b − A
y ∗| = | b −
y A∗ |
∗ −1 ∗ ∗
attains its minimum at b A(A A) . In particular, if A y = b or
764 Fundamentals of Algebraic Linear Algebra
Fig. B.7
y A∗ = b has a solution, then | b −
y A∗ | attains its minimum 0 at the
∗ −1
unique solution b A(AA ) . That is,
y A∗ |
min | b −
y ∈Rn
is obtained at b A(A∗ A)−1 . The linear operator A(A∗ A)−1 A∗ : Rm →
R(A∗ ) ⊆ Rm is the orthogonal projection of Rm onto the column space
R(A∗ ) of A with
projection vector: b A(AA∗ )−1 A∗ ∈ R(A∗ ),
distance vector: b − b A(AA∗ )−1 A∗ ∈ N(A),
the left kernel of A, and
distance from b to R(A ): | b − b A(AA∗ )−1 A∗ |.
∗
N(A)
b
b − bA( A* A)−1 A*
0
bA( A* A) −1 A*
R( A* )
Fig. B.8
B.8 A Matrix and its Transpose 765
In short, y A∗ = b is not possible for general b ∈ Rm but b A(A∗ A)−1
is the optimal solution of y A∗ = b in the sense that it minimizes the
∗
quantities | b − y A | as y varies in Rn by solving
y A∗ A = b A. This is
the so-called least square problems. Take Ex. 6(b) as our example here.
|b − y A∗ | has its minimum at
−6 30 14
1 18 −30 38
(b1 b2 b3 b4 )
b A(A∗ A)−1 = −6
60 30 −6
12 0 −8
1
= (−3b1 + 9b2 − 3b3 + 6b4 ,
30
15b1 − 15b2 + 15b3 , 7b1 + 19b2 − 3b3 − 4b4 ).
As a summary, it is worth reviewing some of main results in order to picture
what will be going on in the sequel. For Am×n ,
(1) If r(A) = m = n, then A has invertible matrix A−1 .
(2) If r(A) = m < n, then A has right invertible matrix A∗ (AA∗ )−1 .
(3) If r(A) = n < m, then A has left invertible matrix (A∗ A)−1 A∗ .
What happens if r(A) = r < min(m, n)? To answer this question, Exercises
9–14 investigate the generalized inverse matrix or pseudoinverse matrix
introduced by E. H. Moore (1935) and R. Penrose (1953).
9. Geometric mapping behavior of AA∗ : Rm → R(A∗ ) ⊆ Rm The following
are equivalent (refer to Fig. B.5).
x AA∗ =
(a) y A∗ ,
x ∈ Rm and y ∈ Rn .
(b) y − x A ∈ N(A ) = R(A)⊥ . Therefore,
∗
y = y − xA + x A,
x A ∈ R(A).
(c)
x A is the orthogonal projection (see Sec. B.9) of y onto the row
space R(A) of A.
(d) |
y −x A| = min z ∈Rm | y − z A|, i.e. | y − x A| is the distance from
y to the row space R(A).
Try to figure out corresponding results for the mapping A∗ A: Rn →
R(A) ⊆ Rn .
10. Characterization of AA∗ : Rm → R(A∗ ) as an orthogonal projection (see
Sec. B.9) The following are equivalent (refer to Fig. B.5).
(a) (geometric) AA∗ : Rm → R(A∗ ) ⊆ Rm is an orthogonal projection
onto the column space R(A∗ ) of A (i.e. AA∗ is symmetric and
(AA∗ )2 = AA∗ ).
766 Fundamentals of Algebraic Linear Algebra
x ∈ Rm and
(e) (geometric) For y ∈ Rn such that x AA∗ =
y A∗ ,
xA
is the orthogonal projection of y on the row space R(A) of A and
y A∗ A.
xA =
Therefore,
x A| = min |
y −
| y −
z A|.
z ∈Rm
∗ ∗
(f) (algebraic) AA A = A or A A | R(A) is an identity mapping.
(g) (geometric) A∗ A: Rn → R(A) ⊆ Rn is an orthogonal projection
from Rn onto the row space R(A) of A (i.e. A∗ A is symmetric and
(A∗ A)2 = A∗ A).
In case AA∗ : Rm → Rm is not an orthogonal projection, equivalent state-
ments in Ex. 10 are no more true and we shall withdraw our knowledge
from here to what stated in Ex. 9. For a given Am×n , it is the deviation of
the geometric mapping properties of A∗ that makes AA∗ not an orthogonal
one. How to compensate this shortcoming of A∗ is to replace A∗ by an
n × m matrix A+ so that AA+ will recover many of the equivalent prop-
erties stated in Ex. 10. This A+ is the Moore–Penrose generalized inverse
matrix of A.
11. Equivalent definitions of the generalized inverse matrix Suppose Am×n
is a real or complex matrix. For any real or complex n × m matrix A+ ,
the following possible properties of A and A+ are equivalent.
(a) (algebraic)
(1) AA+ A = A.
(2) A+ AA+ = A+ .
(3) AA+ and A+ A are symmetric matrices (or Hermitian in case
complex matrices).
x ∈ Rm and
(b) (geometric) AA+ and A+ A are symmetric, and for
y ∈ Rn , then
xA −
y ∈ N(A+ ), the left kernel of A+
⇔
x −
y A+ ∈ N(A).
B.8 A Matrix and its Transpose 767
(c) (geometric and algebraic) For given b ∈ Rn , then among all solu-
tions or approximate solutions
x of
x A = b , it is the vector
x0 = bA+
that satisfies
(1) | b − z ∈Rm | b − z A|,
x A| = min
x0 is usually called the optimal solution of the equations
xA = b
under the constrained conditions 1 and 2.
Such an A+ exists uniquely once A is given (see Ex. 12), and is called
the generalized inverse (matrix) or pseudoinverse of A.
12. Existence and uniqueness of A+ For any permutation
σ: {1, 2, . . . , m} → {1, 2, . . . , m}, the m × m matrix
e σ(1)
..
.
e σ(m) m×m
A = BC, where
Ir A11
B = P −1 or P −1 ,
A21 A−1
11 A21
C = (A11 A12 )Q−1 or (Ir A−1
11 A21 )Q
−1
.
768 Fundamentals of Algebraic Linear Algebra
Note that Bm×r and Cr×n are such that r(B) = r(C) = r. Therefore,
the generalized inverse is
A+ = C ∗ (CC ∗ )−1 (B ∗ B)−1 B ∗ .
In particular,
∗ ∗ −1
A (AA ) , if r(A) = m < n
A = (A∗ A)−1 A, if r(A) = n < m .
+
−1
A , if r(A) = m = n
Consider A+ : Rn → Rm as the linear transformation defined by
y →
y A+ . Then
R(A+ ) = R(A∗ ),
R((A+ )∗ ) = R(A),
and, of course, r(A+ ) = r(A). Hence,
(1) AA+ : Rm → Rm is the orthogonal projection of Rm onto the col-
umn space R(A∗ ) of A, i.e.
AA+ is symmetric and (AA+ )2 = AA+ .
(2) A+ A: Rn → Rn is the orthogonal projection of Rn onto the row
space R(A) of A, i.e.
A+ A is symmetric and (A+ A)2 = A+ A.
Remember that Rn = N(A∗ ) ⊕ R(A) and R(A)⊥ = N(A∗ ). (1) suggests
that A+ : Rn → Rm takes the row space R(A) back to the column space
R(A∗ ) and its restriction to the right kernel N(A∗ ) is zero. Because
r(A) = r(A∗ ), its restriction A | R(A∗ ): R(A∗ ) → R(A) is invertible.
Therefore,
A+ = (A | R(A∗ ))−1 : R(A) → R(A∗ )
inverts A | R(A∗ ). I agree with Gilbert Strang’s comment: that this is the
one natural best definition of an inverse (see [17, p. 853]). See Fig. B.9
(compare with Fig. B.5).
y ∈ Rn such that
y = y − xA + y A∗ =
x A, x AA∗ which is equivalent to
+ +
y A = x AA (see Exs. 10 and 11).
B.8 A Matrix and its Transpose 769
(R m ) N(A) x N( A* ) y ( R n)
AA+ (orthogonal projection)
A R( A)
*
A | R( A )
R( A* ) (isomorphic)
0 x AA+ = y A+ 0 xA
A+
x AA* = y A*
A*
Fig. B.9
(AB)+ = B + A+ .
A+ = A.
+
Hence, On×n = O.
+
(6) Om×n = On×m .
(7) Rm×m and Qn×n are invertible. Then
(P AQ)+ = Q−1 A+ P −1 .
(8) Suppose
..
a11 ..
. ..
.. .. 0
A=
..
.
. . .... .
. . . . . . . . . . a. rr where aii = 0 for 1 ≤ i ≤ r.
.. 0
0 ..
. m×n
770 Fundamentals of Algebraic Linear Algebra
Then
..
1
..
a11
. ..
.. .. 0
+ ..
arr ..
A = 1 .
. . . . . . . . . . . . . ... .
.. 0
0 ..
. n×m
Show that
−9 −9 0
9
A+ =
1 0 −9
27 −3 −6 −3
6 3 −3
and, for b ∈ R3 , find
min |
x |.
x ∈R3
xAA∗ = b A∗
(b) Let
−1 1
1 1 1
A= and B = 1 −1 .
−1 −1 −1
1 −1
x1
.
..
xr
P =
x r+1
..
.
xm m×m
is an orthogonal matrix.
Therefore, A can be written as the following singular value decomposition
λ1
..
.
0
λ
A = P −1 DQ, where D = r
.
0
..
0 .
0
772 Fundamentals of Algebraic Linear Algebra
N(A) R( A* ) N( A* ) R( A)
x3
y3
x2 x2 A
0 A 0 x2 A
2
x1 A
x1 1
x1 A
P P −1 Q
e3
1 0 e3
0 0
2
e2
0 2e2
e1 0
1 e1
Fig. B.10
⇒ | y | ≤ |
x, x ||
y |,
where equality holds if and only if
y = β
x or x = β y for β ∈ F. This is
called the Cauchy–Schwarz inequality. Thus, it follows the triangle inequality
| y | ≤ |
x + x | + |
y |.
x = 0 and
In case V is a real inner product space, then for y = 0 ,
x, y
−1 ≤ ≤ +1.
| x || y |
Therefore, it is reasonable to define the cosine of the angle θ between
x
and
y by
x, y
cos θ = .
| x || y |
We now can reinterpret y as the signed orthogonal projection
x,
| y | cos θ of y along x multiplied by the length |
x | of
x itself. Of course,
B.9 Inner Product Spaces 775
See Fig. B.11. Try to use these concepts to catch the idea used in the proof
of Cauchy–Schwarz inequality above.
〈 y, x 〉
y y− 2
x
x
〈 y, x 〉
0 2
x
x
Fig. B.11
ej = δij ,
ei , 1 ≤ i, j ≤ n.
Exercises
In what follows, V will always denote an inner product space with inner
product , , usually not particularly mentioned.
776 Fundamentals of Algebraic Linear Algebra
n
x, y =
αi β̄j xj = [
xi , y¯ ]∗B ,
x ]B AB [
i,j=1
where
x1 , x1 x1 , x2 ···
x1 , xn
x2 ,
x1 x2 , x2
···
x2 , xn
AB = [ xj ]n×n
xi , = .. .. ..
. . .
x1
xn , x2
xn , · · · xn
xn ,
Ā∗B = AB or (A∗B = AB )
∗
v ≥ 0 for any
v AB v ∈ Fn
with equality only if
v = 0 . Conversely, for any positive-definite
Hermitian matrix A ∈ M(n; C) and a basis B for V ,
y = [
x, y ]∗B
x ]B A[
AB = P AB P̄ ∗ ,
where P = [1V ]B
B is the transition matrix from B to B (see Sec. B.7).
2. The Gram–Schmidt orthogonalization process Suppose { y1 ,
y2 , . . . ,
yk , . . .} is linearly independent in V .
B.9 Inner Product Spaces 777
Then,
1. xk =
x1 , . . . , yk , k ≥ 1.
y1 , . . . ,
2. xk+1 is orthogonal to every vector in
yk ; in symbol,
y1 , . . . ,
xk+1 ⊥ y1 , . . . , yk , k ≥ 1; moreover
xk+1 is the orthogonal
vector from yk+1 to the subspace yk .
y1 , . . . ,
-
Hence { x1 , x2 , . . . ,
xk , . . .} is an orthogonal set and | x1 x2
x 1 | , |
x2 | , . . . ,
xk
.
x k | , . . . is an orthonormal set.
|
(b) The (n + 1)-dimensional real vector space Pn (R) (see Sec. B.3) has
a basis B = {1, x, x2 , . . . , xn }. Use this to show that (n + 1) × (n + 1)
matrix
1 1
2
1
3 ··· 1
n
1
n+1
1 1
2 1 1
· · · n+11
n+2
3 4
. .. .. .. ..
.. .
. . .
1 1 1
· · · 1 1
n n+1 n+2 2n−1 2n
1
n+1 n+2
1 1
n+3 ··· 1
2n
1
2n+1
is positive-definite. Also, {
y0 , yn } where
y1 , . . . ,
y0 (t) = 1,
y1 (t) = t,
1
y2 (t) = t2 − ,
3
..
.
dk
yk (t) = k (t2 − 1)k , 1≤k≤n
dt
is an orthogonal basis. How to get an orthonormal basis?
(c) V has orthogonal (or orthonormal) sets of vectors.
778 Fundamentals of Algebraic Linear Algebra
n
x= xi
x, xi
i=1
[f ]B = [f ( xj ]n×n .
xi ),
S1 ⊥S2 .
In particular, if S1 = {
x }, this is briefly as
x ⊥S2 . Let S be a nonempty
subset of V . The subspace
S ⊥ = {
x ∈ V |
x ⊥S}
|
x0 −
y0 | =
min |
x0 −
y|
y ∈S
x0 −
if and only if ( y0 )⊥S. In this case,
y0 is unique and is called the
orthogonal projection of x0 on S and x0 − y0 the orthogonal vector
from x0 to S. Moreover, the Pythagorean theorem
|
x0 |2 = |
y0 |2 + |
x0 −
y0 |2
holds.
B.9 Inner Product Spaces 779
α1 , . . . , αn ∈ F and n ≥ 1
with equality if and only if αi = xi
for 1 ≤ i ≤ n. That
x,
is, the minimum is obtained at the orthogonal projection vector
n
i=1 x , xi xi of x . By the way,
n
| xi |2 ≤ |
x, x |2 , n≥1
i=1
fj = fxj for 1 ≤ j ≤ n.
B.9 Inner Product Spaces 781
∗
Thus, we identify x and ϕ( x ) = fx , and V and V and call V
a self-dual inner product space. We will adopt this convention when
dealing with inner product spaces.
(c) For each linear operator f : V → V , there exists a unique linear
operator f ∗ : V → V , called the adjoint of f , such that
f (
x ), x , f ∗ (
y = y ), x , y ∈ V.
ϕ ϕ
? ?
V ∗ g
V∗
Notice that f ∗ = ϕ−1 ◦ g ◦ ϕ.)
(d) Suppose f, g: V → V are linear operators. Then
(1) (f + g)∗ = f ∗ + g ∗ .
(2) (αf )∗ = ᾱf ∗ for α ∈ F.
(3) (g ◦ f )∗ = f ∗ ◦ g ∗ .
(4) (f ∗ )∗ = f ∗∗ = f .
(5) 1∗V = 1V .
(e) Some special linear operators. Suppose B is an orthonormal basis
for V and f : V → V is a linear operator.
782 Fundamentals of Algebraic Linear Algebra
(1) f ( y ) = f ∗ (
x ), f ( x ), f ∗ (
y ) for y ∈V.
x,
∗ ∗
⇔ f ◦ f = f ◦ f.
∗ ∗
⇔ [f ]B [f¯]B = [f¯]B [f ]B .
Such an f is called a normal operator and [f ]B a normal matrix.
(2) f ( y ) =
x ), f ( x, y for y ∈ V.
x,
∗ ∗
⇔ f ◦ f = f ◦ f = 1V .
∗ ∗
⇔ [f ]B [f¯]B = In or [f¯]B = [f ]−1 B .
Such an f is called a unitary operator and [f ]B a unitary matrix.
(3) f ( y ) =
x ), x , f (
y ) for y ∈V.
x,
∗
⇔ f = f.
∗
⇔ [f¯]B = [f ]B .
f is called a Hermitian operator or self-adjoint operator and [f ]B
a Hermitian matrix.
(4) f ( y ) = −
x ), x , f (
y ) for y ∈ V.
x,
∗
⇔ f = −f.
∗
⇔ [f¯]B = −[f ]B .
f is called a skew-Hermitian operator and [f ]B a skew-Hermitian
matrix.
In case V is a real vector space, a unitary operator (matrix) is usually
called an orthogonal operator (matrix), Hermitian operator (matrix)
called symmetric operator (matrix) and skew-Hermitian operator
(matrix) called skew-symmetric operator (matrix).
Φ: V → Cn
[f (
x )]B = [
x ]B [f ]B , x ∈ V.
shall refer to Ex. 32 of Sec. B.4 and Sec. B.10 for concepts of eigenvalues
and eigenvectors.
6. Unitary matrices and orthogonal matrices
(a) For a matrix U ∈ M(n; C), the following are equivalent:
(1) U is unitary, i.e. U Ū ∗ = Ū ∗ U = In .
(2) The n row vectors of U form an orthonormal basis for Cn .
(3) The n column vectors of U form an orthonormal basis for Cn .
(4) U (as a linear operator) transforms any orthonormal basis for Cn
into an orthonormal basis for Cn .
(5) U transforms an orthonormal basis for Cn into another one.
(6) U preserves inner products (and hence, orthogonality), i.e.
y U =
x U, y ,
x, x , y ∈ Cn .
x ) − f (
|f ( y )| = |
x −
y |, x , y ∈ Cn .
[f (
x )]B = [
x0 ]B + [
x ]B U,
x ∈ Cn .
f (
x) =
x0 +
x P, x ∈ Rn ,
x →
x0 +
x (rP ), r>0
(f) Suppose A ∈ M(n; C) has rank r = r(A) ≥ 1 and its first r rows
are linearly independent. Then there exists a unitary matrix U such
that
A = BU = U −1 C,
where B is a lower-triangle matrix with first r diagonal entries
positive and the remaining elements all zeros, whereas C is a upper-
triangle matrix having the same diagonal entries as B. Such B and
C are unique. In case A is invertible, U is unique too.
(g) (Schur, 1909) Every complex square matrix is unitarily similar (or
equivalent) to a triangular matrix whose main diagonal entries are
(complex) eigenvalues. That is to say, for A ∈ M(n; C), there exists
a unitary matrix U such that
λ1 x1
b21 λ2 0
x2
−1 b31 b32 λ3 x3
UAU = with U = .
. .. .. .. .
.. . . . ..
bn1 bn2 bn3 · · · λn n×n
xn
Note that the first row vector x1 of U is an eigenvector of A corre-
sponding to the eigenvalue λ1 . Refer to Ex. <C> 10(a) of Sec. 2.7.6.
7. Normal matrix and its spectral decomposition Let N ∈ M(n; C).
(a) Suppose that N is normal. Then
(1) N has n complex eigenvalues.
(2) Eigenvectors of N corresponding to distinct eigenvalues are
orthogonal.
(3) If
x N = λ
x for nonzero x N̄ ∗ = λ̄
x , then x.
(b) (Schur and Toeplitz, 1910) N is normal if and only if N is unitarily
similar to a diagonal matrix. In fact, let λ1 , . . . , λn be eigenvalues of
N with corresponding eigenvectors x1 , . . . ,
xn , i.e.
xj N = λj
xj for 1 ≤ j ≤ n
such that B = { xn } is an orthonormal basis for Cn . Then
x1 , . . . ,
x1
λ1 0
.. x2
U N U −1 = . , U = ..
.
0 λn
xn
where U is unitary.
786 Fundamentals of Algebraic Linear Algebra
Wj = {
x ∈ Cn | x },
x N = λj 1≤j≤r
Cn = W1 ⊕ W2 ⊕ · · · ⊕ Wr
= Wj ⊕ Wj⊥ , 1 ≤ j ≤ r,
where
A
r
Wj⊥ = Wl .
l=1
l=j
B.9 Inner Product Spaces 787
In = N1 + · · · + N r .
N = λ 1 N 1 + · · · + λr N r .
W3
xN3
xN
3xN3
x
2xN2
0
1xN1
xN2
xN1
W1 W2
Fig. B.12
Let
x11
..
.
x1k
1
x21
.
.
.
U = .
x2k2
..
.
xr1
.
..
xrkr
Then U is unitary and
kj
∗ ∗
x1 A x1 > 0 and x2 A x2 < 0.
x ∈ Cn .
There are corresponding definitions for Hermitian matrix with
(a) Suppose An×n = [aij ]n×n is symmetric. The following are equiva-
lent.
(1) A is positive definite.
(2) There exists an invertible real matrix Pn×n such that
A = PP ∗ .
Moreover, P may be taken as a lower triangular matrix.
790 Fundamentals of Algebraic Linear Algebra
is defined to be that of the matrix [f ]B with respect to any fixed basis B for
Fn (see Sec. B.7), while the corresponding eigenvector x is the one that sat-
isfies [x ]B [f ]B = λ[
x ]B which is equivalent to f (
x ) = λx . Similar definition
is still valid for linear transformation f : V → V where dim V < ∞.
Exercises
1. The following are equivalent:
(1) λ is an eigenvalue of A.
(2) λ is a zero of the characteristic polynomial det(A − tIn ) of A (see
Ex. 32 of Sec. B.4), i.e. det(A−λIn ) = 0 or A−λIn is not invertible.
(3) The kernel Ker(A − λIn ) of the linear transformation A − λIn :
Fn → Fn defined by x → x (A − λIn ) has dimension ≥ 1.
2. Some basic facts Let λ be an eigenvalue of A with corresponding
eigenvector
x.
(1) λk is an eigenvalue of Ak for any positive integer k, with the same
eigenvector x.
(2) λ−1 is an eigenvalue of A−1 if A is invertible, with the same
eigenvector x.
(3) The eigenspace corresponding to λ
Eλ = {
x ∈ Fn | x } (including
x A = λ x = 0 ) = Ker(A − λIn )
is a subspace of Fn , of positive dimension.
Also
(4) Similar matrices have the same characteristic polynomial and hence
the same eigenvalues.
(5) A matrix A and its transpose A∗ have the same characteristic poly-
nomial and eigenvalues.
(6) Let det(A − tIn ) = (−1)n tn + αn−1 tn−1 + · · · + α1 t + α0 be the
characteristic polynomial of A = [aij ]n×n . Then (refer to Ex. <C>
4 of Sec. 3.7.6)
αn−1 = (−1)n−1 tr A,
α0 = det A.
Hence (added to Ex. 16(d) of Sec. B.4)
A is invertible.
(c) Prove this theorem in case A is diagonalizable (see Sec. B.11) and
for general matrix A.
B.11 Diagonalizability of a Square Matrix or a Linear Operator 793
Exercises
1. Prove the following.
(a) A matrix is similar to a scalar matrix αIn if and only if it is αIn
itself.
(b) A diagonalizable matrix having only one eigenvalue is a scalar
matrix. Therefore,
1 0 0
1 0
, 1 1 0 , etc.
1 1
0 1 1
be a real matrix. Then, there exist at most two distinct real numbers λ
such that A − λI2 is not invertible. If such a number exists, it should be
an eigenvalue of A.
(a) Suppose A = I2 . Then A has two distinct real eigenvalues λ1 and
λ2 , if and only if there exists an invertible matrix P such that
λ 0
P AP −1 = 1 .
0 λ2
0 1 0 ··· 0
0 0 1 ··· 0
.. .. .. ..
A= . . . ··· .
0 0 0 ··· 1
−a0 −a1 −a2 · · · −an−1
has the characteristic polynomial (−1)n p(t) and the minimal poly-
nomial p(t) itself.
(a) A is diagonalizable.
(b) There exists a basis B = { xn } for Fn consisting of eigenvec-
x1 , . . . ,
tors of A such that
where
x1
..
.
xr1
P =
..
.
xr1 +···+rk−1 +1
..
.
xr1 +···+rk−1 +rk n×n
Fn = Eλ1 ⊕ · · · ⊕ Eλk .
Eλi = Gλi .
798 Fundamentals of Algebraic Linear Algebra
where each Ji is a square matrix of the form [λj ]1×1 or the form
λj 0 0 ··· 0 0
1 λj 0 · · · 0 0
0 1 λj · · · 0 0
. ..
. .. .. . (called a Jordan block corresponding to λj )
. . . · · · .. .
0 0 0 · · · λj 0
0 0 0 · · · 1 λj
for some eigenvalue λj of A and
x1
P = ... .
xn n×n
Exercises
1. Suppose R8 has a basis B = { x8 } so that, for a matrix A8×8 ,
x1 , . . . ,
2 0 0
1 2 0
0 1 2
x1
2 ..
−1
P AP = , P = . .
3 0
1 3
x8
0 0
1 0
8×8
(a) Show that det(A − tI8 ) = (t − 2)4 (t − 3)2 t2 . Note that, among the
basis vectors
x1 , . . . ,
x8 , only
x1 ,
x4 ,
x5 and
x7 are eigenvectors of
A corresponding to eigenvalues λ1 = 2, λ2 = 3 and λ3 = 0 with
respective multiplicity 4, 2 and 2 which are the number of times
that eigenvalues appear on the diagonal of P AP −1 .
(b) Determine the eigenspace Eλi and the generalized eigenspace Gλi
(see Sec. B.11) for 1 ≤ i ≤ 3 and see, if any, Eλi = Gλi or not.
B.12 Canonical Forms for Matrices: Jordan Form and Rational Form 801
(c) For each λi , find the smallest positive integer pi for which
Gλi = Ker((A − λi I8 )pi ), 1 ≤ i ≤ 3.
(d) Show that x3 (A − λ1 I8 ) =
x3 , x3 (A − λ1 I8 )2 =
x2 , x1 , and
x4 are
linearly independent and hence,
Gλ1 =
x1 ,
x2 , x4 .
x3 ,
Similarly, Gλ2 = x6 with
x5 , x6 (A − λ2 I8 ) and Gλ3 =
x5 =
x7 , x8 with x7 = x8 (A − λ3 I8 ).
dim C(
x) = k
and Bx = { x , x A, . . . , x A
k−1
} is a basis for C(
x ) for which
0 1 ··· 0 0
0 ··· 0
0 0
. .. .. ..
.
[A|C(
x )]B = . . . . ,
x
0 0 ··· 1 0
0 0 ··· 0 1
−a0 −a1 · · · −ak−2 −ak−1 k×k
where d k
x (t) = t + ak−1 t
k−1
+ ak−2 tk−2 + · · · + a1 t + a0 .
k
(b) (−1) d x (t) is the characteristic polynomial and d x (t) itself the
minimal polynomial for the restriction A | C( x ).
B.12 Canonical Forms for Matrices: Jordan Form and Rational Form 803
where
0 0 ··· 0 0 0
1 0 · · · 0 0 0
.. .. ..
Ni = ... ... . . . , 1 ≤ i ≤ k.
0 0 · · · 1 0 0
0 0 ··· 0 1 0 mi ×mi
804 Fundamentals of Algebraic Linear Algebra
6 6 6
B1 B2 Bk
Then,
7. Jordan canonical form for matrix Combined Ex. 3 and Ex. 6, here
comes the main result. Suppose that λ1 , . . . , λk are distinct eigenvalues
of An×n such that
Then
xi1 ) ⊕ · · · ⊕ C(
Gλi = C( x iki ), 1≤i≤k
[A | Gλi ]Bi
is as in Ex. 6.
806 Fundamentals of Algebraic Linear Algebra
(2) Moreover
Fn = Gλ1 ⊕ · · · ⊕ Gλk
where P is the invertible matrix where row vectors are basis vectors
of B but arranged in a definite ordering.
By Ex. 6,
Gλ1 =
x 1 (A − λ1 I4 ),
x 1 ⊕
x 2
and B1 = {
x 1 (A − λ1 I4 ), x 2 } is a basis for Gλ1 . Therefore
x 1,
2 0 0
[A | Gλ1 ]B1 = 1 2 0 .
0 0 2
808 Fundamentals of Algebraic Linear Algebra
where
x 1 (A − λ1 I4 ) 1 1 1 1
x 0 1 0 2
P =
1 =
1 0
.
x2 0 0
x3 1 0 0 1
9. Similarity of two matrices Two matrices An×n and Bn×n , each having
its Jordan canonical form, are similar if and only if they have the same
Jordan canonical form, up to the ordering of their eigenvalues. Use this
B.12 Canonical Forms for Matrices: Jordan Form and Rational Form 809
are similar.
10. Write out all Jordan canonical matrices (up to the orderings of Jordan
blocks and eigenvalues) whose characteristic polynomials are the same
polynomial
where each Ri is the companion matrix (refer to Ex. 3(d) of Sec. B.11) of
some polynomial p(t)m , where p(t) is a monic divisor of the characteristic
polynomial det(A − tIn ) of A and m is a positive integer, or Ri is a 1 × 1
matrix [λ], where λ is an eigenvalue of A, and
x1
x2
P = . .
..
xn
810 Fundamentals of Algebraic Linear Algebra
Exercises (continued)
11. For a matrix A9×9 , suppose R9 has a basis B = { x 9 } so that
x, . . . ,
0 1 0 0
0 0 1 0
0 x1
0 0 1
−1 −2 −3 −2
x2
−1 ..
P AP = 0 −1 , P = . .
−1 −1 x8
0 1
x9
−1 0
3 9×9
−1
P AP is the rational canonical form for A with B the corresponding
rational canonical basis.
(a) Show that the characteristic polynomial det(A − tI9 ) =
−p1 (t)3 p2 (t)p3 (t) where p1 (t) = t2 + t + 1, p2 (t) = t2 + 1 and
p3 (t) = t − 3 with the consecutive submatrices R1 , R2 , R3 and R4
as the respective companion matrix of p1 (t)2 , p1 (t), p2 (t) and p3 (t).
Among the diagonal entries of P AP −1 , only 3 is an eigenvalue of
A with x 9 the corresponding eigenvector.
(b) Determine A-invariant subspaces
Epi = {
x ∈ R9 |
x pi (A)m = 0 for some positive integer m},
Ep1 =
x 1,
x 1 A, x 1 A3 ⊕
x 1 A2 , x 5,
x 5 A,
Ep2 =
x 7,
x 7 A,
Ep3 =
x 9 .
Ep = Ker p(A)m .
(p(t))m .
x 1 ) ⊕ · · · ⊕ C(
Ep = C( x k ).
6 6 6
B
x -with B
x -with B
x -with
1 2 k
x1 p(A)m1 = 0, x2 p(A)m2 = 0, xk p(A)mk = 0.
1 1
l1 = [dim Fn − r[p(A)]] = dim Ker p(A),
d d
1
lj = [r[p(A)j−1 ] − r[p(A)j ]] for j > 1,
d
x i1 ) ⊕ · · · ⊕ C(
Epi = C( x iki )
[A | Epi ]Bi
Fn = Ep1 ⊕ · · · ⊕ Epk
15. (compare with Ex. 8) Ex. 13(d) indicates how to compute the ratio-
nal canonical form of a matrix. For example, let
0 1 0 0
−4 −1 −1 −1
A= 12
∈ M(4; R).
3 6 8
−7 −3 −4 −5
Step 1 Compute the characteristic polynomial.
By actual computation,
det(A − tI4 ) = t4 + 5t2 + 6 = (t2 + 2)(t2 + 3).
Let p1 (t) = t2 + 2 and p2 (t) = t2 + 3.
Step 2 Determine [A | Epi ]Bi .
Now
−4 −1 −1 −1
−1 −3 −1 −2
A2 = 4
3 1 5
−1 −1 −1 −4
−2 −1 −1 −1
−1 −1 −1 −2
⇒ p1 (A) = A2 + 2I4 =
4
with r(A2 + 2I4 ) = 2;
3 3 5
−1 −1 −1 −2
−1 −1 −1 −1
−1 0 −1 −2
p2 (A) = A2 + 3I4 =
4
with r(A2 + 3I4 ) = 2.
3 4 5
−1 −1 −1 −1
Hence, there exist
x1 ∈ Ker(A + 2I4 ) and
2
x 2 ∈ Ker(A2 + 3I4 ) so
that B1 = { x 1 , x 1 A} is a basis for Ep1 and B2 = {
x 2,
x 2 A} is a
basis for Ep2 . Therefore,
0 1 0 1
[A | Ep1 ]B1 = and [A | Ep2 ]B2 =
−2 0 −3 0
and the rational canonical form of A is
0 1
x1
−2 0
1 A
x
[A]B = P AP −1 = , P = x2
0 1
−3 0
x 2A
where B = B1 ∪ B2 is a rational canonical basis.
B.12 Canonical Forms for Matrices: Jordan Form and Rational Form 815
nal canonical basis B = {(0, −1, 0, 1), (−3, −2, −3, −4), (−1, 0, 0, 1),
(−7, −4, −4, −5)} and
0 −1 0 −1
−3 −2 −3 −4
P =
−1
.
0 0 1
−7 −4 −4 −5
0 1 1 1 1
2 −2 0 −2 −4
B=
0 0 1 1 3 ∈ M(5; R).
−6 0 −3 −1 −3
2 2 2 2 4
m1 = 2. Now
−2 0 0 0 0
0 −2 0 0 0
2
B = 0 6 4 6 12
0 −12 −12 −14 −24
0 6 6 6 10
0 0 0 0 0
0 0 0 0 0
2
⇒ B + 2I5 = 0 6 6 6 12
0 −12 −12 −12 −24
0 6 6 6 12
with r(p1 (B)) = r(B 2 + 2I5 ) = 1.
Thus, the first row in the Table for Ep has 12 (5 − r(p1 (B))) = 2
dots.
This implies that the only possibility is m1 = m2 = 1 and each
dot contributes the companion matrix
0 1
−2 0
0 1 x1
−2 0
x1 A
−1 x2
P BP = 0 1 , P =
−2 0 x2 A
2 x3
0 0 0 2
This page intentionally left blank
References
On Linear Algebra
[1] · : (Elementary Geometry), , , 1998.
[2] · : (Linear Algebra), , , 1989.
[3] : (1): (Determinants), , 1982, 1987.
[4] : (2): (Matrices), , 1982, 1987.
[5] : (3): , (Vector and Affine Spaces), , 1983.
[6] : (4): (Projective Spaces), , 1984.
[7] : (5): (Inner Product Spaces), , 1984.
[8] : (Linear Algebra and Theory of Matrices),
, , 1992.
[9] A. R. Amir-Moe’z and A. L. Fass: Elements of Linear Algebra, Pergamon
Press, The MacMillan Co., New York, 1962.
[10] N. V. Efimov and E. R. Rozendorn: Linear Algebra and Multidimensional
Geometry (English translation), Mir Publishers, Moscow, 1975.
[11] A. E. Fekete: Real Linear Algebra, Marcel Dekkel, Inc., New York, Basel,
1985.
[12] H. Gupta: Matrices in n-dimensional Geometry, South Asian Publishers,
New Delhi, 1985.
[13] M. Koecher: Lineare Algebra und Analytischer Geometrie, zweite Auflage,
Grundwissen Mathematik 2, Springer-Verlag, Berlin, Heidelberg, New York,
Tokyo, 1985.
[14] P. S. Modenov and A. Parkhomenko: Geometric Transformations, Academic
Press, New York, London, 1965.
[15] G. Strang: Introduction to Linear Algebra, 2nd ed., Wellesley-Cambridge
Press, 1998.
[16] G. Strang: Linear Algebra and its Applications, 3rd ed., Saunders,
Philadelphia, 1988.
[17] G. Strang: The Fundamental Theorem of Linear Algebra, AMS, Vol 100,
number 9(1993), 848–855.
[18] S. M. Wilkinson: The Algebraic Eigenvalue Problem, Oxford University
Press, New York, 1965.
Other Related Sources
History
[19] M. Kline: Mathematical Thought: From Ancient to Modern Times, Oxford
University Press, New York, 1972.
819
820 Geometrics Linear Algebrea — I
Group
[20] J. J. Rotman: An Introduction to the Theory of Groups, 4th ed., Springer-
Verlag, New York, 1995.
Geometry
[21] H. Eves: A Survey of Geometry, Vol 1, 1963, Vol 2, 1965, Allyn and Bacon,
Inc., Boston.
Fractal Geometry
[25] M. Barnsley: Fractals Everywhere, Academic Press, San Diego, 1988.
[26] B. B. Mandelbrot: The Fractal Geometry of Nature, W. H. Freeman,
New York, 1983.
Matrix Analysis
[27] R. Bellman: Introduction to Matrix Analysis, Siam, Philadelphia, 1995.
Complex Analysis
[29] L. V. Ahlfors: Complex Analysis, 3rd ed., McGraw-Hill Book Co., New York,
1979.
[30] L. K. Hua( ): Harmonic Analysis of Functions of Several Complex
Variables in the Classical Domains ( ), Trans. Math.
Monographs, Vol 6, AMS, Providence, R.I., 1963.
References 821
Differential Equations
[33] W. E. Boyce and R. C. Diprema: Elementary Differential Equations, 7th ed.,
John Wiley & Sons, New York, 2000.
[34] S. J. Farlow: An Introduction to Differential Equations and their Applica-
tions, McGraw-Hill, Inc., New York, 1994.
Fourier Analysis
[35] R. E. Edwards: Fourier Series, A Modern Approach, Vols 1 and 2, Pergamon
Press, Inc., New York, 1964.
[36] E. C. Titchmarsh: Introduction to the Theory of Fourier Integrals, 2nd ed.,
Oxford at the Clarendon Press, 1959.
[37] A. Zygmund: Trigonometric Series, Vols I and II, Cambridge University
Press, New York, 1959.
Markov Chains
[38] K. L. Chung ( ): Elementary Probability Theory with Stochastic
Processes, UTM, Springer-Verlag, New York, 1974.
[39] K. L. Chung: Markov Chains, 2nd ed., Springer-Verlag, 1967.
[40] J. L. Doob: Stochastic Processes, Wiley & Sons, New York, 1953.
[41] J. G. Kemeny and J. L. Snell: Finite Markov Chains, Springer-Verlag,
New York, 1976.
This page intentionally left blank
Index of Notations
823
824 Index of Notations
∗
x A = b or x = (x1 , . . . , xn ) ∈
A = [aij ]m×n , 152, 415, 424,
∗
x∗ = b
A Fn and b = (b1 , . . . , bm ) ∈ Fm : 442, 711,
n
723, 759
aij xj = bi , 1 ≤ i ≤ m
(etc.)
∗ ∗ j=1
A
A b or augmented matrix of the coefficient 150, 152, 156,
∗
b matrix A of Ax ∗ = b or 377, 416,
∗
x A = b , respectively 425, 724
(etc.)
A B
block or partitioned matrix 180
C D
A+ the generalized or pseudo inverse of 175, 177, 419,
a real or complex matrix Am×n , 429, 462,
which is A∗ (AA∗ )−1 if r(A) = m; 469, 766
(A∗ A)−1 A∗ if r(A) = n; (etc.)
C ∗ (CC ∗ )−1 (B ∗ B)−1 B if
A = Bm×r Cr×n with
r(B) = r(C) = r, where r = r(A)
det(A − tIn ) the characteristic polynomial 106, 399, 407,
(−1)n tn + αn−1 tn−1 + · · · + 491, 719,
α1 t + α0 , αn−1 = tr(A), 791(etc.),
α0 = detA, of a matrix An×n 795
m
p(A) = αk Ak polynomial matrix of An×n 108, 128, 403,
k=0 induced by the polynomial 486, 496,
m
499
p(t) = αk tk . Note A0 = In
k=0
∞
A 1 k
e = A matrix exponential of A 496, 499, 547,
k!
k=0 558
ρ(A) spectral radius of A 496
(k) (k)
lim A the limit matrix (if exists) of A , 494
k→∞ (k)
where An×n , as k → ∞
a , b , x , y (etc.) vector 8, 26, 692
α1 a1 +· · ·+αk
ak = linear combination of vectors 31, 324, 695
k
a1 , . . . ,
ak with coefficients
αi ai
(scalars) α1 , . . . , αk
i=1
828 Index of Notations
a1 a2 (etc.) directed line segment from point a1 18, 65
to point a2 ; line segment with
endpoints a1 and a2
∆
a1
a2
a3 (affine and Euclidean) triangle with 65, 76
vertices at points a1 ,
a2 and
a3 ;
base triangle in a barycentric
coordinate system for the plane
¯
∆ a1
a2
a3 oriented triangle 75
∆
a1
a2
a3
a4 a tetrahedron with vertices at 356, 363,
points a1 , a2 ,
a3 and a4 ; 638, 643
4-tetrahedron; 4-simplex; base
tetrahedron
∆ a1 · · ·
a0 ak k-tetrahedron or k-simplex, where, 642, 654
a0 , a1 , . . . ,
ak are affinely
independent points in an affine
space
a0 a1 ···
ak k-parallelogram with a0 as vertex 61, 446, 638,
and side vectors 661, 667
a1 − ak −
a0 , . . . , a0 , where
a0 , a1 , . . ., ak are affinely
independent points in an affine
space; k-hyperparallelepiped
B, C, D, N (etc.) basis for a finite-dimensional vector 10, 34, 115,
space V 326, 406,
697 (etc.)
−
[P ]B or [OP ]B or coordinate vector of a point P in R 10
−
[
x ]B = α or the vector OP = x with
respect to a basis B
−
[P ]B or [OP ]B or coordinate vector of a point P in 34
−
[
x ]B = (x1 , x2 ) R2 or the vector OP = x w. r. t.
a basis B = { a2 }, namely,
a1 ,
2
x= xi
ai
i=1
−
[P ]B or [OP ]B or coordinate vector of a point P in 326
−
[
x ]B = (x1 , x2 , x3 ) R3 or the vector OP = x w. r. t.
a basis, B = {a1 , a3 }, namely,
a2 ,
3
x= xi
ai
i=1
Index of Notations 829
F field 684
F or F
1
standard one-dimensional vector 693
space over the field F
Fn standard n-dimensional vector 42, 332, 333,
space over the field F, n ≥ 2 693
F(i)(j) elementary matrix of type I: 150, 160, 442,
interchange of ith column and 720
jth column of In , i = j
Fα(i) elementary matrix of type II: 150, 160, 442,
multiplication of ith column of 720
In by scalar α = 0
F(j)+α(i) elementary matrix of type III, of 150, 160, 442,
columns 720
f: A→B function from a set A to a set B 682
−1
f :B→A inverse function of f : A → B which 684
is both one-to-one and onto
f |s or f |S : S → B the restriction of f : A → B to a 683
subset S ⊆ A
f ◦g the composite of g followed by f 683
f: V →W linear transformation from a vector 57, 58, 84,
space V to a vector space W over 366, 732
the same field F; in case W = V ,
called linear operator ; in case
W = F, called linear functional
F(X, F) the vector space of functions form a 333, 693
set X to a field F
Ker(f ), N(f ) kernel or null space of 56, 58, 85,
f : {
x ∈ V |f (
x) = 0 } 124, 366, 407,
412, 734
Im(f ), R(f ) image or range space of 56, 58, 85,
f : {f ( x ∈ V}
x )| 124, 366, 407,
412, 734
Index of Notations 831
A O
Ga (n; R) A ∈ GL(n; R) 238, 239, 578,
x0 1
/ 580, 656
x 0 ∈ Rn , namely, the set
and
R3Γ(O; A1 , A2 , A3 ) coordinatized space {[P ]B P ∈ Γ} 326
of the space Γ w. r. t. the basis
B = { a 1, a 3 }, where
a 2,
−
a i = OAi , i = 1, 2, 3 and
[P ]B = (x1 , x2 , x3 )
R3 standard three-dimensional real 327
vector space
Rn (n ≥ 2) standard n-dimensional affine, or 42, 327, 693,
Euclidean (inner product) or 773
vector space over the real field R
(R3 )∗ (first) dual space of R3 420
(R3 )∗∗ second dual space of R3 420
T (
x) = x 0 + f (
x) affine transformation 67, 83, 236
or x 0 +
xA
−−
x = OX(etc.) vector 5, 26, 29, 37,
692
−
x = (−1)
x negative of
x or inverse of
x under 6, 26, 27,
addition 29, 692
x +
y sum or addition of vectors
x and
y 26, 29, 692
αx scalar multiplication of x by 6, 27, 29, 692
scalar α
x −
y
x + (− y) 27, 692
x
subspace generated or spanned 41
by x
x2
x1 , subspace generated or spanned 330
by x1 and x2
x k
x1 , . . . , subspace generated or spanned by 41, 330, 695
{ x k }, i.e. {
x1 , . . . , x k }
x1 , . . .
x →
x0 +
x translation along the vector x0
67, 73, 236,
247, 251
Index of Notations 837
x0 +S image of subspace S under 67, 359
x →
x1 + x , an affine subspace
y
x, inner product of
x and
y 773
1
|x|
x , x , length of x
2
360, 774
x⊥y
x and y is perpendicular or 129, 412, 774
orthogonal to each other:
y = 0.
x,
V, W (etc.) vector or linear space over a field 691
∗
V (first) dual space of V 749
∗∗ ∗ ∗
V = (V ) second dual space of V 751
V /S quotient space of V modulus 200, 211, 369
subspace S (etc.), 695
This page intentionally left blank
INDEX
adjoint linear operator, 422, 754, 781 space, 9, 12, 31, 34, 235, 324, 326,
(also called) dual operator in dual 640
space, 422, 754 one-dimensional, 8, 12
self-adjoint or symmetric operator, two-dimensional, 31, 36
782, 786 three-dimensional, 324, 326
adjoint matrix, 336, 731 n-dimensional, 235
affine difference space, 235
basis (or frame), 9, 10, 19, 31, 34, zero vector, 235, 236
71, 239, 324, 326, 362, 580, 640 base point, 235
base point (origin), 9, 10, 19, position vector (with initial
31, 34, 71, 239, 324, 326, point and terminal
362, 580, 640 point), 235
coordinate vector, 10, 34, 326, free vector, 235
640 difference vector, 236
orthonormal affine basis, 613, affine subspace, 236
625 affine transformation or
natural affine basis for R2 , 239 mapping, 237
natural affine basis for R3 , 581 affine motion, 237, 240
coordinate system, 19, 34, 38, 324, subspace, 67, 236, 359, 425, 640,
326, 328, 362, 640 724
affine coordinate, 19, 72, 239, dimension, 235
363, 641 k-dimensional plane, 640
barycentric coordinate, 19, 72, hyperplane, 641
363, 642 dimension (or intersection)
barycenter, 19, 72, 363 theorem, 644
dependence of points, 32, 71, 239, operation of affine
325, 362, 580, 640 subspaces, 643
geometry, 292, 640 intersection, 644
invariant, 19, 90, 286, 374, sum, 644
636, 638, 679 relative positions of affine
group, 238, 240, 580, 581, 656 subspaces, 645
independence of points, 33, 71, 239, coincident, 645
325, 362, 580, 640 parallel, 645
invariant, 19, 90, 286, 310, 374, 636 skew (etc.), 645
839
840 Index
range (subspace, space), 56, 85, 734 Schur’s formula (see block matrix),
rank, 89, 124, 366, 368(etc.), 383, 182
406, 711, 722, 734, 739 set, 681
rank theorem for matrices or linear member; element, 681
operators (see also normal form subset; proper subset, 681
of a matrix), 140, 164, 436, 438 empty set, 681
rank decompoisition theorem of a union, 681
matrix, 184 intersection, 681
rational canonical form (see canonical difference, 681
form), 233, 399, 562, 569, 574, Cartesian product, 681
809(etc.) shearing (see affine, transformation),
A-cycle (of a matrix A), 570, 802 100, 265, 392, 612
annihilator, 214, 570, 802 signed length (of a segment), 6
companion matrix, 571, 574, 796, similar matrix (see matrix, similar)
802 simplex, 642, 654, 662
rational canonical basis, 233, 562, 0-simplex, i.e. a point, 5
574, 813 1-simplex, i.e. a line segment, 5, 18
real field, 12, 685 2-simplex, usually called a triangle
real general linear group on Rn (see in R2 (see triangle), 65
group) 3-simplex, usually called a
real inner product space (see inner tetrahedron in R3 (see
product), 773 tetrahedron),
real vector space, 29, 692 k-simplex or k-tetrahedron (in Rn ),
rectangular coordinate system (see 590, 642, 654, 662
coordinate system), 38, 328 vertex, 642
reflection (see affine, transformation), edge, 642
253, 274, 387, 461, 468, 544, 592 face, 642
Riesz representation theorem (see open simplex, 654
inner product space), 780 boundary, interior, exterior,
right inverse, 142, 175, 423, 457, 760 662
right invertible, 142, 423, 457, 760 separation, 663
right kernel, 124, 407, 757 barycentric subdivision, 666
rotation (see orthogonal, generalized Euler formula, 667
transformation), 272 k-dimensional volume, 641
row simultaneously
rank, 125, 367(etc.), 383, 407, 711, diagonalizable, 209, 213
739 lower or upper triangularizable, 215
space, 125, 407, 757 singular matrix (see matrix)
vector, 125, 367(etc.), 375, 703 singular value (of a matrix), 177, 771
row-reduced echelon matrix, 164, decomposition theorem (SVD),
443(etc.), 464, 721 178, 771
implications and applications, polar decomposition (see the
722–727 index), 772
Index 855
generalized inverse (see the index), unitary matrix (see inner product
176, 766(etc.) space, adjoint operator), 782,
standard 783, 786
zero-dimensional vector space { 0 }, unitarily similar, 786
694, 698 upper triangular matrix (see matrix),
one-dimensional vector space R, 12 161, 701
two-dimensional vector space R2 ,
37 Vandermonde determinant, 750
three-dimensional vector space R3 , vector
327 line vector, 7
n-dimensional vector space Rn or zero vector, 7
Cn or Fn , 693, 698 identical vectors, 7
basis or natural basis (see basis) parallel invariance of
inner product or natural inner vectors, 7
product for Rn or Cn (see inner scalar multiplication
product), 773 (product), 8, 9
stretch or stretching (see affine, linear dependence, 9
transformation) linear independence, 1
Sylverster’s law of inertia (of a plane vector, 24
symmetric matrix), 166, 454, zero vector, 24, 27
467, 470, 471 identical vectors, 25
index, 166, 454, 467, 470, 471 parallelogram law or
parallel invariance, 25
signature, 166, 454, 467, 470, 471
negative of a vector, 26
rank, 166, 454, 467, 470, 471
addition (sum) of vectors, 26
scalar multiplication, 27
tensor product (of two matrices), 148
subtraction vector, 27
tetrahedron in R3 (see simplex), 356,
scalar, 28
363, 401, 638, 642
linear combination: coefficient,
vertex, 356
31
edge, 356
linear dependence, 32, 40
face, 356 linear independence, 33, 41
median plane, 356 spatial vector, 29, 320
triangle in R2 (see simplex), 65 in abstract sense, 692
vertex, 65 addition, 692
side, 65 sum, 692
median, 65 scalar multiplication, 692
centroid, 65 scalar, 684, 692
transition matrix product, 692
between bases (see change of zero vector, 692
coordinates) negative or inverse of a vector,
in Markov chain (see Markov 692
chain) linear combination: coefficient,
translation (see affine, 695
transformation) linear dependence, 696
transpose (see matrix) linear independence, 696
856 Index