3210 Course
3210 Course
3210 Course
Syllabus:
1. Definition and fundamental properties of a metric space. Open sets, closed sets,
closure and interior. Convergence of sequences. Continuity of mappings. (6)
2. Real inner-product spaces, orthonormal sequences, perpendicular distance to a
subspace, applications in approximation theory. (7)
3. Cauchy sequences, completeness of R with the standard metric; uniform convergence
and completeness of C[a, b] with the uniform metric. (3)
4. The contraction mapping theorem, with applications in the solution of equations
and differential equations. (5)
5. Connectedness and path-connectedness. Introduction to compactness and sequential
compactness, including subsets of Rn . (6)
LECTURE 1
Books:
Victor Bryant, Metric spaces: iteration and application, Cambridge, 1985.
M. Ó. Searcóid, Metric Spaces, Springer Undergraduate Mathematics Series, 2006.
D. Kreider, An introduction to linear analysis, Addison-Wesley, 1966.
to other settings.
[DIAGRAM]
This will include the ideas of distances between functions, for example.
1
1.1 Definition
Let X be a non-empty set. A metric on X, or distance function, associates to each
pair of elements x, y ∈ X a real number d(x, y) such that
(i) d(x, y) ≥ 0; and d(x, y) = 0 ⇐⇒ x = y (positive definite);
(ii) d(x, y) = d(y, x) (symmetric);
(iii) d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality).
Examples:
(i) X = R. The standard metric is given by d(x, y) = |x − y|. There are many other
metrics on R, for example
d(x, y) = |ex − ey |;
|x − y| if |x − y| ≤ 1,
d(x, y) =
1 if |x − y| ≥ 1.
Let X be any set whatsoever, then we can define
1 if x 6= y,
d(x, y) = (the discrete metric).
0 if x = y,
Let’s check the axioms. In fact (i) and (ii) are easy (i.e., the distance is positive definite,
symmetric); for (iii) let’s write |x1 −y1 | = p, |x2 −y2 | = q, |y1 −z1 | = r and |y2 −z2 | = s.
Then |x1 − z1 | ≤ p + r and |x2 − z2 | ≤ q + s; so
by inspection.
2
(iii) Take X = C[a, b]. Here are three metrics:
s
Z b
d2 (f, g) = (f (x) − g(x))2 dx.
a
Again, this is linked to the idea of an inner product, so we will delay proving that it is
a metric.
Z b
d1 (f, g) = |f (x) − g(x)| dx,
a
Z 1 1/2
2 2
p
d2 (f, g) = (x − x ) dx = 1/30,
0
Z 1
d1 (f, g) = |x − x2 | dx = 1/6, and
0
d∞ (f, g) = max |x − x2 | = 1/4.
x∈[0,1]
1.2 Definition
A set X together with a metric d is called a metric space, sometimes written (X, d). If
A ⊆ X then we can use d to measure distances between points of A, and (A, d) is also
a metric space, called a subspace of (X, d).
LECTURE 2
Examples:
1. The interval [a, b] with d(x, y) = |x − y| is a subspace of R.p
2. The unit circle {(x1 , x2 ) ∈ R2 : x21 + x22 = 1} with d(x, y) = (x1 − y1 )2 + (x2 − y2 )2
is a subspace of R2 .
3. The space of polynomials P is a metric space with any of the metrics inherited from
C[a, b] above.
1.3 Definition
3
Let (X, d) be a metric space, let x ∈ X and let r > 0. The open ball centred at x, with
radius r, is the set
B(x, r) = {y ∈ X : d(x, y) < r},
and the closed ball is the set
Note that in R with the usual metric the open ball is B(x, r) = (x − r, x + r), an open
interval, and the closed ball is B[x, r] = [x − r, x + r], a closed interval.
For the d2 metric on R2 , the unit ball, B(0, 1), is disc centred at the origin, excluding
the boundary. You may like to think about what you get for other metrics on R2 .
1.4 Definition
A subset U of a metric space (X, d) is said to be open, if for each point x ∈ U there is
an r > 0 such that the open ball B(x, r) is contained in U (“room to swing a cat”).
Clearly X itself is an open set, and by convention the empty set ∅ is also considered
to be open.
1.5 Proposition
Every “open ball” B(x, r) is an open set.
Proof: For if y ∈ B(x, r), choose δ = r − d(x, y). We claim that B(y, δ) ⊂ B(x, r).
If z ∈ B(y, δ), i.e., d(z, y) < δ, then by the triangle inequality
So z ∈ B(x, r).
1.6 Definition
A subset F of (X, d) is said to be closed, if its complement X \ F is open.
Note that closed does not mean “not open”. In a metric space the sets ∅ and X are
both open and closed. In R we have:
(a, b) is open.
[a, b] is closed, since its complement (−∞, a) ∪ (b, ∞) is open.
[a, b) is not open, since there is no open ball B(a, r) contained in the set. Nor is it
closed, since its complement (−∞, a) ∪ [b, ∞) isn’t open (no ball centred at b can be
contained in the set).
1.7 Example
4
If we take the discrete metric,
1 if x 6= y,
d(x, y) =
0 if x = y,
then each point {x} = B(x, 1/2) so is an open set. Hence every set U is open, since
for x ∈ U we have B(x, 1/2) ⊆ U .
1.8 Proposition
In a metric space, every one-point set {x0 } is closed.
1.9 Theorem
Let (Uα )α∈A beSany collection of open subsets of a metric space (X, d) (not necessarily
finite!). Then α∈A Uα is open. Let U and V be open subsets of a metric space (X, d).
Then U ∩V is open. Hence (by induction) any finite intersection of open subsets is open.
S
Proof: If x ∈ α∈A Uα then there is an Sα with x ∈ Uα . Now Uα is open, so
B(x, r) ⊂ Uα for some r > 0. Then B(x, r) ⊂ α∈A Uα so the union is open.
If now U and V are open and x ∈ U ∩ V , then ∃r > 0 and s > 0 such that B(x, r) ⊂ U
and B(x, s) ⊂ V , since U and V are open. Then B(x, t) ⊂ U ∩ V if t ≤ min(r, s).
[DIAGRAM.]
So the collection of open sets is preserved by arbitrary unions and finite intersections.
However, an arbitrary intersection ofTopen sets is not always open; for example (− n1 , n1 )
is open for each n = 1, 2, 3, . . ., but ∞ 1 1
n=1 (− n , n ) = {0}, which is not an open set.
LECTURE 3
For closed sets we swap union and intersection.
1.10 Theorem
Let (Fα )α∈A be any
T collection of closed subsets of a metric space (X, d) (not necessar-
ily finite!). Then α∈A Fα is closed. Let F and G be closed subsets of a metric space
(X, d). Then F ∪ G is closed. Hence (by induction) any finite intersection of closed
5
subsets is closed.
To prove this we recall de Morgan’s laws. We use the notation S c for the complement
X \ S of a set S ⊂ X.
[ [ \
x 6∈ Aα ⇐⇒ x 6∈ Aα for all α, so ( Aα )c = Acα .
α
\ \ [
x 6∈ Aα ⇐⇒ x 6∈ Aα for some α, so ( Aα )c = Acα .
α
Write Uα = Fαc = X \ F
S
Proof: Tα which cis open. So α∈A Uα is open S by Theorem
c
S
1.9. Now, by de Morgan’s
T laws, ( α∈A Fα ) = α∈A Fα . This is just α∈A Uα . Since
its complement is open, α∈A Fα is closed.
Similarly, the complement of F ∪ G is F c ∩ Gc , which is the intersection of two open
sets and hence open by Theorem 1.9. Hence F ∪ G is closed.
Infinite
S∞ 1 unions of closed sets do not need to be closed. An example is
n=1 [ n , ∞) = (0, ∞), which is open but not closed.
1.11 Definition
The closure of S, written S, is the smallest closed set containing S, and is contained
in all other closed sets containing S. Also S is dense if S = X.
A smallest closed set containing S does exist, because we can define
\
S = {F : F ⊃ S, F closed},
the intersection of all closed sets containing S. There is at least one, namely X itself.
1.12 Example in R
The closure of S = [0, 1) is [0, 1]. This is closed, and there is nothing smaller that is
closed and contains S.
1.13 Theorem
The set Q of rationals is dense in R, with the usual metric.
1.14 Proposition
6
Let S ⊂ X. Then:
(i) S ⊂ S.
(ii) S = S ⇐⇒ S is closed (so S = S).
(iii) S ⊂ T ⇒ S ⊂ T .
(iv) ∅ = ∅, X = X.
(v) S ∪ T = S ∪ T .
(vi) S ∩ T ⊂ S ∩ T .
Proof: All these are quite easy except (v) and (vi) (CHECK).
But we don’t need to have equality; for example X = R, S = (0, 1), T = (1, 2). Then
S ∩ T = ∅ = ∅, whereas S ∩ T = [0, 1] ∩ [1, 2] = {1}.
1.15 Definition
We say that V is a neighbourhood (nhd) of x if there is an open set U such that
x ∈ U ⊆ V ; this means that ∃δ > 0 s.t. B(x, δ) ⊆ V . Thus a set is open precisely
when it is a neighbourhood of each of its points.
1.16 Example
The half-open interval [0, 1) is a neighbourhood of every point in it except for 0.
1.17 Theorem
For a subset S of a metric space X, we have x ∈ S iff V ∩ S 6= ∅ for all nhds V of x
(i.e., all neighbourhoods of x meet S).
LECTURE 4
1.18 Definition
7
The interior of S, int S, is the largest open set contained in S, and can be written as
[
int S = {U : U ⊂ S, U open}.
the union of all open sets contained in S. There is at least one, namely ∅.
1.19 Examples in R
int[0, 1) = (0, 1); clearly this is open and there is no larger open set contained in [0, 1).
int Q = ∅. For any non-empty open set must contain an interval B(x, r) and then it
contains an irrational number, so isn’t contained in Q.
1.20 Proposition
int S = X \ (X \ S).
1.21 Corollary
(i) int S ⊂ S.
(ii) int S = S ⇐⇒ S is open.
(iii) S ⊂ T ⇒ int S ⊂ int T .
(iv) int (int S) = int S.
(v) int(S ∪ T ) ⊃ int S ∪ int T .
(vi) int(S ∩ T ) = int S ∩ int T .
Proof: Easy, or take complements and use Prop’s 1.14 and 1.20.
1.22 Definition
The boundary or frontier of S is ∂S = S \ int S = S ∩ X \ S.
This writes ∂S as the intersection of two closed sets, so it is also closed.
1.23 Examples in R
8
For S = [0, 1) we have int S = (0, 1) and S = [0, 1] so ∂S = {0, 1}.
1.24 Examples in R2
For S = {(x, y) : x2 + y 2 < 1}, we have int S = S and S = {(x, y) : x2 + y 2 ≤ 1}, so
∂S is the circle {(x, y) : x2 + y 2 = 1}.
2.1 Definition
We say xn → x (i.e., xn tends to x or converges to x) if d(xn , x) → 0 as n → ∞. That
is, for all ε > 0 there is an N such that d(xn , x) < ε for n ≥ N (“for n sufficiently
large”).
This is the usual notion of convergence if we think of points in Rm with the Euclidean
metric.
2.2 Theorem
(i) The sequence (xn ) tends to x if and only if for every open U with x ∈ U , ∃n0 s.t.
xn ∈ U for all n ≥ n0 .
(ii) Let S be a subset of the metric space X. Then x ∈ S if and only if there is a
sequence (xn ) of points of S with xn → x.
Conversely, if the “open set” condition works, and ε > 0, choose U = B(x, ε). Then
xn ∈ U for n sufficiently large, and so d(xn , x) < ε for n large.
9
2.3 Examples
1. Take (R2 , d1 ), where d1 (x, y) = |x1 − y1 | + |x2 − y2 |, where x = (x1 , x2 ) and
y = (y1 , y2 ), and consider the sequence ( n1 , 2n+1
n+1
). We guess its limit is (0, 2). To see if
this is right, look at
1 2n + 1 1 2n + 1 1 1
d1 , , (0, 2) = +
− 2 = + →0
n n+1 n n+1 n n+1
LECTURE 5
2. In C[0, 1] let fn (t) = tn and f (t) = 0 for 0 ≤ t ≤ 1. Does fn → f , (a) in d1 , and (b)
in d∞ ?
(a) Z 1
1
d1 (fn , f ) = tn dt = →0
0 n+1
as n → ∞. So fn → f in d1 .
(b)
d∞ (fn , f ) = max{tn : 0 ≤ t ≤ 1} = 1 6→ 0
as n → ∞. So fn 6→ f in d∞ .
Note: say gn → g pointwise on [a, b] as n → ∞ if gn (x) → g(x) for all x ∈ [a, b]. If we
0 for 0 ≤ x < 1,
define g(x) = then fn → g pointwise on [0, 1]. But g 6∈ C[0, 1], as
1 for x = 1,
it is not continuous at 1.
A result on convergence in R2 .
2.4 Proposition
10
Take R2 with any of the metrics d1 , d2 and d∞ . Then a sequence xn = (an , bn ) con-
verges to x = (a, b) if and only if an → a and bn → b.
Proof: We have d1 (xn , x) = |an − a| + |bn − b|. This tends to zero as n → ∞ if and
only if each of the terms |an − a| and |bn − b| does. And that’s the same as saying that
an → a and bn → b.
Also d2 (xn , x) = (|an − a|2 + |bn − b|2 )1/2 , which tends to zero if and only if |an − a|2 +
|bn − b|2 does; this happens if and only if |an − a|2 and |bn − b|2 tend to zero, which is
the same as an → a and bn → b.
Finally, d∞ (xn , x) = max{|an − a|, |bn − b|}. If this tends to zero then so do |an − a|
and |bn − b| as they are smaller and still positive; and if they both tend to zero then
so does their maximum, which is less than their sum. Again this is the same as saying
an → a and bn → b.
2.5 Theorem
If fn → f in (C[a, b], d∞ ), then fn → f in (C[a, b], d1 ).
so d1 (fn , f ) → 0 as n → ∞.
Note: It is also true that if d∞ (fn , f ) → 0 then fn → f pointwise on [a, b]. The
converse is FALSE.
2.6 Definition
Let f : (X, dX ) → (Y, dY ) be a map between metric spaces. We say that f is continuous
at x0 ∈ X if for each ε > 0 there is a δ > 0 such that dY (f (x), f (x0 )) < ε whenever
dX (x, x0 ) < δ.
11
So f is continuous, if it is continuous at all points of X.
2.7 Proposition
For f as above, f is continuous at x0 if, whenever a sequence xn → x0 , then f (xn ) →
f (x0 ) (“sequential continuity”).
f −1 (U ) = {x ∈ X : f (x) ∈ U }.
2.8 Theorem
A function f : X → Y is continuous if and only if f −1 (U ) is open in X for every open
subset U ⊂ Y .
(“The inverse image of an open set is open.” Note that for f continuous we do not
expect f (U ) to be open for all open subsets of X, for example f : R → R, f ≡ 0, then
f (R) = {0}, not open.)
LECTURE 6
Proof: Suppose that f is continuous, that U is open, and that x0 ∈ f −1 (U ), so
f (x0 ) ∈ U . Now there is a ball B(f (x0 ), ε) ⊂ U , since U is open, and then by
continuity there is a δ > 0 such that dY (f (x), f (x0 )) < ε whenever dX (x, x0 ) < δ. This
means that for d(x, x0 ) < δ, f (x) ∈ U and so x ∈ f −1 (U ). That is, f −1 (U ) is open.
[DIAGRAM]
Conversely, if the inverse image of an open set is open, and x0 ∈ X, let ε > 0 be given.
We know that B(f (x0 ), ε) is open, so f −1 (B(f (x0 ), ε)) is open, and contains x0 . So it
contains some B(x0 , δ) with δ > 0.
12
But now if d(x, x0 ) < δ, we have x ∈ B(x0 , δ) ⊂ f −1 (B(f (x0 ), ε)) so f (x) ∈ B(f (x0 ), ε)
and we have d(f (x), f (x0 )) < ε.
2.9 Example
Let X = R with the discrete metric, and Y any metric space. Then all functions
f : X → Y are continuous!
(i) Because the inverse image of an open set is an open set, since all sets are open.
(ii) Because whenever xn → x0 we have xn = x0 for n large, so obviously f (xn ) → f (x0 ).
2.10 Proposition
(i) A function f : X → Y is continuous if and only if f −1 (F ) is closed whenever F is
a closed subset of Y .
[DIAGRAM]
(ii) Take U ⊂ Z open; then (g ◦ f )−1 (U ) = f −1 (g −1 (U )); for these are the points which
map under f into g −1 (U ) so that they map under g ◦ f into U .
Now g −1 (U ) is open in Y , as g is continuous, and then f −1 (g −1 (U )) is open in X since
f is continuous.
2.11 Definition
A function f : X → Y is a homeomorphism between metric spaces if it is a bijection
s.t. f and f −1 are continuous. Then we say X and Y are homeomorphic, or X ∼ Y .
2.12 Example
The real line R is homeomorphic to the open interval (0, 1). For if we take y =
tan−1 x this maps it homeomorphically onto (−π/2, π/2), and this can be mapped
homeomorphically onto (0, 1), e.g. by z = π1 (y + π/2).
13
3 Real inner-product spaces
Notation: vectors written u, v, w, etc. (Sometimes just u, v, w).
Scalars written a, b, c, etc.
Functions written f , g, h.
Coordinates of a vector u normally written u1 , u2 , u3 , etc.
hu, vi = u1 v1 + u2 v2 + . . . + un vn .
For example,
h(1, 2, 3, 4), (0, −1, 5, 2)i = 1.0 − 2.1 + 3.5 + 4.2 = 21.
14
N.B. In quantum mechanics and elsewhere people use complex inner products. Not in
this course.
LECTURE 7
3.4 Examples
1. The usual inner product on Rn .
2. We can define a new inner product on R2 by
hu, vi = 2u1 v1 + 3u2 v2 .
Easily checked to be linear (do it!) and symmetric. For positive definiteness, note that
hu, ui = 2u21 + 3u22 ≥ 0
and is > 0 unless u1 = u2 = 0.
The following alternative is not an inner product, e.g. define
hu, vi = 2u1 v1 − 3u2 v2 ,
so hu, ui = 2u21 − 3u22 , and would be negative if u = (0, 1), say.
3. For a < b define C[a, b] to be the vector space of all continuous real functions on
[a, b].
For f , g ∈ C[a, b] define
Z b
hf, gi = f (x)g(x) dx.
a
Example: in C[0, 1], let f (x) = x + 1 and g(x) = 2x.
Z 1 Z 1 3 1
2 2x 2 5
hf, gi = (x + 1)(2x) dx = (2x + 2x) dx = +x = .
0 0 3 0 3
3.5 Other properties of inner products
(a) hu, av +bwi = hav +bw, ui (rule II) = ahv, ui+bhw, ui (rule I) = ahu, vi+bhu, wi
(rule II again).
So it is linear in the second argument as well as the first.
(b) h0, ui = h0u + 0u, ui = 0hu, ui + 0hu, ui = 0 for all u, using rule I.
Also hu, 0i = h0, ui = 0, using rule II. This is for any u ∈ V .
15
4 Lengths, angles, orthogonality
4.1 Definition
In an inner product space we define the length of a vector v (sometimes called its size
or norm) by p
kvk = hv, vi.
Note that hv, vi is always ≥ 0; also by property III, kvk = 0 if and only if v = 0.
n
This agrees with what we usually do in R√ , e.g. v = (3, 4, −12), then kvk2 = 32 + 42 +
(−12)2 = 9 + 16 + 144 = 169, so kvk = 169 = 13.
Example: in C[−1, 1] let f (x) = x. Then
1 1
x3
Z
2 2 2
kf k = x dx = = ,
−1 3 −1 3
p
so kf k = 2/3.
Note that if v ∈ V and a ∈ R, then
4.2 Definition
The angle between two non-zero vectors u and v is the unique solution θ to
in the range 0 ≤ θ ≤ π (radians!) It is easy to check that the angle between u and u
is 0, and the angle between u and −u is π.
We say u and v are orthogonal if hu, vi = 0. This is because the angle between them
satisfies cos θ = 0 so θ = π/2. This is sometimes written u ⊥ v.
To make sense of our definition we will need to know that
hu, vi
cos θ =
kuk.kvk
16
a
so hf, gi = 0 ⇐⇒ 1 + 2
= 0, or a = −2.
More generally, a set of vectors {u1 , . . . , uN } is an orthogonal set if hui , uj i = 0 when-
ever i 6= j.
Proof:
ku + vk2 = hu + v, u + vi
= hu, ui + hv, ui + hu, vi + hv, vi
= kuk2 + 0 + 0 + kvk2 ,
using orthogonality.
[DIAGRAM – draw a parallelogram.] The sums of the squares of the two diagonals
equals the sums of the squares of the four sides.
Proof: expand the inner products; see the example sheets.
17
Note that the LHS is |hu, vi| and the RHS is kuk kvk, where u = (u1 , . . . , un ),
v = (v1 , . . . , vn ) and we use the standard inner product in Rn .
LECTURE 8
We give two proofs, and in each we assume that u 6= 0 and v 6= 0 (otherwise the
inequality is obvious).
Proof 1:
Take
kau − bvk2 = a2 kuk2 − 2abhu, vi + b2 kvk2 ≥ 0,
with a = hu, vi and b = kuk2 . We get
Proof 2:
2thu, ui + 2hu, vi = 0.
hu, vi
NOW we know that lies between -1 and 1, and so the definition of angle
kuk kvk
makes sense.
18
5.2 Triangle inequality
In an inner product space we have
ku + vk ≤ kuk + kvk.
n
!1/2 n
!1/2 n
!1/2
X X X
(ui + vi )2 ≤ u2i + vi2 .
i=1 i=1 i=1
ku + vk2 = hu + v, u + vi
= kuk2 + 2hu, vi + kvk2
≤ kuk2 + 2kuk kvk + kvk2 = (kuk + kvk)2 .
5.3 Theorem
In an inner-product space the norm (length) of a vector satisfies
(i) kuk ≥ 0, and kuk = 0 if and only if u = 0;
(ii) kauk = |a| kuk;
(iii) ku + vk ≤ kuk + kvk.
5.4 Corollary
Let V be an inner-product space, and define d(x, y) = kx − yk. Then d is a metric.
Proof: From Theorem 5.3, we see easily that d(x, y) ≥ 0 and d(x, y) = 0 if and
only if x − y = 0, i.e., x = y.
Also d(x, y) = kx − yk = ky − xk = d(y, x).
Finally
19
We shall get a vector space by adding sequences term-wise; if u = (uk ) and v = (vk ),
then u + v = (uk + vk ) and au = (auk ), just like vectors with an infinite sequence of
components.
How do we know that (uk + vk ) is still in `2 ?
N
!1/2 N
!1/2 N
!1/2
X X X
(uk + vk )2 ≤ u2k + vk2
k=1 k=1 k=1
∞
!1/2 ∞
!1/2
X X
≤ u2k + vk2 = A,
k=1 k=1
N N
!1/2 N
!1/2
X X X
|uk vk | ≤ u2k vk2
k=1 k=1 k=1
∞
!1/2 ∞
!1/2
X X
≤ u2k vk2 = B,
k=1 k=1
P∞ P∞
say. Hence k=1 |uk vk | converges to a limit which is at most B. So k=1 uk vk is
absolutely convergent.
It is easy now to check that this defines an inner product. Also kuk2 = hu, ui =
P ∞ 2 n
k=1 uk , so it is like R with n = ∞. It is an infinite-dimensional vector space, but a
very useful one.
LECTURE 9
6 Orthonormal sets
6.1 Definition
20
A set of vectors {e1 , . . . , en } in an inner product space is orthonormal if it is orthogonal
and each vector has norm 1. So
0 if i 6= j,
hei , ej i =
1 if i = j.
If it’s also a basis for the inner product space, then we call it an orthonormal basis.
Examples:
(i) (1, 0, 0), (0, 1, 0), (0, 0, 1) is an orthonormal basis of R3 (the standard basis);
(ii) An unusual orthonormal basis of R2 is e1 = ( 53 , 45 ) and e2 = (− 45 , 53 ).
[DIAGRAM – draw the vectors]
6.2 Proposition
If {e1 , . . . , en } is orthonormal, then
n
n
!1/2
X
X
ai e i
= a2i ,
i=1 i=1
for any scalars a1 , . . . , an , and so the vectors {e1 , . . . , en } are linearly independent.
Proof: * n +
X n
X n X
X n
ai ei , aj e j = ai aj hei , ej i,
i=1 j=1 i=1 j=1
Pn
by (3.5). All terms except for those with i = j are zero, and we get i=1 a2i , as required.
In general
k
X
wk+1 = vk+1 − hvk+1 , ei iei , and ek+1 = wk+1 /kwk+1 k.
i=1
21
Then {e1 , . . . , en } are orthonormal and for each k the vectors e1 , . . . , ek span the same
space as v1 , . . . , vk .
So each new vector wk+1 and hence also ek+1 is orthogonal to the earlier ej . It isn’t
zero since vk+1 is independent of v1 , . . . , vk .
Finally,
(−2, 4, −4, 2) 1
e3 = w3 /kw3 k = √ = √ (−1, 2, −2, 1).
40 10
Having done this, CHECK that
1 if i = j,
hei , ej i =
0 if i 6= j.
22
Example (Legendre polynomials):
Take the functions 1, t, t2 , t3 , . . . in C[−1, 1] with inner product
Z 1
hf, gi = f (t)g(t) dt.
−1
R1 √
Now k1k2 = −1
1 dt = 2, so e1 (t) = 1/ 2.
Next take
Z 1
1 t
w2 (t) = t − ht, e1 ie1 = t − √ √ dt = t − 0 = t.
2 −1 2
Also r
Z 1
2 2 2 3
kw2 k = t dt = , so e2 (t) = w2 (t)/kw2 k = t.
−1 3 2
Then
23
LECTURE 10
Examples:
1. Take R3 with the usual inner product and W a plane through the origin.
The closest point of W is obtained by “dropping a perpendicular onto W ”.
[DIAGRAM]
2. Find the best approximation to the function f (t) = |t| on [−1, 1] by a quadratic
g(t) = a + bt + ct2 , in the sense of minimizing
Z 1
2
kf − gk = (f (t) − g(t))2 dt.
−1
7.1 Theorem
Let W be a (finite-dimensional) subspace of an inner-product space V , let v ∈ V , and
let w ∈ W satisfy
hv − w, zi = 0 for all z ∈ W.
Then kv − yk ≥ kv − wk for all y ∈ W . That is, w is the closest point in W to v, and
it is unique.
7.2 Definition
If W is a subspace of an inner product space V , then its orthogonal complement, W ⊥ ,
is the set of all vectors u that are orthogonal to every vector of W .
Clearly 0 ∈ W ⊥ , and indeed W ⊥ is a subspace, since if u1 and u2 are orthogonal to
everything in W , then ha1 u1 + a2 u2 , wi = a1 hu1 , wi + a2 hu2 , wi = 0 for all w ∈ W .
24
Example: if W is the 1-dimensional subspace of R3 spanned by the vector w = (3, 5, 7)
then x = (x1 , x2 , x3 ) is in W ⊥ if and only if hx, wi = 0, i.e., 3x1 + 5x2 + 7x3 = 0. This
is the plane perpendicular to W .
for each i = 1, . . . , n.
Example.
In C[−1, 1] we take f (t) = |t|; to approximate it by a quadratic take w1 (t) = 1,
w2 (t) = t and w3 (t) = t2 .
The best approximant c0 + c1 t + c2 t2 to |t| satisfies:
25
Z 1
2
0 + c1 + 0 = |t|t dt = 0,
3 −1
Z 1
2 2 1
c0 + 0 + c2 = t2 |t| dt = .
3 5 −1 2
Note that Z 1 Z 0 Z 1
|t| dt = (−t) dt + t dt,
−1 −1 0
etc.
3 15
The solution to these equations is c0 = 16
, c1 = 0 and c2 = 16
, giving the approximation
3 15
|t| ≈ + t2 .
16 16
7.4 Corollary
Suppose that e1 , . . . , en is an orthonormal basis for W . Then the best approximant of
v ∈ V by an element of W is
Xn
w= hv, ek iek .
k=1
26
Pn
Proof: Let w = k=1 ck ek . Then the normal equations become
n
X
ck hek , ei i = hv, ei i,
k=1
Thus we could have solved the example of approximating f (t) = |t| by using an or-
thonormal basis for the quadratic polynomials, e.g. the Legendre functions.
7.5 Definition
The orthogonal projection of v onto W , written PW v, is the closest vector w ∈ W to
v. In particular,
Xn
PW v = hv, ek iek ,
k=1
LECTURE 11
Example: the plane W = {(x1 , x2 , x3 ) ∈ R3 : x1 + x2 + x3 = 0} is a 2-dimensional
subspace with orthonormal basis e1 = √12 (1, −1, 0) and e2 = √16 (1, 1, −2). CHECK
that these are orthonormal and lie in W (so, since dim W = 2, they are also a basis for
it).
[DIAGRAM]
Pn
We decide to minimize i=1 (yi − cxi )2 , least squares approximation, useful in statis-
tical applications.
27
This is the same as taking x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) in Rn and minimizing
ky − cxk.
Take V to be Rn , usual inner product, and W to be the one-dimensional subspace
{ax : a ∈ R}. This is the same as finding the closest point to y in W .
Solution: take
hy, xi
c= ,
hx, xi
since this is the orthogonal projection onto W . In detail, w = cx, and the normal
equation is hw, xi = hy, xi, or chx, xi = hy, xi.
So
x1 y1 + . . . + xn yn
c= .
x21 + . . . + x2n
Example: find the best fit to the data
x y
2 3
1 2
3 3
4 5
Solution:
2.3 + 1.2 + 3.3 + 4.5 37
c= = .
22 + 12 + 32 + 42 30
7.7 Generalization
Suppose that y is known/guessed to be a linear combination of m variables x1 , . . . , xm ,
y = c1 x1 + . . . + cm xm , so we have experimental data
x1 x2 ... xm y
x11 x21 ... xm1 y1
.. .. .. ..
. . . .
x1n x2n . . . xmn yn
i.e.,
28
To get a unique solution we need the vectors x1 , . . . , xm to be independent, which re-
quires n ≥ m.
Example: Use the method of least squares approximation to find the best relation of
the form y = c1 x1 + c2 x2 fitting the following experimental data:
x1 x2 y
i) 1 0 2
ii) 0 1 3
iii) 1 1 2
iv) 1 −1 0
3c1 + 0c2 = 4
0c1 + 3c2 = 5,
x1 x2 yexperimental ytheoretical
1 0 2 4/3
0 1 3 5/3
1 1 2 3
1 −1 0 −1/3
7.8 Curve fitting
Given (x1 , y1 ), . . . , (xn , yn ) find a (polynomial) curve which fits these points well in the
sense of least squares approximation.
Example: Find the parabola y = c0 + c1 x + c2 x2 which best fits the points (0, 0),
(1, 4), (−1, 1), (−2, 5).
Solution: Apply the method of least squares approximation to y = c0 x0 + c1 x1 + c2 x2
with x0 = 1, x1 = x and x2 = x2 .
Put x0 = (1, 1, 1, 1), x1 = (0, 1, −1, −2), x2 = (0, 1, 1, 4), y = (0, 4, 1, 5). Note that x0
is the vector with all components 1, x1 the vector of x values, and and x2 the vector
of x2 values.
Normal equations are
29
Solution: let x0 = (1, 1, 1, 1), x1 = (2, 1, 3, 4) and y = (3, 2, 3, 5). So we want
y ≈ c0 x 0 + c1 x 1 .
Normal equations are
4c0 + 10c1 = 13
10c0 + 30c1 = 37,
9
giving c0 = 1 and c1 = 9/10, or y = 1 + 10
x.
LECTURE 12
Often we think of convergent sequences as ones where xn and xm are close together
when n and m are large. This is almost, but not quite, the same thing.
8.1 Definition
A sequence (xn ) in a metric space (X, d) is a Cauchy sequence if for any ε > 0 there is
an N such that d(xn , xm ) < ε for all n, m ≥ N .
Example: take xn = 1/n in R with the usual metric. Now d(xn , xm ) = n1 − 1
m
.
Suppose that n and m are both at least as big as N ; then d(xn , xm ) ≤ 1/N .
Hence if ε > 0 and we take N > 1/ε, we have d(xn , xm ) ≤ 1/N < ε whenever n and m
are both ≥ N .
In fact all convergent sequences are Cauchy sequences, by the following result.
8.2 Theorem
Suppose that (xn ) is a convergent sequence in a metric space (X, d), i.e., there is a
limit point x such that d(xn , x) → 0. Then (xn ) is a Cauchy sequence.
Proof: take ε > 0. Then there is an N such that d(xn , x) < ε/2 whenever n ≥ N .
Now suppose both n ≥ N and m ≥ N . Then
d(xn , xm ) ≤ d(xn , x) + d(x, xm ) = d(xn , x) + d(xm , x) < ε/2 + ε/2 = ε,
30
and we are done.
8.3 Proposition
Every subsequence of a Cauchy sequence is a Cauchy sequence.
Proof: if (xn ) is Cauchy and (xnk ) is a subsequence, then given ε > 0 there is an N
such that d(xn , xm ) < ε whenever n, m ≥ N . Now there is a K such that nk ≥ N
whenever k ≥ K. So d(xnk , xnl ) < ε whenever k, l ≥ K.
8.4 Definition
A metric space (X, d) is complete if every Cauchy sequence in X converges to a limit
in X.
Cauchy sequences can be awkward, e.g. xn = 21 +(−1)n 101n , i.e., 0.4, 0.51, 0.499, 0.5001,
0.49999, . . . , will converge to 0.5, even though the individual digits do not converge.
8.5 Theorem
R is complete.
31
A: Every bounded increasing or decreasing sequence in R converges. Increasing means
x1 ≤ x2 ≤ . . . and you can guess what decreasing means. Monotone means either
increasing or decreasing.
B: Every Cauchy sequence in R is bounded.
C: Every sequence in R has a monotone subsequence.
D: If a Cauchy sequence has a convergent subsequence, then the original sequence
converges.
E: R is complete.
Proof of A: we can take this as an axiom of R, or observe that if the numbers are
increasing and bounded, then eventually the integer parts are constant, then the first
digit after the decimal point, then the second, . . . , so it is clear what number we want
as our limit. But if xn agrees with x to k decimal places then |xn − x| < 10−k ; this
shows that xn → x. √
Example, 1.0, 1.2, 1.4, 1.41, 1.412, 1.414, 1.4141, 1.4142, ... is homing in on 2.
LECTURE 13
Proof of B: if (xn ) is Cauchy, then with ε = 1 we know that |xm − xn | < 1 when-
ever m, n ≥ N . Now |xn | ≤ |xn − xN | + |xN | < 1 + |xN | for all n ≥ N . Let
K = max{|x1 |, |x2 |, . . . , |xN −1 |, 1 + |xN |}. Then |xn | ≤ K for all n.
Proof of D: suppose that (xn ) is Cauchy in (X, d) and limk→∞ xnk = y. Take ε > 0.
Then there exists N such that d(xm , xn ) < ε/2 whenever m, n ≥ N ; and K such that
d(xnk , y) < ε/2 whenever k ≥ K. Choose k ≥ K such that nk ≥ N . Then for n ≥ N ,
and so d(xn , y) → 0 as n → ∞.
Proof of C: let (xn ) be a sequence in R. We say that xm is a peak point of the sequence
if xm ≥ xn for all n > m.
[DIAGRAM]
Case 1: only finitely many peak points. Choose n1 large so that xn is not a peak point
for any n ≥ n1 .
Since xn1 is not a peak point we can find n2 > n1 with xn2 > xn1 ;
since xn2 is not a peak point we can find n3 > n2 with xn3 > xn2 ; and so on.
32
Now (xnk ) is strictly increasing.
Case 2: (xn ) has infinitely many peak points, say, xn1 , xn2 , . . . , with n1 < n2 < . . ..
Now xn1 ≥ xn2 ≥ . . ., so (xnk ) is a decreasing subsequence.
8.6 Corollary
A subset X ⊂ R is complete if and only if it is closed.
Define fn in C[0, 2] by
xn for 0 ≤ x ≤ 1,
fn (x) =
1 for 1 ≤ x ≤ 2.
[DIAGRAM]
Then
Z 2
d1 (fn , fm ) = |fn (x) − fm (x)| dx
0
Z 1
= |xn − xm | dx
Z0 1
= (xm − xn ) dx if n ≥ m
0
1 1 1
= − ≤ → 0,
m+1 n+1 m+1
and hence (fn ) is Cauchy in (C[0, 2], d1 ). Does the sequence converge?
33
Z 2
If there is an f ∈ C[0, 2] with fn → f as n → ∞, then |fn (x) − f (x)| dx → 0, so
Z 1 Z 2 0
and both tend to zero. So fn → f in (C[0, 1], d1 ), which means that f (x) = 0
0 1
on [0, 1] (from an example we did earlier). Likewise, f = 1 on [1, 2], which doesn’t give
a continuous limit.
What about d∞ ?
8.7 Definition
A sequence (fn ) of (not necessarily continuous) functions defined on [a, b] is said to
converge uniformly to f if sup{|fn (x) − f (x)| : x ∈ [a, b]} → 0 as n → ∞. (If these
are continuous functions, then this is just convergence in the d∞ metric.)
8.8 Theorem
If (fn ) are continuous functions and fn → f uniformly, then f is also continuous.
Proof: Take ε > 0 and a point x ∈ [a, b]. Then there is an N such that |fn (t) − f (t)| <
ε/3 for all t ∈ [a, b] whenever n ≥ N . Now fN is continuous, so we can choose δ > 0
such that |fN (t) − fN (x)| < ε/3 for all t ∈ [a, b] with |t − x| < δ. Then
|f (t) − f (x)| ≤ |f (t) − fN (t)| + |fN (t) − fN (x)| + |fN (x) − f (x)|
≤ ε/3 + ε/3 + ε/3 = ε
n
Thus, for
example, the functions fn (t) = t converge pointwise on [0, 1] to
0 for 0 ≤ t < 1,
g(t) = but g is not continuous, so the convergence isn’t uniform.
1 for t = 1,
LECTURE 14
8.9 Theorem
(C[a, b], d∞ ) is a complete metric space.
Proof: take a Cauchy sequence (fn ) in (C[a, b], d∞ ). The proof goes in two steps.
I: For each x ∈ [a, b], (fn (x)) is a Cauchy sequence in R, and so has a limit, which we
call f (x).
34
II: fn → f uniformly; hence f ∈ C[a, b] and d∞ (fn , f ) → 0.
Step I: given ε > 0 there is an N with d∞ (fn , fm ) < ε for n, m ≥ N , since (fn ) is
Cauchy. But |fn (x) − fm (x)| ≤ d∞ (fn , fm ) and so this is also < ε for n, m ≥ N . So
(fn (x)) is a Cauchy sequence in R. Since R is complete by (8.5), we see that there is
a limiting value f (x).
Step II: take ε > 0 and N as in Step I. Then |fn (x) − fm (x)| < ε for each x, provided
that n, m ≥ N . Fix n ≥ N and let m → ∞. We conclude that |fn (x) − f (x)| ≤ ε for
each x, provided that n ≥ N . This is just the uniform convergence of fn to f . So f is
continuous, i.e., f ∈ C[a, b] by (8.8), and d∞ (fn , f ) → 0.
8.10 Remark
Note that R2 is also complete with any of the metrics d1 , d2 and d∞ ; since a Cauchy/
convergent sequence (vn ) = (xn , yn ) in R2 is just one in which both (xn ) and (yn ) are
Cauchy/ convergent sequences in R (cf. Prop. 2.4).
Similar arguments show that Rk is also complete for k = 1, 2, 3, . . ., and (with the same
proof as for Corollary 8.6) all closed subsets of Rk are complete.
9 Contraction mappings
Our aim is to use metric spaces to solve equations by using an iterative method to get
approximate solutions.
9.1 Examples
1. x3 + 2x2 − 8x + 4 = 0. Rewrite this as x = 18 (x3 + 2x2 + 4).
35
Rewrite as Z x
y(x) = t(t + y(t)) dt.
0
Define Z x
φ(f )(x) = t(t + f (t)) dt.
0
So y = f (x) solves the original equation if and only if φ(f ) = f .
Again, try to find the solution as the limit of a sequence. Take f0 (x) = 0 for 0 ≤ x ≤ 1.
Then Z x Z x
x3
f1 = φ(f0 ), i.e., f1 (x) = t(t + f0 (t)) dt = t2 dt = .
0 0 3
x x
t3 x3 x5
Z Z
f2 = φ(f1 ), i.e., f2 (x) = t(t + f1 (t)) dt = t(t + ) dt = + .
0 0 3 3 15
x
t3 t5 x3 x5 x7
Z
f3 = φ(f2 ), i.e., f3 (x) = + ) dt =t(t ++ + .
0 3 15 3 15 105
Suppose we have a metric space (X, d) and a function φ : X → X. Choose x0 ∈ X
and define xn = φ(xn−1 ) for n ≥ 1. This gives a sequence (xn ); if it is Cauchy and
(X, d) is complete, then x = limn→∞ xn exists and x should solve x = φ(x). How can
we guarantee that (xn ) will be Cauchy?
Note that d(xn , xn+1 ) = d(φ(xn−1 ), φ(xn )), so to get (xn ) Cauchy we want φ to shrink
distances. Let’s call φ : X → X a shrinking map if d(φ(y), φ(z)) < d(y, z) for all
y, z ∈ X with y 6= z.
Example:
LECTURE 15
36
9.2 Definition
Let (X, d) be a metric space. A map φ : X → X is a contraction mapping, if there
exists a constant k < 1 such that
d(φ(x), φ(y)) ≤ kd(x, y) for all x, y ∈ X.
Examples:
1. Take X = [0, 1], usual metric, and φ(x) = x2 /3. Then
2 2
x y 1 2
d(φ(x), φ(y)) = − = (x + y)(x − y) ≤ |x − y|.
3 3 3 3
So φ is a contraction mapping, with k = 2/3.
2. Take X = R and φ(x) = 41 sin 3x. So |φ(x) − φ(y)| = 14 | sin 3x − sin 3y|.
Use MVT! φ0 (x) = 43 cos 3x, so |φ0 (x)| ≤ 34 , and
3
|φ(x) − φ(y)| = |φ0 (c)| |x − y| ≤ |x − y|, etc.
4
3. Take X = [1, ∞) and φ(x) = x + 1/x. Suppose x = n and y = n + 1; then
1 1 1
φ(y) − φ(x) = n + 1 + −n− =1− ,
n+1 n n(n + 1)
so |φ(y) − φ(x)|/|y − x| can be made as close to 1 as we like by taking x = n and
y = n + 1 for n large. Thus φ (which is a shrinking mapping) is not a contraction
mapping.
9.3 Theorem
Let (X, d) be a metric space and let φ : X → X be a contraction mapping. Then
for each x0 ∈ X the sequence defined by xn = φ(xn−1 ) (for each n ≥ 1) is a Cauchy
sequence.
37
which tends to 0 as m → ∞. Thus (xn ) is Cauchy.
Note: in the above theorem, if (X, d) is complete, then (xn ) will converge to a limit
x ∈ X. Note that x is a fixed point of φ, i.e., φ(x) = x, since
Proof: by Theorem 9.3 and the note following it, we have proved everything except
the fact that there is only one fixed point for φ. But if x and y are fixed points, then
How to apply the CMT: suppose we want to solve the equation φ(x) = x, where φ
is a contraction mapping. Take x0 ∈ X and construct (xn ) as above. Then (xn ) tends
to a solution x.
Note that in an incomplete metric space, there may be problems. For example take
X = (0, 1) ⊂ R and φ(x) = x/2. The iterates form a Cauchy sequence but the limit,
0, isn’t in the space, and there is no fixed point in the space.
Examples:
1. Show that x3 + 2x2 − 8x + 4 = 0 has a unique solution in [−1, 1], and find it correct
to within ±10−6 .
38
Solution: write equation as x = 81 (x3 + 2x2 + 4), and let φ(x) = 18 (x3 + 2x2 + 4) for
−1 ≤ x ≤ 1. Note that if |x| ≤ 1 then
1 7
|φ(x)| ≤ (|x|3 + 2|x| + 4) ≤ ,
8 8
so φ does map [−1, 1] to itself. Then
0
1 2
7
φ (x) = (3x + 4x) ≤ ,
8 8
for x ∈ [−1, 1], so φ is a contraction mapping with k = 7/8. It has a unique fixed
point, as required.
LECTURE 16
Take x0 = 0. Defining xn = φ(xn−1 ), we get x1 = 0.5, x2 = 0.578, etc. as in Examples
9.1. The sequence converges to 0.6308976, although convergence is slow, since k = 7/8,
so the error after n steps is only bounded by
n
kn 7
|xn − x| ≤ |x0 − x1 | = 4 .
1−k 8
Show φ is a contraction mapping for the metric d∞ , and use φ to find an approximate
solution y to the differential equation
dy
= x(x + y), y(0) = 0, (0 ≤ x ≤ 1).
dx
Solution:
d∞ (φ(f ), φ(g)) = max{|φ(f )(x) − φ(g)(x)| : 0 ≤ x ≤ 1}.
Z x Z x
|φ(f )(x) − φ(g)(x)| =
t(t + f (t)) dt − t(t + g(t)) dt
0 0
Z x
=
t(f (t) − g(t)) dt
0
Z x
≤ t|f (t) − g(t)| dt
Z0 x
1
≤ td∞ (f, g) dt = x2 d∞ (f, g).
0 2
39
Thus d∞ (φ(f ), φ(g)) ≤ 12 d∞ (f, g), and φ is a contraction map with k = 1/2.
If y = f (x) is a solution of the diff. eq. then f 0 (t) = t(t + f (t)) and f (0) = 0. Integrate
from 0 to x: Z x Z x
0
f (t) dt = t(t + f (t)) dt,
0 0
So CMT says that the d.e. has a unique solution, which we can obtain by iteration.
We did this in Examples 9.1 as well. Note that f0 = 0, f1 (x) = x3 /3, so d∞ (f0 , f1 ) = 13 ,
and in general d∞ (fn , f ) ≤ 3.21n−1 , by 9.5.
2 /2
Another example: f 0 (t) = t(1 + f (t)), for t ∈ [0, 1], with f (0) = 0, f (t) = et −1
is the actual (unique) solution.
Z x
x2
Take f0 (x) = 0, f1 (x) = t(1 + f0 (t)) dt = ,
Z x 0 2
x2 x4
f2 (x) = t(1 + f1 (t)) dt = + , etc.
0 2 8
or, equivalently, Z x
f (x) = c + F (t, f (t)) dt. (2)
a
40
So when is φ a contraction map?
Note that some differential equations don’t have solutions everywhere we might want
them; e.g. f 0 (t) = −f (t)2 for t ∈ [0, 2], with f (0) = −1. The only solution is f (t) =
1/(t − 1), which is discontinuous at t = 1.
9.7 Theorem
With F and φ as above, suppose there is a constant k < 1 such that
k
|F (x, y1 ) − F (x, y2 )| ≤ |y1 − y2 | for all x ∈ [a, b], y1 , y2 ∈ R.
b−a
Then φ is a contraction mapping on (C[a, b], d∞ ).
9.8 Definition
A function f : [a, b] → R satisfies the Lipschitz condition with constant m if
|f (x1 ) − f (x2 )| ≤ m|x1 − x2 | for all x1 , x2 ∈ [a, b].
If f is differentiable on [a, b] and m = max{|f 0 (t)| : t ∈ [a, b]}, then f satisfies the
Lipschitz condition with constant m, since the Mean Value Theorem gives, for some c
between x1 and x2 ,
LECTURE 17
Similarly, if we have a function F (x, y), we say that it satisfies the Lipschitz condition
in y with constant m if
41
for all x and for all y1 and y2 for which the above is defined.
If we have partial derivatives, we can take
∂F
m = max : a ≤ x ≤ b, y ∈ R .
∂y
9.9 Theorem
1
If F satisfies the Lipschitz condition in y with a constant m < b−a , then the differential
0
equation y = F (x, y), y(a) = c has a unique solution for a ≤ x ≤ b.
k
Proof: use Theorem 9.7 writing m = b−a
with k < 1.
In fact if it satisfies the Lipschitz condition with any constant m at all, we can still
solve the equation. What we do is to solve it in C[a, a + δ], where m < 1δ , and ob-
tain a value y(a+δ). We then solve it in C[a+δ, a+2δ], and keep going until we get to b.
Examples:
dy
1. = cos(x2 y), y(0) = 2, for 0 ≤ x ≤ 1.
dx
∂F
Here F (x, y) = cos(x2 y), and = −x2 sin(x2 y), so
∂y
∂F 2
∂y ≤ x ≤ 1. (3)
Thus F satisfies the Lipschitz condition in y with constant m = 1. Not good enough
for the theorem to apply on [0, 1] (although we could use [0, 21 ] and [ 12 , 1], as above).
But if Z x
φ(f )(x) = 2 + cos(t2 f (t)) dt,
0
then
Z x
|φ(f )(x) − φ(g)(x)| = (cos(t2 f (t)) − cos(t2 g(t)) dt
Z 0x
≤ |F (t, f (t)) − F (t, g(t))| dt
Z0 x
≤ t2 |f (t) − g(t)| dt by (3),
Z0 x
1
≤ t2 d∞ (f, g) dt = x3 d∞ (f, g).
0 3
So d∞ (φ(f ), φ(g)) ≤ 13 d∞ (f, g), and so φ is a contraction map.
42
2.
dy √
= y, y(0) = 0, 0 ≤ x ≤ 1. (4)
dx
∂F 1
Here F (x, y) = y 1/2 and = y −1/2 , unbounded.
∂y 2
So F does not satisfy a Lipschitz condition in y at all. For any c ∈ (0, 1] we can define
0 if x ≤ c,
fc (x) = 1 2
4
(x − c) if c ≤ x ≤ 1.
[DIAGRAM: constant on [0, c], parabola rising on [c, 1].]
The following theorem shows that φ has a unique fixed point, given by iteration:
x0 = 0, x1 = φ(x0 ) = 1, x2 = φ(x1 ) = 0.54, etc. (keep hitting cos button on calculator,
working in radians), and this converges to 0.7390851 . . ..
9.10 Theorem
If (X, d) is a complete metric space and φ : X → X is a map such that some iterate
φm of φ is a contraction map, then φ has a unique fixed point. For any x0 ∈ X the
sequence (xn ) = (φn (x0 )) converges to the fixed point.
Proof: by the CMT applied to φm , we get a unique fixed point x for φm . So x = φm (x).
Apply φ, then
φ(x) = φ(φm (x)) = φm+1 (x) = φm (φ(x)),
43
so that φ(x) is also a fixed point of φm . By the uniqueness, φ(x) = x, so x is a fixed
point of φ as well.
If x and y are fixed points of φ, then x and y are fixed points of φm , which is a con-
traction mapping, and so x = y. Hence φ has a unique fixed point.
Sketch of last assertion: let’s do m = 3 for illustration (the general case is similar, with
more complicated notation). We have, by the CMT for φm :
x0 , x3 , x6 , . . . , x3k , . . . → x,
x1 , x4 , x7 , . . . , x3k+1 , . . . → x,
x2 , x5 , x8 . . . , x3k+2 , . . . → x.
This implies that the single sequence x0 , x1 , x2 , . . . tends to x, since given ε > 0, we
have d(x3k , x) < ε for k > k0 , say, d(x3k+1 , x) < ε for k > k1 , say, and d(x3k+2 , x) < ε
for k > k2 , say. So d(φN (x0 ), x) < ε for N > max{3k0 , 3k1 + 1, 3k2 + 2}.
LECTURE 18
As before we calculate
Z x
|φ(f )(x) − φ(g)(x)| ≤ |F (t, f (t)) − F (t, g(t))| dt
a
Z x
≤ m |f (t) − g(t)| dt, by the Lipschitz condition (6)
a
Z x
≤ m d∞ (f, g) dt
a
Z x
= md∞ (f, g) dt = md∞ (f, g)(x − a). (7)
a
44
Repeat:
9.12 Theorem
If F (x, y) satisfies the Lipschitz condition in y for some constant m, where a ≤ x ≤ b
and y ∈ R, then the differential equation (5) has a unique solution which can be ap-
proximated by iteration.
10 Connectedness
10.1 Definition
A metric space X is disconnected if ∃U , V , open, disjoint, nonempty, such that
X = U ∪ V . Note that U and V will also be closed, as their complements are
open. Otherwise X is connected. A subset is connected/disconnected if it is con-
nected/disconnected when we restrict the metric to the subset to get a (smaller) metric
45
space.
[DIAGRAM in R2 ]
10.2 Definition
An interval in R is a set S such that if s, t ∈ S then [s, t] ⊂ S.
Examples are (a, b), [a, b], (a, b], [a, b), (−∞, b), (−∞, b], (a, ∞), [a, ∞), with a, b finite;
also ∅ and R. These are all the examples possible.
We want to show that the connected subsets of R (usual metric) are precisely the in-
tervals.
10.3 Lemma
Let x, y ∈ R with x < y, let U, V be disjoint open sets in R with x ∈ U and y ∈ V .
Then there is a z ∈ (x, y) with z 6∈ U ∪ V .
10.4 Theorem
A subset S of R is connected if and only if it is an interval.
Proof: (i) If S is not an interval, then there are x, y ∈ S with [x, y] 6⊂ S, so there is
a z ∈ (x, y), z 6∈ S.
46
Now take U = S ∩ (−∞, z) and V = S ∩ (z, ∞); we see that this disconnects S.
The intersection of two connected sets needn’t be connected [picture in R2 ] and nor is
the union of two connected sets (e.g. (0, 1) ∪ (2, 3)). (In R, however, the intersection
of two connected sets is connected, since these are just intervals.)
10.5 Remark
Let Y ⊂ X, where X is a metric space. Then a subset S ⊂ Y is open (regarded as a
subset of Y ) if and only if S = U ∩ Y , where U is an open subset of X. This follows
easily on noting that every open subset in Y is a union of balls BY (s, ε) for s ∈ S, and
BY (s, ε) = BX (s, ε) ∩ Y .
So, for example, we can say that [0, 1] and [2, 3] are open subsets of the metric space
Y = [0, 1] ∪ [2, 3], although not open when regarded as subsets of X = R, since, for
example [0, 1] = (−∞, 23 ) ∩ Y .
10.6 Theorem
For each λ,
Sλ = (U ∩ Sλ ) ∪ (V ∩ Sλ ),
| {z }
nonempty, contains x
S
so since Sλ is connected, V ∩ Sλ = ∅ for each λ, and so V ∩ Sλ = ∅.
S
Hence Sλ is connected.
So every point x ∈ X is contained in a maximal connected subset, the connected
component containing x, namely
[
Cx = {S ⊂ X, x ∈ S, S connected}.
47
Of course {x} itself is one such connected set S, so this is not an empty union.
10.7 Theorem
Let f : X → Y be a continuous mapping between metric spaces, and suppose that X
is connected. Then the image f (X) := {f (x) : x ∈ X} is also connected.
is connected this can only happen if one of f −1 (U ) and f −1 (V ) is empty, which means
that one of U and V is empty. So f (X) is connected.
Proof: By Theorem 10.7 we have that f ([a, b]) is connected, and hence, by Theo-
rem 10.4 it is an interval. The result is now clear.
10.9 Definition
Let (X, d) be a metric space. Then X is path-connected if, for all x, y ∈ X, there is a
continuous f : [0, 1] → X with f (0) = x, f (1) = y (i.e., a path joining them).
(Of course we can also talk about path-connected subsets of a metric space, as they
are metric spaces too.)
10.10 Proposition
Let X be a path-connected metric space. Then X is connected.
48
Proof: Suppose that X = U ∪ V , with U and V open, disjoint and nonempty. Take
x ∈ U and y ∈ V . Then there is a path f : [0, 1] → X joining x to y.
Hence [0, 1] = f −1 (U )∪f −1 (V ), as open disjoint sets in the metric space [0, 1]. But [0, 1]
is connected so one of f −1 (U ) and f −1 (V ) is empty. But 0 ∈ f −1 (U ) and 1 ∈ f −1 (V ),
so we have a contradiction. So X is connected.
The converse is false. Take X = G ∪ I, where G = {(x, sin 1/x) : x > 0} and
I = {(0, y) : −1 ≤ y ≤ 1}. Then G ∪ I is connected, but not path-connected. See the
Exercises.
LECTURE 20
10.11 Remark
For open subsets of Rn it is true that connected and path-connected are the same.
Suppose that S is open and connected, and take x ∈ S. Then
is open [DIAGRAM]. So is
10.12 Theorem
(i) Let n ≥ 2. Then Rn (usual metric) isn’t homeomorphic to R.
(ii) Moreover, no two out of (0, 1), [0, 1) and [0, 1] are homeomorphic.
(ii) Similarly, if we delete any point from (0, 1) it becomes disconnected; not true for
the others if we deleted 0. So (0, 1) is not homeomorphic to the others. If we delete
any 2 points from [0, 1) it becomes disconnected, not true for [0, 1] if we deleted the
end-points. So the other two aren’t homeomorphic, either.
49
Similarly we can see that [0, 1] is not homeomorphic to the square [0, 1]×[0, 1], since re-
moving any three points will disconnect [0, 1]. This is in spite of the fact that there exist
“space-filling curves”, i.e., continuous (non-bijective) maps from [0, 1] onto [0, 1]×[0, 1].
There also exist discontinuus bijections between the two sets.
11 Compactness
Recall that any real continuous function on a closed bounded interval [a, b] is bounded
and attains its bounds. We look at this in a more general context.
11.1 Definition
Let K ⊆ X, where (X, d) isSa metric space. An open cover of K is a family of open
sets (Uλ )λ∈Λ such that K ⊂ λ∈Λ Uλ . We say that K is compact if whenever (Uλ )λ∈Λ is
an open cover of K, there is a finite subcover Uλ1 , . . . , UλN such that K ⊂ Uλ1 ∪. . .∪UλN .
It doesn’t matter whether we cover K with open sets in K or open sets in X, since
open sets in K are just the intersection with K of open sets in X.
11.2 Examples
Clearly R is not compact, as Un = (−n, n) for n = 1, 2, . . . form an open cover, but we
cannot cover the whole of R by taking only finitely many of these sets.
∞
[ 1
Similarly, nor is (0, 1) = , 1 , but not a finite union of any of these sets.
n=2
n
It will be shown later that [0, 1] is compact. More generally, it turns out that the com-
pact subsets of Rn with the Euclidean metric are just the closed bounded ones. Thus,
compact subsets of R include finite sets, and finite unions of closed intervals such as
[0, 1] ∪ [2, 3]. But NOT (0, 1), R itself, or Q.
11.3 Theorem
Let f : (K, d) → R be continuous with K ⊂ X compact. Then f is bounded on K
and it attains its bounds (so that ∃x ∈ K with f (x) = sup{f (k) : k ∈ K} < ∞ and
similarly for inf).
50
Also, if s = supx∈K f (x), we have either that f (x) = s for some x, or else that
1/(s − f (x)) is a continuous function on K and hence bounded by M > 0, say. This
means that s − f (x) ≥ 1/M for all x ∈ K; i.e., f (x) ≤ s − 1/M , contradicting the
definition of s as the sup.
11.4 Theorem
Let (X, d) be a metric space; then every compact subset K ⊂ X is closed and bounded.
Proof: Let x be a point of X \K. For each k ∈ K consider the balls Bk = B(k, rk /2)
and Ck = B(x, rk /2) where rk = d(x, k) > 0. These are disjoint and the Bk form an
open cover of K. By compactness we can find k1 , . . . , kN such that K ⊂ Bk1 ∪. . .∪BkN .
But now Ck1 ∩. . . CkN , is an open ball containing x which is disjoint from Bk1 ∪. . .∪BkN
and hence from K. So K is closed. [DIAGRAM]
LECTURE 21
An alternative way to see why a compact set K is necessarily closed and bounded is to
take any x ∈ X \K and consider the continuous function on K given by f (k) = d(k, x).
By Theorem 11.3, f attains its lower bound δ ≥ 0. But δ cannot be 0 as then we would
have k ∈ K with k = x. Thus δ > 0 and B(x, δ) ∩ K = ∅ so K is closed.
Also, since f is bounded we see that K is bounded.
11.5 Example
1 1
The infinite set S = 1, , , . . . ∪ {0} is compact. For any open cover (Uλ ) of S,
2 3
there will be a set, say Uλ0 , containing 0. Since Uλ0 is open, there is an N such that
1
Uλ0 will also contain for all n ≥ N . But then we only need finitely-many more Uλ
n
to cover the whole set.
11.6 Theorem
(i) Let X be a compact metric space and F a closed subset of X. Then F is compact.
(ii) Let X be a compact metric space and Y an arbitrary metric space. Suppose that
f : X → Y is continuous. Then f (X) is compact.
51
S
Proof: (i) If we have an open cover of F , say, F ⊂ λ∈Λ Uλ , then by adding the set
X \ F , which is open, we have an open cover of X. Since X is compact we only need
finitely many sets, say X ⊂ (X \ F ) ∪ Uλ1 ∪ . . . ∪ UλN , and now F ⊂ Uλ1 ∪ . . . ∪ UλN ,
so F is compact.
(ii) Given an open cover f (X) ⊂ λ∈Λ Uλ , we see that X ⊂ λ∈Λ f −1 (Uλ ) since for
S S
each point x ∈ X there is a λ with f (x) ∈ Uλ , meaning that x ∈ f −1 (Uλ ). Since f is
continuous, this is an open cover of X.
But now we have a finite subcover of X. X ⊂ f −1 (Uλ1 ) ∪ . . . ∪ f −1 (UλN ), which means
that f (X) ⊂ Uλ1 ∪ . . . ∪ UλN . Hence f (X) is compact.
This gives us another way to prove Theorem 11.3. For if K is compact and f (K) ⊂ R
is compact, it is a bounded set, and being closed implies that the least upper bound is
in the set.
11.8 Definition
A subset K of a metric space is sequentially compact if every sequence in K has a
convergent subsequence with limit in K.
The classical Bolzano–Weierstrass theorem in R says that every bounded sequence has
a convergent subsequence, and this implies that all closed bounded subsets F ⊂ R are
sequentially compact (“closed” guarantees that the limit is in F ).
52
11.9 Example
The closed unit ball B in `2 is not sequentially compact (although closed and bounded).
Recall that ( ∞
)
X
2
B = (xn ) : xn ≤ 1 .
n=1
For let e1 = (1, 0, 0, . . .), e2 = (0, 1, 0, 0, . . .), e3 = (0, 0, 1, 0, . . .), etc. Then (e√
n ) is a
sequence in B with no convergent subsequence since d(en , em ) = ken − em k = 2 for
all n 6= m.
We need one more definition before we state the final big theorem of the course.
11.10 Definition
A subset K of a metric space is precompact or totally bounded if for each ε > 0 it can
be covered with finitely many balls B(xk , ε).
[Think of employing finitely-many short-sighted guards to watch over your set.]
Easily, every compact set is precompact, since we can cover K with open balls B(x, ε),
where x varies over the whole of K. By compactness we only need finitely many. [DI-
AGRAM]
LECTURE 22
But the closed ball of `2 isn’t precompact, since if it were covered by balls of radius
1/2 then each en would have to be in a different one – for if d(x, en ) < 1/2 and
d(x, em ) < 1/2 we get d(xn , xm ) < 1, which is a contradiction for n 6= m. So it isn’t
compact either.
53
[DIAGRAM].
Now think of vectors in RN as being padded with zeroes, so that they lie in `2 .
That is D ⊂ K
S SK
k=1 B(zk , ε/2). But now we have that C ⊂ k=1 B(zk , ε), since for every
point in c ∈ C its truncation c to N coordinates lies in D; thus d(c0 , zk ) < ε/2 and
0
In fact the Hilbert cube is also compact, and this is a consequence of the big result
that follows.
11.12 Theorem
The following are equivalent in a metric space (X, d):
(1) X is compact.
(2) If (En ) are nonempty closed sets in X with E1 ⊇ E2 ⊇ . . ., then ∞
T
n=1 En 6= ∅.
(3) X is sequentially compact.
(4) X is complete and precompact.
T S
Proof: (1) ⇒ (2). Suppose that En = ∅, then (X \ En ) = X (de Morgan’s
law); this is an open cover of X so there is a finite subcover, by compactness. So
X = (X \ E1 ) ∪ . . . ∪ (X \ EN ) = X \ EN as the En are decreasing so their complements
are increasing. Which means that EN = ∅, a contradiction.
(2) ⇒ (3). Let (xn ) be any sequence in X and let En = T {xn , xn+1 , . . .}, which are
decreasing nonempty closed sets. Thus there is a point y in ∞ n=1 En .
Take B(y, 1): this meets {x1 , x2 , . . .} since y is in its closure. Pick xn1 ∈ B(y, 1).
Now B(y, 1/2) meets {xn1 +1 , xn1 +2 , . . .} since y is in its closure. Pick xn2 ∈ B(y, 1/2),
and note that n2 > n1 .
Continuing this way we find xnk in B(y, 1/k), so the subsequence (xnk ) converges to y.
Also X will be precompact, since if not then we can find ε > 0 with no finite covering
S of radius ε; choose x1 ∈ X, and then inductively we obtain (xn ) such that
by balls
xn 6∈ n−1
k=1 B(xk , ε). Now it’s clear that d(xk , xn ) ≥ ε > 0 for all k < n, which means
that the (xn ) have no convergent subsequence.
11.13 Corollary
54
A subset K ⊂ RN is compact if and only if it is closed and bounded.
Proof: We saw in Theorem 11.4 that all compact sets (in any metric space) are
closed and bounded.
For the converse, we can show that all closed bounded sets K in RN are sequentially
compact. If (xn ) = (xn1 , . . . , xnN ) is a sequence in K, then by passing to a subsequence
we can ensure that the sequence (xn1 ) of first coordinates converges (since every bounded
sequence in R has a convergent subsequence); then to a further subsequence to ensure
that the sequence (xn2 ) of 2nd coordinates converges, and so on.
LECTURE 23
(4) ⇒ (1) in Theorem 11.12. This is the hardest bit and the proof is definitely not
examinable. We suppose S that X is complete and precompact, and show it is compact.
So take an open cover λ∈Λ Uλ of X.
First we reduce it to a countable cover. For each n we can cover X by finitely many
balls B(an,1 , 1/n) ∪ . . . ∪ B(an,rn , 1/n), using precompactness. Let A denote the set of
centres (this is countable and dense because for each n the set A comes within 1/n of
every point of X) and consider all the balls B(a, 1/k) for a ∈ A and k = 1, 2, . . .. We
claim that every open set U is a union of some of the B(a, 1/k). For if U is open and
x ∈ U , there is a ball B(x, 1/j) ⊂ U and a point a ∈ A contained in B(x, 1/2j). But
now x ∈ B(a, 1/2j) ⊂ B(x, 1/j) ⊂ U [DIAGRAM].
Thus we can cover X with a countable subcollection of the Uλ , since for each x ∈ X
there is a ball B(a, 1/k) with x ∈ B(a, 1/k) ⊂ some Uλ . There are only countably
many balls to choose from so select one Uλ for each ball we used.
Cover X by finitely-many balls of radius 1. Then for at least one of these balls there
is an infinite subsequence, say x11 , x12 , . . ., all in the same ball B(y1 , 1).
Now cover X by finitely-many balls of radius 1/2. Then for at least one of these balls
there is an infinite subsequence of (x1k ), say x21 , x22 , . . . all in the same ball B(y2 , 1/2).
55
Repeat. We obtain nested subsequences (xnk )k all in the same ball B(yn , 1/n). But now
the diagonal subsequence (xnn ) is Cauchy, since since for m ≤ n, d(xmm , ym ) < 1/m
and also d(xnn , ym ) < 1/m as the (xnk ) are a subsequence of the (xmk ). Hence
d(xmm , xnn ) < 2/m for n > m, i.e., a Cauchy sequence.
Let C0 = [0, 1], C1 = [0, 1/3] ∪ [2/3, 1], C2 = [0, 1/9] ∪ [2/9, 1/3] ∪ [2/3, 7/9] ∪ [8/9, 1],
and so on, at each stage deleting the middle open third of every interval that remains.
[DIAGRAM]
Then C = ∞
T
n=0 Cn . This is the intersection of closed subsets of R, and is hence a
closed (even compact) set. Remarkably, it is uncountable: indeed it consists of all
∞
X aj
numbers of the form x = , where aj = 0 or 2 for each j (and not 1). Note that
j=1
3j
we regard 1/3 = 0.02222 . . ..
One can use a Cantor diagonal argument (as one does to prove that R is uncountable)
to show that C is uncountable.
Paradoxically, the complement of the Cantor set is an open set and so just a countable
union of intervals. If one calculates the total length of the intervals removed from [0, 1]
∞
X 2j
it is j
, since we removed 2j intervals of length 3−j at each stage. This sums up to
j=1
3
1, but there are still many points left!
The set C is “totally disconnected” – it clearly doesn’t contain any intervals, so all its
subsets consisting of more than one point are disconnected. That is, every component
of C consists of a single point.
In a technical sense (outside the scope of this course) C is a fractal set – its “dimension”
is log 2/ log 3 or about 0.63.
56
THE END
The exercises on the sheet Extra Examples will be done in lectures, but the solutions
will not be put online (if necessary, you can watch the videos).
57