Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Introduction To Some Convergence Theorems: 2.1 Recap

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Lecture 2

Introduction to Some Convergence theorems


Friday 14, 2005
Lecturer: Nati Linial
Notes: Mukund Narasimhan and Chris R e
2.1 Recap
Recall that for f : T C, we had dened

f(r) =
1
2
_
T
f(t)e
irt
dt
and we were trying to reconstruct f from

f. The classical theory tries to determine if/when the following is
true (for an appropriate denition of equality).
f(t)
??
=

rZ

f(r)e
irt
In the last lecture, we proved Fej ers theorem f k
n
f where the denotes convolution and k
n
(Fej er
kernels) are trignometric polynomials that satisfy
1. k
n
0
2.
_
T
k
n
= 1
3. k
n
(s) 0 uniformly as n outside [, ] for any > 0.
If X is a nite abelian group, then the space of all functions f : X Cforms an algebra with the operations
(+, ) where + is the usual pointwise sum and is convolution. If instead of a nite abelian group, we take
X to be T then there is no unit in this algebra (i.e., no element h with the property that h f = f for all f).
However the k
n
behave as approximate units and play an important role in this theory. If we let
S
n
(f, t) =
n

r=n

f(r)e
irt
Then S
n
(f, t) = f D
n
, where D
n
is the Dirichlet kernel that is given by
D
n
(x) =
sin
_
n +
1
2
_
s
sin
s
2
The Dirichlet kernel does not have all the nice properties of the the Fej er kernel. In particular,
8
1. D
n
changes sign.
2. D
n
does not converge uniformly to 0 outside arbitrarily small [, ] intervals.
Remark. The choice of an appropriate kernel can simplify applications and proofs tremendously.
2.2 The Classical Theory
Let G be a locally compact abelian group.
Denition 2.1. A character on G is a homomorphism : G T. Namely a mapping satisfyin (g
1
+g
2
) =
(g
1
)(g
2
) for all g
1
, g
2
G.
If
1
,
2
are any two characters of G, then it is easily veried that
1

2
is also a character of G, and so
the set of characters of G forms a commutative group under multiplication. An important role is played by

G, the group of all continuous characters. For example,



T = Z and

R = R.
For any function f : G C, associate with it a function

f :

G C where

f() = f, ). For
example, if G = T then
r
(t) = e
irt
for r Z. Then we have

f(
r
) =

f(r). We call

f :

G C the
Fourier transform of f. Now

G is also a locally compact abelian group and we can play the same game
backwards to construct

f. Pontryagins theorem asserts that


G = G and so we can ask the question: Does

f = f ? While in theory Fej er answered the question of when



f uniquely determines f, this question is still
left unanswered.
For the general theory, we will also require a normalized nonnegative measure on G that is translation
invariant: (S) = (a + S) = (a + s [s S) for every S G and a G. There exists a unique such
measure which is called the Haar measure.
2.3 L
p
spaces
Denition 2.2. If (X, , ) is a measure space, then L
p
(X, , ) is the space of all measureable functions
f : X R such that
|f|
p
=
__
X
[f[
p
d
_1
p
<
For example, if X = N, is the set of all nite subsets of X, and is the counting measure, then
|(x
1
, x
2
, . . . , x
n
, . . . )|
p
= (

[x
i
[
p
)
1
p
. For p = , we dene
|x|

= sup
iN
[x
i
[
Symmetrization is a technique that we will nd useful. Loosely, the idea is that we are averaging over
all the group elements.
Given a function f : G C, we symmetrize it by dening g : G C as follows.
g(x) =
_
G
f(x + a) d(a)
9
We will use this concept in the proof of the following result.
Proposition 2.1. If G is a locally compact abelian group, with a normalized Haar measure , and if

1
,
2


G are two distinct characters then
1
,
2
) = 0. i.e.,
I =
_
X

1
(x)
2
(x) d(x) =

1
,
2
=
_
0
1
,=
2
1
1
=
2
Proof. For any xed a G, I =
_
X

1
(x)
2
(x) d(x) =
_
X

1
(x + a)
2
(x + a) d(x). Therefore,
I =
_
X

1
(x + a)
2
(x + a) d(x)
=
_
X

1
(x)
1
(a)
2
(x)
2
(a) d(x)
=
1
(a)
2
(a)
_
X

1
(x)
2
(x) d(x)
=
1
(a)
2
(a)I
This can only be true if either I = 0 or
1
(a) =
2
(a). If
1
,=
2
, then there is at least one a such that

1
(a) ,=
2
(a). It follows that either
1
=
2
or I = 0.
By letting
2
be the character that is identically 1, we conclude that

G with ,= 1 for any
_
G
(x) d(x) = 0.
2.4 Approximation Theory
Weierstrasss theorem states that the polynomials are dense in L

[a, b] C[a, b]
1
Fej ers theorem is about
approximating functions using trignometric polynomials.
Proposition 2.2. cos nx can be expressed as a degree n polynomial in cos x.
Proof. Use the identity cos(u + v) + cos(u v) = 2 cos ucos v and induction on n.
The polynomial T
n
(x) where T
n
(cos x) = cos(nx) is called n
th
Chebyshevs polynomial. It can be
seen that T
0
(s) = 1, T
1
(s) = s, T
2
(s) = 2s
2
1 and in general T
n
(s) = 2
n1
s
n
plus some lower order
terms.
Theorem 2.3 (Chebyshev). The normalized degree n polynomial p(x) = x
n
+ . . . that approximates the
function f(x) = 0 (on [1, 1]) as well as possible in the L

[1, 1] norm sense is given by


1
2
n1
T
n
(x). i.e.,
min
p a normalized polynomial
max
1x1
[p(x)[ =
1
2
n1
This theorem can be proved using linear programming.
1
This notation is intended to imply that the norm on this space is the sup-norm (clearly C[a, b] L[a, b])
10
2.4.1 Moment Problems
Suppose that X is a random variable. The simplest information about X are its moments. These are
expressions of the form
r
=
_
f(x)x
r
dx, where f is the probability distribution function of X. A moment
problem asks: Suppose I know all (or some of) the moments
r

rN
. Do I know the distribution of X?
Theorem 2.4 (Hausdorff Moment Theorem). If f, g : [a, b] C are two continuous functions and if for
all r = 0, 1, 2, . . . , we have
_
b
a
f(x)x
r
dx =
_
b
a
g(x)x
r
dx
then f = g. Equivalently, if h : [a, b] C is a continuous function with
_
b
a
h(x)x
r
dx = 0 for all r N,
then h 0.
Proof. By Weierstrasss theorem, we knowthat for all > 0, there is a polynomial P such that
_
_
h P
_
_

<
. If
_
b
a
h(x)x
r
dx = 0 for all r N, then it follows that
_
b
a
h(x)Q(x) dx = 0 for every polynomial Q(x),
and so in particular,
_
b
a
h(x)P(x) dx. Therefore,
0 =
_
b
a
h(x)P(x) dx =
_
b
a
h(x)h(x) dx +
_
b
a
h(x)
_
P(x) h(x)
_
dx
Therefore,
h, h) =
_
b
a
h(x)
_
P(x) h(x)
_
dx
Since h is continuous, it is bounded on [a, b] by some constant c and so on [a, b] we have

h(x)
_
P(x) h(x)
_

c [b a[. Therefore, for any > 0 we can pick > 0 so that so that
|h|
2
2
. Hence h 0.
2.4.2 A little Ergodic Theory
Theorem 2.5. Let f : T C be continuous and be irrational. Then
lim
n
1
n
n

r=1
f
_
e
2ir
_
=
_
T
f(t) dt
Proof. We show that this result holds when f(t) = e
ist
. Using Fej ers theorem, it will follow that the result
holds for any continuous function. Now, clearly
1
2
_
T
e
ist
dt = 0. Therefore,

1
n
n

r=1
e
2irs

1
2
_
T
e
ist
dt

1
n
n

r=1
e
2irs

1
n
e
2is

1 e
2ins
1 e
2is

2
n (1 e
2is
)
Since is irrational, 1 e
2is
is bounded away from 0. Therefore, this quantity goes to zero, and hence
the result follows.
11
Figure 2.1: Probability of Property v. p
This result has applications in the evaluations of integrals, volume of convex bodies. Is is also used in
the proof of the following result.
Theorem 2.6 (Weyl). Let be an irrational number. For x R, we denote by x) = x [x] the fractional
part of x. For any 0 < a < b < 1, we have
lim
n
[1 r n : a r) < b[
n
= b a
Proof. We would like to use Theorem 2.5 with the function f = 1
[a,b]
. However, this function is not
continuous. To get around this, we dene functions f
+
1
[a,b]
f

as shown in the following diagram.


f
+
and f

are continuous functions approximating f. We let let them approach f and pass to the
limit.
This is related to a more general ergodic theorem by Birkhoff.
Theorem 2.7 (Birkhoff, 1931). Let (, T, p) be a probability measure and T : be a measure
preserving transformation. Let X L
1
(, T, p) be a random variable. Then
1
n
n

k=1
X T
k
E [X; 1]
Where 1 is the -eld of T-invariant sets.
2.5 Some Convergence Theorems
We seek conditions under which S
n
(f, t) f(t) (preferably uniformly). Some history:
DuBois Raymond gave an example of a continuous function such that limsupS
n
(f, 0) = .
Kolmogorov [1] found a Lebesgue measureable function f : T R such that for all t,
limsupS
n
(f, t) = .
12
Carleson [2] showed that if f : T C is a continuous function (even Riemann integrable), then
S
n
(f, t) f(t) almost everywhere.
Kahane and Katznelson [3] showed that for every E T with (E) = 0, there exists a continuous
function f : T C such that S
n
(f, t) , f(t) if and only if t E.
Denition 2.3.
p
= L
p
(N, Finite sets, counting measure). = x[(x
0
, . . . )[
p
< .
Theorem 2.8. Let f : T C be continuous and suppose that

rZ
[

f(r)[ < (so



f
1
). Then
S
n
(f, t) f uniformly on T.
Proof. See lecture 3, theorem 3.1.
2.6 The L
2
theory
The fact that e(t) = e
ist
is an orthonormal family of functions allows to develop a very satisfactory theory.
Given a function f, the best coefcients
1
,
2
, . . . ,
n
so that |f

n
i=1

j
e
j
|
2
is minimized is given by

j
= f, e
j
). This answer applies just as well in any inner product normed space (Hilbert space) whenever
e
j
forms an orthonormal system.
Theorem 2.9 (Bessels Inequality). For every
1
,
2
, . . . ,
n
,
_
_
_
_
_
f
n

i=1

i
e
i
_
_
_
_
_
2
|f|
2

i=1
f, e
i
)
2
with equality when
i
= f, e
i
)
Proof. We offer a proof here for the real case, in the next lecture the complex case will be done as well.
_
_
_
_
_
f
n

i=1

i
e
i
_
_
_
_
_
2
=
_
_
_
_
_
(f
n

i=1
f, e
i
)e
i
) + (
n

i=1
f, e
i
)e
i

n

i=1

i
e
i
)
_
_
_
_
_
2
=
_
_
_
_
_
(f
n

i=1
f, e
i
)e
i
)
_
_
_
_
_
2
+
_
_
_
_
_
(
n

i=1
f, e
i
)e
i

n

i=1

i
e
i
)
_
_
_
_
_
2
+ cross terms
cross terms = 2f
n

i=1
f, e
i
)e
i
,
n

i=1
f, e
i
)e
i

n

i=1

i
e
i
)
Observe that the terms in the cross terms are orthogonal to one another since if f, e
i
)e
i
, e
i
) = 0. We
write
2

f, e
i
)f
n

j=1
f, e
j
)e
j
, e
i
)
n

i
f
n

j=1
f, e
j
)e
i
, e
i
)
Observe that each innter product term is 0. Since if i = j, then we apply if f, e
i
)e
i
, e
i
) = 0. If
i ,= j, then they are orthogonal basis vectors.
13
We want to make this as small as possible and have only control over the
i
s. Since this term is squared
and therefore non-negative, the sum is minimized when we set i
i
= f, e
i
). With this choice,
_
_
_
_
_
f
n

i=1

i
e
i
_
_
_
_
_
2
= f
n

i=1

i
e
i
, f
n

i=1

i
e
i
)
= f, f) 2
n

i=1

i
f, e
i
) +
n

i=1

2
i
= |f|
2

i=1
f, e
i
)
2
where the last inequality is obtained by setting
i
= f, e
i
).
References
[1] A. N. Kolmogorov, Une s erie de Fourier-Lebesgue divergente partout, CRAS Paris, 183, pp. 1327-
1328, 1926.
[2] L. Carleson, Convergence and growth of partial sums of Fourier series, Acta Math. 116, pp. 135-157,
1964.
[3] J-P Kahane and Y. Katznelson, Sur les ensembles de divergence des s eries trignom etriques, Studia
Mathematica, 26 pp. 305-306, 1966
14

You might also like