Integral Equations
Integral Equations
Integral Equations
Integral Equations
051012 F. Porter
Revision 140116 F. Porter
1 Introduction
The integral equation problem is to find the solution to:
Z b
h(x)f (x) = g(x) + λ k(x, y)f (y)dy. (1)
a
We are given functions h(x), g(x), k(x, y), and wish to determine f (x). The
quantity λ is a parameter, which may be complex in general. The bivariate
function k(x, y) is called the kernel of the integral equation.
We shall assume that h(x) and g(x) are defined and continuous on the
interval a ≤ x ≤ b, and that the kernel is defined and continuous on a ≤ x ≤ b
and a ≤ y ≤ b. Here we will concentrate on the problem for real variables
x and y. The functions may be complex-valued, although we will sometimes
simplify the discussion by considering real functions. However, many of the
results can be generalized in fairly obvious ways, such as relaxation to piece-
wise continuous functions, and generalization to multiple dimensions.
There are many resources for further reading on this subject. Some of
the popular ones among physicists include the “classic” texts by Mathews
and Walker, Courant and Hilbert, Whittaker and Watson, and Margenau
and Murphy, as well as the newer texts by Arfken, and Riley, Hobson, and
Bence.
2 Integral Transforms
If h(x) = 0, we can take λ = −1 without loss of generality and obtain the
integral equation: Z b
g(x) = k(x, y)f (y)dy. (2)
a
This is called a Fredholm equation of the first kind or an integral
transform. Particularly important examples of integral transforms include
the Fourier transform and the Laplace transform, which we now discuss.
1
2.1 Fourier Transforms
A special case of a Fredholm equation of the first kind is
a = −∞ (3)
b = +∞ (4)
1
k(x, y) = √ e−ixy . (5)
2π
This is known as the Fourier transform:
1 Z ∞ −ixy
g(x) = √ e f (y)dy (6)
2π −∞
Note that the kernel is complex in this case.
The solution to this equation is given by:
1 Z ∞ ixy
f (y) = √ e g(x)dx. (7)
2π −∞
We’ll forego rigor here and give the “physicist’s” demonstration of this:
1 Z ∞ −ixy Z ∞ ix0 y
g(x) = e dy e g(x0 )dx0 (8)
2π −∞ −∞
1 Z∞ Z ∞
0
= g(x0 )dx0 ei(x −x)y dy (9)
2π −∞ −∞
Z ∞
= g(x0 )δ(x − x0 )dx0 (10)
−∞
= g(x). (11)
Here, we have used the fact that the Dirac “delta-function” may be written
1 Z ∞ ixy
δ(x) = e dy. (12)
2π −∞
The reader is encouraged to demonstrate this, if s/he has not done so before.
It is instructive to notice that the Fourier transform may be regarded as
a limit of the Fourier series. Let f (x) be expanded in a Fourier series in a
box of size [−L/2, L/2]:
∞
an e2πinx/L .
X
f (x) = (13)
n=−∞
2
Hence,
1 Z L/2
an = f (x)e−2πinx/L dx. (15)
L −L/2
Now consider taking the limit as L → ∞. In this limit, the summation
√
goes over to a continuous integral. Let y = 2πn/L and g(y) = Lan / 2π.
Then, using dn = (L/2π)dy,
∞
an e2πinx/L
X
f (x) = lim (16)
L→∞
n=−∞
√ ∞
2π
g(y)eixy
X
= lim (17)
L→∞
n=−∞ L
1 Z ∞ ixy
= √ e g(y)dy. (18)
2π −∞
Furthermore:
Lan 1 Z∞
g(y) = √ = √ f (x)e−ixy dx. (19)
2π 2π −∞
We thus verify our earlier statements, including the δ-function equivalence,
assuming our limit procedure is acceptable.
Suppose now that f (y) is an even function, f (−y) = f (y). Then,
0 Z ∞
1
Z
−ixy
g(x) = √ e f (y)dy + e−ixy f (y)dy (20)
2π −∞ 0
1 Z ∞ h ixy i
= √ e + e−ixy f (y) dy (21)
2π 0
s
2Z∞
= f (y) cos xy dy. (22)
π 0
This is known as the Fourier cosine transform. It may be observed that
the transform g(x) will also be an even function, and the solution for f (y) is:
s
2Z∞
f (y) = g(x) cos xy dx. (23)
π 0
Similarly, if f (y) is an odd function, we have the Fourier sine trans-
form: s
2Z∞
g(x) = f (y) sin xy dy, (24)
π 0
where a factor of −i has been absorbed. The solution for f (y) is
s
2Z∞
f (y) = g(x) sin xy dx. (25)
π 0
Let us briefly make some observations concerning an approach to a more
rigorous discussion. Later we shall see that if the kernel k(x, y) satisfies
3
conditions such as square-integrability on [a, b] then convenient behavior is
achieved for the solutions of the integral equation. However, in the present
case, we not only have |a|, |b| → ∞, but the kernel eixy nowhere approaches
zero. Thus, great care is required to ensure valid results.
We may deal with this difficult situation by starting with a set of functions
which are themselves sufficiently well-behaved (e.g., approach zero rapidly
as |x| → ∞) that the behavior of the kernel is mitigated. For example, in
quantum mechanics we may construct our Hilbert space of acceptable wave
functions on R3 by starting with a set S of functions f (x) where:
We could approach the proof of the Fourier inverse theorem with more
rigor than our limit of a series as follows: First, consider that subset of S
consisting of Gaussian functions. Argue that any function in S may be ap-
proximated aribtrarily closely by a series of Gaussians. Then note that the S
functions form a pre-Hilbert space (also known as an Euclidean space). Add
the completion to get a Hilbert space, and show that the theorem remains
valid.
The Fourier transform appears in many physical situations via its con-
nection with waves, for example:
4
R1
Vi (t) C R2 Vo(t)
R1 → Z1 = R1 (28)
R2 → Z2 = R2 (29)
1
C → ZC = . (30)
iωC
Then it is a simple matter to solve for Vo (t):
1
Vo (t) = Vi (t) R1 , (31)
1+ R2
(1 + iωR2 C)
if Vi (t) = sin(ωt + φ), and where it is understood that the real part is to be
taken.
Students usually learn how to obtain the result in Eqn. 31 long before
they know about the Fourier transform. However, it is really the result in
the frequency domain according to the Fourier transform. That is:
1 Z∞
o (ω) = √
Vb Vo (t)e−iωt dt (32)
2π −∞
1
= Vbi (ω) R1 . (33)
1 + R2 (1 + iωR2 C)
We are here using the “hat” ( b ) notation to indicate the integral transform of
the unhatted function. The answer to the problem for general (not necessarily
sinusoidal) input Vi (t) is then:
1 Z∞ b
Vo (t) = √ Vo (ω)eiωt dω (34)
2π −∞
1 Z∞ b eiωt
= √ Vi (ω) dω. (35)
2π −∞ 1+ R 1
R2
(1 + iωR2 C)
5
2.2 Laplace Transforms
The Laplace transform is an integral transform of the form:
Z ∞
F (s) = f (x)e−sx dx. (36)
0
and
1 Z∞
f (x)θ(x) = F (s)ex(c+iy) dy (43)
2π −∞
1 Z c+i∞
= F (s)exs ds, (44)
2πi c−i∞
which is the above-asserted result.
We group together here some useful theorems for Fourier and Laplace
transforms: First define some notation. Let
1 Z∞
(Ff ) (y) = g(y) = √ f (x)e−ixy dx (45)
2π −∞
6
be the Fourier transform of f , and
Z ∞
(Lf ) (s) = F (s) = f (x)e−sx dx (46)
0
5. Multiplication by an exponential:
7
6. Multiplication by x:
d
{F [xf (x)]} (y) = i (Ff ) (y), (56)
dy
d
{L [xf (x)]} (s) = − (Lf ) (s). (57)
ds
Theorem:
√
(Fg) (y) = 2π (Ff1 ) (y) (Ff2 ) (y), (59)
√
(Lg) (y) = 2π (Lf1 ) (y) (Lf2 ) (y). (60)
8
Corresponding to the first of the three equations we obtain (where the
hat now indicates the Laplace transform):
Z ∞
Vb
o (s) = Vo (t)e−st dt = Vbi (s) − bi(s)R1 . (64)
0
1 b
= iC (s). (66)
sC
Now we have three simultaneous algebraic equations, which may be readily
solved for Vbo (s):
1
Vbo (s) = Vbi (s) R1 . (67)
1 + R2 (1 + sR2 C)
We note the similarity with Eqn. 33. Going back to the time domain, we
find:
1 Z a+i∞ 1
Vo (t) = R1 Vbi (s)est ds. (68)
2πi a−i∞ 1 + R2 (1 + sR2 C)
For example, let’s suppose that Vi (t) is a brief pulse, V ∆t ∼ A, at t = 0.
Let’s model this as:
V (t) = Aδ(t − ), (69)
where is a small positive number, inserted to make sure we don’t get into
trouble with the t = 0 boundary in the Laplace transform. Then:
Z ∞
V̂i (s) = Aδ(t − )e−st dt = Ae−s . (70)
0
where
R1 R2
τ≡ C. (72)
R1 + R2
The integrand has a pole at s = −1/τ . We thus choose the contour of
integration as in Fig. 2. A contour of this form is known as a “Bromwich
9
Im(s)
x
a
-1/ τ Re(s)
10
that suggests the breadth of application of these ideas. This is the “Laplace’s
Method” for the solution of ordinary differential equations. This method rep-
resents a sort of generalization of the Laplace transform, using the feature of
turning derivatives into powers.
Suppose we wish to solve the differential equation:
n
(ak + bk x)f (k) (x) = 0,
X
(74)
k=0
Thus, Z
0= [U (s) + xV (s)] F (s)esx ds, (78)
C
where
n
ak s k ,
X
U (s) = (79)
k=0
n
bk s k .
X
V (s) = (80)
k=0
We assume that we can choose C such that the integrated part vanishes.
Then we will have a solution to the differential equation if
d
U (s)F (s) − [V (s)F (s)] = 0. (83)
ds
11
Note that we have transformed a problem with high-order derivatives (but
only first order polynomial coefficients) to a problem with first-order deriva-
tives only, but with high-order polynomial coefficients.
Formally, we find a solution as:
d
[V (s)F (s)] = U (s)F (s) (84)
ds
dF (s) dV (s)
V (s) = U (s)F (s) − F (s) (85)
ds ds
d ln F U d ln V
= − (86)
ds V ds !
Z
U d ln V
ln F = − ds (87)
V ds
Z
U
= ds − ln V + ln A, (88)
V
where A is an arbitrary constant. Thus, the soluton for F (s) is;
U (s0 ) 0
"Z #
A s
F (s) = exp ds . (89)
V (s) V (s0 )
U (s) = a0 + a2 s2 = 2ν + s2 (91)
V (s) = b1 s = −2s. (92)
12
Im(s)
Re(s)
13
Recall that the residue is the coefficient of the 1/s term in the Laurent series
expansion. Hence,
2 +2sx
Hn (x) = n! × coefficient of sn in e−s . (101)
That is,
∞
2 +2sx Hn (x) n
e−s
X
= s . (102)
n=0 n!
This is the “generating function” for the Hermite polynomials.
The term “generating function” is appropriate, since we have:
dn −s2 +2sx
Hn (x) = lim e (103)
s→0 dsn
2
H0 (x) = lim e−s +2sx = 1 (104)
s→0
2 +2sx
H1 (x) = lim(−2s + 2x)e−s = 2x, (105)
s→0
and so forth.
If
L(c1 f1 + c2 f2 ) = c1 g1 + c2 g2 . (110)
14
where K|f i indicates here the integral ab k(x, y)f (y)dy. Our linear operator
R
is then written:
L = I − λK, (112)
where I is the identity operator.
We are interested in the problem of inverting this linear transformation
– given g, what is f ? As it is a linear transformation, it should not be
surprising that the techniques are analogous with those familiar in matrix
equations. The difference is that we are now dealing with vector spaces that
are infinite-dimensional function spaces.
or even
|f ihf | = If , (117)
where If is the identity matrix in the subspace spanned by {f }.
A value of λ for which the homogeneous equation has non-trivial solutions
is called an eigenvalue of the equation (or, of the kernel). Note that the use
of the term eigenvalue here is analogous with, but different in detail from the
15
usage in matrices – our present eigenvalue is more similar with the inverse of
a matrix eigenvalue. The corresponding solutions are called eigenfunctions
of the kernel for eigenvalue λ. We have the following:
Theorem: There are a finite number of eigenfunctions fi corresponding to
a given eigenvalue λ.
Proof: We’ll prove this for real functions, leaving the complex case as an
exercise. Given an eigenfunction fj corresponding to eigenvalue λ, let:
Z b
1
pj (x) ≡ k(x, y)fj (y)dy = fj (x). (118)
a λ
Now consider, for some set of n eigenfunctions corresponding to eigenvalue
λ: 2
Z b n
D(x) ≡ λ2
X
k(x, y) − pj (x)fj (y) dy. (119)
a j=1
It must be that D(x) ≥ 0 because the integrand is nowhere negative for any
x. Note that the sum term may be regarded as an approximation to the
kernel, hence D(x) is a measure of the closeness of the approximation. With
some manipulation:
Z b n
Z bX
D(x) = λ2 [k(x, y)]2 dy − 2λ2 k(x, y)pj (x)fj (y)dy
a a j=1
2
Z b Xn
+λ2 pj (x)fj (y) dy
a j=1
Z b n
2
2 2
[pj (x)]2
X
= λ [k(x, y)] dy − 2λ
a j=1
n n Z b
+λ2
X X
pj (x) pk (x) fj (y)fk (y)dy
j=1 k=1 a
Z b n
= λ2 [k(x, y)]2 dy − λ2 [pj (x)]2 .
X
(120)
a j=1
16
k 2 dxdy is bounded, we see
RR
using the normalization of the fj . As long as
that n must be finite. For finite a and b, this is certainly satisfied, by our
continuity assumption for k. Otherwise, we may impose this as a requirement
on the kernel.
More generally, we regard “nice” kernels as those for which
Z bZ b
[k(x, y)]2 dydx < ∞, (123)
a a
Z b
[k(x, y)]2 dx < U1 , ∀y ∈ [a, b], (124)
a
Z b
[k(x, y)]2 dy < U2 , ∀x ∈ [a, b], (125)
a
where U1 and U2 are some fixed upper bounds. We will assume that these
conditions are satisfied in our following discussion. Note that the kernel may
actually be discontinuous and even become infinite in [a, b], as long as these
conditions are satisfied.
(or K = ni=1 |φi ihψi |), then the kernel is called degenerate.
P
We may
assume that the φi (x) are linearly independent. Otherwise we could reduce
the number of terms in the sum to use only independent functions. Likewise
we may assume that the ψi (x) are linearly independent.
The notion of a degenerate kernel is important due to two facts:
1. Any continuous function k(x, y) can be uniformly approximated by
polynomials in a closed interval. That is, the polynomials are “com-
plete” on a closed bounded interval.
2. The solution of the integral equation for degenerate kernels is easy (at
least formally).
The first fact is known under the label Weierstrass Approximation
Theorem. A proof by construction may be found in Courant and Hilbert.
We remind the reader of the notion of uniform convergence in the sense used
here:
Definition (Uniform Convergence):If S(z) = ∞
P PN
n=0 un (z) and SN = n=0 un (z),
then S(z) is said to be uniformly convergent over the set of points A =
{z|z ∈ A} if, given any > 0, there exists an integer N such that
17
Note that this is a rather strong form of convergence – a series may converge
for all z ∈ A, but may not be uniformly convergent.
Let us now pursue the second fact asserted above. We wish to solve for
f: Z b
g(x) = f (x) − λ k(x, y)f (y)dy. (128)
a
If the kernel is degenerate, we have:
n Z b
ψi∗ (y)f (y)dy.
X
g(x) = f (x) − λ φi (x) (129)
i=1 a
Multiply Eq. 128 through by ψj∗ (x) and integrate over x to obtain:
n
X
gj = fj − λ cji fi . (133)
i=1
Substituting in:
n Z b n
ψi∗ (y) g(y) + λ
X X X
g(x) = g(x) + λ fi φi (x) − λ φi (x) fj φj (y) dy
i=1 i=1n a j=1
n Z b n
ψi∗ (y) g(y) + λ
X X
= g(x) + λ φi (x) fi − fj φj (y) dy
i=1
a j=1
n
X n
X
= g(x) + λ φi (x) fi − gi + λ cij fj
i=1 j=1
= g(x). (135)
Let us try an explicit example to illustrate how things work. We wish to
solve the equation:
Z 1
2
x = f (x) − λ x(1 + y)f (y)dy (136)
0
18
In this case, n = 1, and it is clear that the solution is simply a quadratic
polynomial which can be determined directly. However, let us apply our new
method instead. We have g(x) = x2 and k(x, y) = x(1 + y). The kernel is
degenerate, with φ1 (x) = x and ψ1 (y) = 1 + y. Our constants evaluate to:
Z 1
7
g1 = (1 + x)x2 dx = (137)
0 12
Z 1
5
c11 = x(1 + x)dx = . (138)
0 6
The linear equation we need to solve is then:
7 5
= f1 − λ f1 , (139)
12 6
giving
7 1
f1 = , (140)
2 6 − 5λ
and
7 λ
f (x) = x2 + x. (141)
2 6 − 5λ
The reader is encouraged to check that this is a solution to the original
equation, and that no solution exists if λ = 6/5.
To investigate this special value λ = 6/5, consider the homogeneous equa-
tion: Z 1
f (x) = λ x(1 + y)f (y)dy. (142)
0
We may use the same procedure in this case, except now g1 = 0 and we find
that
5
f1 1 − λ = 0. (143)
6
Either f1 = 0 or λ = 6/5. If f1 = 0, then f (x) = g(x) + λf1 φ1 (x) = 0. If
λ 6= 6/5 the only solution to the homogeneous equation is the trivial one. But
if λ = 6/5 the solution to the homogeneous equation is f (x) = ax, where a
is arbitrary. The value λ = 6/5 is an (in this case the only) eigenvalue of√the
integral equation, with corresponding normalized eigenfunction f (x) = 3x.
This example suggests the plausibility of the important theorem in the
next section.
19
with given λ, possesses a unique continuous solution f (x) for each con-
tinuous function g(x) (and in particular f (x) = 0 if g(x) = 0), or the
associated homogeneous equation
Z b
f (x) = λ k(x, y)f (y)dy (145)
a
and let
gi ≡ hψi |gi (148)
fi ≡ hψi |f i (149)
cij ≡ hψj |φi i. (150)
Then,
n
X
f j = gj + λ cji fi , (151)
i=1
or
g1
g2
.. = (I − λC) f ,
g= (152)
.
gn
where C is the matrix formed of the cij constants.
Thus, we have a system of n linear equations for the n unknowns {fi }.
Either the matrix I − λC is non-singular, in which case a unique solution f
exists for any given g (in particular f = 0 if g = 0), or I − λC is singular, in
which case the homogeneous equation f = λCf possesses a finite number of
linearly independent solutions. Up to some further considerations concerning
continuity, this proves the theorem for the case of a degenerate kernel.
We may extend the proof to arbitrary kernels by appealing to the fact
that any continuous funciton k(x, y) may be uniformly approximated by de-
generate kernels in a closed interval (for example, see Courant and Hilbert).
There is an additional useful theorem under Fredholm’s name:
20
Theorem: If the integral equation:
Z b
f (x) = g(x) + λ k(x, y)f (y)dy (153)
a
5 Practical Approaches
We turn now to a discussion of some practical “tools of the trade” for solving
integral equations.
21
5.2 Volterra’s Equations
Integral equations of the form:
Z x
g(x) = λ k(x, y)f (y)dy (158)
a
Z x
f (x) = g(x) + λ k(x, y)f (y)dy (159)
a
are called Volterra’s equations of the first and second kind, respectively.
One situation where such equations arise is when k(x, y) = 0 for y > x:
k(x, y) = θ(x − y)`(x, y). Thus,
Z b Z x
k(x, y)f (y)dy = `(x, y)f (y)dy. (160)
a a
We may use this to transform the equation of the first kind to:
Z x
dg ∂k
(x) = λk(x, x)f (x) + λ (x, y)f (y)dy. (162)
dx a ∂x
This is now a Volterra’s equation of the second kind, and the approach to
solution may thus be similar.
Notice that if the kernel is independent of x, k(x, y) = k(y), then the
solution to the equation of the first kind is simply:
1 dg
f (x) = (x). (163)
λk(x) dx
Let us try a simple example. Suppose we wish to find f (x) in:
Z x
2
x =1+λ xyf (y)dy. (164)
1
Rx
This may be solved with various approaches. Let φ(x) ≡ 1 yf (y)dy. Then
x2 − 1
φ(x) = . (165)
λx
Now take the derivative of both sides of the original equation:
Z x
2
2x = λx f (x) + λ yf (y)dy = λx2 f (x) + λφ(x). (166)
1
22
As always, especially when you have taken derivatives, it should be checked
that the result actually solves the original equation!
This was pretty easy, but it is even easier if we notice that this problem
is actually equivalent to one with an x-independent kernel. That is, we may
rewrite the equation as:
x2 − 1 Z x
=λ yf (y)dy. (168)
x 1
gn = g(xn ) (171)
fn = f (xn ) (172)
knm = k(xn , xm ). (173)
Note that f0 = g0 .
We may pick various approaches to the numerical integration, for exam-
ple, the trapezoidal rule gives:
n−1
Z xn !
1 X 1
k(xn , y)f (y) dy ≈ ∆ kn0 f0 + knm fm + knn fn . (174)
a 2 m=1 2
23
Solving for fn then gives:
Pn−1
1
gn + ∆ k f
2 n0 0
+ m=1 knm fm
fn = ∆ , n = 1, 2, . . . N. (176)
1− k
2 nn
For example,
f 0 = g0 , (177)
g1 + ∆2 k10 f0
f1 = , (178)
1 − ∆2 k11
∆
g2 + k f + ∆k21 f1
2 20 0
f2 = , (179)
1 − ∆2 k22
and so forth.
We note that we don’t even have to explicitly solve a system of linear
equations, as we did for Fredholm’s equation with a degenerate kernel. There
are of order
N
!
n = O N2
X
O (180)
n=1
24
This may be continuted indefinitely, with the nth iterative solution given in
terms of the (n − 1)th:
Z b
fn (x) = g(x) + λ k(x, y)fn−1 (y)dy (185)
a
Z b
= g(x) + λ k(x, y)g(y)dy (186)
a
Z bZ b
+λ2 k(x, y)k(y, y 0 )g(y 0 )dydy 0
a a
+...
Z b Z b
n−1
+λ ··· k(x, y) · · · k(y (n−2)0 , y (n−1)0 )g(y (n−1)0 )dy . . . dy (n−1)0 .
a a
This method is only useful if the series converges, and the faster the bet-
ter. It will converge if the kernel is bounded and lambda is “small enough”.
We won’t pursue this further here, except to note what happens if
Z b
k(x, y)g(y)dy = 0. (188)
a
In this case, the series clearly converges, onto solution f (x) = g(x). However,
this solution is not necessarily unique, as we may add any linear combination
of solutions to the homogeneous equation.
Let
Z b
λ2 Z Z k(x, x) k(x, x0 )
0
D(λ) = 1 − λ k(x, x)dx + k(x0 , x) k(x0 , x0 ) dxdx
a 2!
k(x, x0 ) k(x, x00 )
k(x, x)
3
λ Z Z Z
− k(x0 , x) k(x0 , x0 ) k(x0 , x00 ) dxdx0 dx00
3! k(x00 , x) k(x00 , x0 ) k(x00 , x00 )
+..., (190)
and let
k(x, y) k(x, z)
Z
2
D(x, y; λ) = λk(x, y) − λ
k(z, y) dz
k(z, z)
25
k(x, z) k(x, z 0 )
3 Z Z k(x, y)
λ
+ k(z, y) k(z, z) k(z, z 0 ) dzdz 0
2! k(z 0 , y) k(z 0 , z) k(z 0 , z 0 )
+..., (191)
Note that not everyone uses the same convention for this notation. For
example, Mathews and Walker defines D(x, y; λ) to be 1/λ times the quantity
defined here.
We have the following:
Theorem: If D(λ) 6= 0 and if the Fredholm’s equation has a solution, then
the solution is, uniquely:
Z b
D(x, y; λ)
f (x) = g(x) + g(y)dy. (192)
a D(λ)
26
the repeated variables. A heavy dot on a segment breaks the segment into
two meeting segments, according to the above rule, and furthermore means
integration over the repeated variable with a factor of λ. For illustration of
these rules:
k(x, y)
k(x, x)
k(x, y)k(y, z)
• R
λ k(x, y)k(y, z)dy
R
• λ k(x, x)dx
Thus,
D(x, y; λ)
•
λ
= − •−
• − • • + •• − • •+•−
1
• • •
+
2!
•
− ..., (194)
and
1
•− • •
D(λ) = 1 − • + 2! •
•• • + • • − •• • + • • −
1
•• •
− • • •−
3! • •
+ ... (195)
Let us try a very simple example to see how things work. Suppose we
wish to solve: Z 1
f (x) = x + λ xyf (y)dy. (196)
0
Of course, this may be readily solved by elementary means, but let us apply
our new techniques. We have:
= k(x, y) = xy (197)
Z 1
λ
• = λ
0
k(x, x)dx =
3
(198)
27
Z 1
λ
• = λ
0
k(x, y)k(y, z)dy =
3
xz = • (199)
Z 1Z 1 !2
λ
• •
2
2
= λ
0 0
k(x, y)k(y, x)dxdy =
3
= • . (200)
••
•
Z 1 Z 1 h i2
n
n
dots = λ ··· dx . . . dx(n) x2 . . . x(n)
0 0
n
= • . (201)
n •= n
dots
•• • . (202)
λ
D(λ) = 1 − • = 1−
3
(203)
1
D(x, y; λ) = = xy. (204)
λ
The solution in terms of the Fredholm series is then:
Z 1
D(x, y; λ)
f (x) = g(x) + g(y)dy
0 D(λ)
3λ Z 1 2
= x+ xy dy
3−λ 0
3
= x. (205)
3−λ
Generalizing from this example, we remark that if the kernel is degenerate,
n
X
k(x, y) = φi (x)ψi (y), (206)
i=1
28
then D(λ) and D(x, y; λ) are polynomials of degree n in λ. The reader
is invited to attempt a graphical “proof” of this. This provides another
algorithm for solving the degenerate kernel problem.
Now suppose that we attempt to solve our example with a Neumann
series. We have
Z Z Z
f (x) = g(x) + λ k(x, y)g(y)dy + λ2 k(x, y)k(y, y 0 )g(y 0 )dydy 0 + . . .
Z 1 Z 1Z 1
= x + λx y 2 dy + λ2 x y 2 (y 0 )2 dydy 0 + . . .
0 0 0
∞
!n
X λ
= x . (207)
n=0 3
This series converges for |λ| < 3 to
3
x.
f (x) = (208)
3−λ
This is the same result as the Fredholm solution above. However, the Neu-
mann solution is only valid for |λ| < 3, while the Fredholm solution is valid
for all λ 6= 3. At eigenvalue λ = 3, D(λ = 3) = 0.
At λ = 3, we expect a non-trivial solution to the homogeneous equation
Z 1
f (x) = 3 xyf (y)dy. (209)
0
Indeed, f (x) = Ax solves this equation. The roots of D(λ) are the eigenvalues
of the kernel. If the kernel is degenerate we only have a finite number of
eigenvalues.
6 Symmetric Kernels
Definition: If k(x, y) = k(y, x) then the kernel is called symmetric. If
k(x, y) = k ∗ (y, x) then the kernel is called Hermitian.
Note that a real, Hermitian kernel is symmetric. For simplicity, we’ll restrict
ourselves to real symmetric kernels here,1 but the generalization to Hermitian
kernels is readily accomplished (indeed is already done when we use Dirac’s
notation). The study of such kernels via eigenfunctions is referred to as
“Schmidt-Hilbert theory”. We will assume that our kernels are bounded in
the sense:
Z b
[k(x, y)]2 dy ≤ M, (210)
a
Z b" #2
∂k
(x, y) dy ≤ M 0 , (211)
a ∂x
1
Note that, since we are assuming real functions in this section, we do not put a complex
conjugate in our scalar products. But don’t forget to put in the complex conjugate if you
have a problem with complex functions!
29
where M and M 0 are finite.
Our approach to studying the symmetric kernel problem will be to analyze
it in terms of the solutions to the homogeneous equation. We have the
following:
Theorem: Every continuous symmetric kernel (not identically zero) pos-
sesses eigenvalues. Their number is countably infinite if and only if the
kernel is not degenerate. All eigenvalues of a real symmetric kernel are
real.
Proof: First, recall the Schwarz inequality, in Dirac notation:
where φ is any (piecewise) continuous function in [a, b]. We’ll assume |a|, |b| <
∞ for simplicity here; the reader may consider what additional criteria must
be satisifed if the interval is infinite.
Our quadratic integral form is analogous with the quadratic form for
systems of linear equations:
n
!
X
A(x, x) = aij xi xj = ( x ) A x , (214)
i,j=1
where
we have defined a scalar product between the vectors u and v. We are thus
led to consider its square,
Z Z Z Z
[J(φ, φ)]2 = dx dy dx0 dy 0 k(x, y)φ(x)φ(y)k(x0 , y 0 )φ(x0 )φ(y 0 ), (218)
30
Thus, if we require φ to be a normalized function,
Z
[φ(x)]2 dx = 1, (220)
we see that |J(φ, φ)| is bounded, since the integral of the squared kernel is
bounded.
Furthermore, we can have J(φ, φ) = 0 for all φ if and only if k(x, y) = 0.
The “if” part is obviously true; let us deal with the “only if” part. This
statement depends on the symmetry of the kernel. Consider the “bilinear
integral form”:
Z Z
J(φ, ψ) = J(ψ, φ) ≡ k(x, y)φ(x)ψ(y)dxdy. (221)
We have
J(φ + ψ, φ + ψ) = J(φ, φ) + J(ψ, ψ) + 2J(φ, ψ), (222)
for all φ, ψ piecewise continuous on [a, b]. We see that J(φ, φ) = 0 for all φ
only if it is also true that J(φ, ψ) = 0, ∀φ, ψ.
In particular, let us take
Z
ψ(y) = k(x, y)φ(x)dx. (223)
Then
Z Z Z
0 = J(φ, ψ) = dx dyk(x, y)φ(x) dx0 k(x0 , y)φ(x0 )
Z Z 2
= k(x, y)φ(x)dx dy. (224)
Thus, k(x, y)φ(x)dx = 0 ∀φ. In particular, take for any given value of y,
R
31
(n) (n)
where cij = cji and hωi |ωj i = δij , and such that the approximating kernels
are uniformly bounded in the senses:
Z b
[An (x, y)]2 dy ≤ MA , (228)
a
Z b" #2
∂An
(x, y) dy ≤ MA0 , (229)
a ∂x
Now, " Z # an 2
X
φ(x) − ui ωi (x) dx ≥ 0, (231)
i=1
is thus one of finding the maximum of the quadratic form subject to the
constraint u2i = 1. We know that such a maximum exists, because a con-
P
32
maximum value in the domain. Suppose that {u} is the appropriate vector.
Then a n
X (n)
cij ui uj = Λ1n (235)
i,j=1
where {u} is now our (normalized) vector for which the quadratic form is
maximal. The normalization hφn |φn i still holds. Apply the approximate
kernel operator to this function:
Z an Z
X (n)
An (x, y)φn (y)dy = cij ωi (x) ωj (y)φn (y)dy
i,j=1
an an
X X (n)
= ωi (x) cij uj
i=1 j=1
an
X
= Λ1n ui ωi (x)
i=1
= Λ1n φn (x). (238)
Thus,
Z Z 2
2
[J(φ, φ) − Jn (φ, φ)] = [k(x, y) − An (x, y)] φ(x)φ(y)dxdy
Z Z
≤ |hφ|φi| 2
[k(x, y) − An (x, y)]2 dxdy (Schwarz),
Z bZ b
≤ 2 dxdy
a a
≤ 2 (b − a)2 . (240)
33
Thus, the range of Jn may be made arbitrarily close to the range of J by tak-
ing n large enough, and hence, the maximum of Jn may be made arbitrarily
close to that of J:
lim Λ1n = λ1 . (241)
n→∞
Now, by the Schwarz inequality, the functions φn (x) are uniformly bounded
for all n:
Z 2
2
[φn (x)] = λ1n An (x, y)φn (y)dy
Z
≤ λ21n hφn |φn i [An (x, y)]2 dy. (242)
whenever |η| < δ. This may be seen as follows: First, we show that φ0n (x) is
uniformly bounded:
" #2
2
Z
∂An
[φ0n (x)] = λ1n (x, y)φn (y)dy
∂x
Z " #2
∂An
≤ λ21n (x, y) dy (Schwarz)
∂x
≤ λ21n MA0 . (244)
Or, [φ0n (x)]2 ≤ MA00 , where MA00 = MA0 max λ21n . With this, we find:
Z x+η 2
2 0
|φn (x + η) − φn (x)| =
φn (y)dy
x
Z 2
b
= [θ(y − x) − θ(y − x − η)] φ0n (y)dy
a
Z b Z b
2
≤ [θ(y − x) − θ(y − x − η)]2 dy [φ0n (y)] dy
a a
≤ |η|(b − a)MA00
< , (245)
34
Theorem: (Arzela) If f1 (x), f2 (x), . . . is a uniformly bounded equicontin-
uous set of functions on a domain D, then it is possible to select a
subsequence that converges uniformly to a continuous limit function in
the domain D.
The proof of this is similar to the proof of the Bolzano-Weierstrass theorem,
which it relies on. We start by selecting a set of points x1 , x2 , . . . that is
everywhere dense in [a, b]. For example, we could pick successive midpoints
of intervals. By the Bolzano-Weierstrass theorem, this sequence of num-
bers contains a convergent subsequence. Now select an infinite sequence of
functions (out of {f }) a1 (x), a2 (x), . . . whose values at x1 form a convergent
sequence, which we may also accomplish by the same reasoning. Similarly,
select a convergent sequence of functions (out of {a}) b1 (x), b2 (x), . . . whose
values at x2 form a convergent sequence, and so on.
Now consider the “diagonal sequence”:
q1 (x) = a1 (x)
q2 (x) = b2 (x)
q3 (x) = c3 (x)
... (246)
We wish to show that the sequence {q} converges on the entire interval [a, b].
Given > 0, take M large enough so that there exist values xk with
k ≤ M such that |x − xk | ≤ δ() for every point x of the interval, where δ()
is the δ in our definition of equicontinuity. Now choose N = N () so that for
m, n > N
|qm (xk ) − qn (xk )| < , k = 1, 2, . . . , M. (247)
By equicontinuity, we have, for some k ≤ M :
|qm (x) − qm (xk )| < , (248)
|qn (x) − qn (xk )| < . (249)
Thus, for m, n > N :
|qm (x) − qn (x)| = |qm (x) − qm (xk ) + qm (xk ) − qn (xk ) + qn (xk ) − qn (x)|
< 3. (250)
Thus, {q} is uniformly convergent for all x ∈ [a, b].
With this theorem, we can find a subsequence φn1 , φn2 , . . . that converges
uniformly to a continuous limit function ψ1 (x) for a ≤ x ≤ b. There may be
more than one limit function, but there cannot be an infinite number, as we
know that the number of eigenfunctions for given λ is finite. Passing to the
limit,
hφn |φn i = 1 → hψ1 |ψ1 i = 1 (251)
Jn (φn , φn ) = Λ1n → J(ψ1 , ψ1 ) = Λ1 (252)
Z Z
φn (x) = λ1n An (x, y)φn (y)dy → ψ1 (x) = λ1 k(x, y)ψ1 (y)dy.(253)
35
Thus we have proven the existence of an eigenvalue (λ1 ).
Note that λ1 6= ∞ since we assumed that J(φ, φ) could be positive:
1
max [J(φ, φ)] = Λ1 = > 0. (254)
λ1
Note also that, just as in the principal axis problem, additional eigenvalues (if
any exist) can be found by repeating the procedure, restricting to functions
orthogonal to the first one. If k(x, y) is degenerate, there can only be a finite
number of them, as the reader may demonstrate. This completes the proof
of the theorem stated at the beginning of the section.
We’ll conclude this section with some further properties of symmetric ker-
nels. Suppose that we have found all of the positive and negative eigenvalues
and ordered them by absolute value:
β1 , β2 , . . . (256)
The maximum (and minimum) of this form is zero, since the eigenvalues of
k(x, y) equal eigenvalues of ni=1 βi (x)β i (y)
. Hence k 0 (x, y) = 0.
P
λi
We also have the following “expansion theorem” for integral transforms
with a symmetric kernel.
Theorem: Every continuous function g(x) that is an integral transform with
symmetric kernel k(x, y) of a piecewise continuous function f (y),
Z
g(x) = k(x, y)f (y)dy, (260)
36
where k(y, x) = k(x, y), can be expanded in a uniformly and absolutely
convergent series in the eigenfunctions of k(x, y):
∞
X
g(x) = gi βi (x), (261)
i=1
where gi = hβi |f i/λi = hβi |gi, and we should properly justify the interchange
of the summation and the integral. We’ll forego a proper proof of the theorem
and consider its application.
We wish to solve the inhomogeneous integral equation:
Z b
g(x) = f (x) − λ k(x, y)f (y)dy. (264)
a
where
ai = hβi |f − gi
Z Z
= λ k(x, y)f (y)βi (x)dydx
Z Z
= λ f (y)dy k(y, x)βi (x)dx
λ
= hβi |f i. (267)
λi
37
Using the first and final lines, we may eliminate hβi |f i:
λi
hβi |f i = hβi |gi, (268)
λi − λ
and arrive at the result for the expansion coefficients:
λ
ai = hβi |gi. (269)
λi − λ
Thus, we have the solution to the integral equation:
∞
X hβi |gi
f (x) = g(x) + λ βi (x) . (270)
i=1 λi − λ
f = g + λKg + λ2 K 2 g + . . . , (276)
38
which is just the Neumann series.
What do these formal operator equations mean? Well, they only have
meaning in the context of operating on the appropriate operands. For exam-
ple, consider the meaning of |λK| < 1. This might mean that for all possible
normalized functions φ we must have that kλKk < 1, where the k indicates
an “operator norm”, given by:
Z Z 2
kλKk ≡ max λ k(x, y)φ(x)φ(y)dxdy < 1. (277)
φ
The reader is invited to compare this notion with the condition for con-
vergence of the Neumann series in Whittaker and Watson:
|λ(b − a)| max
x,y
|k(x, y)| < 1. (279)
6.2 Example
Consider the problem:
Z 2π
2
f (x) = sin x + λ k(x, y)f (y)dy, (280)
0
39
We turn this into a contour integral on the unit circle, letting z = eix .
Then dx = dz/iz and 2 cos x = z + z1 . This leads to:
I
dz
I0 = i . (285)
αz 2 − (1 + α2 )z + α
The roots of the quadratic in the denominator are at z = {α, 1/α}. Thus,
i I dz
I0 = . (286)
α (z − α)(z − 1/α)
Only the root at α is inside the contour; we evaluate the residue at this pole,
and hence determine that
2π
I0 = . (287)
1 − α2
√
We conclude that eigenfunction 1/ 2π corresponds to eigenvalue 1.
We wish to find the rest of the eigenfunctions. Note that if we had not
taken y = 0 in evaluating I0 , we would have written:
ieiy I dz
I0 = , (288)
α (z − eiy α)(z − eiy /α)
and the relevant pole is at eiy α. We thence notice that we know a whole class
of integrals:
1 − α2 ieiy I z n dz
iy iy
= αn einy , n ≥ 0. (289)
2π α (z − e α)(z − e /α)
1 1
The residue at pole z = αeiy is 1−α 2 αn einy . We need also the residue at z = 0.
40
The j = n − 1 term will give us the residue at z = 0:
α −n
A−1 = e−iny α − α n
. (293)
1 − α2
Thus,
2π
I−n = 2
αn e−iny . (294)
1−α
We summarize the result: The normalized eigenfunctions are βn (x) =
einx
√
2π
, with eigenvalues λn = α−|n| , for n = 0, ±1, ±2, . . ..
Finally, it remains to calculate:
∞
hβn | sin2 xi
f (x) = sin2 x + λ
X
βn (x)
n=1 λn − λ
√ " #
2 2π 2β0 (x) β−2 (x) + β2 (x)
= sin x + λ −
4 1−λ α−2 − λ
λ 1 1
= sin2 x + − −2 cos 2x . (295)
2 1−λ α −λ
Note that if λ = 1 or λ = α−2 then there is no solution. On the other hand,
if λ = α−|n| is one of the other eigenvalues (n 6= 0, ±2), then the above is still
a solution, but it is not unique, since we can add any linear combination of
βn (x) and β−n (x) and still have a solution.
7 Exercises
1. Given an abstract complex vector space (linear space), upon which we
have defined a scalar product (inner product):
ha|bi (296)
41
2. Considering our RC circuit example, derive the results in Eqn. 31
through Eqn. 35 using the Fourier transform.
3. Prove the convolution theorem.
4. We showed the the Fourier transform of a Gaussian was also a Gaussian
shape. That is, let us denote a Gaussian of mean µ and standard
deviation σ by:
(x − µ)2
" #
1
N (x; µ, σ) = √ exp − . (300)
2πσ 2σ 2
(a) In class we found (in an equivalent form) that the Fourier Trans-
form of a Gaussian of mean zero was:
y2σ2
" #
1
N̂ (y; 0, σ) = √ exp − . (301)
2π 2
Generalize this result to find the Fourier transform of N (x; µ, σ).
(b) The experimental resolution function of many measurements is
approximately Gaussian in shape (in probability&statistics we’ll
prove the “Central Limit Theorem”). Often, there is more than
one source of uncertainty contributing to the final result. For ex-
ample, we might measure a distance in two independent pieces,
with means µ1 , µ2 and standard deviations σ1 , σ2 . The resolu-
tion function (sampling distribution) of the final result is then the
convolution of the two pieces:
Z ∞
P (x; µ1 , σ1 , µ2 , σ2 ) = N (y; µ1 , σ1 )N (x − y; µ2 , σ2 )dy. (302)
−∞
42
7. The lowest P -wave hydrogen wave function in position space may be
written:
1 r
ψ(x) = q r cos θ exp − , (304)
32πa50 2a0
√
where r = x2 + y 2 + z 2 , θ is the polar angle with respect to the z
axis, and a0 is a constant. Find the momentum-space wave function
for this state (i.e., find the Fourier transform of this function).
In this and all problems in this course, I urge you to avoid look-up
tables (e.g., of integrals). If you do feel the need to resort to tables,
however, be sure to state your source.
R1
Vc (t)
R2 C
10. Give a graphical proof that the series D(λ) and D(x, y; λ) in the Fred-
holm solution are polynomials of degree n if the kernel is of the degen-
erate form: n X
k(x, y) = φi (x)ψi (y). (306)
i=1
43
11. Solve the following equation for u(t):
d2 u Z 1
(t) + sin [k(s − t)] u(s)ds = a(t), (307)
dt2 0
In particular, estimate f (1), using one, two, and three intervals (i.e.,
N = 1, N = 2, and N = 3). [We’re only doing some low values so you
don’t have to develop a lot of technology to do the computation, but
going to high enough N to get a glimpse at the convergence.]
15. Another method we discussed in section 3 is the extension to the
Laplace transform in Laplace’s method for solving differential equa-
tions. I’ll summarize here: We are given a differential equation of the
form: n
(ak + bk x)f (k) (x) = 0
X
(310)
k=0
We assume a solution of the form:
Z
f (x) = F (s)esx ds, (311)
C
44
where A is an arbitrary constant.
A differential equation that arises in the study of the hydrogen atom
is the Laguerre equation:
16. Write the diagram, with coefficients, for the fifth-order numerator and
denominator of the Fredholm expansion.
45
21. We wish to solve the following integral equation for f (x):
Z x
f (x) = g(x) − λ f (y)dy, (321)
0
V0(t)
R C
46