Boundary Value Problems in One Dimension
Boundary Value Problems in One Dimension
568
c 2012
Peter J. Olver
computer graphics and numerical analysis, and to more general second order boundary
value problems of SturmLiouville type, which arise in a host of physical applications that
involve partial differential equations.
The simplest boundary value problems can be solved by direct integration. However,
more complicated systems do not admit explicit formulae for their solutions, and one
must rely on numerical approximations. In the final section, we introduce the powerful
finite element method. The key idea is to restrict the infinite-dimensional minimization
principle characterizing the exact solution to a suitably chosen finite-dimensional subspace
of the function space. When properly formulated, the solution to the resulting finitedimensional minimization problem approximates the true minimizer. As in Chapter 4, the
finite-dimensional minimizer is found by solving the induced linear algebraic system, using
either direct or iterative methods.
An alternative formulation of the finite element solution, that can be applied even in
situations where there is no minimum principle available, is based on the idea of a weak
solution to the boundary value problem, where one relaxes the classical differentiability
requirements.
569
c 2012
Peter J. Olver
x
u(x)
Figure 11.1.
With our convention, u(x) > 0 means that the atom has moved down, while if u(x) < 0,
the atom has moved up. In particular,
u(0) = 0
(11.1)
because we are assuming that the top end is fixed and cannot move.
The strain in the bar measures the relative amount of stretching. Two nearby atoms,
at respective positions x and x+x, are moved to positions x+u(x) and x+x+u(x+x).
The original, unstressed length of this small section of bar was x, while in the new
configuration the same section has length
x + x + u(x + x) x + u(x) = x + u(x + x) u(x) .
Therefore, this segment has been elongated by an amount u(x + x) u(x). The dimensionless strain measures the relative elongation, and so is obtained by dividing by the
reference length: [ u(x + x) u(x) ]/x. We now take the continuum limit by letting the
interatomic spacing x 0. The result is the strain function
du
u(x + x) u(x)
=
x0
x
dx
v(x) = lim
(11.2)
.
n
The bar is to be viewed as the continuum limit, in which the number of masses n
and the spring lengths x 0. The k th mass starts out at position
x =
xk = k x =
12/11/12
570
k
,
n
c 2012
Peter J. Olver
where
1
1
A=
x
1
1
1
1
1
..
.
..
.
1
d
dx
(11.4)
is the scaled incidence matrix of the massspring chain. As indicated by the arrow, the
derivative operator d/dx that relates displacement to strain in the bar equation (11.2)
can be viewed as its continuum limit, as the number of masses n and the spring
lengths x 0. Vice versa, the incidence matrix can be viewed as a discrete, numerical
approximation to the derivative operator. Indeed, if we regard the discrete displacements
and strains as approximations to the sample values of their continuous counterparts, so
uk u(xk ),
vk v(xk ),
(x ).
x
x
dx k
justifying the identification (11.4). The passage back and forth between the discrete and
the continuous is the foundation of continuum mechanics solids, fluids, gases, and plasmas. Discrete models both motivate and provide numerical approximations to continuum
systems, which, in turn, simplify and give further insight into the discrete domain.
The next part of the mathematical framework is to use the constitutive relations to
relate the strain to the stress, or internal force experienced by the bar. To keep matters
simple, we shall only consider bars that have a linear relation between stress and strain.
For a physical bar, this is a pretty good assumption as long as it is not stretched beyond
its elastic limits. Let w(x) denote the stress on the point of the bar that was at reference
position x. Hookes Law implies that
v(xk ) =
(11.5)
We will find it helpful to label the springs from k = 0 to k = n 1. This will facilitate
comparisons with the bar, which, by convention, starts at position x0 = 0.
12/11/12
571
c 2012
Peter J. Olver
where c(x) measures the stiffness of the bar at position x. For a homogeneous bar, made
out of a uniform material, c(x) c is a constant function. The constitutive function c(x)
can be viewed as the continuum limit of the diagonal matrix
c
c1
C=
..
cn1
or
w = C v,
(11.6)
that relates stress to strain (internal force to elongation) in the individual springs. Indeed,
(11.6) can be identified as the sampled version, w(xk ) = c(xk ) v(xk ), of the continuum
relation (11.5).
Finally, we need to impose a force balance at each point of the bar. Suppose f (x)
is an external force at position x on the bar, where f (x) > 0 means the force is acting
downwards. Physical examples include mechanical, gravitational, or magnetic forces acting
solely in the vertical direction. In equilibrium , the bar will deform so as to balance the
external force with its own internal force resulting from stretching. Now, the internal force
per unit length on the section of the bar lying between nearby positions x and x + x
is the relative difference in stress at the two ends, namely [ w(x + x) w(x) ]/x. The
force balance law requires that, in the limit,
dw
w(x + x) w(x)
= f (x) +
,
x0
x
dx
0 = f (x) + lim
or
dw
.
(11.7)
dx
This can be viewed as the continuum limit of the massspring chain force balance equations
f =
wk1 wk
(11.8)
,
wn = 0,
x
where the final condition ensures the correct formula for the force on the free-hanging
bottom mass. (Remember that the springs are numbered from 0 to n 1.) This indicates
that we should also impose an analogous boundary condition
fk =
w() = 0
(11.9)
at the bottom end of the bar, which is hanging freely and so is unable to support any
internal stress. The matrix form of the discrete system (11.8) is
f = AT w,
12/11/12
572
c 2012
Peter J. Olver
1 1
1 1
1 1
1
AT =
1
x
1
..
.
..
dx
(11.10)
should approximate the differential operator d/dx that appears in the continuum force
balance law (11.7). Thus, we should somehow interpret d/dx as the transpose or
adjoint of the differential operator d/dx. This important point will be developed properly
in Section 11.3. But before trying to push the theory any further, we will pause to analyze
the mathematical equations governing some simple configurations.
But first, let us summarize our progress so far. The three basic equilibrium equations
(11.2, 5, 7) are
v(x) =
du
,
dx
f (x) =
dw
.
dx
(11.11)
Substituting the first into the second, and then the resulting formula into the last equation,
leads to the equilibrium equation
du
d
c(x)
= f (x),
0 < x < .
(11.12)
K[ u ] =
dx
dx
Thus, the displacement u(x) of the bar is obtained as the solution to a second order
ordinary differential equation. As such, it will depend on two arbitrary constants, which
will be uniquely determined by the boundary conditions (11.1, 9) at the two ends:
w() = c() u () = 0.
u(0) = 0,
(11.13)
Usually c() > 0, in which case it can be omitted from the second boundary condition,
which simply becomes u () = 0. The resulting boundary value problem is to be viewed
as the continuum limit of the linear system
K u = AT C A u = f
(11.14)
modeling a mass-spring chain with one free end, cf. (6.11), in which
A
d
,
dx
C c(x),
AT
d
,
dx
u u(x),
f f (x).
And, as we will see, most features of the finite-dimensional linear algebraic system have,
when suitably interpreted, direct counterparts in the continuous boundary value problem.
12/11/12
573
c 2012
Peter J. Olver
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.2
0.4
0.6
0.8
0.2
u(x)
Figure 11.2.
0.4
0.6
0.8
w(x)
Displacement and Stress of a Bar with One Fixed End.
Example 11.1. Consider the simplest case of a uniform bar of unit length = 1
subjected to a uniform force, e.g., gravity. The equilibrium equation (11.12) is
d2 u
= f,
(11.15)
dx2
where we are assuming that the force f is constant. This elementary second order ordinary
differential equation can be immediately integrated:
c
f
(11.16)
c
is the ratio of the force to the stiffness of the bar. The values of the integration constants
a and b are fixed by the boundary conditions (11.13), so
u(x) = 12 x2 + a x + b,
where
u (1) = + a = 0.
u(0) = b = 0,
Therefore, there is a unique solution to the boundary value problem, yielding the displacement
u(x) = x 21 x2 ,
(11.17)
which is graphed in Figure 11.2 for = 1. Note the parabolic shape, with zero derivative,
indicating no strain, at the free end. The displacement reaches its maximum, u(1) = 21 ,
at the free end of the bar, which is the point which moves downwards the farthest. The
stronger the force or the weaker the bar, the farther the overall displacement.
Remark : This example illustrates the simplest way to solve boundary value problems,
which is adapted from the usual solution technique for initial value problems. First, solve
the differential equation by standard methods (if possible). For a second order equation,
the general solution will involve two arbitrary constants. The values of the constants are
found by substituting the solution formula into the two boundary conditions. Unlike initial
value problems, the existence and/or uniqueness of the solution to a general boundary
value problem is not guaranteed, and you may encounter situations where you are unable
to complete the solution; see, for instance, Example 7.42. A more sophisticated method,
based on the Greens function, will be presented in the following section.
12/11/12
574
c 2012
Peter J. Olver
As in the discrete situation, this particular mechanical configuration is statically determinate, meaning that we can solve directly for the stress w(x) in terms of the external force
f (x) without having to compute the displacement u(x) first. In this particular example,
we need to solve the first order boundary value problem
dw
= f,
w(1) = 0,
dx
arising from the force balance law (11.7). Since f is constant,
w(x)
= (1 x).
c
Note that the boundary condition uniquely determines the integration constant. We can
then find the displacement u(x) by solving another boundary value problem
w(x) = f (1 x),
and
v(x) =
du
= v(x) = (1 x),
u(0) = 0,
dx
resulting from (11.2), which again leads to (11.17). As before, the appearance of one
boundary condition implies that we can find a unique solution to the differential equation.
Remark : We motivated the boundary value problem for the bar by taking the continuum limit of the massspring chain. Let us see to what extent this limiting procedure can
be justified. To compare the solutions, we keep the reference length of the chain fixed at
= 1. So, if we have n identical masses, each spring has length x = 1/n. The k th mass
will start out at reference position xk = k/n. Using static determinacy, we can solve the
system (11.8), which reads
wk1 = wk +
f
,
n
wn = 0,
k
1
n
= f (1 xk ) ,
k = 0, . . . , n 1.
Thus, in this particular case, the continuous bar and the discrete chain have equal stresses
at the sample points: w(xk ) = wk . The strains are also in agreement:
k
1
= (1 xk ) = v(xk ) ,
vk = wk = 1
c
n
where = f /c, as before. We then obtain the displacements by solving
vk
k
uk+1 = uk +
1
.
= uk +
n
n
n
Since u0 = 0, the solution is
k1
xk
i
X
k
k(k 1)
xk
1 2
1
uk =
=
=
=
u(x
)
+
.
+
x
k
k
k
2
n i=0
n
n
2 n2
2n
2n
(11.18)
12/11/12
575
c 2012
Peter J. Olver
0.3
0.25
0.2
0.15
0.1
0.05
0.2
Figure 11.3.
0.4
0.6
0.8
The sampled displacement u(xk ) is not exactly equal to uk , but their difference tends to
zero as the number of masses n . In this way, we have completely justified our limiting
interpretation.
Example 11.2. Consider the same uniform, unit length bar as in the previous
example, again subject to a uniform constant force, but now with two fixed ends. We
impose inhomogeneous boundary conditions
u(0) = 0,
u(1) = d,
so the top end is fixed, while the bottom end is displaced an amount d. (Note that d > 0
means the bar is stretched, while d < 0 means it is compressed.) The general solution to
the equilibrium equation (11.15) is, as before, given by (11.16). The values of the arbitrary
constants a, b are determined by plugging into the boundary conditions, so
u(1) = 12 + d = 0.
u(0) = b = 0,
Thus
u(x) =
1
2
(x x2 ) + d x
(11.19)
is the unique solution to the boundary value problem. The displacement is a linear superposition of two functions; the first is induced by the external force f , while the second
represents a uniform stretch induced by the boundary condition. In Figure 11.3, the dotted
curves represent the two constituents, and the solid graph is their sum, which is the actual
displacement.
Unlike a bar with a free end, this configuration is statically indeterminate. There is
no boundary condition on the force balance equation
dw
= f,
dx
and so the integration constant a in the stress w(x) = af x cannot be determined without
first figuring out the displacement (11.19):
w(x) = c
12/11/12
du
=f
dx
576
1
2
x + c d.
c 2012
Peter J. Olver
Example 11.3. Finally, consider the case when both ends of the bar are left free.
The boundary value problem
u = f (x),
u (0) = 0,
u () = 0,
(11.20)
represents the continuum limit of a massspring chain with two free ends and corresponds
to a bar floating in outer space, subject to a nonconstant external force. Based on our
finite-dimensional experience, we expect the solution to manifest an underlying instability
of the physical problem. Solving the differential equation, we find that
Z x Z y
u(x) = a x + b
f (z) dz dy,
0
f (z) dz,
u (x) = a
0
the first boundary condition u (0) = 0 requires a = 0. The second boundary condition
requires
Z
f (x) dx = 0,
(11.21)
u () =
0
which is not automatically valid! The integral represents the total force per unit length
exerted on the bar. As in the case of a mass-spring chain with two free ends, if there is a
non-zero net force, the bar cannot remain in equilibrium, but will move off in space and
the equilibrium boundary value problem has no solution. On the other hand, if the forcing
satisfies the constraint (11.21), then the resulting solution of the boundary value problem
has the form
Z x Z y
f (z) dz dy,
(11.22)
u(x) = b
0
where the constant b is arbitrary. Thus, when it exists, the solution to the boundary value
problem is not unique. The constant b solves the corresponding homogeneous problem,
and represents a rigid translation of the entire bar by a distance b.
Physically, the free boundary value problem corresponds to an unstable structure:
there is a translational instability in which the bar moves off rigidly in the longitudinal
direction. Only balanced forces of mean zero can maintain equilibrium. Furthermore,
when it does exist, the equilibrium solution is not unique since there is nothing to tie the
bar down to any particular spatial position.
This dichotomy should remind you of our earlier study of linear algebraic systems.
An inhomogeneous system K u = f consisting of n equations in n unknowns either admits
a unique solution for all possible right hand sides f , or, when K is singular, either no
solution exists or the solution is not unique. In the latter case, the constraints on the
right hand side are prescribed by the Fredholm alternative (5.80), which requires that f
be orthogonal, with respect to the Euclidean inner product, to all elements of coker K.
In physical equilibrium systems K is symmetric, and so coker K = ker K. Thus, the
Fredholm alternative requires that the forcing be orthogonal to all the unstable modes. In
12/11/12
577
c 2012
Peter J. Olver
the function space for the bar, the finite-dimensional dot product is replaced by the L2
inner product
Z
hf ;gi =
f (x) g(x) dx.
0
Since the kernel or solution space to the homogeneous boundary value problem is spanned
by the constant function 1, the Fredholm alternative requires that the forcing function be
Z
f (x) dx = 0. This is precisely the condition (11.21) required
orthogonal to it, h f ; 1 i =
0
for existence of a (non-unique) equilibrium solution, and so the analogy between the finite
and infinite dimensional categories is complete.
Remark : The boundary value problems that govern the mechanical equilibria of a
simple bar arise in many other physical systems. For example, the equation for the thermal
equilibrium of a bar under an external heat source is modeled by the same boundary value
problem (11.12); in this case, u(x) represents the temperature of the bar, c(x) represents
the diffusivity or thermal conductivity of the material at position x, while f (x) represents
an external heat source. A fixed boundary condition u() = a corresponds to an end that
is held at a fixed temperature a, while a free boundary condition u () = 0 represents an
insulated end that does not allow heat energy to enter or leave the bar. Details of the
physical derivation can be found in Section 14.1.
(11.24)
of the standard basis vectors of R n . Suppose we know how to solve each of the individual
systems
K ui = ei ,
i = 1, . . . , n.
(11.25)
In fact, historically, Fredholm first made his discovery while studying such infinite-dimensional systems. Only later was it realized that the same underlying ideas are equally valid in
finite-dimensional linear algebraic systems.
12/11/12
578
c 2012
Peter J. Olver
The vector ei represents a unit force or, more precisely, a unit impulse, which is applied
solely to the ith mass in the chain; the solution ui represents the response of the chain.
Since we can decompose any other force vector as a superposition of impulse forces, as in
(11.24), the superposition principle tells us that the solution to the inhomogeneous system
(11.23) is the self-same linear combination of the individual responses, so
u = f1 u 1 + f2 u 2 + + fn u n .
(11.26)
Remark : The alert reader will recognize that u1 , . . . , un are the columns of the inverse
matrix, K 1 , and so we are, in fact, reconstructing the solution to the linear system (11.23)
by inverting the coefficient matrix K. Thus, this observation, while noteworthy, does not
lead to an efficient solution technique for discrete systems. In contrast, in the case of
continuous boundary value problems, this approach leads to one of the most valuable
solution paradigms in both practice and theory.
The Delta Function
Our aim is to extend this algebraic technique to boundary value problems. The key
question is how to characterize an impulse force that is concentrated on a single atom of
the bar. In general, a unit impulse at position x = y will be described by something called
the delta function, and denoted by y (x). Since the impulse is supposed to be concentrated
solely at x = y, we should have
y (x) = 0
for
x 6= y.
(11.27)
Moreover, since it is a unit impulse, we want the total amount of force exerted on the bar
to be equal to one. Since we are dealing with a continuum, the total force is represented
by an integral over the length of the bar, and so we also require that the delta function
satisfy
Z
(11.28)
y (x) dx = 1,
provided that
0 < y < .
0
Alas, there is no bona fide function that enjoys both of the required properties! Indeed,
according to the basic facts of Riemann (or even Lebesgue) integration, two functions
which are the same everywhere except at one single point have exactly the same integral,
[53, 158]. Thus, since y is zero except at one point, its integral should be 0, not 1. The
mathematical conclusion is that the two requirements, (11.27, 28) are inconsistent!
This unfortunate fact stopped mathematicians dead in their tracks. It took the imagination of a British engineer, with the unlikely name Oliver Heaviside, who was not deterred
by the lack of rigorous justification, to start utilizing delta functions in practical applications with remarkable effect. Despite his success, Heaviside was ridiculed by the pure
mathematicians of his day, and eventually succumbed to mental illness. But, some thirty
years later, the great theoretical physicist Paul Dirac resurrected the delta function for
12/11/12
579
c 2012
Peter J. Olver
quantum mechanical applications, and this finally made theoreticians sit up and take notice. (Indeed, the term Dirac delta function is quite common.) In 1944, the French
mathematician Laurent Schwartz finally established a rigorous theory of distributions that
incorporated such useful, but non-standard objects, [116, 156]. (Thus, to be more accurate, we should really refer to the delta distribution; however, we will retain the more
common, intuitive designation delta function throughout.) It is beyond the scope of this
introductory text to develop a fully rigorous theory of distributions. Rather, in the spirit
of Heaviside, we shall concentrate on learning, through practice with computations and
applications, how to tame these wild mathematical beasts.
There are two distinct ways to introduce the delta function. Both are important and
both worth knowing.
Method #1. Limits: The first approach is to regard the delta function y (x) as a
limit of a sequence of ordinary smooth functions gn (x). These functions will represent
more and more concentrated unit forces, which, in the limit, converge to the desired unit
impulse concentrated at a single point, x = y. Thus, we require
lim gn (x) = 0,
x 6= y,
(11.29)
(11.30)
n
.
(1 + n2 x2 )
while
0,
,
(11.31)
x 6= 0,
x = 0,
(11.32)
1
1
= 1.
tan n x
gn (x) dx =
x =
(11.33)
To keep the notation compact, we suppress the dependence of the functions gn on the point
y where the limiting delta function is concentrated.
For the moment, it will be slightly simpler here to consider the entire real line corresponding to a bar of infinite length. Exercise discusses how to modify the construction for a finite
interval.
12/11/12
580
c 2012
Peter J. Olver
Figure 11.4.
(11.34)
with the unit impulse delta function concentrated at x = 0. As sketched in Figure 11.4, as n
gets larger and larger, each successive function gn (x) forms a more and more concentrated
spike, while maintaining a unit total area under its graph. The limiting delta function can
be thought of as an infinitely tall spike of zero width, entirely concentrated at the origin.
Remark : There are many other possible choices for the limiting functions gn (x). See
Exercise for another useful example.
Remark : This construction of the delta function highlights the perils of interchanging
limits and integrals without proper justification. In any standard theory of integration,
the limit of the functions gn would be indistinguishable from the zero function, so the limit
of their integrals (11.33) would not equal the integral of their limit:
Z
Z
gn (x) dx 6=
lim gn (x) dx = 0.
1 = lim
n
The delta function is, in a sense, a means of sidestepping this analytic inconvenience. The
full ramifications and theoretical constructions underlying such limits must, however, be
deferred to a rigorous course in real analysis, [53, 158].
Once we have found the basic delta function (x) = 0 (x), which is concentrated at
the origin, we can obtain a delta function concentrated at any other position y by a simple
translation:
y (x) = (x y).
(11.35)
12/11/12
581
c 2012
Peter J. Olver
n
1+
n2 (x
y)2
.
(11.36)
Method #2. Duality: The second approach is a bit more abstract, but much closer
to the proper rigorous formulation of the theory of distributions like the delta function.
The critical observation is that if u(x) is any continuous function, then
Z
for
0 < y < .
(11.37)
Indeed, since y (x) = 0 for x 6= y, the integrand only depends on the value of u at the
point x = y, and so
Z
y (x) u(x) dx =
y (x) dx = u(y).
Equation (11.37) serves to define a linear functional Ly : C0 [ 0, ] R that maps a continuous function u C0 [ 0, ] to its value at the point x = y:
Ly [ u ] = u(y) R.
In the dual approach to generalized functions, the delta function is, in fact, defined as this
particular linear functional. The function u(x) is sometimes referred to as a test function,
since it serves to test the form of the linear functional L.
Remark : If the impulse point y lies outside the integration domain, then
Z
y (x) u(x) dx = 0,
when
y<0
or
y > ,
(11.38)
because the integrand is identically zero on the entire interval. For technical reasons, we
will not attempt to define the integral (11.38) if the impulse point y = 0 or y = lies on
the boundary of the interval of integration.
The interpretation of the linear functional Ly as representing a kind of function y (x)
is based on the following line of thought. According to Theorem 7.10, every scalar-valued
linear function L: V R on a finite-dimensional inner product space is given by an inner
product with a fixed element a V , so
L[ u ] = h a ; u i.
12/11/12
582
c 2012
Peter J. Olver
In this sense, linear functions on R n are the same as vectors. (But bear in mind that the
identification does depend upon the choice of inner product.) Similarly, on the infinitedimensional function space C0 [ 0, ], the L2 inner product
Lg [ u ] = h g ; u i =
g(x) u(x) dx
(11.39)
(11.40)
holds for every continuous function u(x). Every (continuous) function defines a linear
functional, but not conversely. Or, stated another way, while the dual space to a finitedimensional vector space like R n can be identified, via an inner product, with the space
itself, this is not the case in infinite-dimensional function space; the dual is an entirely
different creature. This disconcerting fact highlights yet another of the profound differences
between finite- and infinite-dimensional vector spaces!
But the dual interpretation of generalized functions acts as if this were true. Generalized functions are real-valued linear functionals on function space, but viewed as a kind
of function via the inner product. Although the identification is not to be taken literally,
one can, with a little care, manipulate generalized functions as if they were actual functions, but always keeping in mind that a rigorous justification of such computations must
ultimately rely on their true characterization as linear functionals.
The two approaches limits and duality are completely compatible. Indeed, with
a little extra work, one can justify the dual formula (11.37) as the limit
u(y) = lim
gn (x) u(x) dx =
y (x) u(x) dx
(11.41)
of the inner products of the function u with the approximating concentrated impulse
functions gn (x) satisfying (11.2930). In this manner, the linear functional Ly [ u ] = u(y)
represented by the delta function is the limit, Ly = lim Ln , of the approximating linear
n
functionals
Z
Ln [ u ] =
Thus, the choice of interpretation of the generalized delta function is, on an operational
level, a matter of taste. For the novice, the limit interpretation of the delta function is
perhaps the easier to digest at first. However, the dual, linear functional interpretation
has stronger connections with the rigorous theory and, even in applications, offers some
significant advantages.
Although on the surface, the delta function might look a little bizarre, its utility in
modern applied mathematics and mathematical physics more than justifies including it in
12/11/12
583
c 2012
Peter J. Olver
Figure 11.5.
your analytical toolbox. Even though you are probably not yet comfortable with either
definition, you are advised to press on and familiarize yourself with its basic properties, to
be discussed next. With a little care, you usually wont go far wrong by treating it as if
it were a genuine function. After you gain more practical experience, you can, if desired,
return to contemplate just exactly what kind of object the delta function really is.
Remark : If you are familiar with basic measure theory, [158], there is yet a third
interpretation of the delta function as a point mass or atomic measure. However, the
measure-theoretic approach has definite limitations, and does not cover the full gamut of
generalized functions.
Calculus of Generalized Functions
In order to develop a working relationship with the delta function, we need to understand how it behaves under the basic operations of linear algebra and calculus. First, we
can take linear combinations of delta functions. For example,
f (x) = 2 (x) + 3 (x 1)
represents a combination of an impulse of magnitude 2 concentrated at x = 0 and one of
magnitude 3 concentrated at x = 1. Since y (x) = 0 for any x 6= y, multiplying the delta
function by an ordinary function is the same as multiplying by a constant:
g(x) y (x) = g(y) y (x),
(11.42)
provided that g(x) is continuous at x = y. For example, x (x) 0 is the same as the
constant zero function.
Warning: It is not permissible to multiply delta functions together, or to use more
complicated algebraic operations. Expressions like (x)2 , 1/(x), e(x) , etc., are not well
defined in the theory of generalized functions. This makes their application to nonlinear
systems much more problematic .
12/11/12
584
c 2012
Peter J. Olver
Figure 11.6.
The integral of the delta function is known as a step function. More specifically, the
basic formulae (11.37, 38) imply that
Z x
0,
a < x < y,
y (t) dt = y (x) = (x y) =
(11.43)
1,
x > y > a.
a
Figure 11.5 shows the graph of (x) = 0 (x). Unlike the delta function, the step function
y (x) is an ordinary function. It is continuous indeed constant except at x = y. The
value of the step function at the discontinuity x = y is left unspecified, although a popular
choice, motivated by Fourier theory, cf. Chapter 12 , is to set y (y) = 21 , the average of its
left and right hand limits.
We note that the integration formula (11.43) is compatible with our characterization of the delta function as the limit of highly concentrated forces. If we integrate the
approximating functions (11.31), we obtain
Z x
1
1
fn (x) =
gn (t) dt = tan1 n x + .
Since
lim tan1 y =
1
2
while
lim
tan1 y = 21 ,
1,
1
,
lim f (x) = (x) =
n n
2
0,
x > 0,
x = 0,
x < 0.
(11.44)
585
c 2012
Peter J. Olver
1.2
1.2
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
-1.5
-1
-0.5
0.2
0.5
1.5
-1.5
-1
-0.5
-0.2
0.5
1.5
-0.2
Figure 11.7.
d2
= is no longer an ordinary function.
dx2
integral of the delta function is the nth order ramp
(x y)n
,
n (x y) =
n!
0,
derivative
nth
function
x > y,
(11.46)
x < y.
f (y + ) = lim+ f (x),
xy
(11.48)
to denote, respectively, the left and right sided limits of a function at a point y. The
function f (x) is continuous at the point y if and only if its one-sided limits exist and are
equal to its value: f (y) = f (y ) = f (y + ). If the one-sided limits are the same, but not
equal to f (y), then the function is said to have a removable discontinuity, since redefining
f (y) = f (y ) = f (y +) serves to make f continuous at the point in question. An example
is the function f (x) that is equal to 0 for all x 6= 0, but has f (0) = 1. Removing the
discontinuity by setting f (0) = 0 makes f (x) 0 equal to a continuous constant function.
Since removable discontinuities play no role in our theory or applications, they will always
be removed without penalty.
Warning: Although (0+ ) = 0 = (0 ), we will emphatically not call 0 a removable
discontinuity of the delta function. Only standard functions have removable discontinuities.
This function is not a version of the delta function. It is an ordinary function, and its integral
is 0, not 1.
12/11/12
586
c 2012
Peter J. Olver
-1
0.5
0.5
-0.5
0.5
1.5
-1
-0.5
0.5
-0.5
-0.5
-1
-1
1.5
f (x)
f (x)
Figure 11.8.
Finally, if both the left and right limits exist, but are not equal, then f is said to have
a jump discontinuity at the point y. The magnitude of the jump is the difference
= f (y + ) f (y ) = lim f (x)
x y+
lim f (x)
x y
(11.49)
between the right and left limits. The magnitude of the jump is positive if the function
jumps up, when moving from left to right, and negative if it jumps down. For example,
the step function (x) has a unit, i.e., magnitude 1, jump discontinuity at the origin:
(0+ ) (0 ) = 1 0 = 1,
and is continuous everywhere else. Note the value of the function at the point, namely
f (y), which may not even be defined, plays no role in the specification of the jump.
In general, the derivative of a function with jump discontinuities is a generalized
function that includes delta functions concentrated at each discontinuity. More explicitly,
suppose that f (x) is differentiable, in the usual calculus sense, everywhere except at the
point y where it has a jump discontinuity of magnitude . We can re-express the function
in the convenient form
f (x) = g(x) + (x y),
(11.50)
where g(x) is continuous everywhere, with a removable discontinuity at x = y, and differentiable except possibly at the jump. Differentiating (11.50), we find that
f (x) = g (x) + (x y),
(11.51)
has a delta spike of magnitude at the discontinuity. Thus, the derivatives of f and g
coincide everywhere except at the discontinuity.
Example 11.4. Consider the function
x,
f (x) =
1 2
x ,
5
x < 1,
x > 1,
(11.52)
which we graph in Figure 11.8. We note that f has a single jump discontinuity of magnitude
6
5 at x = 1. This means that
x,
x < 1,
6
where
g(x) =
f (x) = g(x) + 5 (x 1),
1 2
x 65 ,
x > 1,
5
12/11/12
587
c 2012
Peter J. Olver
4
1
2
0.5
-1
-1
-0.5
0.5
1.5
-0.5
0.5
-0.5
-2
-1
-4
1.5
f (x)
f (x)
Figure 11.9.
is continuous everywhere, since its right and left hand limits at the original discontinuity
are equal: g(1+ ) = g(1 ) = 1. Therefore,
1,
x < 1,
6
f (x) = g (x) + 5 (x 1),
where
g (x) =
2
x > 1,
5 x,
while g (1), and hence f (1), is not defined. In Figure 11.8, the delta spike in the derivative
of f is symbolized by a vertical line although this pictorial device fails to indicate its
magnitude of 65 .
Since g (x) can be found by directly differentiating the formula for f (x), once we
determine the magnitude and location of the jump discontinuities of f (x), we can compute
its derivative directly without introducing to the auxiliary function g(x).
Example 11.5. As a second, more streamlined example, consider the function
x < 0,
x,
2
x 1,
0 < x < 1,
f (x) =
2 ex ,
x > 1,
which is plotted in Figure 11.9. This function has jump discontinuities of magnitude 1
at x = 0, and of magnitude 2/e at x = 1. Therefore, in light of the preceding remark,
x < 0,
1,
2
2 x,
0 < x < 1,
f (x) = (x) + (x 1) +
e
x
2e ,
x > 1,
where the final terms are obtained by directly differentiating f (x).
588
c 2012
Peter J. Olver
Figure 11.10.
s(x) = a (x) =
+ 1,
x > 0,
1,
x < 0.
(11.53)
Note that there is no delta function in a (x) because a(x) is continuous everywhere. Since
s(x) has a jump of magnitude 2 at the origin and is otherwise constant, its derivative
s (x) = a (x) = 2 (x) is twice the delta function.
We are even allowed to differentiate the delta function. Its first derivative
y (x) = (x y)
can be interpreted in two ways. First, we may view (x) as the limit of the derivatives of
the approximating functions (11.31):
dgn
2 n3 x
d
= lim
= lim
.
n (1 + n2 x2 )2
dx n dx
(11.54)
The graphs of these rational functions take the form of more and more concentrated spiked
doublets, as illustrated in Figure 11.10. To determine the effect of the derivative on a
test function u(x), we compute the limiting integral
Z
Z
gn (x) u(x) dx
(x) u(x) dx = lim
h ;ui =
n
(11.55)
Z
Z
In the middle step, we used an integration by parts, noting that the boundary terms at
vanish, provided that u(x) is continuously differentiable and bounded as | x | .
Pay attention to the minus sign in the final answer.
12/11/12
589
c 2012
Peter J. Olver
In the dual interpretation, the generalized function y (x) corresponds to the linear
functional
Z
Ly [ u ] = u (y) = h y ; u i =
y (x) u(x) dx,
where
0 < y < ,
(11.56)
0
that maps a continuously differentiable function u(x) to minus its derivative at the point
y. We note that (11.56) is compatible with a formal integration by parts
Z
Z
(x y) u(x) dx = (x y) u(x)
(x y) u (x) dx = u (y).
0
x=0
original conditions (11.2930) are not in fact sufficient to characterize whether a sequence
of functions has the delta function as a limit. To be absolutely sure, one must, in fact,
verify the more comprehensive limiting formula (11.41).
The Greens Function
To further cement our new-found friendship, we now put the delta function to work
to solve inhomogeneous boundary value problems. Consider a bar of length subject to a
unit impulse force y (x) = (x y) concentrated at position 0 < y < . The underlying
differential equation (11.12) takes the special form
du
d
c(x)
= (x y),
0 < x < ,
(11.57)
dx
dx
which we supplement with homogeneous boundary conditions that lead to a unique solution. The solution is known as the Greens function for the boundary value problem, and
will be denoted by Gy (x) = G(x, y).
Example 11.7. Let us look at the simple case of a homogeneous bar, of unit length
= 1, with constant stiffness c, and fixed at both ends. The boundary value problem for
the Greens function G(x, y) takes the form
c u = (x y),
(11.58)
u(0) = 0 = u(1),
where 0 < y < 1 indicates the point at which we apply the impulse force. The solution to
the differential equation is obtained by direct integration. First, by (11.43),
(x y)
+ a,
c
where a is a constant of integration. A second integration leads to
u (x) =
u(x) =
12/11/12
(x y)
+ a x + b,
c
590
(11.59)
c 2012
Peter J. Olver
0.3
0.2
0.1
0.2
0.4
0.6
0.8
-0.1
-0.2
Figure 11.11.
where is the ramp function (11.45). The integration constants a, b are fixed by the
boundary conditions; since 0 < y < 1, we have
1y
1y
+ a + b = 0,
and so
a=
.
c
c
Therefore, the Greens function for the problem is
x(1 y)/c,
x y,
G(x, y) = (x y) + (1 y)x =
(11.60)
y(1 x)/c,
x y,
u(0) = b = 0,
u(1) =
Figure 11.11 sketches a graph of G(x, y) when c = 1. Note that, for each fixed y, it
is a continuous and piecewise affine function of x meaning that its graph consists of
connected straight line segments, with a corner where the unit impulse force is being
applied.
Once we have determined the Greens function, we are able to solve the general inhomogeneous boundary value problem
c u = f (x),
(11.61)
u(0) = 0 = u(1),
The solution formula is a consequence of linear superposition. We first express the forcing
function f (x) as a linear combination of impulses concentrated at points along the bar.
Since there is a continuum of possible positions 0 y 1 at which impulse forces may be
applied, we will use an integral to sum them up, thereby writing the external force as
Z 1
f (y) (x y) dy.
(11.62)
f (x) =
0
In the continuous context, sums are replaced by integrals, and we will interpret (11.62)
as the (continuous) superposition of an infinite collection of impulses f (y) (x y), of
magnitude f (y) and concentrated at position y.
The superposition principle states that, for linear systems, linear combinations of
inhomogeneities produce linear combinations of solutions. Again, we adapt this principle
to the continuum by replacing the sums by integrals. Thus, we claim that the solution to
the boundary value problem is the self-same linear superposition
Z 1
u(x) =
f (y) G(x, y) dy
(11.63)
0
12/11/12
591
c 2012
Peter J. Olver
d2 u
= f (x), as claimed.
dx2
Remark : In computing the derivatives of u, we made use of the calculus formula
Z (x)
Z (x)
d
d
F
d
F (x, y) dy = F (x, (x))
F (x, (x))
+
(x, y) dy
(11.65)
dx (x)
dx
dx
(x) x
for the derivative of an integral with variable limits, which is a straightforward consequence
of the Fundamental Theorem of Calculus and the chain rule, [9, 168]. As with all limiting
processes, one must always be careful when interchanging the order of differentiation and
integration.
We note the following fundamental properties, that serve to uniquely characterize the
Greens function. First, since the delta forcing vanishes except at the point x = y, the
Greens function satisfies the homogeneous differential equation
2G
(x, y) = 0
x2
for all
x 6= y.
(11.66)
Since G(x, y) is a function of two variables, we switch to partial derivative notation to indicate
its derivatives.
12/11/12
592
c 2012
Peter J. Olver
c(x)
G(x, y) = 0,
x
x
for all
x 6= y.
(11.67)
(11.68)
(vi ) Generates a superposition principle for the solution under general forcing functions:
u(x) =
12/11/12
593
(11.69)
0
c 2012
Peter J. Olver
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
y
-0.2
Figure 11.12.
Greens Function for Bar with One Fixed and One Free End.
.
Example 11.8. Consider a uniform bar of length = 1 with one fixed and one
free end, subject to an external force. The displacement u(x) satisfies the boundary value
problem
(11.70)
c u = f (x),
u(0) = 0,
u (1) = 0,
where c is the elastic constant of the bar. To determine the Greens function, we appeal
to its characterizing properties, although one could equally well use direct integration as
in the preceding Example 11.7.
First, since, as a function of x, it must satisfy the homogeneous differential equation
c u = 0 for all x 6= y, the Greens function must be of the form
p x + q,
x y,
G(x, y) =
r x + s,
x y,
for certain constants p, q, r, s. Second, the boundary conditions require
G
(1, y) = 0.
x
Continuity of the Greens function at x = y imposes the further constraint
q = G(0, y) = 0,
r=
p y = G(y , y) = G(y + , y) = s.
Finally, the derivative G/x must have a jump discontinuity of magnitude 1/c at x = y,
and so
G +
G
1
=
(y , y)
(y , y) = 0 p,
and so
c
x
x
We conclude that the Greens function for this problem is
x/c,
x y,
G(x, y) =
y/c,
x y,
12/11/12
594
p=s=
1
.
c
(11.71)
c 2012
Peter J. Olver
which, for c = 1, is graphed in Figure 11.12. Note that G(x, y) = G(y, x) is indeed symmetric, which helps check the correctness of our computation. Finally, the superposition
principle (11.69) implies that the solution to the boundary value problem (11.70) can be
written as a single integral, namely
Z x
Z 1
Z 1
1
1
y f (y) dy +
x f (y) dy.
(11.72)
G(x, y)f (y) dy =
u(x) =
c 0
c x
0
The reader may wish to verify this directly, as we did in the previous example.
Sometimes, the order is reversed, and, at least historically, basic linear algebra concepts make
their first appearance in function space. Examples include the CauchySchwarz inequality, the
Fredholm alternative, and the Fourier transform.
12/11/12
595
c 2012
Peter J. Olver
on both vector spaces. These are the continuum analogs of the Euclidean dot product,
and, as we shall see, will be appropriate when dealing with homogeneous bars. According
to the defining equation (7.74), the adjoint D of the derivative operator must satisfy the
inner product identity
hh D[ u ] ; v ii = h u ; D [ v ] i
for all
u U,
v V.
(11.74)
du
;v
dx
du
v dx.
dx
(11.75)
u D [ v ] dx.
h u ; D [v] i =
(11.76)
Now, in the latter integral, we see u multiplying the result of applying the linear operator
D to v. To identify this integrand with that in the previous integral (11.75), we need to
somehow remove the derivative from u. The secret is integration by parts, which allows
us to rewrite the first integral in the form
Z
Z
du
dv
v dx = u() v() u(0) v(0)
u
dx.
(11.77)
dx
0 dx
0
Ignoring the two boundary terms for a moment, the remaining integral has the form of an
inner product
Z
Z
dv
dv
dv
dx = u ;
= h u ; D[ v ] i.
(11.78)
dx =
u
u
dx
dx
dx
0
0
Equating (11.75) and (11.78), we deduce that
du
dv
hh D[ u ] ; v ii =
= h u ; D[ v ] i.
;v
= u;
dx
dx
Thus, to satisfy the adjoint equation (11.74), we must have
h u ; D [ v ] i = h u ; D[ v ] i
and so
d
dx
for all
= D = D =
u U,
v V,
d
.
dx
(11.79)
The final equation confirms our earlier identification of the derivative operator D as the
continuum limit of the incidence matrix A, and its negative D = D as the limit of the
transposed (or adjoint) incidence matrix AT = A .
However, the preceding argument is valid only if the boundary terms in the integration
by parts formula (11.77) vanish:
u() v() u(0) v(0) = 0,
12/11/12
596
(11.80)
c 2012
Peter J. Olver
which necessitates imposing suitable boundary conditions on the functions u and v. For
example, in the case of a bar with both ends fixed, the boundary conditions
u(0) = 0,
u() = 0,
(11.81)
will ensure that (11.80) holds, and therefore validate (11.79). The homogeneous boundary
conditions serve to define the vector space
U = u(x) C2 [ 0, ] u(0) = u() = 0
Thus, this case corresponds to the free boundary conditions of Example 11.3. Again,
restricting D: U V to these particular vector spaces ensures that the boundary terms
(11.80) vanish, and so (11.79) holds in this situation too.
Let us list the most important combinations of boundary conditions that imply the
vanishing of the boundary terms (11.80), and so ensure the validity of the adjoint equation
(11.79).
Self-Adjoint Boundary Conditions for a Bar
(a) Both ends fixed:
(b) One free and one fixed end:
u(0) = u() = 0.
u(0) = u () = 0
u (0) = u () = 0.
u(0) = u(), u (0) = u ().
or
u (0) = u() = 0.
In all cases, the boundary conditions impose restrictions on the displacement space U and,
in cases (bd) when identifying v(x) = u (x), the strain space V also.
In mathematics, a fixed boundary condition, u(a) = 0, is commonly referred to as
a Dirichlet boundary condition, to honor the nineteenth century French analyst Lejeune
Dirichlet. A free boundary condition, u (a) = 0, is known as a Neumann boundary condition, after his German contemporary Carl Gottfried Neumann. The Dirichlet boundary
value problem (a) has both ends fixed, while the Neumann boundary value problem (c) has
both ends free. The intermediate case (b) is known as a mixed boundary value problem.
The periodic boundary conditions (d) represent a bar that has its ends joined together to
12/11/12
597
c 2012
Peter J. Olver
form a circular elastic ring, and represents the continuum limit of the periodic massspring
chain discussed in Exercise 6.3.11.
Summarizing, for a homogeneous bar with unit stiffness c(x) 1, the displacement,
strain, and external force are related by the adjoint formulae
f = D [ v ] = v ,
v = D[ u ] = u ,
provided that we impose a suitable pair of homogeneous boundary conditions. The equilibrium equation has the self-adjoint form
K[ u ] = f,
We note that
K = D D = D2 .
where
(11.82)
K = (D D) = D (D ) = D D = K,
(11.83)
h K[ u ] ; u
ei =
u (x) u
e(x) dx =
u(x) u
e (x) dx = h u ; K[ u
e] i
0
(11.84)
on the vector space of possible displacements, but adopt a weighted inner product
Z
(11.86)
hh v ; ve ii =
v(x) e
v(x) c(x) dx,
v, ve V,
0
on the space of strain functions. The weight function c(x) > 0 coincides with the stiffness
of the bar; its positivity, which is required for (11.86) to define a bona fide inner product,
is in accordance with the underlying physical assumptions.
Let us recompute the adjoint of the derivative operator D: U V , this time with
respect to the inner products (11.8586). Now we need to compare
Z
Z
du
hh D[ u ] ; v ii =
v(x) c(x) dx,
with
h u ; D [v] i =
u(x) D [ v ] dx.
dx
0
0
The circle is sufficiently large so that we can safely ignore any curvature effects.
12/11/12
598
c 2012
Peter J. Olver
(11.88)
As you can check, this follows from any of the listed boundary conditions: Dirichlet,
Neumann, or mixed, as well as the periodic case, assuming c(0) = c(). Therefore, in such
situations, the weighted adjoint of the derivative operator is
d(c v)
dv
= c
c v.
dx
dx
The self-adjoint combination K = D D is
du
d
c(x)
,
K[ u ] =
dx
dx
D [ v ] =
(11.89)
(11.90)
and hence we have formulated the original equation (11.12) for a nonuniform bar in the
same abstract self-adjoint form.
As an application, let us show how the self-adjoint formulation leads directly to the
symmetry of the Greens function G(x, y). As a function of x, the Greens function satisfies
K[ G(x, y)] = (x y).
Thus, by the definition of the delta function and the self-adjointness identity (11.84),
G(z, y) =
for any 0 < y, z < , which validates the symmetry equation (11.68).
Positivity and Minimum Principles
We are now able to characterize the solution to a stable self-adjoint boundary value
problem by a quadratic minimization principle. Again, the development shadows the
finite-dimensional case presented in Chapter 6. So the first step is to understand how a
differential operator defining a boundary value problem can be positive definite.
According to the abstract Definition 7.59, a linear operator K: U U on an inner
product space U is positive definite, provided that it is
(a) self-adjoint, so K = K, and
(b) satisfies the positivity criterion h K[ u ] ; u i > 0 for all 0 6= u U .
Self-adjointness of the product operator K = D D was established in (11.83). Furthermore, Theorem 7.62 tells us that K is positive definite if and only if ker D = {0}. Indeed,
by the definition of the adjoint,
h K[ u ] ; u i = h D [ D[ u ] ] ; u i = hh D[ u ] ; D[ u ] ii = k D[ u ] k2 0,
(11.92)
12/11/12
599
c 2012
Peter J. Olver
1
2
k D[ u ] k h u ; f i =
1
2
c(x) u (x)2 f (x) u(x) dx.
(11.93)
A proof of this general fact appears following Theorem 7.61. Pay attention: the norm in
(11.93) refers to the strain space V , and so is associated with the weighted inner product
(11.86), whereas the inner product term refers to the displacement space U , which has
been given the L2 inner product. Physically, the first term measures the internal energy
due to the stress in the bar, while the second term is the potential energy induced by the
external forcing. Thus, as always, the equilibrium solution seeks to minimize the total
energy in the system.
Example 11.9. Consider the homogeneous Dirichlet boundary value problem
u = f (x),
u(0) = 0,
u() = 0.
(11.94)
for a uniform bar with two fixed ends. This is a stable case, and so the underlying
differential operator K = D D = D2 , when acting on the space of displacements
satisfying the boundary conditions, is positive definite. Explicitly, positive definiteness
12/11/12
600
c 2012
Peter J. Olver
requires
h K[ u ] ; u i =
[ u (x) u(x) ] dx =
[ u (x) ]2 dx > 0
(11.95)
for all nonzero u(x) 6 0 with u(0) = u() = 0. Notice how we used an integration by parts,
invoking the boundary conditions to eliminate the boundary contributions, to expose the
positivity of the integral. The corresponding energy functional is
Z
1 2
2
1
u
(x)
f
(x)
u(x)
dx.
P[ u ] = 2 k u k h u ; f i =
2
0
Its minimum value, taken over all possible displacement functions that satisfy the boundary
conditions, occurs precisely when u = u is the solution to the boundary value problem.
A direct verification of the latter fact may be instructive. As in our derivation of the
adjoint operator, it relies on an integration by parts. Since u = f ,
P[ u ] =
=
1
2
(u ) +
1
2 (u
u u
dx =
u )2 dx
u () u()
u (0) u(0)
2
1
2 (u )
2
1
2 (u )
u u dx
dx,
(11.96)
where the boundary terms vanish owing to the boundary conditions on u and u. In the
final expression for P[ u ], the first integral is always 0, and is actually equal to 0 if and
only if u (x) = u (x) for all 0 x . On the other hand, the second integral does not
depend upon u at all. Thus, for P[ u ] to achieve a minimum, u(x) = u (x) + c for some
constant c. But the boundary conditions force c = 0, and hence the energy functional will
assume its minimum value if and only if u = u .
Inhomogeneous Boundary Conditions
So far, we have restricted our attention to homogeneous boundary value problems.
Inhomogeneous boundary conditions are a little trickier, since the spaces of allowable
displacements and allowable strains are no longer vector spaces, and so the abstract theory,
as developed in Chapter 7, is not directly applicable.
One way to circumvent this difficulty is to slightly modify the displacement function so
as to satisfy homogeneous boundary conditions. Consider, for example, the inhomogeneous
Dirichlet boundary value problem
d
du
K[ u ] =
c(x)
= f (x),
u(0) = ,
u() = .
(11.97)
dx
dx
We shall choose a function h(x) that satisfies the boundary conditions:
h(0) = ,
h() = .
Note that we are not requiring h to satisfy the differential equation, and so one, but by no
means the only, possible choice is the linear interpolating polynomial
h(x) = +
12/11/12
601
x.
(11.98)
c 2012
Peter J. Olver
(11.99)
u
e(0) = u
e() = 0.
(11.100)
Moreover, by linearity, u
e satisfies the modified equation
K[ u
e ] = K[ u h ] = K[ u ] K[ h ] = f K[ h ] fe,
or, explicitly,
d
du
e
c(x)
= fe(x),
dx
dx
where
d
fe(x) = f (x) +
dx
dh
c(x)
dx
(11.101)
fe(x) = f (x) +
c (x).
Thus, we have managed to convert the original inhomogeneous problem for u into a homogeneous boundary value problem for u
e. Once we have solved the latter, the solution to
the original is simply reconstructed from the formula
u(x) = u
e(x) + h(x).
(11.102)
Let us rewrite the minimization principle in terms of the original displacement function
u(x). Replacing u
e and fe by their formulae (11.99, 101) yields
P[ u
e ] = 21 k u h k2 h u h ; f K[ h ] i
= 21 k u k2 h u ; f i hh u ; h ii h u ; K[ h ] i + 12 k h k2 + h h ; f K[ h ] i
= P[ u ] hh u ; h ii h u ; K[ h ] i + C0 .
(11.104)
In the middle expression, the last pair of terms depend only on the initial choice of h(x),
and not on u(x); thus, once h has been selected, they can be regarded as a fixed constant,
here denoted by C0 . The first pair of terms reproduces the quadratic energy functional
(11.93) for the actual displacement u(x). The middle terms can be explicitly evaluated:
Z
hh u ; h ii h u ; K[ h ] i =
c(x) h (x) u (x) + c(x) h (x) u(x) dx
0
(11.105)
Z
d
602
c 2012
Peter J. Olver
Figure 11.13.
Bending of a Beam.
dx
du
c(x)
dx
= f (x),
u(0) = ,
u() = ,
is the unique C2 function that satisfies the indicated boundary conditions and minimizes
Z
1
the energy functional P[ u ] =
c(x) u (x)2 f (x) u(x) dx.
2
0
Warning: The inhomogeneous mixed boundary value problem is trickier, since the
extra terms (11.105) will depend upon the value of u(x). The details are worked out in
Exercise .
603
c 2012
Peter J. Olver
Thus, for beams, the strain is a nonlinear function of displacement. Since we are still only
willing to deal with linear systems, we shall suppress the nonlinearity by assuming that
the beam is not bent too far; more specifically, we assume that the derivative u (x) 1 is
small and so the tangent line is nearly horizontal. Under this assumption, the curvature
function (11.106) is replaced by its linear approximation
u .
(11.107)
From now on, we will identify v = D2 [ u ] = u as the strain in a bending beam. The
second derivative operator L = D2 that maps displacement u to strain v = L[ u ] thereby
describes the beams intrinsic (linearized) geometry.
The next step is to formulate a constitutive relation between stress and strain. Physically, the stress w(x) represents the bending moment of the beam, defined as the product
of internal force and angular deflection. Our small bending assumption implies an elastic
Hookes law relation
d2 u
(11.108)
w(x) = c(x) v(x) = c(x) 2 ,
dx
where the proportionality factor c(x) > 0 measures the stiffness of the beam at the point
x. In particular, a uniform beam has constant stiffness, c(x) c.
Finally, the differential equation governing the equilibrium configuration of the beam
will follow from a balance of the internal and external forces. To compute the internal
force, we appeal to our general equilibrium framework, which tells us to apply the adjoint
of the incidence operator L = D2 to the strain, leading to the force balance law
L [ v ] = L L[ u ] = f.
(11.109)
Let us compute the adjoint. We use the ordinary L2 inner product on the space of
displacements u(x), and adopt a weighted inner product, based on the stiffness function
c(x), between strain functions:
Z b
Z b
(11.110)
hu;u
ei =
u(x) u
e(x) dx,
hh v ; ve ii =
v(x) e
v(x) c(x) dx.
a
L[ u ] v c dx = hh L[ u ] ; v ii = h u ; L [ v ] i =
u L [ v ] dx.
0
(11.111)
By definition, [ 9, 168 ], the curvature of a curve at a point is equal to the reciprocal, = 1/r
of the radius of the osculating circle; see Exercise A.5.11 for details.
12/11/12
604
c 2012
Peter J. Olver
Clamped End
Free End
Sliding End
Figure 11.14.
As before, the computation relies on (in this case two) integrations by parts:
Z 2
Z
d u
du d(c v)
du
hh L[ u ] ; v ii =
c v dx =
cv
dx
2
dx
dx
0 dx
0 dx
x=0
Z
d2 (c v)
d(c v)
du
+
u
cv u
dx.
=
dx
dx
dx2
0
x=0
Comparing with (11.111), we conclude that L [ v ] = D2 (c v) provided the boundary terms
vanish:
du
d(c v)
dw
du
=
(11.112)
cv u
wu
dx
dx
dx
dx x = 0
x=0
= u () w() u() w () u (0) w(0) u(0) w (0) = 0.
Thus, under suitable boundary conditions, the force balance equations are
d2 (c v)
= f (x).
(11.113)
dx2
A justification of (11.113) based on physical principles can be found in [181]. Combining
(11.108, 113), we conclude that the equilibrium configuration of the beam is characterized
as a solution to the fourth order ordinary differential equation
d2
d2 u
c(x) 2 = f (x).
(11.114)
dx2
dx
L [ v ] =
As such, the general solution will depend upon 4 arbitrary constants, and so we need to
impose a total of four boundary conditions two at each end in order to uniquely
specify the equilibrium displacement. The (homogeneous) boundary conditions should be
chosen so as to make the boundary terms in our integration by parts computation vanish,
cf. (11.112). There are a variety of ways in which this can be arranged, and the most
important possibilities are the following:
12/11/12
605
c 2012
Peter J. Olver
u(0) = w(0) = 0,
u(0) = u (0) = 0,
w(0) = w (0) = 0,
u (0) = w (0) = 0.
In these conditions, w(x) = c(x) v(x) = c(x) u (x) is the stress resulting from the displacement u(x).
A second pair of boundary conditions must be imposed at the other end x = . You
can mix or match these conditions in any combination for example, a pair of simply
supported ends, or one free end and one fixed end, and so on. Inhomogeneous boundary
conditions are also allowed and used to model applied displacements or applied forces at
the ends. Yet another option is to consider a bendable circular ring, which is subject to
periodic boundary conditions
u(0) = u(),
u (0) = u (),
w(0) = w(),
w (0) = w (),
indicating that the ends of the beam have been welded together.
Let us concentrate our efforts on the uniform beam, of unit length = 1, choosing
units so that its stiffness c(x) 1. In the absence of external forcing, the differential
equation (11.114) reduces to the elementary fourth order ordinary differential equation
dx u
= 0.
d4x
(11.115)
(11.116)
Let us use this formula to solve a couple of representative boundary value problems.
First, suppose we clamp both ends of the beam, imposing the boundary conditions
u(0) = 0,
u (0) = ,
u(1) = 0,
u (1) = 0,
(11.117)
so that the left end is tilted by a (small) angle tan1 . We substitute the solution formula (11.116) into the boundary conditions (11.117) and solve for
a = ,
b = 2 ,
c = ,
d = 0.
(11.118)
We first met Charles Hermite in Section 3.6, and the term spline will be explained shortly.
12/11/12
606
c 2012
Peter J. Olver
0.5
0.4
0.3
0.2
0.1
0.2
0.4
0.6
0.8
-0.1
Figure 11.15.
As a second example, suppose that we raise the left hand end of the beam without
tilting, which corresponds to the boundary conditions
u (0) = 0,
u(0) = ,
u (1) = 0.
u(1) = 0,
(11.119)
(11.120)
Observe that if we simultaneously raise and tilt the left end, so u(0) = , u (0) = , then
we can simply use superposition to write the solution as the sum of (11.118) and (11.120):
u(x) = (1 x)2 (2 x + 1) + x(1 x)2 .
To analyze a forced beam, we can adapt the Greens function approach. As we know,
the Greens function will depend on the choice of (homogeneous) boundary conditions. Let
us treat the case when the beam has two fixed ends, and so
u(0) = 0,
u (0) = 0,
u(1) = 0,
u (1) = 0.
(11.121)
To construct the Greens function, we must solve the forced differential equation
dx u
= (x y)
(11.122)
d4x
corresponding to a concentrated unit impulse applied at position y along the beam. Integrating (11.122) four times, using (11.46) with n = 4, we produce the general solution
1
3
x > y,
3
2
6 (x y) ,
u(x) = a x + b x + c x + d +
0,
x < y,
to the differential equation (11.122). The boundary conditions (11.121) require
12/11/12
1
6
(1 y)3 = 0,
u(0) = d = 0,
u(1) = a + b +
u (0) = c = 0,
u (1) = 3 a + 2 b +
607
1
2
(1 y)2 = 0,
c 2012
Peter J. Olver
0.03
0.02
0.01
0.2
0.4
0.6
0.8
y
-0.01
-0.02
Figure 11.16.
and hence
a=
1
3
(1 y)3
1
2
(1 y)2 ,
b = 12 (1 y)3 +
1
2
(1 y)2 .
x < y,
x > y.
(11.123)
Observe that, as with the second order bar system, the Greens function is symmetric,
G(x, y) = G(y, x), which is a manifestation of the self-adjointness of the underlying boundary value problem, cf. (11.91). Symmetry implies that the deflection of the beam at position
x due to a concentrated impulse force applied at position y is the same as the deflection
at y due to an impulse force of the same magnitude applied at x.
As a function of x, the Greens function G(x, y) satisfies the homogeneous differential
equation (11.115) for all x 6= y. Its first and second derivatives G/x, 2G/x2 are
continuous, while 3 G/x3 has a unit jump discontinuity at x = y, which then produces
the required delta function impulse in 4 G/x4 . The Greens function (11.123) is graphed
in Figure 11.16, and appears to be quite smooth. Evidently, the human eye cannot easily
discern discontinuities in third order derivatives!
The solution to the forced boundary value problem
dx u
= f (x),
d4x
(11.124)
for a beam with fixed ends is then obtained by invoking the superposition principle. We
view the forcing function as a linear superposition
f (x) =
f (y) (x y) dx
12/11/12
608
c 2012
Peter J. Olver
0.003
0.002
0.001
0.2
0.4
0.6
0.8
-0.001
-0.002
-0.003
Figure 11.17.
of impulse delta forces. The solution is the self-same linear superposition of Greens function responses:
u(x) =
1
=
6
0
x
(11.125)
1
y (1 x) (3 x y 2 x y) f (y) dy +
6
2
For example, under a constant unit downwards force f (x) 1, e.g., gravity, the deflection
of the beam is given by
u(x) =
1
24
x4
1
12
x3 +
1
24
x2 =
1
24
x2 (1 x)2 ,
and graphed in Figure 11.17. Although we could, of course, obtain u(x) by integrating the
original differential equation (11.124) directly, writing the solution formula (11.125) as a
single integral has evident advantages.
Since the beam operator K = L L assumes the standard self-adjoint, positive semidefinite form, the boundary value problem will be positive definite and hence stable if
and only if ker L = ker D2 = {0} when restricted to the space of allowable displacement
functions. Since the second derivative D2 annihilates all linear polynomials
u(x) = + x,
positive definiteness requires that no non-zero linear polynomials satisfy all four homogeneous boundary conditions. For example, any beam with one fixed end is stable since
u(x) 0 is the only linear polynomial that satisfies u(0) = u (0) = 0. On the other hand,
a beam with two free ends is unstable since every linear polynomial displacement has zero
stress w(x) = u (x) 0, and so satisfies the boundary conditions w(0) = w (0) = w() =
w () = 0. Similarly, a beam with a simply supported plus a free end is not positive definite
since u(x) = x satisfies the four boundary conditions u(0) = u (0) = 0, w() = w () = 0.
In the stable cases, the equilibrium solution can be characterized as the unique minimizer
12/11/12
609
c 2012
Peter J. Olver
1
2
k L[ u ] k h u ; f i =
b
1
2
c(x) u (x)2 f (x) u(x) dx
(11.126)
j = 0, . . . , n.
(11.127)
The mesh points x0 < x1 < x2 < < xn are distinct and labeled in increasing order.
The spline is modeled as an elastic beam, and so satisfies the homogeneous beam equation
(11.115). Therefore,
xj x xj+1 ,
u(x) = aj + bj (x xj ) + cj (x xj )2 + dj (x xj )3 ,
(11.128)
j = 0, . . . , n 1,
is a piecewise cubic function meaning that, between successive mesh points, it is a cubic
polynomial, but not necessarily the same cubic on each subinterval. The fact that we write
the formula (11.128) in terms of x xj is merely for computational convenience.
Our problem is to determine the coefficients
aj ,
bj ,
cj ,
dj ,
j = 0, . . . , n 1.
Since there are n subintervals, there are a total of 4 n coefficients, and so we require 4 n
equations to uniquely prescribe them. First, we need the spline to satisfy the interpolation
conditions (11.127). Since it is defined by a different formula on each side of the mesh
point, this results in a total of 2 n conditions:
u(x+
j ) = a j = yj ,
2
3
u(x
j+1 ) = aj + bj hj + cj hj + dj hj = yj+1 ,
j = 0, . . . , n 1,
(11.129)
Keep in mind that the norm on the strain functions v = L[ u ] = u is based on the weighted
inner product hh v ; ve ii in (11.110).
12/11/12
610
c 2012
Peter J. Olver
j = 0, . . . , n 2.
(11.130)
j = 0, . . . , n 2,
(11.131)
to ensure that u is continuous at the mesh points. We have now imposed a total of 4 n 2
conditions, namely (11.129131), on the 4 n coefficients. The two missing constraints will
come from boundary conditions at the two endpoints, namely x0 and xn . There are three
common types:
(i ) Natural boundary conditions: u (x0 ) = u (xn ) = 0, whereby
c0 = 0,
(11.132)
Physically, this models a simply supported spline that rests freely on the first and last
pegs.
(ii ) Clamped boundary conditions: u (x0 ) = , u (xn ) = , where , , which could
be 0, are fixed by the user. This requires
b0 = ,
(11.133)
(11.134)
(11.135)
then the resulting spline will be a periodic C2 function, so u(x+p) = u(x) with p = xn x0
for all x. The periodic case is used to draw smooth closed curves; see below.
Theorem 11.11. Suppose we are given mesh points a = x0 < x1 < < xn = b,
and corresponding data values y0 , y1 , . . . , yn , along with one of the three kinds of boundary
conditions (11.132), (11.133), or (11.134). Then there exists a unique piecewise cubic
spline function u(x) C2 [ a, b ] that interpolates the data, u(x0 ) = y0 , . . . , u(xn ) = yn , and
satisfies the boundary conditions.
12/11/12
611
c 2012
Peter J. Olver
Proof : We first discuss the natural case. The clamped case is left as an exercise for
the reader, while the slightly harder periodic case will be treated at the end of the section.
The first set of equations in (11.129) says that
a j = yj ,
j = 0, . . . , n 1.
(11.136)
cj+1 cj
.
3 hj
(11.137)
This equation also holds for j = n 1, provided that we make the convention that
cn = 0.
We now substitute (11.136137) into the second set of equations in (11.129), and then
solve the resulting equation for
bj =
yj+1 yj
(2 cj + cj+1 ) hj
.
hj
3
(11.138)
Substituting this result and (11.137) back into (11.130), and simplifying, we find
#
"
yj+1 yj
yj+2 yj+1
= zj+1 , (11.139)
hj cj + 2 (hj + hj+1 ) cj+1 + hj+1 cj+2 = 3
hj+1
hj
where we introduce zj+1 as a shorthand for the quantity on the right hand side.
In the case of natural boundary conditions, we have
c0 = 0,
cn = 0,
A=
h1
2 (h1 + h2 )
h2
..
h2
2 (h2 + h3 )
.
hn3
..
(11.140)
T
h3
..
.
.
2 (hn3 + hn2 )
hn2
hn2
2 (hn2 + hn1 )
(11.141)
and right hand side z = z1 , z2 , . . . , zn1 . Once (11.141) has been solved, we will then
use (11.136138) to reconstruct the other spline coefficients aj , bj , dj .
T
This is merely for convenience; there is no cn used in the formula for the spline.
12/11/12
612
c 2012
Peter J. Olver
2
1.5
1
0.5
1
-0.5
-1
Figure 11.18.
A Cubic Spline.
The key observation is that the coefficient matrix A is strictly diagonally dominant,
cf. Definition 10.36, because all the hj > 0, and so
2 (hj1 + hj ) > hj1 + hj .
Theorem 10.37 implies that A is nonsingular, and hence the tridiagonal linear system has
a unique solution c. This suffices to prove the theorem in the case of natural boundary
conditions.
Q.E.D.
To actually solve the linear system (11.140), we can apply our tridiagonal solution
algorithm (1.66). Let us specialize to the most important case, when the mesh points are
equally spaced in the interval [ a, b ], so that
xj = a + j h,
where
h = hj =
4 1
1 4 1
1 4
B=
ba
,
n
j = 0, . . . , n 1.
1
4 1
1 4 1
.. .. ..
. . .
that first appeared in Example 1.37. Its L U factorization takes on an especially simple
form, since most of the entries of L and U are essentially the same decimal numbers. This
makes the implementation of the Forward and Back Substitution procedures almost trivial.
Figure 11.18 shows a particular example a natural spline passing through the data
points (0, 0), (1, 2), (2, 1), (3, 1), (4, 0). As with the Greens function for the beam, the
human eye is unable to discern the discontinuities in its third derivatives, and so the graph
appears completely smooth, even though it is, in fact, only C2 .
12/11/12
613
c 2012
Peter J. Olver
bn+k = bn ,
cn+k = cn ,
dn+k = dn ,
zn+k = zn .
With this convention, the basic equations (11.136139) are the same. In this case, the
coefficient matrix for the linear system
A c = z,
with
c = c0 , c1 , . . . , cn1
n1
A=
+ h0 )
h0
T
h0
2 (h0 + h1 )
h1
h1
2 (h1 + h2 )
..
..
.
.
hn3
hn1
z = z0 , z1 , . . . , zn1
T
hn1
h2
..
2 (hn3 + hn2 )
hn2
hn2
2 (hn2 + hn1 )
(11.142)
Again A is strictly diagonally dominant, and so there is a unique solution c, from which one
reconstructs the spline, proving Theorem 11.11 in the periodic case. The L U factorization
of tridiagonal circulant matrices was discussed in Exercise 1.7.14.
One immediate application of splines is curve fitting in computer aided design and
T
graphics. The basic problem is to draw a smooth parametrized curve u(t) = ( u(t), v(t) )
T
that passes through a set of prescribed data points xk = ( xk , yk ) in the plane. We have
the freedom to choose the parameter value t = tk when the curve passes through the k th
point; the simplest and most common choice is to set tk = k. We then construct the
functions x = u(t) and y = v(t) as cubic splines interpolating the x and y coordinates of
the data points, so u(tk ) = xk , v(tk ) = yk . For smooth closed curves, we require that both
splines be periodic; for curves with ends, either natural or clamped boundary conditions
are used.
Most computer graphics packages include one or more implementations of parametrized spline curves. The same idea also underlies modern font design for laser printing and
typography (including the fonts used in this book). The great advantage of spline fonts
over their bitmapped counterparts is that they can be readily scaled. Some sample letter shapes parametrized by periodic splines passing through the indicated data points are
plotted in Figure 11.19. Better fits can be easily obtained by increasing the number of data
points. Various extensions of the basic spline algorithms to space curves and surfaces are
an essential component of modern computer graphics, design, and animation, [64, 163].
614
c 2012
Peter J. Olver
Figure 11.19.
p(x)
+ q(x) u = p(x) 2 p (x)
+ q(x) u = f (x),
(11.143)
dx
dx
dx
dx
which is supplemented by Dirichlet, Neumann, mixed, or periodic boundary conditions. To
be specific, let us concentrate on the case of homogeneous Dirichlet boundary conditions
u(a) = 0,
u(b) = 0.
(11.144)
To avoid singular points of the differential equation (although we will later discover
that most cases of interest in physics have one or more singular points) , we assume
that p(x) > 0 for all a x b. To ensure positive definiteness of the SturmLiouville
differential operator, we also assume q(x) 0. These assumptions suffice to guarantee
existence and uniqueness of the solution to the boundary value problem. A proof of the
following theorem can be found in [117].
Theorem 11.12. Let p(x) > 0 and q(x) 0 for a x b. Then the Sturm
Liouville boundary value problem (11.143144) admits a unique solution.
Most SturmLiouville problems cannot be solved in terms of elementary functions.
Indeed, most of the important special functions appearing in mathematical physics, including Bessel functions, Legendre functions, hypergeometric functions, and so on, first
arise as solutions to particular SturmLiouville equations, [144].
12/11/12
615
c 2012
Peter J. Olver
0.15
0.1
0.05
0.2
0.4
0.6
0.8
y
-0.05
Figure 11.20.
(11.149)
616
(11.150)
c 2012
Peter J. Olver
If we multiply (11.149) by cosh (1 y) and (11.150) by sinh (1 y) and then add the
results together, we find
sinh (1 y) = a sinh y cosh (1 y) + cosh y sinh (1 y) = a sinh ,
where we used the addition formula for the hyperbolic sine:
(11.151)
Therefore,
a=
sinh (1 y)
,
sinh
b=
sinh y
,
sinh
sinh x sinh (1 y)
sinh
G(x, y) =
sinh (1 x) sinh y ,
sinh
x < y,
(11.152)
x > y.
Note that G(x, y) = G(y, x) is symmetric, in accordance with the self-adjoint nature of the
boundary value problem. A graph appears in Figure 11.20; note that the corner, indicating
a discontinuity in the first derivative, appears at the point x = y where the impulse force
is applied.
The general solution to the inhomogeneous boundary value problem (11.145) is given
by the basic superposition formula (11.63), which becomes
Z 1
G(x, y)f (y) dy
u(x) =
0
sinh (1 x) sinh y
f (y) dy +
sinh
sinh x sinh (1 y)
f (y) dy.
sinh
2 sinh
(11.153)
For comparative purposes, the reader may wish to rederive this particular solution by a
direct calculation, without appealing to the Greens function.
To place a SturmLiouville boundary value problem in our self-adjoint framework, we
proceed as follows. (Exercise serves to motivate the construction.) Consider the linear
operator
u
L[ u ] =
u
12/11/12
617
c 2012
Peter J. Olver
that maps u(x) to the vector-valued function whose components are the function and its
first derivative. For the homogeneous Dirichlet boundary conditions (11.144), the domain
of L will be the vector space
U = u(x) C2 [ a, b ] u(a) = u(b) = 0
consisting of all twice continuously differentiable functions that vanish at the endpoints.
The target space of L: U V consists of continuously differentiable vector-valued functions
T
v(x) = ( v1 (x), v2 (x) ) ; we denote this vector space as V = C1 ([ a, b ], R 2).
To proceed, we must compute the adjoint of L: U V . To recover the SturmLiouville
problem, we use the standard L2 inner product (11.85) on U , but adopt a weighted inner
product
Z b
v1
w1
hh v ; w ii =
p(x) v1 (x) w1 (x) + q(x) v2 (x) w2 (x) dx,
v=
, w=
,
v2
w2
a
(11.154)
on V . The positivity assumptions on the weight functions p, q ensure that this is a bona
fide inner product. According to the defining equation (7.74), the adjoint L : V U is
required to satisfy
hh L[ u ] ; v ii = h u ; L [ v ] i.
As usual, the adjoint computation relies on integration by parts. Here, we only need to
manipulate the first summand:
Z b
hh L[ u ] ; v ii =
p u v1 + q u v2 dx
a
u [ (p v1 ) + q v2 ] dx.
The Dirichlet conditions (11.144) ensure that the boundary terms vanish, and therefore,
Z b
hh L[ u ] ; v ii =
u [ (p v1 ) + q v2 ] dx = h u ; L [ v ] i.
a
d(p v1 )
+ q v2 .
dx
(11.155)
reproduces the SturmLiouville differential operator. Moreover, since ker L = {0} is trivial
(why?), the boundary value problem is positive definite. Theorem 7.62 implies that the
solution can be characterized as the unique minimizer of the quadratic functional
Z b
1
2
2
2
1
1
P[ u ] = 2 k L[ u ] k h u ; f i =
2 p(x) u (x) + 2 q(x) u(x) f (x) u(x) dx (11.156)
a
12/11/12
618
c 2012
Peter J. Olver
among all C2 functions satisfying the prescribed boundary conditions. For example, the
solution to the constant coefficient SturmLiouville problem (11.145) can be characterized
as minimizing the quadratic functional
Z 1
1 2 1 2 2
P[ u ] =
2 u + 2 u f u dx
0
1
2
k L[ u ] k h f ; u i
where
(11.157)
K = L L,
representing our boundary value problem. The norm in (11.157) is typically based on some
form of weighted inner product hh v ; ve ii on the space of strains v = L[ u ] V , while the
inner product term h f ; u i is typically (although not necessarily) unweighted on the space
of displacements u U . The linear operator takes the self-adjoint form K = L L, and
must be positive definite which requires ker L = {0}. Without the positivity assumption,
the boundary value problem has either no solutions, or infinitely many; in either event,
the basic finite element method will not apply.
Rather than try to minimize P[ u ] on the entire function space U , we now seek to
minimize it on a suitably chosen finite-dimensional subspace W U . We begin by selecting
12/11/12
619
c 2012
Peter J. Olver
1
2
cT M c cT b,
(11.159)
i=1
i,j = 1
where
T
i, j = 1, . . . , n,
(11.160)
i = 1, . . . , n.
(11.161)
Observe that, once we specify the basis functions i , the coefficients mij and bi are all
known quantities. Therefore, we have reduced our original problem to a finite-dimensional
problem of minimizing the quadratic function (11.159) over all possible vectors c R n .
The coefficient matrix M is, in fact, positive definite, since, by the preceding computation,
T
c Mc =
n
X
(11.162)
i,j = 1
(11.163)
Solving (11.163) relies on some form of Gaussian Elimination, or, alternatively, an iterative
linear system solver, e.g., GaussSeidel or SOR.
This constitutes the basic abstract setting for the finite element method. The main
issue, then, is how to effectively choose the finite-dimensional subspace W . Two candidates
that might spring to mind are the space P (n) of polynomials of degree n, or the space
T (n) of trigonometric polynomials of degree n, the focus of Chapter 12. However, for a
variety of reasons, neither is well suited to the finite element method. One criterion is that
the functions in W must satisfy the relevant boundary conditions otherwise W would
12/11/12
620
c 2012
Peter J. Olver
0.8
0.6
0.4
0.2
0.2
Figure 11.21.
0.4
0.6
0.8
not be a subspace of U . More importantly, in order to obtain sufficient accuracy, the linear
algebraic system (11.163) will typically be rather large, and so the coefficient matrix M
should be as sparse as possible, i.e., have lots of zero entries. Otherwise, computing the
solution will be too time-consuming to be of much practical value. Such considerations
prove to be of absolutely crucial importance when applying the method to solve boundary
value problems for partial differential equations in higher dimensions.
The really innovative contribution of the finite element method is to first (paradoxically) enlarge the space U of allowable functions upon which to minimize the quadratic
functional P[ u ]. The governing differential equation requires its solutions to have a certain degree of smoothness, whereas the associated minimization principle typically requires
only half as many derivatives. Thus, for second order boundary value problems, including
bars, (11.93), and general SturmLiouville problems, (11.156), P[ u ] only involves first order derivatives. It can be rigorously shown that the functional has the same minimizing
solution, even if one allows (reasonable) functions that fail to have enough derivatives to
satisfy the differential equation. Thus, one can try minimizing over subspaces containing fairly rough functions. Again, the justification of this method requires some deeper
analysis, which lies beyond the scope of this introductory treatment.
For second order boundary value problems, a popular and effective choice of the finitedimensional subspace is to use continuous, piecewise affine functions. Recall that a function
is affine, f (x) = a x + b, if and only if its graph is a straight line. The function is piecewise
affine if its graph consists of a finite number of straight line segments; a typical example is
plotted in Figure 11.21. Continuity requires that the individual line segments be connected
together end to end.
Given a boundary value problem on a bounded interval [ a, b ], let us fix a finite collection of mesh points
a = x0 < x1 < x2 < < xn1 < xn = b.
The formulas simplify if one uses equally spaced mesh points, but this is not necessary for
the method to apply. Let W denote the vector space consisting of all continuous, piecewise affine functions, with corners at the nodes, that satisfy the homogeneous boundary
conditions. To be specific, let us treat the case of Dirichlet (fixed) boundary conditions
(a) = (b) = 0.
12/11/12
621
(11.164)
c 2012
Peter J. Olver
1.2
1
0.8
0.6
0.4
0.2
1
-0.2
Figure 11.22.
A Hat Function.
for
xj x xj+1 ,
j = 0, . . . , n 1.
cj = (x+
j ) = (xj ) = cj1 + bj1 hj1 ,
j = 1, . . . , n 1,
(11.165)
where hj1 = xj xj1 denotes the length of the j th subinterval. The boundary conditions
(11.164) require
(a) = c0 = 0,
(11.166)
The most convenient basis for W consists of the hat functions, which are continuous,
piecewise affine functions that interpolate the same basis data as the Lagrange polynomials
(4.47), namely
1,
j = k,
j (xk ) =
for
j = 1, . . . , n 1,
k = 0, . . . , n.
0,
j 6= k,
The graph of a typical hat function appears in Figure 11.22. The explicit formula is easily
12/11/12
622
c 2012
Peter J. Olver
established:
xx
j1
xj xj1
xj+1 x
j (x) =
,
xj+1 xj
0,
xj1 x xj ,
xj x xj+1 ,
j = 1, . . . , n 1.
(11.167)
x xj1 or x xj+1 ,
An advantage of using these basis elements is that the resulting coefficient matrix (11.160)
turns out to be tridiagonal. Therefore, the tridiagonal Gaussian Elimination algorithm in
(1.66) will rapidly produce the solution to the linear system (11.163). Since the accuracy of
the finite element solution increases with the number of mesh points, this solution scheme
allows us to easily compute very accurate numerical approximations.
Example 11.14. Consider the equilibrium equations
d
du
K[ u ] =
c(x)
= f (x),
0 < x < ,
dx
dx
for a non-uniform bar subject to homogeneous Dirichlet boundary conditions. In order to
formulate a finite element approximation scheme, we begin with the minimization principle
(11.93) based on the quadratic functional
Z
1
2
2
1
c(x)
u
(x)
f
(x)
u(x)
dx.
P[ u ] = 2 k u k h f ; u i =
2
0
We divide the interval [ 0, ] into n equal subintervals, each of length h = /n. The resulting
uniform mesh has
j
xj = j h =
,
j = 0, . . . , n.
n
The corresponding finite element basis hat functions are explicitly given by
xj1 x xj ,
(x xj1 )/h,
j (x) =
j = 1, . . . , n 1.
(11.168)
(x
x)/h,
xj x xj+1 ,
j+1
0,
otherwise,
The associated linear system (11.163) has coefficient matrix entries
Z
mij = hh i ; j ii =
i (x) j (x) c(x) dx,
i, j = 1, . . . , n 1.
0
Since the function i (x) vanishes except on the interval xi1 < x < xi+1 , while j (x)
vanishes outside xj1 < x < xj+1 , the integral will vanish unless i = j or i = j 1.
Moreover,
xj1 x xj ,
1/h,
j (x) =
j = 1, . . . , n 1.
1/h,
xj x xj+1 ,
0,
otherwise,
12/11/12
623
c 2012
Peter J. Olver
where
s +s
0
1
s1
1
M= 2
h
s1
s1 + s2
s2
..
s2
s2 + s3
.
..
sn3
sj =
s3
..
.
.
sn3 + sn2
sn2
sn2 sn2 + sn1
(11.169)
xj+1
c(x) dx
(11.170)
xj
is the total stiffness of the j th subinterval. For example, in the homogeneous case c(x) 1,
the coefficient matrix (11.169) reduces to the very special form
2
1
M=
h
1
2 1
1 2
..
.
1
..
.
..
2
1
(11.171)
1
=
h
f (x) j (x) dx
"Z
xj
xj1
(x xj1 )f (x) dx +
xj+1
xj
(11.172)
In this manner, we have assembled the basic ingredients for determining the finite element
approximation to the solution.
In practice, we do not have to explicitly evaluate the integrals (11.170, 172), but may
replace them by a suitably close numerical approximation. When h 1 is small, then the
integrals are taken over small intervals, and we can use the trapezoid rule , [35, 168], to
approximate them:
sj
h
c(xj ) + c(xj+1 ) ,
2
bj h f (xj ).
(11.173)
One might be tempted use more accurate numerical integration procedures, but the improvement in accuracy of the final answer is not very significant, particularly if the step size h is
small.
12/11/12
624
c 2012
Peter J. Olver
0.08
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0.2
0.4
0.6
0.8
0.08
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0.2
0.4
0.6
0.8
0.2
0.4
0.6
0.8
0.2
0.4
0.6
0.8
Figure 11.23.
Remark : The j th entry of the resulting finite element system M c = b is, upon dividing
by h, given by
cj+1 2 cj + cj1
u(xj+1 ) 2 u(xj ) + u(xj1 )
=
= f (xj ).
h2
h2
(11.174)
The left hand side coincides with the standard finite difference approximation to minus the
second derivative u (xj ) at the mesh point xj . (Details concerning finite differences can
be found in Section 14.6.) As a result, for this particular differential equation, the finite
element and finite difference numerical solution methods happen to coincide.
Example 11.15. Consider the boundary value problem
du
d
(x + 1)
= 1,
dx
dx
u(0) = 0,
u(1) = 0.
(11.175)
log(x + 1)
.
log 2
(11.176)
1
2
(x + 1) u (x)2 u(x) dx
(11.177)
over all possible functions u C1 that satisfy the given boundary conditions. The finite element system (11.163) has coefficient matrix given by (11.169) and right hand side
(11.172), where
Z xj+1
Z xj+1
1
2
1 2
j+
1 dx = h.
,
bj =
(1 + x) dx = h (1 + xj ) + 2 h = h + h
sj =
2
xj
xj
12/11/12
625
c 2012
Peter J. Olver
The resulting solution is plotted in Figure 11.23. The first three graphs contain, respectively, 5, 10, 20 points in the mesh, so that h = .2, .1, .05, while the last plots the exact
solution (11.176). Even when computed on rather coarse meshes, the finite element approximation is quite respectable.
Example 11.16. Consider the SturmLiouville boundary value problem
u + (x + 1)u = x ex ,
u(0) = 0,
u(1) = 0.
(11.178)
The solution minimizes the quadratic functional (11.156), which in this particular case is
Z 1
1 2 1
2
x
u
(x)
+
(x
+
1)
u(x)
e
u(x)
dx,
(11.179)
P[ u ] =
2
2
0
over all functions u(x) that satisfy the boundary conditions. We lay out a uniform mesh
of step size h = 1/n. The corresponding finite element basis hat functions as in (11.168).
The matrix entries are given by
2 2h
+
(xi + 1),
i = j,
h
3
Z 1
mij =
i (x) j (x) + (x + 1) i (x) j (x) dx
1 h
+ (xi + 1),
| i j | = 1,
0
h
6
0,
otherwise,
while
bi = h x e ; i i =
x ex i (x) dx xi exi h.
The resulting solution is plotted in Figure 11.24. As in the previous figure, the first three
graphs contain, respectively, 5, 10, 20 points in the mesh, while the last plots the exact
solution, which can be expressed in terms of Airy functions, cf. [144].
So far, we have only treated homogeneous boundary conditions. An inhomogeneous
boundary value problem does not immediately fit into our framework since the set of
functions satisfying the boundary conditions does not form a vector space. As discussed
at the end of Section 11.3, one way to get around this problem is to replace u(x) by
u
e(x) = u(x) h(x), where h(x) is any convenient function that satisfies the boundary
conditions. For example, for the inhomogeneous Dirichlet conditions
u(a) = ,
u(b) = ,
( )x + b a
.
ba
The integration is made easier by noting that the integrand is zero except on a small subinterval. Since the function x + 1 (but not i or j ) does not vary significantly on this subinterval,
it can be approximated by its value 1 + xi at a mesh point. A similar simplification is used in the
ensuing integral for bi .
12/11/12
626
c 2012
Peter J. Olver
0.1
0.1
0.08
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0.2
0.4
0.6
0.8
0.1
0.1
0.08
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0.2
0.4
0.6
0.8
Figure 11.24.
0.2
0.4
0.6
0.8
0.2
0.4
0.6
0.8
627
c 2012
Peter J. Olver
singularities in liquid crystals, and so on; see [188] and Section 22.1 for details. The
weak solution approach has the advantage that it applies even to equations that do not
possess an associated minimization principle. However, the convergence of the induced
finite element scheme is harder to justify, and, indeed, not always valid.
The starting point is a trivial observation: the only element of an inner product space
which is orthogonal to every other element is zero. More precisely:
Lemma 11.17. If V is an inner product space, then h w ; v i = 0 for all v V if and
only if w = 0.
Proof : Choose v = w. The orthogonality condition implies 0 = h w ; w i = k w k2 ,
and so w = 0.
Q.E.D.
Note that the result is equally valid in both finite- and infinite-dimensional vector
spaces. Suppose we are trying to solve a linear system
K[ u ] = f ,
(11.180)
where K: U V is a linear operator between inner product spaces. Using the lemma, this
can be reformulated as requiring
hh K[ u ] ; v ii = hh f ; v ii
for all
v V.
According to the definition (7.74), one can replace K by its adjoint K : W V , and
require
h u ; K [ v ] i = hh f ; v ii
for all
v V.
(11.181)
The latter is called the weak formulation of our original equation. The general philosophy
is that one can check whether u is a weak solution to the system by evaluating it on various
test elements v using the weak form (11.181) of the system.
In the finite-dimensional situation, when K is merely multiplication by some matrix,
the weak formulation is an unnecessary complication, and not of use. However, in the
infinite-dimensional situation, when K is a differential operator, then the original boundary
value problem K[ u ] = f requires that u be sufficiently differentiable, whereas the weak
version
h u ; K [ ] i = hh f ; ii
for all
requires only that the test function (x) be smooth. As a result, weak solutions are not
restricted to be smooth functions possessing the required number of derivatives.
Example 11.18. Consider the homogeneous Dirichlet boundary value problem
du
d
c(x)
= f (x),
0 < x < ,
u(0) = u() = 0,
K[ u ] =
dx
dx
12/11/12
628
c 2012
Peter J. Olver
for a nonuniform bar. Its weak version is obtained by integration by parts. We initially
restrict to test functions which vanish at the boundary (0) = () = 0. This requirement
will eliminate any boundary terms in the integration by parts computation
Z
Z
d
du d
du
h K[ u ] ; i =
c(x)
c(x)
(x) dx =
dx
dx
dx
dx dx
0
0
(11.182)
Z
f (x) (x) dx = h f ; i.
This semi-weak formulation is known in mechanics as the principle of virtual work , [169].
For example, the Greens function of the boundary value problem does not qualify as a
classical solution since it is not twice continuously differentiable, but can be formulated as
a weak solution satisfying the virtual work equation with right hand side defined by the
delta forcing function.
A second integration by parts produces the weak form (11.181) of the differential
equation:
Z
Z
d
d
c(x)
dx =
f (x) (x) dx = h f ; i.
(11.183)
h u ; K[ ] i =
u(x)
dx
dx
0
0
Now, even discontinuous functions u(x) are allowed as weak solutions. The goal is to
find u(x) such that this condition holds for all smooth test functions (x). For example,
any function u(x) which satisfies the differential equation except at points of discontinuity
qualifies as a weak solution.
In a finite element or Galerkin approximation to the weak solution, one restricts attention to a finite-dimensional subspace W spanned by functions 1 , . . . , n1 , and requires
that the approximate solution
(x) = c1 1 (x) + + cn1 n1 (x)
(11.184)
satisfy the orthogonality condition (11.181) only for elements W of the subspace. As
usual, this only needs to be checked on the basis elements. Substituting (11.184) into the
semi-weak form of the system, (11.182), produces a linear system of equations of the form
h w ; K[ i ] i =
n
X
mij cj = bi = h f ; i i,
i = 1, . . . , n.
(11.185)
i=1
The reader will recognize this as exactly the same finite element linear system (11.163)
derived through the minimization approach. Therefore, for a self-adjoint boundary value
problem, the weak formulation and the minimization principle, when restricted to the
finite-dimensional subspace W , lead to exactly the same equations for the finite element
approximation to the solution.
In non-self-adjoint scenarios, the weak formulation is still applicable even though there
is no underlying minimization principle. On the other hand, there is no guarantee that
either the original boundary value problem or its finite element approximation have a
solution. Indeed, it is entirely possible that the boundary value problem has a solution,
12/11/12
629
c 2012
Peter J. Olver
but the finite element matrix system does not. Even more worrying are cases in which
the finite element system has a solution, but there is, in fact, no actual solution to the
boundary value problem! In such cases, one is usually tipped off by the non-convergence
of the approximations as the mesh size goes to zero. Nevertheless, in many situations, the
weak solution approach leads to a perfectly acceptable numerical approximation to the
true solution to the system. Further analytical details and applications of weak solutions
can be found in [76, 188].
12/11/12
630
c 2012
Peter J. Olver