Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
80 views

Chapter 5 - Function Approximation

This lecture discusses function approximation using polynomial functions. It begins by introducing the Weierstrass approximation theorem, which states that any continuous function can be approximated as closely as desired by a polynomial function over a given interval. However, the Weierstrass approximation converges very slowly. Taylor series provide an alternative but have errors that are not evenly distributed over the interval. The lecture then introduces the minimax approximation method, which finds polynomial coefficients that minimize the maximum approximation error over the interval, resulting in a more evenly distributed error. An example approximating the function f(x)=e^x on the interval [-1,1] illustrates how minimax approximation provides a more accurate approximation with errors closer to being evenly distributed compared to

Uploaded by

Ajay
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

Chapter 5 - Function Approximation

This lecture discusses function approximation using polynomial functions. It begins by introducing the Weierstrass approximation theorem, which states that any continuous function can be approximated as closely as desired by a polynomial function over a given interval. However, the Weierstrass approximation converges very slowly. Taylor series provide an alternative but have errors that are not evenly distributed over the interval. The lecture then introduces the minimax approximation method, which finds polynomial coefficients that minimize the maximum approximation error over the interval, resulting in a more evenly distributed error. An example approximating the function f(x)=e^x on the interval [-1,1] illustrates how minimax approximation provides a more accurate approximation with errors closer to being evenly distributed compared to

Uploaded by

Ajay
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Lecture 21

Last lecture:
Least Square Method for Regression

This lecture:

Function approximation
CHAPTER V APPROXIMATION OF FUNCTIONS

◊ Objective: f(x) is given in a very complicated form and is difficult to use,


want to approximate f(x) by a simple polynomial in an interval. Can we do it?
How do we do it accurately?
5.1 Weierstrass Theorem—yes we can!
5.1.1 Weierstrass approximation
◊ Let f(x) be continuous for a ≤ x ≤ b and e > 0.
There exists a polynomial P(x) such that
| f(x) - P(x) | ≤ e for a ≤ x ≤ b.

 we can always find a p(x) that is as close to f(x) as we want.

◊ The polynomial can be constructed using Bernstein polynomial,


𝑘 𝑛
Pn(x) = σ𝑛𝑘=0 𝑓(𝑛) ( ) xk(1-x)n-k for 0 ≤ x ≤ 1
𝑘
𝑛
where ( ) are the binomial coefficients.
𝑘
◊ It can be shown that lim P (x) →f(x) for 0 ≤ x ≤ 1
𝑛→∞ n
Example of constructing Weierstrass’s approximation

Consider: f(x) = ex, for -1 ≤ x ≤ 1,


𝑘 𝑛
Find: Weierstrass’s approximation of f(x) using n=3. Pn(x) = σ𝑛𝑘=0 𝑓(𝑛) ( ) xk(1-x)n-k
𝑘
Solution: let 0≤ x=(x+1)/2≤ 1. For n=3, x=k/n=0, 1/3, 2/3 & 1. for 0 ≤ x ≤ 1
f(x=0)=f(x=-1)= e-1, f(x=1/3) =f(x=-1/3)= e-1/3,
f(x=2/3) =f(x=1/3)= e1/3, f(x=1) =f(x=1)= e1
3

 W3(x) = e-1(1-x)3 +e-1/33x(1-x)2 + e1/33x2 (1-x) + e1x3 f(x) 3rd order Weierstrass approx.
2.5

1.5

exact
1
Weierstrass

0.5

0
-1 -0.5 0 0.5 1 x
5.1.2 Comparison with Taylor series expansion

◊ Comment: Bernstein polynomial converges very slowly! 𝑘 𝑛


Pn(x) = σ𝑛𝑘=0 𝑓(𝑛) ( ) xk(1-x)n-k
𝑘
* Thus it is impractical to use them. (See next graph)
for 0 ≤ x ≤ 1
* Must find something else that is practical and accurate. 3
f(x)
◊ Question: Is Taylor series expansion a good alternative? Third order
2.5 polynomial
◊ Taylor series expansion is another way to approximate f(x). approximations

Consider: f(x) = ex, for -1 ≤ x ≤ 1, 2

Then TS3(x) = 1+ x + x2 /2 + x3 /6 for x near 0


1.5
Error: E3(x) = ex - (1 + x + x2/2+ x3/6) = x4ex/4!,
Error x [0, x] or [x, 0] 1 exact
0.04 Weierstrass
0 0.5 TS3
-0.04
Taylor Series
-0.08 0
Weierstrass
-0.12 -1 -0.5 0 0.5 1 x
Errors in
-0.16
third order Problem: E3(x) ~ x4 is not evenly distributed in the interval [-1, 1].
-0.2 polynomial approx.

-0.24
Want: a better way to obtain Pn(x) so that Error(x) is more evenly distributed
-1 -0.5 0 0.5 1 x but max{|E(x)|} is smaller
5.2 Minimax Approximation

5.2.1 Illustration of method • Apparently, there are three locations:


-1, x3, 1,
◊ Consider f(x) = ex, x  [-1, 1]
where |f(x) - 𝑞1∗ 𝑥 | can be large.
* Approximate f(x) by:
* Intuitively, we want to spread out the error uniformly over
𝑞1∗ 𝑥 = a0 + a1 x for x  [-1, 1] the interval of interest. Thus we should force
so that max | E(x) | = max |f(x) - 𝑞1∗ 𝑥 | E(x= -1) = E(x=1) = r1
−1≤𝑥≤1 −1≤𝑥≤1
= smallest among all possible values of (a0, a1) and E(x = x3 = z) = -r1
3  e-1 - (a0 - a1) = r1 (i)
* Sketch: exp(x)
2.5 q1*(x) e - (a0 + a1) = r1 (ii)
2 ez - (a0 + a1z) = -r1 (iii)
1.5 Since the error has a local maximum at x3, we have
dE
1 |x=z=x3 = 0  ez - a1 = 0. (iv)
dx
0.5 (i) - (ii)  a1 = sinh(1) = 1.175201;
x3
0 (iv)  x3 = ln(a1) = 0.161439
-1.5 -1 -0.5 0 0.5 1 x
 a0 = cosh(1)/2 + a1(1- x3) = 1.264279, r1 = 0.278802.
E1(x) = ex - (1.264279+1.175201x)

0.4
Error  𝑞3∗ 𝑥 = 0.994579 + 0.995668x + 0.542973 x2 + 0.179533x3
0.2 Minimax
E3(x) = ex - 𝑞3∗ 𝑥
0
0.06
Error Error in third order
-0.2
polynomial approx.
0.04
-0.4 Minimax
-1 -0.5 0 0.5 1 x 0.02 Taylor Series

0
Indeed the negative error and positive error are evened out.
-0.02
-1 -0.5 0 0.5 1 x

◊ If we try 𝑞3∗ 𝑥 = a0 + a1 x + a2x2 + a3x3 for x  [-1, 1]


• The maximum error |E3(x)| is 0.00553 « 0.0516 from TS expansion.
& minimize max | E(x) | = max |f(x) - 𝑞3∗ 𝑥 |
−1≤𝑥≤1 −1≤𝑥≤1
• Improvement over the TS3(x) series
5.2.2 Comments on the minimax approximation method

◊ Clearly, Taylor series approach, while simple, does not lead to an evenly distributed error.
Although the error near x=x0 is very small, its maximum error in the interval is much higher
than the one obtained via minimax approach.

◊ HOWEVER, in general, it is very tedious to find (a0, a1, a2, ...)


in minimizing max|E(x)| = max |f(x) - (a0 +a1x +a2x2 +...)|
◊ Analytically, it depends on how complicated f(x) is.
◊ Use of absolute function is not suited for analytical treatment.

◊ We must find an alternative robust approach that performs similarly to the minimax approximation.

◊ Accuracy of the minimax approximation:


Given f(x), x  [a, b], the error bound in approximating f(x) by using minimax method is
(𝑏−𝑎)Τ2 𝑛+1 𝑛+1
𝜌𝑛 (𝑓) ≤ ൫𝑛+1)!2 𝑛 𝑚𝑎𝑥 |𝑓 (𝑥)|
𝑎≤𝑥≤𝑏
5.3 General Idea of Least Square Approximation

5.3.1 Inner product and L2 –norm of functions

◊ Let w(x) = weight function >0 on x  [a, b]


• Examples of common w(x):
𝑏
‫| 𝑎׬‬x|n w(x) dx = finite for all n≥0 w(x) =1/ 1 − 𝑥2 , x[-1, 1]
w(x) = 1;
• Inner product of two functions f & g: w(x) = e-x, 0 ≤ x < ∞; w(x) = e-x2, -∞ < x < ∞
𝑏
(f, g) = ‫ 𝑎׬‬w(x) f(x) g(x) dx • Orthogonal sequence of functions {fi(x)}:

• L2 -norm of f(x): if (fi, fj) = 0 for i≠j


𝑏
|| f ||2 = [ ‫ 𝑎׬‬w(x) f 2(x) dx ] 1/2 = (𝑓, 𝑓)

• Cauchy-Schwartz inequality: |(f, g)| ≤ || f ||2 || g ||2

• Triangle inequality: || f + g ||2 ≤ || f ||2 + || g ||2


5.3.2 Illustration of least square approximation for functions

◊ Consider L2-norm of the error (w(x)=1):


For f(x) = ex, 𝑟1∗ (x) = a0 + a1x
Although max | E(x) | =0.43944 > 0.2788 (from minimax),
1 −1≤𝑥≤1
the overall error is: E = ‫׬‬−1 [f(x) −𝑟1∗ (x)]2 dx the procedure is much easier.
To minimize E, we require In contrast, TS expansion gives TS1(x) = 1+x
E E
= 0 and = 0 which has maximum error of 0.7183
a0 a1
> max{LSQ error}= 0.43944.
 1 2 dx = -2 1 [ex − a − a x] dx
‫׬‬ [f(x) − a − a x] ‫׬‬−1
a0 −1 0 1 0 1
3
f(x)
 1 2 dx = -2 1 [ex − a − a x] (x) dx
2.5
‫׬‬ [f(x) − a − a x] ‫׬‬−1
a1 −1 0 1 0 1
Linear approximations
2
1 1
 a0 ‫׬‬−1 dx = ‫׬‬−1 ex dx = e1 – e-1 = 2.350402 1.5

1 1 1
a1 ‫׬‬−1 x2dx = ‫׬‬−1 exxdx = 2e-1 = 0.735789 exact
0.5 LSQ
Minimax
 a0 = 1.175201 and a1 = 1.103638. 0
x
-1 -0.5 0 0.5 1
𝑟1∗ (x) = 1.175201 + 1.103638x
Comparison of errors in the linear approximations:

0.8
Error
0.6 LSQ
Minimax
0.4
Taylor series
0.2

-0.2

-0.4
-1 -0.5 0 0.5 1 x
5.3.3 Generalization to least square approximation

◊ Given a continuous function f(x) on x[a, b],  σ𝑛𝑗=0 Aij aj = cj


FIND a polynomial 𝑟𝑛∗ (x) of degree ≤n that minimizes
Discussion:
𝑏
න w(x) [f(x) −𝑟𝑛∗ (x)]2 dx For a=0, b=1, the coefficient matrix is:
𝑎
𝟏
among all polynomials rn(x) of degree ≤n for the given w(x). Aij = 𝒊+𝒋+𝟏= Hilbert matrix ( ill-conditioned )
• Example: For w(x)=1, approximate f(x) using
 solution for ai can be in great error!
f(x) ~ r(x) = σ𝑛𝑗=0 ajxj

by minimizing
𝑏
E = ‫[ 𝑎׬‬f(x) − σ𝑛𝑗=0 aj xj ]2 dx  need to develop alternative method to find 𝑟𝑛∗ (x)

E 1.E+08
Solution: Minimization of Error E  =0 Cond
aj 1.E+07
1.E+06
𝑏
 -2 ‫[ 𝑎׬‬f(x) − σ𝑛𝑖=0 ai xi ] xj dx 1.E+05
1.E+04
𝑏 𝑏 1.E+03 Cond(H(n))
 σ𝑛𝑗=0 aj ‫ 𝑎׬‬xi+jdx = ‫ 𝑎׬‬f(x) xj dx 1.E+02
𝑏
1.E+01
𝟏
σ𝑛𝑗=0 aj [bi+j+1 - a i+j+1] = ‫ 𝑎׬‬f(x) xj dx  cj 1.E+00
𝒊+𝒋+𝟏
1 2 3 4 5 6 n
5.4 Orthogonal Polynomial

5.4.1 Why are monomials bad for approximations?

◊ Why are monomials xn (n≥0) in the above example bad choices?


1.2 !!! xn look too much alike for large n.

1.0  equations are nearly dependent on each other.


Instability of the system for Aij ai = cj (ill-conditioned)
0.8
n
x
0.6 ◊ We thus want to represent f(x) using rn(x) = σ𝑛𝑗=0 ai fi(x)
n=2
0.4 where fi(x) is independent of fj(x) (i≠j),
3
0.2 that is, fi(x) is “orthogonal” to each other.
5

0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2
x
◊ What does “orthogonal” mean?

◊ Can we find such a set of orthogonal fi(x)?


◊ If we can, how do we solve for ai reliably?
Example of known “orthogonal” functions

❑ Fourier series expansion


f(x) = σ𝑛𝑗=0 ai fi(x) = 𝑎0 + ෍ [𝑎𝑛 cos 𝑛𝑥 𝜋Τ𝐿 + 𝑏𝑛 sin 𝑛𝑥 𝜋Τ𝐿 ]


𝑛=1
1 𝐿 1 𝐿 1 𝐿
𝑎0 = ‫𝑥𝑑)𝑥(𝑓 ׬‬, 𝑎𝑘 = න 𝑓 𝑥 cos 𝑘𝑥 𝜋Τ𝐿 𝑑𝑥, 𝑏𝑘 = න 𝑓(𝑥)sin(𝑘𝑥 𝜋Τ𝐿)𝑑𝑥
2𝐿 −𝐿 𝐿 −𝐿 𝐿 −𝐿
Why is Fourier series expansion successful? Because fi(x) is orthogonal to fj(x):
* Recall for any n & m,
π sin(𝑚𝑥)cos(𝑛𝑥)
1
න sin( 𝑚𝑥) cos( 𝑛𝑥)𝑑𝑥 = 0 = [sin 𝑚 − 𝑛 𝑥 + sin 𝑚 + 𝑛 𝑥 ]
−𝜋
2
 
* For nm,
 cos(mx)cos(nx)dx = 0 &

 sin(mx )sin(nx )dx = 0

cos 𝑚𝑥 cos 𝑛𝑥 = sin 𝑚𝑥 sin 𝑛𝑥 =


1 1
[cos 𝑚 − 𝑛 𝑥 + cos 𝑚 + 𝑛 𝑥 ] [cos 𝑚 − 𝑛 𝑥 − 𝑐𝑜s 𝑚 + 𝑛 𝑥 ]
2 2
5.4.2 Gram-Schmidt orthonormalization

 Gram-Schmidt orthogonalization, also called the Gram-Schmidt process, is a procedure


which takes a nonorthogonal set of linearly independent functions and constructs an
orthogonal basis over an arbitrary interval with respect to an arbitrary weighting function w(x).

 There exists a sequence of polynomials {fn(x)|n≥0} with degree (fn) = n for all n and
(fn, fm) = 0 for n≠m, n>0 and m>0.

 In fact, we can construct such a sequence with


i. (fn, fn) = 1;
ii. coefficient of xn (highest power) in fn(x) is positive.
Then, this {fn(x)} is unique for a given w(x).
Orthonormalization Procedure: n=0 & 1

* n=0: f0(x) = c.
𝑏
Normalization requires: (f0, f0) = c2 ‫ 𝑎׬‬w(x) dx = 1

𝑏
 c = [‫ 𝑎׬‬w(x) dx]1/2

* n=1:: Let y1(x) = x + a1,0 f0(x);

enforce orthogonality: (y1, f0) =0  (x, f0) + a1,0 (f0, f0) = 0

𝑏
 a1,0 = -(x, f0)/(f0, f0) = -(x, f0) = -c ‫ 𝑎׬‬w(x) x dx = 1

𝑏 𝑏
 a1,0 = - ‫ 𝑎׬‬w(x) x dx /[‫ 𝑎׬‬w(x) dx]1/2

Now define f1(x)  y1(x) / ||y1(x) ||2

 || f1(x) ||2 = 1 & (f0, f1) =0.


Orthonormalization Procedure: general n

* Suppose f2(x), f3(x),…, fn-1(x) have been obtained.


* After fn-1(x) is found,

a) Let yn(x) = xn + an,n-1 fn-1(x) + an,n-2 fn-2(x) + ... + an,0 f0(x)

with (fi, fj) = 0 for i≠j.

then enforce (yn, fj) = 0  (xn, fj) + an,j (fj, fj) = 0

 an,j = - (xn, fj) / (fj, fj) = -(xn, fj)

b) Now define fn(x) = yn(x) / ||yn(x) ||2,

we have || fn(x) ||2 =1 & (fn, fj) = 0, j≠n.


Example: find orthonormal polynomial series for w(x)=1, [a, b] = [-1, 1]

Given w(x) =1, [a, b] = [-1, 1]

Then, for n=0: f0(x) = c. yn(x) = xn + an,n-1 fn-1(x) + an,n-2 fn-2(x)


1
c = [ ‫׬‬−1 w(x) dx ]-1/2 = 1/ 2  f0(x) = 1/ 2 + ... + an,0 f0(x)

1
a1,0 = -(x, f0) = -c ‫׬‬−1 x dx = 0 an,j = - (xn, fj) / (fj, fj) = -(xn, fj)
1
y1(x) = x  || y1(x) ||2 = ‫׬‬−1 x2 dx ]1/2 = (2/3) 1/2
 f1(x) = 3/2 x

Similarly, y2(x) = x2 + a2,1 f1(x) + a2,0 f0(x) = x2 + a2,1 3/2 x + a2,0 1/ 2


1
an,j = -(xn, fj)  a2,1 = -(x2, f1)=0; a2,0 = -(x2, f0) = - 1/ 2 ‫׬‬−1 x2 dx =-2/(3 2 )
1
y2(x) = x2 -2/(3 2 )* 1/ 2 = x2 -1/3  f2(x)= 2 5/2 (3x2 -1) …

 Legendre polynomials

• Note: (fj, fj) =1 is not necessary as long as we know the value of || fj(x) ||2 which is usually the case
for well-known orthogonal polynomials.
Lecture 22
Last lecture:
Function approximation

This lecture:

Least square approximation


5.3.3 Generalization to least square approximation

◊ Given a continuous function f(x) on x[a, b],  σ𝑛𝑗=0 Aij aj = cj


FIND a polynomial 𝑟𝑛∗ (x) of degree ≤n that minimizes
Discussion:
𝑏
න w(x) [f(x) −𝑟𝑛∗ (x)]2 dx For a=0, b=1, the coefficient matrix is:
𝑎
𝟏
among all polynomials rn(x) of degree ≤n for the given w(x). Aij = 𝒊+𝒋+𝟏= Hilbert matrix ( ill-conditioned )
• Example: For w(x)=1, approximate f(x) using
 solution for ai can be in great error!
f(x) ~ r(x) = σ𝑛𝑗=0 ajxj

by minimizing
𝑏
E = ‫[ 𝑎׬‬f(x) − σ𝑛𝑗=0 aj xj ]2 dx  need to develop alternative method to find 𝑟𝑛∗ (x)

E 1.E+08
Solution: Minimization of Error E  =0 Cond
aj 1.E+07
1.E+06
𝑏
 -2 ‫[ 𝑎׬‬f(x) − σ𝑛𝑖=0 ai xi ] xj dx 1.E+05
1.E+04
𝑏 𝑏 1.E+03 Cond(H(n))
 σ𝑛𝑗=0 aj ‫ 𝑎׬‬xi+jdx = ‫ 𝑎׬‬f(x) xj dx 1.E+02
𝑏
1.E+01
𝟏
σ𝑛𝑗=0 aj [bi+j+1 - a i+j+1] = ‫ 𝑎׬‬f(x) xj dx  cj 1.E+00
𝒊+𝒋+𝟏
1 2 3 4 5 6 n
5.4 Orthogonal Polynomial

5.4.1 Why are monomials bad for approximations?

◊ Why are monomials xn (n≥0) in the above example bad choices?


1.2 !!! xn look too much alike for large n.

1.0  equations are nearly dependent on each other.


Instability of the system for Aij ai = cj (ill-conditioned)
0.8
n
x
0.6 ◊ We thus want to represent f(x) using rn(x) = σ𝑛𝑗=0 ai fi(x)
n=2
0.4 where fi(x) is independent of fj(x) (i≠j),
3
0.2 that is, fi(x) is “orthogonal” to each other.
5

0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2
x
◊ What does “orthogonal” mean?

◊ Can we find such a set of orthogonal fi(x)?


◊ If we can, how do we solve for ai reliably?
Example of known “orthogonal” functions

❑ Fourier series expansion


f(x) = σ𝑛𝑗=0 ai fi(x) = 𝑎0 + ෍ [𝑎𝑛 cos 𝑛𝑥 𝜋Τ𝐿 + 𝑏𝑛 sin 𝑛𝑥 𝜋Τ𝐿 ]


𝑛=1
1 𝐿 1 𝐿 1 𝐿
𝑎0 = ‫𝑥𝑑)𝑥(𝑓 ׬‬, 𝑎𝑘 = න 𝑓 𝑥 cos 𝑘𝑥 𝜋Τ𝐿 𝑑𝑥, 𝑏𝑘 = න 𝑓(𝑥)sin(𝑘𝑥 𝜋Τ𝐿)𝑑𝑥
2𝐿 −𝐿 𝐿 −𝐿 𝐿 −𝐿
Why is Fourier series expansion successful? Because fi(x) is orthogonal to fj(x):
* Recall for any n & m,
π
sin(𝑚𝑥)cos(𝑛𝑥)
1
න sin( 𝑚𝑥) cos( 𝑛𝑥)𝑑𝑥 = 0 = [sin 𝑚 − 𝑛 𝑥 + sin 𝑚 + 𝑛 𝑥 ]
2
−𝜋
 
* For nm,
 cos(mx )cos(nx)dx = 0 &

 sin(mx )sin(nx )dx = 0

cos 𝑚𝑥 cos 𝑛𝑥 = sin 𝑚𝑥 sin 𝑛𝑥 =


1 1
[cos 𝑚 − 𝑛 𝑥 + cos 𝑚 + 𝑛 𝑥 ] [cos 𝑚 − 𝑛 𝑥 − 𝑐𝑜s 𝑚 + 𝑛 𝑥 ]
2 2
5.4.2 Gram-Schmidt orthonormalization

 Gram-Schmidt orthogonalization, also called the Gram-Schmidt process, is a procedure


which takes a nonorthogonal set of linearly independent functions and constructs an
orthogonal basis over an arbitrary interval with respect to an arbitrary weighting function w(x).

 There exists a sequence of polynomials {fn(x)|n≥0} with degree (fn) = n for all n and
(fn, fm) = 0 for n≠m, n>0 and m>0.

 In fact, we can construct such a sequence with


i. (fn, fn) = 1;
ii. coefficient of xn (highest power) in fn(x) is positive.
Then, this {fn(x)} is unique for a given w(x).
Orthonormalization Procedure: n=0 & 1 for a given w(x) in [a, b]

* n=0: f0(x) = c.
𝑏
Normalization requires: (f0, f0) = c2 ‫ 𝑎׬‬w(x) dx = 1

𝑏
 c = [‫ 𝑎׬‬w(x) dx]-1/2

* n=1: Let y1(x) = x + a1,0 f0(x);

enforce orthogonality: (y1, f0) = 0  (x, f0) + a1,0 (f0, f0) = 0

𝑏
 a1,0 = -(x, f0)/(f0, f0) = -(x, f0) = -c ‫ 𝑎׬‬w(x) x dx

𝑏 𝑏
 a1,0 = - ‫ 𝑎׬‬w(x) x dx /[‫ 𝑎׬‬w(x) dx]1/2

Now define f1(x)  y1(x) / ||y1(x) ||2

 || f1(x) ||2 = 1 & (f0, f1) =0.


Orthonormalization Procedure: general n

* Suppose f2(x), f3(x),…, fn-1(x) have been obtained.


* After fn-1(x) is found,

a) let yn(x) = xn + an,n-1 fn-1(x) + an,n-2 fn-2(x) + ... + an,0 f0(x)

with (fi, fj) = 0 for i≠j.

then enforce (yn, fj) = 0  (xn, fj) + an,j (fj, fj) = 0

 an,j = - (xn, fj) / (fj, fj) = -(xn, fj)

b) Now define fn(x) = yn(x) / ||yn(x) ||2,

we have || fn(x) ||2 =1 & (fn, fj) = 0, j≠n.


Example: find orthonormal polynomial series for w(x)=1, [a, b] = [-1, 1]

Given w(x) =1, [a, b] = [-1, 1]

Then, for n=0: f0(x) = c. yn(x) = xn + an,n-1 fn-1(x) + an,n-2 fn-2(x)


1
c = [ ‫׬‬−1 w(x) dx ]-1/2 = 1/ 2  f0(x) = 1/ 2 + ... + an,0 f0(x)
1
a1,0 = -(x, f0) = -c ‫׬‬−1 x dx = 0 an,j = - (xn, fj) / (fj, fj) = -(xn, fj)
1
y1(x) = x  || y1(x) ||2 = [‫׬‬−1 x2 dx ] 1/2 = (2/3)1/2
 f1(x) = 3/2 x

Similarly, y2(x) = x2 + a2,1 f1(x) + a2,0 f0(x) = x2 + a2,1 3/2 x + a2,0 1/ 2


1
an,j = -(xn, fj)  a2,1 = -(x2, f1)=0; a2,0 = -(x2, f0) = - 1/ 2 ‫׬‬−1 x2 dx =-2/(3 2 )
2
y2(x) = x2 -2/(3 2 )* 1/ 2 = x2 -1/3  || y2(x) ||2 = 3 2/5 1
 f2(x)= 2 5/2 (3x2 -1)
 Legendre polynomials

• Note: (fj, fj) =1 is not necessary as long as we know the value of || fj(x) ||2 which is usually the case
for well-known orthogonal polynomials.
5.4.3 Various common orthogonal polynomials
i) Legendre polynomials
* Weight function: w(x) = 1 on [-1, 1]
* Legendre functions are solutions to Legendre's differential eqn.:
𝑑 𝑑
[(1 − 𝑥 2 ) 𝑃(𝑥)] + 𝑛(𝑛 + 1)𝑃(𝑥) = 0
𝑑𝑥 𝑑𝑥
This ordinary differential equation is frequently encountered in physics and other technical fields.
In particular, it occurs when solving Laplace's equation (& related PDEs) in spherical coordinates
(−1)𝑛 𝑑 𝑛
* General expression: 𝑃𝑛 (𝑥) = 𝑛 [(1 − 𝑥 2 )𝑛 ],
2 𝑛! 𝑑𝑥 𝑛
P0(x) 1
P1(x) x
P2(x) (3x2-1) /2
P3(x) (5x3-3x) /2
P4(x) (35x4-30x2 +3) /8
P5(x) (63x5-70x3 +15x) /8
P6(x) (231x6-315x4 +105x2-5) /16
Legendre polynomials

* Orthogonality of Pn(x):

we can show that (Pn(x), Pm(x)) = 0 for n≠m


i.e. Pn(x) is orthogonal on [-1, 1] w.r.t. w(x) = 1.
When n = m,
2
(Pn(x), Pn(x)) = 2𝑛+1 (≠1; OK!)
Lecture 23
Last lecture:
Least square approximation
Gram-Schmidt process
Various common orthogonal polynomials
This lecture:

Least square approximation


Numerical differentiation
Chebyshev polynomials

ii) Chebyshev polynomial:


* Weight: w(x) = (1-x2)-1/2, on [-1, 1]
* General expression: Tn(x) = cos[n cos-1(x)], n≥0.

Let q = cos-1(x)  T n(x) = cos(nq ).

* The first few Chebyshev polynomials of the first kind

T0(x) 1
T1(x) x = cos(q )
T2(x) 2x2-1 = cos(2q ) =2cos2q-1
T3(x) 4x3-3 x = cos(3q ) =4cos3q- 3cosq
T4(x) 8x4-8 x2 +1
T5(x) 16x5-20x3 +5x
T6(x) 32x6-48x4 +18x2-1
Chebyshev polynomials

0, 𝑛≠𝑚
* Orthogonality: (Tn, Tm) = ቐ 𝜋, 𝑛 = 𝑚 = 0
𝜋/2, 𝑛 = 𝑚 > 0

* Recursion: Tn+1(x) = 2xTn (x) - Tn-1(x), n≥1

• Chebyshev polynomials are important in approximation theory because the roots of the
Chebyshev polynomials of the first kind, which are also called Chebyshev nodes,
are used as nodes in polynomial interpolation.
Laguerre polynomials

* Weight function: w(x) = e-x, [a, b] = [0, ∞)

1 𝑑𝑛
* General expression: 𝐿𝑛 𝑥 = 𝑥 𝑛 𝑒 −𝑥
𝑛!𝑒 −𝑥 𝑑𝑥 𝑛

* Orthogonality: || Ln ||2 = 1, (Ln, Lm) = dnm


5.4.4 Representation of a polynomial using orthogonal polynomials

◊ Let y(x) = a polynomial of degree n: y(x) = σ𝑛𝑖=1 bi xi

Want to represent y(x) by orthogonal polynomials fn(x).

i.e. y(x) = σ𝑛𝑖=1 aifi(x); ai = ?

Note: (y, fj) = σ𝑛𝑖=1 ai(fi, fj) = aj (fj, fj) for j=0, 1, ..., n.

Hence aj = (y, fj) / || fj ||2.

𝑏
= ‫ 𝑎׬‬w(x) y(x) fj(x) dx / || fj ||2.

Then, y(x) = σ𝑛𝑖=1 aifi(x) is exact (n is finite).

(because y(x) IS a polynomial).


5.5 Least Square Approximation Using Orthogonal Polynomials
5.5.1 General formulation

• Given f(x) (≠ a polynomial),


want to approximate f(x) by superposition of {fi(x)} as:

rn(x) = σ𝑛𝑖=1 aifi(x);

• L2-norm error (squared) in approximating f(x) by r(x) is


𝑏
𝐸22 = ‫ 𝑎׬‬w(x)[f(x) − σ𝑛𝑖=1 aifi(x)]2 dx

𝜕𝐸22 𝑏
= 0 -2 ‫ 𝑎׬‬w(x)[f(x) − σ𝑛𝑖=1 aifi(x)]fj(x)dx = 0
𝜕𝑎𝑗

𝑏 𝑏
 ‫ 𝑎׬‬w(x)f(x)fj(x)dx = σ𝑛𝑖=1 ai ‫ 𝑎׬‬w(x)aifi(x)fj(x)dx = σ𝑛𝑖=1 ai (fi, fj) = aj || fj ||2.

𝑏
 aj =‫ 𝑎׬‬w(x)f(x)fj(x)dx / || fj ||2.

i.e. aj = (f, fj) / (fj, fj) the only work needed is the evaluation of (f, fj).
5.5.2 Applications to arbitrary [a, b] using Legendre polynomials

i) Using Legendre polynomials Pn(t) (only defined on [-1, 1].)

• w(x) = 1 with arbitrary [a, b] a (a+b)/2 b x

• Transformation: x = [b+a +(b-a) t] / 2


so that t  [-1, 1]  x  [a, b]. -1 0 1 t
• Substituting x = [b+a +(b-a) t] / 2 into the given f(x), we get
𝑏−𝑎 𝑎+𝑏
𝒙= 𝑡+
f(x) = f {[b+a +(b-a) t] / 2} = F(t) 2 2

• L2-norm error in approximating f(x) by rn(x), a linear combination of Pn(x), is


𝑏 𝑏−𝑎 1
𝐸22 = || f(x) - rn(x) ||22 = ‫[ 𝑎׬‬f(x) − rn(x)]2 dx = ‫[ ׬‬F(t) − Rn(t)]2 dt
2 −1

where Rn(t) = rn{ [b+a +(b-a) t]/2} = rn(x).

2𝑖+1
• Rn(t) = σ𝑛𝑖=1 ai Pi(t), t  [-1, 1]  ai = (F, Pi) / (Pi, Pi) = (F, Pi).
2

2 𝑎+𝑏 Numerical integration needed for (F, Pi).


𝑡= (𝒙 − )
𝑏−𝑎 2
Example: Expand f(x) = ex, 0  x2, using Legendre polynomials Pn(x)

Comparison of ex with P2(x)


• Transformation: x = [b+a +(b-a) t] / 2 = 1+ t, 𝑡 =𝒙−1 8

so that t  [-1, 1]  x  [0, 2]. 7


f(x) P2
6
• Substituting x = 1+ t into the given f(x), we get
5
f(x) = e1+t = F(t)
4
2𝑖+1
Use ai = (F, Pi) / (Pi, Pi) = (F, Pi).
2 3
1
(F, P0) = ‫׬‬−1 e1+t ∗1 dt = e2 – 1  a0 = 3.1945280 2

1 1
(F, P1) = ‫׬‬−1 e1+t ∗t dt = 2  a1 =3/2*2 = 3
0
1
(F, P2) = ‫׬‬−1 e1+t ∗(3∗t2−1)/2 dt = e2 – 7  a2 = 0.9726403 0 0.5 1 1.5 2

1
(F, P3) = ‫׬‬−1 e1+t ∗(5∗t3−3∗t)/2 dt = 37-5e2  a3 = 0.1915183

r3(x)= 3.194528 +3(x-1) +0.9726403*(3* (x-1)2-1)/2 + 0.1915183* (5*(x-1)3-3*(x-1))/2


Example: Expand f(x) = ex, 0  x2, using Legendre polynomials Pn(x)

r3(x)= 3.194528 +3(x-1) +0.9726403*(3* (x-1)2-1)/2 + 0.1915183* (5*(x-1)3-3*(x-1))/2

Error E2(x) and E3(x)


0.25

0.2
E2
0.15
E3
0.1

0.05

-0.05

-0.1

-0.15

-0.2
0 0.5 1 1.5 2 x
5.5.2 Applications using Chebyshev polynomials for f(x)

ii) Using Chebyshev polynomials Tn(x) (only defined on [-1, 1].)

• w(x) = 1/ 1 − 𝑥 2
Construct Cn(x) to approximate f(x):
Cn(x) = σ𝑛𝑗=0 ′ cj Tj(x) (' means taking c0/2, instead of c0, for j=0)
0, 𝑛≠𝑚
(Tn, Tm) = ቐ 𝜋, 𝑛 = 𝑚 = 0
𝜋/2, 𝑛 = 𝑚 > 0
2 1 𝑓(𝑥)𝑇𝑗 (𝑥)
 cj =  ‫׬‬−1 𝑑𝑥 for j = 0, 1,... n.
1−𝑥 2

• To evaluate cj, use standard change of variables


x = cosq (0≤ q ≤ )
2 0 𝑓(𝑐𝑜𝑠 𝜃) 𝑐𝑜𝑠 𝑗𝜃
 cj = ‫𝜋׬‬ (− 𝑠𝑖𝑛 𝜃)𝑑𝜃
 𝑠𝑖𝑛 𝜃
2 𝜋
=  ‫׬‬0 𝑓(𝑐𝑜𝑠 𝜃) 𝑐𝑜𝑠(𝑗 𝜃)𝑑𝜃

• Finally, Cn(x) = Cn(cosq) =σ𝑛𝑗=0 ′ cjcos(jq).


Example: Expand f(x) = ex, -1  x1, using Chebyshev polynomials Tn(x)

Solution:
For f(x) = ex, x[-1, 1]
2 𝜋 𝑐𝑜𝑠 𝜃
cj =  ‫׬‬0 𝑒 𝑐𝑜𝑠(𝑗 𝜃)𝑑𝜃 3

Numerically evaluate the above (using trapezoidal rule)=> 2.5

c0 = 2.5321376, c1 = 1.13031812,
2
c2 = 0.27149534, c3 = 0.04433685,
c4 = 0.00547424, c5 = 0.000542926 1.5 f(x)
C1(x)
C2(x)
 C1(x) = c0/2 + c1x = 1.26607 + 1.13032x 1 C3(x)

C3(x) = 1.26607 + 1.13032x + c2 (2x2-1) + c3(4x3-3x) 0.5

= 0.994571 + 0.997328x + 0.542991x2 + 0.177347x3 0


-1.5 -1 -0.5 0 0.5 1 1.5
~ very close to 𝑞3∗ (x) obtained from minimax process:

𝑞3∗ (x) = 0.994579 + 0.995668x + 0.542973 x2 + 0.179533x3


Example: Expand f(x) = ex, -1  x1, using Chebyshev polynomials Tn(x)

3 0.008
Error_q*3
0.006 Error_C3
2.5
f(x) Error_C4
0.004
2
C4(x) 0.002

1.5 C3(x) 0

-0.002
1
-0.004
0.5 -0.006

0 -0.008
-1 -0.75 -0.5 -0.25 0 0.25 0.5 0.75 1 x
-1 -0.75 -0.5 -0.25 0 0.25 0.5 0.75 1 x

C3(x) = 0.994571 + 0.997328x + 0.542991x2 + 0.177347x3 𝑞3∗ (x) and C3(x) are very close.

𝑞3∗ (x) = 0.994579 + 0.995668x + 0.542973 x2 + 0.179533x3 C4(x) has much less error than C3(x).
Example: Expand f(x) = ex, -1  x1, using Chebyshev polynomials Tn(x)

1.E+01
c(k)
1.E-01

1.E-03 We observe an exponential decrease of the coefficient |ck| as k


1.E-05
increases for this continuous function f(x)=exp(-x).

1.E-07

1.E-09
This type of behavior is called exponential convergence.
1.E-11

1.E-13
It is typical for smooth functions.
1.E-15
0 5 10 15 k
Example: Expand f(x) = ex, -1  x1, using Chebyshev polynomials Tn(x)

𝜋
f(x) ~ Cn(x)= σ𝑛𝑗=0 ′ cj Tj(x) 2
cj =  ‫׬‬0 𝑒
𝑐𝑜𝑠 𝜃
𝑐𝑜𝑠(𝑗 𝜃)𝑑𝜃

1.E+01
c(k)
1.E-01

1.E-03 We observe an exponential decrease of the coefficient |ck| as k


1.E-05
increases for this continuous function f(x)=exp(-x).

1.E-07

1.E-09
This type of behavior is called exponential convergence.
1.E-11

1.E-13
It is typical for smooth functions.
1.E-15
0 5 10 15 k
Supplemental Reading
Chapter 5 APPROXIMATION OF FUNCTIONS
Example 1 Padé Approximation
Background: Padé Approximation is the "best" approximation of a function by a
rational function of given order. A Padé approximant often gives better
approximation of the function than truncating its Taylor series and it may
still work where the Taylor series does not converge. For these reasons Padé
approximants are used extensively in computer calculations.
General idea:
Given a function f(x) and two integers m ≥ 0 and n ≥ 0, then the Padé
approximant of order (m, n) is the rational function

a0 + a1x + a2 x 2 + ... + am x m
R( x) =
1 + b1x + b2 x 2 + ... + bn x n
which agrees with f(x) to the highest possible order, which amounts to
f (0) = R (0) = a0 , f ' (0) = R' (0) , f " (0) = R" (0) ,…
f ( m + n) (0) = R ( m + n) (0)
Illustration:
a + a x + a2 x 2
Approximate f(x) = exp(-x) in the form of R2,2 ( x) = 0 1
1 + b1x + b2 x 2

Find the coefficients a0, a1, a2, b1, & b2.


Compare the performance of R2,2 (x) and the 4-term Taylor series.
Details of solution (for f(x) = exp(-x)):
1 2 1 3 1 4
* Taylor series: f(x) = TS4(x) = 1 − x + x − x + x + ...
2! 3! 4!
* Equate R2,2 (x) to TS4(x):
1 2 1 3 1 4 a0 + a1x + a2 x 2
1 − x + x − x + x + ... =
2! 3! 4! 1 + b1x + b2 x 2
That is,

1 2 1 3 1 4
(1 − x + x − x + x + ... )(1 + b1x + b2 x 2 ) = a0 + a1x + a2 x 2
2! 3! 4!
1 2 1 3 1 4 b b
 1− x + x − x + x + ... b1x − b1x 2 + 1 x 3 − 1 x 4 + ...
2! 3! 4! 2! 3!
b
+ b2 x 2 − b2 x 3 + 2 x 4 + ... = a0 + a1x + a2 x 2
2!
 Collecting coefficient of various powers of x

x4 : 1/4! - b1/3! + b2/2! = 0

x3 : -1/3! + b1/2! - b2 = 0

x2 : 1/2! - b1 + b2 = a2

x1 : -1+ b1 = a1

x0 : 1 = a0
The solution to the first two equations give:
b1 = 1/2 b2 = 1/12
The third and forth equations give
a1 = -1/2 a2 = 1/12
and finally a0 = 1

1 − x / 2 + x 2 / 12
Thus, R2,2 ( x) =
1 + x / 2 + x 2 / 12
* If we continue the process, we can obtain
1 − x / 2 + x 2 / 10 − x 3 / 120
R3,3 ( x) =
1 + x / 2 + x 2 / 10 + x 3 / 120
* Comparison

25
exp(-x)
pade2,2
20 Pade3,3
TS4
15

10

0
-3 -2.5 -2 -1.5 -1 -0.5 0 x

1.6
exp(-x)
1.4 pade2,2
1.2 Pade3,3
TS4
1 TS6
0.8

0.6

0.4

0.2

0
0 0.5 1 1.5 2 2.5 3 x
Clearly, for x > 0 even R2,2 is better than TS4 and TS6.
For x <0, R2,2 is comparable with TS4 in -2<x<0.
Example 2 Function approximation using Chebyshev polynomials

Consider f(x) = (1 − x)1 / 2 , 0  x 1. Find a polynomial


approximation for f(x).
Solution:
* Because f(x) has a weak singularity at x = 1 (i.e. f(x) is continuous but
f′(x) is not near x=1), Chebyshev polynomials are most suited to
approximate f(x).
* Let  = 2x-1 and cos  = 
 x = (cos +1)/2  f(x) = [(1-cos )/2]1/2
so that 0  x 1  0
2
* cj =  cos(j) [(1-cos )/2] d
1/2
0

Numerically evaluate the above (using trapezoidal rule)=>


c0 = 1.27323146449689, c1 = -0.424421261939733,
c2 = -0.0848907170462543, c3 = -0.0363863540527404,
c4 = -0.0202182337115025, c5 = -0.0128690888183209,
c6 = -0.00891185771081673, c7 = -0.00653751983500546,
c8 = -0.00500118437469333, c9 = -0.00395000831184062,
c10 = -0.00319916911335327…
Let C1(x) = c0/2+ c1cos() = c0 + c1(2x-1)
C3(x) = c0/2 + c1cos() + c2cos(2) + c3cos(3)
= c0/2 + c1(2x-1) + c2 [2(2x-1)2-1] + c3[4(2x-1)3-3(2x-1)],

10 10
C10(x) = c0/2 +  cj cos(j) = c0/2 +  cjTj(2x-1)
j =1 j =1
The comparison of various approximations with f(x) is shown below.

1.2
f(x)
C1(x)
1
C3(x)

0.8

0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1 x

1.2
f(x)
C6(x)
1
C10(x)
C15(x)
0.8

0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1 x

The errors in C6(x), C10(x) and C15(x) are shown below.


Clearly the errors are larger near x=1 where f(x) is singular.
The decrease in the error as k increases is noticeable; but it is not as fast.
0.03 Error_6
0.02 Error_10
Error_15
0.01
0
-0.01 0 0.2 0.4 0.6 0.8 1 x
-0.02
-0.03
-0.04
-0.05
-0.06

* For k>1, the magnitude |ck| as a function of k is shown below.

1.E+00
|c(k)|

1.E-01
1
1.E-02
2
1.E-03

1.E-04

1.E-05
1 10 k 100

It is noted that the decrease of |ck| with increasing k is not fast. It is


|ck|  k-2 (algebraic decay)
As we discussed in the class notes, for smooth function f(x) should give
|ck|  exp(-k) (exponential decay)
with increasing k.
Thus, the presence of the singularity at x=1 severely influences the
convergence of solution since we now will need a lot more terms,
ckTk(x), in order to reach to the level of |ckTk(x)| ~ machine error.

You might also like