Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Projection Methods

                                    Jesús Fernández-Villaverde

                                       University of Pennsylvania


                                           July 10, 2011




Jesús Fernández-Villaverde (PENN)           Projection Methods      July 10, 2011   1 / 52
Introduction


Introduction

       We come back to our functional equation:

                                              H (d ) = 0


       Projection methods solve the problem by specifying:

                                                       n
                                    d n (x, θ ) =     ∑ θ i Ψi (x )
                                                     i =0



       We pick a basis fΨi (x )gi∞ 0 and “project” H ( ) against that basis to
                                 =
       …nd the θ i ’s.

       How?
Jesús Fernández-Villaverde (PENN)        Projection Methods           July 10, 2011   2 / 52
Introduction


Points to Emphasize


   1   We may want to approximate di¤erent objects d: for instance a
       decision rule, a value function, or an expectation.


   2   In general we will have with the same number of parameters than
       basis functions.


   3   We will work with linear combinations of basis functions. Why? The
       theory of nonlinear approximations is not yet as developed as the
       linear case.




Jesús Fernández-Villaverde (PENN)       Projection Methods   July 10, 2011   3 / 52
Introduction


Basic Algorithm
   1   De…ne n known linearly independent functions ψi : Ω ! <m where
       n < ∞. We call the ψ1 ( ) , ψ2 ( ) , ..., ψn ( ) the basis functions.
   2   De…ne a vector of parameters θ = [θ 1 , θ 2 , ..., θ n ].
   3   De…ne a combination of the basis functions and the θ’      s:
                                                         n
                                      d n ( j θ) =      ∑ θ i ψn ( )
                                                        i =1

   4   Plug    dn   ( j θ ) into H ( ) to …nd the residual equation:
                                     R ( j θ ) = H (d n ( j θ ))
   5   Find the value of b that make the residual equation as close to 0 as
                         θ
       possible given some objective function ρ : J 1 J 1 ! J 2 :
                                    b = arg min ρ (R ( j θ ) , 0)
                                    θ          nθ 2<

Jesús Fernández-Villaverde (PENN)          Projection Methods          July 10, 2011   4 / 52
Introduction


Relation with Econometrics


       Looks a lot like OLS. Explore this similarity later in more detail.


       Also with semi-nonparametric methods as Sieves.


       Compare with:


          1   Policy iteration.

          2   Parameterized Expectations.




Jesús Fernández-Villaverde (PENN)       Projection Methods        July 10, 2011   5 / 52
Introduction


Two Issues

We need to decide:


   1   Which basis we use?


          1   Pick a global basis)spectral methods.

          2   Pick a local basis)…nite elements methods.


   2   How do we “project”?


Di¤erent choices in 1 and 2 will result in slightly di¤erent projection
methods.
Jesús Fernández-Villaverde (PENN)       Projection Methods     July 10, 2011   6 / 52
Introduction


Spectral Methods


       Main reference: Judd (1992).


       Spectral techniques use basis functions that are nonzero and smooth
       almost everywhere in Ω.


       Advantages: simplicity.


       Disadvantages: di¢ cult to capture local behavior. Gibbs phenomenon.




Jesús Fernández-Villaverde (PENN)       Projection Methods    July 10, 2011   7 / 52
Introduction


Spectral Basis I
Monomials: c, x, x 2 , x 3 , ...
   Simple and intuitive.
   Even if this basis is not composed by orthogonal functions, if J1 is the
   space of bounded measurable functions on a compact set, the
   Stone-Weierstrass theorem assures completeness in the L1 norm.
   Problems:
          1   (Nearly) multicollinearity. Compare the graph of x 10 with x 11 .
              The solution of a projection involves matrices inversion. When the
              basis functions are similar, the condition number of these matrices (the
              ratio of the largest and smallest absolute eigenvalues) are too high.
              Just the six …rst monomials can generate conditions numbers of 1010 .
              The matrix of the LS problem of …tting a polynomial of degree 6 to a
              function (the Hilbert Matrix), is a popular test of numerical accuracy
              since it maximizes rounding errors!
          2   Monomials vary considerably in size, leading to scaling problems and
              accumulation of numerical errors.
       We want an orthogonal basis.
Jesús Fernández-Villaverde (PENN)       Projection Methods             July 10, 2011   8 / 52
Introduction


Spectral Basis II
Trigonometric series

                         1/ (2π )0.5 , cos x / (2π )0.5 , sin x / (2π )0.5 , ...,
                         cos kx / (2π )0.5 , sin kx / (2π )0.5 , ...

       Periodic functions.


       However economic problems are generally not periodic.


       Periodic approximations to nonperiodic functions su¤er from the
       Gibbs phenomenon, requiring many terms to achieve good numerical
       performance (the rate of convergence to the true solution as n ! ∞
       is only O (n)).

Jesús Fernández-Villaverde (PENN)           Projection Methods                 July 10, 2011   9 / 52
Introduction


Spectral Basis III
       Flexible class: orthogonal polynomials of Jacobi (or hypergeometric)
       type. Why orthogonal?

                                                            α,β
       The Jacobi polynomial of degree n, Pn (x ) for α, β >              1, is de…ned
       by the orthogonality condition:
                 Z 1
                                                 α,β            α,β
                         (1    x )α (1 + x ) β Pn (x ) Pm (x ) dx = 0 for m 6= n
                     1



       The two most important cases of Jacobi polynomials:


              Legendre: α = β =         1
                                        2.
          1




          2   Chebyshev: α = β = 0.
Jesús Fernández-Villaverde (PENN)          Projection Methods          July 10, 2011   10 / 52
Introduction


Alternative Expressions
       The orthogonality condition implies, with the customary
       normalizations:
                                 α,β         n+α
                               Pn (1) =
                                               n

       that the general n term is given by:
                                    n
                                        n+α        n+β
                        2    n
                                 ∑       k         n k
                                                       (x          1)n   k
                                                                             (x + 1)k
                                 k =0


       Recursively:
                        2 (n + 1) (n + α + β + 1) (2n + α + β) Pn +1 =
                             (2n + α + β + 1) α2 β2
                                                                                              Pn
                  + (2n + α + β) (2n + α + β + 1) (2n + α + β + 2) x
                                 2 (n + α) (n + β) (2n + α + β + 2) Pn            1

Jesús Fernández-Villaverde (PENN)             Projection Methods                  July 10, 2011    11 / 52
Introduction


Chebyshev Polynomials
       One of the most common tools of Applied Mathematics.
       References:
              Chebyshev and Fourier Spectral Methods, John P. Boyd (2001).
              A Practical Guide to Pseudospectral Methods, Bengt Fornberg (1998).

       Advantages of Chebyshev Polynomials:
          1   Numerous simple close-form expressions are available.
          2   The change between the coe¢ cients of a Chebyshev expansion of a
              function and the values of the function at the Chebyshev nodes are
              quickly performed by the cosine transform.
          3   They are more robust than their alternatives for interpolation.
          4   They are bounded between [ 1, 1] while Legendre polynomials are not,
              o¤ering a better performance close to the boundaries of the problems.
          5   They are smooth functions.
          6   Several theorems bound the errors for Chebyshev polynomials
              interpolations.
Jesús Fernández-Villaverde (PENN)       Projection Methods         July 10, 2011   12 / 52
Introduction


De…nition of Chebyshev Polynomials I
       Recursive de…nition:

                         T0 ( x ) = 1
                         T1 ( x ) = x
                     Tn +1 (x ) = 2xTn (x )               Tn     1   (x ) for a general n
       The …rst few polynomials are then 1, x, 2x 2                       1, 4x 3        3x,
       8x 4 8x 2 + 1, etc...


       The n zeros of the polynomial Tn (xk ) = 0 are given by:
                                                2k 1
                                    xk = cos         π, k = 1, ..., n
                                                  2n


       Note that zeros are clustered quadratically towards                          1.
Jesús Fernández-Villaverde (PENN)           Projection Methods                           July 10, 2011   13 / 52
Introduction




Jesús Fernández-Villaverde (PENN)       Projection Methods   July 10, 2011   14 / 52
Introduction


De…nition of Chebyshev Polynomials II


       Explicit de…nition:

              Tn (x ) = cos (n arccos x )
                        1          1          1     1
                      =     z n + n where        z+    =x
                        2         z           2     z
                        1                 0.5 n                                                      0.5 n
                      =       x + x2 1          + x   x2 1
                        2
                                        [n/2 ]
                                    1                       (n k 1) !
                            =
                                    2    ∑       ( 1)k
                                                            k! (n 2k )!
                                                                        (2x )n             2k

                                        k =0
                              ( 1)n π 0.5                                 0.5   dn                   n   1
                            = n                                 1   x2                 1        x2       2

                              2 Γ n+ 1 2
                                                                                dx n



Jesús Fernández-Villaverde (PENN)                    Projection Methods                          July 10, 2011   15 / 52
Introduction


Remarks

       The domain of the Chebyshev polynomials is [ 1, 1]. Since our state
       space is, in general, di¤erent, we use a linear translation from [a, b ]
       into [ 1, 1] :
                                        x a
                                      2         1
                                        b a
       Chebyshev polynomials are orthogonal with respect to the weight
       function:
                                           1
                                            (1      x 2 )0.5

Chebyshev Interpolation Theorem
                                                           th
if an approximating function is exact at the roots of the n1 order
Chebyshev polynomial then, as n1 ! ∞, the approximation error becomes
arbitrarily small.

Jesús Fernández-Villaverde (PENN)       Projection Methods       July 10, 2011   16 / 52
Introduction


Multidimensional Problems



       Chebyshev polynomials are de…ned on [ 1, 1].


       However, most problems in economics are multidimensional.


       How do we generalize the basis?


       Curse of dimensionality.




Jesús Fernández-Villaverde (PENN)       Projection Methods   July 10, 2011   17 / 52
Introduction


Tensors
       Assume we want to approximate F : [ 1, 1]d ! R.

       Let Tj denote the Chebyshev polynomial of degree j = 0, 1, .., κ.

       We can approximate F with tensor product of Chebyshev polynomials
       of degree κ:

                                      κ               κ
                        ˆ
                        F (x ) =     ∑       ...    ∑       ξ n1 ,...,nd Tn1 (x1 )   Tnd (xd )
                                    n 1 =0         n d =0


       Beyond simplicity, an advantage of the tensor basis is that if the
       one-dimensional basis is orthogonal in a norm, the tensor basis is
       orthogonal in the product norm.

       Disadvantage: number of elements increases exponentially. We end
       up having terms x1 x2
                        κ κ   xd , total number of (κ + 1)d .
                                κ

Jesús Fernández-Villaverde (PENN)                  Projection Methods                     July 10, 2011   18 / 52
Introduction


Complete Polynomials

       Solution: eliminate some elements of the tensor in such a way that
       there is not much numerical degradation.


       Judd and Gaspar (1997): Use complete polynomials instead
                      (                                      )
                                                           d
                        d
                       Pκ            i
                                    x11     i
                                           xdd with       ∑ il     κ, 0   i1 , ..., id
                                                          l =1

       Advantage: much smaller number of terms, no terms of order dκ to
       evaluate.


       Disadvantage: still too many elements.

Jesús Fernández-Villaverde (PENN)             Projection Methods                   July 10, 2011   19 / 52
Introduction


Smolyak’ Algorithm I
       s

       De…ne m1 = 1 and mi = 2i                 1   + 1, i = 2, ....

       De…ne G i = fx1 , ..., xm i g
                     i         i               [ 1, 1] as the set of the extrema of the
       Chebyshev polynomials
                                               π (j       1)
                               xji =   cos                           j = 1, ..., mi
                                                mi        1

       with G 1 = f0g. It is crucial that G i                     G i +1 , 8i = 1, 2, . . .

       Example:

              i = 1, mi = 1, G i = f0g
              i = 2, mi = 3, G i = f 1, 0, 1g
                                                                    π                   3π
              i   = 3, mi = 5, G i = f 1,                  cos        , 0,      cos              , 1g
                                                                    4                    4

Jesús Fernández-Villaverde (PENN)            Projection Methods                        July 10, 2011    20 / 52
Introduction


Smolyak’ Algorithm II
       s
       For q > d, de…ne a sparse grid
                                                    [
                              H(q, d ) =                      (G i1         ...       G id ),
                                             q d +1 ji j q

       where ji j = i1 + . . . + id . The number q de…nes the size of the grid
       and thus the precision of the approximation.
       For example, let q = d + 2 = 5:
                                                    [
                                    H(5, 3) =              (G i1      ...         G id ).
                                                3 ji j 5


                       G3      G1      G1,        G1       G3      G1,                G1        G1       G3
                       G2      G2      G1,        G2       G1      G2,                G1        G2       G2
                       G2      G1      G1,        G1       G2      G1,                G1        G1       G2
                       G1      G1      G1
Jesús Fernández-Villaverde (PENN)            Projection Methods                                 July 10, 2011   21 / 52
Introduction


Smolyak’ Algorithm III
       s

       Number of points for q = d + 2
                                                          d (d       1)
                                    1 + 4d + 4
                                                                 2
       Largest number of points along one dimension

                                         i       = q d +1
                                      mi         = 2q d + 1

       Rectangular grid
                                             h                 id
                                                 2q   d
                                                          +1
       Key: with rectangular grid, the number of grid points increases
       exponentially in the number of dimensions. With the Smolyak
       algorithm number of points increases polynomially in the number of
       dimensions.
Jesús Fernández-Villaverde (PENN)       Projection Methods                July 10, 2011   22 / 52
Introduction


Smolyak’ Algorithm IV
       s



                        Size of the Grid for q = d + 2
                                                                              d
                        d       2q   d   +1     #H(q, d )           2q d + 1
                        2       5               13                 25
                        3       5               25                 125
                        4       5               41                 625
                        5       5               61                 3, 125
                        12      5               313                244, 140, 625




Jesús Fernández-Villaverde (PENN)             Projection Methods                   July 10, 2011   23 / 52
Introduction


Smolyak’ Algorithm V
       s

       For one dimension denote the interpolating Chebyshev polynomials as
                                                          mi
                                        U i (x i ) =     ∑ ξ ij Tj (x i )
                                                         j =1


       and the d-dimensional tensor product by U i1                           ...   U id (x ).
       For q > d, approximating function (Smolyak’ algorithm) given by
                                                 s

                                                                   d    1
       A(q, d )(x ) =               ∑         ( 1)q      ji j
                                                                   q   ji j
                                                                                (U i1    ...      U id )(x )
                              q d +1 ji j q

       Method is (almost) optimal within the set of polynomial
       approximations (Barthelmann, Novak, and Ritter, 1999).
       Method is universal, that is, almost optimal for many di¤erent
       function spaces.
Jesús Fernández-Villaverde (PENN)             Projection Methods                        July 10, 2011   24 / 52
Introduction


Boyd’ Moral Principal
    s



   1   When in doubt, use Chebyshev polynomials unless the solution is
       spatially periodic, in which case an ordinary Fourier series is better.


   2   Unless you are sure another set of basis functions is better, use
       Chebyshev polynomials.


   3   Unless you are really, really sure another set of basis functions is
       better, use Chebyshev polynomials.




Jesús Fernández-Villaverde (PENN)       Projection Methods       July 10, 2011   25 / 52
Introduction


Finite Elements


       Standard Reference: McGrattan (1999).


       Bound the domain Ω in small of the state variables.


       Partition Ω in small in nonintersecting elements.


       These small sections are called elements.


       The boundaries of the elements are called nodes.


Jesús Fernández-Villaverde (PENN)       Projection Methods   July 10, 2011   26 / 52
Introduction


Partition into Elements

       Elements may be of unequal size.


       We can have small elements in the areas of Ω where the economy will
       spend most of the time while just a few, big size elements will cover
       wide areas of the state space infrequently visited.


       Also, through elements, we can easily handle issues like kinks or
       constraints.


       There is a whole area of research concentrated on the optimal
       generation of an element grid. See Thomson, Warsi, and Mastin
       (1985).

Jesús Fernández-Villaverde (PENN)       Projection Methods     July 10, 2011   27 / 52
Introduction


Structure
       Choose a basis for the policy functions in each element.
       Since the elements are small, a linear basis is often good enough:
                                8 x x
                                > x xi 1 if x 2 [xi 1 , xi ]
                                < i i 1
                                    x i +1 x
                      ψi (k ) =                 if k 2 [xi , xi +1 ]
                                > x i +1 x i
                                :            0 elsewhere

       Plug the policy function in the Equilibrium Conditions and …nd the
       unknown coe¢ cients.
       Paste it together to ensure continuity.
       Why is this an smart strategy?
       Advantages: we will need to invert an sparse matrix.
       When should be choose this strategy? speed of computation versus
       accuracy.
Jesús Fernández-Villaverde (PENN)       Projection Methods     July 10, 2011   28 / 52
Introduction


Three Di¤erent Re…nements


   1   h-re…nement: subdivide each element into smaller elements to
       improve resolution uniformly over the domain.


   2   r-re…nement: subdivide each element only in those regions where
       there are high nonlinearities.


   3   p-re…nement: increase the order of the approximation in each
       element. If the order of the expansion is high enough, we will
       generate in that way an hybrid of …nite and spectral methods knows
       as spectral elements.



Jesús Fernández-Villaverde (PENN)       Projection Methods   July 10, 2011   29 / 52
Introduction


Choosing the Objective Function

       The most common answer to the second question is given by a
       weighted residual.
       That is why often projection methods are also called weighted
       residual methods
       This set of techniques propose to get the residual close to 0 in the
       weighted integral sense.
       Given some weight functions φi : Ω ! <m :
                                   R
                               0 if Ω φi (x ) R ( j θ ) dx = 0, i = 1, .., n
           ρ (R ( j θ ) , 0) =
                               1 otherwise

       Then the problem is to choose the θ that solve the system of
       equations:     Z
                                        φi (x ) R ( j θ ) dx = 0, i = 1, .., n
                                    Ω

Jesús Fernández-Villaverde (PENN)               Projection Methods               July 10, 2011   30 / 52
Introduction


Remarks


       With the approximation of d by some functions ψi and the de…nition
       of some weight functions φi ( ), we have transform a rather
       intractable functional equation problem into the standard nonlinear
       equations system!


       The solution of this system can be found using standard methods, as
       a Newton for relatively small problems or a conjugate gradient for
       bigger ones.


       Issue: we have di¤erent choices for an weight function:



Jesús Fernández-Villaverde (PENN)       Projection Methods       July 10, 2011   31 / 52
Introduction


Weight Function I: Least Squares
                     ∂R ( x jθ )
       φi (x ) =       ∂θ i .

       This choice is motivated by the solution of the variational problem:
                                                     Z
                                              min            R 2 ( j θ ) dx
                                                θ        Ω

       with …rst order condition:
                              Z
                                      ∂R ( x j θ )
                                                   R ( j θ ) dx = 0, i = 1, .., n
                                  Ω      ∂θ i

       Variational problem is mathematically equivalent to a standard
       regression problem in econometrics.

       OLS or NLLS are regression against a manifold spanned by the
       observations.
Jesús Fernández-Villaverde (PENN)               Projection Methods                  July 10, 2011   32 / 52
Introduction


Weight Function I: Least Squares


       Least Squares always generates symmetric matrices even if the
       operator H is not self-adjoint.


       Symmetric matrices are convenient theoretically (they simplify the
       proofs) and computationally (there are algorithms that exploit their
       structure to increase speed and decrease memory requirements).


       However, least squares may lead to ill-conditioning and systems of
       equations complicated to solve numerically.




Jesús Fernández-Villaverde (PENN)       Projection Methods     July 10, 2011   33 / 52
Introduction


Weight Function II: Subdomain



       We divide the domain Ω in n subdomains Ωi and de…ne the n step
       functions:
                                       1 if x 2 Ωi
                            φi (x ) =
                                       0 otherwise


       This choice is then equivalent to solve the system:
                                    Z
                                             R ( j θ ) dx = 0, i = 1, .., n
                                        Ωi




Jesús Fernández-Villaverde (PENN)                Projection Methods           July 10, 2011   34 / 52
Introduction


Weight Function III: Moments
       Take 0, x, x 2 , ..., x n             1    and compute the …rst n periods of the
       residual function:
                                    Z
                                             x i R ( j θ ) dx = 0, i = 0, .., n
                                        Ωi



       This approach, widely used in engineering works well for a low n (2 or
       3).


       However, for higher orders, its numerical performance is very low:
       high orders of x are highly collinear and arise serious rounding error
       problems.


       Hence, moments are to be avoided as weight functions.
Jesús Fernández-Villaverde (PENN)                    Projection Methods           July 10, 2011   35 / 52
Introduction


Weight Function III: Collocation or Pseudospectral or
Method of Selected Points
       φi (x ) = δ (x xi ) where δ is the dirac delta function and xi are the
       collocation points.
       This method implies that the residual function is zero at the n
       collocation points.
       Simple to compute since the integral only needs to be evaluated in
       one point. Specially attractive when dealing with strong nonlinearities.
       A systematic way to pick collocation points is to use a density
       function:
                                       Γ 3 γ 2
                      µ γ (x ) =                1         γ<1
                                 (1 x   2 ) γ π 2 Γ (1 γ)
       and …nd the collocation points as the xj , j = 0, ..., n 1 solutions to:
                                      Z xj
                                                  j
                                             µγ (x ) dx =
                                   1              n
       For γ = 0, the density function implies equispaced points.
Jesús Fernández-Villaverde (PENN)       Projection Methods      July 10, 2011   36 / 52
Introduction


Weight Function IV: Orthogonal Collocation

       Variation of the collocation method:


          1   Basis functions are a set of orthogonal polynomials.


          2   Collocation points given by the roots of the n   th polynomial.



       When we use Chebyshev polynomials, their roots are the collocation
       points implied by µ 1 (x ) and their clustering can be shown to be
                           2
       optimal as n ! ∞.


       Surprisingly good performance of orthogonal collocation methods.

Jesús Fernández-Villaverde (PENN)       Projection Methods           July 10, 2011   37 / 52
Introduction


Weight Function V: Galerkin or Rayleigh-Ritz
       φi (x ) = ψi (x ) with a linear approximating function ∑n=1 θ i ψi (x ).
                                                               i
       Then:                                  !
                    Z                          n

                           Ω
                               ψi (x ) H     ∑ θ i ψi (x )          dx = 0, i = 1, .., n
                                             i =1
       that is, the residual has to be orthogonal to each of the basis
       functions.
       Galerkin is a highly accurate and robust but di¢ cult to code.
       If the basis functions are complete over J1 (they are indeed a basis of
       the space), then the Galerkin solution will converge pointwise to the
       true solution as n goes to in…nity:
                                                    n

                                       n !∞
                                            ∑ θ i ψi ( ) = d ( )
                                           lim
                                                   i =1

       Experience suggests that a Galerkin approximation of order n is as
       accurate as a Pseudospectral n + 1 or n + 2 expansion.
Jesús Fernández-Villaverde (PENN)              Projection Methods                    July 10, 2011   38 / 52
Introduction


A Simple Example



       Imagine that the law of motion for the price x of a good is given by:

                                      d 0 (x ) + d (x ) = 0



       Let us apply a simple projection to solve this di¤erential equation.


       Code: test.m, test2.m, test3.m




Jesús Fernández-Villaverde (PENN)       Projection Methods      July 10, 2011   39 / 52
Introduction


Analysis of Error


       As with projection, it is important to study the Euler equation errors.


       We can improve errors:


          1   Adding additional functions in the basis.


          2   Re…ne the elements.



       Multigrid schemes.


Jesús Fernández-Villaverde (PENN)       Projection Methods      July 10, 2011   40 / 52
Introduction


A More Serious Example
       Representative agent with utility function

                                                                                   1 τ
                                              ∞         ctθ (1         lt )1   θ

                                    U = E0   ∑ βt                  1      τ
                                             t =0



       One good produced according to yt = e zt Aktα lt1                           α
                                                                                       with α 2 (0, 1) .


       Productivity evolves zt = ρzt                1   +   t,   jρj < 1 and           t    N (0, σ ).

       Law of motion for capital kt +1 = it + (1                          δ)kt .


       Resource constrain ct + it = yt .
Jesús Fernández-Villaverde (PENN)             Projection Methods                            July 10, 2011   41 / 52
Introduction




       Solve for c ( , ) and l ( , ) given initial conditions.


       Characterized by:

            Uc (t ) = βEt Uc (t + 1) 1 + αAe zt +1 ktα+1 l (kt +1 , zt +1 )α
                                                       1
                                                                                                            δ

                         1       θ       c (kt , zt )
                                                         = (1           α) e zt Aktα l (kt , zt )   α
                             θ       1      l (kt , zt )


       A system of functional equations with no known analytical solution.


       Fortran code using Chebyshev and Finite Elements.



Jesús Fernández-Villaverde (PENN)                  Projection Methods                           July 10, 2011   42 / 52
Introduction


Chebyshev I
       We approximate the decision rules for labor as lt = ∑n=1 θ i ψi (kt , zt )
                                                               i
       where fψi (k, z )gn=1 are basis functions and θ = [fθ i gn=1 ] unknown
                         i                                       i
       coe¢ cients.

       We use that policy function to solve for consumption using the static
       …rst order condition.

       We build a residual function R (k, z, θ ) using the Euler equation and
       the static …rst order condition.

       Then we choose θ by solving:
                     Z              Z
                                              φi (k, z ) R (k, z, θ ) = 0 for i = 1, ..., n
                [kmin ,kmax ] [zmin ,zmax ]


       where fφi (k, z )gn=1 are some weight functions.
                         i
Jesús Fernández-Villaverde (PENN)                 Projection Methods              July 10, 2011   43 / 52
Introduction


Chebyshev I

       We use a collocation method that sets φi (k, z ) = δ (k kj , z zv )
       where δ (.) is the dirac delta function, j = 1, ..., n1 , v = 1, ..., n2 and
       n = n1 n2 and collocation points fkj gjn=1 and fzv gn2 1 .
                                                  1
                                                                  v=

       For the technology shocks and transition probabilities we use Tauchen
       (1986)’ …nite approximation to an AR(1) process and obtain n2
               s
       points.

       We solve the system of n equations R (ki , zi , θ ) = 0 in n unknowns θ
       using a Quasi-Newton method.

       We use an iteration based on the increment of the number of basis
       functions and a nonlinear transform of the objective function (apply
       (u 0 ) 1 ).


Jesús Fernández-Villaverde (PENN)       Projection Methods          July 10, 2011   44 / 52
Introduction


Finite Elements

Rewrite Euler equation as
                                    Z ∞                                                      2
                          β                                                                  t +1
Uc (kt , zt ) =                           [Uc (kt +1 , zt +1 )(r (kt +1 , zt +1 )] exp(           )d t +1
                    (2πσ)0.5          ∞                                                     2σ2

where

                                          Uc (t ) = Uc (kt , zt )
                                    z t +1 α 1 α
                       kt +1 = e kt lt + (1 δ)kt c (kt , zt )
                 r (kt +1 , zt +1 ) = 1 + αe zt +1 ktα+1 l (kt +1 , zt +1 )1 α
                                                         1
                                                                                      δ

and
                                          zt +1 = ρzt +             t +1



Jesús Fernández-Villaverde (PENN)              Projection Methods                   July 10, 2011   45 / 52
Introduction


Goal


       The problem is to …nd two policy functions
       c (k, z ) : R + [0, ∞] ! R + and l (k, z ) : R +      [0, ∞] ! [0, 1] that
       satisfy the model equilibrium conditions.


       Since the static …rst order condition gives a relation between the two
       policy functions, we only need to solve for one of them.


       For the rest of the exposition we will assume that we actually solve
       for l (k, z ) and then we …nd c (l (k, z )).




Jesús Fernández-Villaverde (PENN)       Projection Methods          July 10, 2011   46 / 52
Introduction


Bounding the State Space I

       We bound the domain of the state variables to partition it in
       nonintersecting elements.


       To bound the productivity level of the economy de…ne λt = tanh(zt ).


       Since λt 2 [ 1, 1] we can write the stochastic process as:


                                                            1                0.5
                                    λt = tanh(ρ tanh            (zt   1 ) + 2 σvt )




       where vt =            t
                          2 0.5 σ
                                    .

Jesús Fernández-Villaverde (PENN)              Projection Methods                     July 10, 2011   47 / 52
Introduction


Bounding the State Space II
                                         1                     (1 +λt +1 )0.5     b
       Now, since exp(tanh                   (zt    1 ))   =                    = λt +1 , we have:
                                                               (1 λt +1 )0.5

                                Z 1
                    β
          Uc (t ) = 0.5                 [Uc (kt +1 , zt +1 )r (kt +1 , zt +1 )] exp( vt2+1 )dvt +1
                   π                1


       where

                             b
                   kt +1 = λt +1 ktα l (kt , zt )1 α + (1 δ)kt c (l (kt , zt ))
                                                 b
                    r (kt +1 , zt +1 ) = 1 + αλt +1 k α 1 l (kt +1 , zt +1 )1 α δ
                                                                   t +1
                                               1
       and zt +1 = tanh(ρ tanh                     (zt ) + 20.5 σvt +1 ).

       To bound the capital we …x an ex-ante upper bound kmax , picked
       su¢ ciently high that it will only bind with an extremely low
       probability.
Jesús Fernández-Villaverde (PENN)                  Projection Methods                      July 10, 2011   48 / 52
Introduction


Partition into Elements



       De…ne Ω = [0, kmax ]         [ 1, 1] as the domain of lfe (k, z; θ ).


       Divide Ω into nonoverlapping rectangles [ki , ki +1 ] [zj , zj +1 ], where
       ki is the ith grid point for capital and zj is jth grid point for the
       technology shock.


       Clearly Ω = [i ,j [ki , ki +1 ]     [zj , zj +1 ].




Jesús Fernández-Villaverde (PENN)        Projection Methods            July 10, 2011   49 / 52
Introduction


Our Functional Basis
                                                                b       e
       Set lfe (k, z; θ ) = ∑i ,j θ ij Ψij (k, z ) = ∑i ,j θ ij Ψi (k ) Ψj (z ) where

                                     8    k ki 1
                                     >
                                     <    ki ki 1         if k 2 [ki 1 , ki ]
                         b
                         Ψi (k ) =        k i +1 k
                                                          if k 2 [ki , ki +1 ]
                                     >
                                     :    k i +1 k i
                                                       0 elsewhere
                                     8    z zj 1
                                     >
                                     <    zj zj 1         if z 2 [zj   1 , zj ]
                         e
                         Ψj (z ) =        z j +1 z
                                                          if z 2 [zj , zj +1 ]
                                     >
                                     :    z j +1 z j
                                                       0 elsewhere

       Note that:
        1 Ψ (k, z ) = 0 if (k, z ) 2 [k
                                      / i 1 , ki ]     zj 1 , zj [ [ki , ki +1 ] zj , zj +1
              ij
           8i, j, i.e. the function is 0 everywhere except inside two elements.
        2 l (k , z ; θ ) = θ
            fe i j           ij 8i, j, i.e. the values of θ specify the values of cfe at
           the corners of each subinterval [ki , ki +1 ]      zj , zj +1 .
Jesús Fernández-Villaverde (PENN)        Projection Methods                       July 10, 2011   50 / 52
Introduction


Residual Function I

       De…ne Uc (kt +1 , zt +1 )fe as the marginal utility of consumption
       evaluated at the …nite element approximation values of consumption
       and leisure.
       From the Euler equation we have a residual equation:

                                               R (kt , zt ; θ ) =
                Z 1
              β              Uc (kt +1 , zt +1 )fe
                                                   r (kt +1 , zt +1 ) exp( vt2+1 )dvt +1           1
            π 0.5       1    Uc (kt +1 , zt +1 )fe

       A Galerkin scheme implies that we weight the residual function by the
       basis functions and solve the system of θ equations
                      Z
                                              Ψi ,j (k, z ) R (k, z; θ )dzdk = 0   8i, j
                          [0,kmax ] [ 1,1 ]

       on the θ unknowns.
Jesús Fernández-Villaverde (PENN)                 Projection Methods               July 10, 2011   51 / 52
Introduction


Residual Function II

       Since Ψij (k, z ) = 0 if
       (k, z ) 2 [ki 1 , ki ] [zj
               /                                                1 , zj ] [ [ki , ki +1 ]     [zj , zj +1 ] 8i, j we have
         Z
                                                                                 Ψi ,j (k, z ) R (k, z; θ )dzdk = 0        8i, j
             [k i   1 ,k i ]   [z j   1 ,z j ][[k i ,k i +1 ]     [zj ,zj +1 ]



       We use Gauss-Hermite for the integral in the residual equation and
       Gauss-Legendre for the integrals in Euler equation.


       We use 71 unequal elements in the capital dimension and 31 on the λ
       axis. To solve the associated system of 2201 nonlinear equations we
       use a Quasi-Newton algorithm.


Jesús Fernández-Villaverde (PENN)                                   Projection Methods                     July 10, 2011      52 / 52

More Related Content

Chapter 3 projection

  • 1. Projection Methods Jesús Fernández-Villaverde University of Pennsylvania July 10, 2011 Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 1 / 52
  • 2. Introduction Introduction We come back to our functional equation: H (d ) = 0 Projection methods solve the problem by specifying: n d n (x, θ ) = ∑ θ i Ψi (x ) i =0 We pick a basis fΨi (x )gi∞ 0 and “project” H ( ) against that basis to = …nd the θ i ’s. How? Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 2 / 52
  • 3. Introduction Points to Emphasize 1 We may want to approximate di¤erent objects d: for instance a decision rule, a value function, or an expectation. 2 In general we will have with the same number of parameters than basis functions. 3 We will work with linear combinations of basis functions. Why? The theory of nonlinear approximations is not yet as developed as the linear case. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 3 / 52
  • 4. Introduction Basic Algorithm 1 De…ne n known linearly independent functions ψi : Ω ! <m where n < ∞. We call the ψ1 ( ) , ψ2 ( ) , ..., ψn ( ) the basis functions. 2 De…ne a vector of parameters θ = [θ 1 , θ 2 , ..., θ n ]. 3 De…ne a combination of the basis functions and the θ’ s: n d n ( j θ) = ∑ θ i ψn ( ) i =1 4 Plug dn ( j θ ) into H ( ) to …nd the residual equation: R ( j θ ) = H (d n ( j θ )) 5 Find the value of b that make the residual equation as close to 0 as θ possible given some objective function ρ : J 1 J 1 ! J 2 : b = arg min ρ (R ( j θ ) , 0) θ nθ 2< Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 4 / 52
  • 5. Introduction Relation with Econometrics Looks a lot like OLS. Explore this similarity later in more detail. Also with semi-nonparametric methods as Sieves. Compare with: 1 Policy iteration. 2 Parameterized Expectations. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 5 / 52
  • 6. Introduction Two Issues We need to decide: 1 Which basis we use? 1 Pick a global basis)spectral methods. 2 Pick a local basis)…nite elements methods. 2 How do we “project”? Di¤erent choices in 1 and 2 will result in slightly di¤erent projection methods. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 6 / 52
  • 7. Introduction Spectral Methods Main reference: Judd (1992). Spectral techniques use basis functions that are nonzero and smooth almost everywhere in Ω. Advantages: simplicity. Disadvantages: di¢ cult to capture local behavior. Gibbs phenomenon. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 7 / 52
  • 8. Introduction Spectral Basis I Monomials: c, x, x 2 , x 3 , ... Simple and intuitive. Even if this basis is not composed by orthogonal functions, if J1 is the space of bounded measurable functions on a compact set, the Stone-Weierstrass theorem assures completeness in the L1 norm. Problems: 1 (Nearly) multicollinearity. Compare the graph of x 10 with x 11 . The solution of a projection involves matrices inversion. When the basis functions are similar, the condition number of these matrices (the ratio of the largest and smallest absolute eigenvalues) are too high. Just the six …rst monomials can generate conditions numbers of 1010 . The matrix of the LS problem of …tting a polynomial of degree 6 to a function (the Hilbert Matrix), is a popular test of numerical accuracy since it maximizes rounding errors! 2 Monomials vary considerably in size, leading to scaling problems and accumulation of numerical errors. We want an orthogonal basis. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 8 / 52
  • 9. Introduction Spectral Basis II Trigonometric series 1/ (2π )0.5 , cos x / (2π )0.5 , sin x / (2π )0.5 , ..., cos kx / (2π )0.5 , sin kx / (2π )0.5 , ... Periodic functions. However economic problems are generally not periodic. Periodic approximations to nonperiodic functions su¤er from the Gibbs phenomenon, requiring many terms to achieve good numerical performance (the rate of convergence to the true solution as n ! ∞ is only O (n)). Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 9 / 52
  • 10. Introduction Spectral Basis III Flexible class: orthogonal polynomials of Jacobi (or hypergeometric) type. Why orthogonal? α,β The Jacobi polynomial of degree n, Pn (x ) for α, β > 1, is de…ned by the orthogonality condition: Z 1 α,β α,β (1 x )α (1 + x ) β Pn (x ) Pm (x ) dx = 0 for m 6= n 1 The two most important cases of Jacobi polynomials: Legendre: α = β = 1 2. 1 2 Chebyshev: α = β = 0. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 10 / 52
  • 11. Introduction Alternative Expressions The orthogonality condition implies, with the customary normalizations: α,β n+α Pn (1) = n that the general n term is given by: n n+α n+β 2 n ∑ k n k (x 1)n k (x + 1)k k =0 Recursively: 2 (n + 1) (n + α + β + 1) (2n + α + β) Pn +1 = (2n + α + β + 1) α2 β2 Pn + (2n + α + β) (2n + α + β + 1) (2n + α + β + 2) x 2 (n + α) (n + β) (2n + α + β + 2) Pn 1 Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 11 / 52
  • 12. Introduction Chebyshev Polynomials One of the most common tools of Applied Mathematics. References: Chebyshev and Fourier Spectral Methods, John P. Boyd (2001). A Practical Guide to Pseudospectral Methods, Bengt Fornberg (1998). Advantages of Chebyshev Polynomials: 1 Numerous simple close-form expressions are available. 2 The change between the coe¢ cients of a Chebyshev expansion of a function and the values of the function at the Chebyshev nodes are quickly performed by the cosine transform. 3 They are more robust than their alternatives for interpolation. 4 They are bounded between [ 1, 1] while Legendre polynomials are not, o¤ering a better performance close to the boundaries of the problems. 5 They are smooth functions. 6 Several theorems bound the errors for Chebyshev polynomials interpolations. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 12 / 52
  • 13. Introduction De…nition of Chebyshev Polynomials I Recursive de…nition: T0 ( x ) = 1 T1 ( x ) = x Tn +1 (x ) = 2xTn (x ) Tn 1 (x ) for a general n The …rst few polynomials are then 1, x, 2x 2 1, 4x 3 3x, 8x 4 8x 2 + 1, etc... The n zeros of the polynomial Tn (xk ) = 0 are given by: 2k 1 xk = cos π, k = 1, ..., n 2n Note that zeros are clustered quadratically towards 1. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 13 / 52
  • 14. Introduction Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 14 / 52
  • 15. Introduction De…nition of Chebyshev Polynomials II Explicit de…nition: Tn (x ) = cos (n arccos x ) 1 1 1 1 = z n + n where z+ =x 2 z 2 z 1 0.5 n 0.5 n = x + x2 1 + x x2 1 2 [n/2 ] 1 (n k 1) ! = 2 ∑ ( 1)k k! (n 2k )! (2x )n 2k k =0 ( 1)n π 0.5 0.5 dn n 1 = n 1 x2 1 x2 2 2 Γ n+ 1 2 dx n Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 15 / 52
  • 16. Introduction Remarks The domain of the Chebyshev polynomials is [ 1, 1]. Since our state space is, in general, di¤erent, we use a linear translation from [a, b ] into [ 1, 1] : x a 2 1 b a Chebyshev polynomials are orthogonal with respect to the weight function: 1 (1 x 2 )0.5 Chebyshev Interpolation Theorem th if an approximating function is exact at the roots of the n1 order Chebyshev polynomial then, as n1 ! ∞, the approximation error becomes arbitrarily small. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 16 / 52
  • 17. Introduction Multidimensional Problems Chebyshev polynomials are de…ned on [ 1, 1]. However, most problems in economics are multidimensional. How do we generalize the basis? Curse of dimensionality. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 17 / 52
  • 18. Introduction Tensors Assume we want to approximate F : [ 1, 1]d ! R. Let Tj denote the Chebyshev polynomial of degree j = 0, 1, .., κ. We can approximate F with tensor product of Chebyshev polynomials of degree κ: κ κ ˆ F (x ) = ∑ ... ∑ ξ n1 ,...,nd Tn1 (x1 ) Tnd (xd ) n 1 =0 n d =0 Beyond simplicity, an advantage of the tensor basis is that if the one-dimensional basis is orthogonal in a norm, the tensor basis is orthogonal in the product norm. Disadvantage: number of elements increases exponentially. We end up having terms x1 x2 κ κ xd , total number of (κ + 1)d . κ Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 18 / 52
  • 19. Introduction Complete Polynomials Solution: eliminate some elements of the tensor in such a way that there is not much numerical degradation. Judd and Gaspar (1997): Use complete polynomials instead ( ) d d Pκ i x11 i xdd with ∑ il κ, 0 i1 , ..., id l =1 Advantage: much smaller number of terms, no terms of order dκ to evaluate. Disadvantage: still too many elements. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 19 / 52
  • 20. Introduction Smolyak’ Algorithm I s De…ne m1 = 1 and mi = 2i 1 + 1, i = 2, .... De…ne G i = fx1 , ..., xm i g i i [ 1, 1] as the set of the extrema of the Chebyshev polynomials π (j 1) xji = cos j = 1, ..., mi mi 1 with G 1 = f0g. It is crucial that G i G i +1 , 8i = 1, 2, . . . Example: i = 1, mi = 1, G i = f0g i = 2, mi = 3, G i = f 1, 0, 1g π 3π i = 3, mi = 5, G i = f 1, cos , 0, cos , 1g 4 4 Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 20 / 52
  • 21. Introduction Smolyak’ Algorithm II s For q > d, de…ne a sparse grid [ H(q, d ) = (G i1 ... G id ), q d +1 ji j q where ji j = i1 + . . . + id . The number q de…nes the size of the grid and thus the precision of the approximation. For example, let q = d + 2 = 5: [ H(5, 3) = (G i1 ... G id ). 3 ji j 5 G3 G1 G1, G1 G3 G1, G1 G1 G3 G2 G2 G1, G2 G1 G2, G1 G2 G2 G2 G1 G1, G1 G2 G1, G1 G1 G2 G1 G1 G1 Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 21 / 52
  • 22. Introduction Smolyak’ Algorithm III s Number of points for q = d + 2 d (d 1) 1 + 4d + 4 2 Largest number of points along one dimension i = q d +1 mi = 2q d + 1 Rectangular grid h id 2q d +1 Key: with rectangular grid, the number of grid points increases exponentially in the number of dimensions. With the Smolyak algorithm number of points increases polynomially in the number of dimensions. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 22 / 52
  • 23. Introduction Smolyak’ Algorithm IV s Size of the Grid for q = d + 2 d d 2q d +1 #H(q, d ) 2q d + 1 2 5 13 25 3 5 25 125 4 5 41 625 5 5 61 3, 125 12 5 313 244, 140, 625 Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 23 / 52
  • 24. Introduction Smolyak’ Algorithm V s For one dimension denote the interpolating Chebyshev polynomials as mi U i (x i ) = ∑ ξ ij Tj (x i ) j =1 and the d-dimensional tensor product by U i1 ... U id (x ). For q > d, approximating function (Smolyak’ algorithm) given by s d 1 A(q, d )(x ) = ∑ ( 1)q ji j q ji j (U i1 ... U id )(x ) q d +1 ji j q Method is (almost) optimal within the set of polynomial approximations (Barthelmann, Novak, and Ritter, 1999). Method is universal, that is, almost optimal for many di¤erent function spaces. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 24 / 52
  • 25. Introduction Boyd’ Moral Principal s 1 When in doubt, use Chebyshev polynomials unless the solution is spatially periodic, in which case an ordinary Fourier series is better. 2 Unless you are sure another set of basis functions is better, use Chebyshev polynomials. 3 Unless you are really, really sure another set of basis functions is better, use Chebyshev polynomials. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 25 / 52
  • 26. Introduction Finite Elements Standard Reference: McGrattan (1999). Bound the domain Ω in small of the state variables. Partition Ω in small in nonintersecting elements. These small sections are called elements. The boundaries of the elements are called nodes. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 26 / 52
  • 27. Introduction Partition into Elements Elements may be of unequal size. We can have small elements in the areas of Ω where the economy will spend most of the time while just a few, big size elements will cover wide areas of the state space infrequently visited. Also, through elements, we can easily handle issues like kinks or constraints. There is a whole area of research concentrated on the optimal generation of an element grid. See Thomson, Warsi, and Mastin (1985). Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 27 / 52
  • 28. Introduction Structure Choose a basis for the policy functions in each element. Since the elements are small, a linear basis is often good enough: 8 x x > x xi 1 if x 2 [xi 1 , xi ] < i i 1 x i +1 x ψi (k ) = if k 2 [xi , xi +1 ] > x i +1 x i : 0 elsewhere Plug the policy function in the Equilibrium Conditions and …nd the unknown coe¢ cients. Paste it together to ensure continuity. Why is this an smart strategy? Advantages: we will need to invert an sparse matrix. When should be choose this strategy? speed of computation versus accuracy. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 28 / 52
  • 29. Introduction Three Di¤erent Re…nements 1 h-re…nement: subdivide each element into smaller elements to improve resolution uniformly over the domain. 2 r-re…nement: subdivide each element only in those regions where there are high nonlinearities. 3 p-re…nement: increase the order of the approximation in each element. If the order of the expansion is high enough, we will generate in that way an hybrid of …nite and spectral methods knows as spectral elements. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 29 / 52
  • 30. Introduction Choosing the Objective Function The most common answer to the second question is given by a weighted residual. That is why often projection methods are also called weighted residual methods This set of techniques propose to get the residual close to 0 in the weighted integral sense. Given some weight functions φi : Ω ! <m : R 0 if Ω φi (x ) R ( j θ ) dx = 0, i = 1, .., n ρ (R ( j θ ) , 0) = 1 otherwise Then the problem is to choose the θ that solve the system of equations: Z φi (x ) R ( j θ ) dx = 0, i = 1, .., n Ω Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 30 / 52
  • 31. Introduction Remarks With the approximation of d by some functions ψi and the de…nition of some weight functions φi ( ), we have transform a rather intractable functional equation problem into the standard nonlinear equations system! The solution of this system can be found using standard methods, as a Newton for relatively small problems or a conjugate gradient for bigger ones. Issue: we have di¤erent choices for an weight function: Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 31 / 52
  • 32. Introduction Weight Function I: Least Squares ∂R ( x jθ ) φi (x ) = ∂θ i . This choice is motivated by the solution of the variational problem: Z min R 2 ( j θ ) dx θ Ω with …rst order condition: Z ∂R ( x j θ ) R ( j θ ) dx = 0, i = 1, .., n Ω ∂θ i Variational problem is mathematically equivalent to a standard regression problem in econometrics. OLS or NLLS are regression against a manifold spanned by the observations. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 32 / 52
  • 33. Introduction Weight Function I: Least Squares Least Squares always generates symmetric matrices even if the operator H is not self-adjoint. Symmetric matrices are convenient theoretically (they simplify the proofs) and computationally (there are algorithms that exploit their structure to increase speed and decrease memory requirements). However, least squares may lead to ill-conditioning and systems of equations complicated to solve numerically. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 33 / 52
  • 34. Introduction Weight Function II: Subdomain We divide the domain Ω in n subdomains Ωi and de…ne the n step functions: 1 if x 2 Ωi φi (x ) = 0 otherwise This choice is then equivalent to solve the system: Z R ( j θ ) dx = 0, i = 1, .., n Ωi Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 34 / 52
  • 35. Introduction Weight Function III: Moments Take 0, x, x 2 , ..., x n 1 and compute the …rst n periods of the residual function: Z x i R ( j θ ) dx = 0, i = 0, .., n Ωi This approach, widely used in engineering works well for a low n (2 or 3). However, for higher orders, its numerical performance is very low: high orders of x are highly collinear and arise serious rounding error problems. Hence, moments are to be avoided as weight functions. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 35 / 52
  • 36. Introduction Weight Function III: Collocation or Pseudospectral or Method of Selected Points φi (x ) = δ (x xi ) where δ is the dirac delta function and xi are the collocation points. This method implies that the residual function is zero at the n collocation points. Simple to compute since the integral only needs to be evaluated in one point. Specially attractive when dealing with strong nonlinearities. A systematic way to pick collocation points is to use a density function: Γ 3 γ 2 µ γ (x ) = 1 γ<1 (1 x 2 ) γ π 2 Γ (1 γ) and …nd the collocation points as the xj , j = 0, ..., n 1 solutions to: Z xj j µγ (x ) dx = 1 n For γ = 0, the density function implies equispaced points. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 36 / 52
  • 37. Introduction Weight Function IV: Orthogonal Collocation Variation of the collocation method: 1 Basis functions are a set of orthogonal polynomials. 2 Collocation points given by the roots of the n th polynomial. When we use Chebyshev polynomials, their roots are the collocation points implied by µ 1 (x ) and their clustering can be shown to be 2 optimal as n ! ∞. Surprisingly good performance of orthogonal collocation methods. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 37 / 52
  • 38. Introduction Weight Function V: Galerkin or Rayleigh-Ritz φi (x ) = ψi (x ) with a linear approximating function ∑n=1 θ i ψi (x ). i Then: ! Z n Ω ψi (x ) H ∑ θ i ψi (x ) dx = 0, i = 1, .., n i =1 that is, the residual has to be orthogonal to each of the basis functions. Galerkin is a highly accurate and robust but di¢ cult to code. If the basis functions are complete over J1 (they are indeed a basis of the space), then the Galerkin solution will converge pointwise to the true solution as n goes to in…nity: n n !∞ ∑ θ i ψi ( ) = d ( ) lim i =1 Experience suggests that a Galerkin approximation of order n is as accurate as a Pseudospectral n + 1 or n + 2 expansion. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 38 / 52
  • 39. Introduction A Simple Example Imagine that the law of motion for the price x of a good is given by: d 0 (x ) + d (x ) = 0 Let us apply a simple projection to solve this di¤erential equation. Code: test.m, test2.m, test3.m Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 39 / 52
  • 40. Introduction Analysis of Error As with projection, it is important to study the Euler equation errors. We can improve errors: 1 Adding additional functions in the basis. 2 Re…ne the elements. Multigrid schemes. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 40 / 52
  • 41. Introduction A More Serious Example Representative agent with utility function 1 τ ∞ ctθ (1 lt )1 θ U = E0 ∑ βt 1 τ t =0 One good produced according to yt = e zt Aktα lt1 α with α 2 (0, 1) . Productivity evolves zt = ρzt 1 + t, jρj < 1 and t N (0, σ ). Law of motion for capital kt +1 = it + (1 δ)kt . Resource constrain ct + it = yt . Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 41 / 52
  • 42. Introduction Solve for c ( , ) and l ( , ) given initial conditions. Characterized by: Uc (t ) = βEt Uc (t + 1) 1 + αAe zt +1 ktα+1 l (kt +1 , zt +1 )α 1 δ 1 θ c (kt , zt ) = (1 α) e zt Aktα l (kt , zt ) α θ 1 l (kt , zt ) A system of functional equations with no known analytical solution. Fortran code using Chebyshev and Finite Elements. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 42 / 52
  • 43. Introduction Chebyshev I We approximate the decision rules for labor as lt = ∑n=1 θ i ψi (kt , zt ) i where fψi (k, z )gn=1 are basis functions and θ = [fθ i gn=1 ] unknown i i coe¢ cients. We use that policy function to solve for consumption using the static …rst order condition. We build a residual function R (k, z, θ ) using the Euler equation and the static …rst order condition. Then we choose θ by solving: Z Z φi (k, z ) R (k, z, θ ) = 0 for i = 1, ..., n [kmin ,kmax ] [zmin ,zmax ] where fφi (k, z )gn=1 are some weight functions. i Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 43 / 52
  • 44. Introduction Chebyshev I We use a collocation method that sets φi (k, z ) = δ (k kj , z zv ) where δ (.) is the dirac delta function, j = 1, ..., n1 , v = 1, ..., n2 and n = n1 n2 and collocation points fkj gjn=1 and fzv gn2 1 . 1 v= For the technology shocks and transition probabilities we use Tauchen (1986)’ …nite approximation to an AR(1) process and obtain n2 s points. We solve the system of n equations R (ki , zi , θ ) = 0 in n unknowns θ using a Quasi-Newton method. We use an iteration based on the increment of the number of basis functions and a nonlinear transform of the objective function (apply (u 0 ) 1 ). Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 44 / 52
  • 45. Introduction Finite Elements Rewrite Euler equation as Z ∞ 2 β t +1 Uc (kt , zt ) = [Uc (kt +1 , zt +1 )(r (kt +1 , zt +1 )] exp( )d t +1 (2πσ)0.5 ∞ 2σ2 where Uc (t ) = Uc (kt , zt ) z t +1 α 1 α kt +1 = e kt lt + (1 δ)kt c (kt , zt ) r (kt +1 , zt +1 ) = 1 + αe zt +1 ktα+1 l (kt +1 , zt +1 )1 α 1 δ and zt +1 = ρzt + t +1 Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 45 / 52
  • 46. Introduction Goal The problem is to …nd two policy functions c (k, z ) : R + [0, ∞] ! R + and l (k, z ) : R + [0, ∞] ! [0, 1] that satisfy the model equilibrium conditions. Since the static …rst order condition gives a relation between the two policy functions, we only need to solve for one of them. For the rest of the exposition we will assume that we actually solve for l (k, z ) and then we …nd c (l (k, z )). Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 46 / 52
  • 47. Introduction Bounding the State Space I We bound the domain of the state variables to partition it in nonintersecting elements. To bound the productivity level of the economy de…ne λt = tanh(zt ). Since λt 2 [ 1, 1] we can write the stochastic process as: 1 0.5 λt = tanh(ρ tanh (zt 1 ) + 2 σvt ) where vt = t 2 0.5 σ . Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 47 / 52
  • 48. Introduction Bounding the State Space II 1 (1 +λt +1 )0.5 b Now, since exp(tanh (zt 1 )) = = λt +1 , we have: (1 λt +1 )0.5 Z 1 β Uc (t ) = 0.5 [Uc (kt +1 , zt +1 )r (kt +1 , zt +1 )] exp( vt2+1 )dvt +1 π 1 where b kt +1 = λt +1 ktα l (kt , zt )1 α + (1 δ)kt c (l (kt , zt )) b r (kt +1 , zt +1 ) = 1 + αλt +1 k α 1 l (kt +1 , zt +1 )1 α δ t +1 1 and zt +1 = tanh(ρ tanh (zt ) + 20.5 σvt +1 ). To bound the capital we …x an ex-ante upper bound kmax , picked su¢ ciently high that it will only bind with an extremely low probability. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 48 / 52
  • 49. Introduction Partition into Elements De…ne Ω = [0, kmax ] [ 1, 1] as the domain of lfe (k, z; θ ). Divide Ω into nonoverlapping rectangles [ki , ki +1 ] [zj , zj +1 ], where ki is the ith grid point for capital and zj is jth grid point for the technology shock. Clearly Ω = [i ,j [ki , ki +1 ] [zj , zj +1 ]. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 49 / 52
  • 50. Introduction Our Functional Basis b e Set lfe (k, z; θ ) = ∑i ,j θ ij Ψij (k, z ) = ∑i ,j θ ij Ψi (k ) Ψj (z ) where 8 k ki 1 > < ki ki 1 if k 2 [ki 1 , ki ] b Ψi (k ) = k i +1 k if k 2 [ki , ki +1 ] > : k i +1 k i 0 elsewhere 8 z zj 1 > < zj zj 1 if z 2 [zj 1 , zj ] e Ψj (z ) = z j +1 z if z 2 [zj , zj +1 ] > : z j +1 z j 0 elsewhere Note that: 1 Ψ (k, z ) = 0 if (k, z ) 2 [k / i 1 , ki ] zj 1 , zj [ [ki , ki +1 ] zj , zj +1 ij 8i, j, i.e. the function is 0 everywhere except inside two elements. 2 l (k , z ; θ ) = θ fe i j ij 8i, j, i.e. the values of θ specify the values of cfe at the corners of each subinterval [ki , ki +1 ] zj , zj +1 . Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 50 / 52
  • 51. Introduction Residual Function I De…ne Uc (kt +1 , zt +1 )fe as the marginal utility of consumption evaluated at the …nite element approximation values of consumption and leisure. From the Euler equation we have a residual equation: R (kt , zt ; θ ) = Z 1 β Uc (kt +1 , zt +1 )fe r (kt +1 , zt +1 ) exp( vt2+1 )dvt +1 1 π 0.5 1 Uc (kt +1 , zt +1 )fe A Galerkin scheme implies that we weight the residual function by the basis functions and solve the system of θ equations Z Ψi ,j (k, z ) R (k, z; θ )dzdk = 0 8i, j [0,kmax ] [ 1,1 ] on the θ unknowns. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 51 / 52
  • 52. Introduction Residual Function II Since Ψij (k, z ) = 0 if (k, z ) 2 [ki 1 , ki ] [zj / 1 , zj ] [ [ki , ki +1 ] [zj , zj +1 ] 8i, j we have Z Ψi ,j (k, z ) R (k, z; θ )dzdk = 0 8i, j [k i 1 ,k i ] [z j 1 ,z j ][[k i ,k i +1 ] [zj ,zj +1 ] We use Gauss-Hermite for the integral in the residual equation and Gauss-Legendre for the integrals in Euler equation. We use 71 unequal elements in the capital dimension and 31 on the λ axis. To solve the associated system of 2201 nonlinear equations we use a Quasi-Newton algorithm. Jesús Fernández-Villaverde (PENN) Projection Methods July 10, 2011 52 / 52