Abstract
In this paper, we distinguish three constrained variants of the gravity model of spatial interaction: doubly constrained, production constrained and attraction constrained exponential gravity models. These model variants include origin and/or destination specific balancing factors that act as constraints to ensure that the estimated rows and columns of the flow data matrix sum to the observed row and column totals. Because flows are typically counts, the Poisson rather than the normal probability model specification furnishes the appropriate statistical distribution, and parameter estimation can be achieved via Poisson regression. This probability model specification motivates the use of origin and/or destination fixed effects or—under certain conditions—the use of origin and/or destination specific random effects for model estimation. The paper establishes theoretical connections between balancing factors, fixed effects represented by binary indicator variables, and random effects. The results pertaining to both the doubly and singly constrained cases of spatial interaction are illustrated with an empirical example, while accounting for spatial dependence between flows from locations neighbouring both the origins and destinations during estimation.
This paper has been previously published in the Journal of Geographical Systems. Special Issue on “Advances in the Statistical Modelling of Spatial Interaction Data”, Vol. 15, Number 3/July 2013, ©Springer-Verlag Berlin Heidelberg, pp. 291–317.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The terms gravity model and spatial interaction model are often used interchangeably. But they are not the same. Spatial interaction models not only include gravity models, but also similar models that have been derived using powerful methods of entropy maximisation from statistical mechanics (Wilson 1967), or utility maximization from economic theory (Niedercorn and Bechdolt 1969), and those based on intervening opportunities which can be derived heuristically.
- 2.
For a discussion of problems that plague empirical implementation of regression-based gravity models, and econometric extensions that have recently appeared in the literature, see LeSage and Fischer (2010). These new models replace the conventional assumption of independence among origin-destination flows with formal approaches that allow for spatial dependence in flow magnitudes. The econometric extensions are based on the assumption of a linear relation between the dependent and the independent variables, and this assumes the dependent variable to be normally distributed.
- 3.
Trip making is viewed as consisting of four components (see, for example, Fischer 2000): trip generation and attraction (the decision to make a trip and how often); trip distribution in a system of traffic zones; modal split (choice of mode of transport); and, trip assignment (choice of route through network). The gravity model is used for trip distribution, but is preceded by trip generation and attraction models that provide independent estimates of locational (zonal) trip origins and attractions that subsequently become the “mass” terms of the gravity model. Thus, the definition of the row and column sums of the predicted trip matrix coincides exactly with the definitions of the respective mass terms.
- 4.
- 5.
An alternative formulation to that given in Eq. (3.1) is \( {Y}_{ij}={K}_{ij} {U}_i {V}_j f\left({d}_{ij}\right) {\eta}_{ij}+{\varepsilon}_{ij} \) where ε ij reflects the sample error and η ij the specification error. In this case, the stochastic nature of Y ij derives from assumptions made about the stochastic nature of ε ij and η ij .
- 6.
The multiplicative form of the balancing factors A i and B j (Wilson 1967) ensures mathematical tractability in searching for an adequate estimation procedure. Alternatively, Tobler (1983) suggests an additive adjustment scheme, K ij = A i + B j , to enforce satisfactorily the conservation rule. Ledent (1985) introduces a general functional form that subsumes both the multiplicative (Wilson) and the additive (Tobler) variants.
- 7.
In the origin constrained and the destination constrained models presented here, the constraints to which these models are subject refer to the full set of n origin or n destination locations. But it is possible to develop models that are only constrained over certain subsets of locations. Such models, which are not considered in this paper, may be found in Wilson (1970).
- 8.
The notion that separation functions in conventional gravity models work to effectively capture spatial dependence in origin-destination flows has long been challenged. Griffith (2007) provides an historical review of the regional science literature about this topic in which he credits Curry (1972) as the first to conceptualise the problem of spatial dependence in flows.
- 9.
The constrained gravity model variants are intrinsically non-linear in their parameters, and thus the application of linear methods leads to biased estimates of these parameters.
- 10.
In the economics literature it is often called the RAS procedure.
- 11.
Independence means that the individual flows from origin i to destination j are independent from each other, and that origin-destination flows between any pair of locations are independent from flows between any other pair of locations.
- 12.
Closely related to this assumption are the assumptions that the set of observations for each origin location has a multinomial distribution, say \( \mathcal{MN}\)(Y i1, Y i2, …, Y in ; Y i•), or that the set of all observations has a multinomial distribution \( \mathcal{MN}\)(Y i1, Y i2, …, Y nn ; Y ••), where Y i• is the total flow from origin location i, Y •• is the overall flow, and n is the number of origin and destination locations. These multinomial distributions can be generated by assuming that the Y ij (i, j = 1, …, n) are independent Poisson random variables sampled subject to the origin totals Y i•, or the overall total Y ••, being fixed (Bishop et al. 1975).
- 13.
One advantage of the use of origin/destination indicator variables in a Poisson regression specification is that they yield individual rather than a single aggregate standard error, and null hypothesis probability estimates for each of the individual values in the two sets of balancing factors. One disadvantage is the amount of time necessary to estimate a GLM containing \( 2n-2\) indicator variables.
- 14.
The logarithmic link function is best thought of as being an exponential conditional mean function.
- 15.
McCullagh and Nelder (1983) prove that the procedure converges to the maximum likelihood solution. Note that zero observed flows do not require any special treatment.
- 16.
The equivalence of maximum likelihood estimation with the Poisson assumption and the entropy maximisation solution for a doubly constrained gravity model with origin and destination specific balancing factors is well known (see Wilson and Kirkby 1980, p. 310). In the latter case, parameter estimation of a model such as Eq. (3.1) is obtained by maximising an objective function subject to sets of constraints on the origin and destination totals in combination with some constraint on a general measure of spatial separation in the system of locations (Baxter 1982).
- 17.
Whether the random effects model variants are appropriate model specifications in spatial research remains controversial. When the random effects gravity models are implemented, the spatial units of observation should be representative of a larger population, and n should potentially be able to get to infinity (see Elhorst 2010 for more details on this issue).
- 18.
Origin/destination specific spatial dependence in the random effects estimates motivated the gravity model set forth in LeSage et al. (2007) that formally incorporates spatially structured random effects in place of the zero mean, normally distributed independent random effects.
- 19.
This correlation differs from that latent in the geographic distributions of the origin and destination variables that are reflected in the balancing factors. Pace et al. (2011) show that spatial dependence in the explanatory variables decreases the ability of filtering to produce unbiased regression parameter estimates.
- 20.
In the fixed effects case of the doubly constrained gravity model, for example, this takes the form
$$ E\left({Y}_{ij}\right)={\mu}_{ij}={U}_i\;{V}_j \exp \left[\alpha +{\displaystyle \sum_{h=1}^{n-1}{I}_{iho}\;{\beta}_{ho}+{\displaystyle \sum_{k=1}^{n-1}{I}_{jkd}\;{\beta}_{kd}-}\theta\;{d}_{ij}}\right] {\displaystyle \prod_{j\ne i}^nE{\left({Y}_{ij}\right)}^{\rho {W}_{ij}}} $$where W ij is the (i,j)th element of an N-by-N spatial weight matrix W and ρ is a scalar parameter that governs the degree of spatial dependence in origin-destination flows. Lambert et al. (2010) set forth a two-step maximum likelihood estimation approach for a spatial autoregressive Poisson model for count data which would need to be extended to the case of flows involving N observations.
- 21.
This is an especially valuable approach in situations where the flows are count data, because conventional spatial regression models and software tools are less developed for this data type.
- 22.
We assume that W is similar to a symmetric matrix so that it has real eigenvalues. If W is not symmetric, then \( {\scriptscriptstyle \frac{1}{2}}\left(W+{W}^{\prime}\right) \), which is symmetric by construction, may be used.
- 23.
If intralocational flows are excluded from an analysis, the N-by-N spatial weight matrix reduces to an n(n–1)-by-n(n–1) one, only marginally impacting upon these eigenvectors when n > 100.
- 24.
Neighbours may be defined using contiguity or measures of spatial proximity such as cardinal distance (for example, in terms of transportation costs) or ordinal distance (for example, the six nearest neighbours). In the illustrative example in Sect. 3.6, we use a binary contiguity matrix W n to define W.
- 25.
The criterion I/I max = 0.5 suggests a restriction of the search over eigenvectors with moderate to high spatial autocorrelation.
- 26.
Pace et al. (2011) demonstrate how using iterative eigenvalue routines on sparse matrices such as W can make filtering feasible for data sets involving a million or more observations, and empirically estimate an operation count on the order of N 1.1.
- 27.
For details about the data construction, see Fischer et al. (2006).
- 28.
A 257 and B 257 are the arbitrarily selected balancing factors set to one in each case, to avoid perfect multicollinearity with the intercept term, resulting in an expected intercept of zero and an expected slope of one.
- 29.
The regression equations describe each set of log-balancing factors as a function of the corresponding fixed effects indicator variables. Error terms are not included here.
- 30.
A deviance statistic exceeding one indicates that overdispersion is present; that is, the Poisson variance is greater than its mean. Although the existence of overdispersion does not affect the unbiased character of the parameter estimates, their standard errors are underestimated, and hence their significance is unrealistically increased.
- 31.
Because balancing factors are autoregressive specifications [see Eqs. (3.13)–(3.14)], they contain marked spatial dependence by construction. The spatial filter descriptions of these balancing factors rely on eigenvectors of the transformed spatial weight matrix M n W n M n where W n is the n-by-n binary contiguity matrix and M n is the n-by-n projection matrix defined by \( {M}_n={I}_n-{\iota}_n\;{\iota}_n^{\prime}\;{n}^{-1}. \) Forty-two candidate eigenvectors (for which I/I max > 0.25) are available for constructing spatial filters portraying positive spatial autocorrelation across the European regions. Of these, subsets have been selected with a stepwise regression procedure for constructing spatial filters describing the two sets of balancing factors. The criteria used for selection were statistically significant coefficients at the 10 % level associated with minimisation of the log-likelihood function, which is standard practice.
- 32.
Of note is that for n larger than about 100, current computer resources do not allow direct calculation of the eigenvectors of W. In order to reduce computational intensity we, followed Griffith (2009) to construct the spatial filter with a linear combination of Kronecker products of pairs of origin and destination eigenvectors. The result of this adjustment is 242 = 576 candidate eigenvectors identified as Kronecker products of the 24 eigenvectors with an I > 0.5 extracted from matrix \( (I-\iota\;{\iota}^{\prime}\;{n}^{-1})\;{W}_n\;(I-\iota\;{\iota}^{\prime}\;{n}^{-1}) \). With 66,049 observations, five covariates and an intercept term, and 576 candidate eigenvectors, the numerical intensity of the problem solution becomes feasible but is still high.
References
Bailey TC, Gatrell AC (1995) Interactive spatial data analysis. Longman, Harlow
Baxter M (1982) Similarities in methods of estimating spatial interaction models. Geogr Anal 14(3):267–272
Baxter M (1983) Estimation and inference in spatial interaction models. Prog Hum Geogr 7:40–59
Bishop YMM, Fienberg SE, Holland PW (1975) Discrete multivariate analysis. MIT Press, Cambridge
Black WR (1992) Network autocorrelation in transport network and flow systems. Geogr Anal 24(3):207–222
Bolduc D, Laferrière R, Santarossa G (1995) Spatial autoregressive error components in travel flow models: an application to aggregate mode choice. In: Anselin L, Florax R (eds) New directions in spatial econometrics. Springer, Berlin/Heidelberg/New York, pp 96–108
Cesario FJ (1973) A generalized trip distribution model. J Reg Sci 13(2):233–248
Cesario FJ (1977) A new interpretation of the “normalizing” or “balancing factors” of gravity type spatial models. J Socio Econ Plan Sci 11(3):131–136
Chun Y (2008) Modeling network autocorrelation within migration flows by eigenvector spatial filtering. J Geogr Syst 10(4):317–344
Chun Y, Griffith DA (2011) Modeling network autocorrelation in space-time migration flow data: an eigenvector spatial filtering approach. Ann Assoc Am Geogr 101(3):523–536
Curry L (1972) A spatial analysis of gravity flows. Reg Stud 6(2):131–137
Davies RB, Guy CM (1987) The statistical modelling of flow data when the Poisson assumption is violated. Geogr Anal 19(4):300–314
Elhorst JP (2010) Spatial panel data models. In: Fischer MM, Getis A (eds) Handbook of applied spatial analysis. Springer, Berlin/Heidelberg/New York, pp 377–407
Evans AW (1970) Some properties of trip distribution models. Transp Res 4(1):19–36
Fischer MM (2000) Travel demand – theory. In: Polak JB, Heertje A (eds) Analytical transport economics. Edward Elgar, Cheltenham, pp 51–78
Fischer MM, Griffith DA (2008) Modeling spatial autocorrelation in spatial interaction data: a comparison of spatial econometric and spatial filtering specifications. J Reg Sci 48(5):969–989
Fischer MM, Wang J (2011) Spatial data analysis: models, methods and techniques [Springer briefs in regional science]. Springer, Berlin/Heidelberg/New York
Fischer MM, Scherngell T, Jansenberger E (2006) The geography of knowledge spillovers between high-technology firms in Europe: evidence from a spatial interaction modeling perspective. Geogr Anal 38(3):288–309
Fotheringham AS, O’Kelly ME (1989) Spatial interaction models: formulations and applications. Kluwer Academic Publishers, Dordrecht
Griffith DA (2003) Spatial autocorrelation and spatial filtering. Springer, Berlin/Heidelberg/New York
Griffith DA (2007) Spatial structure and spatial interaction: 25 years later. Rev Reg Stud 37(1):28–38
Griffith DA (2009) Modeling spatial autocorrelation in spatial interaction data: empirical evidence from 2002 Germany journey-to-work flows. J Geogr Syst 11(2):117–140
Haynes KE, Fotheringham AS (1984) Gravity and spatial interaction models. Sage, Bevery Hills
Isard W (1960) Methods of regional analysis. MIT Press, Cambridge
Kirby HR (1974) Theoretical requirements for calibrating gravity models. Transp Res 8(1):97–104
Lambert DM, Brown JP, Florax RJGM (2010) A two-step estimator for a spatial lag model of counts: theory, small sample performance and an application. Reg Sci Urban Econ 40(4):241–252
Ledent J (1985) The doubly constrained model of spatial interaction: a more general formulation. Environ Plan A 17(2):253–262
LeSage JP, Fischer MM (2010) Spatial econometric methods for modeling origin-destination flows. In: Fischer MM, Getis A (eds) Handbook of applied spatial analysis. Springer, Berlin/Heidelberg/New York, pp 409–433
LeSage JP, Pace RK (2008) Spatial econometric modeling of origin-destination flows. J Reg Sci 48(5):941–968
LeSage JP, Pace RK (2009) Introduction to spatial econometrics. CRC Press, Boca Raton
LeSage JP, Fischer MM, Scherngell T (2007) Knowledge spillovers across Europe. Evidence from a Poisson spatial interaction model with spatial effects. Pap Reg Sci 86(3):393–421
McCullagh P, Nelder JA (1983) Generalized linear models. Chapman and Hall, London
Niedercorn J, Bechdolt B (1969) An economic derivation of the “gravity law” of spatial interaction. J Reg Sci 9(2):273–282
Pace RK, LeSage JP, Zhu S (2011) Interpretation and computation of estimates from regression models using spatial filtering. Paper presented at the Fifth World Conference of the Spatial Econometrics Association, Toulouse, July 6–8, 2011
Sen A, Smith T (1995) Gravity models of spatial interaction behavior. Springer, Berlin/Heidelberg/New York
Tiefelsdorf M (2003) Misspecification in interaction model distance decay relations: a spatial structure effect. J Geogr Syst 5(1):25–50
Tiefelsdorf M, Boots B (1995) The exact distribution of Moran’s I. Environ Plan A 27(6):985–999
Tobler W (1983) An alternative formulation for spatial interaction modeling. Environ Plan A 15(5):693–703
Wilson AG (1967) A statistical theory of spatial distribution models. Transp Res 1(3):253–269
Wilson AG (1970) Entropy in urban and regional modelling. Pion, London
Wilson AG (1971) A family of spatial interaction models and associated developments. Environ Plan 3(1):1–32
Wilson AG, Kirkby MJ (1980) Mathematics for geographers and planners, 2nd edn. Clarendon, Oxford
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Results for the Estimation of Singly Constrained Random Effects Specifications
Appendix: Results for the Estimation of Singly Constrained Random Effects Specifications
Because of the large dimensionality of the calculus problem, multivariate integration struggles to properly estimate the random effects terms. Largest values appear to introduce the greatest difficulties. Figure 3.6 A reveals that integration is completely successful between the minimum and roughly 0.5 in our case study. Integration is only partially successful beyond 0.5. Incorrectly calculated random effects constitute about 10 % of the total number of random effects in this case study.
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Griffith, D.A., Fischer, M.M. (2016). Constrained Variants of the Gravity Model and Spatial Dependence: Model Specification and Estimation Issues. In: Patuelli, R., Arbia, G. (eds) Spatial Econometric Interaction Modelling. Advances in Spatial Science. Springer, Cham. https://doi.org/10.1007/978-3-319-30196-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-30196-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30194-5
Online ISBN: 978-3-319-30196-9
eBook Packages: Economics and FinanceEconomics and Finance (R0)