Rest 88 4 641
Rest 88 4 641
Rest 88 4 641
Abstract—Although economists have long been aware of Jensen’s in- models are severely biased, distorting the interpretation of
equality, many econometric applications have neglected an important
implication of it: under heteroskedasticity, the parameters of log-
the model. These biases might be critical for the compara-
linearized models estimated by OLS lead to biased estimates of the true tive assessment of competing economic theories, as well as
elasticities. We explain why this problem arises and propose an appropri- for the evaluation of the effects of different policies. In
ate estimator. Our criticism of conventional practices and the proposed
solution extend to a broad range of applications where log-linearized contrast, our method is robust to the different patterns of
equations are estimated. We develop the argument using one particular heteroskedasticity considered in the simulations.
illustration, the gravity equation for trade. We find significant differences We next use the proposed method to provide new esti-
between estimates obtained with the proposed estimator and those ob-
tained with the traditional method. mates of the gravity equation in cross-sectional data. Using
standard tests, we show that heteroskedasticity is indeed a
severe problem, both in the traditional gravity equation
I. Introduction introduced by Tinbergen (1962), and in a gravity equation
that takes into account multilateral resistance terms or fixed
E CONOMISTS have long been aware that Jensen’s in-
equality implies that E(ln y) ⫽ ln E(y), that is, the effects, as suggested by Anderson and van Wincoop (2003).
The remainder of the paper is organized as follows. where ij is an error factor with E(ij兩Yi, Yj, Dij) ⫽ 1,
Section II studies the econometric problems raised by the assumed to be statistically independent of the regressors,
estimation of gravity equations. Section III considers leading to
constant-elasticity models in general; it introduces the PML
estimator and specification tests to check the adequacy of E共T ij 兩Y i ,Y j ,D ij 兲 ⫽ ␣ 0 Y i␣ 1Y j␣ 2D ij␣ 3.
the proposed estimator. Section IV presents the Monte Carlo
simulations. Section V provides new estimates of both the There is a long tradition in the trade literature of log-
traditional and the Anderson–van Wincoop gravity equa- linearizing equation (2) and estimating the parameters of
tion. The results are compared with those generated by interest by least squares, using the equation
OLS, nonlinear least squares, and tobit estimations. Section ln Tij ⫽ ln ␣0 ⫹ ␣1 ln Yi ⫹ ␣2 ln Yj
VI contains concluding remarks. (3)
⫹ ␣3 ln Dij ⫹ ln ij .
II. The Econometrics of the Gravity Equation The validity of this procedure depends critically on the
assumption that ij, and therefore ln ij, are statistically
independent of the regressors. To see why this is so, notice
Bergstrand (1985), Davis (1995), Deardoff (1998), and Anderson and van explained, among other factors, by large variable costs (for example,
Wincoop (2003). A feature common to these models is that they all assume bricks are too costly to transport) or large fixed costs (for example,
complete specialization: each good is produced in only one country. information on foreign markets). At the aggregate level, these costs can be
However, Haveman and Hummels (2001), Feenstra, Markusen, and Rose best proxied by the various measures of distance and size entering the
(2000), and Eaton and Kortum (2001) derive the gravity equation without gravity equation. The existence of zero trade between many pairs of
relying on complete specialization. Examples of empirical studies framed countries is directly addressed by Hallak (2006) and Helpman, Melitz, and
on the gravity equation include the evaluation of trade protection (for Rubinstein (2004). These authors propose a promising avenue of research
example, Harrigan, 1993), regional trade agreements (for example, using a two-part estimation procedure, with a fixed-cost equation deter-
Frankel, Stein, & Wei, 1998; Frankel, 1997), exchange rate variability (for mining the cutoff point above which a country exports, and a standard
example, Frankel & Wei, 1993; Eichengreen & Irwin, 1995), and currency gravity equation. Their results, however, rely heavily on both normality
unions (for example, Rose, 2000; Frankel & Rose, 2002; and Tenreyro & and homoskedasticity assumptions, the latter being the particular concern
Barro, 2002). See also the various studies on border effects influencing the of this paper. A natural topic for further research is to develop and
patterns of intranational and international trade, including McCallum implement an estimator of the two-part model that, like the PML estimator
(1995), and Anderson and van Wincoop (2003), among others. proposed here, is robust to distributional assumptions.
THE LOG OF GRAVITY 643
the use of the log linear form of the gravity equation. As before, log-linearization of equation (5) raises the prob-
Several methods have been developed to deal with this lem of how to treat zero-value observations. Moreover,
problem [see Frankel (1997) for a description of the various given that equation (5) is a multiplicative model, it is also
procedures]. The approach followed by the large majority of subject to the biases caused by log-linearization in the
empirical studies is simply to drop the pairs with zero trade presence of heteroskedasticity. Naturally, the presence of
from the data set and estimate the log linear form by OLS. the individual effects may reduce the severity of this prob-
Rather than throwing away the observations with Tij ⫽ 0, lem, but whether or not that happens is an empirical issue.
some authors estimate the model using Tij ⫹ 1 as the In our empirical analysis we provide estimates for both
dependent variable or use a tobit estimator. However, these the traditional and the Anderson–van Wincoop gravity equa-
procedures will generally lead to inconsistent estimators of tions, using alternative estimation methods. We show that,
the parameters of interest. The severity of these inconsis- in practice, heteroskedasticity is quantitatively and qualita-
tencies will depend on the particular characteristics of the tively important in the gravity equation, even when control-
sample and model used, but there is no reason to believe that ling for fixed effects. Hence, we recommend estimating the
they will be negligible. augmented gravity equation in levels, using the proposed
Zeros may also be the result of rounding errors.4 If trade PML estimator, which also adequately deals with the zero-
from the characteristics of the data. Because yi is nonnega- which implies the following set of first-order conditions:
tive, when E[yi兩x] approaches 0, the probability of yi being
positive must also approach 0. This implies that V[yi兩x], the
冘 关 y ⫺ exp共x 兲兴
n
practice, all we know about V[yi兩x] is that, in general, it goes natural to give the same weight to all observations.13 Even
to 0 as E[yi兩x] passes to 0. Therefore, an optimal weighted if E[yi兩x] is not proportional to V[yi兩x], the PML estimator
NLS estimator cannot be used without further information based on equation (9) is likely to be more efficient than the
on the distribution of the errors. In principle, this problem NLS estimator when the heteroskedasticity increases with
can be tackled by estimating the multiplicative model using the conditional mean.
a consistent estimator, and then obtaining the appropriate The estimator defined by equation (9) is numerically
weights estimating the skedastic function nonparametri- equal to the Poisson pseudo-maximum-likelihood (PPML)
cally, as suggested by Delgado (1992) and Delgado and estimator, which is often used for count data.14 The form of
Kniesner (1997). However, this nonparametric generalized equation (9) makes clear that all that is needed for this
least squares estimator is rather cumbersome to implement, estimator to be consistent is the correct specification of the
especially if the model has a large number of regressors. conditional mean, that is, E[yi兩x] ⫽ exp(xi). Therefore, the
Moreover, the choice of the first-round estimator is an open data do not have to be Poisson at all—and, what is more
question, as the NLS estimator may be a poor starting point important, yi does not even have to be an integer—for the
due to its considerable inefficiency. Therefore, the nonpara- estimator based on the Poisson likelihood function to be
metric generalized least squares estimator is not appropriate consistent. This is the well-known PML result first noted by
of the inefficient OLS, with an appropriate covariance matrix. in their pseudo-maximum-likelihood estimator for fractional data models.
12 See also Manning and Mullahy (2001). A related estimator is proposed 14 See Cameron and Trivedi (1998) and Winkelmann (2003) for more
by Papke and Wooldridge (1996) for the estimation of models for details on the Poisson regression and on more general models for count
fractional data. data.
646 THE REVIEW OF ECONOMICS AND STATISTICS
ln 共 yi ⫺ y̆i 兲2 ⫽ ln 0 ⫹ 1 ln y̆i ⫹ vi , (11) by OLS and testing the statistical significance of 0(1 ⫺ 1)
using a Eicker-White robust covariance matrix estimator.17
where y̆i denotes the estimated value of E[yi兩x]. Unfortu- In the next section, a small simulation is used to study the
nately, as the discussion in the previous sections should Gauss-Newton regression test for the hypothesis that V[yi兩x]
have made clear, this approach based on the log- ⬀ E[yi兩x], as well as the Park-type test for the hypothesis
linearization of equation (10) is valid only under very that the constant-elasticity model can be consistently esti-
restrictive conditions on the conditional distribution of yi. mated in the log linear form.
However, it is easy to see that this procedure is valid when
the constant-elasticity model can be consistently estimated IV. A Simulation Study
in the log linear form. Therefore, using equation (11) a test
for H0 : 1 ⫽ 2 based on a nonrobust covariance estimator This section reports the results of a small simulation
provides a check on the adequacy of the estimator based on study designed to assess the performance of different meth-
the log linear model. ods to estimate constant-elasticity models in the presence of
A more robust alternative, which is mentioned by Man- heteroskedasticity and rounding errors. As a by-product, we
ning and Mullahy (2001) in a footnote, is to estimate 1 also obtain some evidence on the finite-sample performance
from of the specification tests presented above. These experi-
ments are centered around the following multiplicative
15 Frankel and Wei (1993) and Frankel (1997) suggest that larger model:
countries should be given more weight in the estimation of gravity
equations. This would be appropriate if the errors in the model were just 17 Notice that to test V [ y 兩x] ⬀ E [ y 兩x] against alternatives of the form
i i
the result of measurement errors in the dependent variable. However, if it V [ yi兩x] ⫽ 0 exp [xi ( ⫹ )], the appropriate auxiliary regression would
is accepted that the gravity equation does not hold exactly, measurement be
errors account for only part of the dispersion of trade data around the
gravity equation.
16 It is worth noting that the PPML estimator can be easily adapted to 共 y i ⫺ y̆ i 兲 2 / 冑y̆ i ⫽ 0 冑y̆ i ⫹ 0 x i 冑y̆ i ⫹ i* ,
deal with endogenous regressors (Windmeijer & Santos Silva, 1997) and and the test could be performed by checking the joint significance of the
panel data (Wooldridge, 1999). These extensions, however, are not pur- elements of 0. If the model includes a constant, one of the regressors in
sued here. the auxiliary regression is redundant and should be dropped.
THE LOG OF GRAVITY 647
E关 y i 兩x兴 ⫽ 共 x i 兲 ⫽ exp共0 ⫹ 1 x1i ⫹ 2 x2i 兲, not only corrects the heteroskedasticity in the data, but,
because i is log normal, it is also the maximum likelihood
i ⫽ 1, . . . , 1000. (14)
estimator. The GPML is the optimal PML estimator in this
case, but it should be outperformed by the true maximum
Because, in practice, regression models often include a
likelihood estimator. Finally, case 4 is the only one in which
mixture of continuous and dummy variables, we replicate
this feature in our experiments: x1i is drawn from a standard the conditional variance does not depend exclusively on the
normal, and x2 is a binary dummy variable that equals 1 with mean. The variance is a quadratic function of the mean, as
a probability of 0.4.18 The two covariates are independent, in case 3, but it is not proportional to the square of the mean.
and a new set of observations of all variables is generated in We carried out two sets of experiments. The first set was
each replication using 0 ⫽ 0, 1 ⫽ 2 ⫽ 1. Data on y are aimed at studying the performance of the estimators of the
generated as multiplicative and the log linear models under different
patterns of heteroskedasticity. In order to study the effect of
y i ⫽ 共 x i 兲 i , (15) the truncation on the performance of the OLS, and given
that this data-generating mechanism does not produce ob-
where i is a log normal random variable with mean 1 and servations with yi ⫽ 0, the log linear model was also
As expected, OLS only performs well in case 3. In all this estimator, despite being consistent, leads to very poor
other cases this estimator is clearly inadequate because, results because of its erratic behavior.20 Therefore, it is clear
despite its low dispersion, it is often badly biased. More- that the loss of efficiency caused by some of the forms of
over, the sign and magnitude of the bias vary considerably. heteroskedasticity considered in these experiments is strong
Therefore, even when the dependent variable is strictly enough to render this estimator useless in practice.
positive, estimation of constant-elasticity models using the In the first set of experiments, the results of the gamma
log-linearized model cannot generally be recommended. As PML estimator are very good. Indeed, when no measure-
for the modifications of the log-linearized model designed ment error is present, the biases and standard errors of the
to deal with the zeros of the dependent variable—ET-tobit, GPML estimator are always among the lowest. However,
OLS(y ⫹ 1), and OLS(y ⬎ 0.5)—their performance is also this estimator is very sensitive to the form of measurement
very disappointing. These results clearly emphasize the error considered in the second set of experiments, consis-
need to use adequate methods to deal with the zeros in the
tently leading to sizable biases. These results, like those of
data and raise serious doubts about the validity of the results
the NLS, clearly illustrate the danger of using a PML
obtained using the traditional estimators based on the log
estimator that gives extra weight to the noisier observations.
linear model. Overall, except under very special circum-
stances, estimation based on the log-linear model cannot be As for the performance of the Poisson PML estimator, the
recommended. results are very encouraging. In fact, when no rounding
One remarkable result of this set of experiments is the error is present, its performance is reasonably good in all
extremely poor performance of the NLS estimator. Indeed,
when the heteroskedasticity is more severe (cases 3 and 4), 20 Manning and Mullahy (2001) report similar results.
THE LOG OF GRAVITY 649
cases. Moreover, although some loss of efficiency is notice- TABLE 2.—REJECTION FREQUENCIES AT THE 5% LEVEL
FOR THE TWO SPECIFICATION TESTS
able as one moves away from case 2, in which it is an
optimal estimator, the biases of the PPML are always Frequency
small.21 Moreover, the results obtained with rounded data Case GNR Test Park Test
suggest that the Poisson-based PML estimator is relatively Without Measurement Error
robust to this form of measurement error of the dependent 1 0.91980 1.00000
variable. Indeed, the bias introduced by the rounding-off 2 0.05430 1.00000
errors in the dependent variable is relatively small, and in 3 0.58110 0.06680
4 0.49100 0.40810
some cases it even compensates the bias found in the first
set of experiments. Therefore, because it is simple to im- With Measurement Error
plement and reliable in a wide variety of situations, the 1 0.91740 1.00000
2 0.14980 1.00000
Poisson PML estimator has the essential characteristics 3 0.57170 1.00000
needed to make it the new workhorse for the estimation of 4 0.47580 1.00000
constant-elasticity models.
Obviously, the sign and magnitude of the bias of the
cases are, respectively, 0, 1, and 2. of language similarity are available, at request, from the authors.
650 THE REVIEW OF ECONOMICS AND STATISTICS
a description of the variables and displays the summary ficients differ—oftentimes significantly—from those ob-
statistics. tained using OLS. This suggests that in this case, heteroske-
dasticity (rather than truncation) is responsible for the
B. Results differences between PPML results and those of OLS using
only the observations with positive exports. Further evi-
The Traditional Gravity Equation: Table 3 presents the
dence on the importance of the heteroskedasticity is pro-
estimation outcomes resulting from the various techniques
for the traditional gravity equation. The first column reports vided by the two-degrees-of-freedom special case of
OLS estimates using the logarithm of exports as the depen- White’s test for heteroskedasticity (see Wooldridge, 2002, p.
dent variable; as noted before, this regression leaves out 127), which leads to a test statistic of 476.6 and to a p-value
pairs of countries with zero bilateral trade (only 9,613 of 0. That is, the null hypothesis of homoskedastic errors is
country pairs, or 52% of the sample, exhibit positive export unequivocally rejected.
flows). Poisson estimates reveal that the coefficients on import-
The second column reports the OLS estimates using er’s and exporter’s GDPs in the traditional equation are not,
ln(1 ⫹ Tij) as dependent variable, as a way of dealing with as generally believed, close to 1. The estimated GDP elas-
zeros. The third column presents tobit estimates based on ticities are just above 0.7 (s.e. ⫽ 0.03). OLS generates
Eaton and Tamura (1994). The fourth column shows the significantly larger estimates, especially on exporter’s GDP
results of standard NLS. The fifth column reports Poisson (0.94, s.e. ⫽ 0.01). Although all these results are conditional
estimates using only the subsample of positive-trade pairs. on the particular specification used,25 it is worth pointing out
Finally, the sixth column shows the Poisson results for the that unit income elasticities in the simple gravity framework
whole sample (including zero-trade pairs). are at odds with the observation that the trade-to-GDP ratio
The first point to notice is that PPML-estimated coeffi- decreases with increasing total GDP, or, in other words, that
cients are remarkably similar using the whole sample and smaller countries tend to be more open to international
using the positive-trade subsample.24 However, most coef- trade.26
24 The reason why truncation has little effect in this case is that 25 This result holds when one looks at the subsample of OECD countries.
observations with zero trade correspond to pairs for which the estimated It is also robust to the exclusion of GDP per capita from the regressions.
value of trade is close to zero. Therefore, the corresponding residuals are 26 Note also that PPML predicts almost equal coefficients for the GDPs
also close to zero, and their elimination from the sample has little effect. of exporters and importers.
THE LOG OF GRAVITY 651
TABLE 4.—RESULTS OF THE TESTS FOR TYPE OF HETEROSKEDASTICITY agreement dummy will not reflect the net effect of trade
(p-VALUES)
agreements. To account for the possibility of diversion, we
Test (Null Hypothesis) Exports ⬎ 0 Full Sample include an additional dummy, openness, similar to that used
GNR (V [ yi兩x] ⬀ (xi)) 0.144 0.115 by Frankel (1997). This dummy takes the value 1 whenever
Park (OLS is valid) 0.000 0.000 one (or both) of the countries in the pair is part of a
preferential trade agreement, and thus it captures the extent
of trade between members and nonmembers of a preferen-
The role of geographical distance as trade deterrent is
tial trade agreement. The sum of the coefficients on the trade
significantly larger under OLS; the estimated elasticity is
agreement and the openness dummies gives the net creation
⫺1.17 (s.e. ⫽ 0.03), whereas the Poisson estimate is ⫺0.78
effect of trade agreements. OLS suggests that trade destruc-
(s.e. ⫽ 0.06). This lower estimate suggests a smaller role for
tion comes from trade agreements. Still, the net creation
transport costs in the determination of trade patterns. Fur-
thermore, Poisson estimates indicate that, after controlling effect is around 40%. In contrast, Poisson regressions pro-
for bilateral distance, sharing a border does not influence vide no significant evidence of trade diversion, although the
trade flows, whereas OLS, instead, generates a substantial point estimates are of the same order of magnitude under
effect: It predicts that trade between two contiguous coun- both methods.
TABLE 6.—RESULTS OF THE TESTS FOR TYPE OF HETEROSKEDASTICITY The basic problem is that log-linearization (or, indeed, any
(p-VALUES)
nonlinear transformation) of the empirical model in the
Test (Null Hypothesis) Exports ⬎ 0 Full Sample presence of heteroskedasticity leads to inconsistent esti-
GNR (V [ yi兩x] ⬀ (xi)) 0.100 0.070 mates. This is because the expected value of the logarithm
Park (OLS is valid) 0.000 0.000 of a random variable depends on higher-order moments of
its distribution. Therefore, if the errors are heteroskedastic,
two techniques produce reasonably similar estimates for the the transformed errors will be generally correlated with the
coefficient on the trade-agreement dummy, implying a trade- covariates. An additional problem of log-linearization is that
enhancement effect of the order of 40%. it is incompatible with the existence of zeros in trade data,
As before, the other estimation methods lead to some which led to several unsatisfactory solutions, including
puzzling results. For example, OLS on ln(1 ⫹ Tij) now truncation of the sample (that is, elimination of zero-trade
yields a significantly negative effect of contiguity, and under pairs) and further nonlinear transformations of the depen-
NLS, the coefficient on common colonial ties becomes dent variable.
significantly negative. We argue that the biases are present both in the traditional
To complete the study, we performed the same set of specification of the gravity equation and in the Anderson–
Boisso, D., and M. Ferrantino, “Economic and Cultural Distance in Hallak, J. C., “Product Quality and the Direction of Trade,” Journal of
International Trade: Empirical Puzzles,” Journal of Economic International Economics 68 (2006), 238–265.
Integration 12 (1997), 456–484. Harrigan, J., “OECD Imports and Trade Barriers in 1983,” Journal of
Cameron, A. C., and P. K. Trivedi, Regression Analysis of Count Data International Economics 35 (1993), 95–111.
(Cambridge: Cambridge University Press, 1998). Haveman, J., and D. Hummels, “Alternative Hypotheses and the Volume
Central Intelligence Agency, World Factbook, http://www.cia.gov/cia/ of Trade: The Gravity Equation and the Extent of Specialization,”
publications/factbook/ (2002). Purdue University mimeograph, (2001).
Davidson, R., and J. G. MacKinnon, Estimation and Inference in Econo- Helpman, E., and P. Krugman, Market Structure and Foreign Trade
metrics (Oxford: Oxford University Press, 1993). (Cambridge, MA: MIT Press, 1985).
Davis, D., “Intra-industry Trade: A Hecksher-Ohlin-Ricardo Approach,” Helpman, E., M. Melitz, and Y. Rubinstein, “Trading Patterns and Trading
Journal of International Economics 39 (1995), 201–226. Volumes,” Harvard University mimeograph (2004).
Deardoff, A., “Determinants of Bilateral Trade: Does Gravity Work in a Koenker, R., and G. S. Bassett, Jr., “Regression Quantiles,” Econometrica
Neoclassical World?” in Jeffrey Frankel (Ed.), The Regionalization 46 (1978), 33–50.
of the World Economy (Chicago: University of Chicago Press, Manning, W. G., and J. Mullahy, “Estimating Log Models: To Transform
1998). or Not to Transform?” Journal of Health Economics 20 (2001),
Delgado, M., “Semiparametric Generalized Least Squares Estimation in 461–494.
the Multivariate Nonlinear Regression Model,” Econometric The- McCallum, J., “National Borders Matter: Canada-US Regional Trade
ory 8 (1992), 203–222. Patterns,” American Economic Review 85 (1995), 615–623.
Delgado, M., and T. J. Kniesner, “Count Data Models with Variance of McCullagh, P., and J. A. Nelder, Generalized Linear Models, 2nd ed.
Unknown Form: An Application to a Hedonic Model of Worker (London: Chapman and Hall, 1989).
APPENDIX
CER PATCRA
Australia Australia
New Zealand Papua New Guinea
658 THE REVIEW OF ECONOMICS AND STATISTICS