ections on the probability space induced by 
moment conditions with implications for Bayesian 
inference": a discussion 
Christian P. Robert 
Universite Paris-Dauphine, Paris  University of Warwick, Coventry 
what is the question? 
what could the question be? 
what is the answer? 
what could the answer be ?
what is the question? 
If one speci
es a set of moment functions collected 
together into a vector m(x, ) of dimension M, regards  
as random and asserts that some transformation Z(x, ) 
has distribution  , then what is required to use this 
information and then possibly a prior to make valid 
inference? R. Gallant, p.4
Priors without eorts 
I quest for model induced prior dating back to early 1900's 
[Lhoste, 1923] 
I reference priors such as Jereys' prior induced by sampling 
[Jereys, 1939] 
I Fiducial distributions as Fisher's attempted answer 
[Fisher, 1956]
When considering 
t = 
x -  
the ratio has a frequentist t distribution with n - 1 degrees of 
However, no equivalent justi
cation in asserting that 
t = 
x -  
has a t posterior distribution with n - 1 degrees of freedom on , 
given (x, s) except when using a non-informative and improper 
prior (, 2) / 1=2 since, then 
  Tn-1(x, s=
Furthermore, neither Bayesian nor frequentist interpretation implies 
t = 
x -  
has a t posterior distribution with n - 1 degrees of freedom jointly
what could the question be? 
Given a set of moment equations 
E[m(X1, . . . ,Xn, )] = 0 
(where both the Xi 's and  are random), can one derive a 
likelihood function and a prior distribution compatible with those 
coherence across sample sizes n 
Highly complex question since it implies the integral equation 
m(x1, . . . , xn, ) ()f (x1j)    f (xnj)ddx1    dxn = 0 
must or should have a solution in (, f ) for all n's. 
possible outside of a likelihood x prior modelling?
Zellner's Bayesian method of moments 
Given moment conditions on parameter  and 2 
E[jx1, . . . , xn] = xn E[2jx1, . . .] = s2 
n var(j2, x1, . . .) = 2=n 
derivation of a maximum entropy posterior 
j2, x1, . . .  N(xn, 2=n) -2jx1, . . .  Exp(s2 
n ) 
[Zellner, 1996] 
but incompatible with corresponding predictive distribution 
[Geisser  Seidenfeld, 1999]
what is the answer? 
Under the condition that Z(, ) is surjective, 
p?(xj) =  (Z(x, )) 
and arbitrary choice of prior () 
I lhs and rhs operate on dierent spaces 
I no reason why density   should integrate against Lebesgue 
measure in n-dimensional Euclidean space 
I no direct connection with a genuine likelihood function, i.e., 
product of the densities of the Xi 's (conditional on )
what could the answer be? 
A common situation that requires consideration of the 
notions that follow is that deriving the likelihood from a 
structural model is analytically intractable and one 
cannot verify that the numerical approximations one 
would have to make to circumvent the intractability are 
suciently accurate. R. Gallant, p.7
Approximative Bayesian answers 
ning joint distribution on (, x1, . . . , xn) through moment 
equations prevents regular Bayesian inference as likelihood is 
there may be alternative available: 
I Approximative Bayesian computation (ABC) and empirical 
likelihood based Bayesian inference 
[Tavare et al., 1999; Owen, 201; Mengersen et al., 2013] 
I INLA (Laplace), EP (expectation/propagation), 
[Martino et al., 2008; Barthelme  Chopin, 2014] 
I variational Bayes 
[Jaakkola  Jordan, 2000]

