Applications of Random Set Theory in Econometrics: Ilya Molchanov
Applications of Random Set Theory in Econometrics: Ilya Molchanov
Applications of Random Set Theory in Econometrics: Ilya Molchanov
Econometrics
Ilya Molchanov
Francesca Molinari
tial identification
Abstract The econometrics literature has in recent years shown a growing interest in the
study of partially identified models, where the object of economic and statistical interest is a set
rather than a point. Characterization of this set and development of its consistent estimators
and of inference procedures with desirable properties are the main goals of partial identification
analysis. This review introduces the fundamental tools of the theory of random sets, which
brings together elements of topology, convex geometry and probability theory to develop a
coherent mathematical framework to analyze random elements whose realizations are sets. It
then elucidates how these tools have been fruitfully applied in econometrics, to reach the goals
Annu. Rev. Econ. 2014 6 1941-1383/14/0904-????
CONTENTS
INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Random Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
APPLICATIONS TO INFERENCE . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Duality Between the Level Set Approach and the Support Function Approach . . . . . . 36
CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1 INTRODUCTION
framework to study random objects whose realizations are sets. Such objects
appeared a long time ago in statistics and econometrics in the form of confidence
regions, which can be naturally described as random sets. The first idea of a
2
Applications of Random Set Theory in Econometrics 3
general random set in the form of a region that depends on chance appears in
the theory of random sets did not occur until another while later, stimulated by
and material science, of statistical techniques to develop models for random sets,
estimate their parameters, filter noisy images, and classify biological images.
These and other related applications of set valued random variables induced the
Aumann (1965), Debreu (1967) and to the first self contained treatment of the
theory of random sets given by Matheron (1975). Since then the theory expanded
limit theorems for random sets, set valued processes, etc. An account of the
analysis has provided a new and natural area of application for random set theory.
Partially identified econometric models appear when the available data and
tional of interest, might this be finite or infinite dimensional, even as data ac-
cumulate, see Tamer (2010) for a review and Manski (2003) for a systematic
treatment. For this class of models, partial identification proposes that econo-
metric analysis should study the set of values for the statistical functional which
are observationally equivalent, given the available data and credible maintained
sharp identification region. The goals of the analysis are to obtain a tractable
mating it, and to conduct test of hypotheses and making confidence statements
about it.
ued to set valued objects, which renders it naturally suited for the use of ran-
and statistical inference, and to unify a number of special results and produce
novel general results. The random sets approach complements the more tradi-
tional one, based on mathematical tools for (single valued) random vectors, that
proved extremely productive since the beginning of the research program in par-
tial identification; see, for example, Manski (1995) for results on identification,
and Horowitz and Manski (2000), Imbens and Manski (2004), Chernozhukov,
Hong and Tamer (2007), and Andrews and Soares (2010) for results on statistical
inference.
random variables that are consistent with the available data and maintained
set, and random set theory can be applied to describe their distribution and to
derive statistical properties of estimators that rely upon them. Specific examples
that we discuss in detail in this article include interval data and finite static games
with multiple equilibria. In the first case, the random variables consistent with
the data are those that lie in the interval with probability one. In the second
case, the random variables consistent with the modeling assumptions are the ones
Applications of Random Set Theory in Econometrics 5
In order to fruitfully apply random set theory for identification and inference,
the econometrician needs to carry out three fundamental steps. First, she needs to
define the random closed set that is relevant for the problem under consideration
using all information given by the available data and maintained assumptions.
This is a delicate task, but one that is typically carried out in identification anal-
ysis regardless of whether random set theory is applied. Second, she needs to
determine how the observable random variables relate to this random closed set.
Often, one of two cases occurs: either the observable variables determine a ran-
dom set to which the (unobservable) variable of interest belongs with probability
one, e.g. the interval data example; or the (expectation of the) (un)observable
variable belongs to (the expectation of) a random set determined by the model,
e.g. the games with multiple equilibria example. Finally, the econometrician
needs to determine which tool from random set theory should be utilized. To
date, new applications of random set theory to econometrics have fruitfully ex-
functionals, and laws of large numbers and central limit theorems for random
sets.
The goal of this review is to provide a guide to the study of random sets theory
preparation, Molchanov and Molinari 2014). Our view is that the instruction of
random sets theory could be fruitfully incorporated into Ph.D.-level field courses
ory. Important prerequisites for the study of random sets theory include measure
theory and probability theory; good knowledge of convex analysis and topology
Throughout this article, we use capital Latin letters to denote sets and random
sets. We use lower case Latin letters for random vectors. We denote parameter
denote a nonatomic probability space on which all random variables and random
sets are defined, where Ω is the space of elementary events equipped with σ-
The theory of random closed sets generally applies to the space of closed sub-
Bd = {x ∈ Rd : kxk ≤ 1} denote respectively the unit sphere and the unit ball in
The conventional theory of random sets deals with random closed sets. An advan-
tage of this approach is that random points (i.e. random sets that are singletons)
are closed, and so the theory of random closed sets includes the classical case
a balance is sought out between a need for weak conditions, so that there is a
large class of examples of random sets, and a need for strict conditions, so that
important functionals of random sets are random variables. This trade-off results
X − (K) = {ω : X(ω) ∩ K 6= ∅} ∈ F
In other words, a random closed set is a measurable map from the given prob-
ability space to the family of closed sets equipped with the σ-algebra generated
compact set is defined as a random closed set which is compact with probability
one, so that almost all values of X are compact sets. A convex random set is
defined similarly, so that X(ω) is a convex closed set for almost all ω.
nomics and the social sciences. Let Y = [yL , yU ] be a random interval on R where
yL and yU are two (dependent) random variables such that yL ≤ yU almost surely.
8 Molchanov and Molinari
because yL and yU are random variables. Measurability for all compact sets
Example 2.3 (Entry game). Consider a two player entry game as in Tamer
(2003), where each player j can choose to enter (yj = 1) or to stay out of the
enters the game if and only if πj ≥ 0. Then, for given values of θ1 and θ2 , the
equilibrium of the game is unique, while for (ε1 , ε2 ) ∈ [0, −θ1 ) × [0, −θ2 ) the game
To see that Yθ is a random closed set, notice that in this example one can take
K = {(0, 0), (1, 0), (0, 1), (1, 1)}, and that all its subsets are compact. Then
{Yθ ∩ K 6= ∅} = {(ε1 , ε2 ) ∈ GK } ∈ F ,
where GK is a Borel set determined by the chosen K. For example, if K = {(0, 0)}
random variables.
Definition 2.1 means that X is explored by its hitting events, that is the events
important role in the theory of random sets, hence we define them formally here,
TX (K) = P{X ∩ K 6= ∅} , K ∈ K,
CX (F ) = P{X ⊂ F } , F ∈F,
The importance in random set theory of the capacity functional stems from the
set X, see Molchanov (2005, Ch. 1, Sec. 1.2). We note that the containment
functional defined on the family of all closed sets F yields the capacity functional
containment functional defined on the family of closed sets also determines the
point with positive probability, it might hit two disjoint sets simultaneously, so
Example 2.5 (Interval data). Consider again the random interval Y = [yL , yU ].
Example 2.6 (Entry game). Consider the set-up in Example 2.3. Then for
K = {(0, 1)} we have T ({0, 1}) = P{ε1 < −θ1 , ε2 ≥ 0} and C({0, 1}) = P{ε1 <
−θ1 , ε2 ≥ 0} − P{0 ≤ ε1 < −θ1 , 0 ≤ ε2 < −θ2 }. For K = {(1, 0), (0, 1)} we have
T ({(1, 0), (0, 1)}) = C({(1, 0), (0, 1)}) = 1 − P{ε1 ≥ −θ1 , ε2 ≥ −θ2 } − P{ε1 <
0, ε2 < 0}. One can similarly obtain T (K) and C(K) for each K ⊂ K.
Ever since the seminal work of Aumann (1965), it has been common to think of
random sets as bundles of random variables – the selections of the random sets.
vector x such that x(ω) ∈ X(ω) almost surely. We denote by Sel(X) the set of
measurable itself. Recall that a random closed set is defined on the probability
space (Ω, F, P) and, unless stated otherwise, almost surely means P-almost surely.
A possibly empty random set clearly does not have a selection, so unless stated
Applications of Random Set Theory in Econometrics 11
otherwise we assume that all random sets are almost surely non-empty, which in
1.2.13) One can view selections as curves taking values in the tube being the
Example 2.8 (Interval data). Consider again the random interval Y = [yL , yU ].
Then Sel(Y ) is the family of all F-measurable random variables y such that y(ω) ∈
y = ryL + (1 − r)yU .
Then y ∈ Sel(Y ). Tamer (2010) gives this representation of the random variables
Example 2.9 (Entry game). Consider the set Yθ plotted in Figure 2.1. Let
one selection, since the equilibrium is unique. For ω ∈ ΩM , Yθ contains a rich set
Artstein (1983) and Norberg (1992) provide a necessary and sufficient condition
which relates the distribution of the selections of the random set X to the capacity
12 Molchanov and Molinari
Proof. Molchanov (2005, Cor. 1.4.44) and Molchanov and Molinari (2014).
some random vector x, then it is not guaranteed that x ∈ X a.s., e.g. x can
words one couples x and X on the same probability space. Hence, the nature
of the domination condition in (2.1) can be traced to the ordering, or first order
and y are stochastically ordered if Fx (t) ≥ Fy (t), i.e. P{x ≤ t} ≥ P{y ≤ t} for
all t. In this case it is possible to find two random variables x0 and y 0 distributed
these random variables is to set x0 = Fx−1 (u) and y 0 = Fy−1 (u) by applying
random variable u. One then speaks about the ordered coupling of x and y. Note
P{y ∈ A} for A = [t, ∞) and all t ∈ R. Such a set A is increasing (or upper),
particular, this leads to the condition for the ordered coupling for random closed
sets Z and X obtained by Norberg (1992). Two random closed sets Z and X can
be realized on the same probability space as random sets Z 0 and X 0 having the
if and only if the probabilities that Z has nonempty intersection with any finite
the upper envelope and the lower envelope of all probability measures that are
we have,
TX (K) = sup{µ(K) : µ ∈ PX }, K ∈ K,
CX (F ) = inf{µ(F ) : µ ∈ PX }, F ∈ F,
see Molchanov (2005, Theorem 1.5.13). Because of this, the functionals TX and
CX are also called coherent upper and lower probabilities. In general, the upper
The space of closed sets is not linear, which causes substantial difficulties in
random set using the family of its selections, and considering the set formed by
their expectations.
this case only existence is important, e.g. X being a segment on the line with
one end-point equal to zero and the other one equal to a Cauchy distributed
almost surely, regardless of the fact that its other end-point is not integrable.
all its selections are integrable. In this case the family of expectations of these
see Molchanov (2005, Section 2.1.2). In particular, if the probability space is non-
that the expectation of the closed convex hull of X equals the closed convex hull
of EX, which in turn equals EX. It is then natural to describe the Aumann
expectation through its support function, because this function traces out a con-
vex set’s boundary and therefore knowing the support function is equivalent to
knowing the set itself, see Figure 2.2 and equation (2.3) below.
hK (u) = sup{hk, ui : k ∈ K} , u ∈ Rd ,
Note that the support function is finite for all u if K is bounded, and is sublinear
The great advantage of working with the support function of the Aumann
This implies that one does not need to calculate the Aumann expectation
directly by looking at all selections, but can simply work with the expectation of
that
Y
P{X1 ∩ K1 6= ∅, . . . , Xn ∩ Kn 6= ∅} = P{Xi ∩ Ki 6= ∅} ∀ K1 , . . . , Kn ∈ K,
i=1,...,n
Random set theory provides laws of large numbers and central limit theorems
for Minkowski sums of i.i.d. random sets, that mimic the familiar ones for random
vectors. Given two sets A and B in Rd , and scalars λ, γ in R, define the dilated set
find a set C such that A + C = B (this happens for example if A is a ball and B
is a rectangle). Hence, while with random variables one expresses limit theorems
by taking the difference between a sample average of the variables and their
sets one considers the (normalized) Hausdorff distance between the Minkowski
average of the sets and their Aumann expectation, where the Hausdorff distance
The limit theorems rely on three key steps. First, attention is restricted to
convex random sets, and the sets are represented as elements of a functional
space by means of their support function. This is useful because the sum of the
n
1X
h1 Pn (u) = hXi (u).
n i=1 Xi n
i=1
space of compact and convex subsets of Rd endowed with the Hausdorff metric can
functions on the unit sphere endowed with the uniform metric, so that
subsets of Rd ,
√
ρH (K1 + · · · + Kn , conv(K1 + · · · + Kn )) ≤ d max kKi k .
1≤i≤n
we have that n−1 max kXi k converges to zero almost surely, taking a Minkowski
the Minkowski average of not necessarily convex but integrably bounded sets, and
n n n n
! ! !
1X 1X 1X 1X
ρH Xi , EX − ρH conv(Xi ), EX ≤ ρH Xi , conv(Xi )
n n n n
i=1 i=1 i=1 i=1
1
= Op .
n
Hence, a law of large numbers and a central limit theorem for continuous valued
random variables (the i.i.d. average of support functions minus their expectation),
for the Hausdorff distance between Minkowski averages of convex random sets
theorem which allows to lift the requirement of convexity of the sets, yield the
following results.
√
X1 + · · · + Xn d
nρH , EX → sup{|ζ(u)| : u ∈ Sd−1 } as n → ∞ ,
n
tion on Sd−1 with covariance E[ζ(u)ζ(v)] = E[hX (u)hX (v)] − E[hX (u)]E[hX (v)].
Identification analysis entails the study of what can be learned about a param-
eter of interest, given the available data and maintained modeling assumptions.
Within the partial identification paradigm, the goal is to characterize the sharp
identification region, denoted ΘI in what follows. This region exhausts all the
available information, given the sampling process and the maintained modeling
It may be particularly difficult to prove sharpness, that is, to show that a con-
jectured region contains exactly the feasible parameter values and no others.
that some parameter values in it are actually inconsistent with the sampling pro-
cess and the maintained assumptions. Hence, they cannot have generated the
observed data. Failure to eliminate such values weakens the models ability to
make useful predictions. And it weakens the researcher’s ability to achieve point
is true both in the context of structural analysis and in the context of reduced
form analysis.
several contexts using standard tools of probability theory; see, among others,
Manski (1989, 2003), Manski and Tamer (2002), and Molinari (2008). Beresteanu,
Molchanov and Molinari (2011, BMM henceforth) show how to apply random set
tant settings where other approaches are less tractable. Their approach rests on
the fact that in many partially identified models, the information in the data
and assumptions can be expressed as requiring either that (i) a random vector
belongs to a random set with probability one, or that (ii) the conditional ex-
a random set almost surely with respect to the restriction of P to the condi-
Example 3.1 (Best linear prediction with interval outcomes and covariates).
x, but only observes random intervals Y = [yL , yU ] and X = [xL , xU ] such that
and Stoye (2003, HMPS henceforth) studied this problem and provided a char-
grows with the number of points in the support of the outcome and covariate vari-
ables, and becomes essentially unfeasible if these variables are continuous, unless
one discretizes their support quite coarsely. We show here that the random sets
Suppose X and Y are integrably bounded. Then one can obtain ΘI as the
collection of θ’s such that there are selections (x̃, ỹ) ∈ Sel(X × Y ) and associated
For given θ we can have a mean-zero prediction error uncorrelated with its
associated selection x̃ if and only if the zero vector belongs to EQθ . Convexity
n o
ΘI = θ : 0 ≤ E(hQθ (u)) ∀ u ∈ Bd = θ : max(−E(hQθ (u)) = 0 , (3.1)
u∈Bd
where
hence easy to solve. See for example the CVX software by Grant and Boyd (2010).
It should be noted, however, that the set ΘI itself is not necessarily convex. One
then has to scan the parameter space to trace out ΘI . Ciliberto and Tamer (2009)
and Bar and Molinari (2013) propose methods to conduct this task. Projections
Example 3.2 (Entry game). Consider the set-up in Example 2.3. Assume we
eter vector that is part of θ. Earlier on, Tamer (2003), Berry and Tamer (2007)
and Ciliberto and Tamer (2009) studied this problem, and provided an abstract
lection mechanism, that picks the equilibrium played in the region of multiplicity.
The selection mechanism is a rather general random function, that BMM later
showed builds all possible selections of the random set of equilibria. Because the
dealing with it directly creates great difficulties for the computation of ΘI and for
one can characterize ΘI avoiding altogether the need to deal with the selection
directly linked to existing inference methods (e.g., Andrews and Shi (2013)),
and is in the spirit of the earlier literature in partial identification that provided
reference to the selection mechanism (see, e.g., Manski (2003) and Manski and
recall that Theorem 2.10 (Artstein’s inequality) and Theorem 2.13 (Aumann ex-
the expectation of each selection of a random set, without having to build such
selections directly.
In our simple example, if the model is correctly specified and the observed out-
comes result from pure strategy Nash play, then a candidate θ can have generated
which gives that one can verify whether θ ∈ ΘI by checking a finite number of
potentially be a large number, but in Section 3.2 below we show that econometric
applications of random set theory similar in spirit, but much more complex, than
substantially reduce the number of test sets K over which to check the dominance
condition.
pectation and support function, observing that if the model is correctly specified,
the multinomial distribution Py observed in the data should belong to the collec-
the simple fact that the probability mass function of a discrete random variable
ΘI = {θ : Py ∈ E(Qθ )}
where the second line follows from equation (2.3), the third line follows from
Theorem 2.13, and the last line is an algebraic manipulation. The maximization
easy. For example, Boyd and Vandenberghe (2004, p. 8) write: ”We can easily
The Aumann expectation based characterization applies easily also when out-
comes of the game result from mixed strategy Nash play or from other solution
concepts, by replacing the set Qθ with one collecting the multinomial distribu-
tions over outcomes of the game associated with each equilibrium mixed strategy.
general class of econometric models which they call ”models with convex moment
this review; our two preceding examples, however, illustrate the key features
identification regions.
In important complementary work, Galichon and Henry (2011) use the charac-
Applications of Random Set Theory in Econometrics 25
information with multiple pure strategy Nash equilibria. They show that fur-
of ΘI . This motivated the study of a reduced family of test sets that still suffices
formally defined by Galichon and Henry (2006, 2011), who then implement it in
for a random closed set X if any probability measure µ satisfying the inequalities
for all F ∈ M, is the distribution of a selection of X and so (3.2) holds for all
closed sets F .
in random sets theory correspond to the similar concept for random variables in
core determining.
all compact sets that is dense in a certain sense. For instance, in the Euclidean
space, it suffices to consider compact sets obtained as finite unions of closed balls
For a further reduction one should impose additional restrictions on the family
expect that probabilities of the type CX (F ) = P{X ⊂ F } for all convex closed
sets F determine uniquely the distribution of X. This is however the case only
if X is almost surely compact, see Molchanov (2005, Thr. 1.7.8). Even in this
case, however, the family of all convex compact sets is not a core determining
In some cases, most importantly for random sets being intervals on the line, it
in the real line. In this case, it is useful to characterize selections by the inequal-
into several subsets such that the values of X on ω’s from disjoint subsets are
disjoint.
to check (2.1) only for all K such that there is i ∈ {1, . . . , N } for which K ⊂ Ki .
Example 3.6 (Entry game). Consider the set-up in Example 2.3. We have
Galichon and Henry (2011) propose to use a matching algorithm to check that
to match values x(ω) for ω ∈ Ω to the values of X(ω) so that x(ω) ∈ X(ω).
This yields an alternative algorithm to check the selectionability and also makes
it possible to quantify how far a random vector is from the family of selections.
In Example 3.1, for the interval data case, we have encountered a random closed
set defined in the space of unobservables, the prediction errors. Random closed
sets defined in such space can be extremely useful for incorporating restrictions on
of papers, e.g. Chesher, Rosen and Smolinski (2012) and Chesher and Rosen
(2012). Here we illustrate their approach through the entry game example.
Example 3.7 (Entry game). Consider again the two-player entry game in Ex-
ample 2.3. So far we have addressed the identification problem in this model by
defining the random closed set Yθ (ε) of pure strategy Nash equilibria associated
for all closed sets F in the plane, which is the realization space for ε. However,
Chesher and Rosen (2012) show that this family of test sets can be considerably
reduced, to being equivalent to the case when one works with Yθ , by observing
that the realizations of Ȳθ (y) associated with the four realizations of y ∈ K are
four rectangles, see Figure 2.1. Hence, one can construct the core determining
class in steps. (1) Let F be a proper subset of one of the four rectangles; then
P{Ȳθ (y) ⊂ F } = 0 and the inequality is automatically satisfied. (2) Take the
collection of sets F that contain one of the four rectangles but not more. Then it
suffices to check the inequality for the four sets F that equal (the closure of) each
of the rectangles; this is because a larger set F 0 in this family yields the same value
for the containment functional as F . (3) Take the collection of sets F that contain
two of the four rectangles but not more. A similar reasoning allows one to check
the inequality only on the five sets F that equal (the closure of) unions of two of
the rectangles. Observing that the realizations Ȳθ (0, 0) and Ȳθ (1, 1) are disjoint,
that Ȳθ (0, 0) ⊂ F1 and Ȳθ (1, 1) ⊂ F2 , and therefore inequalities involving this
set are redundant. (4) Finally, the collection of sets F that contain three of the
four rectangles can similarly be reduced to (the closure of) unions of three of the
As this example makes plain, one can often work with random sets defined
often most useful to work with random sets defined in the space of observables. If
30 Molchanov and Molinari
the modeling assumptions are either stochastic or shape restrictions in the space
of unobservables, it is often most useful to work with random sets defined in the
space of unobservables. Suppose for example, within the two player entry game
previously discussed, that one observes variable v along with y. Then if the model
is correctly specified, (y, v) ∈ Sel(Yθ , v). Impose the exclusion restriction that y
see Molchanov and Molinari (2014). On the other hand, suppose the exclusion
yields
where M is the core determining class obtained above. For other examples see
Beresteanu, Molchanov and Molinari (2012) and Chesher and Rosen (2012).
4 APPLICATIONS TO INFERENCE
Identification arguments are always at the population level. That is, they pre-
sume that identified features of the model can be learned with certainty from
feature of the model is a set rather than a point. The shape and size of a prop-
erly defined set estimator changes with sample size, and even consistency of the
Applications of Random Set Theory in Econometrics 31
and Tamer (2002), Imbens and Manski (2004), Chernozhukov, Hong and Tamer
(2007), and Andrews and Soares (2010), among others, have addressed the ques-
elements of random set theory. The method offers a unified approach to inference
for level sets and convex identified sets based on Wald-type test statistics for the
because in this case the support function is a natural tool to obtain a functional
The nature of partial identification problems calls for estimation of sets that
to the closedness of such level sets. If now f is replaced by its empirical estimator
a probability density function, then the set S(t) appears in cluster analysis, see
Hartigan (1975). More sophisticated estimators of S(t) using the so-called excess
plug-in estimators have been studied in Mason and Polonik (2009) and optimal
if S(t) equals its closure. This condition is also necessary under some rather mild
at level t. Most importantly, this is the case if t is the global minimum of the
function f , and S(t) is then the set arginff of the global minimizers of the function
f . This case has been thoroughly analyzed in Chernozhukov et al. (2007), who
A limit theorem obtained in Molchanov (1998) for the plug-in estimator pro-
vides a limit distribution for the normalised Hausdorff distance between S(t) and
Sn (t), both intersected with any given compact set K. The limit theorem holds
under the assumptions that the normalised difference fn (s) − f (s) satisfies a limit
of its downside continuity modulus, i.e. the infimum of f (s0 ) − f (s) for s0 from a
neighborhood of s.
Beresteanu and Molinari (2008) propose to use statistics based on the Hausdorff
in the space of sets, so as to replicate the common Wald approach to these tasks
for point identified models in the space of vectors. In particular, they employ two
Wald statistics, which measure the Hausdorff distance and the directed Hausdorff
distance between the identified set and a set valued estimator, and develop large
equal to the Aumann expectation of a random set which can be constructed us-
Applications of Random Set Theory in Econometrics 33
ing random variables characterizing the model. Applying the analogy principle,
defined using the sample observations. The support function of the convex hull
of these random sets is used to represent the set estimator as a sample average of
and central limit theorem) are used to establish consistency of the estimator and
derive its limiting distribution with respect to the Hausdorff distance. Beresteanu
and Molinari also show that the critical values of the limiting distribution can be
sis about subsets of the population identification region are tested using the Wald
statistic based on the directed Hausdorff distance, and these tests are inverted
population identification region, rather than only its subsets, are tested using the
We illustrate Beresteanu and Molinari’s approach for the case of best linear
prediction with interval outcome data. We remark that in the case of entry
games, ΘI is not convex and therefore any statistic based on the support function
Example 4.1 (Inference for best linear prediction with interval outcomes). Sup-
identification region of the BLP parameter vector θ can be obtained defining the
34 Molchanov and Molinari
random segment
y
G= : yL ≤ y ≤ yU ⊂ R2
xy
and collecting the least squares associated with each (ỹ, xỹ) ∈ Sel(G):
−1
1 x ỹ
d
ΘI = θ ∈ R : θ = E
E
, (ỹ, xỹ) ∈ Sel(G) , (4.1)
2
x x xỹ
where we have assumed that G is integrably bounded (this is the case, for exam-
ple, if yL , yU , xyL , xyU are each absolutely integrable). Given a random sample
1
Θ̂n = Σ̂−1
n (G1 + · · · + Gn ),
n
where Σ̂n is a consistent estimator of the matrix inside equation (4.1). Using
Theorem 2.14, Beresteanu and Molinari establish a Slutsky-type result and under
process, Beresteanu and Molinari need to assume that all x variables have a con-
tinuous distribution. This assures that the set ΘI does not have flat faces, which
that
√
d
nρH Θ̂n , ΘI → sup kz(u)k ,
u∈Sd−1
√
d
ndH ΘI , Θ̂n → sup (−z(u))+ ,
u∈Sd−1
(2012) show that the support function process converges to the sum of a Gaussian
process and a countable point process which takes non zero values at directions
distribution of (xi , yLi , yUi )ni=1 , consistently estimates the quantiles of the limiting
distributions of these Wald-statistics. Hence, one can test hypotheses of the form
the statistic based on the directed Hausdorff distance. Inverting these tests yield
confidence collections which are unions of sets that cannot be rejected as either
Bontemps, Magnac and Maurin (2012) extend these results in important direc-
tions, by allowing for incomplete linear moment restrictions where the number of
about each vector θ ∈ ΘI , and invert this statistic to obtain confidence sets that
tend the applicability of Beresteanu and Molinari’s approach, to cover best linear
approximation of any function f (x) that is known to lie within two identified
bounding functions. The lower and upper functions defining the band are al-
1
See: http://economics.cornell.edu/fmolinari/#Stata_SetBLP.
36 Molchanov and Molinari
lowed to be any functions, including ones carrying an index, and can be esti-
outcome variable (i.e., the extreme points of the band on f (x)) can be estimated
theory for the support function process, and prove that it approximately con-
verges to a Gaussian process and that the Bayesian bootstrap can be applied for
inference. They also propose a simple data jittering procedure, whereby to each
small but positive variance, eliminating flat faces in ΘI . Hence they obtain valid,
In the study of inference for sets defined by one smooth non-linear inequality,
Chernozhukov, Kocatulum and Menzel (2012) show that the (directed) Haus-
dorff statistic can be weighted, to enforce either exact or first order equivariance
to transformations of parameters.
4.3 Duality Between the Level Set Approach and the Support
Function Approach
Kaido (2012) further enlarges the domain of applicability of the support function
criterion functions, and the support function of the level set estimators. This
allows one to use Hausdorff-based statistics and the support function approach
not only when ΘI is the Aumann expectation of a properly defined random closed
Kaido considers an identification region and its corresponding level set estima-
Applications of Random Set Theory in Econometrics 37
ΘI = {θ ∈ Θ : f (θ) = 0},
function with values in R+ and infimum at zero, f and fn satisfy the additional
Using this result, he shows how to relate the normalized support function pro-
cess Zn (u, t) = an hΘ̂n (t) (u) − hΘI (u) to a localized version of the criterion
function fn , to obtain its asymptotic distribution using the notion of weak epi-
bedding theorem then yields the asymptotic distribution of test statistics based
Kaido, Molinari and Stoye (2013) show that the approach of Kaido (2012) can
convex and the identified set is estimated using a level set estimator under the
on the simple observation that the projections of ΘI are equal to the projections of
conv(ΘI ), and as such when projections are the object of interest, no information
Kaido and Santos (2013) develop a theory of efficiency for estimation of partially
the form E(mj (x; θ)) ≤ 0, j = 1, ...J which are smooth as functionals of the
distribution of the data. The functions θ 7→ mj E(mj (x; θ)) are assumed to be
represented through its support function. Using the classic results in Bickel et al.
(1993), Kaido and Santos show that under suitable regularity conditions, the
√
support function admits for n-consistent regular estimation. The assumptions
at any boundary point of ΘI , and (iii) sets ΘI with empty interior. Using the
convolution theorem, they establish that any regular estimator of the support
function must converge in distribution to the sum of a mean zero Gaussian process
Using the same reasoning as in the classical case, they call a support function
for regular estimators of the support function, by deriving the covariance kernel
of G0 . Then they show that a simple plug-in estimator based on the support
function of the set of parameters satisfying the sample analog of the moment
construct estimates of the corresponding identified set that minimize a wide class
critical values of the limiting distribution of test statistics based on the Hausdorff
distance, Kaido and Santos propose a score-multiplier bootstrap which does not
require that the support function is re-computed for each resample of the data,
imply that the estimator for best linear prediction with interval outcome data in
5 CONCLUSIONS
While the initial development of random set theory was in part motivated by
questions of general equilibrium analysis and decision theory, random sets the-
ory has not been introduced in econometrics until recently. The new surge of
by partially identified models, where the identified object is a set rather than a
methodologies to estimate sets, test hypothesis about (subsets of) the identifi-
cation regions, and build confidence sets that cover them with a pre-speficied
asymptotic probability.
Each of these tasks may be simplified by the use of random set theory, and
many results can be developed under a unified framework. This is because ran-
dom set theory distills elements of topology, convex geometry and probability
dom elements whose realizations are sets. The resulting tools have been proven
especially useful for inference when the econometric model yields a convex sharp
40 Molchanov and Molinari
identification region, and especially useful for identification analysis when the in-
This survey has attempted to introduce the basic elements of random set theory
that have proven useful to date in econometrics, and to summarize the main
applications of random set theory within this literature. The hope is that this
traditional approach based on laws of large numbers and central limit theorems
for random vectors, that continues to be very productively applied in the field.
We did not review results based on these methods, but refer the reader to Tamer
(2010) and references therein for a survey of the partial identification literature.
We have also not summarized the important literature in decision theory that
employs elements of random set theory, most notably nonadditive measures and
cial support from the USA NSF grant SES-0922330. Molchanov was supported
References
78: 119–157.
Artstein, Z. and Vitale, R. A. (1975). A strong law of large numbers for random
using random set theory, Journal of Econometrics 166: 17–32. With errata at
http://economics.cornell.edu/fmolinari/NOTE_BMM2012_v3.pdf.
Bickel, P. J., Klaassen, C. A., Ritov, Y. and Wellner, J. A. (1993). Efficient and
Bontemps, C., Magnac, T. and Maurin, E. (2012). Set identified linear models,
CWP21/12, CeMMAP.
295.
Applications of Random Set Theory in Econometrics 43
Grant, M. and Boyd, S. (2010). CVX: Matlab software for disciplined convex
experiments with missing covariate and outcome data, Journal of the American
Kaido, H., Molinari, F. and Stoye, J. (2013). Inference for projections of identified
New York.
Matheron, G. (1975). Random Sets and Integral Geometry, Wiley, New York.
manuscript in preparation.
Applications of Random Set Theory in Econometrics 45
nomics 2: 167–195.
46 Molchanov and Molinari
ε2
{(1, 1)}
{(0, 1)}
−θ2
{(0, 1), (1, 0)}
−θ1 ε1
{(0, 0)} {(1, 0)}
Figure 2.1: The set of pure strategy Nash equilibria of a two player entry game
as a function of ε1 and ε2 .
hK (u)
u
K
Figure 2.2: The support function of K in direction u is the signed distance of the
support plane to K with exterior normal vector u from the origin; the distance
is negative if and only if u points into the open half space containing the origin,