Abstract
It is well known that the optimal transportation plan between two probability measures \(\mu \) and \(\nu \) is induced by a transportation map whenever \(\mu \) is an absolutely continuous measure supported over a compact set in the Euclidean space and the cost function is a strictly convex function of the Euclidean distance. However, when \(\mu \) and \(\nu \) are both discrete, this result is generally false. In this paper, we prove that, given any pair of discrete probability measures and a cost function, there exists an optimal transportation plan that can be expressed as the sum of two deterministic plans, i.e., plans induced by transportation maps. As an application, we estimate the infinity-Wasserstein distance between two discrete probability measures \(\mu \) and \(\nu \) with the p-Wasserstein distance, times a constant depending on \(\mu \), on \(\nu \), and on the fixed cost function.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The Optimal Transport (OT) problem is a classical minimization problem dating back to the work of Monge [24] and Kantorovich [20, 21]. In this problem, we are given two probability measures, namely \(\mu \) and \(\nu \), and we search for the cheapest way to reshape \(\mu \) into \(\nu \). The effort needed in order to perform this transformation depends on a cost function, which describes the underlying geometry of the product space of the support of the two measures. In the right setting, this effort induces a distance between probability measures.
During the last century, the OT problem has been fruitfully used in many applied fields such as the study of systems of particles by Dobrushin [13], the Boltzmann equation by Tanaka in [17,18,19], and the field of fluidodynamics by Yann Brenier [9]. All these results pointed out that , by a qualitative description of optimal transport, it was possible to gain insightful information on many open problems. For this reason, the Optimal Transport problem has become a topic of major interest for analysts, probabilists and statisticians [3, 29, 31]. In particular, a plethora of results concerning the uniqueness [10, 14, 16], the structure [1, 2, 28], and the regularity [8, 23] of the optimal transportation plan in the continuous framework has been proved.
In recent years, it has also become a crucial sub-problem in several applications in Computer Vision [7, 25,26,27], Computational Statistics [22], Probability [5, 6], and Machine Learning [4, 11, 15, 30]. However, in these fields, the measures \(\mu \) and \(\nu \) are discrete, and therefore the optimal transportation plans lack most of the good properties their continuous counterparts enjoy.
In this paper, we study the structure of optimal transportation plans between discrete probability measures. After introducing the notion of trim plan between the measures \(\mu \) and \(\nu \), we prove that such plans are the sum of two deterministic plans, i.e., plans that are induced by the action of two suitable push-forward maps. The first map acts on a portion \(\mu ^{(d)}\) of \(\mu \), while the other one acts on a portion \(\nu ^{(d)}\) of \(\nu \) (Theorem 3). Thanks to this formula, we recover an extension of the estimate given in [8]. Namely, we estimate the infinity-Wasserstein distance between a pair of discrete measures \((\mu ,\nu )\) (see Definition 4 below) by the c-Wasserstein distance between \(\mu \) and \(\nu \), times a quantity that only depends on \(\mu \) and \(\nu \) (Theorem 7).
2 Basic Notions on Optimal Transport
In this section, following [31], we recall the main definitions regarding optimal transportation and we examine the continuous counterpart [8] to our \(W^\infty \) estimate.
Given a polish space (X, d), we denote with \(\mathcal {B}(X)\) the set of Borel sets over X, while with \(\mathcal {P}(X)\) we denote the set of Borel measures over X. Given a Borel measurable function \(T:X\rightarrow Y\), we denote with \(T_\#:\mathcal {P}(X)\rightarrow \mathcal {P}(Y)\) the push-forward operator induced by T, defined by: \((T_\#\mu )[A]=\mu [T^{-1}(A)]\). The projection maps are \(\mathfrak {p}_X:X\times Y \rightarrow X\), \(\mathfrak {p}_X(x,y)=x\) and \(\mathfrak {p}_Y:X\times Y \rightarrow Y\), \(\mathfrak {p}_Y(x,y)=y\).
Definition 1
Let \(\mu \) and \(\nu \) be two measures over two polish spaces X and Y. The probability measure \(\pi \in \mathcal {P}(X\times Y)\) is a transportation plan between \(\mu \) and \(\nu \) if
We denote with \(\Pi (\mu ,\nu )\) the set of all the transportation plans between \(\mu \) and \(\nu .\)
Given \(A\in \mathcal {B}(X)\) and \(B\in \mathcal {B}(Y)\), the quantity \(\pi (A\times B)\) is the amount of mass that travels from the set A to the set B. By assigning a cost function c on \(X\times Y\) we specify a way to measure the cost of every transportation plan.
Definition 2
Let \(\mu \in \mathcal {P}(X)\), \(\nu \in \mathcal {P}(Y)\), and let \(c:X\times Y\rightarrow [0,+\infty )\) be a lower semicontinuous (l.s.c.) symmetric cost function. The transportation functional \(\mathbb {T}_c:\Pi (\mu ,\nu )\rightarrow [0,+\infty )\) is defined as
Given two measures \(\mu \in \mathcal {P}(X)\), \(\nu \in \mathcal {P}(Y)\), and a cost function c, the optimal transportation problem consists in finding the infimum of \(\mathbb {T}_c\) over \(\Pi (\mu ,\nu )\), i.e.
By making further assumptions on c, it is possible to prove that the infimum in (2) is actually a minimum. In particular, when the cost function is nonnegative, the solution exists. For a complete discussion on the existence of the solution, we refer to [31, Chapter 4].
We can use the optimal transportation problem to define a distance over the space \(\mathcal {P}(X)\). In particular, since X is a polish space, we can lift the distance d from X to \(\mathcal {P}(X)\), by choosing d as a cost function in (1).
Definition 3
Let (X, d) be a polish space and \(p\in [1,\infty )\). The Wasserstein distance of order p between the probability measures \(\mu \) and \(\nu \) on X is defined as
When \(p=1\), the 1-Wasserstein distance is also known as Kantorovich-Rubinstein distance.
When the cost function is not the space distance d, we denote the infimum in (2) with \(W_c(\mu ,\nu )\).
Remark 1
The infimum in (3) could actually be \(+\infty \), it is thus customary to restrict \(W_p\) to the space of probability measures with finite p-moments.
Definition 4
Given a cost function c, the \(W^{(\infty )}_c\) distance between two measures \(\mu \) and \(\nu \) is defined as
where \(||\,\cdot \,||_{L^{\infty }_\pi }\) is the \(L^{\infty }\) norm with respect to the measure \(\pi \). When c is the Euclidean distance, we use the notation: \(W^{(\infty )}\).
Let \(\mu \) and \(\nu \) be two probability measures on a Lipschitz regular and bounded subset \(\Omega \subset \mathbb {R}^n\). We define the cost function
When \(\mu \) is absolutely continuous with respect to the Lebesgue measure, it is well known (Theorem 6.3 and Theorem 6.4, [16]) that the optimal transportation plan \(\pi \) between \(\mu \) and \(\nu \) is unique and it is induced by a transportation map \(T_p\), i.e.
In [8], Bouchitté et al. established an \(L^{\infty }_\mu \)-bound on the displacement map \(Id-T_p\), which only depends on the shape of \(\Omega \), on p, and on the density of \(\mu \). This estimate allowed the authors to give the following upper bound on the \(W^{(\infty )}\) distance between \(\mu \) and \(\nu \).
Theorem 1
(Theorem 1.2, [8]) Let \(\Omega \) be a bounded connected open subset of \(\mathbb {R}^n\) with Lipschitz boundary and denote by \(\mathcal {P}(\overline{\Omega })\) (resp. \(\mathcal {P}_{ac}(\Omega )\)) the set of Borel (resp. absolutely continuous) probability measures on \(\overline{\Omega }\). Then, for every \(p>1\) and every pair \((\mu ,\nu )\in \mathcal {P}_{ac}(\Omega )\times \mathcal {P}(\overline{\Omega })\) there holds
where f is the density of \(\mu \) with respect to the Lebesgue measure and \(C_{p,n}(\Omega )\) is a positive constant depending only on p, n, and \(\Omega \).
The proof of this result heavily relies on the regularity of \(\mu \), hence, when \(\mu \) and \(\nu \) are both discrete, this result does not apply. In particular, we are no longer able to find a constant depending only on \(\mu \) and the geometry of the support of \(\mu \), as the following example shows.
Example 1
Let \(\mu ,\nu _\epsilon \in \mathcal {P}(\mathbb {R})\) be defined as
for \(\epsilon \in (0,1)\), and let \(c_2(x,y)=|x-y|^2\). By a simple computation we have that
Hence, estimate (4) does not hold true, as for every constant \(C(p,n,\Omega ,\mu )>0\) (possibly depending on \(p,n,\Omega ,\mu \)), there exists \(\epsilon >0\) such that
3 Structure of Discrete Optimal Transportation Plans
In what follows, we prove the existence of an optimal transportation plan between two discrete measures that is induced by the action of two push-forward functions, one going from X to Y and one going from Y to X. This allows us to establish a bound on \(W^{(\infty )}(\mu ,\nu )\), similar to the one proved in [8]. We always assume \(\# X=\# Y =n \in \mathbb {N}\). In this case, we can identify the sets X and Y with \(\{1,\dots , n\}\). Without loss of generality, we therefore assume \(X=Y\). In this setting, a measure \(\mu \in \mathcal {P}(X)\) has the form \(\sum _{x\in X} \mu _x\delta _x\), we thus use the notation \(\mu _x\) to denote the coefficient of \(\mu \) in x and, likewise, \(c_{x,y}\) (resp. \(\pi _{x,y}\)) stands for the value of \(c:X\times Y \rightarrow \mathbb {R}\) (resp. the coefficient of \(\pi \in \mathcal {P}(X\times Y)\)) in the point \((x,y)\in X\times Y\).
Definition 5
Let \(\mu ,\nu \in \mathcal {P}(X)\) be two measures on a discrete set X and let \(c:X\times X\rightarrow \mathbb {R}\) be a cost function. A minimal solution \(\pi ^*\) of the transportation problem is said to be trim if
for each optimal solution \(\pi \).
Lemma 2
Let \(\pi \in \Pi (\mu ,\nu )\) be a trim solution. Then each restriction of \(\pi \) is a trim solution for its marginals. In particular, if \(\pi ^{(1)}\) and \(\pi ^{(2)}\) are such that
and \(\mathrm{spt}(\pi ^{(1)})\cap \mathrm{spt}(\pi ^{(2)})=\emptyset \), then \(\pi ^{(1)}\) and \(\pi ^{(2)}\) are trim solutions for their marginals.
Proof
Let \(\pi ^{*}\) be a restriction of \(\pi \). By Theorem 4.6 (Chapter 4, [31]), we know that \(\pi ^*\) is optimal between its marginals, hence we only need to prove that its support has minimal cardinality.
Arguing by contradiction, let us assume that \(\pi ^*\) is not trim, hence there exists another optimal plan \(\eta \) between the marginals of \(\pi ^*\) such that
We can define the measure \(\hat{\pi }\) as
since \(\pi \ge \pi ^*\) and \(\eta \ge 0\), we have \(\hat{\pi }\ge 0\). Moreover, since \(\pi ^*\) and \(\eta \) have the same marginals, \(\hat{\pi }\) has the same marginals of \(\pi \), therefore \(\hat{\pi }\in \Pi (\mu ,\nu )\). Moreover, since \(\pi ^*\) and \(\eta \) are optimal between their marginals, we have
thus
In particular, \(\pi \) and \(\hat{\pi }\) have the same cost, therefore \(\hat{\pi }\) is an optimal transportation plan between \(\mu \) and \(\nu \).
To conclude, we notice that, since \(\pi ^*\) is a restriction of \(\pi \), we have
which concludes the contradiction, since \(\pi \) is trim by hypothesis. \(\square \)
Theorem 6.3 in [16] states that, whenever \(\mu \) is an absolutely continuous measure supported over a compact set \(\Omega \subset \mathbb {R}^n\) and the cost function c is a strictly convex function of the Euclidean distance, the optimal transportation plan is induced by a transportation map, regardless of the regularity of \(\nu \). When \(\mu \) and \(\nu \) are both discrete, this result is generally false. However, in the next Theorem 3, we show that there exists at least one optimal transportation plan between two measures that can be recreated as the action of two functions, one acting from a subset \(\tilde{X}\subset \mathrm{spt}(\mu )\) to \( \mathrm{spt}(\nu )\) and one acting from a subset \(\tilde{Y}\subset \mathrm{spt}(\nu )\) to \(\mathrm{spt}(\mu )\).
Theorem 3
Let X be a discrete polish space and let \(\mu \) and \(\nu \) be two positive measures over the set X such that
and
Given a cost function \(c:X\times X \rightarrow \mathbb {R}\), let \(\pi \) be a trim solution of the transportation problem. We can then find two couples of measures \((\mu ^{(d)},\mu ^{(c)})\) and \((\nu ^{(d)},\nu ^{(c)})\) and a couple of functions \(h^{(1)}\) and \(h^{(2)}\) such that
We say that the decomposition ensured by Theorem 3 is a diffusive model associated with the given (trim) solution \(\pi \). We call \(\mu ^{(d)}\) and \(\nu ^{(d)}\) the diffusive part of \(\mu \) and \(\nu \), respectively. Similarly, we denote with \(\mu ^{(c)}\) and \(\nu ^{(c)}\) the concentrating part of \(\mu \) and \(\nu \), respectively. Finally, we call \(h^{(1)}\) the diffusive scheme of \(\mu \) and \(h^{(2)}\) the diffusive scheme of \(\nu \).
Proof
We proceed by induction on the cardinality of X. If \(\#X=1\), the thesis follows trivially.
Let us now assume that the statement holds for each couple of measures whose support has cardinality \((n-1)\) and let \(\mu \) and \(\nu \) be two measures supported on a set with cardinality n, namely \(X_n\). Given a trim solution \(\pi \), it is well known (Chapter 7, [12]) that
Since \(\mu \) and \(\nu \) have n points in their support, we can find \(\bar{a}\in X\) such that there exists a unique \(\bar{b}\in \mathrm{spt}(\nu )\) for which
hence \(\mu _{\bar{a}}=\pi _{\bar{a},\bar{b}}\le \nu _{\bar{b}}\). Similarly, we can find \(\underline{b}\in X\) such that there exists a unique \(\underline{a}\in \mathrm{spt}(\mu )\) for which
so that \(\nu _{\underline{b}}=\pi _{\underline{a},\underline{b}}\le \mu _{\underline{a}}\).
If \(\mu _{\bar{a}}=\pi _{\bar{a},\bar{b}} = \nu _{\bar{b}}\), we can restrict the plan \(\pi \) to the set \(\mathrm{spt}(\pi )\backslash \{(\bar{a},\bar{b})\}\). We denote this restriction with \(\pi _*\). By definition, the marginals of \(\pi _*\) are
and
In particular, the supports of \(\mu _*\) and \(\nu _*\) contain \((n-1)\) points each. By induction we can find \((\mu ^{(d)}_*,\mu ^{(c)}_*)\), \((\nu ^{(d)}_*,\nu ^{(c)}_*)\), and \((h^{(1)}_*,h^{(2)}_*)\) such that
and
We can then define
and
It easy to see that
and, since \(h^{(1)}_\#\delta _{\bar{a}}=\delta _{\bar{b}}\), we have
which concludes the proof in the case \(\mu _{\bar{a}}=\pi _{\bar{a},\bar{b}}=\nu _{\bar{b}}\). We proceed similarly if \(\nu _{\underline{b}}=\pi _{\underline{a},\underline{b}}= \mu _{\underline{a}}\). See Fig. 1 for a visual representation of this process in the case \(n=3\).
To conclude, consider the case in which \(\mu _{\bar{a}}=\pi _{\bar{a},\bar{b}} < \nu _{\bar{b}}\) and \(\nu _{\underline{b}}=\pi _{\underline{a},\underline{b}}< \mu _{\underline{a}}\). In this case, we restrict \(\pi \) to the set \(\mathrm{spt}(\pi )\backslash \{(\bar{a},\bar{b}),(\underline{a},\underline{b})\}\). Let us denote again with \(\pi _*\) the restriction and with \(\mu _*\) and \(\nu _{*}\) its marginals. Since both \(\mu _*\) and \(\nu _*\) have \((n-1)\) points in their supports, we can again decompose them as
and find a couple of functions \(h^{(1)}_*,h^{(2)}_*\) for which
We can then define
and
which concludes the thesis. \(\square \)
Remark 2
Given two measures as in the hypothesis of Theorem 3, let \(\mu ^{(d)}\) and \(\nu ^{(d)}\) be their diffusive parts. Since \(\mathrm{spt}(\mu ^{(d)})\subset \mathrm{spt}(\mu )\) and \(\mathrm{spt}(\nu ^{(d)})\subset \mathrm{spt}(\nu )\), the support of the transportation plan defined by formula (7) has, at most, 2n points. Thus the trim condition on the optimal transportation plan is necessary, as we are going to show in the next example.
Example 2
Let us take
and
(see Fig. 2) and, as a cost function, we choose the Euclidean distance in \(\mathbb {R}^3\), i.e.
It is easy to see that the plan
is optimal. However, according to Remark 2, it cannot be decomposed as in formula (7), since
Remark 3
Given a trim solution, there might be more than one diffusive model associated with it. For example, let
be two discrete measures over \(\mathbb {R}^2\). As a cost function, we choose the Euclidean distance
Then, the probability measure
is a trim plan between \(\mu \) and \(\nu \). It easy to check that
and
is a decomposition of the trim plan. However, we can also decompose \(\nu \) as
define the functions as
and still obtain an admissible decomposition of \(\pi \).
4 An Upper Bound for the Infinity Wasserstein Distance in the Discrete Setting
As an immediate consequence of the diffusive model decomposition (5)–(6) given in Theorem 3, we can decompose the Wasserstein distance associated to a cost function c and use it to estimate the infinity-Wasserstein distance.
Corollary 4
Let \(\mu ,\nu \in \mathcal {P}(X)\) be two discrete measures, \(c:X\times X \rightarrow \mathbb {R}\) be a cost function, and \(\pi \) be a trim solution of the transportation problem. Given a diffusive model for \(\pi \), we have
and
In particular, we have
where
The value \(\alpha \) defined in (9) depends on the particular diffusive model we choose. However, since \(W_c(\mu ,\nu )\) and \(W^{(\infty )}_c\) do not depend on the choice of the diffusive model, if we can give a lower bound on \(\alpha \) for a particular diffusive model, we can generalize the estimate (8).
Corollary 5
Let \(\mu ,\nu \in \mathcal {P}(X)\) be two discrete measures and \(c:X \times X\rightarrow \mathbb {R}_+\) be a cost function. For any trim plan \(\pi \), there exists a diffusive model for which
where \(\alpha \) is defined in relation (9) and
Proof
Let n be the cardinality of X. Since \(\pi \) is trim between \(\mu \) and \(\nu \), we have \(\#\mathrm{spt}(\pi )\le 2n-1\), hence we can find \(\bar{x}_1\) such that
and \(\underline{y}_1\) such that
If \(\underline{x}_1=\bar{x}_1\) (and hence \(\underline{y}_1=\bar{y}_1\)), we have \(\mu _{\bar{x}_1}=\nu _{\bar{y}_1}\) and we define
and
Otherwise, if \(\underline{x}_1\ne \bar{x}_1\) (and hence \(\underline{y}_1\ne \bar{y}_1\)), we set
and
In both cases we find two measures, \(\mu ^{(1)}\) and \(\nu ^{(1)}\), whose support has at most \(n-1\) points. Since \(\pi ^{(1)}\) is a restriction of a trim plan, by Lemma 2, also \(\pi ^{(1)}\) is trim between its marginals \(\mu ^{(1)}\) and \(\nu ^{(1)}\). Therefore, we can repeat the process, finding two points \(\bar{x}_2\) and \(\underline{y}_2\) for which
and
We can then extend the definition of the measures \(\mu ^{(d)},\mu ^{(c)},\nu ^{(d)}\), and \(\nu ^{(c)}\), define the measures \(\mu ^{(2)}\), \(\nu ^{(2)}\), and \(\pi ^{(2)}\) and start all over again.
At each step, we define two measures \(\mu ^{(i)}\) and \(\nu ^{(i)}\) and increase the cardinality of the supports of \(\mu ^{(d)},\mu ^{(c)},\nu ^{(d)}\), and \(\nu ^{(c)}\). Given any \(x \in \mathrm{spt}(\mu ^{(d)})\), we can then find \(i\in \{0,1,\dots ,n-1\}\) such that
and, similarly, for any \(y\in \mathrm{spt}(\nu ^{(d)})\), we can find a \(j\in \{0,1,\dots ,n-1\}\) such that
with the convention \(\mu ^{(0)}=\mu \) and \(\nu ^{(0)}=\nu \). The relation between \(\mu ^{(i)}\) and \(\mu ^{(i+1)}\) is either
or
Similarly, we have
or
Similarly, we can write \(\mu ^{(i)}\) and \(\nu ^{(i)}\) as a function of \(\mu ^{(i-1)}\) and \(\nu ^{(i-1)}\), and then express \(\mu ^{(i+1)}\) through \(\mu ^{(i-1)}\) and \(\nu ^{(i-1)}\) as
where \(\tilde{A}_2\) and \(\tilde{B}_2\) are two subsets of X whose cardinality is at most two. By iterating this process, we are able to find
where \(\tilde{A}_{n-(i+1)}\) and \(\tilde{B}_{n-(i+1)}\) are subsets of X, whose cardinality is \(n-(i+1)\). Since the left side of (12) is positive, we can rewrite (13) as
By taking the minimum over \(K(\mu ,\nu )\) of the right side in (14), we find
for any \(i \in \{0,1,\dots ,n-1\}\) and each \(x\in \mathrm{spt}(\mu ^{(i)})\), therefore, from relation (11), we get
Similarly, one can prove
for each \(y \in \mathrm{spt}(\nu ^{(d)})\), hence relation (10) is proven. \(\square \)
In Corollary 4, we bound \(W_c^{(\infty )}\) from above with \(W_c\). However, due to the properties of \(W^{(\infty )}_c\), it is possible to relate this distance to the Wasserstein cost induced by any \(p-\)power of the same cost function.
Lemma 6
Let \(\mu ,\nu \in \mathcal {P}(X)\) and let \(c:X\times X \rightarrow \mathbb {R}_+\) be a cost function. Given any \(p>0\), it holds true
Proof
Let \(\pi \in \Pi (\mu ,\nu )\) be a plan such that
then
Similarly, one can prove \(\big (W_c^{(\infty )}(\mu ,\nu )\big )^p\le W^{(\infty )}_{c^p}(\mu ,\nu )\) and conclude the thesis. \(\square \)
Thanks to Lemma 6, we are able to prove the following result.
Theorem 7
Given a cost function \(c:X\times X\rightarrow [0,\infty )\), let \(\mu ,\nu \in \mathcal {P}(X)\) be two discrete measures. For any \(p\ge 1\),
where \(\alpha _p\) is the constant defined in (9).
Proof
Given a \(p\ge 1\), let us denote with \(\pi ^{(p)}\) the trim optimal transportation plan between \(\mu \) and \(\nu \) according to the cost function \(c_p\). Given a diffusive model for \(\pi ^{(p)}\), we denote with \(\alpha _p\) the constant defined in (9). From Lemma 6 we have
hence, for any p, we have
i.e.,
\(\square \)
In particular, since the constant \(\alpha \) from Corollary 5 bounds from below every \(\alpha _p\) and does not depend on the cost function but only on the starting measures \(\mu \) and \(\nu \), we have
for any \(p\ge 1\). In particular, if we take
we recover the bound proposed in Theorem 1 for discrete measures.
Remark 4
The estimate in (15) is sharp. To prove it, let us take
where \(a,b\in \mathbb {R}^n\). By definition (9), we have \(\alpha =1\). Moreover, it is easy to see that
which proves the sharpness of inequality (8).
References
Abdellaoui, T., Heinich, H.: Caractérisation d’une solution optimale au problème de Monge-Kantorovitch. Bull. Soc. Math. France 127(3), 429–443 (1999)
Albertos, J. C., Matrán, C., Tuero-Díaz, A.: On the monotonicity of optimal transportation plans. J Math Anal Appl. 215(1), 86–94. ISSN 0022-247X (1997)
Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows: In Metric Spaces and in the Space of Probability Measures. Birkhäuser, Basel (2008)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. Proc. Mach. Learn. Res. 70(214–223), 06–11 (2017)
Bassetti, F., Regazzini, E.: Asymptotic properties and robustness of minimum dissimilarity estimators of location-scale parameters. Soc. Ind. Appl. Math. 50, 312–330, 01 (2005) 10.4213/tvp109
Bassetti, F., Bodini, A., Regazzini, E.: On minimum Kantorovich distance estimators. Stat. Probab. Lett. 76(12), 1298–1302 (2006). (https://EconPapers.repec.org/RePEc:eee:stapro:v:76:y:2006:i:12:p:1298-1302)
Bassetti, F., Gualandi, S., Veneroni, M.: On the computation of Kantorovich-Wasserstein distances between 2D-histograms by uncapacitated minimum cost flows. SIAM J. Optim. 30(3), 2441–2469 (2020)
Bouchitté, G., Jimenez, C., Mahadevan, R.: A new \({L}^{\infty }\) estimate in optimal mass transport. Proc Am Math Soc, 135:3525–3535, 11 (2007) https://doi.org/10.1090/S0002-9939-07-08877-6
Brenier, Y.: On the translocation of masses. Commun. Pure Appl. Math. 44(4), 375–417 (1991)
Caffarelli, L.A., Feldman, M., McCann, R.J.: Constructing optimal maps for Monge’s transport problem as a limit of strictly convex costs. J. Am. Math. Soc. 15(1), 1–26 (2002). ISSN 08940347, 10886834. http://www.jstor.org/stable/827090
Cuturi, M., Doucet, A.: Fast computation of Wasserstein barycenters. Proc. Mach. Learn. Res. 32(2), 685–693, 22–24 (2014)
Dantzig, G.B., Thapa, M.N.: Linear Programming 1: Introduction. Springer, Berlin (1997) 0387948333
Dobrushin, R.: Vlasov equations. Funct. Anal. Appl. 13(2), 115–123 (1979)
Figalli, A.: Existence, uniqueness, and regularity of optimal transport maps. SIAM J. Math. Anal. 39(1), 126–137 (2007)
Frogner, C., Zhang, C., Mobahi, H., Araya, M., Poggio, T. A.: Learning with a wasserstein loss. In: Advances in Neural Information Processing Systems, pp. 2053–2061 (2015)
Gangbo, W., McCann, R.J.: The geometry of optimal transportation. Acta Math. 177(177), 113–161 (1996)
Hiroshi, M., Hiroshi, T.: An inequality for certain functional of multidimensional probability distributions. Hiroshima Math. 4(1), 75–81 (1974)
Hiroshi, T.: An inequality for a functional of probability distributions and its application to Kac’s one-dimensional model of a Maxwellian gas. Z. Wahrscheinlichkeitstheorie Verwandte Gebiete 27, 47–52 (1973)
Hiroshi, T.: Probabilistic treatment of the Boltzmann equation of Maxwellian molecules. Z Wahrscheinlichkeitstheorie Verwandte Gebiete 46, 67–105 (1978)
Kantorovich, L.V.: Mathematical methods of organizing and planning production. Manag. Sci. 6(4), 366–422 (1960)
Kantorovich, L.V.: On the translocation of masses. J. Math. Sci. 133(4), 1381–1382 (2006)
Levina, E., Bickel, P.: The Earth Mover’s Distance is the Mallows distance: Some insights from statistics. In: Proceedings of the IEEE International Conference on Computer Vision, Vol. 2, pp. 251 – 256, 02 (2001). https://doi.org/10.1109/ICCV.2001.937632
Loeper, G., et al.: On the regularity of solutions of optimal transportation problems. Acta Math. 202(2), 241–283 (2009)
Monge, G.: Mémoire sur la théorie des déblais et des remblais. In: Histoire de l’Académie Royale des Sciences de Paris (1781)
Pele, O., Werman, M.: Fast and robust earth mover’s distances. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 460–467 IEEE (2009)
Rubner, Y., Tomasi, C., Guibas, L.: Metric for distributions with applications to image databases. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 59–66, 02 (1998). https://doi.org/10.1109/ICCV.1998.710701
Rubner, Y., Tomasi, C., Guibas, L.J.: The Earth Mover’s Distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)
Rüschendorf, L., Rachev, S.T.: A characterization of random variables with minimum L2-distance. J. Multivar. Anal. 32(1), 48–54 (1990)
Santambrogio, F.: Optimal Transport for Applied Mathematicians. Springer, New York (2015)
Solomon, J., Rustamov, R., Guibas, L., Butscher, A.: Wasserstein propagation for semi-supervised learning. Proc. Mach. Learn. Res. 32(1), 306–314, 22–24 (2014)
Villani, C.: Optimal Transport: Old and New, vol. 338. Springer, Berlin (2009)
Acknowledgements
We are deeply indebted to Filippo Santambrogio for introducing us to the work of Bouchitté, Jimenez, and Mahadevan and for several stimulating discussions and valuable suggestions. We thank Stefano Gualandi for his feedback and Gabriele Loli for enhancing the images of this paper.
Funding
Open access funding provided by Università degli Studi di Pavia within the CRUI-CARE Agreement. The authors have not disclosed any funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interest
The authors have not disclosed any competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Auricchio, G., Veneroni, M. On the Structure of Optimal Transportation Plans between Discrete Measures. Appl Math Optim 85, 42 (2022). https://doi.org/10.1007/s00245-022-09861-4
Accepted:
Published:
DOI: https://doi.org/10.1007/s00245-022-09861-4
Keywords
- Wasserstein distance
- Discrete optimal transport
- Uniform estimates
- Structure of solutions
- Monge–Kantorovich problem