On the Structure of Optimal Transportation Plans between Discrete Measures

Auricchio, Gennaro; Veneroni, Marco

doi:10.1007/s00245-022-09861-4

On the Structure of Optimal Transportation Plans between Discrete Measures

Open access
Published: 10 May 2022

Volume 85, article number 42, (2022)
Cite this article

Download PDF

You have full access to this open access article

Applied Mathematics & Optimization Aims and scope Submit manuscript

On the Structure of Optimal Transportation Plans between Discrete Measures

Download PDF

2454 Accesses
2 Citations
Explore all metrics

Abstract

It is well known that the optimal transportation plan between two probability measures $\mu $ and $\nu $ is induced by a transportation map whenever $\mu $ is an absolutely continuous measure supported over a compact set in the Euclidean space and the cost function is a strictly convex function of the Euclidean distance. However, when $\mu $ and $\nu $ are both discrete, this result is generally false. In this paper, we prove that, given any pair of discrete probability measures and a cost function, there exists an optimal transportation plan that can be expressed as the sum of two deterministic plans, i.e., plans induced by transportation maps. As an application, we estimate the infinity-Wasserstein distance between two discrete probability measures $\mu $ and $\nu $ with the p-Wasserstein distance, times a constant depending on $\mu $, on $\nu $, and on the fixed cost function.

Semi-discrete optimal transport: a solution procedure for the unsquared Euclidean distance case

Article Open access 12 February 2020

On the Existence of Monge Maps for the Gromov–Wasserstein Problem

Article 15 February 2024

Optimal measure transportation with respect to non-traditional costs

Article 06 December 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The Optimal Transport (OT) problem is a classical minimization problem dating back to the work of Monge [24] and Kantorovich [20, 21]. In this problem, we are given two probability measures, namely $\mu $ and $\nu $, and we search for the cheapest way to reshape $\mu $ into $\nu $. The effort needed in order to perform this transformation depends on a cost function, which describes the underlying geometry of the product space of the support of the two measures. In the right setting, this effort induces a distance between probability measures.

During the last century, the OT problem has been fruitfully used in many applied fields such as the study of systems of particles by Dobrushin [13], the Boltzmann equation by Tanaka in [17,18,19], and the field of fluidodynamics by Yann Brenier [9]. All these results pointed out that , by a qualitative description of optimal transport, it was possible to gain insightful information on many open problems. For this reason, the Optimal Transport problem has become a topic of major interest for analysts, probabilists and statisticians [3, 29, 31]. In particular, a plethora of results concerning the uniqueness [10, 14, 16], the structure [1, 2, 28], and the regularity [8, 23] of the optimal transportation plan in the continuous framework has been proved.

In recent years, it has also become a crucial sub-problem in several applications in Computer Vision [7, 25,26,27], Computational Statistics [22], Probability [5, 6], and Machine Learning [4, 11, 15, 30]. However, in these fields, the measures $\mu $ and $\nu $ are discrete, and therefore the optimal transportation plans lack most of the good properties their continuous counterparts enjoy.

In this paper, we study the structure of optimal transportation plans between discrete probability measures. After introducing the notion of trim plan between the measures $\mu $ and $\nu $, we prove that such plans are the sum of two deterministic plans, i.e., plans that are induced by the action of two suitable push-forward maps. The first map acts on a portion $\mu ^{(d)}$ of $\mu $, while the other one acts on a portion $\nu ^{(d)}$ of $\nu $ (Theorem 3). Thanks to this formula, we recover an extension of the estimate given in [8]. Namely, we estimate the infinity-Wasserstein distance between a pair of discrete measures $(\mu ,\nu )$ (see Definition 4 below) by the c-Wasserstein distance between $\mu $ and $\nu $, times a quantity that only depends on $\mu $ and $\nu $ (Theorem 7).

2 Basic Notions on Optimal Transport

In this section, following [31], we recall the main definitions regarding optimal transportation and we examine the continuous counterpart [8] to our $W^\infty $ estimate.

Given a polish space (X, d), we denote with $\mathcal {B}(X)$ the set of Borel sets over X, while with $\mathcal {P}(X)$ we denote the set of Borel measures over X. Given a Borel measurable function $T:X\rightarrow Y$, we denote with $T_\#:\mathcal {P}(X)\rightarrow \mathcal {P}(Y)$ the push-forward operator induced by T, defined by: $(T_\#\mu )[A]=\mu [T^{-1}(A)]$. The projection maps are $\mathfrak {p}_X:X\times Y \rightarrow X$, $\mathfrak {p}_X(x,y)=x$ and $\mathfrak {p}_Y:X\times Y \rightarrow Y$, $\mathfrak {p}_Y(x,y)=y$.

Definition 1

Let $\mu $ and $\nu $ be two measures over two polish spaces X and Y. The probability measure $\pi \in \mathcal {P}(X\times Y)$ is a transportation plan between $\mu $ and $\nu $ if

$$\begin{aligned} (\mathfrak {p}_{X})_\#\pi =\mu \quad \quad \text {and} \quad \quad (\mathfrak {p}_{Y})_\#\pi =\nu . \end{aligned}$$

We denote with $\Pi (\mu ,\nu )$ the set of all the transportation plans between $\mu $ and $\nu .$

Given $A\in \mathcal {B}(X)$ and $B\in \mathcal {B}(Y)$, the quantity $\pi (A\times B)$ is the amount of mass that travels from the set A to the set B. By assigning a cost function c on $X\times Y$ we specify a way to measure the cost of every transportation plan.

Definition 2

Let $\mu \in \mathcal {P}(X)$, $\nu \in \mathcal {P}(Y)$, and let $c:X\times Y\rightarrow [0,+\infty )$ be a lower semicontinuous (l.s.c.) symmetric cost function. The transportation functional $\mathbb {T}_c:\Pi (\mu ,\nu )\rightarrow [0,+\infty )$ is defined as

$$\begin{aligned} \mathbb {T}_c(\pi ):=\int _{X\times Y}c\ \mathrm{d}\pi . \end{aligned}$$

(1)

Given two measures $\mu \in \mathcal {P}(X)$, $\nu \in \mathcal {P}(Y)$, and a cost function c, the optimal transportation problem consists in finding the infimum of $\mathbb {T}_c$ over $\Pi (\mu ,\nu )$, i.e.

$$\begin{aligned} \inf _{\pi \in \Pi (\mu ,\nu )} \mathbb {T}_c(\pi ). \end{aligned}$$

(2)

By making further assumptions on c, it is possible to prove that the infimum in (2) is actually a minimum. In particular, when the cost function is nonnegative, the solution exists. For a complete discussion on the existence of the solution, we refer to [31, Chapter 4].

We can use the optimal transportation problem to define a distance over the space $\mathcal {P}(X)$. In particular, since X is a polish space, we can lift the distance d from X to $\mathcal {P}(X)$, by choosing d as a cost function in (1).

Definition 3

Let (X, d) be a polish space and $p\in [1,\infty )$. The Wasserstein distance of order p between the probability measures $\mu $ and $\nu $ on X is defined as

$$\begin{aligned} W_p(\mu ,\nu ):=\Big (\inf _{\pi \in \Pi (\mu ,\nu )} \mathbb {T}_{d^p}(\pi )\Big )^{\frac{1}{p}} =\Big (\inf _{\pi \in \Pi (\mu ,\nu )} \int _{X\times Y}d^p(x,y) \mathrm{d}\pi (x,y)\Big )^{\frac{1}{p}}. \end{aligned}$$

(3)

When $p=1$, the 1-Wasserstein distance is also known as Kantorovich-Rubinstein distance.

When the cost function is not the space distance d, we denote the infimum in (2) with $W_c(\mu ,\nu )$.

Remark 1

The infimum in (3) could actually be $+\infty $, it is thus customary to restrict $W_p$ to the space of probability measures with finite p-moments.

Definition 4

Given a cost function c, the $W^{(\infty )}_c$ distance between two measures $\mu $ and $\nu $ is defined as

$$\begin{aligned} W^{(\infty )}_c(\mu ,\nu )=\inf _{\pi \in \Pi (\mu ,\nu )}||c ||_{L^{\infty }_\pi } \end{aligned}$$

where $||\,\cdot \,||_{L^{\infty }_\pi }$ is the $L^{\infty }$ norm with respect to the measure $\pi $. When c is the Euclidean distance, we use the notation: $W^{(\infty )}$.

Let $\mu $ and $\nu $ be two probability measures on a Lipschitz regular and bounded subset $\Omega \subset \mathbb {R}^n$. We define the cost function

$$\begin{aligned} c_p({\textbf {x}},{\textbf {y}}):=\left( \sqrt{\sum _{i=1}^n|x_i-y_i|^2}\right) ^{p}, \quad \quad p>1. \end{aligned}$$

When $\mu $ is absolutely continuous with respect to the Lebesgue measure, it is well known (Theorem 6.3 and Theorem 6.4, [16]) that the optimal transportation plan $\pi $ between $\mu $ and $\nu $ is unique and it is induced by a transportation map $T_p$, i.e.

$$\begin{aligned} \pi =(Id,T_p)_\#\mu . \end{aligned}$$

In [8], Bouchitté et al. established an $L^{\infty }_\mu $-bound on the displacement map $Id-T_p$, which only depends on the shape of $\Omega $, on p, and on the density of $\mu $. This estimate allowed the authors to give the following upper bound on the $W^{(\infty )}$ distance between $\mu $ and $\nu $.

Theorem 1

(Theorem 1.2, [8]) Let $\Omega $ be a bounded connected open subset of $\mathbb {R}^n$ with Lipschitz boundary and denote by $\mathcal {P}(\overline{\Omega })$ (resp. $\mathcal {P}_{ac}(\Omega )$) the set of Borel (resp. absolutely continuous) probability measures on $\overline{\Omega }$. Then, for every $p>1$ and every pair $(\mu ,\nu )\in \mathcal {P}_{ac}(\Omega )\times \mathcal {P}(\overline{\Omega })$ there holds

$$\begin{aligned} (W^{(\infty )}(\mu ,\nu ))^{p+n}\le C_{p,n}(\Omega )||f^{-1}||_{L^{\infty }(\Omega )}W^p_p(\mu ,\nu ), \end{aligned}$$

(4)

where f is the density of $\mu $ with respect to the Lebesgue measure and $C_{p,n}(\Omega )$ is a positive constant depending only on p, n, and $\Omega $.

The proof of this result heavily relies on the regularity of $\mu $, hence, when $\mu $ and $\nu $ are both discrete, this result does not apply. In particular, we are no longer able to find a constant depending only on $\mu $ and the geometry of the support of $\mu $, as the following example shows.

Example 1

Let $\mu ,\nu _\epsilon \in \mathcal {P}(\mathbb {R})$ be defined as

$$\begin{aligned} \mu =\dfrac{1}{2}\delta _{0}+\dfrac{1}{2}\delta _{1},\quad \quad \quad \nu _\epsilon =\dfrac{1-\epsilon }{2}\delta _{0}+\dfrac{1+\epsilon }{2}\delta _{1}, \end{aligned}$$

for $\epsilon \in (0,1)$, and let $c_2(x,y)=|x-y|^2$. By a simple computation we have that

$$\begin{aligned} W^{(\infty )}_2(\mu ,\nu _\epsilon )=1,\quad \quad \quad W^2_2(\mu ,\nu _\epsilon )=\dfrac{\epsilon }{2}. \end{aligned}$$

Hence, estimate (4) does not hold true, as for every constant $C(p,n,\Omega ,\mu )>0$ (possibly depending on $p,n,\Omega ,\mu $), there exists $\epsilon >0$ such that

$$\begin{aligned} (W^{(\infty )}(\mu ,\nu _\epsilon ))^{2+1}=1 > C(p,n,\Omega ,\mu ) W^2_2(\mu ,\nu )=\epsilon C(p,n,\Omega ,\mu ). \end{aligned}$$

3 Structure of Discrete Optimal Transportation Plans

In what follows, we prove the existence of an optimal transportation plan between two discrete measures that is induced by the action of two push-forward functions, one going from X to Y and one going from Y to X. This allows us to establish a bound on $W^{(\infty )}(\mu ,\nu )$, similar to the one proved in [8]. We always assume $\# X=\# Y =n \in \mathbb {N}$. In this case, we can identify the sets X and Y with $\{1,\dots , n\}$. Without loss of generality, we therefore assume $X=Y$. In this setting, a measure $\mu \in \mathcal {P}(X)$ has the form $\sum _{x\in X} \mu _x\delta _x$, we thus use the notation $\mu _x$ to denote the coefficient of $\mu $ in x and, likewise, $c_{x,y}$ (resp. $\pi _{x,y}$) stands for the value of $c:X\times Y \rightarrow \mathbb {R}$ (resp. the coefficient of $\pi \in \mathcal {P}(X\times Y)$) in the point $(x,y)\in X\times Y$.

Definition 5

Let $\mu ,\nu \in \mathcal {P}(X)$ be two measures on a discrete set X and let $c:X\times X\rightarrow \mathbb {R}$ be a cost function. A minimal solution $\pi ^*$ of the transportation problem is said to be trim if

$$\begin{aligned} \# \mathrm{spt}(\pi ^*)\le \# \mathrm{spt} (\pi ) \end{aligned}$$

for each optimal solution $\pi $.

Lemma 2

Let $\pi \in \Pi (\mu ,\nu )$ be a trim solution. Then each restriction of $\pi $ is a trim solution for its marginals. In particular, if $\pi ^{(1)}$ and $\pi ^{(2)}$ are such that

$$\begin{aligned} \pi =\pi ^{(1)}+\pi ^{(2)} \end{aligned}$$

and $\mathrm{spt}(\pi ^{(1)})\cap \mathrm{spt}(\pi ^{(2)})=\emptyset $, then $\pi ^{(1)}$ and $\pi ^{(2)}$ are trim solutions for their marginals.

Proof

Let $\pi ^{*}$ be a restriction of $\pi $. By Theorem 4.6 (Chapter 4, [31]), we know that $\pi ^*$ is optimal between its marginals, hence we only need to prove that its support has minimal cardinality.

Arguing by contradiction, let us assume that $\pi ^*$ is not trim, hence there exists another optimal plan $\eta $ between the marginals of $\pi ^*$ such that

$$\begin{aligned} \#\mathrm{spt}(\eta )<\#\mathrm{spt}(\pi ^*). \end{aligned}$$

We can define the measure $\hat{\pi }$ as

$$\begin{aligned} \hat{\pi }=\pi -\pi ^* +\eta , \end{aligned}$$

since $\pi \ge \pi ^*$ and $\eta \ge 0$, we have $\hat{\pi }\ge 0$. Moreover, since $\pi ^*$ and $\eta $ have the same marginals, $\hat{\pi }$ has the same marginals of $\pi $, therefore $\hat{\pi }\in \Pi (\mu ,\nu )$. Moreover, since $\pi ^*$ and $\eta $ are optimal between their marginals, we have

$$\begin{aligned} \sum _{(x,y)\in X\times X}c_{x,y}\pi ^*_{x,y}=\sum _{(x,y)\in X\times X}c_{x,y}\eta _{x,y}, \end{aligned}$$

thus

$$\begin{aligned} \sum _{(x,y)\in X\times X}c_{x,y}\hat{\pi }_{x,y}= & {} \sum _{(x,y)\in X\times X}c_{x,y}\pi _{x,y}-\sum _{(x,y)\in X\times X}c_{x,y}\pi ^*_{x,y}\\&\quad&+\sum _{(x,y)\in X\times X}c_{x,y}\eta _{x,y}\\= & {} \sum _{(x,y)\in X\times X}c_{x,y}\pi _{x,y}. \end{aligned}$$

In particular, $\pi $ and $\hat{\pi }$ have the same cost, therefore $\hat{\pi }$ is an optimal transportation plan between $\mu $ and $\nu $.

To conclude, we notice that, since $\pi ^*$ is a restriction of $\pi $, we have

$$\begin{aligned} \#\mathrm{spt}(\pi )=\#\mathrm{spt}(\pi -\pi ^*)+\#\mathrm{spt}(\pi ^*)>\#\mathrm{spt}(\pi -\pi ^*)+\#\mathrm{spt}(\eta )\ge \#\mathrm{spt}(\hat{\pi }), \end{aligned}$$

which concludes the contradiction, since $\pi $ is trim by hypothesis. $\square $

Theorem 6.3 in [16] states that, whenever $\mu $ is an absolutely continuous measure supported over a compact set $\Omega \subset \mathbb {R}^n$ and the cost function c is a strictly convex function of the Euclidean distance, the optimal transportation plan is induced by a transportation map, regardless of the regularity of $\nu $. When $\mu $ and $\nu $ are both discrete, this result is generally false. However, in the next Theorem 3, we show that there exists at least one optimal transportation plan between two measures that can be recreated as the action of two functions, one acting from a subset $\tilde{X}\subset \mathrm{spt}(\mu )$ to $ \mathrm{spt}(\nu )$ and one acting from a subset $\tilde{Y}\subset \mathrm{spt}(\nu )$ to $\mathrm{spt}(\mu )$.

Theorem 3

Let X be a discrete polish space and let $\mu $ and $\nu $ be two positive measures over the set X such that

$$\begin{aligned} \mu _{a}>0 \quad \quad \quad \forall a \in X, \\ \nu _b>0 \quad \quad \quad \forall b \in X, \end{aligned}$$

and

$$\begin{aligned} \sum _{a\in X }\mu _a=\sum _{b \in X}\nu _b. \end{aligned}$$

Given a cost function $c:X\times X \rightarrow \mathbb {R}$, let $\pi $ be a trim solution of the transportation problem. We can then find two couples of measures $(\mu ^{(d)},\mu ^{(c)})$ and $(\nu ^{(d)},\nu ^{(c)})$ and a couple of functions $h^{(1)}$ and $h^{(2)}$ such that

$$\begin{aligned} \mu&=\mu ^{(d)}+\mu ^{(c)}\quad \text {and}\quad \nu =\nu ^{(d)}+\nu ^{(c)}, \end{aligned}$$

(5)

$$\begin{aligned} \pi&=(Id,h^{(1)})_\#\mu ^{(d)}+(h^{(2)},Id)_\#\nu ^{(d)}. \end{aligned}$$

(6)

We say that the decomposition ensured by Theorem 3 is a diffusive model associated with the given (trim) solution $\pi $. We call $\mu ^{(d)}$ and $\nu ^{(d)}$ the diffusive part of $\mu $ and $\nu $, respectively. Similarly, we denote with $\mu ^{(c)}$ and $\nu ^{(c)}$ the concentrating part of $\mu $ and $\nu $, respectively. Finally, we call $h^{(1)}$ the diffusive scheme of $\mu $ and $h^{(2)}$ the diffusive scheme of $\nu $.

Proof

We proceed by induction on the cardinality of X. If $\#X=1$, the thesis follows trivially.

Let us now assume that the statement holds for each couple of measures whose support has cardinality $(n-1)$ and let $\mu $ and $\nu $ be two measures supported on a set with cardinality n, namely $X_n$. Given a trim solution $\pi $, it is well known (Chapter 7, [12]) that

$$\begin{aligned} \# \mathrm{spt} (\pi )\le 2n-1. \end{aligned}$$

Since $\mu $ and $\nu $ have n points in their support, we can find $\bar{a}\in X$ such that there exists a unique $\bar{b}\in \mathrm{spt}(\nu )$ for which

$$\begin{aligned} \pi _{\bar{a},\bar{b}}>0, \end{aligned}$$

hence $\mu _{\bar{a}}=\pi _{\bar{a},\bar{b}}\le \nu _{\bar{b}}$. Similarly, we can find $\underline{b}\in X$ such that there exists a unique $\underline{a}\in \mathrm{spt}(\mu )$ for which

$$\begin{aligned} \pi _{\underline{a},\underline{b}}>0, \end{aligned}$$

so that $\nu _{\underline{b}}=\pi _{\underline{a},\underline{b}}\le \mu _{\underline{a}}$.

If $\mu _{\bar{a}}=\pi _{\bar{a},\bar{b}} = \nu _{\bar{b}}$, we can restrict the plan $\pi $ to the set $\mathrm{spt}(\pi )\backslash \{(\bar{a},\bar{b})\}$. We denote this restriction with $\pi _*$. By definition, the marginals of $\pi _*$ are

$$\begin{aligned} \mu _*=\mu -\mu _{\bar{a}}\delta _{\bar{a}} \end{aligned}$$

and

$$\begin{aligned} \nu _*=\nu -\nu _{\bar{b}}\delta _{\bar{b}}. \end{aligned}$$

In particular, the supports of $\mu _*$ and $\nu _*$ contain $(n-1)$ points each. By induction we can find $(\mu ^{(d)}_*,\mu ^{(c)}_*)$, $(\nu ^{(d)}_*,\nu ^{(c)}_*)$, and $(h^{(1)}_*,h^{(2)}_*)$ such that

$$\begin{aligned} \mu _*=\mu ^{(d)}_*+\mu ^{(c)}_*, \\ \nu _*=\nu ^{(d)}_*+\nu ^{(c)}_*, \end{aligned}$$

and

$$\begin{aligned} \pi _*=(Id,h^{(1)}_*)_\#\mu ^{(d)}_* +(h^{(2)}_*,Id)_\#\nu ^{(d)}_*. \end{aligned}$$

We can then define

$$\begin{aligned} \mu ^{(d)}=\mu ^{(d)}_*+\mu _{\bar{a}}\delta _{\bar{a}}, \quad \quad \quad \mu ^{(c)}=\mu _*^{(c)}, \\ \nu ^{(d)}=\nu ^{(d)}_*, \quad \quad \quad \nu ^{(c)}=\nu _*^{(c)}+\nu _{\bar{b}}\delta _{\bar{b}}, \end{aligned}$$

and

$$\begin{aligned} h^{(1)}(a)= {\left\{ \begin{array}{ll} h^{(1)}_*(a)\quad \quad \quad \text {if }a\ne \bar{a},\\ \bar{b} \quad \quad \quad \quad \quad \; \text {otherwise,} \end{array}\right. },\quad \quad \quad \quad \quad h^{(2)}(b)=h^{(2)}_*(b). \end{aligned}$$

It easy to see that

$$\begin{aligned} \mu =\mu ^{(d)}+\mu ^{(c)}, \quad \quad \quad \nu =\nu ^{(d)}+\nu ^{(c)} \end{aligned}$$

and, since $h^{(1)}_\#\delta _{\bar{a}}=\delta _{\bar{b}}$, we have

$$\begin{aligned} \pi =(Id,h^{(1)})_\#\mu ^{(d)}+(h^{(2)},Id)_\#\nu ^{(d)}, \end{aligned}$$

(7)

which concludes the proof in the case $\mu _{\bar{a}}=\pi _{\bar{a},\bar{b}}=\nu _{\bar{b}}$. We proceed similarly if $\nu _{\underline{b}}=\pi _{\underline{a},\underline{b}}= \mu _{\underline{a}}$. See Fig. 1 for a visual representation of this process in the case $n=3$.

To conclude, consider the case in which $\mu _{\bar{a}}=\pi _{\bar{a},\bar{b}} < \nu _{\bar{b}}$ and $\nu _{\underline{b}}=\pi _{\underline{a},\underline{b}}< \mu _{\underline{a}}$. In this case, we restrict $\pi $ to the set $\mathrm{spt}(\pi )\backslash \{(\bar{a},\bar{b}),(\underline{a},\underline{b})\}$. Let us denote again with $\pi _*$ the restriction and with $\mu _*$ and $\nu _{*}$ its marginals. Since both $\mu _*$ and $\nu _*$ have $(n-1)$ points in their supports, we can again decompose them as

$$\begin{aligned} \mu _*=\mu ^{(d)}_*+\mu ^{(c)}_*, \quad \quad \quad \nu _*=\nu ^{(d)}_*+\nu ^{(c)}_* \end{aligned}$$

and find a couple of functions $h^{(1)}_*,h^{(2)}_*$ for which

$$\begin{aligned} \pi _*=(Id,h^{(1)}_*)_\#\mu ^{(d)}_*+(h^{(2)}_*,Id)_\#\nu ^{(d)}_*. \end{aligned}$$

We can then define

$$\begin{aligned} \mu ^{(d)}=\mu ^{(d)}_*+\mu _{\bar{a}}\delta _{\bar{a}}, \quad \quad \quad \mu ^{(c)}=\mu _*^{(c)}+\mu _{\underline{a}}\delta _{\underline{a}}, \\ \nu ^{(d)}=\nu ^{(d)}_*{(c)}+\nu _{\underline{b}}\delta _{\underline{b}}, \quad \quad \quad \nu ^{(c)}=\nu _*^{(c)}+\nu _{\bar{b}}\delta _{\bar{b}}, \end{aligned}$$

and

$$\begin{aligned} h^{(1)}(a)= {\left\{ \begin{array}{ll} h^{(1)}_*(a)\quad \quad \text {if }a\ne \bar{a},\\ \bar{b} \quad \quad \quad \quad \; \text {otherwise.} \end{array}\right. }\quad \quad h^{(2)}(b)={\left\{ \begin{array}{ll} h^{(2)}_*(b)\quad \quad \text {if }b\ne \underline{b},\\ \underline{a} \quad \quad \quad \quad \; \text {otherwise,} \end{array}\right. } \end{aligned}$$

which concludes the thesis. $\square $

Remark 2

Given two measures as in the hypothesis of Theorem 3, let $\mu ^{(d)}$ and $\nu ^{(d)}$ be their diffusive parts. Since $\mathrm{spt}(\mu ^{(d)})\subset \mathrm{spt}(\mu )$ and $\mathrm{spt}(\nu ^{(d)})\subset \mathrm{spt}(\nu )$, the support of the transportation plan defined by formula (7) has, at most, 2n points. Thus the trim condition on the optimal transportation plan is necessary, as we are going to show in the next example.

Example 2

Let us take

$$\begin{aligned} \mu =\dfrac{1}{4}\bigg (\delta _{(0,0,0)}+\delta _{(1,1,0)}+\delta _{(1,0,1)}+\delta _{(0,1,1)}\bigg ) \end{aligned}$$

and

$$\begin{aligned} \nu =\dfrac{1}{4}\bigg (\delta _{(1,1,1)}+\delta _{(0,0,1)}+\delta _{(0,1,0)}+\delta _{(1,0,0)}\bigg ), \end{aligned}$$

(see Fig. 2) and, as a cost function, we choose the Euclidean distance in $\mathbb {R}^3$, i.e.

$$\begin{aligned} |{\textbf {x}}-{\textbf {y}}|:=\sqrt{\sum _{i=1}^3(x_i-y_i)^2}. \end{aligned}$$

It is easy to see that the plan

$$\begin{aligned} \pi:= & {} \dfrac{1}{12}\delta _{(0,0,0)}\otimes \bigg (\delta _{(1,0,0)}+\delta _{(0,1,0)}+\delta _{(0,0,1)}\bigg )\\&\quad&+\dfrac{1}{12}\delta _{(1,1,0)}\otimes \bigg (\delta _{(0,1,0)}+\delta _{(1,0,0)}+\delta _{(1,1,1)}\bigg )\\&\quad&+\dfrac{1}{12}\delta _{(1,0,1)}\otimes \bigg (\delta _{(1,0,0)}+\delta _{(0,0,1)}+\delta _{(1,1,1)}\bigg )\\&\quad&+\dfrac{1}{12}\delta _{(0,1,1)}\otimes \bigg (\delta _{(0,1,0)}+\delta _{(0,0,1)}+\delta _{(1,1,1)}\bigg )\\ \end{aligned}$$

is optimal. However, according to Remark 2, it cannot be decomposed as in formula (7), since

$$\begin{aligned} \# \mathrm{spt}(\pi )=12>2\# \mathrm{spt}(\mu )=8. \end{aligned}$$

Remark 3

Given a trim solution, there might be more than one diffusive model associated with it. For example, let

$$\begin{aligned} \mu =\dfrac{1}{2}\delta _{(0,0)}+\dfrac{1}{2}\delta _{(1,1)}\quad \text {and}\quad \nu =\dfrac{1}{4}\delta _{(-1,1)}+\dfrac{3}{4}\delta _{(1,0)} \end{aligned}$$

be two discrete measures over $\mathbb {R}^2$. As a cost function, we choose the Euclidean distance

$$\begin{aligned} c({\textbf {x}},{\textbf {y}}):=\sqrt{(x_1-y_1)^2+(x_2-y_2)^2}. \end{aligned}$$

Then, the probability measure

$$\begin{aligned} \pi =\dfrac{1}{4}\delta _{(0,0)}\otimes \delta _{(-1,1)}+\dfrac{1}{4}\delta _{(0,0)}\otimes \delta _{(1,0)}+\dfrac{1}{2}\delta _{(1,1)}\otimes \delta _{(1,0)} \end{aligned}$$

is a trim plan between $\mu $ and $\nu $. It easy to check that

$$\begin{aligned} \mu ^{(d)}=\dfrac{1}{4}\delta _{(0,0)}+\dfrac{1}{2}\delta _{(1,1)},\quad&\quad&\quad \mu ^{(c)}=\dfrac{1}{4}\delta _{(0,0)},\\ \nu ^{(c)}=\dfrac{1}{4}\delta _{(-1,1)}+\dfrac{1}{2}\delta _{(1,0)},\quad&\quad&\quad \nu ^{(d)}=\dfrac{1}{4}\delta _{(1,0)}, \end{aligned}$$

and

$$\begin{aligned} h^{(1)}:={\left\{ \begin{array}{ll} (-1,1) \quad if \; x=(0,0),\\ (+ 1,0) \quad if \; x=(1,1),\\ (0,0) \quad \quad otherwise, \end{array}\right. }\quad \quad h^{(2)}(y)=(0,0)\quad \forall y\in \mathbb {R}^2, \end{aligned}$$

is a decomposition of the trim plan. However, we can also decompose $\nu $ as

$$\begin{aligned} \tilde{\nu }^{(d)}=\dfrac{1}{4}\delta _{(-1,1)},\quad \quad \quad \tilde{\nu }^{(c)}=\dfrac{3}{4}\delta _{(1,0)}, \end{aligned}$$

define the functions as

$$\begin{aligned} h^{(1)}({\textbf {x}})=(1,0)\quad \forall {\textbf {x}}\in \mathbb {R}^2, \quad \quad \quad h^{(2)}({\textbf {y}})=(0,0) \quad \forall {\textbf {y}}\in \mathbb {R}^2, \end{aligned}$$

and still obtain an admissible decomposition of $\pi $.

4 An Upper Bound for the Infinity Wasserstein Distance in the Discrete Setting

As an immediate consequence of the diffusive model decomposition (5)–(6) given in Theorem 3, we can decompose the Wasserstein distance associated to a cost function c and use it to estimate the infinity-Wasserstein distance.

Corollary 4

Let $\mu ,\nu \in \mathcal {P}(X)$ be two discrete measures, $c:X\times X \rightarrow \mathbb {R}$ be a cost function, and $\pi $ be a trim solution of the transportation problem. Given a diffusive model for $\pi $, we have

$$\begin{aligned} W_c(\mu ,\nu )=\sum _{x\in X}c(x,h^{(1)}(x))\mu ^{(d)}_x+\sum _{y\in X}c(h^{(2)}(y),y)\nu ^{(d)}_y \end{aligned}$$

and

$$\begin{aligned} \mathbb {T}_c^{(\infty )}(\pi )=\max \bigg \{||c(x,h^{(1)}(x))||_{L_{\mu ^{(d)}}^{\infty }},||c(h^{(2)}(y),y)||_{L^{\infty }_{\nu ^{(d)}}}\bigg \}. \end{aligned}$$

In particular, we have

$$\begin{aligned} W_c(\mu ,\nu )\ge \alpha W^{(\infty )}_c(\mu ,\nu ), \end{aligned}$$

(8)

where

$$\begin{aligned} \alpha =\min _{a\in \mathrm{spt}(\mu ^{(d)}),b\in \mathrm{spt}(\nu ^{(d)})}\{\nu ^{(d)}_b,\mu ^{(d)}_a\}. \end{aligned}$$

(9)

The value $\alpha $ defined in (9) depends on the particular diffusive model we choose. However, since $W_c(\mu ,\nu )$ and $W^{(\infty )}_c$ do not depend on the choice of the diffusive model, if we can give a lower bound on $\alpha $ for a particular diffusive model, we can generalize the estimate (8).

Corollary 5

Let $\mu ,\nu \in \mathcal {P}(X)$ be two discrete measures and $c:X \times X\rightarrow \mathbb {R}_+$ be a cost function. For any trim plan $\pi $, there exists a diffusive model for which

$$\begin{aligned} \alpha \ge \min _{(A,B)\in K(\mu ,\nu )}^{} \bigg \{ \bigg |\sum _{x\in A} \mu _x-\sum _{y\in B}\nu _y \bigg |\bigg \}, \end{aligned}$$

(10)

where $\alpha $ is defined in relation (9) and

$$\begin{aligned} K(\mu ,\nu ):=\bigg \{(A,B)\subset X\times X\quad \text {s.t.}\quad \bigg |\sum _{x\in A}\mu _x-\sum _{y\in B}\nu _y\bigg |>0\bigg \}. \end{aligned}$$

Proof

Let n be the cardinality of X. Since $\pi $ is trim between $\mu $ and $\nu $, we have $\#\mathrm{spt}(\pi )\le 2n-1$, hence we can find $\bar{x}_1$ such that

$$\begin{aligned} \exists ! \; \;\bar{y}_1 \quad s.t. \quad \pi _{\bar{x}_1,\bar{y}_1}\ne 0 \end{aligned}$$

and $\underline{y}_1$ such that

$$\begin{aligned} \exists !\; \; \underline{x}_1 \quad s.t. \quad \pi _{\underline{x}_1,\underline{y}_1}\ne 0. \end{aligned}$$

If $\underline{x}_1=\bar{x}_1$ (and hence $\underline{y}_1=\bar{y}_1$), we have $\mu _{\bar{x}_1}=\nu _{\bar{y}_1}$ and we define

$$\begin{aligned} \mu ^{(d)}_{\bar{x}_1}=\mu _{\bar{x}_1}, \quad \quad \quad \nu ^{(c)}_{\bar{y}_1}=\mu _{\bar{x}_1}, \end{aligned}$$

and

$$\begin{aligned} \mu ^{(1)}:=\mu -\mu _{\bar{x}_1}\delta _{\bar{x}_1},\quad \nu ^{(1)}:=\nu -\nu _{\bar{y}_1}\delta _{\bar{y}_1}, \quad \pi ^{(1)}=\pi -\pi _{\bar{x}_1,\bar{y}_1}\delta _{\bar{x}_1,\bar{y}_1}. \end{aligned}$$

Otherwise, if $\underline{x}_1\ne \bar{x}_1$ (and hence $\underline{y}_1\ne \bar{y}_1$), we set

$$\begin{aligned} \mu ^{(d)}_{\bar{x}_1}=\mu _{\bar{x}_1},\quad&\quad&\quad \mu ^{(c)}_{\underline{x}_1}=\nu _{\underline{y}_1},\\ \nu ^{(d)}_{\underline{y}_1}=\nu _{\underline{y}_1},\quad&\quad&\quad \nu ^{(c)}_{\bar{y}_1}=\mu _{\bar{x}_1}, \end{aligned}$$

and

$$\begin{aligned} \mu ^{(1)}= & {} \mu -\mu _{\bar{x}_1}\delta _{\bar{x}_1}-\nu _{\underline{y}_1}\delta _{\underline{x}_1},\\ \nu ^{(1)}= & {} \nu -\nu _{\underline{y}_1}\delta _{\underline{y}_1}-\mu _{\bar{x}_1}\delta _{\bar{y}_1}\\ \pi ^{(1)}= & {} \pi -\pi _{\bar{x}_1,\bar{y}_1}\delta _{\bar{x}_1,\bar{y}_1}-\pi _{\underline{x}_1,\underline{y}_1}\delta _{\underline{x}_1,\underline{y}_1}. \end{aligned}$$

In both cases we find two measures, $\mu ^{(1)}$ and $\nu ^{(1)}$, whose support has at most $n-1$ points. Since $\pi ^{(1)}$ is a restriction of a trim plan, by Lemma 2, also $\pi ^{(1)}$ is trim between its marginals $\mu ^{(1)}$ and $\nu ^{(1)}$. Therefore, we can repeat the process, finding two points $\bar{x}_2$ and $\underline{y}_2$ for which

$$\begin{aligned} \exists ! \;\; \bar{y}_2 \quad s.t. \quad \pi _{\bar{x}_2,\bar{y}_2}\ne 0 \end{aligned}$$

and

$$\begin{aligned} \exists ! \;\; \underline{x}_2\quad s.t. \quad \pi _{\underline{x}_2,\underline{y}_2}\ne 0. \end{aligned}$$

We can then extend the definition of the measures $\mu ^{(d)},\mu ^{(c)},\nu ^{(d)}$, and $\nu ^{(c)}$, define the measures $\mu ^{(2)}$, $\nu ^{(2)}$, and $\pi ^{(2)}$ and start all over again.

At each step, we define two measures $\mu ^{(i)}$ and $\nu ^{(i)}$ and increase the cardinality of the supports of $\mu ^{(d)},\mu ^{(c)},\nu ^{(d)}$, and $\nu ^{(c)}$. Given any $x \in \mathrm{spt}(\mu ^{(d)})$, we can then find $i\in \{0,1,\dots ,n-1\}$ such that

$$\begin{aligned} \mu ^{(d)}_x=\mu ^{(i)}_x, \end{aligned}$$

(11)

and, similarly, for any $y\in \mathrm{spt}(\nu ^{(d)})$, we can find a $j\in \{0,1,\dots ,n-1\}$ such that

$$\begin{aligned} \nu ^{(d)}_y=\nu ^{(j)}_y, \end{aligned}$$

with the convention $\mu ^{(0)}=\mu $ and $\nu ^{(0)}=\nu $. The relation between $\mu ^{(i)}$ and $\mu ^{(i+1)}$ is either

$$\begin{aligned} \mu ^{(i+1)}=\mu ^{(i)}-\mu ^{(i)}_{\bar{x}_{i+1}}\delta _{\bar{x}_{i+1}} \end{aligned}$$

or

$$\begin{aligned} \mu ^{(i+1)}=\mu ^{(i)}-\mu ^{(i)}_{\bar{x}_{i+1}}\delta _{\bar{x}_{i+1}}-\nu ^{(i)}_{\underline{y}_{i+1}}\delta _{\underline{x}_{i+1}}. \end{aligned}$$

Similarly, we have

$$\begin{aligned} \nu ^{(i+1)}=\nu ^{(i)}-\nu ^{(i)}_{\underline{y}_{i+1}}\delta _{\underline{y}_{i+1}} \end{aligned}$$

or

$$\begin{aligned} \nu ^{(i+1)}=\nu ^{(i)}-\nu ^{(i)}_{\underline{y}_{i+1}}\delta _{\underline{y}_{i+1}}-\mu ^{(i)}_{\bar{x}_{i+1}}\delta _{\bar{y}_{i+1}}. \end{aligned}$$

Similarly, we can write $\mu ^{(i)}$ and $\nu ^{(i)}$ as a function of $\mu ^{(i-1)}$ and $\nu ^{(i-1)}$, and then express $\mu ^{(i+1)}$ through $\mu ^{(i-1)}$ and $\nu ^{(i-1)}$ as

$$\begin{aligned} \mu ^{(i+1)}_x=\sum _{a\in \tilde{A}_2}\mu ^{(i-1)}_a-\sum _{b\in \tilde{B}_2}\nu ^{(i-1)}_b, \end{aligned}$$

(12)

where $\tilde{A}_2$ and $\tilde{B}_2$ are two subsets of X whose cardinality is at most two. By iterating this process, we are able to find

$$\begin{aligned} \mu ^{(i+1)}_x=\sum _{a\in \tilde{A}_{n-(i+1)}}\mu _a-\sum _{b\in \tilde{B}_{n-(i+1)}}\nu _b, \end{aligned}$$

(13)

where $\tilde{A}_{n-(i+1)}$ and $\tilde{B}_{n-(i+1)}$ are subsets of X, whose cardinality is $n-(i+1)$. Since the left side of (12) is positive, we can rewrite (13) as

$$\begin{aligned} \mu ^{(i+1)}_x=\bigg |\sum _{a\in \tilde{A}_2}\mu ^{(i-1)}_a-\sum _{b\in \tilde{B}_2}\nu ^{(i-1)}_b\bigg |. \end{aligned}$$

(14)

By taking the minimum over $K(\mu ,\nu )$ of the right side in (14), we find

$$\begin{aligned} \mu ^{(i)}_x\ge \min _{(A,B)\in K(\mu ,\nu )}\bigg \{\bigg |\sum _{x\in A}\mu _x-\sum _{y\in B}\nu _y\bigg |\bigg \}, \end{aligned}$$

for any $i \in \{0,1,\dots ,n-1\}$ and each $x\in \mathrm{spt}(\mu ^{(i)})$, therefore, from relation (11), we get

$$\begin{aligned} \mu ^{(d)}\ge \min _{(A,B)\in K(\mu ,\nu )}\bigg \{\bigg |\sum _{x\in A}\mu _x-\sum _{y\in B}\nu _y\bigg |\bigg \}. \end{aligned}$$

Similarly, one can prove

$$\begin{aligned} \nu _y^{(d)}\ge \min _{(A,B)\in K(\mu ,\nu )}\bigg \{\bigg |\sum _{x\in A}\mu _x-\sum _{y\in B}\nu _y\bigg |\bigg \}, \end{aligned}$$

for each $y \in \mathrm{spt}(\nu ^{(d)})$, hence relation (10) is proven. $\square $

In Corollary 4, we bound $W_c^{(\infty )}$ from above with $W_c$. However, due to the properties of $W^{(\infty )}_c$, it is possible to relate this distance to the Wasserstein cost induced by any $p-$power of the same cost function.

Lemma 6

Let $\mu ,\nu \in \mathcal {P}(X)$ and let $c:X\times X \rightarrow \mathbb {R}_+$ be a cost function. Given any $p>0$, it holds true

$$\begin{aligned} W^{(\infty )}_{c^p}(\mu ,\nu )=\big (W^{(\infty )}_c(\mu ,\nu )\big )^p. \end{aligned}$$

Proof

Let $\pi \in \Pi (\mu ,\nu )$ be a plan such that

$$\begin{aligned} T_{c}(\pi )=W_c^{(\infty )}(\mu ,\nu ), \end{aligned}$$

then

$$\begin{aligned} W^{(\infty )}_{c^p}(\mu ,\nu )\le T_{c^p}(\pi )=T_c(\pi )^p=\big (W_c^{(\infty )}(\mu ,\nu )\big )^p. \end{aligned}$$

Similarly, one can prove $\big (W_c^{(\infty )}(\mu ,\nu )\big )^p\le W^{(\infty )}_{c^p}(\mu ,\nu )$ and conclude the thesis. $\square $

Thanks to Lemma 6, we are able to prove the following result.

Theorem 7

Given a cost function $c:X\times X\rightarrow [0,\infty )$, let $\mu ,\nu \in \mathcal {P}(X)$ be two discrete measures. For any $p\ge 1$,

$$\begin{aligned} W^{(\infty )}_{c}(\mu ,\nu )\le \frac{W_{c_p}(\mu ,\nu )}{(\alpha _p)^{\frac{1}{p}}}, \end{aligned}$$

(15)

where $\alpha _p$ is the constant defined in (9).

Proof

Given a $p\ge 1$, let us denote with $\pi ^{(p)}$ the trim optimal transportation plan between $\mu $ and $\nu $ according to the cost function $c_p$. Given a diffusive model for $\pi ^{(p)}$, we denote with $\alpha _p$ the constant defined in (9). From Lemma 6 we have

$$\begin{aligned} W^{(\infty )}_{c_p}(\mu ,\nu )=(W^{(\infty )}_c(\mu ,\nu ))^p, \end{aligned}$$

hence, for any p, we have

$$\begin{aligned} (W^{(\infty )}_c(\mu ,\nu ))^p=W^{(\infty )}_{c_p}(\mu ,\nu )\le \frac{W^p_{c_p}(\mu ,\nu )}{\alpha _p}, \end{aligned}$$

i.e.,

$$\begin{aligned} W^{(\infty )}_c(\mu ,\nu )\le \frac{W_{c_p}(\mu ,\nu )}{(\alpha _p)^{\frac{1}{p}}}. \end{aligned}$$

$\square $

In particular, since the constant $\alpha $ from Corollary 5 bounds from below every $\alpha _p$ and does not depend on the cost function but only on the starting measures $\mu $ and $\nu $, we have

$$\begin{aligned} W^{(\infty )}_c(\mu ,\nu )\le \frac{W_{c_p}(\mu ,\nu )}{(\alpha )^{\frac{1}{p}}} \end{aligned}$$

for any $p\ge 1$. In particular, if we take

$$\begin{aligned} c({\textbf {x}},{\textbf {y}}):=\sqrt{\sum _{i=1}^n|x_i-y_i |^2}, \end{aligned}$$

we recover the bound proposed in Theorem 1 for discrete measures.

Remark 4

The estimate in (15) is sharp. To prove it, let us take

$$\begin{aligned} \mu =\delta _a \quad \quad \text {and}\quad \quad \nu =\delta _b \end{aligned}$$

where $a,b\in \mathbb {R}^n$. By definition (9), we have $\alpha =1$. Moreover, it is easy to see that

$$\begin{aligned} W^{(\infty )}(\mu ,\nu )=|a-b|\quad \quad \text {and}\quad \quad W_p(\mu ,\nu )=|a-b|, \end{aligned}$$

which proves the sharpness of inequality (8).

References

Abdellaoui, T., Heinich, H.: Caractérisation d’une solution optimale au problème de Monge-Kantorovitch. Bull. Soc. Math. France 127(3), 429–443 (1999)
Article MathSciNet Google Scholar
Albertos, J. C., Matrán, C., Tuero-Díaz, A.: On the monotonicity of optimal transportation plans. J Math Anal Appl. 215(1), 86–94. ISSN 0022-247X (1997)
Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows: In Metric Spaces and in the Space of Probability Measures. Birkhäuser, Basel (2008)
MATH Google Scholar
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. Proc. Mach. Learn. Res. 70(214–223), 06–11 (2017)
Google Scholar
Bassetti, F., Regazzini, E.: Asymptotic properties and robustness of minimum dissimilarity estimators of location-scale parameters. Soc. Ind. Appl. Math. 50, 312–330, 01 (2005) 10.4213/tvp109
Bassetti, F., Bodini, A., Regazzini, E.: On minimum Kantorovich distance estimators. Stat. Probab. Lett. 76(12), 1298–1302 (2006). (https://EconPapers.repec.org/RePEc:eee:stapro:v:76:y:2006:i:12:p:1298-1302)
Article MathSciNet Google Scholar
Bassetti, F., Gualandi, S., Veneroni, M.: On the computation of Kantorovich-Wasserstein distances between 2D-histograms by uncapacitated minimum cost flows. SIAM J. Optim. 30(3), 2441–2469 (2020)
Article MathSciNet Google Scholar
Bouchitté, G., Jimenez, C., Mahadevan, R.: A new ${L}^{\infty }$ estimate in optimal mass transport. Proc Am Math Soc, 135:3525–3535, 11 (2007) https://doi.org/10.1090/S0002-9939-07-08877-6
Brenier, Y.: On the translocation of masses. Commun. Pure Appl. Math. 44(4), 375–417 (1991)
Article Google Scholar
Caffarelli, L.A., Feldman, M., McCann, R.J.: Constructing optimal maps for Monge’s transport problem as a limit of strictly convex costs. J. Am. Math. Soc. 15(1), 1–26 (2002). ISSN 08940347, 10886834. http://www.jstor.org/stable/827090
Cuturi, M., Doucet, A.: Fast computation of Wasserstein barycenters. Proc. Mach. Learn. Res. 32(2), 685–693, 22–24 (2014)
Dantzig, G.B., Thapa, M.N.: Linear Programming 1: Introduction. Springer, Berlin (1997) 0387948333
Dobrushin, R.: Vlasov equations. Funct. Anal. Appl. 13(2), 115–123 (1979)
Article Google Scholar
Figalli, A.: Existence, uniqueness, and regularity of optimal transport maps. SIAM J. Math. Anal. 39(1), 126–137 (2007)
Article MathSciNet Google Scholar
Frogner, C., Zhang, C., Mobahi, H., Araya, M., Poggio, T. A.: Learning with a wasserstein loss. In: Advances in Neural Information Processing Systems, pp. 2053–2061 (2015)
Gangbo, W., McCann, R.J.: The geometry of optimal transportation. Acta Math. 177(177), 113–161 (1996)
Article MathSciNet Google Scholar
Hiroshi, M., Hiroshi, T.: An inequality for certain functional of multidimensional probability distributions. Hiroshima Math. 4(1), 75–81 (1974)
MathSciNet MATH Google Scholar
Hiroshi, T.: An inequality for a functional of probability distributions and its application to Kac’s one-dimensional model of a Maxwellian gas. Z. Wahrscheinlichkeitstheorie Verwandte Gebiete 27, 47–52 (1973)
Article MathSciNet Google Scholar
Hiroshi, T.: Probabilistic treatment of the Boltzmann equation of Maxwellian molecules. Z Wahrscheinlichkeitstheorie Verwandte Gebiete 46, 67–105 (1978)
Article MathSciNet Google Scholar
Kantorovich, L.V.: Mathematical methods of organizing and planning production. Manag. Sci. 6(4), 366–422 (1960)
Article MathSciNet Google Scholar
Kantorovich, L.V.: On the translocation of masses. J. Math. Sci. 133(4), 1381–1382 (2006)
Article MathSciNet Google Scholar
Levina, E., Bickel, P.: The Earth Mover’s Distance is the Mallows distance: Some insights from statistics. In: Proceedings of the IEEE International Conference on Computer Vision, Vol. 2, pp. 251 – 256, 02 (2001). https://doi.org/10.1109/ICCV.2001.937632
Loeper, G., et al.: On the regularity of solutions of optimal transportation problems. Acta Math. 202(2), 241–283 (2009)
Article MathSciNet Google Scholar
Monge, G.: Mémoire sur la théorie des déblais et des remblais. In: Histoire de l’Académie Royale des Sciences de Paris (1781)
Pele, O., Werman, M.: Fast and robust earth mover’s distances. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 460–467 IEEE (2009)
Rubner, Y., Tomasi, C., Guibas, L.: Metric for distributions with applications to image databases. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 59–66, 02 (1998). https://doi.org/10.1109/ICCV.1998.710701
Rubner, Y., Tomasi, C., Guibas, L.J.: The Earth Mover’s Distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)
Article Google Scholar
Rüschendorf, L., Rachev, S.T.: A characterization of random variables with minimum L2-distance. J. Multivar. Anal. 32(1), 48–54 (1990)
Article Google Scholar
Santambrogio, F.: Optimal Transport for Applied Mathematicians. Springer, New York (2015)
Book Google Scholar
Solomon, J., Rustamov, R., Guibas, L., Butscher, A.: Wasserstein propagation for semi-supervised learning. Proc. Mach. Learn. Res. 32(1), 306–314, 22–24 (2014)
Villani, C.: Optimal Transport: Old and New, vol. 338. Springer, Berlin (2009)
Book Google Scholar

Download references

Acknowledgements

We are deeply indebted to Filippo Santambrogio for introducing us to the work of Bouchitté, Jimenez, and Mahadevan and for several stimulating discussions and valuable suggestions. We thank Stefano Gualandi for his feedback and Gabriele Loli for enhancing the images of this paper.

Funding

Open access funding provided by Università degli Studi di Pavia within the CRUI-CARE Agreement. The authors have not disclosed any funding.

Author information

Authors and Affiliations

Department of Mathematics F. Casorati, University of Pavia, Via Ferrata 5, 27100, Pavia, Italy
Gennaro Auricchio & Marco Veneroni

Authors

Gennaro Auricchio
View author publications
You can also search for this author in PubMed Google Scholar
Marco Veneroni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Veneroni.

Ethics declarations

Competing interest

The authors have not disclosed any competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Auricchio, G., Veneroni, M. On the Structure of Optimal Transportation Plans between Discrete Measures. Appl Math Optim 85, 42 (2022). https://doi.org/10.1007/s00245-022-09861-4

Download citation

Accepted: 01 February 2022
Published: 10 May 2022
DOI: https://doi.org/10.1007/s00245-022-09861-4

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On the Structure of Optimal Transportation Plans between Discrete Measures

Abstract

Similar content being viewed by others

Semi-discrete optimal transport: a solution procedure for the unsquared Euclidean distance case

On the Existence of Monge Maps for the Gromov–Wasserstein Problem

Optimal measure transportation with respect to non-traditional costs

1 Introduction

2 Basic Notions on Optimal Transport

Definition 1

Definition 2

Definition 3

Remark 1

Definition 4

Theorem 1

Example 1

3 Structure of Discrete Optimal Transportation Plans

Definition 5

Lemma 2

Proof

Theorem 3

Proof

Remark 2

Example 2

Remark 3

4 An Upper Bound for the Infinity Wasserstein Distance in the Discrete Setting

Corollary 4

Corollary 5

Proof

Lemma 6

Proof

Theorem 7

Proof

Remark 4

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation