Multi-marginal entropy-transport with repulsive cost

Gerolin, Augusto; Kausamo, Anna; Rajala, Tapio

doi:10.1007/s00526-020-01735-3

Multi-marginal entropy-transport with repulsive cost

Open access
Published: 23 April 2020

Volume 59, article number 90, (2020)
Cite this article

Download PDF

You have full access to this open access article

Calculus of Variations and Partial Differential Equations Aims and scope Submit manuscript

Multi-marginal entropy-transport with repulsive cost

Download PDF

Augusto Gerolin¹,
Anna Kausamo² &
Tapio Rajala²

1886 Accesses
11 Citations
Explore all metrics

Abstract

In this paper we study theoretical properties of the entropy-transport functional with repulsive cost functions. We provide sufficient conditions for the existence of a minimizer in a class of metric spaces and prove the $\Gamma $-convergence of the entropy-transport functional to a multi-marginal optimal transport problem with a repulsive cost. We point out that our construction can deal with the case when the space X is a domain in ${\mathbb {R}}^d$, answering a question raised in Benamou et al. (Numer Math 142:33–54, 2019). Finally, we also prove the entropy-regularized version of the Kantorovich duality.

Optimal Entropy-Transport problems and a new Hellinger–Kantorovich distance between positive measures

Article Open access 14 December 2017

Quadratically Regularized Optimal Transport

Article 25 September 2019

A proof of the Caffarelli contraction theorem via entropic regularization

Article 12 May 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

We consider the following multi-marginal entropy-transport problem

$$\begin{aligned} I_{\varepsilon }[\rho ] = \inf _{\gamma \in \Pi ^{\mathrm {sym}}_N(\rho )}( C_0[\gamma ] + \varepsilon E[\gamma ]), \end{aligned}$$

(1.1)

where $C_0[\gamma ] = \int _{X^N} c\,{\mathrm d}\gamma $ is the transportation cost related to a cost function c, $E[\gamma ]$ is the entropy, and $\varepsilon \ge 0$ is a parameter, see Sect. 2 for details. We consider the setting where $(X,d,{\mathfrak {m}})$ is a Polish measure space, and $\rho {\mathfrak {m}}\in {\mathcal {P}}^{ac}(X)$ is an absolutely continuous probability measure with respect to the reference measure ${\mathfrak {m}}$. An element $\gamma \in \Pi ^{\mathrm {sym}}_N(\rho )$ is called a symmetric coupling (or transport plan), that is, a symmetric probability measure in $X^N$ having all marginals equal to $\rho {\mathfrak {m}}$.

We are interested in a class of repulsive cost functions $c:X^N\rightarrow {\mathbb {R}}\cup \lbrace +\infty \rbrace $ of the form

$$\begin{aligned} c(x_1,\ldots , x_N)=\sum _{1\le i<j\le N}f(d(x_i,x_j)),\quad \text { for all } \, (x_1,\ldots ,x_N)\in X^N. \end{aligned}$$

We assume $f:]0,\infty [\rightarrow {\mathbb {R}}$ to be a continuous and decreasing function that approaches $+\infty $ if $d(x_i,x_j)\rightarrow 0$. Among the examples of such cost functions we have the Coulomb cost $f(z) = 1/\vert z\vert $, the Riesz cost $f(z) = 1/\vert z\vert ^s, n \ge s\ge \max \lbrace n-2, 0\rbrace $ (in ${\mathbb {R}}^n$) and the logarithmic cost $f(z) = -\log (\vert z\vert )$. We observe that when $\varepsilon =0$, this entropy-transport problem reduces to the classical multi-marginal optimal transport problem with repulsive costs [4, 6, 7, 12].

The motivation of this paper comes from both theory and numerics. For repulsive cost functions, the entropy term in (1.1) plays a role of a regularizer to compute numerically a solution $\gamma $ of the multi-marginal optimal transport problem $I_{0}[\rho ]$, see [2]. Numerical experiments suggest that when the regularization parameter $\varepsilon $ goes to 0, the minimizer $\gamma _{\varepsilon }$ converges to a minimizer of $I_{0}[\rho ]$ having minimal entropy among the minimizer of $I_{0}[\rho ]$.

From a theoretical viewpoint, this type of a functional has direct relevance in Density Functional Theory. By choosing carefully the parameter $\varepsilon $, the functional (1.1) provides a lower-bound for the Hohenberg–Kohn functional in Density Functional Theory [15, 24, 27]. This is an immediate consequence of the Log-Sobolev Inequality.

The entropy-transport problem has appeared previously in the literature in the attractive case, in particular when $c(x_1,x_2) = d(x_1,x_2)^2$. We mention briefly below some of the connections of the entropy-transport with other fields and point out the relevance in the Coulomb case.

Brief comments on some applications of the entropy-transport

Optimal transport and Sinkhorn algorithm: The entropy-transport (1.1) was introduced by Cuturi [9] in order to compute numerically the optimal transport plan for the distance squared cost in the 2-marginals case via the Sinkorn algorithm. Due to its reasonable computational cost, it has been applied to a wide range of problems in various research areas, including Information Theory, Computer Graphics, Statistical Inference, Machine Learning, and Mean-Field Games. The entropic regularization method was also considered in the (attractive) multi-marginal case in the so-called barycenter problem introduced by Agueh and Carlier [1] (see also [5, 11]) and in numerical methods in the time discretization of Brenier’s relaxed formulation of the incompressible Euler equation [3]. For a thorough presentation of the computational aspects we refer to Cuturi and Peyré’s book [25].

Second-order calculus on RCD spaces: Gigli and Tamanini [8, 17] studied the entropic-transport problem on a class of metric spaces with (Riemannian) Ricci curvature bounded from below (2-marginals case, $c(x_1,x_2) = d(x_1,x_2)^2$). The entropic regularization procedure was crucial for establishing a second-order differential structure in that setting.

Schrödinger problem: In 1926, E. Schrödinger introduced the (linear) Schrödinger equations describing the non-relativistic evolution of a single particle in an electric field with potential energy and also established an equivalence between such equations and a system of diffusion equations [26]. Roughly speaking, the variational problem (see (1.1) with $X = C([0,1],{\mathbb {R}}^d)$ and $N=2$) arises in the Schrödinger manuscript while studying the limit $k\rightarrow \infty $$(N=2)$ of the empirical measures associated to the evolution of k i.i.d. Brownian motions. We refer the reader to Léonard survey [21] for technical details and historical notes.

Lower bound on the Hohenberg–Kohn functional in density functional theory: This is the particular case where the entropy-transport problem with Coulomb cost comes into play. It has been shown in [24, 27] that the functional (1.1) provides a lower bound for computing the ground state energy of the Hohenberg–Kohn functional [4, 6, 7, 12, 22]. Below we give a brief description of the result. Notice that in this context $X = {\mathbb {R}}^d$ and ${\mathfrak {m}}$ is the Lebesgue measure on ${\mathbb {R}}^d$.

Assume that $\gamma \in \Pi _N(\rho )$ such that $\sqrt{\gamma } \in H^1({\mathbb {R}}^{dN})$. This is the case, for example, when $\gamma (x_1,\ldots ,x_N) = \vert \psi (x_1,\ldots ,x_N)\vert ^2$, where $\psi \in H^1({\mathbb {R}}^{dN})$ is a ground-state wave function solving the N-electron Schrödinger Equation (see [6, 7, 12, 15, 27] for details). Then, we can define the Hohenberg–Kohn functional by

$$\begin{aligned}&{\tilde{F}}_{\hbar }^{HK}[\rho ] \\&\quad = \inf _{\gamma \in \Pi _N(\rho ), \sqrt{\gamma } \in H^1({\mathbb {R}}^{dN})}\bigg \lbrace \dfrac{\hbar ^2}{2}\int _{{\mathbb {R}}^{dN}}\vert \nabla \sqrt{\gamma }\vert ^2 dx_1\ldots dx_N + \int _{{\mathbb {R}}^{dN}}\sum _{1\le i<j\le N}\dfrac{1}{\vert x_i-x_j\vert }d\gamma \bigg \rbrace . \end{aligned}$$

Now, as a consequence of the logarithmic Sobolev inequality for the Lebesgue measure [18], the following result holds: if $\rho {\mathcal {L}}^d\in {\mathcal {P}}({\mathbb {R}}^d)$ and $\sqrt{\gamma }\in H^1({\mathbb {R}}^{dN})$ then

$$\begin{aligned} C_{\varepsilon }[\rho ] \le {\tilde{F}}_{\hbar }^{HK}[\rho ], \quad \text { with } \varepsilon = \pi \hbar ^2/2. \end{aligned}$$

1.1 Examples of optimal entropy couplings

Let us present some computational examples of minimizers of $I_\varepsilon [\rho ]$ illustrating the role of the parameter $\varepsilon $. Before this, we recall a result on the characterization of minimizers in the one-dimensional case [10]. In particular, according to it the minimizer of $I_0[\rho ]$ is concentrated on finitely many graphs and thus singular with respect to the product reference measure.

Theorem 1.1

[10] Let $\mu \in {\mathcal {P}}({\mathbb {R}})$ be an absolutely continuous probability measure and $f:{\mathbb {R}}\rightarrow {\mathbb {R}}$ strictly convex, bounded from below and non-increasing function. Then there exists a unique optimal symmetric plan $\gamma \in {\Pi _N^{\mathrm {sym}} (\mu )}$ that solves

$$\begin{aligned} \min _{ \gamma \in \Pi _N^{\mathrm {sym}}(\mu ) } \int _{{\mathbb {R}}^{N} } \sum _{1 \le i < j \le N} f(|x_j-x_i|) \, {\mathrm d}\gamma . \end{aligned}$$

Moreover, this plan is induced by an optimal cyclical map T, that is, $\gamma _{\mathrm {sym}}=\left( \gamma _T\right) ^S$, where $\gamma _T=(\mathrm {id},T,T^{(2)} , \ldots , T^{(N-1)})_{\sharp } \mu $. An explicit optimal cyclical map is

$$\begin{aligned} T(x) ={\left\{ \begin{array}{ll} F_{\mu }^{-1} (F_{\mu }(x) + 1/N) \qquad &{} \text { if }F_{\mu }(x) \le (N-1)/N \\ F_{\mu }^{-1} ( F_{\mu }(x) +1 - 1/N ) &{} \text { otherwise.} \end{array}\right. } \end{aligned}$$

Here $F_{\mu }(x)=\mu (( -\infty , x])$ is the distribution function of $\mu $, and $F_{\mu }^{-1}$ is its lower semicontinuous left inverse.

1.1.1 One-dimensional entropic-transport with Coulomb cost and a Gaussian measure

Let $\rho $ be the normal distribution on the real line with zero mean and standard deviation $\sigma = 5$. We compute numerically the solution of the entropic-transport problem with Coulomb cost in the real line using the Sinkorn algorithm [9]. Notice that by Theorem 1.1, we know that the minimizer of $I_0[\rho ]$ is concentrated on a graph. See Fig. 1 for an illustration of the computational results. Our code is based on the Python implementation available at POT library [14].

1.2 Organization of the paper

In Sect. 2, we introduce the setting and study sufficient conditions for the existence of minimizers for the entropy-transport problem (1.1). Section 3 is devoted to the $\Gamma -$convergence proof of the entropic-transport functional $C_{\varepsilon }[\gamma ]$ to the multi-marginal optimal transport with repulsive costs $C_0[\gamma ]$. In Sect. 4, we study the Kantorovich duality for the entropic-transport problem.

1.3 The strategy of the main proof and some technical remarks

The main result of this paper is Theorem 3.1, in which we prove the $\Gamma $-convergence of the entropic-regularized functional $C_{\varepsilon }[\gamma ]$ to $C_0[\gamma ]$. The technical difficulty on dealing with the $\Gamma $-convergence comes from the fact that while for the entropic part $E[\gamma ]$ the minimizer $\gamma $ tends to be as spread as possible with respect to ${\mathfrak {m}}$, for the cost $C_{0}[\gamma ]$ a minimizer can be very singular and have infinite entropy.

We divide the proof in two parts. The part (I), the $\liminf -$inequality, follows basically from the lower-semicontinuity of the costs $C_0[\gamma ]$ and $C_{\varepsilon }[\gamma ]$ - which are obtained from the assumption $\rho \log \rho \in L_{\mathfrak {m}}^1(X)$ on the marginal measure $\rho {\mathfrak {m}}$, giving a lower bound on the entropy. The part (II), the $\limsup -$inequality, is more involved. In Sect. 3.2, we construct a block approximation $\gamma '_n$ for a coupling $\gamma $ with $C_{0}[\gamma ] <+\infty $. Such construction is done in several steps, since we need to construct a competitor $\gamma '_n$ such that $E[\gamma '_n] <\infty $ and $\gamma '_n \in \Pi ^{\mathrm {sym}}_N(\rho )$. The main idea and the rigorous construction are given in Sect. 3.2.

Futhermore, we point out that our construction can deal with the case when the space X is a domain in ${\mathbb {R}}^d$, answering a question raised in [3]. There the $\Gamma $-convergence was proven using convolutions; an approach that does not seem to be easy to implement for domains, or in general metric spaces.

Related works: A proof of the $\Gamma $-convergence of (1.1) to the Monge-Kantorovich problem for $c(x,y) = d(x,y)^p$ first appeared in [20, 23] via probabilistic methods. In [5], G. Carlier, V. Duval, G. Peyré and B. Schmitzer provided an alternative and more analytical proof carrying out a similar block approximation procedure for the two-marginal squared distance cost in the Euclidean space and the Wasserstein Barycenter.

2 The entropy-regularized repulsive costs

Let (X, d) be a Polish space and ${\mathfrak {m}}$ be a reference measure on X. We denote by ${\mathcal {P}}(X)$ the set of Borel probability measures on X, and ${\mathcal {P}}^{ac}(X)$ the set of Borel probability measures on X that are absolutely continuous with respect to ${\mathfrak {m}}$. We denote by ${\mathfrak {m}}_{N}$ the product measure ${\mathfrak {m}}\otimes {\mathfrak {m}}\otimes \cdots \otimes {\mathfrak {m}}$. This is the reference measure we use on the product space $X^N$. On $X^N$ we use the sup-metric, which we denote by $d_N$.

The class of cost functions $c:X^N\rightarrow {\mathbb {R}}\cup \{+\infty \}$ of our interest is given by functions of the form

$$\begin{aligned} c(x_1,\ldots , x_N)=\sum _{1\le i<j\le N}f(d(x_i,x_j)),\quad \text { for all }(x_1,\ldots ,x_N)\in X^N, \end{aligned}$$

where $f:[0,\infty [\rightarrow {\mathbb {R}}\cup \{+\infty \}$ satisfies the following conditions

$$\begin{aligned}&f|_{]0,\infty [}\text { is continuous, decreasing and } \end{aligned}$$

(F1)

$$\begin{aligned}&\lim _{t\rightarrow 0+}f(t)=+\infty . \end{aligned}$$

(F2)

Above and from now on, we denote by $(x_1,\ldots , x_N)$ points in $X^N$, so $x_i\in X$ for each i.

We denote by

$$\begin{aligned} \Pi _N(\rho )=\left\{ \gamma \in {\mathcal {P}}(X^N)~|~\mathtt {pr}^i_\sharp \gamma =\rho \text { for all }i\in \{1,\ldots ,N\}\right\} \end{aligned}$$

the set of couplings or transport plans, where $\mathtt {pr}^i$ is the projection

$$\begin{aligned} \mathtt {pr}^i(x_1,\ldots ,x_i,\ldots ,x_N)=x_i~~~\text {for all }(x_1,\ldots , x_i,\ldots ,x_N)\in X^N. \end{aligned}$$

A measure $\gamma \in {\mathcal {P}}(X^N)$ is symmetric if

$$\begin{aligned} \int _{X^N}\phi (x_1,\ldots ,x_N)\,{\mathrm d}\gamma = \int _{X^N} \phi (\overline{\sigma }(x_1,\ldots ,x_N))\,{\mathrm d}\gamma , \text { for all } \phi \in {\mathcal {C}}(X^N) \end{aligned}$$

for all permutations $\overline{\sigma }$ of the N symbols $(x_1,\ldots , x_N)$. We denote by ${\mathcal {P}}^{\mathrm {sym}}(X^N)$ the set of symmetric probability measures in $X^N$, and by

$$\begin{aligned} \Pi ^{\mathrm {sym}}_N(\rho ) := \Pi _N(\rho )\cap {\mathcal {P}}^{\mathrm {sym}}(X^N) \end{aligned}$$

the set of symmetric couplings of $\rho $.

Let us also introduce the notation for symmetrising measures. If $\gamma $ is a Borel measure on $X^N$, we denote by $\gamma ^S$ the symmetrized measure

$$\begin{aligned} \gamma ^S:=\frac{1}{N!}\sum _{\sigma \in {\mathcal {S}}_N}\sigma _\sharp \gamma , \end{aligned}$$

where ${\mathcal {S}}_N$ is the set of permutations of the N coordinates $(x_1,\ldots , x_N)$.

We define the functional $C_0[\gamma ]$ to be the cost related to the coupling $\gamma $

$$\begin{aligned} C_0[\gamma ]=\int _{X^N}c(x_1,\ldots ,x_N)\,{\mathrm d}\gamma (x_1,\ldots , x_N). \end{aligned}$$

Because of the symmetry of the cost c, we immediately have

Proposition 2.1

For every $\rho \in \mathcal (X)$, we have that

$$\begin{aligned} \inf _{ \gamma \in \Pi _N(\rho ) }C_0[\gamma ] = \inf _{ \gamma \in \Pi _N^{\mathrm {sym}}(\rho ) }C_0[\gamma ]. \end{aligned}$$

(2.1)

Moreover, if the infimum is attained on one side of the above equality, then it is attained on both sides.

Given $\varepsilon \ge 0$, we denote by $C_\varepsilon [\gamma ]$ the entropy-regularized cost

$$\begin{aligned} C_\varepsilon [\gamma ]=C_0[\gamma ]+\varepsilon E[\gamma ],\quad \text { for all }\gamma \in \Pi ^{\mathrm {sym}}_N(\rho ), \end{aligned}$$

(2.2)

where the entropy $E[\gamma ]:{\mathcal {P}}(X^N)\rightarrow {\mathbb {R}}\cup \lbrace -\infty ,+\infty \rbrace $ is defined as

$$\begin{aligned} E[\gamma ]={\left\{ \begin{array}{ll}\int _{X^N}\rho _\gamma \log \rho _\gamma \,{\mathrm d}{\mathfrak {m}}_{N}&{}\text { if }\gamma \ll {\mathfrak {m}}_{N}\\ +\infty &{}\text { otherwise}\end{array}\right. }. \end{aligned}$$

(2.3)

The notation $\rho _\gamma $ stands for the Radon-Nikodym derivative of $\gamma $ with respect to the reference measure ${\mathfrak {m}}_{N}$ and $\gamma \ll {\mathfrak {m}}_{N}$ means that $\gamma $ is absolutely continuous with respect to the reference measure ${\mathfrak {m}}_{N}$. Let $\rho {\mathfrak {m}}\in {\mathcal {P}}^{ac}(X)$. In this paper we are interested in the following infimum

$$\begin{aligned} I_\varepsilon [\rho {\mathfrak {m}}]:=\inf _{\gamma \in \Pi ^{\mathrm {sym}}_N(\rho )} C_\varepsilon [\gamma ]. \end{aligned}$$

(2.4)

In order to guarantee the lower semicontinuity for $C_\varepsilon [\cdot ]$, we will assume $\rho \log \rho \in L_{\mathfrak {m}}^1(X)$. This will take care of the entropy part $E[\cdot ]$ of the cost. In order to establish the lower semicontinuity for the functional $C_0[\cdot ]$, we assume that the measure $\rho $ satisfies the following two conditions:

$$\begin{aligned}&\lim _{r\rightarrow 0}\sup _{x\in X}\rho (B(x,r))<\frac{1}{N(N-1)^2}~~~\text {and} \end{aligned}$$

(A)

$$\begin{aligned}&\int _{X\setminus B(o,r_0)}f\left( 2d(x,o)\right) \,{\mathrm d}\rho (x)> - \infty ~~~\text {for some }o\in X. \end{aligned}$$

(B)

Above we have, by an abuse of notation, denoted the measure $\rho {\mathfrak {m}}$ by only the density $\rho $; we will use the same abbreviation in the rest of the paper if there is no risk of confusion. The Condition (B) is a similar assumption to requiring, in the case of the quadratic cost, that the marginal measures have finite second moments. The Condition (A) guarantees that the cost is finite.

If we endow the spaces ${\mathcal {P}}(X^N)$ and ${\mathcal {P}}(X)$ with $w^*$-topology then, by Prokhorov’s theorem, any subset of ${\mathcal {P}}(X)$ (or ${\mathcal {P}}(X^N)$) is tight if and only if it is relatively compact.

Remark 2.2

(Entropy-transport seen as a Kullback–Leibler divergence) If $\mu $ and $\nu $ are measures on a set X, the Kullback–Leibler divergence of $\mu $ with respect to $\nu $ is defined as

$$\begin{aligned} \hbox {KL}[\mu \,|\,\nu ]={\left\{ \begin{array}{ll} \int _X\log \left( \frac{{\mathrm d}\mu }{{\mathrm d}\nu }\right) {\mathrm d}\mu &{}\text { if }\mu \ll \nu \\ +\infty &{}\text { otherwise}\end{array}\right. }. \end{aligned}$$

Now, if both measures $\mu $ and $\nu $ are absolutely continuous with respect to some reference measure R of the space X with densities $\rho _\mu $ and $\rho _\nu $, respectively, we can write:

$$\begin{aligned} \hbox {KL}[\mu \,|\,\nu ]={\left\{ \begin{array}{ll} \int _X\rho _\mu \log \left( \frac{\rho _\mu }{\rho _\nu }\right) {\mathrm d}R&{}\text { if }\mu \ll \nu \\ +\infty &{}\text { otherwise}\end{array}\right. }. \end{aligned}$$

Considering the entropy-regularized MOT problem, we see that the cost functional $C_{\varepsilon }[\gamma ]$ can be alternatively written as the Kullback–Leibler divergence between $\gamma $ and a kernel$\kappa $ defined below

$$\begin{aligned} C_{\varepsilon }[\gamma ]= \varepsilon \hbox {KL}[\gamma \,|\,\kappa ] = \varepsilon \int _{X^N}\rho _{\gamma } \ln \bigg (\dfrac{\rho _{\gamma }}{\rho _{\kappa }}\bigg )\,{\mathrm d}{\mathfrak {m}}_{N}, \end{aligned}$$

where $\kappa = e^{-c/\varepsilon }{\mathfrak {m}}_N$.

For the most part, in this paper we have chosen to consider as a reference measure the measure ${\mathfrak {m}}_N$. However, as the following lemma shows, we could also assume the reference measure to be $(\rho {\mathfrak {m}})^{\otimes N}$ since the minimizers of the entropy-regularized MOT problem (2.4) do not depend on the choice of the reference measure, at least if there exists a minimizer with finite cost. To state the lemma, let us introduce the notation of relative entropy: for each reference measure R of a Polish space Y, and for each $\gamma \in {\mathcal {P}}(Y)$, we denote by $E[\gamma \,|\,R]$ the relative entropy of $\gamma $ with respect to R, defined as

$$\begin{aligned} E[\gamma \,|\,R]= {\left\{ \begin{array}{ll} \int _Y\log \left( \frac{{\mathrm d}\gamma }{{\mathrm d}R}\right) \,{\mathrm d}\gamma &{}\text { if }\gamma \ll R\\ +\infty &{}\text { otherwise}\end{array}\right. }. \end{aligned}$$

Now we may consider two, a priori different, entropy-regularized MOT problems: the one introduced in (2.4)

$$\begin{aligned} I_\epsilon [\rho {\mathfrak {m}}]=\inf _{\gamma \in \Pi ^{\mathrm {sym}}_N(\rho )} C_\varepsilon [\gamma ]=:I_\epsilon [\rho {\mathfrak {m}}\,|\,{\mathfrak {m}}], \end{aligned}$$

(2.5)

and the problem with the reference measure chosen to be $(\rho {\mathfrak {m}})^{\otimes N}$

$$\begin{aligned} I_\epsilon [\rho {\mathfrak {m}}\,|\,\rho {\mathfrak {m}}]:=\inf _{\gamma \in \Pi ^{\mathrm {sym}}_N(\rho )}\left( C_0[\gamma ]+E[\gamma \,|\,(\rho {\mathfrak {m}})^{\otimes N}]\right) . \end{aligned}$$

(2.6)

The following Lemma 2.3 is used only to go from the compact to the general case in the duality Theorem 4.2. The proof in [11] can be directly applied here to prove Lemma 2.3.

Lemma 2.3

Let $(X,d,{\mathfrak {m}})$ be a Polish measure space, $\rho {\mathfrak {m}}\in {\mathcal {P}}(X)$ a measure satisfying (A) and (B), and c a cost function satisfying (F1) and (F2). Now for all $\epsilon > 0$ we have

$$\begin{aligned} I_\epsilon [\rho {\mathfrak {m}}\,|\,{\mathfrak {m}}]=I_\epsilon [\rho {\mathfrak {m}}\,|\,\rho {\mathfrak {m}}]+N\epsilon \hbox {KL}[\rho {\mathfrak {m}}\,|\,{\mathfrak {m}}]=I_\epsilon [\rho {\mathfrak {m}}\,|\,\rho {\mathfrak {m}}]+N\epsilon \int _X\rho \log \rho {\mathrm d}{\mathfrak {m}}. \end{aligned}$$

(2.7)

Moreover, whenever at least one side of the equality above is finite, the problems (2.5) and (2.6) have the same minimizers.

2.1 Some properties of the entropy functional

Let us start by noting that the minimum of the entropy is attained by the product measure and that its value is not $-\infty $.

Proposition 2.4

Let $(X,d, {\mathfrak {m}})$ be a Polish metric measure space, and let $\rho {\mathfrak {m}}\in {\mathcal {P}}^{ac}(X)$ with $\rho \log \rho \in L_{\mathfrak {m}}^1(X)$ . Then

$$\begin{aligned} \min _{\gamma \in \Pi ^{\mathrm {sym}}_N(\rho )}E[\gamma ] = \int _{X^N}\bigg (\otimes ^N_{i=1}\rho \bigg )\log \bigg (\otimes ^N_{i=1}\rho \bigg ) \,{\mathrm d}{\mathfrak {m}}_{N} = N \int _X \rho \log \rho \,{\mathrm d}{\mathfrak {m}}> -\infty . \end{aligned}$$

Proof

As we will see, the minimality is an immediate consequence of Jensen’s inequality. Let $\gamma \in \Pi _N(\rho )$. Then

$$\begin{aligned} E[\gamma ]&= \int _{X^N}\rho _{\gamma }\log (\rho _\gamma )\,{\mathrm d}{\mathfrak {m}}_{N} = \int _{X^N}\frac{\rho _{\gamma }}{\otimes ^N_{i=1}\rho }\left( \log \left( \frac{\rho _{\gamma }}{\otimes ^N_{i=1}\rho }\right) + \log \left( \otimes ^N_{i=1}\rho \right) \right) \otimes ^N_{i=1}\rho \,{\mathrm d}{\mathfrak {m}}_{N}\\&\ge \bigg (\int _{X^N}\rho _{\gamma }\,{\mathrm d}{\mathfrak {m}}_{N}\bigg )\log \bigg (\int _{X^N}\rho _{\gamma }\,{\mathrm d}{\mathfrak {m}}_{N}\bigg ) + \int _{X^N}\rho _{\gamma }\log \left( \otimes ^N_{i=1}\rho \right) \,{\mathrm d}{\mathfrak {m}}_{N} \\&= 0 + E[\otimes ^N_{i=1}\rho ]. \end{aligned}$$

$\square $

Using Proposition 2.4 we immediately get the lower semicontinuity of the entropy functional by representing the entropy as relative entropy against the probability measure $\otimes _{i=1}^N(\rho {\mathfrak {m}})$. See for instance [28, Lemma 4.1] for the lower semicontinuity of the entropy when the reference measure is finite.

Corollary 2.5

Let $(X,d, {\mathfrak {m}})$ be a Polish metric measure space, and let $\rho {\mathfrak {m}}\in {\mathcal {P}}^{ac}(X)$ with $\rho \log \rho \in L_{\mathfrak {m}}^1(X)$. Then $E[\cdot ]$ is lower semicontinuous in the set $\Pi ^{\mathrm {sym}}_N(\rho )$.

Now we are ready to prove the existence of the minimizers for entropy-regularized MOT:

Proposition 2.6

Let $(X,d,{\mathfrak {m}})$ be a Polish metric measure space. Assume that the measure $\rho {\mathfrak {m}}\in {\mathcal {P}}^{ac}(X)$ satisfies $\rho \log \rho \in L_{\mathfrak {m}}^1(X)$ along with Conditions (A) and (B). Assume that $c:X^N\rightarrow {\mathbb {R}}\cup \lbrace +\infty \rbrace $ satisfies the conditions (F1) and (F2). Then, for each $\varepsilon \ge 0$, there exists a minimizer $\gamma \in \Pi ^{\mathrm {sym}}_N(\rho )$ for the entropic-regularized cost $C_{\varepsilon }[\gamma ]$.

Proof

We notice that the set $\Pi ^{\mathrm {sym}}_N(\rho )$ is compact in the $w^*$-topology [19]. The functional E is lower semicontinuous by Corollary 2.5, and in our setting the lower semicontinuity of $C_0$ is proven as a part of the proof of [16, Proposition 3.1]. Since for each $\varepsilon \ge 0$ the functional $C_\varepsilon $ is lower semicontinuous, we conclude that it has a minimizer in the set $\Pi ^{\mathrm {sym}}_N(\rho )$$\square $

3 The $\Gamma $-convergence of entropic-regularized cost

Now let us turn to the $\Gamma $-convergence. From now on, $(\tau _n)_{n\in {\mathbb {N}}}$ is any sequence of positive real numbers decreasing to zero. Let us introduce the following functionals: for each $n\in {\mathbb {N}}$

$$\begin{aligned} {\mathcal {C}}_n:{\mathcal {P}}^{\mathrm {sym}}(X^N)\rightarrow {\mathbb {R}}\cup \{+\infty \},~~ {\mathcal {C}}_n[\gamma ]={\left\{ \begin{array}{ll}C_{\tau _n}(\gamma )&{}\text { if }\gamma \in \Pi _N(\rho )\\ +\infty &{}\text { otherwise}\end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} {\mathcal {C}}:{\mathcal {P}}^{\mathrm {sym}}(X^N)\rightarrow {\mathbb {R}}\cup \{+\infty \},~~ {\mathcal {C}}[\gamma ]={\left\{ \begin{array}{ll}C_0[\gamma ]&{}\text { if }\gamma \in \Pi _N(\rho )\\ +\infty &{}\text { otherwise}\end{array}\right. }. \end{aligned}$$

The goal of this section is to prove that the sequence $({\mathcal {C}}_{n\in {\mathbb {N}}})$$\Gamma $-converges to ${\mathcal {C}}$ in the space ${\mathcal {P}}^{\mathrm {sym}}(X^N)$.

Theorem 3.1

Let $(X,d,{\mathfrak {m}})$ be a Polish metric measure space. Let $\rho \in {\mathcal {P}}^{ac}(X)$ with $\rho \log \rho \in L_{\mathfrak {m}}^1(X)$ satisfying (A) and (B). Then the sequence $({\mathcal {C}}_n)$$\Gamma $-converges to ${\mathcal {C}}$ in the space ${\mathcal {P}}^{\mathrm {sym}}(X^N)$.

Let us fix $\gamma \in {\mathcal {P}}^{\mathrm {sym}}(X^N)$. We need to show that

$$\begin{aligned}&\text {For each sequence }(\gamma _n)_{n\in {\mathbb {N}}}\text { that converges to }\gamma \\&\text {we have }\liminf _{n\rightarrow \infty } {\mathcal {C}}_n[\gamma _n]\ge {\mathcal {C}}[\gamma ]\text {, and } \end{aligned}$$

(I)

$$\begin{aligned}&\text {There exists a sequence }(\gamma _n)_{n\in {\mathbb {N}}}\text { that converges to }\gamma \text { and }\\&\limsup _{n\rightarrow \infty }{\mathcal {C}}_n[\gamma _n]\le {\mathcal {C}}[\gamma ]. \end{aligned}$$

(II)

The proof of Theorem 3.1 is divided into two parts. The proof of the first part, the liminf-inequality (I), is short and is established in the next subsection. The remainder of this section is then divided into subsections in which the second part, the limsup-inequality (II) is proven.

3.1 Proof of condition (I)

We fix a sequence $(\gamma _n)_{n\in {\mathbb {N}}}$ that converges to $\gamma $. If $\gamma \notin \Pi _N(\rho )$, then since the set $\Pi _N(\rho )$ is compact, for large indices we also have $\gamma _n\notin \Pi _N(\rho )$, so both sides of inequality (I) are $+\infty $, and we are done. Hence we may assume that $\gamma $ and $\gamma _n$’s are elements of the set $\Pi _N(\rho )$. Since now $\gamma _n\in \Pi _N(\rho )$, the claim (I) follows from the lower-semicontinuity of $\gamma \mapsto \int c\,{\mathrm d}\gamma $ and from the entropy lower bound shown in Proposition 2.4. $\square $

3.2 Constructing an approximation of the coupling $\gamma $

First of all, we need to construct an approximation of $\gamma $ only in the case where $C_0[\gamma ] < \infty $: if this is not the case then any sequence $(\gamma _n)$ converging to $\gamma $ can be used to prove Condition (II). The idea of the construction is to redefine a large part of $\gamma $ to be a product measure on finitely many Borel sets with small diameter. In order not to increase the cost by too much, the Borel sets we are using have to be far away from the diagonal compared to the diameter of the sets. We call the part of the measure defined in this way the core part of the approximation. For the rest of the measure, we take another finite combination of product measures. However, this time the sets do not need to have small (or even bounded) diameter, but just small measure. This part will be called the remainder part of the approximation.

We start the construction by taking out a small part of $\gamma $ that will later be used to deal with the remainder part of the approximation. For this we take a sequence of radii defined as $r_n = 1/n$. Since $C_0[\gamma ] < \infty $, there exists a point $x =(x_1,\ldots , x_N)\in \mathop {\mathrm{spt}}\nolimits (\gamma )$ with

$$\begin{aligned} x_i \ne x_j \quad \text { if }1 \le i < j \le N. \end{aligned}$$

Moreover, since $\gamma \in \Pi _N(\rho )$ and $\rho $ satisfies (A), we have

$$\begin{aligned}&\gamma (\{(y_1,\ldots , y_N) \in X^N \,| \, y_i \ne x_j \text { for all }i,j\}) \\&\quad \ge 1 - \sum _{i\ne j} \gamma (\{(y_1,\ldots , y_N) \in X^N \,| \, y_i = x_j \}) \\&\quad = 1 - \sum _{i\ne j} \rho (\{x_j\}) \ge 1 - N(N-1) \frac{1}{N(N-1)^2} > 0. \end{aligned}$$

Thus, using again $C_0[\gamma ] < \infty $, there exists another point $x'=(x_{N+1},\ldots , x_{2N}) \in \mathop {\mathrm{spt}}\nolimits (\gamma )$, so that

$$\begin{aligned} x_i \ne x_j \quad \text { if }1 \le i < j \le 2N. \end{aligned}$$

From now on, we consider $x,x'$ fixed. Therefore, for $n \in {\mathbb {N}}$ sufficiently large we have

$$\begin{aligned} d(x_i,x_j)>r_n \quad \text { if }1 \le i < j \le 2N. \end{aligned}$$

(3.1)

Let us denote by

$$\begin{aligned} B_n:=B(x,\tfrac{r_n}{10})\quad \text {and}\quad B_n':=B(x',\tfrac{r_n}{10}) \end{aligned}$$

the balls around x and $x'$ with radii $r_n/10$ in the sup-metric of the product space. So,

$$\begin{aligned} y=(y_1,\ldots ,y_N)\in B_n \end{aligned}$$

if and only if

$$\begin{aligned} d(x_i,y_i)<\tfrac{r_n}{10}\quad \text { for all }i\in \{1,\ldots , N\}, \end{aligned}$$

and analogously for $B_n'$ with the relevant index modifications.

Let us now define

$$\begin{aligned} \gamma _{B_n}=\left( \frac{\gamma \big |_{B_n} }{\gamma (B_n)} \right) ^S\quad \text {and}\quad \gamma _{B_n'}=\left( \frac{\gamma \big |_{B_n'} }{\gamma (B_n')} \right) ^S. \end{aligned}$$

Observe that $\gamma _{B_n}$ and $\gamma _{B_n'}$ are symmetric probability measures. Since the marginals of a symmetric measure are the same, we may denote by $\rho _{B_n}$ the marginal of $\gamma _{B_n}$ and similarly by $\rho _{B_n'}$ the marginal of $\gamma _{B_n'}$. Let us further denote ${{\tilde{B}}}_n:=\mathop {\mathrm{spt}}\nolimits \gamma _{B_n}$, ${{\tilde{B}}}_n':=\mathop {\mathrm{spt}}\nolimits \gamma _{B_n'}$ and

$$\begin{aligned} \varepsilon _n := \frac{1}{N}\min \left\{ \gamma ({{\tilde{B}}}_n),\gamma ({{\tilde{B}}}_n'),r_n,\frac{r_n}{f(2r_n/5)}\right\} . \end{aligned}$$

(3.2)

We then define a measure

$$\begin{aligned} \gamma _{0,n}:=&\gamma \big |_{X^n\setminus ({{\tilde{B}}}_n\cup {{\tilde{B}}}_n')} +\frac{\gamma ({{\tilde{B}}}_n)-\varepsilon _n}{\gamma ({{\tilde{B}}}_n)}\gamma \big |_{{{\tilde{B}}}_n}+\frac{\gamma ({{\tilde{B}}}_n')-\varepsilon _n}{\gamma ({{\tilde{B}}}_n')}\gamma \big |_{{{\tilde{B}}}_n'}. \end{aligned}$$

The idea behind the measure $\gamma _{0,n}$ is that we have chopped off a small part of the measure around the points x and $x'$ (symmetrically) for later use. Since we are working with a singular cost, we still need to take out a small neighbourhood of the diagonals before approximating by product measures. We do this now.

We fix a compact $K_n\subset X$ such that

$$\begin{aligned} \gamma _{0,n}(X^N\setminus K_n^N)<\frac{\varepsilon _n}{2} \end{aligned}$$

(3.3)

and take a small enough $\delta _n \in (0,r_n)$ so that

$$\begin{aligned}&\gamma _{0,n}(D_{\delta _n})< \frac{\varepsilon _n}{2},\nonumber \\&D_{\alpha }:=\{(x_1,\ldots ,x_N)\in X^N~|~d(x_i,x_j)<\alpha \text { for some }i\ne j\}\, \end{aligned}$$

(3.4)

denotes the $\alpha $-neighbourhood of the pairwise diagonals. Using $K_n$ and $\delta _n$ we then define

$$\begin{aligned} \gamma _{1,n}:=\gamma _{0,n}|_{K_n^N\setminus D_{\delta _n}}. \end{aligned}$$

(3.5)

The measure $\gamma _{1,n}$ is now the core part of the measure that we approximate. We denote by $\rho _{1,n}$ the marginals of the symmetric measure $\gamma _{1,n}$.

Let us then approximate the measure $\gamma _{1,n}$. We take $\lambda _n \in (0,\delta _n/n)$ so that

$$\begin{aligned} |f(r)-f(s)| < \varepsilon _n \qquad \text {for all }r,s \in [\delta _n/2,2\mathop {\mathrm{diam}}\nolimits (K_n)]\text { with }|r-s| \le 2\lambda _n. \end{aligned}$$

(3.6)

Such $\lambda _n$ exists by the uniform continuity of f on the compact set $[\delta _n/2,2\mathop {\mathrm{diam}}\nolimits (K_n)]$. Since the set $K_n$ is compact, we may fix a finite Borel partition $\{B_n^i\}_{i=1}^{M_n}$ of the set $\mathop {\mathrm{spt}}\nolimits (\rho _{1,n})$ such that

$$\begin{aligned}&\mathop {\mathrm{diam}}\nolimits (B_n^i)<\lambda _n\quad \text {and}\quad \rho _{1,n}(B_n^i) > 0 \quad \text { for all }i\in \{1,\ldots , M_n\}. \end{aligned}$$

We are now ready to define the core part approximants $\gamma _{1,n}^a$ as

$$\begin{aligned} \gamma _{1,n}^a=\sum _{(k_1,\ldots ,k_N)\in M_n^N}\frac{\gamma _{1,n}(B_n^{k_1}\times \cdots \times B_n^{k_N})}{\rho _{1,n}(B_n^{k_1})\cdots \rho _{1,n}(B_n^{k_N})}\rho _{1,n}|_{B_n^{k_1}}\otimes \cdots \otimes \rho _{1,n}|_{B_n^{k_N}}. \end{aligned}$$

(3.7)

Now let us handle the main part of the remainder of the measure, namely the measure

$$\begin{aligned} \gamma _{2,n}:=\gamma _{0,n}|_{D_{\delta _n}\cup (X^N\setminus K_n^N)}. \end{aligned}$$

Because $\gamma _{0,n}$ and the set where we restrict it are symmetric, $\gamma _{2,n}$ is as well. We may thus denote its marginals by $\rho _{2,n}$.

In order to determine which part of the remaining marginal measure should be coupled where, we define a partition $\{A_{i,n}\}_{i=1}^N$ of the space X by setting, for all $i\in \{1,\ldots , N-1\}$

$$\begin{aligned} A_{i,n}:=\{y\in X~|~d(x_i,y)\le \tfrac{r_n}{2}\}, \end{aligned}$$

and

$$\begin{aligned} A_{N,n}:=X\setminus \bigcup _{i=1}^{N-1}A_{i,n}. \end{aligned}$$

Condition (3.1) guarantees that the sets $A_{i,n}$ are pairwise disjoint.

Now we approximate $\gamma _{2,n}$ by the measure

$$\begin{aligned} \gamma _{2,n}^a:=N\left( \sum _{i=1}^N\eta _{n,i}\right) ^S, \end{aligned}$$

where for all i the measure $\eta _{n,i}$ is the product

$$\begin{aligned} \eta _{n,i} := \left( \bigotimes _{k=1}^{i-1}\frac{\rho _{B_n}\big |_{B(x_k,r_n/10)}}{\rho _{B_n}(B(x_k,r_n/10))}\right) \otimes \rho _{2,n}\big |_{A_{i,n}} \otimes \left( \bigotimes _{k=i+1}^N\frac{\rho _{B_n}\big |_{B(x_k,r_n/10)}}{\rho _{B_n}(B(x_k,r_n/10))}\right) . \end{aligned}$$

By the definition of the sets $A_{i,n}$, for every $(y_1,\ldots ,y_N) \in \mathop {\mathrm{spt}}\nolimits (\gamma _{2,n}^a)$ we have for each $i \ne j$

$$\begin{aligned} d(y_i,y_j) \ge |d(y_i,x_j) - d(x_j,y_j)| > \frac{r_n}{2}-\frac{r_n}{10} = \frac{2r_n}{5}, \end{aligned}$$

(3.8)

where we have assumed (which we can do without loss of generality) that $y_j \in {\overline{B}}(x_j,r_n/10)$.

What we have done using the measure $\gamma _{2,n}^a$ is that we have coupled the marginals of the measure $\gamma _{2,n}$ with some suitable parts of the marginals of the reserved measure that was taken out around the point x. In this way we have used unevenly the marginals of this reserved part. To handle the rest of the reserved part of the measure around the point x, we now use the reserved measure around the point $x'$. So, we need to redefine the coupling for the part of the marginal given by

$$\begin{aligned} \rho _{3,n} := (\texttt {pr}^1)_\sharp \frac{\varepsilon _n}{\gamma ({{\tilde{B}}}_n)}\gamma \big |_{{{\tilde{B}}}_n} +\rho _{2,n} - (\texttt {pr}^1)_\sharp \gamma _{2,n}^a. \end{aligned}$$

We define it as

$$\begin{aligned} \gamma _{3,n}^a := \left( \sum _{i=1}^N\phi _{n,i} \right) ^S, \end{aligned}$$

where each $\phi _{n,i}$ is defined as

$$\begin{aligned} \phi _{n,i} := \left( \bigotimes _{k=N+1}^{N+i-1}\frac{\rho _{B_n}\big |_{B(x_k,r_n/10)}}{\rho _{B_n}(B(x_k,r_n/10))}\right) \otimes \rho _{3,n} \otimes \left( \bigotimes _{k=N+i+1}^{2N}\frac{\rho _{B_n}\big |_{B(x_k,r_n/10)}}{\rho _{B_n}(B(x_k,r_n/10))}\right) . \end{aligned}$$

Since $\mathop {\mathrm{spt}}\nolimits (\rho _{3,n}) \subset \mathop {\mathrm{spt}}\nolimits (\rho _{B_n})$, we have that for every $(y_1,\ldots ,y_N) \in \mathop {\mathrm{spt}}\nolimits (\gamma _{3,n}^a)$ and each $i \ne j$

$$\begin{aligned} d(y_i,y_j) \ge |d(x_{k(i)},x_{k(j)})- d(y_i,x_{k(i}) - d(x_{k(j)},y_j)| > {r_n}-\frac{r_n}{10}-\frac{r_n}{10} = \frac{4r_n}{5},\nonumber \\ \end{aligned}$$

(3.9)

where $k(i)\ne k(j)$ are the indices for which $y_j \in {\overline{B}}(x_{k(j)},r_n/10)$ and $y_i \in {\overline{B}}(x_{k(i)},r_n/10)$.

What remains is the part of the measure around $x'$ that was not used for $\gamma _{3,n}^a$. Since $\gamma _{3,n}^a$ used the marginals from this part of the reserved measure evenly, we may simply couple the rest by a measure

$$\begin{aligned} \gamma _{4,n}^a := b\left( \bigotimes _{k=N+1}^{2N}\frac{\rho _{B_n}\big |_{B(x_k,r_n/10)}}{\rho _{B_n}(B(x_k,r_n/10))}\right) ^S, \end{aligned}$$

with b being the correct scaling constant. Similarly as for the previous remainder part, we have that for every $(y_1,\ldots ,y_N) \in \mathop {\mathrm{spt}}\nolimits (\gamma _{4,n}^a)$ and each $i \ne j$ the inequality (3.9) holds.

Now we are ready to define the full approximation as

$$\begin{aligned} \gamma _n' = \gamma _{1,n}^a + \gamma _{2,n}^a + \gamma _{3,n}^a + \gamma _{4,n}^a. \end{aligned}$$

By construction $\gamma _n' \in \Pi _N^\mathrm {sym}(\rho )$.

3.3 Narrow convergence of the approximations

Let us now prove that the sequence $(\gamma _n')_n$ narrowly converges to $\gamma $. We could argue this by using the Wasserstein distance. However, let us do it here directly using the definition of narrow convergence.

Lemma 3.2

The sequence $(\gamma _n')_n$ narrowly converges to $\gamma $.

Proof

Let $\varphi \in C_b(X^N)$ and $\varepsilon > 0$. We need an index $N_0\in {\mathbb {N}}$ such that

$$\begin{aligned} \left|\int _{X^N}\varphi {\mathrm d}\gamma -\int _{X^N}\varphi d\gamma _n'\right|<\varepsilon \text { for all }n\ge N_0. \end{aligned}$$

(3.10)

Let us denote $M:=\sup _{x\in X^N}|\varphi (x)|$; we may assume that $M>0$. Since $\rho $ is inner regular, we can fix a compact set $K\subset X$ such that

$$\begin{aligned} \rho (X\setminus K)<\frac{\varepsilon }{12NM}. \end{aligned}$$

Since $\gamma \in \Pi _N(\rho )$, we now have

$$\begin{aligned} \gamma (X^N\setminus K^N)<\frac{\varepsilon }{12M}. \end{aligned}$$

The function $\varphi $, when restricted to $K^N$, is uniformly continuous. Hence there exists $\delta > 0$ so that

$$\begin{aligned} |\varphi (x)-\varphi (y)|< \frac{\varepsilon }{12}\text { for all }x,y\in K^N\text { for which }d_N(x,y)<\delta . \end{aligned}$$

(3.11)

Now, let $N_0 \in {\mathbb {N}}$ be so large that

$$\begin{aligned} \sqrt{N}\lambda _n< \delta \qquad \text {and} \qquad 6M\varepsilon _n < \frac{\varepsilon }{6}~~~\text {for all }n\ge N_0. \end{aligned}$$

Let us show that this choice of $N_0$ satisfies (3.10). First we note that for all $n\ge N_0$ we have

$$\begin{aligned} \left|\int _{X^N}\varphi \,{\mathrm d}\gamma -\int _{X^N}\varphi \,{\mathrm d}\gamma _n'\right|&\le \left|\int _{X^N}\varphi \,{\mathrm d}\gamma -\int _{X^N}\varphi \,{\mathrm d}\gamma _{1,n}\right|+\left|\int _{X^N}\varphi \,{\mathrm d}\gamma _{1,n}-\int _{X^N}\varphi \,{\mathrm d}\gamma _n'\right|\nonumber \\&\le \left|\int _{X^N}\varphi \,{\mathrm d}\gamma -\int _{X^N}\varphi \,{\mathrm d}\gamma _{1,n}\right|+\left|\int _{X^N}\varphi \,{\mathrm d}\gamma _{1,n}-\int _{X^N}\varphi \,{\mathrm d}\gamma _{1,n}^a\right|\nonumber \\&\quad +\left|\int _{X^N}\varphi \,{\mathrm d}(\gamma _{2,n}^a+\gamma _{3,n}^a+\gamma _{4,n}^a)\right|\nonumber \\&<\left|\int _{X^N}\varphi \,{\mathrm d}\gamma _{1,n}-\int _{X^N}\varphi \,{\mathrm d}\gamma _{1,n}^a\right|+\frac{\varepsilon }{6}, \end{aligned}$$

(3.12)

where in the last inequality we have used the following facts: $\gamma (X^N)-\gamma _{1,n}(X^N)<3\varepsilon _n$ for all n, and for the remainder part of the measure $\gamma _n'$ we have

$$\begin{aligned} (\gamma _{2,n}^a + \gamma _{3,n}^a + \gamma _{4,n}^a)(X^N) < 3\varepsilon _n~~~\text { for all }n\in {\mathbb {N}}. \end{aligned}$$

It remains to show that for all $n\ge N_0$ we have

$$\begin{aligned} \left| \int _{X^N} \varphi \,{\mathrm d}\gamma _{1,n} - \int _{X^N} \varphi \,{\mathrm d}\gamma _{1,n}^a \right| < \frac{5\varepsilon }{6}. \end{aligned}$$

(3.13)

We first estimate the integrals in the set $K^N$. Let us fix, for each $(k_1,\ldots , k_N)\in M_n^N$ for which the set

$$\begin{aligned} (B_n^{k_1}\times \cdots \times B_n^{k_N})\cap K^N \end{aligned}$$

is nonempty, an element

$$\begin{aligned} z_{k_1,\ldots ,k_N}\in (B_n^{k_1}\times \cdots \times B_n^{k_N})\cap K^N. \end{aligned}$$

Now we have, for a fixed $(k_1,\ldots , k_N)$, denoting for simplicity

$$\begin{aligned}&z_0:=z_{k_1,\ldots , k_N}, Q=B_n^{k_1}\times \cdots \times B_n^{k_N},~\gamma =\gamma _{1,n},~\gamma _a=\gamma _{1,n}^a,\\&\quad \bigg |\int _{Q\cap K^N}\varphi (z)\,{\mathrm d}\gamma -\int _{Q\cap K^N}\varphi (z)\,{\mathrm d}\gamma _a\bigg |\\&\quad \le \left|\int _{Q\cap K^N}\varphi (z)\,{\mathrm d}\gamma -\varphi (z_0)\gamma (Q\cap K^N)\right|+\left|\varphi (z_0)\gamma _a(Q\cap K^N)-\int _{Q\cap K^N}\varphi (z)\,{\mathrm d}\gamma _a\right|\\&\qquad +\left|\varphi (z_0)\gamma (Q\cap K^N)-\varphi (z_0)\gamma _a(Q\cap K^N)\right|\\&\quad \le \int _{Q\cap K^N}|\varphi (z)-\varphi (z_0)|\,{\mathrm d}\gamma +\int _{Q\cap K^N}|\varphi (z_0)-\varphi (z)|\,{\mathrm d}\gamma _a\\&\qquad +M|\gamma (Q\cap K^N)-\gamma _a(Q\cap K^N)|\\&\quad \overset{a)}{<}\gamma (Q\cap K^N)\cdot \frac{\varepsilon }{12}+\gamma _a(Q\cap K^N)\cdot \frac{\varepsilon }{12} +M|\gamma (Q\cap K^N)-\gamma _a(Q\cap K^N)|\\&\quad =\gamma (Q\cap K^N)\cdot \frac{\varepsilon }{12}+\gamma _a(Q\cap K^N)\cdot \frac{\varepsilon }{12}\\&\qquad +M|\gamma (Q\cap K^N)-\gamma (Q)+\gamma _a(Q)-\gamma _a(Q\cap K^N)|\\&\quad \le \gamma (Q\cap K^N)\cdot \frac{\varepsilon }{12}+\gamma _a(Q\cap K^N)\cdot \frac{\varepsilon }{12}+M\gamma (Q\setminus K^N)+M\gamma _a(Q\setminus K^N), \end{aligned}$$

where in a) we have used (3.11), and in b) the fact that the total measures of $\gamma $ and $\gamma _a$ coincide on ’cubes’ Q. Summing the estimate above over all cubes $Q=B_{k_1}\times \cdots \times B_{k_N}$, $(k_1,\ldots , k_N)\in M_n^N$, gives

$$\begin{aligned}&\bigg |\int _{K^N} \varphi \,{\mathrm d}\gamma _{1,n} - \int _{K^N} \varphi \,{\mathrm d}\gamma _{1,n}^a \bigg |\nonumber \\&\qquad< \gamma _{1,n}(K^N)\cdot \frac{\varepsilon }{12}+\gamma _a(K^N)\cdot \frac{\varepsilon }{12}+M\gamma (X^N\setminus K^N)+M\gamma _a(X^N\setminus K^N)\nonumber \\&\qquad \overset{a)}{<}\frac{\varepsilon }{12}+\frac{\varepsilon }{12}+\frac{\varepsilon }{12}+\frac{\varepsilon }{12}=\frac{\varepsilon }{3}, \end{aligned}$$

(3.14)

where in inequality a) we have used the fact that $\rho (X\setminus K)<\frac{\epsilon }{12MN}$ and, since the marginals of $\gamma _{1,n}$ and $\gamma _{1,n}^a$ are restrictions of $\rho $, we can bound both $\gamma _{1,n}(X^N\setminus K^N)$ and $\gamma _{1,n}^a(X^N\setminus K^N)$ by $\frac{\varepsilon }{12M}$. For the same reason, we have

$$\begin{aligned} \left|\int _{X^N\setminus K^N}\varphi \,{\mathrm d}\gamma _{1,n}-\int _{X^N\setminus K^N}\varphi \,{\mathrm d}\gamma _{1,n}^a\right|<2M\cdot \frac{\varepsilon }{12M}=\frac{\varepsilon }{6}. \end{aligned}$$

(3.15)

Combining estimates (3.14) and (3.15) gives

$$\begin{aligned} \left|\int _{X^N}\varphi \,{\mathrm d}\gamma _{1,n}-\int _{X^N}\varphi \,{\mathrm d}\gamma _{1,n}^a\right|<\frac{\varepsilon }{3}+\frac{\varepsilon }{6}=\frac{3\varepsilon }{6}<\frac{5\varepsilon }{6}, \end{aligned}$$

proving (3.13). $\square $

3.4 Convergence of the cost functional

In order to prove the $\Gamma $-limsup inequality (II), we need the cost $C_0[\cdot ]$ to converge along the approximating sequence $\gamma _n$. We prove this in the following lemma.

Lemma 3.3

We have $C_0[\gamma _n'] \rightarrow C_0[\gamma ]$ as $n \rightarrow \infty $.

Proof

Let us first consider the remainder part. Recall that for all $n \in {\mathbb {N}}$ we have

$$\begin{aligned} (\gamma _n'-\gamma _{1,n}^a)(X^N) = (\gamma _{2,n}^a + \gamma _{3,n}^a + \gamma _{4,n}^a)(X^N) < 3\varepsilon _n. \end{aligned}$$

Thus, using the lower bounds (3.8) and (3.9) for distances in the support of the remainder part, and the definition (3.2) of $\varepsilon _n$, we get

$$\begin{aligned} \int _{X^N} c\,{\mathrm d}(\gamma _n'-\gamma _{1,n}^a) \le \frac{N(N-1)}{2}f\left( \frac{2r_n}{5}\right) 3\varepsilon _n \le \frac{3(N-1)}{2}r_n \rightarrow 0 \end{aligned}$$

(3.16)

as $n \rightarrow \infty $. By continuity of the integral, we get

$$\begin{aligned} \int _{X^N} c\,{\mathrm d}(\gamma -\gamma _{1,n}) \rightarrow 0 \end{aligned}$$

(3.17)

as $n \rightarrow \infty $.

Let us now estimate the core part of the approximation. By the construction (3.7) of $\gamma _{1,n}^a$ and the choice (3.6) of $\lambda _n$, we have

$$\begin{aligned} \left| \int _{X^N} c\,{\mathrm d}\gamma _{1,n}^a - \int _{X^N} c\,{\mathrm d}\gamma _{1,n}\right| \le \int _{X^N} \frac{N(N-1)}{2} \varepsilon _n\,{\mathrm d}\gamma _{1,n} < \frac{N(N-1)}{2} \varepsilon _n. \end{aligned}$$

(3.18)

Combining (3.16), (3.17) and (3.18) we get

$$\begin{aligned} \left| C_0[\gamma _n'] - C_0[\gamma ] \right| \le&\left| \int _{X^N} c\,{\mathrm d}(\gamma _n'-\gamma _{1,n}^a) \right| + \left| \int _{X^N} c\,{\mathrm d}\gamma _{1,n}^a - \int _{X^N} c\,{\mathrm d}\gamma _{1,n}\right| \\&+ \left| \int _{X^N} c\,{\mathrm d}(\gamma -\gamma _{1,n}) \right| \rightarrow 0 \end{aligned}$$

as $n \rightarrow \infty $. $\square $

3.5 Finiteness of the entropy for the approximations

Next we show that the entropy is finite for the approximating sequence. Notice that, in order to prove (II), we do not need a better estimate on the entropy.

Lemma 3.4

For each $n\in {\mathbb {N}}$ we have $E[\gamma _n'] < \infty $.

Proof

In order to see the finiteness of the entropy, it suffices to notice that each $\gamma _n'$ is a sum of finitely many measures $({\tilde{\gamma }}_{n,k})_{k=1}^{N_n}$ each of which is of the form ${\tilde{\gamma }}_{n,k} = {\tilde{\rho }}_1^k{\mathfrak {m}}\otimes \cdots \otimes {\tilde{\rho }}_N^k{\mathfrak {m}}$ with ${\tilde{\rho }}_i^k \ll \rho $ and $\frac{{\mathrm d}{\tilde{\rho }}_i^k}{{\mathrm d}\rho } \le 1$. Indeed, by Proposition 2.4, the entropy is always bounded from below, and so we can make a crude estimate:

$$\begin{aligned} E[\gamma _n']&= \int _{X^N} \log \left( \sum _{k=1}^{N_n}\frac{{\mathrm d}{\tilde{\gamma }}_{n,k}}{{\mathrm d}{\mathfrak {m}}}\right) \,{\mathrm d}\left( \sum _{k=1}^{N_n} {\tilde{\gamma }}_{n,k}\right) \\&\le \log (N_n) + \sum _{k=1}^{N_n} \int _{X^N}\log \left( \frac{{\mathrm d}{\tilde{\gamma }}_{n,k}}{{\mathrm d}{\mathfrak {m}}}\right) \,{\mathrm d}{\tilde{\gamma }}_{n,k} < \infty . \end{aligned}$$

$\square $

3.6 Proof of condition (II)

We are now ready to prove the $\Gamma $-$\limsup $ inequality (II). By Lemma 3.2 we already know that $(\gamma _n')_n$ converges to $\gamma $. However, ${\mathcal {C}}_n[\gamma _n']$ need not converge to ${\mathcal {C}}[\gamma ]$. This can be solved by making the convergence of $(\gamma _n')_n$ slower by repeating always the same measure for sufficiently (but finitely) many times before moving to the next one. We define k(n) for every $n \in {\mathbb {N}}$ as

$$\begin{aligned} k(n) = \min \left( n,\max \left( 1, \sup \left\{ k\in {\mathbb {N}}\,|\,\sqrt{\tau _n} E[\gamma _j'] < 1 \text { for all }j \le k\right\} \right) \right) . \end{aligned}$$

By definition, $1 \le k(n) \le n$. Moreover, since for every $j\in {\mathbb {N}}$ we have $E[\gamma _j'] < \infty $ by Lemma 3.4 and $\tau _n \rightarrow 0$ by definition, we have that $k(n) \rightarrow \infty $ as $n\rightarrow \infty $. Thus, defining $\gamma _n = \gamma _{k(n)}'$, for large enough $n\in {\mathbb {N}}$ we have

$$\begin{aligned} {\mathcal {C}}_n[\gamma _n] = C_0[\gamma _{k(n)}'] + \tau _nE[\gamma _{k(n)}'] < C_0[\gamma _{k(n)}'] + \sqrt{\tau _n}. \end{aligned}$$

Recalling that by Lemma 3.3 we have $C_0[\gamma _{k(n)}'] \rightarrow C_0[\gamma ]$, we conclude the proof. $\square $

In Proposition 2.6 the existence of a minimizer for the entropy-regularized cost was established. Now that we know that measures $\gamma $ for which $C_0(\gamma )<\infty $ can be approximated by measures with not only finite costs but also finite entropy, we can say more:

Corollary 3.5

Let $(X,d,{\mathfrak {m}})$ be a Polish metric measure space. Assume that $\rho {\mathfrak {m}}\in {\mathcal {P}}^{ac}(X)$ satisfies $\rho \log \rho \in L_{\mathfrak {m}}^1(X)$ and Conditions (A) and (B). Assume that $c:X^N\rightarrow {\mathbb {R}}\cup \lbrace +\infty \rbrace $ satisfies Conditions (F1) and (F2). Then, for each $\varepsilon > 0$, there exists a unique minimizer $\gamma \in \Pi ^{\mathrm {sym}}_N(\rho )$ for the entropic-regularized cost $C_{\varepsilon }[\gamma ]$.

Proof

Our marginal measure satisfies Conditions (A) and (B), so there exists a measure $\gamma \in \Pi ^{\mathrm {sym}}_N(\rho )$ that minimizes $C_0$ with $C_0[\gamma ]<\infty $. It must be noted that this measure can have infinite entropy. However, because of the approximation result presented in the proof of Condition (II) above, we get the existence of a measure $\gamma '\in \Pi ^{\mathrm {sym}}_N(\rho )$ such that $C_\epsilon [\gamma ']<\infty $. The uniqueness claim now follows, since the functional $\gamma \mapsto C_\epsilon [\gamma ]$ is strictly convex for $\epsilon >0$. $\square $

4 Entropic-Kantorovich duality for Coulomb-type costs

We start by recalling the classical Fenchel–Rockafellar Theorem. We refer to the I. Ekeland and R. Témam’s book [13, Theorem 4.2] for a more complete presentation and references.

Theorem 4.1

(Fenchel–Rockafellar) Let ${\mathcal {X}}$ and ${\mathcal {Y}}$ be Banach spaces and $A:{\mathcal {X}}\rightarrow {\mathcal {Y}}$ be linear and continuous. Let $F:{\mathcal {X}}\rightarrow {\mathbb {R}}\cup \lbrace +\infty \rbrace $ and $G:{\mathcal {Y}}\rightarrow {\mathbb {R}}\cup \lbrace +\infty \rbrace $ be proper and convex functions. Then

$$\begin{aligned} \inf \big \lbrace F[x] + G[Ax] ~\big |~ x \in {\mathcal {X}}\big \rbrace = \sup \big \lbrace -F^{*}[-A^* \gamma ] - G^{*}[\gamma ] ~\big |~ \gamma \in {\mathcal {Y}}^* \big \rbrace \end{aligned}$$

where $A^*:{\mathcal {Y}}^*\rightarrow {\mathcal {X}}^{*}$ denotes the adjoint operator of A.

Next we prove the Entropic-Kantorovich duality for the problem (2.4).

Theorem 4.2

(Entropic Duality for repulsive costs) Let $(X,d,{\mathfrak {m}})$ be a Polish measure space. Suppose $\rho {\mathfrak {m}}\in {\mathcal {P}}(X)$ such that (A) and (B) hold and $\rho \log \rho \in L_m^1(X)$, and $c:X^N\rightarrow [0,\infty ]$ is a cost function

$$\begin{aligned} c(x_1,\ldots , x_N)=\sum _{1\le i<j\le N}f(d(x_i,x_j)),\quad \text { for all } \, (x_1,\ldots ,x_N)\in X^N, \end{aligned}$$

where $f:[0,+\infty [\rightarrow [0,\infty ]$ is a function satisfying (F1) and (F2). Then, for $\varepsilon > 0$, the duality holds

$$\begin{aligned} \min _{\gamma \in \Pi _N(\rho )}C_\varepsilon [\gamma ]&= \sup _{u_i\in C_b(X)}\left\{ \sum _{i=1}^N\int _Xu_i{\mathrm d}\rho {\mathfrak {m}}-\varepsilon \int _{X^N}\exp \left( \frac{u_1\oplus \cdots \oplus u_N-c}{\varepsilon }\right) \,{\mathrm d}{\mathfrak {m}}_{N}\right\} +\varepsilon \\&=\sup _{u\in C_b(X)}\left\{ N\int _Xu{\mathrm d}\rho {\mathfrak {m}}-\varepsilon \int _{X^N}\exp \left( \frac{u\oplus \cdots \oplus u-c}{\varepsilon }\right) \,{\mathrm d}{\mathfrak {m}}_{N}\right\} +\varepsilon , \end{aligned}$$

where $v_1\oplus \cdots \oplus v_N$ denotes the operator $(v_1\oplus \cdots \oplus v_N)(x_1,\ldots , x_N) = v_1(x_1)+ \cdots + v_N(x_N)$.

Proof

First let us assume that X is a compact space. We denote by ${\mathcal {X}}= (C_b(X))^N$ and ${\mathcal {Y}}=C_b(X^N)$, where $C_b(X)$ is the space of continuous and bounded functions on X, and similarly for $X^N$. By Riesz representation theorem, the space ${\mathcal {Y}}$ is dual to the space ${\mathcal {M}}(X^N)$ of signed regular Borel measures on $X^N$. Thus, we may define the Legendre–Fenchel transform $G^{*}$ of a functional $G:{\mathcal {Y}}\rightarrow {\mathbb {R}}\cup \lbrace +\infty \rbrace $ by

$$\begin{aligned} G^{*}:{\mathcal {M}}(X^N)\rightarrow {\mathbb {R}}\cup \lbrace +\infty \rbrace , \quad G^{*}[\pi ] = \sup _{\psi \in C_b(X^N)} \bigg \lbrace \int _{X^N}\psi \,{\mathrm d}\pi - G[\psi ] \bigg \rbrace . \end{aligned}$$

We define the functionals

$$\begin{aligned} F:(C_b(X))^N\rightarrow {\mathbb {R}}\cup \{+\infty \},~~(u_1,\ldots , u_N)\mapsto -\sum _{i=1}^N\int _Xu_i\,{\mathrm d}\rho {\mathfrak {m}}\end{aligned}$$

and

$$\begin{aligned} G:C_b(X^N)\rightarrow {\mathbb {R}}\cup \{+\infty \},~~\psi \mapsto \varepsilon \int _{X^N}e^{\tfrac{1}{\varepsilon }(\psi -c)}\,{\mathrm d}{\mathfrak {m}}_{N}, \end{aligned}$$

and the operator

$$\begin{aligned} A:(C_b(X))^N\rightarrow C_b(X^N),~~(u_1,\ldots , u_N)\mapsto u_1\oplus \cdots \oplus u_N. \end{aligned}$$

Now, F and G are proper and convex functionals and A is a linear and continuous operator. So, we may apply Fenchel–Rockafellar duality Theorem 4.1 to get

$$\begin{aligned} \inf \{F[x]+G[Ax]~|~x\in {\mathcal {X}}\}=\sup \{-G^*[\gamma ]-F^*[-A^\dag \gamma ]~|~\gamma \in {\mathcal {Y}}^*\}. \end{aligned}$$

This gives (since for every set S we have $\inf (S)=-\sup (-S)$ and $\sup S=-\inf (-S)$)

$$\begin{aligned} \inf \{G^*[\gamma ]+F^*[-A^\dag \gamma ]~|~\gamma \in {\mathcal {Y}}^*\}=\sup \{-F[x]-G[Ax]~|~x\in {\mathcal {X}}\}. \end{aligned}$$

It remains to show that the above expression has exactly the form of our duality claim. The claim that the right-hand sides correspond to each other follows immediately from our choices of ${\mathcal {X}}$, F, and G. So, it remains to show that

$$\begin{aligned} \inf \{G^*[\gamma ]+F^*[-A^\dag \gamma ]~|~\gamma \in {\mathcal {Y}}^*\}=C_\varepsilon [\gamma ]-\varepsilon . \end{aligned}$$

(4.1)

To prove it, let $\gamma \in {\mathcal {M}}(X^N)$. Now we have

$$\begin{aligned} F^*[-A^*\gamma ]&= \sup \bigg \lbrace \int _{X^N}\sum ^N_{i=1}u_i(x_i)d\gamma - \sum ^N_{i=1}\int _{X}u_i(x_i)\,{\mathrm d}\rho {\mathfrak {m}}(x_i) ~\bigg |~ (u_1,\ldots ,u_N)\in C_b(X)^N \bigg \rbrace \\&={\left\{ \begin{array}{ll}0 &{}\text { if }\gamma \in \Pi _N(\rho )\\ +\infty &{}\text { otherwise}\end{array}\right. }. \end{aligned}$$

Let us then compute $G^*[\gamma ]$:

$$\begin{aligned} G^*[\gamma ]=\sup _{\psi \in C_b(X^N)} \left\{ \int _{X^N}\psi \,{\mathrm d}\gamma - \varepsilon \int _{X^N}e^{\tfrac{1}{\varepsilon }(\psi -c)}\,{\mathrm d}{\mathfrak {m}}_{N} \right\} . \end{aligned}$$

If $\gamma $ is not absolutely continuous with respect to ${\mathfrak {m}}_{N}$, we have $G^*[\gamma ]=+\infty $. If $\gamma \ll {\mathfrak {m}}_{N}$, then the supremum (that appears in the definition of $G^{*}[\gamma ]$) is realized at $\psi =\varepsilon \log \rho _\gamma +c$; this holds also if the function $\rho _\gamma $ is not continuous since it can be approximated by a sequence of continuous functions. Thus, we get for $\gamma \ll {\mathfrak {m}}_{N}$

$$\begin{aligned} G^*[\gamma ]&=\int _{X^N}\left( \rho _\gamma \cdot \psi -\varepsilon e^{\frac{1}{\varepsilon }(\psi -c)}\right) \,{\mathrm d}{\mathfrak {m}}_{N}\\&=\int _{X^N}(\varepsilon \rho _\gamma \log \rho _\gamma +c\rho _\gamma -\varepsilon \rho _\gamma )\,{\mathrm d}{\mathfrak {m}}_{N}. \end{aligned}$$

Hence, if $\gamma \in \Pi _N(\rho )$, we have

$$\begin{aligned} G^*[\gamma ]=C_0[\gamma ]+\varepsilon E[\gamma ]-\varepsilon . \end{aligned}$$

This concludes the duality proof when X is a compact space.

The noncompact case Due to Lemma 2.3, it suffices to prove the claim in the case where the reference measure is $\rho {\mathfrak {m}}$ instead of ${\mathfrak {m}}$; the finiteness of the measure $\rho {\mathfrak {m}}$ now gives access to inner regularity and to the approximability by compact sets. We will for simplicity denote $\rho :=\rho {\mathfrak {m}}$.

The claim is

$$\begin{aligned} \min _{\gamma \in \Pi _N(\rho )}C_\varepsilon [\gamma ]=\sup _{u\in C_b(X)}\left\{ N\int _Xu{\mathrm d}\rho -\varepsilon \int _{X^N}\exp \left( \frac{u\oplus \cdots \oplus u-c}{\varepsilon }\right) \,{\mathrm d}\rho ^{\otimes N}\right\} +\varepsilon . \end{aligned}$$

For simplicity, let us denote

$$\begin{aligned}D_\rho (u):=\left\{ N\int _Xu{\mathrm d}\rho -\varepsilon \int _{X^N}\exp \left( \frac{u\oplus \cdots \oplus u-c}{\varepsilon }\right) \,{\mathrm d}\rho ^{\otimes N}\right\} +\varepsilon ~~~\text {for all }u\in C_b(X).\end{aligned}$$

We may assume that $\sup _{u\in C_b(X)}D_\rho (u)>-\infty $; indeed, since we can test with the function $u\equiv 0$, this always holds for cost functions that are bounded from below.

Let us make, in the notation of the primal functional, the dependence on the reference measure explicit by the notation $\gamma \mapsto C_\epsilon [\gamma \,|\,\mu ]$ when the reference measure on the space X is $\mu $. Thus the original notation $\gamma \mapsto C_\epsilon [\gamma ]$ corresponds to $\gamma \mapsto C_\epsilon [\gamma \,|\,{\mathfrak {m}}]$.

Since the measures $\rho $ and $\gamma $ are inner regular, there exists a sequence $(K_n)_{n\in {\mathbb {N}}}$ of compact subsets of X such that

$$\begin{aligned} \rho (K_n)\rightarrow \rho (X)\qquad \text {and}\qquad \gamma (K_n^N)\rightarrow \gamma (X). \end{aligned}$$

Let us denote $\gamma _n:=\frac{1}{\gamma (K_n^N)}\gamma \big |_{K_n^N}$ and $\rho _n:=\frac{1}{\rho (K_n)}\rho \big |_{K_n}$. Let us also denote by $\gamma _n^{\text {min}}$ the minimizer of the problem $I_\epsilon [\rho _n]$. Since $\gamma $ is the minimizer of the problem $I_\epsilon (\rho )$ and since (due to the absolute continuity of the integral and continuity of the function $t\mapsto t\log t$)

$$\begin{aligned} \lim _{n\rightarrow \infty }|C_\epsilon [\gamma _n\,|\,\rho _n]- C_\epsilon [\gamma \,|\,\rho ]|=0, \end{aligned}$$

(4.2)

we have

$$\begin{aligned} \lim _{n\rightarrow \infty }|C_\epsilon [\gamma _n\,|\,\rho _n]-C_\epsilon [\gamma _n^{\text {min}}\,|\,\rho _n]|=0. \end{aligned}$$

(4.3)

By the duality result proven above for compact spaces, we have for all $n \in {\mathbb {N}}$

$$\begin{aligned} \sup _{u\in C_b(K_n)}D_{\rho _n}(u)=C_\epsilon [\gamma _n^{\text {min}}\,|\,\rho _n]. \end{aligned}$$

Again, due to the absolute continuity of the integral, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\left|\sup _{u\in C_b(K_n)}D_{\rho _n}(u)-\sup _{u\in C_b(X)}D_\rho (u)\right|=0. \end{aligned}$$

Putting these conditions together, we get for all $n\in {\mathbb {N}}$

$$\begin{aligned}&\left| C_\epsilon [\gamma \,|\,\rho ]-\sup _{u\in C_b(X)}D_\rho (u)\right| \\&\quad \le \left| C_\epsilon [\gamma \,|\,\rho ]-C_\epsilon [\gamma _n\,|\,\rho _n]\right| +\left| C_\epsilon [\gamma _n\,|\,\rho _n]-C_\epsilon [\gamma _n^{\text {min}}\,|\,\rho _n]\right| \\&\qquad +\left| C_\epsilon [\gamma _n^{\text {min}}\,|\,\rho _n]-\sup _{u\in C_b(K_n)}D_{\rho _n}(u)\right| +\left| \sup _{u\in C_b(K_n)}D_{\rho _n}(u)-\sup _{u\in C_b(X)}D_{\rho }(u)\right| . \end{aligned}$$

The claim follows by letting $n\rightarrow \infty $. $\square $

References

Agueh, M., Carlier, G.: Barycenters in the Wasserstein space. SIAM J. Math. Anal. 43, 904–924 (2011)
Article MathSciNet Google Scholar
Benamou, J.D., Carlier, G., Nenna, L.: A Numerical method to solve multi-marginal optimal transport problems with Coulomb cost. In: Glowinski, R., Osher, S., Yin, W. (eds.) Splitting Methods in Communication, Imaging, Science, and Engineering. Scientific Computation. Springer, Cham (2016)
Google Scholar
Benamou, J.-D., Carlier, G., Nenna, L.: Generalized incompressible flows, multi-marginal transport and Sinkhorn algorithm. Numer. Math. 142, 33–54 (2019)
Article MathSciNet Google Scholar
Buttazzo, G., De Pascale, L., Gori-Giorgi, P.: Optimal-transport formulation of electronic density functional theory. Phys. Rev. A 85, 062502 (2012)
Article Google Scholar
Carlier, G., Duval, V., Peyré, G., Schmitzer, B.: Convergence of entropic schemes for optimal transport and gradient flows. SIAM J. Math. Anal. 49, 1385–1418 (2017)
Article MathSciNet Google Scholar
Cotar, C., Friesecke, G., Klüppelberg, C.: Density functional theory and optimal transportation with coulomb cost. Commun. Pure Appl. Math. 66, 548–599 (2013)
Article MathSciNet Google Scholar
Cotar, C., Friesecke, G., Klüppelberg, C.: Smoothing of transport plans with fixed marginals and rigorous semiclassical limit of the Hohenberg–Kohn functional. Arch. Ration. Mech. Anal. 228, 891–922 (2018)
Article MathSciNet Google Scholar
Cotar, C., Friesecke, G., Klüppelberg, C.: Second order differentiation formula on ${RCD}^*({K},{N})$ spaces. Accepted at JEMS. arXiv:1802.02463 (2018)
Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Advances in Neural Information Processing Systems, pp. 2292–2300 (2013)
Di Marino, S., De Pascale, L., Colombo, M.: Multimarginal optimal transport maps for $1 $-dimensional repulsive costs. Can. J. Math. 67, 350–368 (2015)
Article MathSciNet Google Scholar
Di Marino, S., Gerolin, A.: An optimal transport approach for the Schrödinger bridge problem and convergence of Sinkhorn algorithm. arXiv preprint arXiv:1911.06850 (2019)
Di Marino, S., Gerolin, A., Nenna, L.: Optimal transport theory for repulsive costs. In: Topological Optimization and Optimal Transport: In the Applied Sciences, vol. 17 (2017)
Ekeland, I., Temam, R.: Analyse convexe et problèmes variationelles. Dunod, Gauthier-Villars, Paris, ix+340 p (1974)
Flamary, R., Courty, N.: POT Python Optimal Transport library (2017). https://github.com/rflamary/POT
Gerolin, A., Grossi, J., Gori-Giorgi, P.: Kinetic correlation functionals from the entropic regularisation of the strictly-correlated electrons problem. J. Chem. Theory Comput. (2019)
Gerolin, A., Kausamo, A., Rajala, T.: Duality theory for multi-marginal optimal transport with repulsive costs in metric spaces. ESAIM Control Optim. Calc. Var. 25, 62 (2019)
Article MathSciNet Google Scholar
Gigli, N., Tamanini, L.: Second order differentiation formula on compact ${R}{C}{D}^\ast ({K}, {N})$ spaces. arXiv:1701.03932 (2017)
Gozlan, N., Léonard, C.: Transport inequalities: a survey. Markov Process. Related Fields 16, 635–736 (2010)
MathSciNet MATH Google Scholar
Kellerer, H.G.: Duality theorems for marginal problems. In: Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 67 (1984)
Léonard, C.: From the Schrödinger problem to the Monge–Kantorovich problem. J. Funct. Anal. 262, 1879–1920 (2012)
Article MathSciNet Google Scholar
Léonard, C.: A survey of the Schrödinger problem and some of its connections with optimal transport. Discrete Continuous Dyn. Syst. 34, 1533–1574 (2014)
Article MathSciNet Google Scholar
Lieb, E.H.: Density functionals for Coulomb systems. In: Loss, M., Ruskai, M.B. (eds.) Inequalities. Springer, Berlin, Heidelberg (2002)
MATH Google Scholar
Mikami, T.: Monge’s problem with a quadratic cost by the zero-noise limit of h-path processes. Probab. Theory Related Fields 129, 245–260 (2004)
Article MathSciNet Google Scholar
Nenna, L.: Numerical Methods for Multi-Marginal Opimal Transportation, PhD thesis, Université Paris-Dauphine (2016)
Peyré, G., Cuturi, M.: Computational Optimal Transport, vol. 11. Now Publishers Inc, Hanover (2019)
Book Google Scholar
Schrödinger, E.: Über die umkehrung der naturgesetze. Verlag Akademie der wissenschaften in kommission bei Walter de Gruyter u, Company (1931)
Seidl, M., Di Marino, S., Gerolin, A., Giesbertz, K., Nenna, L., Gori-Giorgi, P.: The strictly-correlated electron functional for spherically symmetric systems revisited. Accepted in Phys. Rev. A. arXiv:1702.05022) (2016)
Sturm, K.-T.: On the geometry of metric measure spaces. Acta Math. 196, 65–131 (2006)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Theoretical Chemistry, Vrije Universiteit Amsterdam, FEW, De Boelelaan 1083, 1081HV, Amsterdam, The Netherlands
Augusto Gerolin
Department of Mathematics and Statistics, University of Jyvaskyla, P.O. Box 35 (MaD), 40014, Jyvaskyla, Finland
Anna Kausamo & Tapio Rajala

Authors

Augusto Gerolin
View author publications
You can also search for this author in PubMed Google Scholar
Anna Kausamo
View author publications
You can also search for this author in PubMed Google Scholar
Tapio Rajala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Augusto Gerolin.

Additional information

Communicated by A.Malchiodi.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors acknowledge the support of the Academy of Finland, Projects Nos. 274372, 284511, 312488, and 314789. A.G. also acknowledges funding by the European Research Council under H2020/MSCA-IF “OTmeetsDFT” (Grant ID: 795942). A.K. also wants to thank the Vilho, Yrjö and Kalle Väisälä Foundation for funding.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Gerolin, A., Kausamo, A. & Rajala, T. Multi-marginal entropy-transport with repulsive cost. Calc. Var. 59, 90 (2020). https://doi.org/10.1007/s00526-020-01735-3

Download citation

Received: 18 September 2019
Accepted: 28 February 2020
Published: 23 April 2020
DOI: https://doi.org/10.1007/s00526-020-01735-3

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Multi-marginal entropy-transport with repulsive cost

Abstract

Similar content being viewed by others

Optimal Entropy-Transport problems and a new Hellinger–Kantorovich distance between positive measures

Quadratically Regularized Optimal Transport

A proof of the Caffarelli contraction theorem via entropic regularization

1 Introduction

1.1 Examples of optimal entropy couplings

Theorem 1.1

1.1.1 One-dimensional entropic-transport with Coulomb cost and a Gaussian measure

1.2 Organization of the paper

1.3 The strategy of the main proof and some technical remarks

2 The entropy-regularized repulsive costs

Proposition 2.1

Remark 2.2

Lemma 2.3

2.1 Some properties of the entropy functional

Proposition 2.4

Proof

Corollary 2.5

Proposition 2.6

Proof

3 The \(\Gamma \)-convergence of entropic-regularized cost

Theorem 3.1

3.1 Proof of condition (I)

3.2 Constructing an approximation of the coupling \(\gamma \)

3.3 Narrow convergence of the approximations

Lemma 3.2

Proof

3.4 Convergence of the cost functional

Lemma 3.3

Proof

3.5 Finiteness of the entropy for the approximations

Lemma 3.4

Proof

3.6 Proof of condition (II)

Corollary 3.5

Proof

4 Entropic-Kantorovich duality for Coulomb-type costs

Theorem 4.1

Theorem 4.2

Proof

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation