Calculus of Variations: Generalized Solutions of A Kinetic Granular Media Equation by A Gradient Flow Approach

Calc. Var.
(2016) 55:37
DOI 10.1007/s00526-016-0978-7 Calculus of Variations
Generalized solutions of a kinetic granular media

equation by a gradient flow approach
Martial Agueh1 · Guillaume Carlier2
Received: 23 June 2015 / Accepted: 4 March 2016 / Published online: 26 March 2016
© Springer-Verlag Berlin Heidelberg 2016
Abstract We consider a one-dimensional kinetic model of granular media in the case where
the interaction potential is quadratic. Taking advantage of a simple first integral, we can
use a reformulation (equivalent to the initial kinetic model for classical solutions) which
allows measure solutions. This reformulation has a Wasserstein gradient flow structure (on
a possibly infinite product of spaces of measures) for a convex energy which enables us to
prove global in time well-posedness.
Keywords Kinetic models of granular media · Product Wasserstein space · Gradient flows
Mathematics Subject Classification 35Q70 · 35D30 · 35F25
1 Introduction
Kinetic models for granular media were initiated in the work of Benedetto et al. [4,5] who
considered the following PDE
∂t f + v · ∇x f = divv ( f (∇W v f )), (t, x, v) ∈ R+ × Rd × Rd , f |t=0 = f 0 , (1.1)
where f 0 is an integrable nonnegative function on the phase space and W is a certain convex
and radially symmetric potential capturing the (inelastic) collision rule between particles,
Communicated by L. Ambrosio.
B Guillaume Carlier
carlier@ceremade.dauphine.fr
Martial Agueh
agueh@math.uvic.ca
1 Department of Mathematics and Statistics, University of Victoria, PO Box 3060 STN CSC,
Victoria, BC V8W 3R4, Canada
2 CEREMADE, UMR CNRS 7534, Université Paris IX Dauphine, Pl. de Lattre de Tassigny,
75775 Paris Cedex 16, France
123
37 Page 2 of 26 M. Agueh, G. Carlier

and the convolution is in velocity only (∇W v f t )(x, v) = Rd ∇W (v − u) f t (x, u)du (so
that there is no regularizing effect in the spatial variable). At least formally, (1.1) captures
the limit as the number N of particles tends to +∞ of the second-order ODE system:
1
Ẋ i (t) = Vi (t), V̇i = − ∇W (Vi (t) − V j (t))δ X i (t)−X j (t) , i = 1, . . . , N , (1.2)
N
j=i
which describes the motion of N particles of mass N1 moving freely until collisions occur, and
at collision times, there is some velocity exchange with a loss of kinetic energy depending
on the form of the potential W .
Surprisingly there are very few results on well-posedness for such equations. This is in
contrast with the spatially homogeneous case (i.e. f depending on t and v only) associated
with (1.1) that has been very much studied (see [4,6,11–13,17] and the references therein)
and for which existence, uniqueness and long-time behavior are well understood. In fact,
the spatially homogeneous version of (1.1) can be seen as the Wasserstein gradient flow of
the interaction energy associated to W , and then well-posedness results can be viewed as
a consequence of the powerful theory of Wasserstein gradient flows (see [3]). For the full
kinetic equation (1.1), local existence and uniqueness of a classical solution was proved in
one dimension in [4] for the potential W (v) = |v|3 /3 (as observed in [2], the arguments of
[4] extend to dimension d and W (v) = |v| p / p provided p > 3 − d) when the initial datum
f 0 is a non-negative C 1 ∩ W 1,∞ (R × R) integrable function with compact support. Under an
additional smallness assumption, the authors of [4] also proved a global existence result. In
[1], the first author has extended the local existence result of [4] to more general interaction
potentials W and to any dimension, d ≥ 1. The proof of [1] is based on a splitting of the
kinetic equation (1.1) into a free transport equation in x, and a collision equation in v that is
interpreted as the gradient flow of a convex interaction energy with respect to the quadratic
Wasserstein distance. In [2], various a priori estimates are obtained, in particular a global
entropy bound (which thus rules out concentration in finite time) in dimension 1 when W
is subquadratic near zero.
Understanding under which conditions one can hope for global existence or on the contrary
expect explosion in finite time is mainly an open question. Let us remark that the weak
formulation of (1.1) means that for any T > 0 and any φ ∈ Cc∞ ([0, T ] × Rd × Rd ) one has
T
(∂t φ(t, x, v) f t (x, v) + ∇x φ(t, x, v) · v f t (x, v))dxdvdt
R ×R
0 d d

= φ(T, x, v) f T (x, v)dxdv − φ(0, x, v) f 0 (x, v)dxdv
R d ×R d R d ×R d
T
+ ∇v φ(t, x, v) · ∇W (v − u) f t (x, v) f t (x, u)dxdudvdt
0 R d ×R d ×R d
and for the right hand side to make sense, it is necessary to have a control on nonlinear
quantities like
T
f t (x, v) f t (x, u)dxdudvdt
0 R d ×R d ×R d
which actually makes it difficult to define measure solutions (this also explains why in [4]
or [1], the authors look for L 1 ∩ L ∞ solutions). Observing that (1.1) can be written in
conservative form as

∂t f + divx,v f F( f ) = 0, with F( f )(x, v) = v, −(∇W v f )(x, v) ,
123
Generalized solutions of a kinetic granular media equation. . . Page 3 of 26 37
we see that, at least for smooth solutions, (1.1) can be integrated using the method of char-
acteristics:
f t = St # f 0 ,
where St is the flow of the vector-field F( f ) i.e.
d
S0 (x, v) = (x, v), St (x, v) = F f t )(St (x, v) ,
dt
and f t = St # f 0 means that

ϕ(x, v) f t (x, v)dxdv = ϕ St (x, v) f 0 (x, v)dxdv, ∀ϕ ∈ Cb (Rd × Rd ).
R d ×R d R d ×R d
In the present work, we investigate the one-dimensional case with the quadratic kernel
W (v) = 21 |v|2 which is neither covered by the analysis of [4] nor by the entropy estimate of
[2] (actually the entropy cannot be globally bounded in this case, see [2]). In this case the
convolution takes the form

(v − u) f t (x, u)du = ρt (x)v − m t (x),
R
where

ρt (x) := f t (x, v)dv, m t (x) := v f t (x, v)dv, (1.3)
R R
so that the kinetic equation (1.1) rewrites

∂t f t (x, v) + v∂x f t (x, v) = ∂v f t (x, v)(ρt (x)v − m t (x)) , (1.4)
and we supplement (1.4) with the initial condition

f |t=0 = f 0 , (1.5)
where f 0 is a compactly supported probability density:

f 0 ∈ L 1 (Rd × Rd ), f 0 dxdv = 1 (1.6)
R d ×R d
and
Supp( f 0 ) ⊂ B Rx × B Rv (1.7)
for some positive constants R x and Rv . We shall see later on, how to treat more general
measures as initial conditions. Our first contribution is the observation that, thanks to a
special first integral of motion for the characteristics system associated with (1.4), one may
define weak solutions not at the level of measures on the phase space but on a (possibly
infinite) product of measures on the physical space. Our second contribution is to show that
this reformulation has a gradient flow structure for an energy functional with good properties
which will enable us to prove global well-posedness. To the best of our knowledge, even if
the situation we are dealing with is very particular, this is the first global result of this type
for kinetic models of granular media. As pointed out to us by Yann Brenier, our analysis has
some similarities with (but is different from) some models of sticky particles for pressureless
flows (see [8,9]) and Brenier’s formulation of the Darcy–Boussinesq system [7].
The article is organized as follows. In Sect. 2, we show how a certain first integral of motion
can be used to give a reformulation of (1.4) which allows for measure solutions. Section 3
123
investigates the gradient flow structure of this reformulation. Section 4 proves global existence
thanks to the celebrated Jordan–Kinderlehrer–Otto (henceforth JKO) implicit Euler scheme
of [16] for a certain energy functional. In Sect. 5, we prove uniqueness and stability and give
some concluding remarks.
2 A first integral and measure solutions
2.1 A first integral for classical solutions
Let us consider a C 1 compactly supported initial condition f 0 and a classical solution f , that
is a C 1 function which solves (1.4) in a pointwise sense on R+ × Rd × Rd . It is then easy to
show (see [2]) that f remains compactly supported locally in time; more precisely (1.7) and
(1.4) imply that
Supp( f t ) ⊂ B Rx +t Rv × B Rv , ∀t ≥ 0. (2.1)
The characteristics for (1.4) is the flow map for the second-order ODE
Ẍ = −ρt (X ) Ẋ + m t (X ) (2.2)
in the sense that
f t = (X t , Vt )# f 0 ,
where (X 0 (x, v), V0 (x, v)) = (x, v) and
d d
X t (x, v) = Vt (x, v), Vt (x, v) = −ρt X t (x, v) Vt (x, v) + m t X t (x, v) , (2.3)
dt dt
with ρ and m being respectively the spatial marginal and momentum associated to f defined
by (1.3). Integrating (1.4) with respect to v, first gives:
∂t ρt (x) + ∂x m t (x) = 0, t ≥ 0, x ∈ R (2.4)
so that there is a stream potential G such that
ρ = ∂x G, m = −∂t G, (2.5)
and since ρ is a probability measure, it is natural to choose the integration constant in such
a way that G is the cumulative distribution function of ρ:
x

G t (x) = ρt (y)dy = ρt (−∞, x] . (2.6)
−∞
Replacing (2.6) in (2.2) then gives
d
Ẍ = −∂x G t (X ) Ẋ − ∂t G t (X ) = − G t (X )
dt
so that Ẋ + G t (X ) is constant along the characteristics. Since G 0 can be deduced from the
initial condition f 0 by
x
G 0 (x) = f 0 (y, v)dv dy,
−∞ R
we have the following explicit first integral of motion for (2.3):

Vt (x, v) + G t X t (x, v) = v + G 0 (x). (2.7)
123
2.2 Reformulation and equivalence for classical solutions
In view of the first integral (2.7), it is natural to perform a change of variables on the initial
conditions:

a(x, v) := v + G 0 (x)), ν0a (x) := f 0 x, a − G 0 (x)
so that for every φ ∈ C(R × R) one has

φ x, a(x, v) f 0 (x, v)dxdv = φ(x, a)ν0a (x)dxda,
R ×R R ×R
and then to rewrite the characteristics as a family of first-order ODEs parametrized by the
label a:
d a
X (x) = a − G t X ta (x) , X 0a (x) = x. (2.8)
dt t
The flow (2.3) may then be rewritten as:

X t (x, v) = X ta (x), Vt (x, v) = a − G t X ta (x) for a = a(x, v) = v + G 0 (x).
Hence setting
νta := X ta # ν0a , (2.9)
the relation f t = (X t , Vt )# f 0 can be re-expressed as:

φ(x, v) f t (x, v)dxdv = φ x, a − G t (x) νta (x)dxda (2.10)
R2 R2
for every t ≥ 0 and every test-function φ ∈ C(R2 ). This implies in particular that

ρt (x) = νta (x)da
R
and then also

G t (x) = G at (x)da with G at (x) := νta (−∞, x] . (2.11)
R
On the other hand, using (2.8), we deduce that for each a ∈ R, ν a satisfies the continuity
equation:

∂t νta + ∂x νta a − G t (x) = 0, ν a |t=0 (x) = ν0a (x) = f 0 x, a − G 0 (x) . (2.12)
Note that νta is a nonnegative measure but not necessarily a probability measure, its total
mass being that of ν0a i.e. h(a) := R f 0 (x, a − G 0 (x))dx.
The previous considerations show that any classical solution of (1.4) is related to a solution
of the system of continuity equations (2.11) and (2.12) with initial condition f 0 via the relation
(2.10). The converse is also true: if ν a is a family of classical solutions of (2.12) with G a and
G given by (2.11), then the time-dependent family of probability measures f t on R2 defined
by (2.10) actually solves (1.4). Indeed, by construction the spatial marginal ρ of f is ∂x G;
as for the momentum, we have

m t (x) := v f t (x, v)dv = a − G t (x) νta (x)da.
R R
123
Then, thanks to (2.12) and Fubini’s theorem, we have

x x
∂t G(x) = −∞ R ∂t ν
a (y)dyda =− −∞ R ∂x νta (y) a − G t (y) dyda

= − R a − G t (x) νta (x)dx = −m t (x).
Then let us take a test-function φ ∈ Cc1 (R2 ), differentiating (2.10) with respect to time, using
∂x G = ρ, ∂t G = −m, (2.10) and an integration by parts and (2.12), we have

dt R2 φ f t = R2 − φ(x, a − G t (x))∂x (νt (a − G t )) + ∂v φ(x, a − G t )m t νt dxda
d a a

= R2 ∂x φ(x, a − G t (x)) − ∂v φ(x, a − G t (x))ρt (x) (a − G t (x))νta (x)dxda

+ R2 ∂v φ(x, v)m t (x) f t (x, v)dxdv
= R2 (∂x φ(x, v)v + ∂v φ(x, v)(m t (x) − ρt (x)v) f t (x, v))dxdv.
This proves that, for classical solutions, the kinetic equation (1.4) is actually equivalent
to the system of PDEs (2.12)–(2.11) indexed by the label a.
2.3 Measure solutions
We now take the system (2.11) and (2.12) as a starting point to define measure solutions. We
have to suitably relax the system so as to take into account:
• The fact that shocks may occur i.e. atoms of ρ may appear in finite time, then the
cumulative distribution G may become discontinuous (in which case it will be convenient
to view G, which is monotone, as a set-valued map),
• The fact that when shocks occur, the velocity may depend on the label a,
• More general initial conditions.
Let us treat first the case of more general initial conditions. What really matters is to be
able perform the change of variables (x, v) → (x, a) := (x, v + G 0 (x)), which can be done
as soon as ρ0 is atomless i.e. does not charge points. We shall therefore assume that f 0 is a
probability measure on R2 with compact support and having an atomless spatial marginal:
Supp( f 0 ) ⊂ B Rx × B Rv , ρ0 is atomless i.e. f 0 ({x} × R) = 0, ∀x ∈ R. (2.13)
Defining the spatial marginal ρ0 of f 0 by

φ(x)dρ0 (x) = φ(x)d f 0 (x, v), ∀φ ∈ C(R)
R R2
as well as its cumulative distribution function

G 0 (x) := ρ0 (−∞, x] = f 0 (−∞, x] × R , ∀x ∈ R,
G 0 is continuous and ρ0 is supported on [−R x , Rx ]. Since G 0 takes values in [0, 1], then
a(x, v) := v + G 0 (x) ∈ [−Rv , Rv +1] for (x, v) ∈ Supp( f 0 ). We then define the probability
measure η0 as the push-forward of f 0 through (x, v) → (x, a(x, v)) i.e.

η0 (C) := f 0 (x, v) : (x, v + G 0 (x)) ∈ C , for every Borel subset C of R2 .
(2.14)
We then fix a σ -finite measure μ such that the second marginal of η0 is absolutely continuous
with respect to μ; for instance it could be the second marginal of η0 , but we allow μ to be a
more general measure (not necessarily a probability measure; for instance it was the Lebesgue
123
measure in the previous Sect. 2.2, and in the discrete example of Sect. 2.4 below, μ will be a
discrete measure). Then we can disintegrate η0 as η0 = ν0a ⊗ μ which means that for every
φ ∈ C(R2 ) we have

φ x, v + G 0 (x) d f 0 (x, v) = φ(x, a)dν0a (x) dμ(a).
R2 R R
Note that ν0a is supported on [−R x , Rx ] and it is not necessarily a probability measure. We
denote by h(a) its total mass i.e. the Radon–Nikodym density of the second marginal of η0
with respect to μ:

φ v + G 0 (x) d f 0 (x, v) = φ(a)h(a) dμ(a), ∀φ ∈ C(R) (2.15)
R2 R

so that h ∈ L 1 (μ), R h(a) dμ(a) = 1 and h = 0 outside of the interval [−Rv , Rv + 1].
The rest of the paper will be devoted to study the structure and well-posedness of the
following system which relaxes to a measure-valued setting the system (2.11) and (2.12):
∂t νta + ∂x (νta vta ) = 0, ν a t=0

= ν0a , (2.16)
subject to the constraint that

vta (x) ∈ a − G t (x), a − G −
t (x) , (2.17)
where

ρt := νta dμ(a), G t (x) = ρt ( − ∞, x] , G −
t (x) = ρt (−∞, x) . (2.18)
R
Note that when μ is the Lebesgue measure and there are no shocks i.e. when G t is continuous,
we recover the system (2.11) and (2.12) of Sect. 2.2. Denoting by P 2 (R) the set of Borel
probability measures on R with finite second moment, solutions of (2.16)–(2.18) are then
formally defined by:
Definition 2.1 Fix a time T > 0; a measure solution of the system (2.16)–(2.18) on [0, T ]×R
is a family of measures (t, a) ∈ [0, T ] × [−Rv , Rv + 1] → νta ∈ h(a)P2 (R) which
sense that for every Borel bounded function φ on [0, T ] × R × R,

1. Is measurable in the
the map (t, a) → R φ(t, a, x)dνta (x) is dt ⊗ μ measurable,
2. Satisfies the continuity equation (2.16) in the sense of distributions for hμ-a.e. a, with a
νta ⊗ μ ⊗ dt-measurable velocity field vta which satisfies (2.17), νta ⊗ μ ⊗ dt a.e, and
with G t and G −t defined by (2.18).
Note that since vta constrained by (2.18) is bounded, t → νta is actually continuous for
the weak convergence of measures for hμ a.e. a. Note also that the fact that t → νta satisfies
the continuity equation (2.16) in the sense of distributions is equivalent to the condition that
for every ψ ∈ C([−Rv , Rv + 1]) and φ ∈ Cc1 ([0, T ] × R) one has:

T
R ψ(a) 0 R
(∂t φ(t, x)
+ ∂x φ(t, x)vta (x))dνta (x)dt dμ(a)

= R ψ(a) R φ(T, x)dνTa (x) − R φ(0, x)dν0a (x) dμ(a).
123
2.4 A discrete example and a system of Burgers equations
The aim of this paragraph, somehow independent from the rest of the paper, is to show, on
a discrete example, that one cannot take for granted that the stream G t remains continuous,
which justifies the necessity to relax the condition vta (x) = a − G t (x) by (2.17). Consider
indeed the special case
1
N
f 0 = ρ0 ⊗ δai −G 0 (x) ,
N
i=1
where ρ0 is a smooth compactly supported probability density and a1 < · · · < a N are the
finitely many values that the label a may take. In this case, we take μ as the counting measure
and then

N
1 1
μ= δai , h(ai ) = , ν ai = ρ0 .
N 0 N
i=1
Even though G 0 is smooth, we have to expect that shocks may appear in finite time. Let us
relabelthe measures ν i := ν ai and the corresponding cumulative distributions G i := G ai ,
G := Nj=1 G j . If G was continuous then all the nondecreasing functions G i would also be
continuous (no shocks), and, then, the system (2.16)–(2.18) would become

N 1
∂t ν i + ∂x ν i ai − Gj = 0, ν i t=0
= ρ0 , i = 1, . . . , N . (2.19)
N
j=1
Integrating with respect to the spatial variable between −∞ and x would then give a system
of Burgers-like equations:

N
1
∂t G i + ∂x G i (ai − G j ) = 0, G i |t=0 = G 0 , i = 1, . . . , N . (2.20)
N
j=1
We can at least formally rewrite each of these equations in the more familiar form
∂t G i + ∂x G i ψti (G i ) = 0,
where each function ψ i is implicitly defined in terms of the pseudo inverse Hti of G it :
j
ψti (α) = ai − α − G t Hti (α) .
j=i
Note that ψti is decreasing for every t and actually (ψti ) ≤ −1. In the absence of shocks,
Hti simply solves ∂t H i = ψti . Let us then take x1 < x2 belonging to a certain interval
on which ρ0 ≥ ν with ν > 0 and define y1 := N1 G 0 (x1 ), y2 := N1 G 0 (x2 ), we then have
x
y2 −y1 = N1 x12 ρ0 ≥ Nν (x2 −x1 ). Integrating ∂t H i = ψti and using the fact that (ψ i ) ≤ −1,
we get
t
Ht (y2 ) − Ht (y1 ) = x2 − x1 +
i i
ψsi (y2 ) − ψsi (y1 ) ds ≤ x2 − x1 − t (y2 − y1 ).
0
This means that Hti becomes noninjective before a time

x2 − x1 N
≤ .
y2 − y1 ν
123
In other words, discontinuities of G i i.e. shocks appear in finite time of order O(N ) for any
finite N .
3 A gradient flow structure
In this section, assuming (2.13) we will see how to obtain solutions to the system (2.16)–
(2.18) by a gradient flow approach. Existence of such gradient flows using the JKO implicit
scheme for Wasserstein gradient flows will be detailed in Sect. 4. We denote by M(Rd ) the
set of Borel measures on Rd and P (Rd ) the set of Borel probability measures on Rd . Given
two nonnegative Borel measures on Rd with common finite total mass h (not necessarily
1) and finite p-moments, ν and θ , recall that for p ∈ [1, +∞), the p-Wasserstein distance
between ν and θ is by definition:
1p
W p (ν, θ ) := inf |x − y| dγ (x, y) ,
p
γ ∈(ν,θ ) R d ×R d
where (ν, θ ) is the set of transport plans between ν and θ i.e. the set of Borel measures on
Rd × Rd having ν and θ as marginals (we refer to the textbooks of Villani [18,19] for a detailed
exposition of optimal transport theory). Wasserstein distances are usually defined between
probability measures such as h −1 ν and h −1 θ , but of course they extend to measures with the
same total mass and W p (ν, θ ) = hW p (h −1 ν, h −1 θ ). We shall mainly use the 2-Wasserstein
p p
distance but the 1-Wasserstein distance will be useful as well in the sequel. We also recall
that the 1-Wasserstein distance can also be defined through the Kantorovich duality formula
(see for instance [18,19]):

W1 (ν, θ ) := sup f d(ν − θ ) : f 1-Lipschitz . (3.1)
Rd
We will see in Sect. 4 that one may obtain solutions to the system (2.16)–(2.18) by a minimiz-
ing scheme for an energy defined on an infinite product of spaces of measures parametrized
by the label a. Wasserstein gradient flows on finite products have recently been investigated
in [10,15]. To our knowledge the case of an infinite product is new in the literature.
3.1 Functional setting
As in section 2.3, starting from f 0 satisfying (2.13), let us define A := [−Rv , Rv + 1], fix
a σ -finite measure μ on R and a measurable family of finite Borel measures a ∈ A → ν0a
such that, for every φ ∈ C(R2 ):

φ x, v + G 0 (x) d f 0 (x, v) = φ(x, a)dν0a (x) dμ(a).
R2 R R
As already pointed out, neither μ nor ν0a need to be probability measures, we thus define
h(a) := ν0a (R)

so that h ∈ L 1 (μ), R h(a)dμ(a) = A h(a)dμ(a) = 1. Let us then denote by X the set
consisting of all ν := (ν a )a∈A , μ-measurable families of measures such that

ν a (R) = h(a); for μ-a.e. a and x 2 dν a (x)dμ(a) < +∞.
A R
123
Given R > 0 [the precise choice of R will be made later on, see (4.2) below], let us denote
by X R the subset of X defined by
X R := ν ∈ X : Supp(ν a ) ⊂ [−R, R], for μ-a.e. a ∈ A . (3.2)

For ν ∈ X R , let us define the probability [because R h(a)dμ(a) = 1] measure

ν := ν a dμ(a)
R
and the energy

1 1
J (ν) = |x − y|dν(x)dν(y) + − a xdν a (x)dμ(a). (3.3)
4 R ×R A R 2
Note that J is unbounded from below on the whole of X but it is bounded on each X R . Note
also that the interaction term can be rewritten as:

|x − y|dν(x)dν(y) = |x − y|dν a (x)dν b (y)dμ(a)dμ(b). (3.4)
R ×R R4
We equip X R with the distance d given by:

d (ν, θ ) :=
2
W22 (ν a , θ a )dμ(a), (ν, θ ) = (ν a )a∈A , (θ a )a∈A ∈ X R × X R . (3.5)
A
It will also be convenient to work with the weak topology on X R that is the one defined by
the family of semi-norms

pφ (ν) := φd(ν ⊗ μ) , φ ∈ C (A × [−R, R]) ,
A×[−R,R]
where ν ⊗ μ is the probability measure defined by

φd(ν ⊗ μ) := φ(a, x)dν a (x) dμ(a)
A×[−R,R] A [−R,R]
and
K := A × [−R, R]
so that convergence for the weak topology is nothing but weak-∗ convergence of ν ⊗ μ.
Since for all ν ∈ X R , ν ⊗ μ is a probability measure on the compact set A × [−R, R], X R
is compact for the weak topology. Note also that since the weak-∗ topology is metrizable by
the Wasserstein distance (see [18,19]) on the set of probability measures on a compact set of
R2 , the weak topology is metrizable by the distance dw :
dw2 (ν, θ ) := W22 (ν ⊗ μ, θ ⊗ μ), (ν, θ ) ∈ X R × X R , (3.6)

so that (X R , dw ) is a compact metric space. We summarize the basic properties of J , d and
dw in the following.
Lemma 3.1 Let X R , J , d and dw be defined as above then we have:
1. J is Lipschitz continuous for dw ,
2. dw ≤ d,
3. d is lower semicontinous for dw : if (ν n )n is a sequence in X R , (ν, θ ) ∈ X R × X R and
limn dw (ν n , ν) = 0 then lim inf n d 2 (ν n , θ ) ≥ d 2 (ν, θ ).
123
Proof Let us recall that if θ and ν are (compactly supported say) probability measures on
Rd then by Cauchy–Schwarz inequality,
W1 (ν, θ ) ≤ W2 (ν, θ ) (3.7)
and, it follows from (3.1) that, if f is M-Lipschitz then

f d(ν − θ ) ≤ M W1 (ν, θ ). (3.8)
Rd
Moreover,
W1 (ν ⊗ ν, θ ⊗ θ ) ≤ 2W1 (ν, θ ). (3.9)
1. Let us rewrite J as
1
J (ν) = J0 (ν) + J1 (ν),
4
with

J0 (ν) := |x − y|d(ν ⊗ μ)(a, x)d(ν ⊗ μ)(b, y), (3.10)
K2
and

1
J1 (ν) := − a xd(ν ⊗ μ)(a, x). (3.11)
K 2
The fact that J1 is Lipschitz for dw directly follows from (3.7), (3.8) and the fact that the
integrand in J1 is uniformly Lipschitz in x. As for J0 , using also (3.9) and the fact that
the distance is 1-Lipschitz, we have
J0 (ν) − J0 (θ ) ≤ W1 ((ν ⊗ μ) ⊗ (ν ⊗ μ), (θ ⊗ μ) ⊗ (θ ⊗ μ))

≤ 2W2 (ν ⊗ μ, θ ⊗ μ) = 2dw (ν, θ ).
2. Let ν = (ν a )a∈A and θ = (θ a )a∈A be two elements of X R and let γ a be an optimal

plan between ν a and θ a (which can be chosen in a μ-measurable way, thanks to standard
measurable selection arguments, see [14]). Let us then define the probability measure α
on K 2 by

φ (a, x), (b, y) dα(a, x, b, y)
K ×K

:= φ (a, x), (a, y) dγ a (x, y) dμ(a)
A [−R,R]2
for all φ ∈ C(K × K ). Observing that α ∈ (ν ⊗ μ, θ ⊗ ν), we get

dw (ν, θ ) ≤
2
|x − y| dα(a, x, b, y) =
2
|x − y|2 dγ a (x, y) dμ(a)
K ×K A [−R,R]2

= W22 (ν a , θ a )dμ(a) = d 2 (ν, θ ).
A
3. Let γna be an optimal plan (μ-measurable with respect to a) between νna and θ a . Again
passing to a subsequence if necessary we may assume that γna ⊗ μ weakly ∗ converges to
123
some measure of the form γ a ⊗ μ. Using test-functions of the form ψ(a)(α(x) + β(y))
we deduce easily that for μ-almost every a, γ a ∈ (ν a , θ a ) and then

lim inf d 2 (ν n , θ ) = lim inf |x − y|2 dγna (x, y) dμ(a)
n A [−R,R]2

= |x − y|2 dγ a (x, y) dμ(a) ≥ d 2 (ν, θ ).
A [−R,R]2
3.2 Subdifferential of the energy and gradient flows as measure solutions
Let us start with some convexity properties of J . Let ν = (ν a )a∈A and θ belong to X R and
let γ := (γ a )a∈A be a measurable family of transport plans between ν a and θ a [which we
shall simply denote by γ ∈ (ν, θ )]. For ε ∈ [0, 1], then define
ν ε := (((1 − ε)π1 + επ2 )# γ a )a∈A , (3.12)
where π1 and π2 are the canonical projections π1 (x, y) = x, π2 (x, y) = y. Then ε ∈
[0, 1] → ν ε is a curve which interpolates between ν and θ . Similarly if we take transport
plans γ a induced by maps of the form id +ξ a with ξ = (ξ a )a∈A ∈ L ∞ (ν ⊗ μ) i.e. θ a =
(id +ξ a )# ν a then νεa = (id +εξ a )# ν a and in this case, we shall simply denote ξ := (ξ a )a∈A
and ν ε as
ν ε = (id + εξ )# ν, θ = (id + ξ )# ν.
Lemma 3.2 Let ν and θ be in X R , γ ∈ (ν, θ ) and ν ε be given by (3.12). Then
J (ν ε ) ≤ (1 − ε)J (ν) + ε J (θ ), ∀ε ∈ [0, 1].
In particular, the same inequality holds if ν ε = (id + εξ )# ν with ξ ∈ L ∞ (ν ⊗ μ).
Proof This immediately follows from the construction of ν ε , the convexity of the absolute
value in J0 defined by (3.10) and the linearity in x of the integrand in J1 defined
by (3.11).

Remark 3.3 The convexity Lemma 3.2 holds along the interpolation ν ε given by any trans-
portation plan γ a between ν a and μa , it is in particular true when in addition γ a is a required
to be an optimal plan, in such a case, it is easy to see that the interpolation ε ∈ [0, 1] → ν ε
is a geodesic between ν and θ , in other words, J is convex along geodesics (but does not
satisfy any strong convexity property along geodesics).
Definition 3.4 Let ν ∈ X R , the subdifferential of J at ν, denoted ∂ J (ν), consists of all
w := (wa )a∈A ∈ L 1 (ν ⊗ μ) such that for every R > 0, every θ ∈ X R and every γ =
(γ a )a∈A ∈ (ν, θ ), one has

J (θ ) − J (ν) ≥ wa (y)(z − y)dγ a (y, z)dμ(a).
[−R,R]×[−R ,R ]×A
Remark 3.5 An equivalent way to define ∂ J (ν) (which will turn out to be more convenient
in the sequel to prove stability properties, see Lemma 4.4) is in terms of transition kernels
rather than of transport plans. More precisely, given ν ∈ X R , we define the set T (ν) of
ν ⊗ μ measurable maps η: (a, y) ∈ K → ηa,y ∈ P (R) such that there exists an R > 0
such that ηa,y is supported by [−R , R ] for ν ⊗ μ almost every (a, y) ∈ K . We then define
ν η = (νηa )a∈A by

ϕ(z)dνηa (z) := ϕ(z)dηa,y (z)dν a (y), ∀ϕ ∈ C(R).
R R2
123
By construction, γ = (γ a )a∈A with γ a = ν a ⊗ ηa,y defined by

ϕ(y, z)dγ a (y, z) := ϕ(y, z)dηa,y (z)dν a (y), ∀ϕ ∈ C(R2 )
R2 R2
belongs to (ν, ν η ) and thanks to the disintegration Theorem, it is then easy to check that
w ∈ ∂ J (ν) if and only if, for every η ∈ T (ν), one has

J (ν η ) − J (ν) ≥ wa (y)(z − y)dηa,y (z)dν a (y)dμ(a). (3.13)
R3
Remark 3.6 If we restrict ourselves to transport maps [i.e. take ηa,y = δξ a (y) in (3.13)],
we obtain a condition which is weaker than definition 3.4 but somehow easier to handle. If
w := (wa )a∈A ∈ L 1 (ν ⊗ μ) ∈ ∂ J (ν) then for every ξ = (ξ a )a∈A ∈ L ∞ (ν ⊗ μ), one has

J (id + ξ )# ν − J (ν) ≥ wξ d(ν ⊗ μ) = wa (x)ξ a (x)dν a (x)dμ(a). (3.14)
K K
Remark 3.7 The subdifferential ∂ J obviously has the following monotonicity property
(which will be crucial for uniqueness, see Sect. 5): if ν 1 and ν 2 belong to X R and w1 ∈ ∂ J (ν 1 )
and w2 ∈ ∂ J (ν 2 ), then for every γ ∈ (ν 1 , ν 2 ), one has

(w1a (y) − w2a (z))(y − z)dγ a (y, z)dμ(a) ≥ 0. (3.15)
R3
The connection between the subdifferential [in fact the weak condition (3.14)] of the
energy J given by (3.3) and the condition (2.17) is clarified by the following:
Proposition 3.8 Let ν ∈ X R , if w ∈ ∂ J (ν) then, defining the x-marginal of ν ⊗ μ by

ρ := ν a dμ(a)
A
and its cumulative distribution function by

G(x) := ρ (−∞, x] , G − (x) = ρ (−∞, x) , ∀x ∈ R,
we have
wa (x) ∈ [G − (x) − a, G(x) − a] for ν ⊗ μ a.e. (a, x). (3.16)
In particular w ∈ L ∞ (ν ⊗ μ) with
w L ∞ (ν⊗μ) ≤ Rv + 2. (3.17)
Proof Let ξ ∈ L ∞ (ν ⊗ μ) and define ν ε := (id + εξ )# ν for ε ∈ [0, 1]. Since w ∈ ∂ J (ν)
we have in particular

1
lim J (ν ε ) − J (ν) ≥ wξ d(ν ⊗ μ) = wa (x)ξ a (x)dν a (x)dμ(a). (3.18)
ε→0+ ε K K
Defining J0 and J1 as in (3.10) and (3.11) and K := A × [−R, R] , first we have

1 1
J1 (ν ε ) − J1 (ν) = I0 := − a ξ a (x)dν a (x)dμ(a). (3.19)
ε K 2
We then write

1
J0 (ν ε ) − J0 (ν) = ηε (a, b, x, y)d(ν ⊗ μ)(a, x)d(ν ⊗ μ)(b, y) (3.20)
ε K ×K
123
with
1
ηε (a, b, x, y) = x + εξ a (x) − y + εξ b (y) − x − y . (3.21)
ε
Observing that ηε is bounded by 2ξ L ∞ (ν⊗μ) and that

sign(x − y) ξ a (x) − ξ b (y) , if x = y
lim ηε (a, b, x, y) = (3.22)
ε→0+ |ξ a (x) − ξ b (y)|, if x = y,
by Lebesgue’s dominated convergence theorem, we get

1
lim J (ν ε ) − J (ν) = I0 + I1 + I2 (3.23)
ε→0+ ε
with I0 given by (3.19), and

1
I1 = 1x= y sign(x − y) ξ a (x) − ξ b (y) d(ν ⊗ μ)(a, x)d(ν ⊗ μ)(b, y) (3.24)
4 K ×K
and

1
I2 = 1x=y ξ a (x) − ξ b (x) d(ν ⊗ μ)(a, x)d(ν ⊗ μ)(b, y). (3.25)
4 K ×K
To compute I1 we observe that thanks to Fubini’s theorem

1
1x>y (ξ a (x) − ξ b (y))d(ν ⊗ μ)(a, x)d(ν ⊗ μ)(b, y)
4 K ×K

1 1
= ξ a (x)G − (x)d(ν ⊗ μ)(a, x) − ξ b (y)(1 − G(y))d(ν ⊗ μ)(b, y)
4 K 4 K

1
= ξ a (x)(G − (x) + G(x) − 1)d(ν ⊗ μ)(a, x).
4 K
Treating similarly the integral on {x < y} we thus get
−
G (x) + G(x) 1 a
I1 = − ξ (x)d(ν ⊗ μ)(a, x). (3.26)
K 2 2
As for I2 , we have

1
I2 ≤ |ξ a (x)| + |ξ b (x)| ν b ({x})dν a (x) dμ(a)dμ(b), (3.27)
4 A×A [−R,R]
then we use Fubini’s theorem to get

|ξ a (x)|ν b ({x})dν a (x) dμ(a)dμ(b)
A×A [−R,R]

= |ξ a (x)|(G(x) − G − (x))d(ν ⊗ μ)(a, x).
K
Note that in the previous integral, the integration with respect to x is actually a discrete sum,
because the set of atoms where G > G − is at most countable since G is nondecreasing; let
us denote this set by
S := {x ∈ [−R, R] : G(x) − G − (x) > 0} = {xi }i∈I ,
123
where I is at most countable.

Similarly for the second term in the right hand side of (3.27)
observing that |ξ b (x)| A ν b ({x})dμ(b) ≤ ξ L ∞ (ν⊗μ) (G(x) − G − (x)), we only have to
integrate in x over S which gives

ξ b (x) ν b {x} dν a (x) dμ(a)dμ(b)
A×A [−R,R]

= |ξ b (xi )|ν b ({xi })ν a ({xi }) dμ(a)dμ(b)
A×A i∈I

= |ξ b (xi )|ν b ({xi })(G(xi ) − G − (xi ) dμ(b)
A i∈I

= ξ b (x) G(x) − G − (x) d(ν ⊗ μ)(b, x),
K
so that

1
I2 ≤ |ξ a (x)|(G(x) − G − (x))d(ν ⊗ μ)(a, x). (3.28)
2 K
Putting together (3.18), (3.19), (3.23), (3.26) and (3.28) we arrive at the inequality

1
wa (x) + a − (G(x) + G − (x)) ξ a (x)d(ν ⊗ μ)(a, x)
K 2

1
≤ |ξ a (x)|(G(x) − G − (x))d(ν ⊗ μ)(a, x)
2 K
which holds for any ξ ∈ L ∞ (ν ⊗ μ) and (3.16) obviously follows.

Definition 3.9 A gradient flow of J on the time interval [0, T ] starting from ν 0 is a Lipschitz
continuous (for d) curve t ∈ [0, T ] → ν(t) = (ν(t)a )a∈A ∈ X R together with a measurable
map t ∈ [0, T ] → v(t) ∈ L 1 (ν ⊗ μ) such that v(t) ∈ −∂ J (ν(t)) for almost every t ∈ [0, T ],
and for μ-almost every a ∈ A, t → ν(t)a is a solution in the sense of distributions of the
continuity equation (2.16).
It then follows from Proposition 3.8 that gradient flows starting from ν 0 are measure
solutions of the system (2.16)–(2.18). Note also that thanks to the bound (3.17), gradient
flows are not only absolutely continuous but automatically Lipschitz for d and even more is
true: for μ-almost every a, the curve t → νta is Lipschitz for W2 , more precisely
W2 (νta , νsa ) ≤ |t − s|(Rv + 2)h(a)1/2 hence d(ν(t), ν(s)) ≤ |t − s|(Rv + 2). (3.29)
4 Existence by the JKO scheme
We will prove existence of a gradient flow curve on the time interval [0, T ] starting from
ν 0 = (ν0a )a∈A by considering the JKO scheme. Given a time step τ > 0, starting from ν 0 ,
we construct inductively a sequence ν k by
1
ν k+1 ∈ argminν∈X d 2 (ν, ν k ) + J (ν) (4.1)
2τ
for k = 0, · · · , N with N := [ Tτ ].
123
4.1 Estimates
The first step in proving that this scheme is well-defined consists in showing that one can a
priori bound the support. This is based on the following basic result (which we state in any
dimension d eventhough, in the sequel, we will only apply it when d = 1):
Lemma 4.1 Let R0 , R > 0 and τ be positive constants, ν0 be a probability measure on
Rd with support in B R0 and ν ∈ P2 (Rd ). Let P be the projection onto B R0 +τ R and define
ν̂ := P# ν. Then, for every a ∈ B R , one has

1 2 1 2
W (ν̂, ν0 ) − τ a · xdν̂(x) ≤ W2 (ν, ν0 ) − τ a · xdν(x).
2 2 Rd 2 Rd
Proof Fix an optimal transport plan between ν0 and ν i.e. a γ ∈ (ν0 , ν) such that
W22 (ν, ν0 ) = Rd ×Rd |x − y|2 dγ (x, y). Since the map (x, y) → (x, P(y)) pushes forward
γ to a plan having ν0 and ν̂ as marginals, we have

1 2 1
W (ν̂, ν0 ) ≤ |x − P(y)|2 dγ (x, y)
2 2 2 R d ×R d

1 2 1
= W2 (ν, ν0 ) − |y − P(y)|2 dγ (x, y)
2 2 R d ×R d

+ (y − P(y)) · (x − P(y))dγ (x, y)
R d ×R d
and then

2 W2(ν̂, ν0 ) − τ Rd
a · xdν̂(x) − 21 W22 (ν, ν0 ) + τ Rd a · xdν(x)
1 2
≤ Rd ×Rd (y − P(y)) · (x + τ a − P(y))dγ (x, y).

But since γ -a.e. x + τ a ∈ B R0 +τ R , we get that the integrand in the right-hand side is
nonpositive by the well-known characterization of the projection onto B R0 +τ R .

Now consider the first step of the JKO scheme. Since ν0a is supported by [−R x , Rx ], for
every a ∈ A and a ∈ A ⇒ |a| ≤ Rv + 1, the previous lemma implies that if one replaces
ν = (ν a )a∈A ∈ X by ν̂ = (ν̂ a )a∈A defined for every a by ν̂ a = P# ν a where P is the
projection on [−R x − τ (Rv + 3/2), Rx + τ (Rv + 3/2)], one has

1 2 a a 1 1 1
W2 (ν̂ , ν0 ) + τ − a · xdν̂ a (x) ≤ W22 (ν a , ν0a ) + τ − a · xdν a (x).
2 R d 2 2 R d 2
As for the interaction term, it is also improved by replacing ν by ν̂; this is obvious from the
expression (3.4) and the fact that P is 1-Lipschitz. In the first step of the JKO scheme, we
may therefore impose the constraint that ν ∈ X Rx +τ (Rv +3/2) . After k steps, we may similarly
impose that the minimization is performed on X Rx +kτ (Rv +3/2) , so simply setting
R = Rx + (T + τ )(Rv + 3/2), (4.2)
we may replace (4.1) with a bound on the support:
1
ν k+1 ∈ argminν∈X R d 2 (ν, ν k ) + J (ν) . (4.3)
2τ
By a direct application of Lemma 3.1 and the compactness of (X R , dw ), we then see
that the minimizing scheme (4.3) is well-defined and actually defines a sequence ν k , k =
0, . . . , N + 1. We also extend this sequence by piecewise constant in time interpolation:
123
ν τ (t) := ν k , for t ∈ ((k − 1)τ, kτ ], k = 1, · · · , N + 1. (4.4)
In the following basic estimates, C will denote a constant (possibly depending on T )

which may vary from one line to the other. By construction, for all k = 0, . . . , N , we have
1 2
d (ν k+1 , ν k ) ≤ J (ν k ) − J (ν k+1 ). (4.5)
2τ
Summing and using the fact that every ν k belongs to X R and that J is bounded from below
on X R we get:
1 2
N
d (ν k+1 , ν k ) ≤ J (ν 0 ) − J (ν N +1 ) ≤ C. (4.6)
2τ
k=0
From (4.6), Cauchy–Schwarz inequality and Lemma 3.1 we classically get a uniform Hölder
estimate:

dw (ν τ (t), ν τ (s)) ≤ d (ν τ (t), ν τ (s)) ≤ C |t − s| + τ , ∀(s, t) ∈ [0, T ]2 . (4.7)
Since (X R , dw ) is a compact metric space, it follows from some refined variant of Ascoli-
Arzelà theorem (see [3]) that there exists a limit curve
1
t → ν(t) belonging to C 0, 2 ([0, T ], (X R , dw ))
and a vanishing sequence of time-steps τn → 0 as n → +∞ such that
sup dw (ν τn (t), ν(t)) → 0 as n → +∞. (4.8)

t∈[0,T ]
4.2 Discrete Euler–Lagrange equation
Let γ k+1 = (γk+1

a )
a∈A ∈ (ν k , ν k+1 ) be such that γk+1 is an optimal plan for μ-almost
a
every a and let vk+1 be defined by

a

y−x a
ξ(y)vk+1
a
(y)dνk+1
a
(y) = ξ(y) dγk+1 (x, y)
[−R,R] [−R,R]2 τ
for all ξ ∈ C([−R, R]), or equivalently, disintegrating γk+1

a with respect to its second
a,y
marginal νk+1 as dγk+1 (x, y) = dγk+1 (x) ⊗ dνk+1 (y):
a a a

1 a,y

vk+1
a
(y) = y− xdγk+1 (x) . (4.9)
τ [−R,R]
The Euler–Lagrange equation for (4.1) can then be written as
Lemma 4.2 Let ν k+1 be a solution of (4.1), γ k+1 ∈ (ν k , ν k+1 ) and v k+1 be constructed
as above, then:
v k+1 ∈ −∂ J (ν k+1 ). (4.10)
Proof Let R > 0, θ ∈ X R and γ ∈ (ν k+1 , θ ), and define for ε ∈ [0, 1]
ν ε = (νεa )a∈A with νεa := ((1 − ε)π1 + επ2 )# γ a .
123
Then by optimality of ν k+1 and using Lemma 3.2, we have

1 1 2
0 ≤ lim inf (d (ν ε , ν k ) − d 2 (ν k+1 , ν k )) + J (ν ε ) − J (ν k+1 )
ε→0+ ε 2τ
1 1 2
≤ lim inf (d (ν ε , ν k ) − d 2 (ν k+1 , ν k ) + J (θ ) − J (ν k+1 ).
ε→0 ε 2τ
+
a between ν a and ν a
We have already disintegrated the optimal plan γk+1 k k+1 as
a,y
γk+1
a
(dx, dy) = γk+1 (dx) ⊗ νk+1
a
(dy).
Let us also disintegrate the (arbitrary) plan γ a between νk+1

a and θ a as:
γ a (dy, dz) = νk+1

a
(dy) ⊗ γ a,y (dz).
a,y
Define then the 3-plan β a by β a = (γk+1 ⊗ γ a,y ) ⊗ νk+1
a i.e.

a,y
φ(x, y, z)dβ a (x, y, z) := φ(x, y, z)dγk+1 (x)dγ a,y (z) dνk+1
a
(y)
R3 R R2
for every φ ∈ C(R3 ). Setting
(π1 (x, y, z), π2 (x, y, z), π3 (x, y, z)) = (x, y, z),

(π12 (x, y, z), π23 (x, y, z), π13 (x, y, z)) = ((x, y), (y, z), (x, z)),
we have by construction, π12# β a = γk+1 a ,π

23# β = γ . By the very definition of νε , we
a a a
also have (π1 , (1 − ε)π2 + επ3 ))# β a ∈ (νka , νεa ) so that

W22 (νka , νk+1
a
)= |y − x|2 dβ a (x, y, z)
R3
and

W2 (νka , νεa ) ≤ |(1 − ε)y + εz − x|2 dβ a (x, y, z).
R3
Using Lebesgue’s dominated convergence Theorem and recalling the definition of β a and
vk+1
a we then get
1 1 2
lim inf (d (ν ε , ν k ) − d 2 (ν k+1 , ν k )
ε→0+ ε 2τ

y−x a
≤ (z − y) · dβ (x, y, z) dμ(a)
τ
A R
3
y − x a,y
= dγk+1 (x) (z − y)dγ a,y (z)dνk+1
a
(y) dμ(a)
A R2 [−R,R] τ

= vk+1
a
(y) · (z − y)dγ a (y, z)dμ(a).
[−R,R]×[−R ,R ]×A
This yields

J (θ ) − J (ν k+1 ) ≥ − vk+1
a
(y) · (z − y)dγ a (y, z)dμ(a)
[−R,R]×[−R ,R ]×A
i.e. v k+1 ∈ −∂ J (ν k+1 ).

123
Let us also extend vk+1 by piecewise constant interpolation

v τ (t) = v k+1 , t ∈ (kτ, (k + 1)τ , t ∈ [0, T ], v k+1 = (vk+1
a
)a∈A , (4.11)
so that, thanks to the previous Lemma, we have
v τ (t) ∈ −∂ J (ν τ (t)), t ∈ [0, T ]. (4.12)
Thanks to Proposition 3.8, note that supt∈[0,T ] v τ (t) L ∞ (ν τ (t)⊗μ) ≤ C; we can then
define the time-dependent-family of signed measures
dq τ (t) = v τ (t)dν τ (t), i.e. dqτ (t)a = vτ (t)a dντ (t)a .
Denoting by λ the one dimensional Lebesgue measure on [0, T ], we may assume, taking a
subsequence if necessary, that the bounded family of measures on q τn ⊗ μ ⊗ λ converges
weakly ∗ to some bounded signed measure on [−R, R] × A × [0, T ] which is necessarily
of the form q ⊗ μ ⊗ λ because marginals (with respect to the a and t variables) are stable
under weak limits. Since |q τn | ⊗ μ ⊗ λ ≤ Cν τn ⊗ μ ⊗ λ and ν τn ⊗ μ converges weakly ∗ to
ν ⊗ μ, we have |q| ⊗ μ ⊗ λ ≤ Cν ⊗ μ ⊗ λ. Hence, for μ ⊗ λ a.e. (a, t), the limit satisfies
|q(t)a | ≤ Cν(t)a and therefore can be written in the form dq(t)a = v(t)a dν a (t) (q = vν
for short) with v(t) L ∞ (ν(t)⊗μ) ≤ C for λ-a.e. t ∈ [0, T ]. We thus have
∗
q τn ⊗ μ ⊗ λ = (v τn ν τn ) ⊗ μ ⊗ λ q ⊗ μ ⊗ λ = (vν) ⊗ μ ⊗ λ as n → +∞.
(4.13)
In other words, for every φ ∈ C([0, T ] × A × [−R, R]) we have

T
limn 0 A [−R,R] φ(t, a, x)vτn (t)a (x)dντn (t)a (x) dμ(a)dt
T
= 0 A [−R,R] φ(t, a, x)v(t)a (x)dν(t)a (x) dμ(a)dt.
4.3 Existence by passing to the limit
Our task now consists in showing that the limit curve t → ν(t) is a gradient flow solution
associated to the velocity t → v(t) constructed above. Let us first check that it satisfies
the system of continuity equations (2.16). To do so, take test functions ψ ∈ C(A) and
φ ∈ C 2 ([0, T ] × [−R, R]) and let us consider
Nτ
ψ(a)∂t φ(t, x)dντ (t)a (x)dμ(a) dt
0 K
N −1 R
= A ψ(a) k=0 −R (φ((k + 1)τ, x) − φ(kτ, x))dνk+1 (x) dμ(a).
a
Then, we rewrite
N R
k=0 −R (φ((k + 1)τ, x) − φ(kτ, x))dνk+1
a (x)
N −1 R
= k=1 −R φ(kτ, x))d(νk − νk+1 )(x)
a a
R
a (x) − R φ(0, x)dν a (x).
+ −R φ(N τ, x)dν N −R 1
Using the optimal plans γk+1

a as in Lemma 4.2, we then rewrite

R R R a
φ kτ, x) d νka − νk+1
a
)(x) = φ(kτ, x) − φ(kτ, y) dγk+1 (x, y).
−R −R −R
123
A Taylor expansion gives
φ(kτ, x) − φ(kτ, y) = ∂x φ(kτ, y)(x − y) + lk (τ, a, x, y),

|lk (τ, a, x, y)| ≤ ∂x x φ∞ |x − y|2 .
Integrating and using the optimality of γk+1

a gives
R R
lk (τ, a) := |lk (τ, a, x, y)|dγk+1
a
(x, y) ≤ ∂x x φ∞ W22 (νka , νk+1
a
)
−R −R
and then, recalling (4.6) we have

N −1
ψ(a) lk (τ, a)dμ(a) ≤ Cτ ∂x x φ∞ ψ∞ . (4.14)
A k=1
Recalling the definition of the discrete velocity vk+1 from Lemma 4.2, we can rewrite
R R R
∂x φ(kτ, y)(x − y)dγk+1a
(x, y) = −τ ∂x φ(kτ, x)vk+1
a
(x)dνk+1
a
(x),
−R −R −R
hence by definition of ν τ and v τ

−1
N R R
ψ(a) ∂x φ(kτ, y)(x − y)dγk+1
a
(x, y) dμ(a)
A k=1 −R −R
T
=− ψ(a)∂x φ(t, x)vτ (t)a dντ (t)a (x)dμ(a)dt + O(τ ).
0 K
Now thanks to (4.8), we have

R R
lim ψ(a) φ(N τn , x)dν N (x) dμ(a) =
a
ψ(a) φ(T, x)dν(T )a (x) dμ(a)
n A −R A −R
(4.15)
and
R R
lim ψ(a) φ(0, x)dν1a (x)) dμ(a) = ψ(a) φ(0, x)dν0a (x) dμ(a),
n A −R A −R
(4.16)
a = ν a (N τ ) and ν a = ν a (τ ). Putting the previous
where we use in the above limits that ν N τn n 1 τn n
computations together, summing and using (4.15), (4.14), (4.16), we thus obtain
Nτ
ψ(a)∂t φ(t, x)dντ (t)a (x)dμ(a) dt
0 K
T
=− ψ(a)∂x φ(t, x)vτ (t)a dντ (t)a (x)dμ(a)dt
0 K
R
+ ψ(a) φ(T, x)dν(T )a (x) dμ(a)
A −R
R
− ψ(a) φ(0, x)dν0a (x) dμ(a) + ετ ,
A −R
123
where ετn goes to 0 as n → +∞. Taking τ = τn , using (4.8) and (4.13) and letting n → +∞
in the previous identity we get
T R
ψ(a) (∂t φ(t, x) + ∂x φ(t, x)v(t)a (x))dν(t)a (x)dt dμ(a)
A 0 −R
R
= ψ(a) φ(T, x)dν(T ) (x) −
a
φ(0, x)dν0a (x) dμ(a).
A −R R
In other words, we have proved the following:
Lemma 4.3 For μ-almost every a, the limit curve t → ν(t)a solves the continuity equation
(2.16) associated to the limit velocity t → v(t)a .
It remains to check that
Lemma 4.4 For a.e. t ∈ [0, T ], we have v(t) ∈ −∂ J (ν(t)).
Proof By construction of the curves v τ and ν τ and thanks to Lemma 4.2, we have seen in
(4.12) that
v τ (t) ∈ −∂ J (ν τ (t)), ∀t ∈ [0, T ]
which means that for every τ > 0, every t ∈ [0, T ] and every η ∈ T (ν τ (t)) (as defined in
Remark 3.5), we have

J (ν τ (t)η ) − J (ν τ (t)) ≥ − vτa (t)(y)(z − y)dηa,y (z)dντ (t)a (y)dμ(a). (4.17)
A×R2
We wish to prove that there exists S ⊂ [0, T ], λ-negligible, such that for every t ∈ [0, T ] \ S
and every η ∈ T (ν(t)), one has

J (ν(t)η ) − J (ν(t)) ≥ − v a (t)(y)(z − y)dηa,y (z)dν(t)a (y)dμ(a). (4.18)
A×R2
To pass to the limit τ = τn , n → ∞ in (4.17) to obtain (4.18), we shall proceed in several

steps. Let us remark that it is enough to prove (4.17) when ηa,y is supported by a fixed compact
interval [−R , R ] (and then to take an exhaustive sequence of such compact intervals). Let
us also recall that, thanks to Lemma 3.1 and (4.8), J (ν τn (t)) converges to J (ν(t)) as n → ∞
uniformly on [0, T ].
Step 1: Let us first consider the case where η is continuous in the sense that (a, y) ∈ K →
ϕ(z)dη a,y (z) is continuous for every ϕ ∈ C(R). Let φ ∈ C(A× R). Since ϕ defined
η
[−R ,R ]
by ϕη (a, y) := φ(a, z)dηa,y (z) belongs to C(K ), using the fact that
φ, ν τn (t)η ⊗ μ = ϕη , ν τn (t) ⊗ μ,

φ, ν(t)η ⊗ μ = ϕη , ν(t) ⊗ μ
and (4.8), we deduce that limn dw (ν τn (t)η , ν(t)η ) = 0 for every t ∈ [0, T ]. Hence, thanks to
Lemma 3.1, we have
lim[J (ν τn (t)η ) − J (ν τn (t))] = J (ν τ (t)η ) − J (ν τ (t)), ∀t ∈ [0, T ]. (4.19)

n
123
Let ϕ ∈ C([0, T ]), ϕ ≥ 0. Using (4.17) gives

T
ϕ(t)[J (ν τn (t)η ) − J (ν τn (t))]dt
0

≥− ϕ(t)vτan (t)(y)(z − y)dηa,y (z)dντn (t)a (y)dμ(a)dt
[0,T ]×A×R2

=− ϕ(t)ψ(a, y)dqτn (t)a (y)dμ(a)dt,
[0,T ]×K
where

ψ(a, y) := (z − y)dηa,y (z)
belongs to C(K ). We then deduce from (4.13), (4.19) and Lebesgue’s dominated convergence
that
T
ϕ(t)[J (ν(t)η ) − J (ν(t))]dt
0

≥− ϕ(t)ψ(a, y)dq a (y)dμ(a)dt
[0,T ]×K

=− ϕ(t)v a (t)(y)(z − y)dηa,y (z)dν(t)a (y)dμ(a)dt.
[0,T ]×A×R2
This implies that there exists a negligible subset Sη of [0, T ] outside which (4.18) holds.
2N −1
Step 2: For every N ∈ N∗ , let N := {(α0 , · · · , α2N −1 ) ∈ R2N
+ : k=0 αi = 1}, FN be
a countable and dense family in C(K , N ), and consider

2N −1
D N := (a, y) ∈ K → αk (a, y)δz N , (α0 , . . . , α2N −1 ) ∈ FN , D := DN ,
k
k=0 N ∈N ∗
where for k = 0, . . . , 2N − 1, z kN denotes the midpoint of the interval [−R + k R /N , −R +

(k + 1)R /N ]. Since D is countable and its elements belong to C(K , (P ([−R , R ]), W2 )),
it follows from Step 1, that (4.18) holds for every η ∈ D and every t ∈ [0, T ] \ S where S is
the λ-negligible set

S := Sη . (4.20)
η∈D
Step 3: Let t ∈ [0, T ] \ S, and η ∈ T (ν) having its support in [−R , R ]. Note that now we
are working with a fixed t so that we just have to suitably approximate η by a sequence in
D. For N ∈ N∗ , first define for every (a, y) ∈ K the discrete measure

2N −1
f kN (a, y)δz N , f kN (a, y) := ηa,y (IkN ), (4.21)
k
k=0
where IkN is the interval [−R + k R /N , −R + (k + 1)R /N ) if k = 0, . . . , 2N − 2 and

−1 := [R (1 − 1/N ), R ]. We then have
N
I2N

2N −1 R
sup W1 ηa,y , f kN (a, y)δz N ≤ . (4.22)
(a,y)∈K k N
k=0
123
The function ( f kN )k=0,...,2N −1 is not continuous but belongs to L 1 (ν(t) ⊗ μ, N ). Since

C(K , N ) is dense in L 1 (ν(t) ⊗ μ, N ), there exist (g0N , . . . , g2N
N
−1 ) ∈ C(K , N ) such
that

2N −1
1
| f kN (a, y) − gkN (a, y)|dν(t)a (x)dμ(a) ≤ . (4.23)
K N
k=0
Since we have chosen FN dense in C(K , N ), there exist α = (α0N , . . . , α2N

N
−1 ) ∈ FN such
that

2N −1
1
sup |gkN (a, y) − αkN (a, y)| ≤ . (4.24)
N
k=0 (a,y)∈K
We then define η N ∈ D by

2N −1
a,y
η N := αkN (a, y)δz N .
k
k=0
Thanks
to Kantorovich
duality formula
(3.1), it is easy to see that for every α and β in N ,
W1 ( k αk δz N , k βk δz N ) ≤ R k |αk − βk |. In particular, thanks to (4.23), we have
k k
R
W1 f kN (a, y)δz N , gkN (a, y)δz N d(ν(t) ⊗ μ)(a, y) ≤ . (4.25)
K k k N
k k
Similarly, (4.24) implies that

2N −1 R
a,y
sup W1 η N , gkN (a, y)δz N ≤ . (4.26)
(a,y)∈K k N
k=0
We know, from Step 2 that for every N ∈ N∗ :

a,y
J (ν(t)η N ) − J (ν(t)) ≥ − v a (t)(y)(z − y)dη N (z)dν(t)a (y)dμ(a). (4.27)
A×R2
Thanks to (4.22), (4.25) and (4.26) and the triangle inequality, we have

a,y
lim W1 (ηa,y , η N ) d(ν(t) ⊗ μ)(a, y) = 0. (4.28)
N →∞ K
Recalling that v(t) ∈ L ∞ (ν(t) ⊗ μ) and using (3.8), we have

a,y
v a (t)(y) (z − y)d(η N − ηa,y )(z) dν(t)a (y)dμ(a)
K [−R ,R ]

a,y
≤ v L (ν(t)⊗μ)
∞ W1 (ηa,y , η N ) d(ν(t) ⊗ μ)(a, y)
K
so that the right-hand side of (4.27) converges to

− v a (t)(y)(z − y)dηa,y (z)dν(t)a (y)dμ(a)
A×R2
as N → ∞. As for the convergence of the right-hand side of (4.27), we have to show that
lim N W1 (ν η N ⊗ μ, ν η ⊗ μ) = 0. For this, we shall use the Kantorovich-duality formula (3.1)
and observe that if φ ∈ C(K ) is 1-Lipschitz then
123

a,y
φ(a, y)d (ν η N − ν η ) ⊗ μ (a, y) ≤ W1 (ηa,y , η N ) d (ν(t) ⊗ μ) (a, y)
K K
which tends to 0 as N → ∞ thanks to (4.28). Using Lemma 3.1 we then have

lim N →∞ J (ν(t)η N ) = J (ν(t)η ). Passing to the limit N → ∞ in (4.27) gives the desired
inequality (4.18). This shows that v(t) ∈ −∂ J (ν(t)) for every t ∈ [0, T ]\S.

We deduce from Lemmas 4.3 and 4.4 the following existence result:
Theorem 4.5 If (2.13) holds, then for any T > 0, there exists a gradient flow of J starting
from ν 0 on the time interval [0, T ]. In particular, there exists measure solutions to the system
(2.16)–(2.18).
5 Uniqueness and concluding remarks
5.1 Uniqueness and stability
Thanks to (3.15), we easily deduce uniqueness and stability:
Theorem 5.1 Let ν 0 and θ 0 be in X R . If t → ν(t) and t → θ (t) are gradient flows of J
starting respectively from ν 0 and θ 0 , then
d(ν(t), θ (t)) ≤ d(ν 0 , θ 0 ), ∀t ∈ R+ .
In particular there is a unique gradient flow of J starting from ν 0 .
Proof By definition there exists velocity fields v and w such that for a.e. t, v(t) =
(v(t)a )a∈A ∈ −∂ J (ν(t)) and w(t) = (w(t)a )a∈A ∈ −∂ J (θ (t)) and for μ-almost every
a, one has
∂t ν a + ∂x (ν a v a ) = ∂t θ a + ∂x (θ a wa ) = 0, ν a |t=0 = ν0a , θ a |t=0 = θ0a . (5.1)
Since v a and wa are bounded in L ∞ (ν a ) and L ∞ (θ a ) respectively, it follows from well-known

arguments (see [3], in particular Theorem 8.4.7 and Lemma 4.3.4) that t → W22 (νta , θta ) is a
Lipschitz function and that for any family of optimal plans γsa between νsa and θsa for t1 ≤ t2
one has:
t2
W22 (νta2 , θta2 ) ≤ W22 (νta1 , θta1 ) + (v a (s)(y) − wa (s)(z))(y − z)dγsa (y, z) ds.
t1 R2
Integrating the previous inequality gives

t2
d (νt2 , θt2 ) ≤ d (νt1 , θt1 ) +
2 2
(v a (s)(y) − wa (s)(z))(y − z)dγsa (y, z)dμ(a) ds.
t1 A×R2
But since v(s) ∈ −∂ J (ν(s)) and w(s) ∈ −∂ J (θ (s)) for a.e. s, the monotonicity relation
(3.15) gives

(v a (s)(y) − wa (s)(z))(y − z)dγsa (y, z)dμ(a) ≤ 0.
A×R2
We then obtain the desired contraction estimate.

123
5.2 Concluding remarks
Back to classical solutions, more general initial conditions
Starting from a one-dimensional kinetic model of granular media, we have defined gener-
alized (measure) solutions thanks to a special first-integral and have proven that measure
solutions exist globally in time thanks to a gradient flow approach. For classical solutions, as
explained in section 2.2 there is an equivalence between the initial kinetic formulation and
the system of PDEs (2.12) and (2.11) which enabled us to define weak solutions through
(2.16)–(2.18). We also gave an example in section 2.4 which shows that one cannot expect
that the spatial cumulative function G t remains continuous globally in time even if G 0 is
very smooth, but in this example the initial condition is very singular in the velocity variable.
If one starts with a more regular initial condition f 0 in the phase space, it is not clear to us
whether measure solutions of (2.16)–(2.18) are such that G t remains absolutely continuous
globally in time [a necessary condition to give a meaning to (1.4)]. In other words, we have
defined a notion of generalized solutions to (1.4) and proved a global existence result for the
latter but have a priori no guarantee that these generalized solutions have enough regularity
to be solutions of (1.4).
We would also like to mention here that in our main results of existence and uniqueness of
a gradient flow for J , the assumption that ρ0 is atomless plays no significant role. Actually,
our results hold for any compactly supported initial condition ν 0 (we did not investigate the
extension to the case where this assumption is relaxed to a second moment bound, but this is
probably doable). The assumption that ρ0 is atomless was used only to select unambiguously
the Cauchy datum ν0a . We suspect that in the case where ρ0 is a discrete measure, there
might be an interesting connection between gradient flows solutions (which typically select
elements of the subgradient with minimal norm) and some solutions of the initial ODE system
(1.2) but a more precise investigation is left for the future.
Higher dimensions, more general functionals
The motivation for the present work comes from kinetic models of granular media. Since
the first integral trick of Sect. 2 is very specific to the quadratic interaction kernel case
in dimension one, all our subsequent analysis has been performed in dimension one only.
However, it is obvious (but we are not aware of any practical examples in kinetic theory) that
our arguments can be used also to study systems of continuity equations in Rd for infinitely
many species (labeled by a parameter a) such as

∂t ν + divx ν (∇x V (a, x) +
a a
∇x W (a, b, x, y)dν b (y)dμ(b)) = 0,
A×Rd
which (taking for instance W symmetric W (a, b, x, y) = W (b, a, y, x)), can be seen as the
gradient flow of

1
J (ν) := V d(ν ⊗ μ) + W d(ν ⊗ μ) ⊗ d(ν ⊗ μ).
A×Rd 2 A×Rd A×Rd
Acknowledgments The authors are grateful to Yann Brenier, Reinhard Illner and Maxime Laborde for
fruitful discussions about this work. M.A. acknowledges the support of NSERC through a Discovery Grant.
G.C. gratefully acknowledges the hospitality of the Mathematics and Statistics Department at UVIC (Victoria,
Canada), and the support from the CNRS, from the ANR, through the project ISOTACE (ANR-12- MONU-
0013) and from INRIA through the action exploratoire MOKAPLAN.
123
References
1. Agueh, M.: Local existence of weak solutions to kinetic models of granular media. Arch. Ration. Mech.
Anal. (2016) (in press)
2. Agueh, M., Carlier, G., Illner, R.: Remarks on kinetic models of granular media: asymptotics and entropy
bounds. Kinet. Relat. Models 8(2), 201–214 (2015)
3. Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows in Metric Spaces and in the Space of Probability
Measures. Lectures in Mathematics. Birkhäuser, Basel (2005)
4. Benedetto, D., Caglioti, E., Pulvirenti, M.: A kinetic equation for granular media. RAIRO Model. Math.
Anal. Numer. 31(5), 615–641 (1997)
5. Benedetto, D., Caglioti, E., Pulvirenti, M.: Erratum: A kinetic equation for granular media. M2AN Math.
Model. Numer. Anal. 33, 439–441 (1999)
6. Bertozzi, A.L., Laurent, T., Rosado, J.: L p theory for multidimensional aggregation model. Commun.
Pure Appl. Math. 64, 45–83 (2011)
7. Brenier, Y.: On the Darcy and hydrostatic limits of the convective Navier–Stokes equations. Chin. Ann.
Math. 30, 1–14 (2009)
8. Brenier, Y., Gangbo, W., Savaré, G., Westdickenberg, M.: Sticky particle dynamics with interactions. J.
Math. Pures Appl. 99(9), no. 5, 577–617 (2013)
9. Brenier, Y., Grenier, E.: Sticky particles and scalar conservation laws. SIAM J. Numer. Anal. 35(6),
2317–2328 (1998)
10. Carlier, G., Laborde, M.: On systems of continuity equations with nonlinear diffusion and nonlocal drifts
(2015) (preprint)
11. Carrillo, J.A., McCann, R.J., Villani, C.: Kinetic equilibration rates for granular media and related equa-
tions: entropy dissipation and mass transportation estimates. Rev. Mat. Iberoam. 19, 1–48 (2003)
12. Carrillo, J.A., McCann, R.J., Villani, C.: Contractions in the 2-Wasserstein length space and thermalization
of granular media. Arch. Ration. Mech. Anal. 179, 217–263 (2006)
13. Carrillo, J.A., DiFrancesco, M., Figalli, A., Laurent, L., Slepcev, D.: Global-in-time weak measure solu-
tions and finite-time aggregation for nonlocal interaction equations. Duke Math. J. 156(2), 229–271
(2011)
14. Castaing, C., Valadier, M.: Convex Analysis and Measurable Multifunctions. Lecture Notes in Mathe-
matics, vol. 580. Springer, Berlin (1977)
15. Di Francesco, M., Fagioli, S.: Measure solutions for nonlocal interaction PDEs with two species. Non-
linearity 26, 2777–2808 (2013)
16. Jordan, R., Kinderlehrer, D., Otto, F.: The variational formulation of the Fokker–Planck equation. SIAM
J. Math. Anal. 29, 1–17 (1998)
17. Laurent, T.: Local and global existence for an aggregation equation. Commun. Part. Diff. Eq. 32, 1941–
1964 (2007)
18. Villani, C.: Topics in Optimal Transportation, Graduate Studies in Mathematics, vol. 58. American Math-
ematical Society, Providence (2003)
19. Villani, C.: Optimal Transport: Old and New. Grundlehren der mathematischen Wissenschaften. Springer,
Heidelberg (2009)
123

Calculus of Variations: Generalized Solutions of A Kinetic Granular Media Equation by A Gradient Flow Approach

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Calculus of Variations: Generalized Solutions of A Kinetic Granular Media Equation by A Gradient Flow Approach

Uploaded by

Copyright:

Available Formats

Calc. Var.

Generalized solutions of a kinetic granular media

Martial Agueh1 · Guillaume Carlier2

Mathematics Subject Classification 35Q70 · 35D30 · 35F25

and we supplement (1.4) with the initial condition

2 A first integral and measure solutions

2.1 A first integral for classical solutions

2.2 Reformulation and equivalence for classical solutions

and then also

Then, thanks to (2.12) and Fubini’s theorem, we have

2.3 Measure solutions

∂t νta + ∂x (νta vta ) = 0, ν a t=0

subject to the constraint that

sense that for every Borel bounded function φ on [0, T ] × R × R,

2.4 A discrete example and a system of Burgers equations

This means that Hti becomes noninjective before a time

3 A gradient flow structure

3.1 Functional setting

h(a) := ν0a (R)

and the energy

We equip X R with the distance d given by:

where ν ⊗ μ is the probability measure defined by

dw2 (ν, θ ) := W22 (ν ⊗ μ, θ ⊗ μ), (ν, θ ) ∈ X R × X R , (3.6)

W1 (ν, θ ) ≤ W2 (ν, θ ) (3.7)

and, it follows from (3.1) that, if f is M-Lipschitz then

W1 (ν ⊗ ν, θ ⊗ θ ) ≤ 2W1 (ν, θ ). (3.9)

J0 (ν) − J0 (θ ) ≤ W1 ((ν ⊗ μ) ⊗ (ν ⊗ μ), (θ ⊗ μ) ⊗ (θ ⊗ μ))

2. Let ν = (ν a )a∈A and θ = (θ a )a∈A be two elements of X R and let γ a be an optimal

for all φ ∈ C(K × K ). Observing that α ∈ (ν ⊗ μ, θ ⊗ ν), we get

3.2 Subdifferential of the energy and gradient flows as measure solutions

By construction, γ = (γ a )a∈A with γ a = ν a ⊗ ηa,y defined by

by Lebesgue’s dominated convergence theorem, we get

To compute I1 we observe that thanks to Fubini’s theorem

S := {x ∈ [−R, R] : G(x) − G − (x) > 0} = {xi }i∈I ,

where I is at most countable.

4 Existence by the JKO scheme

≤ Rd ×Rd (y − P(y)) · (x + τ a − P(y))dγ (x, y).

ν τ (t) := ν k , for t ∈ ((k − 1)τ, kτ ], k = 1, · · · , N + 1. (4.4)

In the following basic estimates, C will denote a constant (possibly depending on T )

and a vanishing sequence of time-steps τn → 0 as n → +∞ such that

sup dw (ν τn (t), ν(t)) → 0 as n → +∞. (4.8)

4.2 Discrete Euler–Lagrange equation

Let γ k+1 = (γk+1

every a and let vk+1 be defined by

for all ξ ∈ C([−R, R]), or equivalently, disintegrating γk+1

The Euler–Lagrange equation for (4.1) can then be written as

v k+1 ∈ −∂ J (ν k+1 ). (4.10)

ν ε = (νεa )a∈A with νεa := ((1 − ε)π1 + επ2 )# γ a .

Then by optimality of ν k+1 and using Lemma 3.2, we have

Let us also disintegrate the (arbitrary) plan γ a between νk+1

γ a (dy, dz) = νk+1

for every φ ∈ C(R3 ). Setting

(π1 (x, y, z), π2 (x, y, z), π3 (x, y, z)) = (x, y, z),

we have by construction, π12# β a = γk+1 a ,π

also have (π1 , (1 − ε)π2 + επ3 ))# β a ∈ (νka , νεa ) so that

i.e. v k+1 ∈ −∂ J (ν k+1 ). 

Let us also extend vk+1 by piecewise constant interpolation

so that, thanks to the previous Lemma, we have

v τ (t) ∈ −∂ J (ν τ (t)), t ∈ [0, T ]. (4.12)

dq τ (t) = v τ (t)dν τ (t), i.e. dqτ (t)a = vτ (t)a dντ (t)a .

In other words, for every φ ∈ C([0, T ] × A × [−R, R]) we have

4.3 Existence by passing to the limit

for all φ ∈ C(K × K ). Observing that α ∈ (ν ⊗ μ, θ ⊗ ν), we get

also have (π1 , (1 − ε)π2 + επ3 ))# β a ∈ (νka , νεa ) so that

i.e. v k+1 ∈ −∂ J (ν k+1 ).

φ, ν τn (t)η ⊗ μ = ϕη , ν τn (t) ⊗ μ,

where for k = 0, . . . , 2N − 1, z kN denotes the midpoint of the interval [−R + k R /N , −R +

where IkN is the interval [−R + k R /N , −R + (k + 1)R /N ) if k = 0, . . . , 2N − 2 and

The function ( f kN )k=0,...,2N −1 is not continuous but belongs to L 1 (ν(t) ⊗ μ, N ). Since

Since we have chosen FN dense in C(K , N ), there exist α = (α0N , . . . , α2N

We then obtain the desired contraction estimate.