Mean Field Games
Mean Field Games
Mean Field Games
René Carmona
François Delarue
Probabilistic
Theory of Mean
Field Games with
Applications I
Mean Field FBSDEs, Control, and Games
Probability Theory and Stochastic Modelling
Volume 83
Editors-in-chief
Peter W. Glynn, Stanford, CA, USA
Andreas E. Kyprianou, Bath, UK
Yves Le Jan, Orsay, France
Advisory Board
Søren Asmussen, Aarhus, Denmark
Martin Hairer, Coventry, UK
Peter Jagers, Gothenburg, Sweden
Ioannis Karatzas, New York, NY, USA
Frank P. Kelly, Cambridge, UK
Bernt Øksendal, Oslo, Norway
George Papanicolaou, Stanford, CA, USA
Etienne Pardoux, Marseille, France
Edwin Perkins, Vancouver, Canada
Halil Mete Soner, Zürich, Switzerland
The Probability Theory and Stochastic Modelling series is a merger and
continuation of Springer’s two well established series Stochastic Modelling and
Applied Probability and Probability and Its Applications series. It publishes
research monographs that make a significant contribution to probability theory or an
applications domain in which advanced probability methods are fundamental.
Books in this series are expected to follow rigorous mathematical standards, while
also displaying the expository quality necessary to make them useful and accessible
to advanced students as well as researchers. The series covers all aspects of modern
probability theory including
• Gaussian processes
• Markov processes
• Random fields, point processes and random sets
• Random matrices
• Statistical mechanics and random media
• Stochastic analysis
Probabilistic Theory of
Mean Field Games
with Applications I
Mean Field FBSDEs, Control, and Games
123
René Carmona François Delarue
ORFE Department Institut Universitaire de France
Program in Applied and & Laboratoire J.A. Dieudonné
Computational Mathematics Université Nice Sophia Antipolis
Princeton University Nice, France
Princeton, NJ, USA
Mathematics Subject Classification (2010): 91A13, 91A06, 91A55, 49N70, 49N90, 60H10, 60H15,
60H30
Since its inception about a decade ago, the theory of Mean Field Games has rapidly
developed into one of the most significant and exciting sources of progress in the
study of the dynamical and equilibrium behavior of large systems. The introduction
of ideas from statistical physics to identify approximate equilibria for sizeable
dynamic games created a new wave of interest in the study of large populations
of competitive individuals with “mean field” interactions. This two-volume book
grew out of series of lectures and short courses given by the authors over the last
few years on the mathematical theory of Mean Field Games and their applications
in social sciences, economics, engineering and finance. While this is indeed the
object of the book, by taste, background, and expertise, we chose to focus on the
probabilistic approach to these game models.
In a trailblazing contribution, Lasry and Lions proposed in 2006 a methodology
to produce approximate Nash equilibria for stochastic differential games with
symmetric interactions and a large number of players. In their models, a given
player feels the presence and the behavior of the other players through the empirical
distribution of their private states. This type of interaction was extensively studied
in the statistical physics literature under the name of mean field interaction, hence
the terminology Mean Field Game coined by Lasry and Lions. The theory of these
new game models was developed in lectures given by Pierre-Louis Lions at the
Collège de France which were video-taped and made available on the internet.
Simultaneously, Caines, Huang, and Malhamé proposed a similar approach to large
games under the name of Nash Certainty Equivalence principle. This terminology
fell from grace and the standard reference to these game models is now Mean Field
Games.
While slow to pick up momentum, the subject has seen a renewed wave of
interest over the last seven years. The mean field game paradigm has evolved
from its seminal principles into a full-fledged field attracting theoretically inclined
investigators as well as applied mathematicians, engineers, and social scientists.
The number of lectures, workshops, conferences, and publications devoted to the
subject has grown exponentially, and we thought it was time to provide the applied
mathematics community interested in the subject with a textbook presenting the
state of the art, as we see it. Because of our personal taste, we chose to focus on what
v
vi Foreword
we like to call the probabilistic approach to mean field games. While a significant
portion of the text is based on original research by the authors, great care was taken
to include models and results contributed by others, whether or not they were aware
of the fact they were working with mean field games. So the book should feel and
read like a textbook, not a research monograph.
Most of the material and examples found in the text appear for the first time in
book form. In fact, a good part of the presentation is original, and the lion’s share
of the arguments used in the text have been designed especially for the purpose of
the book. Our concern for pedagogy justifies (or at least explains) why we chose to
divide the material in two volumes and present mean field games without a common
noise first. We ease the introduction of the technicalities needed to treat models with
a common noise in a crescendo of sophistication in the complexity of the models.
Also, we included at the end of each volume four extensive indexes (author index,
notation index, subject index, and assumption index) to make navigation throughout
the book seamless.
Acknowledgments
First and foremost, we want to thank our wives Debbie and Mélanie for their
understanding and unwavering support. The intensity of the research collaboration
which led to this two-volume book increased dramatically over the years, invading
our academic lives as well as our social lives, pushing us to the brink of sanity at
times. We shall never be able to thank them enough for their patience and tolerance.
This book project would not have been possible without them: our gratitude is
limitless.
Next we would like to thank Pierre-Louis Lions, Jean-Michel Lasry, Peter
Caines, Minyi Huang, and Roland Malhamé for their incredible insight in intro-
ducing the concept of mean field games. Working independently on both sides of
the pond, their original contributions broke the grounds for an entirely new and
fertile field of research. Next in line is Pierre Cardaliaguet, not only for numerous
private conversations on game theory but also for the invaluable service provided
by the notes he wrote from Pierre-Louis Lions’ lectures at the Collège de France.
Although they were never published in printed form, these notes had a tremendous
impact on the mathematical community trying to learn about the subject, especially
at a time when writings on mean field games were few and far between.
We also express our gratitude to the organizers of the 2013 and 2015 conferences
on mean field games in Padova and Paris: Yves Achdou, Pierre Cardaliaguet, Italo
Capuzzo-Dolcetta, Paolo Dai Pra, and Jean-Michel Lasry.
While we like to cast ourselves as proponents of the probabilistic approach to
mean field games, it is fair to say that we were far from being the only ones
following this path. In fact, some of our papers were posted essentially at the same
time as papers of Bensoussan, Frehse, and Yam, addressing similar questions, with
the same type of methods. We benefitted greatly from this stimulating and healthy
competition.
Foreword vii
This first volume of the book is entirely devoted to the theory of mean field games in
the absence of a source of random shocks common to all the players. We call these
models games without a common noise. This volume is divided into two main parts.
Part I is a self-contained introduction to mean field games, starting from practical
applications and concrete illustrations, and ending with ready-for-use solvability
results for mean field games without a common noise. While Chapters 1 and 2
are mostly dedicated to games with a finite number of players, the asymptotic
formulation which constitutes the core of the book is introduced in Chapter 3.
For the exposition to be as pedagogical as possible, we chose to defer some of
the technical aspects of this asymptotic formulation to Chapter 4 which provides a
complete toolbox for solving forward-backward stochastic differential equations of
the McKean-Vlasov type. Part II has a somewhat different scope and focuses on the
main principles of analysis on the Wasserstein space of probability measures with
a finite second moment, which plays a key role in the study of mean field games
and which will be intensively used in the second volume of the book. We present
the mathematical theory in Chapter 5, and we implement its results in Chapter 6
with the analysis of stochastic mean field control problems, which are built upon a
notion of equilibrium different from the search for Nash equilibria at the root of the
definition of mean field games. Extensions, including infinite time horizon models
and games with finite state spaces, are discussed in the epilogue of this first volume.
The remainder of this preface expands, chapter by chapter, the short content
summary given above. A diagram summarizing the connections between the
different components of the book is provided on page xix.
The first chapter sets the stage for the introduction of mean field games with a
litany of examples of increasing complexity. Starting with one-period deterministic
games with a large number of players, we introduce the mean field game paradigm.
We use examples from domains as diverse as finance, macroeconomics, population
biology, and social science to motivate the introduction of mean field games
in different mathematical settings. Some of these examples were studied in the
literature before the introduction of, and without any reference to, mean field games.
We chose them because of their powerful illustrative power and the motivation they
offer for the introduction of new mathematical models. The examples of bank runs
ix
x Preface to Volume I
modeled as mean field games of timing are a case in point. For pedagogical reasons,
we highlight practical applications where the interaction between the players does
not necessarily enter the model through the empirical distributions of the states of
the players, but via the empirical distributions of the actions of the players, or even
the joint empirical distributions of the states and the controls of the players. Most of
these examples will be revisited and solved throughout the book.
Chapter 2 offers a crash course on the mathematical theory of stochastic
differential games with a finite number of players. The material of this chapter is
not often found in book form, and since we make extensive use of its notations and
results throughout the book, we thought it was important to present them early for
the sake of completeness and future references. We concentrate on what we call the
probabilistic approach to the search for Nash equilibria, and we introduce games
with mean field interactions as they are the main object of the book. Explicitly
solvable models are few and far between. Among them, linear quadratic (LQ for
short) models play a very special role because their solutions, when they exist,
can be obtained by solving matrix Riccati equations. The last part of the chapter
is devoted to a detailed analysis of a couple of linear quadratic models already
introduced in Chapter 1, and for which explicit solutions can be derived. To wit,
these models do not require the theory of mean field games since their finite player
versions can be solved explicitly. However, they provide a testbed for the analysis
of the limit of finite player equilibria when the number of players grows without
bound, offering an invaluable opportunity to introduce the concept of mean field
game and discover some of its essential features.
The probabilistic approach to mean field games is the main thrust of the
book. The underpinnings of this approach are presented in Chapter 3. Stochastic
control problems and the search for equilibria for stochastic differential games
can be tackled by reformulating the optimization and equilibrium problems in
terms of backward stochastic differential equations (BSDEs throughout the book)
and forward-backward stochastic differential equations (FBSDEs for short). In this
chapter, we review the major forms of FBSDEs that may be used to represent
the optimal trajectories of a standard optimization problem: the first one is based
on a probabilistic representation of the value function, and the second one on
the stochastic Pontryagin maximum principle. Combined with the consistency
condition issued from the search for Nash equilibria as fixed points of the best
response function, this prompts us to introduce a new class of FBSDEs with a
distinctive McKean-Vlasov character. This chapter presents a basic existence result
for McKean-Vlasov FBSDEs. This result will be extended in Chapter 4. As a by-
product, we obtain early solvability results for mean field games by straightforward
implementations of the two forms of the probabilistic approach just mentioned.
However, since our primary aim in this chapter is to make the presentation as
pedagogical as possible, we postpone the most general versions of the existence
results for mean field games to Chapter 4, as some of the proofs are rather technical.
Instead, we highlight the role of monotonicity, as captured by the so-called Lasry-
Lions monotonicity conditions, in the analysis of uniqueness of equilibria. Finally,
Preface to Volume I xi
we specialize the results of this chapter to the case of linear-quadratic mean field
games, which can be handled directly via the analysis of Riccati equations. Most
of the results of this chapter will be revisited and extended in the second volume
to accommodate a common noise which is found in many economic and physical
applications.
Chapter 4 starts with a stochastic analysis primer on the theory of FBSDEs.
As explained above, in the mean field limit of large games, the fixed point step of
the search for Nash equilibria turns the standard FBSDEs derived from optimization
problems into equations of the McKean-Vlasov type by introducing the distribution
of the solution into the coefficients. These FBSDEs characterize the equilibria. Since
this new class of FBSDEs was not studied before the advent of mean field games,
one of the main objectives of Chapter 4 is to provide a systematic approach to their
solution. We show how to use Schauder’s fixed point theorem to prove existence of
a solution. The chapter closes with the analysis of the so-called extended mean field
games, in which the players are interacting not only through the distribution of their
states but also through the distribution of their controls. Finally, we demonstrate
how the methodology developed in the chapter applies to some of the examples
presented in the opening chapter.
Although it contains very few results on mean field games, Chapter 5 plays a
pivotal role in the book. It contains all the results on spaces of probability measures
which we use throughout the book, including the definitions and properties of the
Wasserstein distances, the convergence of the empirical measures of a sequence of
independent and identically distributed random variables . . . and most importantly, a
detailed presentation of the differential calculus on the Wasserstein space introduced
by Lions in his unpublished lectures at the Collège de France, and by Cardaliaguet
in the notes he wrote from Lions’ lectures. Even though the use of this differential
calculus in the first volume is limited to the ensuing Chapter 6, the differential
calculus on the Wasserstein space plays a fundamental role in the study of the master
equation for mean field games, whose presentation and analysis will be provided in
detail in the second volume. Still, a foretaste of the master equation is given at the
end of this chapter. Its derivation is based on an original form of Itô’s formula for
functionals of the marginal laws of an Itô process, the proof of which is given in full
detail. For the sake of completeness, we also provide a thorough and enlightening
discussion of the connections between Lions’ differential calculus, which we call
L-differential calculus throughout the book, and other forms of differential calculus
on the space of probability measures, among which the differential calculus used in
optimal transportation theory.
One of the remarkable features of the construction of solutions to mean field
game problems is the similarity with a natural problem which did not get much
attention from analysts and probabilists: the optimal control of (stochastic) differ-
ential equations of the McKean-Vlasov type, which could also be called mean field
optimal control. The latter is studied in Chapter 6. Both problems can be interpreted
as searches of equilibria for large populations, claim which will be substantiated
in Chapter 6 in the second volume of the book. Interestingly, the optimal control
of McKean-Vlasov stochastic dynamics is intrinsically a stochastic optimization
xii Preface to Volume I
problem while the search for Nash equilibria in mean field games is more of a
fixed point problem than an optimization problem. So despite the strong similarities
between the two problems, differences do exist, and we highlight them starting with
Chapter 6. There, we show that since the problem at hand is a stochastic control
problem, the optimal control of McKean-Vlasov stochastic dynamics can be tackled
by means of an appropriate version of the Pontryagin stochastic maximum principle.
Following this strategy leads to FBSDEs for which the backward part involves the
derivative of the Hamiltonian with respect to a measure argument. This novel feature
is handled with the tools provided in Chapter 5. We close the chapter with the
discussion of an alternative strategy for solving mean field optimal control problems,
based on the notion of relaxed controls. Also, we review several crucial examples,
among them potential games. These latter models are mean field games for which
the solutions can be reduced to the solutions of mean field optimal control problems,
and optimal transportation problems.
Chapter 7 is a capstone which we use to revisit some of the examples introduced
in Chapter 1, especially those which are not exactly covered by the probabilistic
theory of stochastic differential mean field games developed in the first volume.
Indeed, Chapter 1 included a considerable amount of applications hinting at
mathematical models with distinctive features which are not accommodated by
the models and results of the first part of this first volume. We devote this
chapter to presentations, even if only informal, of extensions of the Mean Field
Game paradigm to these models. They include extensions to several homogenous
populations, infinite horizon optimization, and models with finite state spaces. These
mean field game models have a great potential for the quantitative analysis of very
important practical applications, and we show how the technology developed in the
first volume of the book can be brought to bear on their solutions.
Volume I
Part I
Chapters 1 & 2
Finite Games
Chapter 3 & 4
Mean Field Games
Part II
Chapter 5
Analysis on
Wasserstein Space
Chapter 6
Mean Field Control
Extensions
Epilogue I
Volume II
Part II Part I
Master Equation and Mean Field Games
Convergence Problem with a Common Noise
Epilogue II:
more Extensions
Thick lines indicate the logical order of the chapters. The dotted line between
Chapters 3–4 and 6 emphasizes the fact that—in some cases like potential games—
mean field games and mean field control problems share the same solutions. Finally,
the dashed lines starting from Part II (second volume) point toward the games and
the optimization problems for which we can solve approximately the finite-player
versions or for which the finite-player equilibria are shown to converge.
References to the second volume appear in the text in the following forms:
Chapter (Vol II)-X, Section (Vol II)-X:x, Theorem (Vol II)-X:x, Proposition (Vol
II)-X:x, Equation (Vol II)-.X:x/, . . . , where X denotes the corresponding chapter in
the second volume and x the corresponding label.
Contents
Volume I
Contents of Volume II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi
xv
xvi Contents
Epilogue to Volume I
7 Extensions for Volume I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
7.1 First Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
7.1.1 Mean Field Games with Several Populations . . . . . . . . . . . . . . . . . 621
7.1.2 Infinite Horizon MFGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
7.2 Mean Field Games with Finitely Many States . . . . . . . . . . . . . . . . . . . . . . . . . 640
7.2.1 N Player Symmetric Games in a Finite State Space . . . . . . . . . . 643
7.2.2 Mean Field Game Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
7.2.3 A First Form of the Cyber-Security Model . . . . . . . . . . . . . . . . . . . . 655
xx Contents
7.2.4
Special Cases of MFGs with Finite State Spaces . . . . . . . . . . . . . 660
7.2.5
Counter-Example to the Convergence of Markovian
Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
7.3 Notes & Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
Volume II
Part I MFGs with a Common Noise
1 Optimization in a Random Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 FBSDEs in a Random Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Immersion and Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Compatible Probabilistic Set-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.3 Kunita-Watanabe Decomposition and Definition of a
Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 Strong Versus Weak Solution of an FBSDE . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.1 Notions of Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.2 Canonical Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.2.3 Yamada-Watanabe Theorem for FBSDEs . . . . . . . . . . . . . . . . . . . . . 30
1.3 Initial Information, Small Time Solvability, and Decoupling
Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.3.1 Disintegration of FBSDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.3.2 Small Time Solvability and Decoupling Field . . . . . . . . . . . . . . . . 54
1.3.3 Induction Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
1.4 Optimization with Random Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
1.4.1 Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
1.4.2 Value Function and Stochastic HJB Equation . . . . . . . . . . . . . . . . . 77
1.4.3 Revisiting the Connection Between the HJB Equations
and BSDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
1.4.4 Revisiting the Pontryagin Stochastic Maximum Principle. . . . 96
1.5 Notes & Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
2 MFGs with a Common Noise: Strong and Weak Solutions . . . . . . . . . . . . . 107
2.1 Conditional Propagation of Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
2.1.1 N-Player Games with a Common Noise . . . . . . . . . . . . . . . . . . . . . . . 108
2.1.2 Set-Up for a Conditional McKean-Vlasov Theory . . . . . . . . . . . . 109
2.1.3 Formulation of the Limit Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
2.1.4 Conditional Propagation of Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
2.2 Strong Solutions to MFGs with Common Noise . . . . . . . . . . . . . . . . . . . . . . 125
2.2.1 Solution Strategy for Mean Field Games . . . . . . . . . . . . . . . . . . . . . . 125
2.2.2 Revisiting the Probabilistic Set-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
2.2.3 FBSDE Formulation of the First Step in the Search for
a Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
2.2.4 Strong MFG Matching Problem: Solutions and Strong
Solvability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
2.3 Weak Solutions for MFGs with Common Noise . . . . . . . . . . . . . . . . . . . . . . 141
2.3.1 Weak MFG Matching Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
2.3.2 Yamada-Watanabe Theorem for MFG Equilibria . . . . . . . . . . . . . 144
2.3.3 Infinite Dimensional Stochastic FBSDEs . . . . . . . . . . . . . . . . . . . . . 150
2.4 Notes & Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
xxii Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665
Abstract
The goal of this first chapter is to introduce a diverse collection of examples of
games whose descriptions help us introduce the notation and terminology as well
as some preliminary results at the core of the theory of mean field games. This
compendium of models illustrates the great variety of applications which can
potentially be studied rigorously by means of this concept. These introductory
examples are chosen for their simplicity, and for pedagogical reasons, we take
shortcuts and ignore some of their key features. As a consequence, they will
need to be revisited before we can subject them to the treatment provided by
the theoretical results developed later on in the book. Moreover, some of the
examples may not fit well with the typical models analyzed mathematically in
this text as the latter are most often in continuous time, and of the stochastic
differential variety. For this reason, we believe that the best way to think about
this litany of models is to remember a quote by Box and Draper “All models are
wrong, but some are useful”.
Our first example is mostly intended to motivate the terminology and explain the
notation used throughout. We consider a population of N individuals which we
denote by i 2 f1; ; Ng, individual i having the choice of a point ˛ i in a space
Ai which is assumed to be a compact metric space for the sake of simplicity. This is
a simple mathematical model for a game discussed frequently by P.L. Lions in his
lectures [265] under the name of “Where do I put my towel on the beach?”, all the
Ai being equal to the same set A representing the beach, and ˛ the location where
the player is setting his or her towel.
We are interested in large games, claim which is translated mathematically by the
assumption N ! 1. The game nature of the problem comes from the fact that each
individual is an optimizer (a minimizer for the sake of definiteness) in the sense that
player i tries to minimize a quantity J i .˛ 1 ; ; ˛ N /. In order for simplifications to
occur in the limit N ! 1, regularity and symmetry assumptions will be needed.
For example, we shall assume that all the spaces Ai are in fact the same compact
metric space A. Next, we shall also assume that each function J i is symmetric in the
choices ˛ j of the players j ¤ i. For instance, the following cost functions could be
used in the description of the example “Where do I put my towel on the beach?”:
1 X
J i .˛ 1 ; ; ˛ N / D ˛d.˛ i ; ˛ 0 / ˇ d.˛ i ; ˛ j /: (1.1)
N1
16j¤i6N
1.1 Introduction of the Key Ingredients 5
Here, ˛ and ˇ are two real numbers, d denotes the distance of the metric space A,
and ˛ 0 2 A is a special point of interest (say a food stand on the beach). So if the
constants ˛ and ˇ are positive, the fact that the players try to minimize this cost
function means that the players want to be closer to the point of interest ˛ 0 , but
at the same time, as far away from each other as possible. Clearly, the above cost
functions are of the form:
X
i 1 Q 1
J .˛ ; ; ˛ / D J ˛ ;
N i
ı˛ j : (1.2)
N1
16j¤i6N
where here and throughout the book, we use the notation ıx for the unit mass at
the point x, forR the function JQ of ˛ 2 A and 2 P.A/ defined by J.˛; Q / D
0 0
˛d.˛; ˛0 / ˇ E d.˛; ˛ /.d˛ /. Also, we use the notation P.A/ for the space of
probability measures on A. The special form (1.2) is the epitome of the type of
interaction we shall encounter throughout the text. The fact that it involves a function
of a measure, an empirical measure in this instance, will be justified by Lemma 1.2
below which gives a rigorous foundation to the rationale for the main assumption
of mean field game models. When the cost J i to player i is a function of ˛ i and a
symmetric function of the other ˛ j with j ¤ i, and the dependence of J i upon each
individual ˛ j is minor, then the function J i can be viewed as a function of ˛ i and the
empirical distribution of the remaining ˛ j . In other words, in the limit N ! 1 of
large games, the costs J i can be viewed as functions of ˛ i and a probability measure.
See Lemma 1.2 below for details.
It is customary to use the notation ˛ to denote the actions taken by, or controls
used by the players. In this way, the cost to each player appears as a function of
the values of the various ˛ i chosen by the players. This is the notation system used
above. On the other hand, it is also customary to use the notation x to denote the state
of a system influenced by the actions of the players. Most often in the applications
we shall consider, the state of the system comprises individual states xi attached to
each player i, and a few other factors, all of them entering the computation of the
costs J i . Notice that in the above example, xi D ˛ i since the controls ˛ i describe the
choice made by the players as well as the states they put themselves in.
The lexicographical order, which is often considered as the most natural notion of
order in RN , is not complete and as a result, the simultaneous minimization of the
N scalar functions J i can be a touchy business. The standard way to resolve the
possible ambiguities associated with this simultaneous minimization is to appeal to
the notion of Nash equilibrium.
Note that the above inequality is easier to write and read if one uses the special
notation introduced earlier. Indeed, it reads:
The notion of Nash equilibrium is best understood in terms of the so-called best
response function B W AN ! AN defined by:
which is obviously well defined if we assume that there exists a unique minimum
of the function in the right of this expression, but which can also be defined in more
general situations. With the intuitive concept of best response formalized in this
way, a Nash equilibrium appears as a fixed point of the best response function B.
Solving N player games for Nash equilibria is often difficult, even for one period
deterministic games, and the strategy behind the theory of mean field games is
to search for simplifications in the limit N ! 1 of large games. Since such
simplifications cannot be expected in full generality, we shall need to restrict our
attention to games with specific properties. We shall assume:
• a strong symmetry property of the model: typically, we shall assume that each
cost function J i is a symmetric function of the N 1 variables ˛ j for j ¤ i;
• the influence of each single player on the whole system is diminishing as N gets
larger.
to justify the set-ups of mean field games. But first, we introduce a special notation
for empirical distributions. If n > 1 is an integer, given a point X D .x1 ; ; xn / 2
En we shall denote by N nX the probability measure defined by:
1X
n
N nX D ıx i (1.3)
n iD1
where we use the notation ıx to denote the unit mass at the point x 2 E. Notice that
N nX belongs to P.E/, the space of probability measures on E. For the purpose of the
present discussion, E is assumed to be a compact metric space. Later on in the book,
we shall consider more general topological spaces. Accordingly, we assume that the
space P.E/ is equipped with the topology of the weak convergence of measures
according to which a sequence R .k /k>1 Rof probability measures in P.E/ converges
to 2 P.E/ if and only if E fdk ! E fd when k ! 1 for every real valued
continuous function f on E. The space P.E/ is a compact metric space for this
topology and we denote by a distance compatible with this topology. We shall
see several examples of such metrics in Chapter 5 where we provide an in-depth
analysis of several forms of calculus on spaces of measures.
The following simple result is the basis for our formulation of the limiting
problems for games with a large number of players. Its assumptions formalize what
we mean when we informally talk about symmetric function weakly dependent on
each of its arguments.
Then, there exists a subsequence .unk /k>1 and a Lipschitz continuous map U W
P.E/ ! R such that:
Remark 1.3 The adjective “symmetric” in the statement of the lemma is redun-
dant. It was included as a matter of emphasis. Indeed, assumption 2 implies that,
for each n > 1, un is necessarily symmetric and continuous. Choosing Y as a
permutation of the entries of X, we get un .X/ D un .Y/ by observing that N nX D N nY .
Moreover, for any sequence .X np /p>1 with values in En converging (for the product
topology on En ) toward some X n , the sequence .N nX np /p>1 converges toward N nX n
implying the continuity of un .
8 1 Learning by Examples: What Is a Mean Field Game?
Proof. For each integer n > 1, define the function U n on P .E/ by:
U n ./ D infn un .X/ C c.N nX ; / ; 2 P .E/: (1.4)
X2E
These functions are uniformly bounded since the functions .un /n>1 are uniformly bounded,
see 1 in the statement, while the function P .E/2 3 .; / 7! .; / is bounded since P .E/
is compact. Also, the functions .U n /n>1 extend the original functions .un /n>1 to P .E/ in
the sense that un .Y/ D U n .N nY / for Y 2 En and n > 1. Indeed, by choosing X D Y in
the infimum appearing in definition (1.4) we get U n .N nY / 6 un .Y/. Equality then holds true
because, if not, there would exist an X 2 En such that un .X/ C c.N nX ; N nY / < un .Y/ which
would contradict assumption 2. Furthermore, these extensions are c-Lipschitz continuous on
P .E/ in the sense that:
for all ; 2 P .E/. To prove this, let X 2 En and Y 2 En be such that U n ./ D un .X/ C
c.N nX ; / and U n ./ D un .Y/ C c.N nY ; /. The infimum in the definition (1.4) is attained
because the space En is compact and, for each fixed 2 P .E/, the function En 3 X 7!
un .X/ C c.N nX ; / is continuous. Now:
Similarly, we prove that U n ./ U n ./ 6 c.; / by using X in the infimum defining
U n ./. This completes the proof of the c-Lipschitz continuity.
Now since P .E/ is a compact metric space, Arzelà-Ascoli theorem gives the existence of a
subsequence .nk /k>1 for which U nk converges uniformly toward a limit U and consequently:
lim sup sup junk .X/ U.N nXk /j 6 lim sup jU nk ./ U./j D 0;
k!1 X2Enk k!1 2P .E/
which follows from the fact that unk .X/ D U nk .N nXk / and which concludes the proof. t
u
The way we wrote formula (1.1) was in anticipation of the result of the above
lemma. Indeed given this result, the take home message is that symmetric functions
of many variables weakly depending on each of its arguments will be conveniently
approximated (up to a subsequence) by regular functions of measures evaluated at
the empirical distributions of its original arguments.
Returning to the game set-up of the previous subsection, we assume that the cost
functions, which we denote by .J N;i /iD1; ;N to emphasize the fact that they depend
on N variables in A, satisfy:
1.1 Introduction of the Key Ingredients 9
J N;i .˛ 1 ; ; ˛ N / D J N .˛ i ; ˛ i /:
1 X
N N1
˛ i D ı˛ j :
N1
16j¤i6N
J .N/ .˛; /
.N/ (1.5)
D inf J ˛; ˛ 2 ; ; ˛ N C c N N1
.˛ 2 ; ;˛ N /
; :
.˛ 2 ; ;˛ N /2AN1
Proposition 1.4 We assume that for each integer N > 1, ˛O N D .˛O N;1 ; ; ˛O N;N / is
a Nash equilibrium for the game defined by the cost functions J N;1 ; ; J N;N which
are assumed to satisfy assumption Large Symmetric Cost Functional. Also, we
assume that the metric in Large Symmetric Cost Functional satisfies, for all
N > 1, ˛ 2 A and 2 P.A/:
N1 1 c
; C ı˛ 6 ; (1.6)
N N N
for a constant c > 0. Then there exist a subsequence .Nk /k>1 and a continuous
function J W A P.A/ ! R such that the sequence .N N˛O Nk k /k>1 converges as k ! 1
toward a probability measure O 2 P.A/, and
ˇ ˇ
lim sup ˇJ Nk ˛ Nk ;1 ; ; ˛ Nk ;Nk J ˛ Nk ;1 ; N N˛Nk 1
k ;1
ˇ D 0; (1.7)
k!1 ˛ Nk 2ANk
and
Z Z
J.˛; /
O .d˛/
O D inf J.˛; /.d˛/:
O (1.8)
A 2P.A/ A
Remark 1.5 Notice that property (1.8) is equivalent to the fact that the topological
support of the measure O is contained in the set of minima of the function ˛ 7!
J.˛; /,
O or in other words that:
O 6 J.˛ 0 ; /;
O f˛ 2 A W J.˛; / O for all ˛ 0 2 Ag D 1: (1.9)
Remark 1.6 The bound (1.6) is easily checked in the case of the Kantorovich-
Rubinstein distance D dKR introduced in Chapter 5, but it can also be checked
directly for other metrics consistent with the weak convergence of measures.
Remark 1.7 In (1.8), we used the notation .d˛/ in the integral, but throughout
the text, we use indistinguishably the two notations .d˛/ and d.˛/.
Proof. Since A is compact, we can find a subsequence .Nk /k>1 such that .N ˛NO Nk k /k>1 converges
toward a probability measure . O Also, by Lemma 1.2, we can assume that (1.7) holds. We
just check that O and J satisfy (1.8). By definition of a Nash equilibrium and by assumption
Large Symmetric Cost Functional, we have:
1.1 Introduction of the Key Ingredients 11
Z
1 X
ı˛O N;i 2 arg inf J .N/ ˛; ı˛O N;j .d˛/;
2P .A/ A N1
j¤i
for all 2 P .A/ and ˛ 2 A. Since the functions .J .N/ /N>1 are uniformly Lipschitz
continuous, we can find a constant c0 , independent of N, such that:
Z
c0
J .N/ ˛O N;i ; N N˛O N 6 inf J .N/ ˛; N N˛O N .d˛/ C ;
2P .A/ A N
Choosing N D Nk and using the fact that the sequence .J .Nk / /k>1 converges to J uniformly,
we deduce that there exists a sequence of positive reals .k /k>1 , converging to 0 as k tends to
1, such that:
Z Z
J ˛; N ˛NO Nk k N ˛NO Nk k .d˛/ 6 inf J ˛; N ˛NO Nk k .d˛/ C k :
A 2P .A/ A
Letting k tend to 1 and using the fact that J is Lipschitz continuous on A P .A/, we get:
Z Z
J.˛; /
O .d˛/
O D inf J.˛; /.d˛/;
O
A 2P .A/ A
Remark 1.8 Below, we shall use a special form of Proposition 1.4. Starting from a
limiting cost functional J W A P.A/ ! R, we shall reconstruct the N-player cost
functionals J N;i by letting:
In other words, we assume that not only (1.7) holds in the limit k ! 1, but also for
any k > 1.
12 1 Learning by Examples: What Is a Mean Field Game?
In this context, the last step of the search for equilibrium, namely the fixed point
argument, amounts to making sure that the support of the measure is included in
the set of minimizers. So the definition of a Nash point can be rewritten in the form:
where Supp./ denotes the support of . Summarizing the above discussion, the
search for Nash equilibria via the search for fixed points of the best response
function can be recast as the two steps procedure:
The above two step problem is what we call the MFG problem. This formulation
captures the information of ALL the Nash points as N ! 1. In particular, the
aberrations with the weird Nash points which are expected when N is finite should
disappear in the limit N ! 1. Clearly, the goal of the analysis of an MFG problem
is to find the equilibrium distribution O for the population, not so much the optimal
positions of the individuals.
One of the objectives of the book is to generalize Proposition 1.4 to stochastic
differential games. Indeed, the theory of mean field games is grounded in the
premise of the convergence of Nash equilibria of large stochastic differential games
with a mean field interaction. This is the object of Chapter (Vol II)-6. Interestingly
enough, we shall provide in Chapter 7 at the end of this first volume, a counter-
example showing that the limiting MFG problem may not capture all the limits of
the finite-game equilibria.
Uniqueness
Existence will not be much of a problem when the cost function J is jointly
continuous, which we denote by J 2 C.A P.A//, as long as we still assume that
A is compact. Indeed, in this case, a plethora of fixed point theorems for continuous
functions on compact spaces can be brought to bear on the model to prove existence.
1.1 Introduction of the Key Ingredients 13
Uniqueness is typically more difficult. In fact uniqueness is not even true in general.
The main sufficient condition used to derive uniqueness (whether we consider the
simple case at hand or the more sophisticated stochastic differential games which
we study later) will be a monotonicity property identified by Lasry and Lions. We
illustrate its workings in the present framework of static deterministic games. It will
be studied in a more general context in Section 3.4 of Chapter 3.
whenever 1 ¤ 2 .
because 2 is an optimum. Summing the two last inequalities and computing the left-hand
side of (1.10), we find a contradiction unless the two solutions 1 and 2 are the same. u
t
for some real valued function f on the real line. We used the notation Leb for the
Lebesgue measure, and for > 0, we denoted byR B the Euclidean ball of radius
around the origin. Consequently, 1B .˛/ D B .˛ d˛ 0 /. So back to the N
player game, the argument of f is the proportion of individuals in the ball of radius
around ˛, so that the cost to player i is given by:
1 #fjI d.˛ i ; ˛ j / < g
J .˛ ; ; ˛ / D J0 .˛ / C f
N;i N i
;
NLeb.B /
and is a function of the fraction of the population inside the neighborhood in which
player i would like to choose ˛ i . This is a particular case of a smoothing of the
measure to guarantee the existence of a density: if ' is a positive function with
integral 1 which vanishes outside a ball around the origin, one replaces the density
p.˛/, which may not exist, by the quantity Œ '.˛/. The closer to the delta
function ' is, the better the approximation of the density. Notice that the function
7! Œ '.˛/ is continuous with respect to the topology of the weak convergence
of measures whenever ' is continuous. As we already explained, this may play a
crucial role in the analysis.
An example which we will use several times in the sequel corresponds to the
choice of the function f .t/ D cta for some positive exponent a and a real constant
c. The case c > 0 corresponds to aversion to crowds while c < 0 indicates the
desire to mingle with the crowd. This example offers instances of models for which
uniqueness does not hold for some of the values of the parameters.
large linear programs). We shall recall the precise definition of potential games in
Chapter 2. For the purpose of the present discussion, we identify a specific class of
games in the limit N ! 1. We shall say that the large game is a potential game if
the cost function J W A P.A/ ! R is of the form:
for some function F W P.A/ ! R which is differentiable in the following sense: for
every ; 2 P.A/,
Z
F..1 / C / F./
lim D ıF./.˛/. /.d˛/;
&0 A
Proof. If is a minimum of F, the Euler first order condition for the minimization of F reads
Z
ıF./.˛/dŒ .˛/ > 0
for all 2 P .A/, which is the definition of a solution of the MFG problem. Uniqueness when
J is strictly monotone was argued earlier. t
u
The discussion of the previous section was mostly introductory. Its purpose was to
introduce important notation and definitions, and to highlight, already in the case of
a simple static deterministic game, the philosophy of the mean field game approach,
by finding a more tractable problem than the N-player game when N is large.
16 1 Learning by Examples: What Is a Mean Field Game?
The first stochastic game model we present was called “When does the meeting
start?” when it was first introduced. A meeting is scheduled to start at a time t0
known from everyone, and participant i 2 f1; ; Ng tries to get to the meeting at
time ti (which is a number completely under the control of player i), except for the
fact that despite this desire to get to the meeting at time ti , due to uncertainties in
traffic conditions and public transportations, individual i arrives at the meeting at
time X i which is the realization of a Gaussian random variable with mean ti and
variance i2 > 0 which is also random. So in this game model, the control of player
i is the time ti which we shall denote by ˛ i from now on in order to conform with
the notation used throughout the book.
The random variables X i may not have the same variances because the players
may be coming from different locations and facing different traffic conditions.
To be specific, we shall assume that for i 2 f1; ; Ng, X i D ˛ i C i i for a
sequence . i /16i6N of independent identically distributed (i.i.d. for short) N.0; 1/
random variables, and the . i /16i6N form an i.i.d. sequence of their own which is
assumed to be independent of the sequence . i /16i6N . We denote by the common
distribution of the . i /16i6N ’s; is assumed to be a probability measure on Œ0; 1/
whose topological support does not contain 0.
If a meeting scheduled to start at time t0 actually starts at time t, and if agent i
arrives at the meeting at time X i controlled as defined above, the expected overall
cost to participant i is defined as:
for three positive constants a, b and c, where we use the notation xC to denote the
positive part max.0; x/ of a real number x 2 R. The interpretations of the three terms
appearing in the total cost to agent i are as follows:
• the first term a.X i t0 /C represents a reputation cost for arriving late (as
compared to the announced starting time t0 );
• the term b.X i t/C quantifies the inconvenience for missing the beginning of the
meeting;
• c.t X i /C represents the cost of the time wasted for being early and having to
wait.
We assume that the actual start time of the meeting is decided algorithmically by
computing an agreed upon function of the empirical distribution N NX of the arrival
times X D .X 1 ; ; X N /. In other words, we assume that t D .N NX / for some
function W P.RC / ! R. For the sake of illustration, we can think of the case
where ./ is chosen to be the 100p-percentile of , for some fixed number p 2
Œ0; 1. In other words, the meeting starts when 100p-percent of the agents already
joined the meeting. This is a form of quorum rule. Notice that the fact that the choice
of the start time t is a function of the empirical distribution t D .N NX / is the source
of the interactions between the agents who need to take into account the decisions
of the other agents in order to make their own decision on how to minimize (1.12).
1.1 Introduction of the Key Ingredients 17
As expected, the search for a Nash equilibrium starts with the computation of the
best response function of each player given the decisions of the other players. So
we need, for each agent i 2 f1; ; Ng, to assume that the other players have made
their choices ˛ i , and then solve the minimization problem:
with t D t.X 1 ; ; X N / D .N NX /. Observe that, in contrast with the type of cost
functional used in Remark 1.8, the empirical measure is here computed over the
states instead of the controls, and the computation is performed over all the players
as opposed to over the players different from i.
The problem (1.13) is not easy to solve, especially given the fact that the
computation of the best response function needs to be followed by the computation
of its fixed points in order to identify Nash equilibria. So instead of searching
for exact Nash equilibria for the finite player game, we settle for an approximate
solution and try to solve the limiting MFG problem. The rationale for the MFG
strategy explained earlier was partly based on the convergence of the empirical
distributions toward a probability measure . Now, because of the randomness of
the arrival times and the independence assumption, the intuition behind the MFG
strategy is reinforced when N is relatively large, since a form of the law of large
numbers should kick in, and the empirical measures N NX (which are random) should
converge toward a deterministic probability measure . Moreover, the sensitivity
of N NX with respect to perturbations of ˛ i for a single individual should not affect
the limit, and it seems reasonable to consider the alternate optimization problem
obtained by replacing t D .N NX / in (1.13) by t D ./. In other words, we
assume that the times of arrival of the agents have a statistical distribution , and we
compute for a representative individual, the best response to this distribution. Since
we choose a rule given by a predetermined function of the arrival time distribution
we, de facto, compute the best response, say ˛, O to t D ./. The fixed point step
required to determine Nash equilibria in the classical case, could be now mimicked
by searching for a starting value of t which would lead to a best response ˛O D t.
This is exactly the MFG program outlined in the previous section. We now give the
details of its implementation in the present set-up.
R
Proof. Notice that F.z/ D ˚.z=s/.ds/, and that F is a strictly positive, strictly increasing
continuous function satisfying limz&1 F.z/ D 0, limz%1 F.z/ D 1, and 1 F.z/ D
F.z/. Here and throughout, ˚ denotes the cumulative distribution function of the standard
Gaussian distribution N.0; 1/. Moreover the fact that the topological support of does not
contain 0 implies that F is differentiable and that its derivative is uniformly bounded over R.
The quantity to minimize reads:
so that, taking the derivative of the above expression with respect to ˛ we get the first order
condition of optimality:
which is exactly (1.14). Given the properties of F identified earlier, this equation has a unique
solution ˛.
O t
u
Theorem 1.12 Assume that the function W P.RC / ! R satisfies the following
three properties.
1. 8 2 P.RC /; ./ > t0 , in other words, the meeting never starts before t0 ;
2. Monotonicity: if ; 0 2 P.RC / and if .Œ0; ˛/ 6 0 .Œ0; ˛/ for all ˛ > 0, then
./ > .0 /;
3. Sub-additivity: if 2 P.RC /, then for all ˛ > 0, .. ˛// 6 ./ C ˛;
If the constants a, b, and c are strictly positive, there exists a unique fixed point for
the map ˛ 7! ˛,O as defined in the previous proposition with t D .F. ˛//.
In the statement of the theorem as well as in the subsequent proof, when we use
the notation .F. ˛//, we identify the cumulative distribution function F. ˛/
with the distribution of the random variable ˛ C .
Proof. We are looking for a fixed point of the map ˛ 7! G.˛/ D ˛O defined by:
˛ 7! F. ˛/ 7! t D F. ˛/ 7! ˛O D ˛.t/;
O
the last step being given by the solution of equation (1.14). Assuming that x < y, the
monotonicity assumption on gives that:
F. x/ 6 F. y/ ;
1.2 Games of Timing 19
Using the special form of equation (1.14), the implicit function theorem implies that when
viewing ˛O as a function of t, we have:
which is bounded from above by a constant strictly smaller than 1 because F 0 is nonnegative.
This implies that G is a strict contraction, and that it admits a unique fixed point. t
u
Remark 1.13 Notice that the quorum rule given by a 100p-percentile of the
distribution of the arrival times satisfies the three assumptions stated in the above
theorem. As a natural extension of the models considered above, one could envision
cases where the cost to each agent depends on more general functionals of the
distribution of individual arrival times. In such a case, the optimization problem
should be solved for each fixed distribution , and the fixed point part of the proof
should concern instead of t. This is much more involved mathematically as the
space of measures is infinite dimensional.
In the first stochastic game model just presented, the interventions of the players
were choices affecting random times. As such, this game could have been called
a game of timing. However, we simplified the model by allowing the actions of
the players to be limited to the choices of the means of these random times. The
standard terminology for game of timing seems to be restricted to situations when
as time evolves, the information available to each player increases with time, and
the choices of random times are made in a non-anticipative way vis-à-vis the
information available at the time of the decision. Mathematically, this means that
the random times are in fact stopping times for the filtrations representing the
information available to the players.
In this section, we present two important examples of games of this type. We call
them games of timing. As expected, we concentrate on models for which the players
interact in a mean field fashion. We choose to motivate the mathematical framework
with an application to a very important issue in the stability theory of the financial
system: liquidity and bank runs. The first model below is intended to present
the fundamental economic equilibrium principles which underpin the analysis of
liquidity and bank runs in the financial system. It is static in nature, and as such, it
may not fully capture the mathematical challenges we want the games of timing to
address. Indeed, it reduces the choice of the time of the run on the bank to a binary
decision: whether or not to withdraw a deposit, at a predetermined time. Despite the
20 1 Learning by Examples: What Is a Mean Field Game?
Our first model is set in a three dates/two periods framework often used in the early
economic literature on bank runs. While some of the assumptions are common to
many studies, the gory details of the specific model presented below are borrowed
from a study of banking liquidity, attempting to derive policy implications of the
models, highlighting among other things, the role of the lender of last resort. See
the Notes & Complements at the end of the chapter for references.
We first summarize the state of a bank ex ante (at time t D 0) by its balance sheet:
These funds are used to finance an investment of size I in risky assets like loans, the
remaining funds, say M, being held in cash reserves. In particular, D0 C E D I C M.
For the sake of convenience, we shall eventually normalize D0 to 1, that is D0 D 1,
in which case 1 C E D I C M.
1.2 Games of Timing 21
The horizon is t D 2, at which time the returns on the risky investments are
collected, the CDs are repaid, and the stockholders of the bank get the remaining
funds whenever there are any left. The returns on the bank’s investments are given
by the value of a Gaussian random variable R N.R; R2 / with mean RN and variance
R2 , the value of R being revealed at time t D 2. The bank regulator lets the bank
operate based on:
D .D0 C E I/ D0 C E D
RS D D1 ;
I I
The liquidity ratio m should be thought of as the maximum number of CDs which
could be redeemed without the bank being forced to seek cash by changing its
investment portfolio. We assume that the number N of investors is large and that
they handed their investment decisions to fund managers.
At time t D 1 early withdrawals are possible. At that time, the N investors/fund
managers i 2 f1; ; Ng have access to a private signal Xi D R C i where the i ’s
are independent identically distributed Gaussian variables with mean 0 and inverse
variance ˇ, i.e., i N.0; ˇ 1 /, independent of R. On the basis of the private signal
Xi , each investor/fund manager makes a binary decision ˛ i which will be his or her
control: to withdraw and collect D=N in case ˛ i D 1, or do nothing if ˛ i D 0.
We denote by n (resp. ˛N N ) the number (resp. proportion)
PN of i investors who decide
to withdraw their deposits at time t D 1. So n D iD1 ˛ and ˛N N
is the mean
of the empirical distribution N N of the controls ˛ i . We model a bank run by the
refusal of investors to renew their CDs at time t D 1. We assume that the investors
cannot coordinate their investment strategies: if they could pool their information,
they would gain near perfect knowledge of the return R which could then be treated
as a deterministic constant instead of a random variable whose outcome is only
revealed at time t D 2.
We now explain how to compute y. When forced to sell, banks cannot get full price
for their assets, they can only get a fraction, say 1=.1 C / for some > 0, of
their value. We assume that the market aggregates efficiently all the private signals,
gaining perfect knowledge of the return R. Accordingly, the price at which the loan
portfolio can be sold is P D R=.1 C /. Since the number of withdrawals is n, the
volume of fire-sales needed to compensate for the withdrawals is given by:
where we use the notation xC to denote the positive part max.0; x/ of a real number
x 2 R.
The bank is close to insolvency when the return R is small or when there is a
liquidity shortage and is large (the interbank markets are not enough to prevent
failure). If the bank needs to close at time t D 1, we assume that the liquidation
value of its assets is R for some constant much smaller than 1=.1 C / modeling
the liquidity premium.
The quantity RS was defined earlier as the solvency threshold of the bank because if
there are no withdrawals at t D 1, namely if ˛N N D 0, the bank fails at t D 2 if and
only if R < RS . The threshold RS is a decreasing function of the solvency ratio E=I.
The second bullet point shows that solvent banks can fail when the number
of early withdrawals is too large. Notice however that when the returns are high
enough, the bank is supersolvent by which we mean R > .1 C /RS , in which case
it can never fail, even if everybody withdraws at t D 1, i.e., ˛N N D 1.
While the i appear as idiosyncratic sources of noise, R is a source of randomness
common to all the participants. As a result, we expect the behavior of the investors
and the failure/non-failure of the bank to depend upon the outcome of this common
1.2 Games of Timing 23
ᾱ
Liquidation at t 1
Fire
Sales at t 1
No failure
Failure at t 2
M D
No Liquidation at t 1
Failure at t 2
R
0
RS 1 λ RS
noise. For this reason, we express the critical values of R as functions of the natural
parameters of the model. Recall that, from the above discussion, we learned that:
• the bank is closed early if R < REC .˛N N / where the function REC is defined on
Œ0; 1 by:
• the bank fails if R < RF .˛N N / where the function RF is defined on Œ0; 1 by:
Notice that REC .˛/ < RF .˛/. Our findings are further illustrated in Figure 1.1.
• each fund manager gets a benefit B > 0 if they get their money back at time
t D 2, or if the bank fails and they withdraw their funds at time t D 1, nothing
otherwise;
• fund managers’ compensations are based on the size of their funds more than on
their returns, so each withdrawal at time t D 1 has a reputation cost C > 0.
Recall the information on the basis of which the fund managers make their
decisions. At time t D 1, fund manager i observes the private signal Xi D R C i ,
and decides on the basis of the value of the observation, say Xi D x, whether or not
to withdraw. In other words, the strategy of player i is a function x 7! ˛ i .x/ 2 f0; 1g,
with ˛ i .x/ D 1 if the decision is to withdraw the funds, and ˛ i .x/ D 0 if the decision
is to remain invested. Since ˛ i is a binary function on the range of the signal Xi , it is
the indicator function of a set of outcomes of Xi . So given strategies ˛ 1 ; ; ˛ N , for
each player i 2 f1; ; Ng, one can define the expected reward J i .˛ 1 ; ; ˛ N / to
player i given the rules and scenarios stated and reviewed in the above bullet points
and Figure 1.1. We refrain from writing such an expectation in detail, but notice that
it should only depend upon ˛ i and the average ˛N N . Rather than risking to drown in
the gory details of the search for equilibria with a finite number of investors, we
switch to the mean field game formulation obtained in the limit N ! 1.
Finally, notice that because of the monotonicity properties alluded to in the
remark below, we can restrict ourselves to strategies writing as indicator functions
of intervals of the form .1; `i , so that ˛ i .x/ D 1 if and only if x 6 `i .
Remark 1.14 In the case of full information, i.e., when R is common knowledge at
time t D 1, everybody runs at time t D 1 on the event fR < RS g in which case ˛N N D
1, while nobody runs on the bank, i.e., ˛N N D 0 on the event fR > .1 C /RS g. These
two extreme equilibria co-exist in the intermediate regime fRS 6 R 6 .1 C /RS g.
See the Notes & Complements at the end of the chapter and Section (Vol II)-7.2 of
Chapter (Vol II)-7 for further discussion and references.
• if ˛N 6 M=D,
– at t D 1, the runs on the bank can be covered without the need to sell assets;
– at t D 2, failure occurs if and only if R < RS with RS ;
• if M=D < ˛, N
– at t D 1 the bank needs to sell .˛DM/.1C
N /=R worth of its loan investments
in order to return ˛D
N M to the requests for withdrawal;
if ˛D
N > M CPI, i.e., if ˛N > p1 .R/ with p1 .R/ D R I=ŒD.1C /CM=D, the
bank needs to sell more than what it owns, so it is liquidated (early closure);
– at t D 2
failure occurs if and only if RI .1 C /.˛D N M/ < .1 ˛/D, N i.e.,
˛N > p2 .R/ with p2 .R/ D .R RS /.D M/=. DRS / C .M=D/.
Recall that all these scenarios are visualized in the Figure 1.1.
• the bank fails if the returns are too small, specifically, if R < RS ;
• the bank does not fail if the returns are sufficiently large, specifically, if R >
.1 C /RS ;
We now try to capture the most important stylized facts of the above static model in
a dynamic setting. We simplify the description of the balance sheet of the bank, as
well as the impact of the fire sales, in order to still be able to identify the optimal
timing decisions of the investors who decide to run on the bank.
We assume that the market value of the assets of a bank are given at time t by an
Itô process:
Z t Z t
Yt D Y0 C bs ds C s dWs0 ;
0 0
where the value Y0 > 0 is known to everyone, and in particular to the N depositors.
We assume that the assets generate a dividend cash flow at rate rN strictly greater than
the risk free rate r. These dividends are not reinvested in the sense that their values
are not included in Yt . The depositors are promised the same interest rate rN on their
deposits. The bank collapses if Yt reaches 0.
As we did in the treatment of the static model considered above, without any loss
of generality, we normalize the aggregate initial deposits to 1. Moreover, since we
shall eventually cast the problem as a mean field game, we shall require a strong
symmetry in the model, and as a result we shall assume that each initial deposit is
in the amount Di0 D 1=N. At any given time t, the liquidation value of the assets
of the bank is given by L.Yt / where L W y 7! L.y/ is a deterministic continuously
differentiable function satisfying:
Given that the depositors can withdraw their funds at any time, the bank can tap
a credit line at interest rate rN > r to pay off the running depositors. At any given
time t, the credit line limit is equal to the liquidation value L.Yt / of the bank’s assets.
The bank is said to be safe if all depositors can be paid in full, even in case of
a run. The bank is said to have liquidity problems if the current market value of its
assets is sufficient to pay depositors, but the liquidation value is not. Finally, it is
said to be insolvent if the current market value of its assets is less than its obligation
to depositors. We shall confirm below that in the case of complete information about
the solvency of the bank, depositors start to run as soon as the bank starts having
liquidity problems, long before the bank is insolvent.
28 1 Learning by Examples: What Is a Mean Field Game?
The endogenous default is not the only way the bank can default. Indeed there is the
possibility of an exogenous default at time t < if the mass of running depositors
reaches L.Yt /. Let us denote by i the time at which depositor i tries to withdraw his
or her deposit and let us denote by N N the empirical distribution of these times, i.e.,
1X
N
N N D ı i :
N iD1
Notice that N N .Œ0; t// represents the proportion of depositors who tried to withdraw
before time t, and that the time of endogenous default is given by:
For the sake of simplicity we assume that once a depositor runs, he or she cannot
get back in the game, in other words, his or her decision is irreversible.
where X > 0 and for i 2 f1; ; Ng, the processes W i D .Wti /t>0 are independent
Wiener processes (also independent of W 0 and ) representing idiosyncratic noise
terms blurring the observations of the exact value Yt of the assets of the bank. When
i;N
Fi D FX we talk about private monitoring of the asset value of the bank. However,
1.2 Games of Timing 29
i;N
for a more realistic form of the model, we shall require that the filtration FX does
not include the information provided by the process .N N .Œ0; t//t>0 which involves
the private signals of the other depositors. This model will be more challenging
mathematically as the individual depositors will have to choose their withdrawal
strategies in a distributed manner, using only the information contained in their
private signals.
In any case, the filtrations Fi will be specified in each particular application.
Clearly i should be a Fi -stopping time in order to be admissible. In other words,
depositors do not have a crystal ball to decide if and when to run.
Given that all the other players j ¤ i have chosen their times j to try to withdraw
their deposits, the payoff Pit . i / at time t to depositor i for trying to run on the bank
at time t (i.e., for i D t) can be written as:
C
Pit . i / D Di0 ^ L.Yt / N N .Œ0; t// 1Œ0;/ .t/;
P
if L.Ys / N1 NjD1 1Œ0;s/ . j / > 0 for all s < t, and Pit . i / D 0 otherwise. The
problem of depositor i is then to choose for i , the Fi -stopping time solving the
maximization problem:
J i . i / D max E e.Nrr/ Pi . i /
06 6
then a Nash equilibrium is when all the depositors decide to run at time O .
30 1 Learning by Examples: What Is a Mean Field Game?
So a bank run occurs as soon as the bank has liquidity problems, even if this is long
before it is insolvent. Notice also that according to this proposition, all the depositors
experience full recovery of their deposits, which is in flagrant contrast with typical
bank runs in which depositors usually experience significant losses.
Proof. We argue that we have indeed identified a Nash equilibrium. If all other depositors
but i choose the strategy given by the running time O , we show that player i cannot do better
than choosing to also run at time O . If L.Y0 / < .N 1/=N, all the others depositors run
immediately, and his or her only hope to get something out of his or her deposit is to run at
time 0 as well. Similarly, if L.Y0 / D .N 1/=N and all the other depositors run, depositor i
needs to run at that time as well. Now if L.Y0 / > .N 1/=N, no depositor has a reason to run
while L.Yt / > .N 1/=N since, by not running for a small time interval while L.Yt / is still
strictly greater than .N 1/=N, he or she can earn the superior interest rN > r without facing
any risk. This proves that every depositor using O as time to run is a Nash equilibrium. t
u
as long as L.Ys / Œ0; s/ is positive for all s < t; otherwise the payoff is null. Then,
the optimal time for a representative depositor to claim his or her deposit back is
given by the stopping time (for his or her own information filtration) solving the
optimal stopping problem:
O D arg max E e.Nrr/ P . ; Y / ;
06 6
where the argument is required to be a stopping time with respect to the filtration
describing the available information to the typical player and where for the sake
of definiteness we choose O to be the smallest of the optimal stopping times when
uniqueness of the maximizer does not hold. As explained before, the representative
player bases his/her own decision on the observation of a private signal of the form:
Xt D X0 C Yt C X Wt ; t 0;
the conditional law L. O jY; /. Hence this creates a map 7! L. O jY; /, and the
final step of the mean field game approach is to find a fixed point for this map.
We shall continue the discussion of this model in Section (Vol II)-7.2 of Chapter
(Vol II)-7.
This example will stay with us throughout. We solve the finite player game explicitly
in Chapter 2, both for open and Markovian closed loop equilibria, see Chapter 2 for
precise definitions. In the following Chapter 3, we identify the limits as N ! 1 of
the solutions to the finite player games and solve the corresponding limiting problem
as a mean field game. Finally, in Chapter (Vol II)-4 we revisit this example one more
time to check the so-called master equation by explicit computations.
.i/
We describe the model as a network of N banks and we denote by Xt the
logarithm of the cash reserves of bank i 2 f1; ; Ng at time t. The following
simple model for borrowing and lending between banks through the drifts of their
log-cash reserves, while unrealistic, will serve perfectly our pedagogical objectives.
For independent Wiener processes W i D .Wti /06t6T for i D 0; 1; : : : ; N and a
positive constant > 0 we assume that:
aX j
N
dXti D .Xt Xti / dt C dBit
N jD1
where:
p
dBit D 1 2 dWti C dWt0 ;
for some 2 Œ1; 1. In other words, we assume that the log-cash reserves are
Ornstein-Uhlenbeck (OU) processes reverting to their sample mean XN t at a rate a >
0. This sample mean represents the interaction between the various banks. We also
consider a negative constant D < 0 which represents a critical liability threshold
under which a bank is considered in a state of default.
A remarkable feature of this model is the presence of the Wiener process W 0 in
the dynamics of all the log-cash reserve processes Xi . While these state processes
are usually correlated through their empirical distribution, when ¤ 0, the presence
of this common noise W 0 creates an extra source of dependence which makes the
solution of mean field games much more challenging. Models with a common noise
will only be studied in the second volume because of their high level of technicality.
However, we shall see that the present model can be solved explicitly whether or
not D 0!
32 1 Learning by Examples: What Is a Mean Field Game?
where ˛i is understood as the control of bank i, say the amount of lending and
borrowing outside of the N bank network (e.g., issuing debt, borrowing at the Fed
window, etc). In this modified model, firm i tries to minimize:
J i .˛1 ; ; ˛N /
Z T
1 i2 c
DE j˛t j q˛ti .XN t Xti / C .XN t Xti /2 dt C .XN T XTi /2 ;
0 2 2 2
for some positive constants and c which balance the individual costs of borrowing
and lending with the average behavior of the other banks in the network. The
parameter q > 0 weighs the contributions of the relative sizes of these components
imposing the choice of the sign of ˛ti and the decision whether to borrow or lend.
The choice of q is likely to be the regulator’s prerogative. Notice that:
• If Xti is small relative to the empirical mean XN t , bank i will want to borrow and
choose ˛ti > 0;
• If Xti is large, then bank i will want to lend and set ˛ti < 0.
q2 6 (1.17)
to guarantee the convexity of the running cost functions. In this way, the problem
is an instance of a linear-quadratic game (LQ for short). We shall solve explicitly
several forms of this model in Chapter 2.
Remark 1.19 We used the notation xN D .x1 C C xN /=N for the empirical mean
of all the xj s. So in the above specification of the model, each bank interacts with the
1.3 Financial Applications 33
empirical mean of the states of all the banks, including its own state. As we often
do when we discuss games with mean field interactions (see for example Section 2.3
of Chapter 2), it may be more natural to formalize the interactions as taking place
between the state of a given bank with the empirical mean of the states of the other
banks. Doing so would require to replace .XN t Xti / by:
1 X j
Xti Xti D Xt Xti
N1
16j¤i6N
1 X j
D Xt Xti
N 1 16j6N
N
D .XN t Xti /:
N1
This shows that the model would be exactly the same as long as we multiply the
constants a and q by .N 1/=N and the constants and c by .N 1/2 =N 2 . So for N
fixed, the qualitative properties of the models should be the same, and in any case,
the quantitative differences should disappear in the limit N ! 1.
where the W i D .Wti /t>0 are independent standard Wiener processes and the
volatilities i > 0 are assumed to be constant for the sake of simplicity, and all
equal to the same positive number > 0 for symmetry reasons. All the agents trade
the same stock whose mid-price at time t is denoted by St . The amount of cash held
by trader i at time t is denoted by Kti . It evolves according to:
34 1 Learning by Examples: What Is a Mean Field Game?
1 X
N
dSt D h.˛ti / dt C 0 dWt0 ; t 2 Œ0; T;
N iD1
for the changes over time of the mid-price. Here we assume that ˛ 7! h.˛/ is a
deterministic function known to everyone, 0 > 0 is a constant, and W 0 D .Wt0 /t>0
is a standard Wiener process independent of the family .W i /16i6N . This is a
particular case of the classical model of Almgren and Chriss for permanent price
impact. Since the drift of the mid-price is the integral of the function h with respect
to the empirical measure N N˛t of the controls ˛ti , we see that in this model, each
participant interacts with the empirical distribution of the controls of the other
participants. In order to avoid unruly notation, we shall denote by hh; N N˛t i this
integral.
Note that St follows an arithmetic Brownian motion with a drift which depends on
the accumulated impacts of previous trades. The function h is sometimes called the
instantaneous market impact function. Since a buy is expected to increase the price
of the stock and a sell will tend to decrease the stock price, the function h should
satisfy h.˛/˛ > 0. Linear, power-law and logarithmic functions are often used in
practice for that reason. Price impact models are most of the time used in optimal
execution problems for high frequency trading. Therefore the fact that .St /06t6T can
become negative is not a real issue in practice since the model is most often used on
a time scale too short for the mid-price to become negative.
When the controls are square integrable and the instantaneous market impact
function h has at most linear growth, the processes Xti and St are square integrable
and the stochastic integrals in the wealth’s dynamic are martingales. We assume
that the traders are subject to a running liquidation constraint modeled by a function
cX of the shares they hold, and to a terminal liquidation constraint at maturity
T represented by a scalar function g. Usually cX and g are convex nonnegative
quadratic functions in order to penalize unwanted inventories. If as usual we denote
by J i the expected costs of trader i as a function of the controls of all the traders we
have:
Z T
1
J .˛ ; ; ˛ / D E
i N
cX .Xt /dt C g.XT / VT :
i i i
(1.19)
0
where as before, we use the notation N N˛t for the empirical distribution of the N
components of ˛t , which is a probability measure on the space A in which the
controls take their values. Here, the running cost function f is defined by:
Remark 1.20 It is important to emphasize the crucial role played by the innocent
looking assumption that the traders are risk neutral as they choose to minimize the
expectation of their cost (as opposed to a nonlinear function of the costs). Indeed,
the quadratic variation terms disappear in the computation of the expected cost,
and so doing, the common noise W 0 as well as the mid-price S disappear from the
expression of the individual expected costs. The game would be much more difficult
to solve it they didn’t.
Since the common noise disappeared from the model, if we restrict the rates
of trading to functions of the inventories, or in the case of open loop models, if
we assume that they are adapted to the filtrations generated by the W i and are
independent of the common noise, the independence of the random shocks dWti
suggest that in the limit N ! 1 the empirical measures N N˛t converge, provided
that the rates are sufficiently symmetric, toward a deterministic measure t which,
36 1 Learning by Examples: What Is a Mean Field Game?
for a given Wiener process W and then, try to find a flow of measures D . t /t>0 so
that t D L.˛O t / where ˛O D .˛O t /06t6T is an optimal control for the above problem.
As a general rule, this form of mean field game is more difficult to solve than
a standard MFG problem where the interaction between the players is through the
empirical distribution of their states. As already explained at the beginning of this
subsection, we shall call these models extended mean field games. We study them
in Section 4.6 of Chapter 4, and we solve this particular model of price impact in
Subsection 4.7.1.
Background
As for many models in the economic literature, the problem was set for an infinite
time horizon (T D 1) with a positive discount rate r > 0, but to be consistent
with the rest of the text, we shall frame it with a finite horizon T. In this model,
the private states of the individual agents are not subject to idiosyncratic shocks.
They react to a common source of noise given by a one-dimensional Wiener process
W 0 D .Wt0 /06t6T . We denote by F0 D .Ft0 /06t6T its filtration. We also assume that
1.4 Economic Applications 37
the volatilities of these states are linear as given by a function x 7! x for some
positive constant , and that each player controls the drift of his or her own state so
that the dynamics of the state of player i read:
We shall restrict ourselves to Markovian controls of the form ˛ti D ˛.t; Xti / for a
deterministic function .t; x/ 7! ˛.t; x/, which will be assumed to be nonnegative
and Lipschitz in the variable x. See Chapter 2 for a detailed discussion of the use of
these special controls. Under these conditions, for any player, say player i, Xti > 0
at all times t > 0 if X0i > 0, and for any two players, say players i and j, the
homeomorphism property of Lipschitz stochastic differential equations (SDE for
j j
short) implies that Xti 6 Xt at all times t > 0 if X0i 6 X0 .
For later purposes we notice that when the Markovian control is of the form:
˛.t; x/ D t x; (1.24)
then,
Rt 2 =2/tCW 0
j j
Xt D Xti C .X0 X0i /e 0 s ds. t : (1.25)
qk
.q/ .dx/ D k 1Œq;1/ .x/dx: (1.26)
xkC1
history of the common noise, the distribution of the states of the players remains
Pareto with parameter k if it starts that way, and the left most point of the support of
the distribution, say qt , can be understood as a sufficient statistic characterizing the
distribution t . This remark is an immediate consequence of formula (1.25) applied
j
to Xti D qt , in which case q0 D 1, and Xt D Xt , implying that Xt D X0 qt . So if
.1/ .qt /
X0 , then t . This simple remark provides an explicit formula for
the time evolution of the (conditional) marginal distributions of the states. As we
shall see, this time evolution is generally difficult to come by, and requires the
solution of a forward Partial Differential Equation (PDE for short) known as forward
Kolmogorov equation or forward Fokker-Planck equation, which in the particular
38 1 Learning by Examples: What Is a Mean Field Game?
Optimization Problems
We now introduce the reward functions of the individual agents and define their
optimization problems. The technicalities required to describe the interactions
between the agents at a rigorous mathematical level are a hindrance to the intuitive
understanding of the nature of these interactions. Indeed, the reward functions would
have to be defined in such a way to accommodate empirical distributions for the
fact that the latter do not have densities with respect to the Lebesgue measure.
Overcoming this technical difficulty would force us to jump through hoops, which
we consider as an unnecessary distraction at this stage of our introduction to mean
field games. For the time being, we define the running reward function f by:
xa E ˛p
f .x; ; ˛/ D c ;
Œ.d=dx/.x/b p Œ.Œx; 1//b
for x; ˛ > 0 and 2 P.RC / and for some positive constants a, b, c, E and p > 1
whose economic meanings are discussed in the references provided in the Notes &
Complements at the end of the chapter. We use the convention that the density is
the density of the absolutely continuous part of the Lebesgue’s decomposition of
the measure , and that in the above sum, the first term is set to 0 when this density
is not defined or is itself 0. Similarly, the second term is set to 0 when does not
charge the interval Œx; 1/.
Solutions of the MFG problem as formulated in this section will be given in
Section 4.5.2 of Chapter (Vol II)-4.
One major difference with the growth model discussed in the previous subsection
is the fact that, on the top of the common noise affecting all the states, we also con-
sider idiosyncratic random shocks specific to each individual agent in the economy.
While the random shocks are assumed to be independent and identically distributed
with a common distribution in Aiyagari’s model, for the sake of definiteness, we first
discuss the approach of Krusell and Smith in which the shocks are kept discrete and
finite in nature for the purpose of numerical implementation. In the next subsection,
we change the nature of the random shocks by introducing Wiener process to recast
the model in the framework of stochastic differential games.
Remark 1.21 In the absence of the common noise z, the model was originally
proposed by Aiyagari. It consists of a more traditional game with independent
idiosyncratic noise terms given by the individual changes in employment.
where Kt and Lt stand for per-capita capital and employment rates respectively.
Here the constant Ǹ can be interpreted as the number of units of labor produced
by an employed individual. In such a model, two quantities play an important role:
the capital rent rt and the wage rate wt . In equilibrium, these marginal rates are
defined as the partial derivatives of the per-capita output Yt with respect to capital
and employment rate respectively. So,
˛1
Kt
rt D r.Kt ; Lt ; zt / D ˛zt ; (1.28)
N
`Lt
40 1 Learning by Examples: What Is a Mean Field Game?
and
˛
N ˛/zt Kt
wt D w.Kt ; Lt ; zt / D `.1 : (1.29)
ǸLt
for some utility function U and scrap function U. Q The discount factor ˇ 2 .0; 1
does not play any significant role in the finite horizon version of the model so we
take it equal to 1 in this case. However, it is of crucial importance in the infinite
horizon version for which the optimization is to maximize:
Z 1
E ˇ t U.cit /dt ;
0
in which case ˇ 2 .0; 1/. To conform with most of the economic literature on the
subject, we use the power utility function:
c1 1
U.c/ D ; (1.30)
1
also known as CRRA (short for Constant Relative Risk Aversion) utility function
and its limit as ! 1 given by the logarithmic utility function.
The constraints on the individual consumption choices are given by the values
of the individual capitals kti at time t which need to remain nonnegative at all times.
The changes in capital over time are given by the equation:
dkti D .rt ı/kti C Œ.1 t /`N it C ı.1
N i /wt dt ci dt:
t t (1.31)
Here, the constant ı > 0 represents a depreciation rate. The second term in the
above right-hand side represents the wages earned by the consumer. It is equal to
N t when the consumer is unemployed, a quantity which should be understood as
ıw
N t after
an unemployment benefit rate. On the other hand, it is equal to .1 t /`w
adjustment for taxes, when he or she is employed. Here,
N t
ıu
t D ;
ǸLt
The above form (1.31) of the dynamics of the state variables kti is rather deceiving
as they partially hide the coupling between the equations. The main source of
coupling comes from the quantities rt and wt which not only depend upon the
common noise zt , but also depend upon the aggregate capital Kt which is in some
sense the average of the individuals .kti /16i6N !
1X X
N N
1
N Nk D ı i: and N i;N D ık j ; i D 1; ; N:
N iD1 kt k
N1 t
jD1;j¤i
Here, .zt ; t /06t6T is a continuous time Markov chain with the same law as any
of the .zt ; it /06t6T introduced earlier, the rental rate function r and the wage level
function w are as in (1.28) and (1.29), and Kt D kN tz is the mean of the conditional
measure zt , namely:
42 1 Learning by Examples: What Is a Mean Field Game?
Z
Kt D kzt .dk/;
Œ0;1/
and where Lt is as above. This is the source of the mean field interaction in the
model.
In this form of the model, the private state at time t of agent i is a two-dimensional
vector Xti D .Zti ; Ait /. As before, the agents i 2 f1; ; Ng can be viewed as the
workers comprising the economy: Zti gives the labor productivity of worker i, and Ait
his or her wealth at time t. The time evolutions of the states are given by stochastic
differential equations:
(
dZti D Z .Zti /dt C Z .Zti /dWti ;
(1.32)
dAit D Œwit Zti C rt Ait cit dt;
In this model, given processes r D .rt /t>0 and wi D .wit /t>0 for i D 1; ; N,
each worker tries to maximize:
Z 1
J i .c1 ; ; cN / D E et U.cit /dt: (1.33)
0
while the total amount of labor Lt supplied in the economy at time t is normalized
to 1. In a competitive equilibrium the interest rate and the wages are given by the
partial derivatives of the production function:
(
rt D Œ@K F.Kt ; Lt /jLt D1 ı;
wt D Œ@L F.Kt ; Lt /jLt D1 ;
1 X
N
N NXt D ı i:
N iD1 Xt
Practical Application
We shall revisit this diffusion model several times in the sequel. It will be one of the
testbeds we use to apply the tools developed in the book. We first solve the model in
Subsection 3.6.3 of Chapter 3 as an example of Mean Field Game (MFG) without
common noise. We revisit this solution in Subsection 6.7.4 of Chapter 6 in light of
our analysis of the optimal control of McKean-Vlasov dynamics. In order to check
the assumptions under which our results are proven, we use a specific model for
the mean reverting labor productivity process Z D .Zt /t>0 . We choose an Ornstein-
Uhlenbeck process for the sake of definiteness. As already mentioned, we use the
CRRA isoelastic utility function with constant relative risk aversion:
c1 1
U.c/ D ; (1.35)
1
F.K; L/ D aN K ˛ L1˛ ;
for some constants aN > 0 and ˛ 2 .0; 1/. With this choice, in equilibrium, we have:
˛ aN
rt D ı; and wt D .1 ˛/NaKt˛ ; (1.37)
Kt1˛
In the first part of this section, we present a macro-economic model for exhaustible
resources. We consider N oil producers in a competitive economy. We denote by
x01 ; ; x0N the initial oil reserves of the N producers. Each producer tries to control
his or her own rate of production so that, if we denote by Xti the oil reserves of
producer i at time t, the changes in reserves are given by equations of the forms:
where > 0 is a volatility level common to all the producers, the nonnegative
adapted and square integrable processes .˛i D .˛ti /t0 /iD1; ;N are the controls
exerted by the producers, and the .W i D .Wti /t0 /iD1; ;N are independent standard
Wiener processes. The interpretation of Xti as an oil reserve requires that it remains
nonnegative. However, we shall not say that much on this sign constraint for the
purpose of the present discussion. If we denote by Pt the price of one barrel of oil
at time t, and if we denote by C.˛/ D b2 ˛ 2 C a˛ the cost of producing ˛ barrels of
oil, then producer i tries to maximize:
Z 1
J i .˛1 ; ; ˛N / D sup E Œ˛ti Pt C.˛ti /ert dt : (1.39)
˛W˛t 0;Xt 0 0
The price Pt is the source of coupling between the producer strategies. The model
is set up over an infinite horizon with a discount factor r > 0. In this competitive
economy model, the price is given by a simple equilibrium argument forcing supply
to match demand. The demand at time t, when the price is p, is given as D.t; p/ by a
demand function D.
In the mean field limit, we fix a deterministic flow D .t /t>0 of probability
measures, interpreting t as the distribution of the oil reserves of a generic producer
at time t. Notice that the knowledge of .t /t>0 determines the price .Pt /t>0 since:
Z
d
Pt D p.t; / D D1 t; xt .dx/ ; t > 0; (1.40)
dt
which is a mere statement that the quantity produced is given by the negative of the
change in reserves. The optimization problem which needs to be solved to determine
the best response to this flow of distributions is based on the running reward function
of a representative producer. It is given by:
where the quantity p.t; / is given by (1.40). Accordingly, the value function of the
representative producer is defined by:
Z 1
r.st/
u .t; x/ D sup E Œ˛s Ps C.˛s /e ds;
.˛s /st W˛s 0;Xs 0 t
As usual, once the best response is found by solving this optimization problem for a
given initial condition x0 at time 0, the fixed point argument amounts to finding
a measure flow D .t /t>0 so that the marginal distributions of the optimal
paths .Xt /t>0 are exactly the distributions .t /t>0 we started from. In particular,
0 must be equal to ıx0 . This model was originally treated in a Partial Differential
Equation (PDE for short) formalism which we shall call the analytic approach in
this book: the optimization problem is solved by computing the value function
u of the problem as the solution of the corresponding Hamilton-Jacobi-Bellman
(HJB for short) equation, and the fixed point property is enforced by requiring
.t /t>0 to satisfy a forward Kolmogorov (or Fokker-Planck) PDE to guarantee that
.t /t>0 is indeed the distribution of the optimal paths. Kolmogorov’s equation is
linear and forward in time while the HJB equation is nonlinear and backward in
time. These two equations are highly coupled and, more than the nonlinearities, the
opposite directions of the time are the major source of difficulty in solving these
equations. The authors did not solve them analytically. Numerical approximations
were provided as illustration. As we shall see, the occurrence of a forward-backward
system is a characteristic feature of the analysis of mean field games. It will be a
mainstay of our probabilistic approach as we start emphasizing in Chapter 3.
46 1 Learning by Examples: What Is a Mean Field Game?
Remark 1.23 The reader will notice that the dependence of Pt upon in (1.40)
is not of the form Pt D p.t; t / but of the more complicated form Pt D p.t; /
where p.t; / is a functional of the path D .s /s>0 (at least when regarded in a
neighborhood of t).
In order to fit the formulation we have used so far, we may notice by taking the
expectation in (1.41) that:
d
EŒXt D EŒ˛t ; t > 0;
dt
which shows that:
Z
d
xt .dx/ D EŒ˛t ; t > 0;
dt
Remark 1.24 It is not too much of a stretch to imagine that the above mean field
formulation can be tweaked to include terms in the maximization which incentivize
producers to avoid being the last to produce, the effects of externalities, the impact
of new entrants producing from alternative energy sources, : : :
with absorption at 0 to guarantee that the reserves of a generic oil producer do not
become negative, and to assume that, as in most models for Cournot games, the
1.5 Large Population Behavior Models 47
price Pit experienced by each producer, is given by a linear inverse demand function
of the rates of productions of all the other players in the form:
X j
Pit D 1 ˛ti ˛t ;
N1
16j¤i6N
Biologists, social scientists, and engineers have a keen interest in understanding the
behavior of schools of fish, flocks of birds, animal herds, and human crowds. Under-
standing collective behavior resulting from the aggregation of individual decisions
and actions is a time honored intellectual challenge which has only received partial
answers. In this section, we introduce a small sample of mathematical models which
have been proposed for the analysis and simulation of the behavior of large crowds
from individual rules of conduct.
j Q
wi;j .t/ D w.jxti xt j/ D j
; (1.44)
.1 C jxti xt j2 /ˇ
48 1 Learning by Examples: What Is a Mean Field Game?
for some constants Q > 0 and ˇ > 0. The first equation is a mere consistency
condition since it states that the velocity is the time derivative of the position.
The second equation says that the changes in the velocity of a bird are given by
a weighted average of the differences between its velocity and the velocities of the
other birds in the flock, a form of mean reversion toward the mean velocity. Given
the form of the weights posited in (1.44), the further apart are the birds, the smaller
the weights. Notice that the exact value of the mean reversion constant Q does not
play much role when N is fixed. However, when the size of the flock increases,
namely when N grows, it is natural to expect that should be of order 1=N for the
system to remain stable. More on that later on.
The fundamental result of Cucker and Smale’s original mathematical analysis is
that if N is fixed and 0 6 ˇ < 1=2, then:
(
limt!1 vti D v N0 ; for i D 1; ; N;
j
(1.45)
supt>0 maxi;jD1; ;N jxti xt j < 1;
irrespective of the initial configuration. The first bullet point states that for large
times, all the birds in the flock eventually align their velocities, while the second
bullet point implies that the birds remain bunched, hence the relevance of the
analysis of this system of ODEs to the biological theory of flocking phenomena.
When ˇ > 1=2, flocking can still occur depending upon the initial configuration.
Since the publication of the original paper of Cucker and Smale, many extensions
and refinements appeared, including a treatment of the case ˇ D 1=2. See the Notes
& Complements at the end of the Chapter for details and references.
The extension which teased our curiosity was proposed by Nourian, Caines,
and Malhamé in the form of an equilibrium problem. Instead of positing a
phenomenological description of the behavior of the birds in the flock, the idea
is to let the birds decide of the macroscopic behavior of the flock by making rational
decisions at the microscopic level. By rational decision, we mean resulting from a
careful risk-reward optimization. So in this new formulation, the behavior of the
flock of N birds will still be captured by their individual states Xti D Œxti ; vti which
have the same meanings as before, but whose dynamics are now given by Stochastic
Differential Equations (SDEs):
(
dxti D vti dt;
dvti D ˛ti dt C dWti ; t > 0:
While the first equation has the same obvious interpretation, the second just says
that except for random shocks given by the increments of a Wiener process dWti
proper to the specific bird, each bird can control the changes in its velocity through
the term ˛ti dt. However, this control comes at a cost which each bird will try to
minimize. To be specific, for a given strategy profile ˛ D .˛1 ; ; ˛N / giving the
control choices of all the birds over time, the cost to bird i is given by:
1.5 Large Population Behavior Models 49
Z ˇ N ˇ2
1 T
1 i 2 1 ˇˇ X j i ˇ
ˇ
J .˛/ D lim
i
j˛ j C ˇ wi;j .t/Œvt vt ˇ dt: (1.46)
T!1 T 0 2 t 2 jD1
The special form of these cost functionals is very intuitive and can be justified in
the following way: by trying to minimize this cost, each bird tries to save energy
(minimization of the contribution from the first term) in order to be able to go far,
and tries to align its velocity with those close to him in order to remain in the pack
and avoid becoming an easy prey to aggressive predators. While introducing the
infinite horizon model stated in (1.46), the authors noticed that in the case ˇ D 0,
the nonlinear weights wi;j .t/ are independent of i, j, and t, and the model reduces to a
Linear Quadratic (LQ) game which can be solved. They also suggest to approach the
case ˇ > 0 by perturbation techniques for ˇ 1 (i.e., ˇ small), but fall short of the
derivation of asymptotic expansions which could be used to analyze the qualitative
properties of the model.
For the purpose of illustration, we recast their model in the finite horizon set-up,
even though this will presumably prevent us from feeling the conclusions (1.45) of
the deterministic model which account for large time properties of the dynamical
system. Our reason to work on a finite time interval is to conform with the notation
and the analyses which the reader will find throughout the book. So the dynamics
of the velocity become:
with:
ˇZ ˇ2
1 2 2 ˇˇ v0 v 0
ˇ
0 ˇ
f .t; X; ; ˛/ D j˛j C ˇ 0 2 ˇ
.dx ; dv /ˇ (1.48)
2 2 R6 .1 C jx x j /
Remark 1.25 We explain how the constant should relate to the constant Q
introduced earlier in (1.44). If we want the probability measure N Nt appearing in
formula (1.47) to be the empirical measure of the states of the other birds, namely
the Xt D Œxt ; vt for j ¤ i, then we need to choose 2 D .N 1/2 Q 2 . On the other
j j j
hand, if we want this probability measure to be the empirical measure of all the
j j j
bird states, namely the Xt D Œxt ; vt for j D 1; ; N including j D i, then we need
to choose 2 D N 2 Q 2 .In any case, since we already noticed that in the large flock
limit, the constant Q should be of order 1=N, should be viewed as a dimensionless
constant independent of the size of the flock.
50 1 Learning by Examples: What Is a Mean Field Game?
One of the main challenges of this model is the fact that the running cost function
f is not convex for ˇ > 0. This complicates significantly the solution of the
optimization problem. In particular, such a running cost function will not satisfy
the typical assumptions under which we provide solutions for this type of models.
Moreover, many population biologists have argued that more general interactions
(e.g., involving quantiles of the empirical distribution) are needed for the model to
have any biological significance. Finally, restricting the random shocks affecting
the system to idiosyncratic shocks attached to individual birds is highly unrealistic,
as an ambient source of noise common to all individuals should be present in the
physical environment in which the birds are evolving.
Remark 1.26 In this example, like in many other examples, each individual
interacts, inside the running cost, with the empirical distribution of the states of
the other individuals involved in the game.
where CA , CR , and `A , `R are the strengths and the typical lengths of attraction and
repulsion respectively. As in the case of the Cucker-Smale model of flocking, we
turn this deterministic descriptive model into a stochastic differential game with
mean field interactions by defining the same controlled dynamics as before:
1.5 Large Population Behavior Models 51
(
dxti D vti dt;
dvti D ˛ti dt C dWti ; t > 0;
with:
ˇZ ˇ2
1 1 ˇˇ ˇ
0 ˇ
f .t; X; ; ˛/ D j˛j2 C ˇ
0 0
U.x x /.dx ; dv /ˇ ;
2 2 R6
where X D Œx; v as before. Notice that the value of the running cost function does
not depend upon the v-component of the state X, and that it only depends upon the
x-marginal of the probability distribution of the state.
This subsection is an attempt to introduce, in the spirit and with the notation of
this chapter, several models of crowd behavior. Part of our motivation is to consider
models with different groups of individuals for which the mean field limit and the
mean field game strategy apply separately to each group. But first, as a motivation,
we start with a single group as we did so far.
where the .W i /16i6N are N independent standard Wiener processes, > 0, and ˛i
is a square integrable process giving individual i a form of control on the evolution
of its position over time. In other words, each individual chooses its own velocity,
at least up to the idiosyncratic noise shocks dWti . If individual i is at x at time t,
its velocity is ˛, and the empirical distribution of the other individuals is , then it
faces a running cost given by the value of the function:
Z a
1
f .t; x; ; ˛/ D j˛j2 .x x /d.x /0 0
C ert k.t; x/: (1.51)
2 R3
52 1 Learning by Examples: What Is a Mean Field Game?
Here is a smooth density with a support concentrated around 0, and the function k
models the effect of panic which frightens individuals depending on where they are,
though this effect is dampened with time through the actualization factor ert for
which we assume that r > 0. The power a > 0 is intended to penalize congestion
since large positive values of a penalize the kinetic energy, and make it difficult to
move where the highest density of population can be found.
We shall generalize the model to the case of two subpopulations (which are often
called species in biological applications) in Chapter 7.
The purpose of this section is to introduce a special class of models which, while
not central to the subject matter of the book, still play a crucial role in many
practical applications of great importance. For these models, although the time
variable varies continuously like with all the subsequent developments in the book,
the states controlled (or at least influenced) by the players are restricted to a discrete
set which we shall assume to be finite in some cases. These models cannot be cast as
stochastic differential games, and they require a special treatment which we provide
in Section 7.2 of Chapter 7 and extend to games with major and minor players in
Subsection 7.1.9 of Chapter 7 in Volume II.
1.6 Discrete State Game Models 53
In this model, we assimilate the Limit Order Book (LOB for short) of an electronic
trading exchange, to a set of M different queues, so that at each time t, the state
of the limit order book is given by the lengths of these queues. When one of the
N agents, typically a trading program, arrives and is ready to trade, the value Xti
of its private state is kept to 0 if it decides to leave and not enter any queue, or to
j 2 f1; ; Mg if it decides to enter the j-th queue. So, in this particular instance,
the space in which the states of the players evolve is the finite set E D f0; 1; ; Mg
instead of the Euclidean space Rd in most of the examples treated in this book.
While Xt D .Xt1 ; ; XtN / gives the states of the N market participants at time t,
the empirical distribution:
M X
1 X 1X X
N M
#fi W Xti D kg
N NXt D ıXti D ık D ık ;
N iD1 N kD0 N
i
iW Xt Dk
kD0
contains the histogram of the relative lengths of the individual queues. For the sake
of convenience, we shall denote by mt D .m1t ; ; mM t / the relative lengths of these
queues, namely mkt D #fi W Xti D kg=N. So because the private states of the
individual agents can only take finitely many values at any given time, 0, 1, ,
M in this instance, the marginal distribution of the system can be identified to an
element of RMC1 , or an element of the M-dimensional probability simplex in this
Euclidean space if we wanted to be more specific.
So if we want to capture a property of the game via a real valued function
U.t; x; m/ of time t, of the state x of an individual player at time t, and of the
statistical distribution of this state at time t, say m, such a function can be viewed as
a function u W Œ0; T RMC1 ! RMC1 with the convention Œu.t; m/x D ux .t; m/ D
U.t; x; m/, the value of the individual state determining which component of the
vector u we use. So if the function U were to appear as the solution of a complex
equation in its three variables, or even an infinite dimensional Partial Differential
Equation (PDE), such an equation could be rewritten as an equation for the function
u which would appear as the solution of a simpler system (say a system of
scalar ordinary differential equations, ODEs for short). Such a function U will be
introduced in Chapter 7 as a solution of what we shall call the master equation,
equation which will be studied in full detail in Chapter 4 of Volume II.
We now discuss a second instance of departure from the great majority of models
treated in the book which happen to have a continuum state space. In this example,
not only do we consider a finite state space, but we also break the symmetry among
the players as we single out one of them. Note that this special player could also
be a small number of players which we would bundle together into what we often
54 1 Learning by Examples: What Is a Mean Field Game?
call a major player. This special player faces a large number of opponents which
will be assumed to be statistically similar, restoring the framework of mean field
game models in this large group. We shall use the terminology minor player to
refer to each element of this homogeneous group. Stochastic differential mean field
games with major and minor players will be studied in Section 7.1 of Chapter 7
in Volume II, the special case of games with finite state spaces being discussed in
Subsection 7.1.9 at the end of that same section.
The example we choose to illustrate these two new features is inspired by
academic works on cyber security, to which an extensive literature on two-player
games has been devoted. Typically, the first player, characterized as the attacker,
tries to infect, or take control of, the computers of a network administered and
protected by a defender. The connectivity of the network and the relative importance
of the nodes dictate the defensive measures implemented by the administrator of
the network. The costs incurred as a result of the attacks and the implementation
of defensive measures, cast the model as a zero-sum game since the cost to the
network parallels the reward to the attacker. Zero-sum games are very popular in
the mathematical literature. This stems mostly from the fact that, since the analysis
reduces to the study of one single value function (as opposed to one value function
per player), the techniques of stochastic control can be extended with a minimal
overhead.
In the model which we study in detail in Chapter (Vol II)-7, we consider an
interconnected network of N computers labeled by i 2 f1; ; Ng, which we
identify to their owners or users, and whose levels of security depend upon the
levels of security of the other computers in the same network. For the sake of
definiteness, we shall assume that each computer can be in one of four possible
states: DI for “defended infected”; DS for “defended and susceptible to infection”;
UI for “unprotected and infected”; and finally US for “unprotected and susceptible
to infection.” Each computer user makes an investment trying to keep its machine
secure by installing anti-virus filters, setting up firewalls, : : : and pays a cost for
this investment. On the other hand, the attacker will be rewarded for taking control
of computers in the network, and pay a cost for the implementation of attacks on
the network, the intensity of its attack and the associated cost depending upon the
proportion of computers in the network already infected. This last feature is what
guarantees the mean field nature of the model.
to be sure, neither one previously met any of the persons the other one met in
the past. This assumption guarantees enough independence to provide sufficient
statistics reducing the complexity of the game. For example, when agents try to
guess the value of a random variable (say a Gaussian N.0; 1/ random variable) by
sharing information when they meet, the number of past encounters happens to be a
sufficient statistic whenever the information of each agent is in the form of a private
signal which is also a Gaussian random variable.
In Chapter 3 Section 3.7, we provide the mathematical setting needed to make
rigorous sense of the continuum of independent random variables without losing
measurability requirements, and we prove a weak form of the exact law of large
numbers at the core of this modeling assumption.
Our stylized version of the model also assumes that each single one of N agents
can improve the level of his or her information only by meeting other agents and
sharing information. It would be easy to add a few bells and whistles to allow each
individual to increase his or her information on his or her own, but this would add
significantly to the complexity of the notation without adding to the crux of the
matter, or to its relevance to the theory of mean field games. So we shall assume
that, in order to improve his or her knowledge of an unknown random quantity, each
agent tries to meet other agents, that the private information of agent i at time t can
be represented by an integer Xti , and that the sharing of information can be modeled
j j
by the fact that if agent i meets agent j at time t, we have Xti D Xt D Xt i
C Xt . In
i
other words, the state Xt is an integer representing the precision at time t of the best
guess player i has of the value of a random variable of common interest.
Each agent controls his or her own search intensity as follows. For each i 2
f1; ; Ng, there exists a (measurable) function Œ0; T N 3 .t; n/ 7! cit .n/ 2 RC ,
for a finite time horizon T > 0, so that cit .n/ represents the intensity with which
agent i searches when his or her state is n at time t (i.e., when Xti D n). Typically,
each ci takes values in a bounded subinterval ŒcL ; cU of Œ0; 1/. We then model the
dynamics of the state .Xt D .Xt1 ; ; XtN //06t6T of the information of the set of N
agents in the following way:
X Z Z
Xt D X0 C 'i;j .s; Xs ; v/Mi;j .ds; dv/; (1.52)
Œ0;t Œ0;1/
16i¤j6N
where the .Mi;j /16i6Dj6N are independent homogeneous Poisson random measures
on RC RC with mean measures proportional to the 2-dimensional Lebesgue
measure Leb2 . More precisely,
1
E Mi;j .A B/ D Leb2 .A B/;
2.N 1/
for i 6D j and A; B two Borel subsets of RC . The functions .'i;j /16i6Dj6N are given by:
56 1 Learning by Examples: What Is a Mean Field Game?
(
.0; ; 0/ if i;j .t; x/ < v
'i;j .t; x; v/ D
y otherwise,
where the controls are given by the feedback functions ..˛ti D cit .Xti //0tT /iD1; ;N
and K W ŒcL ; cU 3 c 7! K.c/ is a bounded measurable function representing the
cost for an individual searching with intensity c. It is natural to assume that this
latter function is increasing and convex. The terminal cost function g represents the
penalty for ending the game with a given level of information. Typically, g.n/ will
be chosen to be inversely proportional to .1Cn/, but any convex function decreasing
with n would do as well.
To understand the behavior the system, we describe the dynamics of the state
.Xtk /06t6T of the information of agent k by extracting the k-th component of both
sides of (1.52).
In order to proceed, we fix a coordinate k 2 f1; ; Ng and we provide an
alternative representation of Xk , obtained by choosing .Mi;k /i6Dk and .Mk;j /j6Dk in
a relevant way. For any time t 2 Œ0; T, our construction of .Mi;k .dt; //i6Dk and
.Mk;j .dt; //j6Dk , is based on a suitable coupling with the past of the whole system up
until time t. For two independent homogeneous Poisson random measures M Q 1 and
MQ 2 on RC RC .0; 1 with 1 Leb3 as intensity, where Leb3 is the 3-dimensional
2
Lebesgue measure, we choose Mi;k .dt; dv/ and Mk;j .dt; dv/ as:
Z
Mi;k .dt; dv/ D Q 1 .dt; dv; dw/;
1 t .i/1 <w6 t .i/ M i 6D k;
Œ0;1 N1 N1
Z
Mk;j .dt; dv/ D Q 2 .dt; dv; dw/;
1 t .j/1 <w6 t .j/ M j 6D k;
Œ0;1 N1 N1
where .t /06t6T is a predictable process with values in the set of one-to-one
mappings from f1; ; Ng n fkg onto f1; ; N 1g. The precise form of .t /06t6T
will be specified later on.
1.6 Discrete State Game Models 57
Recall that:
X Z Z
Xtk D X0k C 'i;j
k
.Xs ; v/Mi;j .ds; dv/:
Œ0;t Œ0;1/
16i6Dj6N
Since 'i;j
k
.x; v/ D 0 if k 62 fi; jg, we get:
Xtk D X0k
X
N Z Z Z
C 'i;k
k Q 1 .ds; dv; dw/
.s; Xs ; v/1 s .i/1 <w6 s .i/ M
Œ0;t Œ0;1/ Œ0;1 N1 N1
iD1;i6Dk
X
N Z Z Z
C 'k;j
k Q 2 .ds; dv; dw/:
.s; Xs ; v/1 s .j/1 <w6 s .j/ M
Œ0;t Œ0;1/ Œ0;1 N1 N1
jD1;j6Dk
X
N
'i;k
k
.s; Xs ; v/1 s .i/1 <w6 s .i/
N1 N1
iD1;i6Dk
(1.54)
X
N
D i
Xs 1Œv;1/ cis .Xs
i
/cks .Xs
k
/ 1 s .i/1 <w6 s .i/ :
N1 N1
iD1;i6Dk
If we now restrict ourselves to symmetric Nash equilibria, in the sense that the
control profiles .˛1 ; ; ˛N / are exchangeable at equilibrium, we can assume that
the search intensities of all the other agents are given by the same feedback function,
say Œ0; T N 3 .t; xi / 7! cit .xi / D cQ kt .xi / 2 RC for i 6D k, for some function
cQ k W Œ0; T N ! RC . In such a case, the above sum becomes:
58 1 Learning by Examples: What Is a Mean Field Game?
X
N
'i;k
k
.s; Xs ; v/1 s .i/1 <w6 s .i/
N1 N1
iD1;i6Dk
(1.55)
X
N
D i
Xs 1Œv;1/ cQ ks .Xs
i
/cks .Xs
k
/ 1 s .i/1 <w6 s .i/ :
N1 N1
iD1;i6Dk
X
N X
N1
.i/;k
f .Xs
i
/1 s .i/1 <w6 s .i/ D f Xs 1 i1 <w6 i :
N1 N1 N1 N1
iD1;i6Dk iD1
We now call N k k N k
s the empirical measure of Xs and Fs ./ the associated empirical
distribution function and we denote by Q N s ./ its pseudo-inverse. We recall that for
k
i1 i N k .i/;k
<w6 H) Q s .w/ D Xs :
N1 N1
We finally get:
X
N
k
f .Xs
i N s .w/ ;
/1 s .i/1 <w6 s .i/ D f Q w 2 .0; 1:
N1 N1
iD1;i6Dk
X
N
'i;k
k
.s; Xs ; v/1 s .i/1 <w6 s .i/
N1 N1
iD1;i6Dk
k k
N k
DQ N s .w//cks .Xs
Q s .Q
s .w/1Œv;1/ c
k
/ :
Xtk D X0k
Z Z Z
k k
C N k
Q Q s .Q
s .w/1Œv;1/ c
N s .w//cks .Xs
k Q
/ M.ds; dv; dw/:
Œ0;t Œ0;1/ Œ0;1
This writing will serve as a starting point for the formulation of the game with a
large number of agents as a mean field game.
1.6 Discrete State Game Models 59
where:
'.t; ˛; v; w/ D 1Œv;1/ ˛ct .Qt .w// Qt .w/;
with:
Z X
cN t D ct dt D ct .n0 /t .fn0 g/:
N n0 2N
In other words, .Xt /t>0 is a pure-jump process whose jump-arrival intensity at time t
is given by the function cN t t . /. The jump-size distribution at time t is then given by:
ct .n/t .fng/
N 3 n 7! :
cN t
60 1 Learning by Examples: What Is a Mean Field Game?
Also, if we denote by .Lt .n; y//n2N;y2N the jump rate kernel at time t of the state
process X, that is PŒXtCdt D n C y j Xt D n D Lt .n; y/dt C o.dt/, then,
Now, the last step of the mean field game approach is to find a flow D .t /0tT
of distributions on N and a function c W Œ0; T N 3 .t; n/ 7! ct .n/ 2 ŒcL ; cU such
that the stochastic control problem (1.56)–(1.57) has an optimizer, given by a family
.t /0tT of feedback functions from N to ŒcL ; cU and admitting D .t /0tT as
flow of marginal distributions of the optimal states, such that the following fixed
point condition holds true: t D ct and t D t , for t 2 Œ0; T.
We can characterize the equilibria as the solutions of a nonlinear Fokker-Planck-
Kolmogorov equation. To do so, the following notation will be useful. If W N ! R
is a bounded function and D .t /0tT is a flow of probability measures on N,
we denote by t the measure with density with respect to the measure t , for any
t 2 Œ0; T. In other words:
Now, for a given flow D .t /0tT of probability measures, we write the
Fokker-Planck-Kolmogorov equation satisfied by the flow of marginal distributions
D .t /t>0 of the pure jump process admitting .Lt /0tT in (1.58) as jump rate
kernel. We find this equation by computing:
d d
EŒf .Xt / D hf ; t i; t 2 Œ0; T;
dt dt
d X X
ht ; f i D t .fng/ Œf .n C m/ f .n/Lt .n; m/
dt n2N m>1
X X
D t .fng/ Œf .n C m/ f .n/t .n/ct .m/t .fmg/
n2N m2N
X
D t .n/t .fng/ ct .m/t .fmg/ f .n C m/
n;m2N
X
f .n/t .n/t .fng/ ct .m/t .fmg/
n;m2N
D ht t ct t ; f i hct t ; 1i ht t ; f i;
where for two probability measures and on N, the convolution is given by:
X
n
.fng/ D .fkg/.fn kg/; n 2 N:
kD0
1.7 Notes & Complements 61
d
t D ct t t t hct t ; 1i t t ; t 2 Œ0; T:
dt
Notice that, if the last step of the mean field game approach can be performed, in
other words, if the fixed point problem can be solved, then t D ct and t D t and
the (linear) Fokker-Planck-Kolmogorov equation for the state distribution becomes
the nonlinear McKean-Vlasov equation:
d
t D ct t ct t hct t ; 1i ct t ; t 2 Œ0; T; (1.59)
dt
which is exactly the finite horizon analog of the equation obtained in the literature
(see the Notes and Complements below) since we do not have entrance and exit of
agents at exponential random times.
The notion of Nash equilibrium in game theory goes back to the seminal works
by Nash [288] and [289]. Throughout the book, we consider game models with
finitely many players, and mathematical objects capturing the limits of features of
these models when the number of players grows without bound. In some sense,
the mean field game models we formulate and solve pertain to an infinite, though
countable, population of players. For a long time, economists have used alternative
models for which the set of players is a measurable space equipped with a nonatomic
probability measure. Even though we chose not to use this framework, we recognize
that it is an attempt to abstract stylized facts from the same finite player game
models which we study. The reader interested in Aumann’s theory of games with a
continuum of players is referred to [27, 28] and to Section 3.7 of Chapter 3.
We borrowed the idea of grounding the formulation of a mean field game problem
on the elementary Lemma 1.2 and Proposition 1.4 from P.L. Lions’ lectures [265]
as explained in Cardaliaguet’s presentation [83]. These simple results provide a
rigorous foundation for the fundamental assumptions we make on the coefficients
of a mean field game. These sources also prompted us to introduce early the notion
of potential game and its connections to centralized decision making in lieu of Nash
equilibrium. This idea will be revisited several times in Chapter 2 and Chapter 6 for
example, with increasingly more sophisticated models and analysis tools.
The discussion of the model “When does the meeting start?” given in the text is
borrowed from [189]. In fact, the deterministic and stochastic one period examples,
as well as the stochastic differential game models used to illustrate the management
of exhaustible resources and the economic growth model are all borrowed from the
survey [189] by Guéant, Lasry and Lions. We chose not to include a discussion of
the Mexican wave (fixture of most crowd behaviors at soccer games all over the
62 1 Learning by Examples: What Is a Mean Field Game?
world) even though it is one of P.L. Lions’ favorite examples of mean field games.
It is explained in detail in [189].
Early game theoretic models for the banking system are due to Bryant [73] and
Diamond and Dybvig whose fundamental paper [136] initiated a wave of interest
leading to a series of papers with increasing realism. In their original analysis,
Diamond and Dybvig proposed a banking model in the form of a game played
by depositors. A distinctive feature of the model is that there always exist at least
a good equilibrium and a bad equilibrium. Many generalizations were proposed,
for example to model random returns. The first of the two models discussed in
the text is in this line of research, analyzing a static model of the inter-banking
system. It is borrowed from the paper [319] of Rochet and Vives. There, the authors
use the methodology of global games proposed by Morris and Shin in [286], and
the differences in opinions among investors to prove existence and uniqueness of a
Nash equilibrium. They go on to analyze the economic and financial underpinnings
of bank runs and propose a benchmark for the role of lenders of last resort. We gave
a detailed account of their set-up because of the mean field game nature of their
approach, despite the fact that their model is de facto static. The theory of games
with strategic complementarities goes back to the original works [339] of Vives and
[284] of Milgrom and Roberts. An application to games with mean field interactions
can be found in the paper [8] by Adlakha and Johari.
The second model of bank run presented in the text ports the most important
stylized facts of the first model to a dynamic set-up in continuous time. It is based on
a diffusion model for the value of the assets of a bank, and for that reason, it is more
in line with the theoretical developments presented later in the text. It was inspired
by a lecture given by Olivier Gossner at a PIMS Workshop on Systemic Risk in
July 2014. This model builds on an earlier paper [197] by He and Xiong modeling
staggered debt maturities in continuous time. Section (Vol II)-7.2 of Chapter (Vol
II)-7 is devoted to the discussion and the solution of more general games of timing.
The toy model of systemic risk introduced in Subsection 1.3.1 is borrowed from
the paper [102] of Carmona, Fouque, and Sun to which we refer the interested reader
for the interpretation of the results in terms of systemic risk.This simple model is
remarkable because it can be used as a testbed for all the theoretical tools developed
in the text. It is solved explicitly in Chapter 2 to illustrate the differences between
open loop and closed loop equilibria for finite player games. It is used in Chapter 3
as an example for which the limit N ! 1 of large games can be performed leading
to a solvable mean field game in the limit. It will also be revisited in Chapter (Vol
II)-4 to illustrate how the master equation can appear in the limit of finite player
games with a common noise. The model was recently extended in [101] to include
delay in the control in order to make the model more realistic and more in line with
the way interbank borrowing and lending actually occur.
Price impact models as mean field games have been considered by Gomes and
Saude in [182] where the problem is approached from a PDE perspective, and
by Carmona and Lacker in [103] where it is treated within the framework of the
weak formulation. The model presented in Subsection 1.3.2 is borrowed from a
technical report by Aghbal and Carmona where it is treated by adapting the tools
1.7 Notes & Complements 63
developed later on in the text. The model of price impact used in the text is due
to Almgren and Chriss. See for example [18]. It is relatively easy to calibrate it to
high frequency data, hence its popularity among practitioners. As we see in the
solution proposed in Subsection 4.7.1 of Chapter 4, this model of price impact
can lead to tractable solutions, hence its popularity in the financial engineering
literature. Carmona and Webster proved in [107] that the self-financing condition of
the classical Black-Scholes theory is not always appropriate to account for frictions
in high frequency trading data. They propose an alternative including price impact
and adverse selection and it would be interesting to solve the global equilibrium
problem in the mean field set-up they propose in [108].
Games for which the interaction between the players occurs through the distri-
bution of the controls of the players have been called extended mean field games by
Gomes and Saude in [182]. They will be studied in Section 4.6 of Chapter 4.
The first macro-economic growth model presented in Subsection 1.4.1 is bor-
rowed from [189]. The discussion of the second growth model of Subsection 1.4.2
is patterned after the original paper [241] by Krusell and Smith where the authors
propose a formulation which leads to approximations and computational algorithms
for approximate solutions. We learned about the version of Aiyagari’s model
presented in Subsection 1.4.3 from a talk by B. Moll at the Institute of Mathematics
and its Applications, Minneapolis MA in November 2012. This continuous time
version of the original discrete time model proposed by Aiyagari is also discussed
in the review [1] by Achdou, Buera, Lasry, Lions, and Moll of partial differential
equation models in macroeconomics. There, it is stated that a mathematical solution
of such a model is not known. We shall provide such a mathematical solution in
Subsection 3.6.3 of Chapter 3 together with numerical illustrations of the properties
of the solution.
The mean field game model for exhaustible resources presented in Subsection
1.4.4 is borrowed from [189]. The last subsection is adapted from a recent technical
report [113] by Chan and Sircar.
Our presentation of the deterministic model of flocking is based on the original
paper [126] of Cucker and Smale. Soon after the publication of the original
treatment [126], Cucker and Mordecki proposed in [125] a generalization in which
the dynamics of the velocities are perturbed by a mean zero stationary Gaussian
process. In [125], the authors compute a lower bound for the probability that the
velocities eventually align in terms of the various parameters of the model. The idea
of using a mean field game model to formulate the flocking problem as the search
for equilibrium in a stochastic game model with mean field interactions is due
to Nourian, Caines, and Malhamé in [285]. In this paper, the authors recognize
that the case ˇ D 0 leads to a linear quadratic mean field game which could be
solved, and they suggest that an asymptotic expansion for ˇ small could provide a
reasonable approximation to the solution. In this book, we solve the particular case
ˇ D 0 in Section 2.4 of Chapter 2 explicitly for a finite number of birds, and in
Section 3.6.1 of Chapter 3 in the mean field game limit when the number of birds
increases without bound. Furthermore, we give a theoretical solution and numerical
illustrations for the general case ˇ ¤ 0 in Section 4.7.3 of Chapter 4.
64 1 Learning by Examples: What Is a Mean Field Game?
network model in the conference proceedings [290] by Nguyen, Alpcan, and Basar
where the authors frame the problem as a zero-sum game between an attacker and
the defender of the network.
Our discussion of the propagation of knowledge in Subsection 1.6.3 was inspired
by the work of Duffie, Giroux, Malamud, and Manso, see for example [144–147]
which are based on a form of continuous law of large numbers proven in [148]. In
keeping with the spirit of this first chapter, we used slightly different assumptions
to formulate a finite player game with mean field interactions. However, in order
to remain faithful to the original works of Duffie et al. we use N as the state space
of the system and as a result, model the idiosyncratic sources of random shocks by
Poisson processes, which will prevent us from applying directly the main existence
results of the book to these models.
In a couple of interesting papers, [199] and [200], Horst discusses stochastic
games with a form of weak interaction between players. There, the author includes
peer and neighborhood effects in a dynamic analysis of equilibrium, very much
in the spirit of an earlier work of Bisin, Horst, and Özgür of 2002 [59] eventually
published in 2006, in which the authors proved the existence of rational expectations
equilibria of random economies with locally interacting agents under the assumption
that the interaction between different players is weak. The weak interaction
approach suggested in this paper provides a unified framework for integrating
strategic behavior into dynamic models of social interactions. However, the weak
interactions of the models considered in this book are of a different nature. They
are defined in an asymptotic sense when the number of players tends to 1. This
is different from Horst’s notion of weak interaction for which the weakness of
the interaction is localized to the neighbors of the given agent. Other examples
of application may be found in the survey by Djehiche, Tcheukam Siwe and
Tembine [137].
Probabilistic Approach to Stochastic
Differential Games 2
Abstract
This chapter offers a crash course on the theory of nonzero sum stochastic
differential games. Its goal is to introduce the jargon and the notation of this
class of game models. As the main focus of the text is the probabilistic approach
to the solution of stochastic games, we review the strategy based on the stochastic
Pontryagin maximum principle and show how BSDEs and FBSDEs can be
brought to bear in the search for Nash equilibria. We emphasize the differences
between open loop and closed loop or Markov equilibria and we illustrate the
results of the chapter with detailed analyses of some of the models introduced in
Chapter 1.
Despite the fact that we discussed in Chapter 1 a couple of examples of one and
two period games, the book is devoted to continuous time models. As before, we
consider N players, and we label them by the integers 1; ; N. They act at time
t on a system whose state Xt they influence through their actions. As the title of
the chapter suggests, the dynamics of the state of the system are given by an Itô
stochastic differential equation. As can be seen from the examples introduced in
Chapter 1, some of the state dynamics appearing naturally in these examples are not
easily amenable to stochastic differential equations. For this reason, we devote an
entire section to games with finite state spaces in Chapter 7. Moreover, we shall
add remarks when appropriate, to indicate how to handle more general Markov
dynamics, possibly including jumps, and we shall give precise references in the
Notes & Complements sections at the end of all the chapters.
As explained in Remark 2.1 below, in order to avoid alternating between he, his, : : :
and she, her, : : :, we decided to make the players genderless and use the pronouns
it, its, : : : throughout the book. This will sound strange at times but in the end, the
discussions of players’ behaviors will be consistent.
We assume that the Itô process used to specify the state dynamics is driven by
an M-dimensional Wiener process W D .Wt /06t6T defined on a probability space
.˝; F; P/, F D .Ft /06t6T being the completion of the natural filtration of the
Wiener process.
We denote by A1 ; ; AN the sets of actions that players 1; ; N can take
at any point in time. The sets Ai are typically compact metric spaces or subsets
of an Euclidean space, say Ai Rki , and we denote by Ai D B.Ai / their
Borel -fields, where, throughout the book B.E/ denotes the Borel -field on any
metric space .E; d/. We use the notation A.N/ for the set of admissible strategy
profiles. The elements ˛ of A.N/ are N-tuples ˛ D .˛1 ; ; ˛N / where each
˛i D .˛ti /06t6T is a progressively measurable Ai -valued process. Most often, these
individual strategies will have to satisfy extra conditions (e.g., measurability and
integrability constraints). These conditions change with the application. In most of
the cases considered in this book, we assume that these constraints can be defined
player by player, independently of each other. To be more specific, we often assume
that A.N/ D A1 AN where, for each i 2 f1; ; Ng, Ai is the space of
control strategies which are deemed admissible to player i, irrespective of what the
other players do. In most applications, Ai will be a space of Ai -valued progressively
measurable processes ˛i D .˛ti /06t6T , either bounded, or satisfying an integrability
RT
condition such as E 0 j˛ti j2 dt < 1. We will also add measurability conditions
specifying the kind of information each player can use in order to choose its course
of action at any given time. Finally, we shall use the notation A.N/ D A1 AN
for the set of actions ˛ D .˛ 1 ; ; ˛ N / available to all the players at any given time.
For each choice of strategy profile ˛ D .˛t /06t6T 2 A.N/ , it is assumed that the
time evolution of the state X D X˛ of the system satisfies:
(
dXt D B.t; Xt ; ˛t /dt C ˙.t; Xt ; ˛t /dWt 0 6 t 6 T;
(2.1)
X0 D x0 ;
Assumption (Games).
.N/
all S 2 Œ0; T, the function Œ0; S ˝ RD A 3.N/
.t; !; x; ˛/ 7!
D
(A1) For
B; ˙ .t; !; x; ˛/ is B.Œ0; S/ ˝ FS ˝ B.R / ˝ B.A /=B.RD / ˝
B.RDM /-measurable.
(A2) There exists a constant c > 0 such that, for all t 2 Œ0; T, ! 2 ˝,
x; x0 2 RD and ˛; ˛ 0 2 A.N/ ,
(continued)
2.1 Introduction and First Definitions 69
Remark 2.1 Gender of the Players. This book is about mathematical models,
their theories and solutions. Practical applications are used as motivations for the
introduction of models, and numerical results are given as illustrations of the power
and the limitations of the methods developed in the book. Given this emphasis, we do
not feel that political correctness should be an issue forcing us to choose the way we
address the individual players, as he or she. Clearly, their biological genders have
no bearing on what we are interested in, and keeping track of grammatical genders
can only be a hindrance and a distraction. So as already stated in the warning at the
beginning of this introductory section, we decided that, for the sake of definiteness,
we shall refer to the individuals involved in the game models we study as neutral
from a grammatical standpoint. As a result, we shall treat the players as genderless.
where H0;n stands for the collection of all Rn -valued progressively measurable
processes on Œ0; T. The set H2;n is a Hilbert space for the inner product obtained by
polarization of the double integral appearing in the definition. We shall also denote
by S2;n the space of all the continuous processes S D .St /06t6T in H0;n such that
EŒsup06t6T jSt j2 < C1, the square root of this quantity providing S2;n with a norm
which we shall use repeatedly in the sequel.
70 2 Probabilistic Approach to Stochastic Differential Games
Quite often, the state of the system is the mere aggregation of private states of
individual players, so that Xt D .Xt1 ; ; XtN / where Xti 2 Rdi can be interpreted as
the private state of player i 2 f1; ; Ng. Here D D d1 C CdN and, consequently,
RD D Rd1 RdN . Moreover, the dynamics of the private states will be assumed
to be given by stochastic differential equations driven by separate Wiener processes
W i D .Wti /06t6T which are most often assumed to be independent of each other,
even though we already saw many examples in Chapter 1 for which this assumption
failed to hold. For us, the typical example for which this assumption fails will be
given by models with random shocks which are common to all the players. We shall
identify these examples as games with a common noise. We focus on these models
in Chapters (Vol II)-2 and (Vol II)-3. Typically for these models, we assume that the
state dynamics are of the form:
for i 2 f1; ; Ng, satisfy the same assumptions as before, and 0 satisfies the
same assumptions as the . i /16i6N ’s. The Wiener processes W i with i D 1; : : : ; N
represent idiosyncratic random shocks while W 0 is used to model what we call the
common noise. It is important to notice that, even in the absence of a common noise
(i.e., when 0 0), these N dynamical equations are coupled by the fact that all the
private states and all the actions enter into the coefficients of these N equations.
2.1 Introduction and First Definitions 71
The common noise models described above are not the most general (see the
discussion in the Notes & Complements at the end of Chapter 2 in Volume II).
They are the ones we introduce in Chapter (Vol II)-2 and solve in Chapter (Vol II)-3
despite the fact that, as we saw in Chapter 1, and especially in the section on macro-
economic models, some of the instances of common noise are not covered by this
additive intervention of the common noise term.
The popularity of the formulation described in this subsection is due to the ease
with which we can define the information structures and admissible strategy profiles
of some specific games of interest. For example, in a game where each player
can only use the information of the state of the system at time t when making a
strategic decision at that time, the admissible strategy profiles will be of the form
˛ti D i .t; Xt / for some deterministic function i which we often call a feedback
function. These strategies are said to be closed loop in feedback form, or Markovian.
Moreover, if the information which can be used by player i at time t can only depend
upon the state of player i at time t, then the admissible strategy profiles will be of
the form ˛ti D i .t; Xti /. Such strategies are usually called distributed.
Under assumption Games, we assume further that each player faces instantaneous
and running costs. So for each i 2 f1; ; Ng, player i has cost coefficients:
where we implicitly assume that the expectation in the above right-hand side is well
defined. For instance so is the case if the cost coefficients are at most of quadratic
growth in x, uniformly in the other variables, and, for any ˛ 2 A.N/ ,
Z T
E jf i .t; 0; ˛t /jdt < 1:
0
72 2 Probabilistic Approach to Stochastic Differential Games
Notice that, in the general situation considered here, the cost to a given player
depends upon the strategies used by the other players indirectly through the values
j
of the state Xt over time, but also directly as the specific actions ˛t taken by the other
players may appear explicitly in the expression of the running cost f i of player i.
While the notion of Pareto optimality is natural in problems of optimal allocation
of resources and, as a result, very popular in the economic literature and in
operations research applications, as explained in Chapter 1, we shall use the notion
of optimality associated with the concept of Nash equilibrium. For the sake of
convenience, we repeat a definition already stated in Chapter 1.
where .˛i ; ˛O i / stands for the strategy profile .˛O 1 ; ; ˛O i1 ; ˛i ; ˛O iC1 /, in which the
player i chooses the strategy ˛i while the others, indexed by j 2 f1; ; Ng n fig,
keep the original ones ˛O j .
The existence and uniqueness (or lack thereof) of Nash equilibria, as well as
the properties of the corresponding optimal strategy profiles, strongly depend upon
the information structures available to the players, and the types of actions they are
allowed to take. In particular, the above definition of a Nash equilibrium can only
make sense once we have properly defined the nature of the frozen strategies ˛O i
in the Nash condition: it is indeed necessary to specify how the players compute
(we could even say “update”) their strategies when one of them uses ˛i instead of
˛O i . So rather than referring to a single game with several information structures and
admissible strategy profiles for the players, we shall often talk about models, e.g.,
the open loop model for the game or the closed loop model, or even the Markovian
model for the game. We give precise definitions below.
This arcane definition is best understood when the filtration F is generated by the
Wiener process W, except possibly for the presence of independent events in F0 .
In this case, the strategy profiles used in an open loop game model are given by
controls of the form:
2.1 Introduction and First Definitions 73
The strategy profile ˛O D ..˛O t1 /06t6T ; ; .˛O tN /06t6T / where each strategy ˛O ` is of
the form ˛O t` D O ` .t; X0 ; WŒ0;t / for some measurable function O ` as above, is an open
loop Nash equilibrium if each time a player i 2 f1; ; Ng uses a different strategy
˛i given by a function i possibly different from O i while the other players keep using
the same functions O j for j 6D i, then this player i is not better off in the sense that
O 6 J i .˛i ; ˛O i /.
J i .˛/
A similar definition can be used when the functions O ` and i fail to depend upon
the Wiener process W. This leads to the notion of deterministic open loop Nash
equilibrium. In some sense, this merely amounts to redefining the sets Ai and A.N/
of admissible controls and strategy profiles as sets of deterministic processes, i.e.,
functions which do not dependent upon W.
˛ti D i .t; X0 /;
Definitions 2.4 and 2.5 are consistent with the definitions of open loop equilibria
used in the standard literature on deterministic games. They accommodate models
with and without sources of randomness.
74 2 Probabilistic Approach to Stochastic Differential Games
X D .Xt /06t6T is now the solution of the same state equation (2.1) in which we
use the control ˛t given by:
This definition may seem rather cumbersome and pedantic, but we chose to spell
out the details needed to understand the subtle differences between open and closed
loop equilibria.
Important Differences.
It is crucial to realize the major differences between the notions of open and closed
loop Nash equilibria. When checking that a strategy profile ˛O D .˛O 1 ; ; ˛O N / is
an open loop equilibrium, the fact that a given player i changes its strategy from
˛O i to ˛i does not affect the strategies ˛O j for j ¤ i of the other players. This comes
j
from the fact that the open loop controls ˛O t are functions of the trajectories of the
2.1 Introduction and First Definitions 75
Wiener process and, as long as j ¤ i, they do not change when player i changes its
own function i of the Wiener process. Even if this change affects the state of the
system, it does not change the strategies ˛O j of the other players.
However, things are different in the case of closed loop equilibria. Indeed, if
player i changes its strategy from ˛O i D .O i .t; XŒ0;t //06t6T to a new control strategy
˛i D . i .t; XŒ0;t //06t6T , this change is likely to affect the trajectory of the state of
the system, and even if the other players j ¤ i still use the same feedback functions
O j , their controls ˛O j D .O j .t; XŒ0;t //06t6T will change because of the changes in the
value of the state!
To put it another way, the prescription that the players other than i keep using the
same feedback functions O i to compute their controls allows their controls to take
into account the new state of the system if player i changes its strategy.
Also, notice that writing ˛ti D i .t; XŒ0;t / supposes that player i has perfect
observation of the state of the whole system. In many practical applications, it
has only partial observation, meaning that player i can only observe (possibly
indirectly) the states of some of the players in the system.
Finally, we give the definition of closed loop Nash equilibria in feedback form.
In the search for a closed loop equilibrium in feedback form, the strategy profiles
are given by controls of the form:
X D .Xt /06t6T is the solution of the same state equation (2.1) in which we use:
˛t D O 1 .t; X0 ; Xt /; ; O i1 .t; X0 ; Xt /;
i .t; X0 ; Xt /; O iC1 .t; X0 ; Xt /; ; O N .t; X0 ; Xt / :
Remark 2.8 It is important to emphasize that in open loop models, when a player
makes its decisions, it may not be able to take into account the plays of its opponents
since its decision can only be a function of the history of the random shocks. On the
other hand, in closed loop models, past plays impact the values of the state, and
in that way, become part of the common knowledge of the players. Though less
realistic, open loop equilibria are more mathematically tractable than closed loop
equilibria. Indeed, players need not consider how their opponents would react to
deviations from the equilibrium path. With this in mind, one should expect that when
the impact of players on their opponents’ costs/rewards is small, open loop and
closed loop equilibria should be the same. It is often conjectured that this should
be the case for large games. We shall see instances of this claim at the end of the
chapter in the large N limit of the Linear Quadratic (LQ) models for flocking with
ˇ D 0, and for systemic risk introduced in Subsections 1.5.1 and 1.3.1 of Chapter 1.
More generally, we shall address this question in a more systematic way in Chapter
(Vol II)-6.
For each player i 2 f1; ; Ng, we define its Hamiltonian as the function H i :
defined by:
Pay attention that, at some point below, the inner product traceŒ˙.t; x; ˛/ z in the
space of matrices is just denoted by ˙.t; x; ˛/ z.
When the volatility is uncontrolled, namely when ˙ is independent of ˛, we
usually do not include the second term in the above right-hand side, and talk about
reduced Hamiltonian instead.
2.1 Introduction and First Definitions 77
Definition 2.9 We say that the generalized Isaacs (minmax) condition holds if there
exists a function:
O x; y; z/ 2 A.N/
˛O W Œ0; T RD .RD /N .RDM /N 3 .t; x; y; z/ 7! ˛.t;
satisfying for every i 2 f1; ; Ng, and for all t 2 Œ0; T, x 2 RD , y D .y1 ; ; yN / 2
.RD /N , and z D .z1 ; ; zN / 2 .RDM /N :
H i .t; x; yi ; zi ; ˛.t; O x; y; z/i / ;
O x; y; z// 6 H i t; x; yi ; zi ; .˛ i ; ˛.t; (2.9)
for all ˛ i 2 Ai .
Notice that in this definition, the function ˛O could be allowed to depend upon the
random scenario ! 2 ˝ if the Hamiltonians H i did. In words, this definition says
that for each set of dual variables y D .y1 ; ; yN / and z D .z1 ; ; zN /, for each
time t and state x at time t, and possibly random scenario !, one can find a set of
actions ˛O D .˛O 1 ; ; ˛O N / depending on these quantities, such that, if we fix N 1
of these actions, say ˛O i , then the remaining one ˛O i minimizes the i-th Hamiltonian
in the sense that:
˛O i 2 arg inf H i t; x; yi ; zi ; .˛ i ; ˛O i / ; for all i 2 f1; ; Ng: (2.10)
˛ i 2Ai
Once again, the notation can be lightened slightly when the volatility is not
controlled. Indeed, minimizing the Hamiltonian gives the same ˛O as minimizing
the reduced Hamiltonian. In this case, the argument ˛O of the minimization is
independent of z. So when the volatility is not controlled we say that the generalized
Isaacs (minmax) condition holds if there exists a function:
O x; y/ 2 A.N/
˛O W Œ0; T RD .RD /N 3 .t; x; y/ 7! ˛.t;
satisfying:
where H i stands for the reduced Hamiltonian of player i. Notice that we use the same
letter H for the full-fledged and for the reduced Hamiltonians. We are confident
that, at least at this stage, there is no possible confusion between the two because of
the context and the variables appearing as arguments. In particular, if there is only
one adjoint variable in H, then H should be the reduced Hamiltonian. Alternatively
78 2 Probabilistic Approach to Stochastic Differential Games
if there are more than one adjoint variable, then H should be understood as the
full-fledged Hamiltonian. Still, in the second volume, we shall make the distinction
between the two forms of Hamiltonians by writing H .r/ for the reduced Hamiltonian.
In many applications of interest, the coefficients of the state dynamics (2.1) depend
only upon the present value Xt of the state instead of the entire past XŒ0;t of the state
of the system, or of the Wiener process driving the evolution of the state. In this
case, the dynamics of the state are given by a diffusion-like equation:
In this setting, it is natural to use strategy profiles which are deterministic functions
of time and the current value of the state to force the controlled state process to be a
Markov diffusion. Furthermore, we also assume that the running and terminal cost
functions f i and gi are Markovian in the sense that, like B and ˙, f i and gi do not
depend upon the random scenario ! 2 ˝, but only upon the current values of the
state and the actions taken by the players, so that f i W Œ0; TRD A.N/ 3 .t; x; ˛/ 7!
f i .t; x; ˛/ 2 R, and gi W RD 3 x 7! gi .x/ 2 R with (at most) quadratic growth. So in
the case of Markovian diffusion dynamics, the cost functional is of the form:
Z T
J .˛/ D E
i
f .t; Xt ; ˛t /dt C g .XT / ;
i i
˛ 2 A.N/ ; (2.13)
0
and we tailor the notion of equilibrium to this situation by considering closed loop
strategy profiles in feedback forms which provide simultaneously Nash equilibria
for all the games starting at times t 2 Œ0; T (i.e., over the time periods Œt; T) and all
the possible initial conditions Xt D x as long as they share the same state drift and
volatility coefficients B and ˙, and cost functions f i and gi . More precisely:
The strategy profiles used in the above definition are called Markovian strategy
profiles and the deterministic functions and i feedback functions. Extra
regularity assumptions on the functions and i may be needed for the stochastic
differential equations giving the dynamics of the controlled state to have a unique
strong solution. Under assumption Games, the coefficients B and ˙ are Lipschitz
in .x; ˛/ uniformly in t 2 Œ0; T, so that, assuming that the feedback function
is locally bounded and Lipschitz in x uniformly in t is enough. However in
some cases, requiring that the feedback functions are Lipschitz continuous may
be an overkill somehow. Indeed, using the Markovian nature of the dynamics, the
stochastic differential equation for X is known to be well posed, regardless of the
smoothness of , when coefficients are bounded and the volatility is Lipschitz-
continuous, uniformly nondegenerate and uncontrolled. We shall use this fact in
Chapter (Vol II)-6 in order to identify an instance of uniqueness for Markovian
Nash equilibria in Proposition (Vol II)-6.27.
We shall still use the same notation (2.8) for the players’ Hamiltonians. However,
their roles and their interpretations will be slightly different than in the search
for open loop Nash equilibria. Indeed, using Markovian strategy profiles instead
of merely state-insensitive adapted processes for controls will dramatically affect
the dependence upon the state variable x. In order to illustrate this last point, we
abandon momentarily the notation ˛ for the control processes, and we use the
notation D . 1 ; ; N / for deterministic functions i W Œ0:T RD ! Ai to
emphasize the Markovian nature of the strategy profiles. The controlled dynamics
of the state X D X˛ solve the Markovian stochastic differential equation:
Hence, the state of the system is a Markov process with infinitesimal generator L
defined by:
1 X X
D D
@2 @
Lt D Apq .t; x/ C Bp .t; x/ ;
2 p;qD1 @xp @xq pD1 @xp
with, at least when the components of the Wiener processes W are independent,
and
X
m
Apq .t; x/ D ˙p` .t; x; .t; x//˙q` .t; x; .t; x//:
`D1
It depends upon the feedback functions i of the other players. It is expected to
satisfy the Hamilton-Jacobi-Bellman (HJB) equation:
@t V i C Li t; x; @x V i .t; x/; @2xx V i .t; x/ D 0; (2.15)
Uncontrolled Volatilities
We now argue that the fact that the feedback function i is a function of @x V
happens often when the volatility ˙ is not controlled, in which case the operator
symbol Li is, up to the second order term, identical to the reduced Hamiltonian
H i of player i. So when the second order term is independent of the controls, it
is equivalent to search for a minimizer of Li or of H i . Consequently in this case,
the HJB equation (2.15) can be equivalently written using the minimized reduced
Hamiltonian H i :
1
@t V i .t; x/C trace ˙.t; x/˙.t; x/ @2xx V i .t; x/ CH i t; x; @x V i .t; x/ D 0; (2.16)
2
We shall often write the HJB equation using the minimized reduced Hamiltonian as
in (2.16) above.
The form of the system (2.16) can be made more explicit when Isaacs condition
is in force, see Definition 2.9. In that case, the minimizer ˛O only depends on t, x,
and y and is independent of z since the volatility is not controlled. Then, the guess
is that j should be given by j .t; x/ D ˛O j .t; x; @x V.t; x//, in which case (2.16)
would take the form:
1
@t V i .t; x/ C trace ˙.t; x/˙.t; x/ @2xx V i .t; x/ (2.17)
2
C H i t; x; @x V i .t; x/; ˛.t;
O x; @x V.t; x// D 0; .t; x/ 2 Œ0; T RD ;
System (2.17) is called the Nash system associated with the game. It is especially
useful as a verification tool for the existence of Markovian Nash equilibria.
Proposition 2.11 Assume that A.N/ H2;K , with K D k1 C C kN , and that there
exists a classical solution of the Nash system (2.17) with the property that, for any
given initial condition x0 2 RD , the stochastic differential equation:
dXt D B t; Xt ; ˛.t;
O Xt ; @x V.t; Xt // dt C ˙.t; Xt /dWt ; t 2 Œ0; T;
with the same initial condition and with ˛ 2 A.N/ , the expectation:
Z T ˇ ˇ
E ˇ˙ .t; Xt /@x V.t; Xt /ˇ2 dt
0
for t 2 Œ0; T and with X0 D x0 , is uniquely solvable and its solution satisfies
O Xt ; @x V.t; Xt //i /0tT 2 A.N/ and has finite costs.
. i .t; Xt /; ˛.t;
Remark 2.12 In fact, the proof shows that V i .0; x0 / is the cost to player i under the
Markovian Nash equilibrium
In particular, the tuple .V 1 ; : : : ; V N / reads as the value function of the game under
the equilibrium . 1 ; ; N /.
Proof. The proof is a mere application of Itô’s formula. For a given i 2 f1; : : : ; Ng and a
given i such that the SDE:
O Xt ; @x V.t; Xt //i / dt C ˙.t; Xt /dWt ;
dXt D B t; Xt ; . i .t; Xt /; ˛.t; t 2 Œ0; T;
Z t
d V i .t; Xt / C O Xs ; @x V.s; Xs //i / ds
f i s; Xs ; . i .s; Xs /; ˛.s;
0
h
O Xt ; @x V.t; Xt //i /
D H t; Xt ; @x V i .t; Xt /; . i .t; Xt /; ˛.t;
i
i
H i t; Xt ; @x V i .t; Xt /; ˛.t;
O Xt ; @x V.t; Xt // dt
Taking the expectation and implementing Isaacs’ condition (2.9), we deduce that:
Z
T
E V i .T; XT / C O Xt ; @x V.t; Xt //i / dt > V i .0; x0 /;
f i t; Xt ; . i .t; Xt /; ˛.t;
0
Here is a specific set of assumptions under which the conclusion of the above
proposition holds true.
The following result is taken from the Partial Differential Equation (PDE)
literature:
Proposition 2.13 Under assumption N-Nash System, the Nash system (2.17)–
(2.18) has a unique solution V in the space of RN -valued bounded and continuous
functions on Œ0; T RD that are differentiable in space on Œ0; T/ RD , with a
bounded and continuous gradient on Œ0; T/ RD , and that have generalized time
p
first-order and space second-order derivatives in Lloc .Œ0; T/ RD /, for any p > 1,
where the index loc is used to indicate that integrability holds true on compact
subsets only.
84 2 Probabilistic Approach to Stochastic Differential Games
Moreover, for any bounded and measurable function from Œ0; T RD into A.N/
and for any initial condition x0 2 RD , the stochastic differential equation:
Proof. The first part of the statement is a standard result from the PDE literature; references
are given in the Notes & Complements at the end of the chapter. What really matters is the
fact that the volatility coefficient is uniformly nondegenerate.
The second part of the statement is also standard in stochastic analysis. Again, references
are provided in the Notes & Complements below.
The third part of the statement follows from the same argument as that used in the proof
of Proposition 2.11, except that an extension of Itô’s formula is needed to overcome the lack
of continuity of the first order time derivative and second order space derivatives of V. Such
an extension is covered by the so-called Itô-Krylov formula for Itô processes driven by a
bounded drift and a bounded and uniformly nondegenerate volatility coefficient. Once again,
references are given in the Notes & Complements. t
u
Observe from the Lipschitz property of B and ˙ that the partial derivatives @x B,
@˛ B, @x ˙ and @˛ ˙ are also bounded.
Notice that, since B takes values in RD and x 2 RD , @x B is an element of
R DD
, in other words a D D matrix whose entries are the partial derivatives of
the components Bi of B with respect to the components xj of x. Analog statements
can be made concerning @x ˙ which has the interpretation of a tensor.
Q˙ to depend upon the control parameter
In this section, we allow the volatility
˛ 2 A.N/ . Also, we assume that A.N/ D NiD1 Ai with Ai H2;ki .
Definition 2.14 Given an admissible strategy profile ˛ 2 A.N/ and the corre-
sponding controlled state X D X˛ of the system, a set of N couples of processes
..Y i;˛ ; Zi;˛ / D .Yti;˛ ; Zti;˛ /06t6T /iD1; ;N in S2;D H2;DM for each i D 1; ; N,
is said to be a set of adjoint processes associated with ˛ 2 A.N/ if, for each
i 2 f1; ; Ng, they satisfy the Backward Stochastic Differential Equation (BSDE):
(
dYti;˛ D @x H i .t; Xt ; Yti;˛ ; Zti;˛ ; ˛t /dt C Zti;˛ dWt ; t 2 Œ0; T;
(2.19)
YTi;˛ D @x g i
.XT˛ /:
In the stochastic analysis literature, the bounded variation part of a BSDE (up to
the minus sign) is often called the driver of the equation and denoted by the letters
or f . We shall not use f for the driver because we use the letter f for the running
cost. So, for each i 2 f1; ; Ng, existence and uniqueness of a solution follow from
standard results on BSDEs. See the Notes & Complements at the end of the chapter
for references and Chapter 4 for precise statements of these results.
Theorem 2.15 Under the above conditions, if ˛O 2 A.N/ is an open loop Nash
equilibrium, if we denote by XO D .XO t /0tT the corresponding controlled state of
1 N 1 N
the system, and the adjoint processes by YO D .YO ; ; YO / and ZO D .ZO ; ; ZO /,
then the generalized min-max Isaacs conditions hold along the optimal paths in the
sense that, for each i 2 f1; ; Ng,
H i t; XO t ; YO ti ; ZO ti ; ˛O t
D inf H i t; XO t ; YO ti ; ZO ti ; .˛ i ; ˛O ti / ; Leb1 ˝ P a.e.; (2.20)
˛ i 2Ai
Proof. We only provide a sketch of the proof since we exclusively use this result as a
rationale behind our search strategy for a function satisfying the min-max Isaacs condition.
The proof is a consequence of the stochastic maximum principle of stochastic control
whose statement is recalled in Theorem 3.27 and which is proven in the greater generality
of the control of McKean-Vlasov equation in Chapters 6 and (Vol II)-1. See for example
Theorem 6.14 in Chapter 6. For a given i 2 f1; : : : ; Ng, in order to find the best response to
the strategies ˛O i we may consider the optimal control problem consisting in minimizing the
cost functional J i .˛i ; ˛O i / over control strategies ˛i 2 Ai and controlled Itô processes:
dXt D B t; Xt ; .˛ti ; ˛O ti / dt C ˙ t; Xt ; .˛ti ; ˛O ti / dWt ; t 2 Œ0; T:
Since ˛i D ˛O i is a minimizer, we can use the necessary part of the standard stochastic
maximum principle of stochastic control. The proof is completed by adapting the proofs
of Theorems 6.14 and (Vol II)-1.59 to the current setting, under which the coefficients are
random and the volatility is controlled. t
u
Theorem 2.16 For an admissible strategy profile ˛O 2 A.N/ , with XO D .XO t /06t6T
O D ..YO 1 ; ; YO N /; .ZO 1 ; ; ZO N // as
O Z/
as corresponding controlled state and .Y;
corresponding adjoint processes, assume that, for each i 2 f1; ; Ng,
and
Proof. We fix i 2 f1; ; Ng, a generic ˛i 2 Ai , and for the sake of simplicity, we denote
i i
by X the state process X.˛ ;O˛ / controlled by the strategies .˛i ; ˛O i /. The function gi being
convex, using the form of the terminal condition of the adjoint equations and integration by
parts, we get:
D YO Ti .XO T XT /
Z T Z T
D .XO t Xt / dYO ti C YO ti d.XO t Xt /
0 0
Z T ˚
C trace ˙.t; XO t ; ˛O t / ˙ t; Xt ; .˛ti ; ˛O ti / ZO ti dt
0
Z T
D .XO t Xt / @x H i .t; XO t ; YO ti ; ZO ti ; ˛O t / dt
0
Z T
C YO ti B.t; XO t ; ˛O t / B t; Xt ; .˛ti ; ˛O ti / dt
0
Z T ˚
C trace ˙.t; XO t ; ˛O t / ˙ t; Xt ; .˛ti ; ˛O ti / ZO ti dt C MT ;
0
where .Mt /06t6T is a martingale with M0 D 0. Taking expectations of both sides and
plugging the result into:
Z
T
O J i .˛i ; ˛O i / D E
J i .˛/ f i .t; XO t ; ˛O t / f i t; Xt ; .˛ti ; ˛O ti / dt C gi .XO T / gi .XT / ;
0
88 2 Probabilistic Approach to Stochastic Differential Games
we get:
O J i .˛i ; ˛O i /
J i .˛/
Z T
i
DE H t; XO t ; YO ti ; ZO ti ; ˛O t H i t; Xt ; YO ti ; ZO ti ; .˛ti ; ˛O ti / dt
0
Z
T
E YO ti B.t; XO t ; ˛O t / B t; Xt ; .˛ti ; ˛O ti / dt
0
Z
T ˚
E trace ˙.t; XO t ; ˛O t / ˙ t; Xt ; .˛ti ; ˛O ti / ZO ti dt
0
(2.22)
C E gi .XO T / gi .XT /
Z T
6E H i .t; XO t ; YO ti ; ZO ti ; ˛O t / H i t; Xt ; YO ti ; ZO ti ; .˛ti ; ˛O ti /
0
.XO t Xt / @x H i .t; XO t ; YO ti ; ZO ti ; ˛O t / dt
6 0;
because the above integrand is non-positive for Leb1 ˝ P almost every .t; !/ 2 Œ0; T ˝.
The last claim is easily seen by using the convexity of H i together with the fact that ˛O t is a
critical point. Indeed, by convexity of Ai and by the generalized Isaacs condition (2.21), we
have, dt ˝ dP a.s., for all ˛ i 2 Ai ,
i
˛ ˛O ti @˛i H i .t; XO t ; YO ti ; ZO ti ; ˛O t / > 0;
Implementation Strategy
We shall use this sufficient condition in the following manner. Under assumptions
Games and Games SMP, we shall search for a deterministic function:
t; x; .y1 ; ; yN /; .z1 ; ; zN / 7! ˛O t; x; .y1 ; ; yN /; .z1 ; ; zN /
defined on Œ0; T RD .RD /N .RDM /N , with values in A.N/ and satisfying the
generalized Isaacs conditions. Next, we replace the adapted controls ˛ by:
˛O t; Xt ; .Yt1 ; ; YtN /; .Zt1 ; ; ZtN /
both in the forward and backward equations. This creates a large FBSDE comprising
a forward equation in dimension D, and N backward equations in dimension D. The
couplings between these equations may be highly nonlinear, and this system may be
very difficult to solve. However, if we find processes X, .Y 1 ; ; Y N /, .Z1 ; ; ZN /
solving this FBSDE, namely:
2.2 Game Versions of the Stochastic Maximum Principle 89
8
ˆ
ˆ dXt D B t; Xt ; ˛O t; Xt ; .Yt1 ; ; YtN /; .Zt1 ; ; ZtN / dt
ˆ
ˆ
ˆ
ˆ C˙ t; Xt ; ˛O t; Xt ; .Yt1 ; ; YtN /; .Zt1 ; ; ZtN / dWt ;
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
< dYt1 D @x H 1 t; Xt ; Yt1 ; Zt1 ; ˛O t; Xt ; .Yt1 ; ; YtN /; .Zt1 ; ; ZtN / dt
ˆ
(2.23)
ˆ
ˆ CZt1 dWt ;
ˆ
ˆ :: :: ::
ˆ
ˆ
ˆ
ˆ : : :
ˆ
ˆ
ˆ
ˆ dYtN D @x H N t; Xt ; YtN ; ZtN ; ˛O t; Xt ; .Yt1 ; ; YtN /; .Zt1 ; ; ZtN / dt
:̂
CZtN dWt ;
for t 2 Œ0; T, with the initial condition X0 D x0 2 RD for the forward equation
and with the terminal conditions YTi D @x gi .XT / for the backward equations
i 2 f1; ; Ng, then, provided that the convexity assumptions in the statement of
Theorem 2.16 are satisfied, the above sufficient condition says that the strategy
O Xt ; .Yt1 ; ; YtN /; .Zt1 ; ; ZtN //, for t 2 Œ0; T, forms
profile ˛O defined by ˛O t D ˛.t;
an open loop Nash equilibrium.
For instance, the convexity assumptions are satisfied whenever, for each i 2
f1; : : : ; Ng, gi Q
is convex and, for any .t; y; z/ 2 Œ0; T .RD /N .RDM /N and
˛ 2 A D j6Di Aj , the function RD Ai 3 .x; ˛ i / 7! H i .t; x; y; z; .˛ i ; ˛ i // is
i i
convex.
In the last two sections of the chapter, we implement this strategy in the cases
of the flocking model with ˇ D 0, and the systemic risk toy model introduced in
Chapter 1. We refer to Chapter 4 for general solvability results for forward-backward
systems.
In the search for Markovian Nash equilibria, despite the strong appeal of the HJB
equation based PDE approach reviewed earlier, we may want to use a version of the
stochastic maximum principle to tackle the individual control problems entering the
construction of the best response function.
If we choose to do so, we may directly invoke, in the spirit of the sketch of
proof of Theorem 2.15, the usual stochastic maximum principle for standard optimal
control problems. As explained in the Notes & Complements below, a detailed
review of this usual version of the stochastic maximum principle is provided in
the next chapters, but anticipating on the sequel, we use now some of this material.
In full analogy with the derivation of the Nash system in Subsection 2.1.4, we
may indeed regard a Nash equilibrium . 1 ; : : : ; N /, say in closed loop feedback
form, as a partial optimizer. Once the Markovian feedback functions i are given
to the players j ¤ i, the feedback function i reads as a minimizer of the cost J i
to player i. Throughout the subsection, we use the convenient notation J i . i ; i /
90 2 Probabilistic Approach to Stochastic Differential Games
to denote the cost to player i when using the Markovian feedback function i while
the others use the Markovian feedback functions . j /j6Di .
Then, the Hamiltonian associated with the minimization of the cost J i . i ; i /
over Markovian feedback functions i reads:
H i .t; x; y; z; ˛/ D B t; x; .˛; i .t; x// y
˚
C trace ˙ t; x; .˛; i .t; x// z C f i t; x; .˛; i .t; x//
D H i t; x; y; z; .˛; i .t; x// ; (2.24)
for .t; x; y; z/ 2 Œ0; T RD RD RDM . Recalling that the adjoint equation in the
stochastic maximum principle is driven by the negative of @x H i , see for instance
Subsection 3.3.2, the p-th component of the driver in the adjoint BSDE reads:
@xp H i .t; x; y; z; ˛/ D @xp H i t; x; y; z; .˛; i .t; x// (2.25)
X
N
C @˛j H i t; x; y; z; .˛; i .t; x// @xp j .t; x/:
jD1;j6Di
Notice that the last line of the above formula is not present when we use the
stochastic maximum principle to search for open loop Nash equilibria. Obviously,
this new line corresponds to the fact that, in closed loop equilibria, the strategies
chosen by the players are sensitive to the states of the others.
Let us assume that D . 1 ; ; N / is a jointly measurable function from
Œ0; T RD into A.N/ D A1 AN which is locally bounded and differentiable
in x 2 RD , for t 2 Œ0; T fixed, with derivatives that are uniformly bounded in
.t; x/ 2 Œ0; T RD . Recalling that the drift and volatility functions B and ˙ are
Lipschitz in .x; ˛/ uniformly in t 2 Œ0; T, we denote by X the unique strong
solution of the state equation:
dXt D B t; Xt ; .t; Xt / dt C ˙ t; Xt ; .t; Xt / dWt ; t 2 Œ0; T; (2.26)
8 h
ˆ
ˆ dY
;i
D
;i ;i
@x H i .t; Xt ; Yt ; Zt ; .t; Xt //
ˆ
ˆ t
ˆ
ˆ i
ˆ
ˆ XN
ˆ
ˆ C @ H i
.t; X
; Y
;i
; Z
;i
; .t; X
//@ j
.t; X
/ dt
< ˛ j t t t t x t
jD1;j¤i (2.27)
ˆ
ˆ ;i
ˆ
ˆ CZt dWt ; t 2 Œ0; T;
ˆ
ˆ
ˆ
ˆ
ˆ
:̂ YT;i D @x gi .XT /:
We shall often drop the superscript when no confusion is possible. Given the
current assumptions on the coefficients of the model and the functions i , the
existence and uniqueness of the adjoint processes follow from the same argument
as in the open loop case.
We now state and prove the sufficient condition for the existence of a Markovian
equilibrium.
and
H i t; Xt ; Yti ; Zti ; .t; Xt /
(2.28)
D inf H i t; Xt ; Yti ; Zti ; .˛ i ; i .t; Xt //
˛ i 2Ai
Leb1 ˝ P a.e., where X D X , then is a Markovian Nash equilibrium.
Proof. The proof is essentially the same as in the case of open loop equilibria. The
differences will become clear below. As before, we fix i 2 f1; ; Ng together with a generic
feedback function .t; x/ 7! .t; x/ 2 Ai and, for the sake of simplicity, we denote by X the
i
solution X. ; / of (2.26) with the feedback function . ; i / in lieu of . Starting from:
92 2 Probabilistic Approach to Stochastic Differential Games
J i . / J i . ; i /
Z T
i
DE f t; Xt ; .t; Xt / f i t; Xt ; . .t; Xt /; i .t; Xt // dt
0
(2.29)
C gi .XT / gi .XT / ;
we use the definition of the Hamiltonian H i to replace f i in the above expression. We get:
J i . / J i . ; i /
Z T
DE H i t; Xt ; Yti ; Zti ; .t; Xt /
0
H i t; Xt ; Yti ; Zti ; . .t; Xt /; i .t; Xt // dt
Z
T
E B t; Xt ; .t; Xt / B t; Xt ; . .t; Xt /; i .t; Xt // Yti dt
0
Z h
T i
E trace ˙ t; Xt ; .t; Xt / ˙ t; Xt ; . .t; Xt /; i .t; Xt // Zti dt
0
C E gi .XT / gi .XT / : (2.30)
gi .XT / gi .XT /
6 @x gi .XT / .XT XT /
D YTi .XT XT /
Z T Z T
D .Xt Xt / dYti C Yti d.Xt Xt /
0 0
Z h
T i
C trace ˙ t; Xt ; .t; Xt / ˙ t; Xt ; . .t; Xt /; i .t; Xt // Zti dt;
0
where we used the special form (2.27) of the adjoint BSDE in order to compute the bracket
in the last equality. Expanding dYti in the third line, we obtain:
2.2 Game Versions of the Stochastic Maximum Principle 93
gi .XT / gi .XT /
Z h
T
6 .Xt Xt / @x H i t; Xt ; Yti ; Zti ; .t; Xt /
0
X
N
i
C @˛j H i t; Xt ; Yti ; Zti ; .t; Xt / @x j .t; Xt / dt (2.31)
jD1;j¤i
Z T
C B t; Xt ; .t; Xt / B t; Xt ; . .t; Xt /; i .t; Xt // Yti dt
0
Z h
T i
C trace ˙ t; Xt ; .t; Xt / ˙ t; Xt ; . .t; Xt /; i .t; Xt // Zti dt
0
C MT ;
X
N
C @˛j H i t; Xt ; Yti ; Zti ; .t; Xt / @x j .t; Xt / dt :
jD1;j¤i
We conclude by using the convexity of the function .x; ˛ i / 7! hi .x; ˛ i / for .t; !/ fixed, where
for the sake of notation we defined this function as:
hi .x; ˛ i / D H i t; x; Yti .!/; Zti .!/; .˛ i ; i .t; x// :
which we apply to x D Xt , xQ D Xt , ˛ i D i .t; Xt / and ˛Q i D .t; Xt /. Since the minimum
of the Hamiltonian is attained along the (candidate for the) optimal path, notice also that:
i
8ˇ 2 Ai ; .t; Xt / ˇ @˛i hi Xt ; i .t; Xt / 6 0:
94 2 Probabilistic Approach to Stochastic Differential Games
X
N
i
C @˛j H i t; Xt ; Yti ; Zti ; .t; Xt / @x j .t; Xt /
jD1;j¤i
6 0;
Implementation Strategy
As one can expect, the systematic use of the above sufficient condition to construct
Markovian Nash equilibria is much more delicate than in the open loop case. Under
assumptions Games and Games SMP, we should, as in the open loop case, search
for a deterministic function ˛:
O
t; x; .y1 ; ; yN /; .z1 ; ; zN / 7! ˛O t; x; .y1 ; ; yN /; .z1 ; ; zN /
defined on Œ0; T RD .RD /N .RDM /N , with values in A.N/ and satisfying the
generalized Isaacs conditions. Let us assume for example that such a function is
found and that it does not depend upon z D .z1 ; ; zN /. As in the case of the open
loop models, we would like to replace the instances of the controls in the forward
dynamics of the state as well as in the adjoint BSDEs by ˛.t; O Xt ; .Yt1 ; ; YtN //,
looking for an FBSDE which could be solved. Unfortunately, while this idea was
reasonable in the open loop case, it cannot be implemented in a straightforward
manner for Markov games because the adjoint equations require the derivatives of
the controls.
However, taking advantage of the deterministic structure of the coefficients of
such an FBSDE, we may expect that there exists a smooth function .u1 ; : : : ; uN / W
Œ0; T RD ! .RD /N such that that Yti D ui .t; Xt / and Zti D @x ui .t; Xt /˙.t; Xt /.
Such a function is called decoupling field; we refer to the first section in Chapter 4
for a review on this notion. Basically, it is understood as the space derivative of the
solution to the Nash system (2.17).
So we may want to use .t; x/ D ˛.t; O x; u.t; x//, and in the BSDE giving the
adjoint processes, the quantity @x ˛.t;
O x; u.t; x// C @y ˛.t;
O x; u.t; x//@x u.t; x/ instead
of the term @x .t; x/. As in the case of the open loop models, this creates a large
FBSDE which we need to solve in order to obtain a Markov Nash equilibrium. In
2.3 N-Player Games with Mean Field Interactions 95
the last two sections of the chapter, we do just that in the cases of the flocking model
with ˇ D 0 and the systemic risk toy model introduced in Chapter 1.
In this section, we specialize the results of the first part of the chapter to the class
of models at the core of the book. As explained in Chapter 1, these models are
characterized by strong symmetry properties in the coefficients and cost functions,
and the interactions between players need to be such that the influence of each
individual player on the rest of the population disappears as the size of the game
grows without bound.
Stochastic differential game models with the strong symmetry property alluded
to above would require the dynamics of the private states of the N players i 2
f1; ; Ng to be given by Itô stochastic differential equations of the form:
dXti D bi .t; Xti ; Xti ; ˛ti ; ˛ti /dt C i .t; Xti ; Xti ; ˛ti ; ˛ti /dWti ; (2.33)
So for the sake of our mathematical analysis, we shall assume that, instead
of (2.33), the dynamics of the private states of the individual players are given by
Itô’s stochastic differential equations of the form:
1 X
N N1
.Xt ;˛t /i D ı.X j ;˛j / ; (2.35)
N1 t t
16j¤i6N
Remark 2.20 In most of the applications considered in the book, the private states
of the players interact through the empirical distributions of the states themselves.
However, as we saw in Chapter 1, in many applications of interest, the interactions
are built into the models through the empirical distribution of the controls, or even
through the empirical distribution of the couple of state and control as posited
in (2.34) above. These last two classes of problems are more difficult to solve than
the first one, and we shall not try to offer a systematic presentation of their analyses.
We shall only discuss these models in the limit N ! 1 in Section 4.6 of Chapter 4
where we call them extended mean field games. But as a general rule, we shall
mostly restrict our discussions of extended models to particular cases for which a
solution can be derived with a low overhead from the theoretical results we provide
for the models with interactions through the states only.
As explained in the previous section, for each player, the choice of a strategy
is driven by the desire to minimize an expected cost over a period Œ0; T, each
individual cost being a combination of running and terminal costs. Based on the
above discussion of the form of the drift and volatility coefficients in the private
state dynamics, we shall assume that for each i 2 f1; ; Ng, the running cost to
player i is given by a measurable function f W Œ0; T Rd P.Rd A/ A ! R and
the terminal cost by a measurable function g W Rd P.Rd A/ ! R in such a way
that if the N players use the strategy profile ˛ D .˛1 ; ; ˛N / 2 AN , the expected
total cost to player i is:
Z
T i N1 i
J .˛/ D E
i
f t; Xt ; N t ; ˛t dt C g.XT ; N T / ;
i N1
(2.36)
0
2.3 N-Player Games with Mean Field Interactions 97
We revisit our notation system in order to take advantage of the current emphasis
on the decomposition of the state of the system as the aggregation of the private
states of the individual players, and the strong symmetry conditions we impose on
the dynamics and the costs. In particular, we pay special attention to the players’
Hamiltonians introduced in (2.8). The state variable, denoted by the bold face letter
x, reads as an N-tuple x D .x1 ; : : : ; xN / 2 .Rd /N , describing the private states of
the players, the first dual variable y reads as an N-tuple .y1 ; : : : ; yN / 2 Œ.Rd /N N ,
each yi being itself an N-tuple .yi;1 ; : : : ; yi;N / 2 .Rd /N and the second dual variable
z reads as an N-tuple .z1 ; : : : ; zN / 2 Œ.Rdm /NN N , each zi denoting an N N-tuple
.zi;1;1 ; : : : ; zi;N;N / 2 .Rdm /NN . Finally, the control variable ˛ reads as an N-tuple
.˛ 1 ; : : : ; ˛ N / 2 AN of possible actions by the players.
With these conventions, the Hamiltonian of player i is given by:
XN
i;j
H i t; x; yi ; zi ; ˛ D .x;˛/j ; ˛ y
b t; xj ; N N1 j
jD1
X
N
i;j;j
C .x;˛/j ; ˛ z
t; xj ; N N1 j
.x;˛/i ; ˛ ;
C f t; xi ; N N1 i
jD1
1 X
N N1
.x;˛/j D ı.xi ;˛i / : (2.37)
N1
16j¤i6N
The necessary part of the stochastic maximum principle suggests the minimization
of H i when t; x; yi ; zi and ˛i are frozen. Interestingly, whenever the interaction in
the coefficients is through the state only as explained in Remark 2.20, that is N N1
.x;˛/j
and N N1
.x;˛/i are replaced by
N N1
x j and N N1
xi , the Hamiltonian has a distributed
additive structure, in the sense that each given control variable appears separately
from the others in each of the terms of the above sum. As a by-product, the partial
98 2 Probabilistic Approach to Stochastic Differential Games
optimization procedure over ˛ i whenever all the other variables t; x; yi ; zi and ˛i
are frozen reduces to the optimization problem:
i;i i;i;i
inf b t; xi ; N N1
xi ; ˛ y C .t; x ;
i
N N1
xi ; ˛ z C f t; xi ; N N1
xi ; ˛ ;
˛2A
of the type we have been dealing with so far. The symmetry assumed in this section
implies that one can focus on the Hamiltonian of one single player. By symmetry,
one can drop the superscript i in the notation of the Hamiltonian as it is enough to
consider H.t; xi ; N N1
xi
; yi;i ; zi;i;i ; ˛ i /, where H is defined as:
H t; x; ; y; z; ˛/ D b t; x; ; ˛ y C .t; x; ; ˛/ z C f .t; x; ; ˛/;
Definition 2.21 Assume that the interaction in the coefficients is through the state
only, as explained in Remark 2.20. Then, given t 2 Œ0; T, x 2 Rd , 2 P.Rd /, y 2
Rd and z 2 Rdm , we use the generic notation ˛.t; O x; ; y; z/ to denote a minimizer
of the function A 3 ˛ 7! H.t; x; ; y; z; ˛/, in other words:
If the above minimizer is well-defined, then the function Œ0; T.Rd /N ..Rd /N /N
..Rdm /NN /N 3 .t; x; .y1 ; ; yN /; .z1 ; ; zN // 7! .˛.t;
O xi ; N N1
xi
; yi;i ; zi;i;i //16i6N
satisfies the Isaacs condition, see Definition 2.9.
Remark 2.22 In the most desirable situations, the minimizer in Definition 2.21 is
uniquely defined. This is for instance the case when A is convex and the Hamiltonian
is strictly convex in the variable ˛. We shall often restrict ourselves to this situation,
although it is rather restrictive since it requires the drift b to be linear in ˛. As
already used before, whenever A is convex and H is differentiable in ˛, we have:
8ˇ 2 A; .ˇ ˛/ @˛ H t; x; ; y; z; ˛.t;
O x; ; y; z/ > 0;
When H is strictly convex, the implicit function theorem may be used to transfer the
smoothness properties of H in the directions x, , y and z into smoothness properties
of ˛.
O We shall make this important remark precise in several lemmas in the sequel.
2.3 N-Player Games with Mean Field Interactions 99
The notion of potential game introduced in Chapter 1 in the particular case of one
period models can be generalized to the setting of stochastic differential games.
Here, we specialize this notion to the case of N-player games with mean field
interactions, and we concentrate on N-player open loop games for the sake of
definiteness. We leave the discussion of potential mean field games to Chapter 6
where we emphasize the connection with the control of McKean-Vlasov stochastic
differential equations.
Recall the definitions of the running and terminal cost functions of the players
entering the definition of the stochastic differential game (2.34)–(2.36), and in
particular the fact that J i is the cost functional of player i 2 f1; ; Ng.
Definition 2.23 The game is said to be a potential game if there exists a functional
.˛1 ; ; ˛N / 7! J.˛1 ; ; ˛N /, from AN to R, satisfying
In other words, a game is a potential game if one can find a single function J of
the set of player strategy profiles, such that any change in the value of this function
J when one (and only one) strategy is perturbed, equals the corresponding change
in the cost functional J i of the player i whose strategy is perturbed, for the same
change in strategies. This special property makes it possible to replace the search for
a Nash equilibrium by the search for a minimum of this function, problem which is
usually simpler! Assume indeed that ˛O D .˛O t1 ; ; ˛O tN /06t6T is an argument of the
minimization of J. Then, one readily checks that ˛O D .˛O t1 ; ; ˛O tN /06t6T is a Nash
equilibrium for the game because
The fact that we use the same notation for different functions should not create
ambiguities. Next, we assume that the running cost function f W Œ0; TRd P.Rd
A/ A ! R is in fact of the form:
1 2 Q
f .t; x; ; ˛/ D j˛j C f .t; x; x /
2
for some function fQ W Œ0; T Rd P.Rd / ! R and where x 2 P.Rd / denotes the
marginal in x 2 Rd of 2 P.Rd A/, in other words, the projection of onto Rd .
Similarly, we assume that the terminal cost function g W Rd P.Rd A/ ! R is of
the form:
g.x; / D gQ .x; x /:
Proposition 2.24 If on top of the above assumptions on the form of the coefficients,
there exist functions Œ0; T P.Rd / 3 .t; / 7! F.t; / and P.Rd / 3 7! G./
satisfying:
fQ .t; x; N N1 Q 0
X / f .t; x ;
N N1
X /
1 N 1 N1 1 N 1 N1
D F t; ıx C N X F t; ıx0 C N X
N N N N
and
X /g
gQ .x; N N1 Q .x0 ; N N1
X /
1 N 1 N1 1 N 1 N1
DG ıx C N X G ıx0 C N X
N N N N
for every x; x0 2 Rd and X D .x1 ; ; xN1 / 2 Rd.N1/ , then the game is a potential
game and the function J defined by:
Z T h1 X
N i
1
J.˛ ; ; ˛ / D E
N
j˛ti j2 C F.t; N NXt / dt C G.N XT /
N
(2.39)
0 2 iD1
Here and in the following we use the notation N NZ for the empirical measure
of the N-tuple Z D .z1 ; ; zN / as defined by formula (1.3) in Chapter 1. As
usual, it is implicitly required that the expectation in (2.39) is well-defined for
.˛1 ; ; ˛N / 2 AN .
It is crucial to emphasize the practical importance of this seemingly innocuous
result. While it is typically very difficult to identify Nash equilibria for stochastic
2.3 N-Player Games with Mean Field Interactions 101
games, especially for large games, it appears that if the cost functions are of the
special form identified in the above statement, the search for Nash equilibria reduces
to a single optimization problem. This type of result had many applications in
economics where this single optimization is often referred to as the central planner,
or representative agent or even the invisible hand optimization.
Proof. The proof is based on a direct verification argument relying on the specific form of
the cost functions.
For a given ˛O D .˛O 1 ; ; ˛O N / 2 AN , we denote by XO D .XO t /06t6T the corresponding
state process. For a given i 2 f1; ; Ng and for another ˛i 2 A, we change ˛O i into ˛i .
i
Thanks to the special form of the forward dynamics, this does not affect XO . We then denote
i
by Xi the state process to player i associated with ˛i . Also, we let X D .Xi ; XO /. Importantly,
we can write, for any t 2 Œ0; T,
1 N 1 N1
N NXt D ıXti C N XO i :
N N t
In particular,
F.t; N NXt / F.t; N NXO / D fQ t; Xti ; N NXO i fQ t; XO ti ; N NXO i :
t t t
J.˛i ; ˛O i / J.˛/
O
Z T h i
1 i 2
DE j˛t j j˛O ti j2 C F.t; N NXt / F.t; N NXO / dt C G.N NXT / G.N NXO /
0 2 t T
Z T h
1 i 2 i
DE j˛t j j˛O ti j2 C fQ t; Xti ; N NXO i fQ t; XO ti ; N NXO i dt
0 2 t t
C gQ .XTi ; N XN1 Q .XO Ti ; N XN1
O i / g O i /
T T
i i i
D J .˛ ; ˛O / J .˛O ; ˛O /;
i i i
Notice that the last statement of the proof is specific to the open loop nature of
the problem as it does not hold any longer if the controls are closed loop.
Remark 2.25 We shall revisit this result about potential games in the case of the
asymptotic regime N ! 1 of large games. In this asymptotic regime, the search
for Nash equilibria will amount to solving for a mean field game equilibrium, and
the central planner optimization problem will reduce to the solution of an optimal
control problem for McKean-Vlasov’s stochastic differential equations. For this
reason, it is instructive to use the stochastic maximum principle to get an idea of
102 2 Probabilistic Approach to Stochastic Differential Games
what the controls forming the Nash equilibrium look like. We do just that in the
analysis of an example below. Notice that this exercise sheds some new and different
light on the above argument.
Examples.
1. The first obvious situation in which the assumptions of the above proposition are
satisfied is when there exist functions F and G such that:
Qf .t; x; N N1 1 N 1 N1
X / D F t; ıx C N X ;
N N
and
1 N 1 N1
gQ .x; N N1
X / DG ıx C N X :
N N
for some smooth even functions h.t; / and k. Indeed, if we define the functions
F and G by:
N2 N2
F.t; / D hh.t; / ; i; and G./ D hk ; i
2.N 1/ 2.N 1/
XN X
N1
1 0
2 Nh.t; 0/ C 2 h.t; x x / C
i
h.t; x x /
i j
N iD1 i;jD1
XN XN
1 0
D h.t; x x /
i
h.t; x x /
i
N 1 iD1 iD1
D fQ .t; x; N N1 Q 0
X / f .t; x ;
N N1
X /;
The reader may wonder about the scaling used in the cost coefficients F and
G, which grow linearly with N. Indeed, such a scaling may seem to contradict our
objective to investigate the asymptotic behavior of the equilibria as N tends to 1.
Actually, when it comes to potential games, what really matters is the fact that Nash
equilibria coincide with minima of some collective cost functional J. Of course,
those minima remain the same if J is replaced by J=N. In our specific case, J=N has
the right scaling: it represents the average cost to the society. We shall come back
to this question in Chapters 6 and (Vol II)-6 when we face optimal control problems
for McKean-Vlasov diffusion processes.
For the purpose of illustration, we show that, at least for this particular example,
the classical version of the Pontryagin stochastic maximum principle when applied
to the standard stochastic control problem of the minimization problem of the central
planner leads to the same FBSDE as the game version of the stochastic maximum
principle when applied to the above potential N-player game. Let us assume for
example that the dynamics of the private states of the N players are given by:
X
N
1X i2
N
1 X N
H.t; x; y; ˛/ D ˛ i yi C j˛ j C h.xi xj /;
iD1
2 iD1 2.N 1/ i;jD1
1 P
Œ j¤i k0 .XTi XT /. Note that we
j
for i D 1; ; N and t 2 Œ0; T, with YTi D N1
used the fact that h and k are even functions in the computation of the drift term and
terminal condition of Y i . Whenever h and k are convex, the Hamiltonian H is convex
˛/ (regarded as a variable in .RN /2 ) and the terminal cost .x1 ; : : : ; xN / 7!
in .x;P
1
G. N NiD1 xi / is also convex, in which case the stochastic maximum principle is not
only a necessary but also a sufficient condition of optimality, see Subsection 3.3.2
104 2 Probabilistic Approach to Stochastic Differential Games
We can check that the application of the game version of the stochastic maximum
principle gives the same result. Indeed, in the case of the N player game with fQ and
gQ as above, the (reduced) Hamiltonian of player i reads:
X
N
1 1 X
N
H i .t; x; yi ; ˛/ D ˛ j yi;j C j˛ i j2 C h.xi xj /;
jD1
2 N 1
jD1;j¤i
and the necessary part of the Pontryagin stochastic maximum principle identifies the
same candidate ˛O ti D Yti;i for the equilibrium controls, so that the forward dynam-
ics of the state .Xt /06t6T are still given by the first equation in the system (2.40):
The backward component of the FBSDE which provides a necessary condition for
any Nash equilibrium includes:
dYti;i
X
N
i;i;j j
D @xi H i t; Xt ; .Yti;1 ; ; Yti;N /; .Yti;1 ; ; Yti;N / dt C Zt dWt
jD1
1 h X 0 i i X
N N
j i;i;j j
D h .Xt Xt / dt C Zt dWt ; t 2 Œ0; T;
N1 jD1
jD1;j¤i
which is exactly the same equation as the second equation in (2.40) with the same
i;j i;i;j
exact terminal conditions if we identify Yti and Yti;i , and Zt with Zt .
Remark 2.26 As a final remark, and anticipating on the discussion of the differen-
tiability of functions of measures in Chapter 5, we emphasize the fact that a crucial
N2 Q
role is played by the identity ıF.t; /.x/ D N1 f .t; x; / -a.e. which we will prove
in Chapter 5. Here, ıF.t; / is as in (1.11). Such a derivative is a function and, in
the notation ıF.t; /.x/, this function is evaluated at x 2 R.
Linear quadratic (LQ) game models are popular because their solutions reduce to
systems of ordinary differential equations, the only nonlinearity appearing in a
matrix Riccati equation. Because these equations are not always solvable in the
multivariate case, we refrain from dwelling on a discussion of the general form of
LQ games, and instead, we restrict our attention to those linear quadratic games
2.3 N-Player Games with Mean Field Interactions 105
with mean field interactions. For such models, the dynamics of the state of player i
are given by a linear equation of the form:
dXti D b1 .t/Xti C bN 1 .t/XN ti C b2 .t/˛ti dt C dWti ; (2.41)
where as usual the .W i D .Wti /06t6T /16i6N ’s are N independent standard Wiener
processes of dimension m and where:
X
N Z
1
XN ti D
j
Xt D xdN N1
Xti
.x/
N1 R d
jD1;j¤i
denotes the sample mean of the states of the players j ¤ i. Here b1 D .b1 .t//06t6T ,
bN 1 D .bN 1 .t//06t6T and b2 D .b2 .t//06t6T are deterministic continuous functions of
t 2 Œ0; T with values in Rdd , Rdd and Rdk respectively, while is a constant
matrix in Rdm . As explained before, the volatility does not need to be constant. We
make this assumption for the sake of simplicity. In terms of the notation used in this
chapter, we have:
and
1
g.x; / D N qN .x s/
x qx C .x s/ N ;
2
where the symbol is used for the transposition so that x q.t/x stands for the
inner product x .q.t/x/ (and similarly for the others) and where q; qN ; s 2 Rdd ,
q D .q.t//06t6T , r D .r.t//06t6T , s D .s.t//06t6T and qN D .Nq.t//06t6T are
deterministic continuous functions of t 2 Œ0; T with values in Rdd , Rkk , Rdd and
Rdd respectively. Moreover, we assume that q, qN , q.t/, and qN .t/ are symmetric and
nonnegative semi-definite, which guarantees that f and g are convex in the direction
x (which, in fact, would be true under the weaker assumption that qCNq and q.t/CNq.t/
are nonnegative semi-definite), while r.t/ is assumed to be symmetric and strictly
106 2 Probabilistic Approach to Stochastic Differential Games
X
N
H i .t; x; yi ; ˛/ D b1 .t/xj C bN 1 .t/Nxj C b2 .t/˛ j yi;j
jD1
1 i
C .x / q.t/xi C .xi s.t/Nxi / qN .t/.xi s.t/Nxi / C .˛ i / r.t/˛ i ;
2
1 PN
with xN i D N1 jD1;j6Di x , for .t; x; y; ˛/ 2 Œ0; T .R / .R / .R / and
j d N d N k N
i 2 f1; ; Ng, are strictly convex in the control variables and the generalized Isaacs
conditions are satisfied with ˛O i D r.t/1 b2 .t/ yi;i . We could write the large system
of forward and backward equations by substituting these values for the controls
appearing in the forward dynamics of the states and in the adjoint equations and try
to solve the resulting high dimensional FBSDE by reducing it to a set of ordinary
differential equations and a matrix Riccati equation. Unfortunately, matrix Riccati
equations are not always well posed, and their analysis can be involved. So we
refrain from pursuing the search for solutions at the present level of generality to
avoid unnecessary technicalities.
However, we show in the last two sections of this chapter that linear quadratic
games with mean field interactions can be explicitly solvable. We substantiate this
claim by solving completely two of the models introduced in Chapter 1. We first
treat the particular case of the flocking model in the case ˇ D 0, and next, we solve
the systemic risk model.
2.4 The Linear Quadratic Version of the Flocking Model 107
As a first application of the theory developed in this chapter, we consider the special
case of the flocking model introduced in Chapter 1 corresponding to the particular
choice ˇ D 0 of the parameter. In that case, the position xti of the bird at time t
does not appear in the cost function and as a result, the mathematical analysis can
be focused on the velocity component of the state. So for the purpose of this section,
Xti 2 R3 represents the velocity at time t of bird i, and its dynamics are given by:
where the 3-dimensional standard Wiener processes W i D .Wti /06t6T are indepen-
dent for i D 1; ; N, and where > 0 is assumed to be a scalar for the sake of
simplicity. If we specialize the discussion of Subsection 1.5.1 of Chapter 1 to the
case ˇ D 0, we find that if ˛ D .˛1 ; ; ˛N / is a strategy profile for the flock, bird
i will want to minimize the expected cost:
Z T
J i .˛/ D E f i t; .Xt1 ; ; XtN /; .˛t1 ; ; ˛tN / dt ;
0
2 1
f i t; .x1 ; ; xN /; .˛ 1 ; ; ˛ N / D jxi xN j2 C j˛ i j2 ;
2 2
for x D .x1 ; ; xN / 2 R3N and ˛ D .˛ 1 ; ; ˛ N / 2 R3N , and > 0. Recall that
there is no terminal cost in the model as we stated it.
Here, we could choose to have each bird interact with the empirical distribution
of the velocities of the other birds. We decided to have each bird interact with the
empirical distribution of all the velocities, including its own. So, we use the notation
xN D .x1 C CxN /=N for the sample mean of the states of all the birds. As explained
in Remark 1.25 of Chapter 1, having bird i pay or be rewarded by the difference
between its state and the mean of the states of the other birds j ¤ i would simply
amount to multiplying the constant by a quantity which converges to 1 as N ! 1.
So for the sake of definiteness, we use the empirical mean of all the states.
We first consider the open loop equilibrium problem. From now on, we use the
terms bird and player interchangeably.
X
N
2 1
H i .x; yi ; ˛/ D ˛ j yi;j C jNx xi j2 C j˛ i j2 ;
jD1
2 2
P
where x D .x1 ; : : : ; xN / 2 R3N , xN D N1 NiD1 xi and yi D .yi;1 ; : : : ; yi;N / 2 R3N . The
value of ˛ i minimizing this reduced Hamiltonian with respect to ˛ i , when all the
other variables, including ˛ j for j ¤ i, are fixed, is given by:
Now, given an admissible strategy profile ˛ D .˛t1 ; ; ˛tN /0tT and the corre-
sponding controlled state X D X˛ , the adjoint processes associated with ˛ are the
processes Y D .Y 1 ; ; Y N / and Z D .Z1 ; ; ZN /, each Y i being .R3 /N -valued
and each Zi being .R3 /NN -valued, solving the system of BSDEs:
X
N
Zt dWt` ;
i;j i;j;`
dYt D @xj H i t; Xt ; .Yti;1 ; : : : ; Yti;N /; ˛t dt C
`D1
1 X
N
D 2 .ıi;j /.Xti XN t / dt C dWt` ;
i;j;`
Zt t 2 Œ0; T;
N
`D1
i;j
for i; j D 1; ; N, with terminal conditions YT D 0. Notice that the controls do
not appear explicitly in the adjoint equations. According to the strategy outlined
earlier, we replace all the occurrences of the controls ˛ti in the forward dynamics by
˛O i .Xt ; Yti / D Yti;i , and we try to solve the resulting system of forward-backward
equations. If we manage to do so, the strategy profile ˛ D .˛t1 ; ; ˛tN /0tT
defined by:
˛ti D ˛O i Xt ; Yti D ˛O i Xt ; .Yti;1 ; : : : ; Yti;N / D Yti;i ; t 2 Œ0; T; (2.43)
will provide an open loop Nash equilibrium. Notice indeed that the stochastic
maximum principle here provides both a necessary and sufficient condition of
equilibrium since the Hamiltonian H i is convex in .x; ˛/. In the present situation,
the system of FBSDEs reads:
8
ˆ
ˆ dX i D Yti;i dt C dWti ;
ˆ t
ˆ
< 1 XN
dYt D 2 .ıi;j /.Xti XN t / dt C Zt dWt` ;
i;j i;j;`
t 2 Œ0; T; (2.44)
ˆ
ˆ N
ˆ kD1
:̂ i;j
YT D 0; i; j D 1; : : : ; N:
2.4 The Linear Quadratic Version of the Flocking Model 109
i;j
This is a system of affine FBSDEs, so we expect Yt to be an affine function
of Xti , or equivalently, using the terminology introduced in Chapter 4, we expect
the decoupling field to be affine. In other words, we expect that the backward
components Yt will be given by an affine function of Xt . In the present situation,
since the couplings between all these equations depend only upon quantities of the
form Xti XN t , we search for a solution of the form:
i;j 1 i N
Yt D t ıi;j .Xt Xt / (2.45)
N
i;j
Therefore, computing the differential dYt from the ansatz (2.45), we get:
1 i N 1
.Xt Xt / P t 2t 1
i;j
dYt D ıi;j dt
N N
(2.47)
1X
N
1
C t .ıi;j / dWti dWt` :
N N
`D1
Identifying this differential with the right-hand side of the backward component
of (2.44) we get:
i;j;` 1 1
Zt D t ıi;j ıi;` ; ` D 1; ; N;
N N
and
1 2
Pt D 1 t 2; (2.48)
N
with terminal condition T D 0. This is a scalar Riccati equation. Since we
shall encounter this type of equation frequently in the sequel, we state a standard
existence result for the sake of future reference.
110 2 Probabilistic Approach to Stochastic Differential Games
Scalar Riccati Equations. Let us assume that A; B; and C are real numbers
such that B ¤ 0, B > 0 and BC > 0. Then, the scalar Riccati equation:
2
P t D 2A t C B t C; t 2 Œ0; T; (2.49)
Pt
ACB t D ; t 2 Œ0; T;
t
which transforms the nonlinear equation (2.49) into the second order linear ordinary
differential equation:
Rt D A2 C BC t ; t 2 Œ0; T:
The crucial point is to observe that ı C > 0 and ı < 0, so that the denominator
in (2.50) does not vanish, which excludes any possibility of blow-up. Similarly, it is
clear that the numerator does not vanish except maybe in t D T, so that the function
W Œ0; T 3 t 7! t takes values in Œ0; C1/ if > 0 and in .1; 0 if 6 0.
In the particular case at hand we find that the unique solution of the Riccati
equation (2.48) is:
r p
N e2 .N1/=N.Tt/ 1
t D p ; t 2 Œ0; T: (2.51)
N 1 e2 .N1/=N.Tt/ C 1
1
˛O ti D 1 N
t .Xt Xt /;
i
(2.52)
N
2.4 The Linear Quadratic Version of the Flocking Model 111
obtained by plugging the ansatz (2.45) into (2.43), is an open loop Nash equilibrium.
Notice that the controls (2.52) are in feedback form since they only depend upon
the current value of the state Xt at time t. Note also that in equilibrium, the state
Xt is Gaussian, and more precisely, the dynamics of the states ..Xti /06t6T /iD1 ;N of
the individual birds are given by the stochastic differential equations (2.46), which
show that the velocities of the individual birds are Ornstein-Uhlenbeck processes
mean reverting toward the sample average of the velocities in the flock.
Important Remark. The strategy profile given by (2.52) was constructed in order
to satisfy the sufficient condition for an open loop Nash equilibrium given by
the stochastic maximum principle, and as such, it is indeed an open loop Nash
equilibrium. However, even though it is in closed loop form, or even Markovian
form, there is a priori no reason, except possibly wishful thinking, to believe that it
could also be a closed loop or a Markovian Nash equilibrium, simply because of the
definition we chose of Markovian equilibria.
X
N
2 i 1
H i .x; yi ; ˛/ D j .t; x/ yi;j C ˛ yi;i C jx xN j2 C j˛j2 ;
2 2
jD1;j¤i
1
i .t; x/ D 1 t .x x
i
N /; .t; x/ 2 Œ0; T R3N ; i D 1; ; N; (2.53)
N
112 2 Probabilistic Approach to Stochastic Differential Games
for some deterministic function Œ0; T 3 t 7! t 2 R. Even though we use the same
notation . t /06t6T , this function may differ from the one identified above in the case
of open loop equilibria. Using the special form of the Hamiltonian H i , which is
convex in .x; ˛/ since is linear in x, we get as FBSDE derived from the stochastic
maximum principle for Markovian equilibria:
8
ˆ
ˆdXt D Yt dt C dWt ;
i i;i i
ˆ
ˆ
ˆ
ˆ i;j h X N
i;` i
ˆ
ˆ ` 2 1 N
ˆ
ˆ dY D @ .t; X / Y C .ı /.X i
X / dt
< t x j t t i;j t t
N
`D1;`¤i
(2.54)
ˆ
ˆ X
N
ˆ
ˆ C Zt dWt` ;
i;j;`
t 2 Œ0; T;
ˆ
ˆ
ˆ
ˆ
ˆ `D1
:̂Y i;j D 0;
T
for i; j D 1; ; N. Each Xi , each Y i;j and each Zi;j;` is R3 -valued. For the particular
choice (2.53) of feedback functions, we have:
1 1
@xj ` .t; x/ D ıj;` 1 t I3 ;
N N
where I3 denotes the 3 3 identity matrix. The backward component of (2.54) can
be rewritten as:
h 1 X
N
1 i;` 1 i N i
Yt C 2 ıi;j
i;j
dYt D 1 t ı`;j .Xt Xt / dt
N N N
`D1;`¤i
X
N
dWt` ;
i;j;`
C Zt (2.55)
`D1
for t 2 Œ0; T, and i; j D 1; ; N. For the same reasons as in the open loop case
(couplings depending only upon XN t Xti ), we make the same ansatz (2.45) on the
i;j
form of Yt , and we search for a solution of the FBSDE (2.54) in the form (2.45)
with the same function Œ0; T 3 t 7! t 2 R as in (2.53). Evaluating the right-hand
side of the BSDE part of (2.55) using the ansatz (2.45), we get:
2.4 The Linear Quadratic Version of the Flocking Model 113
i;j 1 X
N
1 1 i N
dYt D 1 t ı`;j t ıi;` Xt Xt
N N N
`D1;`¤i
1 i N X
N
C 2 ıi;j Zt dWt`
i;j;`
Xt Xt dt C
N
`D1
1 2 i N X
N
1 1
D 1 X Xt ı`;j ıi;`
N t t N N
`D1;`¤i
1 i N X
N
C 2 ıi;j Zt dWt`
i;j;`
Xt Xt dt C
N
`D1
1 1 1 i N X
N
2
C 2 ıi;j Zt dWt` ;
i;j;`
D 1 t Xt Xt dt C
N N N
`D1
where, to pass from the second to the third equality, we used the identity:
X
N
1 1 1 1
ı`;j ı`;i D ıi;j : (2.56)
N N N N
`D1;`6Di
i;j
Equating with the differential dYt obtained in (2.47) from the ansatz (remember
that (2.47) only depends upon the form of the ansatz and not on the nature of the
i;j;`
equilibrium), we get the same identification for the Zt and the following Riccati
equation for t :
1 2 2
P t D .1 / t 2; t 2 Œ0; T; (2.57)
N
with the same terminal condition T D 0. This equation is very similar, but still
different from the Riccati equation (2.48) obtained in the search for open loop
equilibria. By (2.50), we get an explicit formula for the solution. In the present
case, ı C D .1 1=N/ and ı D ı C and consequently:
N e2.11=N/.Tt/ 1
t D ; 0 6 t 6 T: (2.58)
N 1 e2.11=N/.Tt/ C 1
As in the case of the open loop problem, the equilibrium dynamics of the state .X D
.X1 ; ; XN // are given by an RP3N
-valued Ornstein-Uhlenbeck process reverting
toward the sample mean .XN t D N NjD1 Xt /06t6T .
1 j
114 2 Probabilistic Approach to Stochastic Differential Games
e2.Tt/ 1
t D ; 0 6 t 6 T; (2.59)
e2.Tt/ C 1
We now present a second example of finite player game which can be solved
explicitly, and for which the open loop and Markovian equilibria, though similar
and given by feedback functions, differ as long as the number of players remains
finite. As in the previous section, the model is of the linear quadratic type and the
interactions are of a mean field nature. The computations will be more involved than
in the previous section, but the analysis will remain very similar. Our interest in this
example lies mostly in the fact that it contains a common noise and cross terms in
the running cost function.
The example is based on the model of systemic risk introduced in Subsec-
tion 1.3.1 in Chapter 1. It is a particular case of linear quadratic game with mean
field interactions as introduced in Subsection 2.3.4 above. Like in the previous
section, we solve the model completely, and illustrate how the several versions of the
stochastic maximum principle presented in this chapter can lead to different Nash
equilibria. Moreover, we also implement the analytic approach based on the solution
of a large system of coupled Hamilton-Jacobi-Bellman partial differential equations,
if only to show that the Markovian equilibrium found in this way does coincide with
the Markov equilibrium found via the Pontryagin stochastic maximum principle.
As in the case of the flocking model analyzed earlier, we choose to work with
the form of the interaction where each player interact with the empirical mean of
the states of all the banks, as opposed to the empirical mean of the states of all the
other banks. As already explained for the flocking model, switching from one form
of interaction to the other, simply amounts to multiplying the constants of the model
by functions of N which tend to 1 as N ! 1.
2.5 The Coupled OU Model of Systemic Risk 115
In this model, we assume that the log-cash reserves Xt D .Xt1 ; ; XtN / of N banks
are Ornstein-Uhlenbeck (OU) processes reverting to their sample mean XN t D .Xt1 C
C XtN /=N at a rate a > 0. To be specific, we assume that the dynamics of the
log-reserves of the banks are given by equations of the form:
dXti D a.XN t Xti / C ˛ti dt C dBit ; i D 1; ; N; (2.60)
where:
p
dBit D 1 2 dWti C dWt0 ;
for some 2 Œ1; 1. The major fundamental difference between this model
and the flocking model considered in the previous section is the presence of the
Wiener process W 0 in the dynamics of all the log-cash reserve processes Xi . The
state processes are usually correlated through their empirical distribution, but when
¤ 0, the presence of this common noise W 0 creates an extra source of dependence.
The process ˛i is understood as the control of bank i.
In this model, bank i tries to minimize:
J i .˛1 ; ; ˛N /
Z T
N 1 c
DE .Xt Xti /2 q˛ti .XN t Xti / C j˛ti j2 dt C .XN T XTi /2 ;
0 2 2 2
X
N
1
H i .x; yi ; ˛/ D Œa.Nx xj / C ˛ j yi;j C .Nx xi /2 q˛ i .Nx xi / C .˛ i /2 ;
jD1
2 2
P
where x D .x1 ; ; xN / 2 RN , xN D N1 NiD1 xi , yi D .yi;1 ; ; yi;N / 2 RN and
˛ D .˛ 1 ; ; ˛ N / 2 RN . The value of ˛ i minimizing this reduced Hamiltonian
with respect to ˛ i , when all the other variables, including ˛ j for j ¤ i, are fixed, is
given by:
X
N
Zt dWt` ;
i;j i;j;`
dYt D @xj H i t; Xt ; .Yti;1 ; ; Yti;N /; ˛t dt C
`D0
X
N
1 i 1 N i 1
D a. ı`;j /Yt q˛t . ıi;j / C .Xt Xt /. ıi;j / dt
i;`
N N N
`D1
X
N
dWt` ;
i;j;`
C Zt t 2 Œ0; T; (2.62)
`D0
to the strategy outlined earlier, we replace all the occurrences of the controls ˛ti ,
in the forward equations giving the dynamics of the states, and in the backward
adjoint equations, by ˛O i .Xt ; Yti / D Yti;i C q.XN t Xti /. Then, we try to solve the
resulting system of forward-backward equations. If we succeed, the strategy profile
˛ D .˛t1 ; ; ˛tN /0tT defined by:
˛ti D ˛O i Xt ; .Yti;1 ; : : : ; Yti;N / D Yti;i C q.XN t Xti /; t 2 Œ0; T; (2.63)
will provide an open loop Nash equilibrium. Notice that the condition > q2
implies that H i is convex in .x; ˛/. In the present situation, the FBSDEs read:
8 h i p
ˆ
ˆ dX i
D .a C q/. XN X i
/ Y i;i
dt C dWt0 C 1 2 dWti ;
ˆ
ˆ
t t t t
ˆ
ˆ h X N
ˆ
ˆ 1 1
ˆ . ı`;j /Yti;` C qŒYti;i q.XN t Xti /. ıi;j /
i;j
ˆ
ˆ dY t D a
ˆ
ˆ N N
ˆ
< `D1
i
1
C .XN t Xti /. ıi;j / dt (2.64)
ˆ
ˆ N
ˆ
ˆ XN
ˆ
ˆ Zt dWt` ; t 2 Œ0; T;
i;j;`
ˆ
ˆ C
ˆ
ˆ
ˆ
ˆ `D0
ˆ i;j
:̂Y D c.XN X i /. 1 ı /; i; j D 1; ; N:
T T T i;j
N
i;j
Yt D N Xti /. 1 ıi;j /;
t .Xt (2.65)
N
2.5 The Coupled OU Model of Systemic Risk 117
i;j
Therefore, computing the differential dYt from the ansatz (2.65), we get:
1 h 1 i
ıi;j .XN t Xti / P t
i;j
dYt D t a C q C .1 / t dt
N N
(2.66)
p 1 X
N
1
C 1 2 t . ıi;j / dWt` dWti :
N N
`D1
Evaluating the right-hand side of the BSDE part of (2.64) using the ansatz (2.65)
we get:
XN
1 1
ı`;j t .XN t Xti /
i;j
dYt D a ı`;i
N N
`D1
1 1
Cq t XN t Xti 1 q XN t Xti ıi;j
N N
X
1 N
C XN t Xti Zt dWt` :
i;j;`
ıi;j dt C
N
`D0
X
N
1 1 1
ı`;j t .N
x xi /. ı`;i / D t .Nx xi / ıi;j : (2.67)
N N N
`D1
i;j
Identifying the two Itô decompositions of Yt given in (2.66) and (2.68) we get, as
a necessary condition for (2.65):
i;j;0 i;j;`
p 1 1
Zt D 0; Zt D 1 2 t . ıi;j /. ıi;` /; ` D 1; ; N;
N N
and
1 1
Pt t a C q C .1 / t D .a C q/ t q t C q2 ;
N N
1 1 2
P t D 2.a C q/ q t C 1 t C q2 ; (2.69)
N N
with terminal condition T D c. Under the condition > q2 , the existence result
which we recalled in the previous section says that this Riccati equation has a unique
solution, given by:
C C
. q2 / e.ı ı /.Tt/ 1 c ı C e.ı ı /.Tt/ ı
t D ; (2.70)
ı e.ıC ı /.Tt/ ı C c.1 1=N/ e.ıC ı /.Tt/ 1
with:
q p
ı˙ D a C q ˙ R;
2N
(2.71)
q 2 1
and RD aCq C 1 . q2 / > 0:
2N N
Figure 2.1 gives the plots of the solution for a few values of the parameters.
With such a function Œ0; T 3 t 7! t 2 R in hand, the sufficiency part of the
Pontryagin stochastic maximum principle given in Theorem 2.18 implies that the
strategy profile given by:
1
˛ti D q C .1 / t .XN t Xti /; t 2 Œ0; T; (2.72)
N
i;j
obtained by plugging the value (2.65) of Yt in (2.63), is an open loop Nash
equilibrium. Notice that the controls (2.72) are in feedback form since they only
depend upon the current value of the state Xt at time t. Note also that in equilibrium,
the dynamics of the state X are given by the stochastic differential equations:
2.5 The Coupled OU Model of Systemic Risk 119
1 p
dXti D a C q C .1 / t XN t Xti dt C dWt0 C 1 2 dWti ; (2.73)
N
for i D 1; ; N, which are exactly the uncontrolled versions of the equations we
started from, except for the fact that the mean reversion coefficient a is replaced by
the time dependent mean reversion rate a C q C .1 N1 / t .
Same Remark as Before. Even though the strategy profile given by (2.72) is in
closed loop form, we can only claim that it is an open loop Nash equilibrium.
X
N
H i .x; yi ; ˛/ D Œa.Nx x` / C ` .t; x/yi;` C Œa.Nx xi / C ˛yi;i
`D1;`¤i
1
C .Nx xi /2 q˛.Nx xi / C ˛ 2 ;
2 2
1
i .t; x/ D q C 1 t .Nx xi /; .t; x/ 2 Œ0; T RN ; (2.74)
N
Using formula (2.25) for the partial derivative of the Hamiltonian, we can solve
the Markovian model by means of the Pontryagin principle, which leads to the
FBSDE:
8 p
ˆ
ˆ dXti D .a C q/.XN t Xti / Yti;i dt C dWt0 C 1 2 dWti ;
ˆ
ˆ
ˆ
ˆ h X N
1 X N
ˆ
ˆ @xj ` .t; Xt /Yti;`
i;j
ˆ
ˆ dYt D a . ı`;j /Yt Ci;`
ˆ
ˆ N
ˆ
ˆ `D1 `D1;`¤i
< 1 1 i
CqŒYt q.Xt Xt /. ıi;j / C .XN t Xti /. ıi;j / dt
i;i N i
ˆ
ˆ N N
ˆ
ˆ X N
ˆ
ˆ Zt dWt` ;
i;j;`
ˆ
ˆ C t 2 Œ0; T; (2.75)
ˆ
ˆ
ˆ
ˆ `D0
ˆ i;j
:̂Y D c.XN T X i /. 1 ıi;j /; i; j D 1; ; N:
T T
N
1 1
@xj ` .t; x/ D ıj;` q C .1 / t ;
N N
and the backward component of the BSDE rewrites:
X X
i;j
N
1
N
1 1 i;`
dYt D a . ı`;j /Yt C
i;`
. ı`;j / q C t 1 Yt
N N N
`D1 `D1;`¤i
N 1 N i 1
C qŒYti;i q.Xt Xt /. ıi;j / C .Xt Xt /. ıi;j / dt
i
N N
X
N
dWt` ;
i;j;`
C Zt t 2 Œ0; T; i; j D 1; ; N: (2.76)
`D0
For the same reasons as in the open loop case (couplings depending only upon XN t
Xti ), we make the same ansatz on the form of Yt , namely Yt D t .XN t Xti /. N1 ıi;j /,
i;j i;j
and search for a solution of the FBSDE (2.75) in the form (2.65). Evaluating the
right-hand side of the BSDE part of (2.76) using the ansatz (2.65), we get:
2.5 The Coupled OU Model of Systemic Risk 121
XN
1 1
ı`;j t XN t Xti
i;j
dYt D a ı`;i
N N
`D1
X
N
1 1 1
C ı`;j q C t .1 / t XN t Xti ı`;i
N N N
`D1;`¤i
1 1
C q t XN t Xti 1 q XN t Xti ıi;j
N N
1
C XN t Xti ıi;j dt
N
X
N
dWt` ;
i;j;`
C Zt t 2 Œ0; T; i; j D 1; ; N:
`D0
X
N
1 1 1
ı`;j q C t .1 / t xN xi ı`;i
N N N
`D1;`¤i
1 1 1
D q C t .1 / ıi;j N xi :
t x
N N N
i;j
Using (2.67) to handle the first line in dYt , we get:
1 1 1
i;j
dYt D ıi;j N
.Xt Xt / .a C q/ t C
i
1 2
t
2
C q dt
N N N
X
N
dWt` :
i;j;`
C Zt
`D0
i;j
Identifying this Itô decomposition with the differential dYt obtained in (2.66) from
i;j;k
the ansatz, we get the same identification for the Zt and the following Riccati
equation for t :
1 2
P t D 2.a C q/ t C .1 / C q2 ; t 2 Œ0; T; (2.77)
N2 t
with the same terminal condition T D c as before. This equation has a unique
solution since > q2 and (2.50) gives for any t 2 Œ0; T:
C C
. q2 / e.ı ı /.Tt/ 1 c ı C e.ı ı /.Tt/ ı
t D ; (2.78)
ı e.ıC ı /.Tt/ ı C c.1 1=N 2 / e.ıC ı /.Tt/ 1
122 2 Probabilistic Approach to Stochastic Differential Games
1.0
0.65
MFG MFG
OL OL
CL CL
0.8
0.60
eta
eta
0.6
0.55
0.4
0.50
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
t t
Fig. 2.1 Plot of the solution t of the Riccati equations (2.69) and (2.80) for several values of the
parameters and numbers of players N increasing from 1 to 50.
with:
p 1
ı ˙ D .a C q/ ˙ R; and R D .a C q/2 C 1 . q2 / > 0: (2.79)
N2
Clearly, the function Œ0; T 3 t 7! t 2 R obtained in our search for Markov Nash
equilibria is different from the function giving the open loop Nash equilibrium found
in (2.70) and (2.71).
Notice that both functions converge toward the same limit as N ! 1, this
common limit solving the Riccati equation:
2
P t D 2.a C q/ t C t C q2 ; T D c: (2.80)
Figure 2.1 gives the plots of the solutions for the two types of equilibria and for a
few values of the parameters. We indeed observe from the plots that, as N increases,
the two functions Œ0; T 3 t 7! t 2 R decrease to their common limit as N ! 1.
In the limit of large games (N ! 1) the open loop and the closed loop (Markovian)
Nash equilibria found with the Pontryagin stochastic maximum principle coincide.
The fact that the differences between open and closed loop equilibria disappear
in the limit of large games is expected. It is part of the game theory folklore. We
will elaborate further on that limit N ! 1 in Chapter 3 when we discuss Mean
Field Games (MFGs), and at the end of the Notes & Complements section of that
chapter where we give references to papers and book chapters discussing this claim.
These references include Chapter 6 of the second volume, which is dedicated to the
passage from games with finitely many players to mean field games.
2.5 The Coupled OU Model of Systemic Risk 123
For the sake of completeness, we show that the analytic approach based on the
solution of a system of coupled partial differential equations of the Hamilton-
Jacobi-Bellman (HJB for short) type can also be implemented, and that it gives
exactly the same Markovian Nash equilibrium as the stochastic maximum principle
approach implemented in the previous subsection. Notice that the present set-up
fits the setting used in Subsection 2.1.4, with B.t; x; ˛/ D .Bi .t; x; ˛//16i6N and
˙.t; x; ˛/ D .˙i;j .t; x; ˛//16i6N;06j6NC1 , given by:
for x; ˛ 2 RN , where, as above, we use the notation xN for the mean xN D .x1 C
C xN /=N. Accordingly, the noise in (2.12) is regarded as an .N C 1/-dimensional
Wiener process W D .Wt /06t6T D .Wt0 ; Wt1 ; ; WtN /06t6T .
Recall that, given an N-tuple . i /16i6N of functions from Œ0; T R into R, we
define, for each i 2 f1; ; Ng, the related value function V i by:
V i .t; x1 ; : : : ; xN /
Z
T i N i N ˇ
ˇ
D inf E f s; Xs ; N s ; ˛s ds C g.XT ; N T / Xt D x ;
i
.˛si /t6s6T t
with x D .x1 ; : : : ; xN / 2 RN and with the same cost functions f and g as before.
Here the dynamics of .Xs1 ; : : : ; XsN /t6s6T are given by (2.60) with Xt D xj for j 2
j
@t V i .t; x/
˚ 1
C inf a.Nx xi / C ˛ @xi V i .t; x/ C ˛ 2 q˛ xN xi C .Nx xi /2
˛2R 2 2
X
N
C a.Nx xj / C j .t; xj / @xj V i .t; x/
jD1;j6Di
2 X X 2
N N
C C ıj;k .1 2 / @2xj xk V i .t; x/ D 0;
2 jD1 kD1
124 2 Probabilistic Approach to Stochastic Differential Games
for .t; x/ 2 Œ0; T RN , with the terminal condition V i .T; x/ D c.Nx xi /2 =2. The
infima in these HJB equations can be computed explicitly:
˚ 1
inf a.Nx xi / C ˛ @xi V i .t; x/ C ˛ 2 q˛ xN xi
˛2R 2
1 2
D a.Nx xi /@xi V i .t; x/ q xN xi @xi V i .t; x/ ;
2
the infima being attained for
˛O D q xN xi @xi V i .t; x/:
X
N
@t V i .t; x/ C .a C q/ xN xj @xj V j .t; x/ @xj V i .t; x/
jD1
2 X X 2
N N
(2.81)
C C ıj;k .1 2 / @2xj xk V i .t; x/
2 jD1 kD1
1 2 1 2
C . q2 / xN xi C @xi V i .t; x/ D 0;
2 2
for i D 1; ; N, with the same terminal condition as above. In (2.17), we called the
system (2.81) the Nash system of the game. If and when this system is solved, the
feedback functions i .t; x/ D q.Nx xi / @xi V i .t; x/ should give the equilibrium
Markovian strategies. Generally speaking, these systems of HJB equations are
difficult to solve. Here, because of the particular forms of the couplings and the
terminal conditions, we can solve the system by inspection, checking that a solution
can be found in the form:
.Nx xi /2 C t ;
t
V i .t; x/ D .t; x/ 2 Œ0; T RN ; (2.82)
2
1
@xj V i .t; x/ D t ıi;j xN xi ;
N
1 1
@2xj xk V i .t; x/ D t ıi;j . ıi;k /;
N N
2.6 Notes & Complements 125
and plugging these expressions into (2.81), and identifying term by term, we see
that the system of HJB equations is solved if and only if:
(
P t D 2.a C q/ t C 1 N12 2t . q2 /;
(2.83)
P t D 12 2 .1 2 / 1 N1 t ;
for t 2 Œ0; T, with the terminal conditions T D c and T D 0. As already explained
earlier, the Riccati equation is scalar and can be solved explicitly. Here it coincides
with (2.77), and following (2.78), we get:
C C
. q2 / e.ı ı /.Tt/ 1 c ı C e.ı ı /.Tt/ ı
t D ; (2.84)
ı e.ıC ı /.Tt/ ı C c.1 1=N 2 / e.ıC ı /.Tt/ 1
provided we set:
p 1
ı ˙ D .aCq/˙ R; with R D .aCq/2 C 1 2 .q2 / > 0: (2.85)
N
Once t is identified, one solves for t (remember that T D 0) and finds:
Z
1 2 1 T
t D .1 2 / 1 s ds: (2.86)
2 N t
For the record, we note that the optimal Markovian strategies read:
1
˛O ti D q XN t Xti @xi V i .t; Xt / D q C .1 / t XN t Xti ; (2.87)
N
1 p
dXti D a C q C .1 / t XN t Xti dt C 1 2 dWti C dWt0 ; (2.88)
N
for t 2 Œ0; T. As announced, we recover the solution found by the Pontryagin
stochastic maximum principle.
The main purpose of this chapter was to present background material and notation
for the analysis of finite player stochastic differential games. The published
literature on general nonzero sum stochastic differential games is rather limited,
especially in textbook form. Moreover, the terminology varies from one source
to the next. In particular, there is no clear consensus on the names to give to
126 2 Probabilistic Approach to Stochastic Differential Games
the many notions of admissibility for strategy profiles and for the corresponding
equilibria. The definitions we use in this text reflect our own personal biases. They
are borrowed from Carmona’s recent text [94]. The reader is referred to Chapter 5
of this book for proofs of the necessary part of the stochastic Pontryagin maximum
principle, and detailed discussions of linear quadratic game models and applications
to predatory trading.
The formulation of the Isaacs condition as given in Definition 2.9 is credited to
Isaacs in the case of two-player (N D 2) zero-sum games, and to Friedman in the
general case of noncooperative N-player games. Earlier results on the solvability
of the Nash system (2.17)–(2.18) in the classical or strong sense and with bounded
controls may be found in the monograph by Ladyzenskaja et al. [258] and in the
paper by Friedman [163]. We also refer to the series of papers by Bensoussan
and Frehse [45, 48, 49] for refined solvability properties and estimates for parabolic
or elliptic Nash systems allowing for Hamiltonians of quadratic growth. For other
monographs on semilinear PDEs, we refer to Friedman [162] and Lieberman [264].
The solvability property of the Nash system used in the proof of Proposition 2.13
may be explicitly found in Delarue and Guatteri [134]. The unique solvability
of the SDE appearing in the same proof is taken from the seminal work by
Veretennikov [336]. The Itô-Krylov formula is due to Krylov, see Chapter II in his
monograph [242].
The stochastic maximum principle for stochastic differential games was used
in the linear quadratic setting by Hamadène [193] and [194]. Generalizations have
been considered by several authors, among which generalizations to games with
stochastic dynamics including jumps or with partial observation. We refer the
interested reader to [22] and [23] and the references therein. For further details on
the stochastic maximum principle for stochastic optimal control problems, from
which the stochastic maximum principle for games may be derived, we refer the
reader to the subsequent Chapters 3, 4, 6, and (Vol II)-1: The standard version with
deterministic coefficients is exposed in Chapters 3 and 4, while the case with random
coefficients is addressed in Chapter (Vol II)-1; Chapter 6 is dedicated to the optimal
control of McKean-Vlasov diffusion processes.
Our reasons to present the case ˇ D 0 of the flocking model, and the systemic
risk toy model (whose discussion is based on the paper by Carmona, Fouque, and
Sun [102]), are mainly pedagogical. Indeed, in both cases, the open and closed
loop forms of the models can be solved explicitly, and the large game limits appear
effortlessly. So in this sense, they offer a perfect introduction to the discussion
of mean field games, hence our decision to present them in full detail, despite
their possible shortcomings. Indeed, the LQ form of the flocking model is rather
unrealistic, and when viewed as a model for systemic risk in an interbank system,
our toy model of systemic risk is very naive. Indeed, despite the strong case made
in [102] for the relevance of the model to systemic risk of the banking system, it
remains that according to this model, banks can borrow from each other without
2.6 Notes & Complements 127
having to repay their debts, and even worse, the case for the model is further
weakened by the fact that the liabilities of the banks are not included in the model.
As already mentioned in the Notes & Complements of Chapter 1, the realism of
the model was recently improved in [101] by Carmona, Fouque, Moussavi, and Sun
who included delayed terms in the drift of the state to account for the fact that the
decision to borrow or lend at a given time will have an impact down the road on the
ability of a bank to borrow or lend.
Stochastic Differential Mean Field Games
3
Abstract
The goal of this chapter is to propose solutions to asymptotic forms of the
search for Nash equilibria for large stochastic differential games with mean field
interactions. We implement the Mean Field Game strategy, initially developed
by Lasry and Lions in an analytic set-up, in a purely probabilistic framework.
The roads to solutions go through a class of standard stochastic control problems
followed by fixed point problems for flows of probability measures. We tackle
the inherent stochastic optimization problems in two different ways. Once
by representing the value function as the solution of a backward stochastic
differential equation (reminiscent of the so-called weak formulation approach),
and a second time using the Pontryagin stochastic maximum principle. In both
cases, the optimization problem reduces to the solutions of a Forward-Backward
Stochastic Differential Equation (FBSDE for short). The search for a fixed
flow of probability measures turns the FBSDE into a system of equations of
the McKean-Vlasov type where the distribution of the solution appears in the
coefficients. In this way, both the optimization and interaction components of
the problem are captured by a single FBSDE, avoiding the twofold reference
to Hamilton-Jacobi-Bellman equations on the one hand, and to Kolmogorov
equations on the other hand.
Here, we recall the basic results and ingredients from stochastic analysis and optimal
stochastic control theory which we use throughout the chapter. We leverage the
resources of Chapter 2 to formalize what we mean by a mean field game problem.
We consider a stochastic differential game with N players. As usual the players are
denoted by the integers i 2 f1; ; Ng. Each player is controlling its own private
state Xti 2 Rd at time t 2 Œ0; T by taking an action ˛ti in a closed convex set A Rk .
We assume that the dynamics of the private states of the individual players are given
by Itô’s stochastic differential equations of the form:
earlier discussion of finite player games with mean field interactions in which the
empirical distribution was typically assumed to be the empirical distribution of the
couples “state/control.” We saw in Chapter 1 several instances of models for which
the interactions appeared through the empirical distributions of the controls, or even
through the empirical distributions of the couples “state/control.” We shall provide
in Section 4.6 of Chapter 4 insight and tools to handle some of these more general
classes of mean field games which we call extended mean field games.
Recall that the symmetry and small individual influence conditions articulated
in Chapters 1 and 2 have been incorporated in the model through the choice of the
form of the coefficients of the states dynamics. Indeed, the dimensions of the states
and the random shocks, as well as the drift and volatility coefficients b and are the
same for all the players. Moreover, since we want the influence of the players j ¤ i
on the state of player i to be symmetric and diminishing quickly as the number of
players grows, we used the intuition behind the result of Lemma 1.2 to assume that
the coefficients are given by functions of measures. In this way, the state of player i
is influenced by the empirical distribution of the states of the other players.
Remark 3.1 Later on, we shall add a term of the form 0 .t; Xti ; N N1
Xti
; ˛ti /dWt0 to the
right-hand side of the state dynamics given by (3.1). For obvious reasons, the Wiener
process W 0 will be called a common noise as opposed to the Wiener processes W i
for i D 1; ; N which are intrinsic to the private states and called idiosyncratic
noises.
In this chapter, we concern ourselves with both open and closed loop equilibria,
without paying much attention to the differences between the two cases since, in
our framework, the asymptotic formulations are expected to be the same in the
limit N ! 1. So, whatever the type of the equilibrium, each player chooses a
3.1 Notation, Assumptions, and Preliminaries 131
Quite often, we denote by AN the product of N copies of A. We shall also use the
notation J N;i when we want to emphasize the dependence upon the number N of
players. This will be the case when we study the limit N ! 1 in Chapter (Vol II)-
6. Notice that even though only ˛ti appears in the formula giving the cost to player
i, this cost depends upon the strategies used by the other players indirectly, as these
strategies affect not only the private state Xti , but also the empirical distribution N N1
Xti
of the private states of the other players. As emphasized in Chapter 1, we restrict
ourselves to games with strong symmetry properties and our models require that
the behaviors of the players be statistically identical when driven by controls which
are statistically invariant under permutation, imposing that the running and terminal
cost functions f i and gi , like the drift and volatility coefficients, do not depend upon
i. We denote them by f and g respectively.
The final remark of this introductory subsection is related to the actual definition
of the mean field interaction between finitely many players. In accordance with
earlier discussions, the empirical measure which appears in (3.1) and (3.3) is the
j
empirical measure of the other states, namely of the variables Xt for j ¤ i. However,
as we already explained in several instances, if we were to use instead the empirical
j
measure N Nt of all the states Xt including j D i, the results would be qualitatively the
same, though possibly different quantitatively. This was highlighted in Remark 1.19
and Remark 1.25 of Chapter 1 where the net effect of switching from one empirical
measure to the other amounts to applying multiplicative factors on the parameters
of the models. We also argued that these multiplicative factors were converging to
1 as N ! 1. Since the mean field game problems which we study throughout the
book are essentially limits as N ! 1 of N-player stochastic differential games with
mean field interactions (see Chapter (Vol II)-6 for rigorous proofs), the convention
we use for the empirical measures in the finite player games should not matter in the
end. For this reason, we shall often start from the empirical measure N Nt of the states
of all the players when we motivate the formulation of mean field game models.
132 3 Stochastic Differential Mean Field Games
We now formalize the definition of the Mean Field Game problem without a
common noise. For this purpose, we start with a complete filtered probability
space .˝; F; F D .Ft /06t6T ; P/, the filtration F supporting a d-dimensional
Wiener process W D .Wt /06t6T with respect to F and an initial condition 2
L2 .˝; F0 ; PI Rd /. As announced in Subsection 3.1.1, we thus choose W of the
same dimension as the state variable. This is for convenience only. Most of the
time, the filtration F will be chosen as the filtration generated by F0 and W. As
usual, the law of is denoted by L./. From a practical point of view, 0 D L./
should be understood as the initial distribution of the population. Following the
notations introduced in the previous subsection, we shall denote by A the set of F-
progressively measurable A-valued stochastic processes ˛ D .˛t /06t6T that satisfy
the square-integrability condition (3.2).
In the present context, the mean field game problem derived from the finite player
game model introduced in the previous section is articulated in the following way:
(i) For each fixed deterministic flow D .t /06t6T of probability measures on
Rd , solve the standard stochastic control problem:
Z T
inf J .˛/ with J .˛/ D E f .t; Xt˛ ; t ; ˛t /dt C g.XT˛ ; T / ;
˛2A 0
subject to (3.4)
(
dXt˛ D b.t; Xt˛ ; t ; ˛t /dt C .t; Xt˛ ; t ; ˛t /dWt ; t 2 Œ0; T;
X0˛ D :
(ii) Find a flow D .t /06t6T such that L.XO t / D t for all t 2 Œ0; T, if XO is a
Notice that here, Xt represents the private state of a representative player, not the
whole system as before. Recasting these two steps in the set-up of finite player
games and the concept of Nash equilibrium, we see that the first step provides the
best response of a given player interacting with the statistical distribution of the
states of the other players if this statistical distribution is assumed to be given by t ,
while the second step solves a specific fixed point problem in the spirit of the search
for fixed points of the best response function. The strategy outlined by these two
steps parallels exactly what needs to be done to construct Nash equilibria for finite
player games. Once these two steps have been taken successfully, if the fixed-point
optimal control ˛O identified in step (ii) is in feedback form, in the sense that it is of
the form ˛t D .t; XO t ; t / for some deterministic function on Œ0; TRd P.Rd /,
where D .t D L.XO t //06t6T is the flow of marginal distributions at the fixed
3.1 Notation, Assumptions, and Preliminaries 133
point, we expect that the prescription ˛O ti D .t; Xti ; t /, if used by the players i D
1; ; N of a large game, should form an approximate Nash equilibrium. This fact
will be proven rigorously in Chapter (Vol II)-6, where we also quantify the accuracy
of the approximation.
Remark 3.2 Throughout the book, we shall consider the case when the optimiza-
tion problem inf˛2A J .˛/ has a unique minimizer for any input . In that case, we
denote by XO the unique optimal trajectory under the input and is said to be an
equilibrium (or a solution of the mean field game) if t D L.XO t / for all t 2 Œ0; T.
When the optimization problem inf˛2A J .˛/ has several solutions, the two steps
(i) and (ii) may be reformulated as follows. Denoting the set of minimizing controls
by A O D argmin˛2A J .˛/, is said to be an equilibrium if there exists ˛O 2 A O
such that, for all t 2 Œ0; T, t D L.Xt˛O /. However, we shall not consider this level
of generality in the book.
If our goal is to study the limiting MFG problem more than solving the finite player
games from which it is issued, a possible alternative introduction of the problem
may be useful. We motivate this approach by the limit of finite player games, but
it should be understood that the finite player games we are about to introduce are
different from the games we started from to derive the MFG problem.
We framed the search for Nash equilibria as a search for fixed points of the best
response map. We exploit this point of view systematically throughout the book
and our formulation of the mean field game problem was strongly influenced by
this approach. It naturally leads to a search for fixed points on flows of probability
measures. The present discussion will remain informal, as we do not spend much
effort providing explicit definitions of all the objects we manipulate. As before, we
assume that the N players use controls .˛i /iD1; ;N given by deterministic functions
. i /iD1; ;N . We shall denote by N Nt the empirical distribution of Xt D .X 1 ; ; XtN /
at time t, and assume that the controls used by the players are of the form:
the best response, say ˛N of a virtual .N C 1/-th player (which we will refer to using
the index 0), interacting with the empirical distribution of the states of the N players.
In other words
and we treat Xt1 , , XtN as the states of the N players of the game.
• Given the flow .N Nt /06t6T of (random) empirical measures of the Xt1 , , XtN just
computed, we now introduce a virtual player which looks for a control ˛0 given
by a feedback function 0 which bears to ˛0 the same relationship as bears to
˛i for i D 1; ; N, in order to minimize:
Z T
J.˛0 / D E f .t; Xt ; N Nt ; ˛t0 /dt C g.XT ; N NT /
0
We now explain how in the limit N ! 1 of large games, the above steps lead to
the mean field game paradigm introduced in the previous subsection.
First we notice that the first two bullet points do not involve any optimization.
Anticipating on the several discussions of the propagation of chaos which we give in
Chapters 4, 5, (Vol II)-2 and (Vol II)-7, we realize that when N ! 1, the .Xi /iD1; ;N
become independent of each other, and their marginal distributions converge toward
the law of the solution of the McKean-Vlasov equation:
sense, in the limit N ! 1, the role of the first two bullet points is to associate, to
each control ˛ or function , a flow D .t /06t6T of probability given by the
marginal laws t D L.XQ t / of the solution of the above McKean-Vlasov stochastic
differential equation.
Once this flow of measures is obtained, the third bullet point proposes a standard
optimal control problem (still not a game) in which a virtual player minimizes
its expected costs in interaction with the flow of distributions. This optimization
problem is exactly the same as (3.4) except for the fact that the input flow of
probability measures D .t /06t6T is not arbitrary. Instead, this flow is given
by the marginal laws of the solution of a McKean-Vlasov stochastic differential
equation whose coefficients are determined by the choice of a control ˛ or a
function . Clearly, if the fixed point problem can be solved, its solution provides a
solution to the mean field game problem as articulated in the previous subsection.
Conversely, any solution to the mean field game problem of the previous section
provides a solution to the problem stated in the above bullet points.
The present formulation of the mean field game problem exhibits two useful
features.
for t 2 Œ0; T, x; y 2 Rd , z 2 Rdd , ˛ 2 A and 2 P.Rd /. Above, the dots ‘ ’ stand
for the inner products in Rd and Rdd respectively.
In order to lighten the notation and avoid unwanted technicalities at this early
stage of the discussion, we also assume throughout the chapter (and actually
throughout most of the book) that the volatility is uncontrolled (i.e. does not depend
upon the value of the control). In other words we assume that:
In fact, for some of the derivations in this chapter, we shall sometimes assume that
the volatility is also independent of or, even, that it is a constant matrix 2 Rdd
136 3 Stochastic Differential Mean Field Games
The fact that the volatility is uncontrolled allows us to use the reduced Hamilto-
nian defined as:
˛.t;
O x; ; y/ 2 argmin˛2A H.t; x; ; y; ˛/; (3.6)
(continued)
3.1 Notation, Assumptions, and Preliminaries 137
Proof. For any given .t; x; ; y/, the function A 3 ˛ 7! H.t; x; ; y; ˛/ is once continuously
differentiable and strictly convex so that ˛.t;
O x; ; y/ appears as the unique solution of the
variational inequality:
8ˇ 2 A; ˇ ˛.t;
O x; ; y/ @˛ H t; x; ; y; ˛.t;
O x; ; y/ > 0: (3.11)
so that:
ˇ ˇ
O x; ; y/ˇ 6
ˇˇ0 ˛.t; 1
j@˛ f .t; x; ; ˇ0 /j C jb2 .t/j jyj ;
and consequently:
ˇ ˇ
ˇ˛.t;
O x; ; y/ˇ 6 1
j@˛ f .t; x; ; ˇ0 /j C jb2 .t/j jyj C jˇ0 j;
which proves the local boundedness claim since ˇ0 is arbitrary, @˛ f is locally bounded, and
b2 is bounded.
The smoothness of ˛O with respect to x and y follows from a suitable adaptation of the
implicit function theorem to variational inequalities driven by coercive functionals. Indeed,
for x; x0 ; y; y0 2 Rd and .t; / 2 Œ0; T P2 .Rd /, we have the two inequalities:
O x0 ; ; y0 / ˛.t;
˛.t; O x; ; y/ @˛ H t; x; ; y; ˛.t; O x; ; y/ > 0;
˛.t; O x0 ; ; y0 / @˛ H t; x0 ; ; y0 ; ˛.t;
O x; ; y/ ˛.t; O x0 ; ; y0 / > 0:
that is:
O x0 ; ; y0 / ˛.t;
˛.t; O x; ; y/
O x0 ; ; y0 / @˛ H t; x; ; y; ˛.t;
@˛ H t; x; ; y; ˛.t; O x; ; y/
O x0 ; ; y0 / ˛.t;
6 ˛.t; O x; ; y/
O x0 ; ; y0 / @˛ H t; x0 ; ; y0 ; ˛.t;
@˛ H t; x; ; y; ˛.t; O x0 ; ; y0 / :
Exchanging the roles of ˛ and ˛ 0 in (3.9) and summing the resulting bounds, we check that
for any ˛; ˛ 0 2 A,
˛ 0 ˛ @˛ f .t; x; ; ˛ 0 / @˛ f .t; x; ; ˛/ > 2 j˛ 0 ˛j2 :
Using the two previous inequalities together with the fact that b is linear in ˛, we deduce
that:
3.1 Notation, Assumptions, and Preliminaries 139
ˇ ˇ2
2 ˇ˛.t; O x; ; y/ˇ
O x0 ; ; y0 / ˛.t;
O x0 ; ; y0 / ˛.t;
6 ˛.t; O x; ; y/
O x0 ; ; y0 / @˛ H t; x0 ; ; y0 ; ˛.t;
@˛ H t; x; ; y; ˛.t; O x0 ; ; y0 /
ˇ ˇ
6 Cˇ˛.t; O x; ; y/ˇ jx0 xj C jy0 yj ;
O x0 ; ; y0 / ˛.t;
where C only depends upon the bound for b2 and the Lipschitz-constant of @˛ f as a function
of x. t
u
Remark 3.4 Various generalizations of Lemma 3.3 to cases for which b2 is allowed
to depend upon x and are possible. For the sake of simplicity, we refrain from
giving such generalizations as we shall most often focus on the case where b2 is
a function of the sole variable t. Indeed, in this case, the whole drift b in (3.8) is
Lipschitz continuous in the variable x whenever b1 is itself Lipschitz continuous in x.
This assumption on b2 may be rather restrictive for some practical applications. In
order to consider more general models, the reader may want to reformulate some of
the results proven in this chapter (and the next one) in the more general case where
b2 is a function of .t; x; / (or of .t; /). In doing so, he/she must pay particular
attention to the regularity of the whole drift b.
Going back to the program (i)–(ii) articulated in Subsection 3.1.2, the first step
consists in solving a standard stochastic control problem when the deterministic
flow D .t /06t6T of probability measures is given and frozen. A natural route is
to express the value function of the optimization problem (3.4) as the solution of
the corresponding Hamilton-Jacobi-Bellman (HJB for short) equation. This is the
keystone of the analytic approach to the MFG theory, the matching problem (ii)
being resolved by coupling the HJB equation with a Kolmogorov equation intended
to identify the flow D .t /06t6T with the flow of marginal distributions of the
optimal states. With the same notation ˛O as above for the minimizer of the reduced
Hamiltonian H, the resulting system of PDEs can be written as:
8 h i
ˆ 1
ˆ
ˆ @ V.t; x/ C trace
.t; x; /@2
V.t; x/
ˆ
ˆ
t
2
t xx
ˆ
ˆ
< CH t; x; t ; @x V.t; x/; ˛.t; O x; t ; @x V.t; x// D 0;
1 h i (3.12)
ˆ
ˆ @t t trace @2xx .t; x; t /t
ˆ
ˆ
ˆ
ˆ 2
:̂ Cdivx b t; x; t ; ˛.t; O x; t ; @x V.t; x// t D 0;
in Œ0; T Rd , with V.T; / D g.; T / as terminal condition for the first equation
and 0 D 0 as initial condition for the second (recall (3.4) for the meaning of 0 ).
140 3 Stochastic Differential Mean Field Games
The first equation is the HJB equation of the stochastic control problem when the
flow D .t /06t6T is frozen, see for instance Lemma 4.47. Notice that, as pointed
out earlier, this equation can be written using the reduced Hamiltonian H instead
of the usual minimized operator symbol because the volatility is not controlled
and because H is assumed to have a minimizer. The existence of ˛O is especially
useful as it provides the form of the optimal feedback function, which reads
Œ0; T Rd 3 .t; x/ 7! ˛.t;O x; @x V.t; x//. The second equation is the Kolmogorov
(sometimes referred to Fokker-Planck) equation giving the time evolution of the
flow D .t /06t6T of measures dictated by the dynamics (3.4) of the state of
the system once we have implemented the optimal feedback function. These two
equations are coupled by the fact that the Hamiltonian appearing in the HJB equation
is a function of the measure t at time t and the drift appearing in the Kolmogorov
equation is a function of the gradient of the value function V. Notice that the first
equation is a backward equation to be solved from a terminal condition, while the
second equation is forward in time, starting from an initial condition.
The resulting system thus reads as a two-point boundary value problem, noto-
rious for being difficult to solve. In other words, the system (3.12) is nothing but
a forward-backward deterministic differential system in infinite dimension. From
experience with the analysis of forward-backward stochastic differential systems
in finite dimension, we expect that Cauchy-Lipschitz like theory, when it can be
applied, will only provide solutions in small time. One of the major difficulties of
mean field games is to identify sufficient conditions under which existence and/or
uniqueness of a solution hold over a time-interval of arbitrary length. Moreover,
it is also to be expected that, for systems of the same type as (3.12), ellipticity of
the diffusion matrix cannot suffice to decouple the two equations as the forward
component is entirely deterministic. On this last point, we refer to Subsection 3.2.3
below for a more detailed account.
As we shall see next, the crux of our approach is to recast the infinite dimensional
deterministic forward-backward system (3.12) into a finite dimensional stochastic
forward-backward system of the McKean-Vlasov type. The fact that the probabilis-
tic point of view yields a finite dimensional system should not be a surprise. The
infinite dimensional feature is in fact hidden in the McKean-Vlasov component.
strategies. This search for the best response amounts to the solution of an optimal
control problem whereby the typical player seeks a control strategy in order to
minimize its expected cost, assuming that all the other players have chosen their
own strategies and play the game without deviating from these choices. If we work
with models of stochastic differential games with mean field interactions, such an
optimal control problem can be written in the same form as in (3.4) (with, in full
generality, possibly depending on ˛):
Z T
inf E f .t; Xt ; t ; ˛t /dt C g.XT ; T /
˛2A 0
subject to (3.13)
(
dXt D b.t; Xt ; t ; ˛t /dt C .t; Xt ; t ; ˛t /dWt ; t 2 Œ0; T;
X0 D :
Here, Xt is the private state of the typical player, ˛t the action it chooses to take at
time t, and t represents the impact of the strategies chosen by the other players.
For the purpose of this control problem, t is regarded as an input: it is fixed. We
saw in Chapter 2 that, in the case of games with finitely many players, t should
be the empirical distribution of the states (or the actions, or both states and actions)
of the other players. This can be viewed as a random measure with finite support.
In the case of mean field games, t is typically deterministic. It is the marginal
distribution at time t of the state of a generic player in the population. However in
the case of mean field games with a common noise discussed in Chapter (Vol II)-
2, t is a random measure representing the conditional marginal distribution of a
generic state given the realization of the common noise. In that case and as already
alluded to in Remark 3.1, the state dynamics appearing in the stochastic control
problem (3.13) contain an extra diffusion term involving the common noise.
The solutions of the optimal control problems leading to the best responses of
individual players form an important component of the search for Nash equilibria.
However, they are not the whole story. Since Nash equilibria are the fixed points
of the best response function, the second step needs to be the search for fixed
points of this function, in accordance with step (ii) in the program articulated in
Subsection 3.1.2. In our context, this will involve the search for particular flows of
probability measures D .t /06t6T which, if used as input, need to be recovered
as output.
While the analytic methods discussed in the previous section have underpinned
the first works on mean field games, our contention is that a probabilistic approach
should bring new insights and allow for more general and possibly less regular
models to be solved. The purpose of this section is to explain why and how
Forward-Backward Stochastic Differential Equations (FBSDEs) appear naturally in
the solutions of the mean field game problems, and to develop the tools necessary
for their analyses.
142 3 Stochastic Differential Mean Field Games
We first consider the optimal control step of the formulation of the mean field
game problems described earlier. Probabilists have a two pronged approach to
these optimal control problems. We proceed to their descriptions when the input
D .t /06t6T is deterministic and fixed. We refer the reader to Chapter (Vol II)-1
for a review of the corresponding optimization problem in a random environment
given by a stochastic input. As announced, we assume that the volatility function
appearing in the state dynamics part of the stochastic control problem (3.13) is
independent of the control ˛.
1. The first method is closer in spirit to the analytic approach based on the
Hamilton-Jacobi-Bellman (HJB) equation derived from the dynamic programming
principle. The crux of this method is to give a probabilistic representation of the
value function of the optimization problem as the solution of a Backward Stochastic
Differential Equation (BSDE). Assuming that the volatility is an invertible matrix,
this BSDE reads:
dYt D f t; Xt ; t ; ˛t dt C Zt dWt ; t 2 Œ0; T; (3.14)
In (3.14), the process Y D .Yt /06t6T is scalar valued while Z D .Zt /06t6T takes
values in Rd . For that reason, the stochastic integration is denoted under the form
of an inner product. This is in contrast with the notation used when Y is vector
valued, in which case the stochastic integration is written under the form of a
matrix multiplication. Moreover, .t; Xt ; t /1 is the inverse of .t; Xt ; t / and
.t; Xt ; t /1 denotes its transpose. Also, the function ˛O is the minimizer of the
Hamiltonian in the sense that:
Lemma 3.3 provides conditions for existence, uniqueness and regularity of such a
function ˛.
O A first remark is that, even before we replace ˛t , the state Xt appears in
the driver of the BSDE (3.14), by which we mean the coefficient, up to the sign ,
in front of the dt. So this BSDE has random coefficients. But if, as we require, the
player has to use the specific control ˛t given by the function ˛O for the system to be
at the optimum, then the term Zt appears in the forward dynamics of the state of the
3.2 Why and Which FBSDEs? 143
control problem (3.13). Equations giving dXt and dYt are now strongly coupled, and
instead of a BSDE with random coefficients (due to the dependence of the driver
upon Xt ), we now need to solve a fully coupled FBSDE, whose coefficients are
deterministic and depend upon the measure flow D .t /06t6T .
2. The second prong of the probabilistic approach is based on the extension
of the Pontryagin maximum principle to the control of stochastic differential
equations. It does not require invertibility of the volatility matrix , but it requires
differentiability of the coefficients. It is based on a probabilistic representation of the
derivative of the value function (the so-called adjoint variable or adjoint process) as
a solution of a BSDE called the adjoint equation:
is written under the form of a matrix acting on a vector and the Hamiltonian
H full .t; x; ; y; z; ˛/ is equal to H.t; x; ; y; ˛/ C .t; x; / z, which also admits
˛.t;
O x; ; y/ as minimizer in ˛.
The equation (3.15) is a BSDE with random coefficients because of the presence
of Xt in the expression giving the driver. But as before, replacing ˛t by ˛.t;O Xt ; t ; Yt /
in the forward dynamics of the control problem (3.13) creates a strong coupling
between dXt and dYt and the solution of the control problem reduces to the solution
of an FBSDE with deterministic coefficients.
Remark 3.5 This remark complements Remark 3.1 on mean field games with a
common noise. Indeed, the above discussion makes a strong case for the use of
FBSDEs in the solution of mean field game problems. However, the FBSDEs touted
above cannot be used to handle mean field games with a common noise which
we study later in the book. Indeed, as seen in some of the models introduced in
Chapter 1 (see for instance paragraphs 1.3.2 and 1.4.1), the forward SDEs giving
the dynamics of the state should contain an extra term in dWt0 accounting for a
common source of random shocks. Accordingly, the input D .t /06t6T should
be random and stand for the conditional distribution of a generic state given
the realization of the common noise. As a result, the BSDE should also have a
term Zt0 dWt0 (or Zt0 dWt0 depending on the dimension of the backward equation).
However, as explained in detail in Chapter (Vol II)-2, although D .t /06t6T
is expected to be adapted to the filtration generated by W 0 D .Wt0 /06t6T , it
may happen that involves additional sources of randomness, as it is the case
in the construction of weak solutions of stochastic differential equations which
often end up not being adapted to the underlying Brownian filtration. Therefore,
randomness in the measure may prevent us from assuming that the filtrations
satisfy the martingale representation theorem. As a result, we should be prepared
to face cases for which the extra martingale term forced on us by the presence
144 3 Stochastic Differential Mean Field Games
of a common noise, may not be a stochastic integral of the form Zt0 dWt0 . Instead,
this martingale term should be of the more general form Zt0 dWt0 C dMt for some
martingale M D .Mt /06t6T orthogonal to .W; W 0 / D .Wt ; Wt0 /0tT . This class of
FBSDEs is not standard, and as far as we know, was little studied in the existing
literature. We call them FBSDEs in a random environment (due to the randomness
of ), or simply FBSDEs with random coefficients. We develop their theory (or
at least what we need for the purpose of the analysis of mean field games with a
common noise) in Section (Vol II)-1.1.
The second step of the search for Nash equilibria is the construction of the fixed
points (if any) of the best response map, see for instance step (ii) in the program
articulated in Subsection 3.1.2. In the present context of mean field games, this step
amounts to finding particular flows of probability measures D .t /06t6T which, if
used as input to the stochastic control problem, will force the marginal distribution
at time t of the optimal state of the controlled problem to coincide with the original
t we started from. In other words, these fixed points will force .t /06t6T to be the
flow of marginal distributions of the X D .Xt /06t6T -component of the solution of the
FBSDE associated with the control problem. Replacing t by L.Xt / in the FBSDE
turns the family of standard FBSDEs parameterized by the flow of measures into
an FBSDE of McKean-Vlasov type.
Remark 3.6 Strictly speaking, the above discussion applies to the mean field game
models solved in this chapter. In the presence of a common noise, as we shall
see in Chapter (Vol II)-2, the measures .t /06t6T are random since they depend
upon the realizations of the common noise. In that case, the fixed point argument
says that t needs to be identified to the conditional distribution of Xt given the
common noise. So technically speaking, the solution of the mean field game becomes
equivalent to the solution of an FBSDE of conditional McKean-Vlasov type. These
new FBSDEs of McKean-Vlasov type will also appear in Chapter (Vol II)-6 when we
study mean field games with a major and minor players. The solvability of FBSDEs
of conditional McKean-Vlasov type is addressed in Chapter (Vol II)-3.
measures of order p, namely those probability measures which integrate the p-th
power of the distance to a fixed point whose choice is irrelevant in the definition of
Pp .E/. If ; 0 2 Pp .E/, the p-Wasserstein distance Wp .; 0 / is defined by:
Z 1=p
0
Wp .; / D inf d.x; y/ .dx; dy/
p
; (3.16)
2˘p .;0 / EE
where ˘p .; 0 / denotes the set of probability measures in Pp .EE/ with marginals
and 0 . It is customary to talk about Wasserstein space and Wasserstein distance
(without referring to p) when p is assumed to be equal to 2.
Recasting the two prongs of the probabilistic approach into a single formulation,
and leaving for later the introduction of an additional common noise W 0 as
explained in Remark 3.5, we see that the optimal control part leads in both cases
to the analysis of an FBSDE of the form:
8
ˆ
< dXt D B.t; Xt ; t ; Yt ; Zt /dt C ˙.t; Xt ; t /dWt ;
ˆ
dYt D F.t; Xt ; t ; Yt ; Zt /dt C Zt dWt ; t 2 Œ0; T; (3.17)
ˆ
:̂ Y D G.X ; /;
T T T
In the search for an MFG equilibrium, the flow is required to match the flow of
marginal distributions .L.Xt //06t6T of the forward process X D .Xt /06t6T in (3.17).
The resulting equation is the epitome of an FBSDE of McKean-Vlasov type:
8
ˆ
< dXt D B t; Xt ; L.Xt /; Yt ; Zt dt C ˙ t; Xt ; L.Xt / dWt ;
ˆ
dYt D F t; Xt ; L.Xt /; Yt ; Zt dt C Zt dWt ; t 2 Œ0; T; (3.20)
ˆ
:̂ Y D GX ; L.X /:
T T T
We shall provide a systematic analysis of this new class of FBSDEs in Section 4.3.
Remark 3.9 When the dimension m of the backward component in (3.17) is equal
to 1, the process Z D .Zt /06t6T takes values in R1d . As already explained, it will be
more convenient to regard it as a process with values in Rd and to write the product
Zt dWt as an inner product in Rd . For that reason, in the special case m D 1, we
often write Zt dWt instead of Zt dWt , Zt being understood as an element of Rd .
The above reformulation of the MFG problems is screaming for the investigation
of the solvability of forward-backward SDEs of the McKean-Vlasov type. Most
of Chapter 4 will be devoted to this specific question, while Chapter (Vol II)-3
will address the same problem in the presence of a common noise. Here, we try
to provide new insight on the nature of the technical difficulties we are about to
face, and the tools that we shall bring to bear to overcome them.
3.2 Why and Which FBSDEs? 147
The most challenging feature of these equations is the twofold structure of the
boundary condition. To wit, a forward-backward equation is a two-point boundary
value problem. One of the simplest examples we may think of for this type of
equation is a pair of two coupled ODEs (with values in R), one being set forward
and the other one being set backward:
(
xP t D b.t; xt ; yt /;
(3.21)
yP t D f .t; xt ; yt /; t 2 Œ0; T;
with a given initial condition x0 2 R for x D .xt /06t6T and terminal condition
yT D g.xT / for y D .yt /06t6T . In most cases of interest to us, the forward-backward
system is stochastic as the forward equation is forced by a random noise, and
is driven by coefficients depending on an infinite dimensional variable since the
state variable at time t comprises both the private state Xt and the collective state
L.Xt /. Quite remarkably, the purely analytic formulation of the MFG problems
presented in Subsection 3.1.5 also consists in a forward-backward system: the
forward equation is the Fokker-Planck equation for the evolution of the population,
while the backward equation is the Hamilton-Jacobi-Bellman equation for the value
function. In that case, the forward-backward system is clearly infinite dimensional,
though deterministic, since both the forward and backward components are of
infinite dimension.
As we already mentioned, one of the major drawbacks of forward-backward
systems, even those of the simplest form (3.21), is that Cauchy-Lipschitz theory
fails except possibly in small time. There are very simple examples of systems of
the form (3.21), with coefficients b and f Lipschitz continuous in the variables x
and y, for which existence and/or uniqueness fail. We shall present one of them
in Subsection 4.3.4 in order to enlighten the fact that the same difficulties plague
FBSDEs of the McKean-Vlasov type. This is clearly a bad omen as we cannot
expect to solve systems like (3.20) by a standard Picard fixed point argument,
except possibly when T is small enough or equivalently, when the coupling between
the forward and the backward components is weak. Henceforth, we must seek
alternative strategies for proving existence and/or uniqueness of solutions to (3.20);
one of the major objectives of the book is to present some of them.
1 1
0 1 0 1
with terminal condition functions G.x/ D ˙.1/_x^1 whose plots are reproduced
below.
When T D 1 and the leading sign in G is C, it is easy to check that the solution
to (3.22) is given by .xt D x0 .1 t=2/; yt D x0 =2/06t6T when x0 2 Œ2; 2. We plot
paths of the x-component of the solution in the left-hand side below. When T D 1
and the leading sign in G is , the solutions to (3.22) are .xt D x0 C tsign.x0 /; yt D
sign.x0 //06t6T if x0 6D 0, but when x0 D 0, all the curves .xt D at; yt D a/06t6T ,
for a 2 Œ1; 1, are solutions. Plots of the x-components are given in the right-hand
side below, the solutions with x0 D 0 and sign D being in red.
1 1
0 0
1 1
If existence and uniqueness hold true for any initial condition x0 2 R at any
initialization time t0 2 Œ0; T, then u must solve the nonlinear equation:
discussion of the characteristics, and explain how the inclusion of the Laplacian
in the equation manifests in the form of a Brownian motion! In that case, the
system (3.22) becomes stochastic:
8
ˆ
< dXt D Yt C dWt ;
ˆ
dYt D Zt dWt ; t 2 Œ0; T; (3.24)
ˆ
:̂ X D x 2 R; Y D G.X /;
0 0 T T
u being the solution of the viscous Burgers equation, and Zt D @x u.t; Xt / almost
everywhere under Leb1 ˝ P, where Leb1 denotes the one-dimensional Lebesgue
measure. In the theory of FBSDEs, u is called the decoupling field of the forward-
backward system.
This provides still another avenue to solve FBSDEs. Indeed, when the diffusion
coefficient (or volatility) driving the noise term is nondegenerate and the coefficients
are bounded in the space variable, it may be shown that the Cauchy-Lipschitz theory
still holds true, see the Notes and Complements at the end of the chapter. We
shall state and use this result in Chapter 4, see Theorem 4.12. The need for the
boundedness of coefficients has been documented in the literature with examples of
linear forward-backward systems with an additive nondegenerate noise for which
existence and/or uniqueness fail.
1. The first one is a suitable notion of upward monotonicity for functionals depend-
ing upon a measure argument, in full analogy with the upward monotonicity
3.2 Why and Which FBSDEs? 151
property of G in the example 3.22 discussed above. This notion is due to Lasry
and Lions and will be shown in Subsection 3.4, to play a key role in the
uniqueness of equilibria for mean field games.
2. The second one is a systematic use of Schauder’s fixed point theorem in order
to prove the existence (though not the uniqueness) of solutions to FBSDEs of
the McKean-Vlasov type and subsequently, of equilibria to the corresponding
mean field games. The implementation of Schauder’s theorem is discussed in
detail in Chapter 4 where it is applied to the function mapping the input D
.t /06t6T in (3.17) onto the output flow of marginal laws .L.Xt //06t6T formed
by the forward component X D .Xt /06t6T of the solution. This approach works
well because we can easily imbed the input and the output .L.Xt //06t6T in a
topological space to which simple compactness criteria can be applied. This is
crucial as Schauder’s theorem is based on compactness arguments.
For the sake of illustration, we provide below the statement of one of the
solvability results proven in Chapter 4 by means of Schauder’s theorem. This
statement is given here for pedagogical reasons, in anticipation of the discussion
of next subsection where we use it to prove our first results of existence of equilibria
for mean field games. It will be generalized in Chapter 4.
The statement given below addresses the existence (but not the uniqueness)
of a solution to a fully coupled McKean-Vlasov forward-backward system of the
type (3.20), namely of the form:
8
ˆ
< dXt D B t; Xt ; L.Xt /; Yt ; Zt dt C ˙ t; Xt ; L.Xt /; Yt dWt
ˆ
dYt D F t; Xt ; L.Xt /; Yt ; Zt dt C Zt dWt ; t 2 Œ0; T; (3.25)
ˆ
:̂ Y D GX ; L.X /;
T T T
For the sake of definiteness, we state formally the precise assumptions under which
the existence result will be proven.
152 3 Stochastic Differential Mean Field Games
(A3) The function ˙ is uniformly elliptic in the sense that, for any t 2 Œ0; T,
x 2 Rd and 2 P2 .Rd /, the following inequality holds true:
˙˙ .t; x; ; y/ > L1 Id ;
We can now state the anticipated existence result whose proof is deferred to
Subsection 4.3:
Theorem 3.10 Under assumption Nondegenerate MKV FBSDE, for any random
variable 2 L2 .˝; F0 ; PI Rd /, the FBSDE (3.25) has a solution .X; Y; Z/ D
.Xt ; Yt ; Zt /06t6T satisfying
Z
T
E sup jXt j2 C jYt j2 C jZt j2 dt < 1;
06t6T 0
In this section, we offer a first, mostly pedagogical, approach to the mean field
game problem using probabilistic tools in two different ways. In both cases we
introduce BSDEs to tackle the stochastic optimization problem, and in both cases,
the optimality condition creates a coupling between the forward dynamics of the
state and the original BSDE, leading to the solution of an FBSDE. Both approaches
are well understood by probabilists working on optimal control problems. The first
approach is known as the weak formulation or martingale method, while the second
one is known under the name of stochastic maximum approach. We introduce them
below. As emphasized in the previous section, the fixed point step in the solution of
the mean field game problem, when implemented in each of these two approaches,
turns standard FBSDEs into FBSDEs of the McKean-Vlasov type. This unexpected
twist to the standard theory will require special attention in this chapter and in the
next one for the solution of mean field game problems without common noise, and
in Chapter 6 for the control of McKean-Vlasov dynamics.
In this section and the next, all the processes are assumed to be defined on
a complete filtered probability space .˝; F; F D .Ft /06t6T ; P/, the filtration
F satisfying the usual conditions, supporting a d-dimensional Wiener process
W D .Wt /06t6T with respect to F. Recall that for each random variable/vector
or stochastic process X, we denote by L.X/ the law (alternatively called the
distribution) of X and, for any integer n > 1, by H2;n the Hilbert space:
n Z T o
2;n 0;n
H D Z2H W E jZs j2 ds < 1 ;
0
where H0;n stands for the collection of all Rn -valued progressively measurable
processes on Œ0; T. We shall also denote by S2;n the collection of all continuous
processes U D .Ut /06t6T in H0;n such that EŒsup06t6T jUt j2 < 1.
Of the two probabilistic approaches which we propose in this section, the Weak
Formulation is closest to the analytic approach. Indeed, it follows the strategy
based on the search for an equation for the value function of the optimal control
problem (3.4) of step (i) of the formulation of a mean field game problem, when
the flow D .t /06t6T of probability measures is fixed. Recall that, as stated
at the beginning of Subsection 3.1.2, we restrict ourselves to deterministic flows
in this chapter. The main characteristic (and possibly the main shortcoming) of
154 3 Stochastic Differential Mean Field Games
for a locally bounded and measurable function Œ0; TRd 3 .t; x/ 7! .t; x/ 2 Rdd
which we assume to be Lipschitz in x uniformly in t 2 Œ0; T. This guarantees
existence and uniqueness of a strong solution X D .Xt /06t6T of the equation:
for any given square integrable random variable with values in Rd . We shall also
assume uniform ellipticity, namely that the spectrum of the matrix .t; x/ .t; x/
is bounded from below by a strictly positive constant independent of t and x. This
implies that .t; x/ is invertible with a uniformly bounded inverse. This remark is
important because we plan to use Girsanov’s theorem. Indeed, for each continuous
measure flow D .t /06t6T and admissible control ˛ 2 A, we define the
probability measure P;˛ on .˝; FT / by:
Z
dP;˛
DE .t; Xt /1 b .t; Xt ; t ; ˛t / dWt ;
dP 0 T
1
E.M/t D exp Mt M0 ŒM; Mt ;
2
where .ŒM; Mt /06t6T stands for the quadratic variation of M. The process W ;˛
defined by:
Z t
;˛
Wt D Wt .s; Xs /1 b .s; Xs ; s ; ˛s / ds; t 2 Œ0; T;
0
is a Wiener process under P;˛ , provided that Girsanov’s theorem applies. The latter
is true if b is bounded, since 1 is already known to be bounded. In such a case, it
holds P;˛ almost-surely:
;˛
dXt D b .t; Xt ; t ; ˛t / dt C .t; Xt /dWt ; t 2 Œ0; T:
3.3 The Two-Pronged Probabilistic Approach 155
That is, under P;˛ , X is a weak solution of the state equation. Note that P;˛ and P
agree on F0 ; in particular, the law of X0 D remains the same. Moreover, and W
remain independent under P;˛ .
where X solves the driftless equation (3.28) and E;˛ denotes the expectation with
respect to P;˛ . It is worth mentioning that J ;weak .˛/ may differ from J .˛/ in (3.4)
since the distribution of the pair .X; ˛/ under P;˛ may be different from the
distribution of the pair .X˛ ; ˛/ under P. However, when b is bounded and is
bounded and continuous, and when the control ˛ is Markovian in the sense that
˛t D .t; Xt / for some Borel-measurable function W Œ0; T Rd ! A, Stroock and
Varadhan uniqueness in law theorem guarantees that .X; ˛/, with ˛t D .t; Xt / for
any t 2 Œ0; T, has the same law under P;˛ as .Xt ; .t; Xt //0tT where X is the
solution of the SDE:
dXt D b t; Xt ; t ; .t; Xt / dt C .t; Xt /dWt ; t 2 Œ0; TI X0 D ;
(continued)
156 3 Stochastic Differential Mean Field Games
(A2) The functions b, f , and g are bounded, the common bound being also
denoted by L.
(A3) The function is continuous and uniformly elliptic in the sense that, for
any t 2 Œ0; T and x 2 Rd , the following inequality holds:
.t; x/ .t; x/ > L1 Id ;
Proposition 3.11 Let assumption Weak Formulation be in force. Recall also the
definition (3.28) of the process X for an initial condition 2 L2 .˝; F0 ; PI Rd /.
Then, for any continuous flow D .t /06t6T of probability measures on Rd , the
BSDE:
dYt D H t; Xt ; t ; .t; Xt /1 Zt ; ˛.t;
O Xt ; t ; .t; Xt /1 Zt / dt
(3.30)
Zt dWt ; 0 6 t 6 T;
3.3 The Two-Pronged Probabilistic Approach 157
Proof.
First Step. We first show that the BSDE (3.30) has a solution. Importantly, the process X
is adapted with respect to the filtration F, which is assumed to satisfy the representation
martingale theorem. However, the difficulty is that the Hamiltonian H is not Lipschitz
continuous in the variable y, and consequently, the driver of the BSDE is not Lipschitz as
a function of Zt . Indeed, when expanding H, (3.30) takes the form:
dYt D .t; Xt /1 Zt b t; Xt ; t ; ˛.t;
O Xt ; t ; .t; Xt /1 Zt /
O Xt ; t ; .t; Xt /1 Zt / dt
C f t; Xt ; t ; ˛.t; (3.32)
Zt dWt ; 0 6 t 6 T;
with YT D G.XT ; T /. In order to bypass this obstacle, we shall invoke results from the
theory of quadratic BSDEs, a short account of which is given in Chapter 4. Thanks to the
boundedness of f and g, we first notice from standard results for backward SDEs that, for
any solution .Y; Z/ to (3.32), the component Y is bounded in the sense that sup06t6T jYt j
is in L1 .˝; F ; PI R/. We only provide a sketch of the proof. We can find two constants
c > 0 and C > 0 such that, when applying Itô’s formula to .exp.ct/jYt j2 /06t6T , we get, with
probability 1, for all t 2 Œ0; T,
Z Z
1 T T
exp.ct/jYt j2 C exp.ct/jZs j2 ds 6 C C 2 exp.cs/Ys Zs dWs :
2 t t
Taking conditional expectation given Ft on both sides, we get an almost sure bound for jYt j.
Since Y is continuous, we easily obtain an almost sure bound for sup06t6T jYt j. Existence
and uniqueness then follow from Theorem 4.15.
Second Step. Given an admissible control ˇ 2 A, since the variable y does not appear in the
driver of the BSDE, and hence the map R Rd 3 .y; z/ 7! H.t; Xt ; t ; .t; Xt /1 z; ˇt / is
independent of y and uniformly Lipschitz in z (recall that A is assumed to be bounded),
existence and uniqueness hold for the following BSDE, whose solution is denoted by
.Y ˇ ; Zˇ /:
(
ˇ ˇ ˇ
dYt D H t; Xt ; t ; .t; Xt /1 Zt ; ˇt dt C Zt dWt ; t 2 Œ0; T;
ˇ
YT D g.XT ; T /:
Recalling that X D .Xt /06t6T is the solution of the driftless dynamic equation (3.28), we
have:
158 3 Stochastic Differential Mean Field Games
Z Z
ˇ
T T
Yt D g.XT ; T / C H s; Xs ; s ; .s; Xs /1 Zsˇ ; ˇs ds Zsˇ dWs
t t
Z T
D g.XT ; T / C f .s; Xs ; s ; ˇs / C .s; Xs /1 Zsˇ b.s; X; s ; ˇs / ds
t
Z T
Zsˇ dWs
t
Z T
D g.XT ; T / C f .s; X; s ; ˇs /ds
t
Z T
C Zsˇ .s; X/1 b.s; X; s ; ˇs /ds dWs
t
Z T Z T
D g.XT ; T / C f .s; X; s ; ˇs /ds Zsˇ dWs;ˇ :
t t
Since the density of P;ˇ with respect to P has moments of any order, and since Zˇ is square
integrable under P, the stochastic integral above is a martingale under P;ˇ . So by taking
P;ˇ -conditional expectation with respect to Ft , we get:
Z ˇ
T ˇ
f .s; X; s ; ˇs /ds ˇˇ Ft ;
ˇ ;ˇ
Yt D EP g.XT ; T / C
t
and:
Z
ˇ ;ˇ ˇ ;ˇ
T
E Y0 D EP Y0 D EP g.XT ; T / C f .s; Xs ; s ; ˇs /ds D J ;weak .ˇ/:
0
In order to conclude the proof, we notice that the solution .Y; Z/ of the FBSDE (3.30) is the
solution of the BSDE with terminal condition g.XT ; T / and driver defined by:
.t; !; y; z/ D H t; Xt .!/; t ; .t; Xt .!//1 z; ˛.t;
O Xt .!/; t ; .t; Xt .!//1 z/
while .Y ˇ ; Zˇ / is the solution of the BSDE with the same terminal condition g.XT ; T / and
driver defined by:
.t; !; y; z/ D H t; Xt .!/; t ; .t; Xt .!//1 z; ˇt .!/ ;
ˇ
for every t, y and z. From this, we conclude EŒY0 6 EŒY0 by the comparison theorem for
BSDEs (see the Notes & Complements at the end of the chapter for references, see also
Theorem 4.16). Since the comparison theorem for BSDEs is strict and the minimizer of H is
strict as well, we have that:
3.3 The Two-Pronged Probabilistic Approach 159
ˇ
EŒY0 D E Y0 , ˇt D ˛O t Leb1 ˝ P almost-everywhere;
with X0 D as initial condition and YT D G.XT ; T / as terminal condition. Indeed,
owing to Theorem 4.18 in Chapter 4, we know that, for any p > 1,
Z T p
E jZt j2 dt < 1:
0
Since dP;˛O =dP is in any Lp .˝; FT ; PI R/, p > 1, we deduce that
Z T p
E;˛O jZt j2 dt < 1;
0
As we already alluded to, we shall prove in Theorem 4.12 that the system (3.34)
has a unique solution when the McKean-Vlasov component is replaced by a mere
input D .t /06t6T . This proves that there is no loss in replacing the noise W ;˛O
in (3.33) by W, as done in (3.34).
Combining with Theorem 3.10, we finally deduce:
Theorem 3.14 On top of assumption Weak Formulation, assume that the coeffi-
cients b, f and g are continuous in the measure argument and that the optimizer
˛O is also continuous in . Then, there exists an MFG equilibrium whenever the
optimization problem in (3.4) is solved through the weak formulation.
Remark 3.15 The main shortcoming of Theorem 3.14 above is the restrictive
assumption that the set A of possible control values is bounded. However it is
possible to extend the application of the formulation based upon the representation
of the value function to cases where this assumption is not satisfied. For instance,
Theorem 4.44 in Chapter 4 gives a more general solvability result for MFGs with
unbounded A, depending upon and the optimal control problem (3.4) being
understood in the strong sense! At the current stage of our presentation of mean
field games, we chose not to introduce the technical tools required to overcome the
underlying obstacles by fear of obstructing the view of the road to the solution of
these problems with too many technicalities.
Remark 3.16 The reader may want to compare the PDE system (3.12) with the
mean field FBSDE (3.34). They suggest that the value of the adjoint process Yt at
time t should be identified with V.t; Xt /. Accordingly, the value of the representation
process Zt should be identified with .t; Xt ; L.Xt //@x V.t; Xt /. Hence the dynamics
of Y D .Yt /06t6T are directly connected with the dynamics of the value function V
in (3.12) along the optimal paths. Similarly, the distribution of Xt at time t should
be identified to t in (4.70). We shall revisit this question again in Chapter 4.
3.3 The Two-Pronged Probabilistic Approach 161
The notation @.x;˛/ f stands for the gradient in the joint variables .x; ˛/.
Finally, f , @x f and @˛ f are locally bounded over Œ0; TRd P2 .Rd /A.
(continued)
162 3 Stochastic Differential Mean Field Games
Assumption (A2) is presumably too restrictive, and the results of this section
could still be true under the more general assumption .t; x/ D 0 .t/ C 1 .t/x of
linearity instead of boundedness of the volatility, see for instance the generalizations
in Section (Vol II)-3.4.
Theorem 3.17 Let us assume that assumption SMP holds and that the mapping
W Œ0; T 3 t 7! t 2 P2 .Rd / is measurable and bounded. Then, the FBSDE:
8
ˆ
ˆ dXt D b t; Xt ; t ; ˛.t;
O Xt ; t ; Yt / dt C dWt ;
ˆ
< dY D @ H t; X ; ; Y ; ˛.t;
ˆ
t x t t t O Xt ; t ; Yt / dt
(3.36)
ˆ
ˆ CZt dWt ; t 2 Œ0; T;
ˆ
:̂
X0 D ; YT D @x g.XT ; T /;
Proof. The proof of the existence of a solution to (3.36) is deferred to Chapter 4; see
Lemma 4.56 there. Here we just focus on the proof of inequality (3.38). By Lemma 3.3,
˛O D .˛O t /06t6T satisfies (3.2), and the proof of the stochastic maximum principle (see for
example the proof given in Theorem 2.16) gives:
Z
T
J ˛ > J ˛O C E H.t; Xt˛ ; t ; Yt ; ˛t / H.t; Xt ; t ; Yt ; ˛O t /
0
By linearity of b and assumption (A3) on f , the Hessian of H satisfies (3.35), so that the
required convexity assumption is satisfied. The result easily follows. u
t
Remark 3.19 As the proof shows, and which is exactly what we claimed in the
previous remark, there is no need for F to be the filtration generated by F0 and the
Wiener process W D .Wt /06t6T .
Remark 3.20 Theorem 3.17 has interesting consequences. First, it says that the
optimal control exists and is unique. Second, it also implies uniqueness of the
solution of the FBSDE (3.36). Indeed, given two solutions .X; Y; Z/ and .X0 ; Y 0 ; Z0 /
of (3.36), Leb1 ˝ P a.e. it holds by (3.38) that:
O Xt0 ; t ; Yt0 /;
O Xt ; t ; Yt / D ˛.t;
˛.t;
so that X and X0 coincide by the Lipschitz property of the coefficients of the forward
equation. As a consequence, .Y; Z/ and .Y 0 ; Z0 / coincide as well.
The bound provided by Theorem 3.17 is sharp in the class of convex models as
shown for example by the following slight variation on the same theme. We shall
use this form repeatedly in this chapter and the next one.
Proposition 3.21 Under the assumptions and notation of Theorem 3.17 above, if
we consider in addition another measurable and bounded flow Œ0; T 3 t 7! 0t 2
P2 .Rd / of probability measures of order 2, and the corresponding controlled state
process X0 D .Xt0 /06t6T defined by:
Z t
Xt0 D C 0
b.s; Xs0 ; 0s ; ˛s /ds C Wt ; t 2 Œ0; T;
0
where the quantity J ˛; 0 is defined by:
Z
T
J ˛; 0 D E g XT0 ; T C f .t; Xt0 ; t ; ˛t /dt :
0
The process X0 is the controlled diffusion process driven by the control ˛ and
evolving in the environment 0 , but the cost functional is computed under the
environment .
Proof. As before, we use the same old strategy of the original proof of the stochastic
maximum principle, by computing the Itô differential of the process:
Z t
.Xt0 Xt / Yt C f .s; Xs0 ; s ; ˛s / f .s; Xs ; s ; ˛O s / ds ;
0 06t6T
and integrating it between 0 and T. Since the initial conditions and 0 are possibly different,
we get the additional term EŒ. 0 / Y0 in the left-hand side of (3.39). Similarly, since the
drift of X0 is driven by 0 D .0t /06t6T , we get the additional difference of the drifts in
order to account for the fact that the drifts are driven by the different flows of probability
measures. t
u
Definition 3.22 Under assumption SMP, for any continuous flow of measures D
.t /06t6T from Œ0; T to P2 .Rd /, call .XO ; YO ; ZO / the solution to FBSDE (3.36)
(which is unique by Remark 3.20). Then, we say that is a solution to the mean
field game (3.4) or an MFG equilibrium if, for any t 2 Œ0; T,
1
P ı XO t D t :
Similar to Definition 3.12, Definition 3.22 captures the essence of the approach to
mean field games summarized in Subsection 3.1.2. The crux of this approach is to
freeze the probability measure when optimizing the cost. This is in sharp contrast
3.3 The Two-Pronged Probabilistic Approach 165
With the special form of coefficients chosen in assumption SMP, the FBSDE reads:
8
ˆ
ˆ
ˆ dXO t D b0 t; L.XO t / C b1 .t/XO t C b2 .t/˛O t; XO t ; L.XO t /; YO t dt
ˆ
< C dWt ;
(3.41)
ˆ O O O O O XO t ; L.XO t /; YO t / dt
ˆ dYt D b1 .t/Yt C @x f t; Xt ; L.Xt /; ˛.t;
ˆ
:̂
CZO t dWt ;
where, as usual, b1 denotes the transpose of the matrix b1 .
Theorem 3.24 On top of assumption SMP, assume that the set A is bounded, that
is invertible, that the coefficients b0 , @x f and @x g are globally bounded, that b1 is
zero, and that the coefficients b0 , @x f , @x g and ˛O are also continuous in the measure
argument . Then, there exists a solution to the MFG problem.
Regularity properties of ˛O follow from Lemma 3.3, continuity with respect to being easily
tackled by a compactness argument. t
u
166 3 Stochastic Differential Mean Field Games
Remark 3.25 Clearly, demanding that @x f and @x g are bounded while f and g are
already assumed to be convex, is very restrictive. For instance, the Linear Quadratic
(LQ) models considered in Section 3.5 are not covered by this result since the driver
F is not allowed to have linear growth. We shall revisit the problem under much
weaker conditions in Chapter 4. In full analogy with Remark 3.15, the reader will
find in Chapter 4 a more general version of Theorem 3.24 in which the set A and
the coefficients b0 , @x f and @x g are allowed to be unbounded and the coefficient b1
to be nonzero. We refer to Theorem 4.53 for a precise statement.
(A1) The functions b and f are differentiable with respect to .x; ˛/, the
mappings Rd A 3 .x; ˛/ 7! @x .b; f /.t; x; ; ˛/ and Rd A 3 .x; ˛/ 7!
@˛ .b; f /.t; x; ; ˛/ being continuous for each .t; / 2 Œ0; T P2 .Rd /.
Similarly, the function g is differentiable with respect to x, the mapping
Rd 3 .x; / 7! @x g.x; / being continuous for each 2 P2 .Rd /.
(A2) The functions Œ0; T 3 t 7! .b; f /.t; 0; ı0 ; 0A / are uniformly bounded,
for some point 0A 2 A. The derivative @.x;˛/ b is uniformly bounded and,
for any R > 0 and any .t; / 2 Œ0; T P2 .Rd / such that M2 ./ 6 R,
the function @x f .t; ; ; /, @x g.; / and @˛ f .t; ; ; / are at most of linear
growth in .x; ˛/.
3.4 Lasry-Lions Monotonicity Condition 167
Theorem 3.27 Let D .t /06t6T be a bounded and measurable function from
Œ0; T into P2 .Rd / and ˛ D .˛t /06t6T 2 A be an admissible control. Under
assumption Necessary SMP, assume further that the Hamiltonian H is convex in
˛ 2 A. If ˛ is optimal, then, for the associated controlled state X˛ D .Xt˛ /06t6T ,
and the corresponding solution .Y; Z/ D .Yt ; Zt /06t6T of the adjoint backward SDE:
(
dYt D @x H t; Xt˛ ; t ; Yt ; ˛t dt C Zt dWt ; t 2 Œ0; T;
(3.42)
YT D @x g XT˛ ; T ;
Since we make little use of Theorem 3.27 in this chapter and the next, we
postpone its proof to Chapters 6 and (Vol II)-1, where more general versions
are given, including cases where is not constant, see Theorem 6.14 for mean
field stochastic control problems and Theorem (Vol II)-1.59 for stochastic control
problems in a random environment. Also, as indicated in Proposition 6.15, a weaker
form holds if convexity of H in ˛ fails. Roughly speaking, the corresponding version
says that, instead of (3.43), it holds @˛ H.t; Xt ; t ; Yt ; ˛t / D 0 when ˛t is in the
interior of A.
1. The first one is to assume that the coupling between the forward and backward
equations is weak in the sense that one of the two equations depends on the
solution of the other one through coefficients with a small Lipschitz constant.
Basically, this amounts to assuming that the time horizon T is small enough.
168 3 Stochastic Differential Mean Field Games
2. In full analogy with the analysis of the inviscid Burgers equation presented in
Subsection 3.2.3, the second one is to make use of monotonicity conditions, but
in the direction of the measure argument.
3. Finally, given the role played by the Laplace operator in the viscous Burgers
equation, another possibility is to make use of non-degeneracy conditions, but on
the space of probability measures this time around.
Again, existence and uniqueness in small time under Lipschitz conditions will be
investigated in Subsection 4.2.3.
Adapting the third strategy to the McKean-Vlasov case is much more challenging
as the state variable has to be understood as the pair made of Xt , which describes
the private state of the player at time t, and of L.Xt /, which stands for the
statistical distribution of the states in the population at time t. As we shall see in
Chapters (Vol II)-4 and (Vol II)-5, the analogue of the viscous Burgers equation,
whose solution is the decoupling field of the FBSDE (3.24), is a PDE on the
space of probability measures, called the master equation. To put it differently, the
decoupling field of a McKean-Vlasov FBSDE has to be understood as a function
over an infinite-dimensional space. It is thus a rather intricate object. Moreover, it is
worth mentioning that, for mean field games without common noise, the dynamics
of .L.Xt //06t6T is entirely deterministic, and for this reason, we cannot invoke a
non-degeneracy argument. As we shall see in Chapters (Vol II)-2, (Vol II)-3, (Vol
II)-4, and (Vol II)-5, it is only in the presence of a common noise that we may
expect these arguments to make sense. Actually, even the framework considered in
these four chapters is too restrictive to address the smoothing effect of the common
noise in full generality. Indeed, except for a few cases, strict ellipticity cannot
hold true if the common noise is of finite dimension, a situation we encounter
throughout the book. The few examples which could work are cases where the
marginal laws .L.Xt //06t6T belong to a parametric family. We provide such an
example in Subsection (Vol II)-3.5.2 where we manage to prove that the common
noise restores uniqueness in some specific cases.
Therefore, at this stage of the discussion, it seems that there is only one possible
road to uniqueness, and it has to be based on a structural monotonicity condition.
We make this clear in what follows.
The following definition of monotonicity is taken from the earlier works by Lasry
and Lions. We call it Lasry-Lions monotonicity condition.
Z
Œh.x; / h.x; 0 / d. 0 /.x/ > 0:
Rd
Clearly, any linear combination of functions which satisfy the Lasry-Lions Mono-
tonicity condition also satisfies it if the coefficients are nonnegative. A first set of
examples of monotone functions will be provided in Subsection 3.4.2. More prop-
erties of monotone functions, including convexity, will be discussed in Chapter 5,
see for instance Remark 5.75.
We now introduce what turns out to be the most popular set of assumptions under
which uniqueness has been proven to hold in the existing literature. It goes back to
the earlier works of Lasry and Lions on mean field games. With the same notation
as in Subsection 3.1.2, it reads as follows.
(A1) The coefficients b and do not depend upon the measure argument.
They thus read as mappings b W Œ0; T Rd A ! Rd and W Œ0; T
Rd ! Rdd .
(A2) The running cost f has a separated structure of the form:
The main result of this section is the following important uniqueness conse-
quence of the monotonicity assumption.
Theorem 3.29 Let assumption Lasry-Lions Monotonicity hold, and let us assume
that for any deterministic continuous flow D .t /06t6T from Œ0; T to P2 .Rd /,
the optimal control problem (3.4) has a unique minimizer ˛O 2 A. Call XO the
corresponding optimal path.
170 3 Stochastic Differential Mean Field Games
Proof. Assume that there are two different MFG equilibria D .t /06t6T and 0 D
0 0
.0t /06t6T . Then, the processes ˛O and ˛O must differ as otherwise XO and then XO
would be the same and then, by (3.44), and 0 would be the same as well. Therefore,
0
by uniqueness of the minimizer of the cost functionals J and J , we have:
0 0 0 0
J ˛O J ˛O < 0 and J ˛O J ˛O < 0:
0 0 0 0
J ˛O J ˛O J ˛O J ˛O < 0: (3.45)
Now, we use the fact that the coefficients b and are independent of . So in the environment
0 0
, the controlled path driven by ˛O is exactly XO . Similarly, in the environment 0 , the
controlled path driven by ˛O is exactly XO . Therefore,
Z
0 T
J ˛O J ˛O D E f0 t; XO t ; t f0 t; XO t ; 0t dt
0
Z
T
C f1 t; XO t ; ˛O t f1 t; XO t ; ˛O t dt C g XO T ; T g XO T ; 0T ;
0
and we observe that the first term in the second line is zero. Thanks to (3.44), we deduce that:
Z Z
0 T
J ˛O J ˛O D f0 .t; x; t / f0 .t; x; 0t / dt .x/dt
0 Rd
Z
C g.x; T / g.x; 0T / dT .x/:
Rd
Similarly,
Z Z
0 0 0
T
J ˛O J ˛O D f0 .t; x; t / f0 .t; x; 0t / d0t .x/dt
0 Rd
Z
C g.x; T / g.x; 0T / d0T .x/:
Rd
0 0 0 0
J ˛O J ˛O J ˛O J ˛O
Z Z
T
D f0 .t; x; t / f0 .t; x; 0t / d t 0t .x/dt
0 Rd
Z
C g.x; T / g.x; 0T / d T 0T .x/:
Rd
so that:
0
J ;weak ˛O J ;weak ˛O
Z
T
D E;˛O g.XT ; T / g.XT ; 0T / C f0 .t; Xt ; t / f0 .t; Xt ; 0t / dt :
0
Exploiting the fact that P;˛O ı Xt1 D t , we then deduce that:
Z
0
J ;weak
˛O J ;weak ˛O D g.x; T / g.x; 0T / dT .x/
Rd
Z Z
T
C f0 .t; x; t / f0 .t; x; 0t / dt .x/dt;
0 Rd
3.4.2 Examples
Example 2. If h does not depend upon x, then it also satisfies the requirements
of Definition 3.28. Indeed, for any function h W P2 .Rd / ! R and for all ; 0 2
P2 .Rd /,
Z
h./ h.0 / d 0 /.x/ D h./ h.0 / 0 /.Rd / D 0:
Rd
for some a > 0. Then h satisfies the requirements of Definition 3.28. Indeed, for all
; 0 2 P2 .Rd /,
Z Z Z
h.x; / h.x; 0 / d. 0 /.x/ D a x yd 0 .y/d. 0 /.x/:
Rd Rd Rd
Therefore,
Z ˇZ ˇ2
ˇ ˇ
h.x; / h.x; / d. /.x/ D aˇˇ
0 0
xd. /.x/ˇˇ :
0
Rd R d
for some Borel-measurable odd function ` satisfying j`.x/j 6 C.1 C jxj2 / for some
C > 0 and all x 2 R. Then, h is also covered by Definition 3.28. Indeed, for all
; 0 2 P2 .Rd /,
3.4 Lasry-Lions Monotonicity Condition 173
Z
h.x; / h.x; 0 / d. 0 /.x/
Rd
Z Z
D `.x y/d 0 .y/d. 0 /.x/
Rd Rd
Z Z
1
D `.x y/d 0 .y/d. 0 /.x/
2 Rd Rd
Z Z
1
`.y x/d 0 .y/d. 0 /.x/
2 Rd Rd
D 0;
where we used the fact that `.x y/ D `.y x/ to pass from the second to the third
line. This form of function h is well adapted to our discussion of potential games in
Chapter 6.
1
h.x; / D .1; x/ C fxg ; x 2 R; 2 P2 .R/:
2
174 3 Stochastic Differential Mean Field Games
Then, h satisfies the requirements of Definition 3.28. Notice that, when has no
atoms, h.x; / coincides with the cumulative distribution function of at point x.
Indeed, using the sign function (sign.x/ D 1 if x > 0, 1 if x < 0 and 0 if x D 0),
we have
Z
sign.x y/d.y/ D .1; x/ .x; 1/
R
D 2 .1; x/ C fxg 1 D 2h.x; / 1:
where
R is a bounded even smooth probability density function over Rd such that
2
Rd jxj .x/dx < 1 and L W R Œ0; 1/ ! Œ0; 1/ is nondecreasing in the second
d
variable and satisfies, for any r > 0 and all % 2 Œr; r, jL.z; %/j 6 Cr .1 C j%j2 /
for some constant Cr > 0. The notation denotes the standard convolution
product. Notice in particular that the function is bounded. Then, h satisfies
Definition 3.28. While the fact that h is at most of quadratic growth is easily checked,
monotonicity may be proved as follows. For all ; 0 2 P2 .Rd /,
Z
h.x; / h.x; 0 / d 0 .x/
Rd
Z Z
0
0
D L z; .z/ L z; .z/ .x z/ d.x/ d .x/ dz
Rd Rd
Z
D L z; .z/ L z; 0 .z/ .z/ 0 .z/ dz > 0;
Rd
the last line following from the fact that L is nondecreasing in the second variable.
When h is understood as a cost functional, it increases at a point x as the mass of
in the neighborhood of x increases.
Importantly, observe that the above definition does not depend upon the choice of the
probability space .˝; F; P/, provided that .˝; F; P/ is assumed to be rich enough
to carry, for any joint distribution 2 P2 ..Rd Rd /2 /, a pair of random variables
.X; X 0 / with as distribution. We shall address this latter point in detail in Chapter 5.
Examples of L-monotone functions will be given in Subsection 3.4.3. The
reasons for the terminology “L-monotone” will be made clear in Subsection 5.7.1.
Therein, we shall show that, surprisingly, the two notions of monotonicity have
different origins.
We now provide another sufficient condition for uniqueness using the notion of
L-monotonicity introduced in Definition 3.31:
Assumption (L-Monotonicity).
(A1) The coefficient is constant and the coefficient b does not depend upon
the measure argument and reads, for all .t; x; ˛/ 2 Œ0; T Rd A,
(continued)
176 3 Stochastic Differential Mean Field Games
(A4) The functions @x f0 .t; ; / for t 2 Œ0; T, and @x g are L-monotone in
the sense of Definition 3.31. Moreover, the function f1 satisfies the
following strong form of convexity:
for some > 0. The notation @.x;˛/ f1 stands for the gradient in the joint
variables .x; ˛/.
Proof. The proof depends upon the equilibrium criticality condition based on the necessary
from of the Pontryagin stochastic maximum principle.
First Step. Owing to Theorem 3.27, the Pontryagin FBSDE of McKean-Vlasov type
satisfied by any equilibrium takes the form:
8
ˆ
< dXt D b t; Xt; ˛O t dt C dWt ; t 2 Œ0; T;
ˆ
dYt D @x H t; Xt ; L.Xt /; Yt ; ˛O t dt C Zt dWt ; t 2 Œ0; T; (3.46)
ˆ
:̂ Y D @ gX ; L.X /;
T x T T
Let us assume that X0 D .Xt0 /06t6T is the optimal path of another equilibrium. The
Pontryagin FBSDE of McKean-Vlasov type for X0 takes a similar form as long as we replace
.Xt ; Yt ; Zt ; ˛O t /06t6T with .Xt0 ; Yt0 ; Zt0 ; ˛O t0 /06t6T .
3.4 Lasry-Lions Monotonicity Condition 177
Second Step. Like in the derivation of the stochastic Pontryagin principle, we compute:
d .Xt0 Xt / .Yt0 Yt /
D b.t; Xt0 ; ˛O t0 / b.t; Xt ; ˛O t / .Yt0 Yt /
@x H t; Xt0 ; L.Xt0 /; Yt0 ; ˛O t0 @x H t; Xt ; L.Xt /; Yt ; ˛O t .Xt0 Xt / dt
C dMt ;
d .Xt0 Xt / .Yt0 Yt /
D b2 .t/ ˛O t0 ˛O t .Yt0 Yt /
(3.47)
@x f0 t; Xt0 ; L.Xt0 / @x f0 t; Xt ; L.Xt / .Xt0 Xt /
@x f1 t; Xt0 ; ˛Ot 0 @x f1 t; Xt ; ˛O t .Xt0 Xt / dt C dMt ;
Similarly,
b2 .t/˛O t Yt0 C f1 t; Xt ; ˛O t
> b2 .t/˛O t0 Yt0 C f1 t; Xt0 ; ˛O t0 C .Xt Xt0 / @x f1 t; Xt0 ; ˛O t0 C j˛O t0 ˛O t j2 ;
C 2 j˛O t0 ˛O t j2 ;
178 3 Stochastic Differential Mean Field Games
and consequently:
b2 .t/ ˛O t0 ˛O t .Yt0 Yt / .Xt0 Xt / @x f1 t; Xt0 ; ˛Ot 0 @x f1 t; Xt ; ˛Ot
6 2 j˛O t0 ˛O t j2 :
Plugging into (3.47) and using in addition the L-monotonicity of f0 , we deduce that:
Z
T
E .YT0 YT / .XT0 XT / C 2 E j˛O t0 ˛O t j2 dt 6 0:
0
Using now the terminal condition, and again the L-monotonicity condition, we get:
E .YT0 YT / .XT0 XT /
D E @x g XT0 ; L.XT0 / @x g XT ; L.XT / .XT0 XT / > 0;
proving that:
Z T
E j˛O t ˛O t0 j2 dt D 0;
0
Remark 3.33 As made clear by the proof of Theorem (Vol II)-1.59 in Chapter (Vol
II)-1, the result easily extends to the case when takes the form .t; x/ D 0 .t/ C
1 .t/x.
is L-monotone.
Proof. We have:
Z
@x f .x; / D @h.x x0 /d.x0 /; x 2 Rd ; 2 P2 .Rd /:
Rd
Then, for two square-integrable random variables X and X 0 with values in Rd , we have:
h i
E @x f .X; / @x f .X 0 ; 0 / .X X 0 /
h i
D E @h.X Y/ @h.X 0 Y 0 / .X X 0 / ;
3.5 Linear Quadratic Mean Field Games 179
where .Y; Y 0 / has the same distribution as, and is independent of, .X; X 0 /. Since @h is odd,
we have:
h i
E @h.X Y/ @h.X 0 Y 0 / .X X 0 /
h i
D E @h.Y X/ @h.Y 0 X 0 / .Y Y 0 /
h i
D E @h.X Y/ @h.X 0 Y 0 / .Y Y 0 / :
Therefore,
h i
E @x f .X; / @x f .X 0 ; 0 / .X X 0 /
1 h i
D E @h.X Y/ @h.X 0 Y 0 / X Y .X 0 Y 0 / > 0;
2
the last inequality following from the fact that h is convex. This proves that @x f is
L-monotone. t
u
Our first application of the strategy and the results obtained in this chapter concerns
the Linear Quadratic (LQ for short) models. The linearity of the coefficients and
the convexity of the costs are screaming for the use of the stochastic maximum
approach, as the weak formulation approach cannot take advantage of these features
as easily.
With A being equal to the entire Rk , we use the notation and assumptions
introduced in Subsection 2.3.4 of Chapter 2, but with W of dimension d and a
constant matrix in Rdd . Since
assumption (A3) is also satisfied as the matrices q.t/ and qN .t/ are symmetric
nonnegative semi-definite and continuous in time t 2 Œ0; T and the matrix r.t/
is symmetric strictly positive definite and continuous in t 2 Œ0; T; in particular, r.t/
is strictly positive definite, uniformly in t 2 Œ0; T. Finally, since
1
g.x; / D N qN .x s/
x qx C .x s/ N ;
2
(i) For each fixed deterministic function Œ0; T 3 t 7! N t 2 Rd , solve the standard
stochastic control problem
1
inf E X qXT C .XT sN T / qN .XT sN T /
˛2A 2 T
Z
1 T
C Xt q.t/Xt C .Xt s.t/N t / qN .t/.Xt s.t/N t / C ˛t r.t/˛t dt
2 0
subject to (3.48)
8
< dX D b .t/X C b .t/˛ C bN .t/N dt C dW ;
t 1 t 2 t 1 t t
:
X0 D :
(ii) Determine a function Œ0; T 3 t 7! N t 2 Rd so that, for all t 2 Œ0; T, EŒXt D
N t , where .Xt /06t6T is the optimal path of the optimal control problem in the
environment .N t /06t6T .
3.5 Linear Quadratic Mean Field Games 181
O x; ; y/ D r.t/1 b2 .t/ y;
˛O D ˛.t; (3.49)
This system of ordinary differential equations is not always easy to solve despite
its deceptive simplicity. Its forward/backward nature is the source of difficulty. We
shall say more below, especially in the univariate case d D m D k D 1. In any case,
its properties play a crucial role in the solution of the linear quadratic mean field
game. Indeed, we have the following statement.
Proof. Clearly, existence of a solution to the MFG problem implies existence of a solution
for (3.52). We now prove the analogue, but for uniqueness. To do so, let us assume that there
is at most one equilibrium for the MFG problem. For each solution .Nxt ; yN t /06t6T of (3.52),
we can solve the system (3.50) with xN t in lieu of N t , and conclude that EŒXt D xN t and
EŒYt D yN t for all t 2 Œ0; T. Indeed, forming the difference between (3.50) and (3.52), we see
that .EŒXt xN t ; EŒYt yN t /06t6T is the solution of a homogeneous linear system of order 1
with zero initial condition; invoking Theorem 3.17, or duplicating its proof, we observe that
this homogeneous linear system has zero as unique solution because of the strict convexity
assumption implied by the fact that r.t/ is strictly positive definite. Therefore, the solution
.Xt ; Yt /06t6T of (3.50) with xN t in lieu of N t solves (3.51); since (3.51) is uniquely solvable,
this shows that .Nxt D EŒXt ; yN t D EŒYt /06t6T , is uniquely determined. This concludes the
proof of the uniqueness of the solution of (3.52).
Conversely, let us assume existence of a unique solution for the deterministic sys-
tem (3.52). Recall from Theorem 3.17 that for each fixed deterministic continuous function
Œ0; T 3 t 7! N t 2 Rd , the FBSDE (3.50) is uniquely solvable. Using the unique solution
.Nxt /06t6T of (3.52) in lieu of .N t /06t6T as input in (3.50) and forming as above the difference
between (3.50) and (3.52), we get by the same argument that EŒXt D xN t for all t 2 Œ0; T,
which proves that the fixed point step of the MFG strategy is also satisfied. Furthermore,
the uniqueness of the solution of (3.52) and the uniqueness of the solution of (3.50) for
Œ0; T 3 t 7! N t 2 Rd fixed imply the uniqueness of the solution of the MFG equilibrium.
t
u
Consistent with the time honored method to solve affine FBSDEs, we may want
to look for a solution of (3.52) in the form yN t D N t xN t CN t where t 7! N t and t 7! N t are
smooth functions with values in Rdd and Rd respectively. For the sake of notation,
we rewrite the forward-backward system (3.52) in the form:
8
ˆ P
< xN t D at xN t C bt yN t ;
ˆ
yNP D ct xN t C dt yN t ; t 2 Œ0; T; (3.53)
ˆ t
:̂ xN D EŒ; yN D eNx ;
0 T T
where:
and
Notice that we use the standard ODE notation of a dot for the time derivative of
deterministic functions of time. If we compute the derivative of yN t from the ansatz
yN t D N t xN t C N t and use the forward equation to express xPN t , we obtain:
and identifying the two forms of the derivative yPN t , we find that given the ansatz, the
system (3.53) is equivalent to the system:
(
PN t C N t at dt N t C N t bt N t ct D 0; N T D e;
(3.54)
NP t C Œ N t bt dt N t D 0; N T D 0:
The first equation is a matrix Riccati equation which is not always solvable on a
time interval of pre-assigned length. When it is, its solution can be injected into the
second equation, which then becomes a first order homogenous linear equation with
terminal condition zero, so its solution is identically zero.
Let us assume momentarily that the Riccati equation appearing as the first
equation in the system (3.54) has a unique solution which we denote by . N t /06t6T .
Injecting the ansatz yN t D N t xN t into the first equation of the system (3.53), we find
that .Nxt /06t6T has to solve the ODE:
which is a linear ODE for which existence and uniqueness are guaranteed. Finding
the optimal mean function Œ0; T 3 t 7! xN t guarantees the existence of a solution to
the MFG problem, but it does not tell much about the optimal state trajectories or the
optimal control. The latter can be obtained by plugging the so-obtained xN t into the
FBSDE (3.51) in lieu of EŒXt and solving for X D .Xt /06t6T and Y D .Yt /06t6T .
This search reduces to the solution of the affine FBSDE:
8
ˆ
< dXt D Œat Xt C bt Yt C ct dt C dWt ;
ˆ
dYt D Œmt Xt at Yt C dt dt C Zt dWt ; (3.56)
ˆ
:̂ X D ; Y D qX C r;
0 T T
184 3 Stochastic Differential Mean Field Games
where we set:
and
Notice that taking expectations on both sides of this ansatz we get EŒYt D t EŒXt C
t , but since both functions and depend upon the function Œ0; T 3 t 7! xN t , there
is no contradiction with the formula yN t D N t xN t even if, as we are about to find out, the
function is not identically zero, since the function may solve a Riccati equation
different from the Riccati equation solved by N .
Computing dYt from ansatz (3.57) by using the expression of dXt given by the
first equation of (3.56), we get:
and identifying term by term with the expression of dYt given in (3.56), we get:
8
ˆ
ˆ P t C t bt
C at C t at mt D 0; D q;
< t t T
P t C .at C t bt /t dt C t ct D 0; T D r; (3.58)
ˆ
:̂ Zt D t :
As before, the first equation is a matrix Riccati equation. If and when it can be
solved, the third equation becomes solved automatically, and the second equation
becomes a first order linear ODE, though not homogenous this time, which can
be solved by standard methods. Notice that the quadratic terms of the two Riccati
equations (3.54) and (3.58) are the same since bt D bt D b2 .t/r.t/1 b2 .t/ .
However, the terminal conditions are different since the terminal condition in (3.58)
is given by q D q C qN while it was given by e D q C qN .Id s/ in (3.54). Notice also
that the first order terms are different as well.
3.5 Linear Quadratic Mean Field Games 185
Stochastic Maximum Principle and Riccati Equation. Under the same assump-
tions on the dynamics, we consider the minimization of the functional:
Z
Q 1 T
1
J.˛/ D xT qQ xT C xt qQ t xt C ˛t rt ˛t dt
2 0 2
where ˛ D .˛t /06t6T is a deterministic control in L2 .Œ0; TI A/. Pay attention that
.at /0tT and .bt /0tT may differ from the coefficients .at /0tT and .bt /0tT
defined earlier and, similarly, .rt /0tT may differ from .r.t//0tT . Notice that this
problem has a unique solution if we assume as before that the matrix coefficients
are continuous functions of the time variable t, and if we also assume that qQ and qQ t
are symmetric and nonnegative semi-definite, and that rt is symmetric and strictly
positive definite. The Hamiltonian:
1 1
H.t; x; y; ˛/ D y at x C y bt ˛ C x qQ t x C ˛ rt ˛
2 2
O x; y/ D rt1 bt y so that the forward-backward system
is minimized for ˛ D ˛.t;
given by the maximum principle reads:
8
ˆ 1
< xP t D at xt bt rt bt yt ;
ˆ
yP D Qqt xt at yt ; t 2 Œ0; T; (3.59)
ˆ t
:̂ x D EŒ; y D qQ x :
0 T T
P t C at t t Œbt rt1 bt t D 0; T D 0: (3.61)
xP t0 D at xt0 bt rt1 bt t xt0 bt rt1 bt t
the Riccati equation (3.60) coincides with the Riccati equation in (3.58) since at D
at , bt D bt and mt D Qqt and we can use the equivalence provided by the maximum
principle to conclude existence and uniqueness of a solution for the Riccati equation
in (3.58).
However, in our study of LQ mean field games, it is also necessary to provide
conditions implying existence and uniqueness of a solution to the system (3.52)
which, according to Theorem 3.34, is equivalent to the existence and uniqueness of
a solution to the LQ mean field game problem. Indeed, as we already remarked, the
only issue is the solution of the fixed point problem (ii), since for any continuous
function Œ0; T 3 t 7! N t 2 Rd the standard optimal control problem (i) has a unique
solution under the above assumptions on the coefficients.
Notice that if we assume that the Rd -valued function t 7! t satisfies an equation
of the form:
like in the second equation of the system (3.53), and if we define t 7! et as the
unique Rdd -valued solution of the matrix ODE eP t D bN 1 .t/ et with initial condition
e0 D Id , then, the Rd -valued function t 7! Qt defined by Qt D et t satisfies:
if the matrices et and dt commute in the sense that et dt D dt et for all t 2 Œ0; T.
Notice that, while this commutativity property is often satisfied in applications
(in particular in the unidimensional case d D k D 1), it is still rather restrictive.
We make it here for the purpose of the discussion of the assumptions under which
the fixed point step (ii) can be solved for LQ models.
The relevance of this remark comes about in the following way. The above
maximum principle argument shows that if we set:
satisfies:
8
ˆ N 1 1 Q
< xP t D Œb1 .t/ C b1 .t/xt b2 .t/Qrt b2 .t/ et t ;
ˆ
PQ D e qQ x b .t/ Q ; t 2 Œ0; T; (3.63)
1
ˆ t t t t t
:̂ x D EŒ; Q D e qQ x ;
0 T T T
which is nothing but the system (3.52) if we can choose .Qqt /06t6T and qQ so that:
(
et qQ t D q.t/ C qN .t/ qN .t/s.t/; t 2 Œ0; T;
eT qQ D q C qN qN s;
for all t 2 Œ0; T. Checking that qQ t is symmetric and nonnegative semi-definite and
that rQt is symmetric and strictly positive definite may be difficult. Still, if bN 1 .t/
happens to be a multiple of the d d identity matrix Id , et will also be a multiple
of Id , with a positive multiplicative constant. As such, et will commute with all the
d d matrices. In this case, we may choose rQt as equal to rt up to a multiplicative
constant. Moreover, qQ t will be symmetric if q.t/s.t/ is symmetric, and nonnegative
semi-definite if q.t/ C qN .t/ q.t/s.t/ > 0, which is the case if q.t/s.t/ 6 0,
and similarly for qQ . This matches condition (A7) in the set of assumption MFG
Solvability SMP that we shall introduce in Chapter 4 in order to prove existence
of an MFG equilibrium within a larger framework that includes both Theorem 3.24
and the linear-quadratic case.
We deduce:
Proposition 3.35 Assume, as above, that the matrix coefficients are continuous, q,
qN , q.t/, and qN .t/ are nonnegative semi-definite, and r.t/ is strictly positive definite.
Assume also that bN 1 .t/ is a multiple of the identity matrix Id , that for all t 2 Œ0; T,
q.t/s.t/ is symmetric and q.t/ C qN .t/ q.t/s.t/ is nonnegative semi-definite and that
qs is symmetric and q C qN qs is nonnegative semi-definite. Then, the LQ mean field
game problem defined through (3.48) has a unique solution.
The first equation is now a scalar Riccati equation, and according to the classical
theory of scalar Ordinary Differential Equations (ODEs for short), a straightforward
approach is to solve the second order linear equation:
RT
Z T Rs
Œau Cbu u du Œau Cbu u du
t D re t Œds cs s e t ds; t 2 Œ0; T: (3.65)
t
3.5 Linear Quadratic Mean Field Games 189
When the Riccati equation is well posed, its solution does not blow up and all
the terms above are integrable. Now that the deterministic functions . t /06t6T and
.t /06t6T are computed, we rewrite the forward stochastic differential equation for
the dynamics of the state, see (3.56), using the ansatz (3.57):
which provides the solution to (3.56) in the univariate case once the fixed point
condition in the LQ mean field game problem has been solved.
In order to solve the fixed point condition, we may pursue the argument started
earlier in the multidimensional case. To do so, we notice that in the one-dimensional
case, the function .et /0tT used in (3.63) is given explicitly by:
Z t
et D exp bN 1 .u/du ; t 2 Œ0; T;
0
and since the commutativity conditions are automatically satisfied we only need to
check the positivity conditions. Since rQt D e1
t r.t/ is strictly positive if and only if
r.t/ is, which is part of our assumptions, the only requirement we need to guarantee
existence of a solution to the MFG problem is the nonnegativity of qQ t , for t 2 Œ0; T,
and of qQ , which amounts to assuming that q.t/ C qN .t/ qN .t/s.t/ > 0, for t 2 Œ0; T,
and q C qN qN s > 0.
Remark 3.36 Whenever we solve the above LQ mean field game problems with a
deterministic initial private state D x0 2 R, the equilibrium state process X D
.Xt /06t6T , its adjoint process Y D .Yt /06t6T as well as the optimal control process
˛ D .˛t /06t6T are Gaussian processes whose mean and auto-covariance functions
can be computed explicitly in terms of the functions . t /06t6T and .t /06t6T .
Remark 3.37 We refer to (2.49)–(2.50) for the analysis of the Riccati equation
in (3.64) when the coefficients b, a and m are constant.
We also assume that the running cost is simply the square of the control, i.e.,
f .t; x; ; ˛/ D ˛ 2 =2 so that r.t/ D 1 and q.t/ D qN .t/ D 0. In particular,
at D dt D ct D 0 and bt D 1, see (3.56). In this example, the interaction between
the players occurs only through the terminal cost, which we assume to be of the
N 2 =2 for some qN > 0 and s 2 R to conform with the setting
form g.x; / D qN .x s/
of this section. Using the notation and the results above, we see that after fixing the
mean N t D EŒXt , the FBSDE from the Pontryagin stochastic maximum principle
has the simple form:
8
ˆ
< dXt D Yt dt C dWt ;
ˆ
dYt D Zt dWt ; t 2 Œ0; T; (3.67)
ˆ
:̂ X D ; Y D qX C r;
0 T T
(keep in mind that q > 0 so that the functions above are well defined) and plugging
these expressions into (3.66) we get:
Z
1 C q.T t/ rt t
dWu
Xt D C Œ1 C q.T t/ : (3.68)
1 C qT 1 C qT 0 1 C q.T u/
Notice further that the optimal control ˛t and the adjoint process Yt satisfy:
q r
˛t D Yt D Xt C ;
1 C q.T t/ 1 C q.T t/
and that the only quantity depending upon the fixed mean function t 7! N t is the
constant r D Nq s N T , which depends only upon the mean state at the end of the
time interval. Recalling that q D qN , this makes the search for a fixed point very
simple and one easily checks that if
EŒ
N T D ; (3.69)
1 C qN .1 s/T
Remark 3.38 Whenever 1 C qN .1 s/T < 0, the reader may find contradictory
the fact that N T and N 0 have opposite signs in equilibrium, while (3.55) seemingly
implies that EŒXt has the same sign as EŒX0 D EŒ in equilibrium. To resolve
this ostensible contradiction, we must recall that (3.55) holds whenever the Riccati
equation in (3.54) is solvable. Whenever 1 C qN .1 s/T < 0, the Riccati equation
in (3.54) is certainly not solvable on Œ0; T as otherwise there would be a solution on
the shorter interval Œ0; Tc with the same terminal condition at Tc in lieu of T, where
Tc is the critical time when 1 C qN .1 s/Tc D 0. Of course, the Riccati equation
cannot be solvable on Œ0; Tc since (3.69) asserts that there is no equilibrium when
T D Tc .
We test the results of this chapter on some of the examples introduced in Chapter 1.
We first consider the mean field game problem arising from the flocking model
introduced in Subsection 1.5.1 of Chapter 1. As in the discussion of the finite player
game given in Section 2.4 of Chapter 2, we first consider the particular case ˇ D 0.
The general case ˇ ¤ 0 will be treated in Chapter 4. As explained in Section 2.4,
when ˇ D 0, the weights (1.44) are identically equal to a constant, and the costs to
the individuals depend only upon their velocities. Since the position does not appear
in the dynamics of the velocities, it is possible to reframe the model in terms of the
velocities only. Doing so, we are facing a linear quadratic mean field game model
fitting perfectly the framework discussed in this chapter.
As explained in Section 2.4 of Chapter 2, we use the notation Xti to denote the
velocity at time t of bird i. This choice of the upper case letter X for the velocity, is
made to conform with the notation used throughout Chapter 2. Later on, when we
consider the general case in Chapter 4, we shall switch back to the original notation
introduced in Chapter 1. Here we focus on the dynamics of the velocity:
dXt D ˛t dt C dWt ;
and recalling the form (1.48) of the running cost, the minimization concerns the cost
functional:
Z T
1 2
J.˛/ D E Œ j˛t j2 C jXt N t j2 dt :
0 2 2
and
The first equation is a d d matrix Riccati equation. However, in its present very
special form, its solution can clearly be searched for as a scalar multiple of the
identity. So if the above matrix valued function t 7! t is of the form t 7! t I3 for
some real valued function t 7! t , the latter should solve the scalar Riccati equation:
2
Pt t C 2 D 0; T D 0: (3.70)
We chose to use the same letter for both the matrix valued and the scalar functions,
not so much because of a shortage of Greek characters, but because we already
considered a scalar Riccati equation of the type (3.70) in Section 2.4. In fact, this
very equation appeared as the limit N ! 1 of the Riccati equations providing open
loop and Markovian Nash equilibria for the N player games, and we explained there
that this equation has a unique solution given by (2.59):
e2.Tt/ 1
t D ; t 2 Œ0; T:
e2.Tt/ C 1
Similarly, the components it of the vector valued function t are given by:
Z T Rt
it D 2 N i0 e s u du
ds; t 2 Œ0; T; i D 1; ; d:
t
0.4
10
0.2
5
0.0
v2
x2
0
-0.2
-5
-0.4
-10
-0.4 -0.2 0.0 0.2 0.4 -10 -5 0 5 10
v1 x1
These dynamics are mean reverting because the scalar function t 7! t is positive.
In fact, the velocity is given by the explicit formula:
Rt
Z t Rt
Z t Rt
Xt D e 0 u du
v0 e s u du
s ds C e s u du
dWs ; t 2 Œ0; T;
0 0
Rt
where v0 is the initial velocity, from which we obtain xt D x0 C 0 Xs ds for the
position at time t of a typical bird in the flock, x0 denoting the initial position. The
left pane of Figure 3.1 shows the results of N D 50 Monte Carlo simulations of the
model with D 1, D 1 and T D 10 in dimension d D 2.
Even though we were able to solve the finite player game for open and closed loop
Nash equilibria, it is instructive to consider the mean field game version of the toy
model of systemic risk in the absence of common noise, i.e., when D 0. The
general case will be discussed later in Chapter (Vol II)-4. When D 0, this model is
a particular case of the LQ mean field game models considered earlier. Indeed, the
MFG strategy is based on the following two steps:
(i) For each fixed deterministic function Œ0; T 3 t 7! mt 2 R, solve the standard
control problem:
Z
T
˛t2 2 c 2
inf E q˛t .mt Xt / C .mt Xt / dt C .mT XT / ;
˛2A 0 2 2 2
194 3 Stochastic Differential Mean Field Games
As stated, this problem is a particular case of the LQ mean field game models
discussed above only when q D 0. However, in the general case q 6 2 , the
arguments used above can be applied mutatis mutandis. The reduced Hamiltonian
of the system is given by:
1
H.t; x; y; ˛/ D Œa.mt x/ C ˛ y C ˛ 2 q˛.mt x/ C .mt x/2 ;
2 2
which is strictly convex in .x; ˛/ under the condition q2 6 , and attains its
minimum for:
˛ D ˛.t;
O x; mt ; y/ D q.mt x/ y:
This affine FBSDE is of the type considered above and can be solved in the same
way. Given our experience with the corresponding finite player game solved in the
previous chapter, we make the (educated) ansatz:
Yt D t .mt Xt /; (3.72)
where .N t /06t6T denotes the flow of average wealths in the population in equilib-
rium. We switched to a system of notation used in this chapter, but the reader is
warned that this average wealth was denoted Kt when we introduced the model in
Chapter 1 using standard notation in the macro-economic literature. In any case,
this average wealth is assumed to take (strictly) positive values, both for economic
reasons and because of the powers ˛ 2 .0; 1/ and 1 ˛ 2 .0; 1/ appearing in
the above equation. Observe also that we used the Greek letter ˛ for the exponent
in (3.74), although we already used the same letter for the elements of the admissible
values for the control processes; clearly, there is no risk of confusion between both
in the sequel. In order to make sure that .Zt /t>0 is stationary for all times (and not
simply “asymptotically stationary”), we can assume that the distribution of Z0 is
the invariant measure N.1; Z2 =2/ of the process. For our purpose, we just assume
that EŒZ0 D 1, Z0 being independent of W. Among other things, this implies that
EŒZt D 1 for all t > 0, fact which we shall use later on. Last, observe that, in
comparison with (1.37) in Chapter 1, we took aN D 1.
The set A of admissible controls is the set H2;1
C of real valued square-integrable
F-adapted processes c D .ct /06t6T with nonnegative values, and the cost functional
is defined by:
Z T
J.c/ D E Q T/ ;
.U/.ct /dt U.A
0
for the CRRA utility function U given by (1.35), namely U.c/ D .c1 1/=.1 /
for > 0 with U.c/ D ln.c/ if D 1, and U.a/ Q D a. Notice the additional minus
signs due to the fact that we want to treat the optimization problem as a minimization
problem. Here we chose to take 0 for the discount rate since we are working on a
finite horizon. Throughout the analysis, we shall assume that A0 > 0.
H.t; z; a; ; yz ; ya ; c/
D .1 z/yz C c C .1 ˛/N ˛ z C .˛ N ˛1 ı/a ya U.c/;
196 3 Stochastic Differential Mean Field Games
R
where N D R2 a.dz; da/ denotes the mean of the second marginal of the measure
. Notice that we use the reduced Hamiltonian because the volatility of Z is constant
and the volatility of A is zero. The first adjoint equation reads:
dYz;t D @z H t; Zt ; At ; t ; Yz;t ; Ya;t ; ct dt C ZQ z;t dWt
D Yz;t .1 ˛/N ˛t dt C ZQ z;t dWt ; t 2 Œ0; T:
Since the variables z and yz do not play any role in the minimization of the
Hamiltonian with respect to the control variable c, the process .Yz;t /06t6T does not
enter the definition of the optimal trajectory. Consequently, we can ignore these
variables and not include them in the Hamiltonian. Accordingly, we shall use the
(further) reduced Hamiltonian:
H.t; a; ; y; c/ D c C .˛ N ˛1 ı/a y U.c/;
where we used the notation .ZQ t /06t6T to denote the integrand of the backward
equation in order to distinguish it from the process .Zt /06t6T used in the model for
the first component of the state. We emphasize that, the utility function U having
a singularity at 0, the assumptions of the Pontryagin principle used in this chapter
are not satisfied here. However, it is easy to see that the proof of the sufficient part
of the Pontryagin principle goes through provided that the adjoint process .Yt /06t6T
lives, with probability 1, in a compact subset of .1; 0/.
The crux of our analysis is to notice that the backward equation may be decoupled
from the forward equation. Its solution is deterministic and is obtained by solving
the backward ordinary differential equation:
Among other things, this shows that the process .Yt /06t6T is negative valued,
whatever the input .N t /06t6T . Also the optimal trajectory is unique and the optimal
consumption cO t D .Yt /1= is also deterministic! Once we know that .Yt /06t6T is
deterministic, taking the expectation in the dynamics of .At /06t6T , we deduce that
the flow .N t /06t6T describing the average wealth of the population in equilibrium,
if it exists, must solve the deterministic forward-backward system:
3.6 Revisiting Some of the Examples of Chapter 1 197
(
dN t D N ˛t ı N t .Yt /1= dt;
(3.76)
dYt D Yt Œ˛ N ˛1
t ıdt; t 2 Œ0; T I YT D 1:
Above we used the fact that EŒZt D 1 for all t 2 Œ0; T. In order to tackle the
existence and uniqueness of an MFG equilibrium, it thus suffices to prove that (3.76)
admits a unique solution .N t ; Yt /06t6T satisfying N t > 0 (and Yt < 0) for any
t 2 Œ0; T. Once (3.76) has been solved, it is indeed straightforward to plug the
solution into (3.75) and to solve the forward equation therein.
Figure 3.2 gives the plots of ' and its derivative for the value D 0:01 which we
shall use in the numerical computations below.
Notice that if we assume that .at D N t /06t6T is given, then the second equation
of (3.77) can be solved explicitly. We get:
RT
Œ' 0 .as /ıds
Yt D e t ; t 2 Œ0; T: (3.78)
5
1
4
0
−1
3
phi
phi
−2
2
−3
−4
1
−5
−1 0 1 2 3 −1 0 1 2 3
a a
Fig. 3.2 Plots of the regularizing function a 7! '.a/ (left) and its derivative a 7! ' 0 .a/ (right)
for the values D 0:01 and ˛ D 0:5 of the parameters.
198 3 Stochastic Differential Mean Field Games
Strangely enough, the system (3.77) is quite easy to solve numerically. Indeed, a
simple Picard’s iteration converges very quickly to a numerical solution. Typically,
we start with .Yt D 1/06t6T , inject it in the first equation, run a standard Ordinary
Differential Equation (ODE) solver to find .at D N t /06t6T satisfying this first
equation, inject this solution into formula (3.78), retrieve .Yt /06t6T which we then
inject in the first equation, etc. The process converges after a small number (no more
than 5 or 6 depending upon the values of the parameters) of iterations. Figure 3.3
gives the plots of the solutions .N t /06t6T and .Yt /06t6T obtained for a few values of
the parameters given in the caption.
Figure 3.4 shows how the average wealth .N t /06t6T and the adjoint variable Yt
depend upon the risk aversion level of the agents.
As for the mathematical analysis of the system (3.77), we first notice that,
quite remarkably, it reads like the forward-backward system obtained from the
0
40
alpha = 0.1
alpha = 0.5
-10
alpha = 0.9
30
Y_t
a_t
-20
20
-30
alpha = 0.1
10
alpha = 0.5
alpha = 0.9
0
0 1 2 3 4 5 0 1 2 3 4 5
Time Time
Fig. 3.3 Plots of the solutions .N t /06t6T (left) and .Yt /06t6T (right) of the forward/backward
system (3.77) for different values of ˛, and for the values D 0:01 and D 1:5 for the cut-off
and risk aversion parameters.
-1
8
gamma = 0.5
-2
gamma = 1.5
gamma = 6
6
-3
Y_t
a_t
-4
4
gamma = 0.5
-5
gamma = 1.5
gamma = 6
2
-6
0 1 2 3 4 5 0 1 2 3 4 5
Time Time
Fig. 3.4 Plots of the solutions . N t /06t6T (left) and .Yt /06t6T (right) of the forward/backward
system (3.77) for different values of the risk aversion parameter , and for the values D 0:01 and
˛ D 0:5 for the cut-off and Cobb-Douglas parameters.
3.6 Revisiting Some of the Examples of Chapter 1 199
N
H.a; y; c/ D Œ'.a/ ıa cy U.c/;
which is easily seen to be convex in .a; c/ when y is restricted to .1; 0/. Since ' 0
is bounded, it is indeed pretty clear that the solution of the backward equation:
yP t D yt ' 0 .at / ı ; t 2 Œ0; TI yT D 1; (3.80)
where a D .at /06t6T solves the controlled equation (3.79), lives in a compact
subset in .1; 0/. As a byproduct, this shows that, in the minimization of J, N the
dual variable y must live in a compact subset of .1; 0/. In particular, in (3.77),
the singular term .Yt /1= may be replaced by a bounded and Lipschitz function
of Y. With such a prescription, (3.77) may be seen as an FBSDE with Lipschitz
coefficients. It is thus uniquely solvable in small time, see Chapter 4. The fact
that the backward equation lives in a compact subset of .1; 0/, determined by
k' 0 k1 and ı only, shows that the optimal control, given by ..Yt /1= /06t6T , lives
a compact subset of .0; C1/, independently of the initial condition of .N t /06t6T .
In order to extend inductively the property of unique solvability from small to long
time intervals, it suffices to notice that, for the useful values of c and y, the mapping
N
.a; y; c/ 7! H.a; y; c/ is convex in .a; c/ and is uniformly convex in c. Then, by
the Pontryagin maximum principle and using the same argument as in the proof of
Lemma 4.56, based on Proposition 3.21, we can control the Lipschitz constant of
the decoupling field along the induction used in the extension from small time to
long time. This gives the unique solvability of (3.77) for an initial condition N 0 > 0.
We now show that we can replace the function ' and its derivative by the original
power function x 7! x˛ and its derivative, and still solve the system. First we notice
that formula (3.78) giving Yt in terms of at implies that:
and therefore:
ı.T t/
.Yt /1= 6 exp ; t 2 Œ0; T:
200 3 Stochastic Differential Mean Field Games
Then,
h Z t Z t i
at D exp.ıt/ a0 C exp.ıs/'.as /ds exp.ıs/.Ys /1= ds
0 0
h Z t
ı.T s/ i
> exp.ıt/ a0 exp.ıs/ exp ds :
0
Therefore, when
Z T ı.T s/
a0 D N 0 > exp.ıs/ exp ds; (3.81)
0
we can choose small enough so that the solution of (3.77) is also a solution
of (3.76). By the same argument, the solution must be unique, since any other
solution of (3.76), with the prescribed signed condition (that is N t > 0 and Yt < 0
for all t 2 Œ0; T), is a solution of (3.77) for a well-chosen . It is worth mentioning
that the solution to (3.75), obtained in the end under the condition (3.81), satisfies
EŒAt > 0 for any t 2 Œ0; T.
The rationale for the mean field game models studied in this book is based on
the limit as N ! 1 of N-player games with mean field interactions. One of the
justifications given in Chapter 1 for the formulation of the mean field game paradigm
is that the influence on the game of each individual player vanishes in this limit.
3.7 Games with a Continuum of Players 201
Density
-20 -10 0 10 20
a_T
Mathematical physicists and economists have been using game models in which the
impact of each single player is insignificant. They do just that by considering games
for which the players are labeled by elements i of an uncountable set I, accounting
for a continuum of agents. This set I is equipped with a -field I and a probability
measure which is assumed to be continuous (i.e., nonatomic). In this way, if i 2 I
represents a player, the fact that .fig/ D 0 accounts for the insignificance of the
players in the model. This section is thus intended to be a quick introduction to the
framework of games with a continuum of players.
The classical Glivenko-Cantelli form of the Law of Large Numbers (LLN) states
that if F denotes the cumulative distribution function of a probability measure on
R, if .X n /n>1 is an infinite sequence of independent identically distributed random
variables on a probability space .˝; F; P/ with common distribution , and if we
use the notation:
1 ˚
F! .x/ D lim sup # n 2 f1; ; Ng W X n .!/ 6 x ; x 2 R; ! 2 ˝; (3.82)
N!1 N
for the proportion of X n .!/’s not greater than x, then this lim sup is in fact a limit
for all x 2 R and P-almost all ! 2 ˝, and PŒf! 2 ˝ W F! ./ D Fg D 1.
Switching gears momentarily, recall that, over fifty years ago, economists
suggested that the appropriate model for perfectly competitive markets is a model
with a continuum of traders represented as elements of a measurable space. In such
a set-up, the insignificance of individual traders is captured by the idea of a set with
zero measure, and summation or aggregation is generalized by the notion of integral.
In games with a continuum of players, the latter are labeled by the elements i 2 I
of an arbitrary set I (often assumed to be uncountable, and most often chosen to
202 3 Stochastic Differential Mean Field Games
be the unit interval Œ0; 1) equipped with a -field I and a probability measure .
In this set-up, if the state of each player i 2 I is given by a random variable X i
on a probability space .˝; F; P/, in analogy with the countable case leading to
formula (3.82), the quantity:
F! .x/ D fi 2 I W X i .!/ 6 xg (3.83)
Definition 3.39 If E is a Polish space, the E-valued random variables .X i /i2I are
said to be essentially pairwise independent if, for -almost every i 2 I, the random
variable X i is independent of X j for -almost every j 2 I. Accordingly, if the real
valued random variables .X i /i2I are square integrable, we say that the family .X i /i2I
is essentially pairwise uncorrelated if, for -almost every i 2 I, the correlation
coefficient of X i with X j is 0 for -almost every j 2 I.
Z Z Z Z
X .!/dP.!/ d .i/ D
i
X .!/d .i/ dP.!/
i
I ˝ ˝ I
Z (3.84)
D X .!/d
i
P .i; !/:
I˝
In the sequel, we shall use the standard symbol E for denoting the expectation
under the sole probability P.
Measurable essentially pairwise independent processes X are first constructed
in such a way that, for each i 2 I, the law of X i is the uniform distribution
on the unit interval Œ0; 1. Then, using the tools we develop in Chapter 5, see
for example Lemma 5.29, we easily construct measurable essentially pairwise
independent Euclidean-valued processes with any given prescribed marginals. So
the actual problem is to construct rich product probability spaces in the sense of the
following definition.
We refer to the Notes & Complements at the end of the chapter for references
to papers giving the construction of essentially pairwise independent measurable
processes on Fubini extensions.
The following gives a simple property of rich Fubini extensions.
and using the Fubini property (3.84), we deduce that for -a.e. i 2 A, Rthe random variable
˝ 3 ! 7! X i .!/ is P-a.e. equal to the random variable W ˝ 3 ! 7! A X j .!/d .j/= .A/.
Also, for any event B 2 F ,
1
PŒ 2 B D P .i; !/ 2 A ˝ W X i .!/ 2 B
.A/
Z
1
D PŒX i 2 Bd .i/ D Leb1 .B/;
.A/ A
proving that , as a random variable on .˝; F ; P/, has the uniform distribution. In particular,
EŒ 2 D 1=3.
On the other hand, we know that, for almost every i 2 I, the function I ˝ 3 .j; !/ 7!
X i .!/X j .!/ is I F -measurable. Also, by the Fubini property, the function I 3 j 7! EŒX i X j
is integrable with respect to and
Z
1
EŒX i X j d .j/ D EŒX i : (3.85)
.A/ A
Now, we observe that the function I ˝ 3 .i; !/ 7! X i .!/ .!/ is also I F -measurable.
Hence, I 3 i 7! EŒX i is integrable with respect to and
Z
1 1
EŒX i d .i/ D EŒ 2 D :
.A/ A 3
The contradiction comes from the fact that, for almost every i 2 I, X i is orthogonal to X j for
almost every j 2 I. In other words, the left-hand side in (3.85) is equal to:
Z
1 1
EŒX i X j d .j/ D ;
.A/ A 4
Proof.
First Step. We first check that if Y D .Y i /i2I and Z D .Z i /i2I are measurable and square
integrable processes on the Fubini extension .I ˝; I F ; P/, and if we set XQ i;j .!/ D
Y i .!/Z j .!/ for i; j 2 I and ! 2 ˝, then ˝ 3 ! 7! XQ i;j is P-integrable for -a.e. i 2 I and
j 2 I. Now, proceeding as in the proof of Lemma 3.42 and using the Fubini property of the
space, we easily check that, R for i;j -a.e. i 2 I, theR function I 3 j 7! EŒXQ i;j is -integrable,
that the function
R I 3 i 7! IREŒXQ d .j/ D EŒY I Z d .j/ is -integrable, that the function
i j
Second Step. Let A; B 2 I , and let us define the processes Y D .Y i /i2I and Z D .Z i /i2I by
.Y i D 1A .i/.X i EŒX i //i2I and .Z i D 1B .i/.X i EŒX i //i2I respectively. Applying (3.86)
from the first step we get:
Z Z h i
E X i EŒX i X j EŒX j d .i/d .j/
A B
Z Z (3.87)
DE X EŒX d .i/
i i
X EŒX d .j/ ;
j j
A B
and the implication (i) ) (ii) follows by taking B D A. On the other hand, if we assume that
(ii) holds, equation (3.87) implies that:
Z Z h i
E X i EŒX i X j EŒX j d .i/d .j/ D 0
A B
Theorem 3.44 provides a form of the weak law of large numbers for essentially
pairwise uncorrelated uncountable families of random variables. Here is a stronger
form for essentially pairwise independent families of random variables.
We now revisit the introductory discussion of Section 3.1, and especially Subsec-
tion 3.1.1 to introduce what would be the analogue with a continuum of players. In
other words, we would like to replace the finite set I D f1; ; Ng of players, by
a general probability space .I; I; / possibly with a continuous measure . Under
the same assumptions on the drift and volatility functions b and , as well as on the
running and terminal cost functions f and g, we thus posit that the dynamics of the
state process Xi of each player i 2 I are given by a stochastic differential equation
of the form:
dXti D b.t; Xti ; ti ; ˛ti /dt C .t; Xti ; ti ; ˛ti /dWti ; t 2 Œ0; T; (3.88)
in full analogy with (3.1). Here, each W i D .Wti /06t6T , for i 2 I, is intended to
be a Brownian motion with values in Rd , where d is the dimension of the state
space, constructed on some common probability space .˝; F; P/. Accordingly, each
.˛ti /06t6T is intended to be a control process which we require to be progressively
measurable with respect to the larger filtration generated by all the noises .W i /i2I .
Similarly, each .ti /06t6T is a measure-valued process which we also require to be
3.7 Games with a Continuum of Players 207
measure under the map that associates any player j with its state at time t under
the realization ! of the randomness.
for all i 2 I.
208 3 Stochastic Differential Mean Field Games
(ii) If for each player i 2 I and any strategy profile ˛i for player i, we define the
expected cost to player i by the formula:
Z T
J .˛/ D E
i
f .t; Xti ; t ; ˛ti /dt C g.XTi ; T / ;
0
where Xi solves (3.88), then, for -a.e. i 2 I and any admissible control ˛i D
.˛ti /06t6T for player i, we have:
J i .˛i / 6 J i ˛i :
Observe that, in the notation used right above, J i implicitly depends upon
˛ D .˛j /j2I through the empirical distributions .t /06t6T . The fact that the same
.t /06t6T is used whenever player i uses ˛i in place of ˛i is fully legitimated by
Lemma 3.42, which asserts that the measure is necessarily continuous. Indeed,
the continuity of the measure guarantees that each player is insignificant, and in
particular that the empirical measures constructed from the strategy profiles ˛ and
.˛i ; ˛i / are the same, where as usual, the strategy profile .˛i ; ˛i / is given by all
the players j ¤ i using the controls ˛j while player i is using control ˛i .
It is worth noting that the rich Fubini extension is in fact just needed to construct
the state process X forming the equilibrium. In stark contrast, any unilateral
deviation from the equilibrium calls for the redefinition of the trajectories of one
single player i 2 I only, which can be done on the sole space .˝; F; P/.
We now explain why, at least at an intuitive level,
a solution of the mean field game problem stated in Section 3.1 provides a Nash
equilibrium for the mean field game with a continuum of players.
Given the drift and volatility functions b and , and given the running and
terminal cost functions f and g, let us assume that the mean field game problem
formulated in Subsection 3.1.2 has a solution. We denote by ˛O D .˛O t /06t6T the
equilibrium strategy, as defined in Subsection 3.1.2, which we assume to be a
progressively measurable function of the path of the Wiener process W D .Wt /06t6T
on which the game model is based. We also denote by D .t /06t6T the
corresponding equilibrium flow of probability measures. Recall that the latter is
entirely deterministic. Next, we proceed to define the strategy profile ˛ D .˛i /i2I
for the mean field game with a continuum of players by demanding that, for each
i 2 I, ˛i bears to W i the same relationship as ˛O does to W. Next, still for each
i 2 I, we consider the process Xi D .Xti /06t6T solving (3.88) when we use the
deterministic measures .t /06t6T in lieu of the possibly random empirical measures
.t /06t6T . Under standard assumption, the state equation (3.88) is strongly solvable
when, for each i 2 I, .˛ti /06t6T and .ti /06t6T are given as we just explained. In
particular, we claim that there exists a progressively measurable function F from
C.Œ0; TI Rd / into itself such that Xi D F.W i / for all i 2 I. We deduce that
X W I ˝ 3 .i; !/ 7! Xi .!/ 2 C.Œ0; TI Rd / is measurable with respect to I F
and that the family .Xi /i2I is essentially pairwise independent. Hence, the exact
3.8 Notes & Complements 209
law of large numbers implies that, for each t 2 Œ0; T, the corresponding empirical
measure, defined as ı .I 3 i 7! Xti /1 , is nonrandom and coincides with t .
It then remains to identify J i , as defined in (3.46), with the cost functional defined
in (3.4). The only subtlety when we do so is to note that ˛i is adapted to a much
larger filtration than the filtration generated by W i . This says that, to conclude, we
must assume that ˛O in (3.4) is optimal among a class A of control processes that are
progressively measurable with respect to a filtration F which is strictly larger than
the filtration generated by W. In fact, so is the case under standard assumptions on
the coefficients of the game, like those we use throughout the book.
The formulation of the mean field game problem given in Subsection 3.1.2 as a
set of bullet points leads to a family of standard continuous time stochastic control
problems followed by a fixed point problem in the space of deterministic measure
flows. This is inspired by the presentation of the Nash Certainty Equivalence (NCE)
principle by Huang, Caines and Malhamé in [211]. However, our search for a
solution is of a probabilistic nature as opposed to the analytic approach identifying
the value functions of the control problems as solutions of HJB partial differential
equations as described in Subsection 3.1.5. It is tempting to tackle the solutions of
both the HJB equations and the fixed point problems by contraction fixed point
arguments. This scheme is followed in [211]. Unfortunately, this strategy faces
subtle difficulties created by the fact that the two problems have time evolutions in
opposite directions, and as a result, it requires strong hypotheses which are difficult
to check in practice, and equilibria are only obtained over sufficiently short time
intervals. Existence over arbitrary time intervals can be proved for various types of
models at the cost of sophisticated PDE arguments. This was first done by Lasry
and Lions in [260–262] when is equal to the identity. In all these references, the
coefficients are allowed to depend upon the distribution of the population through its
density, in which case the coupling is said to be local. The note [260] is dedicated to
the stationary case presented in Chapter 7, while the finite horizon case is discussed
in both [261] and [262]. The arguments are detailed in the video lectures [265];
the reader may also consult the notes by Cardaliaguet [83]. As in our approach,
Schauder’s theorem is explicitly invoked in [83] to complete the existence proof of
a solution to the MFG system (3.12).
Since Lasry and Lions’ work, several contributions have addressed the solvability
of the MFG system (3.12). Some efforts have been concentrated on the so-called first
order case, when the volatility in (3.1) is null: We refer to Cardaliaguet [85] and
Cardaliaguet and Graber [87] for results with local coupling and to Cardaliaguet,
Mészáros, and Santambrogio [92] for cases when the density of the population is
constrained. Second order but degenerate cases were addressed by Cardaliaguet,
Graber, Porretta, and Tonon [88], while a great attention has been paid by Gomes
and his coauthors to the nondegenerate case but with various forms of Hamiltonians
and couplings: Gomes, Pimentel, and Sanchez [179] studied mean field games
210 3 Stochastic Differential Mean Field Games
on the torus with sub-quadratic Hamiltonians and power like dependence on the
density of the population, such a form of interaction accounting for some aversion
to congestion; Gomes and Pimentel [177] addressed the same problem but with a
logarithmic dependence on the density; and Gomes, Pimentel, and Sanchez [180]
investigated mean field games with super-quadratic Hamiltonians and power like
dependence on the density. Similar models, but on the whole space, are discussed in
Gomes and Pimentel [178]. In [85,87,88,92], the construction of a solution relies on
the connection with mean field optimal control problems as exposed in Chapter 6; in
[177,179,180], it is based on a smoothing procedure of the coefficients permitting to
apply Lasry and Lions’ original results. We refer to Guéant [186] for a subtle change
of variable transforming the MFG system, when driven by quadratic Hamiltonians,
into a tractable system of two heat equations. For a more complete account, the
reader may also consult the monographs by Bensoussan, Frehse and Yam [50],
Gomes, Nurbekyan, and Pimentel [176] and Gomes, Pimentel, and Voskanyan
[181].
For a complete overview of the theory of stochastic optimal control, we refer the
reader to the textbooks by Fleming and Soner [157], Pham [310], Touzi [334], and
Yong and Zhou [343]. For a quicker introduction to the subject, the reader may have
at a look at the surveys by Borkar [65] and Pham [309].
The theory of backward SDEs goes back to the pioneering works by Pardoux
and Peng in the early 90s, see for instance [297, 298]. We refer to the monograph
by Pardoux and Rǎşcanu [299] for a complete overview of the subject and of the
bibliography. For a pedagogical introduction on the connection between backward
SDEs and stochastic control, the reader may also have a look at Pham’s textbook
[310]. Actually, existence of a connection between backward SDEs and stochastic
control was known before Pardoux and Peng’s works as an earlier version of the
stochastic maximum principle appeared in Bismut’s contributions [60–62]. The
standard version of the stochastic maximum principle, as exposed in this chapter,
is due to Peng [302], and it is now a standard tool of stochastic optimization. It is
featured in many textbooks on the subject, for example Chapter IV of Yong and
Zhou’s textbook [343] or Chapter 4 of Carmona’s lectures [94]. We also refer to the
survey by Hu [202]. The sufficient condition can be found in Chapter 6 of Pham’s
book [310] or in Chapter 10 of Touzi’s monograph [334]. We give a complete proof
in the more general set-up of stochastic dynamics of the McKean-Vlasov type in
Chapter 6. The representation of Hamilton-Jacobi-Bellman equations by means
of backward SDEs, as explained in Remark 3.16, is also due to Peng, see [304];
we shall revisit it in the next chapter. Our presentation of the weak formulation in
Subsection 3.3.1 is inspired by the articles by Hamadène and Lepeltier [195] and
El Karoui, Peng, and Quenez [226]. Earlier formulation of the comparison principle
for BSDEs, as used in the proof of Proposition 3.11 on the weak formulation, may
be found in [304]; we refer to Chapter 5 in the monograph by Pardoux and Rǎşcanu
[299] for a more systematic presentation or to any textbook on the subject. See for
example [94, 310] or [343].
3.8 Notes & Complements 211
of optimal paths XO . Instead of Schauder’s fixed point theorem, one may invoke
Kakutani’s fixed point theorem for multivalued functions in order to solve the
equilibrium condition 2 .L.XO t //06t6T .
We refer the reader to the survey by Borkar [65] and to the monograph by Yong
and Zhou for an overview of controlled diffusion processes with relaxed controls.
Earlier results in that direction are due to Young [344] and Fleming [158]. In
Chapter 6, we shall study a simple example of mean field control problem by means
of the notion of relaxed controls.
Examples of mean field games with other different cost functionals, like risk-
sensitive cost functionals for instance, may be found in Tembine, Zhu, and Basar
[332]. Models leading to mean field games with several populations, mean field
games with an infinite time horizon, and mean field games with a finite state space
are discussed in Chapter 7. Mean field games with major minor players will be
presented in Chapter (Vol II)-7.
The results on convex optimization used in the text can be found in most standard
monographs on the subject, see for instance Bertsekas’ [55] or Ciarlet’s [115]
textbooks.
Games models with a continuum of players were introduced by Aumann in 1964
in a breakthrough paper [27]. Our presentation of the exact law of large numbers is
modeled after Sun’s paper [324]. This law was also used by Duffie and Sun in [148]
to model matching from searches. This last work was used to justify the assumptions
of the percolation of information model presented in Chapter 1.
FBSDEs and the Solution of MFGs Without
Common Noise 4
Abstract
The goal of this chapter is to develop a general methodology for the purpose
of solving mean field games using the forward-backward SDE formulations
introduced in Chapter 3. We first proceed with a careful analysis of forward-
backward mean field SDEs, that is of McKean-Vlasov type, which shows how
Schauder’s fixed point theorem can be used to prove existence of a solution.
As a by-product, we derive two general solvability results for mean field games:
first from the FBSDE representation of the value function, and then from the
stochastic Pontryagin maximum principle. In the last section, we revisit some
of the examples introduced in the first chapter, and illustrate how our general
existence results can be applied.
The goal of this section is to provide basic solvability results for standard forward-
backward SDEs. These results will serve us well when we try to prove existence
of solutions to forward-backward SDEs of the McKean-Vlasov type. Precise
references are cited in the Notes & Complements at the end of the chapter for all the
results given without proof.
As a general rule, we consider forward-backward SDEs of a slightly more general
form than what is really needed in order to implement the program outlined in
Chapter 3. Typically, we allow the diffusion coefficient (or volatility) to depend
upon the backward component of the solution. In doing so, we obtain almost for
free, an existence result for FBSDEs of the McKean-Vlasov type for a larger class
of models covering mean field games as a particular case. Being able to handle such
a larger class will turn out to be handy in Chapter 6 for the study of the optimal
control of McKean-Vlasov diffusion processes.
and (4.1) holds true P almost surely. In the next subsections, we address the
existence and uniqueness of such solutions.
Remark 4.1 For the reader who is not familiar with the theory of forward-
backward equations, it may sound rather strange to ask for the well posedness of
a system with three unknowns but two equations only. Actually, the reader must
remember the fact that the triple .X; Y; Z/ is required to be progressively measurable
with respect to F. In particular, it should not anticipate the future of W. The role of
the process Z is precisely to guarantee the adaptedness of the solution with respect
to the filtration F. In the end, the forward-backward system actually consists of two
equations and a progressive measurability constraint.
All the results stated in this section are given in a rigorous form, and all the
assumptions they require are given in full detail. However, proofs are frequently
skipped. Indeed, while we want the reader to have a good sense of the underpin-
nings of the theory of FBSDEs, its main achievements as well as its limitations,
we fear that too many technical proofs will distract from the thrust of our analysis.
We shall only give proofs of results which further the theory of FBSDEs to help
us solve the new challenges posed by mean field game models and the control of
McKean-Vlasov dynamics.
4.1 A Primer on FBSDE Theory 217
Throughout this subsection, we assume that the coefficients are Lipschitz continuous
in the variables x, y, and z:
(A1) The function Œ0; T 3 t 7! .B.t; 0; 0; 0/; F.t; 0; 0; 0/; ˙.t; 0; 0/; G.0// is
bounded by .
(A2) For each t 2 Œ0; T, the functions B.t; ; ; /, F.t; ; ; /, ˙.t; ; / and G
are L-Lipschitz continuous on their own domain.
Theorem 4.2 Under assumption Lipschitz FBSDE, there exists a constant c > 0,
only depending on L (and not on ), such that, for any initial condition 2
L2 .˝; F0 ; PI Rd /, equation (4.1) has a unique solution as long as T 6 c.
We do not give the proof of Theorem 4.2 here. Indeed, the reader will find the
proof of a more general statement, including the McKean-Vlasov case, in Subsec-
tion 4.2.3. In fact, a careful inspection of the proof provided in Subsection 4.2.3
shows that Theorem 4.2 may be easily turned into a stability property which we
also state without proof.
Theorem 4.4 There exist two constants c; C > 0, only depending on L, such that,
for any other set of coefficients .B0 ; F 0 ; G0 ; ˙ 0 / satisfying assumption Lipschitz
218 4 FBSDEs and the Solution of MFGs Without Common Noise
as long as T 6 c.
u.0; x/ D EŒY00;x ; x 2 Rd :
We then have:
P Y00;x D u.0; x/ D 1: (4.3)
4.1 A Primer on FBSDE Theory 219
The thrust of the notion of decoupling field is that the relationship (4.3) remains
true when the initial condition is random; namely, denoting by .X0; ; Y 0; ; Z0; /
the unique solution to (4.1) when T 6 c, we claim that:
P Y 0; D u.0; / D 1: (4.5)
u.t; x/ D EŒYtt;x ;
provided u is continuous in .t; x/, which is the object of the next lemma:
P 8t 2 Œ0; T; Yt D u.t; Xt / D 1: (4.8)
The next lemma shows that u is not only continuous in space, but also jointly
continuous in time and space.
Lemma 4.5 Under assumption Lipschitz FBSDE and for T 6 c with c as in the
statements of Theorem 4.2 and Theorem 4.4, the decoupling field is Lipschitz in
space uniformly in time, and 1=2-Hölder continuous in time locally in space, the
Hölder constant growing at most linearly with the space variable.
Proof. Given t 2 Œ0; T and h > 0 such that t; t C h 2 Œ0; T, and x 2 Rd , we have:
u.t; x/ u.t C h; x/
t;x t;x
D E u.t; x/ u.t C h; XtCh / C E u.t C h; XtCh / u.t C h; x/ (4.9)
t;x t;x t;x
D E Yt YtCh C E u.t C h; XtCh / u.t C h; x/ :
where C is independent of t and x. Plugging this estimate into (4.6), and using the Lipschitz
property of u with respect to x, we deduce that the two terms in (4.9) are less than C.1 C
jxj/h1=2 , for a possibly new value of the constant C. t
u
Remark 4.6 As we already alluded to, we shall prove in Chapter (Vol II)-1
that strong (or pathwise) uniqueness for FBSDEs, as we consider here, implies
uniqueness in law. As a by-product, this will show that the decoupling field u is
independent of the probabilistic set-up on which the solution is constructed, see
also Lemma 4.25.
Remark 4.7 The construction of the decoupling field u provided in this subsection
relies on the assumption T 6 c. The role of this condition is to guarantee the unique
solvability of (4.6). Clearly, the above construction of the decoupling field is
possible on any time interval on which existence and uniqueness are known to hold
for any initial condition.
4.1 A Primer on FBSDE Theory 221
Furthermore, the analysis of the regularity of u may be carried out for arbitrary
times provided that the conclusion of Theorem 4.4 remains true.
Our goal is now to provide a systematic method to extend the small time existence
and uniqueness result. The counter-example of Subsection 3.2.3 shows that such
a method cannot work under the sole assumption Lipschitz FBSDE. Our strategy
is to prove that, as long as we can exhibit a decoupling field which is Lipschitz
continuous in the space variable, existence and uniqueness must hold.
Iteration
Our approach relies on the following observation. For an arbitrary time horizon
T > c, we can first restrict the analysis of the forward-backward system (4.1) to
the interval ŒT c; T, for c as in the statements of Theorem 4.2 and Theorem 4.4.
Then, following the argument of the previous subsection, we know that for any
t 2 ŒT c; T, for any random variable 2 L2 .˝; Ft ; PI Rd /, the restriction of
the system (4.1) to Œt; T with Xt D as initial condition, is uniquely solvable. We
t; t; t;
set .Xt; ; Y t; ; Zt; / D .Xs ; Ys ; Zs /t6s6T for the unique solution and define the
decoupling field
Regarding (4.10) as the new forward-backward system, we may apply the same
argument as above. Denoting the previous c by c0 , we deduce that there exists a new
c1 such that (4.10) is uniquely solvable on ŒT .c0 C c1 /; T c0 . The reason is that
the new terminal condition u.T c0 ; / is Lipschitz continuous, but with a Lipschitz
constant possibly differing from L. Of course, when this Lipschitz constant is greater
than L, c1 is smaller than c0 . We shall come back to this crucial point momentarily.
In order to distinguish solutions constructed on the interval ŒT c0 ; T from that
constructed on ŒT .c0 C c1 /; T c0 , we use the following convention. We label
the solutions constructed on the interval ŒT c0 ; T with a superscript ‘0’ and those
constructed on the interval ŒT .c0 Cc1 /; T c0 with a superscript ‘1’. In particular,
for any t 2 ŒT c0 ; T and 2 L2 .˝; Ft ; PI Rd /, the previous .Xt; ; Y t; ; Zt; / is now
denoted by .X0It; ; Y 0It; ; Z0It; /. Similarly, for any t 2 ŒT .c0 C c1 /; T c0 and
2 L2 .˝; Ft ; PI Rd /, we call .X1It; ; Y 1It; ; Z1It; / the unique solution to (4.10) with
1It;
Xt D as initial condition.
Of course, a very natural idea is to patch together the two solutions. For given
t 2 ŒT .c0 C c1 /; T c0 and 2 L2 .˝; Ft ; PI Rd /, we let for any s 2 Œt; T:
.Xs ; Ys ; Zs /
( 1It; 1It; 1It;
Xs ; Ys ; Zs ; if s 2 ŒT .c0 C c1 /; T c0 ;
D 0ITc0 ;X 1It; 0ITc
1It;
0 ;XTc0 0ITc0 ;XTc
1It;
; Ys ; Zs ; if s 2 .T c0 ; T;
Tc0 0
Xs
which proves that .Xs ; Ys /T.c0 Cc1 /6s6T is continuous at s D T c0 . It is then plain
to check that the process .Xs ; Ys ; Zs /T.c0 Cc1 /6s6T is a solution of the FBSDE (4.1)
on ŒT .c0 C c1 /; T. Quite remarkably, this solution satisfies:
P 8t 2 ŒT .c0 C c1 /; T; Yt D u.t; Xt / D 1:
Proposition 4.8 On top of assumption Lipschitz FBSDE, assume that there exists
a continuous function u W Œ0; T Rd ! Rm which is Lipschitz continuous in space
uniformly in time, and such that, for any .t; x/ 2 Œ0; T Rd , we can find a solution
to (4.1), with Xt D x as initial condition at time t, satisfying:
h i
P 8s 2 Œt; T; Ys D u.s; Xs / D 1: (4.12)
Then, for any t 2 Œ0; T and 2 L2 .˝; Ft ; PI Rd /, there exists a unique solution
to (4.1) with Xt D as initial condition, and this solution satisfies the representation
formula (4.12).
Proof. With the same notation as before, we start with the following observation: there must
exist ı > 0 such that, for all n 2 N, cn > ı. Indeed, since on the interval ŒT c0 ; T
existence and uniqueness hold true for any initial condition, by (4.12), u must coincide with
the decoupling field on ŒT c0 ; T. Since the Lipschitz constant of u in space is bounded
from above by a known (fixed) constant by assumption, this implies that c1 is bounded from
below by a known (fixed) constant. Iterating the argument, we realize that the extension of
the decoupling field to the domain ŒT .c0 C c1 /; T Rd still coincides with u. Therefore,
the Lipschitz constant of the decoupling field at time T .c0 C c1 / is bounded from above
by the same known (fixed) constant and hence, c2 is bounded from below by a known (fixed)
constant, etc.
As a result, there exists a finite integer n 2 N such that T 6 c0 C C cn . In other
words, the iteration argument presented above needs only a finite number of steps to provide
a solution for any initial condition .t; / satisfying the prescribed conditions.
In order to prove uniqueness, it suffices to prove that any solution satisfies the representa-
tion formula (4.12). Indeed, once we have the representation formula (4.12), we may compare
224 4 FBSDEs and the Solution of MFGs Without Common Noise
any two solutions (say starting from some at time 0) on the small interval Œ0; T .c0 C C
cn1 /, where n is the smallest integer such that T 6 c0 C C cn . On this small interval,
the two solutions are known to satisfy (4.1) with the same terminal condition because of the
representation formula. By existence and uniqueness in small time, they coincide on this first
interval. Then, we can repeat the argument on ŒT .c0 C C cn1 /; T .c0 C C cn2 /,
since the two solutions are now known to restart from the same (new) initial condition at time
T .c0 C C cn1 /. Uniqueness follows by induction.
The fact that any solution .X; Y; Z/ satisfies (4.12) may be proved by a backward
induction starting from the last interval ŒT c0 ; T. The representation property on ŒT c0 ; T
is indeed a consequence of Theorem 4.2. It permits to identify YTc0 with u.T c0 ; XTc0 /
and then to repeat the same argument on ŒT .c0 C c1 /; T c0 . And so on... t
u
Stability
Proposition 4.8 may be complemented with the following stability property, which
is the long time analogue of Theorem 4.4.
Lemma 4.9 Let us assume that there is another set of coefficients .B0 ; F 0 ; G0 ; ˙ 0 /
satisfying the same assumption as .B; F; G; ˙/ in the statement of Proposition 4.8,
with respect to another decoupling field u0 .
Then, there exists a constant C, only depending on T, L and the Lipschitz con-
stants of u and u0 in x such that, for any initial conditions ; 0 2 L2 .˝; F0 ; PI Rd /,
the two processes .X; Y; Z/ and .X0 ; Y 0 ; Z0 / obtained by solving (4.1) with and 0 as
respective initial conditions and with .B; F; G; ˙/ and .B0 ; F 0 ; G0 ; ˙ 0 / as respective
coefficients, satisfy:
Z T
E sup jXt Xt0 j2 C jYt Yt0 j2 C jZt Zt0 j2 dt
06t6T 0
6 CE j 0 j2 C j.G G0 /.XT /j2
Z
T ˇ ˇ
C ˇ B B0 ; F F 0 ; ˙ ˙ 0 t; Xt ; Yt ; Zt ˇ2 dt :
0
Proof. For small time T > 0, this estimate follows immediately from Theorem 4.4. We only
need to show that one can extend it to arbitrarily large values of T. We then choose a regular
subdivision 0 D T0 < T1 < < TN1 < TN D T so that the common length of the
intervals ŒTi ; TiC1 is small enough in order to apply Theorem 4.4 on each interval ŒTi ; TiC1
with u.TiC1 ; / or u0 .TiC1 ; / as terminal condition function. For any i 2 f0; ; N 1g, we
have:
Z TiC1
E sup jXt Xt0 j2 C jYt Yt0 j2 C jZt Zt0 j2 dt
Ti 6t6TiC1 Ti
6 C E jXTi XT0 i j2 C j.u u0 /.TiC1 ; XTiC1 /j2 (4.13)
Z
TiC1 ˇ ˇ
CE ˇ B B0 ; F F 0 ; ˙ ˙ 0 t; Xt ; Yt ; Zt ˇ2 dt :
Ti
4.1 A Primer on FBSDE Theory 225
For simplicity, we denote the left-hand side by .Ti ; TiC1 / and we let:
ˇ ˇ2
t D ˇ B B0 ; F F 0 ; ˙ ˙ 0 t; Xt ; Yt ; Zt ˇ :
ˇ ˇ2
In this proof, we shall also use the notation ıT D ˇ.G G0 /.XT /ˇ .
We first consider the last interval ŒTN1 ; TN corresponding to the case i D N 1. Since
TN D T we have u.T; / D G and u0 .T; / D G0 , so that:
Z
T
.TN1 ; T/ 6 C E jXTN1 XT0 N1 j2 C ıT C t dt ;
TN1
this estimate being true for all possible initial conditions for the process X0 at time TN1 . In
this regard, notice that, while some freedom is allowed in the choice of XT0 N1 , the initial
condition of X is somehow fixed through . t /TN1 6t6T . Note also that C is implicitly
assumed to be larger than 1 and that we can allow its value to change from line to line as
long as this new value depends only upon T, L in assumption Lipschitz FBSDE and the
Lipschitz constants in x of u and u0 .
Next we freeze the process X but we let X0 vary. We use the fact that the decoupling field
u does not depend on the initial condition of X0 . In particular, we can choose to keep the
0
coefficients .B0 ; F 0 ; G0 ; ˙ 0 / but set XT0 N1 D XTN1 . Then the above inequality implies
Z
T
E ju.TN1 ; XTN1 / u0 .TN1 ; XTN1 /j2 6 C ıT C t dt :
TN1
We can now plug this estimate into inequality (4.13) with i D N 2 to get:
Z
T
.TN2 ; TN1 / 6 C E jXTN2 XT0 N2 j2 C ıT C t dt :
TN2
As before, we can write what this estimate gives if we keep .B0 ; F 0 ; G0 ; ˙ 0 /, and set XTN2 D
XT0 N2 :
Z
T
E ju.TN2 ; XTN2 / u0 .TN2 ; XTN2 /j2 6 C ıT C t dt :
TN2
Iterating, we get:
Z
T
.Ti ; TiC1 / 6 C E jXTi XT0 i j2 C ıT C t dt : (4.14)
Ti
As before the value of the constants can change from line to line. From this, we get the
desired estimate once we notice that, for each i 2 f1; ; Ng, we have:
226 4 FBSDEs and the Solution of MFGs Without Common Noise
h i
E jXTi XT0 i j2 6 E sup jXt Xt0 j2
Ti1 6t6Ti
Z
T
6 C E jXTi1 XT0 i1 j2 C ıT C t dt ;
Ti1
from which we easily derive the required bound for EŒsup0tT jXt Xt0 j2 by means of a
forward induction. Summing over i in (4.14), we conclude the proof. u
t
Lemma 4.10 For a given T > 0, assume that on top of assumption Lipschitz
FBSDE, the system of PDEs:
@t ui C B t; x; u.t; x/; @x u.t; x/˙.t; x; u.t; x// @x ui .t; x/
1
C trace ˙˙ .t; x; u.t; x//@2xx ui .t; x/
2
C F i t; x; u.t; x/; @x u.t; x/˙.t; x; u.t; x// D 0;
for .t; x/ 2 Œ0; T Rd ; i 2 f1; ; mg, with the terminal condition u.T; x/ D G.x/
for x 2 Rd , has a bounded classical solution u W Œ0; T Rd 3 .t; x/ 7! u.t; x/ D
.u1 .t; x/; ; um .t; x//, continuous on Œ0; T Rd , once differentiable in time, and
twice differentiable in space with jointly continuous derivatives on Œ0; T/ Rd and
with bounded first and second order derivatives in space.
Then, for any t 2 Œ0; T and 2 L2 .˝; Ft ; PI Rd /, the FBSDE (4.1) with initial
condition Xt D , has a unique solution .Xt; ; Y t; ; Zt; /, Xt; solving the SDE:
4.1 A Primer on FBSDE Theory 227
dXst; D B s; Xst; ; u.s; Xst; /; @x u.s; Xst; /˙ s; Xst; ; u.s; Xst; / ds
(4.15)
C ˙ s; Xst; ; u.s; Xst; / dWs ; s 2 Œt; T;
Proof. The proof is quite straightforward. There is no difficulty for solving (4.15). Once this
is done, one can define Y t; and Zt; as in (4.16), and prove the desired result, namely that
t;
.Xt; ; Y t; ; Zt; / satisfies (4.1), by applying Itô’s formula to Y t; D .u.s; Xs //t6s6T and by
taking advantage of the fact that the function u solves the above system of PDEs. Uniqueness
follows from Proposition 4.8. t
u
At this stage, the reader may wonder whether the representation of the gradient
given by the formula (4.16) can be directly proved, without any use of a PDE
argument. A positive answer is given by the following result.
Lemma 4.11 On top of assumption Lipschitz FBSDE, assume that for a random
variable 2 L2 .˝; F0 ; PI Rd /, we can find a solution .X; Y; Z/ to (4.1) with X0 D
as initial condition and a jointly continuous function u W Œ0; T Rd ! Rm , once
differentiable in space with a bounded and jointly continuous derivative, such that
h i
P 8t 2 Œ0; T; Yt D u.t; Xt / D 1: (4.17)
Proof. We first prove the representation formula under the strong assumptions on u. Consider
a uniform subdivision 0 D T0 < T1 < < TN D T of step size h together with a simple
process of the form:
X
N1
t D i
1.ti ;tiC1 .t/; t 2 Œ0; T;
iD0
228 4 FBSDEs and the Solution of MFGs Without Common Noise
with i 2 L1 .˝; Fti ; PI Rd /, the variables . i /iD0; ;N1 being uniformly bounded by some
constant K. Next, for any i D 0; ; N 1, notice that:
Z h i
TiC1
E t dWt YTiC1 D E Ti WTiC1 WTi YTiC1 :
Ti
with:
Z
T
E sup jXs j2 C jYs j2 C jZs j2 ds < 1;
06t6T 0
where, here and below in the proof, .".t; h//t2Œ0;T;h>0 is a generic notation for a function
satisfying:
D h".Ti ; h/:
4.1 A Primer on FBSDE Theory 229
We finally get:
Z TiC1
E s dWs YTiC1
Ti
Z TiC1 Z TiC1
DE s dWs @x u.s; Xs /˙.s; Xs ; Ys /dWs C h".Ti ; h/ (4.20)
Ti Ti
Z TiC1
DE @x u.s; Xs /˙.s; Xs ; Ys / s ds C h".Ti ; h/:
Ti
Identifying (4.18) and (4.20), summing over i D 0; ; N 1, and then letting h tend to 0,
we finally get:
Z T Z T
E @x u.s; Xs /˙.s; Xs ; Ys / s ds D E Zs s ds :
0 0
The proof of the first claim is easily completed, using the fact that the class of simple
processes is dense within the family of square-integrable F-progressively measurable
processes.
When u is Lipschitz continuous in space and not necessarily differentiable in x, we assume
that ˙ is bounded. We then observe that (4.19) still makes sense. Indeed, since the function
R 3 r 7! u.TiC1 ; XTi C r.XTiC1 XTi // is Lipschitz continuous, we can still give a sense to
R1
the integral 0 @x u.TiC1 ; rXTiC1 C .1 r/XTi /dr. Also, we can handle the last term in (4.19)
by Itô’s formula, namely we can prove:
ˇ Z Z ˇ
ˇ TiC1 TiC1 ˇ
E ˇˇ s dWs XTiC1 XTi ˙.s; Xs ; Ys / s dsˇˇ D h".Ti ; h/:
Ti Ti
Therefore, we can find a constant C, only depending on the Lipschitz bound for u and on the
bound for ˙, such that:
Z ˇ Z ˇ
TiC1 ˇ TiC1 ˇ
E dW Y 6 CE ˇ ˙.s; Xs ; Ys / s dsˇˇ C h".Ti ; h/
s s TiC1 ˇ
Ti Ti
Z TiC1
6 CE j s jds C h".Ti ; h/;
Ti
the value of C being allowed to increase from line to line. Identifying again with (4.18),
summing over i 2 f0; ; N 1g and letting h tend to 0, we deduce that:
ˇ Z ˇ Z
ˇ T ˇ T
ˇE Zs s ds ˇˇ 6 CE j s jds :
ˇ
0 0
Once again, the proof of the second claim is easily completed. The last claim may be proved
by changing . s /0sT into .˙ 1 .s; Xs ; Ys / s /0sT . t
u
230 4 FBSDEs and the Solution of MFGs Without Common Noise
(A2) The functions ˙ and G are bounded by L. Moreover, for any t 2 Œ0; T,
x 2 Rd , y 2 Rm and z 2 Rmd ,
j.B; F/.t; x; y; z/j 6 L 1 C jyj C jzj :
(A3) The function ˙ is uniformly elliptic in the sense that, for any t 2 Œ0; T,
x 2 Rd and y 2 Rm , the following inequality holds:
˙˙ .t; x; y/ > L1 Id ;
Theorem 4.12 Under assumption Nondegenerate FBSDE, for any t 2 Œ0; T and
in the space L2 .˝; Ft ; PI Rd /, the forward-backward system (4.1) with Xt D as
t; t; t;
initial condition has a unique solution, denoted by .Xs ; Ys ; Zs /t6s6T . Moreover,
t;x
the decoupling field u W Œ0; T R 3 .t; x/ 7! u.t; x/ D Yt 2 Rm , obtained by
d
ju.t; x/ u.t0 ; x0 /j 6 jt t0 j1=2 C jx x0 j ;
t; t;
for some constant only depending upon T and L. Finally, Ys D u.s; Xs / for any
t;
t 6 s 6 T and jZs j 6 L, Leb1 ˝ P almost everywhere.
Remark 4.13 The above Lipschitz estimate (in the variable x) will be established
in Subsection 4.4.2, when is independent of y. The proof relies on the theory of
quadratic BSDEs, which we present in the next subsection.
So far, we have provided results for general FBSDEs of the form (4.1), allowing
the diffusion coefficient ˙ to depend upon the variable y. Actually, in most of the
applications considered in this book, we do not need such a level of generality. In
fact, it will suffice to manipulate FBSDEs driven by a diffusion coefficient only
depending on the variables t and x.
A crucial insight into this case is the following: When ˙ is independent of the
backward component, the forward-backward system can be decoupled by means of
a Girsanov transformation, at least when ˙ is invertible. For instance, if .X; Y; Z/
is a solution to (4.1), we may let:
Z
dQ 1
DE ˙ .t; Xt /B.t; Xt ; Yt ; Zt / dWt ;
dP 0 T
for .t; x; y; z/ 2 Œ0; TRd Rm Rmd , where z.˙ 1 .t; x/B.t; x; y; z// is the product
of an m d-matrix and a d-vector.
We already used this strategy in the presentation of mean field games under the
weak formulation in Subsection 3.3.1. In particular, (4.21) is very close to (3.30),
although the functions H in the two definitions do not exactly coincide.
232 4 FBSDEs and the Solution of MFGs Without Common Noise
A key observation with (4.21) is that the driver H is most often of quadratic
growth in the variable z. This departure from the standard Lipschitz setting creates
additional difficulties and requires a special treatment. In order to overcome these
new challenges we shall assume that m D 1, implying that the backward component
Y is one-dimensional. This restrictive assumption will not be too much of an
hindrance in what follows. Indeed, in the majority of the cases of interest to us,
quadratic BSDEs will be used to represent a cost; in this regard, quadratic BSDEs
under consideration will be only in dimension 1.
Below, we provide some of the basic results in the analysis of quadratic BSDEs.
We do not give proofs because of their technicalities. We refer to the Notes &
Complements at the end of the chapter for references on the subject.
Remark 4.14 In accordance with the convention introduced in Chapter 3, the value
Zt at time t of the martingale integrand process Z in a one-dimensional BSDE will be
often regarded as a d-dimensional vector (and not as a 1 d-matrix). This justifies
the use of the notation Zt dWt instead of Zt dWt .
Theorem 4.15 Under assumption Quadratic BSDE, there exists a unique pair of
F-progressively measurable processes .Y; Z/ D .Yt ; Zt /06t6T with values in R and
Rd satisfying (4.22) and
Z T
sup jYt j 2 L1 .˝; FT ; PI R/; E jZt j2 dt < 1;
06t6T 0
8y 2 R; z 2 Rd ;
Leb1 ˝ P .t; !/ 2 Œ0; T ˝ W .t; !; y; z/ > 0 .t; !; y; z/ D 0:
Then,
8t 2 Œ0; T; P Yt > Yt0 D 0;
where .Y 0 ; Z0 / is the unique solution (in the sense of Theorem 4.15) to (4.22), when
driven by . 0 ; F0 /.
Bounded-Mean-Oscillation Martingales
One
R t crucial ingredient with quadratic BSDEs is that the martingale process
. 0 Zs dWs /06t6T is of bounded-mean-oscillation in the sense of the following
definition.
We call the smallest constant K with this property the BMO norm of the martingale.
234 4 FBSDEs and the Solution of MFGs Without Common Noise
Theorem 4.19 Under assumption Quadratic BSDE, consider the unique solution Rt
.Y; Z/ D .Yt ; Zt /06t6T to (4.22) as given in Theorem 4.15. Then, the process . 0 Zs
dWs /06t6T is a BMO martingale and its BMO norm only depends upon T, L and the
bound for the L1 -norm of .
Remark 4.20 Part of the statements given here will be revisited in Chapter (Vol II)-
1 when handling optimal control problems in random environments.
This argument will be used again when we state and prove Theorem 1.45 in Chapter
(Vol II)-1 as a generalization of Theorem 4.2 to the case of random coefficients.
Solving FBSDEs of the McKean-Vlasov type on arbitrary time intervals is much
more delicate and involved. We present a first class of models for which we can
actually do that in the following section. Other models will be investigated in the
next sections when we return to the existence of solutions for mean field games.
In this section and the next, all the processes are assumed to be defined on
a complete filtered probability space .˝; F; F D .Ft /06t6T ; P/ supporting a d-
dimensional Wiener process W D .Wt /06t6T with respect to F, the filtration F
satisfying the usual conditions. We recall that, for each random variable/vector
or stochastic process X, we denote by L.X/ the law (alternatively called the
distribution) of X and for any integer n > 1, by H2;n the Hilbert space:
n Z T o
H2;n D Z 2 H0;n W E jZs j2 ds < 1 ; (4.23)
0
where H0;n stands for the collection of all Rn -valued progressively measurable
processes on Œ0; T. We shall also denote by S2;n the collection of all continuous
processes U D .Ut /06t6T in H0;n such that EŒsup06t6T jUt j2 < C1. As for the
dependence of the coefficients (and the solutions) upon the measure parameters, we
refer the reader to the definition (3.16) of the Wasserstein distance given earlier, and
to Section 5.1 of Chapter 5 for a thorough discussion of its properties. We merely
highlight a simple property of the 2-Wasserstein distance W2 :
For technical reasons, we allow the coefficients to be random. This means that the
drift and diffusion coefficients of the state Xt of the system at time t are given by a
pair of (measurable) functions .B; ˙/ W Œ0; T ˝ Rd P2 .Rd / ! Rd Rdd . The
term nonlinear used to qualify (4.25) does not refer to the fact that the coefficients
B and ˙ could be nonlinear functions of x, but instead to the fact that they depend
236 4 FBSDEs and the Solution of MFGs Without Common Noise
not only on the value of the unknown process Xt at time t, but also on its marginal
distribution L.Xt /. We shall use the following assumptions.
Proof. Let D .t /06t6T 2 C .Œ0; TI P2 .Rd // be temporarily fixed. Substituting momen-
tarily t for L.Xt / for all t 2 Œ0; T in (4.25) and recalling that X0 is given, the classical
existence result for Lipschitz SDEs guarantees existence and uniqueness of a strong solution
of the classical stochastic differential equation with random coefficients:
We denote its solution by X D .Xt /06t6T . This classical existence result also implies that
the law of X is of order 2, so that we can define the mapping
˚ W C .Œ0; TI P2 .Rd // 3 7! ˚./ D L.Xt / 0tT
D P ı .Xt /1 06t6T 2 C .Œ0; TI P2 .Rd //:
Observe that the last term is in C .Œ0; TI P2 .Rd // because X has continuous paths and
satisfies EŒsup06t6T jXt j2 < 1.
Since a process X D .Xt /06t6T satisfying EŒsup06t6T jXt j2 < 1 is a solution of (4.25)
if and only if its law is a fixed point of ˚ , we prove the existence and uniqueness result of
the theorem by proving that the mapping ˚ has a unique fixed point. Let us choose and
0
0 in C .Œ0; TI P2 .Rd //. Since X and X have the same initial conditions, Doob’s maximal
inequality and the Lipschitz assumption yield, for all t 2 Œ0; T:
4.2 McKean-Vlasov Stochastic Differential Equations 237
0
EŒ sup jXs Xs j2
06s6t
ˇZ s ˇ2
ˇ ˇ
sup ˇˇ ŒB.r; Xr ; r / B.r; Xr ; 0r /drˇˇ
0
6 2E
06s6t 0
ˇZ s ˇ2
ˇ ˇ
sup ˇˇ Œ˙.r; Xr ; r / ˙.r; Xr ; 0r /dWr ˇˇ
0
CE
06s6t 0
Z Z t
t
0 2
6 c.T/ E sup jXr Xr j2 ds C W2 .s ; 0s / ds
0 06r6s 0
Z t
0
CE j˙.r; Xr ; r / ˙.r; Xr ; 0r /j2 dr
0
Z Z t
t
0 2
6 c.T/ E sup jXr Xr j2 ds C W2 .s ; 0s / ds ;
0 06r6s 0
for a constant c.T/ depending on T and L, c.T/ being nondecreasing in T. As usual, and
except for the dependence upon T which we keep track of, we use the same notation c.T/
even though the value of this constant can change from line to line. Using Gronwall’s
inequality one concludes that:
Z t
0
E sup jXs Xs j2 6 c.T/ W2 .s ; 0s /2 ds; (4.27)
06s6t 0
Iterating this inequality and denoting by ˚ k the k-th composition of the mapping ˚ with
itself we get that for any integer k > 1:
Z
2 T
.T s/k1
sup W2 ˚ k ./s ; ˚ k .0 /s 6 c.T/k W2 .s ; 0s /2 ds
06s6T 0 .k 1/Š
c.T/ T k
k
6 sup W2 .s ; 0s /2 ;
kŠ 06s6T
which shows that for k large enough, ˚ k is a strict contraction and hence, ˚ admits a unique
fixed point as the space C .Œ0; TI P2 .Rd // is complete. t
u
Remark 4.22 The reader may object to the fact that the SDE (4.25) does not
include all the models suggested by the general form touted in (3.17). Indeed, in
order to do so, we should investigate an SDE of the form
dXt D B t; Xt ; L.Xt ; t / dt C ˙ t; Xt ; L.Xt ; t / dWt ; t 2 Œ0; T; (4.28)
where .t /06t6T denotes a given F-adapted process with paths in C.Œ0; TI Rm / and
with square integrable marginals. However, it is easy to check that Theorem 4.21
extends to this slightly more general setting.
This section is devoted to an existence and uniqueness result for BSDEs of McKean-
Vlasov type. We consider a backward stochastic differential equation of the form:
dYt D t; Yt ; Zt ; L.t ; Yt / dt C Zt dWt ; t 2 Œ0; T; (4.29)
with terminal condition YT D G. The driver and the terminal condition function G
are measurable and random, with W Œ0; T ˝ Rm Rmd P2 .Rd Rm / ! Rm
and G W ˝ ! Rm . Moreover, W D .Wt /06t6T is an Rd -valued Brownian motion and
D .t /06t6T is an Rd -valued square-integrable F-adapted process with continuous
paths on Œ0; T, where F is the usual augmentation of the filtration generated by W
and by an initial -field F0 , independent of W.
(A1) For each .y; z; / 2 Rm Rmd P2 .Rd Rm /, the process .; ; y; z; / W
Œ0; T ˝ 3 .t; !/ 7! .t; !; y; z; / is F-progressively measurable and
belongs to H2;m . Also, G 2 L2 .˝; FT ; PI Rm /.
(A2) There exists a constant L > 0 such that for any t 2 Œ0; T, ! 2 ˝,
y; y0 2 Rm , z; z0 2 Rmd , ; 0 2 P2 .Rd Rm /, and 0 having the same
first marginal on Rd ,
j .t; y; z; / .t; y0 ; z0 ; 0 /j 6 L jy y0 j C jz z0 j C W2 .; 0 / ;
where we use the same notation W2 for the 2-Wasserstein distance on P2 .Rd /
and P2 .Rd Rm /.
Theorem 4.23 Under assumption MKV BSDE, there exists a unique solution
.Y; Z/ 2 S2;m H2;md of (4.29).
4.2 McKean-Vlasov Stochastic Differential Equations 239
with a positive constant ˛ to be chosen later in the proof. For any .Y; Z/ 2 H2;d H2;md ,
we denote by .Y 0 ; Z0 / the unique solution of the BSDE (which is known to exist by standard
results from BSDE theory):
dYt0 D t; Yt0 ; Zt0 ; L.t ; Yt / dt C Zt0 dWt ; t 2 Œ0; T; YT0 D G:
This defines a map ˚ W .Y; Z/ 7! .Y 0 ; Z0 / D ˚.Y; Z/ from H2;m H2;md into itself.
Notice that Y 0 2 S2;m . The proof consists in showing that one can choose ˛ so that the
mapping ˚ is a strict contraction, its unique fixed point giving the desired solution to the
mean field BSDE (4.29). Let us choose .Y 1 ; Z1 / and .Y 2 ; Z2 / in H2;m H2;md and let us set
O D .Y 2 Y 1 ; Z2 Z1 / and .YO 0 ; ZO 0 / D
O Z/
.Y 01 ; Z01 / D ˚.Y 1 ; Z1 /, .Y 02 ; Z02 / D ˚.Y 2 ; Z2 /, .Y;
.Y 02 Y 01 ; Z02 Z01 /. Applying Itô’s formula to .e˛t jYO t0 j2 /06t6T , we get, for any t 2 Œ0; T,
Z Z
T ˇ T ˇ
jYO t0 j2 C E ˛e˛.rt/ jYO r0 j2 dr ˇ Ft C E e˛.rt/ jZO r0 j2 dr ˇ Ft
t t
Z T
D 2E e˛.rt/ YO r0 r; Yr02 ; Zr02 ; L.r ; Yr2 /
t
ˇ
r; Yr01 ; Zr01 ; L.r ; Yr1 / ˇ
dr Ft :
From the integrability assumption (A1) and the uniform Lipschitz assumption (A2) in (MKV
BSDE), we deduce that there exists a constant c, depending on L but not on ˛, such that:
Z Z Z
T
1 T T
˛E e jYO r0 j2 dr C E
˛r
e jZO r0 j2 dr 6 cE
˛r
e˛r jYO r0 j2 C jYO r j2 dr;
0 2 0 0
0 0
or equivalently k.YO ; ZO /kH;˛ 6 21=2 k.Y;
O Z/k
O H;˛ . This completes the proof. t
u
8
ˆ
< dXt D B t; Xt ; Yt ; Zt ; L.X
ˆ
t ; Yt / dt
C˙ t; Xt ; Yt ; L.Xt ; Yt / dWt ; (4.30)
ˆ
:̂ dY D F t; X ; Y ; Z ; L.X ; Y /dt C Z dW ; t 2 Œ0; T;
t t t t t t t t
where we use the same notation W2 for the 2-Wasserstein distance on P2 .Rd /
and P2 .Rd Rm /.
4.2 McKean-Vlasov Stochastic Differential Equations 241
Theorem 4.24 Under assumption MKV FBSDE in Small Time, there exists a
constant c > 0, only depending on the parameter L in the assumption, such that for
T 6 c and for any initial condition X0 D 2 L2 .˝; F0 ; PI Rd /, the FBSDE (4.30)
has a unique solution .X; Y; Z/ 2 S2;d S2;m H2;md .
Proof. Throughout the proof, the initial condition 2 L2 .˝; F0 ; PI Rd / is fixed. For an
element X D .Xt /06t6T 2 S2;d , X being progressively measurable with respect to the
completion of the filtration generated by and W, we call .Y; Z/ D .Yt ; Zt /06t6T the solution
of the BSDE:
dYt D F t; Xt ; Yt ; Zt ; L.Xt ; Yt / dt C Zt dWt ; t 2 Œ0; T; (4.31)
with the terminal boundary condition YT D G.XT ; L.XT //. The pair .Y; Z/ is progressively
measurable with respect to the completion of the filtration generated by and W. Its existence
is guaranteed by Theorem 4.23 if we use the driver:
and t D Xt . With this .Y; Z/ 2 S2;m H2;md , we associate X0 D .Xt0 /06t6T the solution of
the SDE:
dXt0 D B t; Xt0 ; Yt ; Zt ; L.Xt0 ; Yt / dt C ˙ t; Xt0 ; Yt ; L.Xt0 ; Yt / dWt ; t 2 Œ0; T;
with X00 D as initial condition, see Theorem 4.21 and Remark 4.22. Obviously, X0 is
progressively measurable with respect to the completion of the filtration generated by and
W. In this way, we created a map:
˚ W S2;d;.;W/ 3 X 7! X0 2 S2;d;.;W/ ;
where S2;d;.;W/ denotes the collection of the processes X 2 S2;d which are progressively
measurable with respect to the completion of the filtration generated by and W, our goal
being now to prove that ˚ is a contraction when T is small enough.
Given two inputs X1 and X2 in S2;d;.;W/ , we denote by .Y 1 ; Z1 / and .Y 2 ; Z2 / the solutions
of the BSDE (4.31) when X is replaced by X1 and X2 respectively. Moreover, we let X10 D
˚.X1 / and X20 D ˚.X2 /. Then, we can find a constant C > 1, depending on L in MKV
FBSDE in Small Time such that, for T 6 1:
Z T h i
E sup jYt1 Yt2 j2 C jZt1 Zt2 j2 dt 6 CE sup jXt1 Xt2 j2 ;
06t6T 0 06t6T
and
h i Z T
E sup jXt10 Xt20 j2 6 CTE sup jYt1 Yt2 j2 C jZt1 Zt2 j2 dt ;
06t6T 06t6T 0
242 4 FBSDEs and the Solution of MFGs Without Common Noise
Lemma 4.25 On top of assumption MKV FBSDE in Small Time, let us assume
that the coefficients B, ˙, F and G are deterministic, and let us also assume that,
on any probabilistic set-up .˝; F; F D .Ft /06t6T ; P/, for any t 2 Œ0; T and 2
L2 .˝; Ft ; PI Rd /, there exists a unique solution, denoted by .Xs ; Ys ; Zs /t6s6T ,
t; t; t;
t;
of (4.30) on Œt; T with Xt D as initial condition.
Then, for any 2 P2 .Rd /, there exists a measurable mapping U.t; ; / W Rd 3
x 7! U.t; x; / 2 Rm such that:
t;
P Yt D U.t; ; L.// D 1:
Furthermore:
8s 2 Œt; T; P Yst; D U s; Xst; ; L.Xst; / D 1:
Proof. Given a probabilistic set-up .˝; F ; F; P/, an initial time t 2 Œ0; T/ and an initial
condition 2 L2 .˝; Ft ; PI Rd /, we can solve (4.30) with respect to the augmented filtration
Ft generated by and .Ws Wt /t6s6T . The resulting solution is also a solution with respect
to the larger filtration F, and by uniqueness, it coincides with the solution obtained by
t;
solving the FBSDE (4.30) with respect to F. We deduce that Yt coincides a.s. with a
fg-measurable R -valued random variable. In particular, there exists a measurable function
d
t;
u .t; / W Rd ! Rm such that PŒYt D u .t; /D1.
4.3 Solvability of McKean-Vlasov FBSDEs by Schauder’s Theorem 243
t;
We now claim that the law of .; Yt / only depends upon the law of (i.e., it depends
on through its law only). This directly follows from the version of the Yamada-Watanabe
theorem for FBSDEs that we shall prove in Chapter (Vol II)-1, see Theorem (Vol II)-1.33.
Since uniqueness holds pathwise, it also holds in law, so that given two initial conditions
with the same law, the solutions also have the same laws. Therefore, given another Rd -
valued random vector 0 with the same law as , it holds .; u .t; // . 0 ; u 0 .t; 0 //. In
particular, for any measurable function v W Rd ! Rm , the random variables u .t; / v./
and u 0 .t; 0 / v. 0 / have the same law. Choosing v D u .t; /, we deduce that u 0 .t; / and
u .t; / are a.e. equal under the probability measure L./. To put it differently, denoting by
the law of , there exists an element U.t; ; / 2 L2 .Rd ; / such that u .t; / and u 0 .t; /
coincide a.e. with U.t; ; /. Identifying U.t; ; / with one of its version, this proves that:
t;
P Yt D U.t; ; / D 1:
When t > 0, we notice that, for any 2 P2 .Rd /, there exists an Ft -measurable random
variable such that D L./. As a result, the procedure we just described permits to define
U.t; ; / for any 2 P2 .Rd /. The situation may be different when t D 0 as F0 may reduce
to events of measure zero or one. In such a case, F0 can be enlarged without any loss of
generality in order to support Rd -valued random variables with arbitrary distributions.
The fact that U is independent of the choice of the probabilistic set-up .˝; F ; F; P/
directly follows from the uniqueness in law property. t
u
Remark 4.26 Notice that the additional variable L./ is “for free” in the above
t;
writing since we could set v.t; / D U.t; ; L.// and then have Yt D v.t; /.
In fact, this additional variable L./ is specified to emphasize the non-Markovian
nature of the equation over the state space Rd : the decoupling fields are not the
same if the laws of the initial conditions are different. Indeed, it is important
to keep in mind that, in the Markovian framework, the decoupling field is the
same for all possible initial conditions, thus yielding the connection with partial
differential equations. Here the Markov property holds, but over the enlarged space
Rd P2 .Rd /, justifying the use of the extra variable L./.
Remark 4.27 The notion of master field will be revisited in Subsection 5.7.2, and
used in a more systematic way in Chapters (Vol II)-4 and (Vol II)-5. The main
challenge will be to prove that the master field solves a partial differential equation
on the enlarged state space Œ0; T Rd P2 .Rd /, this partial differential equation
being referred to as the master equation.
The goal of this section is to provide a general existence result for McKean-Vlasov
FBSDEs over an arbitrarily prescribed time interval. The result of this section was
announced and appealed to in Chapter 3 in order to provide first existence results
of MFG equilibria. It will be revisited in Subsection 4.5 below and in Chapter 6 in
244 4 FBSDEs and the Solution of MFGs Without Common Noise
order to cover models which elude some of the assumptions made in this section.
There, the FBSDEs arise from optimization problems, and by taking advantage of
the assumptions specific to the applications to mean field games and control of
McKean-Vlasov dynamics respectively, we shall be able to extend the coverage of
the existence result of this section. These assumptions include, for example, strong
convexity of the cost functions, linearity of the drift, Instead we here require the
diffusion matrix to be nondegenerate and the coefficients to be bounded in the space
variable.
The motivation of this section comes from the short time restriction in Theo-
rem 4.24. This restriction is not satisfactory for practical applications, hence the
need for conditions under which solutions exist on an arbitrary time intervals.
The non-degeneracy condition used in this section is borrowed from the theory of
standard FBSDEs, and part of the proof is based upon a result of unique solvability
for these equations, see Theorem 4.12.
We emphasize once more that all the regularity properties with respect to the
probability measure argument are understood in the sense of the 2–Wasserstein’s
distance W2 whose definition was given in (3.16), and whose properties will be
discussed in detail in Section 5.1 of Chapter 5. We use the same notation as in
Section 4.2, see for instance (4.23) and (4.24).
Our goal is to prove existence (but not necessarily uniqueness) of a solution to a fully
coupled McKean-Vlasov forward-backward system of the same form as in (4.30):
8
ˆ
< dXt D B t; Xt ; Yt ; Zt ; L.X
ˆ
t ; Yt / dt
C˙ t; Xt ; Yt ; L.Xt ; Yt / dWt ; (4.32)
ˆ
:̂ dY D F t; X ; Y ; Z ; L.X ; Y /dt C Z dW ; t 2 Œ0; T;
t t t t t t t t
(A3) The function ˙ is uniformly elliptic in the sense that, for any t 2 Œ0; T,
x 2 Rd , y 2 Rm and 2 P2 .Rd Rm /, the following inequality holds:
˙˙ .t; x; y; / > L1 Id ;
Remark 4.28 Recall that M2 ./ denotes the square root of the second moment of
, see (3.7). We also notice that (A2) may be rewritten as:
ˇ ˇ
ˇB t; x; y; z; L.X; Y/ ˇ 6 L 1 C jyj C jzj C EŒjXj2 C jYj2 1=2 ;
ˇ ˇ
ˇF t; x; y; z; L.X; Y/ ˇ 6 L 1 C jyj C jzj C EŒjYj2 1=2 ;
for any square-integrable random variables X and Y. The fact that F is uniformly
bounded with respect to EŒjXj2 1=2 will be explicitly used in the analysis below.
Recall that throughout the book, we use the superscript to denote the transpose
of a matrix. We can now state the main result of this section. Notice that it extends
Theorem 3.10.
246 4 FBSDEs and the Solution of MFGs Without Common Noise
Theorem 4.29 Under assumption Nondegenerate MKV FBSDE, for any random
variable 2 L2 .˝; F0 ; PI Rd /, the FBSDE (4.32) has a solution .X; Y; Z/ 2 S2;d
S2;m H2;dm with X0 D as initial condition.
under the constraints that Yt D '.t; Xt / and t D L.Xt / for t 2 Œ0; T, and with the
boundary conditions X0 D and YT D G.XT ; L.XT //. The strategy we use below
consists in recasting the stochastic system (4.33) into a fixed point problem over
the arguments .'; .t /06t6T /. The first step is to use '.t; / ˘ t as an input, and
solve (4.33) as a standard FBSDE. In order to do so, we should be able to use some
of the known existence results for standard FBSDEs which we reviewed earlier.
Remark 4.30 Theorem 4.29 could be extended to the more general case when, in
the McKean-Vlasov argument, the joint law L.Xt ; Yt / of Xt and Yt is replaced by
the joint law L.Xt ; Yt ; Zt / of Xt , Yt , and Zt in B and F. Indeed, in the nondegenerate
setting, Zt is also given by a continuous function of Xt in the same way as Yt is,
namely Zt D v.t; Xt / with v.t; x/ D @x u.t; x/˙.t; x; u.t; x/; u.t; / ˘ L.Xt // whenever
Yt D u.t; Xt / (i.e., u ' with the notations used above), see Lemmas 4.10
and 4.11. However, since the proof would require a careful analysis of the smoothing
properties of the operator driving the forward component of the equation, we refrain
from tackling this question here.
Remark 4.31 The assumption that ˙ is independent of .Zt /06t6T should not be
underestimated. Indeed, if ˙ depends upon Zt , even in the classical (i.e., non-
McKean-Vlasov) case, the arguments needed to prove existence are much more
involved if this assumption is not satisfied. In essence, they try to recreate via specific
monotonicity assumptions the role played by convexity in the analysis of the so-
called adjoint FBSDEs arising in optimal stochastic control.
4.3 Solvability of McKean-Vlasov FBSDEs by Schauder’s Theorem 247
Below, the fixed point problem is solved by means of Schauder’s fixed point
theorem. This provides existence of a fixed point from compactness arguments.
However, it is important to keep in mind that it does not say anything about
uniqueness. The way we implement Schauder’s theorem is quite typical of the
strategy we shall use later on to solve mean field games. In this regard, Theorem 4.29
serves as a good testbed for our technology, although so far, nothing has been said
about the connection with mean field games.
We here recall the statement of Schauder’s theorem for the sake of completeness.
Theorem 4.32 Let .V; k k/ be a normed linear vector space and E be a nonempty
closed convex subset of V. Then, any continuous mapping from E into itself which
has a relatively compact range has a fixed point.
Lemma 4.33 Fix T > 0 and on top of assumption Nondegenerate MKV FBSDE,
assume that, instead of (A2), B and F have the following growth property:
j.B; F/.t; x; y; z; /j 6 L 1 C jyj C jzj ;
for some constant only depending upon T and L. In particular, both and are
t; t;
independent of and . Finally, it holds that Ys D u.s; Xs / for any t 6 s 6 T
t;
and jZs j 6 L, ds ˝ P almost everywhere.
248 4 FBSDEs and the Solution of MFGs Without Common Noise
For the time being, we use this existence result in the following way. We start
with a bounded continuous function ' from Œ0; T Rd into Rm , and a flow of
probability measures D .t /06t6T in C.Œ0; TI P2 .Rd //, which we want to think
of as the flow of marginal laws .L.Xt //06t6T of the solution. We apply the above
existence result for (4.34) to D T and t D '.t; / ˘ t for t 2 Œ0; T and solve:
8
ˆ
< dXt D B t; Xt ; Yt ; Zt ; '.t; / ˘ t dt
ˆ
C˙.t; Xt ; Yt ; '.t; / ˘ t /dWt ; (4.35)
ˆ
:̂ dY D F t; X ; Y ; Z ; '.t; / ˘ dt C Z dW ; t 2 Œ0; T;
t t t t t t t
Lemma 4.34 Under the same assumptions as in Lemma 4.33, there exists a positive
constant C, depending on T and L only, such that for any initial conditions
; 0 2 L1 .˝; F0 ; PI Rd / and any inputs .'; / and .' 0 ; 0 / as above, the processes
.X; Y; Z/ and .X0 ; Y 0 ; Z0 / obtained by solving (4.35) with and 0 as respective
initial conditions and .'; / and .' 0 ; 0 / as respective inputs, satisfy:
Z
T
E sup jXt Xt0 j2 C E sup jYt Yt0 j2 C E jZt Zt0 j2 dt
06t6T 06t6T 0
Z
T ˇ
6 C E j 0 j2 C E ˇ B; F; ˙ t; Xt ; Yt ; Zt ; '.t; / ˘ t (4.36)
0
ˇ2
B; F; ˙ t; Xt ; Yt ; Zt ; ' 0 .t; / ˘ 0t ˇ dt :
We are now in a position to implement the fixed point part of the strategy touted
for the construction of solutions to McKean-Vlasov FBSDEs in arbitrary time.
4.3 Solvability of McKean-Vlasov FBSDEs by Schauder’s Theorem 249
In this subsection we assume that the coefficients B and F are bounded by the
constant L. In addition, we assume that the initial condition of the forward
component is also bounded in the sense that it belongs to L1 .˝; F0 ; PI Rd /.
For any bounded continuous function ' W Œ0; T Rd ! Rm and for any flow of
probability measures D .t /06t6T 2 C.Œ0; TI P2 .Rd //, the map Œ0; T 3 t 7!
'.t; / ˘ t 2 P2 .Rd Rm / is continuous. So by Lemma 4.33, there exists a unique
triplet .Xt ; Yt ; Zt /06t6T satisfying (4.35) with X0 D as initial condition. Moreover,
there exists a bounded and continuous mapping u from Œ0; T Rd into Rm such
that Yt D u.t; Xt /, the bound for u being denoted by . This maps the input .'; /
into the output .u; .L.Xt //06t6T / and our goal is to find a fixed point for this map.
We shall take advantage of the a priori L1 - bound on u to restrict the choice of the
functions ' to the set:
˚
E1 D ' 2 C.Œ0; T Rd I Rm / W 8.t; x/ 2 Œ0; T Rd ; j'.t; x/j 6 : (4.38)
Similarly, since is in L1 .˝; F0 ; PI Rd / and the drift B and the volatility ˙ are
uniformly bounded, the fourth moment of the supremum sup06t6T jXt j is bounded
by a constant depending only upon the bounds of , B and ˙ . Consequently, we
shall choose the input measure in the set:
Z
E2 D 2 C Œ0; TI P4 .Rd / W sup jxj4 dt .x/ 6 0 ; (4.39)
06t6T Rd
for 0 appropriately chosen in such a way that EŒsup06t6T jXt j4 6 0 for any input
.'; /. We then denote by E the Cartesian product E D E1 E2 . We view E as a
subset of the product vector space V D V1 V2 , where V1 D Cb .Œ0; T Rd I Rm /
stands for the space of bounded continuous functions from Œ0; T Rd into Rm ,
and V2 D C.Œ0; TI M1f .Rd // for the space of continuous functions from Œ0; T into
the space M1f .Rd / of finite signed measures on Rd such that Rd 3 x 7! jxj is
integrable under jj. On V1 , we use the exponentially weighted supremum-norm:
Here, Lip1 .Rd / stands for the space of Lip-1 functions on Rd . As we shall show in
Corollary 5.4, when restricted to P1 .Rd /, the distance induced by the Kantorovich-
Rubinstein norm k kKR? coincides with the 1-Wasserstein metric W1 as already
defined in (3.16) earlier in Chapter 3 and studied in detail in Chapter 5.
The fact that k kV2 is a norm on V2 may be easily checked. It suffices to
check that k kKR? is a norm on M1f .Rd /. While the triangular inequality and the
homogeneity are easily verified, the property kkKR? D 0 ) D 0 may be
proved as follows. If kkKR? D 0 then C .Rd / D .Rd /, where C and are
the positive and negative parts of . If C .Rd / > 0, we may assume without any
loss of generality that C and are two probability measures and kC kKR?
is equal to:
Z
C
C
k kKR? D sup `.x/d .x/I ` 2 Lip1 .R /; `.0/ D 0
d
Rd
Z
D sup `.x/d C .x/I ` 2 Lip1 .Rd / ;
Rd
the passage from the first to the second line following from the fact that any `
as in the second line can be replaced by `./ `.0/. By Corollary 5.4, we get
kC kKR? D W1 .C ; /. Since W1 is a distance, kC kKR? is equal
to 0, which shows that C D .
We emphasize that E1 is a convex closed bounded subset of V1 . Moreover, we
notice that the convergence for the norm k kV1 of a sequence of functions in E1 is
equivalent to the uniform convergence on compact subsets of Œ0; T Rd . Similarly,
E2 is a convex closed bounded subset of V2 since the space P1 .Rd / is closed
under k kKR? and since, as shown in Theorem 5.5, the convergence of probability
measures for k kKR? implies weak
R convergence of measures, which guarantees that
the mapping P1 .Rd / 3 7! Rd jxj4 d.x/ 2 Œ0; C1 is lower semicontinuous for
the distance induced by the Kantorovich-Rubinstein norm. We now claim:
Z
8t 2 Œ0; T; lim j.' ' n /.t; y/j2 dnt .y/ D 0:
n!1 Rd
Therefore, by (4.37),
from which we deduce that .L.Xtn //06t6T converges towards .L.Xt //06t6T as n tends to
C1, in the Wasserstein metric W1 uniformly in t 2 Œ0; T, and thus in the topology associated
with the norm k kV2 . Denoting by un the FBSDE decoupling field, which is a function from
Œ0; T Rd into Rm such that Ytn D un .t; Xtn /, and by u the FBSDE decoupling field for which
Yt D u.t; Xt /, we deduce that:
lim sup E jun .t; Xtn / u.t; Xt /j2 D 0:
n!1 06t6T
By Lemma 4.33, we know that all the mappings .un /n>1 are bounded and Lipschitz
continuous with respect to x, uniformly with respect to n. Therefore,
lim sup E jun .t; Xt / u.t; Xt /j2 D 0:
n!1 06t6T
Moreover, by Arzèla-Ascoli’s theorem and by Lemma 4.33 again, the sequence .un /n>1
is relatively compact for the uniform convergence on compact sets, so denoting by uO the
limit of a subsequence converging for the norm k kV1 , we deduce that, for any t 2 Œ0; T,
uO .t; / D u.t; / L.Xt /-almost surely. By Stroock and Varadhan’s support theorem for
diffusion processes, L.Xt / has a full support for any t 2 .0; T, so that, by continuity,
uO .t; / D u.t; / for any t 2 .0; T. By continuity of u and uO on the whole Œ0; T Rd , equality
holds at t D 0 as well. This shows that .un /n>1 converges towards u for k kV1 and completes
the proof of the continuity of ˚ .
We now prove that ˚.E / is relatively compact for the product norm of V1 V2 . Given
.u; 0 / D ˚.'; / for some .'; / 2 E , we know from Lemma 4.33 that u is bounded by
252 4 FBSDEs and the Solution of MFGs Without Common Noise
and .1=2; 1/-Hölder continuous with respect to .t; x/, the Hölder constant being bounded
by . In particular, u remains in a compact subset of C .Œ0; T Rd I Rm / for the topology
of uniform convergence on compact sets as .'; / varies over E . Similarly, 0 remains in
a compact set when .'; / varies over E . Indeed, if .L.Xt //06t6T D 0 is associated with
.'; /, the moments of the measures .0t /06t6T can be easily controlled from the fact B and
˙ are bounded by constants independent of ' and . Using Corollary 5.6 which will be
proven in Chapter 5, this implies that all the .0t /06t6T live in a compact subset of P1 .Rd /
equipped with the distance W1 , independently of the input .'; / 2 E . Moreover, it is clear
that there exists a constant C, independent of the input .'; / 2 E , such that:
which proves, by Arzelà-Ascoli theorem, that the path .Œ0; T 3 t 7! 0t /06t6T lives in a
compact subset of C .Œ0; T; P1 .Rd //, independently of the input .'; / 2 E . u
t
We completed all the steps needed in the proof of the main result of this
subsection.
Proof. By Schauder’s fixed point Theorem 4.32, ˚ has a fixed point .'; /. As explained
in our description of the strategy of the proof, solving (4.35) with this .'; / as input, and
denoting by .Xt ; Yt ; Zt /06t6T the resulting solution, by definition of a fixed point, we have
Yt D '.t; Xt / for any t 2 Œ0; T, a.s., and .L.Xt //06t6T D .t /06t6T . In particular, '.t; / ˘ t
coincides with L..Xt ; Yt //. We conclude that .Xt ; Yt ; Zt /06t6T satisfies (4.32). t
u
We now complete the proof of Theorem 4.29 when the coefficients only satisfy
assumption Nondegenerate MKV FBSDE. The proof consists in approximating
the initial condition and the coefficients B and F by a sequence of initial
conditions . n /n>1 and sequences of coefficients .Bn /n>1 and .F n /n>1 , such that each
. n ; Bn ; F n ; ˙; G/, for n > 1, satisfies the assumptions of Proposition 4.37.
Proof. We first construct the approximating sequences. For any n > 1, t 2 Œ0; T, x 2 Rd ,
y 2 Rm , z 2 Rmd , 2 P2 .Rd Rm / and 2 P2 .Rd /, we set:
.Bn ; F n /.t; x; y; z; / D .B; F/ t; ˘n.d/ .x/; ˘n.m/ .y/; ˘n.md/ .z/; ı .˘n.dCm/ /1 ;
.k/
where for any integer k > 1, ˘n is the orthogonal projection from Rk onto the k-dimensional
.k/
ball of radius n centered at the origin, and for any probability measure on Rk , ı .˘n /1
.k/ .d/
denotes the push-forward of by ˘n . Finally, for each n > 1, we define n D ˘n ./.
4.3 Solvability of McKean-Vlasov FBSDEs by Schauder’s Theorem 253
For each n > 1, the assumptions of Proposition 4.37 are satisfied with n instead of
as initial condition and .Bn ; F n ; ˙ n ; Gn / instead of .B; F; ˙; G/ as coefficients. We denote
by .Xn ; Y n ; Zn / the solution of (4.32) given by Proposition 4.37 when the system (4.32) has
n as initial condition and is driven by the coefficients Bn , F n , ˙ n , and Gn . As explained in
the previous subsection, the process Y n satisfies Ytn D un .t; Xtn /, for any t 2 Œ0; T, for some
deterministic function un .
The next step of the proof is to provide a uniform bound on the decoupling fields
.un /n>1 . Applying Itô’s formula and using the specific growth condition (A2) in assumption
Nondegenerate MKV FBSDE, we get:
Z
T
8t 2 Œ0; T; E jYtn j2 6 C C C E jYsn j2 ds;
t
for some constant C depending on T and L only, and whose value may vary from line to line
at our convenience. By Gronwall’s inequality, we deduce that the quantity sup06t6T EŒjYtn j2
can be bounded in terms of T and L only. Injecting this estimate into (A2) shows that
.F n .t; Xtn ; Ytn ; Ztn ; L.Xtn ; Ytn ///06t6T is bounded by .C.1 C jYtn j C jZtn j//06t6T , which fits
the growth condition in Lemma 4.33. Moreover, repeating the Itô expansion of .jYtn j2 /06t6T ,
we also have:
Z T
E jZsn j2 ds 6 C: (4.40)
0
RT
Plugging the bounds for .sup06t6T EŒjYtn j2 /n>1 and for .E 0 jZtn j2 dt/n>1 into the forward
equation, we obtain in a similar way:
Z
t
8t 2 Œ0; T; E jXtn j2 6 C C C E jXsn j2 ds;
0
The crucial fact is that C is independent of n. Injecting this new estimate into (A2) shows
that the drift .Bn .t; Xtn ; Ytn ; Ztn ; L.Xtn ; Ytn ///06t6T is bounded by .C.1 C jYtn j C jZtn j//06t6T ,
which also fits the growth condition in Lemma 4.33.
By Lemma 4.33, we deduce that the processes .Y n /n>1 and .Zn /n>1 are uniformly
bounded by a constant C that depends upon T and L.
The next step of the proof is to establish the relative compactness of the family of
functions .Œ0; T 3 t 7! L.Xtn ; Ytn / 2 P2 .Rd Rm //n>1 , P2 .Rd Rm / being equipped
with the 2-Wasserstein metric W2 . Thanks to the uniform bounds we have for .Y n /n>1 and
.Zn /n>1 , we see that the driver:
F n t; Xtn ; Ytn ; Ztn ; L.Xtn ; Ytn /
06t6T
of the backward equation is bounded by C, for a possibly new value of C. In particular, using
the fact that .Zn /n>1 is uniformly bounded, we get:
254 4 FBSDEs and the Solution of MFGs Without Common Noise
8s; t 2 Œ0; T; E jYtn Ysn j2 6 Cjt sj: (4.41)
and that:
E jXtn Xsn j2 6 Cjt sj; (4.43)
for all s; t 2 Œ0; T. From (4.41) and (4.43), we deduce that, for all n > 1,
8s; t 2 Œ0; T; W2 L.Xtn ; Ytn /; L.Xsn ; Ysn / 6 Cjt sj1=2 : (4.44)
h i
1=2 1=2
8" > 0; E sup jXtn j2 1D 6 E E sup jXtn j4 jF0 P DjF0
06t6T 06t6T
h i
1=2
6 CE 1 C jj2 P DjF0
(4.46)
1 h i
6 C "E 1 C jj2 C E 1 C jj2 P DjF0
"
1
D C "E 1 C jj2 C E 1 C jj2 1D :
"
We deduce that:
lim sup sup E sup jXtn j2 1D D 0:
ı&0 n>1 D2F WP.D/6ı 06t6T
By Corollary 5.6 in Chapter 5, we deduce that the family ..L.Xtn ; Ytn //06t6T /n>1 lives in a
compact subset of P2 .Rd Rm /. From (4.44) and Arzéla-Ascoli theorem, we finally obtain
that the mappings .Œ0; T 3 t 7! L.Xtn ; Ytn / 2 P2 .Rd Rm //n>1 are in a compact subset of
C .Œ0; TI P2 .Rd / Rm /.
4.3 Solvability of McKean-Vlasov FBSDEs by Schauder’s Theorem 255
For the last step of the proof, we denote by D .t /06t6T a limit point of . n D
.tn /06t6T /n>1 ,with tn D L.Xtn ; Ytn /, and we call .X; Y; Z/ the solution to the FBSDE (4.34)
with as initial condition, with as input flow of measures, and with D T ı .Rd Rd 3
.x; y/ 7! x/1 as input terminal measure. From Lemma 4.34 (injecting in (A2) the bounds
we have on the moments of the solutions in order to fit the framework of Lemma 4.33), we
deduce that, possibly modulo the extraction of a subsequence,
Z T
lim E sup jXtn Xt j2 C sup jYtn Yt j2 C jZtn Zt j2 dt D 0:
n!1 06t6T 06t6T 0
Therefore, for all t 2 Œ0; T, .L.Xtn ; Ytn //n>1 converges in W2 to L.Xt ; Yt / D t . From there,
we easily conclude that .X; Y; Z/ satisfies (4.32). t
u
with initial condition x0 D 0 and terminal condition yT D G.xT / over the interval
Œ0; T. For such a value of a, we now set:
Xt D xt C Wt ; Yt D yt ; t 2 Œ0; T:
256 4 FBSDEs and the Solution of MFGs Without Common Noise
Remark 4.38 The reason for the failure of uniqueness can be explained as follows.
In the classical FBSDE framework, uniqueness holds because of the smoothing
effect of the diffusion operator in the spatial direction. However, in the McKean-
Vlasov setting, the smoothing effect of the diffusion operator is ineffective in the
direction of the measure variable.
Theorem 4.29 can be applied directly to some of the FBSDEs of the McKean-Vlasov
type describing equilibria of mean field games. However, its setting is somewhat
too general for what is actually needed for the solution of MFG problems, and one
should be able to do better under weaker assumptions to solve for MFG equilibria.
As we already explained, FBSDEs of the McKean-Vlasov type underpinning
mean field games are of the simpler form (at least for games for which the volatility
is independent of the control parameter):
(
dXt D B t; Xt ; L.Xt /; Yt ; Zt dt C ˙ t; Xt ; L.Xt / dWt ;
(4.49)
dYt D F t; Xt ; L.Xt /; Yt ; Zt dt C Zt dWt ; t 2 Œ0; T;
In contrast with the notation used in the previous section, we changed the order
in which the arguments appear in the coefficients. Most noticeably, the measure
argument now appears right after the state x argument, to conform with the
notation used in Chapter 3 and earlier in this chapter when we discussed mean
field games.
Based on these observations, we can revisit the proof of Theorem 4.29 and
establish, under the new set of assumptions spelled out below, a version of the
existence result for a system of the type (4.49). See Theorem 4.39 for the statement.
(A3) For any .t0 ; x/ 2 Œ0; T Rd , the FBSDE (4.50) over the time interval
Œt0 ; T with Xt0 D x as initial condition at time t0 has a unique solution
.Xtt0 ;x ; Ytt0 ;x ; Ztt0 ;x /t0 6t6T .
(A4) There exists a continuous mapping u W Œ0; T Rd ! Rm , Lipschitz
continuous in x uniformly in t 2 Œ0; T, such that for any initial condition
.t0 ; x/ 2 Œ0; T Rd , it holds with probability 1 under P:
h i
P 8t 2 Œt0 ; T; Ytt0 ;x D u.t; Xtt0 ;x / D 1:
(continued)
258 4 FBSDEs and the Solution of MFGs Without Common Noise
Theorem 4.39 Under assumption MKV FBSDE for MFG, for any random vari-
able belonging to L2 .˝; F0 ; PI Rd /, the FBSDE (4.49) has a solution .X; Y; Z/ 2
S2;d S2;m H2;dm with X0 D as initial condition.
Proof. For 2 L2 .˝; F0 ; PI Rd / and 2 C .Œ0; TI P2 .Rd //, we know from (A3), (A4) and
Proposition 4.8 that (4.50) has a unique solution .X ; Y ; Z / D .Xt ; Yt ; Zt /06t6T . We
thus define the mapping ˚ W C .Œ0; T; P2 .Rd // 3 7! ˚./ D .L.Xt //06t6T . The goal is
to apply Schauder’s Theorem 4.32 in order to prove that ˚ has a fixed point.
Following the argument used in Subsection 4.3.2, we apply Schauder’s fixed point
theorem in the space C .Œ0; TI M1f .Rd // of continuous functions D .t /06t6T from Œ0; T
into the space of finite signed measures over Rd such that Rd 3 x 7! jxj is integrable under
jj, equipped with the supremum of the Kantorovich-Rubinstein norm:
with:
Z
kkKR? D j.Rd /j C sup `.x/d.x/I ` 2 Lip1 .Rd /; `.0/ D 0 :
Rd
As already mentioned, the norm k kKR? is known to coincide with the Wasserstein distance
W1 on P1 .Rd /. This fact will be proven rigorously in Chapter 5.
We prove existence by proving that there exists a closed convex subset E included in
C .Œ0; TI P2 .Rd // C .Œ0; TI M1f .Rd // which is stable under ˚, such that ˚.E / is relatively
compact, and ˚ is continuous on E . But first, we establish a priori estimates for the solution
of (4.49).
The key point is to notice that the coefficients B, ˙, F and G being Lipschitz in
the variable .x; y; z/ and the decoupling field satisfying (A5), Lemma 4.11 implies that
jYt j 6 C.1 C jXt j/ and jZt j 6 C, Leb1 ˝ P almost everywhere. Plugging these bounds into
the forward part of (4.49) and using (A2), standard Lp estimates for SDEs imply that there
exists a constant C, independent of (but possibly depending on EŒjj2 ), such that:
1=2
E sup jXt j4 jF0 6 C 1 C jj2 : (4.51)
06t6T
Following (4.46) and allowing the constant C to change from line to line, we deduce that, for
any " > 0 and a > 1,
4.3 Solvability of McKean-Vlasov FBSDEs by Schauder’s Theorem 259
h i
E sup jXt j2 1fsup06t6T jXt j > ag 6 C " C "1 E 1 C jj2 1fsup06t6T jXt j > ag
06t6T
6 C " C "1 sup E 1 C jj2 1D ;
D2F WP.D/6Ca2
where we used the fact that EŒsup06t6T jXt j2 6 C, which is implied by (4.51). Minimizing
over " > 0, we get:
h i
1=2
E sup jXt j2 1fsup06t6T jXt j > ag 6 C sup E 1 C jj2 1D :
06t6T D2F WP.D/6Ca2
Now, for any D 2 F such that P.D/ 6 Ca2 , with a > 1, we have that:
E 1 C jj2 1D 6 2Ca1 C E jj2 1fjj>a1=2 g ;
so that:
h i
E sup jXt j2 1fsup06t6T jXt j>ag 6 C a1 C E jj2 1fjj>a1=2 g :
06t6T
E D 2C Œ0; TI P2 .Rd / W
Z
2 1
2
8a > 1; sup jxj dt .x/ 6 C a C E jj 1fjj>a1=2 g :
06t6T fjxj>ag
Clearly, E is convex and closed in C .Œ0; TI M1f .Rd // equipped with k k. Also ˚ maps E
into itself.
Returning to the dynamics of X , observe that (4.51) implies that, for any 2 E and
0 6 s 6 t 6 T:
E jXt Xs j2 6 C.t s/;
so that:
W2 Œ˚./t ; Œ˚./s D W2 L.Xt /; L.Xs / 6 C.t s/1=2 ;
Remark 4.40 Notice that Corollary 5.6 in Chapter 5 implies that the set:
Z
2 P2 .Rd / W 8a > 1; jxj2 d.x/ 6 C a1 C E jj2 1fjj>a1=2 g ;
fjxj > ag
is a compact subset of P2 .Rd /. This remark will play a crucial role in the sequel.
We already appealed to this type of argument in the third step of the proof given in
Subsection 4.3.3.
Remark 4.41 At this stage, it may be worth mentioning the relevant version of the
Arzelà-Ascoli theorem which we use: if X is a compact Hausdorff space, and Y
is a metric space, then F C.X I Y/ is compact in the compact-open topology
if and only if it is equicontinuous, pointwise relatively compact and closed. Here
pointwise relatively compact means that for each x 2 X , the set Fx D ff .x/I f 2 Fg
is relatively compact in Y.
Remark 4.42 The reader presumably noticed the following difference between
the proofs of Theorems 4.29 and 4.39. In the proof of Theorem 4.29, we first
assume that is in L1 .˝; F0 ; PI Rd /, and then handle the general case where
2 L2 .˝; F0 ; PI Rd / by an approximation argument. In contrast, we work directly
with 2 L2 .˝; F0 ; PI Rd / in the proof of Theorem 4.39. As a result, different
prescriptions are required for the definition of the set E in the first step of the proof
of Theorem 4.39.
In this section and the next, we provide two general solvability results for the MFG
problem described in Subsection 3.1.2 of Chapter 3. We remind the reader of the
objective: find a deterministic flow D .t /06t6T of probability measures on Rd
such that the stochastic control problem:
Z T
inf J .˛/; with J .˛/ D E f .t; Xt˛ ; t ; ˛t /dt C g.XT˛ ; T / ;
˛2A 0
subject to (4.52)
(
dXt˛ D b.t; Xt˛ ; t ; ˛t /dt C .t; Xt˛ ; t ; ˛t /dWt ; t 2 Œ0; T;
X0˛ D ;
has an optimally controlled process with .t /06t6T as flow of marginal distributions.
4.4 Solving MFGs from the Probabilistic Representation of the Value Function 261
We recall that the above problem is set on a complete filtered probability space
.˝; F; F D .Ft /06t6T ; P/ supporting a d-dimensional F-Wiener process W D
.Wt /06t6T , and for an initial condition 2 L2 .˝; F0 ; PI Rd /. The set A denotes the
collection of square-integrable and F-progressively measurable control processes
˛ D .˛t /06t6T taking values in a convex closed subset A Rk . Moreover, the
state process X D .Xt /06t6T takes values in Rd . The solvability results which we
provide in this section and the next are derived within the two forms of FBSDE-
based approaches introduced in Subsection 3.2 to characterize the solutions of an
optimal stochastic control problem.
This section is specifically dedicated to the method based upon the FBSDE
representation of the value function, in the spirit of the weak formulation approach
discussed in Subsection 3.3.1, except for the fact that the control problem
underpinning the mean field game is formulated in the strong sense.
Throughout the section, we assume that the volatility coefficient does not
depend upon the control. The coefficients b and f will be regarded as measurable
mappings from Œ0; T Rd P2 .Rd / A into Rd and R respectively, the volatility
coefficient as a measurable mapping from Œ0; T Rd P2 .Rd / into Rdd , and g
as a measurable function from Rd P2 .Rd / into R.
(continued)
262 4 FBSDEs and the Solution of MFGs Without Common Noise
(A4) For any t 2 Œ0; T, 2 P2 .Rd / and ˛ 2 A, the functions b.t; ; ; ˛/,
f .t; ; ; ˛/, .t; ; / and g.; / are L-Lipschitz continuous in x; for
any t 2 Œ0; T, x 2 Rd and ˛ 2 A, the functions b.t; x; ; ˛/, f .t; x; ; ˛/,
.t; x; / and g.x; / are continuous in the measure argument with respect
to the 2-Wasserstein distance.
(A5) For the same constant L and for all .t; x; ; ˛/ 2 Œ0; TRd P2 .Rd /A,
j@˛ b.t; x; ; ˛/j 6 L; j@˛ f .t; x; ; ˛/j 6 L 1 C j˛j :
(A6) Letting
In exactly the same way as for Lemma 3.3, one proves the following sufficient
condition ensuring that assumption (A6) holds.
Lemma 4.43 On top of assumptions (A1–5) above, assume that b.t; x; ; ˛/ has
the form:
for a bounded function b1 , that, for any t 2 Œ0; T, the function @˛ f .t; ; ; / is
continuous in and L-Lipschitz continuous in .x; ˛/, and finally that f satisfies
the -convexity assumption:
for all t 2 Œ0; T, .x; / 2 Rd P2 .Rd / and .˛; ˛ 0 / 2 A A, for some > 0. Then
(A6) in assumption MFG Solvability HJB holds true for a possibly new value of L.
4.4 Solving MFGs from the Probabilistic Representation of the Value Function 263
We now state the main result of this section. It provides a solution to the mean
field game problem by solving the appropriate FBSDE associated with the stochastic
control problem (4.52). Recall that in the first prong of the probabilistic approach,
the variable z 1 .t; x; / is substituted for the dual variable y appearing in the
coefficients b and f (and hence the Hamiltonian H) and the minimizer ˛. O
Theorem 4.44 Let assumption MFG Solvability HJB be in force. Then, for any
initial condition 2 L2 .˝; F0 ; PI Rd /, the McKean-Vlasov FBSDE:
8
ˆ 1
ˆ
ˆ dXt D b t; Xt ; L.Xt /; ˛
O t; Xt ; L.Xt /; .t; Xt ; L.Xt // Zt dt
ˆ
< C .t; Xt ; L.Xt //dWt ;
(4.54)
ˆ 1
ˆ dYt D f t; Xt ; L.Xt /; ˛O t; Xt ; L.Xt /; .t; Xt ; L.Xt // Zt dt
ˆ
:̂
CZt dWt ;
for t 2 Œ0; T, with the initial condition X0 D 2 L2 .˝; F0 ; PI Rd / and the terminal
condition YT D g.XT ; T /.
264 4 FBSDEs and the Solution of MFGs Without Common Noise
Remark that (4.55) differs from equation (3.30) appearing in the statement
of Proposition 3.11 articulating the so-called weak formulation of the optimal
stochastic control problem. As demonstrated by the proof of Proposition 3.11, we
may go (at least formally) from (3.30) to (4.55) by means of Girsanov’s theorem.
Part of the argument in the proof of Proposition 3.11 is precisely to check that it is
indeed legal to invoke Girsanov’s theorem in that context.
Here we work with (4.55) instead of (3.30) in order to avoid any Girsanov
transformation, and in so doing, get a direct representation of the solution of the
stochastic control problem instead of a weaker one. We give a precise statement
now, and we postpone the proof to Subsection 4.4.3.
Theorem 4.45 For the same input as above and under assumption MFG
Solvability HJB, the FBSDE (4.55) with X0 D as initial condition at time 0
0; 0; 0; 0;
has a unique solution .Xt ; Yt ; Zt /06t6T with .Zt /06t6T being bounded by a
deterministic constant, almost everywhere for Leb1 ˝ P on Œ0; T ˝.
Moreover, there exists a continuous mapping u W Œ0; T Rd ! R, Lipschitz con-
tinuous in x uniformly with respect to t 2 Œ0; T and to the input , such that, for any
0; 0; 0;
initial condition 2 L2 .˝; F0 ; PI Rd /, the unique solution .Xt ; Yt ; Zt /06t6T to
the FBSDE (4.55) with X0 D as initial condition at time 0, satisfies:
h 0; i
0;
P 8t 2 Œ0; T; Yt D u t; Xt D 1:
0; 0;
Also, the process . .t; Xt ; t /1 Zt /06t6T is bounded by the Lipschitz con-
0;
stant of u in x. Finally, the process .Xt /06t6T is the unique solution of the
optimal control problem (4.52). In particular, EŒu.0; / D J .˛/O for ˛O D
0; 0; 0;
O Xt ; t ; .t; Xt ; t /1 Zt //06t6T .
.˛.t;
Remark 4.46 Observe that the driver in the backward component of (4.55) is not
assumed to be Lipschitz continuous, see (A5) above. This explains why the analysis
of (4.55) requires a special treatment.
Lemma 4.47 For the same input as above and under assumption MFG Solv-
ability HJB, assume that the HJB equation:
1 h i
@t V.t; x/ C trace .t; x/@2xx V.t; x/
2
C H t; x; t ; @x V.t; x/; ˛.t;
O x; t ; @x V.t; x// D 0; .t; x/ 2 Œ0; T Rd ;
4.4 Solving MFGs from the Probabilistic Representation of the Value Function 265
and:
solves (4.55). It is the unique solution for which the process .Zt /06t6T is bounded.
Moreover, the assumption of Proposition 4.51 below are satisfied by taking u D V.
Observe that the representation of Zt is fully justified by the fact that Zt is here
understood as a random vector with values in Rd and that, V being R-valued, @x V
is also regarded as an Rd -valued function. When Zt and @x V are regarded as taking
values in R1d , Zt takes the form @x V.t; Xt / .t; Xt ; t /.
We give an example of an application of Lemma 4.47 in Subsection 4.7.3.
Proof. We only provide a sketch of the proof. The fact that (4.55) is satisfied is a
straightforward application of Itô’s formula to compute dYt D dV.t; Xt / given the fact that V
solves the HJB equation, and the SDE satisfied by X, which is solvable under the standing
assumption.
Uniqueness of the solution can be proved in two ways. First, one can observe that
for solutions with a bounded .Zt /06t6T , the equation may be rewritten as an equation
with Lipschitz coefficients. Since we have identified V with a Lipschitz decoupling field,
there must be at most one solution by Proposition 4.8. Another strategy is to expand
.V.t; Xt0 //06t6T , for any other solution .Xt0 ; Yt0 ; Zt0 /06t6T , and to check that the pair process
.V.t; Xt0 /; .t; Xt0 ; t / @x V.t; Xt0 //06t6T satisfies a BSDE with random Lipschitz coefficients
that is also satisfied by .Yt0 ; Zt0 /06t6T . By Cauchy-Lipschitz theory for BSDEs (and not
FBSDEs), we get Yt0 D V.t; Xt0 / and Zt0 D .t; Xt0 ; t // @x V.t; Xt0 /, for t 2 Œ0; T, which
shows that .Xt0 /06t6T solves the same SDE as .Xt /06t6T . Uniqueness then follows. t
u
Remark 4.48 We shall not discuss existence results for classical solutions of the
HJB equation. We just emphasize that, whenever the coefficients (obtained after
composition with ) are Hölder continuous in time and satisfy the conditions
required in assumption MFG Solvability HJB and the terminal condition g is
smooth in x, classical solutions are known to exist. We refer to the Notes &
Complements at the end of the chapter for references.
The following lemma shows that, in the framework of Lemma 4.47, the
conclusion of Theorem 4.45 can be checked by a standard verification argument.
266 4 FBSDEs and the Solution of MFGs Without Common Noise
Lemma 4.49 Under the assumptions of Lemma 4.47, the process X D .Xt /06t6T
identified in the statement is the unique optimal solution of the stochastic control
problem (4.52) with X0 D as initial condition.
Proof. The proof consists in another application of Itô’s formula. Indeed, consider the
process:
ˇ ˇ ˇ
dXt D b.t; Xt ; t ; ˇt /dt C .t; Xt ; t /dWt ; t 2 Œ0; T;
with b1 .t; x; / as in Lemma 4.43, then the Hamiltonian takes the form:
H .t; x; ; y; ˛/ D b1 .t; x; / y C .t; x; /y ˛ C f .t; x; ; ˛/;
We now turn to the proof of Theorem 4.45 as we aim at proving that the FBSDE:
8
< dXt D bt; Xt ; t ; ˛.t;
O Xt ; t ; .t; Xt ; t /1 Zt / dt C .t; Xt ; t /dWt ;
: dY D f t; X ; ; ˛.t; (4.57)
t tO X ; ; .t; X ; /1 Z / dt C Z dW ;
t t t t t t t t
Proposition 4.51 On top of assumption MFG Solvability HJB, assume that there
exists R > 0 such that, for any initial condition .t0 ; x/ 2 Œ0; T Rd , the
FBSDE (4.57), with Xt0 D x as initial condition at time t0 , has a unique solution
.Xtt0 ;x ; Ytt0 ;x ; Ztt0 ;x /t0 6t6T satisfying j .t; Xtt0 ;x ; t /1 Ztt0 ;x j 6 R for almost every
.t; !/ 2 Œt0 ; T ˝ under Leb1 ˝ P. Assume also that there exists a continuous
mapping u W Œ0; T Rd ! R, Lipschitz continuous in x uniformly in t 2 Œ0; T, such
that for any .t0 ; x/ 2 Œ0; T Rd :
h i
P 8t 2 Œt0 ; T; Ytt0 ;x D u.t; Xtt0 ;x / D 1:
Then, the FBSDE (4.57) with X0 D as initial condition at time 0 has a solution
.Xt ; Yt ; Zt /06t6T . Also, . .t; Xt ; t /1 Zt /06t6T is bounded by the Lipschitz constant
268 4 FBSDEs and the Solution of MFGs Without Common Noise
of u in x and .Xt ; Yt ; Zt /06t6T is in fact the unique solution of (4.57) with a bounded
martingale integrand .Zt /06t6T . Finally, .Xt /06t6T is the unique optimal path of
the stochastic control problem (4.52). In particular, EŒu.0; / D J .˛/, O for ˛O D
O Xt ; t ; .t; Xt ; t /1 Zt //06t6T .
.˛.t;
Remark 4.52 While the result is true in full generality, the proof provided below
uses the fact that the filtration F is generated by F0 and the Wiener process W. This
is due to the use of BSDEs with F-progressively measurable coefficients in the proof
below. A proof of the result in the general case will be provided in Theorem (Vol II)-
1.57 in Chapter (Vol II)-1 when we discuss FBSDEs with random coefficients. A key
ingredient in that proof will be the so-called Kunita-Watanabe decomposition which
we do not want to introduce at this stage.
for any t 2 Œt0 ; T and any .t0 ; x/ 2 Œ0; T Rd , we may replace the driver
O x; t ; .t; x; t /1 z/
.t; x; z/ D f t; x; t ; ˛.t;
of the backward component of the FBSDE (4.57) by .z/ .t; x; z/ for a smooth cut-off
function W Rd ! Œ0; 1 such that .z/ D 1 when jzj 6 LR and .z/D0 when jzj > 2LR. In
this way, we may regard (4.57) as an FBSDE driven by Lipschitz continuous coefficients.
By Proposition 4.8, we deduce that the FBSDE driven by .z/ .t; x; z/ has a unique
solution .X0; ; Y 0; ; Z0; / with X0 D . By Lemma 4.11, Z0; is bounded by a deterministic
constant. Without any loss generality, we can assume that this constant is LR itself. Therefore,
.X0; ; Y 0; ; Z0; / is also a solution of (4.57). Uniqueness in the class of processes .Y; Z/ with
Z bounded is proved in the same way, using the same truncation argument.
Second Step. We now return to the control problem (4.52). Given another controlled path
.X ˇ ; ˇ/, the control ˇ being bounded by some deterministic constant, we consider, on the
original probabilistic set-up and with the same cut-off function as above, the BSDE:
ˇ ˇ ˇ ˇ
dYt D Zt f t; Xt ; t ; ˛O t dt
h ˇ ˇ i
ˇ ˇ
C Zt 1 b t; Xt ; t ; ˇt 1 b t; Xt ; t ; ˛O t dt (4.58)
ˇ
C Zt dWt ;
ˇ ˇ ˇ ˇ ˇ ˇ
O Xt ; t ; .t; Xt ; t /1 Zt / and terminal condition YT D g.XT ; T /. Here, we
with ˛O t D ˛.t;
use the notation . 1 b/.t; x; ; ˛/ for .t; x; /1 b.t; x; ; ˛/ despite the fact that 1 and
b do not have the same arguments. Equation (4.58) is a quadratic BSDE and we can solve
ˇ
it using Theorem 4.15. Let .Et /06t6T be the Doléans-Dade exponential of the stochastic
integral:
4.4 Solving MFGs from the Probabilistic Representation of the Value Function 269
Z
t
. 1 b s; Xsˇ ; s ; ˇs . 1 b s; Xsˇ ; s ; ˛O sˇ dWs :
0 06t6T
ˇ
We
R t observe that the integrand, at time t, is bounded by C.1 C jZt j/. Since the martingale
ˇ
. 0 Zs dWs /06t6T is of bounded mean oscillation, so is the stochastic integral above. By
ˇ
Proposition 4.18, .Et /06t6T is a true martingale and we can define the probability measure
ˇ
P D ET P. Under Pˇ , the process:
ˇ
Z t
ˇ 1 ˇ
Wt D Wt C . b s; Xs ; s ; ˇs . 1 b s; Xsˇ ; s ; ˛O sˇ ds ;
0 06t6T
ˇ RT ˇ
is a d-dimensional Brownian motion. We show, at the end of the proof, that EP 0 jZt j2 dt <
ˇ ˇ ˇ ˇ
1 and that, under P , .X ; Y ; Z /06t6T is a solution of the FBSDE (4.57) when driven by
.z/ .t; x; z/ instead of .t; x; z/, and W ˇ instead of W. Therefore, taking these facts for
granted momentarily, we infer (by strong and thus weak uniqueness) that the law of .Xˇ ; ˛O ˇ /
under Pˇ is the same as the law of the pair .X; ˛O D .˛.t; O Xt ; t ; .t; Xt ; t /1 Zt //06t6T /
under P, which proves in particular that:
Z T
ˇ ˇ ˇ ˇ
J .˛/
O D EP g.XT ; T / C f .t; Xt ; t ; ˛O t /dt ;
0
ˇ ˇ
and that . .t; Xt ; t /1 Zt /06t6T is bounded by R, Leb1 ˝ Pˇ almost everywhere, and thus
ˇ
Leb1 ˝ P almost everywhere. As a byproduct, by (4.58), .Zt / is equal to 1. Moreover,
Pˇ ˇ
O Since
E ŒY0 is equal to the above right-hand side, and thus to J .˛/.
ˇ ˇ ˇ ˇ ˇ ˇ ˇ
EP ŒY0 D EP ŒET Y0 D EP ŒE0 Y0 D EP ŒY0 ;
we have:
ˇ
J .˛/
O J .ˇ/ D EP ŒY0 J .ˇ/
Z T ˇ ˇ ˇ ˇ
D EP H t; Xt ; t ; .t; Xt ; t /1 Zt ; ˛O t (4.59)
0
ˇ ˇ ˇ
H t; Xt ; t ; .t; Xt ; t /1 Zt ; ˇt dt ;
so that J .˛/
O 6 J .ˇ/. RT
For a generic ˇ satisfying E 0 jˇt j2 dt < 1, we can apply the previous inequality with
ˇ replaced by ˇ n D .ˇt 1jˇt j6n /06t6T . Using the continuity and growth assumptions on the
coefficients, it is plain to prove that J .ˇ n / converges to J .ˇ/ as n tends to 1, and deduce
that ˛O is a control minimizing the cost.
Third Step. Since ˛.t;
O x; ; y/ is a strict minimizer of H.t; x; ; y; /, we have that, for any
O if and only if ˇ D ˛O ˇ Leb1 ˝ P almost everywhere. In
bounded control ˇ, J .ˇ/ D J .˛/
such a case, the second line in (4.58) vanishes and .Xˇ ; Y ˇ ; Zˇ / satisfies the FBSDE (4.57)
under P and by uniqueness of solutions with a bounded martingale integrand, we conclude
270 4 FBSDEs and the Solution of MFGs Without Common Noise
J .ˇ/ J .˛/
O D lim J .ˇ n / J .˛/
O
n!1
Z h
T ˇn ˇn ˇn
> EP lim inf H t; Xt ; t ; .t; Xt ; t /1 Zt ; ˇtn
0 n!1
ˇn i
ˇn ˇn ˇn
H t; Xt ; t ; .t; Xt ; t /1 Zt ; ˛O t dt:
Again, if ˇ is not bounded, we can find R0 > R T L.R C 1/ C 1 for L such that
O x; ; y/j 6 L.1 C jyj/, and such that E 0 1L.RC1/C1<jˇt j<R0 dt 6D 0. Given
j˛.t;
.t; x; / 2 Œ0; T Rd P2 .Rd /, we then let:
˚
.t; x; / D .y; ˇ/ 2 Rd A W jyj 6 R; jˇj 6 R0 ; jˇ ˛.t;
O x; ; y/j > 1 :
Then,
J .ˇ/ J .˛/
O
Z h
T ˇn
> EP lim inf inf H t; Xt ; t ; y; ˇ
n!1 ˇn
0 .y;ˇ/2.t;Xt ;t /
ˇn i
ˇn
H t; Xt ; t ; y; ˛.t;
O Xt ; t ; y/ 1L.RC1/C16jˇtn j6R0 dt:
ˇ
which cannot be zero by definition of .t; Xt ; t /. This proves that X is the unique
minimizing path.
Fourth Step. In order to complete the proof, it remains to check that, for ˇ bounded and
under Pˇ , .Xˇ ; Y ˇ ; Zˇ / is a solution of the FBSDE (4.57) when driven by .z/ .t; x; z/
instead of .t; x; z/ and by W ˇ instead of W. We know that, for t 2 Œ0; T,
ˇ ˇ ˇ
dXt D b t; Xt ; t ; ˇt dt C t; Xt ; t dWt
ˇ ˇ ˇ ˇ
D b t; Xt ; t ; ˛O t dt C t; Xt ; t dWt ;
4.4 Solving MFGs from the Probabilistic Representation of the Value Function 271
and
ˇ ˇ ˇ ˇ ˇ ˇ
dYt D Zt f t; Xt ; t ; ˛O t dt C Zt dWt
ˇ ˇ ˇ ˇ ˇ
D Zt t; Xt ; Zt dt C Zt dWt ;
ˇ ˇ
with the terminal condition YT D g.XT ; T /. This shows that the equations in the
system (4.57) hold true under Pˇ with W replaced by W ˇ and with .z/ .t; x; z/ as driver in
the backward equation. To prove that .Xˇ ; Y ˇ ; Zˇ / is indeed a solution of (4.57), it remains
to check the appropriate integrability conditions.
By Proposition 4.18, we know that, for any p > 1,
Z T p
P ˇ
E jZt j2 ds < 1:
0
ˇ
Since ET is in Lr .˝; FT ; PI R/ for some r > 1, the above is also true under Pˇ .
ˇ ˇ ˇ
By (4.57), we also have EP Œsup06t6T jYt jp < 1, and then EP Œsup06t6T jYt jp < 1.
ˇ ˇ
Since ˇ is bounded, the same holds with X instead of Y . The proof is easily completed.
t
u
Proof. The objective is to prove that, for a given deterministic initial condition, the
FBSDE (4.57) has a solution with a bounded martingale integrand, and that this solution
is unique within the class of solutions with bounded martingale integrands. Meanwhile, we
must also construct a decoupling field. As before, we split the proof into successive steps.
First Step. Following the proof of Proposition 4.51, we first focus on a truncated version
of (4.57), namely:
(
dXt D b t; Xt ; t ; ˛O t; Xt ; t ; .t; Xt ; t /1 Zt dt C .t; Xt ; t /dWt ;
(4.60)
dYt D .Zt /f t; Xt ; t ; ˛O t; Xt ; t ; .t; Xt ; t /1 Zt dt C Zt dWt ;
for t 2 Œ0; T, with the terminal condition YT D g.XT ; T /, for a cut-off function W Rd !
Œ0; 1, equal to 1 on the ball of center 0 and radius R, and equal to 0 outside the ball of center
0 and radius 2R, such that sup j 0 j 6 2=R. For the time being, R > 0 is an arbitrary real
number. Its value will be fixed later on.
By Theorem 4.12, we know that, for any initial condition .t0 ; x/ 2 Œ0; T Rd , (4.60)
is uniquely solvable. We denote the unique solution by .XRIt0 ;x ; Y RIt0 ;x ; ZRIt0 ;x / D
.XtRIt0 ;x ; YtRIt0 ;x ; ZtRIt0 ;x /t0 6t6T . Thanks to the cut-off function , the driver of (4.60) is
indeed Lipschitz-continuous in the variable z. Moreover, the solution can be represented
through a continuous decoupling field uR , Lipschitz continuous in the variable x, uniformly
in time. Also, the martingale integrand ZRIt0 ;x is bounded by L times the Lipschitz constant
of uR , with L as in assumption MFG Solvability HJB. See also Lemma 4.11. Therefore,
the proof boils down to showing that we can bound the Lipschitz constant of the decoupling
field independently of the cut-off function in (4.60).
272 4 FBSDEs and the Solution of MFGs Without Common Noise
Second Step. In this step, we fix the values of .t0 ; x/ 2 Œ0; T Rd and R > 0, and we use
the notation .X; Y; Z/ for .XRIt0 ;x ; Y RIt0 ;x ; ZRIt0 ;x /. We then let .Et /06t6T be the Doléans-Dade
exponential of the stochastic integral:
Z
t
. 1 b/.s; Xs ; s ; ˛O s / dWs ;
0 06t6T
the integrand is bounded, .Et /06t6T is a true martingale, and we can define the probability
measure Q D ET P. Under Q, the process:
Z
t
WtQ D Wt C 1 b .s; Xs ; s ; ˛O s /ds
0 06t6T
is a d-dimensional Brownian motion. Following the proof of Proposition 4.51, we learn that
under Q, .Xt ; Yt ; Zt /06t6T is a solution of the forward-backward SDE:
8
ˆ
ˆ dXt D .t; Xt ; t /dWtQ ;
ˆ
< dY D .Z /f t; X ; ; ˛O t; X ; ; .t; X ; /1 Z dt
ˆ
t t t t t t t t t
(4.61)
ˆ
ˆ Z . 1
b/ t; X ; ; ˛
O t; X ; ; .t; X ; / 1
Zt dt
ˆ t t t t t t t
:̂
CZt dWtQ ;
over the interval Œt0 ; T, with the same terminal condition as before. Since Z is bounded, the
forward-backward SDE (4.61) may be regarded as an FBSDE with Lipschitz-continuous
coefficients. By the FBSDE version of Yamada-Watanabe theorem proven in Theorem
(Vol II)-1.33 of Chapter (Vol II)-1, any other solution with a bounded martingale integrand,
with the same initial condition but constructed with respect to another Brownian motion, has
the same distribution. Therefore, we can focus on the version of (4.61) obtained by replacing
W Q by W. If, for this version, the backward component Y can be represented in the form
Yt D V.t; Xt /, for all t 2 Œt0 ; T, with V being Lipschitz continuous in space, uniformly in
time, and with Z bounded, then V.t0 ; x/ must coincide with uR .t0 ; x/. Repeating the argument
for any .t0 ; x/ 2 Œ0; T Rd , we then have V uR .
Third Step. The strategy is now as follows. We consider the same FBSDE as in (4.61), but
with W Q replaced by the original W:
8
ˆ
ˆ dXt D .t; Xt ; t /dWt ;
ˆ
< dY D .Z /f t; X ; ; ˛O t; X ; ; .t; X ; /1 Z dt
ˆ
t t t t t t t t t
ˆ
ˆ Z . 1
b/ t; X ; ; ˛
O t; X ; ; .t; X ; / 1
Zt dt
ˆ t t t t t t t
:̂
CZt dWt ; t 2 Œ0; T;
developed in Subsection 4.1.2 for FBSDEs with Lipschitz continuous coefficients. To bypass
this obstacle, we shall modify the form of the equation and focus on the following version:
8
ˆ
ˆ dXt D .t; Xt ; t /dWt ;
ˆ
< dY D .Z /f t; X ; ; ˛O t; X ; ; .t; X ; /1 Z dt
ˆ
t t t t t t t t t
(4.62)
ˆ
ˆ .Z /Z . 1
b/ t; X ; ; ˛
O t; X ; ; .t; Xt ; t /1 Zt dt
ˆ t t t t t t
:̂
CZt dWt ; t 2 Œ0; T:
Notice that the cut-off function now appears on the third line. Our objective being to prove
that (4.62) admits a solution for which Z is bounded independently of R, when R is large, the
presence of the cut-off does not make any difference.
Now, (4.62) may be regarded as both a quadratic and a Lipschitz FBSDE. For any initial
condition .t0 ; x/, we may again denote the solution by .XRIt0 ;x ; Y RIt0 ;x ; ZRIt0 ;x /. This is the
same notation as in the first step although the equation is different. Since the steps are
completely separated, there is no risk of confusion. We denote the corresponding decoupling
field by V R . By Theorem 4.12, it is bounded (the bound possibly depending on R at this stage
of the proof) and ZRIt0 ;x is bounded.
For the sake of simplicity, we assume that t0 D 0 and we drop the indices R and t0 in the
notation .XRIt0 ;x ; Y RIt0 ;x ; ZRIt0 ;x /. We just denote it by .Xx ; Y x ; Zx /. Similarly, we just denote
V R by V.
The goal is then to prove that there exists a constant C, independent of R and of the cut-off
functions, such that, for all x; x0 2 Rd ,
ˇ x0 ˇ
ˇE Y Y x ˇ 6 Cjx0 xj; (4.63)
0 0
we can write:
dıXt D ıt ıXt dWt ; t 2 Œ0; T; (4.64)
X
d
ıt ıXt i;j
D ıt i;j;`
ıXt ` ; i; j 2 f1; ; dg2 ;
`D1
274 4 FBSDEs and the Solution of MFGs Without Common Noise
with:
0 0 0
Xt`Ix!x D .Xtx /1 ; ; .Xtx /` ; .Xtx /`C1 ; ; .Xtx /d :
From the Lipschitz property of in x, the process .ıt /06t6T is bounded by a constant C only
depending upon L in the assumption. Notice that in the notation ıt ıXt , .ıt ıXt /i;j appears
as the inner product of ..ıt /i;j;` /16`6d and ..ıXt /` /16`6d . Because of the presence of the
additional indices .i; j/, we chose not to use the inner product notation in this definition. This
warning being out of the way, we may use the inner product notation when convenient.
Indeed, in a similar fashion, the pair .ıYt ; ıZt /06t6T satisfies a backward equation of the
form:
.1/
where ıgT is an Rd -valued random variable bounded by C and ıF.1/ D .ıFt /06t6T
.2/
and ıF.2/ D .ıFt /06t6T are progressively measurable Rd -valued processes, which are
bounded, the bounds possibly depending upon the function . Here, “” denotes the inner
product of Rd . Notice that, as a uniform bound on the growth of ıF.1/ and ıF.2/ , we have:
.1/ 0
jıFt j 6 C 1 C jZtx j2 C jZtx j2
.2/ 0 ; t 2 Œ0; T; (4.66)
jıFt j 6 C 1 C jZtx j C jZtx j
the constant C only depending on the constant L appearing in the assumption and where we
used the assumption sup j 0 j 6 2=R.
Since ıF.2/ is bounded, we may introduce a probability Q (again this is not the same Q as
that appearing in the second step, but, since the two steps are completely independent, there
R t .2/
is no risk of confusion), equivalent to P, under which .WtQ D Wt 0 ıFs ds/06t6T is a
Brownian motion. Then,
ˇ Z T ˇ
ˇ ˇ ˇ Q ˇ ˇ Q ˇ
ˇE ıY0 ˇ D ˇE ıY0 ˇ D ˇE ıgT ıXT C ıF .1/
ıX ds ˇ: (4.67)
ˇ s s ˇ
0
In order to handle the above right-hand side, we need to investigate dQ=dP. This requires to
go back to (4.66) and to (4.62).
Fifth Step. The backward equation in (4.62) may be regarded as a quadratic BSDE satisfying
assumption
Rt Quadratic BSDE, uniformly in R. By Theorem 4.19, the stochastic integral
. 0 Zsx dWs /06t6T is of Bounded Mean Oscillation and its BMO norm is independent of
x and R. Without any loss of generality, we may assume that it is less than C.
4.4 Solving MFGs from the Probabilistic Representation of the Value Function 275
.2/
Coincidentally, the same holds true if we replace Zsx by ıFs from (4.65), as
.2/ 0
jıFs j 6 C.1 C jZsx j C jZsx j/. By Proposition 4.18, we deduce that there exists an exponent
r > 1, only depending on L and T, such that (allowing the constant C to increase from line
to line):
h dQ ri
E 6 C:
dP
Now (4.64) implies that, for any p > 1, there exists a constant Cp0 , independent of the cut-
off functions , such that EŒsup06t6T jıXs jp 1=p 6 Cp0 jx x0 j. Therefore, applying Hölder’s
inequality, (4.67) and the bound for the r-moment of dQ=dP, we obtain:
Z r0 1=r0
ˇ ˇ T 0
ˇE ıY0 ˇ 6 Cjx x0 j 1 C E jZsx j2 C jZsx j2 ds ; (4.68)
0
for some r0 > 1. In order to estimate the right-hand side, we invoke Theorem 4.18 again. We
deduce that:
ˇ ˇ
ˇE ıY0 ˇ 6 C0 jx x0 j;
for a constant C0 that only depends upon L and T. This proves the required estimate for the
Lipschitz constant of the decoupling field associated with the system (4.62). t
u
4.4.4 Conclusion
We now return to the mean field game associated with the stochastic control
problem (4.52).
By Theorem 4.45, any solution to the McKean-Vlasov FBSDE:
8
ˆ
ˆ
ˆ dXt D b t; Xt ; L.Xt /; ˛O t; Xt ; L.Xt /; .t; Xt ; L.Xt //1 Zt dt
ˆ
< C .t; Xt ; L.Xt //dWt ;
(4.69)
ˆ 1
ˆ dYt D f t; Xt ; L.Xt /; ˛O t; Xt ; L.Xt /; .t; Xt ; L.Xt // Zt dt
ˆ
:̂
CZt dWt ;
We now present another strategy for proving the existence of an MFG equilibrium,
based upon the stochastic Pontryagin maximum principle. We already presented
this alternative method in Subsection 3.3.2, but we now address the solvability
of the underpinning McKean-Vlasov FBSDE under weaker conditions than in
Theorem 3.24. Recall that the FBSDE takes the form:
(
dXt D b t; Xt ; L.Xt /; ˛.t;
O Xt ; L.Xt /; Yt / dt C dWt ;
(4.70)
dYt D @x H t; Xt ; L.Xt /; Yt ; ˛.t;
O Xt ; L.Xt /; Yt / dt C Zt dWt ;
for t 2 Œ0; T, with the initial condition X0 D 2 L2 .˝; F0 ; PI Rd /, together with
the terminal condition YT D @x g.XT ; L.XT //. For convenience purposes, we recall
the assumption introduced in Chapter 3.
(A1) The drift b is an affine function of .x; ˛/ in the sense that it is of the
form:
b.t; x; ; ˛/ D b0 .t; / C b1 .t/x C b2 .t/˛; (4.71)
where Œ0; T P2 .Rd / 3 .t; / 7! b0 .t; /, b1 W Œ0; T 3 t 7! b1 .t/
and b2 W Œ0; T 3 t 7! b2 .t/ are Rd , Rdd and Rdk valued respectively,
and are measurable and bounded on bounded subsets of their respective
domains.
(A2) The function is constant.
(A3) The function Rd A 3 .x; ˛/ 7! f .t; x; ; ˛/ 2 R is once continuously
differentiable with Lipschitz-continuous derivatives (so that f .t; ; ; /
is C1;1 ), the Lipschitz constant in x and ˛ being bounded by L (so that
it is uniform in t and ). Moreover, it satisfies the following form of
-convexity:
f .t; x0 ; ; ˛ 0 / f .t; x; ; ˛/
(4.72)
.x0 x; ˛ 0 ˛/ @.x;˛/ f .t; x; ; ˛/ > j˛ 0 ˛j2 :
The notation @.x;˛/ f stands for the gradient in the joint variables .x; ˛/.
Finally, f , @x f and @˛ f are locally bounded over Œ0; TRd P2 .Rd /A.
(continued)
4.5 Solving MFGs by the Stochastic Pontryagin Principle 277
On top of assumption SMP, we shall use the following assumptions to solve the
matching problem (ii) in (3.4). Recall the notation M2 ./2 for the second moment
of the measure introduced in (3.26). The following assumptions are stated using
a fixed point ˛0 2 A. Clearly, the assumptions do not depend upon the particular
choice of this control value in A.
Assumption (A5) provides Lipschitz continuity while condition (A6) controls the
smoothness of the running cost f with respect to ˛ uniformly in the other variables.
The most unusual assumption is certainly condition (A7). We refer to it as a weak
mean-reverting condition as it looks like a standard mean-reverting condition for
recurrent diffusion processes, even though the notion of recurrence does not make
much sense in our case since we are working on a finite time interval. Still, as
shown by the proof of Theorem 4.53 below, its role is to control the expectation of
the solution of the forward equation in (4.70), providing an a priori bound for it.
The latter will play a crucial role in the proof of compactness.
278 4 FBSDEs and the Solution of MFGs Without Common Noise
for some constant C > 0, and such that P-a.s. Yt D u.t; Xt / for all t 2 Œ0; T. In
particular, for any p > 1, EŒsup06t6T jXt jp < C1.
In line with the terminology used so far, the function u will be referred to as the
decoupling field of the FBSDE when the environment D .t D L.Xt //06t6T is
fixed.
1 ˇˇ ˇ2
b0 .t; / D b0 .t/;
N g.x; / D qx C qN N ˇ ;
2
1 ˇˇ ˇ2 1
N N ˇ C jn.t/˛j2 ;
f .t; x; ; ˛/ D m.t/x C m.t/
2 2
0
N are elements of Rd d , for some d0 > 1, n.t/ is an element
where q, qN , m.t/ and m.t/
0
of Rk k , for some k0 > k, and N stands for the mean of . Assumption MFG
Solvability SMP is satisfied when b0 .t/ 0 (so that b0 is bounded as required in
(A5)), qN q > 0, m.t/
N m.t/ > 0, and n.t/ n.t/ > Ik in the sense of quadratic forms
so that (A7) and (A3) hold. In the one-dimensional case d D m D 1, (A7) says that
N must be nonnegative and (A3) says that n.t/2 must be greater than
qNq and m.t/m.t/
. As we saw in Section 3.5, these conditions are not optimal for existence when
d D m D 1, as we showed that (4.70) is indeed solvable when Œ0; T 3 t 7! b0 .t/ is
a (possibly nonzero) continuous function, q.q C qN / > 0 and m.t/.m.t/ C m.t//
N > 0.
Obviously, the gap between these conditions is the price to pay for treating general
systems within a single framework.
Another example investigated in Section 3.5 is b0 0, b1 0, b2 1, f ˛ 2 =2,
with d D m D 1. When g.x; / D qN .x s/ N 2 =2, with qN > 0 and s 2 R, assumption
MFG Solvability SMP is satisfied when qN s 6 0 (so that (A7) holds). The optimal
condition given in Section 3.5 is 1 C qN .1 s/T 6D 0.
Notice that assumption MFG Solvability SMP is satisfied when g.x; / D qN .x
N 2 =2 for a bounded Lipschitz-continuous function from R into itself.
s .//
4.5 Solving MFGs by the Stochastic Pontryagin Principle 279
Remark 4.55 The reader may want to apply Theorem 4.39 in order to prove
Theorem 4.53, but, as made clear in the proof below, the difficulty is that (A5) in
assumption MKV FBSDE for MFG may not be satisfied.
The remainder of the section is dedicated to the proof of Theorem 4.53. It split
into four main steps.
Lemma 4.56 Given a continuous flow D .t /06t6T from Œ0; T to P2 .Rd /, the
FBSDE (4.74) is uniquely solvable under assumption SMP, A being a general
0; 0; 0;
closed convex subset of Rk . If we denote its solution by .Xt ; Yt ; Zt /06t6T , then
there exist a constant C > 0, only depending upon the parameters in SMP and thus
independent of , and a locally bounded measurable function u W Œ0; T Rd ! Rd ,
depending on , such that
0; 0;
and P-a.s., for all t 2 Œ0; T, Yt D u.t; Xt /.
The proof of Lemma 4.56 is based on Lemma 3.3 and Proposition 3.21 from
Chapter 3.
where JO t0 ;x0 D J ..˛O tt0 ;x0 /t0 6t6T / and ˛O tt0 ;x0 D ˛.t;
O Xtt0 ;x0 ; t ; Ytt0 ;x0 / (with similar definitions
0
0 ;x
for JO 0 0 and ˛O t
t
t ;x 0 0
by replacing x0 by x0 ). Exchanging the roles of x0 and x00 in (4.76) and
0
Moreover, by standard SDE estimates first and then by standard BSDE estimates, there exists
a constant C (the value of which may vary from line to line), independent of t0 and ı, such
that:
t ;x0 t ;x0
E sup jXtt0 ;x0 Xt 0 0 j2 C E sup jYtt0 ;x0 Yt 0 0 j2
t0 6t6T t0 6t6T
Z T
t ;x0
6 CE j˛O tt0 ;x0 ˛O t 0 0 j2 dt:
t0
Plugging (4.77) into the above inequality completes the proof of (4.75). As explained in
Section 4.1, the function u is then defined as u W Œ0; T Rd 3 .t; x/ 7! Ytt;x and the
representation property of Y in terms of X follows from (4.8). Local boundedness of u follows
from the Lipschitz continuity in the variable x together with the obviousinequality:
sup06t6T ju.t; 0/j 6 sup06t6T E ju.t; Xt0;0 / u.t; 0/j C E jYt0;0 j < 1. t
u
Proposition 4.57 The system (4.70) is solvable if, in addition to assumption MFG
Solvability SMP, we also assume that EŒjj4 < 1 and that @x f and @x g are
uniformly bounded, i.e., for some constant c > 0
Proof. Throughout the proof, we denote by .XI ; Y I ; ZI / the solution of (4.74) with
as input. Also, we denote by u the corresponding decoupling field.
I
For a given 2 L4 .˝; F0 ; PI Rd /, we consider the map 7! ˚./ D .L.Xt //06t6T
and try to apply Schauder’s Theorem 4.32 in order to prove that ˚ has a fixed point, very
much in the spirit of the proofs of Theorems 4.29 and 4.39. See also Remark 4.42 for a
comment on the difference between the two proofs. Following Subsections 4.3.2 and 4.3.5,
we apply Schauder’s fixed point theorem in the space C .Œ0; TI M1f .Rd // of continuous
functions D .t /06t6T from Œ0; T into the space of finite signed measures over Rd ,
equipped with the supremum of the Kantorovich-Rubinstein norm:
with
Z
kkKR? D j.Rd /j C sup `.x/d.x/I ` 2 Lip1 .Rd /; `.0/ D 0 :
Rd
As already explained several times, k kKR? coincides with the Wasserstein distance W1 on
P1 .Rd /. See the Notes & Complements at the end of Chapter 5 for details and references.
We prove existence by proving that there exists a closed convex subset E included in
C .Œ0; TI P2 .Rd // which, when viewed as a subset of C .Œ0; TI M1f .Rd //, is stable for ˚,
with a relatively compact range, ˚ being continuous on E .
First Step. We first establish several a priori estimates for the solution of (4.74). The
coefficients @x f and @x g being bounded, the terminal condition in (4.74) is bounded, and
the growth of the driver is controlled by:
j@x H t; x; t ; y; ˛.t;
O x; t ; y/ j 6 c C Ljyj:
I
By expanding .jYt j2 /06t6T as the solution of a one-dimensional BSDE, we can compare
it with the solution of a deterministic BSDE with a constant terminal condition. This implies
that there exists a constant C, only depending upon c, L and T, such that, for any 2
C .Œ0; TI P2 .Rd //, P-almost surely,
I
8t 2 Œ0; T; jYt j 6 C: (4.79)
Notice that the value of the constant C will vary from line to line. By (3.10) in the statement
of Lemma 3.3, and by (A6) in assumption MFG Solvability SMP, we deduce that:
ˇ I ˇ
8t 2 Œ0; T; ˇ˛O t; Xt ; t ; YtI ˇ 6 C: (4.80)
Plugging this bound into the forward part of (4.74), and again, allowing the constant C to
increase when necessary, standard Lp estimates for SDEs imply:
I 4
E sup jXt j j 6 C 1 C E jj4 : (4.81)
06t6T
282 4 FBSDEs and the Solution of MFGs Without Common Noise
Clearly, E is convex and closed for the 1-Wasserstein distance, and ˚ maps E into itself.
Second Step. By (4.80), we get for any 2 E and 0 6 s 6 t 6 T:
I
jXt XsI j 6 C .t s/ 1 C sup jXrI j C jWt Ws j ;
06r6T
so that, by (4.81),
I
W2 Œ˚./t ; Œ˚./s D W2 L.Xt /; L.XsI / 6 C.t s/1=2 ;
I I
O Xt ; t ; Yt / for t 2 Œ0; T, ˛O t0 being defined in a similar way by replacing
where ˛O t D ˛.t;
0
by . By optimality of ˛O 0 for the cost functional J ./, we claim:
0
0 0
J ˛O 0 ; 0 6 J ˛O C J ˛O 0 ; 0 J ˛O 0 ;
0 0
We now compare J .˛/ O and similarly J .˛O 0 / with J .Œ˛O 0 ; 0 /. We notice
O with J .˛/,
O is the cost associated with the flow of measures D .t /06t6T and the
that J .˛/
0
diffusion process X I , whereas J .˛/
O is the cost associated with the flow of measures
0 0
D .t /06t6T and the controlled diffusion process U D .Ut /06t6T satisfying:
dUt D b0 .t; 0t / C b1 .t/Ut C b2 .t/˛O t dt C dWt ; t 2 Œ0; T I U0 D :
4.5 Solving MFGs by the Stochastic Pontryagin Principle 283
By Gronwall’s lemma, we can modify the value of C (which is now allowed to depend on )
in such a way that:
Z
I
T
E sup jXt U t j2 6 C W22 .t ; 0t /dt:
06t6T 0
Since and 0 are in E , we deduce from (A5) in assumption MFG Solvability SMP, (4.80)
and (4.81) that:
Z 1=2
0 T
J ˛O J ˛O 6 C W22 .t ; 0t /dt C W2 .T ; 0T / :
0
0
A similar bound holds for J .Œ˛O 0 ; 0 /J .˛O 0 /, the argument being even simpler as the costs
are driven by the same processes. So from (4.83) and (4.79) again, together with Gronwall’s
lemma to go back to the controlled SDEs,
Z T I I0 2
E j˛O t0 ˛O t j2 dt C E sup jXt Xt j
0 06t6T
Z T 1=2
6C W22 .t ; 0t /dt C W2 .T ; 0T / ;
0
showing that ˚ is continuous on E . The last inequality follows from Hölder’s inequality
EŒjXj2 D EŒjXj1=2 jXj3=2 6 EŒjXj1=2 EŒjXj3 1=2 6 EŒjXj1=2 EŒjXj4 3=8 , for any random
variable X. t
u
Obviously, examples of functions f and g which are convex in x and such that @x f and
@x g are bounded are rather limited in number and scope. Also, boundedness of @x f
and @x g fails in the typical case when f and g are quadratic with respect to x. In order
to overcome this limitation, we propose to approximate the cost functions f and g
by two sequences .f n /n>1 and .gn /n>1 , referred to as approximated cost functions,
satisfying assumption MFG Solvability SMP uniformly with respect to n > 1, and
such that, for any n > 1, equation (4.70), with .@x f ; @x g/ replaced by .@x f n ; @x gn /
and with 2 L4 .˝; F0 ; PI Rd /, has a solution .Xn ; Y n ; Zn /. In this framework,
Proposition 4.57 says that such approximate FBSDEs are indeed solvable when @x f n
and @x gn are bounded for any n > 1. Our approximation procedure relies on the
following:
284 4 FBSDEs and the Solution of MFGs Without Common Noise
Lemma 4.58 Let us assume that there exist two sequences .f n /n>1 and .gn /n>1 such
that:
(i) there exist two parameters L0 and 0 > 0 such that, for any n > 1, f n and gn
satisfy assumption MFG Solvability SMP with 0 and L0 ;
(ii) f n (resp. gn ) converges towards f (resp. g) uniformly on bounded subsets of
Œ0; T Rd P2 .Rd / A (resp. Rd P2 .Rd /);
(iii) for any n > 1, equation (4.70), with .@x f ; @x g/ replaced by .@x f n ; @x gn / and
with 2 L4 .˝; F0 ; PI Rd / instead of 2 L2 .˝; F0 ; PI Rd /, has a solution.
Then, equation (4.70), with the original coefficients and with 2 L2 .˝; F0 ; PI Rd /,
is solvable.
Proof. For a sequence of F0 -measurable random variables . n /n>1 with values in Rd , such
that j n j 6 jj ^ n for all n > 1 and EŒj n j2 ! 0 as n tends to 1, we denote
by .Xn ; Y n ; Zn /n>1 the sequence of processes obtained by solving (4.70), with .@x f ; @x g/
replaced by .@x f n ; @x gn / and by n .
We establish tightness of the processes .Xn /n>1 in order to extract a convergent subse-
quence of .n D .nt D L.Xtn //06t6T /n>1 . For any n > 1, we consider the approximate
Hamiltonian:
Since Xn is the diffusion process controlled by ˛O n D .˛O tn /06t6T , we use Theorem 3.17 to
compare its behavior to the behavior of a reference controlled process Un whose dynamics are
driven by a specific control ˇ n . We shall consider two different versions for Un corresponding
to the following choices for ˇ n :
For each of these controls, we compare its cost to the optimal cost by using the version of
the stochastic maximum principle which we proved earlier, and subsequently, derive useful
information on the optimal control ˛O n .
First Step. We first consider .i/ in (4.85). In this case,
Z t
Utn D n C b0 .s; L.Xsn // C b1 .s/Usn C b2 .s/E.˛O sn / ds C Wt ; t 2 Œ0; T: (4.86)
0
Notice that taking expectations on both sides of (4.86) shows that E.Usn / D E.Xsn /, for
0 6 s 6 T, and that:
Z
n t
Ut E.Utn / D n E. n / C b1 .s/ Usn E.Usn / ds C Wt ; t 2 Œ0; T;
0
4.5 Solving MFGs by the Stochastic Pontryagin Principle 285
from which it easily follows that supn>1 sup06s6T Var.Usn / < C1. By Theorem 3.17, with
gn .; L.XTn // as terminal cost and .f n .t; ; L.Xtn /; //06t6T as running cost, we get:
Z
T 0
E gn XTn ; L.XTn / C E j˛O sn ˇsn j2 C f n s; Xsn ; L.Xsn /; ˛O sn ds
0
Z (4.87)
T
6E g n
UTn ; L.XTn / C f n
s; Usn ; L.Xsn /; ˇsn ds :
0
Using the fact that ˇsn D EŒ˛O sn , the convexity condition in (A2) and (A4) and Jensen’s
inequality, we obtain:
Z
T 0
gn E.XTn /; L.XTn / C Var.˛O sn / C f n s; E.Xsn /; L.Xsn /; E.˛O sn / ds
0
Z (4.88)
T
6 E gn UTn ; L.XTn / C f n s; Usn ; L.Xsn /; E.˛O sn / ds :
0
By (A5) in assumption MFG Solvability SMP, we deduce that there exists a constant C,
depending only on , L, EŒjj2 and T, such that (the actual value of C possibly varying from
line to line):
Z T 1=2 1=2 1=2
Var.˛O sn /ds 6 C 1 C E jUTn j2 C E jXTn j2 E jUTn E.XTn /j2
0
Z T 1=2 1=2 1=2 1=2
CC 1 C E jUsn j2 C E jXsn j2 C E j˛O sn j2 E jUsn E.Xsn /j2 ds:
0
Since E.Xtn / D E.Utn / for any t 2 Œ0; T, we deduce from the uniform boundedness of the
variance of .Usn /06s6T that:
Z T Z T 1=2
Var.˛O sn /ds 6 C 1 C sup EŒjXsn j2 1=2 C E j˛O sn j2 ds : (4.89)
0 06s6T 0
From this, the linearity of the dynamics of Xn and Gronwall’s inequality, we deduce:
Z T 1=2
sup Var.Xsn / 6 C 1 C E j˛O sn j2 ds ; (4.90)
06s6T 0
since
Z
T
sup E jXsn j2 6 C 1 C E j˛O sn j2 ds : (4.91)
06s6T 0
Bounds like (4.90) allow us to control, for any 0 6 s 6 T, the Wasserstein distance between
the distribution of Xsn and the Dirac mass at the point E.Xsn /.
Second Step. We now compare Xn to the process controlled by the null control. So we
consider case .ii/ in (4.85), and now:
286 4 FBSDEs and the Solution of MFGs Without Common Noise
Z t
Utn D n C b0 s; L.Xsn / C b1 .s/Usn ds C Wt ; t 2 Œ0; T:
0
Note that we still denote the solution by Un although it is different from the one in the first
step. By the boundedness of b0 in (A5) in assumption MFG Solvability SMP, it holds that
supn>1 EŒsup06s6T jUsn j2 < 1. Using Theorem 3.17 as before in the derivation of (4.87)
and (4.88), we get:
Z
T 0
gn E.XTn /; L.XTn / C E.j˛O sn j2 / C f n s; E.Xsn /; L.Xsn /; E.˛O sn / ds
0
Z
T
6 E gn UTn ; L.XTn / C f n s; Usn ; L.Xsn /; 0 ds :
0
the value of C possibly varying from line to line. From (4.91), Young’s inequality yields:
Z
T 0
gn E.XTn /; ıE.XTn / C E j˛O sn j2 C f n s; E.Xsn /; ıE.Xsn / ; 0 ds
0 2
Z
T
6 gn 0; ıE.XTn / C f n s; 0; ıE.Xsn / ; 0 ds C C 1 C sup Var.Xsn / :
0 06s6T
By (4.90), we obtain:
Z
T 0
gn E.XTn /; ıE.XTn / C E j˛O sn j2 C f n s; E.Xsn /; ıE.Xsn / ; 0 ds
0 2
Z Z 1=2
T T
6 gn 0; ıE.XTn / C f n s; 0; ıE.Xsn / ; 0 ds C C 1 C E j˛O sn j2 ds :
0 0
4.5 Solving MFGs by the Stochastic Pontryagin Principle 287
Using (4.84) and (4.92), we can prove that the processes .Xn /n>1 are tight in C .Œ0; TI Rd /.
Indeed, there exists a constant C0 , independent of n, such that, for any 0 6 s 6 t 6 T,
Z 1=2
T
jXtn Xsn j 6 C0 .t s/1=2 1 C jXrn j2 C j˛O rn j2 dr C C0 jWt Ws j;
0
So there exists a constant, still denoted by C0 , such that ju.t; x/j 6 C0 .1 C jxj/, for t 2
Œ0; T and x 2 Rd . By (3.10) and (A6), we deduce that (for a possibly new value of C0 )
O x; t ; u.t; x//j 6 C0 .1 C jxj/: Plugging this bound into the forward SDE satisfied by X
j˛.t;
in (4.70), we conclude that, for a possibly new value of C0 ,
1=`
8` > 1; E sup jXt j2` jF0 6 C0 1 C jj2 ; (4.94)
06t6T
and, thus,
Z T
E j˛O t j2 dt < 1; (4.95)
0
with ˛O t D ˛.t;
O Xt ; t ; Yt /, for t 2 Œ0; T. We can now apply the same argument to any
.Xtn /06t6T , for any n > 1. We claim:
288 4 FBSDEs and the Solution of MFGs Without Common Noise
1=`
8` > 1; sup E sup jXtn j2` jF0 6 C0 1 C jj2 ; (4.96)
n>1 06t6T
which is a consequence of the following three observations. First, the constant C in the
statement of Lemma 4.56 does not depend on n. Second, the second-order moments of
sup06t6T jXtn j are bounded, uniformly in n > 1 by (4.92). Third, by (A5), the driver of the
backward component in (4.74) is at most of linear growth in .x; y; ˛/, so that by (4.84) and
standard L2 estimates for BSDEs, the second-order moments of sup06t6T jYtn j are uniformly
bounded as well. This shows (4.96) by repeating the proof of (4.94). By (4.94) and (4.96) and
by the same uniform integrability argument as in the third step of the proof of Theorem 4.29
n
in Subsection 4.3.3, we get that sup06t6T W2 .t p ; t / ! 0 as n tends to C1. Repeating the
proof of (4.83) (see (3.39) for the notations), we have:
Z
n n
T
0
E j˛O tn ˛O t j2 dt 6 J n; ˛O J ˛O C J ˛O n ; n J n; ˛O n
0
Z (4.97)
T
E n Y0 E b0 .t; nt / b0 .t; t / Yt dt;
0
where J ./ is given by (4.52) and J n; ./ is defined in a similar way, but with .f ; g/ and
n
.t /06t6T replaced by .f n ; gn / and .nt /06t6T . With these definitions at hand, we notice that:
n
J n; ˛O J ˛O
Z
T n
D E gn .UTn ; nT / g.XT ; T / C E f t; Utn ; nt ; ˛O t f t; Xt ; t ; ˛O t dt;
0
By Gronwall’s lemma and by convergence of np towards for the 2–Wasserstein distance,
we claim that Unp ! X as p ! C1 for the norm EŒsup06s6T j s j2 1=2 , namely in S2;d .
Using on one hand the uniform convergence of f n and gn towards f and g on bounded subsets
of their respective domains together with the regularity properties of f n , gn , f and g, and on
the other hand the convergence of np towards together with the bounds (4.94), (4.95)
np
and (4.96)), we deduce that J np ; .˛/ O ! J .˛/ O as p ! C1. Similarly, using the
bounds (4.84), (4.94) and (4.96), the other differences in the right-hand side in (4.97) tend to
0 along the subsequence .np /p>1 so that ˛O np ! ˛O as p ! C1 in L2 .Œ0; T ˝; Leb1 ˝ P/.
We conclude that X is the limit of the sequence .Xnp /p>1 in S2;d . Therefore, matches the
flow of marginal laws of X, proving that equation (4.70) is solvable. t
u
In order to complete the proof of Theorem 4.53, we must specify the choice of the
approximating sequences in Lemma 4.58. Actually, the choice is performed in two
steps. We first consider the case when the cost functions f and g are strongly convex
in the variables x.
4.5 Solving MFGs by the Stochastic Pontryagin Principle 289
Lemma 4.59 Assume that, in addition to assumption MFG Solvability SMP, there
exists a constant > 0 such that the functions f and g satisfy (compare with (4.72)):
f .t; x0 ; ; ˛ 0 / f .t; x; ; ˛/
.x0 x; ˛ 0 ˛/ @.x;˛/ f .t; x; ; ˛/ > jx0 xj2 C j˛ 0 ˛j2 ; (4.98)
g.x0 ; / g.x; / .x0 x/ @x g.x; / > jx0 xj2 :
Then, there exist two positive constants 0 and L0 , depending only upon , L and ,
and two sequences of functions .f n /n>1 and .gn /n>1 such that:
(i) for any n > 1, f n and gn satisfy MFG Solvability SMP with the parameters 0
and L0 and @x f n and @x gn are bounded;
(ii) for any bounded subsets of Œ0; T Rd P2 .Rd / Rk and Rd P2 .Rd /, there
exists an integer n0 , such that, for any n > n0 , f n and gn coincide with f and g
on these bounded sets.
Proof. The proof of Lemma 4.59 is a pure technical exercise in convex analysis. We focus
on the approximation of the running cost f (the case of the terminal cost g is similar) and we
ignore the dependence of f upon t to simplify the notation. For any n > 1, we define fn as the
truncated Legendre transform:
fn .x; ; ˛/ D sup inf y .x z/ C f .z; ; ˛/ ; (4.99)
d
jyj6n z2R
jyj2
sup y z f .z; ; ˛/ > sup y z cjzj2 c.1 C R2 / D c.1 C R2 /: (4.102)
z2Rd z2Rd 4c
290 4 FBSDEs and the Solution of MFGs Without Common Noise
Therefore,
jyj2
inf y .x z/ C f .z; ; ˛/ 6 Rjyj C c.1 C R2 /: (4.103)
z2Rd 4c
By (4.101) and (A5) in assumption MFG Solvability SMP, fn .t; x; ; ˛/ > c.1 C R2 /, c
depending possibly on , so that optimization in the variable y in the definition of fn can be
done over points y? satisfying:
jy? j2
c.1 C R2 / 6 Rjy? j C c.1 C R2 /; that is jy? j 6 c.1 C R/; (4.104)
4c
the constant c being allowed to vary from inequality to another. In particular, for n large
enough (depending on R),
fn .x; ; ˛/ D sup inf y .x z/ C f .z; ; ˛/ D f .x; ; ˛/: (4.105)
d
y2Rd z2R
So on bounded subsets of Rd P2 .Rd /Rk , fn and f coincide for n large enough. In particular,
for n large enough, fn .0; ı0 ; 0/, @x fn .0; ı0 ; 0/ and @˛ fn .0; ı0 ; 0/ exist, coincide with f .0; ı0 ; 0/,
@x f .0; ı0 ; 0/ and @˛ f .0; ı0 ; 0/ respectively, and are bounded by L as in (A5). Moreover, still
for jxj 6 R, j˛j 6 R and M2 ./ 6 R, we see from (4.100) and (4.104) that optimization in z
can be reduced to z? satisfying:
Second Step. We now investigate the convexity property of fn .; ; /, for given 2 P2 .Rd /.
For any h 2 R, x; e; y; z1 ; z2 2 Rd and ˛; ˇ 2 Rk , with jyj 6 n and jej; jˇj 1, we deduce
from the convexity of f .; ; /:
2 inf y .x z/ C f .z; ; ˛/
z2Rd
z1 C z 2 .˛ C hˇ/ C .˛ hˇ/
6 y .x C he z1 / C .x he z2 / C 2f ; ;
2 2
6 y .x C he z1 / C f .z1 ; ; ˛ C hˇ/ C y .x he z2 / C f .z2 ; ; ˛ hˇ/ 2 h2 jˇj2 :
1 1
fn .x; ; ˛/ 6 fn .x C he; ; ˛ C hˇ/ C fn .x he; ; ˛ hˇ/ h2 jˇj2 : (4.107)
2 2
4.5 Solving MFGs by the Stochastic Pontryagin Principle 291
By expanding f .; ; / up to the first order and by using the Lipschitz regularity of the first
order derivatives, we see that:
inf y1 .x C he z/ C f .z; ; ˛ C hˇ/
z2Rd
C inf y2 .x he z/ C f .z; ; ˛ hˇ/
z2Rd
6 inf .y1 C y2 / .x z/ C 2f .z; ; ˛/ C cjhj2 jej2 C jˇj2
z2Rd
y1 C y2
D 2 inf .x z/ C f .z; ; ˛/ C cjhj2 jej2 C jˇj2 ;
z2Rd 2
By (A5) in assumption MFG Solvability SMP and (4.106), we can find a constant c0
(possibly depending on ) such that:
292 4 FBSDEs and the Solution of MFGs Without Common Noise
fn .x; 0 ; ˛/ D sup inf y .x z/ C f .z; 0 ; ˛/
jyj6n jzj6c.1CR/
6 sup inf y .x z/ C f .z; ; ˛/ C L.1 C R C jzj/ı (4.109)
jyj6n z6c.1CR/
D sup inf y .x z/ C f .z; ; ˛/ C c0 .1 C R/ı:
d
jyj 6 n z2R
By (4.105), for any integer p > 1, there exists an integer np , such that, for any n > np ,
@x fn .0; ; 0/ and @˛ fn .0; ; 0/ are respectively equal to @x f .0; ; 0/ and @˛ f .0; ; 0/ for
M2 ./ 6 p. In particular, for n > np ,
ˇ ˇ ˇ ˇ
ˇ@x fn .0; ; 0/ˇ C ˇ@˛ fn .0; ; 0/ˇ 6 c 1 C M2 ./ whenever M2 ./ 6 p; (4.111)
so that (4.110) implies (A5) whenever n > np and M2 ./ 6 p. We get rid of these
restrictions by modifying the definition of fn . Given a probability measure 2 P2 .Rd /
and an integer p > 1, we define ˚p ./ as the push-forward of by the mapping Rd 3
x ! Œmax.M2 ./; p/1 px so that ˚p ./ 2 P2 .Rd / and M2 .˚p .// 6 min.p; M2 .//.
Indeed, if the random variable X has as distribution, i.e., L.X/ D , then the random
variable Xp D pX= max.M2 ./; p/ has ˚p ./ as distribution. It is easy to check that ˚p is
Lipschitz continuous for the 2-Wasserstein distance, uniformly in n > 1. We then consider
the approximating sequence:
fOp W Rd P2 .Rd / Rk 3 .x; ; ˛/ ! fnp x; ˚p ./; ˛/; p > 1;
instead of .fn /n>1 itself. Clearly, on any bounded subset, fOp still coincides with f for p
large enough. Moreover, the conclusion of the second step is preserved. In particular, the
conclusion of the second step together with (4.109), (4.110), and (4.111) say that (A5) holds
(for a possible new choice of L). From now on, we get rid of the symbol “hat” in .fOp /p>1 and
keep the notation .fn /n>1 for .fOp /p>1 .
Fourth Step. It only remains to check that fn satisfies the bound (A6) and the sign condition
(A7) in assumption MFG Solvability SMP. Since j@˛ f .x; ; 0/j 6 L, the Lipschitz property
of @˛ f implies that there exists a constant c > 0 such that j@˛ f .x; ; ˛/j 6 c for all
.x; ; ˛/ 2 Rd P2 .Rd / Rk with j˛j 6 1. In particular, for any n > 1, it is plain to
see that fn .x; ; ˛/ 6 fn .x; ; 0/ C cj˛j; for any .x; ; ˛/ 2 Rd P2 .Rd / Rk with j˛j 6 1,
so that j@˛ fn .x; ; 0/j 6 c. This proves (A6).
Finally, we can modify the definition of fn once more to satisfy (A7). Indeed, for any R >
0, there exists an integer nR , such that, for any n > nR , fn .x; ; ˛/ and f .x; ; ˛/ coincide for
.x; ; ˛/ 2 Rd P2 .Rd / Rk with jxj; j˛j; M2 ./ 6 R so that x @x fn .0; ıx ; 0/ > L.1 C jxj/;
for jxj 6 R and n > nR . Next we choose a smooth function W Rd ! Rd , satisfying
j .x/j 6 1 for any x 2 R , .x/ D x for jxj 6 1=2 and .x/ D x=jxj for jxj > 1, and we
d
4.5 Solving MFGs by the Stochastic Pontryagin Principle 293
set fOp .x; ; ˛/ D fnp x; p ./; ˛ for any integer p > 1 and .x; ; ˛/ 2 Rd P2 .Rd / Rk
where p ./ is the push-forward of by the mapping Rd 3 x ! x N C p .p1 /. N
Recall that N stands for the mean of . In other words, if X has distribution , then XO p D
X E.X/ C p .p1 E.X// has distribution p ./.
The function p is Lipschitz continuous with respect to W2 , Runiformly in p > 1.
Moreover, for any R > 0 and p > 2R, M2 ./ 6 R implies j Rd x0 d.x0 /j 6 R so
R
that p1 j Rd x0 d.x0 /j 6 1=2, that is p ./ D and, for jxj; j˛j 6 R, fOp .x; ; ˛/ D
fnp .x; ; ˛/ D f .x; ; ˛/. Therefore, the sequence .fOp /p>1 is an approximating sequence for f
which satisfies the same regularity properties as .fn /n>1 . In addition:
for x 2 Rd . Finally we choose .x/ D Œ.jxj/=jxjx (with .0/ D 0), where is a smooth
nondecreasing function from Œ0; C1/ into Œ0; 1 such that .x/ D x on Œ0; 1=2 and .x/ D 1
on Œ1; C1/. If x 6D 0, then the above right-hand side is equal to:
jp1 xj
x @x f .0; ıp .p1 x/ ; 0/ D p .p1 x/ @x f .0; ıp .p1 x/ ; 0/
.jp1 xj/
jp1 xj
> L 1 C jp .p1 x/j :
.jp1 xj/
For jxj 6 p=2, we have .p1 jxj/ D jp1 xj, so that the right-hand side coincides with
L.1 C jxj/. For jxj > p=2, we have .p1 jxj/ > 1=2 so that:
jp1 xj
1 C jp .p1 x/j > 2p1 jxj 1 C jp .p1 x/j > 2p1 jxj 1 C p > 4jxj:
.jp1 xj/
This proves that (A7) in assumption MFG Solvability SMP holds with a new constant. t
u
4.5.5 Conclusion
1 1
fn .t; x; ; ˛/ D f .t; x; ; ˛/ C jxj2 I gn .x; / D g.x; / C jxj2 ;
n n
The purpose of this section is to revisit the theory of mean field games when
the individual players can also interact via the controls. These models have been
referred to as extended mean field games in the PDE literature on mean field games.
We saw several examples of this kind in Chapter 1, notably when we discussed
models for exhaustible resources as in Subsection 1.4.4. Therein, the inventories of
N oil producers are modeled by means of a stochastic differential game in which the
cost functional of each player depends upon the empirical mean of the instantaneous
rates of production of all the producers. A similar situation appeared when we
introduced the price impact model which we shall solve in detail in the next section.
While the N-player game models are usually detailed in the discussions of the
practical applications, here, we jump directly to the asymptotic formulation of
the mean field games. We work with the usual set-up .˝; F; F D .Ft /06t6T ; P/,
equipped with a d-dimensional F - Wiener process W D .Wt /06t6T and an
F0 -measurable initial condition 2 L2 .˝; F0 ; PI Rd /. We rewrite the matching
problem (i)–(ii) of Subsection 3.1.2 as follows:
(i) For each fixed deterministic continuous flow D .t /06t6T of probability
measures on Rd A (where the closed convex subset A Rk denotes the
set of admissible values for the controls), solve the standard stochastic control
problem
Z T
inf J .˛/; with J .˛/ D E f .t; Xt˛ ; t ; ˛t /dt C g.XT˛ ; T / ;
˛2A 0
subject to (4.112)
(
dXt˛ D b.t; Xt˛ ; t ; ˛t /dt C .t; Xt˛ ; t ; ˛t /dWt ; t 2 Œ0; T;
X0˛ D ;
1 h i
@t V.t; x/ C trace .t; x; t /@2xx V.t; x/
2 (4.114)
C H t; x; t ; @x V.t; x/; ˛O t; x; t ; @x V.t; x/ D 0;
which implies in particular that the optimal control in (i) takes the Markovian form:
˛O t D ˛Q t; XO t ; t ; t 2 Œ0; T;
for the function ˛Q defined as ˛.t; Q x; / D ˛.t; O x; ; @x V.t; x//. Therefore, for any
t 2 Œ0; T, the law of .XO t ; ˛O t / appears as the pushed forward image of the law of XO t
since:
1
L XO t ; ˛O t D L XO t ı Id ; ˛.t;
O ; t ; @x V.t; // :
Consequently, the fixed point condition (ii) for the flow D .t /06t6T of joint
distributions of the state and the control can be rewritten as:
296 4 FBSDEs and the Solution of MFGs Without Common Noise
(
t D L XO t ;
1 t 2 Œ0; T; (4.115)
t D t ı Id ; ˛.t;
O ; t ; @x V.t; // ;
in Œ0; T Rd , with V.T; / D g.; T / as terminal condition for the first equation,
and 0 D L./ as initial condition for the second equation.
In comparison with the standard case when the mean field interaction is only
through the states, the new feature is the second relationship in (4.115), which
provides in equilibrium, an implicit expression for the flow D .t /06t6T of joint
distributions of both the state and the control in terms of the flow D .t /06t6T of
marginal distributions of the state. Of course, a natural question is to identify cases
in which this implicit expression is uniquely solvable. In order to do so, we shall
restrict ourselves to flows of probability measures with values in P2 .Rd A/:
> 0; (4.117)
1
D ı Id ; ˛O t; ; ; ./ :
4.6 Extended Mean Field Games: Interaction Through the Controls 297
has a unique solution 2 P2 .Rd A/ for any t 2 Œ0; T and 2 P2 .Rd /, where:
8x; y 2 Rd ; ˛O ı t; x; ; y D argmin˛2A b2 .t/˛ y C fı .t; x; ; ˛/ ;
and
fı .t; x; ; ˛/ D ıf t; x; ; ˛ C .1 ı/f0 t; x; ˛/:
First Step. We start with the case ı D 0. The result is obviously true since f0 and thus ˛O 0
are independent of the argument . Therefore, we can just denote ˛O 0 .t; x; ; y/ by ˛O 0 .t; x; y/
and the only solution to (4.118) must be given by the function ./ D ˛O 0 t; ; ./ . By
Lemma 3.3, such a function is square-integrable because:
ˇ ˇ
8x 2 Rd ; ˇ˛O 0 t; x; .x/ ˇ 6 c 1 C jxj C j.x/j ; (4.119)
1
0 D ı Id ; ˛O ı t; ; 0 ; ./ ; (4.120)
the function fı depending upon the measure through the choice of a new function f0 ,
namely:
fı t; x; 0 ; ˛ D ıf t; x; 0 ; ˛/ C f t; x; ; ˛/ C 1 .ı C / f0 .t; x; ˛/
D ıf t; x; 0 ; ˛/ C .1 ı/fQ0 .t; x; ˛/;
298 4 FBSDEs and the Solution of MFGs Without Common Noise
with
1 .ı C /
fQ0 .t; x; ˛/ D f .t; x; ; ˛/ C f0 .t; x; ˛/:
1ı 1ı
Therefore, fı is covered by the induction assumption if we use for f0 the new function fQ0 .
The induction assumption implies that equation (4.120) has a unique solution in P2 .Rd
A/ guaranteeing that the mapping is well defined. Observe then that any fixed point of
satisfies:
1
D ı Id ; ˛O ı t; ; ; ./ ;
providing a solution of the equation (4.118) with ı replaced by ıC. Conversely, any solution
of the equation (4.118), with ı replaced by ıC, is a fixed point of the mapping . Therefore,
in order to prove that (4.118), with ı replaced by ı C , is uniquely solvable, it suffices to
prove that is a contraction on P2 .Rd A/ for the Wasserstein distance W2 .
Third Step. We now prove that, for small enough, the function is a contraction. Given
1 and 2 2 P2 .Rd A/, we call 10 and 20 their respective images by , and we denote by X
a random variable with distribution . Then, by optimality of ˛O ı .t; ; 0 ; .//, we have:
b2 .t/˛O ı1 t; X; 10 ; .X/ .X/ C fı1 t; X; 10 ; ˛O ı1 t; X; 10 ; .X/
ˇ ˇ2
C ˇ˛O ı1 t; X; 10 ; .X/ ˛O ı2 t; X; 20 ; .X/ ˇ
6 b2 .t/˛O ı2 t; X; 20 ; .X/ .X/ C fı1 t; X; 10 ; ˛O ı2 t; X; 20 ; .X/
b2 .t/˛O ı2 t; X; 20 ; .X/ .X/ C fı2 t; X; 20 ; ˛O ı2 t; X; 20 ; .X/
ˇ ˇ2
C ˇ˛O ı1 t; X; 10 ; .X/ ˛O ı2 t; X; 20 ; .X/ ˇ
6 b2 .t/˛O ı1 t; X; 10 ; .X/ .X/ C fı2 t; X; 20 ; ˛O ı1 t; X; 10 ; .X/ :
fı1 t; X; 10 ; ˛O ı1 t; X; 10 ; .X/ C fı2 t; X; 20 ; ˛O ı2 t; X; 20 ; .X/
ˇ ˇ2
C 2 ˇ˛O ı1 t; X; 10 ; .X/ ˛O ı2 t; X; 20 ; .X/ ˇ
6 fı1 t; X; 10 ; ˛O ı2 t; X; 20 ; .X/ C fı2 t; X; 20 ; ˛O ı1 t; X; 10 ; .X/ :
fı1 t; X; 10 ; ˛O ı1 t; X; 10 ; .X/ fı2 t; X; 20 ; ˛O ı1 t; X; 10 ; .X/
C fı2 t; X; 20 ; ˛O ı2 t; X; 20 ; .X/ fı1 t; X; 10 ; ˛O ı2 t; X; 20 ; .X/
(4.121)
ˇ ˇ2
C 2 ˇ˛O ı1 t; X; 10 ; .X/ ˛O ı2 t; X; 20 ; .X/ ˇ
6 0:
Now, the expectation of the four terms on the two first lines is equal to:
h i
E fı1 t; X; 10 ; ˛O ı1 t; X; 10 ; .X/ fı2 t; X; 20 ; ˛O ı1 t; X; 10 ; .X/
h i
C E fı2 t; X; 20 ; ˛O ı2 t; X; 20 ; .X/ fı1 t; X; 10 ; ˛O ı2 t; X; 20 ; .X/
h
i
D ı E f t; X; 10 ; ˛O ı1 t; X; 10 ; .X/ f t; X; 20 ; ˛O ı1 t; X; 10 ; .X/
h i
E f t; X; 10 ; ˛O ı2 t; X; 20 ; .X/ f t; X; 20 ; ˛O ı2 t; X; 20 ; .X/
h
i
C E f t; X; 1 ; ˛O ı1 t; X; 10 ; .X/ f t; X; 2 ; ˛O ı1 t; X; 10 ; .X/
h
i
E f t; X; 1 ; ˛O ı2 t; X; 20 ; .X/ f t; X; 2 ; ˛O ı2 t; X; 20 ; .X/ :
Since the random vector .X; ˛O 1 .t; X; 10 ; .X/// (respectively .X; ˛O 2 .t; X; 20 ; .X///) has
exactly 10 (respectively 20 ) as distribution, we deduce from (4.121) and from the monotonic-
ity property of f that:
hˇ ˇ2 i
2 E ˇ˛O ı1 t; X; 10 ; .X/ ˛O ı2 t; X; 20 ; .X/ ˇ
ˇ h
ˇ i
6 ˇˇE f t; X; 1 ; ˛O ı1 t; X; 10 ; .X/ f t; X; 2 ; ˛O ı1 t; X; 10 ; .X/
h ˇ
iˇ
E f t; X; 1 ; ˛O ı2 t; X; 20 ; .X/ f t; X; 2 ; ˛O ı2 t; X; 20 ; .X/ ˇˇ:
Thanks to the regularity properties of f , the term between absolute values in the right-hand
side is less than:
hˇ ˇ2 i1=2
CW2 .1 ; 2 /E ˇ˛O ı1 .t; X; 10 ; .X// ˛O ı2 .t; X; 20 ; .X//ˇ ;
for a constant C which only depends on the parameter L in the assumption. In particular, C
is independent of ı and of f0 . Therefore,
300 4 FBSDEs and the Solution of MFGs Without Common Noise
hˇ ˇ2 i
2 E ˇ˛O ı1 t; X; 10 ; .X/ ˛O ı2 t; X; 20 ; .X/ ˇ
hˇ ˇ2 i1=2
6 CW2 .1 ; 2 /E ˇ˛O ı1 t; X; 10 ; .X/ ˛O ı2 t; X; 20 ; .X/ ˇ :
Allowing the constant C to increase from line to line if necessary, we deduce that:
hˇ ˇ2 i1=2
E ˇ˛O ı1 t; X; 10 ; .X/ ˛O ı2 t; X; 20 ; .X/ ˇ 6 CW2 .1 ; 2 /:
Using again the fact that .X; ˛O 1 .t; X; 10 ; .X/// (respectively .X; ˛O 2 .t; X; 20 ; .X///) has
exactly 10 (respectively 20 ) as distribution, we notice that the left-hand side is greater than
W2 .10 ; 20 /, which finally yields:
This shows that the mapping is a contraction for C < 1. Therefore, for C 6 1=2,
Equation (4.118), with ı replaced by ı C , has a unique solution.
Final Step. Since the constant C, in the condition C 6 1=2, is independent of ı, we can
apply a straightforward induction argument to prove that (4.118) is uniquely solvable for any
ı 2 Œ0; 1, which completes the proof. t
u
Examples
We now provide three important examples of function f satisfying (4.117).
satisfies:
2
jh.t; x; ; ˛; /j 6 C 1 C jxj C j˛j C M2 ./ C M2 . / ; (4.122)
where L W Rk Œ0; C1/ 3 .x; y/ 7! L.x; y/ is twice differentiable and convex in both
variables on A Œ0; C1/, and nondecreasing in the second variable on Œ0; C1/,
and is nonnegative, even, smooth, with compact support, satisfies:
Z
@2˛ h.˛; /D @2xx L ˇ; .ˇ/ C 2@2xy L ˇ; .ˇ/ ˝ @ .ˇ/
Rk
˝2
C @2yy L ˇ; .ˇ/ @ .ˇ/ .˛ ˇ/dˇ
Z
C @y L ˇ; .ˇ/ @2 .ˇ/ .˛ ˇ/dˇ:
Rk
302 4 FBSDEs and the Solution of MFGs Without Common Noise
FBSDE Formulation
In the spirit of Chapter 3, we may characterize solutions of extended mean field
games by means of an FBSDE of the McKean-Vlasov type.
Using the Representation of the HJB Equation. As in the standard case, two
strategies are possible. The first one is to represent the value function of the game
as in Proposition 3.11 and Theorem 4.44, and the second one is to use the stochastic
maximum principle as we did in Proposition 3.21. In both cases, the main issue
is the identification of the analogue of relationship (4.115) which provides an
implicit expression of t in terms of its first marginal t . When representing the
value function of the game in Proposition 3.13, the gradient of the solution of the
Hamilton-Jacobi-Bellman equation (4.114) which appears in (4.115), is connected
with the martingale integrand .ZO t /06t6T which appears in the FBSDE formulation
of the game. A quick glance at Proposition 3.13 shows that the analogue of (4.115)
should be:
t D L XO t ; ˛O t; XO t ; t ; .t; Xt ; t /1 ZO t (4.126)
1
D L XO t ; .t; XO t ; t /1 ZO t ı Rd Rd 3 .x; y/ 7! x; ˛.t;
O x; t ; y/ ;
for t 2 Œ0; T, where .XO t ; YO t ; ZO t /06t6T denotes the solution of the associated FBSDE:
8
< dXO t D bt; XO t ; t ; ˛O t; XO t ; t ; O .t; XO t ; t /1 Zt dt C .t; XO t ; t /dWt ;
: dYO D f t; XO ; ; ˛O t; XO ; ; .t; XO ; /1 ZO dt C ZO dW ;
(4.127)
t t t t t t t t t t
Z Z
f .t; x; ; ˛/ f .t; x; 0 ; ˛/ Q.x; d˛/ Q0 .x; d˛/ d.x/ > 0;
Rd A
for D 2 B.Rd A/. Then, for any joint distribution on Rd Rd , there exists a
unique distribution 2 P2 .Rd A/, which we shall denote ˘.t; /, such that:
1
D ı Rd Rd 3 .x; y/ 7! x; ˛.t;
O x; ; y/ 2 Rd A :
The proof of Lemma 4.61 is similar to that of Lemma 4.60. A crucial fact in the
proof is that, for any 2 P2 .Rd Rd / and 2 P2 .Rd A/, the first marginal of
O x; ; y// 2 Rd A/1 is equal to the first marginal of
ı.Rd Rd 3 .x; y/ 7! .x; ˛.t;
on Rd and is thus independent of . This permits to recover the same framework
as in the proof of Lemma 4.60: the first marginal of , denoted by in the proof of
Lemma 4.60, is entirely fixed.
When only depends on 2 P2 .Rd A/ through its first marginal 2 P2 .Rd /
on Rd and under the assumption of Lemma 4.61, Equation (4.126) has a unique
solution:
t D ˘ t; L XO t ; .t; XO t ; L.XO t //1 ZO t ;
which we may rewrite, without any ambiguity, ˘ 0 .t; L.XO t ; ZO t //. Then, the
FBSDE (4.127) rewrites:
8
ˆ
ˆ
ˆ dXO t D b t; XO t ; ˘ 0 t; L.XO t ; ZO t / ;
ˆ
ˆ
ˆ
ˆ ˛O t; XO t ; ˘ 0 .t; L.XO t ; ZO t //; .t; XO t ; L.XO t //1 ZO t dt
ˆ
ˆ
ˆ
ˆ
< C t; XO t ; L.XO t / dWt ;
(4.128)
ˆ
ˆ dYO t D f t; XO t ; ˘ 0 t; L.XO t ; ZO t / ;
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ ˛O t; XO t ; ˘ 0 .t; L.XO t ; ZO t //; .t; XO t ; L.XO t //1 ZO t dt
ˆ
ˆ
:̂
CZO t dWt ;
Based on Theorem 4.45, the analog of Proposition 3.11 (the control problem
being understood in the strong instead of weak sense) reads:
Proposition 4.62 Let assumption MFG Solvability HJB be in force, the measure
argument in the coefficients being in P2 .Rd A/ in lieu of P2 .Rd / except in ,
which is still assumed to be a function from Œ0; T Rd P2 .Rd / into Rdd . Then,
a continuous flow of measures D .t /06t6T from Œ0; T to P2 .Rd A/ is an MFG
equilibrium if and only if
O Y;
where .X; O Z/
O solves the McKean-Vlasov FBSDE (4.128).
Using the Stochastic Maximum Principle. When using the Pontryagin stochastic
maximum principle, to derive an analog of Proposition 3.23, the gradient of the
solution of the Hamilton-Jacobi-Bellman equation is no longer related to the process
Z D .Zt /06t6T appearing in the FBSDE formulation of the game, but with the
process Y D .Yt /06t6T instead. In particular, repeating the above discussion shows
that, under the assumption of Lemma 4.61, the analogue of (4.115) reads:
where .X; Y; Z/ D .Xt ; Yt ; Zt /06t6T now denotes the solution of the FBSDE:
4.6 Extended Mean Field Games: Interaction Through the Controls 305
8
ˆ
ˆ dXt D b t; Xt ; ˘ t; L.Xt ; Yt / ; ˛O t; Xt ; ˘.t; L.Xt ; Yt //; Yt dt
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ C dWt ;
<
dYt D @x H t; Xt ; ˘ t; L.Xt ; Yt / ; Yt ; (4.130)
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ ˛O t; Xt ; ˘.t; L.Xt ; Yt //; Yt dt
ˆ
:̂
CZt dWt ;
Proposition 4.64 Under the assumption of Definition 3.22, the measure argument
in the coefficients being in P2 .Rd A/, a continuous flow of measures D .t /06t6T
from Œ0; T to P2 .Rd A/ is an MFG equilibrium if and only if t D ˘.t; L.Xt ; Yt //
for any t 2 Œ0; T, where .X; Y; Z/ solves the McKean-Vlasov FBSDE (4.130).
Example. Following Example 1 right above, we know that, when the running cost
f is of the form:
where denotes the first marginal of on Rd , f0 W Œ0; TRd P2 .Rd A/ ! R and
f1 W Œ0; T Rd P2 .Rd / A ! R, the minimizer ˛O of the Hamiltonian H depends
upon through only and writing ˛.t;
O ; ; / instead of ˛.t;
O ; ; /, the fixed point
mapping ˘.t; / has the explicit expression:
1
˘.t; / D ı Rd Rd 3 .x; z/ 7! .x; ˛.t;
O x; ; z// 2 Rd A ;
Typical Solution
Motivated by the model of price impact which was introduced in Subsection 1.3.2
of Chapter 1, and which we shall solve in the next section, we choose the following
specific set of assumptions to illustrate the applicability of the approach based on
the stochastic maximum principle discussed above.
Assumption (EMFG). The set A is closed and convex and the coefficients b,
, f , and g are defined on Œ0; T Rd A, Œ0; T Rd , Œ0; T Rd P2 .A/ A,
and Rd respectively and they satisfy, for two constants ; L > 0:
(A1) The volatility is constant and the drift is an affine function of .x; ˛/:
(continued)
4.6 Extended Mean Field Games: Interaction Through the Controls 307
Recall that the notation @.x;˛/ f1 stands for the gradient in the joint
variables .x; ˛/. Finally, f1 , @x f1 and @˛ f1 are locally bounded over
Œ0; T Rd A.
(A3) The function g is differentiable and its derivative is Lipschitz continu-
ous. Moreover, g is convex in the sense that:
Notice that in the present context, the measure 2 P2 .A/ should be understood
as the distribution of the control, namely the second marginal of the measure used
throughout this section.
Proof. We make use of Proposition 4.64. Under the standing assumption, the McKean-
Vlasov FBSDE (4.130) takes the form:
(
dXt D b t; Xt ; ˛.t;
O Xt ; Yt / dt C dWt ;
dYt D @x H1 t; Xt ; Yt ; ˛.t;
O Xt ; Yt / C f0 t; L ˛.t;
O Xt ; Yt / dt C Zt dWt ;
for t 2 Œ0; T, with X0 D as initial condition and YT D @x g.XT / as terminal boundary
condition. Above, H1 denotes the Hamiltonian:
and ˛.t;
O x; y/ is the unique minimizer of the function A 3 ˛ 7! H1 .t; x; y; ˛/.
First Step. We proceed as in the analysis of the system (4.50). For a given continuous flow
of measures D . t /06t6T with values in P2 .A/, we consider the system:
(
dXt D b t; Xt ; ˛.t;
O Xt ; Yt / dt C dWt ;
(4.131)
dYt D @x H1 t; Xt ; Yt ; ˛.t;
O Xt ; Yt / C f0 .t; t / dt C Zt dWt ;
for t 2 Œ0; T, with X0 D as initial condition and YT D @x g.XT / as terminal boundary
condition.
Following the proof of Lemma 4.56, the system (4.131) is uniquely solvable and we may
call u its decoupling field. Also, we can find a constant c, independent of , such that, for
all t 2 Œ0; T and x; x0 2 Rd ,
ˇ ˇ
ˇu .t; x0 / u .t; x/ˇ 6 cjx0 xj: (4.132)
308 4 FBSDEs and the Solution of MFGs Without Common Noise
Second Step. We now consider the system (4.131) but without .f0 .t; t //06t6T in the
backward equation:
(
dXt0 D b t; Xt0 ; ˛.t;
O Xt0 ; Yt0 / dt C dWt ;
(4.133)
dYt0 D @x H1 t; Xt0 ; Yt0 ; ˛.t;
O Xt0 ; Yt0 / dt C Zt0 dWt ;
for t 2 Œ0; T, with X00 D as initial condition and YT0 D @x g.XT0 / as terminal boundary
condition.
Below, we let:
˛O t D ˛.t;
O Xt ; Yt /; ˛O t0 D ˛.t;
O Xt0 ; Yt0 /; t 2 Œ0; T:
the constant c being allowed to increase from line to line, as long as it remains independent
of .
Proceeding as in the proof of Lemma 4.56, we obtain that:
E jY0 Y00 j2 6 c:
It is well checked that the above bound is independent of . We then deduce that
ju .0; 0/j 6 c. More generally, we have:
Third Step. We now have all the ingredients to follow the proof of Theorem 4.39. To do so,
we call X and Y the forward and backward components of the solution to (4.131).
Proceeding as in the proof of Theorem 4.39, we deduce that there exists a compact subset
K C .Œ0; T; P2 .Rd // such that, for any input as above, the path .L.Xt //06t6T is in K.
Since u satisfies (4.132) and (4.136), we also deduce that, for any as above, there exists
a compact subset K P2 .Rd / such that, for any t 2 Œ0; T, L.Yt / 2 K.
Thanks to (4.132) and (4.136) once again, we may proceed as in (4.9) and deduce that
ju .t; x/ u .s; x/j 6 c.1 C jxj/jt sj1=2 , from which we get that:
E jYt Ys j2 6 cjt sj;
where the constant c is, as we already explained, independent of . Following the proof of
Theorem 4.39 and modifying if necessary the compact subset K C .Œ0; TI P2 .Rd //, we can
prove that, for all , .L.Yt //06t6T 2 K.
Since ˛O is known to be Lipschitz continuous in .x; y/ and locally bounded, see Lemma 3.3,
we deduce that there exists a compact subset K0 C .Œ0; T; P2 .Rk // such that, for any ,
O Xt ; Yt ///06t6T belongs to K0 . We then conclude as in the proof of Theorem 4.39 the
.L.˛.t;
existence of 2 C .Œ0; T; P2 .Rk // such that
8t 2 Œ0; T; O Xt ; Yt / :
t D L ˛.t;
By construction of ˛,
O it holds that t 2 P2 .A/ for all t 2 Œ0; T. t
u
4.7 Examples
The price impact model presented in Subsection 1.3.2 of Chapter 1 led to the
mean field game model in which for each fixed flow D . t /06t6T of probability
measures on A, a typical player minimizes the quantity:
Z T
J.˛/ D E f .t; Xt ; t ; ˛t /dt C g.XT / (4.137)
0
310 4 FBSDEs and the Solution of MFGs Without Common Noise
Numerical Results
The Hamiltonian
Z
c˛ 2 cX 2
H.t; x; y; ; ˛/ D ˛y C ˛ C x x hd
2 2 R
N
We first consider the case of a linear price impact function h, say h.˛/ D h˛.
Under this extra assumption, assumption EMFG is not satisfied any longer, but the
problem becomes particularly simple because the McKean-Vlasov FBSDE (4.139)
is now affine:
8
ˆ 1
< dXt D c˛ Yt dt C dWt ;
ˆ
N
dYt D cX Xt ch˛ EŒYt dt C Zt dWt ; t 2 Œ0; T; (4.140)
ˆ
:̂ X D x0 ; Y D c X :
0 T g T
This is a particular case of the system (3.53) which we solved by relying on the
ansatz yN t D N t xN t C N t . In the present situation, the functions N and N can be identified
as the solutions of the system of ODEs:
8
ˆ
ˆ PN t C hN N t 1 N 2t C cX D 0;
< c˛ c˛
N
PN t Œ c1˛ N t ch˛ N t D 0; t 2 Œ0; T; (4.142)
ˆ
:̂ N T D cg ; N T D 0:
The first equation is a one-dimensional Riccati equation. Recalling that both c˛ and
cX are strictly positive and cg is nonnegative and following (2.49)–(2.50), we get:
C C
C e.ı ı /.Tt/ 1 cg ı C e.ı ı /.Tt/ ı
Nt D ; (4.143)
ı e.ıC ı /.Tt/ ı C cg B e.ıC ı /.Tt/ 1
p
N
for t 2 Œ0; T, where A D h=.2c˛ /, B D 1=c˛ , C D cX , ı
˙
D A ˙ R, with
2
R D A C BC > 0.
312 4 FBSDEs and the Solution of MFGs Without Common Noise
The second equation in (4.142) is a first order homogenous linear equation with
terminal condition zero, so its solution is identically zero. Consequently the means
xN t and yN t are given by:
1
Rt 1
Rt
xN t D x0 e c˛ 0 N u du
; and yN t D x0 N t e c˛ 0 N u du
; t 2 Œ0; T: (4.144)
We can now go back to the McKean-Vlasov FBSDE (4.140) and solve for the
equilibrium processes X and Y. As explained above, we use the ansatz Yt D t Xt Ct
and substitute the quantity yN t just computed for EŒYt . Computing the stochastic
differential of Y using such an ansatz and the equations in (4.140), we find that
these functions solve the system of ODEs:
8
ˆ
ˆ Pt D 1 2
cX ;
< c˛ t
1 N
P t D ch˛ yN t ; (4.145)
ˆ c˛ t t
:̂ D cg ; T D 0;
T
where we use the notation yN t for the expectation EŒYt . The first equation is a Riccati
equation which can be solved directly. Proceeding as above, we find:
p p p
p c˛ cX =c˛ cg .c˛ cX =c˛ C cg /e2 cX =c˛ .Tt/
t D c˛ cX =c˛ p p p ; (4.146)
c˛ cX =c˛ cg C .c˛ cX =c˛ C cg /e2 cX =c˛ .Tt/
for t 2 Œ0; T. Once is determined, one can inject its value (4.146) into the explicit
solution of the second equation in (4.145) which reads:
Z T
hN 1
Rs
t D yN s e c˛ t u du
ds; t 2 Œ0; T: (4.147)
c˛ t
1
dXt D xt dt C dWt ;
t Xt C . N t t /N t 2 Œ0; T I X0 D x0 :
c˛
Since Yt D t Xt C t , the adjoint process and the optimal control process ˛O D .˛O t D
.˛/ .˛/
Yt =c˛ /06t6T are also Gaussian. Notice that ˛O t N.t ; Œt 2 / with:
Z
.˛/ x0 1 Rt
N u du .˛/ 2 t
2
Rt
t D N t e c˛ 0 and Œt 2 D 2
e c˛ s u du
ds:
c˛ c2˛ t
0
Figure 4.1 shows the time evolution of the density of the control process.
4.7 Examples 313
6
5
4
density
3
2
1
0
Fig. 4.1 Time evolution (for t ranging from 0:06 to T D 1) of the marginal density of the optimal
rate of trading ˛O t for a representative trader for the values x0 D 1, cX D 0:1, cg D 0:3, c˛ D 2,
k D 10 and D 0:7 of the parameters.
Inventory Inventory
(E[X_T]-Xi)/Xi (E[X_T]-Xi)/Xi
-0.2 -0.2
(E[X_T]-Xi)/
(E[X_T]-Xi)/
-0.4 -0.4
-0.6 -0.6
Xi
Xi
-0.8 -0.8
r
x
ba
c_
h_
c_g c_g
Fig. 4.2 Expected terminal inventory as a function of cg and cX (left) when the latter varies from
0:01 to 1 and hN D 10, and as a function of cg and hN (right) when the latter varies from 0:01 to
10 and cX D 0:1. In both cases, cg varies from 0:01 to 10. The values of the other parameters are
c˛ D 2 and D 0:7.
Figures 4.2 and 4.3 give surface plots of the expected terminal inventory EŒXT as
a function of the various parameters of the model. Clearly, both plots of Figure 4.2
confirm the intuition that large values of the parameter cg would force this terminal
inventory to be small.
Figure 4.3 seems to indicate that the price impact parameter hN does not have
a large influence on the expected terminal inventory for large values of the
parameters cx and c˛ . However, for small values of the parameters cx and c˛ , the
expected terminal inventory seems to be a decreasing function of the price impact
N
parameter h.
314 4 FBSDEs and the Solution of MFGs Without Common Noise
Inventory Inventory
(E[X_T]-Xi)/Xi (E[X_T]-Xi)/Xi
-0.2 -0.4
(E[X_T]-XI)/
(E[X_T]-XI)/
-0.5
-0.4
-0.6
-0.6 -0.7
-0.8
XI
-0.8
XI
r
ba
ba
-0.9
h_
h_
c_alp -1.0 c_x
ha
Fig. 4.3 Expected terminal inventory as a function of c˛ and hN (left) when the former varies from
0:01 to 10 and cX D 0:1, and as a function of cX and hN (right) when the former varies from 0:01
to 10 and c˛ D 2. In both cases, hN varies from 0:01 to 10. The values of the other parameters are
cg D 1 and D 0:7.
We now study in detail the crowd congestion model introduced in Subsection 1.5.3
of Chapter 1.
We model the behavior of N individuals exiting an enclosed area such as a
ballroom, or a theater. The room is modeled as a bounded closed convex polyhedron
D Rd , and the exits comprise the connected components of a relatively closed
subset E of the boundary @D (so that E itself is closed) with a nonempty relative
interior. The convexity assumption is mostly for convenience as it is not needed for
most of the theoretical arguments we use below. Non-convex models are important
for applications. Indeed, domains with holes can be used to model physical obstacles
(e.g., barriers, pillars, rows of seats, : : :) impeding the motion of the individuals.
Also, dumbell-like domains comprising thin corridors connecting convex bodies
can provide realistic models for suites of rooms connected by narrow hallways or
by staircases.
In the mean field game limit, the dynamics of the position of an individual are
assumed to be given by a controlled reflected stochastic differential equation of the
form:
will want to track is the exit time of the room given by the first hitting time of E by
the process X defined as:
When an individual reaches a door, we consider that it is not part of the game any
longer, and instead of letting the reflection take place, we use a standard procedure
in the classical theory of Markov processes to send the individual to a point added
to the state space D and called the cemetery. We use the notation for the cemetery
to follow the tradition of the classical texts on Markov processes, hoping that it will
not be confused with the notation x used for the Laplacian operator at the end of
this subsection. Indeed the cemetery notation will only enter the definition of the
extended state space D D D [ f g. For the sake of definiteness, we shall assume
that 2 Rd n D. We want to apologize to the reader for our willingness to abide by
the standard notation and terminology that create this amusing oxymoron: we end
up calling cemetery the place the individuals want to reach (since they want to leave
the room) in the shortest amount of time !
For the control step of the mean field game problem, we fix a continuous flow
D .t /06t6T of probability measures in C.Œ0; TI P.D //, and to each admissible
control ˛, we associate the cost:
Z
T^
j˛t j2
J .˛/ D E `.Xt ; t / C f .t/ dt ; (4.150)
0 2
for an increasing continuous function ' from RC into itself, and a smooth even
compactly supported density W Rd ! RC . Here, is any probability measure on
D , jD denotes its restriction to D, and ‘*’ stands for the standard convolution on
Euclidean space.
The fixed point step of the mean field game problem can be formulated as follows.
If X is the solution of the optimal control problem described above, we define the
316 4 FBSDEs and the Solution of MFGs Without Common Noise
process X0 by Xt0 D Xt for t < and Xt0 D whenever t > , and we say that X0
(or the flow ) is a solution of the MFG problem if, for any t 2 Œ0; T; t coincides
with the distribution of Xt0 (and not of Xt ). Notice that in the law of Xt0 , the interesting
part is the measure tjD D 1D t which describes the statistical distribution of the
individuals who have not exited yet by time t. This is not necessarily a probability
measure since its total mass is the proportion of individuals still in D at time t.
We could have avoided the introduction of the cemetery and used flows of sub-
probability measures to define the costs and the mean field game equilibrium, but in
order to use the tools developed in this book for probability measures we introduced
the state process X0 and the above definition for the MFG equilibrium. In case when
` is given as the convolution of jD with , we may choose for cemetery a point
satisfying 62 supp./ C D. This condition ensures that the convolution of and
does not feel the mass allocated to the cemetery.
Because the problem is set on a bounded domain instead of the whole Euclidean
space Rd , and because of the special type of mixed boundary conditions needed in
this model, we cannot use directly the results derived in Subsections 4.4 and 4.5.
However, we shall prove that similar arguments may be used to solve the MFG
problem as derived from (4.148)–(4.150).
For the purpose of illustration we treat a numerical example at the end of this
subsection.
where for any t 2 Œ0; T, nt is an inward pointing unit normal vector to @D at Xt .
The first identity expresses the fact that K only acts when X is on the boundary.
4.7 Examples 317
The second one says that K is directed along an inward pointing unit normal to
the boundary. Importantly, observe that for any F-progressively measurable process
Z D .Zt /06t6T with values in D, it holds, for all t 2 Œ0; T,
Z Z
t t
Zs Xs dKs D 1fXs 2@Dg Zs Xs ns djKjs > 0; (4.152)
0 0
Lemma 4.66 For any 2 .0; 1=2/, the cumulative distribution function of D
infft 0 W Xt 2 Eg, namely the function RC 3 t 7! P. 6 t/ is .1=2 /
Hölder continuous, uniformly in time. Moreover, P. < 1/ D 1 and for any t > 0,
P. 6 t/ > 0.
Weak Formulation
Now, for any measure flow D .t /06t6T in C.Œ0; TI P.D // and any admissible
control process ˛ D .˛t /06t6T (i.e., any F-progressively measurable A-valued
process), we define the probability P;˛ on .˝; FT / by:
Z Z
dP;˛ T
1 T
D exp ˛t dWt j˛t j2 dt :
dP 0 2 0
Observe that P;˛ is in fact independent of , the rationale for the exponent being
mostly for pedagogical reasons. Notice also that, under P;˛ , X is not a reflected
Brownian motion any longer. It is the result
R t of the reflection of a process which is a
Brownian motion plus a drift given by 0 ˛s ds, which is what we were looking for.
Under the weak formulation, the cost associated with ˛ is:
Z
;weak ;˛
T^ 1 2
J .˛/ D E `.Xt ; t /j˛t j C f .t/ dt ;
0 2
where we use the notation E;˛ for the expectation with respect to the probability
P;˛ . The reduced Hamiltonian H is independent of the boundary condition. It is
given by the same formula:
318 4 FBSDEs and the Solution of MFGs Without Common Noise
1
H.t; x; ; y; ˛/ D ˛ y C `.x; /j˛j2 C f .t/:
2
A straightforward computation shows that the minimizer of the function A 3 ˛ 7!
H.t; x; ; y; ˛/ is equal to the orthogonal projection of y=`.x; / onto the convex
set A. With the same notation as above, we thus have ˛.x;
O ; y/ D ˘A .y=`.x; //
where ˘A is the orthogonal projection onto A. Hence, the minimized Hamiltonian
H is given by:
Proposition 4.67 For any continuous flow D .t /06t6T of probability measures
on D , the BSDE:
dYt D 1ft6 g H t; Xt ; t ; Zt ; ˛.X
O t ; t ; Zt / dt C Zt dWt ; (4.153)
for t 2 Œ0; T, with terminal condition YT D 0, is uniquely solvable. Moreover, the
control ˛O D .˛O t /06t6T defined by ˛O t D ˛.X
O t ; t ; Zt / is the unique optimal control
over the interval Œ0; T and the optimal cost of the problem is given by:
Xt if t <
8t 2 Œ0; T; t D P;˛O ı .Xt0 /1 ; where Xt0 D : (4.155)
otherwise
our strategy is to prove that ˚ admits a fixed point by checking that Schauder’s
theorem can be used in the same way as in Subsection 4.3.2.
We start with the following simple remark. Since D is bounded and closed,
P.D / coincides with P2 .D / and the topology of weak convergence usually
considered on P.D / is the same as the topology given by the Wasserstein distance
W2 on P2 .D /. Also, P.D / is a closed compact subset of P2 .Rd /. These facts were
already mentioned in Chapter 1, and they will be discussed in detail in Chapter 5.
For this reason, C.Œ0; TI P.D // may be regarded as a closed convex subset of
C.Œ0; TI P2 .Rd //, and we can work along the same lines as in Section 4.3. Below,
we shall write C.Œ0; TI P2 .D // to emphasize the fact that P.D / is equipped with
the 2-Wasserstein distance and that C.Œ0; TI P.D // is equipped with the supremum
distance induced by W2 . Henceforth, we aim at applying Schauder’s theorem as in
Subsection 4.3.2, and for that, it suffices to prove that ˚ is continuous and has a
relatively compact range. This is proven in Lemmas 4.68 and 4.69 below, proving
that the MFG problem (4.155) has a solution.
Lemma 4.68 There exists a constant C such that, for any 2 C.Œ0; TI P2 .D //,
8s; t 2 Œ0; T; W2 P;˛O ı .Xt0 /1 ; P;˛O ı .Xs0 /1 6 Cjt sj1=8 :
dt jXt Xs j2 D 2.Xt Xs / ˛O Xt ; t ; Zt dt C dWt C dKt C dt:
By (4.152), we have:
2.Xt Xs / dKt 6 0;
Taking expectations and using the fact that ˛O is bounded, we deduce that there exists a
constant C, independent of , such that:
8s; t 2 Œ0; T; E;˛O jXt Xs j2 6 Cjt sj: (4.157)
320 4 FBSDEs and the Solution of MFGs Without Common Noise
where we allowed the value of the constant C to change from line to line. By (4.157), we get:
E;˛O jXt0 Xs0 j2 6 C.t s/ C CP;˛O s < < t :
The density of P;˛O with respect to P is defined in terms of ˛,
O which is bounded,
independently of . Therefore,
dP;˛O 2
E 6 C;
dP
from which we get:
1=2
E;˛O jXt0 Xs0 j2 6 C.t s/ C CP s < < t :
The result follows from the fact that according to Lemma 4.66, the cumulative distribution
function of is Hölder continuous. t
u
Lemma 4.69 The function ˚ in (4.156) is continuous from C.Œ0; TI P2 .D // into
itself.
Proof. Given two continuous flows D .t /06t6T and 0 D .0t /06t6T with values in
0
P .D /, we compare the control processes ˛O and ˛O . By Proposition 4.15, we may call
.Yt ; Zt /06t6T the solution of the quadratic BSDE (4.153) driven by , and .Yt0 ; Zt0 /06t6T the
solution of the quadratic BSDE (4.153) driven by 0 . Computing the difference between Yt
and Yt0 for t 2 Œ0; T, we get:
h
d Yt Yt0 D 1ft<g H Xt ; t ; Zt ; ˛O Xt ; t ; Zt H Xt ; t ; Zt0 ; ˛O Xt ; t ; Zt0
i
C H Xt ; t ; Zt0 ; ˛O Xt ; t ; Zt0 H Xt ; 0t ; Zt0 ; ˛O Xt ; 0t ; Zt0 dt
C Zt Zt0 dWt :
Notice now, from the local Lipschitz property of the Hamiltonian H in the variables z and ˛
and from the Lipschitz property of the optimizer ˛O in the variables z and ˛, that we can find
a process D .t /06t6T with values in Rd such that, for some constant C > 0, independent
of and 0 ,
jt j 6 C 1 C jZt j C jZt0 j ; t 2 Œ0; T;
4.7 Examples 321
Rt
Define now the drifted Brownian motion .Wt D Wt 0 s ds/06t6T together with the
Girsanov transform:
Z Z
dP T
1 T
2
D exp s dWs js j ds :
dP 0 2 0
By the BMO property of Z and Z0 , we know from Proposition 4.18 that there exists r > 1,
independent of and 0 , such that (allowing the constant C to vary from line to line):
h dP ri
E 6 C: (4.158)
dP
for t 2 Œ0; T. Taking the power 2p on both sides for some p > 1, we deduce by standard
BSDE inequalities that:
Z p
T
E sup jYt Yt0 j2p C E jZt Zt0 j2 dt
06t6T 0
Z Tˇ
6 CE ˇH Xt ; t ; Z 0 ; ˛O Xt ; t ; Z 0
t t
0
p
ˇ2
H Xt ; 0t ; Zt0 ; ˛O Xt ; 0t ; Zt0 ˇ dt :
RT
Recalling that for any q > 1, EŒ. 0 jZt0 j2 dt/q can be bounded independently of 0 , see again
Proposition 4.18, we easily deduce that the above right-hand side tends to 0 as 0 tends to .
Therefore, for any p > 1,
Z T p Z T p
0
E jZt Zt0 j2 dt ; and thus E
j˛O t ˛O t j2 dt ;
0 0
322 4 FBSDEs and the Solution of MFGs Without Common Noise
0 0 Z Z
dP ;O˛ T 0 1 T 0
D exp ˛O t ˛O t dWt j˛O t j2 j˛O t j2 dt :
dP;O˛ 0 2 0
Recalling that the set A is bounded, we easily deduce that the above right-hand side tends to
1 as 0 tends to . Therefore, for any p > 1,
0
h dP0 ;O˛ pi
lim sup E 6 1:
0 ! dP;O˛
Now,
0 0 0
h dP0 ;O˛ 2i
h dP0 ;O˛ 2i
h dP0 ;O˛ i
;O
˛ ;O
˛ ;O
˛
E 1 DE C 1 2E
dP;O˛ dP;O˛ dP;O˛
0
(4.161)
h dP0 ;O˛ 2i
D E;O˛ 1:
dP;O˛
Finally, for any bounded measurable function F on C .Œ0; TI R2d /, we have:
0
0
h dP0 ;O˛ i
0 ;O
˛ ;O
˛ ;O
˛
E F.X/ E F.X/ D E 1 F.X/ ;
dP;O˛
the convergence being uniform over measurable mappings F with a supremum norm less
than 1.
Observing that, for a given t 2 Œ0; T, the mapping C .Œ0; TI Rd / 3 x 7! xt0 , with xt0 D xt
if t < £ and xt0 D if t > £ and £ D infft 2 Œ0; T W xt 2 Eg, is measurable, we deduce that,
for any bounded measurable function h W D ! R,
ˇ 0 0
ˇ
ˇ ˇ
lim
0
sup ˇE ;O˛ h.Xt0 / E;O˛ h.Xt0 / ˇ D 0;
! t2Œ0;T
the convergence being uniform over measurable mappings h with a sup-norm less than 1.
Expressed in terms of the function ˚ used in the statement, this says that:
ˇZ Z ˇ
ˇ ˇ
lim
0
sup sup ˇˇ hd ˚.0 / t hd ˚./ t ˇˇ D 0;
! khk 6 1 t2Œ0;T D D
1
that is, the probability measure .˚.0 //t weakly converges to .˚.//t , uniformly in t 2
Œ0; T. Recall that, since D is bounded, the metric associated with weak convergence on
P .D / is equivalent to the 2-Wasserstein distance. Therefore,
lim sup W2 ˚.0 / t ; ˚./ t D 0;
0 ! t2Œ0;T
which shows that ˚ is a continuous mapping from C .Œ0; TI P2 .D // into itself. t
u
Proof. First, we recall some basic facts about exit times and exit distributions of standard
Brownian motion, as well as some properties of reflected Brownian motions. Recall that
the boundary @D of the domain D is assumed to be piecewise smooth. See the Notes &
Complements at the end of the chapter for references to papers providing proofs of these
results.
Under the standing assumptions on the domain D, the reflected Brownian motion in D has
a fundamental solution .p.t; x; y//t>0;x2D;y2D , namely:
Z
P Xt 2 BjX0 D x D p.t; x; y/dy; B 2 B.D/:
B
For any t > 0, the mapping D2 3 .x; y/ 7! p.t; x; y/ is continuous and (strictly) positive.
Moreover, there exists a constant C such that, for all t 2 .0; 1/,
jx yj2
8x; y 2 D; p.t; x; y/ 6 Ctd=2 exp ; (4.163)
Ct
4.7 Examples 325
At times, it will be convenient to compare the stopping time (defined as the first hitting time
of the part E of the boundary @D) to the first exit time Q D infft > 0I Xt 2 @Dg of the domain
D. Although we shall not use this fact, we mention that the joint distribution of the first time
of exit and the location of exit, namely L.Q ; XQ / is absolutely continuous with respect to the
measure dt .dy/ where .dy/ denotes the surface measure on @D. More precisely for any
starting point x in the interior of D, we have:
1 @ 0
P Q 2 dt; XQ 2 dy j X0 D x D p .t; x; y/ dt .dy/ (4.166)
2 @ny
where ny denotes the inward pointing unit normal vector to @D at y 2 @D, and p0 .t; x; y/ is
the fundamental solution of the Dirichlet problem in D, namely the density of the Brownian
motion killed the first time it hits the boundary @D; in other words:
We used the process X while talking about standard Brownian motion because, at least in
distribution, X behaves like a standard Brownian motion up until time Q . See the Notes &
Complements for references.
Now, we tackle the proof of the lemma by considering X0 D , as in (4.151). For any
t > 0, we denote by t the distribution of Xt . For each t > 0, t has a density, which we
denote by t :
Z
t .x/ D p.t; y; x/d0 .y/; x 2 D;
D
=2
When t 2 .0; 1/ and x 2 E , we have:
Z Z
C jx yj2
t .x/ 6 p.t; y; x/0 .y/dy C d=2 exp d0 .y/
E t DnE Ct
Z
C 2
6 p.t; y; x/0 .y/dy C d=2 exp ;
E t 4Ct
where as before, E D fx 2 D W dist.x; E/ < g, and we used the notation 0 for the density
of the absolutely continuous part of 0 . Since we assume that 0 is bounded on E , we deduce
that there exists a constant C0 such that, for all t 2 .0; 1/,
=2
8x 2 E ; t .x/ 6 C0 : (4.167)
We now denote by mt the restriction of the distribution of Xt0 to D, namely the sub-
probability measure defined by:
mt .B/ D P Xt 2 B; t < ; B 2 B.D/;
the initial condition of X0 being prescribed. Obviously, we always have mt .B/ 6 t .B/. For
any t > 0 such that PŒ > t > 0, we have:
Z
P 6 t C hj > t D P 6 h j X0 D x dt .x/:
D
For a given x 2 D, we deduce from (4.165) that, for all h 2 .0; 1/,
h ˇ i
P 6 h j X0 D x 6 P sup jXt xj > dist.x; E/ ˇ X0 D x
06t6h
.dist.x; E//2
6 C exp :
Ch
We split the integral in the right-hand side into two parts according to the partition of D into
E =2 and D n E =2 . Using (4.167) and allowing the constant C to increase from line to line,
we get, for any 2 .0; 1=2/,
P 6 t C hj > t
Z Z
.dist.x; E//2 2
6C exp dx C C exp dt .x/
E =2 4Ch DnE =2 Ch
Z
dx
6 Ch.1/=2 dx C C 1 h1=2 :
E =2 .dist.x; E//1
and then,
P > t C h > P > t .1 Ch.1/=2 /:
Since the function t 7! PŒ > t is nonincreasing, this completes the proof of the desired
Hölder continuity, as long as we can check that (4.168) holds. This is indeed the case because,
writing E @D D [NiD1 Fi , where .Fi /iD1; ;N denote the faces of D, it suffices to prove that,
for all i D 1; ; N,
Z
1
dx < 1:
D .dist.x; Fi //1
Now, the distance from x to Fi is greater than the distance from x to the hyperplane Hi
supporting Fi . The result easily follows by changing the coordinates in such a way that,
under the new coordinates, dist.x; Hi / D x1 , x1 denoting the first coordinate of x in the new
reference frame.
Finally, we prove that, for any t > 0, PŒ 6 t > 0.
We start with the case when the support of 0 D L./ is not included in @D. Then, we can
find x0 2 D and > 0 small enough such that the d-dimensional ball B.x0 ; / is included in
the interior of D. Also, we know that E contains a .d 1/-dimensional relatively open ball F
included in one of the face of @D. Thanks to the polyhedral structure of D, we can find, for
any t > 0, a piecewise linear function % W Œ0; t ! Rd such that %.0/ D x0 , %.Œ0; t=2/ D,
%.Œ0; t=2/ \ @D F, %..t=2; t/ \ D D ; and dist.%.t/; D/ > 1 (that is %.Œ0; t/ crosses
@D at some point in F). Also, for small enough, we can a draw a tube T D fx 2 Rd W
infs2Œ0;t jx %.s/j 6 g such that T \ @D F. By support theorem for the Brownian motion,
we know that PŒ8s 2 Œ0; t; X0 C Ws 2 T > 0. Since Xs D X0 C Ws for s 6 Q , we deduce
that PŒQ 6 t; XQ 2 F > 0. This concludes the proof since PŒ 6 t > PŒQ 6 t; XQ 2 F.
When X0 D is concentrated on the boundary, we may use the fact that D2 3 .x; y/ 7!
p.t; x; y/ is strictly positive. In particular, for any t > 0, p.t; x; / must charge a ball in the
interior of D. Therefore, when starting from the boundary, there is a positive probability to
reach a ball in the interior of D, in any positive time. By the Markov property, we deduce that
there is a positive probability to reach E, in any positive time.
In order to prove that is almost surely finite, we use the fact that p.1=2; ; / is bounded
from below. Starting from any point, there is a positive probability to belong, at time 1=2,
to the same ball B.x0 ; / as that constructed right above. Starting from this ball, there is a
positive probability to reach E between t D 1=2 and t D 1. By a standard iteration argument
based on the Markov property, we deduce that PŒ < 1 D 1. t
u
where we use the notation ˛.t; Q x/ D ˛.x; O t ; @x V.t; x// for convenience. The
intuitive interpretation of mt .x/ is the proportion of individuals who have not yet
exited by time t. Since deriving this equation directly involves delicate computations
with the singular process K (involving the local time of the process at the relevant
part of the boundary), we start from a solution of the above equation and identify it,
at least formally, with the distribution of the part of the population still in D by time
t. So we pick a time S 2 Œ0; T and an arbitrary continuous bounded function g on D
and we prove that:
Z
g.x/m.S; x/dx D E g.XS /1fS< g : (4.171)
D
In order to do so, for S and g given, we consider the solution u of the parabolic
Dirichlet-Neumann problem:
1
@t u.t; x/ C x u.t; x/ C ˛.t;
Q x/ @x u.t; x/ D 0; (4.172)
2
for .t; x/ 2 Œ0; S D, with Neumann boundary condition @x u.t; x/ n.x/ D 0 for
.t; x/ 2 Œ0; S/ .@D n E/, Dirichlet boundary condition u.t; x/ D 0 for t 2 Œ0; S
E, and terminal condition u.S; x/ D g.x/ for x 2 E. Consider also the solution
.Xt /06t6T of the reflected SDE:
dXt D ˛.t;
Q Xt /dt C dWt C dKt ; t 2 Œ0; T:
Introducing as above the first hitting time D infft > 0 W Xt 2 Eg, and assuming
that u is smooth enough to apply Itô’s formula, we get:
d
E u t ^ ; Xt^ D 0; t 2 Œ0; S: (4.173)
dt
4.7 Examples 329
We used the fact that the expectation of the stochastic integral with respect to dWt
is 0 and that the Neumann boundary condition on u kills the integral with respect to
dKt . Now, using the notation m.t; x/ for mt .x/ and the equations (4.170) and (4.172)
satisfied by m and u, we get:
Z
d
u.t; x/m.t; x/dx
dt D
Z h i
1
D x u.t; x/ C ˛.t;
Q x/ @x u.t; x/ m.t; x/dx (4.174)
D 2
Z h
1 i
C u.t; x/ x m.t; x/ u.t; x/divx ˛.t;
Q x/m.t; x/ dx:
D 2
where denotes the surface measure. Since u.t; x/ D m.t; x/ D 0 for x 2 E and
@x u.t; x/ n.x/ D 0 for x 2 @D n E, we obtain:
Z Z
x u.t; x/m.t; x/dx C u.t; x/ x m.t; x/dx
D D
Z (4.175)
D u.t; x/@x m.t; x/ n.x/d .x/:
@DnE
which gives the desired result (4.171) if we use the terminal condition of u.
Numerical Illustration
For the purpose of illustration, we consider the domain D D Œ0; 1 Œ0; 1 in the
plane, and two exit doors E D .Œ0:95; 1 f0g/ [ .Œ0:98; 1 f1g/. They are shown
as unions of gray circles on the various panels of Figure 4.5. We implemented the
search for an equilibrium density as a simple Picard iteration. At each step of the
iteration, we use the values m.t; x; y/ obtained from the previous iteration with a
simple monotone finite difference Euler scheme to compute the solution of the HJB
equation (4.169) backward in time. In the case at hand, this HJB equation becomes:
2 ˇ
@t V.t; x/ C V.t; x/ jrVj2 C ı D 0; V.T; / 0:
2 2.1 C m.t; x//˛
if we choose R2 for A and a constant ı > 0 for the function f .t/ appearing in
the expression of the loss J .˛/ in (4.150) penalizing the time spent in the room
before exiting, and if we replace the congestion penalty `.x; / by a multiple of the
quantity .1 C m.x//˛ where m stands for the density of . Once a solution of the
HJB equation is found, we solve (again with a simple monotone finite difference
Euler scheme) the forward Kolmogorov equation (4.170) which reads:
2 mt rV
@t mt mt ˇdiv D 0:
2 .1 C mt /˛
For the sake of definiteness (and comparison with numerical studies reported in the
still unpublished literature), we chose the values D 0:1, ˇ D 16, ı D 1=320, and
we let time evolve from t D 0 to T D 8 by increments of size t D 0:02. Lack
of congestion corresponds to the choice ˛ D 0. For the purpose of the numerical
experiments whose results are reported below, we include minor contagion in the
model by choosing ˛ D 0:1.
Figure 4.4 shows the time evolution of the total mass of the measure mt ,
essentially the number of individuals still in the room at time t, starting from a
uniform distribution of individuals in a square at the center of the room. This plot
shows that the effect of the congestion is to slow down the exit indeed.
4.7 Examples 331
1.0
Total
Mass
alpha=0
0.8
alpha=0.1
5
Total mass
0.6
4
m_t
0.4
2
0.2
1
y
x 0
0.0
0 2 4 6 8
time
Fig. 4.4 Left: Initial distribution m0 used for the numerical illustrations. The small exits are
marked in gray. Right: Time evolution of the total mass of the distribution mt of the individuals
still in the room at time t for ˛ D 0, i.e., absence of congestion (continuous line) and ˛ D 0:1, i.e.,
moderate congestion (dotted line).
Figure 4.5 shows the time evolution of the density m.t; / starting from a uniform
distribution of individuals in a square at the center of the room. Snapshots of the
density are given for times t D 0:42; 1:22 and t D 2:42. For the sake of comparison
with numerical studies reported in the unpublished literature, we chose the values
D 0:2, ˇ D 16, and ˛ D 0:1.
Clearly the congestion term slows down the exit of the individuals as we see
that it takes longer for the same mass of individuals to reach the exit doors, as
more individuals are stranded looking for the exit and bouncing off the walls before
finding the exit.
Total Total
Mass Mass
7
6
6
5
5
4
m_t
m_t
4
3 3
2 2
1 1
y
x 0 x 0
Total Total
Mass Mass
8
10
8 6
m_t
6 m_t 4
4
2
2
y
y
x 0 x 0
Total Total
Mass Mass
10
15
8
m_t
m_t
10 6
4
5
2
y
x 0 x 0
Total Total
Mass Mass
5
15
4
m_t
m_t
3 10
2
5
1
y
x 0 x 0
Fig. 4.5 Surface plots of the density mt for times t D 0:42; 1:22 and t D 2:42 (from the top)
for ˛ D 0, i.e., absence of congestion (left column) and ˛ D 0:1, i.e., moderate congestion (right
column). The exit doors are shown as unions of gray circles.
process. Given a fixed flow D .t /06t6T 2 C.Œ0; TI R2d / of probability measures
on R2d , the cost of a control ˛ is given by:
Z
T
1 2
J .˛/ D E j˛t j C f .xt ; vt /; t dt : (4.178)
0 2
Clearly, the parameter ˇ > 0 plays a crucial role in both cases. Its role is to quantify
how much particles whose positions x are far from the bulk of the positions x0 of
particles distributed according to the input distribution , contribute to the running
cost. Interestingly, in the case ˇ D 0, and for the running cost function (4.179),
we saw in Subsection 3.6.1 that the denominator is identically one, and the model
reduces to a LQ mean field game which we solved by the methods presented in
Section 3.5 of Chapter 3. Existence of equilibria in the case ˇ > 0 does not follow
directly from the results presented so far, mostly because of the degeneracy and
because of the lack of convexity of the function f .
In order to simplify somehow the discussion of the mathematical analysis
presented below, we shall assume that f is bounded. This comes as a slight restriction
in comparison with the original model described in Subsection 1.5.1 as recalled
above. In fact, we shall assume much more as we will require that the function f
is continuous on R2d P2 .R2d /, P2 .R2d / being equipped with the 2-Wasserstein
distance, and that for any fixed 2 P2 .R2d /, f is twice differentiable in .x; v/ 2
R2d , with derivatives uniformly bounded in .
Our goal is to use the first prong of the probabilistic approach based on the FBSDE
representation of the value function, in the strong formulation as given in (4.55).
In this approach, the relevant FBSDE is obtained by replacing the control by its
minimizer in both the forward dynamics and the BSDE representation of the value
function, and the adjoint variable by the martingale integrand multiplied by the
inverse of the volatility. Unfortunately, the .2d/ .2d/ volatility matrix is not
invertible in the present situation. Indeed, it is of the form:
0d 0d
.t; x; / D :
0d Id
However, if we identify the control space to the closed linear subspace A D f0d gRd
of R2d , the current form of the flocking model fits the framework of Remark 4.50
since:
b t; .x; v/; ; .0; ˛/ D .v; 0/ C .t; x; /.0; ˛/ ;
and we can use the content of this remark to derive the appropriate version of the
FBSDE to be solved. Choosing D 1 in the subsequent analysis, it reads:
8
< dxt D vt dt; dvt D Zt dt C dWt ;
(4.181)
: dYt D 1
jZ j2
2 t
C f .xt ; vt /; t dt C Zt dWt ;
Proposition 4.70 For a given input D .t /06t6T as above, the equation (4.182)
has a unique solution .xt ; vt ; Yt ; Zt /06t6T , for which Y D .Yt /06t6T and the
martingale integrand Z D .Zt /06t6T are bounded.
Moreover, there exists a bounded and continuous function U W Œ0; T Rd
R ! R, differentiable and Lipschitz continuous in the space argument uniformly
d
in time, such that with probability 1, for all t 2 Œ0; T, Yt D U.t; xt ; vt / and Zt D
@v U.t; xt ; vt /. For each x; v 2 Rd and t 2 Œ0; T, the quantity U.t; x; v/ is given by
the formula:
U.t; x; v/ D
Z Tt Z s (4.183)
ln E exp f x C vs C Wr dr; v C Ws ; sCt ds :
0 0
Proof. Since the forward and backward components of equation (4.182) are decoupled,
existence and uniqueness of a solution .x; v; Y; Z/ depend only on the BSDE part, and such
a result (with Y bounded), follows from Theorem 4.15. The identification of this solution
and of the decoupling field relies on the so-called Cole-Hopf transformation. Indeed, by Itô
formula, it must hold:
d eYt D eYt f .xt ; vt /; t dt eYt Zt dWt ; t 2 Œ0; T I eYT D 1:
Q Z/
Since Y D .Yt /06t6T is a bounded process, we deduce that the pair process .Y; Q D
.eYt ; eYt Zt /06t6T solves the linear BSDE:
dYQ t D YQ t f .xt ; vt /; t dt C ZQ t dWt ; t 2 Œ0; T I YQ T D 1;
which is uniquely solvable since f is bounded. Call .YQ t ; ZQ t /06t6T the solution. Then,
Rt
Rt
d YQ t e 0 f ..xs ;vs /;s /ds D e 0 f ..xs ;vs /;s /ds ZQ t dWt :
336 4 FBSDEs and the Solution of MFGs Without Common Noise
Again, by boundedness of f , the term in the right-hand side must be a martingale. Thus,
RT
Rt
E YQT e 0 f ..xs ;vs /;s /ds j Ft D YQ t e 0 f ..xs ;vs /;s /ds ;
that is,
R
T
YQ t D E e t f ..xs ;vs /;s /ds j Ft :
YQ t D eU.t;xt ;vt / ;
Recalling that by definition Yt D ln.YQ t / and Zt D ZQ t =YQ t , we conclude that the process
.Yt ; Zt /06t6T satisfies the conditions in the statement of the lemma. t
u
1 1
@t U.t; x; v/ C v @x U.t; x; v/ C @2vv U.t; x; v/ j@v U.t; x; v/j2
2 2
C f .x; v/; t D 0;
for .t; .x; v// 2 Œ0; T R2d , with the terminal condition U.T; ; / D 0.
Proof. Existence and continuity of the first and second order derivatives in .x; v/ is a
straightforward consequence of the regularity of f and of the formula given for U in the
statement of Proposition 4.70.
Existence of the first order derivative in time may be proved as follows. For some initial
condition .t; .x; v// 2 Œ0; T R2d , consider the unique solution .Xs ; Ys ; Zs /t6s6T of (4.182)
with Xt D .x; v/ as initial condition at time t. Recall that we use the notation Xt D .xt ; vt / for
0 6 t 6 T. Then, by applying Itô’s formula in the space variable only, we get:
4.7 Examples 337
i
E U t C h; xtCh ; vtCh D U.t C h; x; v
Z (4.184)
tCh 1
CE vs @x U t C h; xs ; vs C @2vv U t C h; xs ; vs ds:
t 2
Recall that we can identify the left-hand side with EŒYtCh . Going back to the backward
equation (4.182) and using the fact that Zs D @v U.s; xs ; vs / for s 2 Œt; T, we see that:
Z tCh
1
EŒYtCh D U.t; x; v/ C E j@v U.s; xs ; vs /j2 f .xs ; vs /; s ds: (4.185)
t 2
Therefore, by identifying the left-hand sides in (4.184) and (4.185) and by taking advantage
of the fact that @x U and @2vv U are (jointly) continuous and bounded, we easily deduce that:
U.t C h; x; v/ U.t; x; v/
lim
h&0 h
1 1
D v @x U.t; x; v/ @2vv U.t; x; v/ C j@v U.t; x; v/j2 f .x; v/; t :
2 2
Since the right-hand side is continuous in t, we conclude that U is differentiable in time and
that the time derivative is equal to the right-hand side. t
u
Using the above two results in the same way we used Lemmas 4.47 and 4.49 ear-
lier, we prove the following existence and uniqueness result for the FBSDE (4.181).
Proposition 4.72 Equation (4.181) has a unique solution .Xt ; Yt ; Zt /06t6T for
which the process Y D .Yt /06t6T is bounded and Z D .Zt /06t6T is essentially
bounded for Leb1 ˝ P. This solution admits the function U defined in (4.183) as
decoupling field.
Moreover, for a given initial condition X0 D .x0 ; v0 / D 2 L2 .˝; F0 ; PI R2d /,
the process X D .x; v/ D .xt ; vt /06t6T is the unique optimal path for the control
problem (4.177)–(4.178).
Lemma 4.73 For a given initial distribution 0 2 P2 .R2d /, we equip the space
C.Œ0; TI R2d / with the image P0 of the measure 0 ˝ Wd on R2d C.Œ0; TI Rd / by
the mapping:
R2d C.Œ0; TI Rd / 3 .x; v/; w D .wt /06t6T
Z t
7! x C .v C ws /ds; v C wt ;
0 06t6T
where .x D .xt /06t6T ; v D .vt /06t6T / denotes the canonical process on the space
C.Œ0; TI R2d / and U is as in the statement of Proposition 4.70.
Proof. On the canonical space C .Œ0; TI R2d /, equipped with the probability P0 and with the
complete and right-continuous augmentation of the canonical filtration F generated by the
canonical process .x; v/, we let .wt D vt v0 /06t6T . By construction of P0 , w D .wt /06t6T
is an F-Wiener process, and we can write the dynamics of v in the form:
Z t
dvt D @v U.t; xt ; vt /dt C d wt C @v U.s; xs ; vs /ds ; t 2 Œ0; T:
0
This prompts us to define the equivalent probability measure (recall that @v U is bounded):
Z Z
dP T
1 T
D exp @v U.s; xs ; vs / dws j@v U.s; xs ; vs /j2 ds :
dP0 0 2 0
Rt
Under P , the process .wt D wt C 0 @v U.s; xs ; vs /ds/06t6T is an F-Brownian motion.
Moreover,
dvt D @v U.t; xt ; vt /dt C dwt ;
so that .xt ; vt /06t6T solves, under P , the same SDE as the optimal path of the optimal
control problem (4.177)–(4.178), see Propositions 4.70 and 4.72. Therefore, P is the law of
the optimal path.
4.7 Examples 339
It only remains to derive the form (4.186) of dP =dP0 . By Itô’s formula, we have:
U.T; xT ; vT /
Z T 1
D U.0; x0 ; v0 / C @t U.t; xt ; vt / C vt @x U.t; xt ; vt / C @2vv U.t; xt ; vt / dt
0 2
Z T
C @v U.t; xt ; vt / dwt ;
0
Z Z
T T
1
D U.0; x0 ; v0 / f .xt ; vt /; t dt C j@v U.t; xt ; vt /j2 dt
0 0 2
Z T
C @v U.t; xt ; vt / dwt ;
0
where we have used the HJB equation to pass from the first to the second line. Finally, we
complete the proof using the fact that U.T; ; / D 0. u
t
Proposition 4.74 With the same notation as in the statement of Lemma 4.73, any
measure flow D .t /06t6T 2 C.Œ0; TI P2 .R2d // solving:
t D P ı e1
t ; 0 6 t 6 T;
where et denotes the evaluation map at time t on C.Œ0; TI R2d / (i.e., et .x; v/ D
.xt ; vt )) is an equilibrium for the model of flocking.
Computational Implications
Formula (4.186) may be very useful for numerical purposes. Indeed it provides a
direct way to simulate the optimal path in the environment D .t /06t6T by using
an acceptance-rejection method when the supremum norm of f is not too large, or
more refined particle method otherwise. In this regard, it is worth noticing that there
is no need to solve the HJB equation to simulate the distribution of the optimal path
by Monte Carlo methods, even though U.0; ; / explicitly appears in the density. The
reason is that exp.U.0; x0 ; v0 // is nothing but a normalizing constant and simulation
methods based on systems of particles do not need to evaluate it. This is the more
convenient that the HJB equation is in 2d dimensions.
The above discussion could serve as the basis for a strategy to solve MFG
problems numerically by means of Picard iterations based on the construction of
successive approximations of the measure flow D .t /06t6T by the empirical
measures of Monte Carlo samples generated according to formula (4.186).
We implemented the Monte Carlo strategy described above to simulate the
optimal paths of the solution of the optimal control problem whenever the input
measure flow D .t /06t6T is fixed. We restricted ourselves to the two-
dimensional case, i.e., d D 2, and we used the running cost functions f given
340 4 FBSDEs and the Solution of MFGs Without Common Noise
10
10
10
5
5
x2
x2
x2
0
0
-5
-5
-5
-10
-10
-10
-10 -5 0 5 10 15 -10 -5 0 5 10 15 -10 -5 0 5 10 15
x1 x1 x1
Fig. 4.6 Monte Carlo samples of a system of particles near equilibrium in the case ˇ D 0 (left),
ˇ D 0:1 (center), and ˇ D 5 (right).
in (4.180) and (4.179). The qualitative features of the results being the same, we
only report on simulations based on the running cost function (4.179):
Z
jv v 0 j2
f ..x; v/; / D .dx0 ; dv 0 /; x; v 2 R2 :
R4 .1 C jx x0 j2 /ˇ
p
We chose D 2 for the sake of definiteness. Figure 4.6 shows Monte Carlo
samples of a system near equilibrium (i.e., for an input flow still not a fixed point
of the Picard iteration) in the case ˇ D 0, ˇ D 0:1, and ˇ D 5, the other parameters
of the model being the same. Even though it may not appear very clearly, each
plot contains 2000 Monte Carlo sample trajectories, each of them comprising 100
points, the velocity vector being attached to each of these points. The impact of the
size of the parameter ˇ is clear. Indeed, when ˇ is large (right pane of Figure 4.6),
positions x far from the bulk of the positions likely to occur according to the flow
will create large denominators in the expression of f , and as a consequence, a
small overall cost. So one way for the particles to lower the cost is to drift apart,
property which we can clearly see from Figure 4.6. As recalled in Subsection 1.5.1
of Chapter 1, the terminology flocking was introduced to describe situations in which
the birds (particles in our present context) remain in a bounded set as time evolves.
From the plots above, it seems clear that flocking is likely for small values of ˇ and
highly questionable for large values of ˇ, the threshold separating these regimes
being ˇ D 0:5 in the deterministic (nonequilibrium) dynamical system proposed
by Cucker and Smale. One way to prove flocking in the classical deterministic
dynamical systems was to prove asymptotic stability of the velocity. We reinforce
the points made earlier on the basis of the positions of the particles by looking at
the time evolutions of the velocities. Even though we let the time go from t D 0 to
T D 4 in 100 time steps, we see from Figure 4.7 that the velocity vectors seem to
remain in a compact set when ˇ D 0 while they seem to lack stability and diverge
when ˇ D 5, confirming the characterization of flocking touted for deterministic
dynamical systems.
4.8 Notes & Complements 341
6
6
6
4
4
4
2
2
2
v2
v2
v2
0
0
0
-2
-2
-2
-4
-4
-4
-8 -6 -4 -2 0 2 4 -8 -6 -4 -2 0 2 4 -8 -6 -4 -2 0 2 4
v1 v1 v1
6
6
4
4
2
2
v2
v2
v2
0
0
-2
-2
-2
-4
-4
-4
-8 -6 -4 -2 0 2 4 -8 -6 -4 -2 0 2 4 -8 -6 -4 -2 0 2 4
v1 v1 v1
Fig. 4.7 Monte Carlo samples of the two components of the velocities of a system of 2000
particles near equilibrium in the case ˇ D 0 (top row), ˇ D 5 (bottom row). The three plots give
from left to right, the locations of the atoms of the empirical distributions of the two components
of the velocity vectors after 10, 50 and 100 time steps.
The analysis of fully coupled forward-backward SDEs in small time was initiated
first by Antonelli in [25]. Since then, several methods have been discussed in order
to prove existence and uniqueness over an interval of arbitrary length: Hu and
Peng [203] and Peng and Wu [307] implemented a continuation argument under
a suitable monotonicity assumption inspired by convexity conditions in the theory
of stochastic optimal control (see also Chapter 6 below where we implement this
method to solve McKean-Vlasov FBSDEs deriving from the stochastic Pontryagin
principle for mean field stochastic control problems), Pardoux and Tang [300]
exhibited another type of monotonicity condition that permits to apply the Picard
fixed point theorem, and Ma, Protter, and Yong [271] and Delarue [132] made
use of the connection with nondegenerate quasilinear PDEs for systems driven by
deterministic coefficients. The notion of decoupling field was introduced by Ma
and his coauthors in [272]. The reader is referred to the book of Ma and Yong
[274] for background material on adjoint equations, FBSDEs and the stochastic
maximum principle approach to stochastic optimization problems. The reader is
342 4 FBSDEs and the Solution of MFGs Without Common Noise
also referred to the papers by Hu and Peng [203] and Peng and Wu [307] for general
solvability properties of standard FBSDEs within the same framework of stochastic
optimization.
The rehabilitation of the Cauchy-Lipschitz theory when the diffusion coefficient
(or volatility) driving the noise term is nondegenerate and the coefficients are
bounded in the space variable, see Theorem 4.12, is due to Delarue. The original
result can be found in [132], see Theorem 2.6 and Corollary 2.8 therein.
The representation formula for the process .Zt /06t6T in the statement of
Lemma 4.11 is a standard result in the theory of backward SDEs. It may be
seen as a particular case of a more general result permitting to represent .Zt /06t6T
in terms of the Malliavin derivative of .Yt /06t6T , see for instance El Karoui, Peng
and Quenez [226]. For differentiability properties of BSDEs, as used in the proof of
Proposition 4.51, we refer to the seminal paper by Pardoux and Peng [298].
The theory of quadratic BSDEs goes back to the original work of Kobylanski
[232], see also the seminal paper by Briand and Hu [71]. We refer the interested
reader to Dos Reis’ monograph [141] for a quite comprehensive overview of the
subject, including a discussion of the differentiability of the flow formed by the
solutions.
The most standard reference on the BMO condition is the monograph by
Kazamaki [227].
The analysis of SDEs of McKean-Vlasov type has a long history. These equations
were first introduced by Henry McKean Jr in [277] and [278] to provide a rigorous
treatment of special nonlinear Partial Differential Equations (PDEs). Later on, they
were studied for their own sake, and in a more general mathematical setting.
The standard reference for existence and uniqueness of solutions to these special
SDEs is Sznitman’s original set of lectures [325]. See also the paper of Jourdain,
Méléard, and Woyczynski [221] for a generalization including jumps. Properties of
the solutions have been studied in the framework of the propagation of chaos, as
McKean-Vlasov equations appear as effective equations describing the dynamics
of large populations of individuals subject to mean field interactions, see again
[325] together with Méléard [279]. Propagation of chaos will be revisited in Chapter
(Vol II)-2.
As explained in the Notes & Complements of Chapter 3, Backward Stochastic
Differential Equations (BSDEs) of mean field type were introduced by Buckdahn
and his coauthors, see [74, 75] for example. In these papers, the McKean-Vlasov
interaction is of a more restricted form than in Subsection 4.2.2. Also, these results
cannot be used in the applications considered in Chapter 3 and in the optimal control
of McKean-Vlasov stochastic differential equations studied in Chapter 6. Indeed,
we are mostly interested in the analysis of systems of coupled FBSDEs of McKean-
Vlasov type. As far as BSDEs (and not FBSDEs) are concerned, the discussion of
Subsection 4.2.2 is inspired by Carmona’s lecture notes [94]. The proof is adapted
from the original existence and uniqueness result of Pardoux and Peng [298] for
standard BSDEs.
4.8 Notes & Complements 343
Concerning the reflected Brownian motion used in the application to crowd exits
discussed in Subsection 4.7.2, we refer the interested reader to the seminal papers by
Tanaka [329], and by Lions and Sznitman [267] for its construction in a bounded
domain of an Euclidean space. This construction is based on the use of the so-
called Skorokhod map whose theory was enhanced in a more recent paper [81]
by Burdzy, Kang, and Ramanan. The properties of Brownian motion reflected in
a convex domain which we used in the proof of Lemma 4.66 can be found for
instance in the paper [39] of Bass and Hsu, [109] by Carmona and Zheng, and [129]
by Davies.
The theoretical solution of the flocking model provided in Subsection 4.7.3 is
original.
Part II
Analysis on Wasserstein Space and Mean Field
Control
Spaces of Measures and Related Differential
Calculus 5
Abstract
The goal of the present chapter is to present in a self-contained manner,
elements of differential calculus and stochastic analysis over spaces of prob-
ability measures. Such a calculus will play a crucial role in the sequel when
we discuss stochastic control of dynamics of the McKean-Vlasov type, and
various forms of the master equation for mean field games. After reviewing
the standard metric theory of spaces of probability measures, we introduce a
notion of differentiability of functions of measures tailor-made to our needs. We
provide a thorough analysis of its properties, and relate it to different notions
of differentiability which have been used in the existing literature, in particular
the geometric notion of Wasserstein gradient. Finally, we derive a first form of
chain rule (Itô’s formula) for functions of flows of measures, and we illustrate its
versatility on a couple of applications.
Recall from Chapter 1 that if .E; E/ is a measurable space, we use the notation
P.E/ for the space of probability measures on .E; E/, assuming that the -field E
on which the measures are defined is understood.
Lévy-Prokhorov Distance
The weak convergence of probability measures on E appears as the convergence
in the sense of the so-called Lévy-Prokhorov distance dLP on P.E/ defined in the
following way:
dLP .; /
˚
D inf > 0 W 8A 2 B.E/; .A/ 6 .A / C ; and .A/ 6 .A / C ;
1X
N
N NX D ıx j ;
N jD1
and similarly for N NY . Using (5.1), we get the obvious rough bound:
Actually, the proof of the result by Strassen and Dudley (see the Notes &
Complements at the end of the chapter for a precise reference) shows that:
dLP .N NX ; N NY /
(5.3)
]fi 2 f1; ; NgI d.xi ; y.i/ / > g
D inf > 0 W inf < ;
2SN N
where SN denotes the set of all the permutations of f1; ; Ng. The argument
relies on the so-called pairing theorem (whose statement may be found in the same
reference), applied with f1; ; Ng f1; ; N C kg, where k is the floor part of N
for > dLP .N NX ; N NY /. Two elements i 2 f1; ; Ng and j 2 f1; ; N C kg are said
to be connected if j N and d.xi ; yj / < or if j N C 1. Since > dLP .N NX ; N NY /,
the pairing theorem implies the existence of a 1-1 mapping & from f1; ; Ng into
f1; ; N C kg such that i and &.i/ are connected for all i 2 f1; ; Ng. Using this
& , we may construct an element 2 SN as in (5.3).
Observe that, in comparison with (5.1), the representation formula (5.3) is solely
based on the couplings of N NX and N NY that are of the form N NX ı .IE ; '/1 where IE is
the identity on E and ' is a measurable mapping from E into itself. Such couplings
are said to be deterministic. Whenever both x1 ; ; xN and y1 ; ; yN are pairwise
distinct, ' must induce a one-to-one mapping from fx1 ; ; xN g onto fy1 ; ; yN g
by restriction.
for ; 2 P.E/ in the definition of dLP .; /, we get the bound:
1
dLP .; / 6 dTV .; /;
2
352 5 Spaces of Measures and Related Differential Calculus
where dTV .; / D 2 supA2B.E/ j.A/ .A/j denotes the total variation distance
between and . In fact, a striking parallel with (5.1) exists. Indeed, another
coupling argument shows that the total variation distance between and can also
be expressed as:
Z
dTV .; / D 2 inf 1fx6Dyg d.x; y/:
2˘.;/ EE
and
dTV .; / D 2 inf P X 6D Y :
L.X/D; L.Y/D
Wasserstein Distances
We now introduce formally a class of metrics which we already used in several
instances, and which we will use most frequently in this book. For any p > 1,
we denote by Pp .E/ the subspace of P.E/ of the probability measures of order p,
namely those probability measures which integrate the p-th power of the distance to
a fixed point whose choice is irrelevant in the definition of Pp .E/.
For any p > 1 and ; 2 Pp .E/, the p-Wasserstein distance Wp .; / is
defined by:
Z 1=p
Wp .; / D inf d.x; y/p d.x; y/ : (5.4)
2˘.;/ EE
Note that the quantity Wp .; / depends upon the actual distance d in the sense
that another distance would lead to different values of Wp .; /, even if the
topology of E and the space P.E/ were to remain the same. Observe also that,
whenever and belong to Pp .E/, any 2 ˘.; / is also in Pp .E E/,
E E being equipped with any product distance. For convenience, we often use
product distances of the form dEE ..x1 ; x2 /; .y1 ; y2 // D .d.x1 ; y1 /q C d.x2 ; y2 /q /1=q
for someR q 2 Œ1; 1/, or dEE ..x1 ; x2 /; .y1 ; y2 // D max.d.x1 ; y1 /; d.x2 ; y2 //. The
quantity EE d.x; y/p d.x; y/ is sometimes referred to as a cost associated with the
coupling .
5.1 Metric Spaces of Probability Measures 353
opt
We shall often use the notation ˘ opt for ˘p when p D 2. A measurable map
from E into itself is called a transport map from to if it maps into , in
other words, if the push-forward ı 1 is equal to . The associated transport
plan is defined as ı .IE ; /1 where as before IE denotes the identity map of E,
namely the push-forward of the measure by the map E 3 x 7! .x; .x// 2 E E.
This particular form of transport plan given by a transport map is often called a
deterministic transport plan or a deterministic coupling.
The claim that Wp is a distance is not completely obvious. Indeed, the proof of the
triangle inequality is not immediate as it requires using disintegration of measures,
see Theorem (Vol II)-1.1 for a short remainder. If , and are elements of Pp .E/,
and if 1;2 (resp. 2;3 ) is a coupling between and (resp. and ), we can write:
1;2 .dx; dy/ D 1;2 .dx; y/.dy/; .resp. 2;3 .dy; dz/ D .dy/ 2;3 .y; dz/ /
for the disintegration of 1;2 on its second marginal (resp. 2;3 on its first marginal
), and then define the measure on E E E by:
Then the probability measure 1;3 defined as the projection of onto its first and
third coordinates provides a coupling between and which can be used to prove
the desired inequality:
Remark 5.1 While the terminology coupling is ubiquitous in probability and statis-
tics, transport plan is systematically used in the optimal transportation literature
(see Subsection 5.1.3 below). We shall use these two terminologies interchangeably.
We believe that the simultaneous use of both terms will not be the source of confusion
or ambiguity in the sequel.
354 5 Spaces of Measures and Related Differential Calculus
Remark 5.2 Instead of the generic terminology Wasserstein distance, we shall try
to use “p-Wasserstein distance” to make clear that we are using the distance Wp
on Pp .E/, for some p > 1. Indeed, in general, the term Wasserstein distance is
restricted to the distance W2 while the distance W1 is often called the Kantorovich-
Rubinstein distance because of the role it plays in optimal transportation. We
emphasize this connection in the next few results.
The next result is known as the Kantorovich duality theorem. It is central to the
theory of optimal transportation.
where the supremum is taken over all the real valued bounded continuous functions
opt
and on E. Moreover, if 2 ˘p .; / is an optimal transport plan between
and , then there exists 2 L .E; / and 2 L1 .E; / such that, for almost
1
every .x; y/ 2 E E,
Proof.
First Step. Let us denote by W Q p .; /p the right-hand side of (5.6), and prove that W
Qp
satisfies the triangle inequality in the sense that for three probability measures , , and
we have:
Q p .; / 6 W
W Q p .; / C W
Q p . ; /:
The idea of the proof is borrowed from the classical analysis proof that the norm of Lp spaces
satisfies the triangle inequality. For i D 1; 2, let .0; 1/2 3 .s; t/ 7! ci .s; t/ 2 .0; 1/ be
deterministic functions such that:
Explicit formulas can be found for c1 .s; t/ and c2 .s; t/. We refrain from giving them because
they do not play any role in the proof, but the reader can easily check that, in the case p D 2,
we can choose c1 .s; t/ D 1 C t and c2 .s; t/ D 1 C 1=t.
Now let and be two real valued bounded continuous functions on E satisfying .x/ C
.y/ 6 d.x; y/p for all x; y 2 E. Notice that (5.7) implies that, for any s; t > 0 and x; y; z 2 E,
we have:
.x/ C .y/ 6 d.x; y/p 6 c1 .s; t/d.x; z/p C c2 .s; t/d.z; y/p : (5.8)
5.1 Metric Spaces of Probability Measures 355
By construction,
Moreover,
where we used the assumption on the functions and and inequality (5.8). Since the left-
hand side does not depend upon x 2 E, it is still not greater than the infimum of the right-hand
side with respect to x. Consequently, we get:
.y/ .z/ 6 inf Œc1 .s; t/d.x; z/p .x/ C c2 .s; t/d.y; z/p .z/
x2E
(5.10)
D c2 .s; t/d.y; z/p :
We can now take the infimum over s and t in the right-hand side and still get an upper bound
for W Q p .; / C W
Q p .; /p . But by (5.7), this infimum is equal to ŒW Q p . ; /p which proves
Q
the desired triangle inequality for Wp .
Second Step. We now prove that (5.6) holds when the space E is finite, say E D fe1 ; ; en g,
in which case we use the notation .i/ D .fei g/ and .i/ D .fei g/ for i D 1; ; n. By
definition, we have:
X
Wp .; /p D P inf d.ei ; ej /p .i; j/:
.i;j/>0; 16i6n .i;j/D.j/;
P
16j6n .i;j/D.i/
16i;j6d
356 5 Spaces of Measures and Related Differential Calculus
If we treat the n n matrix .d.ei ; ej /p /16i;j6n given by the distance on the space E as
an n2 vector b, then the p-Wasserstein distance between and is given by the value
of a plain linear program. This program is given by the infimum appearing in the right-
hand side of the equality below for the .2n/ n2 matrix A derived by the equality
constraints of the above definition of Wp .; /p , and the 2n vector c comprising the
values .1/; ; .n/; .1/; ; .n/. To sum up, b D .b.i;j/ D d.ei ; ej /p /16i;j6n , A D
.A`;.i;j/ /16`62n;16i;j6n with A`;.i;j/ D 1iD` if ` 6 n and A`;.i;j/ D 1jD`n if ` > n, and
c D .c.`//16`62n with c.`/ D .`/ if ` 6 n and c.`/ D .` n/ if ` > n. We may think of
this problem as the primal problem:
inf b ;
.i;j/ > 0; ADc
and then write that its value is given by the value of the corresponding dual problem. We
recall the classical duality theory for finite dimensional linear programming with obvious
notation:
sup c x D inf b y:
A x6b y>0; AyDc
if we denote by .1/; ; .n/; .1/; ; .n/ the components of the vector x. This is
exactly the Kantorovich’s duality (5.6) in the case of the finite set E.
Third Step. Next, we prove, the inequality Wp .; /p > W Q p .; /p in full generality. This
follows from the fact that if and are real valued bounded continuous functions on E
satisfying .x/ C .y/ 6 d.x; y/p , then for any coupling 2 ˘.; /, we have:
Z Z Z Z
.x/.dx/ C .y/.dy/ D .x/.dx; dy/ C .y/.dx; dy/
E E EE EE
Z
6 d.x; y/p .dx; dy/:
EE
Since the left-hand side is independent of the coupling , one can take the infimum of the
right-hand side over all the couplings and get that Wp .; /p is still an upper bound for the
left-hand side. But we can now take the supremum of the left-hand side over all the couples
.; / and obtain the desired inequality.
Fourth Step. Finally, we prove the remaining inequality by an approximation procedure. Let
x0 2 E be fixed. Given > 0, there exists a compact set K E such that:
Z
d.x; x0 /p Œ.dx/ C .dx/ < p :
Kc
j
Next we construct a finite partition .E /16j6n of K by Borel sets of diameter at most , and
j
for each j 2 f1; ; ng we pick an element xj 2 E . Finally, we define the map from E into
j
E by .x/ D xj whenever x 2 E and .x/ D x0 if x 2 Kc . Clearly, is a coupling mapping
5.1 Metric Spaces of Probability Measures 357
Q 6 21=p
Wp .; / and Q / 6 21=p ;
Wp .;
Wp .; / 6 Wp .; /
Q C Wp .;
Q /
Q C Wp .;
Q / 6 Wp .; Q C 21C1=p :
Q / (5.11)
Using the result proven for probability measures on finite spaces in the Second Step, we see
that Wp .;
Q /
Q DW Q p .;
Q /
Q and using the triangle inequality proven in the First Step, we get:
Q p .;
W Q / Q p .;
Q 6W Q p .; / C W
Q / C W Q p .; /
Q
6 Wp .; Q p .; / C Wp .; /
Q / C W Q
Q p .; / C 21C1=p ;
6W (5.12)
where we used once more the fact that W Q p is not greater than Wp as proven in Third Step.
Putting together (5.11) and (5.12) we get:
Q p .; / C 22C1=p :
Wp .; / 6 W
This proves that the Wasserstein distance W1 coincides with the Kantorovich-
Rubinstein distance dKR introduced earlier.
Proof. In the case p D 1, the constraint in the supremum of the Kantorovich duality, namely
the inequality .x/ C .y/ 6 d.x; y/ can be replaced by:
from which we immediately conclude that is 1-Lipschitz. So, limiting oneself to functions
that are 1-Lipschitz, the inequality .x/ C .y/ 6 d.x; y/ can be replaced by:
so that, in the Kantorovich duality, it is enough to maximize over pairs of Lip-1 functions
.; / which completes the proof. t
u
We shall use this simple estimate quite often throughout the book. Moreover, when
and are of order p,
Wp .; /p D inf E d.X; Y/p I L.X/ D ; L.Y/ D :
Theorem 5.5 For any p > 1, if .n /n>1 and are in Pp .E/, limn!1 Wp .n ; / D
0 if and only if .n /n>1 converges toward for the weak convergence of probability
measures and:
Z Z
lim d.x0 ; x/p dn .x/ D d.x0 ; x/p d.x/; (5.15)
n!1 E E
for one (and hence for all) x0 2 E. The latter is also equivalent to the fact that
.n /n>1 converges toward for the weak convergence of probability measures and
is p-uniformly integrable, namely
Z
lim sup d.x0 ; x/p 1fd.x0 ;x/>rg dn .x/ D 0: (5.16)
r!1 n>1 E
A famous theorem of Skorohod states that the weak convergence of .n /n>1
toward is equivalent to the existence of random variables .Xn /n>1 and X defined
on the same probability space, say .˝; F; P/, such that L.X/ D and L.Xn / D n
for each n > 1, and limn!1 Xn D X almost surely. We discuss this result later in the
chapter and we even provide a proof of a somewhat stronger statement tailor-made
5.1 Metric Spaces of Probability Measures 359
to our needs in this book. See Lemma 5.29 and the ensuing discussion. In any case,
Skorohod’s theorem together with Fatou’s lemma imply that:
Z
d.x0 ; x/p d.x/ D E d.x0 ; X/p
E
6 lim inf E d.x0 ; Xn /p (5.17)
n!1
Z
6 lim inf d.x0 ; x/p dn .x/:
n!1 E
Proof.
First Step. Let us first assume that limn!1 Wp .n ; / D 0. We then observe that .n /n>1
converges in law toward .
opt
To do so, we denote by n an element of ˘p .n ; /, for any n > 1. Then, for any
bounded and uniformly continuous function f from E to R, we have
Z Z Z
f .x/dn .x/ f .x/d.x/ D f .x/ f .y/ dn .x; y/:
E E EE
Splitting the integral in the right-hand side according to the partition of E E into the sets
f.x; y/I d.x; y/ > g and f.x; y/I d.x; y/ 6 g, for a given > 0, and using the boundedness
and the uniform continuity of f , it is plain to deduce that .n /n>1 converges in law toward .
We now prove the convergence of the moments. Again, let us fix > 0 momentarily.
There exists a constant c > 0 such that for all a; b > 0 we have .a C b/p 6 .1 C /ap C c bp .
So, for x; y 2 E, we have:
opt
and integrating both sides with respect to n 2 ˘p .n ; / we get:
Z Z Z
d.x0 ; x/p dn .x/ 6 .1 C / d.x0 ; y/p d.y/ C c d.y; x/p dn .x; y/:
E E EE
By definition, the right most integral is equal to Wp .n ; /p which goes to 0 as n ! 1. This
implies:
Z Z
lim sup d.x0 ; x/p dn .x/ 6 .1 C / d.x0 ; y/p d.y/;
n!1 E E
in which we can take & 0. The resulting inequality together with (5.17) gives (5.15).
Second Step. Conversely, let us assume that .n /n>1 converges weakly toward and
that (5.15) holds. Invoking Skorohod’s representation theorem, we can find a sequence of
Rd -valued random variables .Xn /n>1 , constructed on some probability space .˝; F ; P/ with
L.Xn / D n for any n > 1 and converging almost surely to some random variable X with
L.X/ D . By (5.17), X 2 Lp .˝; A; PI Rd /. Of course, Wp .n ; /p 6 EŒd.Xn ; X/p , for
any n > 1. Therefore, in order to complete the proof, it suffices to prove that the sequence
.Xn /n>1 is p-uniformly integrable, see (5.16). For any r > 0, we have:
360 5 Spaces of Measures and Related Differential Calculus
p
E d.x0 ; Xn /p 1fd.x0 ;Xn />rg D E d.x0 ; Xn /p E d.x0 ; Xn / ^ r C rp P d.x0 ; Xn / > r :
Obviously, the last term can be made as small as we want by taking the limit r ! 1.
Uniform integrability easily follows. The last claim in the statement is clear. t
u
Corollary 5.6 For any p > 1, any subset K Pp .E/, relatively compact for
the topology of weak convergence of probability measures, any x0 2 E, and any
sequences .an /n>1 and .bn /n>1 of positive real numbers tending to C1 with n, the
set:
Z
1
K \ 2 Pp .E/I 8n > 1; d.x0 ; x/ d.x/ <
p
;
fd.x0 ;x/>an g bn
Proposition 5.7 The Borel -field B.Pp .E// of Pp .E/ is generated by the family of
functions .Pp .E/ 3 7! .D//D2B.E/ , where B.E/ is the Borel -field of E. More
generally, if E is a collection of subsets of E which generates B.E/ and is closed
under finite intersections, then B.Pp .E// is generated by the family of functions
.Pp .E/ 3 7! .D//D2E . In particular, for any Borel measurable function W
E ! R which satisfies j .x/j 6 C.1 R C d.x 0 ; x/ p
/ for some C > 0, x0 2 E, and all
x 2 E, the function Pp .E/ 3 7! E d is Borel measurable on Pp .E/.
5.1 Metric Spaces of Probability Measures 361
Proposition 5.7 remains true with P.E/ instead of Pp .E/, P.E/ being equipped
with the Lévy-Prokhorov metric. Indeed, the Borel -field B.P.E// on P.E/ is
generated by the family of mappings .P.E/ 3 7! .D//D2B.E/ . The fact that
B.Pp .E// D fD\Pp .E/I D 2 B.P.E//g, which we shall also denote by B.P.E//\
Pp .E/, can be checked directly by inspection. Indeed, for any 0 2 Pp .E/, the
mapping Pp .E/ 3 7! Wp .; 0 / is lower semicontinuous for the Lévy-Prokhorov
distance, which proves that, for any " > 0, the set f 2 Pp .E/ W Wp .; 0 / <
"g 2 B.P.E//. Therefore, B.Pp .E// B.P.E// \ Pp .E/. Conversely, for any
closed subset D P.E/ for the Lévy-Prokhorov distance, the set D \ Pp .E/ is
a closed subset of Pp .E/ equipped with Wp . We get that D \ Pp .E/ 2 B.Pp .E//.
Since the -algebra generated by sets of the form D \ Pp .E/, with D as above, is
B.P.E// \ Pp .E/, we get the required equality.
As an application of Proposition 5.7, observe that if Œ0; T 3 t 7! t 2 Pp .E/
is measurable and % W Œ0; T E 3 .t; x/ 7! %.t; x/ 2 R is jointly measurable and
satisfies j%.t; x/j 6 C.1 C dE .x0 ; x/p / for all .t;Rx/ 2 Œ0; T E and for some C > 0
and x0 2 E, then the mapping Œ0; T 3 t 7! E .t; x/dt .x/ is measurable. The
proof is a consequence of the monotone class theorem. Moreover, R if % W Pp .E/E 3
.; x/ 7! %.; x/ 2 R is jointly measurable and satisfies
R E j%.; x/jd.x/ < 1 for
all 2 Pp .E/, then the mapping Pp .E/ 3 7! E %.; x/d.x/ is measurable.
In this subsection, we analyze the rate of convergence, for the Wasserstein distance,
in the Glivenko-Cantelli theorem.
We start with a basic reminder. If .Xn /n>1 is a sequence of independent identically
distributed (i.i.d. for short) random variables in Rd with common distribution 2
P.Rd / and if, for each N > 1, we denote by N N the empirical measure:
1 X
N
N N D ıX ;
N iD1 i
This follows from Theorem 5.5 and the law of large numbers, which asserts that:
h Z Z i
P lim jxj2 dN N .x/ D jxj2 d.x/ D 1: (5.18)
N!C1 Rd Rd
362 5 Spaces of Measures and Related Differential Calculus
Actually, the sequence .W2 .; N N //N>1 is also uniformly square-integrable, since P
almost-surely:
2 1 X
N
W2 ı0 ; N N D jXi j2 ;
N iD1
As far as we know, the above a priori estimate does not exist in book form. It
will be used repeatedly throughout the book so we give a detailed proof. It relies
on several technical results which we present, for the sake of completeness, in
the form of three lemmas. The reader only interested in the applications of the
estimate (5.20) may want to skip these lemmas which will only be needed in the
proof of Theorem 5.8.
1X
N
1
N N ı D ı .Xi / ;
N iD1
5.1 Metric Spaces of Probability Measures 363
1
and Mq . ı / D 1, we obtain:
2
E W2 .N N ; /2 6 Mq ./2 E W2 N N ı 1 ; ı 1
8
ˆ
ˆ 1=2
; if d < 4;
<N
2
6 C.d; q; 1/Mq ./ N 1=2
log N; if d D 4;
ˆ
:̂N 2=d ; if d > 4;
Lemma 5.10 There exists a universal constant c > 0 such that, for any pair .; /
of probability measures on .1; 1d , it holds that:
ˇ ˇ
X X X ˇ .C/ .C/ ˇ
2
W2 .; / 6 c 2`
2 .B/ ˇ ˇ
ˇ .B/ .B/ ˇ; (5.21)
`>0 B2P` C2P`C1 ; CB
where Œ.C/=.B/ is set to 1=2d whenever .B/ D 0 (and similarly for in lieu
of ).
Proof. We first isolate an argument which will be used repeatedly in the proof.
Preliminary Step. If .Ak /k>0 is a measurable partition of a Polish space E on which and
are probability measures, we define a new probability Q by its restrictions to each of the
.Ak /k>0 given by Q . \ Ak / D .Ak / . jAk / for any k > 0. Of course, we implicitly assume
that charges all the .Ak /k>0 . Below we use the notation QjAk . / for Q . \ Ak /. We then
have:
. ^ Q /jAk . / D .Ak / ^ .Ak / . jAk /; k > 0;
so that, if we set:
1X X
ıD j .Ak / .Ak /j D 1 .Ak / ^ .Ak / ;
2 k>0 k>0
we have:
X
. Q /C D .Ak / .Ak / C
.jAk /
k>0
X
D .Ak / . ^ /.Ak / .jAk / D ^ Q;
k>0
X
. Q /C D .Ak / .Ak / C
.jAk /
k>0
X
D .Ak / . ^ /.Ak / C
.jAk / D Q ^ Q;
k>0
364 5 Spaces of Measures and Related Differential Calculus
and thus:
. Q /C .E/ D . Q /C .E/ D ı:
.B/
Z Z
D 1fxDyg . ^ Q / ı 1
.dx; dy/ C ı 1 1fx6Dyg . Q /C .dx/. Q /C .dy/
B B
Z Z
D 1fxDyg . ^ Q / ı 1
.dx; dy/ C ı 1 . Q /C .dx/. Q /C .dy/;
B B
X Z
p
D 2`d f .x/dx ` .B/ C O w. d2` /
B2P` B
5.1 Metric Spaces of Probability Measures 365
X Z
p
D 2`d f .x/dx .B/ C O w. d2` /
B2P` B
Z
p
D f .x/d.x/ C O w. d2` / ;
Rd
where we used the Landau notation O. / for a function O. / of the form R 3 x 7! ı.x/ 2 R
such that jı.x/j 6 Cjxj for some constant C > 0. We deduce that .` /`2N converges weakly
toward . By Theorem 5.5, the convergence also holds in the sense of the 2-Wasserstein
distance, which implies that:
showing that in order to prove (5.21), it suffices to prove that its right-hand side is an upper
bound for W2 .; ` /2 for ` fixed.
Third Step. Now, for each ` > 0, we construct a coupling ` 2 ˘.` ; `C1 /. The strategy
is to apply the Preliminary Step with D ` , D and then Q D `C1 . To do so, observe
first that, for any B 2 P` ,
2
X
d 2
X
d
where B1 ; ; B2d form the partition of B into 2d hypercubes in P`C1 . Moreover, for each
C 2 P`C1 contained in B, we have:
.B/.C/
` .C/ D .B/.CjB/ D ;
.B/
.B/
` . jC/ D jC . / D . jC/;
.B/` .C/
Dividing both sides by .B/, we can reinterpret this equality in the framework of the
Preliminary Step: It says that, if we start with D ` . jB/ and D . jB/, the probability
`C1 . jB/ is nothing but the probability Q obtained as in the Preliminary Step from the
partition fC 2 P`C1 W C Bg of B. We then denote by `;B the resulting coupling, and let:
X
` . / D ` .B/`;B . /:
B2P`
Observe that ` .B/ D .B/, so that ` .B/ D `C1 .B/. Therefore, ` 2 ˘.` ; `C1 /; ` is a
coupling constructed from the aggregation prescription. Notice, again from the Preliminary
Step, that:
366 5 Spaces of Measures and Related Differential Calculus
X
` .f.x; y/ W x ¤ yg/ D ` .B/`;B f.x; y/ W x ¤ yg
B2P`
X 1 X
ˇ ˇ
D ` .B/ ˇ` .CjB/ .CjB/ˇ
B2P`
2 C2P ;CB
`C1
ˇ ˇ
1 X X ˇ ˇ
ˇ.C/ .B/ .C/ ˇ:
D ˇ
2 B2P C2P`C1 ;CB
.B/ ˇ
`
Then, Ionescu-Tulcea’s theorem guarantees the existence of a sequence .Z` /`>1 of .1; 1d -
valued random variables constructed on some probability space .˝; F ; P/ such that:
P Z0 2 A0 ; Z1 2 A1 ; ; Z` 2 A`
Z Z Z
D 0 .dx0 /K1 .x0 ; dx1 / K` .x`1 ; dx` /;
A0 A1 A`
for every ` > 1, and any Borel sets A0 ; A1 ; ; A` in .1; 1d . For each ` > 1, the joint
distribution of .Z0 ; Z` / is a coupling between D 0 and ` . Now, if the random variable L
is defined by L D inff` > 0 W Z` ¤ Z`C1 g:
W2 .; /2 6 sup W2 .; ` /2 6 sup E jZ0 Z` j2
`>1 `>1
6 2 sup E jZ0 ZL j2 C jZL Z` j2 1fL6`1g
`>1
D 2 sup E jZL1 ZL j2 C jZL Z` j2 1fL6`1g
`>1
6 c sup E 22L 1fL6`1g ;
`>1
for a universal constant c > 0. The constant c being allowed to increase from line to line, we
get:
X
W2 .; /2 6 c 22` P Z` 6D Z`C1
`>0
X
Dc 22` `C1 f.x; y/I x ¤ yg
(5.23)
`>0
ˇ ˇ
c X 2` X X ˇ
ˇ .C/ ˇˇ
D 2 ˇ.C/ .B/ .B/ ˇ;
2 ;CB
B2P C2P
`>0 ` `C1
which completes the proof when the Lebesgue measure on .1; 1d is absolutely continuous
with respect to and . Observe that we have exchanged the role of and in (5.21).
5.1 Metric Spaces of Probability Measures 367
Conclusion. In order to complete the proof, it remains to discuss the general case when
the Lebesgue measure is no longer absolutely continuous with respect to and . We then
approximate and by:
D .1 / C Lebdj.1;1d I D .1 / C Lebdj.1;1d ;
2d 2d
where Lebd denotes the d-dimensional Lebesgue measure and Lebdj.1;1d . / D Lebd . \
.1; 1d //. Then, . /">0 and . />0 converge in total variation to and as tends to
0. Obviously, we can apply the conclusion of the fourth step to each pair . ; /, > 0.
Letting tend to 0 in (5.23) completes the proof. t
u
For 2 P.Rd / and n > 0 such that .Bn / > 0, we define the probability measure
Bn on .1; 1d by:
Bn .A/ D 2n AjBn ;
for any Borel set A .1; 1d . In other words, Bn is the push-forward by the map
Rd 3 x 7! x=2n of the probability conditioned to be in Bn , namely the probability:
.B \ Bn /
B 7! .BjBn / D :
.Bn /
for ; 2 P2 .Rd /. Importantly, D2 .; / does not depend upon the (arbitrary)
choice of Bn or Bn when .Bn / D 0 or .Bn / D 0. The relevance of D2 to our
analysis of convergence in the sense of the Wasserstein distance W2 is provided by
the following estimate.
368 5 Spaces of Measures and Related Differential Calculus
Lemma 5.11 There exists a universal constant such that, for all pairs of
probability measures and in P2 .Rd /, we have:
Proof. If and are supported in .1; 1d , using estimate (5.21) from Lemma 5.10, we
have:
ˇ ˇ
X X X ˇ .C/ .C/ ˇ
W2 .; /2 6 c 22` .B/ ˇ ˇ
ˇ .B/ .B/ ˇ
`>0 B2P C2P ` ; CB `C1
X X X
.C/
6c 22` j.B/ .B/j C j.C/ .C/j
B2P` C2P`C1 ; CB
.B/
`>0
X X X
6c 22` j.B/ .B/j C j.C/ .C/j
`>0 B2P` C2P`C1
X X
6 c.1 C 22 / 22` j.B/ .B/j;
`>1 B2P`
P
where we used the fact that B2P0 j.B/ .B/j D 0. The above is nothing but the desired
right-hand side of (5.25) which completes the proof when and are supported in .1; 1d .
In the general case, for each n > 1, we denote by n the optimal coupling of Bn and Bn ,
and by n (that we shall also denote by n .dx; dy/ for pedagogical reasons) the push-forward
of n by scaling by 2n , namely by the mapping .x; y/ 7! .2n x; 2n y/. Obviously,
Z Z Z Z
22n W2 .Bn ; Bn /2 D 22n jx yj2 dn .x; y/ D jx yj2 dn .x; y/:
Rd Rd Rd Rd
P
where a D .1=2/ n>0 j.Bn / .Bn /j,
X
˛.dx/ D ..Bn / .Bn //C .dxjBn /;
n>0
X
ˇ.dy/ D ..Bn / .Bn //C .dyjBn /:
n>0
Following the proof of the preliminary step in Lemma 5.10, we notice that:
X X X
aD ..Bn / .Bn //C D ..Bn / .Bn //C D 1 .Bn / ^ .Bn /:
n>0 n>0 n>0
We also note that by construction, the marginals of are and respectively. Indeed, if
A 2 B.Rd / we have:
5.1 Metric Spaces of Probability Measures 369
X
.A Rd / D ..Bn / ^ .Bn //n .A Rd / C a1 ˛.A/ˇ.Rd /
n>0
X
D ..Bn / ^ .Bn //n .2n A Rd / C ˛.A/
n>0
X X
D ..Bn / ^ .Bn //Bn .2n A/ C ..Bn / .Bn //C .AjBn /
n>0 n>0
X
D .Bn /.AjBn / D .A/;
n>0
where we used the fact that a D ˇ.Rd /. Notice that the proof works correctly even if .Bn / D
0 for some n > 0. We argue similarly for the second marginal. Moreover,
Z Z Z Z
2
jx yj2 a1 ˛.dx/ˇ.dy/ 6 jxj2 C jyj2 ˛.dx/ˇ.dy/
Rd Rd a Rd Rd
Z Z
62 jxj2 ˛.dx/ C 2 jyj2 ˇ.dy/
Rd Rd
X
62 22n ..Bn / .Bn //C C ..Bn / .Bn //C
n>0
X
62 22n j.Bn / .Bn /j;
n>0
where we used once more the fact that a D ˛.Rd / D ˇ.Rd /. Now, using the fact that the
marginals of are and , we have:
Z Z
W2 .; /2 6 jx yj2 .dx; dy/
Rd Rd
X 2
6 22n 2j.Bn / .Bn /j C .Bn / ^ .Bn / W2 .Bn ; Bn / ;
n>0
R R
where we used the fact that Rd Rd jx yj2 n .dx; dy/ D 22n .W2 .Bn ; Bn //2 . Since Bn and
are probability measures on .1; 1d , we can use the first part of the proof to bound
Bn
W2 .Bn ; Bn /2 by 5c.ı2 .Bn ; Bn //2 , completing the proof in the general case. t
u
Lemma 5.12 There exists a universal constant C such that, for all probability
measures and in P2 .Rd /, we have:
X X Xˇ ˇ
D2 .; / 6 C 22n 22` ˇ .2n B/ \ Bn .2n B/ \ Bn ˇ;
n>0 `>0 B2P`
with the same notation as in the definition (5.24) of D2 .; /, and where the notation
2n B stands for the set f2n x 2 Rd I x 2 Bg.
370 5 Spaces of Measures and Related Differential Calculus
Proof. Going back to the definition of D2 .; /, we first notice that, for each n > 0:
X ˇ ˇ
j.Bn / .Bn /j D ˇ .2n B/ \ Bn .2n B/ \ Bn ˇ;
B2P0
which completes the proof since the last term is not greater than j.Bn / .Bn /j=3, and the
bound remains true whenever .Bn / D 0. t
u
R 5.8. Let us assume that 2 Pq .R /, for some q > 4, and without loss of
d
Proof of Theorem
generality that Rd jxjq d.x/ D 1. By Markov inequality, .Bn / 6 2q.n1/ for all n > 0.
First Step. We shall apply Lemma 5.11 in order to find a bound for EŒD2 .N N ; /. For a
Borel subset A Rd , the random variable N N N .A/ is Binomial with parameters N and .A/
so that:
p
E jN N .A/ .A/j 6 min 2.A/; .A/=N :
Using Cauchy-Schwarz’ inequality and the fact that the partition P` has exactly 2d` elements,
we deduce that, for all n > 0 and ` > 0,
X ˇ ˇ p
E ˇN N .2n B/ \ Bn .2n B/ \ Bn ˇ 6 min 2.Bn /; 2d`=2 .Bn /=N :
B2P`
Using the result of Lemma 5.12 and the fact that .Bn / 6 2q.n1/ , we get, for a universal
constant C possibly depending upon q (and whose value is allowed to increase from line to
line):
X X p
E D2 .N N ; / 6 C 22n 22` min 2qn ; 2d`=2 2qn =N : (5.26)
n>0 `>0
Second Step. We first consider the case d > 4. For N > 2 fixed, we estimate the right-hand
side of (5.26) by computing the sums in the order they appear. Let n0 D bq1 log2 Nc where
we use the notation bxc for the integer part of x, namely the largest integer smaller than or
equal to x. For each integer n > 0, we define `.n/ D d1 .log2 N qn/. Notice that `.n/ > 0
if and only if n 6 n0 and
5.1 Metric Spaces of Probability Measures 371
p
` 6 `.n/ , 2qn > 2d`=2 2qn =N
p p
, min 2qn ; 2d`=2 2qn =N D 2d`=2 2qn =N:
Similarly,
p p
` > `.n/ , 2qn < 2d`=2 2qn =N ) min 2qn ; 2d`=2 2qn =N D 2qn :
P P
So if we split the above sum over n into two parts, ˙1 D 06n6n0 and ˙2 D n>n0 ,
X
n0 b`.n/c
X p X
˙1 D 22n 22` 2d`=2 2qn =N C 22` 2qn
nD0 `D0 `>b`.n/c
X
n0 b`.n/c
X p
D 22n 22` 2d`=2 2qn =N C 31 2qn 22b`.n/c
nD0 `D0
b`.n/c
X
n0 X X
n0
6 N 1=2 2.2q=2/n 2.2Cd=2/` C CN 2=d 2.2qC2q=d/n :
nD0 `D0 nD0
Obviously, the second sum above is bounded by a constant times N 2=d since 2 q C2q=d <
0, recall d 4. As for the first sum, when d > 4, it is bounded from above by:
X
n0 X
n0
CN 1=2 2.2q=2/n 2.2Cd=2/`.n/ 6 CN 1=2 N .d4/=.2d/ 2.2qC2q=d/n
nD0 nD0
2=d
6 CN ;
where we used once more the fact that 2 q C 2q=d < 0 because q > 4, and where the
constant C is allowed to increase from line to line.
Now when d D 4 the upper bound on the first term reads:
X
n0
CN 1=2 2.2q=2/n `.n/ 6 CN 1=2 log2 N:
nD0
X X 4 1
˙2 D 22n 22` 2qn D 2.q2/n0 6 CN .12=q/ :
32 q2 1
n>n0 `>0
Third Step. In order to treat the case d < 4, we interchange the order of the summations in
the right-hand side of (5.26) and write:
X X p
E D2 N N ; 6 C 22` 22n min 2qn ; 2d`=2 2qn =N ;
`>0 n>0
that N is fixed. Let `0 DPbd1 log2 Nc and let us split the above sum
and as before assume P
into two parts, ˙1 D 06`6`0 and ˙2 D `>`0 . For each integer ` > 0, we define
n.`/ D q1 .log2 N d`/. Notice that n.`/ > 0 if and only if ` 6 `0 and
p
n 6 n.`/ , 2qn > 2d`=2 2qn =N
p p
, min 2qn ; 2d`=2 2qn =N D 2d`=2 2qn =N;
and similarly,
p p
n > n.`/ , 2qn < 2d`=2 2qn =N ) min 2qn ; 2d`=2 2qn =N D 2qn :
Consequently,
`0
X bn.`/c
X p X
˙1 D 22` 22n 2d`=2 2qn =N C 22n 2qn
`D0 nD0 n>bn.`/c
`0 bn.`/c `0
X X X
6 N 1=2 22` 2d`=2 22n 2qn=2 C C 22` 2.q2/n.`/
`D0 nD0 `D0
`0
X `0
X
6 CN 1=2 2.d=22/` C CN .12=q/ 2.d22d=q/` :
`D0 `D0
The second sum on the last line is less than C if d22d=q < 0 and is less than CN 12=q2=d
if d 2 2d=q > 0. If d 2 2d=q D 0, which happens if and only if d D 3 and q D 6
since we assumed d < 4 and q > 4, it is less than C log2 .N/. So, in any case the last term is
the right hand side is less than CN 1=2 since d < 4 and q > 4. Similarly,
X X X
˙2 D 22` 22n 2qn D C 22` 6 CN 2=d :
`>`0 n>0 `>`0
We usually prefer the notation @' to r' to denote the gradient (first derivative)
of a differentiable function '. However, in this subsection, we introduce and use
the notion of subdifferential of a function ' which is not necessarily differentiable,
and since the most commonly used notation for the subdifferential is @', we shall
use the notation r' for the gradient of a differentiable function '. Also, we often
denote by I the identity mapping on Rd , which, though it is connected with, should
not be confused with the identity matrix which we denote by Id . We are confident
that the context should make it clear.
The next proposition identifies an instance of an optimal transport map that may be
easily identified.
Proof. By strict convexity of ', the gradient r' of ' is increasing in the sense that:
In particular, r' is one-to-one from Rd onto its range (also known as co-domain). Since
the Hessian of ' has a strictly positive determinant, the global inversion theorem ensures
that r' is a C 1 diffeomorphism from Rd on its range. We denote the inverse by .r'/1 .
The remainder of the proof relies on a duality argument. We compute the so-called square-
transform of the potential jxj2 2'.x/:
˚ ˚
.y/ D inf jx yj2 jxj2 C 2'.x/ D jyj2 C inf 2x y C 2'.x/ ; y 2 Rd : (5.27)
x2Rd x2Rd
By strict convexity of ', we deduce that, when y is in the range of r', the infimum is attained
at the unique root x 2 Rd of the equation r'.x/ D y, so that:
.y/ D j.r'/1 .y/ yj2 j.r'/1 .y/j2 C 2' .r'/1 .y/
(5.28)
D jyj2 2y .r'/1 .y/ C 2' .r'/1 .y/ :
374 5 Spaces of Measures and Related Differential Calculus
The key point is to observe that because of the definition (5.27) of the square transform, we
have:
Now, if X and Y are two Rd -valued random variables on some probability space .˝; F ; P/,
such that L.X/ D and L.Y/ D ,
E jX Yj2 > E jXj2 C .Y/ 2'.X/
(5.29)
D E jXj2 C r'.X/ 2'.X/ ;
where we used the fact that Y and r'.X/ have the same distribution. A priori, the right-hand
side could be 1, but Inserting (5.28), we get:
E jX Yj2 > E jXj2 C jr'.X/j2 2r'.X/ X C 2'.X/ 2'.X/
D E jX r'.X/j2 ;
which shows that r' is an optimal transport map. As for the proof of uniqueness, we return
to the first line in (5.29). By definition of , it holds P -a.s.
The expectation of the right-hand side only depends on and and does not depend on the
joint law of .X; Y/. Therefore, it is equal to W2 .; /2 . So, if EŒjX Yj2 D W2 .; /2 , then
the above inequality becomes an equality:
which shows that X reaches the infimum in the definition of .Y/. Notice now that Y belongs
to the codomain of r' with probability 1 (since is the push-forward image of by r').
Therefore, the minimum in the definition of .Y/ is unique and is given by r' 1 .Y/, in
other words Y D r'.X/. t
u
Remark 5.14 Strict convexity is crucial for the conclusion of Proposition 5.13.
Here is a simple counter-example. Let 2 P2 .R/ have mean 0 and let be the
push-forward image of by the mapping R 3 x 7! x. We show that the transport
map R 3 x 7! x does not induce an optimal transport plan from to .
The cost of the transport map R 3 x 7! x is:
E jX .X/j2 D 4E jXj2 ;
which says the transport plan ˝ is of smaller cost than the plan associated with
the map R 3 x 7! x.
5.1 Metric Spaces of Probability Measures 375
Remark 5.15 One can easily push further the analysis in the one-dimensional case.
For example if and are two distributions in P2 .R/, if we denote by F and F
their cumulative distribution functions, being atomless (i.e., F is continuous), it
is well known that is the image of by F1 ı F , where F1 denotes the pseudo-
inverse of F . Moreover, if and have smooth positive densities, then F1 ı F
is strictly increasing. So, we can apply Proposition 5.13 by choosing ' as any anti-
derivative of F1 ı F . We deduce that there exists a unique optimal plan induced
by the transport map F1 ı F . In fact, this result remains true under the mere
assumption that has no atom, i.e., F is continuous.
Definition 5.16 If ' W Rd 7! .1; C1 is convex and proper in the sense that it
is not identically equal to C1, for each x 2 Rd , we define the subdifferential of '
at x as the set:
Obviously, if ' is finite and differentiable at the point x, the subdifferential @'.x/ is
the singleton fr'.x/g given by the actual derivative (or gradient) of ' at x.
Proof. The fact that a subset of the subdifferential of a convex function is cyclic monotone
is easy, and can be proven by simple computation from the definition of cyclic monotony by
iterating the definition of subdifferential. We do not give the details of the argument because
we shall not use this half of the equivalence. We give a detailed proof of the reciprocal which
we shall use in the sequel.
Let .x0 ; y0 / 2 A and let us define the function ' by:
˚
'.x/ D sup ym .x xm / C ym1 .xm xm1 / C C y0 .x1 x0 /I
m > 1; .x1 ; y1 /; ; .xm ; ym / 2 A :
376 5 Spaces of Measures and Related Differential Calculus
The function ' is convex and lower semi-continuous as the supremum of linear functions.
Moreover, is it proper because '.x0 / D 0. Indeed, '.x0 / 6 0 by definition of cyclic
monotonicity, and '.x0 / > 0 by choosing m D 1 and x1 D x0 , so that '.x0 / D 0. Now,
if .x; y/ 2 A, for any z 2 Rd we easily get from the very definition of ' that:
Proposition 5.19 For all measures and in P2 .Rd /, the topological support of
any optimal transport plan is cyclic monotone.
Proof. Let us assume that the topological support of an optimal plan is not cyclic
monotone. There exist an integer m > 1 and couples .x1 ; y1 /, , .xm ; ym / in Supp. /
such that:
m
X
jxkC1 yk j2 jxk yk j2 < 0;
kD1
where we use the notation xmC1 for x1 . By definition of the topological support of a measure,
we can find neighborhoods Ui of xi and Vi of yi for i D 1; ; m such that .Ui Vi / > 0
for all i D 1; ; m and
m
X
jQxkC1 yQ k j2 jQxk yQ k j2 < 0; xQ k 2 Uk ; yQ k 2 Vk ; k D 1; ; m:
kD1
m
c X .1/ .2/
D C ˝ k k ;
m kD1 kC1
.1/ .1/
for a positive constant c to be chosen below and with mC1 D 1 . Observe that, for all
A 2 B.Rd Rd /,
c X .A \ .Uk Vk //
m
.A/ > .A/
m kD1 .Uk Vk /
m
1 X
> .A/ .A \ .Uk Vk // > 0;
m kD1
5.1 Metric Spaces of Probability Measures 377
if c < min16i6m .Ui Vi /. The measure is obviously a probability. Its second marginal
is the same as the second marginal of which is . As for its first marginal, it is because
of the cyclic summation over k. Hence it is a coupling between and . We reach the desired
contradiction by computing:
Z Z
jx yj2 .dx; dy/ jx yj2 .dx; dy/
Rd Rd Rd Rd
Z m
X Y
m
c .1/ .2/
D jxk yk j2 jQxk yQ k j2 kC1 .dxkC1 /k .dyk /k .dQxk ; dQyk /
m .Rd /4m kD1 kD1
< 0;
Proof. Let be an optimal transport plan from to . Proposition 5.19 implies that its
support Supp. / is cyclic monotone, and by the part of Rockafellar’s characterization
which we proved in Proposition 5.18, there exists a .1; 1-valued lower semi-continuous
proper convex function ' on Rd such that Supp. / @', which we can rewrite as:
f.x; y/ 2 Rd Rd I y 2 @'.x/g D 1: (5.30)
We now observe that the left- and right-hand sides are integrable with respect to and
satisfy:
Z
ˇ ˇ
ˇ'.x0 / C y0 .x x0 /ˇd .x; y/ 6 j'.x0 /j C jy0 j M1 ./ C jx0 j < 1;
Rd Rd
Z
ˇ ˇ
ˇ'.x0 / C y .x x0 /ˇd .x; y/ 6 j'.x0 /j C M2 ./j M2 ./ C jx0 j < 1:
Rd Rd
378 5 Spaces of Measures and Related Differential Calculus
Therefore, ' is a.s. finite under , that is the domain of ', i.e., Dom.'/ D fx 2 Rd W
j'.x/j < 1g, is of full measure under . Of course, Dom.'/ is a convex subset of Rd , and
its boundary has a zero Lebesgue measure, from which get .Int.Dom.'/// D 1 since
is absolutely continuous with respect to the Lebesgue measure, where Int.Dom.'// is the
interior of Dom.'/.
Recall now that a proper convex function is continuous and locally Lipschitz, and thus
almost everywhere differentiable, on the interior of its domain. Since .Int.Dom.'/// D 1,
we deduce that .fx 2 Rd W fr'.x/g D @'.x/g/ D 1, from which we conclude that:
f.x; y/ 2 Rd Rd W y D r'.x/g D 1;
Remark 5.21 The above proof shows that any optimal transport plan is of the form
ı .I; r'/1 for some proper convex function. A simple duality argument shows
that in fact, there is uniqueness, not only of the optimal transport plan, but of the
gradient of a convex function transporting onto . Moreover, if is also absolutely
continuous, and if we denote by ' the convex conjugate of ', then ' is the convex
function whose gradient transports onto optimally and it is not difficult to see
that for -almost every x 2 Rd and -almost every y 2 Rd , we have:
Throughout the analysis below, we shall use repeatedly the fact that, over an
atomless probability space .˝; F; P/, for any probability distribution on a Polish
space E, we can construct an E-valued random variable on ˝ with as distribution.
In this regard, we refer to Remark 5.26 for the properties of the lifting uQ over general
spaces .˝; F; P/ that are neither Polish nor atomless.
We first prove that the law of the random variable DQu.XQ 0 / does not depend
upon the particular choice of the random variable XQ 0 satisfying L.XQ 0 / D 0 .
See Proposition 5.24 below, whose proof will use the following simple measure
theoretical lemma:
Lemma 5.23 If X and Y are elements of L2 .˝; F; PI Rd / with the same law, then
for each > 0 there exist two measurable measure preserving mappings and 1
from ˝ into itself such that Pf! 2 ˝ W . ı 1 /.!/ D . 1 ı /.!/ D !g D 1 and
PŒjY X ı 1 j 6 D 1.
Proof. Let .An /n>1 be a partition of Rd by Borel sets of diameter at most , and, for each
n > 1, let us set Bn D fX 2 An g and Cn D fY 2 An g. For each n > 1, P.Bn / D P.Cn /.
We now use the fact that P is an atomless probability measure on a Polish space. We
denote by F the completion of F under P and by P the extension of P to F . Whenever
P.Bn / D P.Cn / > 0, there exist two subsets Mn Bn and Nn Cn , both being included in
Borel subsets of zero measure under P, and a one-to-one map n from Bn n Mn onto Cn n Nn
such that n and n1 are measurable with respect to the restriction of F to Bn n Mn and
Cn n Nn and preserve the restrictions of the measure P to Bn n Mn and Cn n Nn (see Corollary
6.6.7 and Theorem 9.2.2 in Bogachev [64]).
Then, we can extend n into a measurable mapping, still denoted by n , from Bn to Cn
(measurability being understood with respect to the restrictions of F to Bn and Cn ) and then
n1 into a measurable mapping, still denoted by n1 , from Cn to Bn (measurability being
understood with respect to the restrictions of F to Cn and Bn ) in such a way that n ı n1 is
the identity on Cn n Nn , n1 ı n is the identity on Bn n Mn , n1 .Nn / Mn and n .Mn / Nn .
Necessarily, .n /1 .Nn / Mn and .n1 /1 .Mn / Nn . Here, .n /1 ./ denotes the pre-
image by n and, similarly, .n1 /1 ./ denotes the pre-image by n1 . Obviously, for all
A 2 F , with A Cn , we have P .A/ D P .A n Nn / D P .n1 .A n Nn // D P ..n /1 .A//.
Similarly, for all A 2 F , with A Bn , we have P .A/ D P ..n1 /1 .A//.
Whenever P.Bn / D P.Cn / D 0, we construct n and n1 according to the same principle,
but with Mn D Bn and Nn D Cn .
380 5 Spaces of Measures and Related Differential Calculus
P Since .Bn /n>1 1 and .CP n /n>1 are partitions of ˝ by measurable sets, the maps D
1
n>1 n 1Bn and D n>1 n 1Cn are measurable from .˝; F / into itself. Letting
M D [n>1 Mn and N D [n>1 Nn , we have P .M/ D P .N/ D 0. On ˝ n N, ı 1 is the
identity and, on ˝ n M, 1 ı is also the identity.
Since and 1 are measurable with respect to F , we can find two mappings Q and
1
Q from ˝ into itself, measurable with respect to F , such that and Q , and similarly
1 and Q 1 , coincide outside an event A in F of zero measure under P. In particular,
Q and Q 1 preserve the probability measure P. As a by-product, P..Q /1 .A// D 0 and
P..Q 1 /1 .A// D 0. For any ! 62 A [ .Q /1 .A/, we have .!/ D Q .!/ 62 A and thus
1 . .!// D Q 1 .Q .!//. Therefore, P.f! 2 ˝ W Q 1 .Q .!// D !g/ D P .f! 2 ˝ W
1 . .!// D !g/ D 1. Similarly, P.f! 2 ˝ W Q .Q 1 .!// D !g/ D 1.
Finally, observe that, by construction, kY X ı 1 k1 6 . Since and Q coincide
outside A, we deduce that PŒjY X ı Q 1 j 6 D 1. t
u
Proof. By definition, there exists a random variable X0 with law 0 such that the lifted
function uQ is Fréchet differentiable at X0 . Let X 2 L2 .˝; F ; PI Rd / be such that L.X/ D 0 .
Then Lemma 5.23 implies that, for any > 0, there exist two measurable measure preserving
mappings and 1 from ˝ into itself, such that P.f! 2 ˝ W . ı 1 /.!/ D !g/ D
P.f! 2 ˝ W .1 ı /.!/ D !g/ D 1 and PŒjX0 X ı j 6 D 1. Using the fact that
the lifting uQ is differentiable at X0 and that its values depend only upon the distributions of
its arguments, we get, for any Y 2 L2 .˝; F ; PI Rd /:
uQ .X C Y/ D uQ X ı C Y ı
D uQ .X0 / C E DQu.X0 / X ı C Y ı X0
C o kX ı C Y ı X0 k2
(5.32)
D uQ .X0 / C E DQu.X0 / X ı C Y ı X0
C o kX ı X0 k2 C kYk2
D uQ .X/ C E DQu.X0 / ı 1 Y C O kX ı X0 k2 C o kYk2 :
It is important to observe that the symbols O. / and o. / which we use according to the
Landau convention, are here uniform with respect to . Here o. / stands for a function o. /
of the form R 3 x 7! xı.x/ 2 R with limx!0 ı.x/ D 0.
Let us assume momentarily that DQu.X0 /ı1 converges in L2 .˝; F ; PI R2d / when & 0,
and let us denote by Z its limit. Then, .X; Z/ has the same law as .X0 ; DQu.X0 // because is
measure preserving. Taking the limit & 0 in (5.32) we get:
uQ .X C Y/ D uQ .X/ C E Z Y C o kYk2 ;
We conclude the proof by proving that .DQu.X0 / ı 1 />0 forms a Cauchy family as
& 0. This follows from the fact that, if we subtract the value of (5.32) for to its value for
0 2 .0; /, we find that:
ˇ ˇ
sup ˇE DQu.X0 / ı 1 DQu.X0 / ı 1
0 Y ˇ 6 C C o.kYk2 / ;
0 2.0;/
p
which is enough to conclude by taking kYk2 D , dividing by kYk2 , and finally, taking the
supremum over these Y’s. t
u
Proof. For a given X 2 L2 .˝; F ; PI Rd /, the goal is to prove that, as a random variable,
DQu.X/ is measurable with respect to the -field generated by X (fact which we denote by
DQu.X/ 2 fXg), as the existence of such that PŒDQu.X/ D .X/ D 1 then follows
from standard measure theory arguments. The fact that is independent of the choice of
the random variable X representing the distribution then follows from Proposition 5.24.
Without any loss of generality, we may assume that u (and thus uQ ) is bounded. Indeed,
it suffices to compose u with any smooth bounded function matching the identity on a
sufficiently large interval in order to recover the general case. For the time being, we also
assume that is absolutely continuous with respect to the Lebesgue measure and that:
Z
jxjq d.x/ < 1;
Rd
for some q > 4. For each > 0, we define the function on L4 .˝; F ; PI Rd / by:
1
.Y/ D uQ .Y/ C E jX Yj2 C EŒjYj4 :
2
382 5 Spaces of Measures and Related Differential Calculus
Note that is Fréchet differentiable on L4 .˝; F ; PI Rd / and that its Fréchet derivative is
given by (or at least can be identified with) D .Y/ D DQu.Y/ C 1 .Y X/ C 4jYj2 Y.
Notice also that .Y/ ! C1 as EŒjYj4 ! C1 since uQ is bounded. We then call
.Zn /n>0 a minimizing sequence for , and for each n > 0, we let n D L.Zn /. Since
is absolutely continuous, we can use Brenier’s Theorem 5.20 stating that there exists a real
valued convex function n on Rd , which is differentiable -almost everywhere, such that the
random variable Yn D r n .X/ satisfies L.Yn / D n and EŒjX Yn j2 D W2 .; n /2 . These
two facts imply that:
1
.Yn / D uQ .Yn / C W2 .; n /2 C EŒjYn j4
2
1
D uQ .Zn / C W2 .; n /2 C EŒjZn j4 6 .Zn /;
2
proving that .Yn /n>0 is also a minimizing sequence of . Since the lifting uQ is bounded, we
conclude that:
Z
sup jxj4 dn .x/ D sup EŒjYn j4 < 1;
n>0 Rd n>0
and consequently that the sequence .n /n>0 is tight. Extracting a subsequence if necessary,
we can assume that this sequence converges (in the sense of weak convergence as well as
for the distance W2 because of the uniform bound on the fourth moments), and we call its
limit. Notice that:
the first part following from the fact that uQ is continuous on L2 .˝; F ; PI Rd /. By Fatou’s
lemma (modulo Skorohod’s equivalent form of weak convergence), we also have:
Z Z
jxj4 d.x/ 6 lim inf jxj4 dn .x/ D lim inf EŒjYn j4 :
Rd n!1 Rd n!1
Using once more Brenier’s Theorem 5.20, we get the existence of a real valued convex
function on Rd such that if we set Y D r .X/, then L.Y/ D and W2 .; /2 D
EŒjX Yj2 . Such a Y is a minimizer of so that D .Y/ D 0 which gives:
DQu.Y/ D 1 .Y X/ 4jYj2 Y;
which together with the identity Y D r .X/, also shows that DQu.Y/ 2 fXg. Since the
latter is closed in L2 .˝; F ; PI Rd /, we conclude that DQu.X/ 2 fXg by letting & 0 since Y
converges toward X (notice 2 .Y/ 6 2 .X/), and consequently DQu.Y/ converges toward
DQu.X/ by continuity of DQu.
Now if 2 P2 .Rd / is still absolutely continuous but does not necessarily satisfy the
above moment condition, we use X withpdistribution , i.e., such that L.X/ D , and
we apply the above proof to Xn D nX= n2 C jXj2 , whose law is absolutely continuous,
5.2 Differentiability of Functions of Probability Measures 383
has moments of all orders, and converges toward X in L2 .˝; F ; PI Rd /. Hence DQu.Xn / 2
fXn g fXg and, letting n ! 1, we get DQu.X/ 2 fXg, again by continuity of DQu and
the fact that fXg is closed.
Finally, if 2 P2 .Rd / is not assumed to be absolutely continuous, we consider a triple
of random variables .X; G1 ; G2 / with L.X/ D , X being independent of .G1 ; G2 /, G1 and
G2 being two independent standard d-dimensional Gaussian random variables (recall that we
work on an atomless probability space). For each n > 1, we set Xi;n D X C n1 Gi , i D 1; 2.
The distribution of Xi;n is absolutely continuous so, by what we just saw, DQu.Xi;n / 2 fX; Gi g,
and taking the limit n ! 1 as before we get DQu.X/ 2 fX; Gi g for i D 1; 2. Since G1 and
G2 are independent, we infer that DQu.X/ 2 fXg, which concludes the proof. t
u
Remark 5.26 seems pretty obvious. Actually, the proof requires some care since
the Landau symbol o./ in (5.34) a priori depends on the variable X0 at which
the expansion is performed, and thus on the underlying probability space used to
construct the lift. In order to proceed, one may let:
Z 1=2
2
#./ D jx yj d.x; y/
.Rd /2
Z
u./ u.0 / @ u 0 .x/ .y x/ d.x; y/ ;
.Rd /2
R
for any 2 P2 ..Rd /2 / such that .Rd /2 jx yj2 d.x; y/ 6D 0, where 0 denotes the
R
first marginal of on Rd . When .Rd /2 jx yj2 d.x; y/ D 0, we just let #./ D 0.
Expansion (5.34) says that there exists a probability space .˝; F; P/ such that,
for any X0 2 L2 .˝; F; PI Rd /,
lim # L.X0 ; X/ D 0;
X!X0
the limit in the left-hand side holding true in L2 .˝; F; PI Rd /. We shall prove in
Subsection 5.3.1 below, see Lemma 5.30, that this implies:
Z
#./ ! 0 as jx yj2 d.x; y/ ! 0;
.Rd /2
jDQu.X/ .Y X/j 6 jQu.Y/ uQ .X/j C o kY Xk2
D ju./ u./j C o kY Xk2
6 cW2 .; / C o kY Xk2
6 ckY Xk2 C o kY Xk2 :
Taking the infimum over the random variables X and X 0 with prescribed marginal
distributions, we deduce that u is Lipschitz continuous with respect to the
2-Wasserstein distance.
5.2.2 Examples
Observing that the last term in the right-hand side is less than:
ˇ Z 1 ˇ
ˇ ˇ
ˇE @h.X C Y/ @h.X/ Y d ˇ
0
Z h
1 ˇ ˇ i
6 E sup ˇ@h.X C y/ @h.X/ˇ jYj d
0 jyj6kYk2
1=2
h i
C CE 1fjYj>kYk1=2 g 1 C jXj C jYj jYj
2
h ˇ ˇ i1=2
6E sup ˇ@h.X C y/ @h.X/ˇ2 kYk2
1=2
jyj6kYk2
p 1=2
C CkYk2 kYk2 C kYk2 C sup E jXj2 1A ;
P.A/6kYk2
where the constant C is connected with the growth of @h, it is easy to check that
the Fréchet derivative of uQ at X is given by @h.X/ (where @h is the classical gradient
of h) viewed as an element of the dual since DQu.X/Y D EŒ@h.X/Y. Consequently,
we can think of @ u./ as the deterministic function @h. Example (5.35) highlights
the fact that this notion of L-differentiability is very different from the usual one.
Indeed, given the fact that the function u defined by (5.35) is linear in the measure
variable , when viewed as an element of the dual of a function space, one should
expect the derivative to be h and NOT its derivative @h! We shall revisit this issue
in Section 5.4 where we show that this apparent anomaly is in fact generic. We
shall use this particular example to derive, from the general Pontryagin principle
for the optimal control of McKean-Vlasov diffusion processes which we prove
in Chapter 6, a simple form applicable to scalar interactions which are given by
functions of measures of this type.
D .i/ C .ii/:
from which we conclude that the Gâteaux derivative of the function uQ at X with
distribution in the direction U is given by EfŒ.@h C @h/ N .X/ Ug if we
N N
use the notation f to denote the function f .x/ D f .x/. Since the mapping
N .L.X//.X/ 2 L2 .˝; F; PI Rd / is continuous,
L2 .˝; F; PI Rd / 3 X 7! Œ.@h C @h/
we deduce that uQ is Fréchet differentiable and that the Fréchet derivative at X with
N .X/.
distribution is given by Œ.@h C @h/
Notice that when h is even (i.e., when h.x/ D h.x/), then @h is odd (i.e.,
@h.x/ D @h.x/) and the derivative is given by 2@h or:
@ u././ D 2Œ@h ./:
P2 .Rd /. A subset K Rof P2 .Rd / is said to be bounded if there exists a > 0 such
that for all 2 K, Rd jxj2 d.x/ 6 a. We also assume that v is L-continuously
differentiable in for x fixed, and that for each 2 P2 .Rd /, we can choose, for
each x 2 Rd , a version of @ v.x; /./ in L2 .Rd ; I Rd / in such a way that the
mapping Rd Rd 3 .x; x0 / 7! @ v.x; /.x0 / is measurable and at most of linear
growth, uniformly in when restricted to bounded subsets of P2 .Rd /.
Observe that v is at most of quadratic growth in x, uniformly in in bounded
subsets, proving that u is well defined. Indeed, for X 2 L2 .˝; F; PI Rd / with
L.X/ D :
v.x; /
Z 1
D v.0; / C @x v. x; / x d
0
Z 1 Z 1
D v.0; ı0 / C E @ v 0; L. X/ . X/ X d C @x v. x; / x d :
0 0
For the proof, we introduce an approach which we shall use repeatedly throughout
Q F;
the chapter. We denote by .˝; Q a copy of the space .˝; F; P/, and we use the
Q P/
following convention. For any random variable X 2 L2 .˝; F; PI Rd /, we denote by
Q F;
XQ the copy of X on .˝; Q Then, for X; Y 2 L2 .˝; F; PI Rd / with L.X/ D
Q P/.
and kYk2 6 1:
ŒQu.X C Y/ uQ .X/
D E v X C Y; L.X C Y/ v X; L.X C Y/
C E v X; L.X C Y/ v X; L.X/
h i
D E @x v X; L.X/ Y C EEQ @ v X; L.X/ .X/Q YQ
C E v X C Y; L.X C Y/ v X; L.X C Y/ @x v X; L.X/ Y
h i
C EEQ v X; L.X C Y/ v X; L.X/ @ v X; L.X/ .X/ Q YQ
h i
D E @x v X; L.X/ Y C EEQ @ v X; L.X/ .X/
Q YQ
C .i/ C .ii/:
5.2 Differentiability of Functions of Probability Measures 389
p 1=2
C CkYk2 kYk2 C kYk2 C sup E jXj2 1A ;
P.A/6kYk2
where C may depend on M2 ./, which shows that .i/ D o.kYk2 /. Regarding .ii/,
we have:
Z 1
.ii/ D EEQ @ v X; L.X C Y/ XQ C YQ @ v X; L.X/ .X/Q YdQ
0
h ˇ ˇ2 i1=2
6 kYk2 E sup EQ ˇ@ v X; L.X C Z/ XQ C ZQ @ v X; L.X/ XQ ˇ :
kZk2 6 kYk2
Example 5. Finally we consider a slightly different challenge than the four exam-
ples considered above. We assume that Rd P2 .Rd / 3 .x; / 7! v.x; / 2 Rd
satisfies the same assumption as in Example 3, and we define the function u by:
Remark 5.28 The result of Proposition 5.36 below shows that, under a mild
regularity assumption on the Fréchet derivatives, the differentials constructed above
can be handled by rather regular versions. Indeed, if the function uQ is Fréchet
differentiable and its Fréchet derivative is uniformly Lipschitz, i.e., there exists a
constant c > 0 such that kDQu.X/ DQu.X 0 /k2 6 ckX X 0 k2 for all X; X 0 in
L2 .˝; F; PI Rd /, then there exists a function @ u:
such that j@ u./.x/ @ u./.x0 /j 6 cjx x0 j for all x; x0 2 Rd and 2 P2 .Rd /,
and for every 2 P2 .Rd /, @ u./.X/ D DQu.X/ almost surely if D L.X/.
Because of the technical nature of some of these results, the reader may want to
skip the proofs in a first reading and jump to the more intuitive and enlightening
results of this section.
Proof.
First Step. We first construct .; / for measures concentrated on the unit cube in the
sense that .Œ0; 1/d / D 1. We call U the set of such measures. Observe from Proposition 5.7
that U is a Borel subset of P2 .Rd /.
Given some n Q > 0, we split the hypercube Œ0; 1/d into .2n /d hypercubes of the form
Qn .k1 ; ; kd / D diD1 Œki =2n ; .ki C 1/=2n /, with .k1 ; ; kd / 2 .Z \ Œ0; 2n 1/d . For any
d-tuple .k1 ; ; kd /, we let M n; .k1 ; ; kd / D .Qn .k1 ; ; kd //.
The strategy is to arrange the Qn .k1 ; ; kd / increasingly according to some order. To this
end, we observe that, for any 1 6 i 6 d, ki =2n may be uniquely written as:
ki Xn
"nj .ki /
D ; (5.38)
2 n
jD1
2j
with "nj .ki / 2 f0; 1g. Given .k1 ; ; kd / and .k10 ; ; kd0 / in .Z \ Œ0; 2n 1/d , with
.k1 ; ; kd / 6D .k10 ; ; kd0 /, we say that .k1 ; ; kd / n .k10 ; ; kd0 / if, letting:
˚
p D inf j 2 f1; ; ng W "nj .k1 /; ; "nj .kd / 6D "nj .k10 /; ; "nj .kd0 / ;
˚
q D inf i 2 f1; ; dg W "np .ki / 6D "np .ki0 / ;
it holds 0 D "np .kq / < "np .kq0 / D 1. In other words, the order is defined by taking into account
first the index j in (5.38) and then the coordinate i. Writing x n y if x n y or x D y, n is a
total order on f0; ; 2n 1gd .
We divide the interval Œ0; 1/ into a sequence .I n; .k1 ; ; kd //.k1 ; ;kd /2.Z\Œ0;2n 1/d of
.2 / disjoint (possibly empty) intervals, closed on the left and open on the right, of
n d
392 5 Spaces of Measures and Related Differential Calculus
length M n; .k1 ; ; kd /, and ordered increasingly according to n . This means that, for any
x 2 I n; .k1 ; ; kd / and x0 2 I n; .k10 ; ; kd0 /, x < x0 if .k1 ; ; kd / n .k10 ; ; kd0 /. Then,
we let:
X
n. ; / D 2n k1 ; ; 2n kd 1f 2I n; .k1 ; ;kd /g ; 2 Œ0; 1/:
.k1 ; ;kd /2.Z\Œ0;2n 1/d
proving that on Œ0; 1/ equipped with the Lebesgue measure, the sequence of random variables
.Œ0; 1/ 3 7! n . ; //n>0 converges in distribution to as n tends to C1. Moreover,
because of our choice of ordering, we have:
[
I n; .k1 ; ; kd / D I nC1; .2k1 C 1 ; ; 2kd C d /: (5.39)
.1 ; ;d /2f0;1gd
Indeed, for any .k1 ; ; kd / and .k10 ; ; kd0 / in f0; ; 2n 1gd , .k10 ; ; kd0 / n .k1 ; ; kd /
if and only if .2k10 C10 ; ; 2kd0 Cd0 / nC1 .2k1 ; ; 2kd / for any 1 ; 10 ; ; d ; d0 2 f0; 1g,
which implies that:
[
an; .k1 ; ; kd / D QnC1 .k10 ; ; kd0 /
.k10 ; ;kd0 / nC1 .2k1 ; ;2kd /
bn; .k1 ; ; kd /
[
D
.k10 ; ;kd0 / nC1 .2k1 ; ;2kd /
[
[ Q n
.k10 ; ; kd0 /
0 0
.2k1 ; ;2kd / nC1 .k1 ; ;kd / nC1 .2k1 C1; ;2kd C1/
This proves that in (5.39), the right-hand side is included in the left-hand side. Observing
that both sides are intervals of the same length (closed on the left and open on the right), this
proves the equality.
As a by-product, there exists a constant C, independent of n and , such that:
C
8 2 Œ0; 1/; j n. ; / nC1 . ; /j 6 : (5.40)
2n
We deduce that, for each 2 U , the random variables .Œ0; 1/ 3 7! n. ; //n>0 converge
pointwise. So, we can define:
1. ; / D lim n. ; /:
n!1
1 1 1 1
.x1 ; ; xd / D arctan.x1 / C ; ; arctan.xd / C ; (5.41)
2 2
we then let:
. ; / D 1 1 ; ı 1 ; 2 Œ0; 1/; 2 P2 .Rd /:
Third Step. In order to complete the proof, we consider an atomless probability space
.˝; F ; P/ equipped with a uniformly distributed random variable W ˝ ! Œ0; 1/. First,
from the first two steps of the proof, we know that for any 2 P2 .˝/, the random
variable ˝ 3 ! 7! ..!/; / has distribution . Next, we argue that the mapping
P2 .Rd / 3 7! cl.˝ 3 ! 7! ..!/; // 2 L2 .˝; F ; PI Rd / is measurable, where we
use the notation cl./ for the equivalence class of the random variable for the P-almost sure
equality.
For the proof, we shall use the fact that for a bounded continuous function ` W Rd Rd !
R , the mapping L2 .˝; F ; PI Rd / Rd 3 .X; x/ 7! cl.`.X; x// 2 L2 .˝; F ; PI Rd / is contin-
d
Lemma 5.30 Let # W P2 ..Rd /2 / ! R satisfy, for some atomless probability space
.˝; F; P/,
lim # L.X0 ; X/ D 0;
X!X0
the limit in the left-hand side holding true in L2 .˝; F; PI Rd /, then, for any 0 2
P2 .Rd /,
Z
#./ ! 0 as jx yj2 d.x; y/ ! 0;
.Rd /2
Recall that, for any D 2 B.Rd /, the mapping Rd 3 x 7! .x; D/ is measurable, see Theorem
(Vol II)-1.1 if needed. Notice also that, without any loss of generality, we can assume that,
for all x 2 Rd , .x; / 2 P2 .Rd /. By Proposition 5.7, we deduce that the mapping Rd 3
x 7! .x; / 2 P2 .Rd / is measurable. In particular, the mapping Œ0; 1/ Rd 3 . ; x/ 7!
. ; .x; // is measurable, where is given by Lemma 5.29.
5.3 Regularity Properties of the L-Differentials 395
Lemma 5.31 With defined in (5.41), for any 2 P2 .Rd / such that the measure
ı 1 satisfies:
˚ k ˚ k
ı 1 x 2 Œ0; 1/d W xi D D x 2 Rd W xi D tan n D 0;
2n 2 2
for all i 2 f1; ; dg, n > 1 and k 2 f0; ; 2n 1g, we can find a Borel subset
C Œ0; 1/ (obviously depending upon ), with Leb1 .C/ D 1, such that:
8 2 C; lim . ; 0 / D . ; /;
0 )
Proof. We use the same notation as in the proof of Lemma 5.29. It suffices to prove that, for
every 2 P2 .Œ0; 1/d / satisfying, for all i 2 f1; ; dg, n > 1 and k 2 f0; ; 2n 1g,
˚ k
x 2 Œ0; 1/d W xi D n D 0; (5.42)
2
there exists a Borel subset C Œ0; 1/, with Leb1 .C/ D 1, such that:
8 2 C; lim 1. ; 0 / D 1. ; /:
0 2P2 .Œ0;1/d /W 0 )
396 5 Spaces of Measures and Related Differential Calculus
For a given 2 P2 .Œ0; 1/d / satisfying (5.42), we thus consider a sequence .N /N>1 of
probability measures on Œ0; 1/d weakly converging to . We then recall that, for any integer
n > 1,
[
an; .k1 ; ; kd / D Qn .k10 ; ; kd0 / ;
.k10 ; ;kd0 / n .k1 ; ;kd /
[
bn; .k1 ; ; kd / D Qn .k10 ; ; kd0 / ;
.k10 ; ;kd0 / n .k1 ; ;kd /
where the relationship x n y stands for x n y or x D y. For a given value of n, observe that
the boundaries of both
[ [
Qn .k1 ; ; kd / and Qn .k10 ; ; kd0 /
0 0 0 0
.k1 ; ;kd / n .k1 ; ;kd / .k1 ; ;kd / n .k1 ; ;kd /
are included in [diD1 [2`D0 fx 2 Œ0; 1/d W xi D `=2n g. Thanks to (5.42), they are of zero
n
In particular, for any tuple .k1 ; ; kd / 2 f0; ; 2n 1gd and any real in the interval
.an; .k1 ; ; kd /; bn; .k1 ; ; kd // (if the interval is not empty), we can find an integer
N n; .k1 ; ; kd / such that, for N > N n; .k1 ; ; kd /,
proving that:
k1 kd
n. ; N / D ; ; D n. ; /:
2n 2n
Since
ˇ ˇ C ˇ ˇ C
ˇ 1. ; N / n. ; N /ˇ 6 n ; ˇ 1. ; / n. ; /ˇ 6 n ;
2 2
ˇ ˇ C
ˇ 1. ; N / 1. ; /ˇ 6 ;
2n1
˚ k
ı a1x 2 Rd I xi D tan n
2 2
˚ k
D x 2 Rd I xi D tan n a ;
2 2
so that:
˚ k
ı a1 x 2 Rd I xi D tan >0
2n 2
k
, 9` 2 N; a D tan c` :
2n 2
Letting:
n o
N D tan k cI
Q c 2 Q; n > 0 n f0g; k 2 f0; ; 2n 1g ;
2n 2
N
we deduce that, for a 62 Q:
˚ k
8n > 1; 8k 2 f0; ; 2n 1g; ı a1 x 2 Rd I xi D tan D 0:
2n 2
From Lemma 5.31, we deduce that, for a 62 Q N and 2 P2 .Rd /, there exists a Borel
subset C Œ0; 1/ such that, for all 2 C, the function Q . ; a; / is continuous at .
So for any sequence .N /N>1 weakly converging to ,
˚
N
8a 62 Q; Leb1 2 Œ0; 1/ W lim Q . ; a; N / D Q . ; a; / D 1:
N!1
398 5 Spaces of Measures and Related Differential Calculus
And then,
Z
˚
Leb1 2 Œ0; 1/ W lim Q . ; a; N / D Q . ; a; / '.a/da D 1;
R N!1
where ' denotes the density of the standard Gaussian probability measure. By
Fubini’s theorem, for almost every . ; a/ 3 Œ0; 1/ R, the function Q . ; a; / is
continuous at . This shows that it suffices to add an additional parameter in the
definition of in order to make it continuous in 2 P2 .Rd / for almost every
fixed values of the other parameters. Put it differently, if and G denote two
independent random variables on some probability space .˝; F; P/, with being
uniformly distributed on Œ0; 1/ and G N.0; 1/, then PŒlimN!1 Q .; G; N / D
Q .; G; / D 1, with Q .; G; N / N , for all N > 1, and Q .; G; / . This
recovers Blackwell and Dubins’ theorem.
Recall that we use freely the notation X in lieu of L.X/ D whenever the
expressions for X and may render the typesetting too cumbersome.
Proof. Let .˝; F ; P/ be a Polish atomless probability space. We use the fact that for any
bounded continuous function ` W Rd Rd ! R and d intervals I1 ; ; Id on the real
line, the mapping L2 .˝; F ; PI Rd / Rd 3 .X; x/ 7! cl.1I1 .`.X; x//; ; 1Id .`.X; x/// 2
L2 .˝; F ; PI Rd / is measurable. As before, L2 .˝; F ; PI Rd / is the quotient of the space
5.3 Regularity Properties of the L-Differentials 399
of square-integrable random variables for P-almost sure equality, and cl./ denotes the
equivalence class of in L2 .˝; F ; PI Rd /.
Here is the way we apply this simple fact. Denoting by ‘’ the inner product in
L2 .˝; F ; PI Rd /, the mapping ŒL2 .˝; F ; PI Rd /2 3 .X; Y/ 7! ŒDQu.X/ Y is measurable
as the pointwise limit of measurable mappings. Therefore, for any vector e 2 Rd and any
" > 0, the mapping L2 .˝; F ; PI Rd / Rd 3 .X; x/ 7! ŒDQu.X/ .e1fjXxj6"g / is jointly
measurable. Then, the mapping D . 1 ; ; d / W L2 .˝; F ; PI Rd / Rd ! Rd given by
h ŒDQu.X/ .e 1 i
i fjXxj 6 "g /
i .X; x/ D lim inf 1fP.jXxj6"/>0g
"&0 P.jX xj 6 "/
P
In the right-hand side of the above equality, the function @ u.N 1 NjD1 ıxj /./ is
P
uniquely defined in L2 .Rd ; N 1 NjD1 ıxj I Rd /. In particular, for each i 2 f1; ; Ng,
P
@ u.N 1 NjD1 ıxj /.xi / is uniquely defined.
Proof. On an atomless Polish probability space .˝; F ; P/, we consider a random variable #
uniformly distributed over the set f1; ; Ng. Then, forP
any fixed x D .x1 ; ; xN / 2 .Rd /N ,
1 N
x# is a random variable having distribution N x D N
N
iD1 ıxi . In particular, with the same
notation as above for uQ ,
the dot product being here the L2 - inner product over .˝; F ; P/, from which we deduce:
1 X
N
uN .x C h/ D uN .x/ C @ u.N Nx /.xi / hi C o.jhj/;
N iD1
Since the L-derivative of a function has the particular form given by Proposi-
tion 5.25, the Lipschitz property can be rewritten as:
E j@ u.PX /.X/ @ u.PY /.Y/j2 6 C2 E jX Yj2 ; (5.43)
Then, for each 2 P2 .Rd /, one can redefine v./. / on a -negligible set in such
a way that:
Remark 5.37 Here, the atomless property is just used to guarantee the existence of
random variables with a prescribed distribution on any Polish space.
The proof of Proposition 5.36 is rather long and technical, so the reader mostly
interested in the practical applications of the notion of L-differentiability may
want to skip it in a first reading.
Proof.
First Step. We first consider the case of a bounded function v, and assume that has
a strictly positive continuous density p on the whole Rd , p and its derivatives being of
exponential decay at infinity. We claim that there exists a continuously differentiable one-
to-one function U from .0; 1/d onto Rd such that, whenever 1 ; ; d are d independent
random variables uniformly distributed on .0; 1/, U. 1 ; ; d / has distribution . It satisfies
for any .z1 ; ; zd / 2 .0; 1/d :
@Ui @Uj
.z1 ; ; zd / 6D 0; .z1 ; ; zd / D 0; 1 6 i < j 6 d:
@zi @zi
The result is well known when d D 1. In such a case, U is the inverse of the cumulative
distribution function of . In higher dimension, U can be constructed by an induction
argument on the dimension. Assume indeed that some U O has been constructed for the first
marginal distribution O of on Rd1 , that is for the push-forward of by the projection
mapping Rd 3 .x1 ; ; xd / 7! .x1 ; ; xd1 /. Given .x1 ; ; xd1 / 2 Rd1 , we then denote
by p.jx1 ; ; xd1 / the conditional density of given the d 1 first coordinates:
p.x1 ; ; xd /
p.xd jx1 ; ; xd1 / D ; x1 ; ; xd1 2 Rd1 ;
pO .x1 ; ; xd1 /
with
Z xd
Fd .xd jx1 ; ; xd1 / D p.yjx1 ; ; xd1 /dy;
1
@U .d/ 1
.zd jx1 ; ; xd1 / D ;
@zd p.U .d/ .zd jx1 ; ; xd1 /jx1 ; ; xd1 /
which is nonzero. We now let:
U.z1 ; ; zd / D U.z O 1 ; ; zd1 // ;
O 1 ; ; zd1 /; U .d/ .zd jU.z z1 ; ; zd 2 .0; 1/d :
In particular, setting:
vn .x/ D E v PCn1 G x C n1 G
Z
nd jx yj2
D v PCn1 G .y/ exp n2 dy;
.2/ d=2
Rd 2
we have:
E jvn ./ vn . 0 /j2 6 C2 E j 0 j2 : (5.45)
We now choose a specific coupling for and 0 . Indeed, we know that, for any D
. 1 ; ; d / and 0 D . 01 ; ; 0d /, with uniform distributions on .0; 1/d , U. / and U. 0 /
have the same distribution as . Without any loss of generality, we may assume that the
probability space .˝; F ; P/ is given by .0; 1/d Rd endowed with its Borel -algebra and
the product of the Lebesgue measure on .0; 1/d and of the Gaussian measure Nd .0; Id /. The
random variables and G are then chosen as the canonical mappings W .0; 1/d Rd 3
.z; y/ 7! z and G W .0; 1/d Rd 3 .z; y/ 7! y.
We then define 0 as a function of the variable z 2 .0; 1/d only. For a given z0 D
.z1 ; ; z0d / 2 .0; 1/d and for h small enough so that the open ball B.z0 ; h/ of center z0
0
where ed is the d-th vector of the canonical basis, that is 0 matches locally the symmetry
with respect to the hyperplane containing z0 and orthogonal to ed . Clearly, 0 preserves the
Lebesgue measure. With this particular choice, we rewrite (5.45) as:
Z Z
ˇ ˇ ˇ ˇ
ˇvn U. .z// vn U. 0 .z// ˇ2 dz 6 C2 ˇU. .z// U. 0 .z//ˇ2 dz;
.0;1/d .0;1/d
or equivalently:
Z
ˇ 0 ˇ
ˇvn U z C r 2rd ed vn U.z0 C r/ ˇ2 dr
jrj<h
Z (5.46)
ˇ 0 ˇ
6 C2 ˇU z C r 2rd ed U.z0 C r/ˇ2 dr:
jrj<h
Xd
@vn @Ui 0
vn U z0 C r 2rd ed vn .U.z0 C r// D 2 .U.z0 // .z /rd C o.r/
iD1
@xi @zd
@vn @Ud 0
D 2 .U.z0 // .z /rd C o.r/;
@xd @zd
Similarly,
Z Z
ˇ ˇ ˇ
ˇU.z0 C r 2rd ed / U.z0 C r/ˇ2 dr D 4ˇ @Ud .z0 /j2 rd2 dr C o.hdC2 /; (5.48)
jrj<h @zd jrj<h
ˇ @vn
ˇ .U.z0 //j2 6 C2 ;
@xd
and since U is a one-to-one mapping from .0; 1/d onto Rd , and z0 2 .0; 1/d is arbitrary,
we conclude that jŒ@vn =@xd .x/j 6 C, for any x 2 Rd . By changing the basis used for the
construction of U (we used the canonical basis but we could use any orthonormal basis as
well), we have jrvn .x/ej 6 C for any x; e 2 Rd with jej D 1. This proves that the functions
.vn /n>1 are uniformly bounded and C-Lipschitz continuous. We then denote by vO the limit
of a subsequence converging for the topology of uniform convergence on compact subsets.
For simplicity, we keep the index n to denote the subsequence. Assumption (5.44) implies:
E jvn ./ v.P /./j2 6 E jv.PCn1 G /. C n1 G/ v.P /./j2 6 C2 n2 ;
and taking the limit n ! C1, we deduce that vO and v. ; P / coincide P almost everywhere.
This completes the proof when v is bounded and has a continuous positive density p, p and
its derivatives being of exponential decay at infinity.
Third Step. When v is bounded and is bounded and has a general distribution, we
approximate by C n1 G again. Then, C n1 G has a positive continuous density, the
density and its derivatives being of Gaussian decay at infinity, so that, by the second step, the
function Rd 3 x 7! v.PCn1 G /.x/ can be assumed to be C-Lipschitz continuous for each
n > 1. Extracting a convergent subsequence and passing to the limit as above, we deduce
that v.P /./ admits a C-Lipschitz continuous version.
When v is bounded and is not assumed to be bounded, we approximate by its
orthogonal projection on the ball of center 0 and radius n. We then complete the proof in
a similar way.
Finally when v is not bounded, we approximate v by . n .v//n>1 where, for each n > 1,
n is a bounded smooth function from R into itself such that n .r/ D r for r 2 Œn; n and
jŒd n =dr.r/j 6 1 for all r 2 R. Then, for each n > 1, there exists a C-Lipschitz continuous
version of n .v.P //./. Letting n tend to 1, we complete the proof. t
u
Under the Lipschitz assumption (5.43) on the Fréchet derivative of the lifting
of u, we can use Proposition 5.36 in order to define @ u./.x/ for every and
every x while preserving the Lipschitz property in the variable x. From now on,
whenever the derivative DQu is Lipschitz, we shall use such a version of @ u././.
Importantly, observe that this version is uniquely defined on the support of only.
5.3 Regularity Properties of the L-Differentials 405
So, if ; 2 P2 .Rd / and X and Y are random variables such that L.X/ D and
L.Y/ D , we have:
E j@ u./.X/ @ u./.X/j2
6 2 E j@ u./.X/ @ u./.Y/j2 C E j@ u./.Y/ @ u./.X/j2
6 4C2 E jY Xj2 ;
where we used the Lipschitz property (5.43) of the derivative together with the result
of Proposition 5.36 applied to the function @ u./. Now, taking the infimum over
all the couplings .X; Y/ with marginals and , we obtain:
inf E j@ u./.X/ @ u./.X/j2 6 4C2 W2 .L.X/; L.Y//2 ;
X;L.X/D
and since the left-hand side depends only upon and not on X as long as L.X/ D ,
we get:
E j@ u./.X/ @ u./.X/j2 6 4C2 W2 .; /2 : (5.49)
.; x/ 7! @ u./.x/ is measurable and continuous at any point .; x/ such that x
belongs to the support of .
Proof. We already know that, for any 2 P2 .Rd /, we can find a Lipschitz version of
@ u././. Whenever the support of is the entire space Rd , this Lipschitz version is the
unique continuous version of @ u././.
First Step. For any 2 .0; 1, we consider the function:
U W P2 .Rd / Rd 3 .; x/ 7! U .; x/ D @ u Nd .0; Id / .x/;
where as usual, Id is the d-dimensional identity matrix and Nd .0; Id / is the d-dimensional
Gaussian law with zero as mean and Id as covariance matrix. We observe that U is uniquely
defined and that, for any 2 P2 .Rd /, U .; / is Lipschitz continuous, uniformly in and
> 0. Moreover, for any 2 P2 .Rd /, any random variable X with as distribution,
and any random variable G independent of X, with Nd .0; Id / as distribution, X and G being
constructed on .˝; F ; P/, it holds that:
the value of the constant C being allowed to increase from line to line, and where we used
the Lipschitz property of DQu in the last line. So for a constant C independent of 2 .0; 1,
we have:
2
jU .; 0/j2 6 C 1 C M2 ./ :
We now claim that the mapping U is jointly continuous. It suffices to observe that, for
a sequence .n /n>0 converging to 2 P2 .Rd /, the family of mappings .U .n ; //n>0 is
uniformly continuous, the sequence .U .n ; 0//n>0 being bounded. Therefore, the family of
functions .U .n ; //n>0 is relatively compact for the topology of uniform convergence on
compact subsets. Passing to the L2 .˝; F ; PI Rd /-limit in the identity:
where, for all n > 0, Xn n is independent of G, we deduce that any limit of .U .n ; //n>0
coincides with U .; /. Joint continuity easily follows.
Second Step. We now let:
n
U .; x/ D lim inf U 2 .; x/; 2 P2 .Rd /; x 2 Rd ;
n!1
Therefore:
n
P lim inf U 2 .; X C 2n G/ D U .; X/ D 1;
n!1
and finally:
P DQu.X/ D U .; X/ D 1:
Third Step. It remains to check that U is continuous at any .; x/ such that x belongs to
the support of . We thus consider such a pair .; x/ together with a sequence .n ; xn /n>0
converging to .; x/. By the same argument as in the first step, we may extract a subsequence
of .U .n ; //n>0 converging for the topology of uniform convergence on compact subsets.
For simplicity, we still denote this sequence by .U .n ; //n>0 and we call v./ its limit. Given
a sequence .Xn /n>0 of random variables converging to X in L2 .˝; F ; PI Rd /, with Xn n
and X (see for instance Subsection 5.3.1 for the construction), we can pass to the limit in:
P DQu.Xn / D U .n ; Xn / D 1;
and exploiting the fact U .n ; / is Lipschitz continuous, uniformly in n, deduce that:
P DQu.X/ D v.X/ D 1;
which proves that v coincides with U .; / almost everywhere under . Since both are
continuous, they must coincide on the support of . In particular, for the same x as above,
v.x/ D U .; x/. Since limn!1 U .n ; xn / D v.x/, this completes the proof. t
u
1X
N
@uN .x/ .y x/ D @ u./.xi / .yi xi /
N iD1
X N 1=2
1
C O W2 .N Nx ; / jxi yi j2 ;
N iD1
the dot product in the left-hand side standing for the usual Euclidean inner product.
408 5 Spaces of Measures and Related Differential Calculus
@uN .x/ .y x/
X
N
D @xi uN .x/ .yi xi /
iD1
1 X
N
D @ u.N Nx /.xi / .yi xi /
N iD1
1 X 1 X
N N
D @ u./.xi / .yi xi / C @ u.N Nx /.xi / @ u./.xi / .yi xi /;
N iD1 N iD1
where ‘’ in the left-hand side is the inner product in RdN , while ‘’ in the right-hand side is
the inner product in Rd . Now, by Cauchy-Schwarz’ inequality,
ˇ X ˇ
ˇ1 N ˇ
ˇ Œ@ u.N x /.xi / @ u./.xi / .yi xi /ˇˇ
N
ˇN
iD1
1=2 X 1=2
1 X
N N
1
6 j@ u.N Nx /.xi / @ u./.xi /j2 jyi xi j2
N iD1 N iD1
1=2
1 X
N
1=2
D E j@ u.N Nx /.x# / @ u./.x# /j2 jyi xi j2
N iD1
1=2
1 X
N
6 2CW2 .N Nx ; / jyi xi j2 ;
N iD1
if we use the same notation for # as in the proof of Proposition 5.35, and apply the
estimate (5.49) with X D x# , D N Nx , and D . t
u
and
h i
lim E W2 .N N ; /2 D 0:
n!C1
sample .Xi /16i6N is close to the sample .@ u./.Xi //16i6N , the accuracy of the
approximation being specified in the L2 .˝; F; PI Rd / norm by (5.20) when is
sufficiently integrable.
Joint Differentiability
The notion of joint differentiability is defined according to the same procedure as
before: h is said to be jointly differentiable if the lifting hQ W Rn L2 .˝; F; PI Rd / 3
.x; X/ 7! h.x; PX / over some atomless Polish probability space .˝; F; P/ is jointly
differentiable. If hQ is continuously differentiable in the direction X, we can define the
partial derivatives in x and . They read Rn P2 .Rd / 3 .x; / 7! @x h.x; / and Rn
P2 .Rd / 3 .x; / 7! @ h.x; /./ 2 L2 .Rd ; I Rd / respectively. By construction,
the partial Fréchet derivative of hQ in the direction X is given by the mapping
L2 .˝; F; PI Rd / 3 .x; X/ 7! DX h.x; Q X/ D @ h.x; PX /.X/ 2 L2 .˝; F; PI Rd /.
The statement and the proof of Proposition 5.33 can be easily adapted to the joint
measurability of Rn P2 .Rd / Rd 3 .x; ; v/ 7! @ h.x; /.v/. In order to
distinguish the variable x in h.x; / from the variable at which the derivative with
respect to is computed, we shall often denote the latter by v.
A standard result from classical analysis which we often use says that joint
continuous differentiability in the two arguments is equivalent to the partial
differentiability in each of the two arguments together with the joint continuity
of the partial derivatives. Here, the joint continuity of @x h is understood as the
joint continuity with respect to the Euclidean distance on Rn and the Wasserstein
distance on P2 .Rd /. The joint continuity of @ h needs to be understood as the joint
continuity of the mapping .x; X/ 7! @ h.x; PX /.X/ from Rn L2 .˝; F; PI Rd / into
L2 .˝; F; PI Rd /. The proof follows from the standard decomposition:
h.x0 ; 0 / h.x; /
Lemma 5.41 Let .u.x; /.//x2Rn ;2P2 .Rd / be a collection of real-valued functions
satisfying, for all x 2 Rn and 2 P2 .Rd /, u.x; /./ 2 L1 .Rd ; I R/, and for which
there exists a constant C such that, for all x; x0 2 Rn , and ; 0 ; 2 L2 .˝; F; PI Rd /,
E u.x; L.//./ u.x0 ; L. 0 //. 0 /
h i
6 C kk1 jx x0 j C k 0 k1 C E j 0 j jj ;
Remark 5.42 Observe that, differently from what we have done so far, we here use
the L1 norm instead of the L2 norm, and the 1-Wasserstein distance W1 instead of
the 2-Wasserstein distance W2 , in order to characterize continuity with respect to
the measure argument. Of course, this is more demanding. This choice is dictated
by the argument used below for exhibiting a fully regular version of U , which is
based on the duality between L1 and L1 .
Notice also that, although u.x; /./ 2 L1 .Rd ; I R/, we do not claim that the
version provided by the statement is bounded on the whole Rd .
Proof. As a preliminary remark, notice that from the main assumption in the statement, the
map Rd L2 .˝; F ; PI Rd / 3 .x; / 7! u.x; L.//./ 2 L2 .˝; F ; PI Rd / is continuous.
5.3 Regularity Properties of the L-Differentials 411
First Step. Consider x; x0 2 Rn and ; 0 2 L2 .˝; F ; PI Rd /. Observe that from the regularity
assumption, with probability 1 under P,
ˇ ˇ
ˇu x; L./ ./ u x0 ; L. 0 / . 0 /ˇ 6 C jx x0 j C k 0 k1 C j 0 j :
Observe that the integrals in the left-hand side are well defined: Since L.C p1 Z/ has a positive
density, u.x; L. C 1p Z//./ 2 L1 .Rd ; Lebd I R/, and similarly for the second integral.
Letting, for all x 2 Rn , 2 P2 .Rd / and v 2 Rd ,
Z
1
up .x; /.v/ D u x; .Nd .0; I //
p2 d
.v C 1p z/'d .z/dz;
Rd
we get that up .x; /./ is continuous and satisfies, for all x; x0 in Rn and ; 0 in
L2 .˝; F ; PI Rd /, with probability 1 under P,
ˇ ˇ
ˇup x; L./ ./ up x; L. 0 / . 0 /ˇ 6 C jx x0 j C k 0 k1 C j 0 j : (5.50)
Second Step. We now consider ; 0 2 P2 .Rd / such that both have a strictly positive smooth
density that decays at least exponentially fast at the infinity and whose derivative also decays
at least exponentially fact at infinity. We let be an optimal coupling between and 0 for
the 1-Wasserstein distance, that is
Z
W1 .; 0 / D jv v 0 jd.v; v 0 /;
Rd Rd
vD .y0 /; v0 D .y00 /:
412 5 Spaces of Measures and Related Differential Calculus
Then, for a given random variable with uniform distribution on .0; 1/d and for ı > 0 such
that B.y0 ; ı/ .0; 1/d and B.y00 ; ı/ .0; 1/d , where B.y0 ; ı/ denotes the d-dimensional open
ball of center y0 and of radius ı, we let:
8
ˆ
ˆ if 62 B.y0 ; ı/ [ B.y00 ; ı/;
<
0
D C y00 y0 if 2 B.y0 ; ı/;
ˆ
:̂ C y y0 if 2 B.y00 ; ı/:
0 0
N
" D " C .1 "/; ";0 D " 0 C .1 "/N 0 :
Clearly, " and ";0 have and 0 as respective distributions. Taking advantage of the
conclusion of the first step, we deduce that, with probability 1 under P,
ˇ ˇ
ˇup .x; /. " / up .x0 ; 0 /. ";0 /ˇ 6 C jx x0 j C k " ";0 k1 C j " ";0 j :
Therefore, we can find a sequence .ym /m>1 converging toward y0 such that:
ˇ ˇ
ˇup .x; / .ym / up .x0 ; 0 / 0 .ym C y0 y0 / ˇ
0
6 C jx x0 j C k " ";0 k1 C j .ym / 0 .ym C y00 y0 /j :
Inequality (5.51) holds true for probability measures ; 0 that have a strictly positive smooth
density that decays at least exponentially fast at infinity and whose derivative also decays
at least exponentially fast at infinity. Since the set of such smooth probability measures
is dense in P2 .Rd /, we deduce that the restriction of up to smooth probability measures
extends by continuity to the whole Rn P2 .Rd / Rd . Of course, the continuous extension,
5.3 Regularity Properties of the L-Differentials 413
Since up .0; ı0 /.0/ D EŒu.0; 1p Z/. p1 Z/ and since the map Rd L2 .˝; F ; PI Rd / 3 .x; / 7!
u.x; L.//./ 2 L2 .˝; F ; PI Rd / is continuous, we deduce that the sequence .Nup .0; ı0 /.0/ D
up .0; ı0 /.0//p>1 is bounded, which shows that the functions .Nup /p>1 are uniformly at most
of linear growth.
Recalling that any bounded subset of P2 .Rd / is a compact subset of P1 .Rd /, we deduce
from the Arzelà-Ascoli theorem that there exists a subsequence, still denoted by .Nup /p>1 , that
converges uniformly on any bounded subset of Rn P2 .Rd / Rd .
It remains to identify the limit of uN p .x; /./ with a version of u.x; /./ in L1 .Rd ; I R/.
This follows from the fact that, for any bounded and measurable function g W Rd ! R,
E uN p x; L./ ./g./ D E u x; L. C 1p Z/ . C 1p Z/g./ ;
So, if h is jointly measurable on Rn P2 .Rd /, then, for any Z 2 L2 .˝; F; PI Rd /, the
mapping Rn ŒL2 .˝; F; PI Rd /2 3 .x; X; Y/ 7! EŒ.DX h.x; Q X/Z/Y is measurable
2
and the mapping R L .˝; F; PI R / 3 .x; X/ 7! EŒ.DX h.x;
n d Q X/ Z/ .X; Z/ is
also measurable for any bounded and continuous function from .Rd /2 into Rd . If
Q is a dense countable subset of the space of continuous functions from .Rd /2 ! Rd
converging to 0 at infinity, we see that:
Rn L2 .˝; F; PI Rd / 3 .x; X/
n o
7! sup E DX h.x;Q X/ Z .X; Z/ 1fk .X;Z/k2 6 1g
2Q
414 5 Spaces of Measures and Related Differential Calculus
The proof is quite simple. It follows from the fact that, whenever a sequence
.x` ; X` /`>1 converges to some .x; X/ in Rn L2 .˝; F; PI Rd /, then the sequence
. .x` ; L.X` /; X` //`>1 converges to .x; L.X/; X/ in probability and the proof is
completed using uniform integrability. When p D 2 and continuity does not hold,
the measurability may be proved by approximating by .` ı /`>1 where, for
each ` > 1, ` W R ! R is a bounded and continuous function satisfying ` .x/ D x
for jxj 6 ` and j` .x/j 6 ` for all x 2 Rd .
We argued repeatedly that the notion of L-differentiability was natural in the context
of functions of probability measures appearing as the distributions of random
variables which needed to be perturbed. More convincing arguments will come
with the discussions of applications in which we need to track the dependence
with respect to the marginal distributions of a stochastic dynamical system. See
for example Section 5.6 later in this chapter or Chapter 6.
5.4 Comparisons with Other Notions of Differentiability 415
This being said, L-differentiability may not appear as the most intuitive notion,
especially from a geometric analysis perspective. The goal of this section is to
enlighten the reader as to the relationships between L-differentiability and some
of the more traditional approaches which have been advocated in similar contexts,
and especially in the theory of optimal transportation.
ıu
W M.Rd / Rd ! R
ım
ıu ıu
W P2 .Rd / Rd 3 .m; x/ 7! .m/.x/ 2 R;
ım ım
continuous for the product topology, P2 .Rd / being equipped with the 2-Wasserstein
distance, such that, for any bounded subset K P2 .Rd /, the function Rd 3 x 7!
Œıu=ım.m/.x/ is at most of quadratic growth in x uniformly in m for m 2 K, and
such that for all m and m0 in P2 .Rd /, it holds:
Z 1 Z
ıu 0
u.m0 / u.m/ D tm C .1 t/m .x/dŒm0 m.x/ dt: (5.53)
0 Rd ım
Proposition 5.44 Let u have a linear functional derivative in the sense of Defi-
nition 5.43. Assume further that, for any m 2 P2 .Rd /, the function Rd 3 x 7!
Œıu=ım.m/.x/ is differentiable and the derivative
5.4 Comparisons with Other Notions of Differentiability 417
h ıu i
P2 .Rd / Rd 3 .m; x/ 7! @x .m/.x/ 2 Rd
ım
Proof. We write:
Z
ıu
u.m0 / u.m/ D .m/.x/dŒm0 m.x/
ım
Rd
Z 1Z h i
ıu 0 ıu
C tm C .1 t/m .x/ .m/.x/ dŒm0 m.x/ dt:
0 Rd ım ım
u.m0 / u.m/
Z
ıu
D .m/.x/dŒm0 m.x/
R d ım
Z 1Z h ıu ıu
C tm0 C .1 t/m .y/ .m/.y/
0 R R
d d ım ım
ıu 0 ıu i
tm C .1 t/m .x/ .m/.x/ d.x; y/ dt
ım ım
Z
ıu
D .m/.x/dŒm0 m.x/
ım
Rd
Z 1Z 1Z h ıu 0
C @x tm C .1 t/m y C .1 /x
0 0 R R
d d ım
ıu i
@x .m/ y C .1 /x .y x/ d.x; y/ d dt:
ım
In order to complete the proof, it suffices to show that the last term in the above right-hand
side is o.W2 .m0 ; m// as W2 .m0 ; m/ ! 0 while m is fixed. Invoking Lemma 5.30 and letting,
with the same notation as in the statement, ".r/ D supf#./g, the supremum being taken
over
R the probability measures 2 P2 ..Rd /2 / having m as first marginal on Rd and satisfying
2
.Rd /2 jx yj d.x; y/ r2 , we just have to prove that:
418 5 Spaces of Measures and Related Differential Calculus
Z 1 Z 1
ıu 0
lim E @x tm C .1 t/m X 0 C .1 /X
X 0 !X 0 0 ım
ıu X0 X
@x .m/ X 0 C .1 /X d dt D 0;
ım kX 0 Xk2
Remark 5.45 In order to distinguish the functional derivative ıu=ım from the
L-derivative @ u, we use the letter ı instead of @ for the differential symbol.
Moreover, we use the letter m instead of for the measure argument.
Remark 5.46 Formulas (5.53) and (5.54) only involve integrals with respect to the
measure m0 m. As a result, any constant can be added to ıu=ım without affecting
either of these formulas as long as the measures m and m0 have the same total mass.
Consequently, ıu=ım is only defined up to an additive constant.
Remark 5.47 Observe also that if there exists a continuous function Œıu=ım which
is at most of quadratic growth in x uniformly in m for m in a bounded subset of
P2 .Rd / and for which the conclusion (5.54) of Proposition 5.44 holds true, then u
has Œıu=ım as linear functional derivative. This follows from the fact that, for any
two m; m0 2 P2 .Rd /, under (5.54), the mapping Œ0; 1 3 t 7! u.m C t.m0 m// is
differentiable with
Z
d ıu
u m C t.m0 m/ D m C t.m0 m/ .x/d m0 m .x/; t 2 Œ0; 1;
dt Rd ım
uQ .X C Y/ uQ .X/ D u L.X C Y/ u L.X/
Z 1Z
ıu
D L .t/ .x/ d L.X C Y/ L.X/ .x/ dt
0 Rd ım
Z 1
ıu ıu
D E L .t/ .X C Y/ L .t/ .X/ dt;
0 ım ım
where we used the notation L .t/ for tL.X C Y/ C .1 t/L.X/. Furthermore, if
we assume that the function u satisfies the assumption of Proposition 5.44, then,
following the proof of (5.54), we have for kYk2 6 1:
uQ .X C Y/ uQ .X/
Z 1 Z 1 h i
ıu
D E @x L .t/ .X C Y/ Y d dt
0 0 ım
h ıu i
D E @x L0 .t/ .X/ Y
ım
Z 1 Z 1 h i
ıu ıu 0
C E @x L .t/ .X C Y/ @x L .t/ .X/ Y d dt
0 0 ım ım
h ıu i
D E @x L.X/ .X/ Y C o./; (5.55)
ım
where the Landau notation o./ in the last line is uniform with respect to Y 2
L2 .˝; F; PI Rd / with kYk2 6 1. This shows that the lifting uQ is Fréchet differ-
entiable at X and that its Fréchet derivative is given by:
ıu
DQu W L2 .˝; F; PI Rd / 3 X 7! @x L.X/ .X/;
ım
ıu
@ u./. / D @x ./. /; 2 P2 .Rd /:
ım
Remark 5.49 The above derivation has the following enlightening interpretation.
If a smooth extension of u to the space of measures exists, the L-derivative of u at
2 P2 .Rd /, when viewed as a function on Rd , is the derivative (gradient) of the
(Fréchet or Gâteaux) derivative of u when considered as a function on the vector
420 5 Spaces of Measures and Related Differential Calculus
space of signed measures. This general fact was already encountered with our first
example of an L-derivative computation. The fact that the L-derivative, when viewed
as a function from Rd into itself, appears to be the gradient of a scalar function will
be argued rigorously in a more general setting in Proposition 5.50 below.
Proof.
First Step.R We use the fact that, if a locally square-integrable vector field f W Rd ! Rd is
such that Rd f .x/ b.x/dx D 0 for every smooth divergence free vector field b with compact
support, then there exists a locally square-integrable function p W Rd ! R such that:
f D rp;
in the sense of distributions. See Remark 1.9 in Temam’s book [331]. We show that whenever
f is continuous, p must be continuously differentiable. Consider indeed a sequence of
mollifiers . " /">0 with compact support. Then, for all " > 0,
"
"
f Dr p ;
where we used the symbol to denote convolution. Since both sides of the equality are
smooth, we have:
Z 1 h i
"
"
"
8x 2 Rd ; p .x/ D p .0/ C f .tx/ x dt: (5.57)
0
Since f is continuous, the right-hand side is continuous in x, uniformly in " > 0 on any
compact subset of Rd . This shows that the functions .Rd 3 x 7! p " .x/ p " .0//">0 are
5.4 Comparisons with Other Notions of Differentiability 421
equicontinuous on compact sets. Therefore, the family .Rd 3 x 7! p " .x/ p " .0//">0
is relatively compact for the topology of uniform convergence on compact subsets of Rd . We
call pQ a limit. Similarly,
Z h i
"
"
1
"
8x; y 2 Rd ; p .y/ D p .x/ C f ty C .1 t/x .y x/ dt:
0
b
XP t D .X /; t > 0;
m t
has a unique solution for every initial condition, and the measure m .dx/ D m .x/dx is
invariant for this dynamical system, as b is divergence free. The fact that we use the same
notation m for a measure and its density should not be a source of confusion. In any case,
L.Xt / D L.X0 / if L.X0 / D m . Note that the randomness in X comes from the initial
condition only. Consequently,
1
0D u L.Xt / u L.X0 /
t
1
D uQ Xt uQ X0
t
1 1
D DQu.X0 / .Xt X0 / C o kXt X0 k2
t t
b
D E @ u.m /.X0 / .X / C o.1/;
m 0
We can now take the limit & 0 in the above equality and get the desired result. Recall
that Proposition 5.36 and Corollary 5.38 guarantee the existence, for every 2 P2 .Rd /, of a
version x 7! @ u./.x/ of the L-derivative such that x 7! @ u./.x/ is Lipschitz continuous
with a constant independent of and such that j@ u./.0/j 6 c.1 C M2 .// for a constant
c independent of . So, the family .@ u.m /. //>0 is uniformly continuous on compact
422 5 Spaces of Measures and Related Differential Calculus
subsets of Rd , and we can extract a subsequence which converges uniformly on compact sets
toward a function providing a version of @ u.m/./. Since this limit satisfies (5.58) for every
b, it is a gradient.
Third Step. Since the version constructed in the second step is continuous, we deduce from
the first step that its potential p is continuously differentiable. By choosing the version of
p which vanishes at 0 and by taking the limit in (5.57), we prove that the representation
formula (5.56) holds. t
u
Proof.
First Step. The goal is to check that we can write u as in Definition 5.43, with Œıu=ım being
equal to the version of the potential constructed in Proposition 5.50.
We thus consider two square-integrable random variables X and Y together with U a third
random variable assumed to be uniform on the segment Œ0; 1 and independent of .X; Y/.
As usual, we work on an atomless Polish probability space .˝; F ; P/. We also denote by
˚ (resp. ') the cumulative distribution (resp. density) function of the univariate standard
normal distribution N.0; 1/. Given two parameters > 0 and t 2 .0; 1/, we let:
t U t U
t; D ˚ Y C 1˚ X:
This definition was chosen so that, as tends to 0, t; tends almost surely toward:
d t; 1 t U
D ' Y X :
dt
Second Step. Thanks to Proposition 5.50, we may consider, for any 2 P2 .Rd /, a potential
p W Rd ! R of @ u././ provided we carefully choose the version of @ u././. Actually,
since the L-derivative of u is assumed to have a version jointly continuous in the space and
measure variables, this jointly continuous version must coincide with the version constructed
in the proof of Proposition 5.50. When has Rd as support, @ u././ admits a unique
continuous version on the entire Rd . When the support of is a strict subset of Rd ,
uniqueness of the continuous version holds true on the support of only. However, the
density argument used in Proposition 5.50 to construct the continuous version shows that, in
that case as well, it coincides with the jointly continuous version of the L-derivative.
Choosing the version of the potential that vanishes at 0, we have the representation
formula:
Z 1
p .x/ D @ u./.tx/ xdt; .x; / 2 Rd P2 .Rd /;
0
which proves that p is jointly continuous in .x; /. Returning to the conclusion of the first
step, we observe that:
t t 1 t
@ u L. t; / ˚ Y C 1˚ X ' Y X
h t t i
D @ pL. / ˚
t;
Y C 1˚ X :
From the proof of Proposition 5.50, we know that @ u./.x/ is at most of linear growth
in x, uniformly in in bounded sets. Therefore, p .x/ is at most of quadratic growth in x,
uniformly in in bounded sets. By a uniform integrability argument, we can exchange the
limit and the integral. We get:
424 5 Spaces of Measures and Related Differential Calculus
Z 1 h i
u L.Y/ u L.X/ D E ptL.Y/C.1t/L.X/ .X/ ptL.Y/C.1t/L.X/ .Y/ dt
0
Z 1 Z
D ptL.Y/C.1t/L.X/ .y/ L.Y/ L.X/ .dy/ dt;
0 Rd
which proves that, for all 2 P2 .Rd /, p ./ D Œıu=ım././ up to an additive constant
depending on .
Third Step. The last two claims in the statement are easily proven. The fundamental
relationship is a straightforward consequence of the fact that p ./ D Œıu=ım././, while
the last claim follows from the fact that the assumptions of Proposition 5.44 are satisfied, the
linear growth property of @ u being shown as in the proof of Proposition 5.50. t
u
Displacements in P2 .Rd /
Our earlier discussion of optimal transportation, recall for example Subsection 5.1.3,
took place in a static framework. Our goal is now to give a dynamic flavor to this
theory. For that purpose, we need a differential geometric structure on the space
of probability measures, and its introduction requires the notion of differentiable
curves joining elements of this space. So, given two probability measures and
in P2 .Rd / for which we can find an optimal transport map from to , we would
like to construct natural paths, think for example of the graphs of functions from
Œ0; 1 to P2 .Rd /, that would go from to .
A straightforward though rather naive solution would be to use the probability
measures t defined by t D .1 t/ C t for 0 6 t 6 1. This is natural indeed as
P2 .Rd / can be viewed as embedded in the linear space M.Rd / of signed measures
on Rd equipped with its vector space structure. However this natural guess ignores
the metric structure of P2 .Rd / provided by the Wasserstein distance W2 , and offers
no insight in the transport of 0 D into 1 D .
Inspired by the lifting procedure used in the construction of L-derivatives, one
can also search for a path from Œ0; 1 into L2 .˝; F; PI Rd / for an atomless Polish
probability space .˝; F; P/, which would go from a random variable X with
distribution to .X/ which has as distribution if is indeed a transport map
from to . Taking advantage of the flat nature of the linear space L2 , it is tempting
to consider:
Xt D .1 t/X C t .X/ D X C t .X/ X ; t 2 Œ0; 1;
5.4 Comparisons with Other Notions of Differentiability 425
and use the path D .t /06t6T in P2 .Rd / given by t D L.Xt /. Notice that in this
case:
While it is unclear at this stage that this situation is generic, the above example
shows that, in order to go from X to .X/ with an optimal transport map, one can
simply follow the flow of an Ordinary Differential Equation (ODE) in a linear vector
space. As we shall see next, it can be proved that the path .L.Xt //06t61 is in fact a
geodesic for the 2-Wasserstein distance.
This is indeed a key idea in the theory of optimal transportation according to
which probability measures are transported along the flow of an ODE induced by
some possibly time-dependent vector field:
which should be compared with (5.60). As a result, integration by parts implies that
the dynamics of .t /06t61 are given by the first-order Fokker-Planck equation:
@t t C div b.t; /t D 0; t 2 Œ0; 1; (5.63)
understood in the sense of distributions. Notice that there are plenty more vector
fields transporting 0 to 1 in this way. Indeed, if we add a divergence free (for t )
vector field w.t; / to b.t; /, that is a vector field satisfying:
div w.t; /t D 0; (5.64)
in the sense of distributions, then equation (5.63) remains unchanged. Recall that
this equation needs to be understood in the sense that:
Z
8 2 Cc1 .Rd /; w.t; x/ r .x/dt .x/ D 0;
Rd
where Cc1 .Rd / is the space of real valued smooth functions with compact support
in Rd .
Remark 5.52 The above argument, though informal, will make up half of the proof
of an important result of Benamou and Brenier proven below as Theorem 5.53.
XP t D ˛t ;
feedback form, i.e., ˛t D b.t; Xt /, and we understand that the divergence free
component of b.t; / has no impact on the dynamics of (5.63). Therefore, we may
restrict the search for b.t; / to the subspace orthogonal to the divergence free
vector fields for t , namely the closure of the gradients in L2 .Rd ; t I Rd /, where
t D L.Xt /. Part of the difficulty is the fact that this space is not known since the
measure t is unknown. The following result, which is often referred to as Benamou
and Brenier’s theorem, addresses this quandary.
Theorem 5.53 For any ; 2 P2 .Rd /, the 2-Wasserstein distance between and
satisfies:
the infimum being taken over the set A.; / of pairs .; b/ 2 C.Œ0; 1I P2 .Rd //
L2 .Œ0; 1 Rd ; t .dx/dtI Rd /, where 0 D , 1 D , and:
@t t C div b.t; /t D 0;
Remark 5.54 The above version of Benamou and Brenier’s theorem is due to
Ambrosio, Gigli, and Savaré, but the lines of the proof below are inspired from
Villani’s monograph; in this latter reference, and are required to be absolutely
continuous and compactly supported. The reader is referred to the Notes &
Complements at the end of the chapter for precise citations.
Proof. We provide a sketch of proof only in the case when and are absolutely continuous.
For a regular enough vector field b (for example locally bounded in .t; x/ and Lipschitz in x,
uniformly in t), and for each t 2 Œ0; 1, let us define t D ıXt ./1 where .Xt .x//06t61; x2Rd
is the flow associated with the vector field b. Then, .; b/ 2 A.; /. Using successively,
the definition of .t /06t61 , the definition of the solution .Xt /06t61 , Fubini’s theorem, Hölder
inequality, the definition of the set A.; /, and finally the definition of W2 .; /, we get:
Z 1 Z Z 1 Z
jb.t; x/j2 t .dx/dt D jb.t; Xt .x//j2 d.x/dt
0 Rd 0 Rd
Z 1 Z
D jXP t .x/j2 d.x/dt
0 Rd
Z Z 1
D jXP t .x/j2 dt d.x/
Rd 0
428 5 Spaces of Measures and Related Differential Calculus
Z ˇZ 1 ˇ2
ˇ ˇ
> ˇ XP t .x/dtˇ d.x/
Rd 0
Z
ˇ ˇ
D ˇX1 .x/ xˇ2 d.x/ > W2 .; /2 :
Rd
Since the right-hand side is independent of .; b/ 2 A.; /, we can take the infimum
of the left-hand side over A.; / and still preserve the inequality. In order to prove that
W2 .; /2 is not greater than the right-hand side of (5.65), we need to consider general
.; b/ 2 A.; /, without assuming that b is Lipschitz in space and make sure that the
above inequality still holds. This can be done by a mollifying argument and controlling the
limits when removing the mollification. The details are rather involved and we shall not give
them here. Details can be found in the references given in the Notes & Complements at the
end of the chapter.
We now prove the reverse inequality. For that, we assume that is absolutely continuous
and we use the Brenier map '. So ı .r'/1 D and:
Z
W2 .; /2 D jr'.x/ xj2 .dx/:
Rd
Piggybacking on the informal discussion of the beginning of the section, for t 2 Œ0; 1 and
x 2 Rd we set:
1t 2
't .x/ D jxj C t'.x/; and Xt .x/ D r't .x/ D .1 t/x C tr'.x/;
2
the second definition making sense at points x where ' is well defined (which is true almost
everywhere under the Lebesgue measure Lebd on Rd ). Since is absolutely continuous with
respect to Lebd , we can define the flow by t D ı Xt ./1 , for t 2 Œ0; 1. From the
definition of .Xt .x//06t61 we get:
Our goal is to find a vector field b for which b.t; / is defined t -almost surely and in such a
way that the above right-hand side can be rewritten as b.t; Xt .x// for 0 -almost every x 2 Rd .
Indeed, for such a vector field, we have:
Z 1 Z Z 1 Z
jb.t; x/j2 t .dx/dt D jb.t; Xt .x//j2 0 .dx/dt
0 Rd 0 Rd
D W2 .; /2 ;
where we used the definition of the flow D .t /06t61 and the definition of b. This shows
that .; b/ is optimal. The existence of b follows from Remark 5.21 following Brenier’s
Theorem 5.20 by setting:
b.t; x/ D r' r't .x/ r't .x/; t 2 Œ0; 1; t almost every x 2 Rd ;
5.4 Comparisons with Other Notions of Differentiability 429
where 't is the convex conjugate of 't . When t D 1, '1 D ' and Remark 5.21 guarantees
that r'.r' .y// D y for -almost every y 2 Rd and r' .r'.x// D x for almost every
x 2 Rd . When t 2 Œ0; 1/, 't is strongly convex, which shows that t D ır't1 is absolutely
continuous. Proceeding as in Remark 5.21, we deduce that r't .r't .y// D y for t -almost
every y 2 Rd and r't .r't .x// D x for -almost every x 2 Rd . t
u
It is called the McCann’s interpolation between and . Observe also that b.t; /
may be rewritten:
b.t; x/ D r' r't .x/ r't .x/
1 1t
D r't I r't .x/ r't .x/
t t
1
D x r't .x/ ;
t
for t almost every x 2 Rd and for t 2 .0; 1.
the right-hand side denoting the closure of the set of smooth gradients in the space
L2 .Rd ; t I Rd /.
.dx; dy/ D .dx; y/t .dy/; and .dy; dz/ D t .dy/ .y; dz/:
and use the notations 1;2 , 1;3 and 2;3 for the projections defined by 1;2 .x; y; z/ D .x; y/,
1;3 .x; y; z/ D .x; z/, and 2;3 .x; y; z/ D .y; z/. By construction, we have ı . 1;2 /1 D ,
and ı . 2;3 /1 D . Moreover, if we define 2 P2 .Rd Rd / by D ı . 1;3 /1 ,
then 2 ˘.0 ; 1 /. We show that in fact, is an optimal transport plan in the sense that
opt
2 ˘2 .0 ; 1 /. Indeed:
Z 1=2
W2 .0 ; 1 / 6 jx zj2 .dx; dz/
.Rd /2
Z 1=2
D jx zj2 .dx; dy; dz/
.Rd /3
Z 1=2 Z 1=2
6 jx yj2 .dx; dy; dz/ C jy zj2 .dx; dy; dz/
.Rd /3 .Rd /3
Z Z 1=2
1=2
D jx yj2 .dx; dy/ C jy zj2 .dy; dz/
.Rd /2 .Rd /2
D W2 .0 ; 1 /;
so that all the above inequalities are in fact equalities. In particular, since the norm of
L2 ..Rd /3 ; I Rd / is strictly convex, this implies that there exists ˛ > 0 such that:
Using the factR that W2 .0 ; t / D tW2 .0 ; 1 /, we conclude that ˛ D t. Defining the function
N by N .y/ D Rd x .dx; y/ for t -a.e. y 2 Rd , and integrating both sides of (5.67) with respect
to the probability measure .dx; y/ we get:
1 1t
y 7! 't .y/ D y N .y/
t t
5.4 Comparisons with Other Notions of Differentiability 431
is a transport map giving the transport plan . Since N depends only upon , and and
were chosen independently of each other, this shows that the transport plan is unique. This
opt opt
concludes the proof for ˘2 .t ; 1 /. We reach the same conclusion for ˘2 .t ; 0 / by
exchanging the roles of 0 and 1 through a simple time reversal t 7! 1 t. t
u
The following simple result provides a large class of constant speed geodesics.
In particular, it justifies the claim made in Remark 5.56 about the McCann
interpolation.
Remark 5.60 As announced in Remark 5.56, the above proposition says that, when
1 is given by 1 D 0 ı .r'/1 , where ' is the Brenier map, the McCann’s
interpolation between 0 and 1 , as constructed in the proof of Benamou-Brenier’s
theorem, is a geodesic path of constant speed between 0 and 1 .
6 .t s/W2 .0 ; 1 / :
In fact, this inequality is an equality because, if there were a couple .s; t/ with s 6 t for which
this inequality was strict, we would have:
we observe that, as the perturbation in (5.60) goes to zero, namely as and get
closer and closer to each other, the direction used to go from to converges to r .
Importantly, in (5.60) may be any compactly supported smooth function from Rd
to R so that the admissible direction r used for transporting locally may be the
gradient of any compactly supported smooth function from Rd to R. More generally,
we can check that the function Rd 3 x 7! r ..I C r /1 .x// which provides the
vector field in (5.60) is always a gradient provided that is small enough, which
shows that, at any time t 2 Œ0; 1, the law of Xt in (5.60) indeed moves along a
gradient vector field. This can be proven by adapting the duality argument used in
Proposition 5.13. If we set:
˚
.y/ D inf jx yj2 C 2 .x/ ;
x2Rd
and if we mimic the computations of Proposition 5.13, we see that for any y 2 Rd :
ˇ 1 ˇ2
.y/ D ˇy I C r .y/ˇ C 2 .I C r /1 .y/ :
Next we notice that the range of I C r is the whole space, so that by expanding
.I C r /, we get for all y 2 Rd :
1 1
r" .y/ D 2y 2 I C r .y/ D 2r I C r .y/ ;
˚ L2 .Rd ;IRd /
Tan P2 .Rd / D r'I ' 2 Cc1 .Rd I R/ :
Since gradients are orthogonal to divergence free fields, Tan .P2 .Rd // is also
equal to:
n
Tan P2 .Rd / D v 2 L2 .Rd ; I Rd / W
Z o
2
8w 2 L .R ; I R / with div.w/ D 0;
d d
v w d D 0 :
Rd
As a first step in our search for connections between L-derivatives and the
Wasserstein geometry advocated in this section, we have:
5.4 Comparisons with Other Notions of Differentiability 433
Proof.
First Step. We start with the case when the Fréchet derivative of the lifting of u is uniformly
Lipschitz. Then, for a given 2 P2 .Rd /, we know from Proposition 5.50 that there exists
a continuously differentiable function p W Rd ! R, with a Lipschitz continuous gradient
rp , such that, almost everywhere:
@ u././ D rp :
The Lipschitz property of the L-derivative implies that we can find a constant C such that,
for any x; x0 2 Rd ,
ˇ ˇ
ˇrp .x/ rp .x0 /ˇ 6 Cjx x0 j;
and
ˇ ˇ ˇ ˇ
ˇrp .x/ˇ 6 C 1 C jxj ; ˇp .x/ˇ 6 C 1 C jxj2 :
For a sequence .n /n>1 of C 1 mollifiers with supports included in a fixed compact set and
converging to ı0 , we set:
pn .x/ D p n .x/; x 2 Rd :
Since rp is continuous, rpn converges to rp , uniformly on compact subsets. The con-
vergence also holds in L2 .Rd ; I Rd / since, up to a possible modification of the constant C:
ˇ ˇ
ˇrp .x/ˇ 6 C 1 C jxj ; x 2 Rd ;
n
Since n 1 converges to 0 pointwise, the same domination argument as above shows that:
. n 1/rpn converges to 0 in L2 .Rd I /. In order to handle the second term in the right-hand
side, we observe that r n converges to 0 pointwise and that:
ˇ ˇ
ˇp r n ˇ 6 C .1 C jxj2 /1fn6jxj62ng 6 C.1 C jxj/;
n
n
Indeed, we shall prove in Lemma 5.94 below that for any smooth function W Rd !
R with compact support, the function P2 .Rd / 3 7! u. ı 1 / 2 R is bounded and
d
This shows that @ u 2 Tan .P2 .Rd // provided that @ un 2 Tan .P2 .Rd // for all
n > 1. Put differently, we can assume that the lifting uQ of u is bounded and continuously
Fréchet differentiable on L2 .˝; F ; PI Rd / and that its Fréchet derivative is bounded (in
L2 .˝; F ; PI Rd /).
For any 0 < ı < , we call uQ ;ı the sup-inf convolution of uQ with parameters .; ı/:
1 1
uQ ;ı .X/ D sup inf uQ .Y/ C EŒjY Zj2 EŒjZ Xj2 ;
Z2H Y2 H 2 2ı
1 1 1
sup inf uQ .Y/ C EŒjY Zj2 EŒjZ Xj2 6 sup C EŒjZ Xj2 ;
Z2H Y2H 2 2ı Z2H 2ı
which shows that the maximization over Z may be restricted to those Z such that kZ
Xk22 6 Cı, the value of C being allowed to increase from line to line. Therefore, the
minimization over Y may be restricted to those Y such that kZ Yk2 6 C", that is
kY Xk22 6 C" for a new value of C.
Now, for another W 2 H, we have:
1 1
uQ ";ı .X C W/ D sup inf uQ .Y/ C EŒjY Zj2 EŒjZ .X C W/j2
Z2H Y2 H 2 2ı
1 1
D sup inf uQ .Y/ C EŒjY W Zj2 EŒjZ Xj2 (5.69)
Z2H Y2 H 2 2ı
1 1
D sup inf uQ .Y C W/ C EŒjY Zj2 EŒjZ Xj2 :
Z2H Y2 H 2 2ı
Recalling that uQ ";ı has a Lipschitz continuous Fréchet derivative, we deduce that, for all r > 0,
;ı
DQu .X/ DQu.X/ 6 "X C r C C;ı r;
2
where the constant C;ı depends upon and ı. Letting first r tend to 0 and then to 0, we
deduce that kDQu;ı .X/ DQu.X/k2 tends to 0 as tends to 0.
Therefore, in order to complete the proof, it suffices to show that DQu;ı .X/ may be
represented in the form ;ı .X/ for some ;ı 2 Tan .P2 .Rd //. To do so, it suffices to observe
that:
Z Z
1
uQ ;ı .X/ D sup inf u./ C jy zj2 d.y; z/ jx zj2 d%.x; z/ ;
2ı .Rd /2 .Rd /2
the infimum being taken over the probability measures 2 P2 ..Rd /2 /, the argument in u
standing for the first marginal of on Rd , and the supremum being taken over the probability
measures % 2 P2 ..Rd /2 / with D L.X/ as first marginal on Rd and with the same second
marginal on Rd as . Obviously, the right-hand side only depends on D L.X/, which
shows that uQ ;ı may be projected as a function u;ı on P2 .Rd /. By the first step of the proof,
@ u;ı ././ 2 Tan .P2 .Rd //. t
u
Elements in Tan .P2 .Rd // are identified when equal up to a -null Borel set.
Proof. Let us assume that 2 @ u./ and C 2 @C u./. The proof is based on the very
same argument we used in our introductory discussion of displacements when we suggested
that optimal transport of measures was taking place along gradients. In the definition of the
sub-differential, choose 0 D ı .I C r /1 where is a smooth function with compact
support from Rd into R and 2 R. For jj small enough, I C r is the gradient of a strictly
convex function, namely Rd 3 x 7! .1=2/jxj2 C .x/. By Proposition 5.13, we know that
the map Rd 3 x 7! x C r .x/ 2 Rd is an optimal transport map from to 0 , and that it
defines the unique optimal transport plan from to 0 . Therefore, by definition of the sub-
and super-differentials, we have that:
Z
u.0 / > u./ C .x/ r .x/d.x/ C o./;
Rd
and
Z
u.0 / 6 u./ C C .x/ r .x/d.x/ C o./;
Rd
from which we deduce, by letting tend to 0 (on both sides 0C and 0 ), that:
Z
C .x/ .x/ r .x/d.x/ D 0:
Rd
The above is true for any smooth function from Rd to R with a compact support. Since
C 2 Tan .P2 .Rd //, and the latter is equal to the closure in L2 .Rd ; I Rd / of the
gradients of smooth functions with compact supports, we conclude that C D almost
everywhere under . t
u
On an atomless Polish probability space .˝; F ; P/, we consider two Rd -valued random
variables X and X 0 such that the pair .X; X 0 / has as joint distribution. By definition of
the L-derivative, we have:
5.4 Comparisons with Other Notions of Differentiability 437
Z 1
u.0 / D u./ C E @ u L..1 t/X C tX 0 / .1 t/X C tX 0 X 0 X dt
0
D u./ C E @ u L.X/ .X/ X 0 X
Z 1 h
C E @ u L..1 t/X C tX 0 / .1 t/X C tX 0 @ u L.X/ X
0
i
X 0 X dt:
Following the discussion right after Remark 5.26 based on the application of Lemma 5.30,
we can prove that:
ˇZ h ˇ
ˇ 1 i ˇ
ˇ E @ u L..1 t/X C tX 0 / .1 t/X C tX 0 @ u L.X/ .X/ X 0 X dtˇˇ
ˇ
0
Z 1=2 Z 1=2
6 jx yj2 d.x; y/ " jx yj2 d.x; y/ ;
.Rd /2 .Rd /2
where
R " W RC ! RC satisfies limr&0 " .r/ D 0. Since the coupling is optimal, it holds
.R /
d 2 jx yj2 d.x; y/ D W2 .; 0 /2 , so that:
Z
u.0 / > u./ C @ u .x/ .y x/ d.x; y/ W2 .; 0 /" W2 .; 0 / :
Rd
Therefore, taking the supremum over the optimal plans 2 ˘2 .; 0 /, we deduce that:
opt
Z
u.0 / > u./ C sup @ u .x/ .y x/ d.x; y/
opt
2˘2 .;0 / Rd
W2 .; 0 /" W2 .; 0 / :
Lemma 5.61 asserts that @ u./ 2 Tan .P2 .Rd //. This proves that @ u./ 2 @ u./. We
then prove that @ u./ 2 @C u./ in the same way. t
u
Mean field games with finite state spaces have been studied sporadically, their
relevance coming from the importance of some of their applications. Our intro-
ductory Chapter 1 contains a couple of such examples. In these models, marginal
distributions have a fixed finite support, and functions of these distributions appear
as functions on a finite dimensional simplex. As before, when these functions are
assumed to be restrictions of smooth functions on the ambient Euclidean space, a
natural notion of differentiability can be used. However, this notion differs from the
notion of L-differentiability, and we propose to highlight the differences.
438 5 Spaces of Measures and Related Differential Calculus
from which we conclude that the derivation in the space of measures (in the rough
with the usual derivative of the function R 3
d
sense of (5.52)) should Pd coincide
.m1 ; ; md / 7! u iD1 mi ıei , with the identity:
@ h X i ıu X
d d
u mi ıei D mi ıei .ek /;
@mk iD1
ım iD1
that should hold true for all k 2 f1; ; dg and .m1 ; ; md / 2 Rd . Importantly,
this formula is for functions of measures and not only for functions of probability
measures. In particular, .m1 ; ; md / lives in the whole space Rd and not only in
the simplex Sd . We shall prove in Proposition 5.66 below that the formula takes
a somewhat different form when the domain of definition of u is restricted to the
subset of probability measures.
The above
P formula says that we can view the d values taken by the function
Œıu=ım. diD1 mi ıei / on E as the d components of the gradient of the function u
when viewed as a function on Rd . Clearly, the present discussion is rather informal.
Indeed, implicit continuity assumptions were used to derive the above identity
from (5.70).
5.4 Comparisons with Other Notions of Differentiability 439
1 p
jp p0 j 6 W2 .p; p0 / 6 djp p0 j1=2 ;
2
X
d
jp p0 j2 D jpi p0i j2 ;
iD1
while, for two random variables X and X 0 (constructed on some probability space .˝; F ; P/)
with values in E and with p and p0 as respective distributions, we have (using the fact that
e1 ; ; ed are chosen as the vectors of the canonical basis of Rd ):
E jX X 0 j2 D 2P X 6D X 0
D 2 2P X D X 0
X
d
1=2 1=2
X
d
1=2
2
>22 P X D ei P X 0 D ei D pi .p0i /1=2 :
iD1 iD1
2 1
W2 p; p0 > jp p0 j2 :
4
Conversely, for p; p0 2 Sd , we can construct two random variables X and X 0 (on the same
probability space as above), with p and p0 as respective distributions, such that PŒX D X 0 D
ei D min.pi ; p0i / for all i 2 f1; ; dg. We then have:
440 5 Spaces of Measures and Related Differential Calculus
2
W2 p; p0 6 E jX X 0 j2
D 2P X 6D X 0
X
d
62 1 min.pi ; p0i /
iD1
h 1X
d i X d
p
D2 1C jpi p0i j .pi C p0i / D jpi p0i j 6 djp p0 j;
2 iD1 iD1
X
d X
d1
Sd1;6 3 .p1 ; ; pd1 / 7! u pi ıei with pd D 1 pi ;
iD1 iD1
X
d X
d
u p0i ıei u pi ıei
iD1 iD1
d Z
X
1
ıu X 0
d
D tpj C .1 t/pj ıej .ei /dt .p0i pi / ;
iD1 0 ım jD1
X
d X
d
u p0i ıei u pi ıei
iD1 iD1
d1 Z
X 1 h ıu X
d
D tp0j C .1 t/pj ıej .ei /
iD1 0 ım jD1
(5.71)
ıu X 0
d i
tpj C .1 t/pj ıej .ed / dt .p0i pi /
ım jD1
d1
X
ıu X ıu X
d d
D pj ıej .ei / pj ıej .ed / .p0i pi / C o jp0 pj ;
iD1
ım jD1
ım jD1
the last equality following from the continuity of Œıu=ım on P2 .Rd / Rd assumed as part of
the definition of the
P existence of a linear functional derivative. Notice that we also used the
fact that the set f diD1 pi ıei I p D .p1 ; ; pd / 2 Sd g is a compact subset of P2 .Rd /, which
guarantees that for any i 2 f1; ; dg, the function:
ıu X
d
Sd 3 p 7! pj ıej .ei /
ım jD1
is uniformly continuous with respect to the Wasserstein distance, and hence the Euclidean
distance because of Lemma 5.65. t
u
X
d
u pi ıei D uN .p1 ; ; pd /:
iD1
ıu X ıu X
d d
@Nu @Nu
pj ıej .ei / pj ıej .ed / D .p1 ; ; pd / .p1 ; ; pd /:
ım jD1 ım jD1 @pi @pd
X
d X
d
u p0i ıei u pi ıei
iD1 iD1
Xd
@Nu
D .p1 ; ; pd / p0i pi C o jp0 pj
iD1
@p i
d1 h
X i
@Nu @Nu
D .p1 ; ; pd / .p1 ; ; pd / p0i pi C o jp0 pj ;
iD1
@pi @pd
P
where we used the fact that diD1 .p0i pi / D 0 in order to derive the last equality. Identifying
with the expansion (5.71), we complete the proof. t
u
Remark 5.68 Observe that, in contrast with our preliminary discussion in (5.70)
for functions of signed
P measures, the statement of Corollary 5.67 identifies the
vector ..ıu=ım/. djD1 pj ıej /.ei //i2f1; ;dg with ..@Nu=@pi /.p1 ; ; pd //i2f1; ;dg up to
an additive constant. This is consistent with our previous observation of the fact that
ıu=ım is uniquely defined up to an additive constant when u is only defined on the
subset of probability measures. Recall Remark 5.46.
X1 D 1fU1 6p1 g ;
X2 D 1fU1 >p1 ; U2 6p2 =.1p1 /g ;
Xd D 1fU1 >p1 ; U2 >p2 =.1p1 /; ;Ud1 >pd1 =.1p1 pd2 /g
Remark 5.69 provides an alternative road for proving Proposition 5.66, at least
when pi > 0 for all i 2 f1; ; dg and when the assumption of Proposition 5.51 is
in force so that Œıu=ım././ is a potential of @ u././.
is differentiable, with:
d p0 p tp0 C .1 t/p U
t D ' e1 e2
dt
D .p0 p/@w tp0 C .1 t/p; U :
We deduce that:
u L.1 / u L.0 /
Z 1
D p0 p E @ u L.t / t @w tp0 C .1 t/p; U dt
0
D p0 p
Z 1 Z 1
0
0
@ u L.t / tp C .1 t/p; w @w tp C .1 t/p; w dw dt:
0 0
444 5 Spaces of Measures and Related Differential Calculus
u p0 ıe1 C .1 p0 /ıe2 u pıe1 C .1 p/ıe2
Z 1
ıu 0
D .p0 p/ tp C .1 t/p ıe1 C 1 tp0 .1 t/p ıe2 .e1 /
0 ım
ıu 0
tp C .1 t/p ıe1 C 1 tp0 .1 t/p ıe2 .e2 / dt;
ım
Y
d1
qj wj
d .q; w/ D 1˚ ;
jD1
and
q1 .p/ D p1 ;
pi pi
qi .p/ D D ; i 2 f2; ; d 1g:
1 .p1 C C pi1 / pi C C pd
5.4 Comparisons with Other Notions of Differentiability 445
Now, for any two p and p0 in Sdı and any vector U D .U1 ; ; Ud1 / of d 1 independent
random variables U1 ; ; Ud1 uniformly distributed on Œ0; 1, the mapping:
X
d
Œ0; 1 3 t 7! t D i tp0 C .1 t/p; U ei 2 L2 .˝; F ; PI Rd /
iD1
is differentiable, with:
Xd
d d 0
t D @q i q tp0 C .1 t/p ; U q tp C .1 t/p ei :
dt iD1
dt
We deduce that:
u L 1 u L 0
X Z 1
d
D E @ u L t t
iD1 0
d 0
@w i q tp0 C .1 t/p ; U q tp C .1 t/p ei dt;
dt
so that:
u L 1 u L 0
d Z
X Z
1 X
d
D @ u L t j tp0 C .1 t/p; w ej
iD1 0 .0;1/d1 jD1
d 0
@w i tp0 C .1 t/p; w q tp C .1 t/p ei dw dt:
dt
Using once again the fact that Œıu=ım././ is a potential of @ u././, and in particular the
relationship @xi Œıu=ım./.x/ D @ u./.x/ ei , we get:
u L 1 u L 0
Z Z
1
ıu X 0
d
D @w L t j tp C .1 t/p; w ej
0 .0;1/d1 ım jD1
d 0
q tp C .1 t/p dw dt:
dt
446 5 Spaces of Measures and Related Differential Calculus
By Stokes’ theorem,
u L 1 u L 0
Z Z
1
ıu X 0
d
D L t j tp C .1 t/p; w ej
0 @.0;1/d1 ım jD1
(5.72)
hd i
q tp0 C .1 t/p n.w/ ds.w/ dt;
dt
where n.w/ denotes the outward unit normal vector to @.0; 1/d1 at w and s denotes the
surface measure on @.0; 1/d1 .
Third Step. Now, for any t 2 .0; 1/,
Z
ıu X 0
d
lim L t j tp C .1 t/p; w ej
&0 @.0;1/d1 ım jD1
hd i
0
q tp C .1 t/p n.w/ ds.w/
dt
X d Z
d1 X
ıu X 0
d
D .tp C .1 t/pj /ıej .e` /
iD1 `D1 fwi D1g ım jD1 j
`1
Y hd i
1wk >qk .tp0 C.1t/p/ 1w` 6q` .tp0 C.1t/p/ q tp0 C .1 t/p ei ds.w/
kD1
dt
X d Z
d1 X
ıu X 0
d
.tp C .1 t/pj /ıej .e` /
iD1 `D1 fwi D0g ım jD1 j
`1
Y hd i
1wk >qk .tp0 C.1t/p/ 1w` 6q` .tp0 C.1t/p/ q tp0 C .1 t/p ei ds.w/ ;
kD1
dt
where we used the convention wd D 0 and the fact that, whenever wi D 0 (resp. 1), n.w/ D
ei (resp. Cei ) except at the vertices where the outward normal unit vector is not uniquely
defined. Above, we used two main ingredients. First, we used the fact that for any p 2 Sdo
and P-almost surely:
Y
i1
lim i tp0 C .1 t/p; U D 1fUj >qj .tp0 C.1t/p/g 1fUi <qi .tp0 C.1t/p/g :
&0
jD1
By Remark 5.69, we deduce that for any t 2 Œ0; 1, the random vector t converges in law to
Pd 0 0
jD1 .tpj C .1 t/pj /ıej as tends to 0. Also, we used the fact that for w 6D q.tp C .1 t/p/,
Pd 0
the vector jD1 j .tp C .1 t/p; w/ej converges to the vector e` , where ` is the smallest
index such that w` 6 q` .tp0 C .1 t/p/.
5.4 Comparisons with Other Notions of Differentiability 447
Recalling the convention qd .tp0 C .1 t/p/ D 1 and thus Œd=dt.qd .tp0 C .1 t/p// D 0,
we deduce that for any t 2 .0; 1/:
Z
ıu X 0
d
lim L t j tp C .1 t/p; w ej
&0 @.0;1/d1 ım jD1
hd i
q tp0 C .1 t/p n.w/ ds.w/
dt
d X
X ıu X
d
D .tp0j C .1 t/pj /ıej .e` /
ım
`D1 i6D` jD1
h Y i
d 0
1 qk .tp0 C .1 t/p/ q` .tp0 C .1 t/p/ qi tp C .1 t/p
dt
k6`1;k6Di
X X
ıu X 0
d d d
.tp C .1 t/pj /ıej .e` /
ım jD1 j
`D1 iD`C1
h Y i
d 0
1 qk .tp0 C .1 t/p/ q` .tp0 C .1 t/p/ qi tp C .1 t/p
dt
k6`1
d
X ıu X 0
d
.tp C .1 t/pj /ıej .e` /
ım jD1 j
`D1
h Y id
1 qk .tp0 C .1 t/p/ q` tp0 C .1 t/p ;
dt
k6`1
which gives:
Z
ıu X 0
d
lim L t j tp C .1 t/p; w ej
&0 @.0;1/d1 ım jD1
hd i
q tp0 C .1 t/p n.w/ ds.w/
dt
X
d X ıu Xd
D .tp0 C .1 t/pj /ıej .e` /
ım jD1 j
`D1 i6`1
h Y i
d 0
1 qk .tp0 C .1 t/p/ q` .tp0 C .1 t/p/ qi tp C .1 t/p
dt
k6`1;k6Di
d
X ıu X 0
d
.tp C .1 t/pj /ıej .e` /
ım jD1 j
`D1
Y
d 0
1 qk .tp0 C .1 t/p/ q` tp C .1 t/p :
dt
k6`1
448 5 Spaces of Measures and Related Differential Calculus
`1 h
X Y i d 0
D 1 qk .tp0 C .1 t/p/ q` tp0 C .1 t/p qi tp C.1 t/p
dt
iD1 k6`1;k6Di
h Y id
1 qk .tp0 C .1 t/p/ q` tp0 C .1 t/p ;
dt
k6`1
so that:
Z
ıu X 0
d
lim L t j tp C .1 t/p; w ej
&0 @.0;1/d1 ım jD1
hd i
q tp0 C .1 t/p n.w/ ds.w/
dt
d
X ıu X 0
d
D .tp C .1 t/pj /ıej .e` /
ım jD1 j
`D1
Y
d 0
0
1 qk .tp C .1 t/p/ q` tp C .1 t/p :
dt
k6`1
where we used the convention that p0 D 0. So that, by differentiating the above identity along
the curve .0; 1/ 2 t 7! tp0 C .1 t/p, we get:
Y
d
1 qk .tp0 C .1 t/p/ q` tp0 C .1 t/p D p0` p` :
dt
k6`1
5.5 Convex Functions of Probability Measures 449
We finally get:
Z
ıu X 0
d
lim L t j tp C .1 t/p; w ej
&0 @.0;1/d1 ım jD1
hd i
q tp0 C .1 t/p n.w/ ds.w/
dt
d
X
ıu X
d
D .tp0j C .1 t/pj /ıej .e` / p0` p` :
ım
`D1 jD1
X X d Z
X
1
ıu X 0
d d d
u p0i ıei u pi ıei D .tpj C .1 t/pj /ıej .e` / p0` p` dt;
ım jD1
iD1 iD1 `D1 0
Most of the practical applications for which the theoretical results of the book
have been developed are concerned with optimizations of functions. So the fact
that sufficient conditions will often involve convexity assumptions should not come
as a surprise. For functions defined on flat linear spaces, the notion of convexity
based on the convexity of restrictions of the functions to lines, and the notion of
convexity based on the idea of graph sitting above the tangent hyperplane are easily
seen to be equivalent. This is not clear any longer for functions defined on curved
spaces like P2 .Rd /. In most of the applications considered in this book, we shall use
convex functions whose graphs are above their tangents when the latter are defined
in terms of L-derivatives. So we first study this class of convex functions. For the
sake of completeness, we next define the class of functions which are convex when
restricted to geodesic curves, and we compare this form of convexity to the previous
one.
We first study the notion of convexity associated with the special notion of L-
differentiability introduced in this chapter. Using this notion of differentiability, the
notion of convexity coming from the above the tangent philosophy can be defined
as follows.
450 5 Spaces of Measures and Related Differential Calculus
h.x0 ; 0 / h.x; / @x h.x; / .x0 x/ EŒ@ h.x; /.X/ .X 0 X/ > 0; (5.74)
whenever X and X 0 are square integrable random variables with distributions and
0 respectively.
where X and X 0 0 . Since is convex, it holds that .X 0 / > .X/ C @.X/
.X 0 X/. Taking the expectation, we get EŒ.X 0 / > EŒ.X/ C EŒ@.X/ .X 0 X/.
Now, using the fact that g is nondecreasing and convex, we get:
g EŒ.X 0 / > g EŒ.X/ C EŒ@.X/ .X 0 X/
> g EŒ.X/ C g0 EŒ.X/ EŒ@.X/ .X 0 X/ ;
is L-convex. Indeed, we know from Example 4 in Subsection 5.2.2 that the function
h is L-differentiable, with:
Z Z
@ h./.x/ D @x g.x; x0 /d.x0 / C @x0 g.x0 ; x/d.x0 /:
Rd Rd
In order to complete the proof, it suffices to take the expectation EEQ and to notice
that:
Z
Q Q 0
EE @x g.X; X/ .X X/ D E 0 0 0
@x g.X; x /d.x / .X X/ ;
Rd
Z
EEQ @x0 g.X; X/ Q D EQ
Q .XQ 0 X/ Q
@x0 g.x0 ; X/d.x 0
/ .XQ 0 X/
Q
Rd
Z
DE @x0 g.x0 ; X/d.x0 / .X 0 X/ :
Rd
is L-convex.
1 1
D ıx1 C ıx2 ; and D ıx3 C ıx4 ;
2 2
and X and Y are Rd -valued random variables such that L.X/ D and L.Y/ D ,
then:
1 12
EŒjX Yj2 D jx j C jx2 j2 C jx3 j2 C jx4 j2
2 (5.75)
x1 C x2 x4 ˛x1 C ˇx2 x3 x4 ;
where:
˛ D P Y D x3 jX D x1 ; and ˇ D P Y D x3 jX D x2 ;
so that:
2 1 12
W2 .; / D inf jx j C jx2 j2 C jx3 j2 C jx4 j2 .x1 C x2 / x4
˛;ˇ2Œ0;1 2
(5.76)
.˛x1 C ˇx2 / .x3 x4 / :
Now, if x1 D .0; 0/, x2 D .2; 1/, x3 D .2; 1/ and x4 D .0; 0/ and PŒX0 D x1 ; Y0 D
x3 D PŒX0 D x2 ; Y0 D x4 D 1=2, then, for any t 2 Œ0; 1, we have:
1
.t/ D ı.2t;t/ C ı.22t;1t/ with .t/ D L .1 t/X0 C tY0 :
2
Finally, we set 0 D 12 .ı.0;0/ C ı.0;2/ / and if X and Y are two other random
variables with distributions 0 and .t/ respectively, using (5.75) we get:
13
EŒjX Yj2 D 5t2 7t C C 2ˇ.2t 1/;
2
4.5
4.4
4.3
dist^2
4.2
4.1
8
ˆ
ˆ 9 1
< 5t2 3t C if t 6
.t/ 2 2 2
W2 .0 ; / D
ˆ 13 1
:̂ 5t2 7t C if t > :
2 2
Clearly, the plot of the function t 7! W2 .0 ; .t/ /2 shows that the latter is not
convex, and not even differentiable because of the cusp at t D 1=2. This proves our
claims.
Of course, it should be noticed that, for any 2 P2 .Rd /, the map P2 .Rd / 3
7! W2 .; /2 is always convex for the structure inherited from the linear space of
measures. Precisely, for any t 2 Œ0; 1 and any ; 0 2 P2 .Rd /,
2
W2 .1 t/ C t0 ; 6 .1 t/W2 .; /2 C tW2 .0 ; /2 :
An Intriguing Example
We return to the second example of L-convex function presented earlier. There
2
u./ D hg ; i, and we nowR choose the function g to be g.x/ D jxj , so that
@h.x/ D 2x and @ u./.x/ D 4 Rd .x y/d.y/. Notice that:
Q
u./ D EEŒjX Q 2 ;
Xj
where we used Fubini’s theorem in order to pass from the second to the third line.
The second order term being nonnegative, we recover the fact that the function u is
convex in the sense of Definition 5.70.
We now explain why we find this example intriguing. The function u may be
rewritten:
Z Z
u.m/ D jx yj2 dm.x/ dm.y/; m 2 P2 .Rd /:
Rd Rd
Z Z
jx yj2 d.m0 m/.x/d.m0 m/.y/
Rd Rd
Z Z
D jxj2 d.m0 m/.x/d.m0 m/.y/
Rd Rd
Z Z
C jyj2 d.m0 m/.x/d.m0 m/.y/
Rd Rd
Z Z
2 x y d.m0 m/.x/d.m0 m/.y/
Rd Rd
ˇZ ˇ2
ˇ ˇ
D 2ˇˇ x d.m m/.x/ˇˇ :
0
R d
Obviously the absolute value of this expression is less than 2W2 .m; m0 /2 . This says
that the absolute value of the last term in the right-hand side in (5.77) is less
than 2W2 .m; m0 /2 . By Remark 5.47, this implies that u admits the following linear
functional derivative:
Z
ıu
.m/.x/ D 2 jx yj2 dm.y/; .m; x/ 2 P2 .Rd / Rd :
ım Rd
Now, using (5.77) to develop the function u near m along the ray from m to m0 , we
find:
u m C .m0 m/
Z Z Z Z
2
D jx yj dm.x/dm.y/ C 2 jx yj2 dm.x/d.m0 m/.y/
Rd Rd Rd Rd
Z Z
C 2 jx yj2 d.m0 m/.x/d.m0 m/.y/
Rd Rd
Z ˇZ ˇ2
ıu ˇ ˇ
D u.m/ C .m/.y/ d.m0 m/.y/ 2 2 ˇˇ x d.m0 m/.x/ˇˇ :
Rd ım Rd
The second order correction is now negative, suggesting concavity instead of the
convexity argued earlier. The space of finite measures is flat, and this concavity can
be interpreted using the standard intuition associated with the shapes of surfaces
plotted above a flat space. However, the notion of convexity derived from an
expansion using L-derivatives cannot be interpreted in terms of Euclidean geometry.
We already encountered this example in Subsection 3.4.3. We shall appeal to it
again in the next subsection and in Section 5.7.1, when revisiting the uniqueness of
mean field game solutions.
456 5 Spaces of Measures and Related Differential Calculus
Already in Chapter 1, we used the notion of monotonicity in our first baby steps
toward the existence and uniqueness of Nash equilibria for mean field games. The
discussion of this notion culminated in Section 3.4 of Chapter 3 for general unique-
ness results, which we shall revisit in Subsection 5.7.1 below. In this subsection, we
derive a couple of simple properties relating monotonicity to convexity. First, we
define the notion of monotonicity for operators on Hilbert spaces (which will be L2 -
spaces in all the examples considered below), which generalizes the Definition 3.31
of L-monotonicity:
F is convex ” DF is monotone:
as well as:
and summing both inequalities gives the monotonicity condition for DF. Conversely, if DF
is assumed to be monotone, since F is continuously differentiable we have:
The second result of this subsection makes the connection with the Lasry-Lions
monotonicity condition introduced in Chapter 3 to ensure uniqueness of mean field
game solutions. It is reminiscent of Example 5 of Section 5.2.2.
then the function U from the Hilbert space L2 .˝; F; PI Rd / into itself defined by
U.X/ D @x u.X; L.X// is monotone.
Similarly,
˝ ˛
u X 0 ; L.X/ > u X; L.X/ C @x u X; L.X/ ; X 0 X L2 :
which is nonnegative by the monotonicity assumption. This completes the proof of the
monotonicity of U. u
t
Remark 5.74 The converse of the implication proven in the above lemma is not
true, as shown by the following example. Indeed, consider the function
458 5 Spaces of Measures and Related Differential Calculus
1 2
u.x; / D x N ; x 2 R; 2 P2 .R/;
2
D Var.X X 0 / > 0;
for all m; m0 2 P2 .Rd /. Then the result of Lemma 5.72 still holds in the sense
that u is convex in this sense if and only if Œıu=ım is monotone in the sense of
Definition 3.28. The proof is exactly the same.
We now introduce the notion of convexity most popular in the theory of optimal
transportation of measures, and we connect it to the notion of L-convexity studied
above.
Recall the definition (5.68) of the projection t . With this definition of displacement
convex sets, the definition of displacement convex functions follows naturally.
d2
h. ı .t /1 / > W2 .0 ; 1 /2 :
dt2
We now revisit Example 3 introduced earlier in this section in the new framework
of displacement convexity.
uniformly convex (in the sense that the Hessian of g is bounded below by Id , with
> 0, when g is twice differentiable), then the restriction of h to the set P .m/ is
-uniformly displacement convex.
opt
Proof. The first claim is easy to check. Indeed, if 0 ; 1 2 P2 .Rd /, 2 ˘2 .0 ; 1 /, and
we use the notation t for ı .t /1 , then:
Z Z
h.t / D g.z z0 /dt .z/dt .z0 /
Rd Rd
Z Z
D g t .x; y/ t .x0 ; y0 / d.x; y/d.x0 ; y0 /
.Rd /2 .Rd /2
Z Z
D g .1 t/.x x0 / C t.y y0 / d.x; y/d.x0 ; y0 /
.Rd /2 .Rd /2
Z Z
6 .1 t/g.x x0 / C tg.y y0 / d.x; y/d.x0 ; y0 /
.Rd /2 .Rd /2
Z Z Z Z
D .1 t/ g.x x0 /d0 .x/d0 .x0 / C t g.y y0 /d1 .y/d1 .y0 /
Rd Rd Rd Rd
D .1 t/h.0 / C th.1 /:
460 5 Spaces of Measures and Related Differential Calculus
If we now assume that g is strictly convex, the above inequality can only be an equality if and
only if ˝2 .f.x; y/; .x0 ; y0 / W x y D x0 y0 g/ D ˝2 .f.x; y/; .x0 ; y0 / W x x0 D y y0 g/ D 1,
which is equivalent to the fact that there exists an element a 2 Rd such that .f.x; y/I xy D
ag/ D 1, which implies that 1 is merely a shift of 0 . In that case, for each fixed m 2 Rd ,
the restriction of the function h to P .m/ is strictly convex.
Finally, if t 2 Œ0; 1 is fixed, the above computation shows that:
R R
Now, if we assume that Rd xd0 .x/ D Rd yd1 .y/, we have:
Z Z Z
jx x0 .y y0 /j2 d.x; y/d.x0 ; y0 / D 2 jx yj2 d.x; y/
.Rd /2 .Rd /2 .Rd /2
D 2W2 .0 ; 1 /2 :
Proposition 5.79 Let us assume that the real valued function h is continuously L-
differentiable on P2 .Rd /. Then h is L-convex if and only if it is displacement convex.
Proof. As usual we denote by hQ the lifting of h to an L2 -space. Recall that saying that h is
L-convex means that the graph of hQ is above its tangent as given by its Fréchet derivative.
opt
So if 0 and 1 are given in P2 .Rd /, if 2 ˘2 .0 ; 1 /, we denote by .X; Y/ a couple of
2
random variables in the L -space over P2 .R / with joint distribution , that is L.X; Y/ D .
d
If we use the same notation t D ı .t /1 for the optimal displacement from 0 to 1 ,
then t D L.Xt / with Xt D .1 t/X C tY. Now:
ˇ ˇ
d ˇ dQ ˇ
h.1 / h.0 / h.t /ˇ D h.1 / h.0 / h.X t /ˇ
dt tD0 dt tD0
Q
D h.1 / h.0 / Dh.X/ .Y X/
D h.1 / h.0 / E @ h.0 /.X/ .Y X/ :
From the definition of the L-convexity and the above the tangent formulation of displacement
convexity, we see that h is L-convex if and only if it is displacement convex. t
u
The goal of this section is to provide a chain rule for the differentiation of functions
of t of the form .u.t //t>0 when u is an R-valued smooth function defined on
the space P2 .Rd / of probability measures of order 2 on Rd , and D .t /t>0 is
the flow of marginal distributions of an Rd -valued Itô process X D .Xt /t>0 . We
shall sometimes use the terminology Itô’s formula instead of chain rule because the
dynamics are driven by an Itô process.
There are two obvious strategies to expand u.t / and derive an infinitesimal chain
rule for the differential in time.
For each given t > 0, a first strategy consists in dividing the interval Œ0; t into
small intervals of length t D t=N for some integer N > 1, and writing the
difference u.t / u.0 / as a telescoping sum:
462 5 Spaces of Measures and Related Differential Calculus
X
N1
u.t / u.0 / D u.i t / u..i1/ t / :
iD0
One could then use an appropriate form of Taylor’s formula at the order 2
for functions of probability measures and expand each difference in the above
summation. Since the remainder terms are expected to be smaller than the step size
t, one should be able to derive the chain rule by collecting the terms and letting t
tend to 0. This strategy fits the original proof of Itô’s formula in classical stochastic
differential calculus.
A different strategy consists in another approximation of the Itô dynamics.
Instead of discretizing in time as in the previous approach, it is tempting to reduce
the space dimension by approximating the flow D .t /t>0 by a flow of empirical
measures:
1X
N
N t D
N
ıXt` ;
N t0
`D1
for N > 1, where X1 D .Xt1 /t>0 , , XN D .XtN /t>0 are N independent copies of
X D .Xt /t>0 (constructed on an appropriate probability space .˝; F; P/). Using the
empirical projection of u defined as the real valued function uN on RNd by:
XN
1 1
u .x ; ; x / D u
N N
ıx ` ; (5.79)
N
`D1
the strategy is then to expand uN .Xt1 ; ; XtN / using the standard version of Itô’s
formula in finite dimension, and try to control the limit when N tends to infinity.
Obviously, we should recover the same chain rule as the one obtained by the first
approach.
Whatever the strategy, it is necessary to pay special attention to the regularity
conditions needed to expand .u.t //t>0 infinitesimally. As evidenced by (5.79), we
may expect that, not only, u has to be once differentiable in the measure argument,
but also to have second order derivatives in order to allow for the application of Itô’s
formula to the empirical projection uN . For instance, it would be quite tempting to
require the lifting uQ to be twice (continuously) Fréchet differentiable. However, as
we show in Remark 5.80 below, this is a very restrictive condition for our purpose.
We shall spend quite a bit of time in the next subsections identifying the right notion
of second order differentiability needed to perform the desired chain rule expansion.
See Remark 5.81 for a first account.
use the quadratic form notation to denote the second order directional derivatives
D2 uQ .X/ŒY; Z in the directions Y and Z of L2 .˝; F; PI Rd /.
However, this notion of differentiability can have serious shortcomings for some
of our purpose. Indeed as we are about to show, there exist smooth functions h W
Rd ! R with compact supports for which the function uQ W L2 .˝; F; PI Rd / 3 X 7!
EŒh.X/ is not twice continuously Fréchet differentiable. For such a uQ , we already
know that DQu.X/ D @h.X/. Therefore, for any Y 2 L2 .˝; F; PI Rd /,
Remark 5.81 The first strategy mentioned above may appear to be most natural as
it mimics the proof of the standard Itô formula. However, the second approach seems
to be right in line with our desire to apply our tools to models of large populations
of individuals interacting through empirical measures. Indeed, the new perspective
provided by the second strategy of proof enlightens the choice we made for a form
of differential calculus on the space of probability measures. In any case, both
strategies require some smoothness conditions on u. As we just accounted for, u must
be twice differentiable (in some sense) in both cases. However, the strategy based
on approximations by empirical projections ends up being less demanding in terms
of assumptions. Indeed, by taking full advantage of the finite dimensional stochastic
calculus chain rule, it allows us to apply standard finite dimensional mollification
arguments, and in so doing, weaken the smoothness conditions required on the
coefficients. In particular, this approach works under a weak notion of second order
differentiability. See Theorem 5.99 below.
464 5 Spaces of Measures and Related Differential Calculus
We first establish the chain rule for .u.t //t>0 when u satisfies a strong notion of C 2
- regularity. We enumerate the properties we require for this notion to hold.
1. Under (A1), there exists one and only one version of @ u././ 2 L2 .Rd ; I Rd /
for each 2 P2 .Rd / such that the mapping P2 .Rd / Rd 3 .; v/ 7!
@ u./.v/ 2 Rd is jointly continuous.
2. Under (A2), there exists one and only one version of @ u././ for each 2
P2 .Rd / such that Rd 3 v 7! @ u./.v/ is differentiable for each 2 P2 .Rd /
and the mapping P2 .Rd / Rd 3 .; v/ 7! @v @ u./.v/ is jointly continuous.
In particular, the values of the derivatives @ u./.v/ and @v @ u./.v/, for 2
P2 .Rd / and v 2 Rd , are uniquely determined.
3. Under (A3), there exists one and only one continuous version of @ u././ for
each 2 P2 .Rd / such that for each fixed v 2 Rd , the mapping P2 .Rd / 3 7!
@ u./.v/ is L-continuously differentiable and the derivative P2 .Rd /Rd Rd 3
.; v; v 0 / 7! @2 u./.v; v 0 / is jointly continuous. Also, the values of @ u./.v/
and @2 u./.v; v 0 / are uniquely determined.
5.6 Itô’s Formula Along a Flow of Measures 465
Proof.
First Step. The proof of the first claim in Remark 5.82 is straightforward. When the support
of is the entire Rd , there exists at most one continuous version of the mapping Rd 3 v 7!
@ u./.v/ 2 Rd . By approximating any 2 P2 .Rd / by a sequence of probability measures
with full supports, we deduce that @ u././ is also uniquely determined when the mapping
.; v/ 7! @ u./.v/ is jointly continuous, as required in (A1).
Second Step. The proof of the second claim is pretty similar. By the same argument as
above, we observe that the mapping .; v/ 7! @v @ u./.v/ is uniquely determined under
(A2). Then, we use the fact that for any 2 P2 .Rd /, any v0 2 Supp./ and any v 2 Rd ,
Z 1
@v u./.v/ D @v u./.v0 / C @v @ u./ tv C .1 t/v0 .v v0 /dt:
0
The second term in the right-hand side is uniquely determined. Since @v u././ is differen-
tiable, it is also continuous. Hence, the value of @v u./.v0 / is also uniquely determined since
v0 belongs to the support of . As a result, the left-hand side is uniquely determined under
(A2).
Third Step. Proceeding as in the first two steps, we deduce that .; v; v 0 / 7! @2 u./.v; v 0 /
is uniquely determined under (A3). Then, we use the fact that for any v 2 Rd and any
2 P2 .Rd /,
Z 1
@ u Nd .0; Id / .v/ @ u./.v/ D E @2 u L.X C tZ/ .v; X C tZ/ Z dt;
0
where Nd .0; Id / denotes the d-dimensional Gaussian distribution with Id as covariance matrix
and X and Z are two independent random variables with values in Rd , with X and
Z Nd .0; Id /. In the above equality, the right-hand side is uniquely determined while the
first term in the left-hand side is also uniquely determined since Nd .0; Id / has the entire
Rd as support. We easily deduce that @ u./.v/ is uniquely determined. t
u
Notice that when the first derivative DQu exists and is Lipschitz, Proposition 5.36
(see also Corollary 5.38) guarantees the existence of a version P2 .Rd / Rd 3
.; v/ 7! @ u./.v/ which is Lipschitz in v uniformly in . And if this function is
differentiable in with a Lipschitz derivative, the same Proposition 5.36 guarantees
the existence of a regular version of the second derivative.
However, neither Proposition 5.36 nor Corollary 5.38 ensure the existence of
a jointly continuous version of @ u (and a fortiori of @v @ u or of @2 u). Indeed,
Corollary 5.38 just provides sufficient conditions ensuring that @ u is jointly
continuous at any point .; v/ 2 P2 .Rd / Rd such that v is in the support of . In
466 5 Spaces of Measures and Related Differential Calculus
this regard, only Lemma 5.41 may be useful. So, the regularity conditions required
in the three bullet points of assumption Full C 2 Regularity above appear as very
strong. Part of the objectives of this section will be precisely to relax them.
Remark 5.84 As announced in Subsection 5.3.4, we used the letter v (and not the
more common letter x) in order to denote the Euclidean variable in the derivative
@ u. This is especially useful when u is defined on the larger space Rd P2 .Rd / (as
it will be often the case below) and thus reads u W Rd P2 .Rd / 3 .x; / 7! u.x; /:
In that case, our convention permits to distinguish @x @ u.x; /.v/ (which is the
partial derivative of @ u with respect to the original Euclidean variable x) from
@v @ u.x; /.v/ (which is the partial derivative of @ u with respect to the auxiliary
Euclidean variable appearing in the L-derivative).
X
d
@2 u./.v; v 0 /y D 0
@ Œ@ u./i .v/ j .v /yj 2 Rd ;
jD1 16i6d
(5.80)
X
d
@v @ u./.v/ .y ˝ z/ D @vj Œ@ u./i .v/zj yi 2 R;
i;jD1
X
d
@2 u./.v; v 0 / .y ˝ z/ D @ Œ@ u./i .v/ j .v 0 /zj yi 2 R:
i;jD1
X
d
˚
@v @ u./.v/ a D @vj Œ@ u./i .v/ai;j D trace @v @ u./.v/a ;
i;jD1
X
d
˚
@2 u./.v; v 0 / a D @ Œ@ u./i .v/ j ai;j D trace @2 u./.v; v 0 /a :
i;jD1
5.6 Itô’s Formula Along a Flow of Measures 467
Proposition 5.85 Assume that u is fully C 2 and satisfies, for any compact subset
K P2 .Rd /:
Z
ˇ ˇ2 ˇ 2 ˇ
sup sup ˇ@v @ u./.v/ˇ C ˇ@ u./.v; v 0 /ˇ2 d.v 0 / < C1:
2K v2Rd Rd
is differentiable with:
d
DQu.X C tY/
dt jtD0
Z
D @v @ u L.X/ .X/Y C @2 u L.X/ .X; x/y dP.X;Y/ .x; y/;
Rd Rd
the right-hand side being linear in the variable Y and defining a continuous function
from L2 .˝; F; PI Rd / L2 .˝; F; PI Rd / into L2 .˝; F; PI Rd /.
Remark 5.86 Observe that the linearity of the Gâteaux derivative in the direction
Q F;
Y is best seen if we use a copy .˝; Q of the space .˝; F; P/ and we write:
Q P/
d
DQu.X C tY/ D @v @ u L.X/ .X/Y C EQ @2 u L.X/ .X; X/
Q YQ ;
dt jtD0
Q F;
where, by convention, XQ and YQ are copies of X and Y on .˝; Q
Q P/.
d
DQu.X C tY/
dt
D @v @ u L.X C tY/ .X C tY/Y C EQ @2 u L.X C tY/ .X C tY; XQ C tY/
Q YQ :
468 5 Spaces of Measures and Related Differential Calculus
Under the growth conditions assumed in the statement of the proposition, the right-hand side
is in L2 .˝; F ; PI Rd /.
In order to prove that differentiability holds in L2 .˝; F ; PI Rd /, it suffices to prove that
the mapping:
2 2
L .˝; F ; PI Rd / 3 .X; Y/ 7!
@v @ u L.X/ .X/Y C EQ @2 u L.X/ .X; X/
Q YQ 2 L2 .˝; F ; PI Rd /
is continuous. For two pairs .X; Y/ and .X 0 ; Y 0 / in ŒL2 .˝; F ; PI Rd /2 , we have:
@v @ u L.X 0 / .X 0 /Y 0 C EQ @2 u L.X 0 / .X 0 ; XQ 0 /YQ0
@v @ u L.X/ .X/Y C EQ @2 u L.X/ .X; X/ Q YQ
n o
D @v @ u L.X 0 / .X 0 / Y 0 Y C EQ @2 u L.X 0 / .X 0 ; XQ 0 / YQ 0 YQ
n
C @v @ u L.X 0 / .X 0 / @v @ u L.X/ .X/ Y
h io
C EQ @2 u L.X 0 / .X 0 ; XQ 0 / @2 u L.X/ .X; X/ Q YQ
D .i/ C .ii/:
In order to complete the proof, we notice that from Cauchy-Schwarz’ inequality and from
the a priori bound on @2 u, we have:
hˇ h iˇ2 i
ˇ ˇ
E ˇEQ @2 u L.X 0 / .X 0 ; XQ 0 / @2 u L.X/ .X; X/
Q Y1
Q fjX 0 jCjXQ 0 jCjXjCjXj>Rg
Q ˇ
h i
6 EEQ jYj
Q 2 1fjX 0 jCjXQ 0 jCjXjCjXj>Rg
Q ;
Proposition 5.87 Assume that u is fully C 2 and satisfies, for any compact subset
K P2 .Rd /,
Z Z Z
ˇ ˇ
sup ˇ@v @ u./.v/ˇ2 d.v/ C j@2 u./.v; v 0 /j2 d.v/d.v 0 / < 1:
2K Rd Rd Rd
R2 3 .s; t/ 7! uQ .X C sY C tZ/;
@2
uQ .X C sY C tZ/
@s@t j.s;t/D.0;0/
@ ˚
D E @ u L.X C sY/ .X C sY/ Z
@s jsD0
h i h i
D E @v @ u L.X/ .X/ Z ˝ Y C EEQ @2 u L.X/ .X; X/
Q Z ˝ YQ ;
where, as usual, we use the tilde notation to denote copies of the various objects at
hand.
Remark 5.88 Notice that, in the statement of Proposition 5.87, the variables Y and
Z are required to be in L1 .˝; F; PI Rd /. Obviously, this requirement is stronger
than the condition used so far for defining the Gâteaux and Fréchet derivatives of
the lifting of u, which have been computed along directions in L2 .˝; F; PI Rd /.
Observe also that we here address the differentiability of R 3 s 7! EŒDQu.X C
sY/.X C sY/ Z 2 R. This is in contrast with the statement of Proposition 5.85, in
which we addressed the differentiability of the mapping R 3 s 7! DQu.X C sY/.X C
sY/ 2 L2 .˝; F; PI Rd /.
Choosing Y of the form " .X/ and Z of the form ".X/, for two bounded Borel
measurable functions from Rd to Rd , with PŒ" D 1 D PŒ" D 1 D 1=2 and
470 5 Spaces of Measures and Related Differential Calculus
" independent of X, we get rid of the second-order derivatives @2 u in the above
identity, so that:
h i h i
E @v @ u L.X/ .X/ .X/ ˝ .X/ D E @v @ u L.X/ .X/ .X/ ˝ .X/ ;
from which we deduce that @v @ u.L.X//.X/ takes values in the set of symmetric
matrices of size d. By continuity of @ u././ in the variable v, where D L.X/,
this shows that @v @ u./.v/ is a symmetric matrix for any v 2 Rd when the support
of is the entire Rd . By continuity in , we deduce that @v @ u./.v/ is a symmetric
matrix for any 2 P2 .Rd / and any v 2 Rd .
Choosing Y and Z of the form .X/ and .X/ respectively, for two bounded
Borel measurable functions from Rd to R, we get, making use of the symmetry of
@v @ u:
h i
EEQ @2 u L.X/ .X; X/
Q .X/ ˝ .X/
Q
h i
D EEQ @2 u L.X/ .X; X/
Q .X/ ˝ .X/Q
h i
D EEQ @2 u L.X/ .X;Q X/ .X/
Q ˝ .X/
h i
D EEQ @2 u L.X/ .X;
Q X/ .X/ ˝ Q ;
.X/
Q D .@2 u.L.X//.X;
from which we deduce that @2 u.L.X//.X; X/ Q X// . By the same
argument as above, we conclude that @ u./.v; v / D .@ u./.v 0 ; v// , for any
2 0 2
v; v 0 2 Rd and 2 P2 .Rd /.
The following corollary summarizes our discussion.
Corollary 5.89 Assume that the function u is fully C 2 . Then, for any 2 P2 .Rd /
and any v; v 0 2 Rd , @v @ u./.v/ is a symmetric matrix and @2 u./.v; v 0 / D
.@2 u./.v 0 ; v// .
Proof. The proof is essentially identical to the above argument except for the integrability
assumptions on @v @ u and @2 u. However, we can repeat the argument when has a bounded
support. Indeed, by continuity, @v @ u./ and @2 u./ are bounded on the support of
whenever it is bounded, and the bounds remain uniform as long as the support of remains
included in a prescribed bounded subset of Rd .
This shows that the symmetry relationships hold on the support of when the latter
is bounded. By approximating any probability measure with Rd as support by a sequence of
probability measures with bounded supports, using the continuity of the derivatives, we show
that the symmetry relationships hold on the whole Rd when the support of is the whole
Rd . By continuity again, we complete the proof as above in the case when the support of
is merely a subset of Rd . t
u
5.6 Itô’s Formula Along a Flow of Measures 471
Remark 5.90 Symmetry of @v @ u should not come as a surprise since Lemma 5.61
asserts that @ u is somehow a gradient.
Proposition 5.91 Assume that u is fully C 2 . Then, for any N > 1, the empirical
projection uN is C 2 on .Rd /N and, for all x1 ; ; xN 2 Rd ,
@2xi xj uN .x1 ; ; xN /
XN XN (5.81)
1 1 1 1
D @v @ u ıx` .xi /1iDj C 2 @2 u ıx` .xi ; xj /:
N N N N
`D1 `D1
Proof. The proof piggybacks on the computation of the first order derivatives given
in Proposition 5.35. When i 6D j, (5.81) can be obtained by applying the result of
Proposition 5.35 twice. A modicum of care is required when i D j as differentiability is
performed simultaneously in the directions of and v in the first order derivative @ u./.v/.
However, the assumption of joint continuity of the second-order derivatives @2 u./.v/ and
@v @ u./.v; v 0 / can be used to handle this minor difficulty. t
u
where W D .Wt /t>0 is an F-Brownian motion with values in Rd , and .bt /t>0
and .t /t>0 are F-progressively measurable processes with values in Rd and Rdd
respectively. We assume that they satisfy:
Z
T 2
8T > 0; E jbt j C jt j4 dt < C1: (5.83)
0
472 5 Spaces of Measures and Related Differential Calculus
The main result of this section is the form of Itô’s formula given by the following
chain rule.
Theorem 5.92 Under the above conditions, in particular assuming that (5.82)
and (5.83) hold, let us further assume that u is fully C 2 and that, for any compact
subset K P2 .Rd /,
Z
ˇ ˇ
sup ˇ@v @ u./.v/ˇ2 d.v/ < C1: (5.84)
2K Rd
Then, if for any t > 0 we denote by t the marginal distribution t D L.Xt / and we
let at D t t , it holds that:
Z t
u.t / D u.0 / C E @ u.s /.Xs / bs ds
0
Z (5.85)
1 t
C E @v @ u.s / .Xs / as ds:
2 0
Then, u ? is fully C 2 . Moreover, u ? and its first and second order derivatives are
bounded and uniformly continuous on the whole space.
@ u ? ./.v/
d h
X i @k
D @ u ı 1 .v/ .v/ ;
k @vi iD1; ;d
kD1
@2 u ? ./.v; v 0 /
d h
X i @k @` 0
D @2 u ı 1 .v/; .v 0 / .v/ .v / ;
k;` @vi @vj i;jD1; ;d
(5.87)
k;`D1
@v @ u ? ./.v/
d h
X i @2 k
D @ u ı 1 .v/ .v/
k @vi @vj
kD1
d h
X i @k @`
C @v @ u ı 1 .v/ .v/ .v/ :
k;` @vi @vj i;jD1; ;d
k;`D1
Lemma 5.95 Assume that the chain rule holds for any fully C 2 function u with
bounded and uniformly continuous (with respect to the space and measure argu-
ments) first and second order derivatives. Then, it holds for any function u satisfying
the assumptions of Theorem 5.92.
Proof. Assume that the chain rule has been proved for any bounded and uniformly
continuous function u with bounded and uniformly continuous derivatives of orders 1 and
2. Then, for u satisfying the assumption of Theorem 5.92, we can apply the chain rule to
u ? , for any as in the statement of Lemma 5.94. In particular, we can apply the chain
rule to u ? n for any n > 1, where .n /n>1 is a sequence of compactly supported smooth
functions such that .n ; @n ; @2 n /.v/ ! .v; Id ; 0/ uniformly on compact sets as n ! 1,
where Id denotes the identity matrix of size d and 0 is here the zero of Rdd . In order to pass
to the limit in the chain rule (5.85), the only thing we need to do is to verify some almost
sure (or pointwise) convergence in the underlying expectations, and to check that the relevant
uniform integrability arguments can be used.
Without any loss of generality, we can assume that there exists a constant C such that
jn .v/j 6 Cjvj, j@n .v/j 6 C and j@2 n .v/j 6 C for any n > 1 and v 2 Rd , and that
n .v/ D v for any n > 1 and v with jvj 6 n. Then, for any 2 P2 .Rd / and any random
variable X with as distribution, it holds:
2
W2 ı n1 ; 6 E jn .X/ Xj2 1fjXj>ng 6 CE jXj2 1fjXj>ng ;
which follows directly from (5.84) and (5.86), noticing that the sequence . ı n1 /n>1 lives
in a compact subset of P2 .Rd / as it is convergent.
By (5.88) and (5.89) and by a standard uniform integrability argument, we deduce that,
for any t > 0 and any s 2 Œ0; t such that EŒjbs j2 C js j4 < 1,
lim E @ Œu ? n .L.X//.X/ bs D E @ u.L.X//.X/ bs ;
n!C1
lim E @v @ Œu ? n .L.X//.X/ as D E @v @ u.L.X//.X/ as :
n!C1
Recall that the above is true for any 2 P2 .Rd / and any X 2 L2 .˝; F ; PI Rd / with as
distribution. We then choose X D Xs in the above limits. As the bound EŒjbs j2 C js j4 < 1
is satisfied for almost every s 2 Œ0; t, we can pass to the limit inside the integrals appearing
in the chain rule applied to each of the .u ? n /n>1 . In order to pass to the limit in the chain
rule itself, we must exchange the pathwise limit which holds for almost every s 2 Œ0; t and
the integral with respect to s. The argument is the same as in (5.89). Indeed, since the flow of
measures .L.Xs //06s6t is continuous for the 2-Wasserstein distance, the family of measures
..L.n .Xs ///06s6t /n>1 is relatively compact and thus:
hˇ ˇ2 ˇ ˇ2 i
sup sup E ˇ@ Œu ? n .L.Xs //.Xs /ˇ C ˇ@v @ Œu ? n .L.Xs //.Xs /ˇ < 1;
n>1 s2Œ0;t
We now turn to the proof of Theorem 5.92. We just give a sketch as a complete
proof will be given for a refined version in Theorem 5.99 later in this section.
Proof of Theorem 5.92. By Lemma 5.95, we can replace u by u ? for some compactly
supported smooth function or equivalently, we can replace .Xt /t>0 by ..Xt //t>0 . In
other words, we can assume without any loss of generality that u and its first and second
5.6 Itô’s Formula Along a Flow of Measures 475
order derivatives are bounded and uniformly continuous, and that u and its derivatives are
uniformly continuous and that .Xt /t>0 is a bounded Itô process.
Repeating the proof of Lemma 5.95, we can even assume that .bt /t>0 and .t /t>0 are
also bounded. Indeed, it suffices to prove the chain rule when .Xt /t>0 is driven by truncated
processes and pass to the limit along a sequence of truncations converging to .Xt /t>0 .
Let us denote by ..Xt` /t>0 /`>1 a sequence of i.i.d. copies of .Xt /t>0 . That is, for any ` > 1,
where ..b`t ; t` ; Wt` /t>0 ; X0` /`>1 are i.i.d copies of ..bt ; t ; Wt /t>0 ; X0 / constructed on an
extension of .˝; F ; P/. Recalling the definition of the flow of marginal empirical measures:
1 X
N
N Nt D ıXt` ;
N
`D1
the classical Itô formula yields together with Proposition 5.91, P-a.s., for any t > 0:
uN Xt1 ; ; XtN D uN X01 ; ; X0N
N Z
1 X t
C @ u N Ns .Xs` / b`s ds
N 0
`D1
N Z
1 X t
C @ u N Ns .Xs` / s` dWs`
N 0
`D1
N Z
1 X t ˚
C trace @v @ u N Ns .Xs` /a`s ds
2N 0
`D1
N Z
1 X t ˚
C trace @2 u N Ns .Xs` ; Xs` /a`s ds;
2N 2 0
`D1
with a`s D s` .s` / . We take expectations on both sides of this equality and obtain, using
the fact that the stochastic integral has zero expectation thanks to the boundedness of the
coefficients:
Z t
1 X
N
E u N Nt D E u N N0 C E @ u N Ns .Xs` / b`s ds
N 0 `D1
Z t h
1 X
N
N ` `i
C E trace @v @ u N s .Xs /as ds
2N 0
`D1
Z t h
1 X
N
2
N ` ` `i
C E trace @ u
N .X ; X /a ds :
2N 2 0
s s s s
`D1
All the expectations are finite, thanks to the boundedness of the coefficients. Using the fact
that the processes ..a`s ; b`s ; Xs` /06s6t /`2f1; ;Ng are i.i.d., we deduce that:
476 5 Spaces of Measures and Related Differential Calculus
Z n o
t
E u N Nt D E u N N0 C E @ u N Ns .Xs1 / b1s ds
0
Z n h io
1 t
C E trace @v @ u N Ns .Xs1 /a1s ds
2 0
Z t n h io
1
C E trace @2 u N Ns .Xs1 ; Xs1 /a1s ds
2N 0
N
D E u N 0 C .i/ C .ii/ C .iii/:
This implies, together with the continuity of u with respect to the distance W2 , that EŒu.N Nt /
(respectively EŒu.N N0 /) converges to u.t / (respectively u.0 /). Combining the boundedness
and the uniform continuity of @ u on P2 .Rd / Rd with the boundedness of .bs /06s6t , we
prove in the same way that .i/ converges to the first integral appearing in the right-hand side
of (5.85). Similar arguments lead to the convergence of .ii/. t
u
One of the most remarkable feature of equation (5.85) is the fact that the second
order derivative @2 u does not appear in the final form of the chain rule provided by
Theorem 5.92. Thus, it is quite natural to wonder if the chain rule could still hold
without assuming the existence of @2 u. Motivated by this quandary, we prove that
the chain rule does indeed hold under a weaker set of assumptions not requiring the
existence of @2 u. We shall refer to this set of assumptions as partial C 2 regularity.
Remark 5.97 Observe that, contrary to what we did in the Definition 5.83 of the
full C 2 regularity, joint continuity of the first and second order derivatives is only
required at pairs .; v/ such that v 2 Supp./. According to our discussion in
Corollary 5.38, this is much more satisfactory. Notice also that, on the support of ,
@ u././ and @v @ u././ are uniquely defined provided that they are continuous.
Remark 5.98 Following our discussion of the fully C 2 case, we argue the symmetry
of @v @ u././ whenever u is partially C 2 . The only difficulty is that we cannot repeat
the proof of Corollary 5.89 (which applies to the fully C 2 case) since it requires the
existence of @2 u./ explicitly.
We shall prove next (see the proof of Theorem 5.99 below) that whenever u,
@ u and @v @ u are bounded and uniformly continuous, there exists a family of
twice continuously differentiable functions .uNn W .Rd /N ! R/n;N>1 together with
a sequence of reals ."p /p>1 converging to 0 as p tends to 1, such that, for any 2
P2 .Rd / and any N-tuple .X 1 ; ; X N / of independent random variables constructed
on some auxiliary probability space .˝; F; P/, with common distribution , it holds
for any i 2 f1; ; Ng:
hˇ ˇi
E ˇN@2xi xi uNn .X 1 ; ; X N / @v @ u./.X i /ˇ 6 "n C n"N :
Therefore,
ˇ ˇ
ˇ ˇ
E ˇ @v @ u./.X 1 / @v @ u./.X 1 /ˇ 6 2 "n C n"N :
Theorem 5.99 Assume that u is partially C 2 and that, for any compact subset K
P2 .Rd /, the bound (5.84) holds true. Then, the chain rule (5.85) holds for any Itô
process of the form (5.82) satisfying (5.83).
Remark 5.101 Notice that, in (5.85), the mappings Œ0; T 3 s 7! EŒ@ u.s /.Xs /bs
and Œ0; T 3 s 7! EŒ@v @ u.s /.Xs / as are measurable if we assume without any
loss of generality that as and bs are square integrable for any s 2 Œ0; T. This follows
from the fact that for any t 2 Œ0; T, the mappings:
L2 .˝; F; PI Rd / 3 X 7! E @ u L.X/ .X/ bt ;
L2 .˝; F; PI Rd / 3 X 7! E @v @ u L.X/ .X/ at ;
X
N1 h i
D lim E @ u L.XTk=N / .XTk=N / bt 1Tk=N6t<T.kC1/=N (5.91)
N!1
kD0
h i
C E @ u L.XT / .XT / bT 1tDT ;
and similarly for .@v @ u.L.Xt //.Xt //06t6T and .at /06t6T .
The proof can be completed by noticing that for any Z 2 L2 .˝; F; PI Rd /, the
mapping Œ0; T 3 t 7! EŒZ bt is measurable by Fubini’s theorem. We deduce that
the right-hand side in (5.91) is measurable in t. Letting N tend to the infinity, we
complete the proof for Œ0; T 3 t 7! EŒ@ u.L.Xt //.Xt / bt . The same argument
works for Œ0; T 3 t 7! EŒ@v @ u.L.Xt //.Xt / at .
on the whole space since @ u and @v @ u are only continuous at points .; v/ such that v is
in the support of . Still, from formulas (5.87), we notice that @ .u ? / and @v @ .u ? /
are also continuous at points .; v/ such that v is in the support of , the reason being that
v 2 Supp./ implies .v/ 2 Supp. ı 1 /.
Next, we replace P2 .Rd / 3 7! .u ? /./ by P2 .Rd / 3 7! .u ? /. '/ where
' is the density of the standard normal (Gaussian) distribution Nd .0; Id / on Rd , and ' is
the usual convolution product given by:
Z
Rd 3 x 7! '.x y/d.y/:
Rd
Using the fact that a lifting of the map 7! u. '/ is given by X 7! uQ .X C / where uQ is
the lifting of u and is an Nd .0; Id / Gaussian vector independent of X, we see that:
Z
@ u. '/ .v/ D @ u. '/.v v 0 /'.v 0 /dv 0 :
Rd
Similarly, we get:
Z
@v @ u ? . '/ .v/ D @v @ u ? . '/.v v 0 /'.v 0 /dv 0 :
Rd
Since the support of ' is the whole Rd , for any v 2 Rd , . '; v/ is a continuity
point of both @ .u ? / and @v @ .u ? /. Therefore, the mappings P2 .Rd / Rd 3 .; v/ 7!
@ .u ? /. '/.v/ and P2 .Rd / Rd 3 .; v/ 7! @v @ .u ? /. '/.v/ are continuous.
Since they are bounded, we deduce from Lebesgue’s theorem that the maps .; v/ 7! @ Œ.u?
/./.v/ and .; v/ 7! @v @ Œ.u?/./.v/ are continuous on the whole P2 .Rd /Rd .
Moreover, whenever ' is replaced by the density ' of Nd .0; Id / which converges to the
Dirac mass at 0 for the W2 distance when & 0, it is easy to check that, for any 2 P2 .Rd /
and any v 2 Supp./, @ Œ.u ? /. ' /.v/ and @v @ Œ.u ? /. ' /.v/ converge to
@ .u?/./.v/ and @v @ .u?/./.v/. In particular, if Itô’s formula holds true for functionals
of the type P2 .Rd / 3 7! .u ? /. ' /, it also holds true for functionals of the type
P2 .Rd / 3 7! .u ? /./ and then for functionals of the type P2 .Rd / 3 7! u./ by
the same approximation argument as in the proof of Theorem 5.92, noticing in particular
that (5.88) remains true.
Therefore, without any loss of generality, we can assume that u and its first and partial
second order derivatives are bounded and continuous on the whole space. Then, repeating
once again the argument from Theorem 5.92, we can also assume that u and its derivatives
are uniformly continuous and that .Xt /t>0 is a bounded Itô process.
Second Step. As before, we use a mollification argument. For a smooth compactly supported
density on Rd , and using the same notations as above, for each integer n > 1, we define the
mollified version uNn of uN by:
480 5 Spaces of Measures and Related Differential Calculus
Z Y
N
Y N
uNn .x1 ; ; xN / D nNd uN .x1 y1 ; ; xN yN / ny` dy`
.Rd /N `D1 `D1
(5.92)
X N
1
DE u ıxi Y i =n ;
N iD1
where Y 1 ; ; Y N are N i.i.d. random variables with density . From the estimate:
2
1 X 1 X 1 X ˇˇ Y i ˇˇ2
N N N
W2 ıxi Y i =n ; ıxi 6 ;
N iD1 N iD1 N iD1 n
we deduce that:
2
1 X 1 X
N N
C
W2 ıxi Y i =n ; ıxi 6 2 ; (5.93)
N iD1 N iD1 n
the constant C depending upon the size of the support of . Above and in the rest of the
proof, the constant C is a general constant which is allowed to increase from line to line.
Importantly, it does not depend on n or N.
Recalling that the function P2 .Rd /Rd 3 .; x/ 7! @ u./.x/ is assumed to be bounded,
we deduce from Remark 5.27, 5.92, and 5.93 that:
ˇ N 1 ˇ
ˇu .x ; ; xN / uN .x1 ; ; xN /ˇ
n
ˇ X ˇ
ˇ 1 X ˇ
N N (5.94)
1
D ˇˇE u ıxi Y i =n u ıxi ˇˇ 6 Cn1 :
N iD1 N iD1
Given a bounded random variable X with distribution , we know from Theorem 5.8 that
EŒW2 .; N N /2 tends to 0 as N tends to infinity, N N denoting the empirical measure of a
sample of N independent random variables with the same law as X. Moreover, the rate
of convergence of .EŒW2 .; N N /2 /N>1 towards 0 only depends upon the moments of X.
Together with (5.94), this implies that we can find a sequence ."` /`>1 independent of t,
converging to 0 as ` tends to 1, and such that, for any n; N > 1 and t > 0,
hˇ ˇi
E ˇuNn .Xt1 ; ; XtN / u t ˇ
hˇ ˇi hˇ ˇi
6 E ˇuNn .Xt1 ; ; XtN / uN .Xt1 ; ; XtN /ˇ C E ˇu N Nt u.t /ˇ (5.95)
6 "n C "N :
.p/
for a sequence ."` /`>1 which tends to 0 as ` tends to 1.
5.6 Itô’s Formula Along a Flow of Measures 481
Z Y
N Y
N
@xi uNn .x1 ; ; xN / D nNd @xi uN .x1 y1 ; ; xN yN / .ny` / dy`
.Rd /N `D1 `D1
Z
1 X Y Y
N N N
nNd
D @ u ıx` y` .xi yi / .ny` / dy`
N .Rd /N N
`D1 `D1 `D1
1 X Yi
N
1
D E @ u ıx` Y ` =n xi :
N N n
`D1
Using the boundedness and the uniform continuity of @ u on the whole space and following
the proof of (5.95), we deduce that, for any t > 0,
hˇ ˇi
E ˇN@xi uNn .Xt1 ; ; XtN / @ u.t /.Xti /ˇ 6 "n C "N : (5.97)
Again, by boundedness of @ u, we deduce that, for any p > 1 and any t > 0,
hˇ ˇp i1=p
E ˇN@xi uNn .Xt1 ; ; XtN / @ u.t /.Xti /ˇ
.p/
6 ".p/
n C "N : (5.98)
the tensor product operating on elements of Rd . We then rewrite the derivative as:
1;N 1 2;N 1
N@2xi xi uNn .x1 ; ; xN / D Tn;i .x ; ; xN / C Tn;i .x ; ; xN /;
with:
1;N 1
Tn;i .x ; ; xN /
Z
1 X Y Y
N
1
D nNdC1 @ u ıx` y` C ıxi .xi yi / ˝ r.nyi / .ny` / dy` ;
.Rd /N N N
`6Di `6Di `D1
482 5 Spaces of Measures and Related Differential Calculus
and
2;N 1
Tn;i .x ; ; xN /
Z X
1 X
N
1 1
D nNdC1 @ u ıx` y` @ u ıx` y` C ıxi .xi yi /
.Rd /N N N N
`D1 `6Di
Y Y
N
˝ r.nyi / .ny` / dy` :
`6Di `D1
1;N
By integration by parts (recall that Rd 3 x 7! @ u./.x/ is differentiable), we can split Tn;i
into:
with:
Z Y
1 X Y
N N N
11;N 1
Tn;i .x ; ; xN / D nNd @v @ u ıx` y` .xi yi / .ny` / dy`
.Rd /N N
`D1 `D1 `D1
and
12;N 1
Tn;i .x ; ; xN /
Z X
1 1
D nNd @v @ u ıx` y` C ıxi
.Rd /N N N
`6Di
Y
1 X Y
N N N
@v @ u ıx` y` .xi yi / .ny` / dy` :
N
`D1 `D1 `D1
The first term is treated as (5.95) and (5.97). Namely, we argue that, because of the uniform
continuity of @v @ u, we have for any t > 0:
ˇ 11;N 1 ˇ
E ˇTn;i .Xt ; ; XtN / @v @ u.t /.Xti /ˇ 6 "n C "N ; (5.99)
from which we get that for any p > 1 and any t > 0,
ˇ 11;N 1 ˇp 1=p
E ˇTn;i .Xt ; ; XtN / @v @ u.t /.Xti /ˇ
.p/
6 ".p/
n C "N : (5.100)
To handle the second term, we use once more the uniform continuity of @v @ u. Indeed, we
have:
12;N 1
jTn;i .x ; ; xN /j 6 "N ;
5.6 Itô’s Formula Along a Flow of Measures 483
as implied by:
2
1 X 1 X
N
1 1 C
W2 ıx` y` C ıxi ; ıx` y` 6 jyi j2 6 ;
N N N N N
`6Di `D1
12;N 1
if, as in the definition of Tn;i .x ; ; xN /, the quantity nyi is restricted to the compact
support of . This implies that, for any t > 0,
hˇ ˇi
E ˇTn;i
12;N
.Xt1 ; ; XtN /ˇ 6 "N ; (5.101)
2;N
We finally handle Tn;i . Following the proof of (5.102), we have, for any p > 1 and any t > 0,
ˇ 2;N 1 ˇp 1=p
E ˇTn;i .Xt ; ; XtN /ˇ
.p/
6 n"N ; (5.103)
N Z
1 X t
@x` uNn Xs1 ; ; XsN s` dWs`
N 0
`D1
N Z
1 X t
trace @2x` x` uNn Xs1 ; ; XsN a`s ds;
2N 0 `D1
with a`s D s` .s` / . We compare with the expected result, by computing the difference:
N Z
1 X t
N
t D u.t / u.0 / @ u.s /.Xs` / b`s ds
N 0
`D1
N Z
1 X t
@ u.s /.Xs` / s` dWs` (5.104)
N 0
`D1
N Z
1 X t
trace @v @ u.s /.Xs` /a`s ds:
2N 0 `D1
484 5 Spaces of Measures and Related Differential Calculus
From (5.96), (5.98), (5.100), (5.102), and (5.103), we obtain, for any T > 0,
ˇ ˇ
sup ˇE N
t
ˇ 6 "n C .1 C n/"N ;
06t6T
the sequence ."` /`>1 now depending on T. Using a straightforward exchangeability argument
and letting N tend to 1, we deduce that:
where:
Z t
t D u.t / u.0 / E @ u.s /.Xs / bs ds
0
Z h i
1 t
E trace @v @ u.s /.Xs /as ds:
2 0
(continued)
5.6 Itô’s Formula Along a Flow of Measures 485
(A3) For the version of @ u mentioned above and for any .t; x; / 2 Œ0; T
Rd P2 .Rd /, the mapping Rd 3 v 7! @ u.t; x; /.v/ 2 Rd is
continuously differentiable and its derivative, denoted by Rd 3 v 7!
@v @ u.t; x; /.v/ 2 Rdd , is locally bounded and is jointly continuous
in .t; x; ; v/ at any point .t; x; ; v/ such that v 2 Supp./.
Proposition 5.102 If u satisfies assumption Joint Chain Rule and if for every
compact subset K Rd P2 .Rd /, it holds that:
Z
ˇ ˇ
sup ˇ@ u.t; x; /.v/ˇ2 d.v/
.t;x;/2Œ0;TK Rd
Z ˇ ˇ2 (5.106)
ˇ ˇ
C ˇ@v @ u.t; x; /.v/ˇ d.v/ < 1;
Rd
if we set t D L.Xt / for t 2 Œ0; T for an Itô process .Xt /06t6T of the form (5.82)
satisfying (5.83) at time T, and if .t /t2Œ0;T is another d-dimensional Itô process
on the same filtered probability space .˝; F; F; P/ with similar dynamics dt D
t dt C t dWt , for two F-progressively measurable processes . t /06t6T and . t /06t6T
with values in Rd and Rdd respectively such that:
Z
T 2
P j t j C j t j dt < 1 D 1;
0
where the process .XQ t ; bQ t ; Q t /06t6T is a copy of the process .Xt ; bt ; t /06t6T , on a
Q F;
copy .˝; Q of the probability space .˝; F; P/.
Q P/
486 5 Spaces of Measures and Related Differential Calculus
Remark 5.103 Importantly, in full analogy with Remark 5.101, the processes
Œ0; T ˝ 3 .s; !/ 7! EQ @ u s; s .!/; s .XQ s / bQ s ;
Œ0; T ˝ 3 .s; !/ 7! EQ trace @v @ u s; s .!/; s .XQ s /Q s Q s ;
are progressively measurable if we assume that as and s s are square integrable
for any s 2 Œ0; T. This is due to the fact that the functions:
Œ0; T Rd 3 .s; x/ 7! EQ @ u s; x; s .XQ s / bQ s ;
Œ0; T Rd 3 .s; x/ 7! EQ trace @v @ u s; x; s .XQ s /Q s Q s ;
are jointly measurable. For the first of them, the measurability follows from the fact
that we can find a jointly measurable version of @ u W Œ0; T Rd P2 .Rd / Rd 3
.t; x; ; v/ 7! @ u.t; x; /.v/, as explained in Subsection 5.3.4. For the second,
arguing the measurability is less straightforward. Still, we can use the fact (see
again Subsection 5.3.4) that the mapping:
Q F;
Œ0; T Rd L2 .˝; Q Rd / 3 .t; x; X/
Q PI Q
Q .X/
7! %dd @v @ u t; x; L.X/ Q 2 L2 .˝; Q Rdd /
Q PI
Q F;
is continuous for any compactly supported smooth function %dd from Rdd
into itself, see again Subsection 5.3.4. Then, the proof can be completed as in
Remark 5.101.
Proof of Proposition 5.102. As a preliminary remark, we observe that for any .t; x/ 2 Œ0; T
Rd , the mapping P2 .Rd / 3 7! u.t; x; / is partially C 2 .
First Step. We first assume that the processes .bt /06t6T and .t /06t6T have continuous paths
and satisfy:
h i
E sup jbt j2 C jt j4 < 1:
06t6T
Since the path Œ0; T 3 t 7! t 2 P2 .Rd / is continuous for the Wasserstein distance, U is
continuous. By a similar argument, U is twice differentiable in space and @x U and @2xx U are
jointly continuous in time and space.
We now prove that U is differentiable with respect to the time variable and that @t U is
continuous. For any t 2 Œ0; T/ and h > 0 such that t C h 2 Œ0; T and for any x 2 Rd , we
have:
5.6 Itô’s Formula Along a Flow of Measures 487
U.t C h; x/ U.t; x/
D u t C h; x; tCh u t; x; tCh C u.t; x; tCh / u.t; x; t /
Z Z
tCh tCh
D @t u.s; x; tCh /ds C E @ u.t; x; s /.Xs / bs ds
t t
Z
1 tCh
C E @v @ u.t; x; s /.Xs / as ds
2 t
where we used the chain rule for partially C 2 functions. Using the joint continuity of @t u,
we clearly have limh&0 .i/=h D @t u.t; x; t /. Using the joint continuity of @ u (at points
.t; x; ; v/ such that v 2 Supp./) and the pathwise continuity of .bs /06s6T , we have, in
probability,
so that, by (5.106) the family .@ u.t; x; s /.Xs //06s6T is uniformly integrable. We deduce
that:
lim E @ u.t; x; s /.Xs / bs D E @ u.t; x; t /.Xt / bt ;
s&t
and, subsequently:
1
lim .ii/ D E @ u.t; x; t /.Xt / bt :
h&0 h
1 1
lim .iii/ D E @v @ u.t; x; t /.Xt / at ;
h&0 h 2
1
@t U.t; x/ D @t u.t; x; t / C E @ u.t; x; t /.Xt / bt C E @v @ u.t; x; t /.Xt / at :
2
Using the same argument as the one used above to investigate the last two limits, we
can prove that @t U is jointly continuous in time and space. This shows that U is of class
C 1;2 on Œ0; T Rd . Applying standard Itô’s formula to .U.t; t / D u.t; t ; t //06t6T , we
obtain (5.107).
Second Step. We now get rid of the continuity assumption on the processes .bt /06t6T
and .t /06t6T . Let ..bnt /06t6T /n>0 and ..tn /06t6T /n>0 be sequences of F-progressively
488 5 Spaces of Measures and Related Differential Calculus
measurable processes such that, for each n > 0, .bnt /06t6T and .tn /06t6T satisfy the
assumptions used in the first step, together with:
Z T
lim E jbt bnt j2 C jt tn j4 dt D 0:
n!1 0
The goal is then to pass to the limit in (5.107). To do so, we use repeatedly (5.106) together
with the fact that the sequence ..nt /0tT /n0 is relatively compact in P2 .Rd /. By using a
localization sequence for the process .s /06s6T , we may assume that it lives in a bounded
subset of Rd . Considering the penultimate line in (5.107), observe by Cauchy Schwarz’
inequality that:
ˇZ t Z t ˇ
ˇ ˇ
sup ˇˇ EQ @ u.s; s ; ns /.XQ sn / bQ ns ds EQ @ u.s; s ; ns /.XQ sn / bQ s dsˇˇ
06t6T 0 0
Z T 1=2
6c E jbnt bt j2 dt :
0
Therefore, in order to pass to the limit (in the pathwise sense, uniformly in time) in the first
of the last two terms of (5.107), it suffices to focus on the limit of:
Z t
EQ @ u.s; s ; ns /.XQ sn / bQ s ds:
0
Repeating the uniform integrability arguments used in the first step of the proof, we claim
that P almost surely, for almost every s 2 Œ0; T (namely those for which EŒjbs j2 < 1),
lim EQ @ u.s; s ; ns /.XQ sn / bQ s D EQ @ u.s; s ; s /.XQ s / bQ s ;
n!1
which, after we take advantage of the fact that .t /06t6T takes values in a bounded subset of
Rd , of (5.106), and after we apply Lebesgue’s dominated convergence theorem, proves that
P almost surely:
ˇZ t Z t ˇ
ˇ ˇ
lim sup sup ˇˇ EQ @ u.s; s ; ns /.XQ sn / bQ s ds EQ @ u.s; s ; s /.XQ s / bQ s dsˇˇ
n!1 06t6T 0 0
Z ˇ ˇ
T
ˇQ ˇ
6 lim sup ˇE @ u.s; s ; ns /.XQ sn / bQ s EQ @ u.s; s ; s /.XQ s / bQ s ˇds D 0:
n!1 0
5.6 Itô’s Formula Along a Flow of Measures 489
The last term of (5.107) is handled in the same way. The terms appearing in the second and
third lines of (5.107) are easily handled. Regarding the stochastic integral, it suffices to notice
that, for a universal constant C > 0,
ˇZ t Z t ˇ2
ˇ ˇ
E ˇ
sup ˇ @x u.s; s ; s / . s dWs /
n
@x u.s; s ; s / . s dWs /ˇˇ
06t6T 0 0
Z T ˇ ˇ
6E ˇ @x u.t; t ; n / @x u.t; t ; t / ˇ2 j t j2 dt:
t
0
Since @x u is assumed to be (jointly) continuous and .t /06t6T is assumed to take values in a
bounded subset of Rd , the right-hand side tends to 0 as n tends to 1, which completes the
proof. t
u
d
@2; uQ .X/ D DQu.X / ;
d j D0
d
whenever X D X 0 and D X :
d j D0
(continued)
490 5 Spaces of Measures and Related Differential Calculus
so that the chain rule applies to any Itô process satisfying (5.83).
Remark 5.105 The thrust of Theorem 5.104 is to focus on the smoothness of the
mapping Rd 3 v 7! @ u./.v/ independently of the smoothness in the direction
by restricting the test random variables .X / 2R to an identically distributed family.
One of the issue in the proof is precisely to construct such a family of test random
variables.
which implies that, for any 2 P2 .Rd /, we can find a Lipschitz continuous version
of the map Rd 3 x 7! @ u./.x/, with C as Lipschitz constant (in particular,
the Lipschitz property holds true uniformly with respect to ). Therefore, for any
X 2 L2 .˝; F; PI Rd / with X , j@ u./.0/j 6 CEŒjXj C EŒj@ u./.X/j, the
last term being bounded thanks to .i/. Obviously the right-hand side is uniformly
bounded for in bounded subsets of P2 .Rd /, from which we deduce that @ is
locally bounded.
Actually, on the model of Corollary 5.38, we can say a little bit more. Indeed, by
the same arguments as in the proof of Corollary 5.38, we can prove that the function
P2 .Rd / Rd 3 .; v/ 7! @ u./.v/ is jointly continuous at any point .; v/ such
that v 2 Supp./.
5.6 Itô’s Formula Along a Flow of Measures 491
We now proceed with the proof of Theorem 5.104. It relies on two main
ingredients. The first one is a new mollification argument. The second is a coupling
argument which permits to choose relevant versions of the random variables along
which the differentiation is performed.
n D 'd;n ;
where the function 'd;n denotes the density of the mean-zero d-dimensional Gaussian
distribution Nd .0; .1=n/Id / with covariance matrix .1=n/Id , where as usual Id is the identity
matrix of dimension d. We then define the mapping:
Z
V n .; v/ D @ u n .v x/nd=2 'd n1=2 x/dx; (5.108)
Rd
where the function 'd D 'd;1 denotes the density of the standard d-dimensional Gaus-
sian distribution. The mapping V n is given by the convolution of @ u.n /./ with the
measure Nd .0; .1=n/Id /. According to the warm-up preceding the proof, the sequence
.@ u.n /.0//n>1 is bounded and the functions .@ u.n / W Rd 3 v 7! @ u.n /.v/ 2
Rd /n>1 are uniformly Lipschitz continuous. Thus, the sequence of functions .V n .; //n>1 is
relatively compact for the topology of uniform convergence on compact subsets. Any limit
must coincide with @ u././ at points v in the support of or, put it differently, any limit
provides a version of @ u././ which is Lipschitz continuous, the Lipschitz constant being
uniform in . When has full support, the sequence .V n .; //n>1 converges to the unique
continuous version of @ u./, the convergence being uniform on compact subsets.
Let X n D X C n1=2 G, where G is an Nd .0; Id / Gaussian variable independent of X, so
that L.X n / D n . We then observe that, for any Rd -valued square integrable random variable
such that the pair .X; / is independent of G,
DQu.X n / D E @ u.n /.X n /
Z
DE @ u.n /.X x/nd=2 'd .n1=2 x/dx (5.109)
Rd
D E V n .; X/ :
The main advantage of this formula is the fact that the mapping Rd 3 v 7! V n .; v/ is
differentiable with respect to v, which is not known for Rd 3 x 7! @ u./.v/ at this stage of
the proof.
Second Step. We construct now, independently of the measure considered above, a family
.Y / 2R that is differentiable with respect to in L2 .˝; F ; PI R/ but which is, at the same
time, invariant in law, all the Y , for 2 R, being uniformly distributed on Œ=2; =2.
Given two independent N.0; 1/ random variables Z and Z 0 , for any 2 R, we set:
For any 2 R, the pair .Z ; Z 0; / has the same law as .Z; Z 0 / (because of the invariance of
the Gaussian distribution by rotation). Next, we define the random variables Y by:
Z Z
Y D arcsin p D arcsin p :
.Z /2 C .Z 0; /2 Z2 C .Z 0 /2
For any 2 R, Y is uniformly distributed over Œ=2; =2. Pointwise (that is to say for
! 2 ˝ fixed), the mapping R 3 7! Y is differentiable at any such that Z 0; 6D 0.
Noticing that Œd=d Z D Z 0; pointwise, we get in that case:
d Z 0; .Z /2 1=2
Y D p 1 D sign Z 0; :
d Z 2 C .Z 0 /2 .Z / C .Z 0; /2
2
On the event fZ 0;0 6D 0g D fZ 0 6D 0g, which is of probability 1, the set of ’s such that
Z 0; D 0 is locally finite. The above derivative being bounded by 1, this says that pointwise,
the mapping R 3 7! Y is 1-Lipschitz continuous. Therefore, the random variables
.Y Y 0 /= , 6D 0, are bounded by 1. Moreover, still on the event fZ 0 6D 0g, the above
computation shows that:
Y Y0
lim D sign Z 0 : (5.110)
!0
X D .ı Y /e C X;
Going back to (5.109), we get, for another random variable 2 L2 .˝; F ; PI Rd /, with
.X; ; Z; Z 0 / independent of G,
1
DQu X C p G D E V n L.X /; X :
n
5.6 Itô’s Formula Along a Flow of Measures 493
Observe in the above formula that @v V n takes values in the set of symmetric d d matrices
since @ u././ derives from a potential for all 2 P2 .Rd /, see Proposition 5.50.
Noticing that the random variable jsign.Z 0 /j is equal to 1 almost surely, we can replace
by sign.Z 0 / with .X; / independent of .Z; Z 0 /, so that:
1
@2sign.Z 0 /e;sign.Z 0 / uQ X C ıYe C p G
n
h n oi
D E trace @v V n L X C ıYe ; X C ıYe ˝ e :
Finally, we let:
Z
W n;ı .; v/ D @v V n pı ; v C ıre p.r/dr; (5.111)
R
where p is the uniform density on Œ=2; =2 and pı ./ D p.=ı/=ı is the uniform density
on Œı=2; ı=2. As usual, pı is an abbreviated notation for denoting the convolution
of with the uniform distribution on the segment Œ.ı=2/e; .ı=2/e. Since the pair .X; /
is independent of .Z; Z 0 /, we end up with the duality formula:
1 h n oi
@2sign.Z 0 /e;sign.Z 0 / uQ X C ıYe C p G D E trace W n;ı .; X/ ˝ e : (5.112)
n
By the smoothness assumption on @2; uQ (see (ii) in (A2) in assumption Sufficiency for
Partial C 2 ), we deduce that, for another X 0 , with distribution as well, such that the triple
.X; X 0 ; / is independent of .Z; Z 0 / and the 5-tuple .X; X 0 ; ; Z; Z 0 / is independent of G,
ˇ h n oiˇˇ
ˇ 1=2 1=2
ˇE trace W n;ı .; X/ W n;ı .; X 0 / ˝ e ˇ 6 CE jX X 0 j2 E jj2 ; (5.113)
the constant C being independent of , ı and n. The above is true for any fX; X 0 g-
measurable 2 L2 .˝; F ; PI Rd /. We deduce that, for any other e0 2 Rd with je0 j D 1,
494 5 Spaces of Measures and Related Differential Calculus
hˇ n oˇˇ2 i
ˇ 1=2
E ˇtrace W n;ı .; X/ W n;ı .; X 0 / e0 ˝ e ˇ 6 CE jX X 0 j2 :
By Proposition 5.36, this says that Rd 3 v 7! tracef.W n;ı .; v//.e0 ˝ e/g has a C-Lipschitz
continuous version.
Fourth Step. From (5.108) and (5.111), we know that:
Z
W n;ı .; v/ D @v V n pı ; v C ıre p.r/dr
R
Z
D n.dC1/=2 @ u pı Nd .0; 1n Id /; w C ıre p.r/@'d n1=2 .v w/ drdw:
RRd
Since Nd .0; .1=n/Id / has full support, we know that @ u. pı Nd .0; .1=n/Id /; /
converges towards @ u. Nd .0; .1=n/Id /; / as ı tends to 0, uniformly on compact subsets
(see the warm-up). We deduce that, as ı tends to 0, W n;ı .; v/ converges to:
Z
W n .; v/ D n.dC1/=2 @ u Nd .0; 1n Id /; w @'d n1=2 .v w/ dw
Rd
Z
1
1=2
D @v n d=2
@ u Nd .0; I /; w
n d
'd n .v w/ dw D@v V n .; v/:
Rd
Therefore, we deduce that the mappings .Rd 3 v 7! tracef.@v V n .; v//.e0 ˝ e/g/n>1 are
Lipschitz continuous, uniformly in . Since @v V n .; v/ is independent of e and e0 , this
implies that the mappings .Rd 3 v 7! @v V n .; v//n>1 are Lipschitz continuous, uniformly
with respect to .
By (5.112) and (ii) in (A2) in assumption Sufficiency for Partial C 2 ,
ˇ ˚ ˇ2
sup E ˇtrace W n;ı .; X/ .e0 ˝ e/ ˇ 6 C;
n>1;ı2Œ0;1
for a possibly new value of C. Above, X . Letting ı tend to 0, we deduce from Fatou’s
lemma that:
hˇ ˚ ˇ2 i
sup E ˇtrace @v V n .; X/ .e0 ˝ e/ ˇ 6 C;
n>1
and thus that supn>1 EŒj@v V n .; X/j2 6 C, which implies by Lipschitz property of
@v V n .; /, that:
8n > 1; j@x V n .; 0/j 6 C.1 C EŒjXj2 : (5.114)
This says that the sequence of mappings .Rd 3 v 7! @v V n .; v//n>1 is relatively compact for
the topology of uniform convergence. Therefore, we can extract a convergent subsequence.
As the limit of V n .; / is @ u././, we deduce that Rd 3 v 7! @ u./.v/ is differentiable
with respect to v. Passing to the limit in (5.112) (first on ı and then on n), we deduce that
5.7 Applications 495
h n oi
@2sign.Z 0 /e;sign.Z 0 / uQ .X/ D E trace @v @ u./.X/ ˝ e : (5.115)
Again, by (ii) in (A2) in assumption Sufficiency for Partial C 2 , there exists a constant
C such that, for any 2 P2 .Rd / and any X 2 L2 .˝; F ; PI Rd / with as distribution,
EŒj@v @ u./.X/j2 6 C, which is a required condition for applying the chain rule. In order
to complete the proof, it remains to prove that the mapping P2 .Rd / Rd 3 .; v/ 7!
@v @ u./.v/ is jointly continuous at any point .; v/ such that v 2 Supp./. We already
know that it is Lipschitz continuous with respect to v, uniformly in . For a sequence
.n /n>1 in P2 .Rd / converging for the 2-Wasserstein distance to some 2 P2 .Rd /, we
deduce from the Lipschitz property and by the same argument as in (5.114) that the sequence
of functions .Rd 3 v 7! @v @ u.n /.v//n>1 is relatively compact for the topology of uniform
convergence on compact subsets. By means of the bound supn>1 EŒj@v @ u.n /.X n /j2 6 C,
with X n n , it is quite easy to pass to the limit in the right-hand side of (5.115). By (ii) in
(A2) in assumption Sufficiency for Partial C 2 , we can also pass to the limit in the left-hand
side. Equation (5.115) then permits to identify any limit with @v @ u././ on the support
of . Since the mappings .@v @ u.n /.//n>1 are uniformly continuous on compact subsets,
we deduce that, for an additional sequence .v n /n>1 , with values in Rd , that converges to
some v 2 Supp./, the sequence .@v @ u.n /.v n //n>1 converges, up to a subsequence, to
@v @ u./.v/. Now, by relative compactness of the sequence .Rd 3 v 7! @v @ u.n /.v//n>1 ,
the sequence .@v @ u.n /.v n //n>1 is bounded. By a standard compactness argument, the
sequence .@v @ u.n /.v n //n>1 must be convergent with @v @ u./.v/ as limit. Arguing as
we did for @ u in the warm-up preceding the proof, we easily prove that @v @ u is locally
bounded. t
u
5.7 Applications
We now comment more on Remark 5.75 within the framework of mean field games.
Assume further that there exist two functions F0 W Œ0; T P2 .Rd / ! R and G W
P2 .Rd / ! R such that, for any t 2 Œ0; T, F0 .t; / and G are differentiable for
the linear functional differentiation defined in Subsection 5.4.1, such that, for all
.t; x; / 2 Œ0; T Rd P2 .Rd /,
ıF0
f0 .t; x; / D .t; /.x/;
ı
ıG
g.x; / D .t; /.x/:
ı
then h is L-monotone. Whenever Proposition 5.51 applies, this says that @x ŒıH=ı
is L-monotone when H is L-convex.
Returning to the mean field game described above, we deduce that @x f0 .t; / and
@x g are L-monotone if F0 .t; / and G are L-convex in the direction . Therefore,
in full analogy with the above interpretation of the Lasry-Lions monotonicity
condition, the L-monotonicity condition in the statement of Theorem 3.32 (which
provides another sufficient condition for guaranteeing uniqueness) can be also
interpreted as a convexity condition (of F0 and G) in the direction of the measure
argument but for the L-differentiation!
5.7 Applications 497
for .t; x; / 2 Œ0; TRd P2 .Rd /, with u.T; x; / D G.x; / as terminal condition.
498 5 Spaces of Measures and Related Differential Calculus
The next statement consists of a verification argument that makes the connection
between (5.116) and (5.117):
Proposition 5.106 Assume that there exists a function u W Œ0; TRd P2 .Rd / ! R
satisfying assumption Joint Chain Rule together with (5.106) and such that u and
@x u are Lipschitz continuous in .x; / uniformly in time. Assume further that ˙ is
bounded.
Then, for any initial condition X0 2 L2 .˝; F0 ; PI Rd /, the system (5.116) admits
a unique solution .Xt ; Yt ; Zt /06t6T satisfying:
Z
T
E sup jXt j2 C jYt j2 C jZt j2 dt < 1:
06t6T 0
Remark 5.107 The master equation for mean field games is obtained by choosing
B, ˙ , F and G as in the statement of Theorem 4.44. We let the reader write the
corresponding form of (5.117). In that case, Proposition 5.106 shows that the
function Œ0; T Rd 3 .t; x/ 7! u.t; x; L.Xt // should coincide with the solution
of the HJB equation in the mean field game system (3.12), while .t /06t6T therein
should coincide with .L.Xt //06t6T .
Proof.
First Step. We first prove the existence of a solution. To do so, we notice that, under our
assumption, the stochastic differential equation:
dXt D B t; Xt ; L.Xt /; u t; Xt ; L.Xt / ; .˙ @x u/ t; Xt ; L.Xt / dt C ˙ t; Xt ; L.Xt / dWt ;
for t 2 Œ0; T, with X0 2 L2 .˝; F0 ; PI Rd / as initial condition, is uniquely solvable, see
Theorem 4.21. The solution satisfies EŒsup06t6T jXt j2 < 1.
Let now:
Yt D u t; Xt ; L.Xt / ; Zt D ˙ @x u t; Xt ; L.Xt / ; t 2 Œ0; T:
Combining the PDE (5.117) with Proposition 5.102, we deduce that .Xt ; Yt ; Zt /06t6T is a
solution of (5.116).
5.7 Applications 499
Second Step. We now consider another solution .Xt0 ; Yt0 ; Zt0 /06t6T to (5.116), with the same
initial condition X00 D X0 as in the first step. We let:
Yt D u t; Xt0 ; L.Xt0 / ; Zt D ˙ @x u t; Xt0 ; L.Xt0 / ; t 2 Œ0; T:
Again, we can combine the PDE (5.117) with Proposition 5.102. We deduce that:
dYt D B t; L.Xt0 /; Xt0 ; Yt0 ; Zt0 B t; L.Xt0 /; Xt0 ; Yt ; Zt @x u t; Xt0 ; L.Xt0 / dt
h i
C EQ B t; L.X 0 /; XQ 0 ; YQ 0 ; ZQ 0 B t; L.X 0 /; XQ 0 ; YQ t ; ZQt @ u t; X 0 ; L.X 0 / .XQ 0 / dt
t t t t t t t t t
F t; L.Xt0 /; Xt0 ; Yt ; Zt dt C Zt dWt ; t 2 Œ0; T;
with the terminal boundary condition YT0 D G.XT0 ; L.XT0 //, and where we used the same
convention as above for the variables labeled with a tilde: They denote copies of the original
variables that are constructed on a copy of the original probability space.
In order to complete the proof, it suffices to regard the difference .Yt0 Yt ; Zt0 Zt /06t6T
as the solution of a backward SDE with random coefficients with 0 as terminal condition.
Notice then from the fact that u is Lipschitz continuous in , that DX uQ takes values in a
bounded subset of L2 .˝; F ; PI Rd /, see Remark 5.27. In particular,
ˇ h iˇ
ˇQ ˇ
ˇE B t; L.Xt0 /; XQ t0 ; YQ t0 ; ZQ t0 B t; L.Xt0 /; XQ t0 ; YQ t ; ZQt @ u t; Xt0 ; L.Xt0 / .XQ t0 / ˇ
h i1=2 h i1=2
6 CEQ jYQ t0 YQ t j2 C jZQ t0 ZQt j2 EQ j@ u t; Xt0 ; L.Xt0 / .XQ t0 /j2
h i1=2
6 CEQ jYQ t0 YQ t j2 C jZQ t0 ZQt j2 ;
for a value of C, independent of t, that is allowed to increase from line to line. By adapting
the stability arguments used in the proof of Theorem 4.23, we get that:
Z
T
sup E jYt0 Yt j2 C E jZt0 Zt j2 dt D 0:
06t6T 0
Therefore, .Xt0 /06t6T solves the same SDE as .Xt /06t6T , which proves uniqueness. t
u
The purpose of this short section is to present an application of the chain rule to a
nonstandard control problem. In full analogy with the classical case and with the
previous subsection, we use a verification argument whereby the classical solution
of a partial differential equation, if it exists, provides a solution to the optimal control
problem. However, the control problem has to be of a very special nature to justify
the need for such a sophisticated form of the chain rule. Case in point, the application
500 5 Spaces of Measures and Related Differential Calculus
over the set A D H2;d of admissible controls for the controlled dynamics:
Proposition 5.108 Let us assume that there exists a function u W Œ0; T P2 .Rd / !
R, differentiable in t, with @t u being continuous in .t; /, partially C 2 in the
measure variable, with DX uQ W Œ0; T L2 .˝; F; PI Rd / 3 .t; X/ 7! DX uQ .t; X/ 2
L2 .˝; F; PI Rd / being Lipschitz continuous with respect to X, uniformly in time,
@ u W Œ0; TP2 .Rd /Rd 3 .t; ; v/ 7! @ u.t; /.v/ and @v @ u W Œ0; TP2 .Rd /
Rd 3 .t; ; v/ 7! @v @ u.t; /.v/ being continuous at any point .t; ; v/ such that
v 2 Supp./, satisfying u.T; / 0 and
Z ˇ ˇ2
ˇ ˇ
sup sup ˇ@v @ u.t; /.v/ˇ d.v/ < 1; (5.120)
t2Œ0;T 2K Rd
for all compact K P2 .Rd /. Furthermore, if we assume that u satisfies the infinite-
dimensional PDE:
Z
1
@t u.t; / j@ u.t; /.v/j2 d.v/
2 Rd
Z (5.121)
1
C trace @v @ u.t; /.v/d.v/ C f ./ D 0;
2 Rd
O D inf J.˛/:
J.˛/
˛2A
Proof.
First Step. We first prove that (5.122) is uniquely solvable. We first recall from Subsec-
tion 5.3.4 that we can a find a version of each @ u.t; /./ 2 L2 .Rd ; / such that the mapping
Œ0; T P2 .Rd / Rd 3 .t; ; v/ 7! @ u.t; /.v/ is measurable. Of course, for any t 2 Œ0; T
and any random variable X 2 L2 .˝; F ; PI Rd /, @ u.t; L.X//.X/ is almost surely equal to
DX uQ .t; X/. In particular, the Lipschitz property of DX uQ in the variable X shows that, for any
X; Y 2 L2 .˝; F ; PI Rd /,
hˇ ˇ2 i
sup E ˇ@ u t; L.X/ .X/ @ u t; L.Y/ .Y/ˇ 6 CkX Yk22 :
06t6T
This suffices to implement Picard’s fixed point theorem along the lines of the proof of
Theorem 4.21. As a byproduct of the above estimates, we also get that:
Z
ˇ ˇ
sup sup ˇ@ u.t; /.v/ˇ2 d.v/ < 1;
t2Œ0;T 2K Rd
for all compact K K2 .Rd /, which is a necessary condition to apply the chain rule in the
second step below.
Second Step. Consider now a generic admissible control ˛ D .˛t /06t6T , denote by
X˛ D .Xt˛ /06t6T the corresponding controlled state given by (5.119), and let us apply the
time dependent form of the chain rule of Theorem 5.99 discussed in Proposition 5.102 to
.u.t; L.Xt˛ ///06t6T . We get:
502 5 Spaces of Measures and Related Differential Calculus
du t; L.Xt˛ /
h i
D @t u t; L.Xt˛ / C E @ u t; L.Xt˛ / Xt˛ ˛t
1 h i
C E trace @v @ u t; L.Xt˛ / Xt˛ dt
2
i
1 hˇ ˇ2 i h
D f L.Xt˛ / C E ˇ@ u t; L.Xt˛ / Xt˛ ˇ C E @ u t; L.Xt˛ / Xt˛ ˛t dt
2
1 1 hˇ ˇ2 i
D f L.Xt˛ / E j˛t j2 C E ˇ˛t C @ u t; L.Xt˛ / Xt˛ ˇ dt
2 2
where we used the PDE (5.121) satisfied by u before identifying a perfect square. If we
integrate both sides and use the definition of the cost J.˛/, we get:
Z hˇ
1 T ˇ i
J.˛/ D u 0; L.X0 / C E ˇ˛t C @ u t; L.X ˛ / X ˛ ˇ2 dt ;
2 0
t t
Remark 5.109 Equation (5.121) is a simple form of the master equation for the
optimal control of McKean-Vlasov dynamics which we shall derive in Chapter 6.
Remark 5.110 Benamou and Brenier’s Theorem 5.53 provides a first variational
formula for the 2-Wasserstein distance W2 . We shall show in Subsection 6.7.3 that
the 2-Wasserstein distance W2 can be viewed (up to a slight modification) as the
solution of an optimization problem of the type considered in this section.
As another application of the chain rule for the flow of marginals measures of an Itô
process, we revisit the propagation of chaos for McKean-Vlasov processes.
for all t 2 Œ0; T, x; x0 2 Rd and ; 0 2 P2 .Rd /, for some constant c > 0. We
assume that .˝; F0 ; P/ is atomless so that, for any 2 P2 .Rd /, there exists a
random variable X0 2 L2 .˝; F0 ; PI Rd / such that X0 .
Under these Cauchy-Lipschitz assumptions, Theorem 4.21 implies that, for any
initial random variable X0 2 L2 .˝; F0 ; PI Rd /, the McKean-Vlasov SDE (5.123) is
uniquely solvable. We emphasize that uniqueness must also hold in law. Indeed, for
any two solutions constructed on possibly different spaces, the standard Yamada-
Watanabe theorem permits to construct on the same probability space, two new
solutions, driven by the same initial condition and by the same random noise,
each one being distributed according to one of the two original laws. These
two new solutions also satisfy the McKean-Vlasov SDE, but on the same space.
Consequently, they must be equal. In particular, for any t > 0, the law of Xt only
depends upon the initial distribution of X0 . This makes it possible to define, for any
t > 0 and any function W P2 .Rd / ! R, the function Pt by:
Pt ./ D L.XtX0 / ; 2 P2 .Rd /:
Here we denote by XtX0 the solution at time t of (5.123) with initial condition X0 ,
and we assume that X0 has distribution .
As a side effect of the proof of existence and uniqueness of a solution of the
McKean-Vlasov SDE (5.123), we get that, for any time T > 0, there exists a
constant C > 0, such that, for any X0 ; X00 2 L2 .˝; F0 ; PI Rd /,
X0
E sup jXtX0 Xt 0 j2 6 C2 E jX0 X00 j2 :
06t6T
This implies that, for any t > 0, Pt W P2 .Rd / ! R is bounded and continuous
whenever is bounded and continuous for the 2-Wasserstein distance W2 . It also
shows that Pt is Lipschitz continuous for the same Wasserstein distance W2
whenever is Lipschitz continuous, and more generally, that Pt is bounded
and uniformly continuous whenever is bounded and uniformly continuous. In
particular, whenever b and are time-independent, uniqueness implies that .Pt /t>0
is a one-parameter semigroup of operators on the Banach space Cb .P2 .Rd /I R/
of bounded continuous functions on P2 .Rd / (equipped with the 2-Wasserstein
distance). It is also a one-parameter semigroup of operators on the Banach space
U C b .P2 .Rd /I R/ of bounded uniformly continuous functions on P2 .Rd / and on
504 5 Spaces of Measures and Related Differential Calculus
Therefore, for a given 2 U C 0 .P2 .Rd /I R/, we have, for any initial condition X0 2
L2 .˝; F0 ; PI Rd / and for any R > 0,
ˇ ˇˇ ˇˇ ˇˇ
ˇ
ˇ L.XtX0 / L.X0 / ˇ D ˇ L.XtX0 / L.X0 / ˇ1fEŒjX0 j2 6 R2 g
ˇ ˇˇ
ˇ
C ˇ L.XtX0 / L.X0 / ˇ1fEŒjX0 j2 >R2 g
ˇ ˇ
6 sup ˇ./ ./ˇ C sup ./
W2 .;/ 6 Ct.1CR/2 M2 ./ > R
C sup ./:
M2 ./2 > exp.C/R2 C
Choosing R large enough first, and then t small enough, we complete the proof of
the strong continuity of the semi-group.
Then, for any initial condition X0 for the McKean-Vlasov SDE (5.123), we may
expand the function RC 3 t 7! Pt ./ with L.X0 / D using the chain rule.
From the bound EŒsup06t6T jXtX0 j2 < 1, which holds true for any T > 0, and from
the growth conditions on and b we see that:
hˇ ˇ2 ˇ ˇ4 i
sup E ˇb t; XtX0 ; L.XtX0 / ˇ C ˇ t; XtX0 ; L.XtX0 / ˇ < 1;
06t6T
where we used the fact that is bounded in x. Therefore, Theorem 5.99 implies that
for all t > 0:
5.7 Applications 505
d d
Pt ./ D .t /
dt dt
Z
D b.t; v; t / @ .t /.v/dt .v/
Rd
Z
1
C trace .t; v; t /@v @ .t /.v/ dt .v/;
2 Rd
d
Pt ./ D Pt Lt ./; 2 P2 .Rd /; (5.125)
dt
and thus:
d
Pt ./ jtD0
D L0 ./; 2 P2 .Rd /: (5.126)
dt
Whenever b and do not depend upon time, Lt is also independent of t, and will
be denoted by L . The above application of the chain rule then says that the domain
of the infinitesimal generator of the strongly continuous semi-group .Pt /t>0 on
U C 0 .P2 .Rd // contains the intersection of U C 0 .P2 .Rd // with the space of partially
C 2 functions, intersection on which this generator coincides with L . As a result,
identity (5.125) can be interpreted as a forward Kolmogorov equation on P2 .Rd /.
Observe also that the intersection of U C 0 .P2 .Rd // and the space of partially C 2
functions is not empty. Indeed, for any compactly supported smooth function W
RC ! R, the function W P2 .Rd / 3 7! .M2 ./2 / is in this intersection, with
@ ./.v/ D 2v0 .M2 ./2 / and @v @ ./.v/ D 20 .M2 ./2 /. Any multiplication
of with a function of the same type as those described in the first step of the
proof of Theorem 5.99 also belongs to the intersection of U C 0 .Rd / with the space
of partially C 2 functions.
506 5 Spaces of Measures and Related Differential Calculus
Backward Equation
Assume now that there exists a subspace C of bounded smooth functions W
P2 .Rd / ! R for which the mapping:
Tt;
˚ W RC P2 .Rd / 3 .t; / 7! .L.XT // (5.127)
Tt;
where XT is the value of the solution at time T starting from a random variable
at time T t, satisfies assumption Joint Chain Rule together with (5.106).
Notice that we shall only need the simplest form of the chain rule since ˚ is
independent of the space variable x.
Then, for any initial condition X0 2 L2 .˝; F0 ; PI Rd / of the McKean-Vlasov
SDE (5.123), we get from the extended chain rule Proposition 5.102 that, for any
T > 0 and t 2 Œ0; T,
d
˚.T t; t /
dt
Z
D @t ˚.T t; t / C b.t; v; t / @ ˚.T t; t /.v/dt .v/ (5.128)
Rd
Z
1
C trace .t; v; t /@v @ ˚.T t; t /.v/ dt .v/;
2 Rd
where we used t D L.XtX0 /. Observe that the left-hand side must be zero since, for
any t 2 Œ0; T,
t; t
˚.T t; t / D L.XT / D L.XTX0 / : (5.129)
As a result, the right-hand side in (5.128) must be zero. Noticing that the dynamics
of the McKean-Vlasov SDE may be initialized from any 2 P2 .Rd / at any time
t 2 Œ0; T, we get the backward equation:
Z
@t ˚.T t; / C b.t; v; / @ ˚.T t; /.v/d.v/
Rd
Z (5.130)
1
C trace .t; v; /@v @ ˚.T t; /.v/ d.v/ D 0;
2 Rd
with the terminal condition ˚.T t; /jtDT D ./, for 2 P2 .Rd /. Observe
that the above PDE reads as a simplified (linear) version of the (nonlinear) master
equation (5.117).
for some integer N > 1, where .X0i /iD1; ;N are independent F0 -measurable random
variables with the same distribution as some X0 2 L2 .˝; F0 ; PI Rd / and .W i /iD1; ;N
are independent Wiener processes of dimension d. Without any loss of generality,
we can work on the same probability space as above. As before, we use the empirical
measure:
1X
N
N Nt D ı i; t 2 Œ0; T:
N iD1 Xt
Assume, in addition to the previous assumptions, that is bounded and that, for any
function in the class C , the function ˚ also satisfies:
being continuous;
3. it holds that:
Z
ˇ ˇ ˇ ˇ
sup sup ˇ@ ˚.t; /.v/ˇ2 C ˇ@2 ˚.t; /.v; v/ˇ2 d.v/ < 1:
t2Œ0;T 2P2 .Rd / Rd
Notice in particular that, for any t 2 Œ0; T, the function ˚.t; / is fully C 2 .
Then, Propositions 5.35 and 5.91 imply that the empirical projection function:
1X
N
Œ0; T .Rd /N 3 t; .x1 ; ; xN / 7! ˚ T t; ıx i
N iD1
is of class C 1;2 with specific partial derivatives, so that we can apply standard Itô’s
formula. We get:
d ˚ T t; N Nt
D @t ˚ T t; / N Nt dt
1 X i N
N
C b t; Xt ; N t @ ˚.T t; N Nt /.Xti /dt
N iD1
1X
N
C @ ˚.T t; N Nt /.Xti / t; Xti ; N Nt dWti
N iD1
508 5 Spaces of Measures and Related Differential Calculus
1 X
N h i
C trace t; Xti ; N Nt @v @ ˚.T t; N Nt /.Xti / dt
2N iD1
1 X
N h i
C 2
trace t; Xti ; N Nt @2 ˚.T t; N Nt /.Xti ; Xti / dt:
2N iD1
Notice that the second and fourth terms in the right-hand side can be rewritten as:
1 X i N
N
b t; Xt ; N t @ ˚.T t; N Nt /.Xti /
N iD1
Z
D b t; v; N Nt @ ˚.T t; N Nt /.v/dN Nt .v/;
Rd
1 X
N h i
trace t; Xti ; N Nt @v @ ˚.T t; N Nt /.Xti /
2N iD1
Z h i
1
D trace t; v; N Nt @v @ ˚.T t; N Nt /.v/ dN Nt .v/;
2 Rd
so that, using the PDE (5.130) satisfied by ˚ at .t; N Nt /, and as before, the notation
t D L.XtX0 /, we get:
d ˚ T t; N Nt ˚ T t; t
D d ˚ T t; N Nt
1 X i N
N
D t; Xt ; N t @ ˚.T t; N Nt /.Xti /dWti (5.131)
N iD1
1 X
N h i N 2 i
C trace t; X ;
N @ ˚.T t;
N N
/.X i
; X i
/
t dt;
2N 2 iD1 t t t t
where we used the fact that ˚.T t; t / is constant in t, see (5.129). By assumption 3
above, there exists a constant C such that:
h ˇ ˇ2 i C hˇ ˇ2 i
E sup ˇ˚ T t; N Nt ˚ T t; t ˇ 6 C CE ˇ˚ T; N N0 ˚ T; 0 ˇ :
06t6T N
Since ˚.T; / is continuous and bounded, the right-hand side tends to 0. Choosing
t D T in the left-hand side, we get:
hˇ ˇ2 i
8 2 C; lim E ˇ N NT T ˇ D 0; (5.132)
N!1
5.7 Applications 509
which is another form of the propagation of chaos, at least when the class C is rich
enough.
the normalization by the root of N in the right-hand side being reminiscent of that
appearing in the statement of the central limit theorem whilst the last term may be
estimated by means of Theorem 5.8.
Another way of quantifying the convergence is to take the expectation in (5.131).
Under the same assumption as above, we get:
ˇ N ˇ C ˇ ˇ
ˇE N T ˇ 6 C ˇEŒ˚.T; N N0 / ˚.T; 0 /ˇ; (5.133)
T
N
which provides a bound on the rate of convergence of the semi-group generated
by the system of particles towards the limiting McKean-Vlasov flow. Obviously, the
order of the last term should depend on the smoothness of ˚.T; /.
for t 2 Œ0; T. With these notations and definition (5.124) for .Lt /t>0 , the chain
rule (5.107) now reads:
d u.t; t ; t / D @t u.t; t ; t / C Lt;x u .t; t ; t / C Lt; u .t; t ; t / dt
C @x u.t; t ; t / t dWt ; t 2 Œ0; T;
ŒLt;x u.t; x; / D ŒLt u.t; ; /.x/; and ŒLt; u.t; x; / D ŒLt u.t; x; /./:
as presented in the recent work of Fournier and Guillin [161]. The upper bounds
provided in the text for the rate of convergence of the empirical measures toward
the true distribution are essentially optimal. Indeed, the lower bound proved in [36]
by Barthe and Bordenave shows that the rate N 1=d cannot be improved for the
1-Wasserstein distance W1 in dimension d > 3.
There are many notions of differentiability for functions defined on spaces of
probability measures, and recent progress in the theory of optimal transportation
have put some of them in the limelight. We refer the interested reader to the books
of Ambrosio, Gigli and Savaré [21] and Villani [337, 338] for detailed exposés of
these geometric approaches in textbook form. The notion of differentiability used
in the text was introduced by P.L. Lions in his lectures at the Collège de France
[265], hence our terminology of L-differentiability. Our presentation benefited from
Cardaliaguet’s lucid and readable account [83]. In particular, the statements and
the strategies of the proofs of Propositions 5.24 and 5.25 are borrowed from [83].
The idea behind the connection between the L-derivative and the linear functional
derivative is from the paper by Cardaliaguet, Delarue, Lasry, and Lions [86].
Proposition 5.51 was already proven in that paper, though using a different strategy.
As for the various notions of differentiability in normed spaces used in the text, we
refer to any textbook on analysis and differential calculus, see for instance [308].
Subsections 5.3.1, 5.3.2, and 5.3.3 are essentially borrowed from the paper by
Carmona and Delarue [98]. The discussion of the Blackwell and Dubins’ theorem
is inspired from the original note by Blackwell and Dubins [63]. The connection
between monotonicity and convexity, as exposed in Subsection 5.5.2, was discussed
in Lions’ lectures at the Collège de France [265].
The reader interested in the theory of optimal transportation can complement the
results presented in this chapter, including Brenier’s theorem whose original proof
can be found in [69], and the transport along vector fields like in the Benamou
and Brenier’s theorem, with Villani’s book [337], and Ambrosio, Gigli and Savaré
textbook [21].
The sobering counter-example showing the lack of differentiability and convexity
of the square of the Wasserstein distance to a fixed measure is borrowed from
the book [21] by Ambrosio, Gigli, and Savaré. The remaining discussion about
the Wasserstein gradients is modeled after this book. Theorem 5.53 is a version
of a famous result by Benamou and Brenier [41] and Otto [295]. Once again, the
statement is taken from [21], but the proof given in the text, which requires the
probability measures to be absolutely continuous, follows the arguments used in
Villani’s monograph [337]. The properties of the inf-sup convolution, as used in the
proof of Lemma 5.61, may be found in the paper by Lasry and Lions [259].
The notions of full and partial C 2 regularity presented in Section 5.6 are taken
from the papers by Buckdahn, Li, Peng, and Rainer [79] and by Chassagneux,
Crisan, and Delarue [114]. The various versions of Itô’s formula are taken from
the same work. The sufficient condition for ensuring the partial C 2 regularity can
also be found in [114]. We shall use it in Chapter (Vol II)-5 in order to prove the
existence of a classical solution to the master equation.
512 5 Spaces of Measures and Related Differential Calculus
The verification argument for the master equation established in Subsection 5.7.2
is inspired by a similar result for classical FBSDEs which can be found in the paper
[271] by Ma, Protter, and Young, and from the introductory paper of Carmona and
Delarue [97] on the master equation. We shall provide an in-depth analysis of the
master equation in Chapters (Vol II)-4 and (Vol II)-5. The optimization problem
introduced in Subsection 5.7.3 will be revisited in Chapter 6. We refer to the paper
by Gangbo, Nguyen, and Tudorascu [167] for an analytic treatment of this example.
The analysis of the semi-group generated by a McKean-Vlasov diffusion process
on Cb .P2 .Rd /I Rd / as well as the corresponding propagation chaos were inspired
by the analysis of the convergence of finite games toward mean field games, as
provided in the paper by Cardaliaguet, Delarue, Lasry, and Lions. We address these
questions of convergence in Chapter (Vol II)-6. Similar ideas to that exposed in
Subsection 5.7.4 have been developed by Kolokoltsov for investigating propagation
of chaos, see for instance the earlier article [233] together with the monograph [234];
in particular, the inequality (5.133) in Remark 5.111 plays a key role in [233, 234]
and in the subsequent works by the same author on mean field games, see the Notes
& Complements of Chapter (Vol II)-6 for precise references. Propagation of chaos
for standard McKean-Vlasov SDEs will be revisited in Chapter (Vol II)-2.
Optimal Control of SDEs
of McKean-Vlasov Type 6
Abstract
The purpose of this chapter is to provide a detailed probabilistic analysis of
the optimal control of nonlinear stochastic dynamical systems of McKean-
Vlasov type. We tackle the characterization and construction of solutions of
this special type of optimal control problem by means of forward-backward
stochastic differential equations. Because of the presence of the distribution of
the controlled state in the coefficients, the approach based on the Pontryagin
stochastic maximum principle requires special attention. We provide a version
of this maximum principle based on the differential calculus for functions of
probability measures introduced and developed in Chapter 5. We test the results
of the analysis on linear quadratic models and a few other models already
considered in the framework of mean field games. Finally, we highlight the
similarities and the differences between this problem and MFG problems with
which it is often confused.
6.1 Introduction
generator could be done with tools developed for infinite dimensional differential
operators, the standard differential calculus, even in infinite dimension, would have
a hard time capturing the fact that the second component of the state process has to
match the marginal distribution of the first component. The chain rules developed
in Chapter 5 were introduced to handle these difficulties.
Because of the crucial role they play in the probabilistic approach to mean
field games, existence and uniqueness results for forward, backward, and forward-
backward stochastic differential equations of the McKean-Vlasov (MKV for short)
type were given in Section 4.2 and Section 4.3 of Chapter 4. While most of the
proofs benefited from an understanding of the metric structure of the spaces of
probability measures, they did not require any form of differential calculus on
these spaces. Here we concentrate on the optimal control of stochastic systems
whose dynamics are given by equations of the McKean-Vlasov type. This is where
the special differential calculus introduced in Chapter 5 comes handy. Strangely
enough, the optimal control of dynamics driven by McKean-Vlasov SDEs seems to
be a brand new problem, to a great degree ignored in the standard stochastic control
literature. See nevertheless the Notes & Complements at the end of the chapter for
exceptions.
As we saw in Chapter 4, solving a McKean-Vlasov SDE is done by a fixed point
argument. First one fixes a set of candidates for the distribution of the solution, then
one solves the resulting standard SDE, the fixed point argument being to demand
that the distribution of the solution be equal to the distribution we started from. A
stochastic control problem adds an extra optimization layer to the fixed point. This
formulation bears a lot of resemblance to the approach to mean field game problems
as formulated in Chapters 1 and 3, and it is of the utmost importance to understand
the extent of the similarities and differences between the two problems.
SDEs of McKean-Vlasov type were introduced to describe the asymptotic
behavior of a generic element of a large population of particles with mean field
interactions. The adjective large underscores the fact that the analysis is intended
to describe the asymptotic regime when the number of particles tends to infinity.
In this asymptotic regime, particles become independent of each others, and the
state of each single particle satisfies an SDE of McKean-Vlasov type. Such a
phenomenon is usually referred to as propagation of chaos. We already alluded to
it in the applications of Chapter 5 and we shall revisit it in a more detailed fashion
in Chapter (Vol II)-1. To see the relevance of this theory to the models of stochastic
differential games where players interact in a mean field way, we assume for the
sake of definiteness that the private states XN;i D .XtN;i /06t6T of N players satisfy
the system of SDEs:
dXtN;i D b t; XtN;i ; N NXN ; ˛ti dt C t; XtN;i ; N NXN ; ˛ti dWti ; t 2 Œ0; T; (6.1)
t t
for some time horizon T > 0 and with a common (deterministic) initial con-
dition. Such a model is similar to those introduced in Chapter 2. However,
differently from Chapter 2, we assume that all the players use distributed controls
6.1 Introduction 515
.˛ti D .t; XtN;i //06t6T given by the same feedback function in order to minimize
an expected cost from running and terminal costs:
Z T
N;i N N;i N
J ./ D E
i
f t; Xt ; N X N ; ˛t dt C g XT ; N X N :
i
(6.2)
t
0 T
Actually, J i ./ is in fact independent of i and is thus common to all the players since
all of them use the same feedback function.
In contrast with our approach to MFG problems, instead of optimizing over
the control right away, we assume that the common feedback function is
momentarily kept fixed, and we first consider the large population limit. The theory
of propagation of chaos states that, in the limit N ! 1, for any fixed integer k,
the joint distribution of the k-dimensional process .XtN;1 ; ; XtN;k /06t6T converges
to a product distribution (in other words the k processes XN;i D .XtN;i /06t6T
for i D 1; ; k become independent in the limit) and the distribution of each
single marginal process converges toward the distribution of the unique solution
X D .Xt /06t6T of the McKean–Vlasov evolution equation:
dXt D b t; Xt ; L.Xt /; .t; Xt / dt
(6.3)
C t; Xt ; L.Xt /; .t; Xt / dWt ; t 2 Œ0; T;
subject to (6.5)
dXt D b t; Xt ; L.Xt /; ˛t dt C t; Xt ; L.Xt /; ˛t dWt ; t 2 Œ0; T:
516 6 Optimal Control of SDEs of McKean-Vlasov Type
Summary. While similar in purpose to the MFG strategy used to identify equilibria
for large symmetric games, the above plan of action may lead to a different notion
of equilibrium. To emphasize this point, we rely on the following diagram which
illustrates the fact that there are two different paths to go from the North West corner
corresponding to the statement of a N player stochastic differential game, to the
South East corner where one should expect to find equilibria. For better or worse,
this diagram is not commutative and this chapter provides insight on the matter.
Accordingly, we demonstrate by examples that the choice of a particular path has
drastic consequences on the properties of the resulting equilibria.
the adjoint processes. The presence of the minimizer in both of these equations
creates a strong coupling between the forward and backward equations, and the
solution of the control problem reduces de facto to the solution of a Forward-
Backward Stochastic Differential Equation (FBSDE). In the present situation, the
marginal distributions of the solutions appear in the coefficients. These FBSDEs of
McKean-Vlasov type were studied in Section 4.3, but one of the assumptions used
there, typically the boundedness of the coefficients with respect to the state variable,
precludes the application of this result to the Linear Quadratic (LQ) models often
used as benchmarks in stochastic control. Like in Chapters 3 and 4, we extend the
basic results of Section 4.3 by taking advantage of the convexity of the Hamiltonian.
A strong form of this convexity assumption can be used to apply the continuation
method, providing existence and uniqueness of the solution of the FBSDE at hand.
Restoring the Markov property by extending the state space as alluded to earlier,
we identify the backward component of the solution of this FBSDE to a function of
the forward component and its marginal distribution. This function is known as the
decoupling field of the FBSDE. In the classical cases, it can be found by solving
a PDE. In the present set-up, such a PDE is infinite dimensional as it involves
differentiation with respect to the state of the forward dynamics as well as its
distribution. Precisely, it reads as the derivative of an infinite dimensional Hamilton-
Jacobi-Bellman equation, similar to that presented in the applications of Chapter 5.
This PDE is formulated in Section 6.5. Somehow, it is related to the master equation
for mean field games, which we shall address with care in the second volume of the
book.
In analogy with our presentation of mean field games in Chapter 3, we address the
optimal control of McKean-Vlasov diffusion processes in two different ways, based
on a direct analytic formulation and a probabilistic methodology respectively. In the
case of the analytic approach, we develop the Pontryagin and the HJB strategies for
the optimization problem, while we concentrate mostly on the Pontryagin approach
in the probabilistic case. Interestingly, we shall see that in both cases, the Pontryagin
systems can be linked to forward/backward systems of the types of those identified
in Chapters 3 and 4 for MFG problems. We shall take advantage of this unexpected
connection when we revisit the class of potential mean field game problems later in
the chapter.
Controlled Dynamics
For a given initial condition 2 L2 .˝; F0 ; PI Rd /, the stochastic dynamics of
the controlled state are given by a stochastic process X D .Xt /06t6T satisfying a
nonlinear SDE of the form:
where the drift and diffusion coefficients of the state Xt of the system at time t
are given by a pair of deterministic (measurable) functions .b; / W Œ0; T Rd
P2 .Rd / A ! Rd Rdd , see Proposition 5.7 for the description of the -field
on P2 .Rd /, and ˛ D .˛t /06t6T is a progressively measurable process with values
in a measurable space .A; A/. Typically, A will be a closed convex subset of the
Euclidean space Rk , for k 1, and A the -field induced by the Borel -field
of this Euclidean space. The fact that W and X are required to have the same
dimension d is for convenience only. As already explained in the introduction, the
term nonlinear used to qualify (6.6), does not refer to the fact that the coefficients
b and could be nonlinear functions of x, but instead to the fact that they depend
not only on the value of the unknown process Xt at time t, but also on its marginal
distribution L.Xt /. Using the terminology introduced in Section 4.2, we call (6.6) a
controlled stochastic differential equation of the McKean-Vlasov type. Sometimes
the McKean-Vlasov dynamics posited in (6.6) are also called of mean field type,
in which case we use the terminology mean field stochastic control problem. This
is justified by the fact that the uncontrolled stochastic differential equations of
McKean-Vlasov type first appeared as the infinite particle limits of large systems
of particles with mean field interactions (see for instance Chapter 1 of Volume II).
Throughout the chapter, we assume that the drift coefficient b and the volatility
satisfy the following assumptions:
Remark 6.1 The above choice of admissibility is motivated by the search for
optimal controls in open loop forms. This class of controls is especially well suited to
the probabilistic approach based on the Pontryagin stochastic maximum principle.
However, we shall consider other classes of admissible controls from time to time.
Indeed, it will be convenient to work with Markovian controls in closed loop
feedback form ˛t D .t; Xt / for a deterministic function W Œ0; T Rd ! A
when we introduce the value function of the optimization problem, and we search
for equations (typically PDEs) satisfied by this value function. We refer to the next
section for a first account in that direction.
Cost Functional
Under the above assumptions (A1) and (A2), Theorem 4.21 of Section 4.2.1 implies
that for every admissible control ˛ 2 A, there exists a unique solution X D X˛
of (6.6). Moreover for every p 2 Œ1; 2, this solution satisfies:
E sup jXt jp < 1: (6.8)
06t6T
over the set A of admissible control processes. The running cost function f is given
by a real valued deterministic (measurable) function on Œ0; TRd P2 .Rd /A, and
the terminal cost function g by a real valued deterministic (measurable) function on
Rd P2 .Rd /. Assumptions on the cost functions f and g will be spelled out later on,
but typically we shall assume:
for all .t; x; ; ˛/ 2 Œ0; T Rd P2 .Rd / A, where M2 ./ denotes the square
root of the second moment of , see (3.26).
6.2 Probabilistic and Analytic Formulations 521
For the sake of simplicity we assume that all the coefficients are deterministic.
Some of the results can be extended to random coefficients. See the Notes &
Complements at the end of the chapter for a discussion of some of these possible
generalizations.
As before, we use a bold face character to indicate that we are working with a
function of t, and we use the notation ./ to emphasize that we are dealing with a
function of x 2 Rd . In this way, we can distinguish the notation ˛./ D ..t; //06t6T
for the feedback function from the values ˛ 2 A in the range of this function
and, also, from the control process ˛ D .˛t D ..t; Xt ///06t6T obtained by
implementing the feedback function ˛./ in the SDE (6.6). Since we keep the
discussion in this section at an informal level only, we do not discuss the well
posedness of (6.10).
522 6 Optimal Control of SDEs of McKean-Vlasov Type
if we use the notation h'; i for the integral of the function ' with respect to the
measure .
Motivated by this (deterministic) form of the objective function, we rewrite
the dynamics of the controlled state in terms of its distribution only, replacing
the dynamical equation (6.3) given by a stochastic differential equation, by a
deterministic dynamical equation for the marginal distributions themselves. A
simple application of the classical form of Itô’s formula to the McKean-Vlasov
SDE (if well posed) shows that, similar to (3.12), the dynamics of the marginal
distributions are given by the nonlinear Kolmogorov-Fokker-Planck’s equation:
.t;/;
@t t D Lt t ; t 2 Œ0; T I 0 ; (6.12)
.t;/;
where the action of the operator Lt on measures (in the sense of distributions)
is given by:
.t;/;
Lt D divx b.t; ; ; .t; //
1 (6.13)
C trace @2xx .t; ; ; .t; // .t; ; ; .t; // :
2
So, instead of considering (6.3)–(6.4), we may want to focus on the deterministic
optimal control problem:
Z
T ˝ ˛ ˝ ˛
inf f t; ; t ; .t; / ; t dt C g.; T /; T ; (6.14)
0
the infimum being taken over the set of pairs .t ; .t; //06t6T , for which the
mappings Œ0; T 3 t 7! t 2 P2 .Rd / and Œ0; T Rd 3 .t; x/ 7! .t; x/ are
measurable, and satisfy:
Z T
k.t; /k2L2 .Rd ;t IRk / dt < 1;
0
b.t; x; ; ˛/ D ˛; .t; x; ; ˛/ D 0;
0 if D 1 (6.15)
f .t; x; ; ˛/ D j˛j2 ; g.x; / D ;
C1 otherwise
Q x; X;
H.t; Q y; z; ˛/ D H.t; x; ; y; z; ˛/ (6.17)
for any random variable XQ with distribution . Below, we shall use the following
convention: Whenever X is a random variable constructed on .˝; F; P/, XQ denotes
an independent copy on a clone .˝; Q F; Q of the space .˝; F; P/, according to the
Q P/
same principle as in Chapter 5. Now, for any t 2 Œ0; T, we may regard the random
variable Xt , seen as an element of L2 .˝; F; PI Rd /, as the state variable itself as it
encodes both the realization Xt .!/ of Xt and the copy XQ t .
So we may regard the full-fledged adjoint variables of the mean field control
problem as random variables Y 2 L2 .˝; F; PI Rd / and Z 2 L2 .˝; F; PI Rdd /, in
which case the Hamiltonian should be given by:
Q X; Y; Z; ˇ/ D E H.t;
H.t; Q X; X;
Q Y; Z; ˇ/ ;
1. First, if ˛.t;
O x; ; y; z/ is a minimizer of A 3 ˛ 7! H.t; x; ; y; z; ˛/ in the sense
that:
Q X; Y; Z; ˇ/ D @x H.t; X; L.X/; Y; Z/
DX H.t;
(6.20)
C EQ @ H.t; X;
Q L.X/; Y;
Q Z; Q
Q ˇ/.X/ ;
the second term in the right-hand side resulting from the definition of the L-
differential and from Fubini’s theorem, see Example 3 in Subsection 5.2.2.
.t;/;
the partial differential operator Lt being defined in (6.13). Obviously, this partial
differential operator is unbounded but, for pedagogical reasons, we choose to ignore
the technical issues related to the definition of its domain and keep the discussion
rather informal.
First, we observe that the state variable is a probability measure in P2 .Rd / and
we
R shall regard it as an element of the space of signed measures on Rd such that
2
Rd .1Cjxj/ djj.x/ < 1. So by duality, the adjoint variable should be a continuous
function u 2 C.Rd I R/ with sub-quadratic growth. It is then natural to introduce the
(formal) Hamiltonian:
D 1 h iE
H t; ; u; ˇ D u./; divx b.t; ; ; ˇ./ C trace @2xx a t; ; ; ˇ./
2
˝ ˛
C f t; ; ; ˇ./ ; ;
for u 2 C.Rd I R/ with sub-quadratic growth as dual variable of the state variable
2 P2 .Rd / and for ˇ 2 L2 .Rd ; I A/. As always, we use the short notation a D
. Whenever u is smooth enough, say C 2 , we can use integration by parts. We get:
D 1 h i E
H t; ; u; ˇ D b t; ; ; ˇ./ @x u./ C trace a t; ; ; ˇ./ @2xx u./ ;
2
˝ ˛
C f t; ; ; ˇ./ ; :
526 6 Optimal Control of SDEs of McKean-Vlasov Type
1
K.t; x; ; y; z; ˛/ D b.t; x; ; ˛/ y C a.t; x; ; ˛/ z C f .t; x; ; ˛/
2 (6.22)
1
D H t; x; ; y; 2
z .t; x; ; ˛/; ˛ ;
such that:
K t; x; ; y; z; ˛ .t; x; ; y; z/ D inf K.t; x; ; y; z; ˛/; (6.23)
˛2A
H .t; ; u/
Z
D K t; x; ; @x u.x/; @2xx u.x/; ˛ .t; x; ; @x u.x/; @2xx u.x// d.x/;
Rd
optimization problem. Since we are using the standard duality between function
spaces and spaces of measures, we should use the functional derivative introduced
in Subsection 5.4.1 instead of the L-derivative in order to compute this derivative.
We get:
ıH
t; ; u; ˇ ./
ım
D b t; ; ; ˇ./ @x u./
1
C trace a t; ; ; ˇ./ @2xx u./ C f t; ; ; ˇ./ (6.24)
2
D ıb
C t; ; ; ˇ./ ./ @x u./
ım
1 ıa ıf E
C trace t; ; ; ˇ./ ./ @2xx u./ C t; ; ; ˇ./ ./; ;
2 ım ım
for t, , u and ˇ as above. In the duality products, the integration variable is and
is fixed.
In order to be consistent with the notations introduced in Chapter 5, we denote by
ı=ım the linear functional derivative with respect to the measure argument, although
the latter one is denoted by . In this way, we avoid any confusion with the notation
@ for the L-derivative.
where .s /t6s6T has the dynamics (6.12) with the initial condition t D 2
P2 .Rd / at time t 2 Œ0; T. The dynamic programming principle says that it should
satisfy (at least in some generalized sense) the HJB equation:
ıv
@t v.t; / C H t; ; .t; / D 0; .t; / 2 Œ0; T P2 .Rd /: (6.26)
ım
The third argument in the Hamiltonian H is the function u./ D Œıv=ım.t; /./
and the terminal condition is v.T; / D hg.; /; i. For exactly the same reasons
as above, we use the standard Fréchet or Gateaux differentiation introduced in
Subsection 5.4.1 instead of the L-derivative in order to compute the derivative of
the value function with respect to the state variable .
528 6 Optimal Control of SDEs of McKean-Vlasov Type
Example 6.3 Let us assume that the matrix is equal to the identity, that the drift
is equal to the control, i.e., b.t; x; ; ˛/ D ˛ 2 A, with A D Rd , and that the running
cost is separable in the sense that:
1 2
f .t; x; ; ˛/ D j˛j C f0 .t; x; /;
2
or equivalently:
Z
1 ˇˇ ˇ2
ˇ
@t v.t; / C ˇ@ v.t; /.x/ˇ
Rd 2
(6.27)
1
C trace @v @ v.t; /.x/ C f0 .t; x; / d.x/ D 0;
2
if we use the relationship between the L-derivative and the functional derivative
proven in Subsection 5.4.1 of Chapter 5.
Example 6.4 We now assume that the matrix is equal to the identity, that the
drift does not depend upon the measure , i.e., b.t; x; ; ˛/ D b.t; x; ˛/, and that the
running cost is separable in the sense that f .t; x; ; ˛/ D f0 .t; x; / C f1 .t; x; ˛/ for
some smooth functions f0 and f1 defined on Œ0; T Rd P2 .Rd / and Œ0; T Rd A
respectively. In this case, the minimizer ˛ of K.t; x; ; y; z; / is the minimizer of
b.t; x; ˛/ y C f1 .t; x; ˛/ (and also of H.t; x; ; y; z/), and for that reason, it depends
neither on the measure nor on z, i.e., ˛ .t; x; ; y; z/ D ˛.t;
O x; ; y; z/ D ˛ .t; x; y/
(which we shall also denote by ˛.t; O x; y/). Consequently:
Z h 1
H .t; ; u/ D b t; x; ˛O t; x; @x u.t; x/ @x u.t; x/ C x u.t; x/
Rd 2
i
Cf0 .t; x; / C f1 t; x; ˛O t; x; @x u.t; x/ d.x/;
6.2 Probabilistic and Analytic Formulations 529
and if we replace the dual variable u by ıv=ım, as before, we can use the fact that
@x u becomes @ v and u becomes traceŒ@x @ v. Therefore, the HJB equation reads:
Z
@t v.t; / C b t; x; ˛O t; x; @ v.t; /.x/ @ v.t; /.x/
Rd
1
C trace @v @ v.t; /.x/ C f0 .t; x; / (6.28)
2
C f1 t; x; ˛O t; x; @ v.t; /.x/ d.x/ D 0;
for .t; / 2 Œ0; T P2 .Rd /. Making use of the shorten notation H.t; x; ; y; ˛/ for
the reduced Hamiltonian associated with the full Hamiltonian (6.16), (6.28) can also
be written:
Z
1
@t v.t; / C trace @x @ v.t; /.x/
R d 2
C H t; x; ; @ v.t; /.x/; ˛O t; x; @ v.t; /.x/ d.x/ D 0;
and
hˇ ˇ2 i
E ˇ@x g.XT ; L.Xt //ˇ C EQ j@ g.XT ; L.Xt //.XQ T /j2 < 1; (6.30)
we call adjoint processes of X (or of the admissible control ˛), any couple .Y; Z/ of
progressively measurable stochastic processes Y D .Yt /06t6T and Z D .Zt /06t6T in
S2;d H2;dd satisfying the equation:
8 h
ˆ
ˆ dYt D @x H t; Xt ; L.Xt /; Yt ; Zt ; ˛t
ˆ
ˆ i
ˆ
<
CEQ @ H t; XQ t ; L.Xt /; YQ t ; ZQ t ; ˛Q t /.Xt / dt
(6.31)
ˆ
ˆ CZt dWt ; t 2 Œ0; T;
ˆ
ˆ
:̂
YT D @x g XT ; L.XT / C EQ @ g XQ T ; L.XT / .XT / ;
Q Y;
where .X; Q Z;
Q ˛/Q is an independent copy of .X; Y; Z; ˛/ defined on the space
.˝; F; P/ and EQ denotes the expectation on .˝;
Q Q Q Q
Q P/.
Q F;
Remark 6.6 Beyond the mathematical rationale (6.20) for the form of the above
adjoint equation, its intuitive justification can be argued as follows. The search
for optimality goes through the computation of the variations of the cost J.˛/
for infinitesimal variations in the control process ˛. The main difference with the
classical case is that when ˛ varies, the variations of both the state .Xt˛ /06t6T and its
distribution .L.Xt˛ //06t6T have to be controlled. So it should not come as a surprise
that derivatives with respect to the state variable x and the measure appear in the
right-hand sides of both the equation for .Yt /06t6T and its terminal condition YT .
The specific ways in which these derivatives enter formula (6.31) directly follow
from (6.20). However, they may not appear transparent in a first reading. They
are best understood from the proofs of the necessary and sufficient conditions
of the stochastic maximum principle given below. As in the proof of (6.20),
these derivatives are manipulated inside an expectation, and an interchange of
this expectation and the expectation over the space introduced for the lifting of
the functions of measures is the source of the special form of formula (6.31).
Fubini’s theorem is the reason for the special role played by the independent copies
appearing in (6.31).
6.2 Probabilistic and Analytic Formulations 531
with the terminal condition YT D @x g.XT ; L.XT // C EŒ@ Q g.XQ T ; L.XT //.XT /.
Notice that @x b and @x are bounded since b and are assumed to be Lipschitz
continuous in the variable x by (A2). Also, EŒj@ b.t; XQ t ; L.Xt /; ˛Q t /.Xt /j2 1=2 and
EŒj@ .t; XQ t ; L.Xt /; ˛Q t /.Xt /j2 1=2 are bounded by c since b and are assumed to be
c-Lipschitz continuous in the variable with respect to the 2-Wasserstein distance.
Indeed, Remark 5.27 ensures that, for a differentiable function h W P2 .Rd / ! R,
which is c-Lipschitz continuous in with respect to the 2-Wasserstein distance, it
holds EŒj@ h.X/j2 1=2 6 c, for any 2 P2 .Rd / and any random variable X having
distribution .
Remark 6.7 The notation ˇ is especially convenient for terms of the form @x ˇ z
or @ ˇ z. In contrast, we did not introduce it in Chapter 4 since, in the various
applications of the stochastic maximum principle addressed therein, is assumed
to be constant.
532 6 Optimal Control of SDEs of McKean-Vlasov Type
ıH
du.t; / D t; t ; u.t; /; .t; / ./ dt;
ım
Z
ı
uT ./ D g.x; /d.x/ ./;
ım Rd jDT
where .t /06t6T satisfies (6.12). Again, the fact that we use the standard duality
between function spaces and spaces of measures prompts us to compute the deriva-
tive of the Hamiltonian and the terminal condition using functional derivatives. So
the adjoint equation should read:
6.2 Probabilistic and Analytic Formulations 533
dut
./ D b t; ; t ; .t; / @x ut ./
dt
1
C trace a t; ; t ; .t; / @2xx ut ./ C f t; ; t ; .t; /
2
D ıb
C t; ; t ; .t; / ./ @x ut ./ (6.34)
ım
1 ıa
C trace t; ; t ; .t; / ./ @2xx ut ./
2 ım
ıf E
C t; ; t ; .t; / ./; t ;
ım
where we denoted u by .ut .//06t6T instead of .u.t; //06t6T and where we used
as integration variable in the duality products. The terminal condition should be:
Z
ıg
uT ./ D g.; T / C .x; T /./dT .x/:
Rd ım
˛ being the minimizer of K defined in (6.23) and u D .u.t; //06t6T solving (6.34).
Therefore, combining the Kolmogorov-Fokker-Planck equation (6.12) for D
.t /06t6T with the equation (6.34) for u D .u.t; //06t6T , we deduce that the pair
.; u/ should necessarily solve the forward-backward system:
8 ˛ .t;;t ;@x u.t;/;@2xx u.t;//;
ˆ
ˆ @t t D Lt t ; t 2 Œ0; T I 0 ;
ˆ
ˆ
ˆ
ˆ ıH
< @t u.t; x/ D t; t ; u.t; /; ˛ t; ; t ; @x u.t; /; @2xx u.t; / .x/;
ım (6.35)
ˆ
ˆ .t; x/ 2 Œ0; T Rd ;
ˆ
ˆ Z
ˆ ıg
:̂ u.T; x/ D g.x; T / C .y; T /.x/dT .y/; x 2 Rd ;
R d ım
where ıH =ım can be computed as in (6.24), namely, with the same notation as
in (6.21):
ıH
t; ; u; ˇ .x/ D K t; x; ; @x u.x/; @2xx u.x/; ˇ.x/
ım
Z
ıK
C t; y; ; @x u.y/; @2xx u.y/; ˇ.y/ .x/d.y/:
Rd ım
534 6 Optimal Control of SDEs of McKean-Vlasov Type
In full analogy with Remark 3.26 for stochastic control in finite dimension, the
adjoint variable u should coincide with the derivative of the value function v D
.v.t; //06t6T in (6.25) computed along the optimal path .t /06t6T , namely we
should have:
ıv
u.t; x/ D .t; t /.x/; .t; x/ 2 Œ0; T Rd : (6.36)
ım
Using the relationship between the L-derivative and the functional derivative, we
derive the following (formal) identification:
for .t; x/ 2 Œ0; T Rd . Relationship (6.36) is nothing but the analogue of the
equation (4.8) for the decoupling field of a forward-backward system. Here, the
decoupling field is the mapping:
ıv
U W Œ0; T Rd P2 .Rd / 3 .t; x; / 7! .t; /.x/:
ım
Differentiating the HJB equation (6.26) with respect to , we deduce that U should
satisfy (at least formally):
ı h i
@t U.t; x; / C H t; ; U.t; ; / .x/ D 0; (6.37)
ım
for .t; x; / 2 Œ0; T Rd P2 .Rd /, which is sometimes called the master equation
of the McKean-Vlasov control problem (6.12)–(6.14). Unfortunately, it is only fair
to say that the jury is still out on the use of the terminology master equation. See
nevertheless the discussion in Section 6.5. Here, the terminal boundary condition
for U.T; ; / is:
Z
ıg
U.T; x; / D g.x; / C .y; /.x/d.y/; x 2 Rd ; 2 P2 .Rd /:
R d ım
We conclude this subsection by revisiting the two examples for which we derived
the HJB equations, and we write explicitly the corresponding master equations.
Example 6.8 Under the assumptions of Example 6.3 above, the master equation
takes the form:
1 1 ˇˇ ˇ2
@t U.t; x; / C x U.t; x; / @x U.t; x; /ˇ C f0 .t; x; /
2 2
Z h
1 ıU ıU
C y .t; y; /.x/ @y .t; y; /.x/ @y U.t; y; /
R d 2 ım ım
ıf0 i
C .t; y; /.x/ d.y/ D 0;
ım
6.2 Probabilistic and Analytic Formulations 535
for .t; x; / 2 Œ0; TRd P2 .Rd /, where @y U.t; y; / and @y ŒıU=ım.t; y; /.x/ are
seen as column vectors of dimension d. Notice that, on the first line, we recognize
the structure of a finite dimensional HJB equation.
Example 6.9 Under the assumptions of Example 6.4 above with A D Rk , we have:
1
@t U.t; x; / C x U.t; x; / C H t; x; ; @x U.t; x; /; ˛
O t; x; @x U.t; x; /
2
Z h
1 ıU
C y .t; y; /.x/
R d 2 ım
ıU
b t; y; ˛O t; y; @y U.t; y; / @y .t; y; /.x/
ım
ıf0 i
C .t; y; /.x/ d.y/ D 0;
ım
for .t; x; / 2 Œ0; T Rd P2 .Rd /, where we used the fact that ˛.t;
O x; y/ is a zero of
@˛ H.t; x; ; y; / since A D Rk . As in the case of Example 6.8, the first part of this
master equation has the structure of a finite dimensional HJB equation.
In the previous subsections, we gave two distinct formulations for the optimal
control of stochastic differential equations of the McKean-Vlasov type, and for each
of them, we developed a dedicated form of the Pontryagin maximum principle.
However, it is not clear how these two forms relate to each other. Here, still in
an informal way, we explain the connections between these two versions of the
Pontryagin maximum principle. Then, we highlight how they can be linked to mean
field game problems.
For simplicity, and since we only accounted for mean field games in this specific
case, we shall assume that does not depend upon the control ˛. In particular,
due to the relationship (6.22) between H and K, ˛O in (6.18) and ˛ in (6.23)
coincide, see Remark 6.2. Moreover, ˛O and thus ˛ as well, only depend upon the
variables .t; x; ; y/.
@t u.t; x/
D K t; x; t ; @x u.t; x/; @2xx u.t; x/; ˛O t; x; t ; @x u.t; x/ (6.38)
Z
ıK
t; y; t ; @y u.t; y/; @2yy u.t; y/; ˛ t; y; t ; @y u.t; y/ .x/dt .y/;
R d ım
1 i
C trace a.t; x; t /@2xx u.t; x/ C f .t; x; t ; ˛/ :
2
Therefore, this solution u D .u.t; //06t6T may be understood as the value function
of a new stochastic control problem, set over controlled standard diffusion processes
of the form:
dXt˛ D b t; Xt˛ ; t ; ˛t dt C t; Xt˛ ; t dWt ; t 2 Œ0; T I X0˛ D ; (6.40)
˛ standing for an A-valued control process as in (6.6), and associated with the cost
functional:
Z
˛ ıg
I.˛/ D E g.XT ; T / C .y; T /.XT˛ /dT .y/
R d ım
Z T
C f t; Xt˛ ; t ; ˛t dt
0
Z TZ (6.41)
ıK 2
C t; y; t ; @y u.t; y/; @yy u.t; y/;
0 Rd ım
˛ t; y; t ; @y u.t; y/ .Xt˛ /dt .y/dt :
It is very important to observe that the flow of measures D .t /06t6T appearing
in the dynamics (6.40) of X and in the definition (6.41) of I.˛/ is not required
to coincide with the flow of marginal distributions of X. Instead, is merely the
forward component of the solution to (6.35). This is in stark contrast with (6.6). In
particular, we stress the fact that the control problem (6.40)–(6.41) is not a McKean-
Vlasov control problem but a standard control problem.
6.2 Probabilistic and Analytic Formulations 537
Since L.Xt / D t for all t 2 Œ0; T, we may regard X as a controlled diffusion
process of the same form as in (6.40), with ˛ equal to:
˛ D ˛t D ˛ t; Xt ; t ; @x u.t; Xt / ; t 2 Œ0; T:
06t6T
Since u is the value function of the HJB equation (6.38) and ˛ is a minimizer of
the Hamiltonian K in ˛, see (6.39), the process X D .Xt /06t6T is an optimal path
for the optimal control problem (6.40)–(6.41), see Lemma 4.47.
H 0 .t; x; y; z; ˛/
D H.t; x; t ; y; z; ˛/
Z
ıK
C t; y; t ; @y u.t; y/; @2yy u.t; y/; ˛O t; y; t ; @y u.t; y/ .x/dt .y/;
Rd ım
8
ˆ
ˆ dXt0 D b t; Xt0 ; t ; ˛.t;
O Xt0 ; t ; Yt0 / dt
ˆ
ˆ 0
ˆ
ˆ C t; Xt ; t dWt ; t 2 Œ0; T I X00 D ;
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ 0 0 0 0
< dYt D @x H t; Xt ; t ; Yt ; Zt ; ˛.t;
ˆ O Xt0 ; t ; Yt0 /
Z
(6.44)
ˆ
ˆ
ˆ C @ K t; y; t ; @y u.t; y/; @2yy u.t; y/;
ˆ
ˆ Rd
ˆ
ˆ 0
ˆ
ˆ ˛
O ; @ .X /d .y/
ˆ
ˆ
t; y; t y u.t; y/ t t dt
ˆ
:̂
CZt0 dWt ; t 2 Œ0; T;
as a possible candidate for the solution of (6.44). See Subsection 3.3.2 for the case
constant, the generalization to the current setting being straightforward. The fact that
this triple is indeed a solution of (6.44) may be easily checked by writing the PDE
satisfied by @x u by differentiation of (6.38), and then by expanding .@x u.t; Xt //06t6T
using Itô’s formula.
@ K.t; x; ; y; z; ˛/.x0 /
1 (6.47)
D @ H t; x; ; y; ˛ .x0 / C @ trace .t; x; / .t; x; / z .x0 /;
2
where @i denotes the i coordinate of the derivative @ , that is @i D .@ /i with the
notations used in Chapter 5. Plugging this expression into (6.47), we deduce:
@ K.t; x; ; y; z; ˛/.x0 / D .@ H/ t; x; ; y; z .t; x; /; ˛ .x0 /;
The FBSDE system of McKean-Vlasov type derived from the version of the stochastic
Pontryagin maximum principle for the optimal control of McKean-Vlasov dynamics
can be identified with the FBSDE system for a standard optimal control problem
derived from the Pontryagin maximum principle for the deterministic formulation of
the original McKean-Vlasov optimal control problem!
The forward/backward PDE system issued from the application of the Pontryagin
maximum principle to the deterministic formulation of the McKean-Vlasov problem
may be identified with an auxiliary MFG problem. In fact, the FBSDE system given
by the Pontryagin principle for the McKean-Vlasov problem may be identified with
the FBSDE system given by the application of the Pontryagin principle (in the sense
described in Chapters 3 and 4) to this auxiliary MFG problem.
ıK ıf ıf1
.t; x; ; y; z; ˛/ D .t; x; ; ˛/ D .t; x; /:
ım ım ım
6.2 Probabilistic and Analytic Formulations 541
Therefore, the auxiliary MFG problem consists in optimizing the cost functional:
Z
ıg
I.˛/ D E g.XT˛ ; T / C .y; T /.XT˛ /dT .y/
Rd ım
Z (6.49)
T ıf1
C f t; Xt˛ ; t ; ˛t C .t; y; t /.Xt˛ / dt .y/dt ;
0 ım
where D .t /06t6T is a continuous trajectory with values in P2 .Rd /, under the
dynamic constraint:
dXt˛ D b t; Xt˛ ; ˛t dt C t; Xt˛ dWt ; t 2 Œ0; T I X0˛ D :
We shall revisit this example in Subsection 6.7.2 through the lenses of potential
games.
the second case, we could reach the same conclusion provided that we allow, in
the probabilistic formulation, for controlled processes constructed on possibly
different spaces. This requires to redefine the cost functional as a function of the
joint law of the Brownian motion W, of the controlled process X D X˛ and of
the control ˛ (instead of a function of the realizations of ˛). We provide a short
account in that direction in Section 6.6.
(A1) The functions b, and f are differentiable with respect to .x; ˛/,
the mappings .x; ; ˛/ 7! @x .b; ; f /.t; x; ; ˛/ and .x; ; ˛/ 7!
@˛ .b; ; f /.t; x; ; ˛/ being continuous for each t 2 Œ0; T. The func-
tions b, and f are also differentiable with respect to the vari-
able , the mapping Rd L2 .˝; F; PI Rd / A 3 .x; X; ˛/ 7!
@ .b; ; f /.t; x; L.X/; ˛/.X/ 2 L2 .˝; F; PI Rdd R.dd/d Rd / being
continuous for each t 2 Œ0; T. Similarly, the function g is differentiable
with respect to x, the mapping .x; / 7! @x g.x; / being continuous.
(continued)
6.3 Stochastic Pontryagin Principle for Optimality 543
Observe that our formulation of the joint differentiability is very much in the
spirit of Subsection 5.3.4. Also notice that we used the notation M2 ./2 introduced
in (3.7) for the second moment of a measure:
Z
M2 ./2 D jxj2 d.x/; 2 P2 .Rd /:
Rd
As before, we assume that the set A of control values is a closed convex subset of
Rk and we denote by A the set of admissible control processes and by X D X˛ the
controlled state process, namely the solution of (6.6) with a given initial condition
X0 D 2 L2 .˝; F0 ; PI Rd /. The filtration F is assumed to be generated by F0 and
by W. Our first task is to compute the Gâteaux derivative of the cost functional J at
˛ in all directions. In order to do so, we choose ˇ 2 H2;k such that ˛ C ˇ 2 A for
> 0 small enough. We then compute the variation of J at ˛ in the direction of ˇ
(think of ˇ as the difference between another element of A and ˛).
They are progressively measurable bounded processes with values in the spaces
Rdd , R.dd/d , Rd , and Rdd respectively (the parentheses around d d indicating
that Ot u is seen as an element of Rdd whenever u 2 Rd ), and:
t D EQ @ b.t; t /.XQ t /VQ t D EQ @ b.t; x; L.Xt /; ˛/.XQ t /VQ t ˇˇˇ xDXt ;
˛D˛t
(6.51)
Ot D EQ @ .t; t /.XQ t /VQ t D EQ @ .t; x; L.Xt /; ˛/.XQ t /VQ t ˇˇˇ xDXt ;
˛D˛t
which are progressively measurable processes with values in Rd and Rdd respec-
tively, and where .XQ t ; VQ t / is a copy of .Xt ; Vt / defined on .˝; Q F; Q We refer
Q P/.
to Subsection 5.3.4 for a complete account of the measurability properties. As
expectations of functions of .XQ t ; VQ t /, t , and Ot depend upon the joint distribution
of Xt and Vt . In (6.50) we wrote t .L.Xt ; Vt // and Ot .L.Xt ; Vt // in order to stress
this dependence upon the joint distribution of Xt and Vt . Even though we are
dealing with possibly random coefficients, the existence and uniqueness of the
variation process are guaranteed by a suitable version of Theorem 4.21 applied to
the couple .X; V/ and the system formed by (6.6) and (6.50). Our assumption on
the boundedness of the partial derivatives of the coefficients implies that V satisfies
EŒsup06t6T jVt jp < 1 for every finite p > 1. In particular .t /0tT and .Ot /0tT
are bounded.
Lemma 6.10 For > 0 small enough, we denote by ˛ the admissible control
defined by ˛t D ˛t C ˇt , and by X D X˛ the corresponding controlled state.
We have:
ˇ ˇ2
ˇ Xt Xt ˇ
lim E sup ˇ ˇ Vt ˇˇ D 0: (6.52)
&0 06t6T
Proof. For the purpose of this proof we set t D .Xt ; L.Xt /; ˛t / and Vt D 1 .Xt Xt /Vt .
Notice that V0 D 0 and that:
1
dVt D b.t; t / b.t; t / @x b.t; t / Vt @˛ b.t; t / ˇt
EQ @ b.t; t /.XQ t / VQ t dt
1 (6.53)
C .t; t / .t; t / @x .t; t / Vt @˛ .t; t / ˇt
EQ @ .t; t /.XQ t / VQ t dWt
Z 1 ;
C EQ @ b t; t
;
XQ t .VQ t C VQ t / d ;
0
By (A2) in assumption Pontryagin Optimality, the last three terms of the above right-hand
side are bounded in L2 .Œ0; T ˝/, uniformly in . Next, we treat the diffusion part V ;2 in
the same way using Jensen’s inequality and Burkholder-Davis-Gundy’s inequality to control
the quadratic variation of the stochastic integrals. Consequently, going back to (6.53), we see
that, for any S 2 Œ0; T,
Z
S
E sup jVt j2 6 c0 C c0 E sup jVs j2 dt;
06t6S 0 06s6t
where as usual c0 > 0 is a generic constant whose value can change from line to line,
as long as it remains independent of . Applying Gronwall’s inequality, we deduce that
EŒsup06t6T jVt j2 6 c0 . Therefore, we have:
ˇ ˇ2
lim E sup sup ˇXt ; Xt ˇ D 0:
&0 06 61 06t6T
We then prove that I ;1 , I ;2 and I ;3 converge to 0 in L2 .Œ0; T ˝/ as & 0. Indeed,
Z Z ˇZ ˇ2
T T ˇ 1 ˇ
E jIt;1 j2 dt DE ˇ Œ@x b t; ;
@x b.t; t / Vt d ˇˇ dt
ˇ t
0 0 0
Z Z 1
T ;
6E j@x b t; t @x b.t; t /j2 jVt j2 d dt:
0 0
546 6 Optimal Control of SDEs of McKean-Vlasov Type
Since the function @x b is bounded and continuous in x, , and ˛, the above right-hand side
converges to 0 as & 0. A similar argument applies to It;2 and It;3 . Again, we treat the
diffusion part V ;2 in the same way using Jensen’s inequality and Burkholder-Davis-Gundy’s
inequality. Consequently, going back to (6.53), we finally see that, for any S 2 Œ0; T,
Z
S
E sup jVt j2 6 c0 E sup jVs j2 dt C c ;
06t6S 0 06s6t
where lim&0 c D 0. Finally, we get the desired result applying Gronwall’s inequality. t
u
d ˇ
J.˛ C ˇ/ˇD0
d
Z T
DE Q f .t; t /.XQ t / VQ t C @˛ f .t; t / ˇt dt
@x f .t; t / Vt C EŒ@ (6.54)
0
Q g.XT ; L.XT //.XQ T / VQ T :
C E @x g.XT ; L.XT // VT C EŒ@
Proof. We use freely the notation introduced in the proof of the previous lemma.
d ˇ
J.˛ C ˇ/ˇD0
d
Z T (6.55)
1
D lim E f t; f .t; t / dt C g.XT ; L.XT // g.XT ; L.XT // :
&0
t
0
where we used the hypothesis on the continuity and growth of the derivatives of f , the uniform
convergence proven in the previous lemma, and standard uniform integrability arguments.
The second term in (6.55) is handled in a similar way. t
u
6.3 Stochastic Pontryagin Principle for Optimality 547
Proof. Letting t D .Xt ; L.Xt /; Yt ; Zt ; ˛t / and using the definitions (6.50) of the variation
process V, and (6.31) or (6.32) of the adjoint process Y, integration by parts gives:
YT VT
Z T Z T Z T
D Y0 V0 C Yt dVt C dYt Vt C dŒY; Vt
0 0 0
Z
T
D MT C Yt @x b.t; t /Vt C Yt EQ @ b.t; t /.XQ t /VQ t C Yt @˛ b.t; t /ˇt
0
@x H.t; t / Vt EQ @ H.t; Q t /.Xt / Vt
C Zt @x .t; t /Vt C Zt EQ @ .t; t /.XQ t /VQ t
C Zt @˛ .t; t /ˇt dt;
where .Mt /06t6T is a mean zero integrable martingale. By taking expectations on both sides
and applying Fubini’s theorem:
EEQ @ H.t; Q t /.Xt / Vt
D EEQ @ H.t; t /.XQ t / VQ t
D EEQ @ b.t; t /.XQ t /VQ t Yt C @ .t; t /.XQ t /VQ t Zt C @ f .t; t /.XQ t / VQ t ;
we get the desired equality (6.56) by handling in a similar way the derivatives in x. t
u
Proof. Using Fubini’s theorem, the second expectation appearing in the expression (6.54) of
the Gâteaux derivative of J given in Lemma 6.11 can be rewritten as:
E @x g.XT ; L.XT // VT C EQ @ g.XT ; L.XT //.XQ T / VQ T
D E @x g.XT ; L.XT // VT C EEQ @ g.XQ T ; L.XT //.XT / VT
D EŒYT VT ;
and using the expression derived in Lemma 6.12 for EŒYT VT in (6.54) we get the desired
result. t
u
Main Statement
The main result of this subsection is the following:
Proof. Since A is convex, given ˇ 2 A we can choose the perturbation ˛t D ˛t C .ˇt ˛t /
which is still in A for 0 6 6 1. Since ˛ is optimal, we have the inequality
Z
d ˇ T
J.˛ C .ˇ ˛//ˇD0 D E @˛ H t; Xt ; L.Xt /; Yt ; Zt ; ˛t .ˇt ˛t / dt > 0:
d 0
By convexity of the Hamiltonian with respect to the control variable ˛ 2 A, we conclude that
Z T
E H t; Xt ; L.Xt /; Yt ; Zt ; ˇt H t; Xt ; L.Xt /; Yt ; Zt ; ˛t dt > 0;
0
for all ˇ. Now, if for a given (deterministic) ˛ 2 A we choose ˇ in the following way:
(
˛ if .t; !/ 2 C;
ˇt .!/ D
˛t .!/ otherwise;
for an arbitrary progressively measurable set C Œ0; T˝ (that is C \Œ0; t 2 B.Œ0; t/˝ Ft
for any t 2 Œ0; T), we see that:
Z T
E 1C H t; Xt ; L.Xt /; Yt ; Zt ; ˛ H t; Xt ; L.Xt /; Yt ; Zt ; ˛t dt > 0;
0
6.3 Stochastic Pontryagin Principle for Optimality 549
When convexity of the set A does not hold, the following weaker version of the
necessary part of the stochastic Pontryagin principle holds:
for t 2 Œ0; T. By construction, ˛t C ˇt 2 A for all t 2 Œ0; T and 2 .0; 0 /. Following the
proof of Theorem 6.14, we claim:
Z T
E @˛ H t; Xt ; L.Xt /; Yt ; Zt ; ˛t ˇt dt > 0;
0
The necessary condition for optimality identified in the previous subsection can be
turned into a sufficient condition for optimality under some technical assumptions.
550 6 Optimal Control of SDEs of McKean-Vlasov Type
If
Remark 6.17 As made clear by the proof below, the optimal control ˛ is unique if
H is -strongly convex in ˛ for some > 0, namely if there is an extra C j˛ ˛ 0 j2
in the right-hand side of (6.60).
Also, the proof shows that J.˛/ 6 J.˛0 / for control processes ˛0 that are
progressively measurable for a larger filtration F0 containing F such that W is an
F0 -Brownian motion. The fact that F is generated by F0 and W is just needed to
guarantee that the adjoint BSDE (6.31) is solvable when ˛ is given.
0
Proof. Let ˛0 2 A be a generic admissible control, and X0 D X˛ the corresponding
controlled state. By definition of the objective function of the control problem we have:
6.3 Stochastic Pontryagin Principle for Optimality 551
J.˛/ J.˛0 /
Z
T
D E g.XT ; L.XT // g.XT0 ; L.XT0 // C E f .t; t / f .t; 0
t/ dt
0
Z (6.61)
T
D E g.XT ; L.XT // g.XT0 ; L.XT0 // C E H.t; t / H.t; t0 / dt
0
Z T ˚ 0
0
E b.t; t / b.t; t/ Yt C .t; t / .t; t / Zt dt;
0
so that:
E g XT ; L.XT / g XT0 ; L.XT0 /
6 E @x g.XT ; L.XT // .XT XT0 /
C EQ @ g.XT ; L.XT //.XQ T / .XQ T XQT0 / (6.62)
D E @x g.XT ; L.XT // C EŒ@Q g.XQ T ; L.XT //.XT / .XT X 0 /
T
0
0
D E YT .XT XT / D E .XT XT / YT ;
where we used Fubini’s theorem and the fact that the “tilde random variables” are indepen-
dent copies of the “non-tilde variables”. Using the adjoint equation and taking expectation,
we get:
E .XT XT0 / YT
Z T Z T Z T
DE .Xt Xt0 / dYt C Yt dŒXt Xt0 C Œ.t; t / .t; 0
t / Zt dt
0 0 0
Z T
D E @x H.t; t / .Xt Xt0 / C EQ @ H.t; Q t /.Xt / .Xt Xt0 / dt
0
Z T 0 0
CE Œb.t; t / b.t; t / Yt C Œ .t; t / .t; t / Zt dt;
0
where we used integration by parts and the fact that Y D .Yt /06t6T solves the adjoint
equation. Using Fubini’s theorem and the fact that Q t is an independent copy of t , the
expectation of the second term in the second line can be rewritten as:
552 6 Optimal Control of SDEs of McKean-Vlasov Type
Z T
E EQ @ H.t; Q t /.Xt / .Xt Xt0 /dt
0
Z T
Q
D EE Œ@ H.t; t /.XQ t / .XQ t XQ t0 / dt (6.63)
0
Z T
DE EQ @ H.t; t /.XQ t / .XQ t XQ t0 / dt:
0
J.˛/ J.˛0 /
Z T
6E ŒH.t; t / H.t; t0 /dt
0
(6.64)
Z h i
T
E @x H.t; t / .Xt Xt0 / C EQ @ H.t; Q t /.XQ t / .XQ t XQ t0 / dt 6 0;
0
because of the convexity assumption on H, see in particular (6.60), and because of the
criticality of the admissible control ˛ D .˛t /06t6T , see (6.59), which says .˛t ˇ/
@˛ H.t; Xt ; t ; Yt ; Zt ; ˛t / 6 0 for all ˇ 2 A, see (3.11) if needed. t
u
Scalar Interactions
We first consider scalar interactions for which the dependence upon the probability
measure comes through functions of scalar moments of the measure. More specifi-
cally, we assume that:
O x; h ; i; ˛/;
b.t; x; ; ˛/ D b.t; .t; x; ; ˛/ D O .t; x; h; i; ˛/;
f .t; x; ; ˛/ D fO .t; x; h ; i; ˛/; g.x; / D gO .x; h; i/;
We derive the particular form taken by the adjoint equation in the present situation.
We start with the terminal condition as it is easier to identify. According to (6.31),
it reads:
Since the terminal cost is of the form g.x; / D gO .x; h; i/, given our definition
of differentiability with respect to the variable , we know, as a generalization
of (5.35), that @ g.x; /. / reads:
@ g.x; /.x0 / D @r gO x; h; i @.x0 /; x0 2 Rd :
Notice that the ‘tildes’ can be removed at this stage since XQ T has the same
distribution as XT . Note also that if g and gO do not depend upon x, the function
P2 .Rd / 3 7! g./ D gO .h; i/ is convex if is convex and gO is nondecreasing
and convex, see Example 1 in Subsection 5.5.1.
Similarly, @ H.t; x; ; y; z; ˛/ can be identified to the Rd -valued function
defined by:
O x; h ; i; ˛/ ˇ y @ .x0 /
@ H.t; x; ; y; z; ˛/.x0 / D @r b.t;
O x; h; i; ˛/ ˇ z @.x0 /
C @r .t;
O x; ; ˛/; i;
b.t; x; ; ˛/ D hb.t; .t; x; ; ˛/ D h.t;
O x; ; ˛/; i;
f .t; x; ; ˛/ D hfO .t; x; ; ˛/; i; g.x; / D hOg.x; /; i:
O ,
for some functions b, O and fO defined on Œ0; TRd Rd A with values in Rd , Rdd
and R respectively and continuously differentiable with respect to .x; x0 ; ˛/ with
derivatives at most of linear growth, and a real valued function gO defined on Rd Rd
and continuously differentiable with derivatives at most of linear growth. This form
of dependence comes from the original derivation of the McKean-Vlasov equation
as limit of large systems of particles whose dynamics are given by stochastic
differential equations with mean field interactions as in:
1 XO
N
j
dXti D b.t; Xti ; Xt /dt
N jD1
(6.65)
1X
N
j j
C O .t; Xti ; Xt /dWt ; i D 1; ; N; 0 6 t 6 T;
N jD1
@ H.t; x; ; y; z; ˛/.x0 /
O x; x0 ; ˛/ ˇ y C @x0 .t;
D @x0 b.t; O x; x0 ; ˛/ ˇ z C @x0 fO .t; x; x0 ; ˛/;
O x; x0 ; y; z; ˛/ D b.t;
H.t; O x; x0 ; ˛/ y C O .t; x; x0 ; ˛/ z C fO .t; x; x0 ; ˛/;
We state the conditions we shall use from now on. These assumptions subsume
assumption Pontryagin Optimality introduced in Section 6.3. As it is most often
the case in applications of the stochastic maximum principle, we choose the set A
of control values to be a closed convex subset of Rk and we consider a linear model
for the forward dynamics of the state.
(A1) The drift and volatility functions b and are linear in x, and ˛. To
wit, for all .t; x; ; ˛/ 2 Œ0; T Rd P2 .Rd / A, we assume that:
(A3) The derivatives of f and g with respect to .x; ˛/ and x respectively are
L-Lipschitz continuous with respect to .x; ˛; / and .x; / respectively,
the Lipschitz property in the variable being understood in the sense
of the 2-Wasserstein distance. Moreover, for any t 2 Œ0; T, any x; x0 2
Rd , any ˛; ˛ 0 2 Rk , any ; 0 2 P2 .Rd /, and any Rd -valued random
variables X and X 0 having and 0 as distributions,
E j@ f .t; x0 ; 0 ; ˛ 0 /.X 0 / @ f .t; x; ; ˛/.X/j2
6 L j.x0 ; ˛ 0 / .x; ˛/j2 C E jX 0 Xj2 ;
E j@ g.x0 ; 0 /.X 0 / @ g.x; /.X/j2
6 L jx0 xj2 C E jX 0 Xj2 :
Finally,
The drift and the volatility being linear, the Hamiltonian takes the particular form:
H.t; x; ; y; z; ˛/ D b0 .t/ C b1 .t/x C bN 1 .t/N C b2 .t/˛ y
C 0 .t/ C 1 .t/x C N 1 .t/N C 2 .t/˛ z C f .t; x; ; ˛/;
Assumption Control of MKV Dynamics being slightly stronger than the assump-
tions used in Chapter 3, we can strengthen the conclusions of Lemma 3.3 while still
using the same arguments.
Proof. Except maybe for the Lipschitz property with respect to the measure argument, these
facts were explicitly proved in Lemma 3.3. Lemma 3.3 applies when D 0, but the proof
may be easily adapted to the current setting. The regularity of ˛O with respect to follows
from the following remark. If .t; x; y; z/ 2 Œ0; TRd Rd Rdd is fixed and ; 0 are generic
O x; ; y; z/ and ˛O 0 D ˛.t;
elements in P2 .Rd /, ˛O D ˛.t; O x; 0 ; y; z/ denoting the associated
minimizers, we deduce from the convexity condition (A4) in assumption Control of MKV
Dynamics:
2 j˛O 0 ˛j
O 2 6 .˛O 0 ˛/
O @˛ f t; x; ; ˛O 0 @˛ f t; x; ; ˛O
D .˛O 0 ˛/
O @˛ H t; x; ; y; z; ˛O 0 @˛ H t; x; ; y; z; ˛O
6 .˛O 0 ˛/
O @˛ H t; x; ; y; z; ˛O 0 @˛ H t; x; 0 ; y; z; ˛O 0 (6.67)
D .˛O 0 ˛/
O @˛ f t; x; ; ˛O 0 @˛ f t; x; 0 ; ˛O 0
6 Cj˛O 0 ˛j
O W2 .0 ; /;
558 6 Optimal Control of SDEs of McKean-Vlasov Type
the passage from the second to the third line following from the two inequalities:
8ˇ 2 A; ˇ ˛O @˛ H t; x; ; y; z; ˛O > 0;
ˇ ˛O 0 @˛ H t; x; 0 ; y; z; ˛O 0 > 0;
Given the necessary and sufficient conditions proven in the previous section, our
goal is to use the control ˛O D .˛O t /06t6T defined by ˛O t D ˛.t;O Xt ; L.Xt /; Yt ; Zt / where
˛O is the minimizer function constructed above and .Xt ; Yt ; Zt /06t6T is a solution of
the FBSDE:
8
ˆ
ˆ dXt D b0 .t/ C b1 .t/Xt C bN 1 .t/EŒXt
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ C b2 .t/˛.t;
O Xt ; L.Xt /; Yt ; Zt / dt
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ C 0 .t/ C 1 .t/Xt C N 1 .t/EŒXt
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ C 2 .t/˛.t;
O Xt ; L.Xt /; Yt ; Zt / dWt ;
ˆ
ˆ
ˆ
ˆ
<
dYt D @x f t; Xt ; L.Xt /; ˛.t;
O Xt ; L.Xt /; Yt ; Zt / (6.69)
ˆ
ˆ
ˆ
ˆ
C b1 .t/ Yt C 1 .t/ Zt dt
ˆ
ˆ
ˆ
ˆ h
ˆ
ˆ
ˆ
ˆ
ˆ EQ @ f t; XQ t ; L.Xt /; ˛.t;
O XQ t ; L.Xt /; YQ t ; ZQ t / .Xt /
ˆ
ˆ
ˆ
ˆ i
ˆ
ˆ
ˆ
ˆ C bN 1 .t/ EŒYt C N 1 .t/ EŒZt dt
ˆ
ˆ
:̂ C Zt dWt ;
Theorem 6.19 Under assumption Control of MKV Dynamics and for any initial
condition 2 L2 .˝; F0 ; PI Rd /, the forward-backward system (6.69) is uniquely
solvable.
Observe from Theorem 6.16 that, under the above assumption, the solution
of (6.69) is the unique optimal path of the mean field stochastic control problem
defined in (6.6)–(6.9).
The proof is an adaptation of the continuation method for FBSDEs. The idea is
to prove that existence and uniqueness are preserved under small perturbations of
the coefficients. Starting from a case for which existence and uniqueness are known
to hold, we then establish Theorem 6.19 by modifying iteratively the coefficients so
that (6.69) is eventually shown to belong to the class of uniquely solvable systems. A
simple strategy is to modify the coefficients in a linear way. The notations becoming
quickly unruly, we use the following conventions.
Parameterized Solutions
Like in Subsection 6.3.1, the notation .t /06t6T stands for stochastic processes of
the form .Xt ; L.Xt /; Yt ; Zt ; ˛t /06t6T with values in Rd P2 .Rd / Rd Rdd
A. We will denote by S the space of processes D .t /06t6T such that
.Xt ; Yt ; Zt ; ˛t /06t6T is F–progressively measurable, X D .Xt /06t6T and Y D
.Yt /06t6T have continuous sample paths, and
Z 1=2
T 2
kkS D E sup jXt j2 C jYt j2 C jZt j C j˛t j2 dt < 1: (6.70)
06t6T 0
Similarly, the notation . t /06t6T is generic for processes .Xt ; L.Xt /; ˛t /06t6T with
values in Rd P2 .Rd / A. All the processes . t /06t6T considered below will be
restrictions of extended processes .t /06t6T 2 S .
An input for (6.69) will be a four-tuple I D ..Itb ; It ; It /06t6T ; IT /, .Itb /06t6T ,
f g
.It /06t6T and .It /06t6T being three square-integrable progressively measurable
f
g
processes with values in Rd , Rdd and Rd respectively, and IT denoting a square-
integrable FT -measurable random variable with values in Rd . Such an input is
specifically designed to be injected into the dynamics of (6.69), I b being plugged
into the drift of the forward equation, I into the volatility of the forward equation,
I f into the bounded variation term of the backward equation and I g into the terminal
condition of the backward equation. The space of inputs is denoted by I. This
justifies their respective dimensions. It is endowed with the norm:
560 6 Optimal Control of SDEs of McKean-Vlasov Type
Z 1=2
T b2
kIkI D E jIT j2 C jIt j C jIt j2 C jIt j2 dt
g f
: (6.71)
0
Definition 6.20 For any 2 Œ0; 1, 2 L2 .˝; F0 ; PI Rd / and input I 2 I, the
FBSDE:
8
ˆ .t; t / C It dWt ;
< dXt D b.t;
ˆ t / C It dt C
b
˚ f
dYt D @x H.t; t / C EQ @ H.t; Q t /.Xt / C It dt (6.72)
ˆ
:̂ CZ dW ; t 2 Œ0; T;
t t
with:
˛t D ˛.t;
O Xt ; L.Xt /; Yt ; Zt /; t 2 Œ0; T; (6.73)
Remark 6.21 The way the coupling between the forward and backward equations
enters (6.72) is a bit different from the way Equation (6.69) is written. In the
formulation used in the statement of Definition 6.20, the coupling between the
forward and the backward equations follows from the optimality condition (6.73).
Because of that optimality condition, the two formulations are equivalent in the
sense that, when D 1 and I 0, the pair (6.72)–(6.73) coincides with (6.69). The
formulation used above matches the one used in the statements of Theorems 6.14
and 6.16.
Induction Argument
The following definition is stated for the sake of convenience only. It will help
articulate concisely the induction step of the proof of Theorem 6.19.
Definition 6.22 For any 2 Œ0; 1, we say that property .S / holds if, for any
2 L2 .˝; F0 ; PI Rd / and any I 2 I, the FBSDE E. ; ; I/ has a unique extended
solution in S .
Lemma 6.23 Let 2 Œ0; 1 such that .S / holds. Then, there exists a constant C,
independent of , such that, for any ; 0 2 L2 .˝; F0 ; PI Rd / and I; I 0 2 I, the
respective extended solutions and 0 of E. ; ; I/ and E. ; 0 ; I 0 / satisfy:
6.4 Solvability of the Pontryagin FBSDE 561
1=2
k 0 kS 6 C E j 0 j2 C kI I 0 kI :
Proof. We use a mere variation on the proof of the classical stochastic maximum principle.
With the same notation as in the statement, and using for .Xt ; L.Xt /; Yt ; Zt ; ˛t /06t6T and
. t D .Xt ; L.Xt /; ˛t //06t6T , we compute:
E .XT0 XT / YT
D E . 0 / Y0
Z T
E @x H.t; t / .Xt0 Xt / C EQ @ H.t; Q t /.Xt / .Xt0 Xt / dt
0
Z
T 0 0
E Œb.t; t/ b.t; t / Yt C Œ .t; t/ .t; t / Zt dt
0
Z
T
.Xt0 Xt / It C .Itb Itb;0 / Yt C .It It;0 / Zt dt
f
E
0
D T0 T1 T2 :
Following (6.62),
E .XT0 XT / YT
D E @x g.XT ; L.XT // C EŒ@ Q g.XQ T ; L.XT //.XT / .X 0 XT /
T
g;0 g
C E .IT IT / YT
g;0
6 E g.XT0 ; L.XQ T0 // g.XT ; L.XT // C E .IT IT / YT :
g
Identifying the two expressions above and repeating the proof of Theorem 6.16, we obtain:
Z T g
J.˛0 / J.˛/ > j˛t ˛t0 j2 dt C T0 T2 C E .IT IT / YT :
g;0
E (6.74)
0
Now, we can reverse the roles of ˛ and ˛0 in (6.74). Denoting by T00 and T20 the corresponding
terms in the inequality and summing both inequalities, we deduce that:
Z T g
j˛t ˛t0 j2 dt C T0 C T00 .T2 C T20 / C E .IT IT / .YT YT0 / 6 0:
g;0
2 E
0
Similarly,
T0 C T00 D E . 0 / .Y0 Y00 / :
562 6 Optimal Control of SDEs of McKean-Vlasov Type
Therefore, using Young’s inequality, there exists a constant C (the value of which may change
from line to line), C being independent of , such that, for any " > 0,
Z T
C
E j˛t ˛t0 j2 dt 6 "k 0 k2S C E j 0 j2 C kI I 0 k2I : (6.75)
0 "
From standard estimates for BSDEs, there exists a constant C, independent of , such that:
Z T
E sup jYt Yt0 j2 C jZt Zt0 j2 dt
06t6T 0
Z (6.76)
T
6C E sup jXt Xt0 j2 C j˛t ˛t0 j2 dt C CkI I 0 k2I :
06t6T 0
Similarly,
Z
T
E sup jXt Xt0 j2 6 E j 0 j2 C C E j˛t ˛t0 j2 dt C CkI I 0 k2I : (6.77)
06t6T 0
C
6 C"k 0 k2S C E j 0 j2 C kI I 0 k2I :
"
Using (6.75) again and choosing " small enough, we complete the proof. t
u
Lemma 6.24 There exists ı0 > 0 such that, if .S / holds for some 2 Œ0; 1/, then
.S C / also holds for any 2 .0; ı0 satisfying C 6 1.
Proof. The proof follows from a standard Picard’s contraction argument. Indeed, if is such
that .S / holds, for > 0, 2 L2 .˝; F0 ; PI Rd / and I 2 I, we then define a mapping ˚ from
S into itself whose fixed points coincide with the solutions of E . C ; ; I /. The definition
of ˚ is as follows. Given a process 2 S , we denote by 0 the extended solution of the
FBSDE E . ; ; I 0 / with:
k 0;1 0;2 kS 6 C k 1 2 kS ;
Lemma 6.25 For any t 2 Œ0; T and 2 L2 .˝; Ft ; PI Rd /, there exists a unique
t; t; t;
solution .Xs ; Ys ; Zs /t6s6T , of the Pontryagin forward/backward system (6.69) on
t;
Œt; T with Xt D as initial condition. Moreover, for any 2 P2 .Rd /, there exists
a measurable mapping U .t; ; / W Rd 3 x 7! U .t; x; / such that:
h i
t;
P Yt D U t; ; L./ D 1: (6.79)
We here use the letter U instead of U as in the statement of Lemma 4.25 in order to
distinguish from U in (6.37).
Remark 6.26 We shall revisit the notion of master field in the next section, see
Subsection 6.5.2. The notion of master field for mean field games will be addressed
in detail in Chapters (Vol II)-4 and (Vol II)-5.
564 6 Optimal Control of SDEs of McKean-Vlasov Type
Also, observe that, from Proposition 5.36, for any 2 P2 .Rd /, there exists a
version of Rd 3 x 7! U .t; x; / in L2 .Rd I / that is Lipschitz-continuous with
respect to x, for the same Lipschitz constant C as in (6.80).
Our goal is now to provide a probabilistic derivation of the master equation (6.37)
obtained in Subsection 6.2.4 and to connect its solution U with the decoupling
field U identified in the statement of Lemma 6.25. In order to do so, we assume
throughout the section that assumption Control of MKV Dynamics introduced in
the previous section is in force, but with the restriction that is uncontrolled. In
particular, Theorem 6.19 applies and guarantees that for any given initial condition
2 L2 .˝; F0 ; PI Rd /, the mean field stochastic control problem (6.6)–(6.9) has a
unique optimal path, which is characterized by the solution of the forward-backward
system (6.69).
Most of the discussion below could be generalized to more general forms of
drift and volatility coefficients, and running and terminal cost functions, as long as
existence and uniqueness of an optimal path remain true.
over the space of A-valued square integrable F-adapted controls ˛ D .˛t /06t6T
under the dynamic constraint:
dXt˛ D b t; Xt˛ ; L.Xt˛ /; ˛t dt C s; Xt˛ ; L.Xt˛ / dWt ; t 2 Œ0; T; (6.82)
Definition 6.27 Under the assumption prescribed above, for any t 2 Œ0; T and
2 L2 .˝; Ft ; PI Rd /, the quantity:
6.5 Several Forms of the Master Equation 565
Z
T ˛ ˛
˛ ˛
inf E f s; Xs ; L.Xs /; ˛s ds C g XT ; L.XT / ; (6.83)
˛jŒt;T t
under the prescription Xt˛ D , where the infimum is taken over A-valued square-
integrable .Fs /t6s6T -progressively measurable processes ˛jŒt;T D .˛s /t6s6T , only
depends upon t and the distribution of . For that reason, we can denote it by v.t; /
if is the law of .
The function v is called the value function of the mean field stochastic control
problem.
Proof. Without any loss of generality, we can assume that t D 0. By Theorem 6.19, the mean
field stochastic control problem (6.6)–(6.9) with as initial condition has a unique optimal
path, which is characterized by the solution of the forward-backward system (6.69).
We shall prove in Theorem (Vol II)-1.33 a relevant version of the Yamada and Watanabe
theorem for McKean-Vlasov forward-backward SDEs. It says that, the forward-backward
system (6.69) being uniquely solvable in the strong sense, the law of its solution only depends
on the law of the initial condition. t
u
Remark 6.28 The definition of v in (6.83) is quite similar to that given in (6.25),
except that (6.83) is based upon the probabilistic formulation of the control problem
while (6.25) is based upon the analytic approach. In order to identify the two
definitions rigorously, it is necessary to connect first the two formulations of the
optimal control problem. We refer to the final discussion in Subsection 6.2.5 for a
short account.
Proposition 6.29 The value function v, as defined above, satisfies, for all t 2 Œ0; T,
h 2 Œ0; T t and 2 P2 .Rd /:
v.t; /
Z
tCh (6.84)
D inf E f s; Xs˛ ; L.Xs˛ /; ˛s ds C v t C h; L.XtCh
˛
/ ;
˛jŒt;tCh t
Notice that, despite the presence of the expectation in (6.84), the dynamic
programming principle is deterministic. Indeed, the underlying state variable is the
marginal law of the controlled process X˛ .
Proof. The proof of (6.84) is pretty standard. For t, h and as in the statement, we first prove
that v.t; / is greater than the right-hand side in (6.84). To do so, it suffices to start from the
definition of v.t; / in Definition 6.27 and observe that for any ˛jŒt;T :
566 6 Optimal Control of SDEs of McKean-Vlasov Type
Z
T ˛
E f s; Xs˛ ; L.Xs˛ /; ˛s ˛
ds C g XT ; L.XT /
t
Z
tCh
DE f s; Xs˛ ; L.Xs˛ /; ˛s ds
t
Z
T
CE f s; Xs˛ ; L.Xs˛ /; ˛s ds C g XT˛ ; L.XT˛ / ;
tCh
˛
but the last term in the right-hand side is greater than v.t C h; L.XtCh //, which proves that
v.t; / is greater than the right-hand side in (6.84).
To prove the converse inequality, we consider a control ˛ D .˛s /t6s6tCh defined over
Œt; t C h and a control ˛0 D .˛s0 /tCh6s6T defined over Œt C h; T. We patch them together by
letting ˇs D ˛s for s 2 Œt; t C h and ˇs D ˛s0 for s 2 .t C h; T. Clearly, ˇ is an admissible
control over Œt; T. Therefore,
Z
T ˇ ˇ
v.t; / 6 E f s; Xsˇ ; L.Xsˇ /; ˇs ds C g XT ; L.XT / ;
t
ˇ
with Xt D 2 L2 .˝; Ft ; PI Rd /, with . It is pretty clear that:
˛
˛0 ;XtCh
Xsˇ D Xs ; s 2 Œt C h; T;
˛0 ;X ˛ ˛
where .Xs tCh /tCh6s6T is the solution of (6.6) with XtCh as initial condition at time t C h.
0
By freezing ˛ and by minimizing over ˛ , we get:
Z
tCh
v.t; / 6 E f s; Xs˛ ; L.Xs˛ /; ˛s ds C v t C h; L.XtCh
˛
/ ;
t
Whenever v is smooth enough, this weak form of the DPP is sufficient to recover
the HJB equation (6.26), with the difference that computations are then based upon
the L-differential calculus instead of the linear functional derivative. Indeed, the use
˛
of the L-derivative is very natural as .v.t C h; L.XtCh ///t6tCh6T may be expanded
in h by means of the chain rule proven in Chapter 5. We refer to Subsection 5.7.3
for a preliminary discussion of the same kind. We shall address this question again
below.
Whenever v is not smooth, the DPP may be used to derive the HJB equation in
the viscosity sense. We shall do so in Chapter (Vol II)-4, but for the master equation
associated with mean field games. As for mean field stochastic control, we refer to
citations in the Notes & Complements below.
Z
v.T; / D g.x; /d.x/; 2 P2 .Rd /;
Rd
for some function V, in which case V would read as a value function defined over
the enlarged state space Rd P2 .Rd /.
A natural way to do so is to proceed by conditioning. Roughly speaking, for
each .t; x; / 2 Œ0; T Rd P2 .Rd /, we should define V.t; x; / as the conditional
expected future costs:
V.t; x; /
Z
T ˇ (6.85)
DE f s; Xs˛O ; L.Xs˛O /; ˛O s ds C g XT˛O ; L.XT˛O / ˇ Xt˛O D x ;
t
where ˛O minimizes the quantity (6.83) under the constraint Xt˛ D . Notice
that, with this definition of the value function, for each t 2 Œ0; T and 2 P2 .Rd /,
V.t; x; / is only defined for -almost every x 2 Rd . Below, the ‘hat’ symbol always
refers to optimal quantities, and .Xs˛O /t6s6T is sometimes denoted by XO D .XO s /t6s6T .
Put differently, X˛O in (6.85) is understood as the optimal path minimizing the cost
functional (6.83) over McKean-Vlasov diffusion processes satisfying (6.82) with the
initial condition Xt˛ D 2 L2 .˝; Ft ; PI Rd /, where . We shall prove below
that the definition (6.85) is consistent in the sense that the right-hand side in (6.85)
is independent of the choice of the random variable representing the distribution
.
In order to reformulate (6.85) in a more proper fashion, we use the fact that the
minimizer .˛O s /t6s6T has a feedback form, given by Lemma 6.25. Namely ˛O s reads
N Xs˛O ; L.Xs˛O //, where:
as ˛.s;
˛.t;
N x; / D ˛O t; x; ; U .t; x; / : (6.86)
for s 2 Œt; T, with the initial condition XO t D 2 L2 .˝; Ft ; PI Rd /. By Lemma 6.25,
L2 .˝; Ft ; PI Rd / 3 X 7! U .t; X; L.X// is Lipschitz continuous in X. Following the
last argument in the proof of Lemma 4.56, it is bounded on bounded subsets. Hence,
568 6 Optimal Control of SDEs of McKean-Vlasov Type
the above SDE is uniquely solvable. As already explained, strong uniqueness for
McKean-Vlasov SDEs implies weak uniqueness, see the proof of Definition 6.27.
Therefore, the law of the path XO only depends on .t; /, where D L./. In
particular, we can write L.XO s / for L.XO s /, since the latter only depends on
t; t;
through .
Now, in order to well define the conditioning in (6.85), we may define, for any
t;x;
.t; x; / 2 Œ0; T Rd P2 .Rd /, XO D .XO s /t6s6T as the solution of the SDE:
t;x;
dXO st;x; D b s; XO st;x; ; L.XO st; /; ˛N s; XO st;x; ; L.XO st; / dt
(6.87)
C s; XO st;x; ; L.XO st; / dWs ; s 2 Œt; T;
Proof. The identification of v is a mere consequence of the Markov property for the
SDE (6.87), which holds since the equation is well posed. t
u
Proposition 5.102, one may try to guess the kind of relationship it should satisfy in
order for the bounded variation part to vanish. The implementation of this approach
depends only upon the availability of a form of the dynamic programming principle
which is the basis for the martingale property of the value function along optimal
paths.
Following the approach used in finite dimension, a natural strategy is to use (6.88)
as a basis for the derivation of a dynamic programming principle for V. We shall
use the fact that the pair .XO s ; L.XO s //t6s6T is Markovian with values in Rd
t;x; t;
P2 .R /. The Markov property states that for any t 6 t C h 6 T, the future states of
d
t;
tCh;L.XtCh /
L.XO st; / D L XO s ; s 2 Œt C h; T;
and the conditional law of the future states of .XO s /tCh6s6T given the past before
t;x;
O t;
tCh;XtCh ;L.XtCh /O t;
XO st;x; D XO s ; s 2 Œt C h; T;
It follows that:
Under the same assumption as before, V satisfies the dynamic programming
principle:
t;
V t C h; XO tCh ; L.XO tCh /
t;x;
Z T
DE f s; XO st;x; ; L.XO st; /; ˛N s; XO st;x; ; L.XO st; / ds
tCh
t;x; t; ˇ
C g XO T ; L.XO T / ˇ FtCh ;
for t 6 t C h 6 T.
Taking expectations on both sides and using the definition (6.88), this shows that:
Proposition 6.31 Under the above assumption, V satisfies the following version of
the dynamic programming principle:
Z tCh
V.t; x; / D E f s; XO st;x; ; L.XO st; /; ˛N s; XO st;x; ; L.XO st; / ds
t
(6.89)
t;
O t;x; O
C V t C h; XtCh ; L.XtCh / ;
According to the strategy we hinted at earlier, our derivation of the master equation
is based on the application of the chain rule stated in Proposition 5.102 to the
dynamics of V along optimal paths. Consequently, for the purpose of this derivation,
we assume that the value function V defined in (6.88) is smooth enough so that we
can apply the chain rule (5.107). In order to satisfy the integrability constraints in
the chain rule, we also require to be bounded.
It would be possible to carry out the complete analysis of the regularity of V, but
the proof would be lengthy and would require a lot of technicalities. The interested
reader may have a look at the references at the end of the chapter. Also, she/he may
find in Chapter (Vol II)-5 a similar analysis, but for mean field games instead of
mean field stochastic control problems.
1 h i
C trace @2xx V s; XO st;x; ; L.XO st; / s s
2
h A
C EQ @ V s; XO st;x; ; L.XO st; / fXO st; g bQ s
i
C
1 Qh A
E trace @v @ V s; XO st;x; ; L.XO st; / fXO st; g Q s Q s
i
ds
2
C @x V s; XO st;x; ; L.XO st; / s dWs ;
where we wrote bs for b.s; XO s ; L.XO s /; ˛O s /, s for .s; XO s ; L.XO s //, and ˛O s for
t;x; t; t;x; t;
A
fXO st; g
is a copy of XO s .
t;
6.5 Several Forms of the Master Equation 571
Inserting the above expansion in (6.89) in order to get the limit form of the left-
hand side as h tends to 0, we deduce that V satisfies the equation:
@t V.t; x; / C b t; x; ; ˛.t;
N x; / @x V.t; x; /
1 h i
C trace a.t; x; /@2xx V.t; x; / C f t; x; ; ˛.t; N x; /
2
Z
(6.90)
C b t; x0 ; ; ˛.t;
N x0 ; / @ V.t; x; /.x0 /
Rd
1
C trace a t; x0 ; @v @ V.t; x; /.x0 / d.x0 / D 0;
2
for .t; x; / 2 Œ0; TRd P2 .Rd /, with the terminal condition V.T; x; / D g.x; /.
We call this equation the master equation for the value function of the problem.
Recalling that the mapping ˛O W Œ0; TRd P2 .Rd /Rd 3 .t; x; ; y/ 7! ˛.t;
O x; ; y/
minimizes the reduced Hamiltonian H, the minimizer in (6.91) must be:
˛N t; ; D ˛O t; ; ; @x V.t; ; / C EQ @ V t; ;
Q ./
Z
0 0
D ˛O t; ; ; @x V t; ; C @ V t; x ; ./d.x / ;
Rd
is an optimal feedback. Plugging this relationship into (6.90), we obtain the full-
fledged form to the master equation:
6.5 Several Forms of the Master Equation 573
1 h i
@t V.t; x; / C trace a.t; x; /@2xx V.t; x; /
2
Z
0 0
C b t; x; ; ˛O t; x; ; @x V.t; x; / C @ V.t; x ; /.x/d.x /
Rd
@x V.t; x; /
Z
C f t; x; ; ˛O t; x; ; @x V.t; x; / C @ V.t; x0 ; /.x/d.x0 / (6.93)
Rd
Z Z
0 0
C b t; xQ ; ; ˛O t; xQ ; ; @x V.t; xQ ; / C @ V.t; x ; /.Qx/d.x /
Rd Rd
@ V.t; x; /.Qx/
1
C trace a t; xQ ; @v @ V.t; x; /.Qx/ d.Qx/ D 0;
2
for .t; x; / 2 Œ0; TRd P2 .Rd /, with the terminal condition V.T; x; / D g.x; /.
Also, the optimal path solving the optimal control of the McKean-Vlasov dynamics
is given by:
dXO s D b s; XO s ; O s ; ˛O s; XO s ; O s ; @x V.s; XO s ; O s /
Z
(6.94)
C @ V.s; x ; O s /.XO s /dO s .x0 /
0
ds
Rd
C s; XO s ; O s dWs ;
subject to the constraint O s D L.XO s / for s 2 Œt; T, with L.XO t / D , for some
initial distribution 2 P2 .Rd /. Moreover, by comparing (6.86) and (6.92), we may
conjecture that:
Z
U .t; x0 ; / D @x V.t; x; / C @ V.t; x0 ; /.x/d.x0 /;
Rd (6.95)
.t; x; / 2 Œ0; T R P2 .R /:
d d
The fact that the right-hand side contains two different terms is a perfect reflection
of the backward propagation of the terminal condition of the FBSDE (6.69). Indeed,
this terminal condition has two terms corresponding to the partial derivatives of the
terminal cost function g with respect to the state variable x and the distribution .
Recalling Definition 6.30, this leads to us the formal identification:
574 6 Optimal Control of SDEs of McKean-Vlasov Type
The proof of the above relationship can be made rigorous. Observing from (6.88)
that:
Z T
v.t; / D E f s; XO st; ; L.XO st; /; ˛N s; XO st; ; L.XO st; / ds
t
(6.96)
t; t;
O
C g XT ; L.XT / ; O
Verification Argument
The relevance of the master equation (6.90) is contained in the following verification
result, which is an extension of Proposition 5.108 and which does not require
Control of MKV Dynamics to hold.
Proof. We first notice that because of the linear growth assumption and (6.97), the supremum
over Œ0; T of the solution of (6.94) is square integrable and that, for any square integrable
control ˛, the supremum of X ˛ (with X0˛ D ) is also square integrable. Next, we replace
g by V.T; ; / in (6.9), and apply the chain rule (5.107), the integrability condition (6.97)
ensuring that the expectation of the martingale part is zero. Using the same
R Fubini argument
as in (6.91), we deduce that the right-hand side is indeed greater than Rd V.0; x; /d.x/,
with D L./. Choosing ˛ D ˛O 0; , we see that equality must hold. t
u
As explained earlier, the decoupling field U of the FBSDE (6.69). can be identified
with the L-derivative of the value function v, as defined in Definition 6.27. In full
analogy with the previous subsection, but also with Subsections 4.1.2 and 5.7.2,
the goal of this subsection is to derive informally an equation (most likely a PDE,
though possibly in infinite dimension) satisfied by U .
To do so, we assume that U satisfies the assumptions of the Itô chain rule stated
in Proposition 5.102. We start from the equality:
Yt D U t; Xt ; L.Xt / ; (6.98)
dU .t; Xt ; t / D @t U t; Xt ; L.Xt / C @x U t; Xt ; L.Xt / bt
1 2
C @xx U t; Xt ; L.Xt / t t
2
(6.100)
C EQ @ U .t; Xt ; L.Xt //.XQ t /bQ t
1 h i
C EQ @v @ U .t; Xt ; L.Xt //.XQ t /Q t Q t dt
2
C @x U t; Xt ; L.Xt / t dWt ;
where we used the notation bt for b.t; Xt ; L.Xt /; ˛O t / and t for .t; Xt ; L.Xt //. As
usual, .XQ t ; bQ t ; Q t / is a copy of .Xt ; bt ; t /, the expectation over which is denoted by
Q Notice also that @x U .t; Xt ; L.Xt // and @ U .t; Xt ; L.Xt //.XQ t / are d d matrices
E.
acting on the d-dimensional drifts bt and bQ t . Similarly, @2xx U .t; Xt ; L.Xt // and
@v @ U .t; Xt ; L.Xt //.XQ t / are of dimension d .d d/. The d d components act
on t t and Q t Q t .
Identifying the quadratic variation terms in (6.99) and (6.100), we get:
Zt D @x U t; Xt ; L.Xt / t; Xt ; L.Xt / ; t 2 Œ0; T: (6.101)
We can now identify the bounded variation parts of the differentials of both sides
of (6.98) after replacing ˛O t by the argument of the minimization of the Hamiltonian,
namely ˛.t;
O Xt ; L.Xt /; Yt /, and Yt and Zt by (6.98) and (6.101) respectively. We get:
0 D @t U t; Xt ; L.Xt / C @x U t; Xt ; L.Xt / bt
1
C @2xx U t; Xt ; L.Xt / t t
2
h i
C EQ @ U t; Xt ; L.Xt / .XQ t /bQ t
1 Qh
i (6.102)
C E @v @ U t; Xt ; L.Xt / .XQ t /Q t Q t
2
C @x H t; Xt ; L.Xt /; Yt ; Zt ; ˛O t
h i
C EQ @ H t; XQ t ; L.Xt /; YQ t ; ZQ t ; ff
˛O t g .Xt / :
Since .XQ t ; YQ t ; ZQ t ; ff
˛O t g/ is an independent copy of .Xt ; Yt ; Zt ; ˛O t /, if we use formu-
las (6.98) and (6.101), since bt , t and ˛t are also functions of Xt only, the above
expectations EQ are nothing but mere integrals with respect to the measure L.Xt /. So
equality (6.102) will be satisfied along all the optimal paths Œ0; T 3 t 7! .Xt ; L.Xt //
if the function U satisfies the following system of PDEs:
6.5 Several Forms of the Master Equation 577
0 D @t U .t; x; / C @x U .t; x; /b t; x; ; ˛O t; x; ; U .t; x; /
1 2
C @ U .t; x; /a.t; x; /
2 xx
Z
C @ U .t; x; /.x0 /b t; x0 ; ; ˛O t; x0 ; ; U .t; x0 ; /
Rd
1
C @v @ U .t; x; /.x0 /a.t; x0 ; / d.x0 / (6.103)
2
C @x H t; x; ; U .t; x; /; @x U .t; x; /.t; x; /; ˛O t; x; ; U .t; x; /
Z
C @ H t; x0 ; ; U .t; x0 ; /; @x U .t; x0 ; /.t; x0 ; /;
Rd
˛O t; x0 ; ; U .t; x0 ; / .x/d.x0 /;
1
0 D @t U .t; x; / @x U .t; x; /U .t; x; / C x U .t; x; /
2
Z h
C @ U .t; x; /.x0 /U .t; x0 ; /
Rd
1 i
C trace @v @ U .t; x; /.x0 / d.x0 /
2
Z
C @x f0 .t; x; / C @ f0 .t; x0 ; /.x/d.x0 /:
Rd
578 6 Optimal Control of SDEs of McKean-Vlasov Type
The full analysis based on the stochastic maximum principle we provided earlier in
the chapter is very robust, in part because it describes the optimizers of the control
problem in quite a simple way. However, since it relies on a strong joint convexity
assumption in x, and ˛, the conditions under which it can be applied may not be
satisfied in some practical examples.
The purpose of this section is to provide, at least in a particular case, an
alternative formulation of the control of McKean-Vlasov stochastic differential
equations for which existence of an optimizer can be proven without requiring the
Hamiltonian to be convex in the state variable.
In order to circumvent the lack of joint convexity in the cost functions, we introduce
a new formulation of the minimization problem inf˛ J.˛/ defined in (6.6) and (6.9).
with a prescribed initial condition X0˛ D X0 . Here, X˛ D .Xt˛ /06t6T takes values
in Rd , , B and ˙ are constant matrices of dimensions d d, d k and d m
respectively, and W D .Wt /06t6T is a m-dimensional Wiener process. Motivated by
practical applications addressed in the next section, we assume that m ¤ d, which
differs from what we have done so far. Without any loss of generality we can assume
d > m, which is the only interesting case, and that ˙ is of rank m. The complete
probability space carrying W is denoted by .˝; F; P/. It is equipped with a complete
and right-continuous filtration F and W is assumed to be a Brownian motion with
respect to F. Above, the control ˛ D .˛t /06t6T is a square-integrable and F-
progressively measurable process with values in a closed convex subset A Rk .
Of course, whenever X0 is deterministic, F may be the complete (and thus right-
continuous) augmentation of the filtration generated by W, but, as we shall see
next, it may also be a larger filtration. In any case, for the purpose of the present
discussion, we remark that the dynamics (6.104) imply that:
where M D .Mt /t>0 is a continuous martingale with quadratic variation ŒM; Mt D
˙ ˙ t. Notice also that, the state Xt˛ in (6.104) satisfies:
Z t
Xt˛ Det
X0˛ C e.ts/ B˛s ds C Yt ; t 2 Œ0; T; (6.106)
0
6.6 A Weak Formulation Bypassing Pontryagin Principle 579
Rt
where the process Y D .Yt /t>0 is given by Yt D 0 e.ts/ ˙ dWs and satisfies:
For the analysis, it is important to observe that, starting from a state process given
Rbyt (6.106) with a continuous process Y with Y0 D 0 such that .Mt D Yt
0 Ys ds/06t6T is a martingale with quadratic variation .ŒM; Mt D ˙˙ t/06t6T ,
we can recover the dynamics (6.104) in the following way. If ˙ D UDV is
the singular value decomposition of the matrix ˙, the matrix U (resp. V) is an
orthogonal d d (resp. m m) matrix, and the matrix D is a d m diagonal matrix
(i.e., the entries Di;i for i D 1; ; m are the only nonzero entries). We then define
the process W D .Wt /t>0 by:
where D1 denotes the m d diagonal matrix with ŒD1 i;i D ŒDi;i 1 for i D
1; ; m. The process W is an m-dimensional Brownian motion for the same
filtration, because it is a continuous martingale with quadratic variation ŒW; Wt D
tIm . Moreover, by construction Mt D ˙Wt . While the computation of the quadratic
variation of W is straightforward given its definition, the fact that Mt D ˙Wt is less
obvious. It can be proven by checking that U Mt D U ˙Wt which is equivalent to:
Since ŒDD1 is a d d diagonal matrix with ones on the first m entries of the
diagonal, the desired inequality can be proven by showing that the last d m entries
of the random vector U Mt are identically zero. This is indeed the case because:
cov.U Mt / D U EŒMt Mt U D U ˙˙ U t D DD t; t 2 Œ0; T:
Remark 6.34 For pedagogical reasons, we could have chosen a simpler model for
the dynamics of the state, typically something of the form dXt˛ D ˛t dtCdWt . Despite
the fact that it increases the technicalities of the proofs, the choice of (6.104) (and
in particular the assumption d > m) was made to accommodate applications such
as the flocking model discussed at the end of Subsection 6.7.2 below on potential
mean field games.
We begin our discussion of the weak formulation with a cost functional of the
type:
Z
T
J.˛/ D E f t; Xt˛ ; L.Xt˛ /; ˛t dt C g XT˛ ; L.XT˛ / ; (6.108)
0
580 6 Optimal Control of SDEs of McKean-Vlasov Type
(A1) g is continuous and, for any t 2 Œ0; T, the function f .t; ; ; / is also
continuous;
(A2) there exist two constants C > 0 and > 0 such that, for all .t; x; ; ˛/ 2
Œ0; T Rd P2 .Rd / A,
jf .t; x; ; ˛/j C jg.x; ˛/j 6 C 1 C jxj2 C M2 ./2 C j˛j2 ;
f .t; x; ; ˛/ > j˛j2 C 1 C jxj C M2 ./ ;
g.x; ˛/ > C 1 C jxj C M2 ./ :
Reformulation
A first step is to regard the cost functional as a mere function of the law of the control
process ˛ as opposed to a function of the actual realizations of ˛. This requires a new
definition of the optimization problem to dissociate it from the specific choice of the
probabilistic space .˝; F; P/. Observe indeed that, instead of constructing control
processes on a prescribed probability space (as we did above), we may consider
a triple .X0 ; Y; ˛/, defined on some complete probability space, still denoted by
.˝; F; P/ for simplicity, such that:
3. Y is a d-dimensional
Rt continuous process such that Y0 D 0 and the process
.Mt D Yt 0 Ys ds/06t6T is a martingale with quadratic variation .ŒM; Mt D
˙ ˙ t/06t6T for the complete andRright-continuous augmentation of the filtration
t
generated by the process .X0 ; Y; . 0 ˛s ds/06t6T /.
1. is distributed R t to
according 0,
2. the process yt 0 ys ds 06t6T is a martingale with quadratic variation
.˙ ˙ t/06t6T for the complete and right-continuous Raugmentation F D
t
.Ft /06t6T of the filtration generated by the process .; yt ; 0 as ds/06t6T ,
RT
3. E 0 jat j2 dt is finite.
On ˝canon , we define:
Z t
xt D et C e.ts/ Bas ds C yt ; t 2 Œ0; T;
0
and then:
Z
T
J .P/ D EP f t; xt ; P ı xt1 ; at dt C g xT ; P ı xT1 ;
0
582 6 Optimal Control of SDEs of McKean-Vlasov Type
Observe finally that, for any admissible triple .X0 ; Y; ˛/ irrespective of the
probability space .˝; F; P/ on which it is defined, the law of .X0 ; Y; ˛/ is admissible
in the sense of Definition 6.35.
Main Statement
Here is the main result of this section.
Theorem 6.37 Under assumption MKV Weak Formulation, there exists a prob-
ability P? on ˝canon such that J .P? / is equal to the infimum of J over the set of
admissible probability measures on ˝canon .
The advantage of the weak formulation is quite clear: it is much easier to establish
the relative compactness of a family of laws of random variables than the relative
compactness of the family formed by the random variables themselves. However,
although this fact makes the approach more appealing, it still does not answer the
need for a topology on the space of control processes for which one can prove
6.6 A Weak Formulation Bypassing Pontryagin Principle 583
tightness for sequences of controls whose costs are uniformly bounded. Indeed, this
is more or less what we should prove in order to establish the relative compactness
of the sublevel sets of J .
A way to bypass this difficulty is to relax the formulation of the control problem
using the notion of relaxed control, which has been widely used in the theory
of stochastic control. Roughly speaking, a relaxed control is a random measure
on Œ0; T A satisfying some conditions that are described below. Obviously,
standard controls ought to be relaxed controls. A natural way to make sure that
this is indeed the case is to associate, with each control process ˛ D .˛t /06t6T
defined on a complete filtered probability space .˝; F; F; P/, the random measure
Q.!; dt; d˛/ D dtı˛t .!/ .d˛/ rigorously defined as:
Z
T
Q !; B D 1B t; ˛t .!/ dt; B 2 B.Œ0; T A/; ! 2 ˝:
0
Notice that the marginal distribution of Q on Œ0; T is the Lebesgue measure. This
prompts us to introduce the following definition.
Recall also from Proposition 5.7 that, for any B 2 B.Œ0; T A/, the mapping
˝ 3 ! 7! Q.!; B/, which we sometimes denote by Q.B/, is a random variable. A
key observation is that every element q 2 Q may be normalized into a probability
measure q=T 2 P.Œ0; T A/. In this respect, the following simple result will come
handy in what follows.
Proposition 6.39 The set fq=T; q 2 Qg is a closed subset of P.Œ0; TA/ equipped
with the topology of weak convergence.
Proof. Without any loss of generality, we assume that T D 1. We then consider a sequence
.qn /n>0 , with values in Q, which converges in the weak sense to some q 2 P .Œ0; T A/,
and prove that for any B 2 B.Œ0; T/, q.B A/ D Leb1 .B/, which will show that q belongs
to Q. Consider a continuous function ` from Œ0; T to R. We have:
Z T Z Z T Z
lim `.t/qn .dt; da/ D `.t/q.dt; da/:
n!1 0 A 0 A
RT
Obviously, the fact that the left-hand side is equal to 0 `.t/dt concludes the proof. t
u
The fact that the first marginal of a relaxed control is fixed plays a key role. For
instance, we shall appeal several times to the following property, sometimes referred
to as stable convergence in law, see replace by Lemma (Vol II)-7.34 for a proof:
584 6 Optimal Control of SDEs of McKean-Vlasov Type
With the above notation, we can prove the following properties which will be helpful
in manipulating relaxed controls.
1. for any t 2 Œ0; T, the mapping Œ0; t ˝ 3 .s; !/ 7! Qs .!; / 2 P.A/ is
measurable when its domain Œ0; t ˝ is equipped with the -field B.Œ0; t/ ˝ Ft
and its range P.A/ with the Borel -field of the Lévy-Prokhorov metric;
2. for P-almost every ! 2 ˝, Q.!; / D dtQt .!; /.
For any t 2 Œ0; T, Qht W ˝ 3 ! 7! Qht .!; B/ is a random variable. Moreover, for 0 6 s <
t 6 T,
jQht .!; B/ Qhs .!; B/j 6 h1 Q !; Œs; t A/ C Q !; Œ.s h/C ; .t h/C A
ts
62 ;
h
so that Qh .; B/ is continuous in t and is thus jointly measurable in .t; !/. We deduce that
Œh; T ˝ 3 .t; !/ 7! Qht .!; / 2 P .A/ is measurable.
6.6 A Weak Formulation Bypassing Pontryagin Principle 585
Observe now that, for any ! 2 ˝ and for almost every t 2 Œ0; T, for any ` in a countable
dense subset of the space of real valued continuous functions on R converging to 0 at infinity,
we have:
Z
`.a/Qt .!; da/
A
Z Z Z
t
D lim h1 `.a/Q !; .ds; da/ D lim `.a/Qht .!; da/:
h&0 .th/C A h&0 A
is Lipschitz continuous. In particular, for any ! 2 ˝, " > 0 and almost every t 2 ."; T, the
family .Qht .!; //0<h<" converges weakly to Qt .!; / as h & 0.
For each fixed " > 0, let D" Œ"; T ˝ be the set of points .t; !/ at which the sequence
1=n
.Qt .!; //n>"1 has a limit in P .A/ as n tends to 1. By joint measurability of the map
1=n
Œ"; T ˝ 3 .t; !/ 7! Qt .!; /, we have D" 2 B.Œ"; T/ ˝ F . Moreover, for all ! 2 ˝,
the set ft 2 Œ"; T W .t; !/ 62 D" g has zero Lebesgue measure. Therefore, D" has full measure
1=n
in Œ"; T ˝. In particular, we can redefine Qt .!; / as the limit of .Qt .!; //n>1 when
.t; !/ 2 [">0 D" (each D" being regarded as an element of B.Œ0; T/ ˝ F ), and a fixed
arbitrary value otherwise.
By the same argument as above, for any " 6 h 6 t 6 T, the mapping Œ"; t˝ 3 .s; !/ 7!
Qhs .!; / is B.Œ"; t/ ˝ Ft -measurable. In particular, the set f.s; !/ 2 Œ"; t ˝ W .s; !/ 2
D" g belongs to B.Œ"; t/ ˝ Ft . Therefore, Œ0; t ˝ 3 .s; !/ 7! Qs .!; / is B.Œ0; t/ ˝ Ft -
measurable. t
u
We now provide a new formulation of the optimal control problem which accom-
modates relaxed controls.
Observe that the two integrals with respect to Q may be rewritten in terms of
.Qt /0tT . Of course, the resulting writing does not depend upon the choice of the
kernel .Qt /0tT in Proposition 6.41.
1. the law of X0 is the probability measure 0 originally chosen for the initial
distribution; RT R
2. Q is a random variable with values in Q such that E 0 A jaj2 Q.dt; da/ is finite;
3. Y is a d-dimensional continuous
Rt process such that Y0 D 0 and the process M
defined by Mt D Yt 0 Ys ds for 0 6 t 6 T is a martingale for the complete
and right-continuous augmentation of the filtration generated by .X0 ; Y; Q/ (see
the paragraph preceding the statement of Proposition 6.41 for the definition of
the filtration generated by Q) with quadratic variation ŒM; Mt D ˙˙ t.
1. is distributed according to 0 ;
2. y D .yt /06t6T Ris such that y0 D 0 and the process m D .mt /06t6T defined
t
by mt D yt 0 ys ds is a martingale for the complete and right-continuous
augmentation F of the canonical filtration generated by the process .; yt ; q. \
.Œ0; t A///06t6T with quadratic variation Œm; mt D ˙˙ t;
RT
3. E 0 jaj2 q.dt; da/ < 1.
6.6 A Weak Formulation Bypassing Pontryagin Principle 587
Taking for granted the conclusion of Theorem 6.43, the proof of Theorem 6.37
may be completed as follows:
Proof of Theorem 6.37: Consider a minimizing probability measure P?;relax identified in the
statement of Theorem 6.43. It is a probability measure on the canonical space ˝canon
relax
. With
each .; y; q/ 2 ˝canon
relax
such that:
Z T Z
jajq.dt; da/ < 1;
0 A
RT R
Under P?;relax , the integral 0 A jaj2 q.ds; da/ is almost surely finite. In particular, x has
continuous sample paths. We call .qt .!; //06t6T the kernel given by Proposition 6.41
satisfying q.!; / D dtqt .!; / for P?;relax almost every ! 2 ˝canon
relax
. Then, by convexity of the
running cost in the variable ˛, we have:
J relax P?;relax
Z Z
?;relax T
D EP g xT ; P?;relax ı xT1 C f t; xt ; P?;relax ı xt1 ; a q.dt; da/
0 A
Z Z (6.112)
P?;relax
?;relax
T
DE g xT ; P ı xT1 C f t; xt ; P ?;relax
ı xt1 ; a qt .da/dt
0 A
Z Z
P?;relax
T
>E g xT ; P?;relax ı xT 1
C f t; xt ; P?;relax ı xt1 ; aqt .da/ dt :
0 A
588 6 Optimal Control of SDEs of McKean-Vlasov Type
R
Notice that, P?;relax -almost surely, for almost every t 2 Œ0; T, A aqt .da/ makes sense and
belongs to A. Letting:
8Z Z
< aqt .!; da/ if jajqt .!; da/ < 1;
˛t .!/ D A A
:
0 otherwise;
Now we define P? as the law of .; y; ˛/ on Rd C .Œ0; TI Rd / L2 .Œ0; TI A/ under
P?;relax . Also, with the same notations as in Definition 6.35 and by the very definition of
x D .xt /06t6T in (6.111), the law of .; y; a; x/ under P? is the same as the law of .; y; ˛; x/
under P?;relax . By (6.112), we then have:
J P? 6 J relax P?;relax :
Since any admissible probability measure for the weak formulation (see Definition 6.35)
is also an admissible probability measure for the weak relaxed formulation (see Defini-
tion 6.42), or equivalently since A A relax , we deduce that, for any P 2 A relax ,
J P? 6 J relax P?;relax 6 J P ;
The rest of the section is devoted to the proof of Theorem 6.43. Without any loss
of generality we assume that T D 1 which allows us to identify Q with a subset of
P.Œ0; 1 A/.
In order to prove that the weak relaxed formulation admits a minimizer, we
proceed in two steps. First we prove that any nonempty sublevel set of the form
fP 2 A relax W J relax .P/ 6 Kg for K 2 R, is relatively compact for the
weak-topology. Next, we show that J relax is lower semi-continuous for the weak
topology.
6.6 A Weak Formulation Bypassing Pontryagin Principle 589
Lemma 6.44 Let K 2 R such that the sublevel set fP 2 A relax W J relax .P/ 6 Kg
is not empty. Then, for any 2 Œ1; 2/, fP 2 A relax W J relax .P/ 6 Kg is
relatively compact for the weak topology on P.Rd C.Œ0; 1I Rd /Q /, where Q D
Q \ P .Œ0; 1 A/ is equipped with the -Wasserstein distance on P .Œ0; 1 A/.
Moreover, any weak limit of sequences with values in fP 2 A relax W J relax .P/ 6 Kg
belongs to A relax .
Notice that Proposition 6.39 implies that Q is a closed subset of P .Œ0; 1 A/ for
the -Wasserstein distance.
Proof. Recall that we assume that T D 1 for simplicity. Throughout the proof, we work on
the canonical space ˝canon
relax
and the canonical random variable is denoted by .; y; q/. With
each .; y; q/ 2 ˝canon
relax
such that:
Z 1Z
jajq.dt; da/ < 1;
0 A
we associate the process x D .xt /06t61 defined by (6.111) for t 2 Œ0; 1. When P 2 A relax ,
R1R
the integral 0 A jaj2 q.ds; da/ is finite P-almost surely. In particular, x has continuous sample
paths.
First Step. Let us consider a sequence .Pn /n>1 with values in A relax such that
J relax .Pn / 6 K for all n > 1. By assumption (A2), we have, for a possibly new value
of the constant C whose value is allowed to increase from line to line:
Z 1 Z
En jaj2 q.dt; da/ 6 J relax .Pn / C C 1 C En sup jxt j :
0 A 06t61
We use the notation En to denote the expectation with respect to Pn . Observe now that by
definition (6.111) of x, we have:
Z 1 Z
1=2
En sup jxt j 6 C 1 C En jaj2 q.dt; da/ :
06t61 0 A
from which we deduce that the sequence .Pn ı q1 /n>1 is tight on P .Œ0; 1 A/. In particular,
for any "; c > 0:
h c 1i h c i cK "2
sup Pn q jaj > > 6 c sup En q jaj > 6 :
n>1 " c n>1 " c
h 2p 1 i cK "2
sup Pn q jaj > > p 6 p :
n>1 " 2 2
h[n 2p 1 oi X cK "2
sup Pn q jaj > > p 6 D 2cK "2 : (6.115)
n>1 p2N
" 2 p2N
2p
D cK "2.2 /p=2
:
Now, using Corollary 5.6, the fact that the time component t runs through the compact
interval Œ0; 1, and the relative compactness of the set (6.116) for the topology of weak
convergence, we conclude that, for any " > 0, the set:
6.6 A Weak Formulation Bypassing Pontryagin Principle 591
n 2p 1
q 2 P .Œ0; 1 A/I 8p 2 N; q jaj > 6 p;
" 2
Z 1Z o
1
1fjaj>2p g jaj q.dt; da/ < .2 /p=2
0 A 2 "
is relatively compact in P .Œ0; 1 A/. Since the bounds (6.115) and (6.117) control the
mass that Pn ı q1 puts outside of this relatively compact set, the proof of the tightness of
.Pn ı q1 /n>1 on Q is complete.
Third Step. Obviously, .Pn /n>1 is tight on ˝canonrelax
. Therefore, we can extract a subsequence
converging in the weak sense. Let P be a limit point. In order to complete the proof, we must
show that P satisfies the properties of Definition 6.42.
It is clear that under P, has distribution
R t0 . Moreover, it is standard to prove that under
P, the process m defined by .mt D yt 0 ys ds/06t61 is a martingale for the complete
and right-continuous augmentation of the filtration of the canonical process .; y; q/ with
quadratic variation .Œm; mt D ˙ ˙ t/06t61 .
Finally, observe that if we denote by E the expectation with respect to P we have:
Z 1 Z Z 1 Z
E jaj2 q.dt; da/ D lim E jaj2 'p .jaj/q.dt; da/;
0 A p!1 0 A
where .'p /p2N is a nondecreasing sequence of continuous functions with values in Œ0; 1,
equal to 1 on Œp; p and vanishing outside Œ2p; 2p. Since the mapping Q 3 Q 7!
R1R 2
0 A jaj 'p .jaj/Q.dt; da/ is continuous, we deduce that, for all p 2 N,
Z 1 Z Z 1 Z
E jaj2 'p .jaj/q.dt; da/ D lim En jaj2 'p .jaj/q.dt; da/
0 A n!1 0 A
Z 1 Z
6 lim inf En jaj2 q.dt; da/ 6 CK :
n!1 0 A
We conclude that:
Z 1 Z
E jaj2 q.dt; da/ 6 CK ;
0 A
Lemma 6.45 Let .S; d/ be a Polish space and ' W Œ0; 1 S A ! R be a bounded
measurable function, such that for any t 2 Œ0; 1, the function S A 3 .&; a/ 7!
'.t; &; a/ is continuous.
592 6 Optimal Control of SDEs of McKean-Vlasov Type
Then, for any sequence .& n ; qn /n>1 converging to some .&; q/ for the product
topology on C.Œ0; 1I S/ P.Œ0; 1 A/, we have:
Z 1 Z Z 1 Z
lim inf '.t; &tn ; a/qn .dt; da/ D '.t; &t ; a/q.dt; da/:
n!1 0 A 0 A
Proof.
First Step. We first assume that there exists c > 0 such that '.t; &; a/ D 0 if jaj > c. Then,
we write:
Z 1 Z Z 1 Z
'.t; &tn ; a/qn .dt; da/ '.t; &t ; a/q.dt; da/ (6.118)
0 A 0 A
Z 1 Z Z 1 Z
D '.t; &tn ; a/ '.t; &t ; a/ qn .dt; da/ C '.t; &t ; a/ qn q .dt; da/:
0 A 0 A
Since for any t 2 Œ0; 1, the function S A 3 .&; a/ 7! '.t; &; a/ is continuous in .&; a/ and
null when jaj > c, we conclude that for any t 2 Œ0; 1:
ˇ ˇ
lim sup ˇ'.t; &tn ; a/ '.t; &t ; a/ˇ D 0:
n!1 a2A
Now,
ˇZ Z ˇ Z ˇ ˇ
ˇ 1 ˇ 1
ˇ ˇ
ˇ '.t; &tn ; a/ '.t; &t ; a/ q .dt; da/ˇˇ 6
n
sup ˇ'.t; &tn ; a/ '.t; &t ; a/ˇdt;
ˇ
0 A 0 a2A
which takes care of the first term in the right-hand side of (6.118). As for the second term,
it tends to 0 as n tends to 1 when ' is jointly continuous in its three arguments. In order to
overcome the lack of continuity in the variable t, we use Proposition 6.40.
Second Step. We now turn to the general case when ' does not vanish anymore when jaj is
large. We consider a sequence of continuous functions . p /p2N , with values in Œ0; 1, such
that each p .x/ D 1 when jxj 6 p and p .x/ D 0 whenever jxj > 2p. Then, the first step
applies to each function Œ0; 1 S A 3 .t; &; a/ 7! '.t; &; a/ p .a/. Therefore, in order to
complete the proof, it suffices to notice that:
ˇZ Z ˇ Z Z
ˇ 1 n ˇ 1 ˇ ˇ
sup ˇˇ '.t; &t ; a/ p .a/ 1 q .dt; da/ ˇ
ˇ 6 C sup ˇ p .a/ 1ˇqn .dt; da/
n>1 0 A n>1 0 A
Z 1Z
6 C sup 1fjaj>2pg qn .dt; da/;
n>1 0 A
6.6 A Weak Formulation Bypassing Pontryagin Principle 593
where C depends upon the sup norm of '. Since the sequence of measures .qn /n>1 is
convergent, it is tight, so that the last term tends to 0 as p tends to 1. t
u
Lemma 6.46 For any 2 .1; 2/, the cost functional J relax W A relax ! R given
in Definition 6.42 is lower semicontinuous on any sublevel set with respect to the
weak topology on P.Rd C.Œ0; TI Rd / Q /, where Q D Q \ P .Œ0; 1 A/ is
equipped with the -Wasserstein distance.
Proof. Let us consider a sequence .Pn /n>1 in A relax such that supn>1 J relax .Pn / 6 K, for
some K 2 R, and converging to P with respect to the weak topology on P .Rd C .Œ0; TI Rd /
Q /. By Lemma 6.44, P belongs to A relax .
relax;
First Step. When x is defined on the canonical space ˝canon D Rd C .Œ0; 1I Rd / Q ,
with Q D Q \ P .Œ0; 1 A/ via formula (6.111), we view it as the image of .; y; q/ under
the mapping X defined by:
˝canon
relax;
3 .; y; q/ 7! X .; y; q/
Z tZ
D et C e.ts/ Ba q.ds; da/ C yt 2 C .Œ0; 1I Rd /:
0 A 06t61
for some constant C depending only upon the norms of the matrices and B.
Second Step. From the first step, we have that .Pn ı .; y; q; x/1 /n>1 converges to P ı
.; y; q; x/1 on ˝canon C .Œ0; 1I Rd /. Therefore, letting, for each t 2 Œ0; 1 and n > 1,
relax;
Also,
Z 1 Z
sup sup jxt j 6 C jj C jajq.dt; da/ C sup jyt j:
n>1 06t61 0 A 06t61
Therefore, for any 2 Œ1; 2/, the sequence of measures .Pn ı .sup06t61 jxt j /1 /n>1 is -
uniformly integrable. We deduce that, for any t 2 Œ0; 1, the sequence .nt /n>1 converges to
t in P .Rd /. Moreover:
Z 1 Z
2
E jaj2 q.dt; da/ 6 CK ; and sup M2 .t / 6 E sup jxt j2 6 CK :
0 A 06t61 06t61
(6.120)
relax;
Third Step. By Skorohod’s representation theorem on the Polish space ˝canon
C .Œ0; 1I Rd /, we get a sequence of random variables . n ; Y n ; Qn ; Xn /n>1;nD1 , defined
on the same probability space .; G ; P/, such that, for all n > 1, . n ; Y n ; Qn ; Xn / is
distributed according to Pn ı .; y; q; x/1 , . 1 ; Y 1 ; Q1 ; X1 / is distributed according
to P ı .; w; q; x/1 and, P-almost surely,
lim n ; Y n ; Qn ; Xn D 1 ; Y 1 ; Q1 ; X1 :
n!1
where p is a continuous cut-off function from Rk to Œ0; 1, which is equal to 1 on the ball of
center 0 and radius p, and 0 outside the ball of center 0R and radius 2p.
1R
We now apply Fatou’s lemma to the sequence . 0 A f .t; Xtn ; nt ; a/ p .a/Qn .dt; da/ C
R1
Cp 0 .1 C jXtn j C M2 .nt //dt/n1 , which is non-negative for a well-chosen constant Cp . By
R1
uniform integrability of the random variables . 0 .1 C jXtn j C M2 .nt //dt/n1 , we get for any
p 2 N:
Z 1 Z
E f .t; Xt1 ; 1
t ; a/ p .a/Q
1
.dt; da/
0 A
Z 1 Z
6 lim inf E f .t; Xtn ; nt ; a/ p .a/Q
n
.dt; da/;
n!1 0 A
where we used the notation E for the expectation with respect to P over . Returning to the
original sequence .Pn ı .; y; q; x/1 /n>1 , we deduce that:
Z 1 Z
E f .t; xt ; t ; a/ p .a/q.dt; da/
0 A
Z 1 Z
6 lim inf En f .t; xt ; nt ; a/ p .a/q.dt; da/:
n!1 0 A
6.6 A Weak Formulation Bypassing Pontryagin Principle 595
2
jf .t; xt ; t ; a/j 6 1 C sup jxt j2 C sup M2 .t / C jaj2 ;
06t61 06t61
we have:
Z 1 Z Z 1 Z
lim E f .t; xt ; t ; a/ p .a/q.dt; da/ D E f .t; xt ; t ; a/q.dt; da/:
p!1 0 A 0 A
The assumption on the running cost function implies that, for all n > 1, t 2 Œ0; 1 and a 2 A:
f .t; xt ; nt ; a/ > jaj2 C 1 C jxt j C M2 .nt /
> jaj2 C 1 C jxt j ;
where we used the fact the sequence .sup06t61 M2 .nt //n>1 is bounded, and allowed the
constant C to vary from line to line. Therefore,
Z 1 Z
En f .t; xt ; nt ; a/ p .a/q.dt; da/
0 A
Z 1 Z Z 1 Z
D En f .t; xt ; nt ; a/q.dt; da/ C En f .t; xt ; nt ; a/ p .a/ 1 q.dt; da/
0 A 0 A
Z 1 Z Z 1 Z
6 En f .t; xt ; nt ; a/q.dt; da/ C CEn 1 C sup jxt j 1fjaj>pg q.dt; da/:
0 A 0 A 06t61
C
6 ;
p
Z 1 Z
lim sup lim inf En f .t; xt ; nt ; a/ p .a/q.dt; da/
p!1 n!1 0 A
Z 1 Z
6 lim inf En f .t; xt ; nt ; a/q.dt; da/:
n!1 0 A
Taking the limit as p tends to 1 in the conclusion of the third step, we obtain:
Z 1 Z Z 1 Z
E f .t; xt ; t ; a/q.dt; da/ 6 lim inf En f .t; xt ; nt ; a/q.dt; da/:
0 A n!1 0 A
Handling the terminal cost as the running cost in the third step, we complete the proof. t
u
Conclusion
The proof of Theorem 6.43 is easily completed, using the relative compactness of
the sublevel sets and the lower semi-continuity of the cost functional J relax .
6.7 Examples
The treatment of this subsection parallels the discussion of Section 3.5 of Chapter 3
where we solved linear quadratic mean field games. As before, we start with the
multidimensional models before focusing on the one-dimensional case. We use the
same notation as in Section 3.5 for the purpose of comparison.
subject to
h i
dXt D b1 .t/Xt C b2 .t/˛t C bN 1 .t/EŒXt dt C dWt ; X0 D x0 :
O x; ; y/ D r.t/1 b2 .t/ y;
˛O D ˛.t; (6.122)
for t 2 Œ0; T, with initial condition X0 and terminal condition YT D Œq C qN XT C
Œs qN s .Nqs C s qN /EŒXT . Above, Id denotes the identity matrix of dimension d.
By Theorems 6.14 and 6.16, the above system characterizes the optimal trajectories
of (6.121).
In order to proceed with the analysis of (6.123), we take the expectations of both
sides. Using the notation xN t and yN t for the expectations EŒXt and EŒYt respectively,
we find that:
8
ˆ
ˆ
ˆ dNx D b 1 .t/ C bN 1 .t/ N
x b2 .t/r.t/ 1
b2 .t/
N
y t dt;
ˆ
ˆ
t
t
ˆ
ˆ
< dNy D q.t/ qN .t/ C qN .t/s.t/ C s.t/ qN .t/ s.t/ qN .t/s.t/ xN
ˆ
t t
ˆ (6.124)
ˆ
ˆ
ˆ
ˆ b1 .t/ C bN 1 .t/ yN t dt;
t 2 Œ0; T;
ˆ
ˆ
:̂
x0 D EŒX0 ; yN T D q C qN C s qN s .Nqs C s qN / xN T :
with
for t 2 Œ0; T. Pay attention to the fact that the notation bt does not stand for a drift
term. The drift coefficients are denoted by b1 .t/, bN 1 .t/ and b2 .t/. We can solve the
system (6.125) if we are able to solve the matrix Riccati equation:
PN t C N t at dt N t C N t bt N t ct D 0; N T D e; (6.126)
in which case yN t D N t xN t , for t 2 Œ0; T. Clearly, (6.126) is similar to the MFG case
(see Section 3.5), except for the fact that the coefficients ct and dt are different,
and the terminal condition requires e D q C qN C s qN s .Nqs C s qN /. Assuming
momentarily that the matrix Riccati equation (6.126) has a unique solution N t , the
solution of (6.125) is obtained by solving:
and setting yN t D N t xN t .
The solvability of the Riccati equation (6.126) may be addressed by the same
arguments as in Section 3.5. Indeed, as in the case of the linear quadratic mean field
game models, existence and uniqueness for a solution of this matrix Riccati equation
are equivalent to the unique solvability of a deterministic control problem. This
deterministic control problem has a convex Hamiltonian (with strong convexity in ˛)
whenever the matrix coefficients are continuous, q, qN , q.t/ and qN .t/ are nonnegative
definite, and r.t/ is strictly positive definite. This suffices to prove the solvability
of (6.126).
Returning to (6.123) and plugging EŒXt D xN t and EŒYt D yN t into the McKean-
Vlasov FBSDE (6.123), we reduce the problem to the solution of the affine FBSDE:
(
dXt D Œat Xt C bt Yt C ct dt C dWt ; X0 D x0 ;
(6.128)
dYt D Œmt Xt at Yt C dt dt C Zt dWt ; YT D qXT C r;
with:
8
ˆ
ˆ
ˆ at D b1 .t/; bt D b2 .t/r.t/1 b2 .t/ ; ct D bN 1 .t/Nxt ;
ˆ
< m D Œq.t/ C qN .t/;
t
ˆ N
ˆ dt D ŒNq.t/s.t/ C s.t/ qN .t/ s.t/ qN .t/s.t/Nxt b1 .t/ yN t ;
ˆ
:̂
q D q C qN ; r D Œs qN s .Nqs C s qN /NxT :
600 6 Optimal Control of SDEs of McKean-Vlasov Type
As usual, the affine structure of the FBSDE suggests that the decoupling field will
be an affine function, so we search for deterministic differentiable functions t 7! t
and t 7! t such that:
Yt D t Xt C t ; t 2 Œ0; T:
Computing dYt from this ansatz, using the expression of dXt given by the first
equation of (6.128), and identifying term by term with the expression of dYt given
in (6.128) we get:
8
ˆ
< P t C t bt t C at t C t at mt D 0;
ˆ T D q;
P t C .at C t bt /t dt C t ct D 0; T D r; (6.129)
ˆ
:̂ Z D :
t t
As before, the first equation is a matrix Riccati equation. If and when it can be
solved, the third equation becomes solved automatically, and the second equation
becomes a first order linear ODE, though not homogenous this time, which can be
solved by standard methods. Notice that the quadratic terms of the two Riccati equa-
tions (6.126) and (6.129) are the same since bt D bt D b2 .t/r.t/1 b2 .t/ . However,
the terminal conditions are different since the terminal condition in (6.129) is given
by q D q C qN , while it is given by e D q C qN C s qN s .Nqs C s qN / in (6.125).
Notice also that the first order terms are different as well. Anyway, although it
differs from (6.126), (6.129) may be proved to be uniquely solvable by the same
argument, since existence and uniqueness of a solution to (6.129) are equivalent to
the unique solvability of a deterministic control problem with a convex Hamiltonian.
This proves once more that we have existence and uniqueness of a solution to the
LQ McKean-Vlasov control problem under the standing assumption.
When X0 is deterministic, the optimally controlled state is Gaussian despite the
nonlinearity due to the McKean-Vlasov nature of the dynamics, and because of the
linearity of the ansatz, the adjoint process Y D .Yt /06t6T is also Gaussian. Also,
using again the form of the ansatz, we see that the optimal control ˛ D .˛t /06t6T
which was originally expected to be an open loop control, is in fact in closed loop
feedback form ˛O t D '.t; Xt / since it can be rewritten as:
which incidentally shows that the optimal control is also a Gaussian process.
Remark 6.47 The reader might notice that, within the linear-quadratic framework,
the conditions for the unique solvability of the adjoint equations are not the same
in the MFG approach and in the Control of MKV Dynamics. On the one hand, opti-
mization over controlled MKV dynamics reads as an optimization problem of purely
(strictly) convex nature, for which existence and uniqueness of an optimal state are
expected. On the other hand, the optimal states in the MFG approach appear as the
6.7 Examples 601
The running cost function was of the simple form f .t; x; ; ˛/ D ˛ 2 =2, so that
r.t/ D 1 and q.t/ D qN .t/ D 0. As before, we assume that the terminal cost
function is given by g.x; / D qN .x s/N 2 =2 for some qN > 0 and s 2 R. Using
the notation and the results above, we see that the McKean-Vlasov FBSDE derived
from the Pontryagin stochastic maximum principle is the same as (3.67), except for
its terminal conditions. Indeed, while q is the same since q D qN , r is now given by
r D qN s.s 2/EŒxT . Postulating again an affine relationship Yt D t Xt C t and
solving for the two deterministic functions and , we find the same expression as
in (3.68):
Z
1 C q.T t/ rt t
dWs
Xt D x0 C Œ1 C q.T t/ :
1 C qT 1 C qT 0 1 C q.T s/
8t 2 Œ0; T; t yN t C t xN t D 0;
602 6 Optimal Control of SDEs of McKean-Vlasov Type
so that:
8t 2 Œ0; T; Yt D t Xt C yN t t xN t :
yN t D N t xN t ; t 2 Œ0; T;
where . N t /06t6T is also defined autonomously, we finally end up with the relation-
ship:
8t 2 Œ0; T; Yt D t Xt C N t t EŒXt ;
When t D T, the above decomposition is consistent with the writing of YT under the
form YT D Œq C qN XT C Œe .q C qN /EŒXT .
As an exercise, the reader may check that it solves the master equation (6.103).
We first revisit the discussion initiated in Subsection 2.3.3. Accordingly, our goal
is to identify the notion of potential game appropriate in the setting of mean field
games.
Informal Discussion
the infimum being taken over control processes ˛ D .˛t /06t6T , the process X˛ D
.Xt˛ /06t6T denoting the controlled diffusion process:
Z t
Xt D C ˛s ds C Wt ; t 2 Œ0; T: (6.131)
0
the infimum being taken over the same class of control processes ˛ D .˛t /06t6T
as above, and X˛ denoting the same controlled diffusion process as in (6.131).
Obviously, the optimization problem (6.133) is a special case of the class of
McKean-Vlasov optimal control problems considered in this chapter.
for a given continuous path D .t /06t6T with values in P2 .Rd /, where, as above,
Z t
Xt˛ DC ˛s ds C Wt ; t 2 Œ0; T:
0
604 6 Optimal Control of SDEs of McKean-Vlasov Type
Consequently, we can find a nonrandom quantity C./, depending upon the flow
and independent of the control ˛, such that:
Z
T
1 2 ıF ˛ ıG ˛
I.˛/ D E j˛t j C .t; t /.Xt / dt C .T /.XT / C C./:
0 2 ım ım
Up to the constant C./, we recognize the same cost functional as in the def-
inition of the MFG problem (6.130). Since the constant C./ plays no role
when optimizing with respect to ˛ while keeping frozen, we deduce that the
auxiliary MFG problem associated with (6.133) through the procedure defined in
Subsection 6.2.5 coincides with the MFG problem (6.130). Moreover, the analysis
provided in Subsection 6.2.5 shows that the Pontryagin forward-backward systems
associated with the mean field game (6.130) and with the mean field stochastic
control problem (6.133) are the same.
Generic Model
We now specialize the choice of F and G and make the above discussion rigorous in
that case. To do so, let us assume that ` is an even and continuously differentiable
function on Rd and that its derivative is at most of linear growth, and similarly that
for any t 2 Œ0; T, h.t; / is also even and continuously differentiable on Rd , with a
derivative at most of linear growth. Then, we define the functions F and G by:
Z Z
1
F.t; / D h.t; x x0 /d.x/d.x0 /;
2 Rd Rd
Z Z (6.134)
1
G./ D `.x x0 /d.x/d.x0 /;
2 Rd Rd
for .t; x; / 2 Œ0; T Rd P2 .Rd /, so they can be used in the present setup.
6.7 Examples 605
Letting D .t D L.Xt //06t6T , we may rewrite the above system as:
8
ˆ
< dXt D Yt dt C dWt ;
ˆ
dYt D @x f .t; Xt ; t /dt C Zt dWt ; t 2 Œ0; T;
ˆ
:̂ X D ; Y D @ g.X ; /;
0 T x T T
which is exactly the Pontryagin system for the standard optimal control problem (6.130).
Observing that f and g are convex in the Euclidean space variable, we deduce from
Theorem 3.17 that X D .Xt /06t6T is an optimal path of the optimal control problem (6.130).
Therefore, D .t /06t6T is an MFG equilibrium, and Theorem 6.19 again shows that
it is the unique one since the MFG equilibria are characterized by the McKean-Vlasov
FBSDE (6.136). t
u
606 6 Optimal Control of SDEs of McKean-Vlasov Type
The property of potential mean field games which we identified above says
that the solution (via the Pontryagin stochastic maximum principle) of the mean
field game problem (6.130)–(6.131) reduces to the solution of a central planner
optimization problem. This much was expected on the basis of previous discussions
of potential games. However, what is remarkable is the fact that this central planner
optimization problem is in fact an optimal control of McKean-Vlasov dynamics.
where xt denotes the position of a bird at time t, and vt its velocity. Here,
W D .Wt /t>0 is a three-dimensional Wiener process and ˛ D .˛t /t>0 is a
three-dimensional progressively measurable process with respect to the filtration
generated by the initial position and by W. The process ˛ plays the role of the
control of the bird on its velocity.
With the rationale of mean field games as a framework for the search for a large
population consensus, the finite player game formulation of Chapter 1 suggests that
we consider the following cost functional:
Z
T 1
J .˛/ D E j˛t j2 C h t .xt ; vt / dt ; (6.137)
0 2
for a given time horizon T > 0, a continuous flow D .t /t>0 of measures
on R6 and an even function h of the variables x and v. Notice that the special
convolution form of the running cost function is covered by the second example
of N-player potential game in Subsection 2.3.3. In the context of the limit N ! 1
of large populations, the convolution form appearing in the cost functional (6.137)
is reminiscent of that used to define f and g in the statement of Proposition 6.48.
Unfortunately, Proposition 6.48, as stated above, cannot apply directly to (6.137)
because of the degeneracy of the equation for the position .xt /06t6T .
However, we can easily recast (6.137) in a more tractable setting, as done in
Section 6.6. Indeed, with .Xt˛ D .xt˛ ; vt˛ //06t6T , the flocking controlled dynamics
can be written as:
with:
0d Id 0 0d 0d
D BD d and ˙D :
0d 0d Id 0d Id
where:
Z
1
F./ D h .x; v/d.x; v/; (6.140)
2 R6
and, when h is convex, the mean field game can be solved by solving the central
planner optimization problem, the latter being an optimal control problem of the
McKean-Vlasov type.
Still, the typical form for h in the flocking model is:
jvj2
h.x; v/ D ; (6.141)
.1 C jxj2 /ˇ
p
for some ˇ > 0, see (4.180) with D 2 therein. Except for the case ˇ D 0,
the function h is not convex! Therefore Proposition 6.48 does not apply. Still we can
solve the central planner optimization and regard its solutions as possible candidates
for solving the MFG problem. Indeed, although the running cost function F is not
convex – which prevents us from applying the results of Section 6.4 – assumption
MKV Weak Formulation is satisfied and we can appeal to Theorem 6.37 to prove
existence of a solution of the McKean-Vlasov central planner optimization in the
weak formulation.
608 6 Optimal Control of SDEs of McKean-Vlasov Type
The purpose of this section is to revisit the example introduced at the very end
of Subsection 6.2.2. There, starting from the statement of Benamou-Brenier’s
Theorem 5.53, we gave an informal argument to show that the 2-Wasserstein
distance could appear as the value function of an optimal control problem of the
McKean-Vlasov type formulated in an analytic way. By analytic formulation, we
mean that the controlled trajectories were not regarded as controlled stochastic
processes, as we did in most of the chapter, but as deterministic flows of probability
measures satisfying a Kolmogorov-Fokker-Planck equation of the form (6.12).
The fact that two approaches, a probabilistic one and an analytic one, are
conceivable for handling mean field stochastic control problems was already
explained in the introductory Section 6.2. However, there, we just gave a few
indications on the strategies that could be used to pass from one formulation to
another. Motivated by the statement of Benamou-Brenier’s theorem, our goal is here
to address these questions more properly in the framework of optimal transportation.
As mentioned at the end of Section 6.2, one difficulty for passing from the
analytic to the probabilistic approach is to reconstruct, for a given flow of
probability measures satisfying a continuity equation of the Kolmogorov-Fokker-
Planck type, a stochastic process X admitting as flow of marginal laws. This
is exactly the purpose of the following statement, which is taken from Ambrosio-
Gigli-Savaré’s monograph:
for all real valued smooth function from Œ0; TRd with compact support included
in .0; T/ Rd , and for some measurable vector field b W Œ0; T Rd ! Rd satisfying
Z T Z
jb.t; x/j2 dt .x/dt < 1:
0 Rd
and, for all t 2 Œ0; T, t D L.xt /. In particular, the trajectories of .xt /06t6T belong
to the so-called Cameron-Martin space of absolutely continuous trajectories whose
derivative is square integrable on Œ0; T; they satisfy:
6.7 Examples 609
Z T
EP jPxt j2 dt < 1;
0
induces, via its flow of marginal measures D .P ı xt1 /06t6T , a solution to the
Kolmogorov-Fokker-Planck equation (6.142).
the infimum being taken over probability measures P on the canonical space ˝
under which 0 , x1 1 , and .xt /06t61 satisfies (6.143) for T D 1 and for
R1
some Borel vector field b satisfying E 0 jb.t; xt /jdt < 1.
In this regard, observe that the fact that the infimum is taken over probability
measures and not over control processes is reminiscent of the weak formulation
used in Section 6.6.
Interestingly, we may wonder about a similar formulation of Benamou-Brenier’s
theorem using open loop instead of Markovian controls in (6.143). This prompts us
to quote another result, which we already alluded to at the end of Section 6.2:
Theorem 6.50 For some time horizon T > 0, let X D .Xt /06t6T be an Rd -valued
absolutely continuous process defined on some probability space .; G; P/ with
dynamics of the form:
Z t
Xt D X0 C ˛s ds; (6.144)
0
where EŒjX0 j2 < 1 and ˛ D .˛t /06t6T is a jointly measurable process satisfying
RT
E 0 j˛t j2 dt < 1, where E is the expectation under P.
610 6 Optimal Control of SDEs of McKean-Vlasov Type
and P almost-surely:
Z t
xt D C b.s; xs /ds; t 2 Œ0; T;
0
where .; x D .xt /06t6T / is the canonical process on ˝, such that for any t 2 Œ0; T,
the law of xt under P is the same as the law of Xt under P.
where the infimum is taken over all the probability measures P on the canonical
space ˝ D Rd C.Œ0; 1I Rd / under which the canonical process .; x D .xt /06t61 /
satisfies:
1. x0 D ,
R1
2. x is absolutely continuous and EP 0 jPxt j2 dt < 1,
3. the law of x0 is 0 ,
4. the law of x1 is 1 .
Of course, this may be rewritten as a mean field stochastic control problem, but
formulated in the weak sense as in Section 6.6. To any probability P on the canonical
space such that the three (and not four) first items above are satisfied, we may indeed
associate the cost:
Z 1
P
J .P/ D E jPxt j2 dt C g L.x1 / ; (6.145)
0
where:
(
0 if D 1 ;
g./ D
C1 if 6D 1 :
6.7 Examples 611
We now face a control problem of the McKean-Vlasov type as the cost functional
depends upon the marginal distribution of the process x at the terminal time.
inf J.˛/;
with:
Z 1
2
J.˛/ D E j˛t j dt C g L.X1 / ;
0
and as above:
(
0 if D 1 ;
g./ D
C1 if 6D 1 :
In (6.146), the volatility provides some form of mollification. For that reason,
it may sound convenient to regularize 1 as well in the terminal condition and to
replace g by:
(
0 if D 1 Nd .0; 2 Id /;
g ./ D
C1 if 6D 1 Nd .0; 2 Id /:
where U.c/ D .c1 1/=.1 / for > 0, with U.c/ D ln.c/ if D 1, and U Q is the
identity function. As in the MFG version of the problem, the (time-homogeneous)
reduced Hamiltonian has the form:
H.z; a; ; yz ; ya ; c/
h i
D Œ1 zyz C .1 ˛/N ˛ z C ˛ N ˛1 ı a c ya U.c/;
6.7 Examples 613
R
where N D R2 a.dz; da/ is the mean of the second marginal of . Also,
we denoted the control by a; here, ˛ is a constant exponent. Obviously,
H.z; a; ; yz ; ya ; c/ makes sense only if N > 0. A first difference with the analysis
performed for the MFG version of the problem is the fact that we keep the variable
z in the expression of the Hamiltonian, but, as in the MFG case, its adjoint process
does enter the equation for the optimal trajectories. An obvious reason for keeping
z is that the measure argument is now part of the state of the forward dynamics, and
we need to compute exactly the value of @ H. In the present situation,
i
@ H.a; z; ; ya ; yz ; c/.v/ D .1 ˛/˛ N ˛1 z C ˛.˛ 1/N ˛2 a ya ;
if N > 0. Since the first adjoint process Y z D .Yz;t /06t6T has no influence on the
value of the optimal trajectory, we can focus on the dynamics of the adjoint of
the wealth process A D .At /06t6T for which we use the notation Y D .Yt /06t6T
instead of Y a D .Ya;t /06t6T for the sake of simplicity. The McKean-Vlasov forward-
backward system derived from the Pontryagin maximum principle proved in this
chapter for MKV diffusion processes (see Definition 6.5) writes:
8
ˆ
ˆ dAt D .1 ˛/EŒAt ˛ Zt C Œ˛EŒAt ˛1 ıAt .Yt /1= dt
ˆ
< dY D Y ˛EŒA ˛1 ı dt .1 ˛/˛EŒA ˛1 EŒZ Y dt
ˆ
t t t t t t
(6.147)
ˆ
ˆ ˛2 Q
˛.˛ 1/EŒAt EŒAt Yt dt C Zt dWt ; t 2 Œ0; T;
ˆ
:̂
YT D 1;
dYt
D Yt ˛EŒAt ˛1 ı dt .1 ˛/˛EŒAt ˛1 Yt dt ˛.˛ 1/EŒAt ˛1 Yt dt
D Yt ˛EŒAt ˛1 ı dt;
which is the same backward equation as in (3.76). Taking the mean in the forward
equation of (6.147), we then deduce that the pair .EŒAt ; Yt /06t6T solves (3.76),
which makes it possible to repeat the analysis of the case of the MFG version of the
problem, provided that N 0 is large enough. In such a case, the system (6.147) has a
unique solution such that EŒAt > 0 for all t 2 Œ0; T.
614 6 Optimal Control of SDEs of McKean-Vlasov Type
We claimed in the introduction that the problem of the optimal control of SDEs of
McKean-Vlasov type has notoriously been ignored in the mathematical literature.
However, it is fair to mention that some special cases such as the mean variance
portfolio selection problem have been considered by Anderson and Djehiche in [24]
and Fischer and Livieri in [155] in the spirit of this chapter. The linear quadratic
case was discussed (though quite recently) by two groups of authors in [53] and
[99]. Bensoussan, Sung, Yam, and Yung on one hand and Carmona, Delarue, and
Lachapelle on the other hand, simultaneously and independently of each other,
discussed the linear quadratic case. However, the technical analysis presented in
this chapter follows the approach of Carmona and Delarue as originally developed in
[98] which is similar to, though different from, the one presented in the monograph
[50] by Bensoussan, Frehse, and Yam. Also, inspired by the surge of interest for
the theory of optimal control, several works have been published on the analysis of
Hamilton-Jacobi-Bellman equations on the Wasserstein space, see for instance Feng
and Katsoulakis [152], Gangbo, Nguyen, and Tudorascu [167] and Pham and Wei
[311]. As explained in the chapter, see also the additional comments right below,
HJB equations on the Wasserstein space play a central role in the deterministic
analysis of mean field stochastic control problems.
Several versions of the stochastic maximum principle for optimization problems
over systems with mean field interactions exist in the literature. For example,
Hosking derives in [201] a maximum principle for a finite player game with mean
field interactions. Also, Brandis-Meyer, Oksendal, and Zhou [281] use Malliavin
calculus to derive a stochastic maximum principle for a mean field control problem
including jumps.
The discussion of models with scalar interactions in Subsection 5.2.2 shows how
the model treated by Anderson and Djehiche in [24] appears as an example of our
more general formulation of the Pontryagin stochastic maximum principle. In fact,
the mean variance portfolio optimization example discussed in [24] as well as the
solution proposed in [53] and [99] of the optimal control of linear-quadratic (LQ)
McKean-Vlasov dynamics are based on the general form of the Pontryagin principle
proven in this chapter as applied to models with scalar interactions.
The continuation method alluded to in the proof of Theorem 6.19 was originally
introduced for the analysis of forward-backward stochastic differential equations,
by Peng and Wu in [307].
Throughout the chapter, we assumed that the space A of controls was convex.
This assumption was only made for the sake of simplicity. More general spaces can
be handled at the cost of using spike variation techniques, and adding one extra
adjoint equation. See for example [343, Chapter 3] for a discussion of the classical
(i.e., non-McKean-Vlasov) case. Without the motivation from specific applications,
we chose to refrain from providing this level of generality and avoid an excessive
overhead in notation and technicalities.
6.8 Notes & Complements 615
Abstract
The goal of this chapter is to follow-up on some of the examples introduced in
Chapter 1, especially those which are not directly covered by the probabilistic
theory of stochastic differential mean field games developed so far. Indeed,
Chapter 1 included a considerable amount of applications hinting at mathe-
matical models with distinctive features which were not accommodated in the
previous chapters. We devote this chapter to presentations, even if only informal,
of extensions of the Mean Field Game paradigm to these models. They include
extensions to several homogenous populations, infinite horizon optimization, and
finite state space models. These mean field game models have a great potential
for the quantitative analysis of very important practical applications, and we
show how the technology developed in this book can be brought to bear on their
solutions.
To start with, we present in this section two natural extensions of the class of mean
field games we have studied so far.
The first extension concerns mean field games with several populations or,
equivalently, with multiclass agents. The second one is about mean field games with
infinite time horizon.
From a modeling perspective, one of the major shortcomings of the standard mean
field game theory is the strong symmetry requirement that all the players in the game
are statistically identical. A first way to break this symmetry is to assume that the
players belong to a finite number of homogeneous groups in which the mean field
asymptotic theory can be carried out, group by group. Another way is to assume that
one of the players dominates the others in the sense that it directly influences the
states of the other players while it only feels the others through their collective state.
This subsection is devoted to the former approach only. The latter, which is more
demanding from the technical point of view, will be addressed in the last chapter of
the next volume.
For the sake of simplicity, we restrict ourselves to the case of two homogeneous
subgroups in the population. Clearly, the discussion below can be easily adapted to
cover cases with a higher number of subgroups.
Finite-Player Game
The generic N-player stochastic differential games leading to mean field games were
introduced in Chapter 2. See in particular Subsection 2.3. A key feature was the fact
that the dynamics of the states of the players were driven by the same drift and
volatility coefficients b and from Œ0; T Rd P.Rd / A into Rd and Rdm ,
where A is the set of admissible actions, d the dimension of the state space of the
players, and m the dimension of the noise. Recall that m was chosen equal to d in
Chapters 3 and 4.
Accordingly, when the population is divided into two homogeneous subgroups,
we shall consider two sets of drift and volatility coefficients .b1 ; 1 / and .b2 ; 2 /.
Obviously, .b1 ; 1 / denotes the drift and volatility coefficients of players from the
first group, and similarly for .b2 ; 2 /. Observe that the set A and the dimension
parameters d and m can be chosen to be proper to each of the two subgroups,
in which case .A; d; m/ becomes .A1 ; d1 ; m1 / and .A2 ; d2 ; m2 /. Also, due to the
mean field hypothesis, we now require that each .bl ; l /, for l 2 f1; 2g, is a function
defined on Œ0; T Rdl P.Rd1 / P.Rd2 / Al , which accounts for the fact that the
dynamics of the state of any player in the game now feel the collective states of the
two subgroups. In the end, the dynamics of players from the group 1 take the form:
1 1
1 1
dXt1;i D b1 t; Xt1;i ; N N1;i ; N NX 22 ; ˛t1;i dt C 1 t; Xt1;i ; N N1;i ; N NX 22 ; ˛t1;i dWt1;i ;
Xt t Xt t
for i 2 f1; ; N1 g, while, for players from the second group, they take the form:
2 1
2 1
dXt2;i D b2 t; Xt2;i ; N NX 11 ; N N2;i ; ˛t2;i dt C 2 t; Xt2;i ; N NX 11 ; N N2;i ; ˛t2;i dWt2;i ;
t Xt t Xt
technicalities here. Also, we use the same notations as in Chapter 2 for the empirical
distributions, namely:
1 X
N1
1 X
N2
1 1
N N1;i D ıX 1;j ; N NX 22 D ı 2;j ;
Xt N1 1 t t N2 jD1 Xt
jD1;j6Di
1 X X
N1 N2
2 1
1
N NX 11 D ı 1;j ; N N2;i D ıX 2;j :
t N1 jD1 Xt Xt N2 1 t
jD1;j6Di
The costs to players of each of the subgroups are defined in a similar manner.
For tuples of strategies .˛1;i /iD1; ;N1 and .˛2;i /iD1; ;N2 , the cost to any player
i 2 f1; : : : ; N1 g from the first group is:
J 1;i .˛1;j /jD1; ;N1 ; .˛2;j /jD1; ;N2
Z T
1;i N1 1 N2 1;i 1;i N1 1 N2
DE f1 t; Xt ; N 1;i ; N X 2 ; ˛t dt C g1 XT ; N 1;i ; N X 2 ;
Xt t XT
0 T
and, similarly, for any player i 2 f1; ; N2 g from the second group:
J 2;i .˛1;j /jD1; ;N1 ; .˛2;j /jD1; ;N2
Z T
2 1
DE f2 t; Xt2;i ; N NX 11 ; N N2;i ; ˛t2;i dt C g2 XT2;i ; N NX 11 ; N NX 22 1
;i
:
t Xt
0 T T
Remark 7.1 Obviously, the model is not as general as what we could think of. For
instance, we could incorporate common noises, in analogy with mean field games
with a common noise investigated in Part I in the second volume. In that case, we
could use either the same common noise for the two subgroups or two different
common noises, one for each subgroup.
Also, in this presentation, the way the coefficients are required to depend upon
the empirical distributions of the two subgroups is somewhat restrictive. Indeed, it
would be more realistic to allow the coefficients to depend upon the proportions
of players from each of the two subgroups in the population. However, although
this would make sense from a modeling standpoint, the mathematical significance
would be rather limited. Indeed, the proportions N1 =.N1 C N2 / and N2 =.N1 C N2 / of
each of the two subgroups would be regarded as fixed, since there is no way, in this
model, for a player to switch from one subgroup to another. This observation will
become especially clear in the next paragraph: We shall take the mean field limit
N1 ; N2 ! 1 with the prescription that:
624 7 Extensions for Volume I
N1 N2
! 1 ; ! 2 ; (7.1)
N1 C N2 N1 C N2
1 and 2 representing the limiting proportion of players from each subgroup in the
limiting population. In this approach, 1 and 2 are a priori prescribed.
Asymptotic Formulation
We now present the limiting formulation of the game when N1 and N2 tend to infinity
in such a way that (7.1) is satisfied, 1 and 2 representing the limiting proportion
of players from each subgroup in the limiting population.
The intuition leading to the definition of an asymptotic Nash equilibrium is
exactly the same as in standard mean field games. Asymptotically, any unilateral
change of strategy decided by one of the players cannot affect the global states of any
of the two populations. So, in the limiting framework, everything works as if the best
response of any player was computed as the solution of a standard optimal control
problem within the environment determined by the equilibrium distributions of the
two populations. So, the search for an asymptotic equilibrium should comprise the
following two-steps:
1. For any two deterministic flows of probability measures 1 D .1t /06t6T and
2 D .2t /06t6T given on Rd1 and Rd2 respectively, solve the two optimal control
problems:
1 ;2 1 ;2
inf J 1; .˛1 / and inf J 2; .˛2 /;
˛1 ˛2
with
dXt1 D b1 t; Xt1 ; 1t ; 2t ; ˛t1 dt C 1 t; Xt1 ; 1t ; 2t ; ˛t1 dWt1 ;
dXt2 D b2 t; Xt2 ; 1t ; 2t ; ˛t2 dt C 2 t; Xt2 ; 1t ; 2t ; ˛t2 dWt2 ;
for t 2 Œ0; T, and X01 D x01 2 Rd1 and X02 D x02 2 Rd2 as initial conditions.
7.1 First Extensions 625
We assume that the initial conditions X01 and X02 are deterministic for convenience
only. The above procedure can be easily extended to cases when X01 and X02 are
random.
Each optimization problem articulated in step 1 above can be handled using the
tools introduced for the analysis of standard mean field games. For instance, both
problems can be reformulated by means of either: (i) a Hamilton-Jacobi-Bellman
equation, (ii) or an FBSDE for the value function, (iii) or the stochastic maximum
principle, which relies on another FBSDE.
Below, we review these three approaches when 1 and 2 are independent of the
control variables.
in Œ0; T Rdl , with Vl .T; / D gl .; 1T ; 2T / as terminal condition for the first
equation, and l0 D ıxl as initial condition for the second, and for l D 1; 2. Above
0
.r/
Hl is the reduced Hamiltonian associated with .bl ; l ; fl /:
.r/
Hl .t; xl ; 1 ; 2 ; yl ; ˛l / D bl .t; xl ; 1 ; 2 ; ˛l / yl C fl .t; xl ; 1 ; 2 ; ˛l /; (7.2)
for t 2 Œ0; T, with YTl D gl .XTl ; L.XT1 /; L.XT2 // as terminal condition, for l D 1; 2.
for t 2 Œ0; T, with the terminal condition YTl D @x gl .XTl ; L.XT1 /; L.XT2 //, for l D
1; 2. Above, Hl stands for the full-fledged Hamiltonian:
In contrast with (4.70), observe that l may not be constant, which explains why the
backward equation involves the full Hamiltonian instead of the reduced one.
The analysis of FBSDEs of the McKean-Vlasov type (7.3) and (7.4) can be
carried out as the analysis of (4.54) and (4.70). In the latter case, 1 and 2 need to be
assumed to be constant. We shall revisit the Pontryagin principle for processes with
nonconstant volatility coefficients in Chapter (Vol II)-1, see Subsection (Vol II)-
1.4.4.
A Practical Example
As a practical application, we revisit the crowd congestion model discussed in
Subsection 1.5.3.
Instead of one representative individual, we consider two sub-populations with
R as state space for the individuals, and with the same dynamics as in (1.50):
for t 2 Œ0; T, where > 0 and W 1 and W 2 are two independent Wiener processes.
Here, ˛1 and ˛2 are F1 and F2 -progressively measurable square integrable processes
with values in A1 D A2 D R. As in (1.51), we then choose:
f1 .t; x; 1 ; 2 ; ˛/ (7.6)
Z a1 Z a2
1
D j˛j2 .x x0 /d1 .x0 / .x x0 /d2 .x0 / C ert k.t; x/;
2 R R
as running cost for the representative player of the first group, and similarly for the
representative player of the second group. Here is a smooth density with a support
concentrated around 0.
As explained in Chapter 1, the function k models the effect of panic depending
upon where the player is. Also, powers a1 > 0 and a2 > 0 are intended to penalize
congestion. Whenever a2 > a1 , individuals from group 1 primarily avoid congestion
with people from group 2, which may be typical of a xenophobic behavior.
We refer to the Notes & Complements below for references on this model.
Potential Games
As another example, we consider the analogue of potential games, but for models
with two homogeneous subpopulations. We consider two Rd1 and Rd2 -valued
representative players with state dynamics of the form:
where, as above, W 1 and W 2 are two independent Wiener processes with values
in Rd1 and Rd2 respectively, and ˛1 and ˛2 are two F1 and F2 square-integrable
progressively measurable processes with values in A1 Rd1 and A2 Rd2 . Their
cost functionals are of the form:
Z T 1
J l .˛1 ; ˛2 / D Fl t; L.Xt1 /; L.Xt2 / C EŒj˛tl j2 dt C Gl L.XT1 /; L.XT2 / ;
0 2
Remark 7.2 As explained above, whenever X01 and X02 are deterministic, F1 and F2
may be chosen as the complete filtration generated by W 1 and W 2 . Whenever X01
and X02 are random, both F1 and F2 have to be augmented in an obvious manner.
Notice that in contrast with the notion of equilibrium defined for mean field
games with two subgroups, the Nash equilibrium is here regarded as an equilibrium
between the two populations. The formulation is in fact reminiscent of the mean
field stochastic control problems investigated in Chapter 6, since the marginal laws
appearing in the cost functionals are directly influenced by the strategies.
In order to find a Nash equilibrium, we can follow the arguments developed in
Chapter 6 and implement the stochastic Pontryagin principle, except that we have to
use the version of the stochastic maximum principle for games instead of the version
for control problems. Based upon our experience from Chapter 6, we expect that the
resulting adjoint equations depend upon the differential calculus used on the space
of probability measures of order 2.
If we choose the L-differential calculus introduced in Chapter 5 and if we
assume that the coefficients satisfy suitable differentiability assumptions, then in
full analogy with Definition 6.5, we associate with each couple .˛1 ; ˛2 / two pairs
of backward SDEs:
8
ˆ i;j Q 1 2 Q i;1 Q i;2 1 2 j
< dYt D E @j H t; L.Xt /; L.Xt /; Yt ; Yt ; ˛Q t ; ˛Q t /.Xt / dt
ˆ i
i;j j
CZt dWt ; t 2 Œ0; T; (7.7)
ˆ
:̂ Y i;j D @ G L.X 1 /; L.X 2 / .X j /;
T j i T T T
Q F;
.Y i;1 ; Y i;2 ; ˛1 ; ˛2 / defined on .˝; Q and EQ denotes the expectation on .˝;
Q P/ Q
Q P/.
Q F;
1 2
Here, the Hamiltonians H and H are defined as reduced Hamiltonians for games
(observe that the index is in exponent in order to distinguish these equations
from (7.3)):
2
X 1
H i .t; 1 ; 2 ; y1 ; y2 ; ˛1 ; ˛2 / D ˛j yj C j˛i j2 C Fi .t; 1 ; 2 /:
jD1
2
We here use the reduced Hamiltonians because the volatility is uncontrolled. Also,
in (7.7), we represent the martingale part with respect to the sole W j because
the randomness in the equation only comes from Xj , which is Fj -progressively
measurable.
Computing @j H i given the current assumptions, we get:
( j
dYt D @j Fi t; L.Xt1 /; L.Xt2 / .Xt /dt C Zt dWt ; t 2 Œ0; T;
i;j i;j j
YT D @j Gi L.XT1 /; L.XT2 / .XT /:
i;j j
7.1 First Extensions 629
Since the first order optimality condition given by the stochastic maximum principle
takes the form ˛ti D Yti;i , this yields the following McKean-Vlasov FBSDE:
8
ˆ i;i
< dXt D Yt dt C
ˆ dWti ;
i
j
dYt D @j Fi t; L.Xt1 /; L.Xt2 / .Xt /dt C Zt dWt ;
i;j i;j j
t 2 Œ0; T;
ˆ
:̂ Y i;j D @ G L.X 1 /; L.X 2 / .X j /:
T j i T T T
Example. In the spirit of the congestion cost functions (7.6), we may choose, with
the same dynamics as in (7.5),
Z Z Z
Fl D F0 .l / C .x x0 /d1 .x0 / .x x0 /d2 .x0 / dx;
R R R
for > 0 and a smooth density with a small support containing 0. As above, can
be interpreted as a xenophobia parameter. In particular, we may expect some forms
of segregation for large, suggesting that the supports of the state distributions of
the two populations might separate from one another in equilibrium. We refer to the
bibliography in the Notes & Complements for references to further discussions on
this issue.
Some of the examples discussed in Chapter 1 were presented with an infinite time
horizon: see for instance the economic growth model in Subsection 1.4.2, the model
of production of exhaustible resources in Subsection 1.4.4 and the Cucker-Smale
model of flocking in Subsection 1.5.1. However, the mathematical theory covered
in the book has been limited to mean field games with a finite time horizon. The
goal of this subsection is to provide information on the methodology which could
be implemented to solve infinite horizon models.
630 7 Extensions for Volume I
In order to do so, we distinguish two cases: (i) Mean field games with an
infinite time horizon and a discounted running cost, which cover the aforementioned
economic growth model and model of production of exhaustible resources; (ii)
Ergodic mean field games, which appeared in the presentation of the Cucker-Smale
model.
Mean Field Games with Infinite Time Horizon and Discounted Running
Cost
Following the presentation of mean field games given in Chapter 3, we consider a
player in interaction with a homogeneous population and with controlled dynamics
of the form:
where ˇ > 0 is an actualization factor, most often a discount factor as in the financial
applications. In some sense, one may think of a zero terminal cost function g.
Formally, the search for an equilibrium within the population should follow the
same procedure as that defined for mean field games with a finite time horizon:
1. For any deterministic flow of probability measures D .t /t>0 on Rd , solve the
infinite horizon optimal control problem:
inf J .˛/;
˛
Remark 7.3 Obviously, we could consider a more general version of the model,
including for example a common noise, or a random initial condition. We leave it to
the reader to adapt the above definition of an equilibrium accordingly.
When comparing with the analysis of mean field games with a finite time horizon
presented in Chapters 3 and 4, the main difference lies in the optimal control
problem defined in step 1. Indeed, since the cost functional is defined via an integral
over an unbounded interval, this integral may not make sense under the regularity
and integrability assumptions used in Chapters 3 and 4, and additional conditions
may be needed to make the whole machinery work.
In this subsection, we do not address this question in detail, though we provide
references in the Notes & Complements to works where results addressing this issue
can be found. Still, we observe that whenever f is bounded, the cost is obviously well
defined. However, since it is often convenient to assume that f is strictly convex
in ˛, in order to accommodate these two seemingly contradictory constraints, it
then makes sense to require that A is bounded. Also, notice that more generally,
when f is bounded from below, the cost is well defined, although it may be infinite.
Furthermore, whenever f is neither bounded from above nor from below, special
properties of the drift function b and the volatility can still make it possible for the
cost to still be well defined. Indeed, some of these properties can be used to control
the growth of the solution .Xt /t>0 and its moments which may grow exponentially,
polynomially, or could be bounded.
As oftentimes in this book, we assume that is uncontrolled.
Value Function and HJB Equation. When the cost functional is well defined
for a sufficiently large class A of admissible control processes ˛, one can try to
characterize the solutions to the optimal control problem inf˛2A J .˛/ by means of
similar equations to those used when the time horizon is finite. A common way to
do so is to introduce the analogue of the value function:
Z 1
V.t; x/ D eˇt inf t E eˇs f .s; Xs ; s ; ˛s /ds j Xt D x ; (7.9)
˛2A t
where At is the class of admissible controls starting from time t. Here, the
exponential pre-factor is a normalization accounting for the fact that the system is
initialized at time t. By a formal application of the dynamic programming principle,
we expect that:
Z tCh
V.t; x/ D inf t E eˇt eˇs f .s; Xs ; s ; ˛s /ds C eˇh V.t C h; XtCh / j Xt D x :
˛2A t
(7.10)
632 7 Extensions for Volume I
1 h i
@t V.t; x/ C trace .t; x; t /@2xx V.t; x/
2 (7.11)
ˇV.t; x/ C inf H .r/ t; x; t ; @x V.t; x/; ˛ D 0;
˛2A
Obviously, (7.11) has the same form as the HJB equation appearing in the statement
of Lemma 4.47, except for the presence of an additional zero-order term and the
apparent lack of a terminal condition. The terminal condition should be replaced by
a condition on the asymptotic behavior of V.t; / as t tends to 1. The need for such
an additional condition on the growth of V.t; / for t ! 1, becomes especially clear
when implementing the analog of the verification argument used in Lemma 4.47 in
the case of finite horizon models. Following the statement of this lemma, assume
indeed that V is a classical solution to (7.11), and for an admissible control process
˛, expand .eˇt V.t; Xt //t>0 using Itô’s formula. Taking expectation in the resulting
expansion (provided this is permissible), we get for any t > 0,
Z
t
E eˇt V.t; Xt / C E eˇs f .s; Xs ; s ; ˛s /ds
0
Z h
t
D V.0; x0 / C E eˇs H .r/ s; Xs ; s ; @x V.s; Xs /; ˛s
0
.r/
i
inf H s; Xs ; s ; @x V.s; Xs /; ˛ ds :
˛2A
with equality if
8t > 0; ˛t D ˛O t; Xt ; @x V.t; Xt / ;
where:
and with strict inequality if the above identity for ˛ is not satisfied on some
measurable subset of Œ0; 1/ ˝ with a nonzero measure for Leb1 ˝ P. Therefore,
if the minimizer ˛O is well defined, if the SDE:
dXt D b t; Xt ; t ; ˛.t;
O Xt ; t ; @x V.t; Xt // dt C .t; Xt ; t /dWt ; t > 0;
for all t > 0, with X0 D x0 as initial condition. Like (7.11), (7.12) has no explicit
terminal condition. Instead, it is necessary to impose conditions on the behavior of
Yt as t tends to 1. As above, a standard strategy for constructing a solution is to
solve the approximating problem on the interval Œ0; n instead of Œ0; 1/, with 0 as
explicit terminal condition at time n, and then let n tend to 1. Also, uniqueness of
the solution may be proved by combining, as in the above verification argument,
the condition imposed on the asymptotic behavior of the solution together with the
strategy used in Proposition 4.51 to prove uniqueness of the optimal paths on finite
intervals.
634 7 Extensions for Volume I
The strategy is the same when dealing with the stochastic Pontryagin principle.
The corresponding FBSDE should be of the form:
(
dXt D b t; Xt ; t ; ˛.t;
O Xt ; t ; Yt / dt C .t; Xt ; t /dWt ;
(7.13)
O Xt ; t ; Yt / ˇYt dt C Zt dWt ;
dYt D @x H t; Xt ; t ; Yt ; Zt ; ˛.t;
for t > 0, with appropriate conditions on the behavior of the solution as t tends
to 1. Above, H stands for the full Hamiltonian of the problem.
We refer to the Notes & Complements at the end of the chapter for further
references on these kinds of equations, including results on the choice of the
asymptotic boundary condition, and on practical applications.
with
Z 1
J .˛/ D E eˇt f .Xt ; t ; ˛t /dt
0
as cost functional. If we now require that the flow D .t /t>0 remains constant
over time, that is t D for all t > 0 for some 2 P.Rd /, then the optimization
problem becomes the minimization of the cost functional:
Z 1
ˇt
J .˛/ D E e f .Xt ; ; ˛t /dt ; (7.14)
0
Notice that we denoted J by J with the superscript in a regular font (as opposed
to the boldface ) to emphasize the fact that the flow is now constant. Accordingly
the value function in (7.9) is expected to become time independent. It reads:
Z 1
V.x/ D inf E eˇs f .Xs ; ; ˛s /ds j X0 D x :
˛2A 0
1 h i
trace .x; /@2xx V.x/ ˇV.x/ C inf H .r/ x; ; @x V.x/; ˛ D 0; (7.15)
2 ˛2A
H .r/ being now independent of t. Then, minimizers ˛O of H .r/ merely write ˛.x;
O ; y/
and optimal paths are given by time-homogeneous diffusion processes:
O XO t ; ; @x V.XO t // dt C .XO t ; /dWt ;
dXO t D b XO t ; ; ˛. t > 0:
Ergodic Cost. In order to fully legitimize the substitution of by its long-run limit
, a convenient strategy is to provide a formulation of the cost functional J which
is independent of the initial condition of X D .Xt /t>0 . A natural candidate is:
Z
;erg 1 T
J .˛/ D lim E f .Xt ; t ; ˛t /dt ; (7.16)
T!1 T 0
for a flow D .t /t>0 with values in P.Rd /, where as above, we assume that b and
are independent of t and also of , that is:
This is the same cost functional as in (7.14). We now emphasize the dependence
upon the actualization rate ˇ. It is natural to expect:
where:
Z 1
V ˇ .x/ D inf E eˇs f .Xs ; ; ˛s /ds j X0 D x :
˛2A 0
for any x 2 Rd .
Letting D inf˛2A J ;erg .˛/ and passing to the limit in a formal way in (7.15),
we derive the HJB equation:
1 h i
trace .x/@2xx V.x/ C inf H .r/ x; ; @x V.x/; ˛ D 0: (7.20)
2 ˛2A
7.1 First Extensions 637
From this equation, V can only be determined up to an additive constant, and the
constant needs to be part of the solution. For this reason, V is usually constructed
as the limit of V ˇ V ˇ .0/. In that case, optimal paths are given by:
dXO t D b XO t ; ˛.
O XO t ; ; @x V.XO t // dt C .XO t /dWt ; t > 0;
where O is the unique invariant measure of XO . This identification of follows
from (7.19) when XO is irreducible, strong Feller and has an invariant measure.
In this framework, the fixed point condition for mean field games merely writes
O D , that is is the invariant measure of XO . Therefore, for an ergodic mean field
game, the search for an equilibrium consists in the following two-step procedure:
Observe that in step 1, we can choose whether or not to reduce the analysis to
controls ˛ in stationary Markov feedback form. In any case, a reasonable guess
is that the optimal control should be in stationary Markov feedback form. The
requirement that XO has an invariant measure for any 2 P.Rd / is a very restrictive
condition. In order to satisfy it, one usually imposes specific conditions on the
coefficients b and . For example, denoting by the optimal stationary feedback
function, one may want to require that the diffusion process XO solving the SDE:
dXO t D b XO t ; .XO t / dt C .XO t /dWt ; t > 0;
has suitable non-degeneracy and positive recurrence properties, this being the case if
the set A is bounded, the coefficient is bounded from above and uniformly elliptic,
and the drift b is dissipative in the x-direction. We refer to the Notes & Complements
below for standard references on the subject.
Obviously, if the analysis is reduced to controlled processes which are irreducible
and strong Feller, the measure in step 2 is the unique invariant measure of XO .
Also, once a fixed point in step 2 is found, the two formulations (7.16) and (7.18)
coincide if the marginals .t D L.XO t //t>0 converge to , which is an invariant
638 7 Extensions for Volume I
measure of XO because of the fixed point condition. For instance, if XO is irreducible
and strong Feller, in which case is the unique invariant measure of XO , then
t converges in law to as t ! 1. Depending on the smoothness of f in the
measure argument, convergence may be investigated with respect to other topologies
or distances, such as the 2-Wasserstein distance.
PDE Formulation. Accounting for the HJB equation (7.20) together with the
standard Poisson equation for the invariant measure of a diffusion process, we end
up describing ergodic mean field games with a system of two stationary PDEs:
8 h i
ˆ 1
< trace .x/@2xx V.x/ C inf H x; ; @x V.x/; ˛ D 0;
2 h i ˛2A
1
:̂ trace @2xx .x/ C divx b x; ˛.x;
O ; @x V.t; x// D 0;
2
Ergodic BSDE. In analogy with (7.12), the stationary HJB equation may be
represented by means of an ergodic BSDE, which can be obtained by replacing ˇYt
in (7.12) by . Once again, we refer to the end of the chapter for further discussion
of that point.
The N-Player Game. At the risk of indulging in anticipation of Chapter (Vol II)-6,
we cannot resist the temptation to describe the connection with finite-players games.
The reason is that, as explained in the Notes & Complements of Chapter (Vol II)-
6, the first published results on the convergence of finite-player games equilibria to
mean field games were obtained for ergodic mean field games. In this last paragraph,
we follow the presentation used in these early papers to explain the differences with
the strategy that we shall adopt in Chapter (Vol II)-6.
With coefficients of the same form as above, consider N states with dynamics:
where:
1 X
N
N N;i
t D ıX N;j ; t > 0:
N1 t
jD1;j6Di
J N;i .˛i /
Z Z
1 X Y
N N
1 T
D lim E f Xti ; ızj ; ˛ti d .z / dt ;
j j
T!1 T 0 R.N1/d N
jD1;j6Di jD1;j6Di
H i .x; i ; y; ˛/
Z
1 X Y
N N
D b.x; ˛/ y C f x; ızj ; ˛ dj .zj /;
R.N1/d N
jD1;j6Di jD1;j6Di
This section is devoted to the analysis of models for which the state space of the
system is finite. The methods can be adjusted to handle countable discrete state
spaces like the set of integers N used in the example Searching for Knowledge of
Section 1.6 of Chapter 1. We refrain from working at this level of generality to
avoid having to deal with heavier notations and most importantly, to add technical
conditions to guarantee that all the quantities (which would typically involve infinite
sums) are actually finite and well defined. Moreover, and even though we still work
with continuous time, the stochastic analysis tools developed throughout the book
for solutions of stochastic differential mean field games can no longer be used in
their original forms. The solutions and implementations need to be ported from the
framework of stochastic differential games to a discrete game set-up.
In this section, we assume that the possible states of the system comprise a finite
set E D fe1 ; ; ed g. Even though this will not play a major role in this section,
we can always assume, as we did in Subsection 5.4.4 of Chapter 5, that the set E is
embedded in Rd by regarding its elements as the vectors of the canonical basis of
Rd formed by the unit coordinate vectors e1 D .1; 0; ; 0/, , ed D .0; ; 0; 1/.
Before we introduce the specifics of the finite player games from which we
derive mean field game models, we review some of the features of the basic
stochastic differential mean field games studied in the first part of the book. At any
given time, players choose their actions from a Borel set A Rk . The dynamics
of the state of the system are completely determined by the drift and volatility
7.2 Mean Field Games with Finitely Many States 641
functions b and which depend upon time, the value of the state of a given player,
a probability measure serving as a proxy for the distribution of the states of the
other players, and a possible action. For each 2 P2 .Rd / and ˛ 2 A, the functions
Œ0; T Rd 3 .t; x/ 7! b.t; x; ; ˛/ and Œ0; T Rd 3 .t; x/ 7! .t; x; ; ˛/
determine the local means and standard deviations of the changes of the state in
infinitesimally small time intervals. Put more mathematically, they determine the
;˛
infinitesimal generator .Lt /06t6T of a Markov diffusion process for the dynamics
of the states from time t on. Replacing by a deterministic flow D .t /06t6T of
measures, or ˛ by a feedback function .t; x/ would change the dynamics without
affecting the Markovian character of these dynamics. However, replacing by a
stochastic flow of measures, or ˛ by an adapted process with values in A, would
prevent us from constructing the controlled dynamics of the states as a Markov
process. Nevertheless, such dynamics could still be constructed from the local mean
and standard deviation characteristics (now random processes) by solving stochastic
differential equations with random coefficients. We now explain why and how the
situation is different, and possibly more delicate, when the state space is a finite set
E instead of Rd .
We first recall the characterization of the infinitesimal generators of continuous
time Markovian dynamics on a finite state space E:
is called a Q-matrix.
(continued)
642 7 Extensions for Volume I
The linearity assumption (A2) will only be needed later on for some specific
technical results. We included it here for the sake of completeness as it is an
assumption on the form of the jump rates after all. Observe however that it is
rather restrictive as the sign constraint t .x; x0 ; ; ˛/ > 0 for x 6D x0 becomes
t .x; x0 ; / ˛ > 0, which precludes A to contains opposite vectors. We let the
reader check that a more general affine, instead of linear, condition would in fact
suffice to implement the arguments used below.
Our first task is to show that the rates t .x; x0 ; ; ˛/ can be used to construct
dynamics controlled by the players when is replaced by t for a flow D
.t /06t6T of measures which is either deterministic or Markovian in the sense that
t is a function of time and the state Xt at time t, and similarly when ˛ is replaced
by a feedback function .t; Xt / of the state Xt at time t.
Constructing state dynamics when the flow of measures and the controls are only
assumed to be adapted to a given filtration is much more involved mathematically.
It requires the construction of point processes from their local characteristics as
given by their dual predictable projections. Indeed, the jump intensities are the
only characteristics which can be controlled. More details on what is actually
needed, at least in the mean field game formulation, will be given at the beginning
of Subsection 7.2.2 below. So for the sake of simplicity, we restrict ourselves to
models without common noise, and we only consider Markovian control strategies.
References to specific texts containing elements of the control theory of point
processes are given in the Notes & Complements at the end of the chapter.
7.2 Mean Field Games with Finitely Many States 643
For pedagogical reasons, we first describe, though in a rather informal way, the finite
player games leading to the mean field game models considered in this section.
Complete hypotheses and precise and rigorous statements will be given in the
following subsections.
The state of player i 2 f1; ; Ng is given at time t by an element Xti of the finite
set E. Intuitively, the way player i acts on the system is by choosing, or at least
influencing, the rate at which its own state will switch from its current value Xti D x
to another possible value x0 2 E. So even if this choice were to depend upon other
quantities such as the current or past values Xsj for j ¤ i and 0 6 s 6 t of the states
of the other players, the action of player i at time t should only affect directly the
value of the rates at which its own state will jump to other states x0 2 E. Again, these
rates can and will depend upon the current value of the state of player i, as well as
other quantities like the current values of the states of the other players.
1 PN
for i D 1; ; N, whenever x0 6D x, where as usual, N N1
X i
D N1 jD1;j6Di ıXtj is the
t
empirical distribution of the states Xti
of the other players.
The rate it .x; x0 ; / is here to give the small t asymptotic behavior of the
probability at time t that the state of player i changes from x to x0 when the empirical
distribution of the states of all the other players is . The formulation of this
requirement is inspired by the properties of continuous time Markov chains in finite
state spaces.
denote by Ai D A this set of strategies. For the time being, it will serve as the set
of admissible control strategies for each player i. We may add extra conditions for a
control strategy to be admissible later on.
In any case, the controlled state evolves as a Markov process in EN , with càd-làg
trajectories, and its distribution is entirely determined by the transitions:
PŒXtC t D x0 jXt D x; Ft
8
ˆ
ˆ XN
j 0j N1 j
ˆ
ˆ1 t C o. t/ if x0 D x;
< t x ; x ; xj ; .t; x/ (7.23)
D jD1
ˆ
ˆ i 0i 0i 0i i
ˆ t x ; x ; xi ; .t; x/ t C o. t/ if x ¤ x and x D x ;
N1 i i
:̂ for some i 2 f1; : : : ; Ng;
1X X
N d
N Nx D ıx i D p` ıe`
N iD1
`D1
Cost Functionals
The goal of player i is to minimize its expected cost as given by:
Z T
1
J .˛ ; ; ˛ / D E
i N
f .t; Xt ; N X i ; ˛t / dt C g.XT ; N X i / ;
i N1 i i N1
t
0 T
where the strategies ˛j belong to A for j D 1; ; N, and where the running cost
function f and the terminal cost function g are real valued functions defined on the
sets Œ0; T E P.E/ A and E P.E/ respectively.
We emphasize the similarities and the differences between the present analysis
and the stochastic differential models studied throughout the book by using similar
notations and formulating hypotheses and results in as close a manner as possible.
General Formulation
The reader may skip this short subsection in a first reading. Indeed, it will not
be used in the remainder part of this section where we consider only Markovian
games. Still, we thought it would be useful to hint at the technicalities we would
have to face should we want to consider models with common noise or games with
major and minor players as we do in Subsection (Vol II)-7.1.9.
In its most abstract form, a general formulation of the problem would involve
a filtered probability space .˝; F; F; P/ together with two predictable processes
˛ D .˛t /06t6T and D .t /06t6T with values in A and P.E/ respectively. We
should think of t as a random input serving as proxy for the conditional distribution
of the state of a generic player, the conditioning being with respect to a random
environment external to the game and common to all the players. We should think
of ˛t as the control exerted at time t by a generic player. Given the environment
and the control strategy ˛, the state of the process should be a continuous time
process X D .Xt /06t6T with values in E. We shall assume that it is càd-làg by
which we mean right continuous and with left limits. Because E is finite, this process
determines a random point measure on Œ0; T E:
X X
.dt; dx/ D ı.t;Xt / .dt; dx/ D ı.Tn ;Xn / .dt; dx/;
06t6T n>0
Xt ¤Xt
if we think of X as a marked point process .Tn ; Xn /n>0 where Tn denotes the time of
the n-th jump of the process and Xn the landing point of the jump. The mathematical
formulation of the fact that a generic player controls the state in the random
environment with the strategy ˛ would state that the dual predictable projection
N of the measure is given by:
646 7 Extensions for Volume I
.dt;
N fx0 g/ D t .Xt ; x
0
; t ; ˛t /dt;
which appears as the data from which the state evolution needs to be constructed.
Such a level of generality raises questions which could distract us from the main
thrust of this section. Indeed, reconstructing the point measure from its dual
predictable projection N is rather involved, and even when this can be done, the
measure does not always determine uniquely the state process X if the filtration
F is larger than the filtration FX generated by the process X. This difficulty can be
resolved by working on the canonical space on which we can construct the marked
point process, but this can prevent us from capturing natural models for important
applications. See the Notes & Complements at the end of the chapter for references
to textbooks and articles providing the necessary material needed to tackle these
subtle issues.
qt .x; x0 / D t .x; x
0
; t ; .t; x//; t 2 Œ0; T; x; x0 2 E:
2. Find D .t /06t6T such that the flow of marginal distributions of a solution
X D .Xt /06t6T to the above optimal control problem coincides with the flow we
started from in the sense that L.Xt / D t for all t 2 Œ0; T.
(A1) The function f is jointly continuous in .t; ; ˛/. For each fixed
.t; x; / 2 Œ0; T E P.E/, it is differentiable with respect to the
control parameter ˛, with a derivative @˛ f which is jointly continuous
in .t; ; ˛/.
(A2) The set A is closed and convex and the function f is uniformly -convex
in ˛ 2 A for some strictly positive constant > 0, uniformly in
.t; x; / 2 Œ0; T E P.E/, in the sense that:
Notice that the assumptions made above are automatically uniform with respect
to the state variable x since the state space E is finite.
The Hamiltonian
For each x 2 E we define the difference operator x acting on functions on E by the
formula:
0
Œ x h.x / D h.x0 / h.x/; x; x0 2 E; (7.25)
The introduction of this definition is justified by the fact that the action of the
infinitesimal generators .Lt /t>0 of a nonhomogeneous continuous time Markov
chain in E with Q-matrix (or transition rates) .qt /t>0 is given by, for each function h
on E:
X X
ŒLt h.x/ D qt .x; / h D qt .x; x0 /h.x0 / D qt .x; x0 /Œh.x0 / h.x/
0
x 2E 0
x 2E (7.27)
D qt .x; / x h:
We shall use the notation x for the difference operator when convenient.
In any case, the Kolmogorov or Fokker-Planck equation for a flow of probability
measures D .t /t>0 governed by .qt /t>0 is given by @t t D Lt t where we use the
notation Lt to denote the adjoint/transpose of the operator/matrix Lt . In developed
form, the Kolmogorov equation reads:
X
@t t .fxg/ D qt .x0 ; x/t .fx0 g/; x 2 E; t > 0: (7.28)
x0 2E
;˛
We use the obvious notation Lt for the infinitesimal generator Lt defined above for
the Q-matrix qt .x; x0 / D t .x; x0 ; ; ˛/.
Using (A2) in Discrete MFG Rates and (A2) in Discrete MFG Cost Functions
and following the steps of the proof of Lemma 3.3 in Chapter 3, we can prove the
existence of a unique minimizer ˛O defined by:
˛.t;
O x; ; h/ D argmin˛2A H.t; x; ; h; ˛/: (7.30)
For the sake of convenience, we implicitly assume below that (A1) and (A2) in
assumption Discrete MFG Cost Functions and assumption Discrete MFG Rates
are satisfied. However, in some specific cases, we shall still use the notations ˛O and
H even though t is not linear in ˛.
as the integral of its derivative between t and T, when X D .Xt /06t6T is driven by a control
˛. This can be achieved by means of (7.27). We then get that the cost to ˛ is greater than
EŒu.0; X0 /, with equality if D .O t
u
Lemma 7.6 Under the standing assumptions, there exists a constant C such
that, for any S 2 Œ0; T for which the HJB equation (7.33) has a continuously
differentiable solution u on ŒS; T E with terminal condition u.T; x/ D g.x; T /,
it holds:
Proof. We consider S as in the statement. Thanks to the preliminary discussion before the
statement of Lemma 7.6, we know that u and u coincides on ŒS; T.
For t 2 Œ0; T and ˛0 2 A, we can use .˛s D ˛0 /t6s6T in the right-hand side of (7.32), we
easily get an upper bound for u and thus for u on ŒS; T E, the bound being independent
of S.
In order to establish the lower bound, we make use of the convexity of f :
u .t; x/
Z T
>E O Xs ; s ; u .s; // ˛0 ds
f .s; Xs ; s ; ˛0 / C @˛ f .s; Xs ; s ; ˛0 / ˛.s;
t
ˇ
C g.XT ; T /ˇ Xt D x
Z
T ˇ ˇ ˇ
> C 1CE ˇ˛O s; Xs ; s ; u .s; / ˇds ˇ Xt D x ;
t
where C > 0 depends upon ˛0 , and the three maxima maxt2Œ0;T;x0 2E jf .t; x0 ; t ; ˛0 /j,
maxt2Œ0;T;x0 2E j@˛ f .t; x0 ; t ; ˛0 /j, and maxx0 2E jg.x0 ; T /j.
7.2 Mean Field Games with Finitely Many States 651
where we use the notation .Oqt /t>0 for the Q-matrix of transition rates of the optimal
continuous time Markov chain identified in the above verification theorem. In other
words:
0
qO t .x; x0 / D qO t .x; x0 / D t O x; t ; u .t; // :
x; x ; t ; ˛.t; (7.37)
As for the stochastic differential games studied in this book, mean field game Nash
equilibria are then identified to the solutions of the system of Kolmogorov and
Hamilton-Jacobi-Bellman equations (7.33) and (7.36).
for all ; 0 2 P.E/, and that the running cost function f has a decomposition of
the form:
with f0 .t; ; / monotone for each fixed t 2 Œ0; T. A crucial ingredient in the proof of
Theorem 3.29 is the fact that the drift and the volatility are independent of the flow
, so that, given an admissible control strategy ˛, the law of the control process is
entirely determined, irrespective of the flow . This is still the case in the present
situation if, on the top of the above conditions, we assume that:
Under these conditions, we can repeat the proof of Theorem 3.29 mutatis mutandis
in the present situation and prove uniqueness.
states makes it possible to discuss the master equation without the heavy lifting of
the second part of the second volume.
We consider a real valued function U defined on Œ0; T E P.E/ which is
assumed to be functionally differentiable in the variable 2 P.E/ for every fixed
.t; x/ 2 Œ0; T E. Recall that P.E/ can be identified with the simplex Sd via the
correspondence $ P p D .p1 ; ; pd / with pi D .fei g/ for i D 1; ; d, or in
other words, D diD1 pi ıei . Given this identification, Proposition 5.66 explains
how one goes from the standard Euclidean partial derivatives @U =@pi to the linear
functional derivative ıU =ı when U is defined on the whole P2 .Rd /. If that were
to be the case, we would be able to use this identification to express the master
equation in terms of this functional derivative.
In the present context of finite state mean field games, and in analogy with
the master equation (5.117) for stochastic differential mean field games without
common noise introduced in Chapter 5, the master equation here takes the form:
@t U .t; x; / C H t; x; ; U .t; ; /
X @U .t; x; / (7.38)
C h t; ; U .t; ; / .x0 / D 0;
@.fx0 g/
x0 2E
@U .t; x; / @U .t; x; /
;
@.fx0 g/ @.fxg/
for x0 6D x, with the partial derivative of U .t; x; / with respect to .fx0 g/ whenever
U .t; x; / is regarded as a smooth function of the .d1/ tuple ..fx0 g//x0 2Enfxg , which
we can see as an element of the .d 1/-dimensional domain:
X
d1
Sd1;6 D .p1 ; ; pd1 / 2 Œ0; 1 d1
W pi 6 1 :
iD1
Proposition 7.7 Let us assume that U is a real valued function defined on Œ0; T
EP.E/ which solves equation (7.38) with terminal condition U .T; x; / D g.x; /
for .x; / 2 E P.E/. If W Œ0; T 3 t 7! t 2 P.E/ is the solution of the ordinary
differential equation:
@t t .fxg/ D h t; t ; U .t; ; t / .x/; x 2 E; (7.41)
with a given initial condition 0 , then the function u W Œ0; TE 3 .t; x/ 7! u.t; x/ D
U .t; x; t / 2 R solves the HJB equation (7.33) for the flow , and appears as the
value function of the optimization problem in the environment . Also, (7.41) can be
identified with the Kolmogorov equation (7.36). Consequently, is an equilibrium.
The proof is immediate in the present situation. We can use the chain rule to
compute @t u.t; x/, and the fact that U satisfies the master equation (7.38), and
7.2 Mean Field Games with Finitely Many States 655
satisfies (7.41), imply the desired result. The claim that (7.41) identifies with the
Kolmogorov equation (7.36) is justified by (7.37).
For the sake of illustration, we propose a first analysis of the Botnet defense model
alluded to in Subsection 1.6.2 of Chapter 1. There, we introduced the model as an
instance of a game with major and minor players. Here, we exogenize the behavior
of the major player and we concentrate on the population of potential victims
assuming that the hacker has already chosen its strategy, and that all the players
know it. We shall consider the full model with an attacker and a field of targets in
Subsection (Vol II)-7.1.9 of Chapter (Vol II)-7.
We first specify the state space and the transition rates for the dynamics of
the states of the potential victims, as well as their cost functions. We refer to the
bibliography cited in the Notes & Complements below for a complete account on
this model. We assume that each vulnerable computer can be in one of the following
d D 4 states:
So E D fDI; DS; UI; USg. In this simplistic model, the rate t is independent of t
and each network computer owner can choose one of two actions, that is A D f0; 1g.
Action 0 means that the computer owner is happy with its level of defense (Defended
or Unprotected) and does not try to change its own state, while action 1 means that
the computer owner is willing to update the level of protection of its computer and
switch to the other state (Unprotected or Defended). In the latter case, updating
occurs after an exponential time with parameter > 0, which accounts for the
speed of the response of the defense system.
When infected, a computer may recover at a rate depending on its protection
level: the recovery rate is denoted by qD U
rec for a protected computer and by qrec for
an unprotected one.
Conversely, a computer may become infected in two ways, either directly from
the attacks of the hacker or indirectly from infected computers that spread out the
infection. The rate of direct infection depends upon the intensity of the attacks,
as fixed by the botnet herder. This intensity is denoted by vH and the rate of
direct infection of a protected computer is vH qD inf while the rate of direct infection
of an unprotected computer is vH qU inf . Also, the rates of infection spreading from
infected to susceptible computers depend upon the distribution of states within
the population of computers. The rate of infection of an unprotected susceptible
computer by other unprotected infected computers is ˇUU fUIg, the rate of infection
of a protected susceptible computer by other unprotected infected computers is
656 7 Extensions for Volume I
t .; ; ; vH ; 0/ D
DI DS UI US
2 3
DI qD
rec 0 0
DS6 vH qinf C ˇDD .fDIg/ C ˇUD .fUIg/ 0 0 7
D
6 7
UI 4 0 0 qU
rec
5
US 0 0 vH qU
inf C ˇUU .fUIg/ C ˇDU .fDIg/
and:
t .; ; ; vH ; 1/ D
DI DS UI US
2 3
DI qDrec 0
6
DS vH qinf C ˇDD .fDIg/ C ˇUD .fUIg/
D
0 7
6 7
UI 4 0 qUrec
5
US 0 vH qinf C ˇUU .fUIg/ C ˇDU .fDIg/
U
where all the instances of should be replaced by the negative of the sum of the
entries of the row in which appears on the diagonal.
As explained earlier, we do not specify the dynamics nor the state of the attacker
in this first form of the model. For the present purposes, it suffices to known the
value of the intensity of the attacks, here given by vH . Notice also that, in the current
form of the model, the rate t not only depends on the action of the typical computer
owner, but also on the intensity of the attacks. Each computer owner pays a fee kD
per unit of time for the defense of its system, and kI per unit of time for losses
resulting from infection. So, if we denote by Xt the state of its computer at time t,
and by ˛ D .˛t /06t6T its control, the expected cost to a typical computer owner is
given by:
Z
T
J.˛/ D E kD 1D C kI 1I .Xt /dt ;
0
Numerical Implementation
For the purpose of illustration, we provide numerical results from a straightforward
implementation of the solution of the mean field game of cyber security described
above. We chose a time interval Œ0; T with T D 10 (see comments below
for a discussion of this particular choice) which we covered by a regular mesh
7.2 Mean Field Games with Finitely Many States 657
1.0
μ(t)[DI] μ(t)[DI]
μ(t)[DS] μ(t)[DS]
μ(t)[UI] μ(t)[UI]
0.8
0.8
μ(t)[US] μ(t)[US]
0.6
0.6
μ(t)
μ(t)
0.4
0.4
0.2
0.2
0.0
0.0
0 2 4 6 8 10 0 2 4 6 8 10
time t time t
μ(t)[DI]
μ(t)[DS]
μ(t)[UI]
0.8
μ(t)[US]
0.6
μ(t)
0.4
0.2
0.0
0 2 4 6 8 10
time t
Fig. 7.1 Time evolution in equilibrium, of the distribution t of the states of the computers in the
network for different initial conditions 0 : 0 D .0:25; 0:25; 0:25; 0:25/ in the left plot on the top
row; 0 D .1; 0; 0; 0/ in the right plot on the top row; and 0 D .0; 0; 0; 1/ in the bottom plot.
fti giD0; ;Ns with Ns D 104 and ti D i t for t D 103 . We implemented the
solutions of the HJB equation (7.33) and the Kolmogorov equation (7.36)–(7.37)
with straightforward explicit Euler schemes, and we iterated the solutions of these
equations to find the fixed point. In the numerical experiments we conducted, the
process converged in a very small number of iterations.
We used the following parameters to produce the plots of Figures 7.1 and 7.2:
ˇfUUg D 0:3, ˇfUDg D 0:4, ˇfDUg D 0:3, and ˇfDDg D 0:4 for the rates of
infection; vH D 0:6 for the attack intensity of the hacker, and D 0:8 for the speed
rec D 0:5 and qrec D 0:4, for the rates of recovery, and qinf D 0:4 and
D
of response, qD U
Finally, the constants appearing in the definition of the expected cost were chosen
as kD D 0:3 for the cost of being defended, and kI D 0:5 for the cost of being
infected.
658 7 Extensions for Volume I
1.0
0.8
0.8
0.6
0.6
φ(t)[DS]
φ(t)[DI]
0.4
0.4
0.2
0.2
0.0
0.0
0 2 4 6 8 10 0 2 4 6 8 10
time t time t
Time evolution of the optimal Time evolution of the optimal
feedback function φ (t)[UI] feedback function φ(t)[US]
1.0
1.0
0.8
0.8
0.6
0.6
φ(t)[US]
φ(t)[UI]
0.4
0.4
0.2
0.2
0.0
0.0
0 2 4 6 8 10 0 2 4 6 8 10
time t time t
Fig. 7.2 Time evolution of the optimal feedback function .t; / in equilibrium. From left to right
and from top to bottom, .t; DI/, .t; DS/, .t; UI/, and .t; US/.
value of T in order to see that the proportions become constant. Larger values of T
confirm this fact. We kept T to a reasonably small value to still see the patterns in
the left-hand side of the plots. Varying the parameters of the model gives different
values for the limiting levels of t .UI/ and t .US/ while t .DI/ and t .DS/ remain
constant and equal to 0.
While the interpretation of this invariant distribution is not clear, its existence,
and the strong evidence for the convergence suggest that a strong ergodicity exists
in the model and that a search for stationary solutions in an analysis of an infinite
horizon model in the spirit of Subsection 7.1.2 is reasonable. See the Notes &
Complements at the end of the chapter for references to such an analysis.
For the sake of completeness, we computed and plotted the time evolution of the
optimal feedback function .t; /. Interestingly, irrespective of our choice of initial
condition 0 , or even of the parameters of the model, we found that the function
.t; / is constant over time, and given by .t; DI/ D .t; DS/ D 1 and .t; UI/ D
.t; US/ D 0 for all t 2 Œ0; T.
The strong ergodicity which made us believe in the convergence over time of the
distribution t and the optimal feedback control function .t; / toward unique well
specified limits, does not always hold. There exist combinations of parameters for
which several stationary limits are possible, or no stationary limit exists.
For the purpose of illustration, we used the following parameters: ˇfUUg D
ˇfUDg D 5, ˇfDUg D ˇfDDg D 2, for the rates of infection; qD rec D qrec D 0:3,
U
for the rates of recovery, and qinf D 0:3 and qinf D 0:4 for the rates of infection.
D U
Finally, we kept the same attack intensity vH D 0:6, the same time horizon T D 10,
kI D 1 for the cost of being infected, but we chose kD D 0:5385 for the cost of being
defended and D 1000 for the speed of response. We explain these choices in the
Notes & Complements at the end of the chapter.
With these parameters, the iterations of the successive solutions of the HJB
equation and the Kolmogorov Fokker Planck equation do not behave the same way.
Instead of a very fast convergence to what we took as the equilibrium measure
flow, we see oscillations questioning the convergence toward a unique equilibrium
accepted before as a fact. We do not plot the time evolution of the optimal feedback
function .t; / because it appears to be the same as before. However, we illustrate
the behavior of the measure flow through these iterations in Figure 7.3.
For the sake of completeness, we also solved numerically the master equa-
tion (7.38). For the purpose of comparison, we still work on the interval Œ0; T with
T D 10, but we now use a coarser grid with Ns D 10 time steps only. We discretize
the simplex S3 with the grid:
˚ k1 k2 k3 k1 k2 k3
G D . ; ; /I ki integer ; 0 6 ki 6 M; for i D 1; 2; 3; C C 61 ;
M M M M M M
with M D 25 and we use backward derivatives. While we do not know how to
provide an instructive plot of the values of the solution of the master equation (7.38),
we checked the consistency of our results by using this solution to derive the equi-
librium measure flow D .t / by solving the ordinary differential equation (7.41)
from Proposition 7.7. The results are reproduced in Figure 7.4. They were obtained
with the parameters used to produce Figure 7.1.
660 7 Extensions for Volume I
1.0
μ(t)[DI] μ(t)[DI]
0.8
0.8
μ(t)[DS] μ(t)[DS]
μ(t)[UI] μ(t)[UI]
μ(t)[US] μ(t)[US]
0.6
0.6
μ(t)
μ(t)
0.4
0.4
0.2
0.2
0.0
0.0
0 2 4 6 8 10 0 2 4 6 8 10
time t time t
Time evolution of the state distribution μ(t) Time evolution of the state distribution μ(t)
1.0
1.0
0.8
0.8
μ(t)[DI] μ(t)[DI]
μ(t)[DS] μ(t)[DS]
μ(t)[UI] μ(t)[UI]
0.6
0.6
μ(t)[US] μ(t)[US]
μ(t)
μ(t)
0.4
0.4
0.2
0.2
0.0
0.0
0 2 4 6 8 10 0 2 4 6 8 10
time t time t
Fig. 7.3 From left to right and from top to bottom, time evolution of the distribution t for the
parameters given in the text, after 1, 5, 20, and 100 iterations of the successive solutions of the HJB
equation and the Kolmogorov Fokker Planck equation.
1.0
μ(t)[DI] μ(t)[DI]
μ(t)[DS] μ(t)[DS]
μ(t)[UI] μ(t)[UI]
0.8
0.8
μ(t)[US] μ(t)[US]
0.6
0.6
μ(t)
μ(t)
0.4
0.4
0.2
0.2
0.0
0.0
0 2 4 6 8 10 0 2 4 6 8 10
time t time t
Fig. 7.4 Time evolution of the distribution t as computed as the solution of the ordinary differ-
ential equation (7.41) from Proposition 7.7 using the numerical solution of the master equation.
For the purpose of comparison, we used the initial conditions 0 : 0 D .0:25; 0:25; 0:25; 0:25/ on
the left and 0 D .1; 0; 0; 0/ on the right.
As above, we assume that A is a closed convex subset of the Euclidean space Rk . The
typical example we have in mind is the analog of the model used in Subsection 6.7.2
provided by:
1 2
f1 .t; x; ˛/ D j˛j ; t 2 Œ0; T; x 2 E; ˛ 2 A: (7.42)
2
For the sake of convenience, we recall the important definitions and equations. The
Hamiltonian H is still given by the same formula:
;˛
H.t; x; ; h; ˛/ D ŒLt h.x/ C f0 .t; x; / C f1 .t; x; ˛/;
So, for a given flow D .t /06t6T of measures on E, the HJB equation (7.33)
reads:
0 D @t u.t; x/ C H1 .t; x; t ; u.t; // C f0 .t; x; t /;
with the same terminal condition as before. Now, since the sum of the jump rates
. t .x00 ; x0 ; ; ˛.t;
O x00 ; ; U .t; ; ////x0 2E is null, this may be also written:
@t U .t; x; / C H1 t; x; t ; U .t; ; / C f0 .t; x; /
X 00 0 h @U .t; x; / i 0
C t x ; x ; ; ˛O t; x00 ; ; U .t; ; / .fx00 g/ x00 .x /
@.fg/
x0 ;x00 2E
D 0:
where ˛.i/ is the ith coordinate of ˛ D .˛.1/; ; ˛.d 1// 2 Œ0; 1/d1 , for
i 2 f1; : : : ; d 1g.
In agreement with assumption Discrete MFG Rates, we then assume that the
rate function has the following linear form:
0
t .x; x ; ; ˛/ D ˛.x; x0 /; t 2 Œ0; T; x; x0 2 E; 2 P.E/; ˛ 2 A:
Hence,
X 1 X
H1 .t; x; ; h; ˛/ D ˛ x; x0 /Œ x h.x
0
/C j˛.x; x0 /j2 ;
2
x0 2E;x0 ¤x x0 2E;x0 ¤x
7.2 Mean Field Games with Finitely Many States 663
1 X 2
D Œh.x/ h.x0 /C
2
x0 2E;x0 6Dx
1 X 2
D Œh.x/ h.x0 /C :
2 0
x 2E
As a result, for any given flow D .t /06t6T of measures on E, the HJB
equation (7.33) reads:
1 X 2
@t u .t; x/ u .t; x/ u .t; x0 / C
C f0 .t; x; t / D 0;
2 0
x 2E
1 X 2
@t U .t; x; / U .t; x; / U .t; x0 ; / C
C f0 .t; x; t /
2 0
x 2E
X @U .t; x; / 0 h i
C .fx00 g/ x00 .x / x00 U .t; ; / .x0 / D 0;
@.fg/
x0 ;x00 2E
or, equivalently,
1 X 2
@t U .t; x; / U .t; x; / U .t; x0 ; / C
C f0 .t; x; t / (7.45)
2 0
x 2E
X @U .t; x; / @U .t; x; /
.fx00 g/ U .t; x00 ; / U .t; x0 ; / C
D 0:
@.fx00 g/ @.fx0 g/
x0 ;x00 2E
admit linear functional derivatives giving the running and terminal cost functions of
the game in the sense that:
@F.t; / @G./
f0 .t; x; / D ; and g.x; / D ;
@.fxg/ @.fxg/
for t 2 Œ0; T, x 2 E and 2 P.E/. In the present context, the central planner
problem is the following optimal control problem of the McKean-Vlasov type:
Z h
T i
inf E f1 .t; Xt ; ˛t / C F t; L.Xt / dt C G L.XT / ;
˛2A 0
with f1 as in (7.42) and where the infimum is taken over the set of admissible Markov
strategies ˛t D .t; Xt / D .1 .t; Xt /; ; d1 .t; Xt // with values in Œ0; 1/d1 and
X D X˛ D .Xt /06t6T is the inhomogeneous continuous time E-valued Markov
process with transition rates given by (7.44), namely by the Q-matrices:
8
ˆ
ˆ
< x .x0 / .t; x/ if x0 ¤ x;
0
qt .x; x / D X
ˆ x .x00 / .t; x/ if x0 D x;
:̂
x00 2E
Motivated by the result of Subsection 6.7.2, we try to prove that the solution of
this optimal control problem provides a solution to the mean field game problem
considered in this section.
We solve the McKean-Vlasov control problem by writing its HJB equation and
identifying the optimal control. Once this is done, we check that the optimal control
7.2 Mean Field Games with Finitely Many States 665
provides the solution of the original mean field game problem defined by the running
and terminal cost functions f and g. The Hamiltonian H of the McKean-Vlasov
control problem is given by:
H .t; ; h; /
D .Lt / h C hf1 .t; ; .//; i C F.t; /
X h X 1 i
D .fxg/ .t; x; x0 /Œ x h.x0 / C j.t; x; x0 /j2 C F.t; /;
0 0
2
x2E x 2E;x ¤x
and, since we can minimize term by term, the minimized Hamiltonian is given by:
X X
1 0
H .t; ; h/ D .fxg/ Œh.x/ h.x /2C C F.t; /: (7.46)
2
x2E x0 2E;x0 ¤x
Using the notation v.t; / for the value function of the deterministic control
problem, the HJB equation is given by:
1X X @v.t; /
@v.t; / 2
@t v.t; / .fxg/ C F.t; / D 0: (7.47)
2 x2E 0
@.fxg/ @.fx0 g/ C
x 2E
Again, using Proposition 5.66 and Corollary 5.67, we see that, instead of the
standard derivative of functions defined on Rd , we could as well use the linear
functional derivatives of functions defined on P2 .Rd / if we assume that the function
v.t; / has a smooth extension to P2 .Rd /.
We now prove that the solution of this deterministic control problem can provide
a solution to the original mean field game problem.
Proposition 7.8 Let us assume that the function Œ0; T P.E/ 3 .t; / 7!
v.t; / 2 R is twice differentiable with respect to the weights ..fxg//x2E and
solves equation (7.47) with terminal condition v.T; / D G./. Then the function
Œ0; T E P.E/ 3 .t; x; / 7! U .t; x; / D Œ@v=@.fxg/.t; / 2 R solves the
master equation (7.38) with terminal condition U .T; x; / D g.x; /.
Here and below, we use the same rules of differentiation as in the writing of the
master equation (7.38).
Proof. We give the proof in broad strokes only. The interested reader can easily fill in the
technical details. Notice first that a standard verification argument can be used to show
that v, as a solution of equation (7.47), is indeed the value function of the deterministic
McKean-Vlasov control problem of interest here. Next, exchanging freely the order of partial
derivatives, we differentiate both sides of (7.47) with respect to .fx0 g/ for some x0 2 E and
obtain:
666 7 Extensions for Volume I
@v.t; / 1 X @v.t; / @v.t; / 2 @F.t; /
@t C
@.fx0 g/ 2 0 @.fx0 g/ @.fx0 g/ C @.fx0 g/
x 2E
X X @2 v.t; / @2 v.t; /
.fxg/
@.fxg/@.fx0 g/ @.fx0 g/@.fx0 g/
x2E x0 2E
@v.t; / @v.t; /
D 0:
@.fxg/ @.fx0 g/ C
Setting:
@v.t; / @F.t; /
U .t; x; / D ; f0 .t; x; / D ; and x D x0 ;
@.fxg/ @.fxg/
we recover (7.45). t
u
Similarly, we denote by V .x/ the subset of E n fxg of nodes x0 for which there
exists a directed edge from x0 to x. Notice that the case studied earlier corresponds to
VC .x/ D V .x/ D E nfxg for all x 2 E. It is plain to port all the results proven above
to the set-up of mean field games on directed graphs (including potential games and
central planer optimization problems) using (7.48). For example, the Kolmogorov
equation (7.36) rewrites:
X X
@t t .fxg/ D t .fx0 g/qt .x0 ; x/ t .fxg/qt .x; x0 /: (7.49)
x0 2V .x/ x0 2VC .x/
The goal of this subsection is to provide an example of a mean field game with
finitely many states for which the associated N-player game has a Nash equilibrium
that does not converge to the solution of the mean field game.
Throughout the subsection, we consider the following example with T D 1,
E D A D f0; 1g and, for t 2 Œ0; 1, x; x0 ; ˛ 2 E and 2 P.E/,
(
0
t .x; x ; ; ˛/ D ˇ1fx0 D˛g ; x0 ¤ x;
t .x; x; ; ˛/ D ˇ1fxD1˛g ;
for some ˇ > 0, which has a simple interpretation: the next state x0 coincides with
the action ˛. Also, we let:
In that case, the dynamics of the representative player may be easily described
by means of a Poisson process with intensity ˇ. At any occurrence t of the Poisson
process, the player jumps to the state ˛t if different from its current position and,
otherwise, stays in its current state. Here, .˛t /06t61 denotes the control process
chosen by the representative player.
which shows that, for a given initial condition 0 2 P.E/, the flow D .t /06t61
is solution to the mean field problem if and only if
t .f1g/ D 0 .f0g/ 1 exp.ˇt/ C 0 .f1g/
(7.50)
D 1 0 .f0g/ exp.ˇt/; t 2 Œ0; 1:
668 7 Extensions for Volume I
HJB Equation. The value function may be retrieved by writing down the HJB
equation. Here the Hamiltonian is independent of and reads:
H.t; 0; h; ˛/ D ˇ h.1/ h.0/ 1f˛D1g C 1;
H.t; 1; h; ˛/ D ˇ h.0/ h.1/ 1f˛D0g :
Therefore,
(
1 if h.1/ h.0/ < 0;
˛.t;
O 0; h/ D (7.51)
0 if h.1/ h.0/ > 0;
and
(
0 if h.0/ h.1/ < 0;
˛.t;
O 1; h/ D (7.52)
1 if h.0/ h.1/ > 0;
and
d d
u .t; 1/ C H t; 1; u .t; / D u .t; 1/ D 0:
dt dt
Observing in particular that u .t; 0/ > u .t; 1/ for all t 2 Œ0; T/, we recover
from (7.51) and (7.52) the fact that, whatever the initial point and the initial
distribution, the optimal strategy is given by the constant control strategy 1.
7.2 Mean Field Games with Finitely Many States 669
@U .t; 0; / @U .t; 0; /
D 2 exp.ˇ.1 t//;
@.f0g/ @.f1g/
@U .t; 1; / @U .t; 1; /
D 2 exp.ˇ.1 t//;
@.f0g/ @.f1g/
h ./.0/ D ˇ.f0g/;
h ./.1/ D ˇ.f0g/:
Therefore,
@U .t; 0; / @U .t; 0; /
h ./.0/ C h ./.1/
@.f0g/ @.f1g/
@U .t; 0; / @U .t; 0; /
D h ./.0/ D 2ˇ.f0g/ exp.ˇ.1 t//;
@.f0g/ @.f1g/
and, similarly,
@U .t; 1; / @U .t; 1; /
h ./.0/ C h ./.1/
@.f0g/ @.f1g/
@U .t; 1; / @U .t; 1; /
D h ./.0/ D 2ˇ.f0g/ exp.ˇ.1 t//;
@.f0g/ @.f1g/
and then, with the same computation as above, we have:
d
U .t; 0; / C H t; 0; U .t; ; /
dt
@U .t; 0; / @U .t; 0; /
C h ./.0/ C h ./.1/
@.f0g/ @.f1g/
D exp.ˇ.1 t// 2ˇ.f0g/ exp.ˇ.1 t//
1 exp.ˇ.1 t// C 1 C 2ˇ.f0g/ exp.ˇ.1 t//
D 0;
670 7 Extensions for Volume I
and, similarly,
d
U .t; 1; / C H t; 1; U .t; ; /
dt
@U .t; 1; / @U .t; 1; /
C h ./.0/ C h ./.1/
@.f0g/ @.f1g/
D 2ˇ.f0g/ exp.ˇ.1 t// C 2ˇ.f0g/ exp.ˇ.1 t//
D 0;
1 PN
where XN tN;i D N1
N;j
jD1;j6Di Xt .
Notice first that the constant strategies N;i 1 for i D 1; ; N form a
Markovian Nash equilibrium. Indeed, if all the players j ¤ i use such a constant
strategy, then .EŒXN tN;i /06t61 is independent of N;i , so that the expected cost to
player i is, up to an additive constant which is independent of N;i , its expected time
spent in state 0 which is minimized by using the strategy N;i 1, proving that we
indeed have identified a Nash equilibrium.
However, this example is especially interesting because one can identify in some
cases, another explicit Nash equilibrium. Consider the strategy profiles:
1 if xN N;i > 0;
N;i .x1 ; ; xN / D
0 if xN N;i D 0;
1 X
N
xN N;i D xj ;
N1
jD1;j¤i
which gives the proportion of players different from i in state 1 in the system
described by the state .x1 ; : : : ; xN / 2 f0; 1gN .
We claim:
Lemma 7.9 There exists ˇ0 > 0 such that, for ˇ > ˇ0 , there exists N0 .ˇ/ > 2 such
that, for N > N0 .ˇ/, the functions . N;i /16i6N form a Markovian Nash equilibrium.
We write v N;1 .t; x/ for J N;1 . N / when the initial condition at time t of the system is a
deterministic tuple x D .x1 ; ; xN / 2 f0; 1gN .
Throughout the proof, we denote by .%n /n>0 a Poisson process with intensity Nˇ and
modeling the possible jumping times in the game, at least up until time 1. Precisely, players
cannot jump at any time s 62 f%n ; n > 1g. At any time %n , an index In 2 f1; ; Ng is selected
and player In jumps if allowed by its own control strategy.
Obviously, the sequences .In /n>1 and .%n /n>0 are independent and the variables .In /n>1
are also independent. For any n > 1, In is uniformly distributed on f1; ; Ng.
Also, throughout the proof, we denote by ..YsN;i /16i6N /t6s61 a family of N independent
Markov processes with values in f0; 1g, with the prescription that each .YsN;i /t6s61 starts
from XtN;i , cannot exit from state 1, and jumps from state 0 to state 1 at the first time n with
In D i. In words, the jumping times of Y N;i , for i D 1; ; N, are dictated by the same Poisson
1 PN
process as the jumping times of XN;i . For any s 2 Œt; 1, we let YN sN;1 D N1 N;j
jD2 Ys .
1 1
The strategy is to compare v .t; .1; x // and v .t; .0; x // for all the possible values
N;1 N;1
b. If player 1 starts from 0 at time t, it will stay in 0 up until 1 , where 1 is the first jumping
time of player 1. Since all the players implement strategy 1, we get:
Z 1 h i
v N;1 t; .0; x1 / D E 1fXsN;1 D0g ds C 2XN 1N;1 > E .1 ^ 1 t/ C 2YN 1N;1 ;
t
which is strictly greater than v N;1 .t; .1; x1 //. As a result, we have:
N;1 .1; x1 / D 1 D sign v N;1 t; .0; x1 / v N;1 t; .1; x1 / ;
N;1 .0; x1 / D 1 D sign v N;1 t; .0; x1 / v N;1 t; .1; x1 / ;
2
at least when x1 > N
.
Second Case. Assume now that there is exactly one player, say I0 , different from 1 that starts
from state 1 at time t, that is xN 1 D 1.
a. If player 1 starts from 1, that is x1 D 1, then all the players implement strategy 1 up
until the end of the game. As a result, we have as before:
v N;1 t; .1; x1 / D 2E XN 1N;1 D 2E YN 1N;1 ;
2 N 2
v N;1 t; .1; x1 / D C2 1 exp.ˇ.1 t// :
N1 N1
b. If player 1 starts from 0 at time t, then all the players except player I0 implement
strategy 1 up until the first jumping time %1 , while player I0 implements strategy 0. At time
%1 , the system restarts with two particles in state 1 provided that %1 < 1 and I1 6D I0 . So,
on the event f%1 < 1; I1 6D I0 g, the process .XN sN;1 /t6s6T coincides with .YN sN;1 /t6s6T .
Therefore,
1
v N;1 t; .0; x1 / > 2E 1f%1 >1g C 1f%1 <1; I1 6DI0 g YN 1N;1 C E 1 ^ 1 t ;
N1
1
1f% >1g C 1f%1 <1; I1 6DI0 g YN 1N;1
N1 1
X (7.56)
1 1 N1
j
D 1 1f%1 <1;I1 DI0 g C 1 1f%1 <1; I1 DI0 g Y1 :
N1 N1
jD2;j6DI0
7.2 Mean Field Games with Finitely Many States 673
1 1 ˇ
P %1 < 1; I1 D I0 D 1 exp.Nˇ.1 t// D .1 t/O /;
N1 N.N 1/ N
and
X
N1 X
N1
1 j 1 1 j
E 1f%1 <1; I1 DI0 g Y1 6 E Y1 j %1 < 1; I1 D I0
N1 N N1
jD2;j6DI0 jD2;j6DI0
1 ˇ
6 1 exp.ˇ.1 t// D .1 t/O ;
N1 N
where we used the Markov property for the Poisson process in the second line.
Inserting the two previous bounds in (7.56) and comparing with v N;1 .t; .1; x1 // obtained
in part a, we deduce that:
1 ˇ
2E 1f%1 >1g C 1f%1 <1; I1 6DI0 g YN 1N;1 > v N;1 t; .1; x1 / .1 t/O :
N1 N
Thus, returning to the lower bound for v N;1 .t; .0; x1 //, we get:
Z 1
ˇ
v N;1 t; .0; x1 / > v N;1 t; .1; x1 / .1 t/O C ˇ .1 t/ ^ r exp ˇr dr
N 0
ˇ
>v N;1
t; .1; x1 / .1 t/O C .1 t/ exp ˇ ;
N
which shows that, for each fixed ˇ > 1, we can choose N large enough so that
v N;1 .t; .0; x1 // > v N;1 .t; .1; x1 //. Hence the conclusion is the same as in the first case,
see (7.55).
Third Case. We now assume that xN 1 D 0.
a. If player 1 starts from 0, then all the players implement strategy 0 and thus remain in 0.
In particular, v N;1 .t; .0; x1 // D 1 t.
b. Now, if player 1 starts from 1, then all the players except player 1 implement strategy 1,
at least up until %1 . If %1 < 1 and I1 > 2, then there are two players in state 1 at time %1 ;
after %1 , all of them implement strategy 1. If %1 < 1 and I1 D 1, then player 1 switches to
0 at time %1 and then all the players remain in 0. So, the cost to player 1 is greater than:
h i h i
2E 1f%1 <1; I1 >2g YN 1N;1 D 2E 1fI1 >2g YN 1N;1
h i h i
D 2E YN 1N;1 2E 1f%1 <1; I1 D1g YN 1N;1 ;
where, as in the second case, we used the fact that 1f%1 <1g YN 1N;1 D YN 1N;1 with probability
1. Now, following the second case again,
h i 2 h i 2
2E 1f%1 <1; I1 D1g YN 1N;1 6 E YN 1N;1 j %1 < 1; I1 D 1 6 1 exp.ˇ.1 t// :
N N
674 7 Extensions for Volume I
which is greater than 1 and thus than v N;1 .t; .0; x1 // for N large enough. If ˇ.1 t/ < ln 3,
then:
Z .1t/
1 1 ln 3
v N;1 t; .1; x1 / > 2 1 exp.r/dr > 2 1 exp .1 t/:
N 0 N ˇ
So, choosing ˇ and N large enough, it is also greater than v N;1 .t; .0; x1 //. In any case, the
conclusion (7.55) of the first step remains true for well-chosen values of ˇ and N.
Conclusion. For ˇ and N as in the statement, we get that (7.55) is always satisfied, whatever
the value of x. Also, recall from its definition that v N;1 satisfies the backward ODE:
d N;1 XN
v .t; x/ C ˇ v N;1 t; .1 xi ; xi / v N;1 t; .xi ; xi / 1f N;i .xi ;xi /D1xi g
dt iD1
C 1fx1 D0g D 0;
with v N;1 .1; x/ D 2Nx1 as terminal condition. Using (7.55), this may be rewritten as:
d N;1
v .t; x/ C 1fx1 D0g
dt
C ˇ v N;1 t; .1 x1 ; x1 / v N;1 t; .x1 ; x1 / 1fvN;1 .t;.1x1 ;x1 //v.t;.x1 ;x1 //<0g
X
N
C ˇ v N;1 t; .1 xi ; xi / v N;1 t; .xi ; xi / 1f N;i .xi ;xi /D1xi g
iD2
D 0:
We then recognize on the first two lines the Hamiltonian structure generated by the
Hamiltonian H in (7.53). This shows that v N;1 is the value function of the optimal control
problem characterizing the best response of player 1 when all the others play the strategies
. N;i /26i6N . t
u
Conclusion
Remarkably, Lemma 7.9 shows that, whenever the N-player game is initialized
with .0; ; 0/, the strategy profiles . N;i /16i6N force all the players to stay in
7.2 Mean Field Games with Finitely Many States 675
state 0, while, in the limiting mean field game, the Dirac mass ı0 at point 0 is not an
equilibrium.
Obviously, what happens is that the mean field limit captures the trivial Nash
equilibrium obtained by letting all the players in the N-player game play strategy
1. In contrast, the mean field formulation cannot keep track of the strategies
. N;i /1iN because, in the asymptotic regime, it is no longer possible for the whole
population to optimize its response when a single player is deviating. In this regard,
it is worth observing that the limiting optimal cost in the mean field problem, when
initialized at 0, is 2 O.1=ˇ/, while, when all the players start from 0 and play
strategy 0 in the N-player game, the cost to any player is 1, which shows that the
equilibrium captured by the mean field limit is not the one with the minimal cost.
It is also worth mentioning that, in appearance, the limiting mean field game
possesses all the properties that we shall use in Chapter (Vol II)-6 to prove the
convergence of games with finitely many players to mean field games. Notice
in particular that the mean field game is uniquely solvable for any initial dis-
tribution and that the master equation has a smooth solution. This may seem
contradictory with the existence of the extra Nash equilibrium . N;i /16i6N since
the latter facts are basically the main ingredients used in the analysis performed in
Chapter (Vol II)-6.
We now explain why the master equation cannot capture the additional Nash
equilibrium exhibited in the statement of Lemma 7.9.
The Nash System. Following the analysis performed in Chapter (Vol II)-6, we first
write down the Nash system for the N-player game.
Similar to (2.17), see also (Vol II)-(6.94) together with the proof of Lemma 7.9,
the Nash system reads as a system of differential equations with a tuple of functions
.v N;i W Œ0; T EN ! R/iD1; ;N as unknown:
d N;i
v .t; x/ C H t; xi ; v N;i .t; ; xi / (7.57)
dt
X
N
C ˇ v N;i t; .1 xj ; xj / v N;i .t; x/ 1f N;j .t;x/D1xj g D 0;
jD1;j6Di
Lemma 7.10 The functions .uN;i /16i6N are solutions of the Nash system (7.57).
Of course, Lemma 7.10 should not come as a surprise: It is a just way to rephrase
the fact that the constant strategy profile 1 is a Nash equilibrium of the N-player
game. Actually, its interest is mostly pedagogical as it provides a clear parallel with
the statement of Proposition (Vol II)-(6.31) in Chapter (Vol II)-6.
Proof. By (7.54),
1
uN;i t; .0; xi / D 1 exp.ˇ.1 t//
ˇ
2 X
N
C2 .1 xj / exp.ˇ.1 t//;
N1
jD1;j6Di
2 X
N
uN;i t; .1; xi / D 2 .1 xj / exp.ˇ.1 t//:
N1
jD1;j6Di
Therefore, for j 6D i,
2
uN;i t; .1 xj ; xj / uN;i t; .xj ; xj / D .1 2xj / exp.ˇ.1 t//;
.N 1/
while, for j D i,
1
uN;i t; .0; xi / uN;i t; .1; xi / D 1 exp.ˇ.1 t// ;
ˇ
so that, letting:
8
<1 x i if uN;i t; .1 xi ; xi / uN;i .t; x/ < 0;
N;i
.t; x/ D
:x i if uN;i t; .1 xi ; xi / uN;i .t; x/ > 0;
X
N
ˇ uN;i t; .1 xj ; xj / uN;i .t; x/ 1f N;j .t;x/D1xj g
jD1;j6Di
2ˇ X
N
D exp.ˇ.1 t// .1 2xj /1fxj D0g
N1
jD1;j6Di
2ˇ X
N
D exp.ˇ.1 t// .1 xj /:
N1
jD1;j6Di
7.3 Notes & Complements 677
Computing the time derivatives of .uN;i /16i6N , we easily complete the proof. t
u
Now, if we perform the same computation with the strategy . N;i /16i6N given
by Lemma 7.9, then we find that:
N;i .t; x/ D 0 if x D 0;
.v N;i /16i6N being defined as the corresponding value functions. Actually, the above
inequality is precisely what we checked in the proof of Lemma 7.9. Of course, it is
false if v N;i is replaced by uN;i .
So, the explanation is now clear: The control strategy profile is of bang-bang type
as it oscillates from state 0 or 1 to state 1 or 0 according to the sign of the discrete
derivative of the value function of the game. Rephrased with the notation used in the
book, the minimizer ˛.t; O x; h/ in (7.51) and (7.52) is not continuous with respect to h
and this explains why the master equation fails to capture the equilibrium identified
in Lemma 7.10. In comparison, the minimizer of the Hamiltonian appearing in
the analysis of Chapter (Vol II)-6 is regular, which makes a big difference when
investigating the convergence property.
Mean field games models with several groups of players were already part of the
original contribution of Huang, Caines, and Malhamé, see [211]. Another earlier
paper on the subject is due to Lachapelle and Wolfram [253], who introduced
models for groups of pedestrians with crowd aversion and xenophobia, that is
aversion of an ingroup towards outgroups. One of this model was revisited by
Cirant and Verzini in the more recent work [119], with a more detailed analysis
of the segregation phenomenon. General well posedness of stationary mean field
games with two populations was addressed by Feleqi in [151] under periodic
boundary conditions and by Cirant in [117] under Neumann boundary conditions.
Convergence of the N-player game was investigated by Feleqi in [151]. A synthetic
presentation is also given in Chapter 8 of the textbook by Bensoussan, Frehse, and
Yam [50].
678 7 Extensions for Volume I
The derivation of the HJB equation for stochastic optimal control problems with
an infinite horizon and a discounted running cost may be found in Chapter III of
the monograph by Fleming and Soner [157]. However, the exposition therein is
limited to time homogeneous coefficients, in which case the resulting HJB equation
becomes stationary. Obviously, the stationary case is not well fitted to infinite
horizon mean field games unless we modify the fixed point condition as explained
in the introduction of ergodic mean field games: Under the standard fixed point
condition, the distribution of the population may depend on time and, subsequently,
the coefficients of the underlying control problem are not time homogeneous. It
is only when addressing mean field games with an ergodic cost that the problem
becomes time independent.
Regarding the probabilistic approach to this kind of optimal control problems,
earlier results on decoupled FBSDEs in infinite horizon were obtained by Peng
[303] and Buckdahn and Peng [80] under appropriate monotonicity conditions,
which were relaxed by Briand and Hu [70] and Royer [321]. Peng and Shi, in
[306], implemented a continuation argument to prove an existence and uniqueness
result for fully coupled FBSDEs in infinite horizon. Connection between FBSDEs
and optimal control problems in infinite horizon was addressed by Fuhrman and
Tessitore [166] and Hu and Tessitore [205]. The corresponding version of the
stochastic maximum principle was investigated by Haadem, Øksendal, and Proske
[192] and by Maslowski and Veverka [276].
Examples of infinite horizon mean field games may be found in Huang, Caines,
and Malhamé [212], Huang [208] and Huang [209]. We refer to Chapter 7 in
the monograph by Bensoussan, Frehse, and Yam [50] and to the article by Priuli
[314] for other considerations on mean field games with an infinite horizon and a
discounted running cost.
Ergodic mean field games, including the convergence of the N-player game,
were addressed by Lasry and Lions in their first works on the subject. We refer
to the two seminal papers [260, 262] for a complete account of the available results.
Refinements were obtained by Cirant in [118], where special attention is paid to cost
functionals favoring congestion phenomena, in [153] by Ferreira and Gomes whose
analysis allows for degenerate cases, and in [313] by Pimentel and Voskanyan who
address the existence of classical solutions. The extended mean field games models
introduced and studied in Subsection 4.6 of Chapter 4 were studied in the ergodic
case by Gomes, Patrizi, and Voskanyan in [127] where they are called extended
ergodic mean fields games. The linear quadratic case was considered by Bardi
and Priuli in [35]. The note by Borkar [66] provides a pedagogical introduction
to the theory of ergodic optimal control (without mean field interactions). The
HJB equations used to solve ergodic control problems and ergodic games were
investigated by Bensoussan and Frehse [46, 47]. We refer to the monographs [228]
by Khasminskii and [128] by Da Prato and Zabczyk for a general presentation of the
ergodic properties of Markov and diffusion processes. Ergodic BSDEs, which we
alluded to in Subsection 7.1.2, were investigated by Debussche, Hu, and Tessitore
in [131] and Fuhrman, Hu, and Tessitore in [270].
7.3 Notes & Complements 679
The connection between ergodic mean field games and the large time limit of
mean field games with finite time horizon was addressed by Cardaliaguet, Lasry,
Lions, and Porretta in [90, 91] for mean field games with nondegenerate diffusion
coefficients, and by Cardaliaguet in [84] for degenerate first order mean field games.
For numerical methods on ergodic mean field games, we refer the interested reader
to the paper [19] of Almulla, Ferreira, and Gomes and to [4] by Achdou and
Capuzzo-Dolcetta. In [82], Camilli and Marchi study a class of stationary mean
field games on networks.
Our introductory discussion of mean field games with finite state spaces empha-
sizes the fact that we refrain from considering the equivalent of a common noise,
and that we restrict our presentation to Markovian dynamics as given by control
strategies in feedback closed loop form. For the construction of state dynamics as
marked point processes from their dual predictable projections and the discussion
of a few examples of control of queuing systems, the interested reader is referred
to the book [231] of Kitaev and Rykov. For the theory of point processes we refer
the reader to Brémaud’s monograph [67], and to Cinlar’s textbook [116] for a clear
pedagogical introduction. All these treatises rely heavily on the fundamental work of
Jacod [214] on the theory of the predictable projection of a random measure. We also
refer to the monograph [173] of Gihman and Skorohod for a general overview of
the theory of controlled processes and to the textbook [190] of Guo and Hernández-
Lerma for a more specific focus on Markov decision processes.
The bulk of the technical results presented in our discussion of the mean field
games with finite state spaces was inspired by the works of Guéant [185, 188] and
Gomes, Mohr, and Souza [175]. The earlier publication [174] by the same authors
investigates the discrete time case which we did not consider in the text. These
works are very similar. The dynamics of the states are given for an evolution on
a directed graph in [185, 188], emphasizing the possibility to restrict transitions
to and from specific sets of nodes. However, except for a clear emphasis in the
notation which could help the intuition regarding the time evolution of the state,
this directed graph structure is not really used when it comes to theoretical results
and proofs, so the set-up of [175] could be used as well to describe the dynamics.
The main difference is that from the start of [185, 188], the contributions of the
control ˛ and the marginal distribution to the running cost function are split
apart and appear in two different additive components of the cost. Even though the
form of the contribution of the control is more general than the quadratic function
which we use in the text, the strict convexity assumption of [185, 188] ends up
playing the same role as our quadratic assumption, and for the sake of exposition,
we decided to use the quadratic function instead of carrying around Legendre
transforms. Finally, while [175] does not assume that the running cost function splits
into parts containing the contributions of the control and the marginal distribution,
technical assumptions and dedicated a priori estimates lead to similar arguments and
proofs.
The four state model for the behavior of computer owners facing cyber attacks
by hackers which we chose for the purpose of illustration is borrowed from the
paper [235] by Kolokoltsov and Bensoussan. There, the authors consider the infinite
680 7 Extensions for Volume I
time horizon version of the model, and search for stationary equilibria. They give
a complete characterization of the state of affair, i.e., nonexistence, existence and
uniqueness, and even existence of two equilibria in the asymptotic regime D 1.
Obviously, the numerical illustrations given in the text are for T and finite. We
chose the parameters of the model for our numerical results to be consistent with
the asymptotic properties expected from their results. This cyber-security model
will be revisited in Chapter 7 of Volume II where we extend the framework of finite
state mean field games to include major and minor players, generalization which
makes the model more realistic for the analysis of cyber attack applications.
The counter-example presented in Subsection 7.2.5 is inspired by the note [139]
of Doncel, Gast, and Gaujal. See also the expanded version [140] by the same
authors. It may be regarded as a dynamic variant of the classical prisoner’s dilemma
game. In the finite player version, a strategy for constructing an equilibrium is
given by the tit-for-tat principle: any player which defects from the cooperation is
punished by the others. Intuitively, such a strategy cannot be implemented anymore
in the limiting setting since the mass of one defecting player is zero.
We refer to Gast and Gaujal [169, 170], Gast, Gaujal, and LeBoudec [169, 170],
and Kolokolstov [239] for a study of mean field control problems, very much in
the spirit of Chapter 6, on a finite state space. Some of these articles treat only the
discrete time case. Finally, we refer to Basna, Hilbert, and Kolokoltsov [37] for a
discussion of mean field games when the dynamics of the states of the players are
given by pure jump Markov processes in a continuous state space.
References
1. Y. Achdou, F. Buera, J.M. Lasry, P.L. Lions, and B. Moll. Partial differential equation models
in macroeconomics. Philosophical Transactions of the Royal Society, A, 372, Oct. 2014.
2. Y. Achdou, F. Camilli, and I. Capuzzo-Dolcetta. Mean field games: numerical methods for
the planning problem. SIAM Journal on Control and Optimization, 50:77–109, 2010.
3. Y. Achdou, F. Camilli, and I. Capuzzo-Dolcetta. Mean field games: convergence of a finite
difference method. SIAM Journal on Numerical Analysis, 51:2585–2612, 2013.
4. Y. Achdou and I. Capuzzo-Dolcetta. Mean field games: numerical methods. SIAM Journal
on Numerical Analysis, 48:1136–1162, 2010.
5. Y. Achdou and M. Laurière. On the system of partial differential equations arising in mean
field type control. Discrete and Continuous Dynamical Systems, A, 35:3879–3900, 2015.
6. Y. Achdou and V. Perez. Iterative strategies for solving linearized discrete mean field games
systems. Networks and Heterogeneous Media, 7:197–217, 2012.
7. Y. Achdou and A. Porretta. Convergence of a finite difference scheme to weak solutions of
the system of partial differential equations arising in mean field games. SIAM Journal on
Numerical Analysis, 54:161–186.
8. S. Adlakha and R. Johari. Mean field equilibrium in dynamic games with strategic
complementarities. Operations Research, 61:971–989, 2013.
9. N. Aghbal and R. Carmona. A solvable mean field game with interactions through the control.
Technical report, Princeton University, 2014.
10. P. Aghion and P. Howitt. A model of growth through creative destruction. Econometrica,
60:323–352, 1992.
11. S. Ahuja. Wellposedness of mean field games with common noise under a weak monotonicity
condition. SIAM Journal on Control and Optimization, 54:30–48, 2016.
12. S.R. Aiyagari. Uninsured idiosyncratic risk and aggregate saving. The Quarterly Journal of
Economics, 109:659–684, 1994.
13. M. Aizenman and B. Simon. Brownian motion and Harnack inequality for Schrödinger
operators. Communications in Pure and Applied Mathematics, 35:209–273, 1982.
14. S. Alanko. Regression-based Monte Carlo methods for solving nonlinear PDEs. PhD thesis,
New York University, 2015.
15. D. Aldous. Weak convergence and the general theory of processes. Unpublished notes. http://
www.stat.berkeley.edu/~aldous/Papers/weak-gtp.pdf, 1983.
16. D. Aldous. Exchangeability and related topics. In Ecole d’Eté de Probabilités de Saint Flour
1983. Volume 1117 of Lecture Notes in Mathematics, pages 1–198. Springer-Verlag Berlin
Heidelberg, 1985.
17. C. D. Aliprantis and K. Border. Infinite Dimensional Analysis. Third Edition. Springer-Verlag
Berlin Heidelberg, 2006.
18. R. Almgren and N. Chriss. Optimal execution of portfolio transactions. Journal of Risk,
3:5–39, 2001.
19. N. Almulla, R. Ferreira, and D. Gomes. Two numerical approaches to stationary mean-field
games. Dynamic Games and Applications, 7:657–682, 2016.
20. L. Ambrosio and J. Feng. On a class of first order Hamilton-Jacobi equations in metric spaces.
Journal of Differential Equations, 256:2194–2245, 2014.
21. L. Ambrosio, N. Gigli, and G. Savaré. Gradient flows in metric spaces and in the Wasserstein
space of probability measures. Birkhäuser Basel, 2004.
22. T.T.K. An and B. Oksendal. Maximum principle for stochastic differential games with partial
information. Journal of Optimization Theory and Applications, 139:463–483, 2008.
23. T.T.K. An and B. Oksendal. A maximum principle for stochastic differential games with
g-expectations and partial information. Stochastics, 84:137–155, 2012.
24. D. Andersson and B. Djehiche. A maximum principle for SDEs of mean-field type. Applied
Mathematics & Optimization, 63:341–356, 2010.
25. F. Antonelli. Backward-forward stochastic differential equations. Annals of Applied
Probability, 3:777–793, 1993.
26. F. Antonelli and J. Ma. Weak solutions of forward-backward SDE’s. Stochastic Analysis and
Applications, 21(3):493–514, 2003.
27. R. Aumann. Markets with a continuum of traders. Econometrica, 32:39–50, 1964.
28. R. J. Aumann. Existence of competitive equilibrium in markets with continuum of traders.
Econometrica, 34:1–17, 1966.
29. X. Tan C. Zhou B. Bouchard, D. Possamaï. A unified approach to a priori estimates
for supersolutions of BSDEs in general filtrations. Annales de l’institut Henri Poincaré,
Probabilités et Statistiques, to appear.
30. R. Bafico and P. Baldi. Small random perturbations of Peano phenomena. Stochastics, 6:279–
292, 1982.
31. F. Baghery and B. Oksendal. A maximum principle for stochastic control with partial
information. Stochastic Analysis and Applications, 25:705–717, 2007.
32. K. Bahlali, B. Mezerdi, M. N’zi, and Y. Ouknine. Weak solutions and a Yamada-Watanabe
theorem for FBSDEs. Random Operators and Stochastic Equations, 15:271–285, 2007.
33. M. Bardi. Explicit solutions of some linear quadratic mean field games. Networks and
Heterogeneous Media, 7:243–261, 2012.
34. M. Bardi and E. Feleqi. Nonlinear elliptic systems and mean field games. Nonlinear
Differential Equations and Applications NoDEA, 23:44, 2016.
35. M. Bardi and F. Priuli. Linear-quadratic N-person and mean-field games with ergodic cost.
SIAM Journal on Control and Optimization, 52:3022–3052, 2014.
36. F. Barthe and C. Bordenave. Combinatorial optimization over two random point sets. In
C. Donati-Martin et al., editors, Séminaire de Probabilités XLV. Volume 2046 of Lecture
Notes in Mathematics, pages 483–536. Springer International Publilshing, 2013.
37. R. Basna, A. Hilbert, and V.N. Kolokolstov. An approximate Nash equilibrium for pure jump
Markov games of mean-field-type on continuous state space. Stochastics, 89:967–993, 2017.
38. R.F. Bass. Diffusions and Elliptic Operators. Springer-Verlag New York, 1998.
39. R.F. Bass and P. Hsu. Some potential theory for reflecting Brownian motion in Hölder and
Lipschitz domains. Annals of Probability, 19:486–508, 1991.
40. J.R. Baxter and R.V. Chacon. Compactness of stopping times. Zeitschrift für Wahrschein-
lichkeitstheorie und verwandte Gebiete, 40:169–181, 1977.
41. J.-D. Benamou and Y. Brenier. A computational fluid mechanics solution to the Monge-
Kantorovich mass transfer problem. Numerische Mathematik, 84:375–393, 2000.
42. J.D. Benamou and G. Carlier. Augmented Lagrangian methods for transport optimization,
mean field games and degenerate elliptic equations. Journal of Optimization Theory and
Applications, 167:1–26, 2015.
43. J.D. Benamou, G. Carlier, and N. Bonne. An augmented Lagrangian numerical approach to
solving mean field games. Technical report, INRIA, 2013. https://hal.inria.fr/hal-00922349/
44. A. Bensoussan, M.H.M. Chau, and S.C.P. Yam. Mean field games with a dominating player.
Applied Mathematics & Optimization, 74:91–128, 2016.
References 683
45. A. Bensoussan and J. Frehse. Nonlinear elliptic systems in stochastic game theory. Journal
für die reine und angewandte Mathematik, 350:23–67, 1984.
46. A. Bensoussan and J. Frehse. On Bellman equations of ergodic control in Rn . Journal für die
reine und angewandte Mathematik, 492:125–160, 1992.
47. A. Bensoussan and J. Frehse. Ergodic Bellman systems for stochastic games in arbitrary
dimension. Proceedings of the Royal Society of Edinburgh, A, 449:65–77, 1995.
48. A. Bensoussan and J. Frehse. Stochastic games for N players. Journal of Optimization Theory
and Applications, 105:543–565, 2000.
49. A. Bensoussan and J. Frehse. Smooth solutions of systems of quasilinear parabolic equations.
ESAIM: Control, Optimisation and Calculus of Variations, 8:169–193, 2010.
50. A. Bensoussan, J. Frehse, and P. Yam. Mean Field Games and Mean Field Type Control
Theory. SpringerBriefs in Mathematics. Springer-Verlag New York, 2013.
51. A. Bensoussan, J. Frehse, and S. C. P. Yam. The master equation in mean field theory. Journal
de Mathématiques Pures et Appliquées, 2014.
52. A. Bensoussan, J. Frehse, and S. C. P. Yam. On the interpretation of the master equation.
Stochastic Processes and their Applications, 127:2093–2137, 2016.
53. A. Bensoussan, K.C.J. Sung, S.C.P. Yam, and S.P. Yung. Linear quadratic mean field games.
Journal of Optimization Theory and Applications, 169:469–529, 2016.
54. A. Bensoussan and S. C. P. Yam. Control problem on space of random variables and master
equation. Technical report, http://arxiv.org/abs/1508.00713, 2015.
55. D.P. Bertsekas. Nonlinear Programming. Athena Scientific, 1995.
56. D.P. Bertsekas and S.E. Shreve. Stochastic Optimal Control: The Discrete Time Case.
Academic Press, 1978.
57. P. Billingsley. Convergence of Probability Measures. Third edition. John Wiley & Sons, Inc.,
1995.
58. P. Billingsley. Probability and Measure. Second edition. John Wiley & Sons, Inc., 1999.
59. A. Bisin, U. Horst, and O. Özgür. Rational expectations equilibria of economies with local
interactions. Journal of Economic Theory, 127:74–116, 2006.
60. J.-M. Bismut. Théorie probabiliste du contrôle des diffusions, Memoirs of the American
Mathematical Society, 167(4), 1976.
61. J.-M. Bismut. Conjugate convex functions in optimal stochastic control. Journal of
Mathematical Analysis and Applications, 44:384–404, 1973.
62. J.M. Bismut. An introductory approach to duality in optimal stochastic control. SIAM Review,
20:62–78, 1978.
63. D. Blackwell and L.E. Dubins. An extension of Skorohod’s almost sure convergence theorem.
Proceedings of the American Mathematical Society, 89:691–692, 1983.
64. V.I. Bogachev. Measure Theory, Volume 2. Springer-Verlag Berlin Heidelberg, 2007.
65. V.S. Borkar. Controlled diffusion processes. Probability Surveys, 2:213–244, 2005.
66. V.S. Borkar. Ergodic control of diffusion processes. In Marta Sanz-Solé et al., editors,
Proceedings of the International Congress of Mathematics, Madrid, Spain, pages 1299–1309.
European Mathematical Society, 2006.
67. P. Brémaud. Point Processes and Queues: Martingale Dynamics. Springer Series in Statistics.
Springer-Verlag New York, 1981.
68. P. Brémaud and M. Yor. Changes of filtrations and of probability measures. Zeitschrift für
Wahrscheinlichkeitstheorie und verwandte Gebiete, 45:269–295, 1978.
69. Y. Brenier. Polar factorization and monotone rearrangement of vector-valued functions.
Communications on Pure and Applied Mathematics, 44:375–417, 1991.
70. P. Briand and Y. Hu. Stability of BSDEs with random terminal time and homogenization of
semilinear elliptic PDEs. Journal of Functional Analysis, 155:455–494, 1998.
71. P. Briand and Y. Hu. BSDE with quadratic growth and unbounded terminal value. Probability
Theory and Related Fields, 136:604–618, 2006.
72. G. Brunick and S. Shreve. Mimicking an Itô process by a solution of a stochastic differential
equation. Annals of Applied Probability, 23:1584–1628, 2013.
684 References
73. J. Bryant. A model of reserves, bank runs and deposit insurance. Journal of Banking and
Finance, 4:335–344, 1980.
74. R. Buckdahn, B. Djehiche, and J. Li. Mean field backward stochastic differential equations
and related partial differential equations. Stochastic Processes and their Applications,
119:3133–3154, 2007.
75. R. Buckdahn, B. Djehiche, J. Li, and S. Peng. Mean field backward stochastic differential
equations: A limit approach. Annals of Probability, 37:1524–1565, 2009.
76. R. Buckdahn and H.-J. Engelbert. A backward stochastic differential equation without strong
solution. Theory of Probability and its Applications, 50:284–289, 2006.
77. R. Buckdahn and H.-J. Engelbert. On the continuity of weak solutions of backward stochastic
differential equations. Theory of Probability and its Applications, 52:152–160, 2008.
78. R. Buckdahn, H. J. Engelbert, and A. Rǎşcanu. On weak solutions of backward stochastic
differential equations. Theory of Probability and its Applications, 49:16–50, 2005.
79. R. Buckdahn, J. Li, S. Peng, and C. Rainer. Mean-field stochastic differential equations and
associated PDEs. Annals of Probability, 45:824–878, 2017.
80. R. Buckdahn and S. Peng. Stationary backward stochastic differential equations and
associated partial differential equations. Probability Theory and Related Fields, 115:383–
399, 1999.
81. K. Burdzy, W. Kang, and K. Ramanan. The Skorokhod problem in a time-dependent interval.
Stochastic Processes and their Applications, 119:428–452, 2009.
82. F. Camilli and C. Marchi. Stationary mean field games systems defined on networks. SIAM
Journal on Control and Optimization, 54:1085–1103, 2016.
83. P. Cardaliaguet. Notes from P.L. Lions’ lectures at the Collège de France. Technical report,
https://www.ceremade.dauphine.fr/$\sim$cardalia/MFG100629.pdf, 2012.
84. P. Cardaliaguet. Long time average of first order mean field games and weak KAM theory.
Dynamic Games and Applications, 3:473–488, 2013.
85. P. Cardaliaguet. Weak solutions for first order mean field games with local coupling. In
P. Bettiol et al., editors, Analysis and Geometry in Control Theory and its Applications.
Springer INdAM Series, pages 111–158. Springer International Publishing, 2015.
86. P. Cardaliaguet, F. Delarue, J.-M. Lasry, and P.-L. Lions. The master equation and the
convergence problem in mean field games. Technical report, http://arxiv.org/abs/1509.02505,
2015.
87. P. Cardaliaguet and J. Graber. Mean field games systems of first order. ESAIM: Control,
Optimisation and Calculus of Variations, 21:690–722, 2015.
88. P. Cardaliaguet, J. Graber, A. Porretta, and D. Tonon. Second order mean field games with
degenerate diffusion and local coupling. Nonlinear Differential Equations and Applications
NoDEA, 22:1287–1317, 2015.
89. P. Cardaliaguet and S. Hadikhanloo. Learning in mean field games: the fictitious play. ESAIM:
Control, Optimisation and Calculus of Variations, 23:569–591, 2017.
90. P. Cardaliaguet, J.M. Lasry, P.L. Lions, and A. Porretta. Long time average of mean field
games. Networks and Heterogeneous Media, 7:279–301, 2012.
91. P. Cardaliaguet, J.M. Lasry, P.L. Lions, and A. Porretta. Long time average of mean field
games with a nonlocal coupling. SIAM Journal on Control and Optimization, 51:3558–3591,
2013.
92. P. Cardaliaguet, A. R. Mészáros, and F. Santambrogio. First order mean field games with
density constraints: Pressure equals price. SIAM Journal on Control and Optimization,
54:2672–2709, 2016.
93. E. A. Carlen. Conservative diffusions. Communication in Mathematical Physics, 94:293–315,
1984.
94. R. Carmona. Lectures on BSDEs, Stochastic Control and Stochastic Differential Games.
SIAM, 2015.
95. R. Carmona and F. Delarue. Mean field forward-backward stochastic differential equations.
Electronic Communications in Probability, 2013.
References 685
96. R. Carmona and F. Delarue. Probabilistic analysis of mean field games. SIAM Journal on
Control and Optimization, 51:2705–2734, 2013.
97. R. Carmona and F. Delarue. The master equation for large population equilibriums. In
D. Crisan, B. Hambly, T. Zariphopoulou, editors, Stochastic Analysis and Applications 2014:
In Honour of Terry Lyons, pages 77–128. Springer Cham, 2014.
98. R. Carmona and F. Delarue. Forward-backward stochastic differential equations and
controlled Mckean Vlasov dynamics. Annals of Probability, 43:2647–2700, 2015.
99. R. Carmona, F. Delarue, and A. Lachapelle. Control of McKean-Vlasov versus mean field
games. Mathematics and Financial Economics, 7:131–166, 2013.
100. R. Carmona, F. Delarue, and D. Lacker. Mean field games with common noise. Annals of
Probability, 44:3740–3803, 2016.
101. R. Carmona, J.P. Fouque, M. Moussavi, and L.H. Sun. Systemic risk and stochastic games
with delay. Technical report, 2016. https://arxiv.org/abs/1607.06373
102. R. Carmona, J.P. Fouque, and L.H. Sun. Mean field games and systemic risk: a toy model.
Communications in Mathematical Sciences, 13:911–933, 2015.
103. R. Carmona and D. Lacker. A probabilistic weak formulation of mean field games and
applications. Annals of Applied Probability, 25:1189–1231, 2015.
104. R. Carmona, F. Delarue, and D. Lacker. Mean field games of timing and models for bank
runs. Applied Mathematics & Optimization, 76:217–260, 2017.
105. R. Carmona and D. Nualart. Nonlinear Stochastic Integrators, Equations and Flows. Gordon
& Breach, 1990.
106. R. Carmona and P. Wang. An alternative approach to mean field game with major and minor
players, and applications to herders impacts. Applied Mathematics & Optimization, 76:5–27,
2017.
107. R. Carmona and K. Webster. The self financing condition in high frequency markets. Finance
Stochastics, to appear.
108. R. Carmona and K. Webster. A Stackelberg equilibrium for the Limit Order Book.
Mathematical Finance, to appear.
109. R. Carmona and W.I. Zheng. Reflecting Brownian motions and comparison theorems for
Neumann heat kernels. Journal of Functional Analysis, 123:109–128, 1994.
110. R. Carmona and G. Zhu. A probabilistic approach to mean field games with major and minor
players. Annals of Applied Probability, 26:1535–1580, 2016.
111. C. Ceci, A. Cretarola, and F. Russo. BSDEs under partial information and financial
applications. Stochastic Processes and their Applications, 124:2628–2653, 2014.
112. U. Cetin, H.M. Soner, and N. Touzi. Options hedging under liquidity costs. Finance
Stochastics, 14:317–341, 2010.
113. P. Chan and R. Sircar. Bertrand and Cournot mean field games. Applied Mathematics &
Optimization, 71:533–569, 2015.
114. J.F. Chassagneux, D. Crisan, and F. Delarue. McKean-vlasov FBSDEs and related master
equation. Technical report, http://arxiv.org/abs/1411.3009, 2015.
115. P. G. Ciarlet. Introduction to Numerical Linear Algebra and Optimisation. Cambridge Texts
in Applied Mathematics. Cambridge University Press, 1989.
116. E. Çinlar. Probability and Stochastics. Graduate Texts in Mathematics. Springer-Verlag New
York, 2011.
117. M. Cirant. Multi-population mean field games systems with Neumann boundary conditions.
Journal de Mathématiques Pures et Appliquées, 103:1294–1315, 2015.
118. M. Cirant. Stationary focusing mean-field games. Communications in Partial Differential
Equations, 41:1324–1346, 2016.
119. M. Cirant and G. Verzini. Bifurcation and segregation in quadratic two-populations mean
field games systems. ESAIM: Control, Optimisation and Calculus of Variations, 23:1145–
1177, 2017.
120. M. Coghi and F. Flandoli. Propagation of chaos for interacting particles subject to
environmental noise. Annals of Applied Probability, 26:1407–1442, 2016.
686 References
121. M.G Crandall and P.L. Lions. Viscosity solutions of Hamilton-Jacobi equations in infinite
dimensions. i. Uniqueness of viscosity solutions. Journal of Functional Analysis, 62:379–
396, 1985.
122. M.G Crandall and P.L. Lions. Viscosity solutions of Hamilton-Jacobi equations in infinite
dimensions. ii. Existence of viscosity solutions. Journal of Functional Analysis, 65:368–405,
1986.
123. M.G Crandall and P.L. Lions. Viscosity solutions of Hamilton-Jacobi equations in infinite
dimensions. iii. Journal of Functional Analysis, 68:214–247, 1986.
124. M.G Crandall and P.L. Lions. Viscosity solutions of Hamilton-Jacobi equations in infinite
dimensions. iv. Hamiltonians with unbounded linear terms. Journal of Functional Analysis,
90:3237–283, 1990.
125. F. Cucker and E. Mordecki. Flocking in noisy environments. Journal de Mathématiques
Pures et Appliquées, 89:278–296, 2008.
126. F. Cucker and S. Smale. Emergent behavior in flocks. IEEE Transactions on Automatic
Control, 52:852–862, 2007.
127. V. Voskanyana D.A. Gomes, S. Patrizi. On the existence of classical solutions for stationary
extended mean field games. Nonlinear Analysis: Theory, Methods & Applications, 99:49–79,
2014.
128. G. Da Prato and J. Zabczyk. Ergodicity for Infinite Dimensional Systems. Cambridge
University Press, 1996.
129. E.B. Davies. Spectral properties of compact manifolds and changes of metric. American
Journal of Mathematics, 112:15–39, 1990.
130. D. Dawson and J. Vaillancourt. Stochastic McKean-Vlasov equations. NoDEA. Nonlinear
Differential Equations and Applications, 2(2):199–229, 1995.
131. A. Debussche, Y. Hu, and G. Tessitore. Ergodic BSDEs under weak dissipative assumptions.
Stochastic Processes and their Applications, 121:407–426, 2011.
132. F. Delarue. On the existence and uniqueness of solutions to FBSDEs in a non-degenerate
case. Stochastic Processes and their Applications, 99:209–286, 2002.
133. F. Delarue. Estimates of the solutions of a system of quasi-linear PDEs. a probabilistic
scheme. In J. Azéma et al., editors, Séminaire de Probabilités XXXVII, pages 290–332.
Springer-Verlag Berlin Heidelberg, 2003.
134. F. Delarue and G. Guatteri. Weak existence and uniqueness for FBSDEs. Stochastic Processes
and their Applications, 116:1712–1742, 2006.
135. S. Dereich, M. Scheutzow, and R. Schottstedt. Constructive quantization: approximation
by empirical measures. Annales Institut Henri Poincaré, Probabilités Statistiques, 49:1183–
1203, 2013.
136. D.W. Diamond and P.H. Dybvig. Bank runs, deposit insurance, and liquidity. The Journal of
Political Economy, (91):401–419, 1983.
137. B. Djehiche, A. Tcheukam Siwe, and H. Tembine. Mean field-type games in engineering.
Technical report, https://arxiv.org/abs/1605.03281, 2016.
138. B. Djehiche, H. Tembine, and R. Tempone. A stochastic maximum principle for risk-sensitive
mean-field type control. IEEE Transactions on Automatic Control, 60:2640–2649, 2015.
139. J. Doncel, N. Gast, and B. Gaujal. Are mean-field games the limits of finite stochastic games?
SIGMETRICS Performance Evaluation Review, 44:18–20, 2016.
140. J. Doncel, N. Gast, and B. Gaujal. Mean-field games with explicit interactions. Technical
report, https://hal.inria.fr/hal-01277098/file/main.pdf, 2016.
141. G. Dos Reis. Some advances on quadratic BSDE: Theory - Numerics - Applications. LAP
LAMBERT Academic Publishing, 2011.
142. R. Duboscq and A. Réveillac. Stochastic regularization effects of semi-martingales on random
functions. Journal de Mathématiques Pures et Appliquées, 106:1141–1173, 2016.
143. R.M. Dudley. Real Analysis and Probability. Wadsworth & Brooks/Cole, 1989.
144. D. Duffie, G. Giroux, and G. Manso. Information percolation. American Economics Journal:
Microeconomic Theory, 2:1, 2010.
References 687
145. D. Duffie, S. Malamud, and G. Manso. Information percolation with equilibrium search
dynamics. Econometrica, 77:1513–1574, 2009.
146. D. Duffie, S. Malamud, and G. Manso. Information percolation in segmented markets.
Journal of Economic Theory, 153:1–32, 2014.
147. D. Duffie and G. Manso. Information percolation in large markets. American Economic
Review, Papers and Proceedings, 97:203–209, 2007.
148. D. Duffie and Y. Sun. Existence of independent random matching. Annals of Applied
Probability, 17:385–419, 2007.
149. S. Ethier and T. Kurtz. Markov Processes: Characterization and Convergence. John Wiley &
Sons, Inc., 2005.
150. G. Fabbri, F. Gozzi, and A. Swiech. Stochastic Optimal Control in Infinite Dimensions:
Dynamic Programming and HJB Equations. Probability Theory and Stochastic Modelling.
Springer International Publishing, 2017.
151. E. Feleqi. The derivation of ergodic mean field game equations for several population of
players. Dynamic Games and Applications, 3:523–536, 2013.
152. J. Feng and M. Katsoulakis. A comparison principle for Hamilton-Jacobi equations related
to controlled gradient flows in infinite dimensions. Archive for Rational Mechanics and
Analysis, 192:275–310, 2009.
153. R. Ferreira and D. Gomes. Existence of weak solutions to stationary mean field games through
variational inequalities. Technical report, http://arxiv.org/abs/1512.05828, 2015.
154. M. Fischer. On the connection between symmetric N-player games and mean field games.
Annals of Applied Probability, 27:757–810, 2017.
155. M. Fischer and G. Livieri. Continuous time mean-variance portfolio optimization through the
mean field approach. ESAIM: Probability and Statistics, 20:30–44, 2016.
156. F. Flandoli. Random Perturbation of PDEs and Fluid Dynamics: Ecole d’été de probabilités
de Saint-Flour XL. Volume 2015 of Lecture Notes in Mathematics. Springer-Verlag Berlin
Heidelberg, 2011.
157. W.H. Fleming and M. Soner. Controlled Markov Processes and Viscosity Solutions.
Stochastic Modelling and Applied Probability. Springer-Verlag, New York, 2010.
158. W.H. Fleming. Generalized solutions in optimal stochastic control. In Proceedings of the
Second Kingston Conference on Differential Games, pages 147–165. Marcel Dekker, 1977.
159. R. Foguen Tchuendom. Restoration of uniqueness of Nash equilibria for a class of linear-
quadratic mean field games with common noise. Dynamic Games and Applications, to appear.
160. M. Fornasier and F. Solombrino. Mean-field optimal control. ESAIM: Control, Optimisation
and Calculus of Variations, 20:1123–1152, 2014.
161. N. Fournier and A. Guillin. On the rate of convergence in the Wasserstein distance of the
empirical measure. Probability Theory and Related Fields, 162:707–738, 2015.
162. A. Friedman. Partial differential equations of parabolic type. Prentice-Hall, Englewood
Cliffs, N.J., first edition, 1964.
163. A. Friedman. Stochastic differential games. Journal of Differential Equations, 11:79–108,
1972.
164. D. Fudenberg and D. Levine. Open-loop and closed-loop equilibria in dynamic games with
many players. Journal of Economic Theory, 44:1–18, 1988.
165. D. Fudenberg and J. Tirole. Game Theory. MIT Press, 1991.
166. M. Fuhrman and G. Tessitore. Infinite horizon backward stochastic differential equations and
elliptic equations in Hilbert spaces. Annals of Probability, 30:607–660, 2004.
167. W. Gangbo, T. Nguyen, and A. Tudorascu. Hamilton-Jacobi equations in the Wasserstein
space. Methods and Applications of Analysis, 15:155–184, 2008.
168. W. Gangbo and A. Swiech. Existence of a solution to an equation arising from the theory of
mean field games. Journal of Differential Equations, 259:6573–6643, 2015.
169. N. Gast and B. Gaujal. A mean field approach for optimization in discrete time. Journal of
Discrete Event Dynamic Systems, 21:63–101, 2011.
688 References
170. N. Gast, B. Gaujal, and J.-Y. Le Boudec. Mean field for Markov decision processes: from
discrete to continuous optimization. IEEE Transactions on Automatic Control, 57:2266–
2280, 2012.
171. J. Gatheral, A. Schied, and A. Slynko. Transient linear price impact and Fredholm integral
equations. Mathematical Finance, 22:445–474, 2012.
172. R. Gayduk and S. Nadtochiy. Liquidity effects of trading frequency. Mathematical Finance,
to appear.
173. I.I. Gihman and A.V. Skorohod. Controlled Stochastic Processes. Springer-Verlag Berlin
Heidelberg New York, 1979.
174. D.A. Gomes, J. Mohr, and R.R. Souza. Discrete time, finite state space mean field games.
Journal de Mathématiques Pures et Appliquées, 93:308–328, 2010.
175. D.A. Gomes, J. Mohr, and R.R. Souza. Continuous time finite state mean field games. Applied
Mathematics & Optimization, 68:99–143, 2013.
176. D.A. Gomes, L. Nurbekyan, and E. Pimentel. Economic Models and Mean-field Games
Theory. Publicaões Matemáticas, IMPA, Rio, Brazil, 2015.
177. D.A. Gomes and E. Pimentel. Time-dependent mean-field games with logarithmic nonlinear-
ities. SIAM Journal of Mathematical Analysis, 47:3798–3812, 2015.
178. D.A. Gomes and E. Pimentel. Local regularity for mean-field games in the whole space.
Minimax Theory and its Applications, 1:65–82, 2016.
179. D.A. Gomes, E. Pimentel, and H. Sánchez-Morgado. Time-dependent mean-field games in
the sub- quadratic case. Communications in Partial Differential Equations, 40:40–76, 2015.
180. D.A. Gomes, E. Pimentel, and H. Sánchez-Morgado. Time-dependent mean-field games in
the superquadratic case. ESAIM: Control, Optimisation and Calculus of Variations, 22:562–
580, 2016.
181. D.A. Gomes, E. Pimentel, and V. Voskanyan. Regularity Theory for Mean-Field Game
Systems. SpringerBriefs in Mathematics Springer International Publishing, 2016.
182. D.A. Gomes and J. Saude. Mean field games models - a brief survey. Dynamic Games and
Applications, 4:110–154, 2014.
183. D.A. Gomes and V. Voskanyan. Extended mean field games. SIAM Journal on Control and
Optimization, 54:1030–1055, 2016.
184. A. Granas and J. Dugundji. Fixed point theory. Springer Monographs in Mathematics.
Springer-Verlag New York, 2003.
185. O. Guéant. From infinity to one: The reduction of some mean field games to a global control
problem. Cahier de la Chaire Finance et Développement Durable, 42, 2011.
186. O. Guéant. Mean field games equations with quadratic Hamiltonian: A specific approach.
Mathematical Models and Methods in Applied Sciences, 22:291–303, 2012.
187. O. Guéant. New numerical methods for mean field games with quadratic costs. Networks and
Heterogeneous Media, 2:315–336, 2012.
188. O. Guéant. Existence and uniqueness result for mean field games with congestion effect on
graphs. Applied Mathematics & Optimization, 72:291–303, 2015.
189. O. Guéant, J.M. Lasry, and P.L. Lions. Mean field games and applications. In R. Carmona
et al., editors, Paris Princeton Lectures on Mathematical Finance 2010. Volume 2003 of
Lecture Notes in Mathematics. Springer-Verlag Berlin Heidelberg, 2010.
190. X. Guo and O. Hernández-Lerma. Continuous-Time Markov Decision Processes. Stochastic
Modelling and Applied Probability. Springer-Verlag Berlin Heidelberg, 2009.
191. I. Gyöngy. Mimicking the one-dimensional marginal distributions of processes having an Itô
differential. Probability Theory and Related Fields, 71:501–516, 1986.
192. S. Haadem, B. Øksendal, and F. Proske. Maximum principles for jump diffusion processes
with infinite horizon. Automatica, 49:2267–2275, 2013.
193. S. Hamadène. Backward-forward SDE’s and stochastic differential games. Stochastic
Processes and their Applications, 77:1–15, 1998.
194. S. Hamadène. Nonzero-sum linear quadratic stochastic differential games and backward
forward equations. Stochastic Analysis and Applications, 17:117–130, 1999.
References 689
195. S. Hamadène and J.P. Lepeltier. Backward equations, stochastic control and zero-sum
stochastic differential games. Stochastics and Stochastic Reports, 54:221–231, 1995.
196. E. Häusler and H. Luschgy. Stable Convergence and Stable Limit Theorems. Probability
Theory and Stochastic Modelling. Springer International Publishing, 1995.
197. Z. He and W. Xiong. Dynamic debt runs. Review of Financial Studies, 25:1799–1843, 2012.
198. J. Horowitz and R.L. Karandikar. Mean rates of convergence of empirical measures in the
Wasserstein metric. Journal of Computational and Applied Mathematics, 55:261–273, 1994.
199. U. Horst. Ergodic fluctuations in a stock market model with interacting agents: the mean field
case. Discussion paper No. 106, Sonderforschungbereich 373, Humboldt Universität, Berlin,
1999.
200. U. Horst. Stationary equilibria in discounted stochastic games with weakly interacting
players. Games and Economic Behavior, 51:83–108, 2005.
201. J.A. Hosking. A stochastic maximum principle for a stochastic differential game of a mean-
field type. Applied Mathematics & Optimization, 66:415–454, 2012.
202. Y. Hu. Stochastic maximum principle. In John Baillieul, Tariq Samad, editors, Encyclopedia
of Systems and Control, pages 1347–1350. Springer-Verlag London, 2015.
203. Y. Hu and S. Peng. Maximum principle for semilinear stochastic evolution control systems.
Stochastics and Stochastic Reports, 33:159–180, 1990.
204. Y. Hu and S. Tang. Multi-dimensional backward stochastic differential equations of
diagonally quadratic generators. Stochastic Processes and their Applications, 126:1066–
1086, 2016.
205. Y. Hu and G. Tessitore. BSDE on an infinite horizon and elliptic PDEs in infinite dimension.
Nonlinear Differential Equations and Applications NoDEA, 14:825–846, 2007.
206. C-F. Huang and L. Li. Continuous time stopping games with monotone reward structures.
Mathematics of Operations Research, 15:496–507.
207. M. Huang. Large-population LQG games involving a major player: the Nash equivalence
principle. SIAM Journal on Control and Optimization, 48:3318–3353, 2010.
208. M. Huang. A mean field accumulation game with HARA utility. Dynamics Games and
Applications, 3:446–472, 2013.
209. M. Huang. Mean field capital accumulation games: the long time behavior. In Proceedings
of the 52nd IEEE Conference on Decision and Control, pages 2499–2504. 2013.
210. M. Huang, P.E. Caines, and R.P. Malhamé. Individual and mass behavior in large population
stochastic wireless power control problems: centralized and Nash equilibrium solutions. In
Proceedings of the 42nd IEEE International Conference on Decision and Control, pages 98–
103. 2003.
211. M. Huang, P.E. Caines, and R.P. Malhamé. Large population stochastic dynamic games:
closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle. Commu-
nications in Information and Systems, 6:221–252, 2006.
212. M. Huang, P.E. Caines, and R.P. Malhamé. Large population cost coupled LQG problems
with nonuniform agents: individual mass behavior and decentralized -Nash equilibria. IEEE
Transactions on Automatic Control, 52:1560–1571, 2007.
213. M. Huang, R.P. Malhamé, and P.E. Caines. Nash equilibria for large population linear
stochastic systems with weakly coupled agents. In R.P. Malhamé, E.K. Boukas, editors,
Analysis, Control and Optimization of Complex Dynamic Systems, pages 215–252. Springer-
US, 2005.
214. J. Jacod. Multivariate point processes: predictable projections, Radon-Nykodym derivatives,
representation of martingales. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte.
Gebiete, 31:235–253, 1975.
215. J. Jacod. Weak and strong solutions of stochastic differential equations. Stochastics, 3:171–
191, 1980.
216. J. Jacod and J. Mémin. Weak and strong solutions of stochastic differential equations:
Existence and stability. In D. Williams, editor, Stochastic Integrals. Volume 851 of Lecture
Notes in Mathematics, pages 169–212. Springer-Verlag Berlin Heidelberg, 1981.
690 References
217. M. Huang, P.E. Caines, and R.P. Malhamé. Social optima in mean field LQG control: central-
ized and decentralized strategies. IEEE Transactions on Automatic Control, 57(7):1736–1751,
2012.
218. X. Huang, S. Jaimungal and M. Nourian. Mean-field game strategies for optimal execution.
Technical report, University of Toronto, https://papers.ssrn.com/sol3/papers.cfm?abstract_
id=2578733, 2017.
219. M. Jeanblanc and Y. Le Cam. Immersion property and credit risk modelling. In F. Delbaen, M.
Rásonyi, C. Stricker, editors, Optimality and Risk-Modern Trends in Mathematical Finance,
pages 99–132. Springer-Verlag Berlin Heidelberg, 2010.
220. R. Jordan, D. Kinderlehrer, and F. Otto. The variational formulation of the Fokker-Planck
equation. SIAM Journal on Mathematical Analysis, 29:1–17, 1998.
221. B. Jourdain, S. Meleard, and W. Woyczynski. Nonlinear SDEs driven by Lévy processes and
related PDEs. ALEA, Latin American Journal of Probability, 4:1–29, 2008.
222. J. Kallsen and C. Kühn. Pricing derivatives of american and game type in incomplete markets.
Finance and Stochastics, 8:261–284, 2004.
223. J. Kallsen and C. Kühn. Convertible bonds: Financial derivatives of game type. In
A. Kyprianou, W. Schoutens, P. Wilmott, editors, Exotic Option Pricing and Advanced Lévy
Models, pages 277–288. John Wiley & Sons, Inc., 2005.
224. N. El Karoui and S.J. Huang. A general result of existence and uniqueness of backward
stochastic differential equations. In N. El Karoui, L. Mazliak, editors, Backward stochastic
differential equations, Research Notes in Mathematics, pages 27–36. Pitman, Longman,
Harlow, 1997.
225. N. El Karoui, D.H. Nguyen, and M. Jeanblanc-Picqué. Compactification methods in the
control of degenerate diffusions: existence of an optimal control. Stochastics, 20:169–219,
1987.
226. N. El Karoui, S. Peng, and M.C. Quenez. Backward stochastic differential equations in
finance. Mathematical Finance, 7:1–71, 1997.
227. N. Kazamaki. Continuous Exponential Martingales and BMO. Volume 1579 of Lecture Notes
in Mathematics. Springer-Verlag Berlin Heidelberg, 1994.
228. R.Z. Khasminskii. Stochastic Stability of Differential Equations. Sijthoff & Noordhoff, 1980.
229. Y. Kifer. Game options. Finance and Stochastics, 4:443–463, 2000.
230. J.F.C. Kingman. Uses of exchangeability. Annals of Probability, 6:183–197, 1978.
231. M.Y. Kitaev and V. Rykov. Controlled Queuing Systems. CRC Press, 1995.
232. M. Kobylanski. Backward stochastic differential equations and partial differential equations
with quadratic growth. Annals of Probability, 28:558–602, 2000.
233. V.N. Kolokolstov. Nonlinear Markov semigroups and interacting Lévy processes. Journal of
Statistical Physics, 126:585–642, 2007.
234. V.N. Kolokolstov. Nonlinear Markov processes and kinetic equations. Cambridge University
Press, Cambridge, 2010.
235. V.N. Kolokolstov and A. Bensoussan. Mean-field-game model for botnet defense in cyber-
security. Applied Mathematics & Optimization, 74:669–692, 2016.
236. V.N. Kolokolstov, J. Li, and W. Yang. Mean field games and nonlinear Markov processes.
Technical report, http://arxiv.org/abs/1112.3744, 2011.
237. V.N. Kolokolstov and M. Troeva. On the mean field games with common noise and the
McKean-Vlasov SPDEs. Technical report, http://arxiv.org/abs/1506.04594, 2015.
238. V.N. Kolokolstov, M. Troeva, and W. Yang. On the rate of convergence for the mean field
approximation of controlled diffusions with large number of players. Dynamic Games and
Applications, 4:208–230, 2013.
239. V.N. Kolokoltsov. Nonlinear Markov games on a finite state space (mean-field and binary
interactions). International Journal of Statistics and Probability, 1:77–91, 2012.
240. T. Kruse and A. Popier. BSDEs with monotone generator driven by Brownian and Poisson
noises in a general filtration. Stochastics, 88:491–539, 2016.
241. P. Krusell and Jr. A. Smith. Income and wealth heterogeneity in the macroeconomy. Journal
of Political Economy, 106:867–896, 1998.
References 691
242. N. Krylov. Controlled Diffusion Processes. Stochastic Modelling and Applied Probability.
Springer-Verlag Berlin Heidelberg, 1980.
243. H. Kunita. Stochastic Flows and Stochastic Differential Equations. Cambridge Studies in
Advanced Mathematics. Cambridge University Press, 1990.
244. H. Kunita and S. Watanabe. On square integrable martingales. Nagoya Mathematical Journal,
30:209–245, 1967.
245. T.G. Kurtz. Random time changes and convergence in distribution under the Meyer-Zheng
conditions. Annals of Applied Probability, 19:1010–1034, 1991.
246. T.G. Kurtz. The Yamada-Watanabe-Engelbert theorem for general stochastic equations and
inequalities. Electronic Journal of Probability, 12:951–965, 2007.
247. T.G. Kurtz. Weak and strong solutions of general stochastic models. Electronic Journal of
Probability, 19:1–16, 2014.
248. T.G. Kurtz and P. Protter. Weak limit theorems for stochastic integrals and stochastic
differential equations. Annals of Probability, 19:1035–1070, 1991.
249. T.G. Kurtz and J. Xiong. Particle representations for a class of nonlinear SPDEs. Stochastic
Processes and their Applications, 83(1):103–126, 1999.
250. T.G. Kurtz and J. Xiong. A stochastic evolution equation arising from the fluctuations of a
class of interacting particle systems. Communications in Mathematical Sciences, 2(3):325–
358, 2004.
251. A. Lachapelle, J.M. Lasry, C.A. Lehalle, and P.L. Lions. Efficiency of the price formation
process in the presence of high frequency participants: a mean field games analysis.
Mathematics and Financial Economics, 10:223–262, 2016.
252. A. Lachapelle, J. Salomon, and G. Turinici. Computation of mean field equilibria in
economics. Mathematical Models and Methods in Applied Sciences, 20:567–588, 2010.
253. A. Lachapelle and M.T. Wolfram. On a mean field game approach modeling congestion and
aversion in pedestrian crowds. Transportation Research Part B: Methodological, 45:1572–
1589, 2011.
254. D. Lacker. Mean field games via controlled martingale problems: Existence of markovian
equilibria. Stochastic Processes and their Applications, 125:2856–2894, 2015.
255. D. Lacker. A general characterization of the mean field limit for stochastic differential games.
Probability Theory and Related Fields, 165:581–648, 2016.
256. D. Lacker. Limit theory for controlled McKean-Vlasov dynamics. http://arxiv.org/1609.
08064, 2016.
257. D. Lacker and K. Webster. Translation invariant mean field games with common noise.
Electronic Communications in Probability, 20, 2015.
258. O.A. Ladyzenskaja, V.A. Solonnikov, and N. N. Ural’ceva. Linear and Quasi-linear Equa-
tions of Parabolic Type. Translations of Mathematical Monographs. American Mathematical
Society, 1968.
259. J.M. Lasry and P.L. Lions. A remark on regularization in Hilbert spaces. Israël Journal of
Mathematics, 55, 1986.
260. J.M. Lasry and P.L. Lions. Jeux à champ moyen I. Le cas stationnaire. Comptes Rendus de
l’Académie des Sciences de Paris, ser. I, 343:619–625, 2006.
261. J.M. Lasry and P.L. Lions. Jeux à champ moyen II. Horizon fini et contrôle optimal. Comptes
Rendus de l’Académie des Sciences de Paris, ser. I, 343:679–684, 2006.
262. J.M. Lasry and P.L. Lions. Mean field games. Japanese Journal of Mathematics, 2:229–260,
2007.
263. M. Laurière and O. Pironneau. Dynamic programming for mean field type control. Comptes
Rendus Mathematique, ser. I, 352:707–713, 2014.
264. G M Lieberman. Second Order Parabolic Differential Equations. World Scientific, 1996.
265. P.L. Lions. Théorie des jeux à champs moyen et applications. Lectures at the Collège
de France. http://www.college-de-france.fr/default/EN/all/equ_der/cours_et_seminaires.htm,
2007–2008.
692 References
266. P.L. Lions. Estimées nouvelles pour les équations quasilinéaires. Seminar in Applied
Mathematics at the Collège de France. http://www.college-de-france.fr/site/pierre-louis-
lions/seminar-2014-11-14-11h15.htm, 2014.
267. P.L. Lions and A.S. Sznitman. Stochastic differential equations with reflecting boundary
conditions. Communications on Pure and Applied Mathematics, 37:511–537, 1984.
268. K. Lye and J. Wing. Game strategies in network security. International Journal on
Information Security, 4:71–86, 2005.
269. J. Komlós M. Ajtai and G. Tusnàdy. On optimal matchings. Combinatorica, 4:259–264,
1983.
270. Y. Hu M. Fuhrman and G. Tessitore. Ergodic BSDEs and optimal ergodic control in Banach
spaces. SIAM Journal on Control and Optimization, 48:1542–1566, 2009.
271. J. Ma, P. Protter, and J. Yong. Solving forward-backward stochastic differential equations
explicitly – a four step scheme. Probability Theory and Related Fields, 98:339–359, 1994.
272. J. Ma, Z. Wu, D. Zhang, and J. Zhang. On well-posedness of forward-backward SDEs - a
unified approach. Annals of Applied Probability, 25:2168–2214, 2015.
273. J. Ma, H. Yin, and J. Zhang. On non-Markovian forward backward SDEs and backward
stochastic PDEs. Stochastic Processes and their Applications, 122:3980–4004, 2012.
274. J. Ma and J. Yong. Forward-Backward Stochastic Differential Equations and their Applica-
tions. Volume 1702 of Lecture Notes in Mathematics. Springer-Verlag Berlin Heidelberg,
2007.
275. J. Ma and J. Zhang. Path regularity for solutions of backward stochastic differential equations.
Probability Theory and Related Fields, 122:163–190, 2002.
276. B. Maslowski and P. Veverka. Sufficient stochastic maximum principle for discounted control
problem. Applied Mathematics & Optimization, 70:225–252, 2014.
277. H.P. McKean. A class of Markov processes associated with nonlinear parabolic equations.
Proceedings of the National Academy of Science, 56:1907–1911, 1966.
278. H.P. McKean. Propagation of chaos for a class of nonlinear parabolic equations. Lecture
Series in Differential Equations, 7:41–57, 1967.
279. S. Méléard. Asymptotic behaviour of some interacting particle systems; McKean-Vlasov
and Boltzmann models. In D. Talay, L. Denis, L. Tubaro, editors, Probabilistic Models for
Nonlinear Partial Differential Equations. Volume 1627 of Lecture Notes in Mathematics,
pages 42–95. Springer-Verlag Berlin Heidelberg, 1996.
280. P.-A. Meyer and Zheng. Tightness criteria for laws of semimartingales. Annales de l’Institut
Henri Poincaré, Probabilités et Statistiques, 20:353–372, 1984.
281. T. Meyer-Brandis, B. Oksendal, and X.Y. Zhou. A mean field stochastic maximum principle
via Malliavin calculus. Stochastics, 84:643–666, 2012.
282. T. Mikami. Markov marginal problems and their applications to Markov optimal control. In
W.M. McEneaney, G.G. Yin, Q., Zhang, editors, Stochastic Analysis, Control, Optimization
and Applications, A Volume in Honor of W.H. Fleming, pages 457–476. Boston, Birkhäuser,
1999.
283. T. Mikami. Monge’s problem with a quadratic cost by the zero-noise limit of h-path processes.
Probability Theory and Related Fields, 29:245–260, 2004.
284. P. Milgrom and J. Roberts. Rationalizability, learning, and equilibrium in games with strategic
complementarities. Econometrica, 58:1255–1277, 1990.
285. M. Nourian, P.E. Caines, and R.P. Malhamé. Mean field analysis of controlled Cucker-Smale
type flocking: Linear analysis and perturbation equations. In S. Bittanti, editor, Proceedings
of the 18th IFAC World Congress, Milan, August 2011, pages 4471–4476. Curran Associates,
Inc., 2011.
286. S. Morris and H.S. Shin. Unique equilibrium in a model of self-fulfilling currency attacks.
American Economic Review, 88:587–597, 1998.
287. M. Safonov N. Krylov. An estimate for the probability of a diffusion process hitting a set of
positive measure. Doklady Akademii nauk SSSR, 245:18–20, 1979.
288. J. Nash. Equilibrium points in n-person games. Proceedings of the National Academy of
Sciences of the USA, 36:48–49, 1950.
289. J. Nash. Non-cooperative games. Annals of Mathematics, 54:286–295, 1951.
References 693
290. K. Nguyen, T. Alpcan, and T. Basar. Stochastic games for security in networks with
interdependent nodes. In Proceedings of the 2009 International Conference on Game Theory
for Networks, 13–15 May, 2009, Istanbul pages 697–703, 2009.
291. S. Nguyen and M. Huang. Linear-quadratic-Gaussian mixed games with continuum-
parametrized minor players. SIAM Journal on Control and Optimization, 50:2907–2937,
2012.
292. S. Nguyen and M. Huang. Mean field LQG games with mass behavior responsive to a major
player. In Proceedings of the 51st IEEE Conference on Decision and Control, pages 5792–
5797, 2012.
293. M. Nourian and P. Caines. -Nash mean field game theory for nonlinear stochastic dynamical
systems with major and minor agents. SIAM Journal on Control and Optimization, 51:3302–
3331, 2013.
294. M. Nutz. A mean field game of optimal stopping. Technical report, Columbia University,
https://arxiv.org/abs/1605.09112, 2016.
295. F. Otto. The geometry of dissipative evolution equations: the porous medium equation.
Communications in Partial Differential Equations, 26:101–174, 2001.
296. E. Pardoux. Homogenization of linear and semilinear second order parabolic PDEs with
periodic coefficients: A probabilistic approach. Journal of Functional Analysis, 167:469–
520, 1999.
297. E. Pardoux and S. Peng. Adapted solution of a backward stochastic differential equation.
Systems & Control Letters, 14:55–61, 1990.
298. E. Pardoux and S. Peng. Backward SDEs and quasilinear PDEs. In B. L. Rozovskii and
R. B. Sowers, editors, Stochastic Partial Differential Equations and Their Applications.
Volume 176 of Lecture Notes in Control and Information Sciences. Springer-Verlag Berlin
Heidelberg, 1992.
299. E. Pardoux and A. Rǎşcanu. Stochastic Differential Equations, Backward SDEs, Partial
Differential Equations. Stochastic Modelling and Applied Probability. Springer International
Publishing, 2014.
300. E. Pardoux and S. Tang. Forward-backward stochastic differential equations and quasilinear
parabolic PDEs. Probability Theory and Related Fields, 114:123–150, 1999.
301. K.R. Parthasarathy. Probability on Metric Spaces. Chelsea AMS Publishing, 1967.
302. S. Peng. A general stochastic maximum principle for optimal control problems. SIAM Journal
on Control and Optimization, 2:966–979, 1990.
303. S. Peng. Probabilistic interpretation for systems of quasilinear parabolic partial differential
equations. Stochastics and Stochastics Reports, 37:61–74, 1991.
304. S. Peng. A generalized dynamic programming principle and Hamilton-Jacobi-Bellman
equation. Stochastics and Stochastics Reports, 38:119–134, 1992.
305. S. Peng. Stochastic Hamilton Jacobi Bellman equations. SIAM Journal on Control and
Optimization, 30:284–304, 1992.
306. S. Peng and Y. Shi. Infinite horizon forward-backward stochastic differential equations.
Stochastic Processes and their Applications, 85:75–92, 2000.
307. S. Peng and Z. Wu. Fully coupled forward-backward stochastic differential equations and
applications to optimal control. SIAM Journal on Control and Optimization, 37:825–843,
1999.
308. J.P. Penot. Calculus Without Derivatives. Graduate Texts in Mathematics. Springer-Verlag
New York, 2012.
309. H. Pham. On some recent aspects of stochastic control and their applications. Probability
Surveys, 2:506–549, 2005.
310. H. Pham. Continuous-time Stochastic Control and Optimization with Financial Applications.
Stochastic Modelling and Applied Probability. Springer-Verlag Berlin Heidelberg, 2009.
311. H. Pham and X. Wei. Bellman equation and viscosity solutions for mean field stochastic
control problem. ESAIM: Control, Optimisation and Calculus of Variations, to appear.
312. H. Pham and X. Wei. Dynamic programming for optimal control of stochastic McKean-
Vlasov dynamics. SIAM Journal on Control and Optimization, 55:1069–1101, 2017.
694 References
313. E. Pimentel and V. Voskanyan. Regularity theory for second order stationary mean-field
games. Indiana University Mathematics Journal, 66:1–22, 2017.
314. F. Priuli. Linear-quadratic N-person and mean-field games: Infinite horizon games with
discounted cost and singular limits. Dynamic Games and Applications, 5:397–419, 2015.
315. P. Protter. Stochastic Integration and Differential Equations. A New Approach. Stochastic
Modelling and Applied Probability. Springer-Verlag Berlin Heidelberg, 1990.
316. J. Quastel and S.R.S. Varadhan. Diffusion semigroups and diffusion processes corresponding
to degenerate divergence form operators. Communications on Pure and Applied Mathematics,
50:667–706, 1997.
317. S.T. Rachev and L. Ruschendorf. Mass Transportation Problems I: Theory. Probability and
Its Applications. Springer-Verlag New York, 1998.
318. M. Reed and B. Simon. Methods of Modern Mathematical Physics. I. Functional analysis.
Academic Press, Inc. San Diego, 1980.
319. J.C. Rochet and X. Vives. Coordination failures and the lender of last resort. Journal of the
European Economic Associateion, 2:1116–1148, 2004.
320. R. T. Rockafellar. Convex Analysis. Princeton University Press, 1970.
321. M. Royer. BSDEs with a random terminal time driven by a monotone generator and their
links with PDEs. Stochastics and Stochastic Reports, 76:281–307, 2004.
322. W. Rudin. Real and Complex Analysis. McGraw-Hill, New York, 1966.
323. J. Schauder. Der fixpunktsatz in funktionalräumen. Studia Mathematica, 2:171–180, 1930.
324. Y. Sun. The exact law of large numbers via Fubini extension and characterization of insurable
risks. Journal of Economic Theory, 126:31–69, 2006.
325. A.S. Sznitman. Topics in propagation of chaos. In P-L Hennequin, editor, Ecole de
Probabilités de Saint Flour, XIX-1989. Volume 1464 of Lecture Notes in Mathematics, pages
165–251. Springer-Verlag Berlin Heidelberg, 1989.
326. M. Jeanblanc T. Bielecki and M. Rutkowski. Hedging of defaultable claims. In R. Carmona
et al., editors, Paris Princeton Lectures on Mathematical Finance 2003. Volume 1847 of
Lecture Notes in Mathematics, pages 1–132. Springer-Verlag Berlin Heidelberg, 2004.
327. M. Jeanblanc T. Bielecki, S. Crepey and M. Rutkowski. Arbitrage pricing of defaultable game
options with applications to convertible bonds. Quantitative Finance, 8:795–810, 2008.
328. X. Tan and N. Touzi. Optimal transportation under controlled stochastic dynamics. Annals of
Probability, 41:3201–3240, 2013.
329. H. Tanaka. Stochastic differential equations with reflecting boundary condition in convex
regions. Hiroshima Mathematical Journal, 9:163–177, 1979.
330. A. Tarski. A lattice theoretic fixed point theorem. Pacific Journal of Mathematics, 5:285–309,
1955.
331. R. Temam. Navier-Stokes Equations. AMS Chelsea, 1984.
332. H. Tembine, Q. Zhu, and T. Basar. Risk-sensitive mean-field stochastic differential games.
IEEE Transactions on Automatic Control, 59:835–850, 2014.
333. R. Tevzadze. Solvability of backward stochastic differential equations with quadratic growth.
Stochastic Processes and their Applications, 118:503–515, 2008.
334. N. Touzi. Optimal Stochastic Control, Stochastic Target Problems, and Backward SDE. Fields
Institute Monographs. Springer-Verlag New York, 2012.
335. J. Vaillancourt. On the existence of random McKean-Vlasov limits for triangular arrays of
exchangeable diffusions. Stochastic Analysis and Applications, 6(4):431–446, 1988.
336. A. Y. Veretennikov. Strong solutions and explicit formulas for solutions of stochastic integral
equations. Matematicheskii Sbornik, 111:434–452, 1980.
337. C. Villani. Topics in Optimal Transportation. Graduate Studies in Mathematics. American
Mathematical Society, 2003.
338. C. Villani. Optimal Transport, Old and New. Grundlehren der mathematischen Wis-
senschaften. Springer-Verlag Berlin Heidelberg, 2009.
339. X. Vives. Nash equilibrium with strategic complementarities. Journal of Mathematical
Economics, 19:305–321, 1990.
References 695
340. H. Xing and G. Žitković. A class of globally solvable Markovian quadratic BSDE systems
and applications. The Annals of Probability, to appear.
341. J. Yong. Linear forward backward stochastic differential equations. Applied Mathematics &
Optimization, 39:93–119, 1999.
342. J. Yong. Linear forward backward stochastic differential equations with random coefficients.
Probability Theory and Related Fields, 135:53–83, 2006.
343. J. Yong and X. Zhou. Stochastic Controls: Hamiltonian Systems and HJB Equations.
Stochastic Modelling and Applied Probability. Springer-Verlag New York, 1999.
344. L.C. Young. Calculus of variations and control theory. W.B. Saunders, Philadelphia, 1969.
345. E. Zeidler. Nonlinear Functional Analysis and its Applications I: Fixed-Point Theorems.
Springer-Verlag New York, 1986.
346. A. K. Zvonkin. A transformation of the phase space of a diffusion process that will remove
the drift. Matematicheskii Sbornik, 93:129–149, 1974.
Assumption Index
Note: Page numbers in Roman refer to Volume I and those in italics refer to Volume II
A
Approximate Nash HJB, 455 G
Approximate Nash SMP, 456 Games, 68, 71
Approximate Nash with a Common Noise Games SMP, 84
HJB, 472
Approximate Nash with a Common Noise
SMP, 473 H
Hamiltonian Minimization in Random
Environment, 80
C HJB in Random Environment, 81
Coefficients Growth, 408
Coefficients MFG with a Common Noise, 158
Compatibility Condition, 11 I
Conditional MKV FBSDE in Small Time, 330 Iteration in Random Environment, 67
Conditional MKV SDE, 116
Control, 126, 242
Control Bounds, 173 J
Control of MKV Dynamics, 555 Joint Chain Rule, 484
Joint Chain Rule Common Noise, 279
D
Decoupling Master, 260 L
Discrete MFG Cost Functions, 648 Large Symmetric Cost Functional, 9
Discrete MFG Rates, 642 Lasry-Lions Monotonicity, 169
Lipschitz FBSDE, 217
Lipschitz FBSDE in Random Environment, 54
E L-Monotonicity, 175
EMFG, 306
M
F Major Convexity, 560
FBSDE, 132, 244 Major Hamiltonian, 559
FBSDE in Random Environment, 10 Major Minor Convergence, 565
Note: Page numbers in Roman refer to Volume I and those in italics refer to Volume II
opt
Symbols ˘p .; /, 353
.; w0 ; ; w/, 23 opt
˘2 .; 0 /, 435
.x; xi /, 5
N t , 108
N
.x; y; .; 0 /; m/, 23
N xN1
i , 490
A.N/ , 68
N nX , 7
D , 315
N .X
N1
i , 96
t ;˛t /
F , 375
N .x;˛/
N1
j , 97
H , 287
H .r/ , 78, 287 ıx , 5, 7, 151
Id , 188, 377, 432 ˛,
O 98, 75
J1, 21, 663 h'; i, 522
p
Lloc .Œ0; T/ RD /, 83, 509 hh; N N˛t i, 34
M2 , 136 hh; i, 552
M2 ./, 151, 159 bxc, 370, 191
Mp;E , 136 A, 131, 74, 126
Mp , 136 A.N/ , 68
N.0; 1/, 443, 383 Es , 343
O. /, 365, 111, 277 E0s , 343, 363
W1 , 536 F , 7
W2 , 144, 124 Fnat; , 7
W2 .; 0 /, 236, 519 Gnat
C , 24
Wp .; 0 /, 145 H0;n , 235
Wp .; /, 352 H2;n , 69, 153, 235, 18
X , 127, 398 J, 646
Œ; , 14 P ı X 1 , 518, 127
˝input , 624 PX , 400, 518, 127
˝canon , 22, 581 S2;n , 69, 153, 235
˝canon
relax
, 586 S2 .Œt; TI Rq /, 345
˝input , 52, 652 Leb1 , 16
˝output , 22 Supp./, 269, 12, 398, 477
˝total , 22 cl./, 394
˚, 18, 443 .d˛/, 10
opt
˘2 .; /, 373 ı 1 , 393
˘p .; 0 /, 145 .q/ , 37
˘p .; /, 350 r', 373
N P
NCE, 209 PDE, 37, 53
O S
ODE, 47, 183, 188, 198 SDE, 37, 48
OLNE, 73 SIFI, 541
OU, 31, 115 SPDE, 38, 112
Author Index
Note: Page numbers in Roman refer to Volume I and those in italics refer to Volume II
M
Ma, J., 211, 342, 105, 106, 319
Q
Malamud, S., 65
Quastel, J., 616
Malhamé, R.P., 209, 212, 677, 678, 537
Quenez, M.C., 210, 342
Manso, G., 65
Marchi, C., 679
Maslowski, B., 678
McCann, 429 R
McKean, H.P., 342 Rachev, S.T., 510
Méléard, S., 342 Rainer, C., 511, 321, 446
Mémin, J., 617, 104, 105, 663 Ramanan, K., 345
Mészáros, A.R., 209 Rǎşcanu, A., 320
Meyer-Brandis, T., 615 Reed, M., 344
Meyer, P.-A., 162, 163, 235 Réveillac, A., 106
Mikami, T., 616, 618 Roberts, J., 62, 662
Milgrom, P., 62, 662 Rochet, J.C., 62
Mohr, J., 679 Rockafellar, R.T., 375, 377
Moll, B., 63 Royer, M., 678
Mordecki, E., 63 Rudin, W., 152
Morris, S., 62 Rüschendorf, L., 510
Rutkowski, M., 663
Rykov, V., 679
N
Nash, J., 5, 72
Nguyen, D.H., 104 S
Nguyen, K., 65 Safonov, M., 539
Nguyen, S., 661 Salomon, J., 537
706 Author Index
Note: Page numbers in Roman refer to Volume I and those in italics refer to Volume II
A Bertrand game, 46
accrued interest, 653 best response
adjoint in the alternative description of MFG, 133
equation, 143, 162, 276, 524, 525, 530–532, example, 26, 41, 45, 59
96 function or map, 5, 6, 76, 80, 89
process, 85, 90, 143, 276, 517, 531, 96 in MFG, 132, 140, 144, 212
system, 161 in MFG with major and minor players, 586,
variable, 143, 524, 525, 531, 532 591, 600
admissibility for FBSDEs in random for strategies in open loop, 86
environment Blackwell and Dubins’ theorem, 397
admissible lifting, 132 BMO, see bounded mean oscillation (BMO)
admissible probability, 23, 30 bond
admissible set-up, 11 callable, 653
admissible set-up with initial information, convertible, 654
43 Borel -field, 350
t-initialized admissible set-up, 46 botnet (model with), 656, 603
admissibility for Weak Formulation of MKV, bounded mean oscillation (BMO)
admissible triple, 581 condition, 342
approximate Nash equilibrium martingale, 234, 511
closed loop, 459, 475, 476 norm, 234, 511
for mean field game of timing, 613 bounded set in P2 .Rd /, 388
open loop, 458, 475 Brenier’s theorem, 377, 511
semi-closed loop, 473 Brownian motion
Arzela Ascoli Theorem, 260 as a common noise, 109
atomless probability space, 352, 629 killed, 325
attacker in security model, 54, 603 reflected, 324
attraction-repulsion model, 50
C
B càd-làg process
bang-bang control, 657, 677, 505 Meyer-Zheng topology, 168, 235
bank run, 20, 62, 611, 614, 656 Skorohod topology, 21, 105
bankruptcy code, 654 callable bond, 653
bee swarm, 594 canonical process (or variable)
Benamou and Brenier’s theorem, 427, 511 for FBSDEs in a random environment, 23
Berge’s maximum theorem, 647, 648 for MFG of timing, 623
dual variables, 77, 263, 267, 525, 529, 547, 75 for finite state processes, 649
dynamic programming principle for MKV, 522, 533
for the master field, 248 stochastic, 112, 150
for MKV, 568 forward backward stochastic differential
equation (FBSDE)
as characteristics of a PDE, 149, 320
E decoupling field, 278
empirical distribution, 7, 96, 97, 130, 351, 408, disintegration, 42
644, 447, 543, 611 infinite dimensional, 150
for a continuum of players, 202 Kunita-Watanabe decomposition, 14, 80
empirical projection, 462, 471 McKean-Vlasov type, 146, 240, 138, 324
endogenous default, 28 overview, 141
equilibrium with random coefficients, 3
deterministic (in a finite game), 73 in a random environment, 13
strong, 140, 613 in a random environment, canonical
weak, 141, 613 process, 23
essentially pairwise independent, 202 in a random environment, canonical space,
essentially pairwise uncorrelated, 202 22
exchangeability, 110, 480 strong uniqueness, 16
exchangeable weak uniqueness, 21
random variables, 110 Yamada-Watanabe theorem, 34
strategy profile, 480 Fréchet differential, 527
exhaustible resources, 44 friction model, 50
exit distribution, 325 Fubini extension, 202
exogenous default, 28 rich, 203
extended mean field games, 33, 36, 63, 96, Fubini, theorem and completion, 114
130, 294 full C 2 -regularity, 464, 267, 415, 419
extension functional derivative, 415, 527, 566, 654
Fubini, 202
G
F game
FBSDE, see forward backward stochastic complementarities, 615
differential equation (FBSDE) with a finite state space, 644, 670
feedback function, 71, 79, 132, 134, 140, 515, linear quadratic, 104, 308
521, 79, 458, 477 with major and minor players, 542
filtration, 19, 28 potential, 15, 99, 602
augmentation, 7 supermodular, 616
immersion and compatibility, 5 of timing, 19
natural filtration, 7 unimodular, 615
fire-sale, 22 zero-sum, 54
first exit time, 325 Gateaux differential, 527
first order interaction, 385, 554 generalized closed loop equilibrium, 475
fixed point general set-up, 141, 156, 241, 466
for games, 6, 76, 80 geodesic, 429
MFG formulation, 132, 133, 626, 614 constant speed, 429
MFG with a common noise, 127, 194 Girsanov theorem, 154, 231, 90, 512
fixed point theorem Glivenko-Cantelli theorem, 201, 361
Kakutani-Fan-Glickbserg, 648
Picard, 237, 116
Schauder, 150, 247, 155 H
Tarski, 619 Hamiltonian
flocking, 340, 594 for control in random environment, 75
Fokker-Planck equation, 37, 45, 139, 140, 147 for a finite state space, 649
710 Subject Index
M
K major player, 54, 541
Kakutani-Fan-Glickbserg’s theorem, 648, 663 Malliavin calculus, 615
Kantorovich duality, 354 marked point process, 646, 679
Kantorovich-Rubinstein market impact
distance, 354 instantaneous, 34
norm, 249, 250, 258, 281 temporary, 34
theorem, 510 Markovian control for MKV, 520
kernel, 4, 5 Markovian equilibrium, 31, 75
killed Brownian motion, 325 convergence, 519
Kolmogorov equation, see Fokker-Planck existence and uniqueness, 517
equation for games with a finite state space, 645
Subject Index 711