Book ModelingCellularSystems

Contributions in Mathematical and Computational
Sciences 11
Frederik Graw
Franziska Matthäus
Jürgen Pahle Editors
Modeling
Cellular
Systems
Contributions in Mathematical
and Computational Sciences
Volume 11
Series editors
Hans Georg Bock
Willi Jäger
Hans Knüpfer
Otmar Venjakob
More information about this series at http://www.springer.com/series/8861
Frederik Graw Franziska Matthäus
•
Jürgen Pahle
Editors
Modeling Cellular Systems
123
Editors
Frederik Graw Jürgen Pahle
Heidelberg University Heidelberg University
Heidelberg Heidelberg
Germany Germany
Franziska Matthäus
Heidelberg University
Heidelberg
Germany
ISSN 2191-303X ISSN 2191-3048 (electronic)

Contributions in Mathematical and Computational Sciences
ISBN 978-3-319-45831-1 ISBN 978-3-319-45833-5 (eBook)
DOI 10.1007/978-3-319-45833-5
Library of Congress Control Number: 2016960568
© Springer International Publishing Switzerland 2017

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.
Printed on acid-free paper
This Springer imprint is published by Springer Nature

The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The life of multicellular organisms, such as humans and animals, is a complex

dynamic process requiring the constant interaction of various molecules, cells,
organs and other factors within the organism itself, and in exchange with the
environment. In the human body, each cell has its specific task that is necessary to
enable us to walk, breath, produce energy or fight infections. Only the tight reg-
ulation of multiple processes on various scales, be it on the genetic, molecular or
cell population level, leads to the proper functioning of the complete system. Each
cell on its own also represents a complex, multi-scale system that receives, pro-
cesses and transmits information by cellular signalling, the production of enzymes
and proteins, or by adapting their mechanical properties or cell cycle dynamics.
While, step by step, we learn more and more about individual elements and pro-
cesses of cellular dynamics and interactions, how these parts are connected within
the complete system still remains an area of active research. Currently, some major
topics of interest are the investigation of interacting processes across different
spatio-temporal scales (i.e. the genetic or chemical regulation of complex cell
behaviours like division, motility or ageing), the importance and effects of the
inherent stochasticity in biological systems, and the integration of complex inter-
actions of many cellular components to a stimulus or new environment leading to a
single, well-defined cellular behaviour.
To better understand the processes regulating the functioning of single cells, or
the interaction between cells, mathematical models provide help to formulate
hypotheses in an abstract way and represent a means to make predictions.
Mathematical models formalise physical and chemical laws underlying biological
systems, for instance chemical reaction kinetics or diffusion. They allow the inte-
gration of elements and processes from various sources and scales in a systematic
and quantitative framework, i.e. studying the interdependency of individual pro-
cesses. Using these models, we are also able to quantify elements that are not
directly observable in lab experiments and to identify key processes that shape the
behaviour of the system.
The development of mathematical models as a tool to generate understanding of
biological processes has always been driven by the progress of experimental
v
vi Preface
techniques. Novel experimental data and observations usually also require the
development of appropriate mathematical methods in order to analyse and interpret
these data in a meaningful way. The theoretical considerations on particle diffusion
followed the development of microscopy. Similarly, one of the most famous
mathematical models in biology—the Hodgkin–Huxley model describing the
molecular basis of action potentials in nerves—was guided by observations based
on voltage clamp, a new experimental technique that allowed to measure ion cur-
rents of neuronal membranes and control the membrane potential.
In recent years, biological sciences have experienced enormous advances in
experimental techniques that allow the quantification of biological processes in more
and more detail. In particular imaging technologies have improved substantially in
terms of spatial and temporal resolution, also driven by the development of novel
fluorescent dyes and techniques. Single molecule spectroscopy allows the visuali-
sation and quantification of single molecules within individual cells. With life cell
and two-photon imaging, as well as 3-D microscopy we are able to observe and track
the behaviour of single cells within particular organs over time, or even across an
entire organism. Further experimental techniques, such as high-throughput -omics
technologies quantifying entire chemical subsystems of a cell (e.g. proteins, meta-
bolic compounds or gene expression), cellular barcoding, which enables us to follow
the fate of individual cells during differentiation and migration, or traction force
microscopy, generate tremendous amounts and various types of quantitative data.
We strongly expect that inspired by the data generated by these technologies novel
mathematical models and methods have to be developed, in order to formalise and
test newly generated hypotheses about biological processes, and to provide a sys-
tematic and quantitative understanding of the often complex structures and inter-
actions across multiple scales. Without the help of mathematical models a proper
quantification and interpretation of these novel types of data and observations is
practically impossible. Mathematical models are essential to describe the underlying
mechanistic processes spanning different spatio-temporal scales and the interaction
of many biological components. They provide a more thorough understanding of the
observed data than purely statistical approaches, as they go beyond simple quanti-
tative and qualitative comparisons.
The process of mathematical model building is thereby challenging and often
tedious, as many different requirements have to be considered:
1. Does the model correctly describe the experimental data?
2. Are the assumptions on the underlying biological processes plausible and how
sensitive are our conclusions with regard to these assumptions?
3. Does the model comprise the minimum complexity necessary to explain all
observed behaviours, or is a further reduction sensible?
4. Does the model allow predictions that can be tested experimentally and, thus,
allow model validation?
5. Is a rigorous mathematical analysis of the model possible? If not, can the model
be analysed numerically?
Preface vii
The chapters of this book contain examples of mathematical modelling

approaches describing and analysing cellular systems on different scales. The dif-
ferent models and technical approaches depend on the biological system being
investigated, and the particular question addressed. An important role plays the
consideration of stochasticity within the modelling procedures.
The first chapter by Ádám M. Halász, Meghan McCabe Pryor, Bridget S. Wilson
and Jeremy S. Edwards on “Spatiotemporal Modeling of Membrane Receptors”
provides a detailed overview of mathematical modelling approaches for cellular
processes with a specific focus on the processes involved in early signalling of
membrane receptors. Mathematical modelling of chemical reactions and reaction
networks is being discussed, as well as the necessity for stochastic approaches when
the copy numbers of the involved compounds are small. The chapter introduces
models of increasing complexity starting from simple approaches based on reaction–
diffusion models up to models including specific spatial aspects such as agent-based
modelling systems accounting for Brownian and non-Brownian motion or anomalous
diffusion. This chapter describes and discusses classical modelling principles and
approaches also used in the following chapters. It provides an introduction to the field
of mathematical modelling in biology, also accessible to non-experts in the field.
Elaborating on the topic of stochasticity of chemical reactions, Chapter “Distribution
Approximations for the Chemical Master Equation: Comparison of the Method of
Moments and the System Size Expansion”—contributed by Alexander
Andreychenko, Luca Bortolussi, Ramon Grima, Philipp Thomas, and Verena Wolf—
focuses on stochastic chemical kinetics. The chapter presents and compares two
different approaches to estimate the probability distribution associated with the
amounts of different chemical species: the indirect estimation via moment closure
techniques and the direct approach using system size expansion. The two approxi-
mation methods analysed represent useful tools for the analysis of large-scale models.
The authors of Chapter “Sampling from T Cell Receptor Repertoires”, Marco
Ferrarini, Carmen Molina-París and Grant Lythe, address a related problem by
trying to predict the complete repertoire of T-cell clonotypes, a specific type of
immune cells, from a sample of a few hundred T-cells. The estimation of the size
and the distribution of these T-cell clonotypes is challenging due to the total
population size, which is distributed heterogeneously across the whole organism,
and the immense natural variability of clonotype construction. Therefore, a large
probability space needs to be estimated from extremely sparse experimental data.
Using various assumptions on the underlying distribution of clone sizes, they
investigate the possibility and challenges to infer the total population structure
based on small individual samples.
Chapters “IL-2 Stimulation of Regulatory T Cells: A Stochastic and Algorithmic
Approach” and “Understanding the Role of Mitochondria Distribution in Calcium
Dynamics and Secretionin Bovine Chromaffin Cells” couple stochastic modelling of
signalling processes determining cell–cell interactions and higher order cell functions,
respectively. In Chapter “IL-2 Stimulation of Regulatory T Cells: A Stochastic and
Algorithmic Approach” an algorithmic approach is presented by Luis de la Higuera,
Martín López-García, Grant Lythe, and Carmen Molina-París to solve and analyse a
viii Preface
stochastic model of receptor-mediated cell–cell interactions. In the underlying bio-

logical scenario looking at the IL-2 dependency of regulatory T-cells, one cell type, i.e.
effector T-cells, provides the chemical substance needed by the second cell type, i.e. the
regulatory T-cells for survival. The authors provide an algorithmic approach to
determine the rates at which receptor–ligand interactions are formed and thereby define
cell fate. Chapter “Understanding the Role of Mitochondria Distribution in Calcium
Dynamics and Secretion in Bovine Chromaffin Cells”, contributed by Amparo Gil,
Virginia González-Vélez, José Villanueva and Luis M. Gutiérrez, couples a stochastic
description of intracellular calcium signalling with cell function, i.e. exocytosis. The
regulatory relationship between signalling and function is hereby affected by the spatial
distribution of cell organelles, in particular mitochondria, since they limit the diffusion
of the involved components. The authors present a spatially explicit modelling scheme
analysing the dynamics of intracellular components.
The book concludes with Chapters “Dynamical Features of the MAP Kinase
Cascade” and “Numerical Treatment of the Filament-Based Lamellipodium Model
(FBLM)”, which comprise two different deterministic modelling approaches at two
different scales. In Chapter “Dynamical Features of the MAP Kinase Cascade”,
Juliette Hell and Alan D. Rendall use a deterministic modelling approach to
describe the dynamics of the MAPK signalling cascade, a widespread intracellular
signalling pathway within eukaryotes. They perform a mathematical analysis of the
system described by ordinary differential equations showing its qualitative beha-
viour under different conditions. Finally, Chapter “Numerical Treatment of the
Filament-Based Lamellipodium Model (FBLM)”, authored by Angelika Manhart,
Dietmar Oelz, Christian Schmeiser and Nikolaos Sfakianakis, introduces a con-
tinuous model coupling cell mechanics with the dynamics of cytoskeleton com-
ponents, discussing its mathematical analysis and showing corresponding numerical
simulations based on a finite-element method.
In summary, the different chapters in this book address various types of math-
ematical models and methods to describe and analyse biological systems and
processes on different cellular scales. They cover a variety of biological topics
reaching from the analysis of intracellular signalling pathways to the level of cell
mechanics and cytoskeleton structuring, up to the regulation of cell populations
within immune responses. The different approaches demonstrate how challenging
mathematical problems arise from the mechanistic description of cellular processes
and interactions. The importance of the development of such models, as well as
their rigorous mathematical and numerical analysis is steadily increasing in line
with the progress of measurement techniques in quantitative biology. The combi-
nation of detailed quantitative measurements of increasing resolution in time and
space with novel mathematical models will help us to get a systems level under-
standing of individual processes, and might finally lead us to a better understanding
of the dynamics and regulation of cellular processes that shape life.
Heidelberg, Germany Frederik Graw

Franziska Matthäus
Jürgen Pahle
Organisation
Programme Chairs
Frederik Graw (BIOMS, BioQuant/IWR, Heidelberg University)

Franziska Matthäus (BIOMS, BioQuant/IWR, Heidelberg University)
Jürgen Pahle (BIOMS, BioQuant, Heidelberg University)
ix
Contents
Spatiotemporal Modeling of Membrane Receptors . . . . . . . . . . . . . . . . . . 1

Ádám M. Halász, Meghan McCabe Pryor, Bridget S. Wilson
and Jeremy S. Edwards
Distribution Approximations for the Chemical Master Equation:
Comparison of the Method of Moments and the System Size
Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Alexander Andreychenko, Luca Bortolussi, Ramon Grima,
Philipp Thomas and Verena Wolf
Sampling from T Cell Receptor Repertoires . . . . . . . . . . . . . . . . . . . . . . . 67
Marco Ferrarini, Carmen Molina-París and Grant Lythe
IL-2 Stimulation of Regulatory T Cells: A Stochastic
and Algorithmic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Luis de la Higuera, Martín López-García, Grant Lythe
and Carmen Molina-París
Understanding the Role of Mitochondria Distribution
in Calcium Dynamics and Secretion in Bovine Chromaffin Cells . . . . . . 107
Amparo Gil, Virginia González-Vélez, José Villanueva
and Luis M. Gutiérrez
Dynamical Features of the MAP Kinase Cascade . . . . . . . . . . . . . . . . . . . 119
Juliette Hell and Alan D. Rendall
Numerical Treatment of the Filament-Based Lamellipodium
Model (FBLM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Angelika Manhart, Dietmar Oelz, Christian Schmeiser
and Nikolaos Sfakianakis
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
xi
Spatiotemporal Modeling of Membrane
Receptors
Ádám M. Halász, Meghan McCabe Pryor, Bridget S. Wilson

and Jeremy S. Edwards
Abstract We discuss our approach to the detailed computational modeling of the

molecular processes involved in early signaling of membrane-bound receptors, typ-
ically exemplified by members of the receptor tyrosine kinase (RTK) family. This
includes receptors whose mutations are associated with increased risk of cancers
(ErbB2) or are involved in the survival of nascent tumors (Kdr). Current imaging
methods can visualize individual molecules in the context of the living cell, allow-
ing the direct observation of molecular movement and transformations. Modeling
and simulation are necessary to connect these observations to the cell-level kinetics
of signaling and to help reveal connections between molecular properties and cell
signaling, under both normal and pathological conditions. We describe the relevant
methods and provide a minimal mathematical justification for the reader interested
in understanding or applying them. The chapter builds up from the simplest mod-
eling approach to the fully spatial, agent-based simulation that is currently used by
our group. This should be useful from a tutorial perspective and also to provide the
proper connections between models at different levels of granularity.
Á.M. Halász (B)

Department of Mathematics, West Virginia University,
P.O.Box 6310, Morgantown, WV 26506, USA
e-mail: halasz@math.wvu.edu
M.M. Pryor
Department of Biomedical Engineering, School of Medicine,
Johns Hopkins University, Baltimore, MD 21205, USA
e-mail: mmccabe9@jhu.edu
B.S. Wilson
Department of Pathology, University of New Mexico Health Science Center,
Albuquerque, NM 87131, USA
e-mail: BWilson@salud.unm.edu
J.S. Edwards
Department of Chemistry and Chemical Biology, University of New Mexico,
MSC03 2060, Albuquerque, NM 87131, USA
e-mail: jsedward@unm.edu
© Springer International Publishing Switzerland 2017 1

F. Graw et al. (eds.), Modeling Cellular Systems, Contributions in Mathematical
and Computational Sciences 11, DOI 10.1007/978-3-319-45833-5_1
2 Á.M. Halász et al.
1 Introduction
The technological advances of the past few decades fueled the fast growth of a
quantitative dimension to biology. The work discussed here describes computational
approaches for simulating and predicting outcomes of molecular processes on the
behavior and output of the overall system. The consideration of “dynamics” offers
improved understanding of stochastic processes, such as diffusion-limited reactions,
and is markedly distinct from approaches that infer causal connections between
genes, mutations, phenotypes, and pathologies from large datasets.
We define mathematical models that, at least in part, recapitulate the behavior
of living cells observable in quantitative experiments. Models that integrate the
dynamics of living systems are reductionist and mechanistic; they result from the
composition of simple, physically motivated descriptions of the various elements
taken separately. In contrast with occasional misperceptions,1 the resulting inte-
grative dynamical models are precisely the way to understand or predict emerging
system properties. Furthermore, the focus of our discussion is on processes that are
random,2 and therefore the predictions that such models can make are likely versions
of system behavior.
Modeling Membrane Dynamics
Very significant advances in experimental techniques, in particular, high-resolution
microscopy combined with the ability to reliably label molecules of interest, opened
the door to a wealth of data on the movement and chemical state of individual
molecules. This has the potential to greatly refine the already vast knowledge of
the biochemistry of cell signaling, accumulated through more traditional avenues of
investigation, such as in vitro identification of molecular structures and reactions,
and cell-level in vivo measurements of molecular processes.
Modern microscopy has revealed that the cell membrane presents a varied land-
scape which modulates the movement of membrane bound proteins. The distrib-
ution of receptors and other biomolecules is typically not homogeneous, and is
characterized by structures at spatial scales ranging from several hundreds to tens of
nanometers. Molecules exhibit clustering and anomalous diffusion, deviating from
the null-hypothesis of uniform, random movement. This complicates the interpre-
tation of experiments aimed at ascertaining or quantifying the details of known
molecular processes.
Modeling processes on the cell membrane requires taking into account both the
discrete nature of molecules, and the inhomogeneous spatial context. While mature
methods exist for handling both aspects, available software, and the majority of
modeling effort in the field, is not well adapted to handling the relevant spatial scale,
1 In a “fundamentalist” version of reductionism (motivated by nineteenth century classical Physics,

but long since abandoned in the physical sciences [40]), the properties of a system must be a sum
of the properties of its parts, negating the possibility of emerging system properties.
2 There is a distinction between processes that are intrinsically random and deterministic chaotic
behaviors that can be treated as random.

Spatiotemporal Modeling of Membrane Receptors 3
ranging from just above the diameter of a receptor to a fraction of a eukaryotic cell
(from 10 to a few hundred nm).
The material in this chapter is naturally divided into four sections. The rest of
this section discusses (1) the continuous description of well-mixed chemical reaction
systems. The following sections are devoted to: (2) stochastic models for well-mixed
systems; (3) modeling and simulation of Brownian motion (diffusion) of individual
molecules and finally (4) stochastic, agent-based simulation of reaction–diffusion
systems. We provide a brief summary and a few closing ideas in Sect. 5.
1.1 Background: Chemical Reaction Networks,

ODE Simulations
The elementary building blocks of living cells are organic molecules, proteins, lipids,
and sugars. Their size ranges from a few atoms to large complexes of tens of thousands
of atoms that organize into molecular machines. All the functions of the cell are real-
ized through chemical transformations, changes in shape, assembly, and movement
of molecular entities.
1.1.1 Cells as Chemical Factories
The language of chemistry provides an approximate but natural framework for a

quantitative description of the state and processes inside a living cell. The state of
the cell is specified in terms of the number, internal state, and location of all the
molecules it is comprised of. We may account for all of these by extending the
notion of a chemical species A, B, . . . to mean a type of molecular aggregate, in
a specific internal state, in a specific location (section) within the cell (membrane,
cytoplasm, nucleus). Then, a reaction A → B + C may refer to an actual chemical
transformation, A(e) → A(i) would represent the transport of a substance from the
extracellular space into the cytoplasm, and B → B∗ could refer to the activation of
a kinase.
There is an inevitable trade-off between the scope and the level of detail of the
modeling framework. Even the simplest living cells contain hundreds of different
species of molecules. A practical model typically focuses on a much smaller set of
biomolecular species and processes. At a minimum, one keeps track of the amount of
each species. This is typically represented as a concentration, or amount of substance
per unit of volume (or area). We will denote the concentration of substance Si by
[Si ].
It is useful to specify the precise meaning of these terms. Chemical reactions
described by equations such as
L + 2R → RLR (1)
refer to the way individual molecules interact. In particular, the reaction above refers
to a process when one molecule of species L joins with two molecules of species
R, and they together form a single molecule of species RLR. Even if one does not
explicitly model molecules, the stoichiometric coefficients in (1) specify the exact
proportions in which the reactants combine.
Molecules of different species typically have different sizes or molecular weights;
reaction (1) does not mean that 1 g of L and two grams of R will combine into 1 g
of RLR. The amount of substance is a macroscopic quantity that is proportional to
the number of molecules: 1 mol of any substance corresponds to the same num-
ber of molecules, NAvogadro ≈ 6 · 1023 known as Avogadro’s number. Going back
to the reaction (1), one mol of L combined with two moles of R will indeed form
1 mol of RLR. We summarize below the connection between the concentration [A],
the number of molecules n A , and the amount of substance ν A = n A /NAvogadro in a
volume V :
νA nA
[A] = = . (2)
V V · NAvogadro
The base unit for concentration is the molar; 1 M corresponds to 1 mol per 1 litre of
solution (the concentration is often referred to as molarity).
1.1.2 Chemical Reaction Network Theory
In general, a chemical reaction network is defined by a set of species S1 , S2 , . . . , Sn

that are subject to reactions R1 , R2 , . . . , Rm . Reaction R j is defined by a chemical
equation and a rate ϕ j :

n
n
αi j Si → βi j Si ; ϕ j = f j ([S1 ], [S2 ], . . . , [Sn ]) , (3)
i=1 i=1

and is fully specified by the stoichiometric coefficients αi j , βi j i=1...n and the rate
function f j (·). The state of the system at any time t is given by the concentrations of
all substances, usually arranged into a vector X = ([S1 ], [S2 ], . . . , [Sn ])T .3 The net
stoichiometric coefficients γi j ≡ βi j − αi j form a matrix of size n × m, Γ ≡ {γi j }.
If we also arrange the fluxes into a column vector φ = (ϕ1 , ϕ2 , . . . , ϕm )T , we may
write out the equations of motion of the CRN model:
dX
= Γ · φ(X) (4)
dt
3 We implicitly assume here that the reactions take place between substances in the same solution,
and therefore the stoichiometric coefficients carry over from the reactions to the concentrations.
The equations of motion are fully determined when the rate functions (or laws) f j (·)
in (3) are specified. The rate through a certain reaction typically depends on the
concentrations of the species on the origin side; the rate is generally a nondecreasing
function of these concentrations, and it must vanish if one or more of the concen-
trations becomes zero. Several types of functions can serve as a rate law; the exact
form is ultimately determined by the system under consideration. Mass-action rates
play a special role due to their close connection to simple molecular dynamics; the
mass-action rate for reaction R j in a generic CRN (3) is:

n
ϕj = kj [Si ]αij (5)
i=1
The mass-action rate constant k j connects the continuum description with molecular
properties.
Despite the compact appearance, the equations of motion (4)–(5) define a
class of complex nonlinear dynamical systems. Chemical reaction network theory
[5, 11, 32] provided a number of results regarding the possibility and number of equi-
libria (steady states) of CRN systems. Control theory provides methods to simplify
these systems and to classify their possible behaviors [36]. Such involved mathe-
matical tools are necessary to gain insight into the complex behavior of signaling
networks [4, 12, 14].
Independently of the theoretical insights, CRN models provide a basis for numer-
ical simulations. There are mature methods for the simulation of sizeable ODE
systems, implemented in widely available software tools. Simulations can be used
to predict the behavior of cells under a variety of conditions, and are a valuable tool
in biological, pharmaceutical, and medical research.
1.1.3 Need for a Stochastic Approach
The molecular nature of matter is always reflected in the exact proportions in

which different substances combine. In a macroscopic context, the number of
molecules involved is very large, comparable to Avogadro’s number NAvogadro ≈
6 · 1023 copies/mol; the discrete nature of matter is irrelevant when it comes to mea-
suring the amount of substance. This is emphatically not the case with living cells.
The diameter of a human cell is a few microns; consider a cell of 10 µm3 that contains
100 nM of a certain molecular species. That works out to

−18 −9 3 mol 23 copies
N = (10 · 10 m ) × 100 · 10 · 10
3
× 6 · 10 = 60 copies .
m3 mol
With larger cells and concentrations of several µM, the copy number becomes large,
albeit not astronomical. The number of membrane bound receptors of one type ranges
from a few thousand to 105 per cell, certainly not a small number. However, adding
spatial structure, for instance signaling islands that contain a few tens or hundreds of
receptors, further reduces the number of copies of a molecule that can be reasonably
thought of as “uniformly distributed” in a specific region of space.
Thus, the notion of describing the amount of substance of a certain type as a
continuous quantity is not necessarily correct, raising questions about the validity of
a modeling approach along the lines of (1)–(5). For instance, it is not very meaningful
to talk about the “concentration” of a certain gene/promoter combination, when
referring to a single, specific cell. The gene is either present or not; it may have more
than one copy, but there cannot be exactly 1.376 copies of it.
On the other hand, it is perfectly meaningful to talk about an average copy number
of 1.376, referring to cells within a population. Even when dealing with relatively
small numbers of molecules within a given cell, the amount of substance or its various
proxies4 may be understood as averages over a large population.
The connection between continuous dynamics as described by (1)–(5) and the
physical processes includes averaging the stochastic dynamics of a large number
of cells. Individual cell behaviors may factor in the ensemble average in ways that
cannot be accounted for in the ODE picture; the continuum equations of motion
provide a necessary framework for large-scale modeling, but they may not capture
the primordial physical reality.
Thus, kinetic constants that reasonably recapitulate the observed behavior of cell
populations may be very different from the physical constants that control the under-
lying molecular processes. One goal of the more detailed (and necessarily, more
resource intensive) modeling approaches discussed here is to investigate and uncover
quantitative relations between the average and the relevant microscopic properties.
2 Reactions (Stochastic)
The discrete and stochastic nature of biomolecular processes limits the validity of
continuous, ordinary differential equation based (ODE) models (4), but does not
eliminate our ability to predict the future states of the system. While we cannot pre-
dict the exact states of the system, we can estimate the probabilities of the possible
outcomes, either individually, or defined by certain quantifiable features (observ-
ables). Such a calculation comes as close as possible to a deterministic model, in
that it provides the likelihood of all possible versions of the future behavior of the
system.
Accounting for all possible states is practical only for relatively simple systems.
As the number of random elements in the system increases, the number of possible
outcomes grows exponentially; typically, the set of likely outcomes does not increase
as fast and represents an increasingly small fraction. Consideration of all individual
outcomes becomes both exceedingly expensive and unnecessary.
4 Such as concentration, mass, etc.

Fig. 1 The probability that

the difference between
number of heads and tails
exceeds 10% of the number
of flips of a fair coin
decreases dramatically with
the number of trials. For
1,000 flips the probability is
8.6 · 10−4 , and for 10,000 it
becomes astronomically
small, 7.75 · 10−24 . A
significant bias observed in a
large number of trials would
strongly indicate that the
coin is not fair
Instead, enough insight can be gained by generating a sample of possible outcomes

by emulating the random variables in the system. Stochastic simulations are the main
approach used in the biomolecular context. It is important to keep in mind that,
unlike ODE simulations which predict the future behavior of the system, a stochastic
simulation produces one possible future behavior of the system.
Example: Sets of Coin Flips
A much-invoked analogy with games of chance may be illuminating for some readers.
One cannot predict the outcome of a single coin flip, but (assuming the coin is fair) we
expect that on average, the two possible outcomes (head and tail) are equally likely.
Given the probability of the elementary outcomes, we can calculate the likelihood
of a specific number of outcomes of one type from a given number of repeated trials.
We may only be interested in the excess number of heads or tails; the probability
that the difference |Nheads − Ntails | exceeds 10% of the number of trials Nflips =
Nheads + Ntails is shown in Fig. 1.5 The number of excess heads or tails is an example
of an observable, a measurable quantity that can be derived from an ordered set of
coin flips.
The set of possible values an observable could take is much smaller than the
set of possible outcomes (or sample space). The length of the largest sequence of
consecutive heads or tails in a set of 1,000 flips is an observable which also depends
on the order of individual coin flips; the number of distinct outcomes is enormous,
2100 ≈ 1030 , while the length of the longest streak can only take 1,000 values.
5 Note that for an odd number of trials the number of heads and tails cannot be equal, increasing the
likelihood of a given relative difference.

2.1 Dealing with the Molecular Nature of Chemical

Reactions
In a molecular framework, we keep track of the copy number of molecules of each

species. This defines the state of the system, which can be changed through reactions.
Reactions occur at random times; the physical flux (or rate) through each reaction (3)
is replaced by a probability per time for the occurrence of an instance of the reaction.
2.1.1 Intrinsic Transformation of a Single Molecule
To illustrate the principle of representing reactions by random processes, consider a

system with two types of molecules A, B and a single reaction type, A → B. The state
of the system at any time t is defined by the number of molecules of the two types,
(n A (t), n B (t)). The idealized model for this type of transformation is an abstraction
of molecular dynamics acting on finer time and spatial scales; we assume that each
molecule of type A has an intrinsic tendency to spontaneously convert into a molecule
of type B.
For each A molecule that exists at time t, the probability that it will convert
to B over the subsequent short time interval (t, t + Δt), is proportional to the
length Δt 6 :
ΔpA→B (t, t + Δt)

lim = kA→B ⇒ ΔpA→B (t, t + Δt) ≈ kA→B Δt . (6)
Δt→0 Δt
The proportionality constant kA→B is an inverse time, and is characteristic to the

transformation. The transition is triggered independently of the rest of the system
and of the past history of the transitioning molecule.
This defines a Poisson process with rate kA→B that triggers the transition; the
process ‘fires’ at some time, τ ∈ (t, t + Δt), with probability as described above. It
can be conceptualized as a random experiment whose outcome is the firing time τ ;
alternatively we think of it as a time dependent random system whose state changes
from A to B.
The meaning of the probability (6) is clarified if we think of multiple copies of
the system. Assume we started with N A molecules of A and we have n A (t) of them
at time t. The change in n A over (t, t + Δt) is the same as the number of A → B
transformations that take place in this time, which can be approximated by (6):
n A (t) − n A (t + Δt) ≈ ΔpA→B (t, t + Δt) · n A (t) ≈ n A (t) · kA→B · Δt + O(Δt 2 ). (7)
The
√ approximate equalities in (7) neglect relative differences on the order of
1/ Δn A . For molecule counts n A (t) comparable to NAvogadro ≈ 6 × 1023 , this is
6 Strictly speaking, only the limit version of Eq. (6) is exact; the second form is for Δp 1.
perfectly satisfactory, with room for infinitesimally small time steps Δt 1/k. In
this interpretation, we may treat n A as continuous (replacing it with
ν A = n A /NAvogadro ), and write (7) as a proper limit similarly to (6):
n A (t + Δt) − n A (t) dn A
lim = = −kA→B · n A (t). (8)
Δt→0 Δt dt
With the initial condition n A (0) = N A , the differential equation implies
n A (t) = N A · e−kA→B t . (9)
Not surprisingly, Eqs. (8) and (9) are formally identical to those one would obtain
for the concentration in the continuum picture (4)–(5). If N√ A is large, the relative
fluctuations around this average are small (on the order of 1/ N A ).
But what about small copy numbers, where Δn A = n A kA→B Δt 1, making the
limit in Eq. (8) meaningless? The interpretation that also works in this case is that
since we are describing a stochastic system of initially N A copies of A molecules,
n A (t) represents the average n A (t) , of the number of A molecules at time t. The
average can be thought of taken over a large number of copies of the system.
Importantly for our purposes, we can also think of N A in terms of a similar thought
experiment, as the number of copies of the random system represented by a single A
molecule that exists at time 0. Then, the ratio n A (t)/N A approaches the probability
that, at time t, the molecule under consideration has not yet converted to B;
n A (t)
p A (t) = lim = e−kA→B t . (10)
N A →∞ NA
This line of reasoning also provides the PDF (probability density function) f (τ ) of
the time τ when a specific A particle converts into B. Assuming that the particle
belongs to species A at time τ = 0, the probability that it is still an A at time τ = t
is the integral of f (τ ) over the interval [0, t]:
t
f (τ )dτ = e−kA→B t ⇔ f (τ ) = kA→B · e−kA→B τ . (11)
0
2.1.2 Combining Poisson Processes
We know how to describe or simulate the behavior of a single molecule of A. We

did estimate the average number n A (t) of A molecules left after time t (9). A com-
plete analysis should provide either the probability of having exactly 0, 1, 2, . . . , N A
molecules at an arbitrary time t, or a prescription to simulate the evolution of the
system, consistently with the underlying stochastic properties.
The key mathematical element is the composition property of Poisson processes

[34]. Suppose that at t = 0 we start two independent Poisson processes, with rates
k1 and k2 , and that we are only interested in the first time either of them fires. It turns
out that the time the first one fires follows the same law as the firing time of a Poisson
process that starts at t = 0 and has rate k T = k1 + k2 . This can be shown rigorously,
using the PDF of the two firing times (11). Furthermore, the probability that process
1 fires first is p1 = k1 /k T .
Based on this property, we can easily show that the probability that any one
of the n A (t) molecules of species A that exist at time t converts over t, t + Δt is
n A · k A→B · Δt. In other words, the conversion of any one of the n A molecules is
triggered by a Poisson process with rate gA→B = n A · kA→B . Finally, the time of this
next instance of the reaction A → B is distributed according to the PDF
f (τ ) = gA→B · e−gA→B τ ; gA→B = n A · kA→B (12)
The propensity gA→B refers to all possible instances of the reaction A → B.

The principle of combining Poisson processes also helps when we have several
types of reactions. Suppose that A can also convert into C, with a molecular rate kA→C .
The “dilemma” facing each A molecule is easily quantified: the rate of a given A
molecule converting to either B or C is kA→X = kA→B + kA→C ; the probability that
it becomes a B as a result, is pB = kA→B /kA→X . Finally, in a system with n A copies
of A, the propensity of A → C is gA→C = n A · kA→C , the joint propensity of the two
reactions is n A · kA→X , and the probability that the next reaction is A → B is still
pB = kA→B /kA→X .
2.1.3 Simulations
A simulation provides the evolution of the state of a system over time, based on the
knowledge of its dynamics. An ODE model along the lines of (3)–(5) is deterministic
in the sense that the initial conditions fully determine the state of the system at any
future time. By contrast, a stochastic model allows for several possible future states;
the initial state determines the probabilities of the different possible outcomes. As
both the ODE and the stochastic models we discuss are surrogates of the system
being modeled, the future states we obtain are those of the respective model systems.
A simulation of an ODE model provides the future behavior of the (model) sys-
tem,7 while a single simulation run of a stochastic model provides one possible
version of the future (a single realization of a probabilistic experiment). A complete
account of the future behavior of a stochastic system should specify the probabilities
of all possible states over time. A possible starting point for this approach is the so
7 Which, in light of the discussion at Eqs. (7)–(9), approximates the average behavior of a more
detailed, stochastic model of the same biomolecular system, and by extension, the average behavior
of the system being modeled.
called Master Equation [34], which can be readily set up for the type of stochastic
models discussed in this section.
A major drawback of “brute force” approaches along these lines is that the likely
states typically represent a very small fraction of the set of possible states. Going
back to the example of coin tosses, the likelihood of observing a streak of l ≥ 25
consecutive heads or tails in a single set of 1000 flips is ≈2.25 · 10−7 , or one in 4.4
million. Even though the longest streak can be anywhere between 1 and 1,000, only
≈2.5% of these possible outcomes are ever observed.
We will focus exclusively on stochastic simulations that provide individual his-
tories of the system, by emulating the random processes involved using random
numbers generated from the appropriate distributions. A sizeable number of simula-
tion runs provides an incomplete but informative account of the future states of the
system, by necessarily sampling the likely states and avoiding the unlikely states.
As we proceed to discuss various simulation methods, we ask the reader to keep
in mind that the proper use of all stochastic simulations implies the estimation of
the variability of the predicted values, typically by generating a sample of several
simulation runs.
2.2 Stochastic Simulation Algorithm
The exact stochastic simulation algorithm (SSA) for a generic system of chemical
reactions is often referred to as Gillespie simulation [8, 9] due to the pioneering work
of Gillespie, who defined the terminology and proposed the use of the approach for
mesoscopic, well-mixed chemical systems.
Consider a system of several species and reactions among them, similar to (3).
We will denote the species by A, B, C, . . . The states of the system are defined by
the number of molecules (copies) of each molecular species, (n A , n B , . . .). Reac-
tions occur as instantaneous, random events. In the context of the SSA we do not
identify individual molecules of the same species; similarly, an instance of a reaction
corresponds to a discrete state transition that converts one set of molecules of the
appropriate species, according to the chemical equation. For example, one instance
of the reaction A → B + C means (n A , n B , n C ) → (n A − 1, n B + 1, n C + 1). The
set of all possible states of a given system is therefore discrete and may be finite or
infinite.
Reactions are triggered by Poisson processes, similarly to what we discussed
for the reaction A → B. In the context of the SSA, we associate a single Poisson
process to all the instances of a given reaction. The rate of the corresponding Poisson
process is called the propensity of the reaction (denoted g), and it typically depends
on the copy number of the species that are involved. The exact functional dependence
reflects modeling and empirical considerations.
2.2.1 Mass-Action Propensities
In the discussion leading up to (12) we assumed that the possible instances of the
reaction A → B were driven by identical random processes, and were therefore
equally likely. If we can generally assume that each possible instance of a reaction
is equally likely, irrespective of past history and the configuration of the system, we
obtain the stochastic equivalent of mass-action kinetics.
In this model, each reaction is triggered by a Poisson process, whose rate is called
the propensity of the reaction. The propensity of a given reaction R, starting from
a given state of the system (n A , n B , . . .) is the number of possible instances Z R
(molec)
multiplied by an elementary propensity kR that characterizes the reaction,
(molec)
gR (n A , n B , . . .) = Z R (n A , n B , . . .) · kR . (13)
The role of elementary propensities is similar to that of the macroscopic rate constants
(macro)
kR . They can be related to molecular considerations or to macroscopic rates.
For first-order reactions (involving a single molecule: A → · · ·), the number of
possible instances is Z A→··· = n A , same as that of available A particles. We have
stated (12) that the elementary propensity coincides with the mass-action rate con-
stant and reflects the intrinsic (quantum mechanical) transition rate of the molecule,
(molec) (macro)
kA→... = kA→... .
Let us spell out the reasoning behind this. We obtained a differential equation
(molec)
(8) for the number of particles: n A (t) = −kA→B n A (t). We noticed it had the same
form as the mass-action equation of motion (4)–(5) for the concentration [A], namely
(macro)
[A] (t) = −kA→B [A](t). Since the concentration and number of molecules are pro-
portional (2), we identified the ratio [A] /[A] ≈ n A /n A and the propensity per A
particle:
[A] (t) gA→B

= lim = −kA→B . (14)
[A](t) V →∞ n A
For second-order reactions such as A + B → D, the number of possible instances

is Z A+B→D = n A · n B , the number of A, B pairs that can be formed from avail-
able molecules.8 The propensity also depends on the collision rate between the two
molecules involved, and we therefore expect it to be inversely proportional to the
volume or space in which the molecules can move. It also reflects the interplay of
different collision geometries with the dynamics of the reaction; its estimation from
first principles is possible, but would have to include detailed molecular dynamics
considerations.
Alternatively, we may estimate the propensity by relating it to the physical kinetic
constant for the reaction. From the perspective of one participating A molecule,
the situation is similar to the first-order case: the propensity per molecule is the
probability per unit time of being involved in a reaction. In the macroscopic picture,
8 If the reacting particles are identical, we have a combinatorial factor: Z A+A→C = 21 n A · (n A − 1).
this should be comparable to the logarithmic derivative [A] /[A]:

(macro) (molec)
[A] = −kA+B→D · [A] · [B] ; gA+B→D = n A · n B · kA+B→D
(macro)
[A] gA+B→D (molec) kA+B→D
= lim ⇒ kA+B→D = (15)
[A] V →∞ nA V · NAvogadro
The connection between elementary propensities and rate constants can be general-
ized to higher order reactions, by including the appropriate number of V · NAvogadro
concentration to particle number conversion factors.
2.2.2 Competing Reactions: The Gillespie SSA and the First

Reaction Method
In a generic CRN system (3) with several species and reactions, we associate a Poisson
process to each reaction. The mass-action reaction propensities are calculated as
described,

n n
αi j
(molec) (macro) ni
g j (n 1 , n 2 , . . .) = k j · (n i )αi j = k j V · NAvogadro · (16)
V · NAvogadro
i=1 i=1
The Poisson processes for each possible reaction compete, in the sense that they
run concurrently and independently of each other, until one of them fires. At that
moment, one instance of the “winning” reaction - let it be Rk - occurs, changing the
state of the system by the vector γ k = (γ1k , γ2k , . . .):
x = (n 1 , n 2 , . . .) → x∗ = (n ∗1 , n ∗2 , . . .) ≡ (n 1 + γ1k , n 2 + γ2k , . . .) . (17)
All Poisson processes restart, with propensities reflecting the new state vector x∗ =
(n ∗1 , n ∗2 , . . .).
The above defines a Markov chain, an idealized stochastic, well-mixed model of
the chemical reaction network. It has been shown [15] that, in the limit of infinite
volume, the expectation of concentrations predicted by this model coincides with the
traditional ODE description with mass-action rate laws (4)–(5). While this model is
still a mathematical construct, it emulates the random behavior of the number of
molecules of a real system and its average behavior matches the ODE model, it
therefore represents a more detailed description of reality.
The Gillespie stochastic simulation algorithm [9] (exact SSA, or first reaction
method), is a first principles simulation of the Markov chain model. The state of the
system is given by the vector x = (n 1 , n 2 , . . .). The update step is as follows. The
propensities g j of all possible reactions R j are calculated according to (16). For each
reaction R j , a tentative firing time τ j is generated from a random distribution (11)
with PDF f j (τ ) = g j exp(−g j τ ). The smallest of the set {τ1 , τ2 , . . .} defines the first
reaction, Rk . The firing time τk is added to the simulation time t → t ∗ = t + τk , and

one instance of Rk is implemented by adding the vector γ ·k = (γ1k , γ2k , . . .) to the
state vector: x → x∗ = x + γ ·k .
Many alternative algorithms have been developed to simulate the Markov chain
model of CRN more efficiently [3, 7, 10]. Gillespie’s original direct method [8, 9]
is important in that it uses the addition property of Poisson processes to avoid the
generation of a full set of reaction times at each update. When the set of propen-
sities is updated to reflect the new state of the system, a total propensity is cal-
culated, gT = g1 + g2 + · · · . The next reaction time is generated from the PDF
f (τ ) = gT exp(−gT τ ). The identity of the reaction is chosen by comparing a uni-
form random number r ∈ [0, 1] to the cumulative sums of the reaction probabilities,
k
0 ≤ s1 ≤ · · · sn = 1, where sk ≡ pl , pl ≡ gl /gT .
l=1
The hypotheses behind the Markov chain model can be relaxed to some extent,
as long as the system can be satisfactorily described by keeping track of only the
number of molecules of each species. In principle, any macroscopic rate law can be
implemented by defining an ‘effective’ rate constant that depends on the concentra-
tions (or particle numbers); for a generic rate law (3), the propensity would be the
macroscopic reaction rate (flux) converted to “instances per time per particle”:
ni
g j (n 1 , n 2 , . . .) = ϕ j ([S1∗ ], [S2∗ ], . . .) · V · NAvogadro where [Si∗ ] ≡ (18)
V · NAvogadro
3 Diffusion
An important fraction (if not most) of molecular transformations relevant to the

biological functions of living cells involve the interaction of two or more molecules
that can move independently. Such processes require the physical proximity of the
participants, and therefore depend on the position and movement of biomolecules.
The preceding discussion took into account the spatial aspect of molecular
processes only to the extent required to distinguish densities and concentrations from
mass and number of molecules. The dynamical picture ignored the spatial positions
of the interacting particles, consistent with an assumption of spatial homogeneity.
This is called the well-mixed hypothesis. A system where molecules are not distrib-
uted homogeneously is not well mixed, and its dynamics is influenced by the location
of individual molecules.
Someone approaching this subject with a background in a mature quantitative field
may find the complexity of the reaction networks “challenging enough” by itself.
The need to take into account the stochastic nature of the processes and the spatial
aspect that we are about to discuss add additional layers of complexity. However,
the study of spatial organization is necessary if we are to understand the chains of
causation that lead from mutations to pathology.
This is especially true for systems involving membrane proteins, whose dimeriza-
tion and dissociation reactions are important. Signal initiation by a host of membrane
bound receptor-ligand families relies on ligand-induced dimerization or the forma-
tion of larger cross-linked aggregates [14, 19]. Spatial inhomogeneity, or rather, the
emergence of multiple levels of spatial organization, are a constant feature of the cell
membrane. This nontrivial landscape modulates the movement of proteins along the
membrane [18, 21, 41].
This section focuses on the movement (diffusion) of molecules along the mem-
brane; we will discuss methods for combining diffusion with reactions in the next
section.
3.1 Diffusion Basics
Historically, the term Brownian motion referred to a specific type of random motion
characteristic of particles in a colloidal suspension. Here we follow the more
current usage, to mean the idealized random motion that is consistent with clas-
sical (Fickian) diffusion. A particle moving in two dimensions with position vector
r(t) = (x(t), y(t)) is said to be Brownian if the components Δx, Δy of the dis-
placement vector Δr = r(t + Δt) − r(t) over any interval (t, t + Δt) are random
variables distributed according to the PDF

1 Δx 2 + Δy 2
f (Δx, Δy; Δt) = exp − . (19)
4π DΔt 4DΔt
The above simply states that the x- and y-displacements are distributed inde-
pendently, each following a normal distribution with variance σ 2 = 2D · Δt. The
parameter D is a measure of the mobility of the particle.
Brownian motion is random; even if we know the exact position of the particle,
its position x, y at a later time can be predicted only in terms of a distribution. If we
place a single Brownian particle at (x0 , y0 ) at time t = 0, the localization probability
density coincides with (19), with Δx = x − x0 , Δy = y − y0 , and Δt = t.

1 (x − x0 )2 + (y − y0 )2
p(x, y; t) = exp − . (20)
4π Dt 4Dt
Properties of the normal distribution ensure that the function (20) is normalized at
all t > 0: p(x, y; t) dxdy = 1. Its time evolution is consistent with (19), in that
the localization density at any time t = t0 + s > t0 can be obtained as a convolution
of the localization density at time t0 and the displacement PDF over Δt = s 9 :
9 Note that (20) approaches a Dirac delta at t = 0: lim p(x, y; t) = δ(x)δ(y).

t→0

p(x, y; t0 + s) = p(x , y ; t0 ) f (x − x , y − y ; s) dx dy . (21)
This behavior on the level of individual particles is closely related to the ‘bulk’
physical process of diffusion. Setting the matter of overlaps between particles aside
for now,10 we can describe a set of n Brownian particles by the sum of the individual
localization probability density functions p ( j) (x, y; t). This joint localization density
function is normalized to the number of particles:

n
ρ(x, y; t) ≡ p ( j) (x, y; t) ; ρ(x, y; t) dxdy = n . (22)
j=1
The integral of ρ(x, y) over any finite area corresponds to the average number of
particles in that area. The joint density function inherits the convolution property
(21). It can be shown that for t ≥ t0 , the localization probability p(x, y; t) defined
by (19)–(21), as well as ρ(x, y; t) defined in (22) verify a diffusion equation:

∂p ∂2 p ∂2 p
=D + 2 . (23)
∂t ∂x2 ∂y
In other words, the particles of a substance that undergoes standard (Fickian) diffusion
with diffusion coefficient D, perform Brownian motion as defined by the PDF (19).
The mathematical equivalence between (23) and (19) is crucial in connecting the
behavior of individual particles to bulk properties such as diffusion coefficients and
(as we shall see in the next section) reaction rates.
We end this discussion with two applications of (19) used in single particle
tracking (SPT). SPT trajectories [24] are derived from microscopic images (frames)
recorded at equal time intervals τ . Particles are identified based on the spatial pat-
tern of the fluorescent light they emit. The process results in a sequence of posi-
tions {(x0 , y0 ), (x1 , y1 ), . . . (xk , yk ), . . .}, corresponding to recording times t0 + kτ .
One way to characterize the observed movement is by comparing the displacements
Δx jk = xk − x j , Δy jk = yk − y j over time intervals of length tk − t j = (k − j)τ .
Statistics of displacements can be obtained from one or more trajectories, over one
or more time intervals of a chosen length.11
A quantity often analyzed is the square displacement over time θ :
s 2 (θ ) ≡ (Δr)2 = Δx 2 + Δy 2 ; Δr = r(t + θ ) − r(t) (24)
If the particles follow (19) with the same parameter D, the x- and y-displacements
are independent with expectation zero and variance σ 2 = 2Dθ . The mean square
10 This is justified at low overall densities, and also when the density of
the particle species that are
explicitly modeled is sufficiently low.
11 When the same trajectory is sampled for intervals corresponding to multiple frames, the intervals
must not overlap in order to avoid oversampling.

displacement (MSD) over time θ is the sum of Δx 2 and Δy 2 , and therefore
Δx 2 = Δy 2 = 2Dθ ; s 2 (θ ) = Δx 2 + Δy 2 = 4Dθ . (25)
The distribution of square displacements corresponding to time intervals of length

θ can be derived12 from (19):

1 s2
g(s ; θ ) = 2
2
exp − 2 . (26)
s (θ ) s (θ )
The above should not be confused with the distribution function for the absolute value
of the displacement s = Δx 2 + Δy 2 , which is proportional to s · exp(−s 2 / s 2 ).
3.2 Brownian or Not Brownian?
The motion of a particle in two dimensions is in general described by a position vector

that varies with time, r(t) = (x(t), y(t)). The motion is random if the future values
r(t + Δt); Δt > 0 are set by a process that has (any) random elements. One may
define such random walks by providing the distribution (or PDF) f (Δx, Δy; Δt) of
the two components of the displacement vector Δr = r(t + Δt) − r(t).
In Brownian motion as described in the previous section, displacements over
intervals of length Δt are distributed according to (19); the components of the dis-
placement vector Δr along any direction follow a normal distribution with the same
standard deviation, which is proportional to the length of time: σ = 2DΔt. Conse-
quently, the mean squared displacement (MSD) over a time interval should increase
proportionally with the length of the interval.
Deviations from standard Brownian motion result in anomalous diffusion
[24, 30]; the term is somewhat misleading in that such deviations are typical, rather
than unusual. Brownian motion is an idealized mathematical model; it describes a
limiting behavior which is approximated by many real systems, but not truly followed
by any of them. Such caveats apply to any mathematical model, but in the case of
Brownian motion there are two aspects that merit a brief mathematical discussion.
First, by the Central Limit Theorem (CLT), [34] any isotropic random walk
approaches ideal Brownian motion in the limit of large time (or spatial) scales. Con-
sider a set of N independent and identically distributed random variables {X k }k=1,...,N
with mean zero ( X k = 0) and finite variance√ X k = σ . Define a new random vari-
2 2
able Y N as the sum of the X k ’s, divided by N . The theorem states that in the limit
of N → ∞, the PDF of Y N converges to a normal distribution centered on 0 and
with the same variance σ 2 as the individual X k ’s.
12 To obtain (26) we need to first change variables from (x,

y) to (r, α) → dxdy = r dr dα; integrate
out the angle α, then change variable to w = r 2 → dw = 2r dr .
Consider now a random walk where the x-displacements ΔX over some char-
acteristic time τ are distributed randomly with mean zero and standard deviation
σ 0 (so that ΔX = 0; ΔX 2 = σ 02 ). The displacement over a multiple T = N τ
of the characteristic time τ , is the sum of N independent, identically distributed
displacements Z (T ) = ΔX 1 + · · · ΔX N (where ΔX √ k is the displacement over the
k-th interval). The CLT applies to Y (T ) = Z (T )/ N . In the limit when T becomes
much larger than the characteristic time (i.e., T /τ = N → ∞), the distribution of
Y (T ) must approach a normal
√ centered on zero, with variance σ0 . It follows that,
2
in the same limit, Z (T ) = N · Y (T ) approaches a normal distribution centered on

zero, with variance Z (T )2 = N σ 02 = T σ 02 /τ . This is identical to the distribution
of the x-displacement of a Brownian particle (19) with diffusion coefficient
σ 02
Deff = . (27)
2τ
In summary, for a random walk whose displacement along one direction, over a
characteristic time τ , has mean zero and standard deviation σ0 , the distribution of
displacements
√ over time T approaches a normal with standard deviation σ (T ) =
2T · Deff as T τ where Deff is given by (27). Finally, if the movement is in two
dimensions and the displacements along both x and y verify the assumptions (zero
mean and standard deviation σ 0 over τ ), then, in the limit T /τ → ∞, the system
approaches ideal Brownian motion with diffusion coefficient (27).
The second issue is that perfect Brownian motion is a mathematical idealization
that is, in fact, physically impossible. There are various physical limitations to how
far a membrane-bound particle could possibly travel over a given time, even in the
absence of any barriers. Yet, the PDF (19) implies that given any finite time Δt, no
matter how large a distance B is, there is a nonzero probability that the displacement
ΔX over Δt will be larger than B. This would imply that our particle could exceed
any such limit (including the speed of light).
Brownian trajectories have counterintuitive features: the function x(t) describing
a specific trajectory (in one dimension) has an infinite number of zero-crossings
over any finite time interval, and it is continuous but not differentiable. One of the
most consequential properties is also quite obvious. Consider a Brownian trajectory
that we sample at time intervals of length τ . The standard
√ deviation corresponding
to each time step, σ (τ ) = (x(t + τ ) − x(t))2 = 2Dτ , is proportional to τ 1/2 .
The standard deviation is a measure of the typical displacement over the time step;
this implies that the “typical velocity” σ (τ )/τ varies like τ −1/2 , that is, it increases
and grows indefinitely as the sampling time step is decreased. The closer one looks
at a Brownian trajectory, the less smooth it gets.
Potential artifacts of the Brownian motion model can be resolved by falling back
to a more realistic model of molecular motion. Randomness in physical motion
typically results from a process with a characteristic time; for instance, particles
travel along straight lines and change direction because of collisions. The rate of
direction changes defines the characteristic time; movement on shorter time scales
is actually deterministic, and is interspersed with random events.
In summary, we could say that all physical random motion looks Brownian if
examined on a long enough time scale; on the other hand examination over a short
enough time scale of will reveal that any apparently Brownian motion is in fact,
“anomalous”. This and similar issues have to be kept in mind when performing
Brownian motion simulations.
3.3 Brownian Motion Based Simulations
We now develop the elements of simulations of particle motion on the cell mem-
brane, and then discuss how motion simulators are combined with reaction simulation
algorithms.
Simulations that take into account the location of individual particles generally
require significantly more organization and memory than nonspatial algorithms.13
In a nonspatial algorithm (16)–(17) with N S species, the state of the system is fully
determined by Ns integers, (n 1 , n 2 , . . . , n Ns ) that represent the number of molecules
of each type. In a spatial algorithm we need to keep track, at a minimum, of the
location of each particle of each species: (x j , y j , s j ) j=1...NP ; where N P = n k is
the total number of particles.
With the availability of computer memory, the bookkeeping aspect is much less
of a limitation than it was 10–15 years ago. In the scheme we outlined, one would
need 2 floating point numbers and an integer for each particle. For a million particles,
this works out to 3 · 106 ≈ 3 Mb, which is not at all a logistical challenge. However,
the implementation of the updates may make a big difference, and fully spatial
simulations continue to be significantly more work and resource intensive.
The simplest spatial simulation applies the definition of Brownian motion to a set
of particles. Given N p coordinate pairs {rk = (xk , yk )}k=1...Np at time t, the positions
at time t + Δt are obtained as rk (t + Δt) = rk (t) + Δrk , with the components of
Δrk = (Δxk , Δyk ) chosen randomly, according to the Brownian PDF (19). Assume
we store the coordinates in an N P × 2 array. Then, for each update step, we simply
need to generate an array of the same size, filled with normally distributed random
numbers and add it to the coordinates.
The largest computational cost is in generating the random numbers. Algorithms
for the generation of normally distributed random numbers (based on a uniform ran-
dom number generator) are well known and easy to implement. The Brownian update
step is a typical candidate for vectorization, the ability to perform operations with
arrays without relying on a loop. Array operations as well as functions that provide
sets of normally distributed random numbers are readily available in environments
such as MATLAB, where the entire update step can be implemented in a single line
of code.
13 With the one possible exception of nonspatial, but “rule based” simulations that keep track of
individual molecule parts [12].
In the absence of limitations to the allowed values of the positions, the procedure
described here can be employed for any time increment Δt, which can be chosen as
long or as short as the other elements of the model dictate. A typical implementation
would be as a fixed time step algorithm. Most of the effort in setting up this type of
simulation goes into dealing with limitations to unimpeded movement, which may
stem from the simulation landscape and the size of the diffusing molecules.
Depending on the context of the simulation, physical details of the molecular
motion become relevant. A more realistic description of the movement of membrane
proteins is a sequence of short, straight line displacements, interrupted by collisions
with small molecules. In principle, when there is any doubt about the proper approach
to a specific situation, one should go back to this picture as the true, physical reality.
A general feature of microscopic simulation methods is that one does not nec-
essarily aim at reproducing the physical reality over a short time or spatial scale;
rather, simulations are validated by approaching the proper Brownian behavior (or
macroscopic reaction kinetics) in the limit of large time and spatial scales.
3.4 Anomalous Diffusion/Barriers to Movement
Elements that limit the free diffusion of membrane bound receptors (and other pro-
teins) are a central feature of membrane dynamics and signaling. Far from being
homogeneous, the membrane presents a varied landscape of linear boundaries and
two-dimensional domains [1, 16, 33, 37, 39].
Virtually any experimental observation of these molecular aggregates has striking
features that point to inhomogeneity and limited movement, indicating a pattern of
spatial organization. Indeed, a significant fraction of our data analysis and modeling
effort is aimed at characterizing these features and understanding their possible role
in signaling.
3.4.1 Emulating Open Space with a Limited Simulation Area
Currently, a detailed simulation of the entire cell membrane is not practical. For a
single signaling pathway [14], this would involve on the order of 106 molecules. The
limitation is not so much in terms of memory capacity than in terms of execution time.
The high computational cost is a result of the need to update all diffusing particles
at each time step, combined with the necessarily short steps required by numerous
other considerations.
Suppose we focus on a rectangular patch of dimensions Bx , B y , which represents
a fraction of Bx BY /A T of a membrane of total area A T . The number of molecules
of each species of interest can be set proportionally, using the copy number per cell
or an area-based concentration.14 If diffusion is unimpeded, the molecules initially

in the simulation area would leak out. On a real cell, the molecules that leave would
be replaced by others entering the area. To emulate this in a simulation, we usually
impose reflecting or periodic boundary conditions.
The boundary conditions are mathematically equivalent to those for differential
equations. In simulations, they are implemented by modifying the position update
rules for instances that would result in a boundary crossing. Suppose our simulation
area is the rectangle S = [0, Bx ] × [0, B y ] defined by 0 ≤ x ≤ Bx , 0 ≤ y ≤ B y .
The boundary rule is applied to all particles whose “raw” updated position rk (t +
Δt) = (x ∗ , y ∗ ), resulting from the procedure discussed in Sect. 3.3 falls outside S .
With periodic boundary conditions, the updated coordinate(s) that would fall out-
side the [0, B] interval are shifted up or down by the interval length. In the case of
the x coordinate, we replace x ∗ → (x ∗ − Bx ) if x ∗ > Bx , and x ∗ → (x ∗ + Bx ) if
x ∗ < 0, with an analogous prescription for y. This is equivalent to imagining that
the plane is tiled with an infinite number of identical copies of S ; when a particle
leaves one copy of S , it enters the next one, without changing its direction. The cor-
responding condition for Eq. (23) is: ρ(0, y) = ρ(Bx , y) , ∂x ρ(0, y) = ∂x ρ(Bx , y).
With reflecting boundary conditions, each “overshooting” coordinate is reflected
by the boundary position; the additional update rule is x ∗ → (−x ∗ ) if x ∗ < 0 and
x ∗ → (Bx − x ∗ ) if x ∗ > Bx . Geometrically, this rule corresponds to a reflection of
the displacement vector by the boundary. The differential equation equivalent is the
zero flux boundary condition: ∂x ρ(0, y) = 0 , ∂x ρ(Bx , y) = 0 .
The price of not simulating the entire membrane is that both of these procedures
come with artifacts, unintended features that have no correspondent in reality. With
periodic boundary conditions, the geometry (topology) of the simulation box is that
of a torus, rather than a sphere. When particle interactions are taken into account,
reflecting boundary conditions potentially change the local distribution pattern by
inducing a “pile-up” of molecules at the boundary, a phenomenon that does occur
in the presence of physical barriers; this approach is used to emulate various obsta-
cles. For this reason, periodic boundary conditions are preferable for the purpose of
emulating open space without “leakage” of particles.
3.4.2 Physical Obstacles
Aside from collision with other molecules,15 physical obstacles encountered by mem-
brane bound molecules typically [17, 23] act as linear, semipermeable barriers. These
can be modeled by expanding the approach for reflecting boundaries.
Suppose there is a barrier along the line x = B, faced by particles diffusing in
the x y plane. For a strict boundary that enforces x ≤ B, a proposed updated value
14 A potentially confusing practice is to characterize the amount of a membrane bound species
in terms of a (volume) concentration that reflects the copy number per cell multiplied by the
concentration of cells in a suspension.
15 We will discuss intermolecular collisions in the context of reaction–diffusion models.
for x(t + Δt) = x ∗ such that x ∗ > B would be replaced by (B − x ∗ ). For a partially
permeable barrier we relax the rule and allow the boundary-crossing update x ∗ with
a probability pcross , and we replace x ∗ → (B − x ∗ ) with probability (1 − pcross ).
A linear barrier can be approached from two sides. The barrier rule applies to
any particle whose current x(t) ≡ xold and “raw” updated x(t + Δt) ≡ x ∗ coordi-
nates are on opposite sides of x = B, in other words, (x ∗ − B) · (xold − B) < 0. The
crossing probability reflects the way the diffusing particle interacts with the bound-
ary and the surrounding membrane. It may have different values for each species,
reflecting molecular mobility; it may have different values for crossing in one or the
other direction, reflecting an affinity of a type of molecule for one or the other side
of the barrier.
The implementation of these barriers in simulations should be informed by the
nature of the obstacle, and, ideally, by some insight into the molecular mechanism
that interferes with the diffusing particles. One type of linear obstacle is represented
by large, linear protein structures—actin filaments or elements of the cytoskeleton.
Diffusing membrane proteins are generally blocked from crossing these structures.
Crossings are possible through small, transient gaps that open and close due to the
relative movement of the membrane and the barrier [17, 20].
Another source of linear obstacles may be the boundaries between different phases
of the mix of lipids that form the membrane. In addition to an overall barrier poten-
tial, electrostatic interaction between the tails of the lipids and the transmembrane
domains of diffusing proteins may result in a net force or potential difference that
favors the movement of the protein in one direction over the other [6, 28, 31].
4 Reaction–Diffusion Systems
The main goal of this contribution is to discuss stochastic modeling of reaction–

diffusion systems. This level of detail is necessary in the modeling of biomolecular
processes in living cells, when the number of molecules involved does not justify
a continuum (ODE or PDE) description, and when the hypothesis of spatial homo-
geneity does not hold. Modeling and simulation of reaction–diffusion systems are
required to connect experimental results from the study of membrane dynamics on
the molecular scale to cell-level behavior. The remaining task is to combine the sim-
ulation methods we discussed previously for well-mixed reaction systems with those
for diffusion.
The core aspect of molecular reaction–diffusion simulations is the coupling
between the location of the molecules and the triggering of individual reaction events.
To avoid ambiguity, we will assume that the chemical equations reflect the actual
participants in the molecular event corresponding to an instance of a reaction. For
instance, an enzymatic reaction where an enzyme E converts S into P is some-
E
times described as S −
→ P, while the correct equation should be S + E → P + E,
irrespective of the reaction mechanism or the rate law.16 Consistent with a level of
detail that accounts for the location of individual molecules, we disallow reaction
events that involve “magical” action at a distance; if the conversion of S into P
is facilitated by a third species E, then each reaction event must require the close
proximity of both participants, one copy of S and one copy of E.
Possible exceptions would emerge in a hybrid modeling framework, when only
certain species of interest (larger, more scarce molecules) are modeled in spatial
detail. For the purposes of this discussion, we assume that all species are accounted
for and that a reaction event requires the physical proximity of no less and no more
than the set of molecules implied by the incoming side of the chemical equation.
4.1 First-Order Reactions and Diffusion
If all the reactions in the system are first order (such as A → B or A → B + C),
then individual instances of reactions are triggered independently of the position
of the molecules. In a sense, the dynamics of reactions and of diffusion take place
in parallel, without influencing each other. This is a good opportunity to develop a
unified framework that simulates the two phenomena without having to deal with
some of the more complex issues.
The state of the system is defined by the position and chemical species of all
particles. The spatial aspect requires that we track the position of each particle. The
fact that particles have an identity requires a major change in the way chemical
dynamics is accounted for. All the methods we discussed previously tracked only
the copy number of particles of each species. In the course of these well-mixed
simulations one would estimate the time when an instance of a reaction type would
occur, and then implement it by changing the particle numbers accordingly.
With identifiable particles we need to deal with two additional tasks: choosing the
actual particles involved in each reaction event, and managing the relation between
particle identities and chemical species. The choice of the reacting particle is straight-
forward. For first-order reactions, once the reaction type is set, it comes down to
choosing one of otherwise identical possibilities.
The concept of particle identity helps limit the size and changes to the lists of
particles we maintain. It is most useful when a relatively small set of moieties can
combine in different ways. Consider a system with a single receptor type R, which
may be liganded (RL ) and/or phosphorylated (R∗ ), and may form dimers (R · R).
The following sequence of reaction events:
R + RL → R · RL ; R · RL → R∗ · RL → R∗ · R ; R∗ · RL → R∗ + R (28)
16 For a proper rendering of the Michaelis–Menten mechanism, one should include an intermediate,
reversible step.
allows straightforward identification of the specific receptors that participate.17 The

identity-species relationship is not always this simple, with species transforming in
ways that do not allow consistent identification of individual molecules (moieties)
beyond one reaction step.
4.1.1 One-to-One Reactions: Chemical Species as an Attribute
If reactions are strictly one to one (A → B, A → C, C → A, . . .), particles change

their type but in a sense keep their identity, so the total number of particles never
changes. In this case we may treat the chemical species as a discrete valued variable
assigned to each particle. The state of the system at time t is then characterized by an
array of size N × 3, {(xk , yk , sk )}k=1...N where (xk , yk ) = rk (t) define the position
and sk (t) is the chemical species of particle k at time t.
First-order chemical reactions are triggered within each particle, by Poisson
processes that are independent of the position or of the presence of others. If
the system follows mass-action, reaction A → B is characterized by an intrinsic
rate constant kA→B , and the joint propensity for all instances of the reaction is
gA→B = kA→B n A (t), independently of the spatial configuration. If such considera-
tions apply to all the reactions in the system, we may simulate the reaction dynamics
following the direct method [9]. We calculate a total propensity gT as the sum of all
reaction propensities at a given time t. The time to the next reaction (of any type) is
obtained by generating a random value τreact using the PDF f (τ ) = gT exp(−gT τ ).
The probability that the next reaction is A → B, is given by gA→B /gT ; the type of
the next reaction can be chosen using a second random number.
To fully determine the next chemical transformation, we need to identify the
actual molecule that undergoes the predicted transformation. The total propensity
gA→B = n A kA→B corresponds to the joint Poisson process that triggers the next
one of the n A possible instances of the reaction A → B. Each A particle has the
same intrinsic propensity kA→B , and is equally likely to be the one reacting. The
choice can be made similarly to that of the reaction type, by comparing a uniform
random number r ∈ [0, 1] with the partition of the interval [0, 1] induced by the set
n A −1
1
,
nA nA
2
, . . . , nA
. The probability that r falls into the subinterval j−1
,
nA nA
j
is the
∗
same for all integers j from 1 to n A , and the resulting j is a proper way to identify
the transitioning particle.
Once we obtained the value for τreact , selected the type of the next reaction
(A → B), and the identity of the particle involved (the j ∗ -th particle of type A),
the reaction event is to be implemented by simply changing the species value sk∗
from A to B at the time of the reaction, t + τreact .
Having established that there would be no chemical transformations between t and
t + τreact , we may evolve the spatial dynamics over this interval without concern for
17 All four reaction instances can be encoded as changes to the states of two receptor molecules:
Receptor 1 forms a dimer with Receptor 2 which is initially liganded; then Receptor 1 is phospho-
rylated; after that, Receptor 2 loses its ligand; finally, the dimer breaks up.
chemical transformations. If particle–particle collisions are not an issue, Brownian

motion can be estimated over any time interval Δt by obtaining a single set of
displacement values from the Brownian PDF (19). We can either perform a Brownian
motion update using the time to the next reaction, or, if we have a fixed time step
τstep , perform the diffusion update up to whichever time comes first, tnext = min(t +
τstep , t + τreact ). If τstep < τreact , we can perform successive Brownian updates until
the time of the reaction.
4.1.2 First-Order Reactions with Particle Number

Change: Bookkeeping Versus Rules
If all reactions are first order, each instance takes place independently of all the
molecules in the system, other than the one undergoing the molecular transformation.
Thus, the choice of the upcoming reaction type and of the reacting molecule, as well
as the diffusion update between reaction events, can be performed as described
above, whether or not the reactions are one to one (A → B) or not (A → A∗ + B,
A → ∅, . . .). The difference is in how the reaction event is implemented and the
machinery required to keep track of the chemical composition of the system.
When reactions change the number of molecules, we have to deal with the creation
and disappearance of particles. A particle is represented by at least two real numbers
(the spatial coordinates); depending on the algorithm, it can become a fairly complex
entity—similar to a record in a table or list.
One strategy is to maintain separate lists of particles of each type (species). In
this approach, each reaction event involves at a minimum, moving a record between
lists (A → B), but may also require the deletion (A → ∅) or creation of new ones
(A → B + C). Care is required to properly handle memory issues stemming from the
continuously changing size of the respective arrays. High-level programming envi-
ronments such as MATLAB allow dynamic resizing of arrays; if no other precautions
are taken, this can result in dramatic loss of execution speed. A safer strategy is to
preallocate arrays with the expected largest number of particles of each species and
manage the lists in a way that the blank records are always at the end.
Given that particles must be created and deleted, the idea of keeping a single list
with the chemical species as an attribute is not easily applicable. Somewhat similar
ideas are used in the rule-based approach to biochemical reaction network modeling
[12], where the entities of interest are parts of molecules that connect to each other
following a set of rules. For the reader who is not at all acquainted with the idea, the
best analogy is Lego blocks with several types of connectors. This approach is useful
in dealing with the complexity of biomolecules, dramatically reducing the number
of chemical species. Implementation requires keeping track of each building block
copy and of its connections to others. The resulting structures can easily be expanded
to include position information [25].
4.2 Second-Order Reactions
Things get more complicated with binary reactions, such as A + B → · · ·. For this to
happen, the interacting A and B molecules have to be in close proximity, or collide.
The way we represent the position of our particles, using a two-component position
vector rk = (xk , yk ), only specifies a geometric point, with no size or shape. When
thinking about molecular collisions, we have to account for these, as well.
Molecular dynamics [29] studies the geometry of biomolecules and can in prin-
ciple provide detailed information about the size and shape of receptors and other
membrane proteins. For now (2015), this is yet another example of a level of detail
that may provide useful information, but cannot be directly integrated into a simula-
tion of tens of thousands of molecules. Here we will assume that, given two molecules
of a given types with position vectors rA , rB , we can tell that they are in collision
(either overlapping or close enough to interact) based only on their distance:
|rA − rB | ≤ dAB
coll
(29)
We will use conditions similar to (29) to determine if a pair of molecules are eligible
to interact, or if their positions are physically overlapping and thus inconsistent with
free Brownian motion.
In the preceding section, the time horizon of the motion simulator was only limited
by the time τreact to the next reaction event; this time was obtained from an expo-
nential PDF f (τ ) = gT exp(−gT τ ) using the joint reaction propensity gT = gj .
j
In particular, in a Brownian motion simulation algorithm, we could use (19) with
Δt = τreact to obtain the next set of particle positions. Collisions (that may or may
not lead to a reaction event) complicate the simulation of molecular motion.
4.2.1 Elastic Collisions
If the molecules of interest take up a significant fraction of the space available,

their movement will be limited by the presence of other molecules. A tentative
configuration where two or more molecules overlap (their distance is less than the
sum of their physical radii) corresponds to a collision.
We call a collision elastic if it does not lead to a chemical transformation. A
motion simulator that takes into account molecular sizes must resolve all colliding
configurations, replacing them with a physically possible version. Elastic collisions
are generally more common than reactive ones; they may be ignored at low particle
densities, but may require a significant computational effort at high particle densities,
due to the combinatorial explosion of the number of possible overlaps.
One approach to resolving prospective overlapping configurations is similar to
the way we dealt with reflective boundaries: construct a corrected updated position
that is not overlapping (similarly to reflection off the boundary), or simply leave one
or both colliding particles in their original position.
As the density of particles increases, random position updates are increasingly

likely to result in overlaps. The computational effort in rejecting these updates reflects
the physical reality of crowding, where the movement of individual particles is lim-
ited, and in extreme cases made practically impossible by surrounding particles.
Crowding and traffic jams are believed to be a significant factor in membrane dynam-
ics [16].
A further complication at high densities is the possibility of multiple concurrent
interactions. Let us say for example that in the proposed update, particle 1 overlaps
with particle 2. Changing the position of particle 2 will make it overlap with particle 3.
Whether we now reject the second prospective position of particle 2, or keep that and
generate a new position for particle 3, is a matter of choice of strategy/algorithm.
Here, we only wanted to emphasize that as the density of particles increases, the
number of collisions that may occur grows in a combinatorial fashion.
This aspect is exacerbated by large time steps, which result in increased travel
distance |Δr| per update; the number of possible collision partners is set by the
volume of a sphere of radius Δr + ρ1 + ρ2 , where ρ1 , ρ2 are the physical radii of the
particles. Conversely, reducing the simulation time step results in fewer concurrent
collisions and reduces the chances of complex interactions.
Another issue that arises with large time steps is that of missed collisions, situa-
tions where the previous and the updated positions of two molecules do not overlap,
but a collision is likely to have occurred between the two time points. At low densi-
ties one may take the position that intermediate collisions are accounted for by the
Brownian PDF, which reflects a complex trajectory that results from multiple colli-
sions. However, at high densities, particles may end up tunneling through each other,
in the sense that a Brownian update may put a particle past another one blocking its
way.
Neither of the issues listed here is intractable, but hopefully the reader can appre-
ciate that properly accounting for elastic intermolecular collisions at high density
can be an exceedingly complex task, both in terms of algorithm development, and
computation time. The right approach is very much determined by the purpose of
the simulation and the physical situation of interest. Possible strategies include: (i)
reducing the simulation time step to keep the number of collisions per update man-
ageable; (ii) use physical insight into the details of molecular movement to replace
Brownian motion with realistic trajectories that can be used to predict collisions;
(iii) ignore elastic collisions altogether and adjust the parameters of the simulated
Brownian motion based on effective diffusion parameters.
In the discussion that follows, we will assume that a kinetic simulation module
takes care of elastic collisions (close encounters of particles that cannot enter into
a chemical reaction), in a way that does not directly limit the choice of the time
step employed in the rest of the calculation. One possibility is (iii) above that elastic
collisions can be ignored; another one ((ii) above) is that we have a readily available
motion simulator that generates updated, nonoverlapping positions over an arbitrary
time horizon.
4.2.2 Reactive Collisions
We will focus on simulating binary reactions triggered by the physical proximity of

the two participants. In particular, assume that our system has a reversible dimeriza-
tion/dissociation reaction
A+B→C ; C→A+B (30)
We briefly addressed dissociation reactions when they occurred isolated from dimer-
ization. Here we are interested in the interplay of the two reaction types. In the
well-mixed picture, the total propensity G A+B→C of the forward reaction (30) is
proportional to the number of eligible A, B pairs of molecules, and an elementary
propensity gA+B→C that relates to the mass-action rate constant kA+B→C , the vol-
ume,18 and Avogadro’s number.
kA+B→C
G A+B→C = n A · n B · gA+B→C ; gA+B→C = (31)
V0 · NAvogadro
When designing a simulation, often the mass-action rate constant is all that is known
about the kinetics of a given reaction. If no further physical information is available,
the only requirement for the implementation of this particular reaction is the principle
that in the limit of large volumes, the simulation must reproduce the rate predicted
by (31).
Recall that the elementary propensity of a first-order reaction is a rate (probability
per time) that reflects the internal dynamics of one molecule. It can be readily used
in simulations as the rate of a Poisson process; this rate can be directly measured in
experiments, because it coincides with the macroscopic rate constant. For second-
order reactions, the relation between propensities, the simulation of individual events,
and experimentally accessible quantities is more complicated.
A reaction such as A + B → C is typically one of several possible outcomes19
of a molecular collision event. The identity and location of the resulting particles
is the stochastic outcome of a quantum mechanical process, which is susceptible
to the incoming spatial configuration (impact parameter, relative orientation). The
elementary propensity gA+B→C is the (average) probability rate for the occurrence
of one a specific A + B collision in the system under consideration, and that the
collision will result in the creation of a dimer A + B → C. The value of gA+B→C
in a given system is the average over all possible A, B pairs, and all their future
trajectories over the simulation time step Δt. The formula (31) relates gA+B→C to an
18 In the case of membrane-bound species, the volume is replaced by area, and the conversion
between the per-pair propensity and the physical (effective) rate constant has to account for the
units used for concentration. As we pointed out, sometimes molar concentrations are used for
receptors, representing a molar concentration based on a density in suspension of a specific type of
cell, and the average number of receptors per cell.
19 If nothing else, the collision may be elastic.
experimentally accessible number. But how do we implement binary reaction events

in a spatial simulation?
First Principles
In principle, quantum mechanical calculations [25] can provide the distribution (PDF)
for the future state of each pair of molecules; from here, we could derive the PDF
for the time and location where each specific pair reacts, and use a joint distribution
to generate the time of the next reaction event of type A + B → C.
With a prescription that identifies the next reaction event, we may proceed the
same way as described for first-order reactions—identify the next reaction event
using an approach along the lines of the next reaction method. This would be very
efficient, since we could evolve our simulation clock from one reaction event to
the next, implement the reaction event, and simply generate new Brownian motion
positions for all the other particles.
For this we would have to analyze all n A · n B pairs of molecules and predict (gen-
erate) the time of their next inelastic collision. We must do this in a way that we can be
confident that other molecules have not precluded the pair under consideration from
interacting. Even if the two-body analysis could be performed, the three, four-body
problems get complicated quickly, but can be tackled with appropriate computational
resources. One approach in this direction is the GFRD algorithm [38].
Using an Effective Reaction Distance
The number of A, B pairs that may possibly react over the next time step is drastically
reduced if we impose a maximal reaction distance (29), and neglect the possibility
of reaction if the distance between molecules exceeds this maximum. This choice
neglects a small probability of reaction for pairs that are further apart.
For pairs within the reaction distance, one could perform a detailed analysis as
outlined above; in most cases, this first principles approach is still impractical, due to
the large number of different reactions, the complexity of the necessary calculations,
and the lack of detailed knowledge of reaction mechanisms and molecular structures.
(reaction)
A practical alternative is to assign an average reaction probability pA+B→C , con-
ditioned on the two molecules being within the reaction distance. In a given system,
the total reaction propensity is then the product of the collision rate (probability per
time that an A, B pair comes within reaction distance) and the conditional reaction
probability:
G A+B→C = G (coll) (reaction)

A+B · pA+B→C (32)
A comparison of (31) and (32) highlights the difficulty in relating spatial simulations
of binary reactions to mass-action rate constants. The well-mixed rate constant is
an average rate that applies to all A + B pairs; it reflects not only the ability of the
two molecules to react once they are close, it also depends on the characteristics of
molecular movement: the diffusion rate, as well as any features of the space, which
in our case is the membrane with its varied landscape.
As opposed to this, in (32) mobility is accounted for by the collision rate G (coll)
A+B ,
and the conditional reaction probability is an intrinsic parameter. In a simulation
(once the reaction distance is specified) the spatial movement algorithm (such as the
BM simulator) automatically provides G (coll)
A+B . The actual reaction rate is therefore
set separately from the motion simulator, by specifying the collision radius and the
conditional reaction probability. This approach is especially useful when building a
simulation required to match observed diffusion rates and reaction rates. The reac-
tion distance and conditional probability are not unique; decreases in one can be
compensated, to some extent, increasing the other.
4.2.3 Implementation Aspects
We will describe our algorithm in detail in the next subsection. The general idea is that
second-order reactions are triggered by the proximity of the participating molecules.
Within the type of simulation we have outlined in the preceding sections second-order
reactions can be implemented together with the spatial position updates.
At each update, after the tentative new positions are generated and any first-order
reactions have been implemented, we identify the A, B pairs that verify the reaction
condition d(A, B) ≤ RA+B→C . Not all of these reactions can be carried out, because
some of them might be mutually exclusive. For instance, the same A molecule may
be within reaction distance of two different B molecules. Such situations are similar
to the multiple collisions discussed in Sect. 4.2.1. Their likelihood increases with
particle density and the reaction radius. Reducing the time step can reduce but not
avoid the occurrence of contradictory reactions.
Ultimately, one has to eliminate some of the proposed reactions and identify a
noncontradictory subset. Each pair in this set is then allowed to react with probability
(reaction)
pA+B→C . The choice of the reactions to be implemented must be performed in a way
that is not biased with respect to the specific molecules that are involved. Because the
elimination of some reacting pairs, the reaction radius and/or the probability must
be adjusted to match the required overall reaction rates.
Finally, when dissociation reactions are also present, the placement of the reaction
products needs to be done in a way the does not result in new collisions (reactive or
elastic).
4.3 Smoluchowski Model—Binding Radius

Approach to Bimolecular Reactions
Bimolecular reactions are the product of two reactive molecules colliding. In our
current simulation, we use an approach similar to that of Smoldyn [2].
Fig. 2 Collision distance, binding radius, unbinding radius. The binding radius is the distance
(between points representing a pair of particles) below which a reaction event is triggered
4.3.1 Irreversible Bimolecular Reactions
Marian V. Smoluchowski sought to develop a model for diffusion-influenced sys-

tems, including irreversible bimolecular reactions, using diffusion coefficients and
molecular radii [35]. The model describes interacting particles within a solution as
point particles that do not interact except for instances of irreversible chemical reac-
tions that are instantaneous, and triggered by proximity. Assuming Brownian motion
for diffusing particles, the issue with bimolecular reactions (A + B → · · ·) is quan-
tifying the moment when the two species involved in the reaction will actually react
with one another to form the product/complex. Smoluchowski tackled this issue by
using the molecular radii to determine when two particles would collide, and there-
fore react. This collision distance was the sum of the molecular radii of the two
particles, and provided an analytical calculation of the corresponding (macroscopic)
binding rate.
However, using the sum of the molecular radii as the reaction threshold is equiv-
alent to a purely diffusion-limited model, which does not take into consideration the
activation energy of the reaction. This results in an overestimation of reaction occur-
rences. One way to correct this [2] is to replace the molecular radii with a binding
radius, a length shorter than the collision distance, to account for the slower rate of
reaction.20 At the highest level overview, the binding radius is the distance required
in simulation to achieve the same rate of reaction observed in wet lab experiments.
20 Alternatives include using a reaction probability that controls whether a reaction occurs upon
collision, as well as manipulating the unbinding radius discussed in the next paragraph.
To achieve this rate in the context of a specific computation, the binding radius must
take into account not only the bimolecular reaction rate, but also both molecules’
diffusion coefficients, as well as the simulation time step (Fig. 2).
The term “binding radius” is somewhat counterintuitive. In our simulation
approach, following the Smoluchowski model and [2], particles are represented by
points that diffuse freely, irrespective of the presence of other particles. The binding
radius is the maximum distance between two points indicating the two corresponding
particles will react with one another. The important distinction here is between an
idealized geometric point and a particle/molecule. A point can be interpreted as the
center of a molecule and as having no volume, while a particle/molecule has a cer-
tain physical size (that can be represented by an average radius) and corresponding
volume. It follows that the binding radius can be, and typically is, smaller than the
actual collision radius between two molecules. See [2] for a more in-depth discussion
of the binding radius.
4.3.2 Reversible Bimolecular Reactions
The reverse of a bimolecular reaction, A + B → C, is a first-order reaction,

C → A + B, that results in the consumption of one molecule and generation of two
molecules. The first-order reaction is modeled as discussed above, however, the place-
ment of the product molecules requires additional considerations. In the approach
of [2], the Smoluchowski model was expanded to include reversible bimolecular
reactions by including an unbinding radius term. The unbinding radius is the initial
distance between two product molecules when they are generated in a reaction. This
distance is an important factor in controlling geminate recombination of product
molecules.
Geminate recombination is the re-reaction of two product molecules to reform the
original reactant molecule. An example reaction would be: C → A + B → C. This is
a real phenomenon whose occurrence may be artificially increased in a simulation. To
address the issue, Andrews and Bray [2] use a probability of geminate recombination
to calculate the unbinding radius: PGR = Rbinding /Runbinding . The default probability
used in [2] is 0.2, meaning there is a 20% chance of rebinding/re-reaction upon the
formation of the initial product molecules.
The use of an unbinding radius to limit recombination is one of several possible
choices.21 It introduces an additional unphysical feature in that it violates the principle
of detailed balance; when dimerization and dissociation are at equilibrium, the spatial
distribution of A and B particles entering the bound state (C) is different from that
of the same particles exiting the bound state. However, this is one of several artifacts
of the model that emerge at short length scales, and has the advantage of reducing
the computational burden due to repeated rebinding.
21 An alternative would be to use a larger binding radius and trigger reactions with probability <1.
4.3.3 Implementation Details
Our simulation area corresponds to a membrane patch of approximately 400 ×

400 nm, typically featuring a landscape of attractive domains derived from
microscopy data. The current version of our simulation code uses a fixed time step,
Δt. To resolve spatial and reaction conflicts, the logic is equivalent to updating par-
ticles one at a time, in random order. A one-particle update corresponds to a small
time step of δt = Δt/N , where N is the number of individual particles in the initial
simulation space.
The one-particle update proceeds as follows. After choosing the particle, a new
position of the particle is chosen following the Brownian motion PDF (19) for dis-
placements over the full time step Δt. Depending on the purpose of the calculation,
the diffusion coefficient is set by species, based either on a macroscopic value, or
directly, based on SPT data. The proposed position update is adjusted (accepted or
rejected and re-sampled) to satisfy boundary conditions, both for the simulation area
(periodic) and the attractive domains (semipermeable). Due to the relatively low
particle density, we do not currently check for elastic collisions, but the algorithm
allows it.
Once the new position is set, depending on the particle species, we check for
potential reaction partners, molecules that may participate in a binary reaction with
the current particle, and whose distance to it is less than the corresponding reaction
radius. If more than one such particle is found, one of them is chosen randomly. Once
a single partner has been chosen, the corresponding reaction is carried out.22
If the particle does not participate in a binary reaction, depending on the species,
(assume it is an A), a first-order reaction is implemented with probability
1 − e−Δt gA→··· ≈ Δt · gA→··· (33)
where gA→··· is the joint propensity for all first-order reactions involving the species A.
If a reaction occurs, the specific reaction type (A → C, A → B + D, . . .) is selected
using the corresponding relative probabilities gA→C /gA→··· , gA→B+D /gA→··· , · · · .
For dissociation reactions (other than one-to-one), the position of the reaction prod-
uct(s) is set randomly in a circle of the corresponding unbinding radius.
5 Conclusion
In this chapter we discussed methods for spatial stochastic simulation of reaction–

diffusion systems. This framework is necessary to accommodate features of mole-
cular processes involving membrane-bound receptors, ligands, and other species
involved in signal initiation. While some of the methods presented here are widely
22 In our current EGF related work [17], all membrane bound species are either monomer or dimer
receptors; which may only participate in dimerization or dissociation reactions, respectively.

used in the field of biomolecular modeling, the particular combination of spatial

scales and experimental features is not commonly addressed by general purpose
software or in methods papers.
A careful reader will find many instances where the spatial simulation as described
in the preceding section is “unphysical”—from infinitely detailed Brownian motion
to a schematic representation of binary reactions. We should point out that the goal
is to account for the movement of molecules in a way that is consistent with experi-
mentally available information, from macroscopic down to just above the molecular
scale. Similarly to the relation between well-mixed stochastic and ODE models,
these spatial simulation methods allow the inclusion of more detail than nonspatial
models can, but they are still just an approximation of reality.
In analyzing nano-microscopy data on membrane dynamics, major challenges
arise from the fact that the existing information on reaction kinetics is based either
on cell-level measurements or is derived from in vitro conditions. A similar situation
exists regarding the mobility of membrane proteins. Ongoing single particle tracking
and other high spatial resolution methods provide direct observation of molecular
motion and transformations.
One important function of our modeling effort is to reconcile the observed behav-
ior with established, macroscopically derived kinetics. Other special aspects emerge
from the role of the membrane landscape. Modeling is needed both for identifying
landscape features, and for predicting their effect on signaling. This aspect will likely
require further refinement of spatial models. The current modeling paradigm [2] is to
calibrate microscopic parameters by matching known macroscopic (cell-level) kinet-
ics. This may need to be revisited as direct information on microscopic molecular
behavior becomes the main point of validation.
Summary of the Chapter
In the first section, we introduced the notation and basic elements of the traditional
approach to the dynamics chemical reaction systems. This approach is well mixed
and continuous, in that the amount of each substance is represented by a single
quantity that can take any nonnegative value. The state vector of the system evolves
according to a set of chemical rate laws that amount to a dynamical system obeying
a set of ordinary differential equations (ODE). Even though this approach serves as
a baseline, the dynamical systems emerging from ODE models of chemical reaction
networks (CRN) are complex dynamical systems that are studied in their own right
[32] and in the context of control theory [36]. In particular, a lot of the literature on
signaling network dynamics [14, 22] provide such ODE models.
The second section is devoted to the stochastic description of well-mixed CRN
systems. A stochastic model is necessary when the copy number of molecules of
one or more chemical species becomes too small to be described as a continuum.
At this level, the fundamentally probabilistic nature of molecular processes becomes
important. This approach can be seen as a refinement of well-mixed ODEs, or con-
versely, well-mixed ODEs can be seen as a simplified approximation of the more
general stochastic description. The random movement of molecules is discussed in
the third section. Here we introduce Brownian motion and briefly discuss its role as a
limiting behavior of any random walk. We describe algorithms that directly simulate
Brownian motion in two dimensions, in the presence of physical obstacles; we focus
on Brownian motion based simulators and do not discuss lattice-based models.
Section 4 brings together the stochastic treatment of chemical processes with ran-
dom spatial movement. We explain how the two can be combined in a single simula-
tion algorithm and describe the main approach we currently use for such simulations.
One salient point here is the notion of the binding/unbinding radius, which is a sim-
ulation model parameter used to tune the on- and off-rates of dimerization/diffusion
reactions.
Emerging Roles of Modeling
Dynamical models can predict the behavior of a system based on the characteristics
of its components, far exceeding the ability of verbal reasoning in doing so.
In the context of ongoing research in membrane biology, dynamical modeling
in our group has two fairly distinct roles, integration and data interpretation. The
first refers to the traditional function of bringing together detailed information on
the components of a system to predict emerging behavior. [13, 27] Ongoing work
focuses on the impact of the membrane landscape and of emerging biochemistry and
molecular dynamics on ErbB2/ErbB3 signaling.
The second aspect emerged in response to increasing amounts of nano-microscopy
data that provide completely novel insights but raise previously unexpected chal-
lenges. This modality is the primary source of information on the membrane land-
scape, as well as on the movement and molecular dynamics of membrane proteins
within this landscape. We are developing methods aimed at identifying landscape
features and to characterize the motion of receptors. The output of this approach
consists of diffusion coefficients, identification of the binding state of the molecules
(moieties) under observations, as well as the location and characteristics of mem-
brane features that modulate molecular movement.
Acknowledgements This work was supported by NIH CA119232 (BSW), NIH P50GM085273
(BSW), R01GM104973 (JSE and ÁMH), and NIH K25CA131558 (ÁMH). MMP was supported
in part by the U.S. Department of Energy through the LANL/LDRD Program.
References
1. Andrews, N.L., Lidke, K.A., Pfeiffer, J.R., Burns, A.R., Wilson, B.S., Oliver, J.M., Lidke,
D.S.: Actin restricts FcεRI diffusion and facilitates antigen-induced receptors immobilization.
Nature Cell Biol. 10(8), 955–963 (2008). doi:10.1038/ncb1755
2. Andrews, S.S., Bray, D.: Stochastic simulation of chemical reactions with spatial resolution
and single molecule detail. Physical Biology 1, 137–151 (2004)
3. Cao, Y., Li, H., Petzold, L.: Efficient formulation of the stochastic simulation algorithm for
chemically reacting systems. J Chem Phys 121(9), 4059–4067 (2004)
4. Chen, K.C., Csikasz-Nagy, A., Gyorffy, B., Val, J., Novák, B., Tyson, J.J.: Kinetic analysis of a
molecular model of the budding yeast cell cycle. Molecular Biology of the Cell 11(1), 369–391
(2000)
5. Craciun, G., Tang, Y., Feinberg, M.: Understanding bistability in complex enzyme-driven reac-
tion networks. Proc. Nat. Acad. Sci. USA 103, 8697–8702 (2006)
6. Edidin, M.: Lipid microdomains in cell surface membranes. Current opinion in structural
biology 7(4), 528–532 (2001)
7. Gibson, M.A., Bruck, J.: Efficient exact stochastic simulation of chemical systems with many
species and many channels. J Phys. Chem. A 104, 1876–1889 (2000)
8. Gillespie, D.T.: A general method for numerically simulating the stochastic time evolution of
coupled chemical reactions. J Comput Phys 22, 403–434 (1976)
9. Gillespie, D.T.: Exact stochastic simulation of coupled chemical reactions. J Phys Chem 81,
2340–2361 (1977)
10. Gillespie, D.T.: Approximate accelerated stochastic simulation of chemically reacting systems.
J Chem Phys 115(4), 1716–1733 (2001)
11. Gunawardena, J.: A linear framework for time-scale separation in nonlinear biochemical sys-
tems. PLoS One 7(5), e36,321 (2012). doi:10.1371/journal.pone.0036321
12. Hlavacek, W.S., Faeder, J.R., Blinov, M.L., Posner, R.G., Hucka, M., Fontana, W.: Rules for
modeling signal-transduction systems. Science Signaling 2006(344), re6 (2006). doi:10.1126/
stke.3442006re6
13. Kerketta, R, Halász, Á.M., Steinkamp, M.P., Wilson, B.S., Edwards, J.S.: Effect of Spatial
Inhomogeneties on the Membrane Surface on Receptor Dimerization and Signal Initiation.
Frontiers in Cell and Developmental Biology 4:81 (2016)
14. Kholodenko, B.N., Demin, O.V., Moehren, G., Hoek, J.B.: Quantification of short term sig-
naling by the epidermal growth factor receptor. The Journal of Biological Chemistry 274(42),
30169–30181 (1999)
15. Kurtz, T.G.: The relationship between stochastic and deterministic models for chemical reac-
tions. J Chem Phys 57(7), 2976–2978 (1972)
16. Kusumi, A., Nakada, K., Ritchie, K., Murase, K., Suzuki, K., Murakoshi, H., Kasai, R.S.,
Kondo, J., Fujiwara, T.: Paradigm shift of the plasma membrane concept from the two dimen-
sional continuum fluid to the partitioned fluid: high-speed single-molecule tracking of mem-
brane molecules. Annu. Rev. Biophys. Biomol. Struct. 34, 351–378 (2005)
17. Kusumi, A., Sako, Y.: Cell surface organization by the membrane skeleton. Current opinion in
cell biology 8(4), 566–574 (1996)
18. Lavi, Y., Edidin, M., Gheber, L.A.: Lifetime of major histocompatibiliy complex class-i mem-
brane clusters is controlled by the actin cytoskeleton. Biophys. J. 102, 1543–1550 (2012)
19. Lemmon, M.A., Schlessinger, J.: Cell signaling by receptor tyrosine kinases. Cell 141, 1117–
1134 (2010)
20. Lillemeier, B., Pfeiffer, J.R., Surviladze, Z., Wilson, B.S., Davis, M.: Plasma membrane-
associated proteins are clustered into islands attached to the cytoskeleton. Proc. Natl. Acad.
Sci. U.S.A. 103(50), 18,992 (2006)
21. Low-Nam, S.T., Lidke, K.A., Cutler, P.J., Roovers, R.C., van Bergen en Henegouwen, P.M.,
Wilson, B.S., Lidke, D.S.: ErbB1 dimerization is promoted by domain co-confinement and
stabilized by ligand binding. Nature Structural and Molecular Biology 18 (2011)
22. MacGabhann, F., Popel, A.S.: Systems Biology of Vascular Endothelial Growth Factors. Micro-
circulation 15(8), 715–738 (2008)
23. Niehaus, A.M.S., Edwards, J.S., Plechac, P., Tribe, R.: Microscopic Simulation of Membrane
Molecule Diffusion on Corralled Membrane Surfaces. Biophys. J. 94(5), 1551–1564 (2008)
24. Pezzarossa, A., Fenz, S., Schmidt, T.: Probing structure and dynamics of the cell membrane
with single fluorescent proteins. In: Fluorescent Proteins II, pp. 185–212. Springer, Berlin,
Heidelberg (2011)
25. Popov, A.V., Agmon, N.: Three-dimensional simulations of reversible bimolecular reactions:
the simple target problem. J Chem Phys 115(19), 8921–8932 (2001)
26. Pryor, M.M., Low-Nam, S.T., Halász, Á.M., Lidke, D.S., Wilson, B.S., Edwards, J.S.:Dynamic
transition states of ErbB1 phosphorylation predicted by spatial-stochastic modeling. Biophys.
J. 105(6), 1533–1543 (2013)
27. Pryor, M.M., Steinkamp, M.P., Halász, Á.M., Chen, Y., Yang, S., Smith, M.S., Zahoransky-
Kohalmi, G., Swift, M., Xu, X-P., Hanien, D., Volkmann, N., Lidke, D., Edwards, J.S., Wilson,
B.S.: Orchestration of ErbB3 signaling through heterointeractions and homointeractions. Mol.
Biol. Cell 26:4109–4123 (2015)
28. Saikh, S.R., Edidin, M.A.: Membranes are not just rafts. Chem. Phys. Lipids 144(1), 1–3 (2006)
29. Schlessinger, J.: Ligand-induced, receptor-mediated dimerization and activation of egf receptor.
Cell 110, 669–672 (2002)
30. Schmidt, T., Schütz, G.J.: Single-Molecule Analysis of Biomembranes. In: Handbook of Single-
Molecule Biophysics, pp. 19–42. Springer US, New York, NY (2009)
31. Schmidt, T., Schütz, G.J., Baumgartner, W., Gruber, H.J., Schindler, H.: Photophysics and
motion of single fluorescent molecules in phospholipid membranes. J. Phys. Chem 99, 17662–
17668 (1995)
32. Shinar, G., Alon, U., Feinberg, M.: Sensitivity and robustness in chemical reaction networks.
SIAM Journal of Applied Mathematics 69(4), 977–998 (2009).
33. Simson, R., Yang, B., Moore, S.E., Doherty, P., Walsh, F.S., Jacobson, K.A.: Structural
mosaicism on the submicron scale in the plasma membrane. Biophys. J. 74, 297–308 (1998)
34. van Kampen, N.G.: Stochastic Processes in Physics and Chemistry. North-Holland, Amsterdam
(1992)
35. von Smoluchowski, M.V.: Z. Phys. Chem. 92, 129 (1917)
36. Sontag, E.: Monotone and near-monotone biochemical networks. Systems and Synthetic Biol-
ogy 1, 59–87 (2007)
37. Spira, F., Mueller, N.S., Beck, G., von Olshausen, P., Beig, J., Wedlich-Söldner, R.: Patchwork
organization of the yeast plasma membrane into numerous coexisting domains. Nature Cell
Biol. 14(6), 640–648 (2012)
38. Takahashi, K., Tanase-Nicola, S., ten Wolde, P.R.: Spatio-temporal correlations can drastically
change the response of a mapk pathway. Proc Natl Acad Sci USA 107(6), 2473–2478 (2010)
39. S. Wieser, M. Moertelmaier, E. Fuertenbauer, H. Stockinger, Shutz, G.: (un)confined diffusion
of cd59 in the plasma membrane determined by high-resolution single molecule microscopy.
Biophys. J. 92(10), 3719–3728 (2007). doi:10.1529/biophysj.106.095398
40. Woese, C.R.: A new biology for a new century. Microbiology and Molecular Biology Reviews
68(2), 173–186 (2004)
41. Yang, S., Raymond-Stintz, M.A., Ying, W., Zhang, J., Lidke, D.S., Steinberg, S., Williams,
L., Oliver, J.M., Wilson, B.S.: Mapping ErbB receptors on breast cancer cell membranes
during signal transduction. Journal of Cell Science 120(16), 2763–2773 (2007). doi:10.1242/
jcs.007658
Distribution Approximations for the
Chemical Master Equation: Comparison
of the Method of Moments and the System
Size Expansion
Alexander Andreychenko, Luca Bortolussi, Ramon Grima,

Philipp Thomas and Verena Wolf
Abstract The stochastic nature of chemical reactions has resulted in an increasing

research interest in discrete-state stochastic models and their analysis. A widely used
approach is the description of the temporal evolution of such systems in terms of a
chemical master equation (CME). In this paper we study two approaches for approx-
imating the underlying probability distributions of the CME. The first approach is
based on an integration of the statistical moments and the reconstruction of the dis-
tribution based on the maximum entropy principle. The second approach relies on an
analytical approximation of the probability distribution of the CME using the system
size expansion, considering higher order terms than the linear noise approximation.
We consider gene expression networks with unimodal and multimodal protein dis-
tributions to compare the accuracy of the two approaches. We find that both methods
provide accurate approximations to the distributions of the CME while having dif-
ferent benefits and limitations in applications.
A. Andreychenko · L. Bortolussi · V. Wolf (B)

Modelling and Simulation Group, Saarland University, Saarbrücken, Germany
e-mail: wolf@cs.uni-saarland.de
A. Andreychenko
e-mail: makedon@cs.uni-saarland.de
L. Bortolussi
Department of Mathematics and Geosciences, University of Trieste, Trieste, Italy
e-mail: luca@dmi.units.it
R. Grima
School of Biological Sciences, University of Edinburgh, Edinburgh, UK
e-mail: ramon.grima@ed.ac.uk
P. Thomas
Department of Mathematics, Imperial College London, London, UK
e-mail: p.thomas@imperial.ac.uk

40 A. Andreychenko et al.
1 Introduction
It is widely recognized that noise plays a crucial role in shaping the behavior of
biological systems [1–4]. Part of such noise can be explained by intrinsic fluctua-
tions of molecular concentrations inside a living cell, caused by the randomness of
biochemical reactions, and fostered by the low numbers of certain molecular species
[2]. As a consequence of this insight, stochastic modeling has rapidly become very
popular [5], dominated by Markov models based on the chemical master equation
(CME) [5, 6].
The CME represents a system of differential equations that specifies the time
evolution of a discrete-state stochastic model that explicitly accounts for the dis-
creteness and randomness of molecular interactions. It has therefore been widely
used to model gene regulatory networks, signaling cascades and other intracellular
processes which are significantly affected by the stochasticity inherent in reactions
involving low number of molecules [7].
A solution of the CME yields the probability distribution over population vectors
that count the number of molecules of each chemical species. While a numerical
solution of the CME is rather straightforward, i.e. via a truncation of the state space
[8, 9], for most networks the combinatorial complexity of the underlying state space
renders efficient numerical integration infeasible. Therefore, the stochastic simula-
tion algorithm (SSA), a Monte Carlo technique, is commonly used to derive statistical
estimates of the corresponding state probabilities [10].
An alternative to stochastic simulation is to rely on approximation methods
that can provide fast and accurate estimates of some aspects of stochastic models.
Typically, most approximation methods focus on the estimation of moments of the
distributions [11–16]. However, two promising approaches for the approximate com-
putation of the distribution underlying the CME have recently been developed, whose
complexity is independent of the molecular population sizes.
The first method discussed here is based on the inverse problem, i.e., recon-
structing the probability distribution from its moments. To this end, a closure on the
moment equations is employed which yields an approximation of the evolution of
all moments up to order K of the joint distribution [14–16]. Thus, instead of solv-
K N S +k−1
ing one equation per population vector we solve k=1 k
equations if N S is
the number of different chemical species. Given the (approximate) moments at the
final time instant, it is possible to reconstruct the corresponding marginal probabil-
ity distributions using the maximum entropy principle [17, 18]. The reconstruction
requires the solution of a nonlinear constrained optimization problem. Nevertheless,
the integration of the moment equations and the reconstruction of the underlying
distribution can for most systems be carried out very efficiently and thus allows a
fast approximation of the CME.
The second method, which is based on van Kampen’s system size expansion [19],
does not resort to moments but instead represents a direct analytical approximation
of the probability distribution of the CME. Unlike the method of moments, the tech-
nique assumes that the distribution can be expanded about its deterministic limit
Distribution Approximations for the Chemical Master Equation … 41
rate equations using a small parameter called the system size. For biochemical sys-
tems, the system size coincides with the volume to which the reactants are confined.
The leading order term of this expansion is given by the linear noise approximation
which predicts that the fluctuations about the rate equation solution are approxi-
mately Gaussian distributed [19] in agreement with the central limit theorem that is
valid for large number of molecules. For low molecule numbers, the non-Gaussian
corrections to this law can be investigated systematically using the higher order terms
in the system size expansion. A solution to this expansion has recently been given
in closed form as a series of the probability distributions in the inverse system size
[20]. Although in general the positivity of this distribution approximation cannot be
guaranteed, it often provides simple and accurate analytical approximations to the
non-Gaussian distributions underlying the CME.
Since many reaction networks involve very small molecular populations, it is
often questionable whether the system size expansion and moment based approaches
can appropriately capture their underlying discreteness. For example, the state of
a gene that can either be ‘on’ or ‘off’ while the number of mRNA molecules is
of the order of only a few tens on average. In these situations, hybrid approaches
are more appropriate, and supported by convergence results in a hybrid sense in
the thermodynamic limit [21]. A hybrid moment approach for the solution of the
CME integrates a small master equation for the small populations while a system
of conditional moment equations is integrated for the large populations which is
coupled with the equation for the small populations. Equations of unlikely states
of the small populations can be truncated and then, with a similar computational
effort, a more accurate approximation of the CME is obtained compared to the
standard method of moments. Similarly, a conditional system size expansion can
be constructed that tracks the probabilities of the small populations and applies
the system size expansion to the large populations conditionally. Presently, such an
approach has employed the linear noise approximation for gene regulatory networks
with slow promoters invoking timescale separation [22]. The validity of a conditional
system size expansion including higher than linear noise approximation terms is,
however, still under question.
Given these recent developments, it remains to be clarified how these two com-
peting approximation methods perform in practice. Here, we carry out a compar-
ative study between numerical results obtained using the method of moments and
analytical results obtained from the system size expansion for two common gene
expression networks. The outline of the manuscript is the following: In Sect. 2 we
will briefly review the CME formulation. Section 3 outlines the methods of (con-
ditional) moments and the reconstruction of the probability distributions using the
maximum entropy principle. In Sect. 4 the approximate solution of the CME using
the SSE is reviewed. We then carry out two detailed case studies in Sect. 5. In the
first case study, we investigate a model of a constitutively expressed gene leading
to a unimodal protein distribution. In a second example we study the efficacy of the
described hybrid approximations using the method of conditional moments and the
conditional system size expansion for the prediction of multimodal protein distribu-
tions from the expression of a self-activating gene. These two examples are typical
scenarios encountered in more complex models, and as such provide ideal bench-
marks for a qualitative and quantitative comparison of the two methods. We conclude
with a discussion in Sect. 6.
2 Stochastic Chemical Kinetics
A biochemical reaction network is specified by a set of N S different chemical species

S1 , . . . , S N S and by a set of R reactions of the form
− − kr+ +
r,1 S1 + · · · + r,N S −
S NS
→ r,1 S1 + · · · + r,N S , 1 ≤ r ≤ R.
S NS
Given a reaction network, we define a continuous-time Markov chain {X(t), t ≥ 0},

where X(t) = (X 1 (t), . . . , X N S (t)) is a random vector whose i-th entry X i (t) is the
number of molecules of type Si . If X(t) = x = (x1 , . . . , x N S ) ∈ N0N S is the state of the
−
process at time t and xi ≥ r,i for all i, then the r -th reaction corresponds to a possible
transition from state x to state x + vr where vr is the change vector with entries
+ −
vr,i = r,i − r,i ∈ Z N S . The rate of the reaction is given by the propensity function
γr (x), with γr (x)dt being the probability of a reaction of index r occurring in a time
instant dt, assuming that the reaction volume is well-stirred. The most common form
of propensity function follows from the principle of mass action and depends on the
−
volume Ω to which the reactants are confined, γr (x) := Ωkr i=1 NS
Ω −r,i x−i , as it
r,i
can be derived on the basis of physical kinetics [6, 23].
We now fix the initial condition x0 of the Markov process to be deterministic and
let Π (x, t) = Prob(X(t) = x | X(0) = x0 ) for t ≥ 0. The time evolution of Π (x, t)
is governed by the Chemical Master Equation (CME) as
dΠ (x, t)
R
= (γr (x − vr )Π (x − vr , t) − γr (x)Π (x, t)) . (1)
dt r =1
We remark that the probabilities Π (x, t) are uniquely determined when we consider
the equations of all states that are reachable from the initial conditions.
Example 1 In the sequel we will use the following system as a running example. It
describes the expression of a gene with enzymatic degradation [24].
k0 k1 k2 f (x P )
∅−
→M−
→ ∅, M −
→ M + P, P −−−→ ∅,
where f (x P ) is a propensity function that depends on the number of molecules of

type P (see also Sect. 5.1 for more details). A state of the associated Markov process
is a vector x = (x M , x P ) where x M , x P ∈ N0 are the number of mRNA (M) and
protein (P) molecules in the system.
3 Method of Moments
For most realistic examples the number of reachable states is extremely large or
even infinite, which renders an efficient numerical integration of Eq. (1) impossible.
An approximation of the moments of the distribution over time can be obtained by
considering the corresponding moment equations that describe the dynamics of the
moments up to order K < ∞. We refer to this approach as the method of moments
(MM). In this section we will briefly discuss the derivation of the corresponding
moment equations following the lines of Ale et al. [16].
Let f : N0N S → Rn be a function that is independent of t where n ∈ N. In the sequel
we will exploit the following relationship,

d
dt
E(f(X(t))) = f(x) · dtd Π (X(t) = x)
x

R (2)
= E γr (X(t)) · (f(X(t) + vr ) − f(X(t))) .
r =1
For f(x) = x this yields a system of equations for the population means

R
d
dt
E(X(t)) = vr E(γr (X(t))) . (3)
r =1
Note that the system

Nof ODEs in Eq. (3) is only closed if at most monomolecular mass
−
action reactions ( i=1 S
r,i ≤ 1) are involved. For most networks the latter condition
is not true and higher order moments appear on the right side.
Let us write μi (t) for E(X i (t)) and µ(t) for the vector with entries μi (t), 1 ≤
i ≤ N S . Then a Taylor expansion of the function γr (X(t)) about the mean E(X(t))
yields

NS
E(γr (X)) = γr (µ) + 1
1!
E(X i − μi ) ∂μ∂ i γr (µ)
i=1
(4)

NS
NS
E((X i − μi )(X k − μk )) ∂μ∂i ∂μk γr (µ) + . . . ,
2
+ 2!1
i=1 k=1
where we omitted t to improve the readability. Note that E(X i (t) − μi ) = 0 and
assuming that all propensities follow the law of mass action and all reactions are at
most bimolecular, the terms of order three and more disappear. By letting Cik be the
covariance E[(X i (t) − μi )(X k (t) − μk )] we get
NS NS ∂2
E(γr (X)) = γr (µ) + 21 i=1 k=1 C ik ∂μi ∂μk γr (µ). (5)
Next, we derive an equation for the covariances by first exploiting the relationship
d d d d d d
Cik = E(X i X k ) − (μi μk ) = E(X i X k ) − μi μk − μi μk ,
dt dt dt dt dt dt
(6)
and if we couple this equation with the equations for the means, the only unknown
term that remains is the derivative dtd E(X i X k ) of the second moment. For this,
we can apply the same strategy as before by using Eq. (2) for the test function
f (x) := xi xk . This equation will contain the expectations E(γr (X)) and E(γr (X)X i )
for which we again consider the Taylor expansion about the mean. For the expansion
of E(γr (X)X i ) moments of order three come into play since derivatives of order three
of γr (x)xi may be nonzero. It is possible to take these terms into account by deriving
additional equations for moments of order three and higher. These equations will
then include moments of even higher order such that theoretically we end up with
an infinite system of equations. Different strategies to close the equations have been
proposed in the literature [25–29]. Here we consider a low dispersion closure and
assume that all moments of order >K that are centered around the mean are equal
to zero. E.g., if we choose K = 2, then we can obtain a closed system of equations
that does not include higher order terms. Then we can integrate the time evolution
of the means and that of the covariances and variances.
Example 2 To illustrate the method we consider Example 1. Assuming that all central
moments of order three and higher are equal to zero, we get the following equations
for the means of the species.
d
μ M = Ωk0 − k1 μ M
dt (7)
d 1 2 f (μ P )
μ P = k1 μ M − f (μ P ) − ∂ ∂μ 2 C P,P
dt 2 P
where the expectations of the species are given by μ M and μ P and the variance of
P is given by C P,P . Equations for covariances are omitted.
In many situations, the approximation provided by the MM approach is very

accurate even if only the means and covariances are considered. In general, however,
numerical results show that the approximation tends to become worse if systems
exhibit complex behavior such as multistability or oscillations and the resulting
equations may become very stiff [14, 30]. For some systems increasing the number
of moments improves the accuracy [16]. Generally, however such a convergence is
not seen [31], except in the limit of large volumes [32].
3.1 Equations for Conditional Moments
For many reactions networks a hybrid moment approach, called method of condi-
tional moments (MCM), can be more advantageous, in which we decompose X(t)
into small and large populations. The reason is that small populations (often describ-
ing the activation state of a gene) have distributions that assign a significant mass
of probability only to a comparatively small number of states. In this case we can
integrate the probabilities directly to get a more accurate approximation of the CME
compared to an integration of the moments.
Formally, we consider X(t) = (Y(t), Z(t)), where Y(t) corresponds to the small,
and Z(t) to the large populations. Similarly, we write x = (y, z) for the states of the
process and vr = (v̂r , ṽr ) for the change vectors, r ∈ {1, . . . , R}. Then, we condition
on the state of the small populations and apply the MM to the conditional moments
but not to the distribution of Y(t). The probabilities for Y(t) = y are called mode
probabilities. Now, Eq. (1) becomes
dΠ (y, z)
R
= [γr (y− v̂r , z− ṽr )Π (y− v̂r , z− ṽr ) − γr (y, z)Π (y, z)]. (8)
dt r =1
Next, we
sum over all possible z to get the time evolution of the mode probabilities
Π (y) = z Π (y, z).

R
R
d
dt
Π (y) = γr (y − v̂r , z − ṽr )Π (y − v̂r , z − ṽr ) − γr (y, z)Π (y, z)
z r =1 z r =1

R
R
= Π (y − v̂r )E[γr (y − v̂r , Z) | Y = y − v̂r ] − Π (y)E[γr (y, Z) | Y = y].
r =1 r =1
(9)
Note that in this small master equation that describes the change of the mode
probabilities over time, the sum runs only over those reactions that modify y, since
for all other reactions the terms cancel out. Moreover, on the right side we have
only mode probabilities of neighboring modes and conditional expectations of the
z-part of the reaction rate. For the latter, we can use a Taylor expansion about the
conditional population means. Similar to Eq. (5), this yields an equation that involves
the conditional means and centered conditional moments of second order (variances
and covariances). Thus, in order to close the system of equations, we need to derive
equations for the time evolution of the conditional means and centered conditional
moments of higher order. Since the mode probabilities may become zero, we first
derive an equation for the evolution of the partial means (conditional means multi-
plied by the probability of the condition)

d
dt (E[Z | y] Π (y)) = z dtd Π (y, z)
z

R
= E[(Z + ṽr )γr (y − v̂r , Z) | y − v̂r ] Π (y − ṽr ) (10)
r =1
R
− E[Zγr (y, Z) | y] Π (y),
r =1
where in the second line we applied Eq. (8) and simplified the result. The con-
ditional expectations E[(Z + ṽr )γr (y − v̂r , Z) | y − v̂r ] and E[Zγr (y, Z) | y] are
then replaced by their Taylor expansion about the conditional means such that the
equation involves only conditional means and higher centered conditional moments
[33]. For higher centered conditional moments, similar equations can be derived. If
all centered conditional moments of order higher than K are assumed to be zero, the
result is a (closed) system of differential algebraic equations (algebraic equations are
obtained whenever a mode probability Π (y) is equal to zero). However, it is possible
to transform the system of differential algebraic equations into a system of (ordi-
nary) differential equations after truncating modes with insignificant probabilities.
Then we can get an accurate approximation of the solution after applying standard
numerical integration methods. For the case study in Sect. 5 we derived the system
of ODEs using the tool SHAVE [34] which implements a truncation based approach
and solves the ODEs using an explicit Runge–Kutta method.
Example 3 We apply the MCM approach to Example 1. The modes of the system
correspond to the number of molecules of type M. Here we provide equations only for
the first two modes, M = 0 and M = 1. The equations for the corresponding mode
probabilities ( p0 , p1 ) and the expected number of proteins (μ P,0 , μ P,1 ) conditioned
on the modes are given below. This equation system has to be closed not only by
setting to zero all the central moments of order higher than 2, but also by truncating
the infinite population of M. This can be done by assuming that pn = 0 from a certain
number n on.
d
p0 = −Ωk0 p0 + k1 p1
dt
d
p1 = Ωk0 p0 − k1 p1 − Ωk0 p1
dt
d 1 ∂ 2 f (μ P,0 )
μ P,0 p0 = −Ωk0 μ P,0 p0 + k1 μ P,1 p1 − f (μ P,0 ) p0 − C P,P,0 p0 (11)
dt 2 ∂μ P,0 2
d
μ P,1 p1 = Ωk0 μ P,0 p0 − Ωk0 μ P,1 p1 − k1 μ P,1 p1 − f (μ P,1 ) p1
dt
1 ∂ 2 f (μ P,1 )
− C P,P,1 p1 + k2 p1
2 ∂μ P,1 2
3.2 Maximum Entropy Distribution Reconstruction
Given the (approximated) moments of a distribution up to order K it is possible

to reconstruct the corresponding probability distribution. Since a finite number of
moments defines a set of distributions, we apply the maximum entropy principle
where we choose among the distributions, that fulfill the moment equations, the one
that maximizes the entropy. For instance, the normal distribution is chosen among
all continuous distributions with equal mean and variance.
In the sequel we describe how to obtain the one-dimensional marginal probability
distributions of a reaction network when we use the moments up to order K . We
mostly follow Andreychenko et al. [17] and simply write X for the corresponding
molecular count at some fixed time instant t. Given a sequence of K + 1 noncentral

moments1 E X k = μ(k) , k = 0, 1, . . . , K , the set G of allowed (discrete) proba-
bility distributions consists of all non-negative functions g for which the following
conditions hold: k
x g(x) = μ(k) , k = 0, 1, . . . , K . (12)
x
Here x ranges over possible arguments (usually x ∈ IN0 ) with positive probability.
Note that we have included the constraint μ0 = 1 in order to guarantee that g is a
probability distribution. According to the maximum entropy principle, we choose
the distribution q ∈ G that maximizes the entropy H (g), i.e.,

q = arg max H (g) = arg max − g(x) ln g(x) . (13)
q∈G g∈G x
The problem of finding the maximum entropy distribution is a nonlinear con-

strained optimization problem that can be addressed by considering the Lagrangian
functional
K k
L(g, λ) = H (g) − λk x g(x) − μ(k) ,
k=0 x
where λ = (λ0 , . . . , λ K ) are the corresponding Lagrangian multipliers. The max-

imum of the unconstrained Lagrangian L corresponds to the solution of the con-
strained maximum entropy problem (13). Note that setting the derivatives of L(g, λ)
w.r.t. λk , to zero results in the moment constraints. The general form of the maximum
∂L
is obtained by setting ∂g(x) to zero which yields

K
K
q(x) = exp −1 − λk x k = 1
Z
exp − λk x k ,
k=0 k=1
where

M
Z = e1+λ0 = exp − λk x k (14)
x k=1
is a normalization constant. The last equality in Eq. (14) follows from the fact that
q is a distribution and thus λ0 is uniquely determined by λ1 , . . . , λ K . Next we insert
the above general form into the Lagrangian, thus transforming the problem into an
unconstrained convex minimization problem of the dual function w.r.t. the variables
λk . This yields the dual function
1 Noncentral moments can be easily obtained from central ones. For instance, the second noncentral
moment μ(2) is obtained from the variance σ 2 and the mean μ as μ(2) = σ 2 + μ2 .

K
Ψ (λ) = ln Z + λk μ(k) . (15)
k=1
According to the Kuhn–Tucker theorem, the solution λ∗ = arg min Ψ (λ) of the min-
imization problem determines the solution q of the original constrained optimization
problem in Eq. (13) (see [35]). We solve this unconstrained optimization problem
using the Newton method from the MATLAB’s numerical minimization package
minFunc, where we choose λ(0) = (0, . . . , 0) as an initial starting point and use the
approximated gradient and Hessian matrix. Since for the systems that weconsider, the
dual function is convex [36–38], there exists a unique minimum λ∗ = λ∗1 , . . . , λ∗K
where all first derivatives are zero and where the Hessian is positive definite. The
final results λ∗ of the iteration yields the distribution

K
q̃(x) = exp −1 − λ∗k x k ,
k=0
which is an approximation of the marginal distribution of the process at time t.

The sequence of moments μ(k) , k = 0, . . . , K obtained using MM or MCM serves
as an input to the maximum entropy reconstruction procedure. Due to the
high sensi-
tivity with respect to the accuracy of the highest order moment E X PK , we compute

all moments
up to E X PK +1 to get a better estimation but use moments only up to
E X PK in the entropy maximization.
The maximum entropy approach provides a useful extension of moment-based
integration methods of the CME. In order to reconstruct a probability distribu-
tion, we require that moments of the reconstructed distribution match with those
obtained by the method of moments. We apply the maximum entropy principle to
the infinitely many distributions satisfying these constraints, i.e., we pick the distri-
bution that maximizes the uncertainty. The advantage of this choice are that we can
efficiently learn the maximum entropy distribution in one or two dimensions. Fur-
thermore, it seems to produce reasonably accurate results, as validated a posteriori
[17, 18].
The reconstruction of the distributions of higher dimension is more involved due
to numerical instabilities arising when using the ill-conditioned Hessian matrix
[36, 39] in the optimization procedure. As mentioned in the numerical results that
we present in the sequel, problems may arise if the support of the distribution (which
serves as an input argument to the optimization procedure) is not chosen adequately.
The true support of the distribution is often infinite and a reasonable truncation has to
be used [40]. A possible solution is addressed in [18] where we introduce an iterative
heuristic-based procedure of the support approximation. However, generally this
approach gives very accurate results relative to the information about the distribution
given by the moment constrains.
4 System Size Expansion of the Probability Distribution
We here describe the use of the system size expansion [19] to obtain approximate
but simple expressions for the probability distributions of the CME. For simplicity,
we will focus on the case of a single species and follow the approach developed by
Thomas and Grima [20]. The system size expansion makes use of the macroscopic
limit of the CME which is attained for large reaction volumes. When concentrations
are held constant, large volumes imply large number of molecules, and hence the
macroscopic concentration [X ] follows the deterministic rate equation
d[X ]
R
= νr fr(0) ([X ]). (16)
dt r =1
Note that here νr = νr,1 because we only consider a single species. A prerequisite
for Eq. (16) to be the deterministic limit of the CME is that the rate functions satisfy
the scaling

γ j (Ω[X ], Ω) = Ω fr(0) ([X ]) + Ω −1 fr(1) ([X ]) + . . . , (17)
which holds for instance in the case of mass action kinetics [20]. The first term
in this series, fr(0) ([X ]) = limΩ→∞ γr (Ω[X
Ω
])
, is the macroscopic rate function. Note
that for an unimolecular reaction only the first term in Eq. (17) is present while for
a bimolecular one the first two terms are nonzero.
The expansion allows to characterize the deviations from this deterministic behav-
ior by successively taking into account higher order terms. Specifically, van Kampen
proposed separating the dynamics of the molecular concentration into the determin-
istic part [X ] and a fluctuating component ε that reads
x
= [X ] + Ω −1/2 ε. (18)
Ω
This ansatz can be used to expand the CME in powers of the inverse square root of
the volume by changing variables from x to ε. Assuming that this transformation is
continuous, i.e., Π (ε, t) = Π (Ω[X ] + Ω 1/2 ε, t)Ω 1/2 , the CME becomes
∂ N
Π (ε, t) = L0 Π (ε, t) + Ω −k/2 Lk Π (ε, t) + O(Ω −(N +1)/2 ). (19)
∂t k=1
When the above equation is truncated after the N th term, it approximates the CME by
a partial differential equation. It is shown in Ref. [20] that the differential operators
Lk can be written down explicitly and are given by
k/2 k−2(s−1)
k− p−2(s−1)
D p,s
Lk = (−∂ε ) p εk− p−2(s−1) , (20)
s=0 p=1
p!(k − p − 2(s − 1))!
where the coefficients
R
∂ q fr(s) ([X ])
Dqp,s = (νr ) p , (21)
r =1
∂[X ]q
depend only on the solution of the rate equation. We will solve Eq. (19) perturbatively
by expanding the probability density as

N
Π (ε, t) = Ω − j/2 π j (ε, t) + O(Ω −(N +1)/2 ). (22)
j=0
Using the above series in Eq. (19) and equating order Ω 0 terms, one finds
∂
− L0 π0 = 0, (23a)
∂t
while equating terms to order Ω − j/2 , one finds
∂
− L0 π j (ε) = L1 π j−1 + · · · + L j π0 , (23b)
∂t
for j > 0. The first equation, Eq. (23a), is called the linear noise approximation [19]
and its solution is a Gaussian distribution
1 ε2
π0 (ε, t) = exp − , (24)
2π σ 2 (t) 2σ 2 (t)
with zero mean meaning that the rate equation are valid on average. Its variance
σ 2 (t) follows the equation
∂σ 2
= 2J (t)σ 2 + D20 (t), (25)
∂t
where we have denoted by J (t) = D11 (t) the Jacobian of the rate Eq. (16).
The solution of the partial differential equations (23b) can then be obtained using
the eigenfunctions of L0 . In particular, we can write

3j
π j (ε, t) = am( j) (t)ψm (ε, t)π0 (ε, t)
m=1

where ψm (ε, t) = π0−1 (−∂ε )m π0 = σ m1(t) Hm σ ε(t) and Hm are the Hermite polyno-
mials. The solution of the system size expansion is therefore
⎛ ⎞

N
3j
Π (ε, t) = π0 (ε, t) ⎝1 + Ω − j/2 am( j) (t)ψm (ε, t)⎠ + O(Ω −(N +1)/2 ). (26a)
j=1 m=1
Mathematically speaking the above equation represents an asymptotic series solution

to the CME. Equations for the coefficients can be obtained using the orthogonality
of the Hermite polynomials, and are given the ordinary differential equations
k/2 k−2(s−1)
∂
j 3( j−k)

− nJ an( j) = am( j−k) Dk− Imn
p−2(s−1) p,k− p−2(s−1)
,
∂t k=1 m=0 s=0 p=1
p,s
(26b)
with
σ β−α+n−m
min(n−α,m)
m (β + α + 2s − (m + n) − 1)!!
αβ
Imn = , (26c)
α! s=0
s (β + α + 2s − (m + n))!(n − α − s)!
( j)
and zero for odd (α + β) − (m + n). Note that an = 0 when n + j is odd. Explicit
expressions for the probability density can now be evaluated to any desired order.
Relatively simple and explicit solutions are obtained in steady state conditions by
letting the time derivative on the left-hand side of Eq. (26b) go to zero. It follows
that the first term in Eq. (26a), the linear noise approximation π0 , describes the
distribution in the infinite volume limit. For finite volumes, implying finite number
of molecules, higher order terms given by Eq. (26) have to be taken into account.
It is, however, the case that this approximation can become inaccurate for
processes whose mean behavior differs significantly from the rate equation. This
is because in van Kampen’s ansatz, Eq. (18), the average concentration is approxi-
mated by the solution of the rate equation [X ]. For biochemical networks involving
bimolecular reactions, the rate equation provides only an approximation to the true
average predicted by the CME because its propensities depend nonlinearly on the
concentrations [41]. In applications it is important to account for these deviations
from the rate equation solution and the linear noise approximation. We therefore
calculate the true concentration mean and variance using the system size expansion
a priori and then perform the expansion about the true mean. A posteriori, this leads
to an expansion about the mean

x x
= + Ω −1/2 ε̄ , (27)
Ω Ω

mean fluctuations
which is different than the one proposed by van Kampen, Eq. (18), who expands the
CME
x about the −1/2
solution of the rate equation. Here, the averages are calculated from
Ω
= [X ] + Ω ε such that ε̄ = ε − ε is a centered random variable quanti-
fying the fluctuations about the true average. The result is the following expansion

N
3j
π̄ (ε̄, t) = π̄0 (ε̄, t) + Ω − j/2 ām( j) ψ(ε̄, t)π̄0 (ε̄, t) + O(Ω −(N +1)/2 ), (28a)
j=1 m=1
where π̄0 (ε̄) is a Gaussian different from the linear noise approximation because it
is centered about the true mean (instead of the rate equation). The coefficients in the
above equation can be calculated from

j

3k
( j−k)
ām( j) = an(k) κm−n , (28b)
k=0 n=0
and

1 n
j/2 n−m
(ζ ) ζ! 2
κ (n)
j = (−1)( j+m) Bk, j−2m ζ !a1 Bn−k,m σ̄(ζ ) .
n! m=0 k= j−2m
k 2
(28c)
( j) ( j) (ζ )
Here a1 and σ̄(2j) = 2 a2 − B j,2 ({ζ !a1 })/ j! denote the coefficients in the expan-
sions of mean and variance

N
( j)
ε = Ω − j/2 a1 + O(Ω −(N +1)/2 ), (29a)
j=1

N
σ̄ 2 = σ 2 + Ω − j/2 σ(2j) + O(Ω −(N +1)/2 ), (29b)
j=1
respectively, and Bk,n ({yζ }) denotes the partial Bell polynomials [42] defined as
n! y j1 yn−k+1 jn−k+1
1
Bn,k ({yζ }n−k+1
ζ =1 ) = ... . (30)
j1 ! . . . jn−k+1 ! 1! (n − k + 1)!

Note that denotes the summation over all sequences j1 , . . . , jn−k+1 of non-
negative integers such that j1 + · · · + jn−k+1 = k and j1 + 2 j2 + · · · + (n − k + 1)
jn−k+1 = n. Note that the expansion about the mean has generally less nonzero coeffi-
( j) ( j)
cients because ā1 = ā2 = 0 for all j. Note also that for systems whose propensities
depend at most linearly on the concentrations, mean and variance are exact to order
Ω 0 (linear noise approximation), and hence for this case expansion (26a) coincides
with Eq. (28a). In Sect. 5 we show that typically a few terms in this expansion (28a)
are sufficient to capture the underlying distributions of the CME.
A particularly relevant case, namely the stationary solution of the CME, turns out
to be obtained quite straightforwardly. For example, truncating after Ω −1 -terms, from
Eq. (29) it follows that ε = Ω −1/2 a1(1) + O(Ω −3/2 ) and σ̄ 2 = σ 2 + Ω −1 (2a2(2) −
(a1(1) )2 ) + O(Ω −3/2 ). Letting now the left hand side of Eq. (26b) go to zero, solving
( j)
for the coefficients an and using the solution in Eq. (28b) one finds that to order
−1/2
Ω the only nonzero coefficient is
σ 4 D12 σ 2 D21 D0
ā3(1) = − + + 3 , (31a)
6J 6J 18J
while to order Ω −1 , one finds
D40 σ 2 D31 σ 4 D22 σ 6 D13 3D21 3σ 2 D12

ā4(2) = − − − − − ā3(1) + ,
96J 24J 16J 24J 8J 4J
1
ā6(2) = (ā3(1) )2 . (31b)
2
The above equations determine the series solution of the CME, Eq. (28a), in stationary
conditions to order Ω −1 .
5 Results
In this section we will compare the (hybrid) method of moments (MM and MCM
approach) and the (conditional) system size expansion (SSE approach) by consider-
ing the quality of the resulting distribution approximations. In the former case we use
the maximum entropy approach outlined in Sect. 3.2 and the approximate solution
obtained from the SSE in the latter case. We base our comparison on two simple
but challenging examples. The first one describes the bursty expression of a protein
which degrades via an enzyme-catalyzed reaction. The second example describes
the expression of a protein activating its own expression resulting in a multimodal
protein distribution.
5.1 Example 1: Bursty Protein Production
In order to compare the performance of the two methods, we will employ a simple
model of gene expression with enzymatic degradation described in Ref. [24]. The
system is described by the following set of biochemical reactions:
k0 k1 k2
∅−
→M−
→ ∅, M −
→ M + P,
k3 k
−

P+E−C −
→ E.
5
(32)
k4
Recently, such active protein degradation has been recognized to be an important fac-
tor in skewing nuclear protein distributions in mammalian cells toward high molecule
numbers [43]. In our model we explicitly account for mRNA which is important even
when the mRNA half-life is shorter than the one of the corresponding protein [44].
Assuming binding and unbinding of P and E are fast compared to protein degra-
dation, the degradation kinetics can be simplified as in Refs. [43, 45], resulting in a
Michaelis–Menten like kinetic rate:
f (x P )
P −−−→ ∅, (33)
with f (x P ) = ΩΩKvMM+x
xP
P
, where x P is the number of proteins and K M = k4k+k
3
5
is the
Michaelis–Menten constant. Less obviously for this reduction to hold in the stochastic
case, one also requires k4 k5 as shown in Refs. [45, 46]. Here rates are set to
k0 = 8, k1 = 10, k2 = 100, v M = 100, K M = 20, Ω = 1. In particular, the reaction
rates involving mRNA are chosen such that proteins are produced in bursts of size b =
k2 /k1 , i.e., on average 10 protein molecules are synthesized from a single transcript.
5.1.1 Method of Moments and Maximum Entropy Reconstruction
We compute an approximation of the moments of species M and P up to order 4

(K + 1 = 4) and 6 (K + 1 = 6) using the MM as explained in Sect. 3. The moments
of P are then used to reconstruct
the marginal
probability
distribution
of P. For
instance, given the moments E X 0P , E X 1P , E X 2P , E X 3P , E X 4P we approx-
imate the distribution of P with q̃(x) ≈ Π (X P = x) (see Sect. 3.2). Due to the high
sensitivity of the maximum entropy method even to small approximation errors 4of
the moments
of highest order considered (in this case the approximation
of E XP
and E X 6P , respectively), we use moments only up to E X 3P for the reconstruction.
The same holds for all results presented below. In Fig. 1 we plot the two reconstructed
distributions.
While for most parts of the support (including the tail) the distribution is accurately
reconstructed even with K = 3, the method is less precise when considering the
probability of small copy numbers of P. To improve the reconstruction, we may use
more moments, for instance K = 7. However, in this case the moment equations
become so stiff that the numerical integration fails completely. This happens due to
a combination of highly nonlinear derivatives of the rate function f (x P ) with large
values of the higher order moments.
Fig. 1 Bursty protein production: reconstruction based on MM. The reconstructed distribution
(cf. Eq. (3.2)) is compared to the distribution estimated with stochastic simulations (shaded gray
area), where we use K = 3 (dashed red) and K = 5 (solid blue line) for the moments in the
maximum entropy method. We find that taking into account moments of higher order increases the
accuracy significantly, in particular in the region of 0–200 proteins
5.1.2 Solution Using the System Size Expansion
The approximate solution to the distribution functions using the system size expan-
sion as outlined in Sect. 4 is available for networks of a single species only. We
therefore concentrate on the case where the mRNA dynamics is much faster than
the one of the protein, called the burst approximation [44]. It can be shown that the
reaction scheme (32) follows the reduced CME
∞
d vM x
Π (x) = Ωh 0 (E x−z − 1)ϕ(z)Π (x) + Ω(E x+1 − 1) Π (x), (34)
dt z=0
Ω K M +x
where E x−z is the step operator defined by E x−z f (x) = f (x − z) for any function
f . Note that protein synthesis
b z occurs in random bursts z following a geometric
distribution ϕ(z) = 1+b
1
1+b
with average b. The relation between the parameters
in scheme (32) and Eq. (34) are h 0 = k0 k2 /k1 and b = k2 /k1 . Within this description
protein synthesis involves many reactions: one for each value of z with probability
ϕ(z). The coefficients in the expansion of the CME follow from Eq. (21), and are
given by
∂m [P]
Dnm = δm,0 h 0 z n ϕ + (−1)n v M , (35)
∂[P] K M + [P]
m
and Dn,s
m
= 0 for s > 0, where [P] denotes the protein concentration according

to the rate equation solution and z n ϕ = ∞ z=0 z ϕ(z) = 1+b Li−n ( 1+b ) denotes
n 1 b
the averageover the geometric distribution in terms of the polylogarithm function

Li−n (x) = ∞ n k
k=1 k x . The rate Eq. (16) can be solved together with the linear noise
approximation, Eq. (25), in steady state conditions. For v M < bh 0 the solution is
−1
vM
[P] = K M 1 − , σ 2 = K M (b + 1)ς (ς + 1), (36a)
bh 0
where we have defined by ς = [P] KM

the reduced substrate concentration. In Fig. 2a
we show that the expansion performed about the rate equation solution leads to large
undulations, we therefore focus on the expansion about the mean. To this end, we
have to take into account higher order corrections to the first two moments, we
find ε = Ω −1/2 (1 + b)ς and σ̄ 2 = σ 2 + Ω −1 (b + 1)ς (b(ς + 2) + ς + 1). The
nonzero coefficients to order Ω −1 given by Eq. (31) then evaluate to
σ2
ā3(1) = (2b(ς + 1) + 2ς + 1),
6
!
σ2 1
ā4(2) = (b + 1) ς + (b + 1)(2b + 1)ς + (6b(b + 1) + 1) ,
2 2
4 6
1
ā6(2) = ā3(1) .
2
(36b)
2
The coefficients to order Ω −3/2 can be obtained from Eq. (28) and read
1
ā3(3) = (b + 1)ς 2(b + 1)2 ς 2 + 3b(2b + 3)ς + 6b(b + 1) + 3ς + 1 ,
6
ā (1)
ā5(3) = 3 1 + 12b(b + 1) + 12(b + 1)2 ς 2 + 12(b + 1)(2b + 1)ς ,
20
1 3
ā7(3) = ā3(1) ā4(2) , ā9(3) = ā3(1) . (36c)
6
The analytical form of these coefficients represents a particularly simple way of
solving the CME. The approximation resulting from using these in Eq. (28a) is
shown in Fig. 2b (dot-dashed blue line). This approximation captures much better the
true distribution obtained from exact stochastic simulation using the SSA (shaded
gray area) than the linear noise approximation (dashed red line). We found that
including higher order terms in Eq. (28) helped to improve the agreement. However,
the resulting expressions turn out to be more elaborate and are hence omitted. A
quantitative assessment of this agreement will be given in Sect. 5.3.
5.2 Example 2: Cooperative Self-Activation of Gene

Expression
As a second application of our methods we consider the regulatory dynamics of a

single gene inducing its own, leaky expression. We therefore consider the case where
gene activation occurs by binding of its own protein P to two independent sites
(a) (b)
Fig. 2 Bursty protein production: system size expansion solution. a We compare the solution
obtained using the system size expansion about the rate equation solution, Eq. (26a), that is truncated
after Ω 0 (dashed red), Ω −3/2 (dot-dashed blue) and Ω −3 -terms (solid yellow line) to stochastic
simulations (shaded gray area). We observe that the series yields large undulations and negative
probabilities. b The system size expansion about the mean is shown when the series in Eq. (28a)
truncated after Ω 0 (dashed red), Ω −3/2 (dot-dashed blue) and Ω −3 -terms (solid yellow line). We
find that these approximations avoid undulations and converge rapidly with increasing truncation
order to the distributions obtained from simulations using the SSA
∗ k1
−
G+P−G ,
k−1
k2
G∗ + P
− ∗∗
−G , (37)
k−2
which is modeled explicitly using mass action kinetics. In effect, there are three
gene states with increasing transcriptional activity, G, G ∗ and G ∗∗ , corresponding
to zero, one or two activators bound and leading to a cooperative form of activation.
Translation of a transcript denoted by M therefore must occur via one of the following
reactions
kG
G−
→ G + M,
kG ∗
G ∗ −→ G ∗ + M,
k G ∗∗
G ∗∗ −−→ G ∗∗ + M, (38)
where k G denotes the basal transcription rate, k G ∗ the transcription rate of the semi-
induced state G ∗ , and k G ∗∗ the rate of the fully induced gene. Finally, by the standard
model of translation and neglecting active degradation we have
k3
M−
→ M + P,
k4 k5
M−
→ ∅, P −
→ ∅, (39)
In the following two parameter sets listed in Table 1 leading to moderate and low
protein numbers are considered. As we shall see the protein distributions are multi-
modal in both cases representing ideal test cases for the distribution reconstruction
using conditional moment closures and the conditional system size expansion.
Table 1 The two parameter sets used to study the multimodal expression of a self-activating gene
described by the reactions (37)–(39). Set (A) leads moderate protein levels while set (B) yields low
protein levels. Note that we have set Ω = 1
Parameter k1 k−1 k2 k−2 kG kG ∗ k G ∗∗ k3 k4 k5
Set (A) 5× 3× 5× 2.5 × 4 12 24 1200 300 1
10−4 10−3 10−4 10−2
Set (B) 5× 2× 5× 2× 4 60 160 30 300 1
10−4 10−4 10−4 10−3
5.2.1 Method of Conditional Moments and Maximum Entropy

Reconstruction
We compare the distribution reconstruction using an approximation of the first 3,

5, and 7 moments of all the species obtained by the MCM (see Sect. 3). As for the
previous case study, the values of the moments of P are used to reconstruct the
corresponding
marginal
distribution.
However, here we use the conditional moments
E X k |G = 1 , E X k |G ∗ = 1 , E X k |G ∗∗ = 1 instead. We construct the function
q̃(x) that approximates the distribution of P by applying the law of total probability

q̃(x) = P (S = 1) q̃(x|S = 1). (40)
S∈{G,G ∗ ,G ∗∗ }
The results of the approximation are plotted in Fig. 3. As we can see, the multimodal-
ity of the distribution is captured pretty well, and the quality of the reconstructed
distribution is quite good in particular when using 7 moments.
(a) (b)
Fig. 3 Self-activating gene: reconstruction based on MCM. The reconstructed distribution (cf.
Eq. (40)) for the case of moderate (a) and slow (b) protein production is compared to the distrib-
ution estimated with stochastic simulations (shaded gray area), where we use K = 3, K = 5, and
K = 7 for the conditional moments in the maximum entropy method. Again we find a significant
improvement if moments of higher order are taken into account, in particular in those regions where
the shape of the distribution is complex such as the region for 0–5 proteins in case of (b)
5.2.2 Conditional System Size Expansion
An alternative technique to approximate distributions for gene regulatory networks

with multiple gene states has been given by Thomas et al. [22]. The method makes use
of timescale separation by grouping reactions into reactions that change the gene state
and reactions that affect only the protein distributions. Based on a conditional variant
of the linear noise approximation it was shown that when the gene transitions are
slow, protein distributions are well approximated by Gaussian mixture distributions.
Implicit in this approach was, of course, that the protein numbers are sufficiently large
to justify an application of an linear noise approximation. We will here extend this
framework considering higher order terms in the system size expansion to account
for low number of protein molecules.
To this end, we describe by the vector G one of the three gene states G = (1, 0, 0),
G ∗ = (0, 1, 0), and G ∗∗ = (0, 0, 1) and by x the number of proteins. We will assume
that gene transitions between these states evolve slowly on a timescale 1/μ. Rescaling
time via τ = t/μ, the CME on the slow timescale reads
dΠ (G, x, τ )
= μW0 (G)Π (G, x, τ ) + W1 Π (G, x, τ ) , (41)
dτ
where W0 (G) describes the reactions (38) and (39) in the burst approximation
∞

W0 (G) = (E x−z − 1) k · G ϕ(z) + (E x+1 − 1)k4 x, (42)
z=0
where k = (k G , k G ∗ , k G ∗∗ ) and W1 denotes the transition matrix of the slow gene

binding kinetics given by the reactions (37). Note that as before ϕ(z) is the geometric
distribution with mean b = k3 /k4 . Using the definition of conditional probability, we
can write Π (G, x, τ ) = Π (x |G, τ )Π (G, τ ) which transforms Eq. (41) into
dΠ (x|G, τ ) dΠ (G, τ )
Π (G, τ ) + Π (x|G, τ )
dτ dτ
= μ Π (G, τ ) W0 (G)Π (x|G, τ ) + W1 Π (x|G, τ )Π (G, τ ). (43)
Marginalizing the above equation we find
∞
dΠ (G, τ )
= W1 Π (x|G) Π (G, τ ), (44)
dτ x=0
where the term in brackets is a conditional average of the slow dynamics over the
protein concentrations. In steady state conditions the above is equal to the equations
0 = Π (G)[P|G] − Π (G ∗ )K 1 , 0 = Π (G ∗ )[P|G ∗ ] − K 2 Π (G ∗∗ ) and Π (G ∗∗ ) = 1
− Π (G) − Π (G ∗ ) by conservation of probability. Here [P|G] denotes the condi-
tional protein concentration that remains to be obtained from Π (x|G). The steady
state solution can be found analytically
K1 K2
Π (G) = ,
K 1 K 2 + K 2 [P|G] + [P|G][P|G ∗ ]
K 2 [P|G]
Π (G ∗ ) = ,
K 1 K 2 + K 2 [P|G] + [P|G][P|G ∗ ]
[P|G][P|G ∗ ]
Π (G ∗∗ ) = , (45)
K 1 K 2 + K 2 [P|G] + [P|G][P|G ∗ ]
where K 1 = kk−11 and K 2 = kk−22 are the respective association constants of the DNA-
protein binding. The protein distribution is then given by a weighted sum of the
probability that a product is found given a particular gene state, times the probability
of the gene being in that state:

Π (x) = Π (x |G)Π (G). (46)
G
It is however difficult to obtain Π (x |G) analytically, we will therefore employ the

limit of slow gene transitions (μ → ∞) in Eq. (43) to obtain
W0 (G) Π∞ (x|G) = 0, (47)
where Π∞ (x|G) = limμ→∞ Π (x|G, τ ). We can now perform the system size
expansion for the conditional distribution Π∞ (x |G) that is determined by Eq. (47)
using the ansatz
x ""
" G = [P|G] + Ω −1/2 ε|G, (48)
Ω
for the conditional random variables describing the protein fluctuations. The coeffi-
cients in the expansion of the CME (47) are

Dnm = δm,0 k · G z n ϕ + (−1)n k1 [P|G] + δm,1 (−1)n k1 , (49)
with k = k/Ω and Dn,s m

= 0 for s > 0. The solution of the rate equation and the
conditional linear noise approximation are given by
b
[P|G] = k · G , σ P|G
2
= [P|G](1 + b), (50)
k5
respectively. Note that there are no further corrections in Ω to this conditional linear
noise approximation because the conditional CME (47) depends only linearly on
the protein variables. The conditional distribution can now be obtained using the
result (26a). Associating with π0 (ε |G) a centered Gaussian of variance as given in
Eq. (50), we find to order Ω −1 ,
Π∞ (ε|G) = π0 (ε|G) 1 + Ω −1/2 a3(1) (G)ψ3,G (ε)

!
+ Ω −1 a4(2) (G)ψ4,G (ε) + Ω −1 a6(2) (G)ψ6,G (ε) + O(Ω −3/2 ).
(51a)
By the definition given before Eq. (26a) the polynomials ψm,G (ε) depend on the gene
state via the conditional variance Eq. (50). The coefficients obtained from Eq. (31)
take the particularly simple form
1 2
a3(1) (G) = 2b + 3b + 1 [P|G], (51b)
6
(2) 1 1 (1) 2
a4 (G) = (b + 1) 6b2 + 6b + 1 [P|G], a6(2) (G) = a3 (G) . (51c)
24 2
Finally, we remark that Π (x |G) and Π (ε |G) are related by Π (x |G) = Ω −1/2 Π (ε =
Ω −1/2 x − Ω 1/2 [P|G] |G).
In Fig. 4a, this conditional system size expansion is compared to stochastic sim-
ulations using the SSA. We find that the conditional linear noise approximation,
Eq. (51), truncated after Ω 0 captures well the multimodal character of the distrib-
utions but misses to predict the precise location of its modes. In contrast, the con-
ditional approximation of Eq. (51) taking into account up to Ω −1 terms accurately
describes the location of these distribution maxima. We note however that a con-
tinuous approximation such as Eq. (51a) may fail in situations when the effective
support of the conditional distributions represents the range of a few molecules.
Such case is depicted in Fig. 4b. In this case the distributions are captured better by
an approximation with discrete support as has been given in Ref. [20], Eqs. (35) and
(36) therein. The resulting approximation, (blue filled circles) using the analytical
form of the coefficients (51) is in excellent agreement with stochastic simulations
performed using the SSA (shaded gray area). These findings highlight clearly the
need to go beyond the conditional Gaussian approximation for the two cases studied
here.
5.3 Comparison of Numerical Results
In order to compare both methods quantitatively, we calculated absolute |Π (x) −

Π ∗ (x)| and relative errors |Π (x) − Π ∗ (x)|/Π (x) between the exact distribution
Π (x), which was estimated from simulations via the SSA, and the distribution
approximation obtained either via the MM or via the SSE denoted by Π ∗ (x). The
results for the first case study are shown in Fig. 5. We observe that the maximum
absolute and relative errors occur close to the boundary of zero molecules for both
approximation methods. However, the SSE is more accurate than MM here. In this
(a) (b)
Fig. 4 Self-activating gene: conditional system size expansion. The conditional system size
expansion (cSSE), Eqs. (45), (46) and (51), for the moderate (a) and slow (b) protein production is
compared to stochastic simulations (SSA). While the conditional linear noise approximation (Ω 0 )
captures the multimodal character of the distribution, it misses the precise location of its modes. We
find that the Ω −1 -estimate of the cSSE given by Eq. (51) agrees much better with the SSA. b The
discrete approximation of Ref. [20], see main text for details, is shown when truncated after Ω 0
and Ω −1 terms. The analytical form of the coefficients in Eq. (51) has been used with parameter
set B in Table 1
(a) (b)
Fig. 5 Bursty protein production: absolute and relative error. We plot the absolute (a) and
relative (b) errors of the MM (for moments up to order K = 3 and K = 5, Fig. 1) and the SSE
(truncated after Ω −3/2 and Ω −3 terms, Fig. 2). The SSE for this example yields the percentage
error ||ε||V to be 5.1% (Ω −3/2 ) and 2.8% (Ω −3 ) while the moment based approach yields 5.6%
(K = 3) and 2.0% (K = 5). Both approaches become more accurate as more moments or higher
order terms in the SSE are taken into account. For both methods, the maximum errors occurs at
zero molecules
region a direct representation of the probabilities (as in the hybrid approaches) may
be more appropriate. For measuring the overall agreement of the distributions we
computed the percentage statistical distance
∞
100%
||ε||V = |Π (x) − Π ∗ (x)|. (52)
2 x=0
(a)
(b)
Fig. 6 Self-activating gene: absolute and relative error. We plot the absolute (left) and relative
(right) errors of the MCM (for moments up to order K = 3 and K = 7) and the cSSE (truncated after
Ω 0 and Ω −1 terms). For moderate protein production (a), corresponding to the distributions shown
in Figs. 3a and 4a, the MCM yields a percentage error ||ε||V of 2.4% (K = 3) and 1.0% (K = 7)
while the cSSE yields 5.4% (Ω 0 ) and 1.4% (Ω −1 ), respectively. For slow protein production
(b), corresponding to the distributions shown in Figs. 3b and 4b, we find ||ε||V = 10.9% (K = 3)
and 1.7% (K = 7) for the MCM as well as ||ε||V = 8.3% (Ω 0 ) and 1.9% (Ω −1 ) for the cSSE,
respectively
This distance can also be interpreted as the maximum percentage difference between
the probabilities of all possible events assigned by the two distributions [47, 48]
and achieves its maximum (100% error) when the distributions do not overlap. The
numerical values given in the caption of Fig. 5 reveal that the estimation errors of
the MM and the SSE decrease as more moments or higher order terms in the SSE
are taken into account. The respective error estimates are of the same order of mag-
nitude. However, the analytical solution obtained using the SSE, given in Sect. 5.1.2
including only low order terms is slightly more accurate than the MM with only few
moments, while the MM with a larger number of moments becomes more accurate
than the SSE including up to Ω −3 -terms.
For the second case study, we find that the absolute and relative estimation errors
of the method of moments and the SSE are of the same order of magnitude, compare
Fig. 6. However, we found that the method of moments including three conditional
moments (K = 3) is overall more accurate than the conditional linear noise approx-
imation (Ω 0 ). The approximations become comparable as higher order conditional
moments and higher orders in conditional SSE are employed. However, the method
of moments including 7 conditional moments turned out to be slightly more accurate

than the analytical SSE solution to order Ω −1 given in Sect. 5.2.2.
6 Discussion
We have here studied the accuracy of two recently proposed approximations for
the probability distribution of the CME. The method of moments utilizes moment
closure to approximate the first few moments of the CME from which the distribution
is reconstructed via the maximum entropy principle. In contrast, the SSE method
does not rely on a moment approximation but instead the probability distribution is
obtained analytically as a series in a large parameter that corresponds roughly to the
number of molecules involved in the reactions. Interestingly, our comparative study
revealed that both methods yield comparable results. While generally both methods
provide highly accurate approximations for the distribution tails and capture well the
overall shape of the distributions, we found that for both methods the largest errors
occur close to the boundary of zero molecules. We observed a similar behavior
when conditional moments or the conditional SSE were used. These discrepancies
could be resolved by taking into account higher order moment closure schemes
or, equivalently, by taking into account higher order terms in the expansion of the
probability distribution.
From a computational point of view, the method based on moment closure is
limited by the number of moments that can be numerically integrated due to the
stiffness of the resulting high-dimensional ODE system. Our investigation showed
that such difficulties are encountered when one closes the moment equations beyond
the 8th moment. In contrast, the analytical solution provided by the SSE technique
does not suffer from these issues because it is provided in closed form. We note
however that the SSE solution given here is limited to a single species while the
method of moments has no such limitation. Moreover, the conditional SSE solution
relies on timescale separation which the method of conditional moments does not
assume.
The computational cost of the analytical approximation provided by the SSE is
generally less than that of the moment based approach because it avoids numeri-
cal integration and optimization. This fact may be particularly advantageous when
wide ranges of parameters are to be studied, as for instance in parameter estimation
from experimental data. We note, however, that the moment-based approach is still
much faster than the one of the SSA because it avoids large amounts of ensemble
averaging. Therefore the moment based approach may be preferable in situations
where the SSE cannot be applied as we have mentioned above. We hence conclude
that both approximation schemes provide complementary strategies for the analysis
of stochastic behavior of biological systems. Most importantly, they both provide
computationally more efficient strategies compared with simulation and numerical
integration of the CME while preserving an high degree of accuracy. The methods
thus have an high potential for the analysis of large-scale models.
Acknowledgements PT acknowledges support from the Royal Commission for the Exhibition of
1851 in form of a Research Fellowship.
References
1. H. H. McAdams and A. Arkin.: “Stochastic mechanisms in gene expression”. Proc Natl Acad
Sci 94.3 (1997), pp. 814–819.
2. P. S. Swain, M. B. Elowitz, and E. D. Siggia.: “Intrinsic and extrinsic contributions to stochas-
ticity in gene expression”. Proc Natl Acad Sci 99.20 (2002), pp. 12795–12800.
3. T. J. Perkins and P. S. Swain.: “Strategies for cellular decision-making”. Mol Syst Biol 5 (2009).
4. M. B. Elowitz et al.: “Stochastic gene expression in a single cell”. Sci Signal 297.5584 (2002),
p. 1183.
5. D. Wilkinson.: Stochastic Modelling for Systems Biology. Chapman & Hall, 2006.
6. D. T. Gillespie.: “Exact stochastic simulation of coupled chemical reactions”. J Phys Chem
81.25 (1977), pp. 2340–2361.
7. N. Maheshri and E. K. O’Shea.: “Living with noisy genes: how cells function reliably with
inherent variability in gene expression”. Annu Rev Biophys Biomol Struct 36 (2007), pp.
413–434.
8. B. Munsky and M. Khammash.: “The finite state projection algorithm for the solution of the
chemical master equation”. J Chem Phys 124.4 (2006), p. 044104.
9. M. Mateescu et al.: “Fast Adaptive Uniformisation of the Chemical Master Equation”. IET
Syst Biol 4.6 (2010), pp. 441–452.
10. D. T. Gillespie.: “Stochastic simulation of chemical kinetics”. Annu Rev Phys Chem 58 (2007),
pp. 35–55.
11. J. Elf and M. Ehrenberg.: “Fast evaluation of fluctuations in biochemical networks with the
linear noise approximation”. Genome Res 13.11 (2003), pp. 2475–2484.
12. R Grima. “An effective rate equation approach to reaction kinetics in small volumes: Theory
and application to biochemical reactions in nonequilibrium steady-state conditions”. J Chem
Phys 133.3 (2010), p. 035101.
13. P. Thomas, H. Matuschek, and R. Grima.: “How reliable is the linear noise approximation of
gene regulatory networks?” BMC Genomics 14. Suppl 4 (2013), S5.
14. S. Engblom.: “Computing the moments of high dimensional solutions of the master equation”.
Appl Math Comput 180 (2 2006), pp. 498 -515.
15. C. Gillespie. “Moment-closure approximations for mass-action models”. IET Syst Biol 3.1
(2009), pp. 52–58.
16. A. Ale, P. Kirk, and M. P. H. Stumpf.: “A general moment expansion method for stochastic
kinetic models”. J Chem Phys 138.17 (2013), p. 174101.
17. A. Andreychenko, L. Mikeev, and V.Wolf.: “Model Reconstruction for Moment-Based Sto-
chastic Chemical Kinetics”. ACM Trans Model Comput Simul 25.2 (2015), 12:1–12:19.
18. A. Andreychenko, L. Mikeev, and V. Wolf.: “Reconstruction of Multimodal Distributions for
Hybrid Moment-based Chemical Kinetics”. To appear in Journal of Coupled Systems and
Multiscale Dynamics (2015).
19. N. G. Van Kampen.: Stochastic Processes in Physics and Chemistry. Third. Amsterdam: Else-
vier, Amsterdam, 1997.
20. P. Thomas and R. Grima. : “Approximate probability distributions of the master equation”.
Phys Rev E 92.1 (2015), p. 012120.
21. L. Bortolussi.: “Hybrid Behaviour of Markov Population Models”. Information and Computa-
tion (2015 (accepted)).
22. P. Thomas, N. Popović, and R. Grima.: “Phenotypic switching in gene regulatory networks”.
Proc Natl Acad Sci 111.19 (2014), pp. 6994–6999.
23. D. T. Gillespie: “A diffusional bimolecular propensity function”. J Chem Phys 131.16 (2009),
p. 164109.
24. P. Thomas, H. Matuschek, and R. Grima.: “Computation of biochemical pathway fluctuations
beyond the linear noise approximation using iNA”. Bioinformatics and Biomedicine (BIBM),
2012 IEEE International Conference on. IEEE. 2012, pp. 1–5.
25. P. Whittle.: “On the Use of the Normal Approximation in the Treatment of Stochastic
Processes”. J R Stat Soc Series B Stat Methodol 19.2 (1957), pp. 268–281.
26. J. H. Matis and T. R. Kiffe.: “On interacting bee/mite populations: a stochastic model with
analysis using cumulant truncation”. Environ Ecol Stat 9.3 (2002), pp. 237–258.
27. I. Krishnarajah et al.: “Novel moment closure approximations in stochastic epidemics”. Bull
Math Biol 67.4 (2005), pp. 855–873.
28. A. Singh and J. P. Hespanha.: “Lognormal moment closures for biochemical reactions”. Deci-
sion and Control, 2006 45th IEEE Conference on. IEEE. 2006, pp. 2063–2068.
29. A. Singh and J. P. Hespanha.: “Approximate moment dynamics for chemically reacting sys-
tems”. Automatic Control, IEEE Transactions on 56.2 (2011), pp. 414–418.
30. D. Schnoerr, G. Sanguinetti, and R. Grima.: “Comparison of different momentclosure approx-
imations for stochastic chemical kinetics”. J Chem Phys 143.18 (2015), p. 185101.
31. D. Schnoerr, G. Sanguinetti, and R. Grima.: “Validity conditions for moment closure approxi-
mations in stochastic chemical kinetics”. J Chem Phys 141.8 (2014), p. 084103.
32. R. Grima.: “A study of the accuracy of moment-closure approximations for stochastic chemical
kinetics”. J Chem Phys 136.15 (2012), p. 154105.
33. J. Hasenauer et al.: “Method of conditional moments for the Chemical Master Equation”. J
Math Biol (2013), pp. 1–49.
34. M. Lapin, L. Mikeev, and V. Wolf.: “SHAVE - Stochastic Hybrid Analysis of Markov Population
Models”. Proceedings of the 14th International Conference on Hybrid Systems: Computation
and Control (HSCC’11). ACM International Conference Proceeding Series. 2011.
35. A.L. Berger, V.J.D. Pietra, S.A.D. Pietra, A Maximum Entropy Approach to Natural Language
Processing. Comput Ling 22(1), 39–71 (1996)
36. R. Abramov.: “The multidimensional maximum entropy moment problem: a review of numer-
ical methods”. Commun Math Sci 8.2 (2010), pp. 377–392.
37. Z. Wu et al.: “A fast Newton algorithm for entropy maximization in phase determination”.
SIAM Rev 43.4 (2001), pp. 623–642.
38. L. R. Mead and N. Papanicolaou: “Maximum entropy in the problem of moments”. J Math
Phys 25 (1984), p. 2404.
39. G. W. Alldredge et al.: “Adaptive change of basis in entropy-based moment closures for linear
kinetic equations”. J Comput Phys 258 (2014), pp. 489–508.
40. Á . Tari, M. Telek, and P. Buchholz.: “A unified approach to the moments based distribu-
tion estimation-unbounded support”. Formal Techniques for Computer Systems and Business
Processes. Springer, 2005, pp. 79–93.
41. J. Elf et al.: “Mesoscopic kinetics and its applications in protein synthesis”. Systems Biology.
Springer, 2005, pp. 95–18.
42. L. Comtet.: Advanced Combinatorics: The art of finite and infinite expansions. Springer Science
& Business Media, 1974.
43. E. Giampieri et al.: “Active Degradation Explains the Distribution of Nuclear Proteins during
Cellular Senescence”. PloS one 10.6 (2015), e0118442.
44. V. Shahrezaei and P. S. Swain.: “Analytical distributions for stochastic gene expression”. Proc
Natl Acad Sci 105.45 (2008), pp. 17256–17261.
45. P. Thomas, A. V. Straube, and R. Grima.: “Communication: Limitations of the stochastic quasi-
steady-state approximation in open biochemical reaction networks”. J Chem Phys 135(18),
181103 (2011)
46. K. R. Sanft, D. T. Gillespie, and L. R. Petzold.: “Legitimacy of the stochastic Michaelis-Menten
approximation”. Syst Biol, IET 5.1 (2011), pp. 58–69.
47. D. A. Levin, Y. Peres, and E. L. Wilmer.: Markov chains and mixing times. American Mathe-
matical Soc., 2009.
48. T. M. Cover and J. A. Thomas.: Elements of information theory. John Wiley & Sons, 2012.
Sampling from T Cell Receptor Repertoires
Marco Ferrarini, Carmen Molina-París and Grant Lythe
Abstract Modern single-cell sequencing techniques allow the unique T cell receptor
(TCR) signature of each of a sample of hundreds of T cells to be read. The mathemat-
ical challenge is to extrapolate from the properties of a sample to those of the whole
repertoire of an individual, made up of many millions of T cells. We consider the
distribution of the number of repeats of any TCR in a sample, the mean number of
cells needed to find a repeat with probability one half, and the relationship between
the true distribution of clonal sizes and that experimentally observed in a sample. In
the simplest hypothesis for the structure of the repertoire, the same number of cells
make up each clonotype. We also consider the case where the distribution of clonal
sizes is geometric, and examples where a small fraction of clones in the repertoire
are expanded.
1 Introduction
Approximately 4 × 1011 T cells circulate in the adult human body [1]. About 30,000
T cell receptors (TCRs) are on the surface of each T cell, usually of only one speci-
ficity [2]. T cells are selected in the thymus by binding to self-peptides expressed in
association with major histocompatibility complex molecules (self-pMHC) [1, 3–5].
The set of cells with the same TCR defines a T cell clonotype, and the set of T cells
in the body can be thought of as a repertoire of clonotypes. CD8+ T cells recognise
peptide bound to MHC class I molecules and CD4+ T cells recognise peptide bound
to MHC class II molecules [2]. How many TCR clonotypes are there in humans, mice
and other mammals? [6–10]. Direct measurement is not possible even with the latest
M. Ferrarini (B) · C. Molina-París (B) · G. Lythe (B)

Department of Applied Mathematics, School of Mathematics, University of Leeds,
Leeds LS2 9JT, UK
e-mail: mmmf@leeds.ac.uk
C. Molina-París
e-mail: carmen@maths.leeds.ac.uk
G. Lythe
e-mail: grant@maths.leeds.ac.uk

68 M. Ferrarini et al.
developments in sequencing techniques. Estimates of the number of different TCRs

that could, in principle, be produced by VDJ gene rearrangement in the thymus, are
about 1015 [11–14]. However, the human body cannot contain even one T cell of
1015 possible types: 1015 T cells would weigh about 500 kg [15].
The number of distinct TCR clonotypes, N, is equal to the total number of T cells
divided by the mean number of cells per clonotype. Equivalently, N is equal to the
product of the rate of release of new clonotypes from the thymus to the periphery and
the mean lifetime of a clonotype in the periphery [16]. Lower limits on the number
of distinct TCR β chains in the repertoire are about 4 × 106 [17–20]. If each TCR
β chain combines with 25 α chains, then the number of distinct clonotypes in one
human is at least 108 [21].
Direct estimates of β chain diversity have been made by PCR amplification of
mRNA from pools of cells, but the technique is less suitable for measuring distri-
butions of clonal sizes because numbers of mRNA vary from cell to cell and PCR
amplification may depend on the TCR. Single-cell measurements, where PCR and
sequencing are performed on one cell at a time, avoid biases. However, their expense
means that only hundreds of cells are usually sequenced from a single mammal, and
estimates of diversity must therefore rely on mathematical extrapolation from small
samples [19, 22–24].
In our analysis, we use the geometric distribution as a basic model for the distri-
bution of clonal sizes in the repertoire. The mean number of cells per clonotype, in
the repertoire, is the total number of cells in the repertoire divided by the number
of clonotypes. We find, using probability generating functions, the distribution of
number of copies of cells of each clone that are found in a sample of cells taken
at random from the repertoire. When the sample is small, most clonotypes are not
present at all in the sample, and the majority of clonotypes that are in the sample
are only present as one copy. We show that the observed distribution of clonal sizes,
which is the distribution of sequences seen once, twice, three times, . . . in the sample,
also has a geometric distribution, and calculate the corresponding mean value.
1.1 Sampling from a Repertoire
What can be deduced from a sample of m cells taken from a repertoire of T cells if the
total number of cells in the repertoire, S, is very large? Let us begin by describing the
structure of the repertoire, which is divided into N subsets, called TCR clonotypes.
Denote by ni the number of cells of a clonotype labelled i. The index i runs from 1 to
N, and ni = S (see Fig. 1). Typically S is known, but N and ni with 1 ≤ i ≤ N,
i
are not.
When the number of cells in the sample, m, is much smaller than the number of
cells in the repertoire, S, and much smaller even than the number of TCR clono-
types, N, it is not obvious how to draw direct conclusions. On the other hand, some
mathematical simplifications can be made. Let us consider one TCR clonotype, with
Sampling from T Cell Receptor Repertoires 69
Fig. 1 The repertoire contains S cells, divided up into N TCR clonotypes. Here, cells are represented
by circles, a TCR clonotype is the set of cells of one shade, and a random sample of cells is
represented by those inside the black square
label i and 1 ≤ i ≤ N. If m ni S then, instead of the full expressions involving

multinomial distributions, we can use the following approximations:
m
1. the probability that none of the m cells in the sample are of clonotype i is 1 − nSi ,
2. the probability that exactly one of the m cells in the sample is of clonotype i is
m−1
m nSi 1 − nSi , and
3. the probability that exactly two of the m cells in the sample are of clonotype i is
m−2
1
2
m (m − 1) nSi ni S−1 1 − nSi .
Denote the number of cells of clonotype i in the sample by Yi . If m 1 but
m ni S, then P (Yi = 2) ri , where
1 m 2
ri = ni (ni − 1) . (1)
2 S
We say there is a repeat in the sample if two (or more) of the m cells are of the
same clonotype. Let us consider a group of M identified clonotypes in the repertoire,
with numbers of cells n1 , n2 , . . . , nM , respectively. How many repeats, of clonotypes
in this group, will we see in our sample? If ri 1, ∀i = 1, . . . , M, so that the
occurrences of repeats in distinct clonotypes can be taken as independent events,

then
1 m 2
M M
IE (number of repeats of identified clonotypes) = ri = ni (ni − 1) .
2 S
i=1 i=1
That is,
1 m2
IE (number of repeats of identified clonotypes) = M IE (ni (ni − 1)) , (2)
2 S2
where the expectation is taken over the M clonotypes

M
−1
IE (ni (ni − 1)) = M ni (ni − 1) .
i=1
Repertoires can be constructed and sampled inside a simple computer programme,

where each clonotype is assigned a label i and values of ni are assigned according to
a probability distribution. We have constructed repertoires with uniform, geometric
and double-geometric distributions of clonal sizes to verify the conclusions presented
here.
2 Results
2.1 The Mean Number of Repeats
To find the mean number of repeats of any clonotype in the sample, we put M = N
in (2) and write S = N IE (ni ), to obtain

N
m2 IE (ni (ni − 1))
IE (number of repeats) = ri = . (3)
i=1
2N IE (ni )2
2
m
The expression (3) is the product of the factor 2N , that does not depend on the
IE (ni (ni −1))
distribution of clonal sizes, and the factor IE (ni )2 , that does. The latter can be
written
IE (ni (ni − 1)) IE (ni2 ) 1
= − .
IE (ni ) 2 IE (ni ) 2 IE (ni )
• If ni = n̄ for every i then
IE (ni2 ) IE (ni (ni − 1)) 1

=1 and =1− .
IE (ni )2 IE (ni )2 n̄
Fig. 2 Mean number of

repeats as a function of the 10
number of cells in the
sample, from a repertoire of
N = 105 clonotypes and a 8
mean number of repeats

geometric distribution of
clonal sizes, with n̄ = 10
6
0
500 1,000
m
k−1
• If ni has a geometric distribution with mean n̄ (that is, P (ni ≥ k) = 1 − n̄1 ,
k = 1, 2, . . .) then

IE (ni2 ) 1 IE (ni (ni − 1)) 1
=2− and = 2 1 − .
IE (ni ) 2 n̄ IE (ni )2 n̄
See Fig. 2.
2.2 Number of Draws to Find the First Repeat
Let us consider the probability of finding no repeats in a sample of m cells. With ri

defined in (1), we approximate by 1 − ri the probability that fewer than two cells of
clonotype i are found in the sample, so that
N
P (no repeat in sample of m cells) = (1 − ri ).
i=1
We can then write

N
log (P (no repeat in a sample of m cells)) = log(1 − ri )
i=1

N
− ri
i=1
assuming ri 1 for every i. Thus, we have
P (no repeat in sample of m cells) = exp (−λ)
where
m2 IE (ni (ni − 1))
λ= (4)
2N IE (ni )2
is the mean number of repeats in a sample of m cells.

How many cells do we need to sample in order to have a 50 percent chance of
finding a repeat? Let this number be m0.5 . Then
IE (ni )2
2
m0.5 = 2 N log 2 . (5)
IE (ni (ni − 1))
In the simplest case, when all clonotypes have

the same number of cells, n̄, we
m2

find P (no repeat in sample of m cells) = exp − 2N 1 − n and
1
1
2
2 N log 2
m0.5 = .
1 − n1
When the distribution of the number of cells per clonotype is geometric with mean
n̄, the desired number is
1
2
N log 2
m0.5 = .
1 − n̄1
See Fig. 3.
2.3 Poisson Distribution of Number of Repeats in a Sample
Let k be the total number of repeats in a sample of m cells. For example, if 96

sequences are found once and 2 sequences are found twice, then m = 100 and k = 2.
We have already seen the case k = 0 and analysed P (k = 0). Let us consider the
case k = 1

N N
P (k = 1) = ri (1 − rj ) .
i=1 j=1
j=i
Fig. 3 Mean number of

cells that need to be sampled
in order to have a 50 percent 10 4
chance of one repeat, from a
repertoire of N clonotypes
and a geometric distribution
of clonal sizes, with n̄ = 10
m 0.5
10 3
10 2
3 4 5 6 7 8
10 10 10 10 10 10
N
N N
If ri 1 for every i, then (1 − rj ) (1 − ri ) and
j=1 i=1
j=i
P (k = 1) = λ e−λ .
The same argument works for all k m, so that the number of repeats in a sample
has a Poisson distribution
λk −λ
P (number of repeats in sample of m cells is k) = e .
k!
2.4 Estimating the Size of the Repertoire from One Repeat
Suppose there is one repeat in a sample of m1 cells. We then use (4) to estimate N.
Putting λ = 1 and, assuming a geometric distribution of clonal sizes, we conclude
that
1
N = m12 1 − .
n̄
If we find one repeat per 100 cells, we estimate the size of the repertoire is 104 . If we
find one repeat per 1,000 cells, we estimate that the size of the repertoire is 106 . In
practice, the estimate m12 is likely to be conservative, because any clonal expansion
will increase the number of observed repeats.
2.5 The Observed Distribution of Clonal Sizes
In this section, our goal is to find the probability distribution of the number of
instances of k copies of a TCR in a random sample of m cells. First, consider the
point of view of one cell in the total of S cells in the repertoire. The probability,
which we denote q, that this cell is one of the m cells in the sample is equal to m/S.
Next, let us define the Bernoulli random variable B
m
P (B = 0) = 1 − q and P (B = 1) = q , where q= . (6)
S
The probability generating function of B is
∞

φB (z) = P (B = k) zk = 1 − q + q z . (7)
k=0
If ni is the number of cells of a clonotype labelled i, then the number of cells of type
i in the sample is the random variable Yi , which can be written
Yi = B1 + · · · + Bni , (8)
where Bj , j = 1, . . . , ni are random variables with the same distribution as B. With

the approximation that the Bj are independent random variables, the probability
generating function of Yi is
φYi (z) = φB (z)ni = (1 − q + qz)ni . (9)
Let Y be the number of copies of a randomly chosen clonotype found in the sample
of m cells, which can take the value 0 or any integer greater than 0. That is,

P (Y = k) = P (ni = n)P (Yi = k|ni = n) .
n
Suppose that the probability generating function of the random variable ni is φn (z).
Then

φY (z) = P (ni = n)(1 − q + qz)n
n
= φn (1 − q + qz) .
For example, if ni has a geometric distribution with mean n̄, then φn (z) =
z
, so that
n̄ − (n̄ − 1)z
1 − q + qz 1 − q + qz
φY (z) = = . (10)
n̄ − (n̄ − 1)(1 − q + qz) 1 + (n̄ − 1)q(1 − z)
Let us consider the distribution of observed clonal sizes (the histogram that is
obtained by plotting the number of TCRs as a function of the number of cells in the
sample). Observed clonal sizes can take integer values greater than, but not including,
0. We define s1 , s2 , . . . by
P (Y = k)
sk = .
1 − P (Y = 0)
That is, the distribution is that of Y , conditioned on not being 0. Then
γ k−1
sk = k≥1, (11)
1 + (n̄ − 1)q
and
(n̄ − 1)q
γ = . (12)
1 + (n̄ − 1)q
Note that sk+1

sk
= γ . That is, if the distribution of clonal sizes in the repertoire is geo-
metric, with mean n̄, then the observed distribution of clonal sizes is also geometric,
with mean 1 + (n̄ − 1)q (see Fig. 4).
2.6 Expansion of a Fraction of the Repertoire
In this Section, we assume that a fraction f 1 of clonotypes are “expanded”. Even

if expanded clonotypes are rare, they will be over-represented in the repeats found in a
sample. Let us define a simple repertoire with expanded clonotypes. Let us suppose
that the fraction 1 − f of clonotypes consist of n cells each, while the remaining
fraction f consist of α n cells each, where α 1. Then, the probability generating
function of the random variable ni is
φni (z) = (1 − f ) zn + f zαn . (13)
In this case, the probability generating function of the random variable Y , the number
of copies of a randomly chosen clonotype found in the sample of m cells, is
φY (z) = φni (1 − q + qz) = (1 − f ) (1 − q + qz)n + f (1 − q + qz)αn . (14)
If α 1, f 1 and α n q 1, then P (Y = 1) n q + f α n q and

P (Y = 2)= 21 f (α n q)2 . To give a concrete example, suppose that n = 2, q =
10−5 , f = 10−4 and α = 4 × 103 . Then (P (Y = 1) − P (Y = 2))/P (Y = 1) =
Fig. 4 Observed clonal size distribution in a sample of 1,000 cells, from repertoires containing
different numbers of clones, N. A “constant” repertoire means that there are n̄ = 10 cells of each
clonotype. In a “geometric” repertoire, the number of cells in each clonotype is drawn from a
geometric distribution with mean n̄ = 10
0.99, which means that 99% of sequences in a sample of m sequences will be found
as a single copy.
Finally, we consider a more realistic case in which the distribution of the number of
cells per clonotype is “double geometric”. That is, both the unexpanded and expanded
clonotypes have clonal sizes that follow a geometric distribution. The mean of ni for
unexpanded clonotypes is n̄ and the mean of ni for expanded clonotypes is α n̄. Then,
the probability generating function of the random variable Y is
φY (z) = (1 − f ) ψ(1 − q + qz, n̄) + f ψ(1 − q + qz, α n̄) , (15)
where z
ψ(z, n) = . (16)
n − (n − 1) z
That is,
⎧
⎪ 1−q 1−q
⎪
⎪f + (1 − f ) , k=0
⎪
⎨ 1 + q(α n̄ − 1) 1 + q(n̄ − 1)
P (Y = k) =
⎪
⎪
⎪
⎪ (α n̄ − 1)k−1 α n̄ qk (n̄ − 1)k−1 n̄ qk
⎩f + (1 − f ) , k ≥ 1.
(1 + q(α n̄ − 1))k+1 (1 + q(n̄ − 1))k+1
To give a concrete example, similar to that above, suppose that n̄ = 2, q = 10−5 ,

f = 10−4 and α = 4 × 103 . Then 98% of sequences in a sample of m sequences will
be found as a single copy.
3 Discussion
Small samples from a large repertoire are obtained in single-cell sequencing experi-
ments of T cell receptors. Estimates of the diversity of the TCR repertoire, that can
be deduced, depend on the distribution of clonal sizes, which is also unknown. How-
ever, small sample sizes allow the simplifying approximation that random variables
describing quantities of interest, such as the numbers of cells of different types in
the sample, are independent. Then, the probability generating function of the distri-
bution of clonal sizes in the sample is the composition of that of a Bernoulli random
variable (that takes values 0 or 1) and that of the true distribution of clonal sizes in
the repertoire that is being sampled from. Our work is motivated by studies of the
repertoire of T cells in humans and mice. Bulk sequencing has been used to set a
lower limit, in the millions, of the number of distinct TCRs in an individual. In this
type of experiment, where mRNA is extracted from a pool of cells, it is difficult to
obtain statistics of the number of cells of each clonotype (abundance data) that is free
from biases. Single-cell TCR sequencing can eliminate biases but can, at present,
only be carried out on a few hundred cells from one individual.
Acknowledgements The research leading to these results has received funding from the Euro-
pean Union Seventh Framework Programme (FP7/2007–2013) through the Marie-Curie Action
“Quantitative T cell Immunology” Initial Training Network, with reference number FP7-PEOPLE-
2012-ITN 317040-QuanTI.
We have benefitted from discussions with Pedro Gonçalves and Benedita Rocha, and from the
helpful comments of an anonymous referee.
References
1. M.K. Jenkins, H.H. Chu, J.B. McLachlan, and J.J. Moon. On the composition of the preimmune
repertoire of T cells specific for peptide-major histocompatibility complex ligands. Annual
Review of Immunology, 28:275–294, 2009.
2. R. Varma. TCR triggering by the pMHC complex: valency, affinity, and dynamics. Science
Signaling, 1(19):pe21, 2008.
3. Benedita Rocha and Harald von Boehmer. Peripheral selection of the T cell repertoire. Science,
251(4998):1225–1228, 1991.
4. I. Bains, R. Antia, R. Callard, and A.J. Yates. Quantifying the development of the peripheral
naive CD4+ T-cell pool in humans. Blood, 113(22):5480, 2009.
5. François Van Laethem, Anastasia N Tikhonova, and Alfred Singer. MHC restriction is imposed
on a diverse T cell receptor repertoire by CD4 and CD8 co-receptors during thymic selection.
Trends in Immunology, 33(9):437–441, 2012.
6. RE Langman and M Cohn. The ET (elephant-tadpole) paradox necessitates the concept of a
unit of B-cell function: the protecton. Molecular Immunology, 24(7):675–697, 1987.
7. Joseph N Blattman, Rustom Antia, David JD Sourdive, Xiaochi Wang, Susan M Kaech, Kaja
Murali-Krishna, John D Altman, and Rafi Ahmed. Estimating the precursor frequency of naive
antigen-specific CD8 T cells. Journal of Experimental Medicine, 195(5):657–664, 2002.
8. Stanca M Ciupe, Blythe H Devlin, Mary Louise Markert, and Thomas B Kepler. Quantification
of total T-cell receptor diversity by flow cytometry and spectratyping. BMC Immunology,
14(1):1–12, 2013.
9. Niclas Thomas, Katharine Best, Mattia Cinelli, Shlomit Reich-Zeliger, Hilah Gal, Eric Shifrut,
Asaf Madi, Nir Friedman, John Shawe-Taylor, and Benny Chain. Tracking global changes
induced in the CD4 T cell receptor repertoire by immunization with a complex antigen using
short stretches of CDR3 protein sequence. Bioinformatics, page btu523, 2014.
10. Tania Cukalac, Wan-Ting Kan, Pradyot Dash, Jing Guan, Kylie M Quinn, Stephanie Gras,
Paul G Thomas, and Nicole L La Gruta. Paired TCRαβ analysis of virus-specific CD8+ T cells
exposes diversity in a previously defined “narrow” repertoire. Immunology and Cell Biology,
2015.
11. Andrew K Sewell. Why must T cells be cross-reactive? Nature Reviews Immunology,
12(9):669–677, 2012.
12. Janko Nikolich-Žugich, Mark K. Slifka, and Ilhem Messaoudi. The many important facets of
T-cell repertoire diversity. Nature Reviews Immunology, 4(2):123–132, 2004.
13. Veronika Zarnitsyna, Brian Evavold, Louie Schoettle, Joseph Blattman, and Rustom Antia.
Estimating the diversity, completeness, and cross-reactivity of the T cell repertoire. Frontiers
in Immunology, 4:485, 2013.
14. Anand Murugan, Thierry Mora, Aleksandra M Walczak, and Curtis G Callan. Statistical infer-
ence of the generation probability of T-cell receptors from sequence repertoires. Proceedings
of the National Academy of Sciences, 109(40):16161–16166, 2012.
15. D. Mason. A very high level of crossreactivity is an essential feature of the T-cell receptor.
Immunology Today, 19(9):395–404, 1998.
16. Grant Lythe, Robin E Callard, Rollo L Hoare, and Carmen Molina-París. How many TCR
clonotypes does a body maintain? Journal of Theoretical Biology, 389:214–224, 2016.
17. T.P. Arstila, A. Casrouge, V. Baron, J. Even, J. Kanellopoulos, and P. Kourilsky. A direct
estimate of the human αβ T cell receptor diversity. Science, 286(5441):958, 1999.
18. Can Keşmir, José AM Borghans, and Rob J de Boer. Diversity of human αβ T cell receptors.
Science, 288(5469):1135–1135, 2000.
19. Harlan S Robins, Paulo V Campregher, Santosh K Srivastava, Abigail Wacher, Cameron J
Turtle, Orsalem Kahsai, Stanley R Riddell, Edus H Warren, and Christopher S Carlson. Com-
prehensive assessment of T-cell receptor β-chain diversity in αβ T cells. Blood, 114(19):4099–
4107, 2009.
20. René L Warren, J Douglas Freeman, Thomas Zeng, Gina Choe, Sarah Munro, Richard Moore,
John R Webb, and Robert A Holt. Exhaustive T-cell repertoire sequencing of human peripheral
blood samples reveals signatures of antigen selection and a directly measured repertoire size
of at least 1 million clonotypes. Genome Research, 21(5):790–797, 2011.
21. Qian Qi, Yi Liu, Yong Cheng, Jacob Glanville, David Zhang, Ji-Yeun Lee, Richard A Olshen,
Cornelia M Weyand, Scott D Boyd, and Jörg J Goronzy. Diversity and clonal selection in the
human T-cell repertoire. Proceedings of the National Academy of Sciences, 111(36):13139–
13144, 2014.
22. Vanessa Venturi, Katherine Kedzierska, Stephen J Turner, Peter C Doherty, and Miles P Daven-
port. Methods for comparing the diversity of samples of the T cell receptor repertoire. Journal
of Immunological Methods, 321(1):182–195, 2007.
23. N. Sepúlveda, C.D. Paulino, and J. Carneiro. Estimation of T-cell repertoire diversity and
clonal size distribution by poisson abundance models. Journal of Immunological Methods,
353(1):124–137, 2010.
24. D.J. Laydon, C.R.M. Bangham, B. Asquith, Estimating T-cell repertoire diversity: limitations
of classical estimators and a new approach. Philosphical Transactions of the Royal Society B
370, (2015).
IL-2 Stimulation of Regulatory T Cells:
A Stochastic and Algorithmic Approach
Luis de la Higuera, Martín López-García,

Grant Lythe and Carmen Molina-París
Abstract Regulatory T cells express IL-2 receptor (IL-2R) complexes on their

surface, but do not produce IL-2 molecules. Survival of a population of regulatory
T cells depends on the production of IL-2 by other cells, such as effector T cells. We
formulate a stochastic version of the model of Busse (Dynamics of the IL-2 cytokine
network and T-cell proliferation, Logos, Berlin, 2010, [1]), for the synthesis of IL-2R
by a regulatory T cell in constitutive (ligand-independent) and in ligand-induced con-
ditions, with the assumption that synthesis is a function of the number of IL-2/IL-2R
bound complexes present on the cell surface. Exact analysis of the stochastic Markov
process, by considering its master equation, is usually not possible. Here, we develop
an algorithmic approach, which leads to the analysis of suitable random variables.
In particular, we focus on the time to reach a threshold number of IL-2/IL-2R bound
complexes on the cell surface, and the number of receptors synthesised in this time.
These descriptors provide a way to quantify the rates at which IL-2/IL-2R bound
complexes and IL-2R free receptors are formed in the cell, and how these rates
relate to each other. By following first-step arguments, the different order moments
of these random variables are obtained. We illustrate our approach with numerical
realisations. The contributions of the constitutive and the ligand-induced synthesis
pathways are quantified under different signalling hypotheses.
L. de la Higuera and M. López-García have contributed equally to this work.
L. de la Higuera (B) · M. López-García (B) · G. Lythe · C. Molina-París

Department of Applied Mathematics, School of Mathematics, University of Leeds,
Leeds LS2 9JT, UK
e-mail: mmldlh@leeds.ac.uk
M. López-García
e-mail: m.lopezgarcia@leeds.ac.uk
G. Lythe
e-mail: grant@maths.leeds.ac.uk
C. Molina-París
e-mail: carmen@maths.leeds.ac.uk

82 L. de la Higuera et al.
1 Introduction
T lymphocytes (T cells) are a fundamental part of the adaptive immune system.

T cell precursors originate in the bone marrow and mature in the thymus, undergoing
positive and negative selection [2, 3]. Only around 3% of immature T cells entering
the thymus complete the maturation process and are released into the periphery [4].
Maturation in the thymus ensures that thymocytes display functional T cell receptors
(TCRs) on their surface, and at the same time, will not mount a potential autoimmune
response in the periphery [5]. T cells in the periphery recognise pathogens during
an infection by means of a specific molecule that they express on their surface, the
T cell receptor. Peripheral T cells have different functions depending on the class
they belong to. Helper T cells (CD4+ T cells) assist other cells of the immune system
during an immune response, such as B cells or cytotoxic T cells. Cytotoxic T cells
(CD8+ T cells) are directly capable of killing virus-infected cells by recognition
of particular molecules expressed on their surface. These molecules, called pMHC
complexes, are formed by peptides presented by antigen presenting cells (APCs),
by means of major histocompatibility complexes (MHCs). T cell receptors on the
surface of T cells can bind pMHC complexes. Those T cells which have been released
from the thymus into the periphery and have not participated in an immune response
are called naive T cells. Upon primary infection, the subset of naive T cells which are
able to recognise pathogen-derived pMHC complexes on the surface of APCs, such
as dendritic cells, undergo proliferation and clonal expansion. Once the infection
is cleared, a clonal contraction follows [6]. After clonal contraction, a small pool
of those T cells that participated in the immune response is selected to remain in
the periphery, becoming memory T cells. These cells will be able then to mount
a faster immune response, following a potential secondary exposure to the same
pathogen [7]. Regulatory T cells are responsible for restraining the immune response
within safe levels (tolerance mechanisms), as well as for avoiding the occurrence of
potential autoimmune responses, caused by T lymphocytes which may have escaped
negative selection in the thymus [7]. We refer the reader to Ref. [8] for an interesting
introduction to these immune processes and other basic mechanisms of the immune
system.
The proper functioning of the different T cell classes introduced above, as well
as the interaction between them, not only depends on the TCR and the pMHC com-
plexes, but also depends on a group of cytokines called the interleukin family. Our
interest here is in the molecule interleukin-2 (IL-2), a cytokine involved in a number of
immunological processes (T cell thymic development [9, 10], immune responses [7]
and regulation of homeostatic levels of regulatory T cells [11–13]). Regulatory T cell
survival in the periphery depends on IL-2, as regulatory T cells are characterised
by the constitutive expression of IL-2 receptor (IL-2R) molecules on their surface
[11–13]. Since regulatory T cells do not produce IL-2, they depend on the production
of IL-2 by other cells in the periphery, such as effector T cells. We refer the reader
to Ref. [14] for a review on the role of IL-2 in the immune system.
IL-2 Stimulation of Regulatory T Cells: A Stochastic and Algorithmic Approach 83
In this chapter, we aim to introduce, develop and analyse a stochastic version of the
deterministic model presented in Ref. [1] for the interaction between a helper T cell
and a regulatory T cell by means of the IL-2 molecule. When analysing a biological
system like this one from a mathematical and computational perspective, determin-
istic or stochastic approaches can be followed. The advantage of a deterministic
approach is that it allows one to elucidate the dynamics of the process in an ana-
lytical manner, as the mathematical analysis of the system is usually more tractable
than in a stochastic approach. However, a stochastic approach reveals the intrinsic
randomness that naturally arises in these processes, being specially desirable when
small numbers of molecules are involved (e.g. T cell responses have been reported
to be mediated only by around 10 IL-2/IL-2R molecules [15]). When analysing a
stochastic model, the usual approach is to study the master equation of the process
(a system of differential equations involving the probabilities of the process being
at each possible state at any particular time). This system, which is usually referred
to as the Kolmogorov equations [16], does not often have closed-form analytical
solutions. Under these circumstances, different approaches are implemented in the
literature to analyse the dynamics of these processes. For example, standard Gillespie
stochastic simulations [17] allow one to produce realisations of the process under
study. A trajectory corresponding to a single Gillespie simulation represents an exact
sample from the probability mass function that is the solution of the master equa-
tion. On the other hand, moment-based approximations [18] focus on analysing the
dynamics of average quantities (e.g. average number of molecules).
Our aim in this Chapter is to introduce and develop an alternative algorithmic
method: the matrix-analytic approach. This method makes use of a matrix formal-
ism and was originally developed by Marcel F. Neuts in the area of queueing the-
ory [19]. It has recently been applied in different areas of mathematical biology,
such as population dynamics [20], epidemiology [21, 22], or cellular and molecular
biology [23]. The analysis developed here relies on the introduction of stochastic
descriptors, which are conveniently defined random variables that provide detailed
information about the dynamics of the process, yet do not require the analysis of the
time-dependent dynamics in the master equation.
The Chapter is structured as follows. In Sect. 2, the stochastic process based on the
deterministic model of Ref. [1] is introduced and described. The stochastic descrip-
tors are defined in Sect. 3. They enable us to study the rate at which IL-2/IL-2R
complexes are formed on the regulatory T cell surface, as well as the rate at which
IL-2R molecules are synthesised. An algorithmic approach, discussed in the Appen-
dix, allows us to analyse these descriptors in an efficient manner. Finally, numerical
results are obtained and discussed in Sect. 4. A summary and the main conclusions
are given in Sect. 5.
2 Stochastic Model
Our aim is to develop a stochastic version of the deterministic model proposed in

Ref. [1] for the interaction between a helper T cell and a regulatory T lymphocyte,
mediated by the interleukin-2 cytokine (IL-2) and its receptor (IL-2R). We restrict
our study to the dynamics of regulatory T cells: the synthesis of IL-2R by regulatory
T cells is induced by IL-2 that is secreted by helper T cells, but sensed by regulatory
T cells in a paracrine fashion.
We consider here the stochastic counterpart of the mathematical model introduced
in Ref. [1], which considers the following molecules: IL-2 cytokine or free ligand
molecules (L), IL-2R (R) on the regulatory T cell surface, bound IL-2/IL-2R com-
plexes on the cell surface (C), and bound complexes in the endosome of regulatory
T cells (E). The model under study is represented in Fig. 1 and involves the analysis
of the following random variables:
R(t) = “Number of free IL-2R receptors on the cell surface at time t”,
C(t) = “Number of IL-2/IL-2R complexes on the cell surface at time t”,
E(t) = “Number of IL-2/IL-2R complexes in the endosome at time t”,
L(t) = “Number of free extra − cellular IL-2 molecules at time t”,
for any t ≥ 0, where we assume an initial number L(0) = n L of IL-2 molecules

(ligand), and an initial number R(0) = n R of free IL-2Rs, with C(0) = E(0) = 0.
In what follows, and as we are not explicitly modelling the dynamics of helper T cells,
their IL-2 secretion, or the spatial diffusion of IL-2 from helper T cells to regulatory
T cells, we assume a constant background of ligand, so that L(t) = n L ∀t ≥ 0, in
the spirit of Ref. [24]. Finally, we consider that the total number of receptors per cell
is bounded by a carrying capacity of the regulatory T cells. Thus, we assume that
Fig. 1 Model of IL-2 stimulation of a regulatory T cell

R(t) + C(t) + E(t) ≤ n max

R ,
for all t ≥ 0, so that n R ≤ n max

R .
Once the variables of the model have been described, we introduce the set of
reactions considered (see Fig. 1):
(R1 ) Binding of ligand to receptor: Extra-cellular IL-2 molecules can bind to
IL-2R on the surface of regulatory T cells, forming IL-2/IL-2R complexes, with
rate kon ,
kon
R + L −→ C .
(R2 ) Dissociation of ligand and receptor: Bound complexes C can dissociate

with rate koff ,
koff
C −→ R + L .
(R3 ) Synthesis of IL-2R: We assume both constitutive and IL-2 induced synthesis
of new IL-2R molecules,
ν0 +ν1 σ (x)
∅ −−−−−→ R .
The synthesis of new receptors occurs at a rate
x3
v0 + v1 σ (x) = v0 + v1 ,
x 3 + K c3
where x is the number of bound IL-2/IL-2R complexes on the cell surface at a given
time,1 v0 is the constitutive synthesis rate and v1 is the ligand-induced synthesis
rate. The positive feedback of bound complexes on IL-2R synthesis is represented
by a Hill function with half-saturation constant K c = 1,000 and Hill coefficient
m = 3, as discussed and considered in Ref. [1].
(R4 ) Internalisation of bound complexes: IL-2/IL-2R complexes are internalised
from the membrane of the cell into the endosome with rate γ ,
γ
C−
→E.
(R5 ) Endosomal degradation: Internalised IL-2/IL-2R complexes are degraded

in the endosome with rate ke ,
ke
E−
→∅.
1 Other hypotheses for x are discussed and considered in Sect. 4. In particular, x = C(t) in the orig-
inal model of Ref. [1], while alternative possibilities x = E(t) and x = C(t) + E(t) are introduced
and analysed here in Sect. 4.
(R6 ) Receptor recycling: IL-2R recycling takes place from internalised IL-2/IL-
2R complexes with rate δ,
δ
E−
→R.
(R7 ) Surface receptor degradation: Free IL-2R receptors on the cell surface are
degraded with rate ks ,
ks
R−
→∅.
Under Markovian assumptions, we can introduce a continuous-time Markov

process X (t) = {(R(t), C(t), E(t)) : t ≥ 0} defined over the space of states S =
{(n 1 , n 2 , n 3 ) ∈ (N ∪ {0})3 : n 1 + n 2 + n 3 ≤ n max
R }, with non-null infinitesimal
transition rates, according to mass-action kinetics, given by
⎧
⎪
⎪ kon n 1 n L , if (n 1 , n 2 , n 3 ) = (n 1 − 1, n 2 + 1, n 3 ) ,
⎪
⎪
⎪
⎪ koff n 2 , if (n 1 , n 2 , n 3 ) = (n 1 + 1, n 2 − 1, n 3 ) ,
⎪
⎪
⎨ ks n 1 , if (n 1 , n 2 , n 3 ) = (n 1 − 1, n 2 , n 3 ) ,
q(n 1 ,n 2 ,n 3 ),(n 1 ,n 2 ,n 3 ) = γ n2 , if (n 1 , n 2 , n 3 ) = (n 1 , n 2 − 1, n 3 + 1) ,(1)
⎪
⎪
⎪
⎪ δ n3 , if (n 1 , n 2 , n 3 ) = (n 1 + 1, n 2 , n 3 − 1) ,
⎪
⎪
⎪
⎪ ke n 3 , if (n 1 , n 2 , n 3 ) = (n 1 , n 2 , n 3 − 1) ,
⎩
ν0 + ν1 σ (n 2 ) , if (n 1 , n 2 , n 3 ) = (n 1 + 1, n 2 , n 3 ) .
In Sect. 4 we discuss the values for all the rates.

The dynamics of this process can be analysed in terms of the probabilities
{Pn (t) : t ≥ 0, n ∈ S }, where Pn (t) = P((R(t), C(t), E(t)) = n) is the proba-
bility of the process being in state n at time t, for a given initial state n0 = (n R , 0, 0).
These probabilities verify the master equation, which is the name commonly used in
quantitative biology to refer to the Kolmogorov differential equations corresponding
to the Markov process under consideration [16]:
dPn (t)
= q(n ,n) Pn (t) − q(n,n ) Pn (t) , ∀n ∈ S .
dt
n ∈S ,n =n n ∈S ,n =n
This system of differential equations cannot, in general, be solved analytically. Dif-

ferent methods have been proposed and used in the literature to study the Kolmogorov
equations of a given Markov process, such as carrying out Gillespie simulations [17],
or making use of moment-based approaches by means of the analysis of the average
number of individuals (cells, molecules, etc.) within each species [18].
Our objective here is to propose, in Sect. 3, an alternative approach: the intro-
duction of new stochastic descriptors, which are random variables of interest to the
process under consideration. This approach, based on a matrix formalism, requires
arranging the space of states S in groups of states, and the use of several algorithmic
techniques, known as the matrix-analytic approach [25].
3 Stochastic Descriptors
Our aim in this section is to introduce and analyse two stochastic descriptors: a
continuous and a discrete one. The descriptors will allow us to study the role of the
rate at which IL-2/IL-2R bound complexes are formed and the rate at which IL-2R
molecules are synthesised in the dynamics of the process. We introduce:
• the time to reach a threshold number, B, of bound IL-2/IL-2R complexes simul-

taneously present on the cell surface, and
• the number of newly synthesised receptors during that time.
The first descriptor gives us an absolute measure of the rate at which bound
complexes are formed and stay on the cell surface to reach a threshold number, while
the second descriptor allows us to relate this threshold number with its direct output,
the synthesis of new receptors.
Our choice for the first descriptor is guided by experimental observations that
support the hypothesis that cellular responses to receptor-mediated signals only
take place “once a threshold number of bound receptor-ligand complexes have been
formed on the cell surface” [26–29]. The second descriptor has been considered to
quantify the effect of receptor-ligand signalling in protein synthesis. In the case of
IL-2R, and for T cells, there is experimental evidence for the fact that signalling
through the IL-2R (mediated by IL-2) leads to transcription and translation (or de
novo synthesis) of IL-2R. Given the differences between regulatory T cells, naive
and activated T cells (regarding production of IL-2 and expression of the high affinity
chain of IL-2R, CD25 or IL-2Rα), it is of interest to compute the number of synthesis
events during the IL-2R signalling process [11, 12].
In order to analyse the stochastic descriptors we need to arrange the space of states
S by levels as follows:
n max
R
S = S (k) ,
k=0
where each level, S (k) = {(n 1 , n 2 , n 3 ) ∈ S : n 2 = k}, is organised in sub-levels
R −k
n max

S (k) = S (k; r ) , 0 ≤ k ≤ n max
R ,
r =0
with S (k; r ) = {(n 1 , n 2 , n 3 ) ∈ S : n 1 = r, n 2 = k} ⊂ S (k). That is, level S (k)

is the group of states within S representing a total number of bound complexes on the
cell surface equal to k, while each sub-level S (k; r ) is formed by those states within
S (k), representing a total number of free receptors equal to r , for 0 ≤ r ≤ n max
R − k,
0 ≤ k ≤ n max
R . Finally, it is easy to show that
(n max − k + 1)(n max − k + 2)

J (k) = #S (k) = R R
,
2
J (k; r ) = #S (k; r ) = n R − k − r + 1 ,
max
R , 0 ≤ r ≤ nR
for 0 ≤ k ≤ n max max
− k. This organisation of S allows us to follow an
algorithmic approach when analysing the descriptors under study. This analysis is
based on the use of first-step arguments, Laplace–Stieltjes transforms and probability
generating functions, and is developed in the following sections. In Sect. 3.1 the time
to reach a total threshold number, B, of bound complexes simultaneously present
on the cell surface is studied as a random variable. We then obtain its Laplace–
Stieltjes transform and its order moments are computed. In Sect. 3.2, we analyse the
probability generating function of the random variable representing the number of
receptors synthesised during the time it takes to reach a threshold number, B, of
bound complexes simultaneously present on the cell surface. We obtain not only the
different order moments of this random variable, but also an algorithmic approach
for computing its probability mass function.
3.1 Time to Reach a Threshold Number of Bound

Complexes on the Cell Surface
The time to reach a threshold number of bound complexes simultaneously present

on the cell surface can be analysed in terms of the random variable
T(nB1 ,n 2 ,n 3 ) = “Time to reach a threshold number, B, of bound complexes

simultaneously present on the cell surface, given the current state
of the process (n 1 , n 2 , n 3 ) ∈ S ”,
n2
for values n 2 ≤ B ≤ n max
R , with T(n 1 ,n 2 ,n 3 ) = 0. In order to study this random variable,
we make use of its Laplace–Stieltjes transform [16, Appendix F]

(z) = E e−z T(n1 ,n2 ,n3 ) ,
B
ϕ(n
B
1 ,n 2 ,n 3 )
(z) ≥ 0 ,
which uniquely determines the distribution of the random variable and allows us to
compute its order moments by successive differentiation, as follows:
B,(l) dl B
m (n = E (T(nB1 ,n 2 ,n 3 ) )l = (−1)l ϕ (z) , l≥1.
1 ,n 2 ,n 3 )
dz l (n 1 ,n 2 ,n 3 ) z=0
The Laplace–Stieltjes transform ϕ(n

B
1 ,n 2 ,n 3 )
(z) can be obtained by a first-step argument
(z + Δ(n 1 ,n 2 ,n 3 ) ) ϕ(n
B
1 ,n 2 ,n 3 )
(z) = kon n 1 n L ϕ(n
B
1 −1,n 2 +1,n 3 )
(z) + koff n 2 ϕ(n
B
1 +1,n 2 −1,n 3 )
(z)
+ ks n 1 ϕ(n
B
1 −1,n 2 ,n 3 )
(z) + γ n 2 ϕ(n
B
1 ,n 2 −1,n 3 +1)
(z)
+ δ n 3 ϕ(n
B
1 +1,n 2 ,n 3 −1)
(z) + ke n 3 ϕ(n 1 ,n 2 ,n 3 −1) (z)
+ (1 − δn 1 +n 2 +n 3 ,n max
R
) (ν0 + ν1 σ (n 2 ))
× ϕ(n
B
1 +1,n 2 ,n 3 )
(z) , (2)
where Δ(n 1 ,n 2 ,n 3 ) =kon n 1 n L +koff n 2 +ks n 1 +γ n 2 +δ n 3 +ke n 3 +(1 − δn 1 +n 2 +n 3 ,n max

R
)
× (ν0 + ν1 σ (n 2 )), and δi, j represents the Kronecker’s delta (which is equal to 1 if
i = j and 0, otherwise). Equation (2) yields a system of equations involving the
Laplace–Stieltjes transforms corresponding to states (n 1 , n 2 , n 3 ) ∈ ∪k=0 B−1
S (k), with
boundary conditions ϕ(n 1 ,B,n 3 ) (z) = 1 for states (n 1 , B, n 3 ) ∈ S (B). This system of
B
equations can be efficiently solved by making use of a matrix formalism, while

exploiting the structure of S . We refer the reader to the Appendix where this pro-
cedure is shown in some detail.
Once the Laplace–Stieltjes transforms are in hand, the different order moments
are obtained by successive differentiation of Eq. (2) with respect to z, and setting
z = 0, as follows
B,(l) B,(l) B,(l)
Δ(n 1 ,n 2 ,n 3 ) m (n 1 ,n 2 ,n 3 )
= kon n 1 n L m (n 1 −1,n 2 +1,n 3 )
+ koff n 2 m (n 1 +1,n 2 −1,n 3 )
B,(l) B,(l)
+ ks n 1 m (n 1 −1,n 2 ,n 3 )
+ γ n 2 m (n 1 ,n 2 −1,n 3 +1)
B,(l) B,(l)
+ δ n 3 m (n 1 +1,n 2 ,n 3 −1)
+ ke n 3 m (n 1 ,n 2 ,n 3 −1)
B,(l)
+ (1 − δn 1 +n 2 +n 3 ,n max
R
) (ν0 + ν1 σ (n 2 )) m (n 1 +1,n 2 ,n 3 )
B,(l−1)
+ l m (n 1 ,n 2 ,n 3 )
, (3)
so that moments of order l can be obtained, in an algorithmic manner, from previously

B,(0)
computed moments of order l − 1, starting with m (n 1 ,n 2 ,n 3 )
= ϕ(n
B
1 ,n 2 ,n 3 )
(0), computed
from Eq. (2). This procedure makes use of a similar matrix formalism to the one for
solving Eq. (2), and is also briefly discussed in the Appendix.
3.2 Number of Receptors Synthesised During the Time

to Reach a Threshold Number of Bound Complexes
on the Cell Surface
In this section our interest is in analysing the random variable
N(nB 1 ,n 2 ,n 3 ) = “Number of newly synthesised receptors during the time it takes to

reach a threshold number, B, of bound complexes simultaneously
present on the cell surface, given the current state of the system
(n 1 , n 2 , n 3 ) ∈ S ”,
which is defined for values n 2 ≤ B ≤ n max

R . In order to analyse this random variable,
we propose to consider its probability generating function
B
φ(n
B
1 ,n 2 ,n 3 )
(s) = E s N(n1 ,n2 ,n3 ) , |s| ≤ 1 ,
which characterises the random variable, while allowing us at the same time to
compute any pth order factorial moment as follows
B,( p)
n (n 1 ,n 2 ,n 3 ) = E N(nB 1 ,n 2 ,n 3 ) (N(nB 1 ,n 2 ,n 3 ) − 1)(N(nB 1 ,n 2 ,n 3 ) − 2) · · · (N(nB 1 ,n 2 ,n 3 ) − p + 1)
dp B
= φ (s) , p≥0. (4)
ds p (n 1 ,n 2 ,n 3 ) s=1
A particular advantage of using the probability generating function is that it allows

us to compute the probability mass function of the random variable under study. That
is, P(N(nB 1 ,n 2 ,n 3 ) = a) = α(n
B
1 ,n 2 ,n 3 )
(a) is given by
1 da B
α(n
B
1 ,n 2 ,n 3 )
(a) = P(N(nB 1 ,n 2 ,n 3 ) = a) = φ (s) , a ≥ 0 . (5)
a! ds a (n 1 ,n 2 ,n 3 ) s=0
Thus, the Laplace–Stieltjes transform ϕ(n B

1 ,n 2 ,n 3 )
(z) considered in Sect. 3.1 (corre-
B
sponding to the continuous random variable T(n 1 ,n 2 ,n 3 ) ) can be seen, from a probabilis-
tic perspective, as the continuous analogous of the probability generating function
φ(n
B
1 ,n 2 ,n 3 )
(s), corresponding to the discrete random variable N(nB 1 ,n 2 ,n 3 ) . In particular,
both allow the computation of the different factorial (or standard, in the continuous
case) order moments of the random variables under study, as well as of the probabil-
ity mass (or density, by numerical inversion of the Laplace–Stieltjes transform [30])
function of them. We refer the reader to Ref. [31, Appendix B] for an introduc-
tion to probability generating functions and Laplace–Stieltjes transforms of discrete
and continuous, respectively, random variables. The probability generating function
φ(n
B
1 ,n 2 ,n 3 )
(s) can be obtained by following a first-step argument, in a similar manner
to that discussed in the previous subsection. In particular,
Δ(n 1 ,n 2 ,n 3 ) φ(n
B
1 ,n 2 ,n 3 )
(s) = kon n 1 n L φ(n
B
1 −1,n 2 +1,n 3 )
(s) + koff n 2 φ(n
B
1 +1,n 2 −1,n 3 )
(s)
+ ks n 1 φ(n
B
1 −1,n 2 ,n 3 )
(s) + γ n 2 φ(n
B
1 ,n 2 −1,n 3 +1)
(s)
+ δ n 3 φ(n
B
1 +1,n 2 ,n 3 −1)
(s) + ke n 3 φ(n
B
1 ,n 2 ,n 3 −1)
(s)
+ (1 − δn 1 +n 2 +n 3 ,n max
R
) (ν 0 + ν1 σ (n 2 )) s
× φ(n
B
1 +1,n 2 ,n 3 )
(s) , (6)
for states (n 1 , n 2 , n 3 ) ∈ ∪k=0

B−1
S (k), with boundary conditions φ(n
B
1 ,B,n 3 )
(s) = 1 for
states (n 1 , B, n 3 ) ∈ S (B). Equation (6) yields a system of equations that can be
efficiently solved by following the same matrix formalism described in the previous
section. This procedure is briefly discussed in the Appendix.
A direct application of Eqs. (4)–(6) leads to a system of equations corresponding
to the desired factorial moments and the probability mass function. In fact, we can
write
B,( p) B,( p) B,( p)
Δ(n 1 ,n 2 ,n 3 ) n (n 1 ,n 2 ,n 3 ) = kon n 1 n L n (n 1 −1,n 2 +1,n 3 ) + koff n 2 n (n 1 +1,n 2 −1,n 3 )
B,( p) B,( p)
+ ks n 1 n (n 1 −1,n 2 ,n 3 ) + γ n 2 n (n 1 ,n 2 −1,n 3 +1)
B,( p) B,( p)
+ δ n 3 n (n 1 +1,n 2 ,n 3 −1) + ke n 3 n (n 1 ,n 2 ,n 3 −1)
B,( p)
+ (1 − δn 1 +n 2 +n 3 ,n max
R
) (ν0 + ν1 σ (n 2 ))(n (n 1 +1,n 2 ,n 3 )
B,( p−1)
+ p n (n 1 +1,n 2 ,n 3 ) ) , p≥1,
Δ(n 1 ,n 2 ,n 3 ) α(n
B
1 ,n 2 ,n 3 )
(a) = kon n 1 n L α(nB
1 −1,n 2 +1,n 3 )
(a) + koff n 2 α(nB
1 +1,n 2 −1,n 3 )
(a)
+ ks n 1 α(n 1 −1,n 2 ,n 3 ) (a) + γ n 2 α(n 1 ,n 2 −1,n 3 +1) (a)
B B
+ δ n 3 α(nB
1 +1,n 2 ,n 3 −1)
(a) + ke n 3 α(nB
1 ,n 2 ,n 3 −1)
(a)
+ (1 − δn 1 +n 2 +n 3 ,n max
R
)(1 − δa,0 ) (ν0 + ν1 σ (n 2 ))
× α(n
B
1 +1,n 2 ,n 3 )
(a − 1) , a ≥ 0 .
B,( p)
Boundary conditions for the equations above are n (n 1 ,B,n 3 ) = 0, for all p ≥ 1 and
(n 1 , B, n 3 ) ∈ S (B), and α(n
B
1 ,B,n 3 )
(a) = 0 for all a ≥ 1 and α(n
B
1 ,B,n 3 )
(0) = 1. Effi-
cient methods to solve the previous systems of equations can be developed by making
use of the matrix formalism introduced above. We do not present the details here.
4 Numerical Results
In this section, we carry out a numerical study of the descriptors previously introduced
for the IL-2 stimulation of regulatory T cells. In order to do so, we propose to make use
of the physiological parameters and kinetic rates provided in Ref. [1] and reported
in Table 1. We are interested in the stimulation dynamics of a regulatory T cell
under three different regimes, characterised by the availability of IL-2 (low, medium,
Table 1 Physiological parameters and kinetic rates for the IL-2/IL-2R system from Ref. [1]
Parameter Value
Regulatory T cell surface area, sc 3 × 10−10 m2
Distance to a helper T cell secreting IL-2, h 1 × 10−3 m
Antigen-induced IL-2R synthesis rate in a 103 molecules/h
regulatory T cell, v0
IL-2 induced IL-2R synthesis rate in a 8 × 103 molecules/h
regulatory T cell, v1
IL-2 association rate constant to IL-2R, k̃on 111.6 nM−1 h−1
IL-2 dissociation rate constant to IL-2R, koff 0.83 h−1
Internalisation rate constant of IL-2R, ks 0.64 h−1
Internalisation rate constant of IL-2/IL-2R 1.7 h−1
complexes, γ
Recycling rate constant of endosomal IL-2R, δ 9 h−1
Endosomal degradation constant, ke 5 h−1
high), given by n L ∈ {1000, 5000, 10000}, respectively. These values, chosen for
illustrative purposes, have been selected taking into account that a helper T cell has
an antigen-induced IL-2 synthesis rate in the range of (0 − 2) × 104 molecules/h [1].
Initial conditions for our process are then given by R(0) = v0 /ks , C(0) = E(0) = 0,
which represent the state of the regulatory T cell before stimulation. We have assumed
that in the absence of IL-2, the initial number of IL-2R on the surface of a regulatory
T cell is given by the balance between receptor synthesis and degradation. We restrict
ourselves to the first 60 minutes post-stimulation and for computational convenience,
we consider the dynamics occurring on a fraction, f = 1%, of the cell surface. Thus,
n L ∈ {10, 50, 100}, n R = 15 and the kinetic rates have been transformed accordingly
(see Table 1). The binding rate, kon (see Fig. 1), is obtained from k̃on in Table 1 as
follows
k̃on
kon = ,
f h sc N A
where h is the distance to the source of IL-2 (helper T cells), sc is the regulatory
T cell surface area and N A is Avogadro’s number. Finally, preliminary Gillespie
simulations allow us to set n max
R = 6 R(0). This value is chosen so that the total
number of receptors in the system, R(t) + C(t) + E(t), for t ∈ [0, 60] min, does not
exceed the value n max
R with a probability greater than 0.99.
The aim of the numerical experiments carried out in this section is to investigate
the main hypothesis of the mathematical model, originally considered in Ref. [1].
Namely, that IL-2R ligand-induced synthesis is driven by a positive feedback from
the IL-2/IL-2R complexes on the cell surface. However, a number of other possi-
ble alternatives need to be considered: for example, a positive feedback from the
IL-2/IL-2R complexes in the endosome, or a synergistic positive feedback from the
IL-2/IL-2R complexes on the surface and those in the endosome. Thus, we propose
here three possible alternatives for the synthesis rate considered in reaction (R3 ) in
Sect. 2. In particular, we consider
C3
σ (C) = ,
C 3 + K c3
E3
σ (E) = ,
E 3 + K c3
(C + E)3
σ (C + E) = ,
(C + E)3 + K c3
where C and E represent the number of surface complexes and endosomal complexes
at a given time, respectively.
B
In Fig. 2, we plot the average time E[T(15,0,0) ] to reach a threshold number, B, of
bound complexes on the cell surface ± its standard deviation. These quantities are
obtained by following arguments provided in Sect. 3.1, and in particular, by direct
implementation of Algorithm 1 in the Appendix. We plot these times as a function of
B, for different number of IL-2 molecules, n L ∈ {10, 50, 100}, and for the three syn-
thesis hypotheses, σ (C), σ (E) and σ (C + E). In the first instance (Fig. 2 top, σ (C)),
the average time to reach B bound complexes on the cell surface is approximately
equal to one hour for values of B equal to B = 4, B = 21 and B = 39, for n L = 10,
n L = 50 and n L = 100, respectively. That is, larger numbers of IL-2 molecules lead
to a larger number of bound receptors on the surface, that induce synthesis of new
IL-2Rs, thus enhancing further IL-2 binding to IL-2R. In the third model (Fig. 2
bottom, σ (C + E)), the corresponding values of B for an average time of one hour
are B = 4, B = 23 and B = 41, for n L = 10, n L = 50 and n L = 100, respectively.
This illustrates the small but additional positive feedback that endosomal complexes
provide to the number of surface complexes, if they are explicitly considered in the
synthesis rate. On the other hand, the second hypothesis corresponding to σ (E) (Fig. 2
middle), which assumes that only endosomal complexes induce positive feedback for
the synthesis of new receptors, significantly changes the timescales of the threshold.
In particular, values of B corresponding to an average time, E[T(15,0,0) B
], approxi-
mately equal to one hour are B = 4, B = 10 and B = 13, for n L = 10, n L = 50
and n L = 100, respectively. That is, if only endosomal complexes were to give pos-
itive feedback for the synthesis of new receptors, the stimulation of the regulatory
T cell, and in particular the rate at which IL-2/IL-2R bound complexes are formed
and maintained on its surface, would significantly decrease.
A similar analysis can be made regarding the second descriptor, the average num-
B
ber E[N(15,0,0) ] of synthesised receptors during the time it takes B complexes to be
displayed on the cell surface. The descriptor is shown in Fig. 3 and plotted as a func-
tion of B. Quantities in Fig. 3 are obtained by the arguments provided in Sect. 3.2,
that is, by implementing a modified version of Algorithm 1 (discussed at the end of
the Appendix). We study the behaviour of this discrete descriptor for different num-
ber of IL-2 molecules and considering a number, E[N(15,0,0) B
], to be approximately
B
Fig. 2 Mean time E[T(15,0,0) ] (in minutes) ± its standard deviation, to reach a threshold number B
of bound complexes on the cell surface, as a function of B, for different number of IL-2 molecules,
n L ∈ {10, 50, 100}. Three different synthesis rate hypotheses have been considered in the process
(from top to bottom): σ (C), σ (E) and σ (C + E)
equal to 40. If we assume the first hypothesis, corresponding to σ (C) (Fig. 3 top),
the threshold value of IL-2/IL-2R corresponds to B = 7, B = 19 and B = 26, for
n L = 10, n L = 50 and n L = 100, respectively. If we assume the third hypothesis,
B
Fig. 3 Mean number E[N(15,0,0) ] of receptors synthesised ± its standard deviation, to reach a
threshold number B of bound complexes on the cell surface, as a function of B, for different
number of IL-2 molecules, n L ∈ {10, 50, 100}. Three different synthesis rate hypotheses have been
considered in the process (from top to bottom): σ (C), σ (E) and σ (C + E)
corresponding to σ (C + E) (Fig. 3 bottom), the threshold values are then equal to

B = 7, B = 19 and B = 27, for n L = 10, n L = 50 and n L = 100, respectively. This
means that there is a small contribution of endosomal complexes in σ (C + E) when

compared with σ (C), which allows the regulatory T cell to synthesise receptors
slightly faster. On the other hand, the hypothesis corresponding to σ (E) (Fig. 3 mid-
dle) significantly changes the stimulatory dynamics, and the corresponding values
of B are approximately equal to B = 7, B = 14 and B = 17, for n L = 10, n L = 50
and n L = 100, respectively. We note that the results in Fig. 3 need to be interpreted
together with those of Fig. 2. In particular, since in Fig. 3 we plot the number of
synthesised receptors to reach a threshold number of bound complexes on the sur-
face, this number depends on the time it takes to obtain this threshold B, which
is plotted in Fig. 2. This explains the behaviour shown in Fig. 3. For example, in
Fig. 3 middle, and under hypothesis σ (E), the number E[N(15,0,0)
B
] of receptors syn-
thesised to reach B = 14 complexes on the cell surface, for n L = 50, is equal to
B
E[N(15,0,0) ] ∼ 44. This large number of receptors synthesised (when compared with
the same case in Fig. 3 top, and under hypothesis σ (C), E[N(15,0,0)
B
] ∼ 23), can be
explained as follows: the timescale to reach B = 14 complexes, under hypothesis
σ (E), is significantly larger than one hour (see Fig. 2 middle).
From the comments above it is clear that, under the third hypothesis, σ (C + E),
the contribution of endosomal IL-2/IL-2R complexes to the number of receptors
synthesised during an hour is negligible. However, these results do not allow us
to separately quantify the contribution from the constitutive synthesis pathway, ν0 ,
and the ligand-induced pathway, ν1 σ (x), to the number of receptors synthesised
during a given time interval. This can be studied as follows: we note that regardless
of the particular hypothesis considered, σ (x) (with x ∈ {C, E, C + E}), the total
number of synthesised receptors to reach a threshold number, B, of surface IL-2/IL-
2R complexes, N(nB R ,0,0) , can always be expressed as
N(nB R ,0,0) = N(nB R ,0,0) (CS) + N(nB R ,0,0) (LIS) ,
where N(nB R ,0,0) (CS) is the number of receptors which have been constitutively syn-
thesised and N(nB R ,0,0) (LIS) is the number of receptors which have been synthesised
due to ligand-binding. We can then write
E[N(nB R ,0,0) ] = E[N(nB R ,0,0) (CS)] + E[N(nB R ,0,0) (LIS)] ,
where E[N(nB R ,0,0) ] has been plotted in Fig. 3. The values of E[N(nB R ,0,0) (CS)] can be
obtained by a slight modification of the arguments introduced in Sect. 3. In particular,
if we define
B
φ(nB
,n ,n
1 2 3 ) (s; CS) = E s N(n1 ,n2 ,n3 ) (CS) , |s| ≤ 1 ,
then Eq. (6) is replaced by

Δ(n 1 ,n 2 ,n 3 ) φ(n
B
1 ,n 2 ,n 3 )
(s; CS) = kon n 1 n L φ(n
B
1 −1,n 2 +1,n 3 )
(s; CS) + koff n 2 φ(n
B
1 +1,n 2 −1,n 3 )
(s; CS)
+ ks n 1 φ(n
B
1 −1,n 2 ,n 3 )
(s; CS) + γ n 2 φ(n
B
1 ,n 2 −1,n 3 +1)
(s; CS)
+ δ n 3 φ(n
B
1 +1,n 2 ,n 3 −1)
(s; CS) + ke n 3 φ(n
B
1 ,n 2 ,n 3 −1)
(s; CS)

x3
+ (1 − δn 1 +n 2 +n 3 ,n max ) v 0 s + v1 3 φ(n
B
1 +1,n 2 ,n 3 )
(s; CS) ,
R x + K c3
where x ∈ {n 2 , n 3 , n 2 + n 3 } for each synthesis rate hypothesis, that is, σ (C), σ (E)
and σ (C + E), respectively. Thus, the moments of N(nB R ,0,0) (CS) (and, similarly,
those of N(nB R ,0,0) (LIS)) can be obtained by reproducing our arguments of Sect. 3 and
the Appendix.
We are now able to compute (see Table 2), not only the mean number E[N(15,0,0) B
] of
synthesised receptors to reach a threshold number of IL-2/IL-2R complexes, B, under
three different hypotheses, but also the percentage contribution to the synthesis of IL-
2R from the constitutive and the ligand-induced pathway. Under the first hypothesis,
corresponding to a synthesis rate given by σ (C), we observe that as the number
of IL-2 molecules increases, the ligand-induced pathway becomes more relevant,
as expected. However, a saturation behaviour can be seen between n L = 50 and
n L = 100, which seems to indicate that a greater number of IL-2 molecules will not
lead to a higher contribution from the ligand-induced pathway. This is, of course,
due to the Hill function form assumed for σ (x), as originally proposed in Ref. [1].
Similar comments can be made for the third hypothesis, which corresponds to the
choice σ (C + E). As shown in Table 2, differences in the values of E[N(15,0,0) B
]
for the cases σ (C) and σ (C + E), and even for σ (E), are small for B = 5. This
indicates that the ligand-induced synthesis pathway does not play a significant role,
in absolute terms, for short timescales. This is not the case for B = 10, where a
different behaviour can be observed for σ (E), when compared to σ (C) or σ (C + E).
Now, the ligand-induced synthesis pathway plays a larger role for long timescales.
Finally, we note that the values provided in Table 2 have to be carefully interpreted by
B
looking at both Figs. 2 and 3. For example, the value E[N(15,0,0) ] ∼ 670 for B = 10,
with synthesis rate given by σ (E), and n L = 10, can be explained by pointing out
that it takes over 66 h to display B = 10 IL-2/IL-2R complexes on the cell surface,
and during this time 670 IL-2R synthesis events take place. This is indeed, a much
longer timescale than the first hour of the experiment (see Fig. 2).
5 Discussion
We have defined and developed a stochastic version of the deterministic model intro-
duced in Ref. [1], for the stimulation of a regulatory T cell by IL-2. Instead of solving
the master equation associated with the Markov process, or of carrying out Gillespie
simulations, we have introduced two random variables (or stochastic descriptors)
to analyse the rate at which IL-2/IL-2R complexes are formed and stay on the cell
B
E[N(15,0,0) (CS)]
B
Table 2 Values of E[T(15,0,0) ] (in minutes), E[N(15,0,0)
B ], 100 % (that is, the per-
B
E[N(15,0,0) ]
B
E[N(15,0,0) (LIS)]
centage of synthesis events due to the constitutive pathway), and 100 % (that is, the
B
E[N(15,0,0) ]
percentage of synthesis events due to the ligand-induced pathway), for n L ∈ {10, 50, 100} and
B ∈ {5, 10}. Three different synthesis rate hypotheses have been considered in the process: σ (C),
σ (E) and σ (C + E)
B
E[N(15,0,0) (CS)] B
E[N(15,0,0) (LIS)]
B σ (x) nL B
E[T(15,0,0) ] B
E[N(15,0,0) ] 100 % 100 %
B
E[N(15,0,0) ] B
E[N(15,0,0) ]
5 σ (C) 10 86.52 15.97 90.33 9.67

50 8.91 1.73 86.06 13.94
100 4.10 0.80 85.66 14.34
σ (E) 10 95.26 15.92 99.72 0.28
50 9.00 1.50 99.73 0.27
100 4.11 0.69 99.84 0.16
σ (C + E) 10 84.12 15.98 87.72 12.28
50 8.88 1.78 83.02 16.98
100 4.09 0.82 83.46 16.54
10 σ (C) 10 408.35 107.32 63.43 36.57
50 27.93 10.41 44.75 55.25
100 12.23 4.85 42.00 58.00
σ (E) 10 3993.32 669.99 99.36 0.64
50 51.13 8.67 98.33 1.67
100 16.31 2.76 98.36 1.64
σ (C + E) 10 316.02 92.79 56.77 43.23
50 26.32 10.95 40.05 59.95
100 11.87 5.18 38.21 61.79
surface, as well as the rate at which IL-2R is synthesised. We have computed the
Laplace–Stieltjes transforms and the probability generating functions of these ran-
dom variables by appropriately arranging the space of states and making use of
first-step arguments.
The author in Ref. [1] hypothesises that IL-2R synthesis is induced by the presence
of bound IL-2/IL-2R complexes on the cell surface. We have further generalised this
hypothesis and have also considered the role of internalised (endosomal) IL-2/IL-2R
complexes in the IL-2R synthesis rate. We have made use of numerical experiments to
compare three different hypotheses. Our numerical results suggest that if endosomal
complexes contribute to the synthesis rate, their effect would be negligible when
considering the time to reach a certain signalling threshold (encoded by the number
of IL-2/IL-2R complexes on the surface of a regulatory T cell).
We have also been able to quantify the relative contribution to the number of
IL-2R molecules synthesised by the constitutive and the ligand-induced pathway,
respectively. We have done so by slightly modifying the stochastic descriptors. A
particular conclusion from Table 2 is that ligand-induced synthesis is not important
for the short-term dynamics of the system, in which the constitutive receptor synthesis
pathway seems to play the central role. On the other hand, for larger number of IL-
2 ligand molecules, the ligand-induced synthesis pathway plays a major role, and
the three hypotheses for the synthesis rate need to be carefully analysed. In this
case, the contribution of endosomal complexes to the synthesis rate (together with
that of surface complexes, third hypothesis (σ (C + E))) does not seem to change
the behaviour of the system, when compared to the first hypothesis (σ (C)). If the
synthesis rate only depends on the number of endosomal IL-2/IL-2R complexes
(second hypothesis (σ (E))), the results change in a quantitative and qualitative way.
Our results also indicate that as the number of ligand molecules is increased, the
role of the ligand-induced synthesis pathway is more significant, and a saturation
behaviour is observed: above a certain threshold of IL-2 molecules, there are no
changes in the relative contribution of the ligand-induced receptor synthesis pathway
to that of the constitutive one. We have not done so in this paper, but a parallel
study could be carried out for two other stochastic descriptors: the time to reach
a threshold number of IL-2/IL-2R complexes in the endosome and the number of
IL-2R molecules that are synthesised during this time. This would correspond to
the hypothesis that the signalling complexes, that induce IL-2R synthesis, are the
IL-2/IL-2R complexes in the endosome of a regulatory T cell.
Finally, we conclude by noting that, although the algorithmic approach followed
here allows us to obtain analytical results for the stochastic descriptors consid-
ered, it has computational limitations. Thus, the algorithmic procedures developed
within the Appendix are essential for efficient numerical computation. It is clear that
standard stochastic Gillespie simulations are more efficient, from a computational
perspective, than our approach (e.g. the practical implementation of our methods,
according to our numerical experiments, seems to be computationally unfeasible
for ligand concentrations higher than 15 × 103 IL-2 molecules/cell). On the other
hand, our approach not only allows the computation of exact results (instead of sim-
ulated ones), as described in Figs. 2, 3 and Table 2, but it enables the development
of further studies, such as perturbation analysis, which allows us to characterise
(by means of the computation of partial derivatives) the impact that each kinetic
rate θ ∈ {kon , koff , ν0 , ν1 , K c , γ , ke , δ, ks } has on the descriptors here analysed. This,
which is not possible by means of Gillespie simulations, is out of the scope of
this Chapter. In general, a balance between computational limitations and model
complexity needs to be considered, and alternative procedures such as Gillespie sim-
ulations will be required when studying, for example, models with a greater number
of variables or higher ligand concentrations than the ones presented here.
Acknowledgements This research is supported by the European Commission through the Marie-
Curie Action “Quantitative T cell Immunology” QuanTI Initial Training Network, with grant
number FP7-PEOPLE-2012-ITN 317040-QuanTI (Luis de la Higuera, Grant Lythe and Carmen
Molina-París). M. López-García is supported by The Leverhulme Trust RPG-2012-772. The authors
acknowledge the support of the University of Leeds for the permission to use the High Performance
Computing facilities ARC1 and ARC2.
Appendix
In order to efficiently analyse the first descriptor studied in Sect. 3.1, we express the
system of equations given by Eq. (2), in matrix form as
ϕ(z) = A(z) ϕ(z) + b(z) , (7)
where the constant B has been omitted to simplify the notation. The vector of
unknowns, ϕ(z), is structured, due to the organisation of S , in levels and sub-levels,
by blocks as follows
⎛ ⎞ ⎛ ⎞
ϕ 0 (z) ϕ k0 (z)
⎜ ϕ 1 (z) ⎟ ⎜ ϕ k1 (z) ⎟
⎜ ⎟ ⎜ ⎟
ϕ(z) = ⎜ .. ⎟ , ϕ k (z) = ⎜ .. ⎟ , 0 ≤ k ≤ B − 1,
⎝ . ⎠ ⎝ . ⎠
ϕ B−1 (z) ϕ kn max −k (z)
R
with ϕ rk (z) = (ϕ(r,k,0) (z), . . . , ϕ(r,k,n max

R −r −k)
(z))T , and where T represents the trans-
pose operator. In a similar way, the organisation of states within S by levels and
sub-levels, and the consideration of the transition rates given in Eq. (1), lead to
⎛ ⎞
A0,0 (z) A0,1 (z) 0 ... 0 0
⎜ A1,0 (z) A1,1 (z) A1,2 (z) . . . 0 0 ⎟
⎜ ⎟
⎜ 0 A (z) A (z) . . . 0 0 ⎟
⎜ 2,1 2,2 ⎟
A(z) = ⎜ . . . . . . ⎟,
⎜ .. .. .. .. .. .. ⎟
⎜ ⎟
⎝ 0 0 0 . . . A B−2,B−2 (z) A B−2,B−1 (z) ⎠
0 0 0 . . . A B−1,B−2 (z) A B−1,B−1 (z)
where the sub-matrix Ak,k (z) contains in an ordered fashion, those coefficients in the
system, Eq. (2), related to transitions from states in level S (k) to states in level S (k ).
The specific structure by sub-levels allows us to write the following expressions
⎛ ⎞
Bk,k (z) Bk,k
0,1 (z) 0 ... 0 0
⎜ 0,0 ⎟
⎜ Bk,k (z) B k,k
(z) B k,k
1,2 (z) ... 0 0 ⎟
⎜ 1,0 1,1 ⎟
⎜ 0 k,k
(z) k,k ⎟
⎜ B 2,1 B 2,2 (z) ... 0 0 ⎟
Ak,k (z) = ⎜
⎜ .. .. .. .. .. .. ⎟,
⎟
⎜ . . . . . . ⎟
⎜ ⎟
⎜ 0 0 0 . . . Bn max −k−1,n max −k−1 (z) Bn max −k−1,n max −k (z) ⎟
k,k k,k
⎝ R R R R ⎠
0 0 0 . . . Bk,k
n max −k,n max −k−1 (z) B k,k
n max −k,n max −k (z)
R R R R
⎛ ⎞
Bk,k−1 (z) B0,1
k,k−1
(z) 0 ... 0 0
⎜ 0,0 k,k−1 k,k−1 ⎟
⎜ 0 B1,1 (z) B1,2 (z) . . . 0 0 ⎟
⎜ ⎟
⎜ 0 0 k,k−1
B2,2 (z) . . . 0 0 ⎟
⎜ ⎟
Ak,k−1 (z) = ⎜
⎜ .. .. .. .. .. .. ⎟,
⎟
⎜ . . . . . . ⎟
⎜ ⎟
⎜ 0 0 0 . . . Bnk,k−1
max −k−1,n max −k (z) 0 ⎟
⎝ R R ⎠
0 0 0 . . . Bnk,k−1 k,k−1
max −k,n max −k (z) Bn max −k,n max −k+1 (z)
R R R R
⎛ ⎞
0 0 0 ... 0 0
⎜ Bk,k+1 (z) 0 0 ... 0 0 ⎟
⎜ 1,0 ⎟
⎜ k,k+1
(z) ... ⎟
⎜ 0 B2,1 0 0 0 ⎟
⎜ .. .. .. .. .. ⎟
Ak,k+1 (z) = ⎜ .. ⎟,
⎜ . . . . . . ⎟
⎜ ⎟
⎜ 0 0 0 ... k,k+1
Bn max −k−1,n max −k−2 (z) 0 ⎟
⎝ R R
⎠
0 0 0 ... 0 Bnk,k+1
max −k,n max −k−1 (z)
R R
where the dimensions of the sub-blocks 0 in the previous expressions have been
omitted. We note that, in fact, the dimensions of a sub-block 0 corresponding to the
group of rows S (k; r ) and the group of columns S (k ; r ) are J (k; r ) × J (k ; r ).
k,k
Expressions for the sub-matrices Br,r (z) can be obtained from Eq. (2) as

γ k (z + Δ(r,k,i) )−1 , if j = i + 1,
(Br,r
k,k−1
(z))i j =
0, otherwise,
where 1 ≤ i ≤ J (k; r ), 1 ≤ j ≤ J (k − 1; r ), 1 ≤ k ≤ B − 1 and 0 ≤ r ≤

n max
R − k;

k,k−1 koff k (z + Δ(r,k,i) )−1 , if j = i,
(Br,r +1 (z))i j = 0, otherwise,
where 1 ≤ i ≤ J (k; r ), 1 ≤ j ≤ J (k − 1; r +1), 1 ≤ k ≤ B−1 and 0 ≤ r ≤ n max

R −
k−1;

k,k ks r (z + Δ(r,k,i) )−1 , if j = i,
(Br,r −1 (z))i j =
0, otherwise,
where 1 ≤ i ≤ J (k; r ), 1 ≤ j ≤ J (k; r − 1), 0 ≤ k ≤ B − 1 and 1 ≤ r ≤

n max
R − k;

ke i (z + Δ(r,k,i) )−1 , if j = i − 1,
(Br,r
k,k
(z))i j =
0, otherwise,
where 1 ≤ i ≤ J (k; r ), 1 ≤ j ≤ J (k; r ), 0 ≤ k ≤ B − 1 and 0 ≤ r ≤ n max

R − k;
⎧
⎨ (ν0 + ν1 σ (k)) (z + Δ(r,k,i) )−1 , if j = i,
(Br,r +1 (z))i j = δ i (z + Δ(r,k,i) )−1 ,
k,k
if j = i − 1,
⎩
0, otherwise,
where 1 ≤ i ≤ J (k; r ), 1 ≤ j ≤ J (k; r + 1), 0 ≤ k ≤ B − 1 and 0 ≤ r ≤ n max

R −
k − 1; and

k,k+1 kon r n L (z + Δ(r,k,i) )−1 , if j = i,
(Br,r −1 (z))i j =
0, otherwise,
where 1 ≤ i ≤ J (k; r ), 1 ≤ j ≤ J (k+1; r −1), 0 ≤ k ≤ B − 2 and 1 ≤ r ≤ n max

R −
k. Finally, the expression for the vector b(z) in Eq. (7) is given by
⎛ ⎞
0
⎜ 0 ⎟
⎜ ⎟
⎜ .. ⎟
b(z) = ⎜ . ⎟,
⎜ ⎟
⎝ 0 ⎠
A B−1,B (z) e J (B)
where e j represents a column vector of ones with dimension j. Then, following a

forward-elimination backward-substitution method suggested by Ciarlet
[32, p. 144], Algorithm 1 is obtained. This Algorithm allows us to compute all
the Laplace–Stieltjes transforms in Eq. (2) in an efficient and recursive manner.
Algorithm 1 [to obtain the Laplace–Stieltjes transforms ϕ(n
B
1 ,n 2 ,n 3 )
(z)]
H0 (z) = I J (0) − A0,0 (z);

For k = 1, . . . , B − 1:
−1
Hk (z) = I J (k) − Ak,k (z) − Ak,k−1 (z)Hk−1 (z)Ak−1,k (z);
−1
ϕ B−1 (z) = H B−1 (z)A B−1,B (z)e J (B) ;
For k = B − 2, . . . , 0:
ϕ k (z) = Hk−1 (z)Ak,k+1 (z)ϕ k+1 (z);
B,(l)
Finally, the order moments, m (n 1 ,n 2 ,n 3 )
, of the random variable T(nB1 ,n 2 ,n 3 ) can be
obtained by means of a matrix formalism similar to that described for Eq. (3). In
particular, the system given by Eq. (3) can be expressed in matrix form as
m(l) = A(0) m(l) + b̃l ,
with
1 B−1
(b̃ )i = l
l
(m(l−1) )i , 0 ≤ i ≤ #S (k),
Δi k=0
where Δi represents the value Δ(n 1 ,n 2 ,n 3 ) for the state (n 1 , n 2 , n 3 ) corresponding to

row i. The vector b̃l can be structured by blocks as follows
⎛ ⎞
b̃l0
⎜ b̃l1 ⎟
⎜ ⎟
⎜ .. ⎟
b̃l = ⎜ . ⎟.
⎜ ⎟
⎝ b̃l ⎠
B−2
b̃lB−1
Similar arguments to those used to derive Algorithm 1 lead to Algorithm 1 (continua-

tion), which allows us to compute the moments in vector m( p) from those previously
computed in vector m( p−1) , starting at m(0) = ϕ(0) and until the desired order, p = l,
is reached.
B,(l)
Algorithm 1 (Continuation) [to obtain the l-th order moments m (n 1 ,n 2 ,n 3 )
]
For k = 0, 1, . . . , B − 1:
mk(0) = ϕ k (0);
For p = 1, . . . , l:
( p) p
J0 = b̃0 ;
For j = 1, . . . , B − 1:
( p) ( p)
J j = A j, j−1 (0)H−1
p
j−1 (0)J j−1 + b̃ j ;
( p) ( p)
m B−1 = H−1 B−1 (0)J B−1 ;
For j = B − 2, . . . , 1,0:
( p) ( p) ( p)
m j = H−1 j (0) J j + A j, j+1 (0)m j+1 ;
We now turn to the second descriptor analysed in Sect. 3.2. Equation (6) can be
expressed in matrix form as
φ(s) = A(s) φ(s) + b, (8)
where we are omitting again B in the notation, and where the probability generating
functions φ(n
B
1 ,n 2 ,n 3 )
(s) for (n 1 , n 2 , n 3 ) ∈ ∪k=0
B−1
S (k) are stored in a column vector
φ(s), which is organised in sub-vectors following the structure of levels and sub-
levels of S . This follows similar arguments to those used for the vector ϕ(z). A
direct comparison between Eqs. (2) and (6) allows us to write A(s) = A(z = 0),
k,k k,k
except for sub-blocks Br,r +1 (z), which should be replaced by Br,r +1 (s) given by
⎧ −1
⎨ (ν0 + ν1 σ (k)) s Δ(r,k,i) , if j = i,
k,k −1
(Br,r +1 (s))i j = δ i Δ(r,k,i) , if j = i − 1,
⎩
0, otherwise,
for 1 ≤ i ≤ J (k; r ), 1 ≤ j ≤ J (k; r + 1), 0 ≤ k ≤ B − 1 and 0 ≤ r ≤ n max R −

k − 1. Finally, the vector b = b(z = 0) and Algorithm 1 leads to the computation of
the vector φ(s) from Eq. (8).
B,( p)
The factorial moments n (n 1 ,n 2 ,n 3 ) and the probabilities α(n
B
1 ,n 2 ,n 3 )
(a) of the random
B
variable N(n 1 ,n 2 ,n 3 ) can be computed following similar arguments to those provided
above, and are not included here.
References
1. D Busse. Dynamics of the IL-2 cytokine network and T-cell proliferation. Logos Verlag Berlin
GmbH, 2010.
2. L Klein, B Kyewski, PM Allen, and KA Hogquist. Positive and negative selection of the T cell
repertoire: what thymocytes see (and don’t see). Nature Reviews Immunology, 14(6):377–391,
2014.
3. E Palmer. Negative selection? clearing out the bad apples from the T-cell repertoire. Nature
Reviews Immunology, 3(5):383–391, 2003.
4. K Shortman, D Vremec, and M Egerton. The kinetics of T cell antigen receptor expression
by subgroups of CD4+ 8+ thymocytes: delineation of CD4+ 8+ 3 (2+) thymocytes as post-
selection intermediates leading to mature T cells. The Journal of Experimental Medicine,
173(2):323–332, 1991.
5. Y Xing and KA Hogquist. T-cell tolerance: central and peripheral. Cold Spring Harbor Per-
spectives in Biology, 4(6):a006957, 2012.
6. EM Janssen, EE Lemmens, T Wolfe, U Christen, MG von Herrath, and SP Schoenberger.
CD4+; T cells are required for secondary expansion and memory in CD8+; T lymphocytes.
Nature, 421(6925):852–856, 2003.
7. O Boyman and J Sprent. The role of interleukin-2 during homeostasis and activation of the
immune system. Nature Reviews Immunology, 12(3):180–190, 2012.
8. LM Sompayrac. How the immune system works. John Wiley & Sons, 2011.
9. TR Malek, BO Porter, EK Codias, P Scibelli, and A Yu. Normal lymphoid homeostasis and
lack of lethal autoimmunity in mice containing mature T cells with severely impaired IL-2
receptors. The Journal of Immunology, 164(6):2905–2914, 2000.
10. TR Malek, A Yu, V Vincek, P Scibelli, and L Kong. CD4 regulatory T cells prevent lethal
autoimmunity in IL-2Rβ-deficient mice: implications for the nonredundant function of IL-2.
Immunity, 17(2):167–178, 2002.
11. ARM Almeida, N Legrand, M Papiernik, and AA Freitas. Homeostasis of peripheral CD4+
T cells: IL-2Rα and IL-2 shape a population of regulatory cells that controls CD4+ T cell
numbers. The Journal of Immunology, 169(9):4850–4860, 2002.
12. ARM Almeida, B Zaragoza, and AA Freitas. Indexation as a novel mechanism of lymphocyte
homeostasis: the number of CD4+ CD25+ regulatory T cells is indexed to the number of
IL-2-producing cells. The Journal of Immunology, 177(1):192–200, 2006.
13. TR Malek, A Yu, L Zhu, T Matsutani, D Adeegbe, and AL Bayer. IL-2 family of cytokines in T
regulatory cell development and homeostasis. Journal of Clinical Immunology, 28(6):635–639,
2008.
14. TR Malek and I Castro. Interleukin-2 receptor signaling: at the interface between tolerance and
immunity. Immunity, 33(2):153–165, 2010.
15. SK Dower, SR Kronheim, CJ March, PJ Conlon, TP Hopp, S Gillis, and DL Urdal. Detection
and characterization of high affinity plasma membrane receptors for human interleukin 1. The
Journal of Experimental Medicine, 162(2):501–515, 1985.
16. VG Kulkarni. Modeling and analysis of stochastic systems. CRC Press, 2009.
17. DT Gillespie. Exact stochastic simulation of coupled chemical reactions. The Journal of Phys-
ical Chemistry, 81(25):2340–2361, 1977.
18. NG Van Kampen. Stochastic processes in physics and chemistry, volume 1. Elsevier, 1992.
19. MF Neuts. Matrix-analytic methods in queuing theory. European Journal of Operational
Research, 15(1):2–12, 1984.
20. A Gómez-Corral and M López-García. Extinction times and size of the surviving species in a
two-species competition process. Journal of Mathematical Biology, 64(1–2):255–289, 2012.
21. A Economou, A Gómez-Corral, and M López-García. A stochastic SIS epidemic model with
heterogeneous contacts. Physica A: Statistical Mechanics and its Applications, 421:78–97,
2015.
22. M López-García. Stochastic descriptors in an SIR epidemic model for heterogeneous individ-
uals in small networks. Mathematical Biosciences, 271:42–61, 2015.
23. JR Artalejo, A Gómez-Corral, M López-García, and C Molina-París. Stochastic descriptors to
study the fate and potential of naive T cell clonotypes in the periphery. Journal of Mathematical
Biology. doi:10.1007/s00285-016-1020-6, 2016.
24. KG Gurevich, PS Agutter, and DN Wheatley. Stochastic description of the ligand-receptor
interaction of biologically active substances at extremely low doses. Cellular Signalling, 15(4):
447–453, 2003.
25. QM He. Fundamentals of matrix-analytic methods. Springer, 2014.
26. C Starbuck and DA Lauffenburger. Mathematical model for the effects of epidermal growth fac-
tor receptor trafficking dynamics on fibroblast proliferation responses. Biotechnology Progress,
8(2):132–143, 1992.
27. B Goldstein, JR Faeder, and WS Hlavacek. Mathematical and computational models of
immune-receptor signalling. Nature Reviews Immunology, 4(6):445–456, 2004.
28. PW Zandstra, DA Lauffenburger, and CJ Eaves. A ligand-receptor signaling threshold
model of stem cell differentiation control: a biologically conserved mechanism applicable
to hematopoiesis. Blood, 96(4):1215–1222, 2000.
29. J Currie, M Castro, G Lythe, E Palmer, and C Molina-París. A stochastic T cell response
criterion. Journal of The Royal Society Interface, 9(76):2856–2870, 2012.
30. J Abate and W Whitt. Numerical inversion of Laplace transforms of probability distributions.
ORSA Journal on Computing, 7(1):36–43, 1995.
31. D Insua, F Ruggeri, and M Wiper. Bayesian analysis of stochastic process models, volume 978.
John Wiley & Sons, 2012.
32. PG Ciarlet, B Miara, and JM Thomas. Introduction to numerical linear algebra and optimisa-
tion. Cambridge University Press, 1989.
Understanding the Role of Mitochondria
Distribution in Calcium Dynamics
and Secretion in Bovine Chromaffin Cells
Amparo Gil, Virginia González-Vélez, José Villanueva and

Luis M. Gutiérrez
Abstract Adrenomedullary chromaffin cells are widely used as a valuable model

to study calcium-induced exocytosis of dense vesicles. Functional studies demon-
strated the important role of organelles in shaping calcium signals in this cell type.
Therefore, the study of mitochondria distribution in relation with exocytotic sites is
relevant to understand the nature of such modulation. In this paper, we discuss the
spatial distribution of mitochondria in bovine chromaffin cells in culture inferred
from experimental observations and use a theoretical model for understanding the
role played by these organelles in the fine tuning of calcium signals and the modu-
lation of secretion.
1 Introduction
It is well known the key role that calcium plays as messenger in a large number
of vital processes, such as secretion of hormones and neurotransmitters, muscle
contraction and genetic transcription, among others [1, 20]. Regarding the release
of neurotransmitters, the vast majority of synapses in the central nervous system
are chemical, as are all synapses between nerves and muscles. When an action
A. Gil (B)
Depto. de Matemática Aplicada Y CC de la Comput, Universidad de Cantabria,
39005 Santander, Spain
e-mail: amparo.gil@unican.es
V. González-Vélez
Area de Química Aplicada, Universidad Autónoma Metropolitana-Azcapotzalco,
02200 Mexico city, Mexico
e-mail: vgv@correo.azc.uam.mx
J. Villanueva · L.M. Gutiérrez
Instituto de Neurociencias, Centro Mixto Universidad Miguel Hernández-CSIC,
Sant Joan d’Alacant, Alicante, Spain
e-mail: jvillanueva@umh.es
L.M. Gutiérrez
e-mail: luisguti@umh.es

108 A. Gil et al.
potential invades the terminal, the depolarization opens voltage-sensitive calcium

channels, allowing calcium ions to enter the nerve terminal and trigger the trans-
mitter release process. This model, established by Katz and Miledi [10, 11] for the
release of neurotransmitter in the frog neuromuscular junction, has been extended
to release processes in neurons, neuroendocrine and endocrine cells and many other
cell types.
In the particular case of chromaffin cells (a type of neuroendocrine cells located
in the adrenal glands), the release of catecholamines takes place in response to the
elevation of cytosolic calcium in a process involving the transport of granules, translo-
cation to the plasma membrane, docking at the secretory sites, and finally the fusion
of membranes with extrusion of soluble contents [2]. Chromaffin cells have been
widely used to study neurosecretion since they exhibit similar calcium dependence
of several exocytotic steps as synaptic terminals do, but having the great advan-
tage of being larger than neurons that facilitates the experimental study of exocy-
tosis and calcium dynamics. In chromaffin cells, major cellular structures such as
the cortical cytoskeleton seem to play fundamental roles in different stages of the
secretory cascade [9], whereas organelles such as mitochondria and the endoplasmic
reticulum appear to control and shape calcium elevations at the subplasmalemmal
region [5]. In connection to this second aspect, a characterization of the populations
of mitochondria and ER elements in cultured bovine chromaffin cells in relation
with its distance to the secretory apparatus or exocytotic sites, was presented in
[18].
In this paper, we discuss experimental findings on the mitochondria distribution in
bovine chromaffin cells in culture obtained using confocal fluorescence microscopy
and use a theoretical stochastic model for understanding the role played by these
organelles in the fine tuning of calcium signals and the modulation of secretion. The
modeling approach, which describes the entry through L and P/Q voltage-dependent
calcium channels (VDCC), the 3-D diffusion of calcium ions and the kinetic reactions
of calcium and buffers, is particularly appropriated for the study of media with an
inhomogeneous spatial distribution of obstacles [15], as seems to be the case of
mitochondria in chromaffin cells. It should be mentioned that the specific model for
mitochondria organelles is not yet perfect and should be further validated using more
experimental data and more extensive simulations.
2 Experimental Results
The experimental protocol for characterizing the populations of mitochondria and

ER elements in cultured bovine chromaffin cells, was described in detail in [18]. We
briefly discuss here the experimental protocol and present two new figures summa-
rizing the experimental findings.
Understanding the Role of Mitochondria Distribution … 109
2.1 Experimental Materials and Methods
Bovine chromaffin cell isolation and culture chromaffin cells were prepared from
bovine adrenal glands by collagenase digestion and separated from the debris and
erythrocytes using centrifugation on Percoll gradients as described before [7, 8].
After extensive washing, cells were maintained as monolayer cultures in Dulbec-
cos modified Eagles medium (DMEM) supplemented with 10% fetal calf serum,
10 µm cytosine arabinoside, 10 µm 5-fluoro-2 -deoxyuridine, 50 IU/ml penicillin,
and 50 g/ml streptomycin. Finally, cells were harvested at a density of 150,000
cells/cm2 in 22 mm diameter coverslips coated with polylysine. Experiments were
done in Krebs/HEPES (K/H) basal solution containing 134 mm NaCl, 4.7 mm KCl,
1.2 mm KH2 PO4 , 1.2 mm MgCl2 , 2.5 mm CaCl2 , 11 mm glucose, 0.56 mm ascorbic
acid and 15 mm HEPES/Na, pH 7.4. Cells were stimulated for 1 min using a depo-
larizing solution with 59 mm high potassium (obtained by replacing isosmotically
NaCl by KCl) in K/H basal solution. Cells were used between the third and sixth day
after plating.
2.1.1 Visualization of F-Actin Cytoskeleton and Mitochondria by

Confocal Fluorescence Microscopy
Chromaffin cells were transfected with GFP-lifeact, a 17 amino acid peptide binding
to F-actin without altering its dynamics in vivo or in vitro studies using the Amaxa
basic nucleofector kit for primary mammalian neuronal cells (Amaxa GmbH. Koehl,
Germany) as described in [17]. 48 h later, cells were incubated either with 1 µm mito-
tracker green (Invitrogen, Eugene, Or, USA) for a 15 min period at room temperature.
Cell fluorescence was studied in an Olympus Fluoview FV300 confocal laser
system mounted on a IX-71 inverted microscope incorporating a 100X PLAN-Apo
oil-immersion objective with 1.45 n.a. Excitation was achieved using Ar and HeNe
visible light lasers. After acquisition, images were processed using the ImageJ pro-
gram. For the statistical analysis, the nonparametric Mann Whitney test for paired
samples or the 1 way ANOVA Kruskal–Wallis test were used. Differences were con-
sidered significant when p < 0.05. Data are expressed as mean + SEM obtained
from experiments performed in a number (n) of individual cells from at least three
different cultures.
2.2 Mitochondria Distribution in Bovine Chromaffin Cells in

Culture
The confocal fluorescence images shown in Fig. 1 evidence two different popula-
tions of mitochondria in bovine chromaffin cells in the culture. These populations
are represented in distinct green (perinuclear mitochondria) and yellow (cortical
mitochondria) in the 3-D reconstruction shown in Fig. 1b. The existence of these
two different populations was further supported with the measurements of the green
110 A. Gil et al.
Fig. 1 Confocal microscopy images show two populations of mitochondria in chromaffin cells.
Confocal images from representative experiments performed in cultured bovine chromaffin cells
expressing RFP- Life Act (red channel) and labeled with mitotracker green. a Confocal planes
separated by 0.2 µm from the basal cortical region to the equatorial plane were used to reconstruct
cell fluorescence (XY, YZ and XZ planes). The YZ and XZ planes evidenced the low fluorescence
layer separating two populations of mitochondria. b 3-D image obtained with Imaris showing the
perinuclear mitochondria in green as well as the cortical mitochondria in yellow. c Normalized
fluorescence for the different confocal planes determined for Life Act fluorescence (red line), and
mitotracker (green line) represented in function of the distance measured from cortical basal to the
equatorial plane (3 µm in total). Bar in b represents 3 µm
fluorescence intensity in the different confocal planes (Fig. 1c, green line). From
these measurements, it is clear the presence of an external population of mitochon-
dria associated with the cortical region (0–1 µm of the cell limit) and an internal
population around 2–3 µm of the cell limits.
The experiments also reveal that two subpopulations of cortical mitochondria
exist in the vicinity of exocytotic sites: the images shown in Fig. 2a, b were used to
measure distances of secretory sites (punctate green dots) to the nearest mitochondria
(red elongate forms) and represented in form of distributions. The distance distribu-
tion analysis shown in Fig. 2c shows that ∼30% of secretory sites colocalized with
mitochondria. As can be seen, there is also a significant population of mitochondria
around the 300–500 nm distance.
Fig. 2 Cortical mitochondria locate close to the secretory sites. Cultured chromaffin cells incubated
with mitotracker red were stimulated during 1 min with a 59 mm KCl solution at room temperature
(21–22 C). Secretion was stopped by lowering the temperature using ice-cooled buffer. Secretory
sites were labeled using an anti-DβH antibody followed by a secondary antibody coupled to Cy-2
(green fluorescence) in ice-cold buffer to prevent endocytosis. The distribution of cortical mito-
chondria and secretory sites was studied acquiring confocal images of the cortical area in polar
sections (a) or equatorial sections (b) to calculate distances between secretory sites and the nearest
mitochondria (N = 347 distances from 10 cells). These distances were used to build a distribution
using 0.1 nm bin size (c). Red line is the best fit to a polynomial function. Bars in a and b represent
3 µm
3 Theoretical Model
The simulation scheme used in this work is an extension of the algorithms developed
by the authors [6, 14]. The Monte Carlo algorithm has proved to be successful in the
study of the influence of the geometry in the exocytotic response of neuroendocrine
cells and presynaptic terminals. For the spatial resolutions that are relevant (of the
order of 100 nm in neuroendocrine cells or 10 nm in presynaptic terminals) and the
typical concentrations of calcium in chromaffin cells, we can expect few calcium
112 A. Gil et al.
ions inside each cube of 100 nm of side (∼7 ions for 10 µm) and also a moderate
number of calcium binding sites. Then, it seems more appropriate a particle-based
approach instead of using a continuous modeling in terms of concentrations.
Our algorithm implements a microscopic simulation in which the fundamental
variables are the number of ions and buffers. The average values of the output of
our simulation converge to macroscopic simulations when considering symmetrical
configurations.
A conical domain is appropriate to describe buffered diffusion in the submembrane
domain of spherical cells (which is the case of chromaffin cells in good approxima-
tion). The base of the cone represents the membrane of the cell in which voltage-
dependent calcium channels are distributed. An orthogonal 3-D regular grid maps
the domain of simulation with a distance between grid points Δl. Each point of the
grid is associated with a cubic compartment of volume (Δl)3 . The 3-D diffusion of
calcium ions and possible mobile buffers is modeled as a random walk process. The
first-order kinetic reactions of calcium ions and buffers are interpreted (and solve)
probabilistically.
Some of the parameter values used in the model can be found in the Table 1.
To simulate currents through calcium channels, we use a simple stochastic scheme
where every channel of the total population may transit from its present state to an
open, closed or inactive state in response to voltage and calcium concentrations (the
transition parameters between states depend on voltage and local calcium concen-
trations). Then, the total current is the sum of unitary currents due to open channels;
the unitary currents are specific to each channel type and depend on its unitary con-
ductance. In our simulations, we consider that each channel cluster is formed by
Table 1 Parameters used in Geometrical parameters

the simulations
Radius r = 1 µm
Height h = 5 µm
Spatial resolution Δx = 100 nm
Kinetic parameters
Calcium
Basal concentration [Ca 2+ ]0 = 0.1 µm
Diffusion coefficient DCa = 220 µm2 /s
Endogenous Buffer
Total concentration [B] = 500 µm
Forward binding rate kon = 5.108 m−1 s−1
Dissociation constant K D = 10 µm
Secretory vesicles
Number of binding sites 3
Forward binding rate kon = 8.106 m−1 s−1
Dissociation constant K D = 13 µm
Fig. 3 Simulation of
calcium entry through P/Q
and L-type Ca2+ channels in
our algorithm: current to
voltage relationships
considered in the channel
gating kinetic schemes
two P/Q- and one L-type calcium channels, according to experimental estimations
of channel populations involved in secretion in chromaffin cells [13]. In Fig. 3, the
simulated currents to voltage relationships for the L and P/Q calcium channels are
shown.
3.1 Modeling Mitochondria as Obstacles for Diffusion of

Calcium Ions
In our simulation scheme, mitochondria are modeled as permeable obstacles for

diffusion. The distribution of mitochondria inside the simulation domain is made
(randomly) according to the experimental findings: (a) 38% of cortical mitochondria
(located from 0 to 1 µm to the cellular membrane); (b) 15% of cytoplasmic mito-
chondria (located from 1 to 2 µm to the cellular membrane); and (c) 47% of internal
mitochondria (located from 2 to 3 µm to the cellular membrane). From the cortical
mitochondria 1/3 (12.6% of total) are colocalized with calcium channels and 2/3
(25.4% of total) are at 300 nm mean distance. In Fig. 4, we show a representation of
the base of the cone used as 3-D simulation domain and its first three slices. Three
clusters of voltage-dependent calcium channels (VDCC) and two cortical mitochon-
dria are also shown in the figure. The calcium ions enter through the calcium channels
and diffuse inside the cytosol. When in the diffusion process calcium ions are trapped
by a compartment corresponding to a mitochondria, the obstructed diffusion inside
the mitochondrial matrix is simulated by moving them in exactly the same way that
calcium ions are moved in the cytosol, but only after a number N of simulation
steps. Although the treatment of diffusion inside mitochondria is simplistic, it has
the advantage of using just a single parameter that can be constrained by the available
experimental observations.
114 A. Gil et al.
Fig. 4 Schematic representation of the three upper slices of the 3-D simulation domain including
three VDCC and two cortical mitochondria
In order to estimate a reasonable value of N in our approach, we have taken into

account experimental evidence in chromaffin cells suggesting that: (a) the amount of
calcium taken up by the subpopulation of mitochondria in the immediate vicinity of
the VDCC could be very large, even as high as 300–400 µm during maximal stimula-
tion of Ca 2+ entry through calcium channels [16] and (b) during stimulation, calcium
entering through the plasma membrane is taken up preferentially by mitochondria
and that there is very little calcium uptake by the endoplasmic reticulum (ER). There-
fore, in a first approximation to the problem it seems reasonable to exclude the effect
of the ER on the calcium profiles and limit to the study of mitochondria organelles.
Figure 5 (lower figure) shows the average calcium concentration for two different
values of N (N = 50,100) inside the compartments corresponding to the corti-
cal mitochondria closest to the VDCC during the simulated stimulus. As can be
seen, the calcium fluctuations inside the mitochondria peak at ∼200, ∼400 µm for
N = 50, N = 100, respectively. Therefore, a value of N = 100 for simulating
the obstructed diffusion inside the mitochondrial matrix, seems to generate mito-
chondrial calcium concentrations closer to the experimental observations during the
simulated stimulus.
As a comparison, Fig. 5 (upper figure) also shows the average cytoplasmic cal-
cium concentration from 0 to 100 nm to the cellular membrane. In the computation
of the cytoplasmic calcium concentration we have excluded those compartments cor-
responding to a mitochondria. As can be seen, this average calcium concentration
peaks at few micromolar, in contrast to the high calcium levels reached inside the
mitochondria matrix. These findings are in agreement with the idea that mitochondria
may act as a spatial buffer in many cells, regulating the local Ca 2+ concentration in
cellular microdomains [3] and stopping the progression of the calcium ions toward
the cell core [5].
In our study, we have also computed the possible impact of the mitochondria
calcium uptake on the secretory response of chromaffin cells using a standard kinetic
model for the secretory sensor of vesicles in this cell type [12]. The kinetic model is a
noncooperative kinetic scheme in which three calcium ions have to bind to a protein to
achieve vesicle fusion. In the simulation scheme, secretory vesicles are considered
Fig. 5 Calcium 4
concentrations in the
submembrane domain. 3
(µM)
Upper figure average
cytoplasmic calcium
cyt
2
concentration from 0 to
[Ca ]
2+
100 nm to the cellular 1
membrane. Lower figure
average calcium
0
concentrations inside the 0 10 20 30 40
compartments corresponding
t (ms)
to the cortical mitochondria
closest to the VDCC. Two 600
different values of the 50 diff steps
retarded diffusion steps
[Ca ] (µM)
100 diff steps

(N = 50,100) are considered 400
mit
in the calculations for

comparison
2+
200
0
0 10 20 30 40
t (ms)
as additional buffers in the medium. These vesicles are assumed to represent the
readily releasable pool of vesicles (RRP) [19]. Different distances of vesicles in
this pool from the VDCC were simulated by considering few hundreds of random
configurations in the first slice of the simulation domain.
In Fig. 6, we show the comparison of the percentage of the accumulated number
of vesicles that have fused to the membrane when mitochondria are considered or
not in the medium. It is important to note, that the accumulated secretion time course
shown is the average of the results obtained with the hundreds of random distances
of vesicles from the VDCC mentioned before, not a particular simulation result.
As can be seen, the presence of mitochondria moderates the secretory response of
the RRP after the calcium peak of the simulated stimulus, in agreement with the
idea that mitochondria could play a significant role in the modulation of secretion in
chromaffin cells [4].
Future work will involve the refinement of the model for mitochondria organelles
in chromaffin cells and other cell types: the model should be further validated using
more experimental data and more extensive simulations. Also, the predictive power
of the model should be studied. It will be also interesting to include the effect of
the Ca 2+ -induced Ca 2+ release (CICR) mechanism from the endoplasmic reticu-
lum (excluded in the present study). Modeling this process, along with the entry of
calcium through calcium channels and the mitochondria calcium uptake, will allow
us to obtain a more clear perspective of the functional triad controlling exocytosis in
chromaffin cells.
116 A. Gil et al.
Fig. 6 Percentage of the 10

accumulated number of
No mitochondria
vesicles that have fused to
Mitochondria
% Accum Secretory Events

the membrane when 8
mitochondria are considered
or not in the medium. The
accumulated secretion time 6
course shown is the average
of the results obtained with
few hundreds of random 4
distances of vesicles from
the VDCC
2
0
0 10 20 30 40
t (ms)
Acknowledgements This study was supported by grants from the Spanish Ministerio de Economia
y Competitividad (BFU2011-25095 to LMG).
References
1. Augustine, G.J. How does calcium trigger neurotransmitter release?. Curr Opinion Neurobiol,
11:320–326 (2001)
2. Burgoyne, R.D., Morgan, A., Robinson, I., Pender, N., Cheek, T.R. Exocytosis in adrenal
chromaffin cells. J. Anat. 183 (Pt 2):309–314 (1993)
3. Duchen, M.R. Mitochondria and calcium: from cell signalling to cell death. J. Physiol. 529.1:
5768 (2000)
4. García, A.G., García-De-Diego, A.M., Gandía, L., Borges, R., García-Sancho, J. Calcium
signaling and exocytosis in adrenal chromaffin cells. Physiol Rev. 86(4):1093–131 (2006)
5. Garcia-Sancho, J., de Diego, A.M., Garcia, A.G. Mitochondria and chromaffin cell function.
Pflugers Arch. 464:33–41 (2012)
6. Gil, A., Segura, J., Pertusa, J.A.G., Soria, B. Monte Carlo Simulation of 3-D Buffered Ca2+
Diffusions in Neuroendocrine Cells. Biophys. J. 78(1): 13–33 (2000)
7. Gil, A., Gutierrez, L. M., Carrasco-Serrano, C., Alonso, M. T., Viniegra, S. and Criado, M.
Modifications in the C terminus of the synaptosome associated protein of 25 kDa (SNAP-25)
and in the complementary region of synaptobrevin affect the final steps of exocytosis. J. Biol.
Chem. 277: 9904–9910 (2002)
8. Gutierrez, L. M., Ballesta, J. J., Hidalgo, M. J., Gandia, L., Garcia, A. G. and Reig, J. A. A two-
dimensional electrophoresis study of phosphorylation and dephosphorylation of chromaffin
cell proteins in response to a secretory stimulus. J. Neurochem. 51: 1023–1030 (1988)
9. Gutierrez, L.M. New insights into the role of the cortical cytoskeleton in exocytosis from
neuroendocrine cells. Int. Rev. Cell Mol. Biol. 295:109–137 (2012)
10. Katz, B., Miledi, R. The effect of calcium on acetylcholine release from motor nerve terminals.
Proc. R. Soc. London B 161: 496503 (1965)
11. Katz, B., Miledi, R. The timing of calcium action during neuromuscular transmission J. Physiol.
(London) 189: 535544 (1967)
12. Klingauf, J., E. Neher. Modeling buffered Ca 2+ diffusion near the membrane: implications for
secretion in neuroendocrine cells. Biophys. J. 72:674–690 (1997)
13. Lukyanetz, E.A., Neher, E. Different types of calcium channels and secretion from bovine
chromaffin cells. Eur. J. Neurosci. 11: 2865–2873 (1999)
14. Segura, J., Gil, A., Soria, B. Biophys. J. Modeling study of exocytosis in neuroendocrine cells:
influence of the geometrical parameters. 79(4): 1771-1786 (2000)
15. Vilaseca, E., Pastor, I., Isvoran, A., Madurga, S., Garcés, J.L., Mas, F. Diffusion in macromole-
cular crowded media: Monte Carlo simulation of obstructed diffusion vs. FRAP experiments.
Theor. Chem. Acc. 128, Issue 4–6: 795–805 (2011)
16. Villalobos, C., Nunez, L., Montero, M., García, A.G., Alonso, M.T., Chamero, P., Alvarez, J.,
García-Sancho, J. Redistribution of Ca2+ among cytosol and organella during stimulation of
bovine chromaffin cells. FASEB J 16:343–353 (2002)
17. Villanueva, J., Torres, V., Torregrosa-Hetland, C. J., Garcia-Martinez, V., Lopez- Font, I.,
Viniegra, S. and Gutierrez, L. M. F-Actin-Myosin II Inhibitors Affect Chromaffin Granule
Plasma Membrane Distance and Fusion Kinetics by Retraction of the Cytoskeletal Cortex. J.
Mol. Neurosci. 48: 328–338 (2012)
18. Villanueva, J., Viniegra, S., Giménez-Molina, Y., García-Martínez, V., Expósito-Romero, G.,
del Mar Frances, M., García-Sancho, J., Gutiérrez, L.M. The position of mitochondria and ER
in relation to that of the secretory sites in chromaffin cells. J Cell Sci. 127(23), 5105–14 (2014)
19. Voets, T., Neher, E., Moser, T. Mechanisms underlying phasic and sustained secretion in chro-
maffin cells from mouse adrenal slices Neuron (23): 607–615 (1999)
20. Whitfield, J.F., Chakravarthy, B.: Calcium: The Grand-Master Cell Signaler. NRC Research,
Otawa (2001)
Dynamical Features of the MAP Kinase
Cascade
Juliette Hell and Alan D. Rendall
Abstract The MAP kinase cascade is an important signal transduction system in

molecular biology for which a lot of mathematical modelling has been done. This
paper surveys what has been proved mathematically about the qualitative properties
of solutions of the ordinary differential equations arising as models for this biological
system. It focuses, in particular, on the issues of multistability and the existence of
sustained oscillations. It also gives a concise introduction to the mathematical tech-
niques used in this context, bifurcation theory and geometric singular perturbation
theory, as they relate to these specific examples. In addition further directions are
presented in which the applications of these techniques could be extended in the
future.
1 Introduction
An important process in cell biology is the transmission of information by signalling

networks from the cell membrane to the nucleus, where it can influence transcription.
This provides the cell with a possibility of reacting to its environment. A common
module in many signalling networks is the mitogen activated protein kinase cascade
(MAPK cascade). It is the subject of what follows. The MAPK cascade is a pattern
of chemical reactions which is widespread in eukaryotes [45]. The individual pro-
teins which make up the cascade differ between different organisms and between
different examples within a given organism but what is common is a certain archi-
tecture. The cascade consists of three parts which we will call layers. Each layer is
a phosphorylation cycle or, as it is sometimes called, a multiple futile cycle [43].
J. Hell (B)
Institut Für Mathematik, Freie Universität Berlin, Arnimallee 7, 14195 Berlin, Germany
e-mail: jhell@zedat.fu-berlin.de
A.D. Rendall (B)
Institut Für Mathematik, Johannes Gutenberg-Universität Mainz, Staudingerweg 9,
55099 Mainz, Germany
e-mail: rendall@uni-mainz.de

120 J. Hell and A.D. Rendall
A multiple futile cycle consists of a protein X which can be phosphorylated by

a kinase E at n sites. The resulting phosphoproteins can be dephosphorylated by a
phosphatase F. The cases of interest for the MAPK cascade are n = 1 and n = 2.
The following sketch shows the MAPK cascade with three layers. Each plain arrow
G /
marked with an enzyme represents an enzymatic reaction, i.e. Y Z stands for
the chemical reactions Y + G Y G → Z + G. The dotted arrows between a phos-
phorylated protein and an enzyme in the next layer stand for equality, i.e. Z /G
means Z = G. The first layer of the cascade is a simple phosphorylation, i.e. n = 1.
The next layers are double phosphorylations, i.e. n = 2. The species Xi are the pro-
teins in the cascade, or in more detail X1 = MAPKKK (MAP kinase kinase kinase),
X2 = MAPKK (MAP kinase kinase), X3 = MAPK (MAP kinase).
E1
X1 ` X1 P (1)
F1

E2 E2

X2 ` X2 Pc X2 PP
F2 F2

E3 E3

X3 c X3 Pd X3 PP
F3 F3
It is important that even when there is more than one site where the protein can
be phosphorylated there is only one kinase and one phosphatase. Thus different
phospho-forms of the protein may compete for binding to one of the enzymes. A
MAPK cascade consists of three layers, each of which is a multiple futile cycle
with a different protein. The connection between the layers is that the maximally
phosphorylated forms of the protein in the first and second layers are the kinases
which catalyze the phosphorylations in the second and third layers, respectively. The
most extensively studied example in mammals is that where the proteins in the three
layers are Raf, MEK and ERK.
The subject of what follows is mathematical modelling of the MAPK cascade. It
turns out that this system has a rich dynamics which needs mathematical modelling
for its understanding. Pioneering work in studying this question was done by Huang
and Ferrell [21]. In that paper the authors presented both theoretical and experi-
mental results and compared them. Their experiments where done in cell extracts
from Xenopus oocytes. On the theoretical side they wrote down a system of ordi-
nary differential equations describing the time evolution of the concentrations of the
Dynamical Features of the MAP Kinase Cascade 121
substances involved in the MAPK cascade and simulated these equations numeri-
cally. The results of the simulations reproduced important qualitative features of the
experimental data.
In order to model the reactions taking place it is necessary to make assumptions
about the kinetics. In many enzymatic reactions the concentrations of the enzymes
are much less than those of the corresponding substrates. This cannot necessarily
be assumed in the case of the MAPK cascade. In particular there are substances,
for example the doubly phosphorylated form of MEK, which are the substrate for
one reaction and the enzyme for others. For this reason the model in [21] uses a
Michaelis–Menten scheme for each reaction with a substrate, an enzyme and a
substrate-enzyme complex without making an assumption of small enzyme concen-
tration. The elementary reactions involved are assumed to obey mass action kinetics.
There are seven conservation laws for the total amounts of the three substrates and the
four enzymes which are not also substrates (one kinase, E1 and three phosphatases,
Fi , i = 1, 2, 3). It is assumed that the phosphorylation and dephosphorylation are
distributive and sequential. In other words in any one encounter of a substrate with
an enzyme only one (de-)phosphorylation takes place, after which the enzyme is
released. To add or remove more than one phosphate group more than one encounter
is necessary. The phosphorylations take place in a particular order and the dephos-
phorylations in the reverse order. These assumptions are implemented in the model
of [21]. They have also frequently been adopted in other literature concerned with
the modelling of this system and we will call them the standard assumptions. The
authors of [21] mention that they also did simulations for cases where one or more of
the reactions is processive (i.e. more than one phosphate group is added or removed
during one encounter). The standard assumptions may not be correct in all biological
examples but they are a convenient starting point for modelling which can later be
modified if necessary.
In real biological systems the MAPK cascade is part of a larger signalling network
and cannot be seen in isolation. Nevertheless, one can hope to obtain insights by
first understanding the isolated cascade and later combining it with other reactions.
Similarly it can be helpful to approach an understanding of the cascade itself by
studying its component parts, the multiple futile cycles.
The most frequent approaches to these modelling questions in the literature use
simulations and heuristic considerations. An alternative possibility, which is the
central theme of this paper, is to prove mathematical theorems about certain aspects
of the dynamics with the aim of obtaining insights complementary to those coming
from the numerical procedures. The number of mathematically rigorous results on
this subject known up to now is rather limited. The aim of this paper is to survey the
results of this type which are available and to outline perspectives of how they might
be extended. At the same time it gives an introduction to some of the techniques
which are useful in this kind of approach. The description starts from the simplest
models and proceeds to more complicated ones. We discuss successively the simple
futile cycle, the dual futile cycle and the full cascade. The description also proceeds
from simpler dynamical features to more complicated ones, from multistationarity
to multistability and then to sustained oscillations. After this core material has been
treated further directions are explored. What happens when the basic cascade is
embedded in feedback loops? What happens in systems with other phosphorylation
schemes?
2 The Simple Futile Cycle
In this section we look at the case n = 1 of the multiple futile cycle, in other
words we isolate the first layer of the cascade Eq. (1). We omit the index 1 of
the chemical species for clarity in this section, since the other layers will play
no role here. Modelling this system in a way strictly analogous to that applied
to the MAPK cascade in [21] leads to a system of six equations for the sub-
strates X and XP, the enzymes E, F and the substrate-enzyme complexes X · E
and XP · F. There are three conserved quantities, which are the total amounts
of the enzymes and the substrate, Etot = [E] + [X · E], Ftot = [F] + [XP · F] and
Xtot = [X] + [XP] + [X · E] + [XP · F]. where here and in the following [Z] denotes
the concentration of the species Z. These can be used to eliminate three of the
equations if desired. When the evolution is modelled by mass-action kinetics, these
manipulations can be done explicitly:
d[X]
dt
= −k1 [X](Etot − [X · E]) + k2 [X · E] + k6 [XP · F],
d[X·E]
dt
= k1 [X](Etot − [X · E]) − (k2 + k3 )[X · E], (2)
d[XP·F]
dt
= k4 [XP](Ftot − [XP · F]) − (k5 + k6 )[XP · F],
where ki , i = 1, . . . , 6, are the reaction constants of the six reactions involved.

The remaining concentrations are given by the relations [E] = Etot − [X · E], [F] =
Ftot − [XP · F], [XP] = Xtot − [X] − [X · E] − [XP · F].
Suppose we have a system ẋi = fi (x), x = (x1 , . . . , xn ) ∈ Rn representing the
dynamics of a chemical reaction network. The xi (t) are the concentrations of the
substances involved as functions of time and the dot denotes the time rate of change.
A stationary solution (or steady state) is one which satisfies for all i ∈ {1, . . . , n},
xi (t) ≡ xi,0 for some fixed concentrations x0 = (x1,0 , . . . , xn,0 ). Thus it satisfies ẋ = 0
or equivalently f (x0 ) = 0. A solution x(t) is said to converge to the steady state x0
if limt→∞ x(t) = x0 . This is an idealization of the situation where an experimental
system settles down to a steady state on a sufficiently long time scale. For instance
in the experiments of [21] the system was found to approach a steady state after 100
minutes. Corresponding behaviour was found in the simulations. There is no reason
why a chemical system should behave in this simple way for all initial data. The
results of [21] indicate that it does so for the data considered there. Even if for a par-
ticular system all solutions converge to a steady state, it may be that there exist more
than one steady state for fixed values of the total amounts of the substances involved.
In the language of chemical reaction network theory [11], there may be more than
one steady state in one stoichiometric compatibility class. This is the phenomenon
of multistability. It is important for biological processes such as cell differentiation.
In the case of the simple futile cycle it was proved in [5] that there is always
exactly one steady state for fixed values of the total amounts Etot , Ftot and Xtot and
that all other solutions converge to that steady state. The steady state is globally
asymptotically stable and bistability is ruled out in this case. We cannot enter into
the details of the proof of this result here but it is appropriate to mention some
of the key ideas involved. Suppose that the system ẋi = fi (x) has the property that
∂fj /∂xi > 0 for all i = j and all x. Then the system is called monotone. If a system
does not satisfy this property we may try to make it do so by reversing the signs of
some of the variables. In other words we replace the variables xi by yi = εi xi , where
each εi is plus or minus one. In general a system is called monotone if there is a
mathematical transformation of this kind which makes all partial derivatives of the
right hand sides of the equations with i = j positive. There is a graphical criterion to
decide whether this is possible. Define a graph which has a vertex for each variable
xi and which has an oriented edge connecting node i to node j if ∂fj /∂xi = 0. Label
each edge with the sign of the corresponding derivative. Alternatively we can use
the convention that a positive sign is represented by a normal arrow while a negative
sign is represented by a blunt-headed arrow. This object is called the species graph.
For example the species graph of the simple futile cycle is the following.
− −
- XP o + r
XE (3)
z z= O O ÀA
AA +
− z
z AA
− zzz AA −
z}
-F + + Eq
aDD }>
DD }
DD }}
+ DD }}} −
! ~}
3 XPF /Xm
+
− −
Since the use of terms concerning feedback loops is not always consistent between
different sources in the literature we specify the terminology which we will use. A
feedback loop is a sequence of arrows which starts at one node and ends at the same
node, i.e. a cycle. It is called a positive or negative feedback loop according to whether
the number of edges with a negative sign it contains is even or odd. More precisely
this object may be called a directed feedback loop, while the corresponding definition
where the orientation of the edges is ignored is called an undirected feedback loop.
Suppose we have a system of ordinary differential equations for which the sign of
the derivative ∂fj /∂xi is independent of x for each fixed i and j. Then it can be proved
that the system is monotone if and only if the species graph contains no negative
undirected feedback loops of length greater than one. In the case of the simple futile
cycle the species graph contains at least one negative feedback loop. This is true both
for the full six-dimensional system and for the three-dimensional system obtained by
eliminating the concentrations of E, F and XP using the conservation laws. However,

it was shown in [5] that there is a different type of transformation which makes this
system monotone. In this transformation the concentrations are replaced as variables
by the extents of the reactions. The resulting monotone system has additional good
properties and this allows the property of convergence to a unique steady state to be
concluded for the original system.
Many chemical systems have interesting limiting cases obtained by letting cer-
tain combinations of reaction constants tend to zero. This can lead to a significant
reduction in the number of variables in the system and make analytical investigations
simpler. Under suitable circumstances solutions of the limiting system approximate
solutions of the original system in a certain parameter regime. This can be illustrated
by the case of the simple futile cycle. For the species E, F, X · E, XP · F involving
the enzymes E, F, alone or in a complex, scale the concentrations with a parame-
ter ε > 0 while the remaining concentrations [X] and [XP] are not rescaled. In other
words, if y = ([E], [F], [X · E], [XP · F]) is the vector of their concentrations, define
ỹ = εy, where ε is a positive constant. If x = ([X], [XP]) is the vector of the remaining
concentrations, the original system with mass-action kinetics is of the form

ẋ = f (x, y),
(4)
ẏ = g(x, y),
where f , g are linear in each of the concentrations, i.e. entries of the vectors x, y. The
smallness of ε corresponds to the fact that the amount of enzymes is small compared
to the amount of the other species. Define a new time coordinate by τ = εt and let a
prime denote the derivative with respect to time τ . The time τ is called the slow time
scale because its velocity dτdt
= ε is small. This gives rise to a system of the general
form
x = f (x, ỹ),
(5)
εỹ = g(x, ỹ).
In the limit ε → 0, and dropping the tildes, the second equation εy = g(x, y)
changes from being an ordinary differential equation to being an algebraic equation
0 = g(x, y). Under favourable conditions this equation can be solved for y in terms
of x and the result y = h(x) substituted into the first equation to give an equation for
x alone,
x = f (x, h(x)). (6)
The result is a system with fewer unknowns. The degeneration to an algebraic

equation means that the limit is singular. It may be asked whether the solutions
themselves nevertheless behave in a regular way in the limit and indeed this is the
case under certain conditions. The appropriate mathematical techniques for studying
this are known as geometric singular perturbation theory (GSPT) as introduced by
Fenichel, see [12]. This theory will be discussed in more detail below. The partic-
ular type of limit just exhibited for the simple futile cycle is sometimes called a
Michaelis–Menten limit. In this case it leads to a two-dimensional system. The con-

servation law for the total amount of substrate survives in the limit in a simplified
form and can be used to reduce the system further to a single equation. Let the reac-
tion constants for complex formation, complex dissociation and product formation
be denoted by ki , di and ai respectively, with i = 1 corresponding to phosphorylation
and i = 2 to dephosphorylation.
k1
&
X + Eg XE
a1
/ XP + E (7)
d1
k2
&
XP + F XP · F
a2
/ X +F
g
d2
The Michaelis constants are defined as usual by Km,i = di +a

ki
i
. Then the equation can
be written as
d k2 Ftot (Xtot − [XP]) k1 Etot [XP]
[XP] = − + , (8)
dt Km,2 + Xtot − [XP] Km,1 + [XP]
After this reduction it is possible to get an explicit formula for the unique steady
state by solving a quadratic equation in the variable [XP] [16]. For ε small the
total amounts of the enzymes are small compared to the total amount of substrate
and this is sometimes described by saying that the enzymes are close to saturation.
Varying one of the parameters in the system and monitoring the concentration of
phosphorylated protein gives a response function. The main concern of [16] is the
form of this function, which corresponds to the property of ultrasensitivity: A small
change in the parameter leads to a large change in the value of the response function.
This property is quantitative rather than qualitative and not obviously amenable to
the application of the analytical techniques to be discussed in this paper. It is an
interesting question, whether these techniques can be extended so as to give more
information about quantitative properties. This will not be discussed further here
except to mention that ultrasensitivity was also the central feature of interest in [21],
where the response of the concentration of maximally phosphorylated ERK as a
function of that of the first kinase in the cascade was investigated.
3 The Dual Futile Cycle
This section is concerned with the case n = 2 of the multiple futile cycle. The second
layer of the MAPK cascade Eq. (1) is an example of a dual futile cycle. We omit the
index indicating the layer in this section, since we consider a single layer of the
cascade. The basic system with mass action kinetics can be found, for instance, in
[43]. It is possible to do a Michaelis–Menten reduction in a similar way to that done
in the last section. We recall that this consists in scaling the concentration Z of the
species containing one of the two enzymes E, F, alone or in a complex, via the
transformation ỹ = εy of their concentration vector y, as well as the time by τ = εt.
The limit ε → 0 can be reduced to a lower dimensional ODE by solving an algebraic
equation. This leads, after using all conservation laws to eliminate as many variables
as possible, to a two-dimensional system, which will be called the MM system (for
Michaelis–Menten). More details on this can be found in [19]. The system can be
written in the form
−1 −2
d k1 Km,1 Etot [X] k2 Km,1 Ftot [XP]
[X] = − −1 −1
+ −1 −1
. (9)
dt 1 + Km,1 [XP] + Km,3 [XPP] 1 + Km,2 [XP] + Km,4 [XPP]
−1 −1
d k3 Km,3 Etot [XP] k4 Km,4 Ftot [XP]
[XPP] = −1 −1
− −1 −1
. (10)
dt 1 + Km,1 [XP] + Km,3 [XPP] 1 + Km,2 [XP] + Km,4 [XPP]
When using the conservation law for Xtot we have the choice, which of the three
concentrations [X], [XP] or [XPP] to eliminate. Here the equation for [XP] has been
discarded and on the right hand side of the equations [XP] should be regarded as
an abbreviation for Xtot − [X] − [XPP]. In this case the MM system is monotone,
as defined in the previous section. Since it is two-dimensional this implies that all
solutions converge to steady states. Moreover it can be shown using GSPT that for
ε small but non-zero almost all initial data give rise to solutions which converge
to steady states [44]. For general ε it was not known until very recently whether
the corresponding statement was true. In [10] the authors used computer-assisted
methods to find periodic solutions of this system, indicating that the statement is false
for general ε. They do not only use dynamical simulations but also use computer
algebra to help implement a theoretical approach to finding Hopf bifurcations. They
do not obtain evidence for the existence of stable periodic solutions, so that it would be
consistent with their findings if periodic solutions, while present, were only relevant
for exceptional initial conditions. Despite the results of [10] the global behaviour of
solutions of the dual futile cycle is much less well understood than that in the case
of the simple futile cycle.
What has been proved is that multistationarity (existence of more than one steady
state) occurs in the dual futile cycle for certain values of the parameters [43]. In fact it
is known that there exist up to three steady states for given values of the total amounts
and that there are never more than three. The proof of the existence result can be
split into two steps. In the first step the equations for steady states are partly solved
explicitly. This leaves a system of two equations for two unknowns. The second step
involves taking a limit of these equations as a parameter ε tends to zero. This limit is
essentially the Michaelis–Menten limit discussed above. Since we are dealing with
steady states the factor ε in the equation for y plays no role and the limit is regular.
In the limit a single equation for one variable is obtained and this is relatively easy
to analyze. It is possible to find cases where there are three solutions of the equation
F(x) = 0 for steady states of the equation with ε = 0, all of which satisfy dF/dx = 0.
This implies, using the implicit function theorem, that corresponding solutions exist
for ε small but positive. This argument gives no information on the important issue
of stability of these solutions. A steady state x0 will only be observed in practice if
it is stable. This means that if a solution starts close enough to x0 it will stay close
to x0 for all future times. For example if the linearization at the equilibrium x0 has
only eigenvalues with strictly negative real parts, then the equilibrium x0 is stable.
On the level of simulations multistationarity was already observed for the system
with mass action kinetics in [29] and it was found that two of the three steady states
are stable. Simulations in [31] indicated that these features are already present in
the MM system. On the other hand until recently there was no mathematical proof
of bistability for the dual futile cycle. A strategy suggested by what has been said
up to now for obtaining such a proof is to first prove bistability for the MM system
and then use GSPT to conclude the corresponding result for the mass action system.
This strategy was carried out in [19] and in that paper we proved bistability. We now
sketch the main lines of the proof.
In the previous section, we saw that the MM reduction is the singular limit as
ε → 0 of the system on the slow time scale Eq. (5), where g(x, ỹ) = 0 ⇔ ỹ = h(x).
Now consider the fast time scale, i.e. the same equations expressed in the original
time t, augmented by the trivial evolution of the parameter ε. This system is called
the extended system. ⎧
⎪
⎨ẋ = εf (x, ỹ, )
ỹ˙ = g(x, ỹ, ) (11)
⎪
⎩
ε̇ = 0
The curve {(x, ỹ = h(x), ε = 0)} is a curve of equilibria in the extended system.
The linearization of the extended system Eq. (11) at such an equilibrium admits
two zero eigenvalues with the corresponding eigenvector being (0, 0, 1) pointing in
direction of ε orthogonally to the (x, y)-plane, and a second vector in the (x, y)-
plane tangent to the curve {(x, h(x))}. If the linearization Dy g(x, h(x), 0) admits
only eigenvalues with strictly negative real parts, then the centre eigenspace (i.e. the
eigenspace spanned by the eigenvectors associated to purely imaginary eigenvalues)
is exactly two dimensional. The eigenspace is tangential to a manifold called the
centre manifold: see [40] for details about center manifold theory. The (local) centre
manifold M c (x, h(x), 0) of such an equilibrium is an invariant manifold containing
all bounded solutions sufficiently near the equilibrium. The centre manifold M c can
be written as a graph of a function Ψ over the centre eigenspace:
M c (x, h(x), 0) = {(x, Ψ (x, ε), ), ε small}, (12)
with Ψ (x, 0) = h(x). The manifold M c is tangent to the centre eigenspace of the
extended system at each equilibrium (x, h(x), 0). Because of the equation ε̇ = 0, the
invariant two-dimensional centre manifold M c is foliated by invariant 1-dimensional
Fig. 1 Sketch of the phase y

portrait of the extended
system with a curve of Mc
equilibria {y = h(x) ≡ 0}.
The one dimensional
dynamics of the
MM-reduced system consists ε=0 ε=1
of one unstable equilibrium
connected via heteroclinic
orbits to two stable
equilibria. This dynamics
persists on the invariant ε
ε-leaves for small ε
leaves ε = constant. If the leaf at ε = 0 consists of hyperbolic equilibria (i.e. the

linearization there admits no purely imaginary eigenvalues) connected by hetero-
clinic orbits as depicted in Fig. 1, then this dynamics is preserved in the leaves at ε
small. Furthermore, the linearization theorem of Shoshitaishvili (see [26], Theorem
5.4) tells us that, if Dy g(x, h(x)) has only stable eigenvalues, the center manifold M c
is attracting. Hence hyperbolic equilibria that are stable in the MM-reduced system
Eq. (6) give rise to stable equilibria in the extended system. Undoing the scaling
ỹ = εy provides us with stable equilibria for the original mass-action system Eq. (4).
In order to apply GSPT in this context the main property to be checked concerns
the eigenvalues of the matrix A = Dy g(x, h(x)). By this we mean the matrix of partial
derivatives of the function g with respect to the variables y at a point (x, h(x)). The
condition to be checked for the center manifold M c to be attracting is that the real
parts of these eigenvalues are negative. Fortunately in this case the calculation can
be reduced to one for the eigenvalues of two by two matrices, which is relatively
simple.
The remaining part of the argument is to prove bistability for the MM system.
When this has been done and if the stable steady states are hyperbolic (i.e. the lin-
earization of the system at those points has no eigenvalues with zero real parts) then
GSPT tells us that these steady states continue to exist and be stable for ε small but
non-zero. For more details we refer to [19]. Bistability for the MM system is proved
using bifurcation theory and now some more information will be given concern-
ing that technique. Consider a system of ordinary differential equations ẋ = f (x, α)
and a steady state (x0 , α0 ), i.e. f (x0 , α0 ) = 0. Here α denotes one or more parame-
ters. The linearization of the system at (x0 , α0 ) is the matrix of partial derivatives
A = Dx f (x0 , α0 ). If no eigenvalue of A has zero real part the equilibrium x0 is said to
be hyperbolic. Then for α close to α0 there is a unique solution of f (x∗ (α), α) = 0
with x∗ close to x0 by the implicit function theorem. The stability properties of the
solution are preserved, the dynamics nearby is the same as the dynamics of the lin-
earized system by the Hartman–Grobman Theorem (see [17], Theorem 1.3.1). For
instance if x0 is stable then x∗ (α) is stable. If, on the other hand, A has an eigen-
value whose real part is zero then (x0 , α0 ) is said to be a bifurcation point and the
qualitative dynamics of the system may change at that parameter value. For instance
as the parameter is varied one steady state may split into several. In other words,
new branches of equilibria may come into being at the critical parameter value α0 .
Identifying a suitable bifurcation is a way of proving that several steady states exist
for certain parameter values.
For example, consider a system ẋ = f (x, α) with one unknown x depending on
two parameters, α = (α1 , α2 ). Denote by a prime the partial derivative of a function
of x and α with respect to x. Suppose that f (0, 0) = 0, f (0, 0) = 0, f (0, 0) = 0 and
f (0, 0) = 0 and that an additional quantity depending on derivatives with respect
to the parameters is non-zero. The eigenvalue of the linearization A that crosses zero
depends on the two parameters. In this example there is only one eigenvalue of the
Jacobian, which is f (0, 0) = 0. When the above mentioned quantity is non-zero,
it guarantees that the crossing happens at a non-zero velocity with respect to the
parameters and transversally to the imaginary axis. This is called a transversality
condition. See [26] for details. These are the defining properties of what is called a
generic cusp bifurcation. There is a surface of equilibria over the two dimensional
parameter space which develops a fold at (0, 0). In a cusp region of the parameter
space, three equilibria coexist - two stable ones and an unstable one. See Fig. 2. Then
there are parameter values near zero for which the system has three steady states
close to zero. The case of relevance for the examples considered in this paper is that
where f (0, 0) < 0 and from now on we will only discuss that case. There two of the
steady states close to zero are stable and one unstable. Now suppose there are several
variables xi and that at the point (0, 0) the derivative A = Dx f has a zero eigenvalue
of multiplicity one. The kernel of A is of dimension one. The qualitative behaviour
of solutions near the steady state is determined by the restriction of the dynamics
to a curve, the one dimensional centre manifold, which is tangent to the kernel of A
and invariant under the flow. While the local stable and unstable manifolds contain
all initial conditions for solutions that converge exponentially to the equilibrium in
Fig. 2 Bifurcation diagram x

for a generic cusp
bifurcation: the unstable
branch of the surface of
equilibria is shaded, as well
as the region in the
parameter plane where two
stable and one unstable
α1
equilibria coexist
α2
forward or backward time direction respectively, the center manifold contains local
bounded dynamics that depends heavily on the nonlinear terms. In this way the
general case may be reduced to the one-dimensional case already discussed when
the kernel of A is of dimension one. In fact the dynamics in the stable and unstable
directions corresponding to eigenvalues with nonzero real parts do not change. These
are the main techniques used in the proof of bistability in [19].
It is clear that in order to linearize about a steady state we first have to have that
steady state. It is not too difficult to find steady states since there are many parameters
in the problem which can be varied. In [19] steady states were considered for which
the concentrations of X and XPP are equal, since this simplifies the algebra. Moreover
it was assumed that all Michaelis constants have a common value K. In this case the
relation
Etot 2 k2 k4
= (13)
Ftot k1 k3
holds. A bifurcation point was found under the restriction that q2 =(k1 k4 )/(k2 k3 ) < 1.
The bifurcation occurs when KXtot = 2+q 1−q
.
It is important to note that here Michaelis–Menten reduction was carried out for
the whole system and not for the two phosphorylation steps separately. The latter
alternative leads to a different set of equations. It was used in [24], where the effect of
embedding a MAPK cascade in a negative feedback loop was investigated. Consider,
for instance, the dual futile cycle which is the third layer of the cascade in [24] and
the equation for the unphosphorylated protein. It is of the form
d k7 Etot [X] k7 Ftot [XP]

[X] = − + (14)
dt K7 + [X] K10 + [XP]
which is clearly different from the corresponding equation in the MM system intro-
duced above, even when all Michaelis constants are taken to be equal (which is done
in the simulations of [24]).
The results just discussed only give limited information on the region of parameter
space in which bistability occurs. In contrast, in [7] rigorous quantitative results on
multistationarity were proved. Rather general conditions on the reaction constants
were exhibited for which there are one or three steady states.
Bistability is important for the property of being a ‘good switch’: consider our
system as an input–output relation where the input consists of the values of the
conserved quantities and the output consists of the equilibrium reached by the system
after a certain amount of time. For certain values of the input, the stable equilibrium
reached by the system is unique and, say, on the lower part of the folded surface of
equilibria. When the input enters the cusp region, the output remains on the lower
part of the fold until the cusp region is left: at this point, the output switches to the
upper part of the folded surface. In fact, the same phenomenon of switching is present
when the parameter (input) is one dimensional and there exists a S-shaped curve of
equilibria. See Fig. 3.
Fig. 3 The input–output output

relation follows the lower x
stable equilibria until it
becomes unstable through a
fold (also called
saddle-node) bifurcation, f (x, λ )= 0
then jumps to the upper
stable branch. Hence such an
input–output relation is a
good switch switch
λ
input
Sometimes when modelling phosphorylation systems the enzyme concentrations

are not included explicitly and instead mass action kinetics is used for the substrates
alone. If this is done for the cycle with two phosphorylations, or indeed for a cycle
with any number of phosphorylations, a dynamical system is obtained which has a
unique stationary solution to which all other solutions converge. This is because it
can be shown that the system is a weakly reversible system of deficiency zero and
the Deficiency Zero Theorem of chemical reaction network theory (see [11]) can be
applied. This type of argument was used to prove corresponding results for the kinetic
proofreading model of T cell activation in [38] and for the multiple phosphorylation
of the transcription factor NFAT in [34]. These systems only describe small parts
of the network involved in T cell activation, which also contains a MAPK cascade.
A comprehensive model of this phenomenon presented in [1] is too large to be
accessible by direct analytical investigation. Interestingly a very much simplified
version of this model introduced in [13] reproduces some of the key features of T
cell activation such as specificity, speed and sensitivity. In the simplified model the
MAPK cascade is represented by a simple response function.
4 The MAPK Cascade
The starting point of this section is the model of the MAPK cascade introduced in [21].
Simulations in [33] revealed the presence of bistability and sustained oscillations in
this model. Mathematically the latter correspond to periodic solutions, i.e. solutions
which satisfy x(t + T ) = x(t) for some time interval T but are not steady states.
In [33] similar results were obtained for the truncated cascade consisting of just the
first two layers. Michaelis–Menten reduction for the MAPK cascade is made difficult
by the fact that there is not a clear division between substrates and enzymes. This
issue was studied in [41]. A further development of these ideas in [42] indicated
that periodic solutions already occur in the Michaelis–Menten limit. It was shown in
[19] that a small modification of these ideas allows a Michaelis–Menten limit of the
equations for the truncated cascade to be defined which is well-behaved in the sense
of GSPT. Since the truncated system contains a species, X1 P, which is a product in
the first layer and an enzyme in the second layer, the ε-scaling of the MM-reduction
has to be carried out using two different powers of ε. We first define a new variable
X1 replacing [X1 P] as follows.
X1 := [X1 P] + [X2 · X1 P] + [X2 P · X1 P],
The concentration vector is split into three vectors v0 , v1 , v2 . The vector v2 is

the vector of concentrations of species containing the enzymes of the first layer
of the cascade, alone or in a complex, i.e. v2 = ([E1 ], [F1 ], [X1 · E1 ], [X1 P · F1 ]).
This vector is rescaled by v˜2 = ε2 v2 . The vector v1 is the vector of concentra-
tions of species containing X1 or the enzymes X1 P = E2 and F2 of the second
layer of the cascade, alone or in a complex, i.e. v1 = (X1 , [X1 ], [X1 P], [E2 ], [X2 ·
X1 P], [X2 P · X1 P], [X2 P · F2 ], [X2 PP · F2 ]). This vector is rescaled by v˜1 = εv1 .
Finally, the vector v0 contains the concentrations of the remaining species, i.e.
x = ([X2 ], [X2 P], [X2 PP]) and is not rescaled. Furthermore the reaction constants
of the first layer are also rescaled by ε. A slow time variable τ = εt is introduced: the
time derivative w.r.t. t is denoted by an upper dot while the time derivative w.r.t. τ is
denoted by . Using conserved quantities to reduce the dimension of the concentration
vectors and the new variable X1 , we get a system of the form

x = f (x, y, z),
(15)
εy = g(x, y, z),
where x = (X1 , [X2 ], [X2 PP]) and y=([X1 · E1 ], [X1 P · F1 ], [X2 · X1 P], [X2 P · X1 P],
[X2 P · F2 ], [X2 PP · F2 ]). The limit ε → 0 of this system allows a MM-reduction
that is well-behaved in terms of GSPT. For more details see [19, 20]. This result was
extended to the full cascade in [20].
The facts just listed indicate that the strategy used to prove bistability in the dual
futile cycle might also be used to prove the existence of sustained oscillations in the
truncated MAPK cascade, i.e. layers 1 and 2 of cascade Eq. (1). Here the relevant type
of bifurcation is a Hopf bifurcation where the linearization at an equilibrium admits
a pair of imaginary eigenvalues for a critical parameter. By a classical theorem of
Hopf, if under variation of a parameter a pair of complex conjugate eigenvalues of
the linearization passes through the imaginary axis (away from zero) with non-zero
velocity then there exist periodic solutions for at least some parameter values. See
[17, 26] for details. In [20] it was proved that Hopf bifurcations occur in the MM
system for the MAPK cascade. As a consequence periodic solutions occur. If an
additional genericity condition (hyperbolicity of the periodic orbit) were satisfied
then it could further be concluded using GSPT that periodic solutions also occur in
the mass action system for the truncated MAPK cascade. In fact it has not yet been
possible to prove hyperbolicity. Instead it was proved that the Hopf bifurcation itself
can be lifted to the mass action system and this then gives the existence of periodic
solutions of that system. Unfortunately these arguments give no information on the
stability of the periodic solutions involved. It is interesting to look at this situation
in the light of results on feedback loops. It has been proved that a system can only
admit a stable periodic solution if it includes a directed negative feedback loop [4,
35]. It can easily be checked directly that this condition holds in the case of the MM
system for the truncated MAPK cascade.
The results for the truncated cascade imply analogous results for the full cascade
by another application of GSPT. What must be shown is that the truncated cascade
can be represented as a limit of the full cascade which is well-behaved in the sense
of GSPT. Consider the MM system for the full cascade. Let Z be the concentration
of a protein in the first or second layer and define a new variable by Z̃ = ε−1 Z. Let
ci be any of the rate constants in the third layer and define c̃i = εci . The transformed
system has a limit for ε → 0 which is well-behaved in the sense of GSPT and the
limiting system is the MM system for the truncated cascade. Thus in the context
of the MM system the Hopf bifurcation can be lifted from the truncated to the full
cascade. It can then be further lifted from the MM system for the full cascade to the
mass action system.
In [32] an in vitro model of the MAPK cascade was introduced. The substances
involved are modified in such a way that certain features of the reaction network are
modified. In the first layer Raf is constitutively active which means that for modelling
this layer can be ignored. In the third layer ERK can only be phosphorylated once, on
tyrosine and not on threonine. (In the wild type system MEK has the unusual property
of being a dual specificity kinase which can phosphorylate both threonine and tyro-
sine.) This leads to a cascade with two layers where the first has two phosphorylation
steps and the second only one:
Raf (16)
! *
E2 E2

X2 g X2 P h X2 PP
F2 F2
|
E3

X3 g X3 P
F3
Here X3 = ERK. This was modelled mathematically in a certain way in [32] and
it was proved that for that system of equations there is a unique steady state and
all other solutions converge to it. If, on the other hand, the system is modelled in
direct analogy to what was done in [21] a system is obtained which might potentially
admit Hopf bifurcations and hence periodic solutions. It was written down in [20] but
attempts to prove the existence of Hopf bifurcations using the methods applied to the
truncated MAPK cascade have not succeeded. Simulations done in [46] indicate that
there may be chaotic behaviour in the MAPK cascade. The approach of the authors
was via (numerical) bifurcation theory. They discovered the presence of fold-Hopf
bifurcations (where the linearization of the system at the bifurcation point has one
zero and a pair of non-zero imaginary eigenvalues) and Hopf-Hopf bifurcations
(where the linearization of the system at the bifurcation point has two pairs of non-
zero imaginary eigenvalues). See [26] for more details on these types of bifurcations.
Bifurcations of these types are often associated with chaos. Simulations for initial
data close to the bifurcation points gave pictures consistent with the presence of
chaotic behaviour.
In [33] a heuristic explanation for the existence of oscillations in the MAPK cas-
cade was given, involving the embedding of a bistable system in a negative feedback
loop. This point of view played no role in the proof of the existence of periodic solu-
tions using bifurcation theory which has just been discussed. It could, however, in
principle lead to an alternative proof of that result which could also provide informa-
tion on the stability of the periodic solution. A corresponding strategy, which makes
use of the Conley index (see [6]), has been developed and applied to a system related
to that considered in [24]. More details can be found in [3, 14, 15].
5 Embedding the Cascade in Feedback Loops
Given that modelling predicts oscillatory behaviour in the MAPK cascade it is of great
interest to try to observe it experimentally. This has been done in [36]. Oscillations
were found in the concentration of activated (i.e. doubly phosphorylated) ERK which
had a period of about fifteen minutes and lasted up to ten hours. This effect was
monitored by observing the translocation of ERK tagged with GFP between the
cytosol and the nucleus. This is relevant because activated ERK is imported into
the nucleus and inactive ERK is exported into the cytosol. Mathematical models
presented in [36] are more complicated than the basic model of [21] in several ways.
One is that a two-compartment model is used so that transport between cytosol and
nucleus is included. This network is sketched in Fig. 4 below, where Raf = X1 , MEK
= X2 , ERK = X3 . A second is that the fact is included that ERK and MEK which
are not fully phosphorylated can bind to each other ( /o /o /o o/ o/ o/ in Fig. 4). This
protects ERK from phosphorylation by MEK and thus represents a kind of negative
feedback. Note that the tendency of this type of feedback to encourage bistability
was already pointed out in [27]. A third way is that a negative feedback due the
repression of SOS by ERK is included ( in Fig. 4). This is important since
Fig. 4 Two-compartment model with a full MAPK cascade in the cytosol, a lower half of the
MAPK cascade in the nucleus, transport of X3 PP from the cytosol to the nucleus and transport of
X3 from the nucleus to the cytosol
SOS influences the rate of phosphorylation of Raf. This last effect is modelled in
a simple phenomenological manner. It was discovered that certain aspects of the
experimental data do not fit the mathematical model with an isolated cascade. In
particular this concerns the facts that the oscillations are found for a large range
of total amounts of ERK and that the period of the oscillations is found to depend
only weakly on the total amount of EGF, the substance being used to stimulate the
cascade. It is assumed that the phosphorylation of Raf is proportional to the amount
of EGF. Incorporating the negative feedback loop via SOS in the model allows the
experimental results to be reproduced.
Yet another type of negative feedback which may lead to oscillations arises via the
competition of a substrate of ERK with a phosphatase being reduced by increasing
degradation of the activated substrate [28]. In this paper the authors present both
mass action and reduced models and find periodic solutions in simulations.
6 Alternative Phosphorylation Mechanisms
In this paper we have concentrated on distributive phosphorylation. If this is partly

replaced by processive (but still sequential) phosphorylation then this often results
in simpler dynamics. For instance it was proved in [8] that when phosphorylation,
dephosphorylation or both are replaced in the dual futile cycle by their processive
versions, then there is a unique steady state for fixed values of the total amounts.
This follows from the deficiency one algorithm of chemical reaction network theory.
The result was generalized to the analogue of the multiple futile cycle with strictly
processive and sequential phosphorylation in [9]. In addition it was proved that
all other solutions converge to the steady state. Note that other types of (partially)
processive phosphorylation may also be considered [18].
One of the most important roles of oscillations in biology is that they can act
as clocks, for instance those defining circadian rhythms. Many of these clocks are
dependendent on translation but a clock has been found in cyanobacteria which
is not. It uses only phosphorylation states of the proteins KaiA, KaiB and KaiC.
This has been demonstrated by reconstructing the clock in vitro [30]. In KaiC the
phosphorylation is cyclic rather than sequential. In other words the first of two sites to
be phosphorylated is also the first to be dephosphorylated. This motivated the study
of oscillations in dual phosphorylation models more general than the usual dual futile
cycle [23]. In that paper the case of unordered phosphorylation was considered, i.e.
that where the two (de-)phosphorylations may take place in any order. Simulations
indicate that this is sufficient to produce sustained oscillations.
While the MAPK cascade is a type of phosphorylation system of central impor-
tance in eukaryotic signalling pathways, signalling pathways in prokaryotes more
often use a different type of phosphorylation system known as a two-component sys-
tem [39]. These are uncommon in eukaryotes and unknown in mammals. The central
mechanism is as follows. The two components are proteins generically called HK
and RR. The protein HK is a histidine kinase which, under appropriate conditions,
phosphorylates itself on a histidine. RR, the response regulator, catalyzes the transfer
of the phosphate group from the histidine of HK to an aspartate in RR. In this way
the RR is activated. HK can also dephosphorylate RR.
What is the advantage of the process of phosphorylation of RR taking place in
two steps rather than directly? It has been suggested that the motivation is that the
two-step process leads to absolute concentration robustness [37]. This means that the
output signal is independent of the total concentrations of the enzymes, so that the
system achieves independence from the stochastic variation in protein levels between
different cells.
Bistability has been observed in two-component systems. The conditions for bista-
bility in these systems have been discussed in [22]. This dynamical property has also
been studied in [2]. The case treated there is that of a split kinase. This means that
instead of one kinase HK phosphorylating itself one kinase binds to a second which
then phosphorylates the first. These authors investigated bistability in different mod-
els using both dynamical simulations and the Chemical Reaction Network Toolbox.
The latter is a computer programme which provides positive or negative results on
bistability on the basis of chemical reaction network theory.
Some signalling pathways include phosphorelays in which the phosphate group
is transferred from one species to the next. They also have a cascade structure, in
many cases with four layers. In some layers, a phosphate group can easily be lost
by hydrolysis. Furthermore some species can be bifunctional, i.e. able to give as
well as take a phosphate group to/from the species on the next layer. As for the
MAPK cascade, such phosphorelays can be embedded in transcriptional feedback
loops. See for example [25] where mathematical results have been obtained on the
form of response functions and bistability in certain cases. There are many possible
topologies for these systems and questions arise which are similar to those which
could be posed for the MAPK cascade: how complicated should the architecture
of the network be in order to achieve a ‘good switch’ property of the input–output
relation? We saw that bistability could give an answer. Furthermore the methods
explained in the previous sections could give insight on the dynamics beyond the
steady states (heteroclinic structure, oscillations).
7 Summary and Outlook
The MAPK cascade is a system of chemical reactions occurring as a part of many

signal transduction networks. This cascade on its own has the potential to give rise to
complicated dynamical behaviour such as multistability, sustained oscillations and
even chaos. The positive and negative feedback loops in which the cascade is embed-
ded in real biological systems present even more possibilities for generating this type
of phenomena. One way of trying to obtain deeper insights into the conditions lead-
ing to different types of dynamical behaviour is to carry out mathematically rigorous
investigations of systems of ordinary differential equations modelling the cascade.
In this context it is natural to start by studying small building blocks in detail and
build up from there. In particular we can pass from single layers of the cascade to
the full cascade without feedback and then the full cascade with feedback.
In the first four sections of this paper the results on the MAPK cascade or parts of it
which have been proved rigorously are summarized. In the case of a single phospho-
rylation loop it could be shown that the dynamics are simple: there is only one steady
state and it is globally stable. For a series of several phosphorylation loops it has been
known for some time that multiple stationary solutions are possible and information
is available on their number. More recently it could be proved that for a system with
two loops there are parameter values which give bistability, thus confirming previous
conclusions based on numerical simulations and heuristic considerations. It could

also be proved that for the MAPK cascade (or even for the first two layers of it)
there exist periodic solutions. An introduction is given to some of the mathematical
techniques used to obtain these results, geometric singular perturbation theory and
bifurcation theory.
Cases are pointed out where further progress would be desirable. It could not yet
be proved that the oscillations in the MAPK cascade are stable although simulations
indicate that this is the case. In fact if they were not stable then it would be hard to find
them by simulations. There are also no analytical results available on the presence
of chaos. Obtaining results of this kind would require analyzing bifurcations much
more complicated than those treated up to now. It would also be valuable to obtain
more results which in addition to showing that certain types of behaviour can occur
in a given system also give useful information on the range of parameters for which
it occurs.
The fifth section contains some remarks of the influence of feedback loops on the
dynamics of the cascade. This should be a fruitful field of application for the tech-
niques already developed for the cascade on its own. Interestingly it seems based on
simulations that the range of parameters for which bistability or sustained oscilla-
tions occur is increased by the presence of feedback loops and this may be important
for the question of whether these features are present for biologically reasonable
parameters and if so, whether the ranges of these parameters are large enough to
allow the oscillations to be observed experimentally.
The sixth section discusses some generalizations to other types of phosphorylation
systems. The focus in this paper is on distributive and sequential phosphorylation
and this mirrors a more general tendency in the theoretical literature on this subject.
Replacing distributive by processive phosphorylation in some reactions appears to
lead to a simplification of the dynamics. On the other hand distributive unordered
phosphorylation can lead to oscillations not present in the corresponding sequen-
tial case. Here again there is a lot of potential for further analytical investigations.
Remarks are also made on the relations to signal transduction networks based on
two-component systems.
Phosphorylation systems involved in signal transduction give rise to many chal-
lenges for mathematical analysis which have only started to be addressed. It is to
be hoped that further progress in this direction will be rewarded by a deeper under-
standing of the mechanisms of the biological processes involved.
References
1. Altan-Bonnet, G. and Germain, R.N. 2005 Modelling T cell antigen discrimination based on
feedback control of digital ERK responses. PloS Biol. 3 (11): e356.
2. Amin, M., Porter, S. L. and Soyer, O. S. 2013 Split histidine kinases enable ultrasensitivity and
bistability in two-component signalling networks. PloS Comp. Biol. 9 (3): e1002949.
3. Angeli, D., Ferrell, J. E. Jr. and Sontag, E. D. 2004 Detection of multistability, bifurcation and
hysteresis in a large class of biological positive-feedback systems. Proc. Natl. Acad. Sci. USA
101, 1822–1827.
4. Angeli, D., Hirsch, M. and Sontag, E. 2009 Attractors in coherent systems of differential
equations. J. Diff. Eq. 246, 3058–3076.
5. Angeli, D. and Sontag, E. D. 2006 Translation-invariant monotone systems and a global con-
vergence result for enzymatic futile cycles. Nonlin. Anal. RWA 9, 128–140.
6. Conley, C. 1978 Isolated invariant sets and the Morse index. CBMS Regional Conference
Series in Mathematics 38.
7. Conradi, C. and Mincheva, M. 2014 Catalytic constants enable the emergence of bistability in
dual phosphorylation. Royal Society Interface 11, 20140158.
8. Conradi, C., Saez-Rodriguez, J., Gilles, E.-D. and Raisch, J. 2005 Using chemical reaction
network theory to discard a kinetic mechanism hypothesis. IEEE Syst. Biol. 152, 243–248.
9. Conradi, C. and Shiu, A. 2015 A global convergence result for processive multisite phospho-
rylation systems. Bull. Math. Biol. 77, 126–155.
10. Errami, H., Eiswith, M., Grigoriev, D., Seiler, W. M. and Weber, A. 2015 Detection of Hopf
bifurcations in chemical reaction networks using convex coordinates. J. Comp. Phys. 291,
279–302.
11. Feinberg, M. 1980 Lectures on chemical reactions networks. Available at https://crnt.osu.edu/
lectures-chemical-reaction-networks.
12. Fenichel, N. 1979 Geometric perturbation theory for ordinary differential equations. J. Diff.
Eq. 31, 53–98.
13. François, P., Voisinne, G., Siggia, E. D., Altan-Bonnet, G. and Vergassola, M. 2013 Phenotypic
model for early T cell activation displaying sensitivity, specificity and antagonism. Proc. Natl.
Acad. Sci. USA. 110 E888–E897.
14. Gedeon, T. 2010 Oscillations in monotone systems with a negative feedback. SIAM J. Dyn.
Sys. 9, 84–112.
15. Gedeon, T. and Sontag, E. D. 2007 Oscillations in multi-stable monotone systems with slowly
varying feedback. J. Diff. Eq. 239, 273–295.
16. Goldbeter, A. and Koshland, D. E., Jr. 1981 An amplified sensitivity arising from covalent
modification in biological systems. Proc. Natl. Acad. Sci. USA 78, 6840–6844.
17. Guckenheimer, J. and Holmes, P. 1983 Nonlinear oscillations, dynamical systems, and bifur-
cations of vector fields. Springer.
18. Gunawardena, J. 2007 Distributivity and processivity in multisite phosphorylation can be dis-
tinguished through steady-state invariants. Biophys. J. 93, 3828–3834.
19. Hell, J. and Rendall, A. D. 2015 A proof of bistability for the dual futile cycle. Nonlin. Anal.
RWA. 24, 175–189.
20. Hell, J. and Rendall, A. D. 2016 Sustained oscillations in the MAP kinase cascade. Math.
Biosci. 282, 162–173.
21. Huang, C.-Y. F. and Ferrell, J. E., Jr. 1996 Ultrasensitivity in the mitogen-activated protein
kinase cascade. Proc. Natl. Acad. Sci. USA 93, 10078–10083.
22. Igoshin, O. A., Alves, R. and Savageau, M. A. 2008 Hysteretic and graded responses in bacterial
two-component signal transduction. Mol. Microbiol. 68, 1196–1215.
23. Jolley, C. C., Ode, K. L. and Ueda, H. R. 2012 A design principle for a posttranslational
biochemical oscillator. Cell Reports 2, 938–950.
24. Kholodenko, B. N. 2000 Negative feedback and ultrasensitivity can bring about oscillations in
the mitogen activated protein kinase cascades. Eur. J. Biochem. 267, 1583–1588.
25. Kothamachu, V.B., Feliu, E., Wiuf, C., Cardelli, L., Soyer, O.S. 2013 Phosphorelays provide
tunable signal processing capabilities for the cell. PLoS Comp. Biol. 9(11):e1003322.
26. Kuznetsov, Y. 1995 Elements of Bifurcation Theory. Springer, Berlin.
27. Legewie, S., Schoeberl, B., Blüthgen, N, and Herzel, H. 2007 Competing docking interactions
can bring about bistability in the MAPK cascade. Biophys. J. 93, 2279–2288.
28. Liu, P., Kevrekidis, I. G. and Shvartsman, S. V. 2011 Substrate-dependent control of ERK
phosphorylation can lead to oscillations. Biophys. J. 101, 2572–2581.
29. Markevich, N. I., Hoek, J. B. and Kholodenko, B. N. 2004 Signaling switches and bistability
arising from multisite phosphorylation in protein kinase cascades. J. Cell Biol. 164, 353–359.
30. Nakajima, T., Satomi, Y., Kitayama, Y., Terauchi, K., Kiyohara, R., Takao, T. and Kondo, T.
2005 Reconstitution of circadian oscillation of cyanobacterial KaiC phosphorylation in vitro.
Science 308, 414–415.
31. Ortega, F., Garcés, J. L., Mas, F., Kholodenko, B. N. and Cascante, M. 2006 Bistability from
double phosphorylation in signal transduction. FEBS J. 273, 3915–3926.
32. Prabakaran, S., Gunawardena, J. and Sontag, E. D. 2014 Paradoxical results in perturbation-
based network reconstruction. Biophys. J. 106, 2720–2728.
33. Qiao, L., Nachbar, R. B., Kevrekidis, I. G. and Shvartsman, S. Y. 2007 Bistability and oscilla-
tions in the Huang-Ferrell model of MAPK signaling. PloS Comp. Biol. 3, 1819–1826.
34. Rendall, A. D. 2012 Mathematics of the NFAT signalling pathway. SIAM J. Appl. Dyn. Sys.
11, 988–1006.
35. Richard, A. and Comet, J. P. 2010 Stable periodicity and negative circuits in differential systems.
J. Math. Biol. 63, 593–600.
36. Shankaran, H., Ippolito, D. L., Chrisler, W. B., Resat, H., Bollinger, N., Opresko, L. K. and
Wiley, H. S. 2009 Rapid and sustained nuclear-cytoplasmic ERK oscillations induced by epi-
dermal growth factor. Mol. Sys. Biol. 5, 332.
37. Shinar, G. and Feinberg, M. 2010 Structural sources of robustness in biochemical reaction
networks. Science 327, 1389–1391.
38. Sontag E. D. 2001 Structure and stability of certain chemical networks and applications to the
kinetic proofreading model of T-cell receptor signal transduction. IEEE Trans. Automat. Contr.
46, 1028–1047.
39. Stock, A. M., Robinson, V. L. and Goudreau, P. N. 2000 Two-component signal transduction.
Ann. Rev. Biochem. 69, 183–215.
40. Vanderbauwhede, A. 1989 Centre Manifolds, Normal Forms and Elementary Bifurcations.
Dynamics Reported 2, 89–169.
41. Ventura, A. C., Sepulchre, J.-A. Merajver, S. D. 2008 A hidden feedback in signalling cascades
is revealed. PLoS Comp. Biol. 4(3):e1000041.
42. Ventura, A. C. and Sepulchre, J.-A. 2013 Intrinsic feedbacks in MAPK signalling cascades
lead to bistability and oscillations. Acta Biotheor. 61, 59–78.
43. Wang, L. and Sontag, E. D. 2008 On the number of steady states in a multiple futile cycle. J.
Math. Biol. 57, 29–52.
44. Wang, L. and Sontag, E. D. 2008 Singularly perturbed monotone systems and an application
to double phosphorylation cycles. J. Nonlin. Sci. 18, 527–550.
45. Widmann, C., Gibson, G., Jarpe, M. B. and Johnson, G. L. 1999 Mitogen-activated protein
kinase: conservation of a three-kinase module from yeast to human. Physiol. Rev. 79, 143–
180.
46. Zumsande, M. and Gross, T. 2010 Bifurcations and chaos in the MAPK signalling cascade. J.
Theor. Biol. 265, 481–491.
Numerical Treatment of the Filament-Based
Lamellipodium Model (FBLM)
Angelika Manhart, Dietmar Oelz, Christian Schmeiser and Nikolaos

Sfakianakis
Abstract We describe in this work the numerical treatment of the Filament-Based

Lamellipodium Model (FBLM). This model is a two-phase two-dimensional contin-
uum model, describing the dynamics of two interacting families of locally parallel
F-actin filaments. It includes, among others, the bending stiffness of the filaments,
adhesion to the substrate, and the cross-links connecting the two families. The numer-
ical method proposed is a Finite Element Method (FEM) developed specifically for
the needs of this problem. It is comprised of composite Lagrange–Hermite two-
dimensional elements defined over a two-dimensional space. We present some ele-
ments of the FEM and emphasize in the numerical treatment of the more complex
terms. We also present novel numerical simulations and compare to in-vitro experi-
ments of moving cells.
1 Introduction
The lamellipodium is a flat cell protrusion functioning as a motility organelle in

protrusive cell migration [28]. It is a very dynamic structure mainly consisting of a
network of branched actin filaments. These are semi-elastic rods that represent the
polymer form of the protein actin. They are continuously remodeled by polymeriza-
A. Manhart · C. Schmeiser
Faculty of Mathematics, University of Vienna, Oskar-Morgenstern Platz 1,
1090 Vienna, Austria
e-mail: angelika.manhart@univie.ac.at
C. Schmeiser
e-mail: christian.schmeiser@univie.ac.at
D. Oelz
Courant Institute of Mathematical Sciences, New York University,
251 Mercer Street, New York, NY 10012-1185, USA
e-mail: dietmar@cims.nyu.edu
N. Sfakianakis (B)
Johannes-Gutenberg University, Staudungerweg 9, 55099 Mainz, Germany
e-mail: sfakiana@uni-mainz.de

142 A. Manhart et al.
tion and depolymerization and therefore undergo treadmilling [2]. Actin associated
cross-linker proteins and myosin motor proteins integrate them into the lamellipodial
meshwork which plays a key role in cell shape stabilization and in cell migration.
Different modes of cell migration result from the interplay of protrusive forces due
to polymerization, actomyosin dependent contractile forces and regulation of cell
adhesion [9].
The first modeling attempts have resolved the interplay of protrusion at the front
and retraction at the rear in a one-dimensional spatial setting [1, 6]. Two-dimensional
continuum models were developed in order to include the lateral flow of F-actin
along the leading edge of the cell into the quantitative picture. Those models can
explain characteristic shapes of amoeboid cell migration [21, 22] on two-dimensional
surfaces as well as the transition to mesenchymal migration [23].
One of the still unresolved scientific questions concerns the interplay between
macroscopic observables of cell migration and the microstructure of the lamel-
lipodium meshwork. Specialized models have been developed separately from the
continuum approach to track microscopic information on filament directions and
branching structure [8, 11, 24]. However, solving fluid-type models that describe
the whole cytoplasm while retaining some information on the microstructure of the
meshwork has turned out to be challenging. One approach is to formulate hybrid
models [14], another one to directly formulate models on the computational, dis-
crete level [13]. Recently the approach to directly formulate a computational model
has been even extended into the three-dimensional setting making use of a finite
element discretization [15].
In an attempt to create a simulation framework that addresses the interplay of
macroscopic features of cell migration and the meshwork structure the Filament-
Based Lamellipodium Model (FBLM) has been developed. It is a two-dimensional,
two-phase, anisotropic continuum model for the dynamics of the lamellipodium
network which retains key directional information on the filamentous substructure
of this meshwork [20].
The model has been derived from a microscopic description based on the dynamics
and interaction of individual filaments [18], and it has by recent extensions [12]
reached a certain state of maturity. Since the model can be written in the form of
a generalized gradient flow, numerical methods based on optimization techniques
have been developed [19, 20]. Numerical efficiency had been a shortcoming of this
approach. This has led to the development of a Finite Element numerical method
which is presented in this article alongside simulations of a series of migration assays
(Figs. 1 and 2).
2 Mathematical Modeling
In this section the FBLM will be sketched (see [12] for more detail). The main
unknowns of the model are the positions of the actin filaments in two locally parallel
families (denoted by the superscripts + and −). Each of these families covers a
Numerical Treatment of the Filament-Based Lamellipodium Model (FBLM) 143
Fig. 1 Graphical representation of (3); showing here the lamellipodium Ω(t) “produced” by the
mappings F± and the crossing–filament domain C
Fig. 2 Discretized lamellipodium (left) and lamellipodium fragment (right)
topological ring with all individual filaments connecting the inner boundary with the
outer boundary. The outer boundaries are the physical leading edge and therefore
identical, whereas the inner boundaries of the two families are artificial and may be
different. Filaments are labeled by α ∈ [0, 2π ), where the interval represents a one-
dimensional torus, which means that in the following all functions of α are assumed
periodic with period 2π . The maximal arclength of the filaments in an infinitesimal
±
element dα of the ±-family at time t is denoted ± by L (α, t), and an arclength

parametrization of the filaments is denoted by F (α, s, t) : −L ± (α, t) ≤ s ≤ 0 ⊂
IR2 , where the leading edge corresponds to s = 0, i.e.

F+ (α, 0, t) : 0 ≤ α < 2π = F− (α, 0, t) : 0 ≤ α < 2π ∀t , (1)
which together with ±

∂s F (α, s, t) = 1 ∀ (α, s, t) , (2)
constitutes constraints for the unknowns F± . The second constraint is connected

to an inextensibility assumption on the filaments, which implies that s can also be
interpreted as a monomer counter along filaments.
We expect that different filaments of the same family do not intersect each other,
and each plus-filament crosses each minus-filament at most once. The first condition
is guaranteed by det(∂α F± , ∂s F± ) > 0, where the sign indicates that the labeling with
increasing α is in the clockwise direction. The second condition uniquely defines
s ± = s ± (α + , α − , t) such that F+ (α + , s + , t) = F− (α − , s − , t), for all (α + , α − ) ∈
C (t), the set of all pairs of crossing filaments. It has to be noted that the validity of
these properties is not guaranteed. For this and other reasons finite-time breakdown of
the model cannot be excluded, although elements of the model like filament repulsion
(see below) provides a regularization. As a consequence of the above assumptions,
there are coordinate transformations ψ ± : (α ∓ , s ∓ ) → (α ± , s ± ) such that
F∓ = F± ◦ ψ ± .
In the following, we shall concentrate on one of the two families and skip the super-
scripts except that the other family is indicated by the superscript ∗. The heart of the
FBLM is the force balance

0 = μ B ∂s2 η∂s2 F − ∂s (ηλinext ∂s F) + μ A ηDt F (3)

bending inextensibility adhesion
⊥
⊥

+ ∂s p(ρ)∂α F − ∂α p(ρ)∂s F

pressure

±∂s ηη∗ μT (φ − φ0 )∂s F⊥ + ηη∗ μ S Dt F − Dt∗ F∗ ,

twisting stretching
where the notation F⊥ = (F1 , F2 )⊥ = (−F2 , F1 ) has been used. For fixed s and t,
the function η(α, s, t), is the number density of filaments of length at least −s at
time t with respect to α. Its dynamics and that of the maximal length L(α, t) will
not be discussed here. It can be modeled by incorporating the effects of polymeriza-
tion, depolymerization, branching, and capping (see [12]). We only note that faster
polymerization (even locally) leads to wider lamellipodia.
The first term on the right hand side of (3) describes the filaments’ resistance
against bending with the stiffness parameter μ B > 0. The second term is a tan-
gential tension force, which arises from incorporating the inextensibility constraint
(2) with the Lagrange multiplier λinext (α, s, t). The third term describes friction of
the filament network with the nonmoving substrate (see [18] for its derivation as
a macroscopic limit of the dynamics of transient elastic adhesion linkages). Since
filaments polymerize at the leading edge with the polymerization speed v(α, t) ≥ 0,
they are continuously pushed into the cell with that speed, and the material derivative
Dt F := ∂t F − v∂s F
is the velocity of the actin material relative to the substrate. For the modeling of v
see [12].
The second line of (3) models a pressure effect caused by Coulomb repulsion
between neighboring filaments of the same family with pressure p(ρ), where the
actin density in physical space is given by
η
ρ= . (4)
|det(∂α F, ∂s F)|
Finally, the third line of (3) models the interaction between the two families caused
by transient elastic cross-links and/or branch junctions. The first term describes elastic
resistance against changing the angle φ = arccos(∂s F·∂s F∗ ) between filaments away
from the angle φ0 of the equilibrium conformation of the cross-linking molecule. The
last term describes friction between the two families analogously to friction with the
substrate. The friction coefficients have the form
∗

T,S ∂α

μ =μ
T,S ,
∂s
with μT,S > 0, and the partial derivative refers to the coordinate transformation ψ ∗ ,
which is also used when evaluating partial derivatives of F∗ .
The system (3) is considered subject to the boundary conditions

− μ B ∂s η∂s2 F − p(ρ)∂α F⊥ + ηλinext ∂s F ∓ ηη∗ μT (φ − φ0 )∂s F⊥ (5)

η ( f tan (α)∂s F + f inn (α)V(α)) , for s = −L ,
=
±λtether ν, for s = 0 ,
η∂s2 F = 0, for s = −L , 0 .
The terms in the second line are forces applied to the filament ends. The force in
the direction ν orthogonal to the leading edge at s = 0 arises from the constraint (1)
with the Lagrange parameter λtether . Its biological interpretation is due to tethering
of the filament ends to the leading edge. The forces at the inner boundary s = −L
are models of the contraction effect of actin-myosin interaction in the interior region
(see [12] for details).
3 Numerical Method
Before discretization, the problem for each filament family is transformed to a rec-
tangular domain. For this formulation, a new anisotropic Finite Element (FE) method
is presented, and several implementational issues are discussed.
3.1 Reparametrization
The fact that the maximal filament length varies along the lamellipodium and poten-
tially with time has the consequence that the computational domain B(t) = {(α, s) :
0 ≤ α < 2π , −L(α, t) ≤ s < 0} is non-rectangular. In order to be able to use
tensor product grids, we introduce the coordinate change
(α, s, t) → (α, L(α, t)s, t) ,
with the new domain (α, s) ∈ B0 := [0, 2π ) × [−1, 0). Accordingly, a weak formu-
lation of the transformed version of (3), (5) is given by

0= η μ B ∂s2 F · ∂s2 G + L 4 μ A
Dt F · G + L 2 λinext ∂s F · ∂s G d(α, s)
B0

+ ηη∗ L 4 μ S t∗ F∗ · G ∓ L 2 μT (φ − φ0 )∂s F⊥ · ∂s G d(α, s)
Dt F − D
B0

1
− p(ρ) L 3 ∂α F⊥ · ∂s G − ∂s F⊥ · ∂α (L 4 G) d(α, s)
B0 L
2π 2π

+ η L 2 f tan ∂s F + L 3 f inn V · G dα ∓ L 3 λtether ν · G dα , (6)
0 s=−1 0 s=0
with F, G ∈ Hα1 ((0, 2π ); Hs2 (−1, 0)), with the modified material derivative

v s∂t L
Dt = ∂t − + ∂s
L L
and with the inextensibility constraint
|∂s F(α, s, t)| = L(α, t) .
3.2 The Finite Element Formulation
As previously, we skip the superscripts (±) except for those of the other family that
we indicate by ∗. For Nα , Ns ∈ IN we define the rectangular grid
2π
αi = (i − 1)Δα , i = 1, . . . , Nα + 1 , Δα = ,
Nα
1
s j = −1 + ( j − 1)Δs , j = 1, . . . , Ns , Δs = ,
Ns − 1
where α Nα +1 = 2π is identified with α1 = 0. Then the domain B0 = [0, 2π ) ×

[−1, 0) can be decomposed into rectangular computational cells:
s −1
Nα N
B0 = Ci, j , with Ci, j = [αi , αi+1 ) × [s j , s j+1 ) . (7)
i=1 j=1
We introduce the conforming Finite Element space

V := F ∈ Cα ([0, 2π ]; Cs1 ([−1, 0]))2 : F Ci, j (·, s) ∈ IP1α ,

F Ci, j (α, ·) ∈ IP3s for i = 1, . . . , Nα ; j = 1, . . . , Ns − 1 ,
of continuous functions, continuously differentiable with respect to s, and on each

computational cell coinciding with a first order polynomial in α for fixed s, and a
third order polynomial in s for fixed α.
For representing the elements of V , we introduce, for (a, s) ∈ Ci, j , the shape
functions
i, j αi+1 −α i, j 3(s−s j )2 2(s−s )3

L 1 (α) = Δα
, G 1 (s) = 1 − Δs 2
+ Δs 3j ,
i, j i, j i, j 2(s−s )2 (s−s )3
L 2 (α) = 1 − L 1 (α), G 2 (s) = s − s j − Δs j + Δs 2j ,
(8)
i, j i, j
G 3 (s) = 1 − G 1 (s),
i, j i, j
G 4 (s) = −G 2 (s j + s j+1 − s),
which satisfy
i, j i, j
L 1 (αi ) = 1, L 1 (αi+1 ) = 0,
i, j i, j
L 2 (αi ) = 0, L 2 (αi+1 ) = 1,
i, j i, j i, j i, j
G 1 (s j ) = 1, G 1 (s j+1 ) = 0, (G 1 ) (s j ) = 0, (G 1 ) (s j+1 ) = 0,
i, j i, j i, j i, j (9)
G 2 (s j ) = 0, G 2 (s j+1 ) = 0, (G 2 ) (s j ) = 1, (G 2 ) (s j+1 ) = 0,
i, j i, j i, j
G 3 (s j ) = 0, G 3 (s j+1 ) = 1, (G 3 ) (s j ) = 0, (G i, j ) (s j+1 ) = 0,
i, j i, j i, j i, j
G 4 (s j ) = 0, G 4 (s j+1 ) = 0, (G 4 ) (s j ) = 0, (G 4 ) (s j+1 ) = 1,
and that they span IP1α and, respectively, IP3s . Consequentially, we define the composite
Lagrange–Hermite shape functions (see also [3]), for (α, s) ∈ Ci, j , by
s α s α s α s α
(a) H1C (b) H2C (c) H3C (d) H4C
Fig. 3 Graphical representation of the Lagrange–Hermite shape functions (10). Each one of the
shape functions attains the value 1 in one degree of freedom, and 0 on all the rest
i, j i, j i, j i, j i, j i, j
H1 (α, s) = L 1 (α)G 1 (s), H5 (α, s) = L 2 (α)G 1 (s),
H2 (α, s) = L 1 (α)G 2 (s), H6 (α, s) = L 2 (α)G 2 (s),
i, j i, j i, j i, j i, j i, j (10)
H3 (α, s) = L 1 (α)G 3 (s), H7 (α, s) = L 2 (α)G 3 (s),
H4 (α, s) = L 1 (α)G 4 (s), H8 (α, s) = L 2 (α)G 4 (s),
i, j
and by Hk (α, s) = 0, k = 1, . . . , 8, for (α, s) ∈
/ Ci, j . Refer to Fig. 3 for a graphical
representation presentation of (10). For a scalar function, there are eight degrees of
freedom on each computational cell, which can be chosen as the function values and
the derivatives with respect to s at the vertices. These degrees of freedom are the
i, j i, j
coefficients in a representation in terms of the basis {H1 , . . . , H8 }.
Consequentially, every element F of the Finite Element space V can be repre-
sented in terms of the function values Fi, j and the s-derivatives ∂s Fi, j at all grid
points:

Nα
Ns

F(α, s) = Fi, j Φi, j (α, s) + ∂s Fi, j Ψi, j (α, s) , (11)
i=1 j=1
with the basis functions

i−1, j−1 i−1, j i, j−1 i, j
Φi, j := H7 + H5 + H3 + H1 ,
i−1, j−1 i−1, j i, j−1 i, j
Ψi, j := H8 + H6 + H4 + H2 , (12)
i = 1, . . . , Nα , j = 1, . . . , Ns .
The Finite Element formulation of the lamellipodium problem on the time interval
[0, T ] is to find F ∈ C 1 ([0, T ]; V ), such that the weak formulation (6) holds for all
G ∈ C([0, T ]; V ).
3.3 Time Discretization – Implementation Issues
In this section we go through all the terms in (6) and discuss their time discretization
and some implementation details. This will lead to a semi-implicit time discretization
of the problem, where at each time step a linear system has to be solved. We shall
use the superscripts n and n + 1 for the numerical approximations at the old time tn
and, respectively, the new time tn+1 = tn + Δt, i.e.

Nα
Ns
n
F (α, s) =
n
Fi, j Φi, j (α, s) + ∂s Fi,n j Ψi, j (α, s) . (13)
i=1 j=1
Finally, we shall also describe a regridding procedure in the α-direction, which has
the goal to equidistribute the computational filaments.
Resistance Against Filament Bending
The bending term is evaluated at the new time step and therefore becomes

ημ B ∂s2 Fn+1 · ∂s2 G d(α, s) ,
B0
where for G the basis functions (12) are inserted. For the computation of the integral,
a piecewise constant approximation for η is used.
Adhesion with the Substrate
For the transport operator
Dt F in

ηL 4 μ A
Dt F · G d(α, s) ,
B0
an explicit time discretization is used, i.e., it is replaced by

Fn+1 − Fn v s∂t L
− + ∂s F n .
Δt L L
For the computation of the integral, piecewise constant approximations for η and 1/L
were used. For the factor L 4 , L was approximated by piecewise linear functions.
Stretching of Cross-Links
The friction term caused by the stretching of cross-links requires the computation of
the relative velocity Dt F− Dt∗ F∗ , which is a subtle issue since the material derivative
of F∗ has to be evaluated at (α ∗ , s ∗ ), defined by
F(α, s, t) = F∗ (α ∗ , s ∗ , t) . (14)
The computation

Δt Dt F(α, s) − Dt F∗ (α ∗ , s ∗ )
≈ Fn+1 (α, s) − Fn (α, s) − vΔt∂s Fn (α, s)
−Fn+1,∗ (α ∗ , s ∗ ) + Fn,∗ (α ∗ , s ∗ ) + v∗ Δt∂s Fn,∗ (α ∗ , s ∗ )
≈ Fn+1 (α, s) − Fn (α, s + vΔt) − Fn+1,∗ (α ∗ , s ∗ ) + Fn,∗ (α ∗ , s ∗ + v∗ Δt)
shows that it is convenient to introduce an additional O(Δt)-discretization error,

replacing (14) by
Fn (α, s + vΔt) = F∗,n (α ∗ , s ∗ + v∗ Δt) . (15)
Another difficulty originates from the fact that the s ∗ -direction in the (α ∗ , s ∗ )-plane
does not correspond to the s-direction in the (α, s)-plane, and therefore it is difficult
to express the information encoded in the values of ∂s ∗ F∗ in terms of (α, s). We
therefore decided for approximations of the cross-link terms only in terms of the
filament positions:

Nα
Ns
Nα
Ns
Fn (α, s) = Fi,n j
Φi, j (α, s) , F∗,n (α ∗ , s ∗ ) = Fi,∗,nj
Φi, j (α ∗ , s ∗ ) ,
i=1 j=1 i=1 j=1
where the hat-functions Φi, j are piecewise bilinear. The Eq. (15) is solved for (α, s) =
(αi , s j ), i = 1, . . . , Nα , j = 1, . . . , Ns , using these representations, which involves
a search for the quadrilateral of F∗,n -positions containing Fn (αi , s j + vΔt). The
nonlinear system is then solved by using a bilinear representation of F, which allows
to solve the system exactly. The resulting values for (α ∗ , s ∗ ) are denoted by (αi,∗ j , si,∗ j ).
Finally, the relative velocity is approximated by
Nα Ns
j −F
Fi,n+1 ∗,n+1 ∗
(αi, j , si,∗ j )
(Dt F − Dt∗ F∗ )(α, s) ≈ Φi, j (α, s)
i=1 j=1
Δt
in the cross-link stretching term

ηη∗ L 4 μ S Dt F − Dt∗ F∗ · G d(α, s) , (16)
B0
where, again η and η∗ are approximated as piecewise constant and L as piecewise

linear.
Twisting of Cross-Links
For the cross-link twisting term, a semi-implicit time discretization is used. The angle
between filaments is evaluated at each grid point and at the old time step:

φi,n j = arccos ∂s Fi,n j · ∂s F∗,n (αi,∗ j , si,∗ j ) ,
with (αi,∗ j , si,∗ j ) computed as above. A piecewise constant approximation φ n is then

obtained from averaging over grid cells. This is used in the evaluation of

ηη∗ L 2 μT (φ n − φ0 )∂s Fn+1,⊥ · ∂s G d(α, s) ,
B0
where also η and η∗ are approximated by piecewise constant functions and L by

piecewise linear functions.
Filament Repulsion
For the pressure p(ρ) with ρ = η/|∂s F · ∂α F⊥ | a piecewise constant approximation
ρ n evaluated at the old time step is used. It is calculated using the representation
of F given in (13) to compute cell averages of ∂s F and ∂α F. The function η is
approximated by a piecewise constant function. The pressure term is discretized
semi-implicitly as

1
p(ρ n ) L 3 ∂α Fn+1,⊥ · ∂s G − ∂s Fn+1,⊥ · ∂α (L 4 G) d(α, s) .
B0 L
As above L is approximated by a piecewise linear function, with the exception of

the coefficient 1/L, which is approximated by a piecewise constant function.
Inextensibility Constraint
With a penalization approach, the inextensibility term

ηL 2 λinext ∂s F · ∂s G d(α, s)
B0
in (6) is replaced by

|∂s F|2 − L 2
ηL 2 ∂s F · ∂s G d(α, s) ,
B0 ε
with a small positive parameter ε. We use the semi-implicit linearization

|∂s F|2 − L 2 ∂s F ≈ |∂s Fn |2 − L 2 ∂s Fn+1 + 2 ∂s Fn · ∂s Fn+1 − |∂s Fn |2 ∂s Fn
and employ the augmented Lagrangian method, whence the inextensibility term
becomes

|∂s Fn |2 − L 2 2 n
ηL 2
λ +
n
∂s F n+1
+ ∂s F · ∂s F n+1
− |∂s F | ∂s F
n 2 n
B0 ε ε
·∂s G d(α, s) .
After the time step is carried out, the Lagrange multiplier is updated by
|∂s Fn |2 − L 2 2 n
λn+1 = λn + + ∂s F · ∂s Fn+1 − |∂s Fn |2 .
ε ε
Again, L is approximated as piecewise linear.
Spatial Equidistribution of Computational Filaments
In some simulations, the computational filaments tend to be distributed unevenly.
This is avoided by a regridding procedure, where the computational barbed ends are
evenly distributed along the leading edge, which can be achieved by a coordinate
change α → β, defined by
α 2π −1
β = 2π |∂α F(α̂, 0, t)|d α̂ |∂α F(α̂, 0, t)|d α̂ .
0 0
Numerically this is realized after carrying out a time step tn−1 → tn , by defining a
piecewise linear function g(α) through its values at the grid:
⎛ ⎞−1

i−1
Nα
g(αi ) := 2π |Fnj+1,Ns −Fnj,Ns | ⎝ |Fnj+1,Ns − Fnj,Ns |⎠ , i = 1, . . . , Nα +1 .
j=1 j=1
Then
α1 , . . . ,
α Na +1 are determined as the solutions of
αi ) = (i − 1)Δα ,
g( i = 1, . . . , Nα + 1 .
Now the computational filaments, corresponding to α = α1 , . . . , α Nα +1 , are replaced

α1 , . . . ,
by those located at α = α Nα +1 :
n
Fi, j := Fn (
αi , s j ) , ∂
n
s Fi, j := ∂s F (
n
αi , s j ) ,
and the density also needs to be redefined:

ηn (
αi , s j )

ηi,n j := .
g (
αi )
This procedure can be carried out whenever needed. In the simulations of the fol-
lowing section, it was done after every time step.
4 Numerical Simulations
The purpose of this section is to demonstrate that the model is capable of predicting
the outcome of migration experiments on inhomogeneous adhesive patterns. In [12]
the effect of varying different model parameters has been demonstrated. Addition-
ally the model has been used to simulate chemotactic migration and to study the
effect of changes in the signaling cascade on cell shape and filament density. Here
we go a step further and simulate how the shape of a migrating cell is influenced
by inhomogeneous adhesive patterns. Such studies are used to better understand
the interplay between adhesion, contraction, actin polymerization and other actin
associated proteins.
4.1 Experiment 1: Strongly Versus Weakly Adhesive Stripes
We show that model predictions are consistent with experimental data on migration
experiments published in [4]. In these experiments migrating fish keratocytes were
placed on substrates coated with distinct patterns of the extra-cellular-matrix (ECM)
protein fibronectin which binds to integrin transmembrane receptors mediating adhe-
sion. In [4] striped patterns were used featuring adhesive (fibronectin containing)
strips of 5 µm width and nonadhesive strips (without fibronectin) varying between 5
and 30 µm in width. In [4] it was reported that this affects cell shape in a very distinct
way. Protruding bumps on the adhesive strips and lagging bumps on the nonadhesive
stripes were observed and their width was correlated to the stripe width. Also it was
observed that cells tend to assume a symmetric shape such that they had an equal
Fig. 4 Figure reproduced from [4]: “Reversible deformation of the leading edge on line patterns.
a: ... keratocyte crawling from a 5–9 pattern ... onto an unpatterned region ... b: Deformation of
the leading edge on a 5–7 pattern with protruding bumps on adhesive stripes and lagging bumps
on non-adhesive stripes... c: Control experiment on a 5–7 line pattern where unprinted regions
(black) are not backfilled ... rendering the substrate homogeneously adhesive. Cells restore their
characteristic crescent-shaped outline ... Scale bars: 5 µm”
Fig. 5 a–i Times series of the simulation of a cell moving over a striped adhesive pattern (red)
with a 80% drop in adhesiveness (white) between adhesive strips. Shading represents actin network
density. Parameter values as in Table 1. The bar represents 10 µm
number of adhesive strips to the right and to the left of their cell center (Fig. 4 shows
some of the data published in [4]).
In the numerical experiment we used the same geometrical pattern with adhesive
strips of 5 µm width interspaced with 7 µm wide strips of reduced adhesiveness. In
the mathematical model adhesion forces result in friction between the cell and the
substrate and, speaking in numerical terms, they link one time step to the next. We
simulated the inhomogeneous adhesive pattern by decreasing the friction coefficient
by 80–90% in those regions of low fibronectin concentration as compared to adhe-
sive regions. Whilst the keratocytes in the original experiments move spontaneously
without an external signal, we simulate chemotactic cells under an external cue,
since at this point the model cannot describe the dynamics of contractile rear bundles
which stabilize autonomously migrating keratocytes. However the numerical results
show that there are many similarities as far as general behavior and morphology are
concerned, suggesting that the underlying phenomena are very similar. In Fig. 5a–i a
time series resulting from the simulation of a cell on a striped adhesive pattern with
a drop in adhesiveness of 80% is depicted. The following agreements between the
simulation and the experiments (Fig. 4) were found:
• On the striped adhesion pattern the cell shape becomes more rectangular as com-
pared to the crescent shape in the homogeneously adhesive region.
• Cells show protruding bumps on the adhesive stripes and lagging bumps on non-
adhesive stripes.
• The width of the bumps is correlated with the widths of the stripes.
• Spikes appear at the rear of the cell.
• After leaving the striped region the cell resumes its crescent shape and continues
to migrate as before.
Fig. 6 Comparison of the cell shape for three different starting positions. Parameter values as in
Table 1. The bar represents 10 µm
Fig. 7 Movement of a cell on an adhesive substrate (red) with less-adhesive stripes (white). Shading
represents actin network density. a 90% drop in adhesiveness, b 80% drop in adhesiveness. Parameter
values as in Table 1. The bar represents 10 µm
To compare the influence of the starting position on the shape of the cell, the
simulation was performed with three different initial conditions differing by 2 µm
shifts in the y-direction. The outcome is depicted in Fig. 6. It can be observed that
the shape of the cell starting at the lowest position (blue, dashed) differs significantly
from the other two. This is due to the fact that it interacts with the lowest adhesive
stripe causing it to shift further down as compared to the other cells.
In Fig. 7a, b a comparison between the bumps on stripes with a 90% (a) and a
80% (b) drop in adhesiveness is shown. Here the α-discretization used was twice as
fine to resolve more details. As expected bumps when adhesiveness drops by 90%
are more pronounced. Over a time interval of several minutes the amplitude of the
bumps fluctuated, an observation also made in the experiment of [4].
Fig. 8 Movement of a cell on an adhesive substrate (red) with less-adhesive spikes (white). Shading
represents actin network density. Parameter values as in Table 1. The bar represents 10 µm
4.2 Experiment 2: Less-Adhesive Spikes on Strongly

Adhesive Ground
Next we demonstrate the predictive capacity of the model simulating cell migration
along an adhesive path lined by irregular regions of low adhesion. The low-adhesion
pattern consists of two shifted spikes lining the trajectory of the cell from both sides.
The drop in adhesiveness was chosen to be 80%. As opposed to the situation above,
the cell is now able to almost fully avoid the less-adhesive regions. The behavior
observed over a time period of 30 min is depicted in the time series shown in Fig. 8.
The outcome of the simulation is counterintuitive in the cell does not simply slide of
the nonadhesive areas. Instead it behaves as if the less-adhesive spikes were obstacles
and only a very small portion of the lamellipodium enters the less-adhesive areas.
4.3 Parameters Values
For the discretization we used a time step of 0.12 s and nine nodes per filament. For
the first experiment we used 36 and 72 discrete filaments, for the second one 36.
Table 1 Parameter values

Var. Meaning Value Comment
µB Bending elasticity 0.07 pN µm2 [5]
µA Macroscopic friction caused 0.041, 0.082, Lower values for
by adhesions 0.41 pN min µm−2 less-adhesive regions,
highest value for adhesive
regions, order of magnitude
from measurements in [10,
16], estimation and
calculations in [17, 18, 20]
κbr Branching rate 10 min−1 Order of magnitude from
[7], chosen to fit
2ρ ref = 90 µm−1 [26]
κcap Capping rate 5 min−1 Order of magnitude from
[7], chosen to fit
2ρ ref = 90 µm−1 [26]
cr ec Arp2/3 recruitment 900 µm−1 min−1 Chosen to fit
2ρ ref = 90 µm−1 [26]
κsev Severing rate 0.38 min−1 µm−1 Chosen to give
lamellipodium widths
similar as described in [26]
µI P Actin–myosin interaction 0.1 pN µm−2
strength
A0 Equilibrium inner area 450 µm2 Order of magnitude as in
[25, 27]
vmin Minimal polymerization 1.5 µm min−1 In biological range
speed
vmax Maximal polymerization 8 µm min−1 In biological range
speed
µP Pressure constant 0.05 pN µm
µS Cross-link stretching 7.1×10−3 pN min µm−1
constant
µT Cross-link twisting constant 7.1 × 10−3 µm
κr e f Reference leading edge (5 µm)−1
curvature for polymerization
speed reduction
For the biological parameters, we used the same as those in [12], apart from the
adhesion coefficient which was increased for the adhesive regions and decreased for
the less-adhesive regions. They are summarized in Table 1.
Acknowledgements This work has been supported by the Austrian Science Fund through grant
no. J-3463 and through the PhD program Dissipation and Dispersion in Nonlinear PDEs, grant
no. W1245. The authors also acknowledge support by the Vienna Science and Technology Fund,
grant no. LS13-029. N. Sfakianakis wishes to thank the Alexander von Humboldt Foundation and
the Center of Computational Sciences (CSM) of Mainz for their support, and M. Lukacova for the
fruitful discussions during the preparation of this manuscript.
References
1. W. Alt and M. Dembo. Cytoplasm dynamics and cell motion: Two-phase flow models. Math.
Biosci., 156(1–2):207–228, 1999.
2. L. Blanchoin, R. Boujemaa-Paterski, C. Sykes, and J. Plastino. Actin dynamics, architecture,
and mechanics in cell motility. Physiological Reviews.
3. D. Braess. Finite Elements. Theory, fast solvers, and applications to solid mechanics. Cam-
bridge University Press, 2001.
4. G. Csucs, K. Quirin, and G. Danuser. Locomotion of fish epidermal keratocytes on spatially
selective adhesion patterns. Cell Motility and the Cytoskeleton, 64(11):856–867, 2007.
5. F. Gittes, B. Mickey, J. Nettleton, and J. Howard. Flexural rigidity of microtubules and actin fil-
aments measured from thermal fluctuations in shape. The Journal of Cell Biology, 120(4):923–
34, 1993.
6. M.E. Gracheva and H.G. Othmer. A continuum model of motility in ameboid cells. Bull. Math.
Biol., 66(1):167–193, 2004.
7. H.P. Grimm, A.B. Verkhovsky, A. Mogilner, and J.-J. Meister. Analysis of actin dynamics at the
leading edge of crawling cells: implications for the shape of keratocytes. European Biophysics
Journal, 32:563–577, 2003.
8. C.I. Lacayo, Z. Pincus, M.M. VanDuijn, C.A. Wilson, D.A. Fletcher, F.B. Gertler, A. Mogilner,
and J.A. Theriot. Emergence of large-scale cell morphology and movement from local actin
filament growth dynamics. 2007.
9. T. Lämmermann, M. Sixt, Mechanical modes of amoeboid cell migration. Current Opinion in
Cell Biology21(5), 636–644 (2009)
10. F. Li, S.D. Redick, H.P. Erickson, and V.T. Moy. Force measurements of the α5β1 integrin-
fibronectin interaction. Biophysical Journal, 84(2):1252–1262, 2003.
11. I.V. Maly and G.G. Borisy. Self-organization of a propulsive actin network as an evolutionary
process. Proc. Natl. Acad. Sci., 98:11324–11329, 2001.
12. A. Manhart, D. Oelz, C. Schmeiser, and N. Sfakianakis. An extended Filament Based Lamel-
lipodium Model produces various moving cell shapes in the presence of chemotactic signals.
J. Theor. Biol., 382:244–258, 2015.
13. A. F. M. Marée, A. Jilkine, A. Dawes, V. A. Grieneisen, and L. Edelstein-Keshet. Polarization
and movement of keratocytes: a multiscale modelling approach. Bull. Math. Biol., 68(5):1169–
1211, 2006.
14. R.W. Metzke, M.R.K. Mofrad, and W.A. Wall. Coupling atomistic simulation to a continuum
based model to compute the mechanical properties of focal adhesions. Biophysical Journal,
96(3, Supplement 1):673a –, 2009.
15. S.J. Mousavi and M.H. Doweidar. Three-dimensional numerical model of cell morphology
during migration in multi-signaling substrates. PLoS ONE, 10(3), 2015.
16. A.F. Oberhauser, C. Badilla-Fernandez, M. Carrion-Vazquez, and J.M. Fernandez. The mechan-
ical hierarchies of fibronectin observed with single-molecule AFM. Journal of Molecular Biol-
ogy, 319(2):433–47, 2002.
17. D. Oelz and C. Schmeiser. Cell mechanics: from single scale-based models to multiscale
modeling., chapter How do cells move? Mathematical modeling of cytoskeleton dynamics and
cell migration. Chapman and Hall, 2010.
18. D. Oelz and C. Schmeiser. Derivation of a model for symmetric lamellipodia with instantaneous
cross-link turnover. Archive for Rational Mechanics and Analysis, 198:963–980, 2010.
19. D. Oelz and C. Schmeiser. Simulation of lamellipodial fragments. Journal of Mathematical
Biology, 64:513–528, 2012.
20. D. Oelz, C. Schmeiser, and J.V. Small. Modeling of the actin-cytoskeleton in symmetric lamel-
lipodial fragments. Cell Adhesion and Migration, 2:117–126, 2008.
21. B. Rubinstein, K. Jacobson, and A. Mogilner. Multiscale two-dimensional modeling of a motile
simple-shaped cell. Multiscale Model. Simul., 3(2):413–439, 2005.
22. B. Rubinstein, M.F. Fournier, K. Jacobson, A.B Verkhovsky, and A. Mogilner. Actin-myosin
viscoelastic flow in the keratocyte lamellipod. Biophysical Journal, 97(7):1853–1863, 2009.
23. Y. Sakamoto, S. Prudhomme, and M.H. Zaman. Modeling of adhesion, protrusion, and con-
traction coordination for cell migration simulations. Journal of Mathematical Biology, 68(1–
2):267–302, 2014.
24. T.E. Schaus, E. W. Taylor, and G. G. Borisy. Self-organization of actin filament orientation in
the dendritic-nucleation/array-treadmilling model. Proc Natl Acad Sci USA, 104(17):7086–91,
2007.
25. J.V. Small, G. Isenberg, and J.E. Celis. Polarity of actin at the leading edge of cultured cells.
Nature, bf 272:638–639, 1978.
26. J.V. Small, T. Stradal, E. Vignal, and K. Rottner. The lamellipodium: where motility begins.
Trends in Cell Biology, 12(3):112–20, 2002.
27. A.B. Verkhovsky, T.M. Svitkina, and G.G. Borisy. Self-polarisation and directional motility of
cytoplasm. Current Biology, 9(1):11–20, 1999.
28. M. Vinzenz, M. Nemethova, F. Schur, J. Mueller, A. Narita, E. Urban, C. Winkler, C. Schmeiser,
S.A. Koestler, K. Rottner, G.P. Resch, Y. Maeda, and J.V. Small. Actin branching in the initiation
and maintenance of lamellipodia. Journal of Cell Science, 125(11):2775–2785, 2012.
Author Index
A M
Andreychenko, Alexander, 39 Manhart, Angelika, 141
Molina-París, Carmen, 67, 81
B
Bortolussi, Luca, 39 O
Oelz, Dietmar, 141
D
de la Higuera, Luis, 81
P
Pryor, Meghan McCabe, 1
E
Edwards, Jeremy S., 1
R
Rendall, Alan D., 119
F
Ferrarini, Marco, 67
S
Schmeiser, Christian, 141
G Sfakianakis, Nikolaos, 141
Gil, Amparo, 107
González-Vélez, Virginia, 107
Grima, Ramon, 39 T
Gutiérrez, Luis Miguel, 107
Thomas, Philipp, 39
H
Halász, Ádám M., 1 V
Hell, Juliette, 119 Villanueva, José, 107
L W
López-García, Martín, 81 Wilson, Bridget S., 1
Lythe, Grant, 67, 81 Wolf, Verena, 39

and Computational Sciences 11, DOI 10.1007/978-3-319-45833-5

Book ModelingCellularSystems

Uploaded by

Copyright:

Available Formats

Book ModelingCellularSystems

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Book ModelingCellularSystems

Uploaded by

Copyright:

Available Formats

Contributions in Mathematical and Computational

Modeling Cellular Systems

ISSN 2191-303X ISSN 2191-3048 (electronic)

© Springer International Publishing Switzerland 2017

Printed on acid-free paper

This Springer imprint is published by Springer Nature

The life of multicellular organisms, such as humans and animals, is a complex

The chapters of this book contain examples of mathematical modelling

stochastic model of receptor-mediated cell–cell interactions. In the underlying bio-

Heidelberg, Germany Frederik Graw

Frederik Graw (BIOMS, BioQuant/IWR, Heidelberg University)

Spatiotemporal Modeling of Membrane Receptors . . . . . . . . . . . . . . . . . . 1

Ádám M. Halász, Meghan McCabe Pryor, Bridget S. Wilson

Abstract We discuss our approach to the detailed computational modeling of the

Á.M. Halász (B)

© Springer International Publishing Switzerland 2017 1

1 In a “fundamentalist” version of reductionism (motivated by nineteenth century classical Physics,

behaviors that can be treated as random.

1.1 Background: Chemical Reaction Networks,

1.1.1 Cells as Chemical Factories

The language of chemistry provides an approximate but natural framework for a

1.1.2 Chemical Reaction Network Theory

In general, a chemical reaction network is defined by a set of species S1 , S2 , . . . , Sn

1.1.3 Need for a Stochastic Approach

The molecular nature of matter is always reflected in the exact proportions in

4 Such as concentration, mass, etc.

Fig. 1 The probability that

Instead, enough insight can be gained by generating a sample of possible outcomes

likelihood of a given relative difference.

2.1 Dealing with the Molecular Nature of Chemical

In a molecular framework, we keep track of the copy number of molecules of each

2.1.1 Intrinsic Transformation of a Single Molecule

To illustrate the principle of representing reactions by random processes, consider a

ΔpA→B (t, t + Δt)

The proportionality constant kA→B is an inverse time, and is characteristic to the

With the initial condition n A (0) = N A , the differential equation implies

n A (t) = N A · e−kA→B t . (9)

2.1.2 Combining Poisson Processes

We know how to describe or simulate the behavior of a single molecule of A. We

The key mathematical element is the composition property of Poisson processes

f (τ ) = gA→B · e−gA→B τ ; gA→B = n A · kA→B (12)

The propensity gA→B refers to all possible instances of the reaction A → B.

2.2 Stochastic Simulation Algorithm

2.2.1 Mass-Action Propensities

[A] (t) gA→B

For second-order reactions such as A + B → D, the number of possible instances

this should be comparable to the logarithmic derivative [A] /[A]:

2.2.2 Competing Reactions: The Gillespie SSA and the First

x = (n 1 , n 2 , . . .) → x∗ = (n ∗1 , n ∗2 , . . .) ≡ (n 1 + γ1k , n 2 + γ2k , . . .) . (17)

reaction, Rk . The firing time τk is added to the simulation time t → t ∗ = t + τk , and

An important fraction (if not most) of molecular transformations relevant to the

3.1 Diffusion Basics

9 Note that (20) approaches a Dirac delta at t = 0: lim p(x, y; t) = δ(x)δ(y).

s 2 (θ ) ≡ (Δr)2 = Δx 2 + Δy 2 ; Δr = r(t + θ ) − r(t) (24)

must not overlap in order to avoid oversampling.

displacement (MSD) over time θ is the sum of Δx 2 and Δy 2 , and therefore

Δx 2 = Δy 2 = 2Dθ ; s 2 (θ ) = Δx 2 + Δy 2 = 4Dθ . (25)

The distribution of square displacements corresponding to time intervals of length

3.2 Brownian or Not Brownian?

[A] (t) gA→B

this should be comparable to the logarithmic derivative [A] /[A]: