Urszula Ledzewicz
Dept. of Mathematics and Statistics
Southern Illinois University at Edwardsville
Edwardsville, Il 62026, USA
Heinz Schättler
Dept. of Electrical and Systems Engineering
Washington University
St. Louis, Mo 63130, USA
Of all the limiting factors in chemotherapy, and there exist many of them, “prob-
ably the most important - and certainly the most frustrating - is drug resistance”
[11, pg. 335]. The entire process of the drugs’ actions is “forbiddingly complex”
[9] and given the complicated biochemical processes that are necessary for the cy-
totoxic agents to be effective, many defense mechanisms are open to the cell. For
example, many cytotoxic agents are susceptible to ABC transport proteins which
remove molecules out of the cell. Thus an over-expression (for example, by gene-
amplification) in these molecules is an important mechanism for resistance to various
drugs. Another one is that repair systems within the cells become activated which
overcome the damage done by the drug. Malignant cancer cell populations are
highly heterogeneous - the number of genetic errors present within one cancer cell
lies in the thousands [16] - and fast duplications combined with genetic instabilities
provide just one of several mechanisms which allow for quickly developing acquired
resistance to anti-cancer drugs. In addition, so-called intrinsic resistance (i.e. the
specific drug’s activation mechanism simply doesn’t work) makes some cancer cells
not susceptible to many cytotoxic agents. “ ... the truly surprising thing is that
some malignancies can be cured even with current approaches.” [9, pp. 65]. Sev-
eral mechanisms to circumvent the problem of acquired drug resistance have been
tried, but so far without success and currently no medical solution to the prob-
lem of drug resistance in chemotherapy exists. In fact, it is acquired or intrinsic
drug resistance which eventually makes most chemotherapy fail and in view of the
manifold ways in which the cell can react to an attack by cytostatic agents [9, 31]
it appears unlikely that there ever will exist a drug whose effectiveness will not
ultimately be limited in this way. While a “cure for cancer” thus may simply be
the “holy grail” of medicine, a more realistic objective of treatment is to increase
the life expectancy of the patient. For this combination therapies in which different
types of drugs are administered still provide a valuable option. They are based on
the idea that cells may become resistant to some particular agent, but then still can
be treated through other cytotoxic agents which have an entirely different mode of
action. Since cancer cells can in fact lose acquired drug resistance, (for example, by
gene de-amplification), alternate treatments with different type of drugs therefore
may prolong the onset of resistance. While this is true to some extent, in reality,
unfortunately often multi-drug resistance and unacceptable levels of toxicity to the
patient limit these approaches as well.
Several probabilistic models for developing drug resistance exist in the literature
(e.g. [4, 8, 12, 33]). For example, in one of the early classical works by Coldman
and Goldie the tumor size is analyzed as a stochastic process and the probability
to have no resistant cells is maximized [4]. In this paper, as in numerous others
like [5], simple non cell-cycle specific two-compartment models distinguishing only
resistant and sensitive cells are considered. More recently, a probabilistic model for
the evolution of the drug sensitive cancer subpopulation from a single mutational
cell has been formulated and analyzed numerically by Westman et al. in [32, 33] in a
cell-cycle specific context distinguishing between the clonogenic (or quiescent) and
the growth (or proliferating) fraction. In these and earlier papers drug resistance is
treated as a sudden event, only distinguishing resistant and sensitive cells. While
drug resistance may be induced by a single mutational event (clinical resistance
sometimes appears so rapidly in patients that this would be a plausible explanation,)
the more common mechanism seem to be random mutations over time. “a partially
resistant clone may ... undergo further mutations and become progressively more
resistant [9, pg. 64].” A broad class of models which describe drug resistance not as
a single mutation event, but as a branching process, was developed by Harnevo and
Agur [7, 8] and Kimmel and Axelrod [12, 13]. Corresponding infinite-dimensional
deterministic models have been formulated and analyzed by Swierniak et al. [14, 28,
30]. However, due to high dimensionality these models often only allow a limited
All these models have in common that they analyze developing drug resistance
with respect to a single cytotoxic agent or a group of drugs which can be lumped
together in their effect. In this paper we consider a mathematical model for combi-
nation cancer chemotherapy under evolving drug resistance which considers multiple
killing agents. The underlying model was formulated jointly with A. Swierniak in
[21]. It is probabilistic, a branching random walk model with a finite number of
states [12], but averaged over the populations in individual compartments. The
model for acquired drug resistance is based on the mechanism of gene amplification,
but the equations can easily be adjusted to fit other mechanisms. For simplicity
and as starting point, the models considered in this paper are not yet cell-cycle spe-
cific and treat drug resistance as a “complete event” [33] only distinguish between
resistant and sensitive compartments of cancer cells. Developing drug resistance is
unavoidable and will eventually lead to a halt of treatment. The problem therefore
is not to eliminate the cancer, but to prolong the patient’s life expectancy.
There are many non-equivalent ways of trying to formulate such a problem math-
ematically. In this paper we consider an optimal control approach. In section
2 we briefly consider the simpler model with one killing agent which falls into
a well researched class of bilinear optimal control problems mathematically (e.g.
[17, 18, 20, 27]). This is no longer the case when combination drug treatments are
considered. In this case considered in section 3 the minimization of an indefinite
quadratic function on a compact set determines the controls also leading to inter-
esting mathematical questions. We initiate the analysis of this model with tools of
modern optimal control in order to gain some qualitative insights into the structure
of optimal protocols. Section 4 gives some simulations of corresponding multi-drug
protocols comparing them with reasonable ad-hoc strategies.
with a one-copy forward gene amplification hypothesis which states that in cell di-
vision at least one of the two daughter cells will be an exact copy of the mother cell
while the second one with some positive probability undergoes gene amplifications.
These concepts form the background for models developed by Swierniak, Smieja
et al. [26, 29, 30] where various levels of drug resistance are considered. Taking
into account an increasing degree of gene amplification leads to infinite-dimensional
models [14] involving integro-differential equations which, however, are difficult to
analyze. Thus, assuming some level of simplification and staying within a finite-
dimensional structure enables a better analysis of these problems. In this paper we
keep the number of levels of drug resistance minimal, i.e. we only distinguish sensi-
tive and resistant compartments, but our aim is to analyze multi-drug treatments.
As a precursor we first briefly discuss the 2-dimensional model corresponding to a
one drug treatment.
Here the first terms on the right hand sides account for the deaths of the mother
cells, the second terms describe the return flows into the compartments, and the
third terms give the cross-over flows. We now assume that a cytotoxic agent kills
sensitive cells, but has no effect on the resistant population. Let u denote the drug
dose, 0 ≤ u ≤ umax ≤ 1, with u = 0 corresponding to no drug being used and
u = umax corresponding to a full dose. For simplicity here it is assumed that the
dosage, the concentration and the effect of the drug are equal, i.e. pharmacokinetics
(PK) or pharmacodynamics (PD) are not modelled. It is assumed that the drug
kills a fixed proportion u of the outflow of the sensitive cells at time t, aS(t), and
therefore only the remaining fraction (1−u)aS(t) of cells undergoes cell division. Of
these new cells then (2−q)(1−u)aS(t) remain sensitive, while a fraction q(1−u)aS(t)
mutates to resistant cells. It is assumed that the drug has no effect on resistant
Thus, in principle the cancer cells can be reduced through chemotherapy provided
aS > cR, the typical situation initially. However, since applying drugs diminishes
only the sensitive population the resistant population eventually takes over and the
total number of cancer cells then will still grow exponentially. Now the quotient
x= R satisfies a linear ODE,
ẋ = rc + (a + (1 − r)c) x,
with stable equilibrium at
x̄ = . (11)
a + (1 − r)c
This value is small (since r is), even 0 in case of no gene de-amplification (r = 0).
Thus drug resistance takes over in the model, no matter what, and this is consistent
with medical experience. Clearly, in a specific case the actual parameters may be
favorable and it may take a very long time. These precisely are the situations when
chemotherapy will be successful.
over all admissible controls u : [0, T ] → [0, umax ] subject to the dynamics (5) and
additional constraints that may be imposed. For example, the objective might be
chosen as to maximize the total time T while restricting the overall amount of drug
given, 0 u(t)dt ≤ A, and requiring not to violate an upper bound on the number
of cancer cells, S(t) + R(t) ≤ N̄ . State space constraints, besides leading to a much
more difficult problem mathematically, have the disadvantage that in principle they
require to monitor the number of cancer cells constantly which is not feasible. It
therefore is more practical, and easier to handle mathematically, to formulate the
constraints on drugs and cancer cells implicitly by including these terms in the
objective. Since chemotherapy is normally given over some specified time period,
it also seems reasonable to minimize an objective of the type (12) over a fixed time
interval [0, T ]. The objective of treatment is to kill as many of the sensitive cancer
cells possible while limiting both the size of the resistant subpopulation and the
overall toxicity to the patient. Mathematically we therefore for a moment consider
the general problem
(P): minimize J = 0 L(N, u)dt + ϕ(N (T )) over all Lebesgue measurable con-
trols u : [0, T ] → [0, umax ] subject to the dynamics Ṅ = (A + uB)N with
N (0) = N0 given and with positive components.
Necessary conditions for optimality of a control u∗ : [0, T ] → [0, umax ] are given
by the Pontryagin Maximum Principle [22]. It is easily seen that for our case the ab-
normal situation is not possible and hence these conditions can be stated as follows:
if u∗ is an optimal control with corresponding trajectory N∗ : [0, T ] → P = {(S, R) :
S > 0, R > 0}, then it follows that there exists an absolutely continuous function λ,
From a practical point of view, choosing an objective that is quadratic in the control
tends to undermine the side effects and it favors giving partial doses. Typically
solutions will have segments when the control is given by the stationary point of
the Hamiltonian (whenever this minimum lies in the interior of the control set)
implying the use of time-varying partial doses dependent on the number of cancer
cells at the moment. Controls of this type are not yet realistically medically.
An alternative way, and this is the one we pursue in this paper, is the use of a
Lagrangian function L which is linear in the control u. There exists some biological
justification for this if one equates the numbers of cells killed with the numbers of
ineffective cell divisions (and, at least for some range, this number can reasonably be
assumed to be proportional to the overall amount of drugs given). Mathematically,
we thus consider the problem to minimize an objective of the form
J = kN (T ) + (`N (t) + u(t)) dt → min (16)
where k and ` are row-vectors of weights with the components of k positive and
those of ` non-negative. The penalty term kN (T ) represents a weighted average of
the total number of cancer cells at the end of an assumed fixed therapy interval [0, T ]
and the term `N (t) models cumulative effects during the therapy. The control term
u(t) in the Lagrangian models the negative side effects of the drugs, measured in the
L1 norm. The parameters k and ` can also be used to put a stronger emphasis on
the number of cancer cells since it can be argued that a duplication of the cancer or
the drug dose is more hazardous than a duplication of the objective would represent.
For the methods used below it is important that the Hamiltonian is linear in the
control u, but its structure in N can be rather arbitrary, i.e. more generally we
could consider
J = ϕ(N (T )) + (L(N (t)) + u(t)) dt → min (17)
with smooth penalty functions ϕ and Lagrangian L depending on the state N if such
an effect is considered important. In this paper, however, we use linear functions.
The mathematical problem therefore becomes to find a Lebesgue-measurable func-
tion u : [0, T ] → [0, umax ] which minimizes (16) subject to the dynamical equations
(3) and (4).
2.5. Analysis of optimal controls. Necessary conditions for optimality of a con-
trol u∗ : [0, T ] → [0, umax ] are given by the Pontryagin Maximum Principle [22].
As mentioned already, the abnormal situation is not possible and hence these
conditions can be stated as follows: if u∗ is an optimal control with correspond-
ing trajectory N∗ : [0, T ] → P = {(S, R) : S > 0, R > 0}, then it follows that
there exists an absolutely continuous function λ, which we write as row-vector,
λ = (λ0 , λ1 ) : [0, T ] → (R2 )∗ , satisfying the adjoint equation
λ̇ = −λ(A + u∗ B) − `, λ(T ) = k, (18)
such that the optimal control u∗ minimizes the Hamiltonian H,
H = `N + u + λ(A + u∗ B)N, (19)
over the control set [0, umax ] along (λ(t), N∗ (t)). Since the Hamiltonian is linear in
u and the control set is an interval, defining the switching function Φ by
Φ(t) = = 1 + λ(t)BN (t) (20)
Proposition 3. Singular controls are not optimal in regions of the state space where
qS > (2 − q)R.
This holds as long as the resistant population is very small and then optimal
controls will be bang-bang, i.e. correspond to sessions of full dose chemotherapy
with rest periods interlaced. However, as the portion of resistant cells R increases,
the Legendre-Clebsch condition will be satisfied and if admissible, there are mathe-
matical reasons to suspect that it indeed will be the optimal control in this case. It
is also quite intuitive that a full dose may do more harm than good once resistance
builds up and thus the optimal strategy may switch to give partial doses, i.e. use
singular controls.
3.1. Modeling Aspects. We consider two cytostatic killing agents whose dosages
are labelled u1 and u2 , both with values in intervals [0, uimax ], i = 1, 2. (As before,
the value 0 represents “no dose” and the value uimax corresponds to a “maximum
dose”). The state space now is comprised of four compartments, a compartment S
of cells sensitive to both drugs, a compartment L1 of cells sensitive to drug u1 , but
resistant to drug u2 , a compartment L2 of cells sensitive to drug u2 , but resistant
to drug u1 , and a compartment R of cells resistant to both drugs. We denote
the average numbers of cells in these compartments by the corresponding capital
Roman letters.
q1 q2
r2 r1
2-s1-r2 L1 L2 2-r1-s2
r1 r2
s1 R s2
and thus becomes quadratic in the controls. In many probabilistic models (e.g.
[4]) in order to simplify the analysis similar quadratic terms are linearized with the
reasoning that the probabilities involved are small. But for this model the validity
of such an argument is questionable and is not needed. Overall the dynamics we
consider is therefore given as follows:
Ṡ = −aS + (1 − u1 )(1 − u2 )(2 − q1 − q2 )aS + (1 − u1 )r2 b1 L1 + (1 − u2 )r1 b2 L2 ,
L̇1 = −b1 L1 + (1 − u1 )(2 − s1 − r2 )b1 L1 + (1 − u1 )(1 − u2 )q1 aS + r1 cR, (37)
L̇2 = −b2 L2 + (1 − u2 )(2 − s2 − r1 )b2 L2 + (1 − u1 )(1 − u2 )q2 aS + r2 cR, (38)
Ṙ = −cR + (2 − r1 − r2 )cR + (1 − u1 )s1 b1 L1 + (1 − u2 )s2 b2 L2 . (39)
Proposition 4. For all Lebesgue measurable controls u = (u1 , u2 ) : [0, ∞) →
[0, u1max ] × [0, u2max ] the solution N (·) = (S(·), L1 (·), L2 (·), R(·))T exists on [0, ∞)
and its components are positive.
Proof. As in Proposition 1 a contradiction arises if we assume there is a finite first
time any of the components would vanish.
The dynamical equations become more transparent if we change the control vari-
ables to vi = 1 − ui . Then the dynamical equations can be written in the form
Ṅ = (A + v1 B1 + v2 B2 + v1 v2 C)N (40)
−a 0 0 0 2 − q1 − q2 0 0 0
0 −b1 0 r1 c q1 0 0 0
, C = a ,
0 −b2 r2 c q2 0 0 0
0 0 0 (1 − r1 − r2 )c 0 0 0 0
0 r2 0 0 0 0 r1 0
0 2 − s1 − r2 0 0 0 0 0 0
B1 = b1
, B2 = b2 . (42)
0 0 0 0 0 2 − s2 − r1 0
0 s1 0 0 0 0 s2 0
As above, the optimal control problem becomes to choose Lebesgue measurable
i i
functions vi : [0, T ] → [vmin , 1], vmin = 1 − uimax , i = 1, 2, to minimize an objective
of the form Z T
J = kN (T ) + `N (t) − mv(t)dt → min (43)
subject to the dynamics (40). Here m = (m1 , m2 ) also is a row-vector of positive
3.2. Analysis of optimal controls. It is again easily seen that optimal controls
are normal. Thus, if v = (v1∗ , v2∗ ) is an optimal control with corresponding trajectory
N∗ = (S∗ , L∗1 , L∗2 , R∗ ) : [0, T ] → P = {(S, L1 , L2 , R) : S > 0, L1 > 0, L2 > 0, R > 0},
then it follows that there exists an absolutely continuous function λ, which we write
as row-vector, λ = (λ0 , λ1 , λ2 , λ3 ) : [0, T ] → (R4 )∗ , satisfying the adjoint equation
λ̇ = −λ(A + v1∗ B1 + v2∗ B2 + v1∗ v2∗ C) − `, λ(T ) = k, (45)
These two curves divide the (Ψ̂1 , Ψ̂2 )-space into four sectors as shown in Fig. 2
and the optimal control is given by one of the vertices of V as the vector (Ψ̂1 (t), Ψ̂2 (t))
lies in one of these regions. More precisely, we have:
V=(1,0) V=(0,0)
4. Simulations. We include some simulations for the multi-drug model. The nu-
merical values chosen are just for illustrative purposes and are not based on medical
data. The data for the objective are the same for all runs and we simply picked all
parameters arising in the objective as 1, i.e. k = ` = (1, 1, 1, 1) and m = (1, 1). Also,
for simplicity we simply take u1max = u2max = 1. As transition probabilities in the
dynamics we chose q1 = .02, q2 = .02, r1 = .005, r2 = .01, s1 = .02, and s2 = .02.
In the data for the cell-cycle parameters we fix a = .4, b1 = .35, and b2 = .35,
but for some of the runs we vary the transit time through the compartment R of
cells that are resistant to both drugs. Clearly these parameters strongly influence
the structure of controls and here we only compare briefly runs for c = .3 and for
c = .8. The value c = .3 corresponds to a situation when the transit times through
the resistant compartments are somewhat slower than for the sensitive compartment
(tumors that are responding to treatment) while c = .8 means that these doubly
resistant cells duplicate on average twice as fast, a situation more reminiscent of
malignant situations. In the simulations below the time horizon is always taken as
the interval [0, 10].
We compare an extremal control (i.e. one that satisfies the necessary conditions
for optimality) with two reasonable ad hoc choices as reference controls. Reference
control 1 applies both drugs simultaneously at full dose over the intervals [0, 2],
[4, 6] and [8, 10] with rest periods in between while reference control 2 alternates
the drugs over these intervals with drug 1 given on [0, 2], [4, 6] and [8, 10] and drug
2 given on [2, 4] and [6, 8] so that there is no rest-period over the therapy interval.
Figs. 3 and 4 compare the response of the system for an initial condition given by
S(0) = .90, L1 (0) = .05, L2 (0) = .05, and R = 0 for the case when c = .3. In all
figures the solid graph represents the response of the sensitive cells S, the dashed
curves give the responses of L1 and L2 and the dash-dot curve gives the response
of the fully resistant compartment R. Note that the curves for L1 and L2 in Fig.
3 overlay because of the symmetries in the data. In this case it is evident that
alternating the drugs is the better strategy. The reason simply lies in the fact that
resistance is still very small and does not yet build up significantly over the therapy
interval. Side effects have not yet become an issue for this initial condition.
states S, L1, L2 and R
0 1 2 3 4 5 6 7 8 9 10
states S, L1, L2 and R with alternating controls
0 1 2 3 4 5 6 7 8 9 10
On the other end of the spectrum of possibilities, Figs. 5 and 6 compare the
response of the system for the same initial condition (S(0) = .90, L1 (0) = .05,
L2 (0) = .05, R = 0,) but now for the case when c = .8. In this case the strategy of
alternating the drugs is drastically inferior. Comparing the total number of cancer
cells that had been normalized to 1 at the initial time, reference control 1 leads to
a reduction in the total number of cancer cells to 0.7348, but the fully resistant
population R has grown to R = .2826, more than one third of the overall number
of cancer cells. For the alternating strategy 2 the total number of cancer cells
actually increases to 1.4957 and the fully resistant portion makes up 1.3467 of it, a
horrendous outcome. The reason is that resistance, once established at any ever so
small proportion, rapidly takes over. Hence the more drug can be applied initially,
the better it seems to be. Fig. 8 gives the response to an extremal control that
was computed by backward integration from the same terminal condition that was
generated with the now better reference control 1. The corresponding control is
identical v2∗ = 1 over [0, 10] (i.e. the second drug is NOT used) and the first control
v1∗ switches from v = 0 to v = 1 at τ = 6.11. In this case, with the same number of
cancer cells at the end of the therapy interval, initially the number of cancer cells
is 1.0216; thus there is a comparable response in reducing the cancer cells, but it is
achieved with roughly half the dose since only the first drug is given.
states S, L1, L2 and R
0 1 2 3 4 5 6 7 8 9 10
states S, L1, L2 and R with alternating controls
0 1 2 3 4 5 6 7 8 9 10
extremal control v1
0 1 2 3 4 5 6 7 8 9 10
states S, L1, L2 and R for extremal
0 1 2 3 4 5 6 7 8 9 10
Figs. 9 and 10 compare the response of the system for a different initial condition
given by S(0) = .8, L1 (0) = 0, L2 (0) = .2, and R = 0, and again for c = .8. As
above the strategy of alternating the drugs is inferior. Comparing the total number
of cancer cells that had been normalized to 1 at the initial time, reference control
1 now only leads to a minimal reduction in the total number of cancer cells to
0.9883 with the fully resistant population R growing to R = .5309, more than
half of the overall number of cancer cells. The alternating strategy 2 is disastrous
with the total number of cancer cells multiplying more than five-fold to 5.3292 and
the doubly resistant portion making up 4.8843. The reason for the much shorter
lead time until resistance builds up of course is that we already have 20% resistant
cells, although only resistant to drug 1, at the beginning. Nevertheless, through the
transitions to R this causes a resistant population to develop faster and then quickly
to become dominant. Fig. 12 again gives the response to an extremal control that
was computed by backward integration from the same terminal condition that was
generated with the now better reference control 1. The corresponding control now
is identical v1∗ = 1 over [0, 10] (i.e. the first drug is NOT used consistent with the
existing resistance to drug 1 of some of the cells) and the second control v2∗ switches
from v = 0 to v = 1 at τ = 6.75. In this case, however, with the same number of
cancer cells at the end of the therapy interval, initially the number of cancer cells
is 1.7410 and thus giving a much better response in reducing the cancer cells which
again is achieved with roughly half the dose since only the second drug is given.
0 1 2 3 4 5 6 7 8 9 10
states S, L1, L2 and R with alternating controls
0 1 2 3 4 5 6 7 8 9 10
extremal control v2
0 1 2 3 4 5 6 7 8 9 10
states S, L1, L2 and R for extremal
0 1 2 3 4 5 6 7 8 9 10
given for a single drug model. This condition indicates that if the resistant popula-
tion becomes too large, then bang-bang controls may no longer be optimal since in
this case the damage caused by a full dose to healthy cells outweighs the benefits
of killing the cancer cells. In this situation we cannot help the patient any more
with treatment with this drug and the natural choice become combination treat-
ments involving other drugs. An example of a model for such a treatment involving
two drugs is presented in the second part of the paper. Mathematically it has a
different structure than the previous model since the dynamics is quadratic in the
control. Our initial results show that in scheduling the therapy it is not optimal to
simultaneously withdraw or initiate the treatment of both drugs.
