The Accumulation of Beneficial Mutations and Convergence to a Poisson Process
Abstract
We consider a model of a population with fixed size , which is subjected to an unlimited supply of beneficial mutations at a constant rate . Individuals with beneficial mutations have the fitness . Each individual dies at rate 1 and is replaced by a random individual chosen with probability proportional to its fitness. We show that when and for some , large numbers of beneficial mutations are present in the population at the same time, competing against each other, yet the fixation times of beneficial mutations, after a time scaling, converge to the times of a Poisson process.
MSC: Primary 92D15; Secondary 60J27, 60J80, 92D25
Keywords: Population model, mutation, selection, Poisson process
1 Introduction
One of the most important questions in evolutionary biology is to understand how beneficial mutations accumulate in a population. We consider here a simple model of a population which repeatedly acquires beneficial mutations. We assume the population has fixed size . We assume that, at time zero, no individuals have mutations, but then each individual in the population independently acquires mutations at times of a homogeneous Poisson process with rate . All mutations are assumed to be beneficial and to increase the individual’s fitness by a factor of , so that an individual with mutations has fitness . We assume that each individual independently lives for an exponentially distributed time with rate . When an individual dies, it gets replaced by a new individual whose parent is chosen at random from the individuals in the population, with probability proportional to the individual’s fitness. The new individual inherits all of its parent’s mutations.
It is instructive to consider what happens after one individual acquires a beneficial mutation, if we assume that no further mutations can occur. As we will explain in more detail below, the number of individuals with the mutation then evolves like a birth and death chain in which the ratio of the birth rate to the death rate is . Classical results on asymmetric random walks imply that the probability that this chain reaches before is
which is approximately as long as as . Therefore, the beneficial mutation may quickly disappear, but with probability approximately , the beneficial mutation will spread to the entire population, an event known as a selective sweep. One can also show that the duration of a selective sweep, that is, the time required for a beneficial mutation to spread to the entire population, is approximately . This question was first investigated by Kimura and Ohta [16], and a rigorous analysis for a population model very similar to the one presented here is given in section 6.1 of [5].
Returning now to original population model, because there are individuals acquiring mutations at rate , the total mutation rate for the population is . Therefore, the rate of mutations which trigger a selective sweep is approximately . It follows that the expected time between such mutations is approximately . Therefore, the time between selective sweeps is much longer than the duration of a selective sweep provided that . As a result, when , we expect to have approximately exponentially distributed waiting times between selective sweeps, so that after a suitable rescaling of time, the times of selective sweeps converge as to the times of a homogeneous Poisson process. When is bounded away from zero as , which is the case of strong selection, this result is straightforward to prove because with high probability, there will only be one beneficial mutation in the population at any given time that has not already spread to the entire population, which means the selective sweeps can be analyzed individually. However, when as , even though the selective sweeps are well separated in time, at any given time there will be many different mutations in the population that will ultimately die out before spreading to a large number of individuals. The presence of these additional mutants leads to what is known as clonal interference and complicates the analysis significantly. In this paper, we demonstrate that nevertheless one can prove the expected result for a range of values of which includes the case of moderate selection, where for some .
Given two sequences of positive numbers and , we write if and if . Throughout the paper, we will assume that
(1) |
and that
(2) |
Let be the number of individuals at time with mutations, which we call type individuals. Let , and let
Also, let
The following theorem is the main result of this paper.
Theorem 1.1.
Assume that (1) and (2) hold. Let be a sequence of independent random variables having the exponential distribution with mean one. Then for each fixed positive integer , as we have the convergence in distribution
(3) |
Furthermore, there exist positive constants and , depending on , such that for all nonnegative integers , we have
(4) |
and
(5) |
We can think of as being approximately the time when type becomes established in the population. The result (3) demonstrates that the times , when scaled by which is approximately the rate at which selective sweeps take place, converge as to the times of a homogeneous rate one Poisson process. The result (4) shows that shortly after time , most of the population consists of type individuals. Furthermore, the result (5) shows that all individuals of types and lower disappear from the population shortly after time , and then all individuals have types between and until at least time .
From this result, we obtain the following corollary regarding how the average number of mutations in the population evolves over time.
Corollary 1.2.
To give one indication of why these results are significant, we refer the reader to the award-winning papers [11, 1], which provide a mathematical analysis of the results of the famous Lenski experiments on bacterial evolution. In [11], the authors consider a model very similar to the one in the present paper and assume that and , where and . (In [11], the mutation rate is written as , but this is because in [11] refers to the mutation rate for the entire population and therefore corresponds to in the present paper.) These restrictions on the parameters are chosen to eliminate all clonal interference on the time scale of interest with high probability. Theorem 1.1 suggests that the same results may still hold if the condition is replaced by the weaker condition , which is sufficient to eliminate clonal interference among beneficial mutations that do not quickly die out.
Many papers have been devoted to analyzing this population model (or very similar models, perhaps with slightly different selection mechanisms) for different ranges of values for the parameters and . Much of this work has been carried out by statistical physicists and appears in the biology or physics literature; see, for example, [2, 3, 7, 12, 13, 18, 19, 21]. There is also a growing body of mathematically rigorous work on the subject. The case when , where one begins to see overlaps between selective sweeps, was considered by Gerrish and Lenski in [9] and has recently been studied rigorously in [10]. Durrett and Mayberry [6] studied the case in which is a constant and for some . Schweinsberg [22, 23] studied slightly faster mutation rates, so that tends to zero more slowly than any power of . This work made rigorous the analysis in [3, 4]. Rigorous results for the case in which both and are constants were established in [24, 15]. One can also consider the case in which the mutation rate is very fast, but the selective benefit resulting from each mutation is very small. In this case, the fitness of a lineage over time is well approximated by Brownian motion. This parameter regime was studied by Neher and Hallatschek [19]. A branching Brownian motion model that should serve as a good approximation to this population model was studied rigorously in [20, 17]. Finally, we note that the case when both and are on the scale of can be studied using a diffusion approximation, as discussed, for example, in section 8.1 of [5].
The rest of this paper is devoted to proving Theorem 1.1 and Corollary 1.2. An important component of the proof will be a coupling between the population process and a branching process with immigration, which will allow us to bound the number of individuals with a given number of mutations from above and below by branching processes.
2 Transition rates for the population process
For the rest of the paper, to lighten notation, we shall omit the subscript and simply write , , , and in place of , , , and . Nevertheless, it is important to keep in mind that these quantities do depend on .
In this section, we work out the transition rates for the population process. Let
(6) |
which is the total fitness of the population at time . Note that for all . We need to consider two types of transitions in the population process:
-
1.
For every pair of non-negative integers such that and , decreases by while increases by 1 when a type individual is replaced by a type individual. Hence, the rate at which decreases by while increases by 1 at time is
because type individuals die at rate , and the probability that the new individual born is type is .
-
2.
For every non-negative integer , decreases by while increases by 1 when a type individual is replaced by a type individual, or a type individual gains a new mutation and becomes a type individual. Hence, the rate at which decreases by while increases by 1 at time is
There are also events in which a type individual is replaced by another type individual, but we may ignore these events because they do not change the composition of the population.
From these transition rates, we can see that for , the process can be viewed as a birth-death process with immigration having the following transition rates:
-
1.
An immigration event occurs when a type individual becomes a type individual by acquiring a new mutation, which occurs at rate
(7) Note that immigration only occurs for . We will call a type individual a type immigrant if it arises from a type individual who gains a new mutation.
-
2.
A given type individual gives birth when an individual that is not of type is replaced by a new individual who chooses this type individual as its parent. This event occurs at rate
(8) -
3.
A given type individual dies when it is replaced by an individual that is not of type , or it gains a new beneficial mutation, which occurs at rate
(9)
Note that when discussing births and deaths of type individuals, we are disregarding events in which a type individual is replaced in the population by another type individual. Ignoring these birth and death events does not affect the distribution of types in the population but does alter the genealogy of the population. For the rest of the paper, we will work with this modified genealogy. This affects what is meant when we consider, for example, the set of individuals that are descended from a particular type immigrant.
3 Structure of the induction argument
Define , and for each positive integer , let
Note that at time , all individuals of types or lower have disappeared from the population. Let
For positive integers , let
The following lemma is the key to the proof of our main results. Note that, although for the model described in the introduction, no individuals have mutations at time zero, we present the result here under a slightly more general initial condition, so that the lemma can be applied inductively.
Lemma 3.1.
Suppose for and for . Then the following hold:
-
1.
For all , we have
-
2.
There exists a positive constant such that
-
3.
We have
-
4.
We have
-
5.
For all positive integers such that , we have
Part 1 of Lemma 3.1 shows that the number of type individuals reaches after a time which is approximately exponentially distributed with rate . Then part 2 of the lemma shows that the type 0 individuals completely disappear a short time later. Parts 3 and 4 show that type 0 individuals disappear before the number of type individuals reaches for any , and before any individual acquires mutations. Finally, part 5 of the lemma shows that at the time the type individuals disappear, there are at most individuals of type for .
Sections 4, 5, and 6 are devoted to the proof of Lemma 3.1. In the rest of this section, we will show how to apply Lemma 3.1 inductively to obtain Lemma 3.2, and then use Lemma 3.2 to obtain Theorem 1.1 and Corollary 1.2. Let be the -field generated by the random variables for nonnegative integers and , so that is the natural filtration associated with the population process. Note that this filtration implicitly depends on .
Lemma 3.2.
For all nonnegative integers , let be the event that for and for . Then for all positive integers , the following hold:
-
1.
For all , we have
-
2.
There exists a positive constant such that
-
3.
We have
-
4.
We have
-
5.
We have
Proof.
The result when is equivalent to Lemma 3.1. Suppose is a positive integer, and the result holds up to . Then we have , so we can work on the event . On the event , every individual in the population must have at least mutations from time onward. For and , on the event ,
and
We can see from these formulas that the rates would be unchanged if were subtracted from the type of each individual, which is a consequence of the fact that subtracting from the type of each individual multiplies the fitness of each individual by , without changing the relative fitnesses of the individuals. Therefore, will will shift the type of each individual down by , so that type individuals are relabeled as type .
After this relabeling of the types, on the event , the distribution of types at time satisfies the same conditions as the distribution of types at time zero in Lemma 3.1. Therefore, we can apply the strong Markov property at time , and after accounting for the relabeling of types, the five conclusions in Lemma 3.1 are equivalent to the five conclusions in Lemma 3.2 with in place of . Thus, the result holds for , and the lemma follows by induction. ∎
Proof of Theorem 1.1.
Fix a positive integer . It follows from part 5 of Lemma 3.2 that
Therefore, by part 1 of Lemma 3.2, if , then
That is, we have . Because part 2 of Lemma 3.2 and (1) imply that as for , the result (3) follows.
Next, note that part 5 of Lemma 3.2 implies that at time , with probability tending to one, all individuals have type at least and at most . Parts 2 and 3 of Lemma 3.2 imply that with probability tending to one, we have , so in particular before time , the number of individuals of type is less than for . Part 4 of Lemma 3.2 implies that with probability tending to one as , no individual of type appears before time . Putting together these observations, we conclude that
In view of part 2 of Lemma 3.2, the result (4) follows with and . The result (5) also follows from this same reasoning. ∎
Proof of Corollary 1.2.
For all , let . It follows from (3) that the finite-dimensional distributions of the processes converge as to the finite-dimensional distributions of a homogeneous rate one Poisson process. Therefore, it suffices to show that for each fixed , we have
(10) |
By (3), for any fixed and any ,
Since by (1), it follows from part 2 of Lemma 3.2 that for each fixed , we have
(11) |
However, as long as, for all , we have for and for all and , an event which has probability tending to one as by Lemma 3.2, we have
(12) |
Because as by (2), the result (10) follows from (11) and (12). ∎
4 Following the process until time
In this section, we study the process between time zero and the time when the number of type 1 individuals reaches . By bounding the process from above and below by branching processes with immigration, we will show that is asymptotically exponentially distributed. We will also bound the processes from above to show that the number of individuals of type or higher stays small until after time .
4.1 Bounding the process from above by a branching process
For an interval , we define to be the number of type individuals at time that descend from type immigrants who appear during the time interval . When , descendants of type individuals that are in the population at time zero are included; recall that this matters because we are aiming to prove Lemma 3.1 under slightly more general initial conditions to facilitate the induction argument. Recall also that when determining which individuals are descended from a particular immigrant, we are ignoring events in which a type individual is replaced by another type individual. For , define
(13) |
and for each positive integer , define
(14) |
Lemma 4.1.
The following statements hold.
-
1.
For all positive integers and , we have for all .
-
2.
For sufficiently large , we have for all .
Proof.
For positive integers , let be equal to plus the number of times that a type individual mutates to type during the time interval . For , define the stopping time
(15) |
Because for all , for sufficiently large we have .
Let , and for , let . For all positive integers , we now construct a new process from the population process as follows.
-
1.
Set .
-
2.
For and , the process jumps up by 1 due to immigration at rate . For , there is no immigration.
-
3.
For all , the process jumps up by 1 due to births at rate
-
4.
For all , the process jumps down by 1 due to deaths at rate .
Lemma 4.1 implies that the prescribed transition rates are nonnegative, so this process is well-defined. Also, once the process hits 0, it cannot jump down. Thus, for all .
One can carry out this construction formally by defining homogeneous rate one Poisson processes , , and which are independent of one another and of the population process, and then defining the process to satisfy
For other similar constructions in this paper, we will simply specify the jump rates without explicitly introducing the Poisson processes.
For all , we define
Therefore, for all . Note that is a birth-death process with immigration with the following rates:
-
1.
An immigrant appears in the process when an immigrant appears in or . Therefore, immigrants appear in between times and at rate
For , a immigrant appears in the process when an immigrant appears in , which occurs only during the time interval at rate .
-
2.
For all , a birth occurs in the process at rate
-
3.
For all , a death occurs in the process at rate
We shall scale the time so that each individual after the time scaling gives birth at rate and dies at rate . For all positive integers and all , define
(16) |
and define . Then the process is a branching process with immigration with the following rates:
-
1.
When , immigration occurs at time at rate
When , immigration occurs at a rate which is not constant in time and depends on how the population has evolved at earlier times.
-
2.
Each individual produces an offspring at the rate
-
3.
Each individual dies at the rate
4.2 An upper bound on
We first record the following elementary result about branching processes, which follows from classical results on asymmetric random walks.
Lemma 4.2.
Consider a continuous-time branching process started from one individual in which each individual gives birth at rate and dies at rate . The probability that the branching process survives forever is , and the probability that it goes extinct is .
Define and to be the events that the processes and go extinct, respectively.
Lemma 4.3.
We have
Proof.
On the event , all families of individuals at time must go extinct. Since individuals in the branching process give birth at rate and die at rate , the extinction probability of each family is . Also, at time , there are at least individuals in the process because . Hence,
As , we have and , which completes the proof. ∎
Lemma 4.4.
For every constant ,
Proof.
First, we show that
(17) |
For , let be the number of immigrants in the process that appear in the time interval and whose families do not go extinct. In the process , immigrants appear at rate until the time . The family of each immigrant has extinction probability . Hence, the first immigrant whose family does not go extinct appears at rate
Note that . Also, by (9) and (16),
(18) |
Hence,
We obtain the inequality (17) by taking the of both sides and using that , , and as . Next, note that
Thus, the result of this lemma follows by Lemma 4.3 and (17). ∎
4.3 Finite and infinite lines of descent
In this subsection, we will use the fact that a branching process that is conditioned to go extinct is still a branching process. Let be a branching process with . Let be the generating function of the offspring distribution of . Let be the mean lifetime of an individual in the process . We define .
An individual in the branching process is said to have a finite line of descent if the family of this particular individual goes extinct; otherwise, it is said to have an infinite line of descent. Let be the number of individuals at time that have a finite line of descent, and let be the number of individuals at time that have an infinite line of descent. Gadag and Rajarshi [8] showed that is a two-type Markov branching process. Let where is the probability that an individual with a finite line of descent has offspring with a finite line of descent and offspring with an infinite line of descent. Let where is the probability that an individual with an infinite line of descent has offspring with a finite line of descent and offspring with an infinite line of descent. Also, define and . Gadag and Rajarshi [8] also showed that
(19) |
and
(20) |
where is the extinction probability of the branching process .
We will apply the following result to immigrant families in the branching process .
Lemma 4.5.
Let be a continuous-time branching process with such that each individual gives birth at rate and dies at rate . Let be the event that the process goes extinct. Then
Proof.
We have
By Lemma 4.2, the extinction probability is , which can also be found by finding the smallest non-negative root of . Thus,
and
(21) |
The coefficients of tell us that an individual with a finite line of descent gives birth at rate and dies at rate . Hence, for all , we have
The result follows by integrating over . ∎
Remark 4.6.
If instead each individual gives birth at rate and dies at rate one, then we can apply Lemma 4.5 with in place of to get
Remark 4.7.
Equation (21) shows that an individual with an infinite line of descent gives birth to another individual with an infinite line of descent at rate . Therefore, conditional on , the number of individuals with an infinite line of descent is a Yule process with birth rate .
4.4 The number of type immigrants
Lemma 4.8.
For every constant ,
(22) |
and
(23) |
Proof.
For , we have . Because and each type 1 individual can mutate to type 2 at rate , it follows that
Therefore, using Markov’s inequality, and then using to bound the first term and to bound the second term, we get
Because and as for all , the result (22) follows.
Next, note that
(24) |
In particular, we have . Therefore, for sufficiently large ,
which gives (23). ∎
For each positive integer , let and be the events that the processes and go extinct, respectively. Note that since for all .
Lemma 4.9.
For every positive integer ,
Proof.
Recall that . Using that , we have
Likewise, we have
The result follows. ∎
Lemma 4.10.
For every positive integer ,
Proof.
Let
Lemma 4.11.
For every positive integer , we have
Proof.
By the assumptions on and , we have for sufficiently large . Therefore, it suffices to show that . By applying the result of Remark 4.6 to each immigrant family, we obtain
(25) |
On the event , we have
Therefore, applying the strong Markov property at time and Remark 4.6, we get
(26) |
Note that if is a nonnegative random variable and and are events, then
Applying this result to (25) and (26), we get
Because by Lemma 4.10, it follows that , which implies the result. ∎
Lemma 4.12.
For every positive integer , we have
Proof.
Define the stopping time
It follows from (25) and the fact that that
Therefore, by Markov’s inequality,
which implies that
(27) |
Recall that and . By (15),
Since and , we have and
Therefore, for sufficiently large ,
(30) |
That is, is the first time that an individual of type individual appears in the population.
Lemma 4.13.
We have
4.5 Bounding the process from below by a branching process
Let , , and be constants. Define
Lemma 4.14.
For sufficiently large , we have
Proof.
By (30), if is sufficiently large, then during the time interval , type has never appeared in the population. Hence, when ,
Therefore,
Because as , it is not difficult to show that for all , we have for sufficiently large . Therefore, when ,
Thus, when , for sufficiently large ,
For all positive real numbers and such that , the function is decreasing on the interval because for all . Therefore, for sufficiently large , when ,
Note that when , because and ,
Therefore, for sufficiently large ,
and thus . ∎
Lemma 4.15.
For sufficiently large , we have for all .
Proof.
We now construct a new birth-death process with immigration called which will bound the process from below. We set .
-
1.
At time , if a birth occurs in , then
-
•
with probability , a birth also occurs in ,
-
•
with probability , nothing happens in .
-
•
-
2.
At time , if a death occurs in , then
-
•
with probability , a death also occurs in ,
-
•
with probability , nothing happens in .
-
•
-
3.
At time , if an immigration event occurs in , then
-
•
with probability , an immigration event also occurs in ,
-
•
with probability , nothing happens in .
-
•
-
4.
For times , the process behaves independently of , and
-
•
a birth occurs at rate ,
-
•
a death occurs at rate ,
-
•
immigration occurs at rate .
-
•
From this construction, we see that for all . Also, all of the probabilities in the construction are guaranteed to be in by Lemmas 4.14 and 4.15. Hence, is a branching process with immigration where
-
•
each individual gives birth at rate ,
-
•
each individual dies at rate ,
-
•
immigration occurs at rate .
Recall the definition of the time scaling function in (16). We define for all . Then the process is a branching process with immigration in which each individual gives birth at rate , each individual dies at rate , and immigration occurs at rate .
By Lemma 4.2, the extinction probability of a family of an immigrant is . Thus, in the process , an immigrant whose family survives forever appears at rate
We define to be the first time that an immigrant whose family survives forever appears in the process , and define
(31) |
Lemma 4.16.
Let be a constant. For sufficiently large , we have
Proof.
Let be the number of individuals in the process at time who have an infinite line of descent and descend from the first immigrant that has an infinite line of descent. Let
Since is the first time the process goes above , we have . Therefore,
It follows from Remark 4.7 that is a Yule process in which each individual gives birth at rate . Therefore, has a geometric distribution with success probability . Using that as , we have
as , and the result follows because . ∎
Lemma 4.17.
For every constant ,
Proof.
By the definition of in (9), for sufficiently large , when , we have
Thus, by (16), for sufficiently large , when ,
Because for , we have . Also, is an increasing function. Hence,
It follows from Lemma 4.16 that
(32) |
As , we have , and because for sufficiently large by (2), we have
Thus, from (32), for sufficiently large ,
Since has an exponential distribution with rate ,
Thus,
It follows that
Since this statement is true for all , the result follows by taking limits as and . ∎
Proposition 4.18.
The following statements hold.
-
1.
For every , we have
-
2.
We have
-
3.
We have
Proof.
First, note that for every ,
From Lemma 4.13, we have . Therefore, by Lemma 4.17,
Combining this result with Lemma 4.4 gives the first statement in the proposition.
To prove the second statement, note that for every ,
(33) |
By Lemma 4.13, the term on the left hand side converges to as . Therefore, using the result of the first statement of the proposition and taking the liminf of both sides of (33), we get
The result follows because this inequality is true for all .
To prove the last statement, we first observe that
Taking the liminf of both sides and using Lemma 4.8 and part 1 of this proposition, we have
Since this is true for all , we have proved the last statement of the proposition. ∎
5 Following the process until time
In this section, we study the process between the time when the number of type 1 individuals first reaches and the time , when the number of type individuals reaches . Our goal is to show that the number of type 1 individuals reaches quickly after time , and that there is not enough time for many mutations to type 2 to occur during this period.
We now construct a branching process which will bound from below between times and . For , let
That is, is the number of type 1 individuals at time that are descended from type individuals that are alive at time . Similar to the way we constructed , we construct a new birth-death process from the process as follows:
-
1.
Set .
-
2.
At time , if a birth occurs in , then
-
•
with probability , a birth also occurs in ,
-
•
with probability , nothing happens in .
-
•
-
3.
At time , if a death occurs in , then
-
•
with probability , a death also occurs in ,
-
•
with probability , nothing happens in .
-
•
-
4.
For times , the process evolves independently of the population, and
-
•
a birth occurs at rate ,
-
•
a death occurs at rate .
-
•
From this construction, which is well-defined in view of Lemma 4.14, the process is a birth-death process in which each individual gives birth at rate and each individual dies at rate . Also,
(34) |
For , let . Then the process is a branching process in which each individual gives birth at rate and dies at rate 1.
We now review the following standard result for continuous-time branching processes, which can be obtained, for example, from Theorem 6.1 on page 103 of [14].
Lemma 5.1.
Let be a continuous-time branching process with such that each individual independently lives for an exponentially distributed time with mean and is then replaced by offspring with probability . For , let
Let . Then , and if , then
Let be the event that . Note that as by part 2 of Lemma 4.18, and that on the event , we have . Also, let
Lemma 5.2.
We have
Proof.
Since is a branching process as described above,
Therefore, if is sufficiently large, implies . It thus follows from the conditional Chebyshev’s inequality that
(35) |
The generating function for this branching process, using the notation of Lemma 5.1, satisfies
Therefore, by Lemma 5.1, since the numbers of offspring produced by the individuals at time zero are independent, we have
(36) |
Note that and, if is sufficiently large, . Therefore, it follows from (35) and (36) that for sufficiently large ,
which goes to 0 as . The result of the lemma follows. ∎
Lemma 5.3.
There is a positive constant such that
Proof.
By the definition of in (9), when ,
Let be a constant such that . Since as , for sufficiently large we have for all . By the definition of in (16), when ,
(37) |
Let be the event that . By (37), on we have
Since is an increasing function, it follows that on ,
Define . By (34), either reaches before or at the same time as does, or reaches after time . Therefore, we have , which means that on ,
By the definition of , it follows that the process does not go above until after time . That is, on we have
Therefore, recalling that is the event that and using Lemma 5.2, we have
Because by part 2 of Proposition 4.18, it follows that as , which means
Because and , for sufficiently large we have . The result follows if we choose . ∎
Lemma 5.4.
For all positive constants , we have
Proof.
The number of type 1 individuals at any time is bounded above by , so the rate of mutations to type 2 is bounded above by . Therefore, the expected number of mutations to type 2 between times and is bounded above by . By Markov’s inequality and the fact that ,
Since , we have
(38) |
(39) |
Since by (24), we have . Thus, this lemma follows from (39) and the definition of in (15). ∎
Proposition 5.5.
We have
(40) |
Also, there exists a positive constant such that
(41) |
6 Following the process until type 0 vanishes
In this section, we will prove that after the time when the number of type 1 individuals reaches , the type 0 population quickly goes extinct. In particular, we will show that with probability tending to one as , we have the inequality for some positive constant . We begin by showing that with probability going to 1 as .
Lemma 6.1.
We have .
Proof.
For , we have for . Also, since is the time that the first type individual appears when is sufficiently large by (30), there are no individuals of type or above for . Therefore, as long as , we have for sufficiently large ,
Because as , when is large enough, if . Because by Proposition 5.5, the result follows. ∎
We now bound the process from above by a branching process. For , we define
(42) |
We construct a new process from the population process as follows.
-
1.
Set .
-
2.
The process jumps up by 1 at rate for all .
-
3.
The process jumps down by 1 at rate for all .
Once this process hits 0, it cannot jump down. Therefore, for all . It remains to check that the rate at which the process jumps up by 1 is non-negative, which follows from the lemma below.
Lemma 6.2.
For , we have .
Let for . Clearly, for all . Also, is a birth-death process in which, for , a birth occurs at rate and a death occurs at rate for . Next, for , define
(43) |
Then, we define for . It follows that is a subcritical branching process in which each individual gives birth at rate and dies at rate .
We define
(44) |
and
(45) |
Lemma 6.3.
We have .
Proof.
Consider the branching process . Since each individual in this process gives birth at rate and dies at rate , we know that at any time, the next event is a birth with probability and a death with probability . Therefore, if we evaluate the process at the time of each birth or death event, we obtain an asymmetric random walk. Note that if , then . Also, if , then . Thus, in both cases, we have .
Given that there are individuals of type at time and , the probability that this random walk reaches before is
The bound on the right-hand side does not depend on . Therefore, given that , the probability that hits before is bounded from below by . By the definitions of and in (44) and (45), we have
Therefore,
(46) |
Since and as , we have and as . Thus, from (46), . Lastly, note that after time , the process will stay at forever. Hence, implies that . ∎
Lemma 6.4.
We have
Proof.
Define and as in (44) and (45). Since for all , the process must reach before or at the same time the process does, which implies that . It is therefore enough to show that
(47) |
Consider the process , which is a branching process in which each individual gives birth at rate and dies at rate . For all
Since , we have . Let . Then . By Markov’s inequality,
Hence, . Because is the first time that hits , we have . By Lemma 6.3, .
Proof of Lemma 3.1.
Part 1 of Lemma 3.1 is part 1 of Proposition 4.18. Part 2 of Lemma 3.1 follows from (41) and Lemmas 6.1 and 6.4. To prove parts 3 and 4 of Lemma 3.1, it suffices to show that . This result holds because by Lemma 4.13 and by Lemma 5.4 and part 2 of Lemma 3.1. Finally, to prove part 5 of Lemma 3.1, it is enough to establish that for . However, we have already seen that , and Lemmas 4.11 and 4.12 imply that . ∎
References
- [1] E. Baake, A. González Casanova, S. Probst, and A. Wakolbinger (2019). Modelling and simulating Lenski’s long-term evolution experiment. Theor. Pop. Biol. 127, 58-74.
- [2] É. Brunet, I. M. Rouzine, and C. O. Wilke (2008). The stochastic edge in adaptive evolution. Genetics 179, 603-620.
- [3] M. M. Desai and D. S. Fisher (2007). Beneficial mutation-selection balance and the effect of linkage on positive selection. Genetics 176, 1759-1798.
- [4] M. M. Desai, A. M. Walczak, and D. S. Fisher (2013). Genetic diversity and the structure of genealogies in rapidly adapting populations. Genetics 193, 565-585.
- [5] R. Durrett (2008). Probability Models for DNA Sequence Evolution. 2nd ed. Springer, New York.
- [6] R. Durrett and J. Mayberry (2011). Traveling waves of selective sweeps. Ann. Appl. Probab. 21, 699-744.
- [7] D. S. Fisher (2013). Asexual evolution waves: fluctuations and universality. Journal of Statistical Mechanics: Theory and Experiment, P01011.
- [8] V. G. Gadag and M. B. Rajarshi (1992). On processes associated with a super-critical Markov branching process. Serdica. 18, 173-178.
- [9] P. J. Gerrish and R. E. Lenski (1998). The fate of competing beneficial mutations in an asexual population. Genetica 102/103, 127-144.
- [10] F. Hermann, A. González Casanova, R. Soares dos Santos, A. Tobiás, and A. Wakolbinger. From clonal interference to Poissonian interacting trajectories. arXiv:2407.00793.
- [11] A. González Casanova, N. Kurt, A. Wakolbinger, and L. Yuan (2016). An individual-based model for the Lenski experiment, and the deceleration of the relative fitness. Stochastic Process. Appl. 126, 2211-2252.
- [12] B. H. Good, I. M. Rouzine, D. J. Balick, O. Hallatschek, and M. M. Desai (2012). Distribution of fixed beneficial mutations and the rate of adaptation in asexual populations. Proc. Natl. Acad. Sci. USA 109, 4950-4955.
- [13] B. H. Good, A. M. Walczak, R. A. Neher, and M. M. Desai (2014). Genetic diversity in the interference selection limit. PLOS Genetics 10, e1004222.
- [14] T. E. Harris (1963). The Theory of Branching Processes. Springer-Verlag, Berlin.
- [15] M. Kelly (2013). Upper bound on the rate of adaptation in an asexual population. Ann. Appl. Probab. 23, 1377-1408.
- [16] M. Kimura and T. Ohta (1969). The average number of generations until the fixation of a mutant gene in a finite population. Genetics 61, 763-771.
- [17] J. Liu and J. Schweinsberg (2021). Particle configurations for branching Brownian motion with an inhomogeneous branching rate. ALEA Lat. Am. J. Probab. Math. Stat. 20, 731-803.
- [18] M. J. Melissa, B. H. Good, D. S. Fisher, and M. M. Desai (2022). Population genetics of polymorphism and divergence in rapidly evolving populations. Genetics 221, iyac053.
- [19] R. A. Neher and O. Hallatschek (2013). Genealogies in rapidly adapting populations. Proc. Natl. Acad. Sci. USA 110, 437-442.
- [20] M. I. Roberts and J. Schweinsberg (2020). A Gaussian particle distribution for branching Brownian motion with an inhomogeneous branching rate. Electron. J. Probab. 26, no. 103, 1-76.
- [21] I. M. Rouzine, É. Brunet, and C. O. Wilke (2008). The traveling-wave approach to asexual evolution: Muller’s ratchet and speed of adaptation. Theor. Pop. Biol 73, 24-46.
- [22] J. Schweinsberg (2017). Rigorous results for a population model with selection I: evolution of the fitness distribution. Electron. J. Probab. 22, no. 37, 1-94.
- [23] J. Schweinsberg (2017). Rigorous results for a population model with selection II: genealogy of the population. Electron. J. Probab. 22, no. 38, 1-54.
- [24] F. Yu, A. Etheridge, and C. Cuthbertson (2010). Asymptotic behavior of the rate of adaptation. Ann. Appl. Probab. 20, 978-1004.